Xgrid Introduction

A brief introduction to Xgrid, discussing massively parallel tasks and how Xgrid can be put to work on these.

(For more up to date reading, visit the
Xgrid Archives at Macwize.com)

Xgrid is Apple's rather ahead-of-its-time grid computing platform. Platform is perhaps a bit generous in this case, though, as it's a somewhat incomplete, confusingly documented and poorly disseminated system. Neat nevertheless.

I'll save giving an overview of Xgrid here as Apple
give an overview on their developer pages.

What I'll try and do here, however, is to fill in some gaps and mention some of the traps for the unwary. I'll also provide some links to the better resources out there in webland.


To give a flavour of what Xgrid could be used for, I'll mention some useful tasks and divide them into categories.

1. Embarrassingly Parallel CPU Tasks

This means that the task can be split into a number of independent tasks that are completely independent of the others. Furthermore, they only require number crunching of the computers rather than using other resources like files systems.

Examples in this category are things like Montecarlo simulations, pi calculations (only when the algorithms are appropriate), the Mandelbrot demonstration (Xgrid Mandelbrot.app from the Developer Tools' Examples folder), password cracking etc.

2. Embarrassingly parallel but Resource Dependent Tasks

The tasks can all happen simultaneously as above. However, in this case there is significant resource requirements of the grid computers, most often the filesystem or network.

The classic example here is video conversion or ripping. In these instances a large volume of network traffic will need to take place in order to pass files to each computer. This adds some complexity and careful consideration to the design of the Xgrid architecture.

A useful commercial software package that does video conversion in batches using Xgrid is
VisualHub.

3. Partially Dependent Tasks

The tasks can't operate entirely independently and need to communicate back to each other somehow. It is assumed that nonetheless an algorithm is known that reduces the total execution time even though it won't be as significant as an embarrassingly parallel algorithm.

The classic group of this type of distributed algorithm are those that use Pascal's triangle-like problems where each vector in a row is dependent upon two cells of the previous row.


Pascal's Triangle

See this tutorial on MPI and Pascal's Triangle at Daugerresearch

These pages will concentrate on the first two types of tasks as they are the ones for which Xgrid is most suited.

Example 1



Let's take a most obvious choice of EP (embarrassingly parallel) tasks in IT security - brute-force code breaking and password cracking.

In these scenarios a trial and error approach is needed and the cracker needs to run over all possible choices in the search space to see if it is the correct one. As soon as a correct choice is found, the search is over.

It is obvious that, for example, a dictionary attack on an encrypted password is EP. You can hand off all words starting with 'a' to computer one, those with 'b' to computer two and so on. There is no dependency on the work being performed by any computer with any other, though it may be acceptable to terminate the processes on the remainder when one of the grid computers discovers the solution. However, that's not the real point about their independence.

Now we don't have to have 26 computers available in order to break down the workload for this. We can hand out all a,b,c to the first computer, d,e,f,g to the second and break them down however we like. We could be smart and figure out the likelihood of w,x,y,z as being small and so bundle up more according to their statistical frequency in the English language. The point is simply that we can partition the "space" however we like.

Note: Xgrid is only granular to the task level - it can't divide up the work it is sent into smaller parcels itself, so if you have a large domain to search, it is up to the user to partition this effectively. A good rule of thumb is to make your packages of Xgrid tasks small enough so that a lower spec'd machine is not going to be a bottleneck in the process. With eight tasks and four computers it is possible that the lowest performing computer only just finishes its first task and starts on the second when the other three computers have finished two each. Thus the processing time is almost double what it could be if the domain was broken into smaller parcels. Obviously the circumstances determine whether this can be done or not and converting four videos is hard to split into smaller tasks than four.