What’s New
Welcome to the Pooch manual! Pooch has been updated. New features include:

version 1.7:

Please read on for more!






Pooch Application





Parallel OperatiOn and Control Heuristic Application
Version 1.7



Copyright © 2001-6 Dauger Research, Inc.

Code and manual written by Dean E. Dauger


Table of Contents


I. Installing Pooch ·············································3

II. Your First Parallel Computation ·············································4

III. Introduction ·············································7

IV. Menus and Windows
Menus ··········································12
Job Window ··········································24
Network Scan Window ··········································35
Node Info Window ··········································48
Get Files Window ··········································49
Node Registration Schedule Dialog ··········································50

V. Pooch Pro ··········································51

VI. AppleScript
Standard Suite ··········································63
Pooch Suite ··········································67

VII. Security ··········································71

VIII. Frequently Asked Questions ··········································73

IX. Revision History ··········································78

X. Credits ········································106

XI. Contact Information ········································107

A. Compiling the MPIs ········································108







I. Installing Pooch

Make sure your Macintosh running OS X 10.21 is connected properly to the Internet. Almost any type of network connection (10BaseT, 100BaseT, Gigabit, Airport, or a mix) is sufficient. If this Mac is on an isolated network, manually configure the TCP/IP control panel to a unique IP address from 192.168.1.1 to 192.168.1.254. There are two ways to install Pooch:
• For automatic installation, double-click the Pooch Installer:
(There is no step three. Or two.)

• For manual installation:

1. Copy the Pooch folder, containing the Pooch application, to your hard drive.
2. Open the Pooch application.

So that your Mac will be available for computation if rebooted, we recommend the following:

3. From the File menu, select “Set Pooch in Account Startup Items”.

That’s it! Your Mac is now ready for parallel computing. Repeat for additional Macs.






II. Your First Parallel Computation

Once Pooch is installed and running on each Mac of a local area network, you can use them as a parallel computer. You will need a parallel application to run. For our first test, let’s use the Power Fractal app. Run it on your machine first and note its performance.

Step One - Selecting a Parallel Application

Switch to the Pooch application and select New Job… from the File menu, which results in a new Job Window. Drag the Power Fractal app from the Finder to Pooch’s Job Window or click Select App….



Step Two - Selecting Nodes

By default, Pooch selects your own node as participating in the parallel job. To add other nodes, click on Select Nodes… in the Job Window to invoke the Network Scan Window.



Double-clicking on a node moves it to the node list of the Job Window.








Step Three - Launching the Parallel Job

Finally, click Launch Job to start your parallel job.



Pooch should now be distributing the code and starting the parallel application. Once the job runs, you should see a measurable performance increase.

Congratulations! You are now operating your first parallel computer.



Optional - Your Second Parallel Computation

1. Open version 1.1 or later of Power Fractal. Note its performance.
2. From the Parallel menu, select “Automatically Launch onto Four Nodes”.

The demo will now automatically ask Pooch to create a job using the four best nodes on the local network, then submit itself for parallel launch. This is an example of “Computational Grid” computing.







III. Introduction

What is Pooch?

Pooch serves as both the user interface to your parallel computer and the manager of your parallel computer. Cluster and grid computing are both examples of parallel computation. Pooch is the parallel computer management and user interface software that can prepare, launch, and monitor these types of parallel computing jobs.
The first role of Pooch is to automate the process of starting a parallel application. A parallel application is much like a normal application except that it is designed to run on a collection of intercommunicating computational nodes.2 Manually starting such an application is a tedious process. Pooch automates this process, minimizing the user’s effort to operate and control a parallel computation. However, Pooch provides much more beyond this fundamental function.
Pooch provides access to a great deal of information, customization, and automation regarding the life of jobs run on the cluster and the health of the cluster. Jobs can be launched immediately or queued for a later time or when specific conditions are met. Pooch can intelligently and dynamically acquire nodes using its heuristic algorithms or using custom AppleScripts prior to launch. Statistical information about nodes can be accessed, and some node behavior can be customized. Job status is monitored, and statistics about jobs are collected and retained during and after jobs have completed. Pooch serves as launcher, scheduler, queuing system, and multisystem cluster monitor, all presented in a friendly user interface.
The first thing that makes Pooch unique from all other software of its kind on any platform is the extension of the revered Macintosh ease-of-use to parallel computing. Since 1998, the AppleSeed project at UCLA’s Department of Physics has unquestionably demonstrated the practicality of leveraging the advantages of the Macintosh platform for the goals of numerically-intensive parallel computing. While others used Linux and Windows NT with a hodgepodge of command-line only, unreliable, unstable code fragments, AppleSeed provided the only known solution for parallel computing on the Mac OS. Users of software built by the AppleSeed team are not required to know the inner workings of an operating system, nor are they required to hire expert assistance in such matters. Demonstrated at UCLA and around the world time and time again for three years, that advantage has enabled college students, high school and junior high school students, and even ordinary scientists on shoestring budgets to successfully build and operate their very own parallel computer.
We have continued this success in the latest incarnation of parallel computing on the Mac. Two months after the introduction of Pooch, a fellow in Hawaii used it to build and run his own five-iMac cluster. He then informed Dauger Research of his success and that he was in the sixth grade. While achieving such strides in ease of use, the same software harnesses massively parallel hardware.

Macintosh Parallel Computing

Forming a parallel computer out of a network of computers is called “cluster computing”. In this case, the hardware of the parallel computer is a cluster of Macintoshes. AppleSeed provided two major software pieces essential for using such hardware for parallel computation.
The first is a communications library named MacMPI, authored by Dr. Viktor K. Decyk and introduced in 1998. MacMPI provides a means for executables on each Macintosh to communicate with one another. It is a source code wrapper library that translates the calls by the parallel code into calls directly to the Mac OS. MPI (Message-Passing Interface) is a standard application programming interface (API) of the high-performance parallel computing industry. That API is supported on almost all large parallel computers (Cray, IBM SP, Fujitsu, SGI, etc.). Using the Mac OS, MacMPI supports a commonly used subset of those MPI calls.
The second software component satisfies a different essential need of the parallel computation: initiation. To begin, MacMPI requires information about the Macintoshes designated to participate in the computation. In addition, since, when compiled, it is mixed with the object code of the executable itself, MacMPI cannot copy and start the parallel application or start itself on all the Macintoshes that will perform the computation. These needs are filled by parallel computing launching software.
Through 2000, AppleSeed provided software called the Launch Den Mother and the Launch Puppy (LDM & LP) that filled this second need. The combination of MacMPI and LDM & LP are the key ingredients that allowed students and researchers around the world to develop and run their own parallel computing software on the Mac OS.
Meanwhile, Apple was introducing prereleases of Mac OS X in preparation for its final release in 2001. Mac OS X was fundamentally distinct from the original Mac OS while being the future of the Mac OS. Apple required Mac developers to port their code to Carbon, a new API, so that their code can run natively in OS X. Carbonizing MacMPI (renamed MacMPI_X) was straightforward, but the launching mechanism was another story. Most of LDM & LP’s fundamental design made it impossible for a simple conversion to OS X. For example, LDM & LP’s primary communications mechanisms and LDM’s user interface were tied to pieces of the Mac OS not carried forward into Mac OS X.
Since “porting” LDM & LP to OS X would largely be a rewrite of LDM & LP, their author decided instead to simply start from scratch. Rather than a reconstruction of what came before, the new software created the opportunity to create software that vastly superseded LDM & LP’s functions, capabilities, and use of the Macintosh user interface. Utilizing three years of experience with parallel computation using Macintosh clusters at UCLA Physics, this project called for a rethinking and redesign of the user interface for a parallel computer. Hence, Pooch was born.

Using Pooch

Pooch is meant to be used in an environment where your cluster is a computational resource shared by a cooperative group, much like how a workplace printer is shared. A few scenarios are possible:





System Requirements

In general, Pooch needs 4 MB of free memory on a Macintosh with a valid TCP/IP connection. Pooch requires CarbonLib version 1.2 or later and Mac OS 9 or later. The latest CarbonLib is available from http://www.info.apple.com/support/downloads.html.
Because Pooch is a Carbon application, it can run on OS X too. In version 10.2 of OS X, Apple fixed significant bugs present in previous versions of OS X. Thus, we are proud to say clustering on Mac OS X 10.2 and 10.3 is fully operational and fully supported. Pooch Pro requires OS X 10.2.1 or later.






IV. Menus and Windows

Pooch Menus

File menu


Edit menu


Node menu

To use the items on the Node menu, you most select one node from either the node list of the Job Window or the list of nodes in the Node View of the Network Scan Window. Contextual menu-clicking on a single node in the Job Window or the Network Scan Window also invokes this menu.


Network menu



Settings menu


Pooch menu


General Preference Pane


Clicking on the icons in the toolbar at the top of the window switches the Preference Window to the other panes. It is normal for this window to disappear when Pooch is in the background. For backwards compatibility, the Settings Menu is shown by default.

Remote Head Node - This preference modifies and coordinates four elements of Pooch’s behavior to support launching jobs into and retrieving files from a cluster with a designated head node. If this feature is enabled, the default Network Scan window scans the head node’s network instead of the Local network, the Forward via job option chooses the head node, the Retrieve Output and Queue Job options are enabled, the Access Nodes via Proxy setting is enabled, and the Automatically Acquire Nodes job option automatically selects the head node as the starting node. New head node IP addresses can be entered via the address field of the Network Scan window. New in 1.7.

Nodes Preference Pane


Remote Network Job Access - Normally Pooch on other nodes contacts yours directly to forward jobs and files. For example, a node accessed using the Remote Node Scan feature might not be able to return files to your nodes because of firewalls or other network barriers. This feature permits the Network Scan Window to pull these files onto your machine. Typically this is used in combination with the via Proxy setting of the previous item. None turns off this feature, Only My Jobs accesses only jobs that resulted from a job from this node and user, typically a job triggered by Retrieve Output job option (see Options of the Job Window), while All Jobs will pull all jobs destined for this node. If the name and IP address of your node changes, then this feature might not successfully locate the jobs. New in 1.7.

Jobs Preference Pane

Selecting On Job Submit of the New Job Automatically Reloads menu triggers a new job to invoke as soon as the previous job is submitted to the launch queue. New in 1.7.



Pooch Windows

Job Window



The Job Window is where all parallel jobs are prepared for launching, scheduling, or queuing. A job requires information about the parallel application you want executed and what nodes will run this application. This window is resizable so it can display large amounts of information.







The Recent File Lists folder contains the file lists of previously submitted jobs, each of which can be opened to access specific files. The Recent Node Lists folder contains the node lists of those previous submitted jobs, which can also be opened to recall specific nodes. The Remote Scan List displays the cached IP addresses used in the Remote Node Scan feature of the Network Scan Window. For your convenience, an icon at the top of this drawer contains the name and IP address information of your node. This drawer is available only in OS X 10.2 or later.

Variations of the Job window

The Job window offers two variations from the node list. The first is the ability to acquire nodes automatically from the cluster at launch time. This feature allows the executable to be launched on nodes as they become available. New in 1.6, the second is the ability to distribute a single-processor executable on a cluster, giving them a different integer for each instance of the executable. This feature gives you the ability to explore a parameter space using an executable not otherwise designed for parallel execution.

Automatically Acquire Nodes



Grid Job Type

New in 1.6, the Grid job type allows you distribute single-processor tasks on a cluster automatically, exercising a subset of parallel computing called “distributed computing”,5 well-known in brute-force key breaking and SETI@Home. This feature implicitly activates the Queue this Job, Automatically Acquire Nodes, and Retrieve Output options because its operation requires these capabilities. Launching this job creates subfolders beginning with “case” where the original executable resides and numbered with the integer assigned to that job. The output of the executables will be returned to these folders. Use the Subdirectory job option to customize these folder names.






Network Scan Window

Pooch’s Network Scan window features three different views, toggled via the segmented view control in its upper left corner.





The Node View displays the nodes discovered on the network, along with diagnostics and statistics about these nodes. This view is used to access nodes, query their status and health, and select them for parallel computational jobs.


The Job View retrieves from those nodes data about jobs that are queued, launching, running, or terminated. This view compiles data from these sources to display the condition and other statistics about parallel computing jobs.






New in 1.6, the Network view displays nodes in saved node lists, as well as remotely scanned networks in one large view. Each network of nodes can be revealed by clicking on the node list’s or network’s disclosure triangle.


Node View


When this window is in Node View, Pooch scans the network for available nodes and displays its results here. Once Pooch has finished scanning, you can use this window to select nodes for your computation. Double-clicking on a node moves it to the node list of the Job Window, which will open if not already present. Clicking the Add button while a node is selected will have the same effect. Nodes also present in the Job Window will appear in gray in the Network Scan Window.
Selecting a number from the pop-up menu of Add Best moves that many nodes that Pooch considers “best” to the Job Window. The selection excludes nodes that are already on the Job Window’s node list. The rating system Pooch uses to rank the nodes is described fully in the Network Scan Columns section below. This mechanism is designed to be more likely to select nodes that perform better than others and be less likely to select nodes that are busy or heavily loaded. Selecting the highest number from this menu selects all nodes that Pooch can communicate with and are not busy with a parallel job.
Normally, the Network Scan Window will scan the network again according to the Automatic Rescan specified in the Network menu. The Refresh button directs Pooch to rescan and refresh the list immediately. This window is resizable for large amounts of information.

- After selecting one node, you can use this pop-up button to get information about that node. The Get Node Info and Get Job Queue items invoke a Node Info Window. The Get Files items invokes a Finder-like window from which you can drag files to the Finder.
- The Search box at the top right of the Network Scan window accepts a search string to limit the displayed data. It compares the search string against the node and job data available in the list below and displays only those items that match. For example, if you type “G5”, it will show nodes that use a G5 processor. If you type “dual”, only those with two processors will appear. Other text will be compared against the names and other data to reveal appropriate matches. As with any search using text, overly specific search strings will yield little or no data, and overly general strings will produce excess. New in 1.6.

- The address box at the top middle of the window provides access to remote clusters of nodes, something like the address box in a web browser. (Revised in 1.6.) Pooch normally can only directly scan within its own local network neighborhood. This is the default scanning process. You may always return Pooch to this mode by selecting Local Network from this pop-up menu or typing “Local” in the box. Technically, unless your network administrators have set up your routers to support IP multicasting, this means that Pooch can only see other nodes in the same IP subnet. (This limitation is inherited from the Internet-standard Service Location Protocol implemented in Apple’s Network Services Location Manager and Apple’s Bonjour implementation.)
Entering an IP address in this box allows you to circumvent this limitation. Enter the address, in DNS (e.g., mymac.mydomain.com) or “dotted quad” (e.g., 192.1.2.34) form, of a Mac you know is running Pooch. Your local Pooch will then ask the far away Pooch to do its local scan and relay the results. Essentially, the Pooch on your Mac “borrows the nose” of the other Pooch. This feature allows you to scan for nodes on any other accessible subnet. This is useful for circumstances where, say, you want to access a chem.ucla.edu Mac from a physics.ucla.edu Mac. Selecting Local Network returns the network search to your local network.
Once you enter one address, Pooch will remember that address and make it available on the menu of the pop-up button for future use. As you accumulate more addresses, Pooch will re-sort the list in order of most recently accessed first.
The functionality of this feature extends well beyond just one domain, however. From your Mac, you may use this to access nodes on any subnet anywhere on the Internet. This means you can even access your cluster running Pooch from your home dial-up connection.
This feature is very powerful. With this power comes responsibility. There are security features in Pooch that will make unauthorized access without Pooch extraordinarily unlikely. These issues are discussed in the Security section.






Job View


When the Network Scan window is in Job View, it scans the network for nodes and accesses data, if present, about queued, launching, running, terminated, and aborted jobs and compiles that job data for display here. The root level of the list shows all the jobs, named according to their executable, found on the network. The job icon for each job is color coded as follows:

Icon Color Meaning
white A normal parallel computing job
blue A queued job (“on ice”)
yellow The job is being launched
green The job is running
gray The job has ended
white X on black The job was aborted

Opening the disclosure triangle displays the nodes that will be included in, are being used for, or were included in that job, depending on the state of the job. If possible, the information about a particular node will be queried directly, otherwise the data is derived from the data associated with the job.
This view correlates and compiles this data from all the nodes on the network. A variety of otherwise independent jobs may be accessed as far back as the Job View Access Time preference will permit. Since nodes used in one job can be reused for another one later, nodes will naturally appear more than once in multiple jobs. Although information such as the job ID, number of nodes and files, origin, and submission time should be consistent between nodes of a job, the process ID, start time, and end time could be different. If certain nodes of a job are not available, data about a node will be derived from the job data, if possible, residing on the other nodes. The Last Access and Determined Via columns are easy ways to tell whether or not the data are fresh.

Network View



In the Network View, the window organizes nodes into saved node lists followed by remote networks. Opening the disclosure triangle of a node lists displays the nodes it contains and includes those nodes for access. Opening a remote network triggers Pooch to access the remote node and query the nodes in its local area network, just like when using the address box as in the Node View. However, what is different is that you can have multiple networks open and accessible at once, and multiple saved node lists open as well. You may mix and match nodes from these multiple networks when selecting nodes for your parallel computing job. New in 1.6.

Network Pane


When the Network Scan window is initially opened, a pane splitter appears on the far left side of the window. Dragging that to the right opens a network pane, listing both saved node lists and remote networks. The remote networks shown are the same that was entered via the address box earlier, and they list networks in the same order, with the Local Network listed first. This pane gives another way to quickly select and scan a remote network, much like the left pane in OS X’s Finder windows.
Clicking on the plus sign below the Network Pane creates a new saved node lists. You may think of these as iTunes playlists, but for nodes. You may drag nodes from any network to these saved node lists, creating arbitrary groups of nodes useful for different kinds of parallel computing jobs. For example, you may have a group that you know is only available at night, or another set of nodes that are all G5s or dual-processor machines, but are on disparate, mixed networks or clusters. This gives you a way to group commonly used nodes together, allowing you to conveniently view, scan, and monitor these nodes as if they were together and include them in a new job in the Job window. Double-click a node list to scan them. When viewing a saved node list, select a node and press the delete key to remove them from a node list. The Network Pane and saved node lists are new in 1.6.

Network Scan Columns

The Network Scan Window displays information about each node Pooch queries. This information is shown in a spreadsheet-like format. This window is invoked using the columns button at the top of the Network Scan Window or the Show Columns… item of the Network Scan Columns submenu of the Network menu. The columns can also be added or removed using the Network Scan Columns submenu of the Network menu. You can drag a column at its header and resize it by clicking on its right edge. Clicking on a column header re-sorts the list according to that column.
Sometimes, if the node is running a computational intensive task, the Pooch on that node may not be able to respond immediately, delaying the appearance of this information. In that case, it probably should not be used for a new job.

Column Function
IP Address the “dotted quad” Internet Protocol address of this node
Machine Type the machine type, based on its ROMs
CPU Type the processor type (e.g., G3 or G4), including number of processors
Clock Speed the processor clock cycle speed in millions of cycles per second
OS Version the version of the operating system (e.g., 9.0.4 or 10.0)
Load the estimated load on the system in %. On OS 9, this item uses the amount of CPU time, in sixtieths of a second, each application has consumed according to the Process Manager approximately every ten seconds. On OS X, this item is calculated using the results of a call to the Unix “ps” command. Since the resolution is rather low and the process of measurement can influence the measurement (like Quantum Mechanics), this number can be somewhat inaccurate. It is intended, if the number is measurably nonzero, to hint that that this node is probably in use and should be avoided.
Free Memory the amount of contiguous free RAM available on this machine, which is the largest available space for a newly launched application. Updated in 1.7 to report more than 2 GB.
Free Disk Space the amount of free space on the drive on which this node’s Pooch resides
Achieved Performance measured achieved single-precision floating-point performance according to a benchmark based on code from the Power Fractal app. Automatically recognizes the Velocity Engine and SSE, and updated once per day. (not multiprocessor aware)
Peak Single-Precision measured single-precision floating-point performance of the FPU of this processor based on an IBM benchmark. Updated once per day. (not multiprocessor aware)
Peak Double-Precision measured double-precision floating-point performance of the FPU of this processor based on an IBM benchmark. Updated once per day. (not multiprocessor aware)
Peak Vector Float measured single-precision vector floating-point (vector float or vFloat) performance of the vector FPU (in the Velocity Engine or SSE) of this processor based on an IBM benchmark. Updated once per day. (not multiprocessor aware)
User Idle Time the time that has elapsed since Pooch last detected interaction from the user, such as moving the mouse or other activity
Rating a nonnegative real number describing Pooch’s rating of that node. This number is a function of the information Pooch has collected. If the node is busy, or Pooch has not successfully communicated with the node, the rating is zero. Otherwise, it is related to the above properties according to the following heuristic function:


where is Achieved Performance divided by 300 MFlops, is the load (from 0 to 1), is Free Memory divided by 32 MB, is Free Disk Space divided by 16 MB, and is User Idle Time divided by 15 minutes. is designed to be very low when the node is undesirable to use, such as when it is heavily loaded or is running out of memory or disk space. Note that, assuming load and memory space are non-issues, is approximately logarithmic with processor performance. Also, many of these variables can easily change at any moment, causing the rating to fluctuate. This rating function may change in future versions of Pooch. (To create and use your own rating, use the AppleScript interface described in Chapter V.)

Jobs Queued the number of jobs that are queued on that particular node
Current Job Name if the node is running a job, the name of the first parallel executable it is running
Last Access the time and date this data was accessed, according to that node’s clock
Pooch Version the version of Pooch that node is running
Job ID the job identification number assigned to that job
Node Count the number of nodes that job will use, is using, or had used
File Count the number of files supplied for that job, as submitted to Pooch
Process ID the process identification number of a particular node’s instance of the executable of a job when it is running or a job when it was running
Submission Time the time and date this job was originally submitted
Start Time the time and date a particular node’s instance of the executable of a job was first determined to be running
End Time the time and date a particular node’s instance of the executable of a job was first determined to be ended
Elapsed Time the amount of time a particular node’s instance of the executable of a job took that node
Job Origin the node that was the origin of the job
Determined Via the node that provided the information about this job or node

Finally, the Status column has a number of different possible displays.

Status Reading Meaning
attempting to connect to this node
a connection was established to Pooch at this node and security authorization is being approved. If the node is under heavy load, Pooch might not be getting enough CPU time to respond. On OS 9, writing your parallel app to give more background time (for example, using the checkesc function in MacMPI) would help.
access to Pooch at this node is in progress
access to Pooch at this node is almost done
Okay the Pooch at this node was successfully accessed and is not busy
BUSY the Pooch at this node was successfully accessed and is busy running an application launched by Pooch
(“breathing”) the Pooch at this node is being used for remote neighborhood access and is returning addresses of and/or accessing other nodes
retrieving job data
retrieving file and folder data
exchanging data or commands
Queued this job was successfully queued and is awaiting launch
Launching… Pooch is presently attempting to launch this job
Running this job was successfully launched and is running
Ended Pooch has determined that this job terminated
Aborted Pooch was used to kill this job and it is no longer running











Node Info Window



Selecting one node in the Job window or the Network Scan window allows you to access that node’s information. Much of the information on the left is also available in the columns of the Network Scan Window, with a few additions:

Item Function
DNS Name the IP name of this node according to the DNS network, if available
Total Memory the total memory available, including virtual memory (Updated 1.7)
The right side shows a list of running processes each with an approximate distribution of processor load. The determination of this information is OS specific. On OS 9, Pooch derives this information from sampling the Process Manager every ten seconds. On OS X, Pooch calls the Unix “ps” command, whose results can change at any moment and can be adversely influenced by the call to “ps”.
After selecting a process on this list, clicking Kill Process will direct the Pooch on that node, if it is running OS 9, to ask that process to quit. Applications that a user is presently using can respond to this request, and some applications ignore this request. If that node is running OS X, Pooch will use a “kill -9” command instead. However, in accordance with the Unix underpinnings of OS X, Pooch can only kill processes that Pooch has permission to kill.
Clicking on the pop-up menu will allow you to switch between Stats & Apps and Job Queue. In the Job Queue mode, this window will display parallel jobs in the queue on this node. The columns list the number of nodes and files selected in this job and the launch time of this job. If no launch time was selected, or if the launch time has passed, it will display Normal. Selecting one will allow you to kill that job using the Kill Job button.

Get Files window

Selecting a node in the Network Scan Window or the Job Window and selecting Get Files from the Node menu will have your Pooch access the Pooch on that node and access files there. It displays information about the files in a window much like a Finder window. You can drag files out of it to the Finder, you can double click on a folder to access its contents, and you can Command-click on the window title to pop up a menu to back out a directory. As a security feature, you can only access the folder in which the remote Pooch resides and any of its subdirectories (folder aliases will not resolve), and you can only get files out, not drag files in. This access is completely independent of Mac OS File Sharing.
Node Registration Schedule Dialog




This dialog allows you to select the time Pooch registers and deregisters itself on the network, which determines this node’s visibility by other Pooches. As seen in the above screen shot, you may set the schedule so that, on weekdays, this node is visible on the network at 5:15 pm, after you leave for home, then no longer is available for computation at 9 am when you arrive at work that morning. The Schedule Type pop-up menu can be set to Everyday, where the daily schedule will be the same every day, Weekdays & Weekends, where the schedule for weekends (Saturday and Sunday) is distinct from the schedule on Monday through Friday, and Weekly, where the schedule on each day of the week can be set separately. The Day pop-up menu selects which day you are editing, which depends on the Schedule Type setting.
For each day, the large circular dial is used to specify what portion of the day Pooch will register itself. The circle represents a 24-hour clock with midnight at the top and noon at the bottom. The highlighted sector of the circle indicates the portion of the day registration will occur. Dragging the mouse inside the circle selects times rounded to the nearest 15 minutes. Dragging outside the circle, or using the text fields, allows you to select any minute of the day.







V. Pooch Pro

Pooch was created to make clustering as easy to use as possible. Generally there are two ways ease-of-use can be applied: 1. minimize the effort to perform an otherwise technically challenging feat; or 2. apply the same amount of effort to achieve much more than could otherwise be done. With the Pooch Application, we have done our best, in the context of cluster computing, to pursue the former while making progress with the latter. It is clear that many users need a solution that focuses on this first goal.
However, we recognize that we can do still more addressing the latter. To do so, we require a version of Pooch that enables larger clusters to be managed for use by a larger number of users. Administrators are often responsible for managing much larger quantities of computational resources than single users usually do. They require more detailed control over how the hardware is used and shared amongst users. Administrators would like to accomplish as much as they can with the effort they employ.
Pooch Pro is designed to support that latter form of cluster computing. This “Pro” version of Pooch supplies the infrastructure of users, groups, and administrative accounts to manage the usage of cluster resources. These capabilities imply mechanisms to identify parallel computing jobs with users and track their time spent against usage policies decided by the cluster administrator as well as an account hierarchy between users and administrators.
But we should be clear that the introduction of Pooch Pro by no means is the end of Pooch. We recognize that a spectrum of cluster solutions exist, from an individual running a few nodes to many users sharing a university lab as a cluster. To better satisfy these cluster categories, Pooch has bifurcated into two incarnations: 1. Pooch continues to make it easy for a user, not necessarily skilled in computing, to keep their cluster up and running so they can maximize their time applying, rather than maintaining, the technology; while 2. Pooch Pro allows one to administer a cluster for a large number of users, all via a user interface easy-to-use for administrators and users alike. Both will evolve and improve to address their users’ needs.
When Pooch Pro is first installed, it creates an initial user database file with two groups and two user accounts, user and admin. The password for the both accounts is set to a psuedorandomly generated string supplied with your Pooch CD. (For the Pooch Pro downloadable from the web site, the admin password is “pooch”.) When you set up your accounts, we recommend first logging into your admin account and either set a new password or create a new administrator-level account for yourself and delete the old admin account. Then, after configuring the user accounts, use Update User Database to distribute your changes to all other cluster nodes. If you need to backup the encrypted user database file, it is at /Users/Shared/pooch/poochuserdata.

Pooch Pro Menu

User menu





Command-line Access

When using the macmpi.tar.gz package with Pooch Pro, we recommend that you create users on the system whose user names match the user names registered in Pooch Pro’s user database and use the -m flag when launching jobs. This makes it possible correspond command-line based logins (e.g., ssh) with jobs submitted by users known to Pooch Pro. Without those consistent user names, Pooch Pro will not recognize a job submitted by users it does not know, and therefore kill those jobs because they appear to be foreign.

Pooch Pro Windows

User Administration Window

Pooch Pro’s User Administration Window provides access to the user and group account database. Usually only administrators would have access to this window. This window features two different views, toggled by the view buttons in its upper left corner. A dot in the close box indicates that user or group data has recently been changed.


The User View displays all user accounts in the cluster, including statistical data about these users. This view is used to administer, create, edit, and delete user accounts.


The Group View lists all groups of user accounts on the cluster and the users they contain in a hierarchal format. This view is used to administer groups and organize which users they include.



User View


The User View of the User Administration Window is where user accounts can be created, edited, and deleted by selecting the account and clicking on the buttons at the bottom of the window.

Group View


The Group View of the User Administration Window is where groups can be created, edited, and deleted. Users can be dragged between groups here.

User Administration Columns

The User Administration Window displays information users and groups in a spreadsheet-like format. This window is invoked using the columns button at the top of the User Administration Window. You can drag a column at its header and resize it by clicking on its right edge. Clicking on a column header re-sorts the list according to that column.

Column Function
Name the full name of a user or group
Short Name the short name of a user (a.k.a. user name) or group. The ID number of that group or user is shown in parentheses.
Group the short name of the user’s group, with the group ID in parentheses
Compute Time Quota
the compute time allocated to a user. The quota is specified as a particular amount of time, summed over all nodes used by that user, allowed within a certain wall-clock time. For example, if a user is allocated five hours per day, then Pooch will allow that user’s jobs, in aggregate, to use up to five nodes one hour each, or one node for five hours, every day
Rollover Minutes Expiration
the expiration limit of this user’s rollover minutes. If a user does not use up their quota on a previous quota interval, the user may carry those minutes over to the next interval. However, those minutes will expire after the time specified here. Using the above example of a user quota set to five hours per day, if this rollover expiration is set to two days and the user did not use any time yesterday, then that user could use up to ten hours on the cluster today.
Maximum Job Duration
the wall-clock time limit of this user’s jobs. Jobs submitted by the user will be limited to running for the amount of time specified here.
Total Time Remaining
the sum of the compute time available to the user if that user were to start a job now. Pooch keeps track of jobs submitted by users compares that time against the policies specified by the quota and rollover minutes policy settings. The time left is displayed in this column.
Total Time Used the sum of the compute time used by the user within the quota and rollover minute time intervals
Quota Time Remaining
the sum of the compute time available to the user according to the compute time quota rule only
Quota Time Used the sum of the compute time used by the user within the quota time intervals
Rollover Time Remaining
the sum of the compute time available to the user due to rollover minutes only
Rollover Time Used the sum of the compute time used by the user outside the quota time interval but within the rollover minute expiration
User Flags the flags of a user or group, specifying user policy features, shown graphically. Their meanings are shown in the chart below.
User Flag Icon Meaning
this user or users of this group group are allowed to administrate on the cluster
this user or users of this group have a compute time quota limiting their allocation on the cluster
this user or users of this group have been allowed rollover minutes
this user or users of this group have been allowed to migrate their access to the cluster from node to node
this user or users of this group have been allowed to change their own password

Edit User Sheet

Editing a user via double-clicking or the Edit button invokes the Edit User Sheet.


The sliders set the compute time policy for this particular user. A user’s total compute quota can be limited to a certain amount of wall-clock node time per time interval, which can be a day, a week, a month, or a year. This time is the sum of all the time this user keeps any node or nodes busy, as tracked by Pooch. This calculation is compared against the quota specified in this dialog and the quota of this user’s group. In addition, the user may be allowed minutes to be “rolled over” from previous quota intervals. This feature allows users to combine previously unused minutes with their current quota for their compute time. Users’ individual jobs can also be limited to running for a specified amount of time using the Limit Job Duration slider. When creating a new user, a New User Sheet appears that is very similar to the Edit User Sheet. Unless one is entered, the default password given to new users is “pooch”, all lowercase. Once the short name and user ID is set, they cannot be changed without deleting the user.

Edit Group Sheet

Editing a group via double-clicking or the Edit button while in the Group View invokes the Edit User Sheet.


Like that for an individual users, the sliders set compute time policy, but instead of just a particular user, a policy can be set for all the users in this group at once. A user’s compute time is limited to either that user’s particular quota or the group’s, whichever is more limiting. This hierarchy applies individually to the rollover minutes and job duration limits as well. When creating a new group, a New Group Sheet appears that is very similar to the Edit Group Sheet. Once the short name and group ID are set, they cannot be changed without deleting the group.


AppleScript

Pooch Pro supports three AppleScript types regarding user functions. New in 1.6.5.

user noun -- a user of the cluster
group noun -- a user group of the cluster

The user record is data about the user, including settings specific to the user as well as calculated quota and rollover CPU time used. Similarly, a group record is data about a group. The units of the time-valued properties, such as quota or rollover time, are in minutes. Via an administrator account, these records can be accessed using the usual Standard Suite in AppleScript vocabulary. For example, to get a complete list of users, use the following command:
get every user
To create a new user, use make:
make new user with properties {{name:”Test User”, user name:“testuser”, has quota:yes, quota:10000, group ID:301}}
These capabilities are useful to set up and create a new user programmatically via AppleScript or a Unix script. If not all properties are provided, as in the make example, Pooch Pro will fill in the remaining fields with default values. Properties that return calculated time usage and remaining , of course, are ignored if fed into Pooch Pro.

login noun -- a user’s login session to the cluster

The login record is about the login state of Pooch. Data about the currently running user can be returned via the get command, while delete can log out the user. To log in from AppleScript, use make:
make new login with properties {{user:{user name:“testuser”}, password:”pooch”}}







VI. AppleScript

Pooch has the ability to respond to commands from a script written in AppleScript. This feature makes possible customized launch configurations and queuing systems. For example, simply by dragging an application to it, a script could launch that application automatically onto the four “best” nodes found on the network. Also, it is possible to write and launch scripts from OS X’s Unix command line; thus, launching jobs through Pooch from OS X’s Unix shell is possible. In addition, because the underlying mechanism of AppleScript are AppleEvents, the inter-application messaging system on the Mac OS, it is possible for Mac applications, including mainstream ones, to ask Pooch to query the network and launch parallel computing jobs.
You may look at the Pooch dictionary by selecting Open Dictionary… from the File menu of Script Editor (an application commonly supplied with the Mac OS for editing AppleScript scripts) and selecting Pooch App in that dialog. We also present that vocabulary here. Pooch fits its AppleScript vocabulary within the official AppleScript object-oriented approach, accompanied by only a few Pooch-specific commands and data structures.

Standard Suite

The AppleScript Standard Suite was defined by Apple Computer, Inc., to create a standard vocabulary that all AppleScript-aware applications would understand and respond to in a consistent manner. This vocabulary is meant to encompass as many operations as possible that a user could do with an application, while allowing only a minimum of application-specific operations to fall outside this suite. The goal is, once an AppleScript user learns this suite once for one application, that user may apply that knowledge to all other applications with the greatest possible efficiency.

make
new
type class -- the class of the new element. Keyword 'new' is optional in AppleScript
[at reference] -- the location at which to insert the element
[with data anything] -- the initial data for the element, such as a list of alias records for the file list of a job
[with properties record] -- the initial values with properties of these items, needed for nodes in a node list of a job
[with last type class] -- use the data from the last submitted job for the new job
Result: reference -- to the new object(s)


make is probably the second most common command you will use with Pooch. You may use it to create a new job.
set myjob to make new job
This command directs Pooch to open a new Job Window with only the node this Pooch is running on as node zero. It then sets the variable myjob to be a reference to the new job just created.
By default, make new job creates a new job with an empty file list and a node list containing the node this Pooch is running on. “make new job with last job” causes Pooch to derive the new job from the last job this Pooch launched according to the settings in the New Job Automatically Reloads submenu.
To add an application or files to the job, use the make new file form:
make new file at myjob with data alias "Power Fractal" of the startup disk
This command adds the Power Fractal application, which resides on the root directory of the startup disk, to the file list of myjob. The word alias typecasts the reference into an alias. Script Editor often checks to make sure the file exists before allowing you to compile the script. Adding files is similar, and this mechanism also accepts lists of files.
To add a node explicitly to the job, use the make new node form:
make new node at myjob with properties node {name:"Mac number 2", address:"192.168.1.2"}
The information between the brackets is interpreted as a record containing the name and address of the node. This record is typecast as type node, then added to the node list of myjob. You may also construct a list of nodes,
set mynodelist to {{name:"computer01", address:"192.168.0.1"}, {name:"computer02", address:"192.168.0.2"}}
and add that list to the job.
make new node at myjob with properties mynodelist
In addition, you may add the result of a node scan directly to the node list.
make new node at myjob with properties (node scan)
The node scan command is described in the Pooch Suite section below.

The remaining AppleScript commands of the Standard Suite that Pooch recognizes are:
close reference -- the object to close
count reference -- the object whose elements are to be counted
delete reference -- the element to delete
exists reference -- the object in question
get reference -- the object whose data is to be returned
open reference -- list of objects to open
set reference -- the object to change
to anything -- the new value

These commands are general purpose because they can refer to almost any object, within reason, with the Pooch application. Some of them are used simply, e.g., close job window closes the job window, or close every window closes all windows. count the node list of myjob returns an integer how many nodes are in myjob. Using the open command is the equivalent of dragging the item onto the Pooch icon, so its function overlaps with make new file. Commands that do not make logical sense (such as close BUSY check of myjob) either return an error or do nothing.
The objects that these commands act upon can vary widely. Individual nodes of a node list or files of a file list can be referred to by number, such as node 2 or file 4 or item 5. Note that, in AppleScript, all counting is 1-based, so the nodes of an n-node node list are node 1 through node n. The node list or file list of a job can also be specified. For example, exists file 4 of the file list of myjob tests if that file of the job exists, while delete the node list of myjob clears the node list. In the former example, file 4 of the file list of myjob may also be simplified to file 4 of myjob because the reference is clear; however item 4 of myjob is not specific enough.
The set command is one of the most versatile members in this suite and will probably be the most common command you use. Once you create a job and set the variable j to be a reference to that job:
set j to make new job
you can set entire file or node lists:
set the file list of j to {alias "Enterprise:Pooch folder:pinput2d"}
set the node list of j to {{name:"computer01", address:"192.168.0.1"}, {name:"computer02", address:"192.168.0.2"}}
set a particular file of a file list to another member of the same list:
set file 2 of j to file 1 of j
or set particular options of the current job:
set BUSY check of j to no
set launch failure queues job of j to yes
set target subfolder of j to "run001 folder"
set start time of j to date "Wednesday, October 31, 2001 12:00:00 AM"
set tasks per computer of j to use processor count
Setting the target subfolder option to "" and start time to current date or an earlier date turns off those job options. All the substructures that make up a job are listed in the job class listing in the Pooch Suite section of Pooch’s dictionary.
In addition Pooch’s Application Class has a two additional properties from the normal suite. You can query the launching a parallel job property to find out whether or not Pooch is currently launching a parallel job onto nodes. This information is useful for scripts or parallel applications to coordinate its activities with Pooch’s launch process. Also, you may find out whether or not this Pooch is registered by accessing the node registration property.

Pooch Suite

The commands particular to Pooch are in the Pooch Suite. Expanded in 1.6.
node scan reference
[for best integer] -- Search for a given number of 'best' nodes
[for [best] processors integer] -- Search for a given number of processors using the 'best' nodes
[starting at string] -- scan starting at a remote node specified by an IP address in a character string
[BUSY boolean] -- Include the nodes labeled BUSY
[all boolean] -- Include all nodes found
[new scan boolean] -- Explicitly perform a new node scan
Result: node -- list of nodes

job scan reference
[starting at string] -- scan starting at a remote node specified by an IP address in a character string
[all boolean] -- Include all jobs found
[new scan boolean] -- Explicitly perform a new node scan
[lists boolean] -- Explicitly include node and file lists in job records
[recalling integer] -- Recalls job records back this far in time (e.g., days, weeks, or months)
Result: job -- list of jobs

network scan reference
[for best integer] -- Search for a given number of 'best' nodes
[for [best] processors integer] -- Search for a given number of processors using the 'best' nodes
[starting at string] -- scan starting at a remote node specified by an IP address in a character string
[BUSY boolean] -- Include the nodes labeled BUSY
[all boolean] -- Include all nodes found
[new scan boolean] -- Explicitly perform a new node scan
[nodes boolean] -- return found nodes
[jobs boolean] -- return found jobs
[lists boolean] -- Explicitly include node and file lists in job records
[hidden boolean] -- hide the network scan to reduce its visual impact
Result: node or job -- list of nodes or jobs

network scan directs Pooch to open the Network Scan Window and begin scanning the network. By default it returns node records, but the with jobs switch with have it return job records instead. job scan is equivalent to network scan with jobs, and node scan is equivalent to network scan with nodes. Pooch will wait about eight seconds before returning the results of its search. The command will only return nodes which, in that time, successfully reported their status as Okay after being discovered. Note that the results of this search may or may not overlap with the nodes already present in a job you are constructing.
When the returned object is a list of nodes, each item of this list is a record containing information about the node’s name, IP address, load, free memory, OS version, achieved performance, rating, and so on. These items generally correspond to the node scan columns described in detail in Chapter IV. A complete list of the members of these records are listed and defined in the node class section of the Pooch Suite in the Pooch dictionary. Much of the information about a node can change at any time, so it is not recommended to cache data about other nodes. Likewise, job data can change and evolve over time, as queued jobs are launched and running jobs end.
Passing an IP address (in a string format, such as "192.168.1.7" or "farawaymac.otherdomain.edu") in the starting at option directs Pooch to perform a Remote Node Scan using a Pooch at that address. The success of this operation depends on whether or not a Pooch is actually running on a Mac at that address. Assuming that Pooch successfully relays information about other nodes, your Pooch will request the status of those nodes and return the results to your script as before.
Passing an integer n in the for best option of this command directs Pooch to return only the best n nodes of the search for nodes. Nodes that are busy or cannot be contacted are never included in this list. It is possible for this command to return less than n nodes. Which nodes are the “best” nodes is determined by Pooch’s internal rating calculation, which depends upon data about that node which change from moment to moment. To determine the “best” nodes according to your own criteria, you can construct a script that sorts the results of node scan using a rating function of your own design.
To specify a number of processors to use, use the for processors option of this command. Pooch uses the same criteria as the for best option, except it limits the nodes it returns so that the sum of the number of processors on the return list most closely match the number of processors specified here. This is useful if you are using the Use Processor Count setting of the Tasks per Computer job option and want to launch only a limited number of tasks.
The output of this routine can be placed directly into a job’s node list:
make new node at j with properties node scan for best 4
but you may instead wish to edit the search results yourself before passing it to your job:
set nodesearch to node scan for best 4 starting at "farawaymac.otherdomain.com"
--
edit the list in nodesearch here


--
make new node at j with properties nodesearch

Be creative!

launch job -- job to launch: required

The launch command directs Pooch to submit a job that you pass by reference. For example,
launch myjob
would launch a job that you originally made using set myjob to make new job.

bark

This command makes Pooch bark. Try it and listen for yourself.


Additional Reading

Documentation about AppleScript in general is available in print and on the web.

Derrick Schneider (with Hans Handsen & Tim Holmes), The Tao of AppleScript, 2nd edition, (BMUG and Hayden Books, Indianapolis, Indiana, USA, 1994).

Apple’s official AppleScript web site http://applescript.apple.com/

MacScripter - an independent web site with links, information, and resources, including numerous examples, related to scripting on the Mac http://www.macscripter.net/

Apple’s AppleScript mailing list http://www.lists.apple.com/applescript-users







VII. Security

Pooch is its own lock and key. You should keep track of your Pooch like you keep track of your keys.
Before Pooch will accept commands from another Pooch, it must receive a passcode that matches its own. Then, all subsequent commands use a 512-bit encryption key that rotates for each message in a psuedo-random manner. Only those two Pooches can predict the next encryption and decryption keys. If a mistake in the passcode or commands is made at any time, Pooch will reject the connection. Since Pooch waits a second or two before it accepts another connection, an exhaustive search for the correct encryption keys (2512 possibilities once per second would take over 10145 years) will be extraordinarily unlikely to succeed.
The first passcode and the start of the rotating key are 512-bit psuedo-random numbers derived from the registration name of that Pooch (which is set at compile time). Therefore, only Pooches of the same registration will be able to communicate with one another. Because the registration name is unique for each Pooch customer, a copy of Pooch registered to, say, MIT, will not be able to communicate with a Pooch registered to UC Berkeley. (For cross-registered Pooches or other custom configurations or encryption methods, please e-mail Dauger Research.)
Security for your cluster then becomes dependent on the security of your Pooch registered with your registration name. Your Pooch can be installed on the Macs of your cluster, and, if no additional copies of Pooch exist, no one can get in. But if you make a copy of that Pooch and bring it home to access the cluster, the security of that cluster depends on how securely you keep that extra copy of Pooch.
The nature of this security is analogous to having the ability to copy a key to a locked office. It is not uncommon to entrust a group of people with the keys to a shared resource, such as office equipment. The security of the equipment is shared by those who have copies of the key. These people understand the responsibility that comes with the privilege for that access. Access to Pooch can be shared in a similar way.
In testing by Dauger Research, Pooch’s ability to break through firewalls is no greater than that of a typical web browser.
If someone loses a key to an office, it is not uncommon for the office locks to be rekeyed and a new set of keys distributed to the users of that office. Likewise, if you feel the security of your Pooch has been compromised, it is possible to obtain a “rekeyed” Pooch. The Pooch code allows psuedo-random variants of the 512-bit passcode based on the same registration name. The rekeyed Pooch will not communicate with the previous Pooch of the same registration. An active Pooch subscription with Dauger Research will enable you to obtain a limited number of rekeyed Pooches.
If you are using the downloadable demonstration version of Pooch, you should be aware that the same version can be downloaded by anyone else on the web. So, if they have the IP address of your Mac, they could access your Mac, to the extent that Pooch allows, over the Internet. Although guessing your Mac’s IP address is unlikely, a uniquely registered Pooch makes for much better security.








VIII. Frequently Asked Questions

Does Pooch work on machines running an OS before OS 9?
We’re sorry, but no. Pooch uses many of the latest APIs implemented by Apple. The Data Browser interface was sufficiently implemented only in CarbonLib 1.2, and the NSLM v1.1 implementation, used by Pooch to register and search for nodes, requires OS 9. Pooch’s predecessor, the Launch Den Mother and Launch Puppy, runs on OS 8.x and is available from daugerresearch.com, but is officially not supported.

Does Pooch work on OS X?
Yes. Pooch is a Carbon app, so it runs on both OS X and 9. We have even run parallel apps on a mixed cluster where some Macs are running OS 9 and some running OS X 10.1 and some running OS X 10.2 through 10.3. Due to bugs in earlier OS X versions, we highly recommend using at least OS X 10.2.1 or later.

Can I use Pooch to run a node while it is logged out?
Yes. This feature is new as of version 1.3.5. You may either use the standard Pooch Installer (while holding down the option key) or the Pooch command-line installer, poochclinstaller.tar.gz. Due to security issues in OS X, a restart may be required to fully enable this feature. Jobs launched onto logged out nodes will run behind the login window. Because of OS X’s design, that is the only way the executable can reliably run in that environment.

How do I disable the Pooch’s run at logout feature?
You may do so by deselecting the Enable Pooch at Logout… preference. Due to security features in OS X, administrative authorization and a restart may be required to fully disable this feature.

How do I uninstall Pooch completely?
You may remove Pooch and all its components by allowing Pooch to run normally, then holding down the Option key after launching the Pooch Installer. In the Pooch Installer, a dialog should appear that allows you to select either to upgrade or uninstall Pooch. Clicking on uninstall will delete the running Pooch and the components that allow it to run at logout. The latter process may require administrative authorization.

How do I obtain a Pooch subscription, and what do I get?
With your ordered copy of Pooch, you get a one-year subscription, which can be renewed annually at 25% of the purchase price (at the time of renewal) of a new copy. As a subscriber, you will receive free updates to the software (approximately twice per year) and technical support via e-mail. The subscription also entitles you to up to six free rekeyed Pooches, if you feel your security has been compromised, per year. If you would like to upgrade your Pooch, please contact Dauger Research.

When I get an update, do I have to reinstall it on every Mac myself?
On the CD you receive, Dauger Research, Inc., will supply an updater program we call a Pooch Package. You can use the usual Pooch launching mechanism to deliver this Package to all your nodes, which will then unpack and update Pooch to the new one on all your nodes automatically and simultaneously. No reboot necessary.

What is different about Pooch compared to its predecessor, the Launch Den Mother and Launch Puppy?
LDM & LP were a pair of Macintosh applications that assist in starting a parallel application. I created their earliest incarnations in 1998. Neither were designed to run all the time in the background. The Launch Den Mother discovered the existence and addresses of nodes using AppleTalk calls that looked for a side-effect of the Program Linking toggle in File Sharing. It then sent an AppleEvent (of a type allowed only when Guest Access was on) to the Finder on that Mac to launch its Launch Puppy. Correct execution of this AppleEvent required that the LP be in a very particular location on a particular hard drive. Once LP was up, LDM communicated with it via the Program-to-Program Communications Toolbox (PPC Toolbox), introduced in 1990 to support AppleEvents and AppleScript, written to use AppleTalk only, and toggled via the aforementioned Program Linking button. Finally, LDM’s user interface consisted of one large CustomGetFile dialog box. LDM & LP contained an innovative combination of technologies. There was nothing else out there that could do what they could do, and they became used in scores of Mac clusters worldwide.

Then, Apple announced the future of the Mac OS to be OS X. Once the specifics about X’s application support became clear (early 2000), it was evident that LDM & LP were in trouble. CustomGetFile was not supported in Carbon, so its associated 3000 lines of code would be useless. Although AppleEvents would be available in another form on X, Guest Access was permanently disabled (although there were ways to use password access). The calls LDM used to discover nodes (e.g., PLookupName) did not exist in Carbon, so another method must be found. PPC Toolbox was completely unsupported in X. (Just try to find a Program Linking button in native X.) Apple itself was discouraging the use of AppleTalk in favor of TCP/IP. It was clear that a port of LDM & LP to X was completely impractical: almost everything interesting about LDM relied on APIs Apple listed as “Unsupported”.

So, rather than hodgepodge LDM & LP piece by piece for X, I decided to abandon those 13000 lines of LDM & LP source code and start something new. Numerous convenience features intrinsic to AppleTalk did not exist in TCP/IP, so a completely fresh approach was necessary. Although the ideas had been rolling around in my head since early 2000, only the CarbonLib’s available in late 2000 had enough functionality to provide what I needed. Then, in early 2001 (simultaneous with finishing my doctoral dissertation), I developed Pooch. Pooch:


Using these APIs and experience with LDM & LP resulted in fundamental design requirements in Pooch different from LDM. For example, on OS 9, SLP requires a registration by a running application, forcing Pooch to run all the time. But, I figured I would take advantage of these requirements rather than fight them. These fundamental design differences gave the opportunity to do things that LDM & LP could never do:


And then there were other features added to enhance the user experience, such as extending the drag-and-drop metaphor from the icon to the Job Window, becoming a repository for files and nodes and allowing the entire Finder to become a file dialog.

But the primary goal was the same: Make operating a powerful parallel computer as simple and as easy to use as possible. Easy enough for anyone to use.





IX. Revision History

v1.7 May 15, 2006
Significant new features include:


Pooch changes in detail: Universal Binary release. Implemented and added user interface for remote job access mechanism. Implemented Job file type support in user interface and Job window, including save sheet reminders and standard save sheet dialogs. Implemented Head Node setting, influencing default behavior in the Network Scan window, the Job forwarding option, via Proxy setting, automatic node acquisition, retrieve output job option, and queuing job option. Implemented more ways to acquire file icons in file pane of Job window. Added support libraries for complete reading and writing of folders and files. Modified Node Access library to locate and retrieve jobs and files on remote nodes. Implemented node access retry in Node Access library to attempt primary address of remote node. Added recognition for .app extension like that of .out. Set job name just prior to job submission. Implemented job window reinvoke after job submission preference. Eased access to port number of job in Job window. Updated use of Open Transport API for endian changes. Increased proxy time based on data transmission sizes. Log list of IP addresses on startup. Added retry for large send sizes from background mechanism. Added reporting of user name data in AppleEvent job structures. Added access delay to launch state in AppleEvent interface when job just launched. Added AppleEvent dictionary entry and implementation for TCP port of job data. Allowed for initial file name to be specified in Navigation dialogs. Fixed queuing system to launch properly if only one job queued. Added logging for launch errors, including identification of problem nodes, and reporting. Fixed endian conversion for multi-virtual node launch. Implemented additional timeout detection in launch mechanism. Implemented optimized secure encryption methods. Implemented secure connections at beginning of launch mechanism, including logging. Fixed allocation bug on post-launch reporting to origin node. Updated Preferences window with new features and allowed for its resizing. Fixed tunnel address string detection in Network Scan library. Updated Status window with antialiased, positioned fonts. Fixed Navigation data possible crash when file dialogs are closed. Fixed possible crash if refresh in Network view of Network Scan window. Restricted Kill Job button if accessing by proxy, while triggering proxy node if Kill Job is invoked. Edited Status icon to reflect new node states and consistent display of status text. Implemented support to detect and display greater than 2 GB of RAM using sysctl. Implemented workarounds for resource handle bug in recent jobs list, preferences saving, and process watch list. Added support for Nodes per Box specific to a node in job structure and contextual menu access in node pane of Job window. Added support calculations for virtual node specification. Added support for long job names based on long file names. Added support for job identification for retrieval. Fixed endian conversion for grid and node acquire structure types. Fixed peak Vector float flops endian conversion. Fixed endian conversion backwards compatibility with old Job properties structure types as well as new extensions. Added file info parameter record endian conversions. Handled endian PinRect issues in Network Scan Columns and User Administration windows. Extended search support for possible quad- and octuple-processors. Implemented launch of Console.app to display poochlog.txt. Reorganized File menu with new items. Modified Paste behavior to the default in dialogs, the address field in the Network Scan window, and submit to the Job window. Optimized backgrounding timings. Node Info window displays node name in title bar, if available. Allowed Node Info data to display without process data. Updated Total Memory field to include virtual memory estimate, and updated Free Memory field to show greater than 2 GB. Process checking for null jobs before converting endians. Ignore /private prefix in acquired paths to processes. Added routine to tell Console to open a file. Added standardized Nodes per Box Job structuring setting. Altered code to fit within Xgrid’s library restriction, fixing a bug that appeared in later versions of OS X 10.4. Updated benchmarking code to include new Power Fractal implementations in AltiVec and SSE. Added option to save and retrieve Pooch Pro user login data in Keychain, implementing time delay if user is not present. Converted endians for encrypted Pooch Pro user data to disk and over the network. Redirected Pooch Job History data into centralized, common system location from user location, with consumption of old job data. Converted endians for Job History data. Implemented Job accumulation cache and filter to optimize network and job scan for user minutes calculation. Added delay to accumulate calculation until user idle. Fixed job accumulation when launch occurred before Network scan. Log if user or group data is corrupt. Converted endians for initial password calculation. Extended Pooch user export/import to include quota and rollover flags. Fixed possible crash when refreshing on Network view of Network Scan window. Fixed Unix permissions of the queued jobs folder. Prevented crash during deletion of one-year-old archived jobs when getting such jobs. Converted endians for Get Files window and Bonjour and NSL/SLP access implementations. Signaled automatic user database update and direct Job window click as higher priority to queuing system. Added icns data for new Job, Job queue, Job history, Address list, and node list file types. Converted endians for client-side proxy node operation. Modified poochdaemon installers to conform to Unix permissions requirements in late versions of 10.4. Revamped Pooch Package and Pooch Installer to incorporate both CFM and UB versions of Pooch and install the appropriate version, from OS 9 to Intel. Implemented hidden feature to launch jobs on both OS 9 nodes and Intel nodes.

v1.6.5: January 18, 2006
Significant new features include:


Pooch changes in detail: Modified /tmp/pooch folder creation to set all Unix permissions to true. Fixed search to find flops and processor count correctly. Added to AppleScript interface to support new command-line features. Fixed ps parser to not read past end of file or duplicate ps output. Randomized ps temp file name to reduce file and permissions conflicts. Modified message diagnostics to print more reliably. Added ability to open Pooch job queue files to create a new Job window. Prevented pooch.unix from calling Process Manager or Apple Event Manager. Migrated Node Info window and its functions to use Node Access library. Modified the Pooch at Logout feature to use root privileges to start the poochdaemon. Removed excess debugging messages, enabling some according to user’s preference. Added cmd-ctrl-Q trigger for launch queue. Fixed text import to navigate folders. Fixed CPU time tracking to exclude null data. Fixed Network Scan window’s nodelist pane behavior to properly delete playlist entries. Implemented Kill All Job behavior when option-clicking on Kill Job button, also displaying “Kill All Jobs”. Fixed job status field display color when it follows a BUSY node in Node Scan window. Implemented export and import of user data via tab-delimited text files, flexibly parsing header and data fields. Implemented new AppleScript data types for user, group, login session, and time interval selectors and connecting the handlers to the user database, making it possible to modify the login state and user database of the cluster via scripts. Added default processor per box selector. Enhanced randomness of TCP port number selection to help prevent conflicts. Added recognition of dual-core G5 CPU type.

Added endian flippers for all 40+ data structure types and implemented them at all data entrance and exit points in Pooch due to migration to Universal. Modified header dependencies, disabled bridge code, modified resource behavior and icon access, compensated for missing target ID AppleEvent coercion and API changes for migration to Xcode from Metrowerks. Added recognition of Intel CPU type for Pentium 4.

v1.6: May 10, 2005
Significant new features include:


Pooch changes in detail: Major renovations to the Network Scan window, both at internal and external levels. Internally, factored the routines into separate components for node access, network scanning and compilation, and the window user interface, making it possible for each lower-level component to operate independently of the upper-level components. Externally, used the Segemented control to switch views, added a new Network View, added a playlist-inspired left-side pane, with a splitter that detects mouse-overs and saves state, for user-defined node lists and networks, replaced the remote node scan button with an address box with a pop-up menu, accessors to alter job history lookback dynamically, added an action pop-up to eliminate the node info pop-up and network scan columns button, added a Search field to reduce both node and job data, added alternating colored backgrounds in DataBrowser for OS X 10.4, reorganized the refresh button and other controls, following cues from Finder, iTunes, and Safari. All this while still properly supporting drag-and-drop between Network Scan panes and the Job window and preserving the previous look for OS X 10.2, 10.1, and OS 9. Added ability to automatically hide the Network Scan window. Generalized job aggregate data structures. Worked around bug that corrupted QuickDraw after editing DataBrowser item. Updated CarbonEvent handlers. Added routines to translate HIRect and the traditional QuickDraw Rect. Added dozens of wrappers and constants for support routines for the new and updated controls in the Network Scan window. Added the ability to add complete node data, node acquire data, and arbitrary job data to the Job window, as well as being able to acquire a complete job record from the Job window. Created and implemented new Grid Job Type, with the ability to launch a series of jobs, stepping through a range of integers for an executable and customize how the index is entered on the command line and how the index appears in the subfolders. Added actions on the event of a job ending, including creating a job log file to include in automatically retrieved files. Fixed a Job properties access bug when no files or nodes are present. Added Job name and Node List set/get accessors for job structures. Abstracted file list updates. Inserted workaround for jumping Job window drawers, especially in 10.4. Adjusting to changes in OS X 10.4, altered the Pooch StartupItem installation to properly set unix permissions and eliminate excess files, and wrote a new mechanism based on CFPreferences to add Pooch to the Account Startup Items list. Created a new version of Pooch that operates in the Mach-O space without a user interface, adjusting all components to the new environment, and eliminating CFM wrapper routines, yet preserving the bark. Added implementations for menu items, configuration dialog, and Xgrid job launching routine. Reorganized internal Unix and utility routines. Generalized StringListHandle support routines. Altered network error reporting routines for selective reporting into poochlog file. Converted all external mentions of “Rendezvous” with “Bonjour” for 10.3 and later. Altered the Bonjour internal interface to optionally expose multiple addresses per node. Reduced potential memory leak in resolution callback. Installed correct call sequence to stop Bonjour’s services resolver, eliminating “burden on the network” warnings. Added detection and reporting of nil infoPtr returned from NSLPrepareRequest. Added new triggers to the job launch queue. Created all-new implementation of job queue internal data system to centralize the queue on the machine and update the format for OS X, implementing new job data storage and retrieval systems while automatically converting data from the old system. New queue is more efficient and reports on possible errors in poochlog. Redirected Launch mechanism messages to poochlog and factored node scan for automatic acquire code if running in Unix mode. Reduced declarations of Job properties structure. Generalized the file/folder scan structures and the user login detection routine. Reduced AppleEvent dependencies on old url data structures. Fixed scrambled name bug when importing nodes without names. Added ability to accept and expose node acquire data, file list entity, node list entity, and complete job data via AppleEvents. Added and implemented network scan and job scan, with new modifiers, AppleScript commands to acquire job data instead of node data. Detected new clock speed format for > 2GHz and fixed clock speed display. Added calculations using UTCDateTime. Altered 64-bit comparison calculations. Cached system version Gestalt calls. Updated FSRef routines to report descriptive errors. Created node and job data to text conversion routine for Search and other features. Generalized Node List structure manipulation routines. Added ability to minimize windows on command and detect mouse movement events. Added self identity modification address list implementation and import routines, generalizing friendly-address implementation. Fixed a potential bug when importing addresses while the Job window was open. Altered subdirectory creation to not delete existing directories. Changed Node Info window to use frames only before 10.3 and provide a horizontal scroll in the Job Queue view. Added accessors for strings in preferences file. Altered window names and hide Settings menu on 10.4 to follow OS X conventions. Added Job file type.

v1.5.5: September 15, 2004
Significant new features include:


In detail: Reorganized job processing and archival routines. Added user data to job record before launch. Added logging to /Users/Shared/pooch/poochlog.txt. Added data recording to launch mechanism for LAM. Added logging of launch events to poochlog. Optimized calls to update Status Bar during launch, reducing flicker and excess CPU consumption. Factored forwarding job detection. Added preparation, access, command, and launch functions for mpich-gm and LAM, connecting them to the main launch engine. Cached calls to TickCount(). Added function and detection flags to redirect file collection for auto retrieve function. Added check for nil commID before proceeding with launch. Added calls to place Terminal.app in foreground when launching using Terminal. Added archiving of forwarded job data. Report when communication retry count exceeded. Added 3 second timeout for key rotation function during launch. Added recognition of G4 7447 and 7457 and G5 970 and 970FX in Network Scan Window, Node Info Window, and AppleEvent results. Added check for nil ProcessSerialNumber in quit and AppleEvent functions as workaround to prevent killing user space. Extended explicit collection of command-line data from ps command. Increased scope of pid search for wraparound process numbers. Prevent overflow of command string read from ps. Report reason and quantity limit in error string when user policy triggers job kill. Added Perl script routine. Added check for 0 process id to prevent logout of user space (#3805951). Added retention and processing of friendly addresses in listener functions. Redirect application file of watch list and archived job for multiple executable instance when launching multiple tasks per node. Added simultaneous send of multiple packet data to proxy controller routines. Added state requirements for mpich-gm launch command. Added state and explicit check for access to Internet Services and Open Transport before endpoint access with appropriate retry loopback. Report access from outside friendly address list. Added slave LAM command. Added AutoRetrieve directory command. Separated functions for network access and listener activity, making cluster access possible while another Pooch is running. Added primary address setting with Network menu modifications. Added pro version application AppleEvent variable. Added flags to node scan AppleEvent to report all and BUSY nodes, with a new scan performed only when requested or required. Added cancel launch AppleEvent. Added auto retrieve, forward via, and user name AppleEvent access to job data. Added mpich-gm and lam job types to AppleEvent access. Added LAM path access. Retrieve node status data even if communication is not complete. Added check to prevent user file addition to jobs. Include process path watch to BUSY check. Factored common preferences access. Cleaned up job view entry additions in Network Scan Window. Optimized and bottlenecked status text drawing in Network Scan Window, adding grayscale option. Adjusted initial order for job view. Overwrite job item entry only if node “owns” the job. Added ability to send multiple special commands to a node during network scan. Correctly kill multiple task per node jobs via Network Scan Window. Reorder node status access if special command are to be sent. Optimized accumulated job history task. Made Rendezvous and NSLM registration sensitive to listener state. Added user interface for friendly address import in menu and preferences window. Adjusted User Admin Window column dimensions. v1.5.6: Prevented crash during deletion of one-year-old archived jobs.

v1.5: June 28, 2004
Significant new features include:


In detail: New User menu and infrastructure for user log in and log out, password modification, user info display, user migration, user administration, and user database updating. New Show Job View menu item. New detection of User Idle Time, reporting mechanisms in the Node Info Window and the Node Scan Window and incorporation into the Rating function. Improved reporting of dialog status for compatibility with Carbon events. Pooch Pro requires log in for Job Window to function. Added recognition of correct user and group icons to the Job Window. Revised Job Window file acceptance for secure handling of user database files. Revised job launch rules incorporating user compute time quota calculations. Added flags to Status structure for Pooch Pro and encryption algorithm existence and User Idle Time. Extended Node Scan Window column infrastructure to allow over 32 columns while maintaining backwards compatibility. Corrected count of selection in Job View of Node Scan Window. Fixed activation of Get File Window and Get URL from Job View of Node Scan Window. Updated Pooch Version column of Node Scan Window for “Pro” label. Added Elapsed Time calculation and User Name column for jobs listed in the Node Scan Window. Restricted users from killing others’ jobs, except for administrators, via the Node Scan Window. Added standard Okay/Cancel and double-entry sheets. Modified parallel progress bar structure for additional flags for Pooch Pro. Added version number to About Box. Added links to User Administration accessors for windows routines. Made Job submission visible to Pooch Pro code. Created wrapper library for access to Apple’s CDSA implementation for strong encryption and decryption. Added mechanisms to filter commands requiring correct execution of job launch and other Pooch communication sequences. Added support for encrypted communications. Cleared out job handle on new connections. Added support for incorporation of new user data. Added block for queuing system based on user compute time quota calculation. Fixed an issue where cancel was not accepted during the file info step of the launch sequence. Added Node Scan Window support for view specification on open. Added time interval to string conversion. Added mechanism to save current process watch list to disk in case Pooch is restarted. Change to leave last call to ps in place. Added comparison of job execution time with user compute time quota calculation and maximum duration limits to process watch routine. Introduced new Pooch resource structure and support infrastructure for cross-platform compatible resource files. Corrected Job Archive Limit preference for year limit. New modules for user database management and file infrastructure, including password hash and user limit resolution. New administrative windows and dialogs. New job history accumulation and management organized by user. New user log in management for reporting of currently logged in user’s data. New Pooch Pro icon and About Box graphic.

v1.4.5: May 14, 2004
Significant new features include:


In detail: New Network menu. New hierarchal menus for Launch Unix Jobs, Access Nodes via, Local Node Scan, and Node Deregistration. New Job Port Range support and user interface. New Job Queue menu item in File menu. New Proxy Connection support for up to sixteen simultaneous bidirectional connections via the node assisting with the Remote Node Scan. New preference to prevent remote access to processes not launched by Pooch. Adjusted event callbacks to not filter key events when dialogs are open. Had Demo version remove poochdaemon at expiration. Changed Node Scan Window user interface, logic, and accessors for other windows to respond correctly to node selection in Job View, including double-clicking and contextual menus. Allowed multiple selection in Job View. Allowed network connections to nodes to continue while adding them to the Job Window. Fixed an issue which corrupted status information, particularly processor counts, when adding nodes to the Node pane of the Job Window. Added support for OS X-style, callback-supporting, single-entry dialog sheets supporting copy and paste. Report retry count in Status column of Network Scan Column when Verbose Error Reporting is on. Report proxy node in Determined Via column when connection attempt to a node via that proxy. When sorting by Status in Job View, secondary sorting occurs using Submission Time. Added proper sorting using Process ID. When new nodes are introduced to the Network Scan Window, SLP data is chosen first, letting Rendezvous data can be reaccessed later. Generalized automatic rescan mechanism to extend time for proxy connections. Allowed for very long (up to 127 characters) DNS names for IP addresses. When proxy is available, Network Scan Window automatically alternates between proxy and direct attempts with a psuedorandom starting try, extending timeouts exponentially. Added support for simple OS X-style Cancel and Okay dialogs. Added proper support for Command-. to simple modal dialogs. Added new and renamed preferences to Preferences Window, requiring its heightening. Adjusted menu disabling and hiding for new Network menu. By performing post-open triggers, worked around an issue where Job Window’s drawers, while opening, would randomly drag their parent window around the screen. Added drag-out support for items in Recent Items drawer. Allowed multiple selections in Recent Items drawer. Utilized new-style sheets for Job Options requiring them. Updated plst and vers resources to allow Version data to appear correctly in Finder. Extended Automatic Rescan interval to ten minutes. Added support for use of /Users/Shared folder instead of /tmp. Disallowed Kill Process in Node Info Window when no process data given. Corrected pid data reporting when concluding pid search. Supported restriction of process ID data. Passed proxy access flag through Status structure. Fixed node structure creator routine for correct behavior with nil strings. Added automatic listener endpoint check for failure and auto-recovery. Separated small network send buffer from receive buffer. Added response to T_DATA event on OTLookErr. Corrected bug in endpoint struct open routine to set TCP_NODELAY. Extended timeout delaying reuse of endpoint structures. Added dialog indicating success of Startup Items setting. Revised splash graphic.

v1.4: January 2, 2004
Significant new features include:


In detail: Added node list type to the Apple Event dictionary and enabled access to nodelist_ip information via AppleScript and interapplication communication. Added accessors to security authentication and authorization in OS X. Restructured Pooch utility code into six separate modules depending on category and type. Customized behaviors and appearance to OS X 10.2 and 10.3. Enhanced signaling and communication between Network Scan Window and the Apple Event support and launch modules. Prevented registration of 0.0.0.0 address via SLP. Prevented the automatic SLP registration retry from disturbing other code behavior. Upgraded CarbonLib headers to version 1.6. Added and implemented automatic node acquire mode of the Job Window and launching mechanism, with corresponding adjustments for job monitoring and display. Added Job Option drawer with corresponding controls. Added Recent Items drawer to Job Window. Added clipboard distribution feature, including pasting into the Job Window and copying using remote data. Correct handling of nil AllAddresses data from Rendezvous. Searched through all addresses resolved by Rendezvous for valid addresses, rather than only use the first. Correctly tallied Rendezvous addresses with repeat accesses though the Network Scan Window. Responds to Rendezvous discovery events at all times while the Network Scan Window is open, rather than just during the SLP scan. Added and implemented zoom button for Get Files window. Allowed Node Info and Job Queue to be invoked when viewing the Get Files window. Enabled delayed trigger for Get Files and Node Info windows for compatibility with Carbon Events. Used smaller system font for consistency with Finder. Replaced old Job IDs for the queuing system with Queued Job IDs. Implemented Job Timeline data structures to hold past, present, and future job data. Implemented job archive and retrieval mechanism, establishing Job resource type. Updated Process Watch List to incorporate job as well as process data. Implemented Job Timeline accessors for Job Archive, Process Watch List, and Job Queue. Added launch mechanism to Network Scan Window subscriber list. Prevented uninitialized data from entering the Apple Event identification routine. Accessorized Pooch version data for access by multiple parts of Pooch. Fixed an issue where the Apple Event node scan command would retrieve BUSY nodes. Limited node communication retries in the launch mechanism to 10, then reporting failure. Separated submission jobs versus working jobs in the launch mechanism, including appropriate resubmission into queue. Bridged compatibility between new Job Properties data and older Job Queue accessors and kill mechanism, including automatic acquire features. Check for null address in job data. Allowed job submission to place new jobs at either end of queue. Allowed display of IP address in launch status window if no name is present. Once launched, all processes are now submitted with job and file data for full recognition and tracking. Correctly initialized LaunchApplication records in launch mechanism. Enabled preparation and retrieval of Job Timeline data on remote nodes with GetJobTimelineCommand. Attempted to register network communication activity as user activity for screen savers. Track and retain data about aborted jobs. Implemented command-line argument Job Option on remote nodes. Performed a+w permission operation on all files created by Pooch. Prevent caching of 0.0.0.0 IP address. Returns true when checking if 127.0.0.1 is self. Created and implemented Pooch Preference window with three panes. Implemented appropriate Carbon Events to invoke and drive the Pooch Preference window. Added preference to kill the Settings menu. Conversion of Node Scan Window to Node View and Job View of Network Scan Window with addition of Carbon Events, the brushed metal look, and related control modifications. Added mechanisms to retrieve job information and coordinate the killing of jobs across multiple nodes. Added and implemented Zoom button to Network Scan Window. Generalized, and updated with antialiased graphics, the Node Registration interface for use with the Start Time Job Option. Implemented little arrows controls for arrows of the Start Time Job Option. Job Window rejects files if nil is passed for file specification. Check for possible null node names and addresses in Recent Node Lists. Appended Recent Nodes and Files, rather than replace the present node and file lists in the Job. Added Carbon Events and brushed metal look to Job Window. Separated job recall into node list and automatic acquire modes. Job Window edits keyboard focus. Accessorized self node addition to Job Window. Allowed specification of initial Job Windows node list versus automatic acquire option. Eliminated OS 9-style brushed metal when in OS X. Relabeled Launch Job button depending on Job Options. Responded to modified dragging and tracking in Job Window while using Carbon Events. Generalized IconRef access for files and special cases. Use of OS X-style display name instead of actual file name in file list pane of Job Window. Modified highlight region drawing for brushed metal look and automatic acquire option. Added Job ID and submission time initialization when Job Window submits jobs. Abstracted control updating separately for Job Window and its drawers. Embedded Node Info Window’s data display in its new two tab control. Node Info window lists changed to smaller system font. Node Info Window no longer disables Node menu inappropriately. Progress pie in Node Info Window is antialiased in OS X. Rearranged data display to hold more data and animate as retrieval completed. Allowed for scrolling in Job Queue mode. Accepted supplementary Status structure data. Display proper rounding for OS version numbers. Worked around potential race condition during fast logout to login transition. Created IconRef accessor to icon resources. Created and implemented Job Archive Time Access Interval preference. Enabled access to Preference Window through Pooch menu item. Enabled and implemented Pooch at Logout toggle with administrative authorization. Added application and other main event implementation through Carbon Events. Extended Network Scan Columns controls in three columns. Added and implemented Job Item lists, with disclosure triangles, and new Job-related display columns for Job View, compiled from retrieved data. Implemented acceptance of supplemental status data, Job Timeline retrieval, and multiple node commands. Designed new suite of Job icons, Network Neighborhood icon, Node Info icon, and others. Added recognition of PowerPC G5 CPU type to Network Scan Window and AppleEvent interface.

v1.3.5: June 20, 2003
Major new features include:


In detail: Added implementation to monitor login/logout state so that it will hide all its menu bar when logged out and quit if the login status changes. Added support for Carbon events to support the Dock menu items and horizontal and vertical scroll wheels in the Node Scan, Get Files, Node Info, and Job Windows. Added support for contextual menus in the Node Scan and Job Windows. Minimized the appearance of the menu bar while running at logout. Implemented retention of additional column sort criterion and direction information in the Node Scan Window. Implemented time out reset when OT support was off when the user wanted to perform a Node Scan. Added code to prevent the IP address of a node in the Node Scan window from being overwritten while communication with that node was in progress. Added support and preferences for saving files in the Temporary Items folder and /tmp/pooch in addition to the folder where Pooch resides. Implemented Reveal in Finder AppleEvent support for sandboxes. Modified launch mechanisms and job window accessors to accept Unix command-line arguments and convey them to the executables, if applicable, for all job types. Added code to change Pooch’s Unix umask so all can read and write files that Pooch creates. Optimized the Recent Node and File List menus so that the menu would not take so long to appear. Added interface and implementation for opening File Sharing from any window that specified a node. Added and implemented skip node zero, job type, tasks per node, and Unix command-line arguments properties to the job definition and Okay/BUSY status and time stamp to the node definition in AppleScript interface. Fixed a bug in the AppleScript interface and the Node Scan Window that reported negative memory if 2 GB of RAM was installed and increased the detection limit to 4 GB. Improved the Node Scan completion queue in the AppleScript interface to more intelligently allow the needed amount of time for communication with other nodes to complete, including while performing a Remote Node Scan. Added a more intelligent Pooch Preferences folder locator in case the normal Preferences folder was not available. Implemented high-level IP address caching and comparison routines with corresponding changes in the launch mechanism and the job window, for cases where Pooch’s node has more than one IP address. Implemented a workaround in the Node Registration Schedule dialog so the day selection menu would redraw properly when switching schedule types. Fixed automatic benchmark starting mechanism when the node is not registered. Created a caching mechanism for ps access. Improved internal support for variable length job options, including command-line arguments. Changed default sandbox preferences. Changed launch mechanism to only put Finder in foreground if Pooch knows it is in the foreground. Optimized mpich launch timing. Changed the behavior of the Subdirectory option of the Job Window to be more likely to create a new subdirectory name rather that reuse the old one. Created a workaround and warning for blank node names, a circumstance that Rendezvous will not accept. Eliminated extraneous diagnostic reporting in Rendezvous implementation. Created a workaround for a DataBrowser display while scrolling bug in 10.2.3. Created a workaround to maintain background colors after DataBrowser reset them to white. Added command-control-R key command to reset Pooch’s network and node settings, in case OS X did not notify Pooch.

v1.3.1: December 31, 2002
A few minor bug fixes and cosmetic updates. In particular, a bug that only occurs in OS X 10.1 was fixed having to do with the Node Scan features of Pooch and the new Rendezvous implementation.

v1.3: October 4, 2002
Major new features include:


In detail: Created complete Rendezvous registration (with auto-retry), continuous browse, and on-request resolution implementations. Revised Pooch’s OS X process acquisition code to focus its search for processes it launched. Excluded requests for null process number watches. Keep track of jobs by session ID so that multiple tasks can be associated with a particular job. Adjusted for differences in ps on OS X 10.2 versus 10.1. Deletes previous mpich procgroupfile before writing a new one. Extended internal job data format and accessors to handle node information for Tasks per Computer and Use Port Number features. Added internal time stamp to node status information. Implemented discovery accessors and additional job option retention and added these settings to preferences file. Modified slave and master node launching mechanisms for multiple tasks per computer for all MPI types. Modified launch sequence to always query for latest status from nodes before launch and update internal job data. Launch process now forwards all job data to slave nodes before launch. Created complete recursive file and folder copy mechanism. Modified master and slave node to recursively copy executables and files on command. Implemented wait states for mpich launching due to fragility in mpich’s p4. Job Window updates node 0 name, address, and status information of the current job if node 0 was this node. Fixed possible memory leak in recent node and file list constructors. Created task per computer counter for Job Window node list display. Fixed node list display error if name was zero length. Adjusted to condensed font if node numbering became too wide. Changed default subdirectory job option suffix from “ƒ” to “folder”. Implemented new Tasks per Computer and Port Number job options in Job Window. Modified job window accessors to handle extra node information. Set Running Applications list to order by descending CPU time by default, rather than ascending application name. Optimized Node Info access and disconnection speed. Created higher-level AppleEvent node information accessors and creators and extended implementations to AppleEvent and Job Window accessors. Added AppleEvent access to new Job Window features. Implemented two second minimum time for node scan command to complete to allow Rendezvous and SLP discovery time to initialize. Implemented “for best processors” clause. Fixed node scan request queue bug that prevented more than one request from being queued. Added “time stamp” property for typeNode AppleEvent data. Added information to help tags for Node Scan Window pop-ups. Fixed bug that prevented node scan columns from being reorganized in OS X. Generalized Node Scan Window node list accessors to handle NodeStruct types from Rendezvous in addition to raw URL data from NSL/SLP. Added reset NSL/SLP list accessor. Encapsulated slaving launching mechanisms for each MPI type. Created workaround for CloseOpenTransport hang bug. Connected node scan tunnel to Rendezvous in addition to NSL/SLP. Extended slave node launching mechanisms to handle multiple Tasks per Computer feature. Separated MPI/Pro launch into its own command. Created and implemented retain job information and duplicate files commands.

v1.2.1: June 30, 2002
When launching jobs via the Terminal.app while Terminal is not yet running, added a search by the name “Terminal.app” in addition to its signature code. Improved the error code reporting in case Terminal.app launching fails. If two copies of Pooch are launched, and the first has already set up its network connections, the second copy of Pooch will remain inert, retrying once per minute, rather than interfere with the first. If NSLStandardDeregister fails, Pooch will report the error, rather than simply beep. Changed the Node Scan Window code so the browser list sort and movability default settings are more consistent across columns. Changed the Rating column margin setting to 5 pixels on OS 9. Corrected a problem in the Job Window where the Launch Job button would remain disabled after using the Recent Node Lists menu. Updated this manual’s mpich description to recommend installation in the home directory.

v1.2: March 27, 2002
Major new features include:


In detail: Added the ability to recognize folders, Unix scripts, bundles, and bundle icons to the Job Window. Added the flexibility of adding more than one executable to the Job Window, while maintaining that the first file is the first executable, even when files are later removed, and labeling that executable with a •. Altered the style specification of columns of the Job Window, Node Scan Window, Node Info Window, and Get Files Window for greater consistency between OS X and OS 9. Altered the internal control and window spacing specification for greater consistency throughout Pooch. Altered the Job Options pop-up to accept submenus. Added and implemented the Job Type submenu to the Job Window. Altered Job Window so its node list displays IP address when there is no name. Altered the Node Info Window text display, Node Scan Window node count placard, Get Files placard, and about box to use the correct antialiased fonts when running on OS X. Added calls to the Remote Node Scan cache of IP addresses to “warm up” the DNS search when the Node Scan Window is opened rather than only when the pop-up is first clicked. Fixed a bug preventing help tags from appearing under some circumstances. Corrected the status string region of the Node Scan Window to enable itself more quickly when its window is made active. Added the Clock Speed column to the Node Scan Window. Prevented the remote node scan node information from being overwritten prematurely. Prevented the Node Scan Window from displaying too many nodes. Optimized the parallel node query retry mechanism. Changed the ThemeWindowBackground of the Start Time dialog and single entry dialogs to the default on OS X. Added code to save and retrieve information about a series of recently launched jobs. Added the recent node and file list submenus with abbreviated names and the ability to restore them to the current Job Window. Changed the Register this Node preference so that it could accept Always, Never, and Scheduled modes instead of just on and off. Added Node Registration Schedule dialog, switchable between Everyday, Weekend/Weekday, and Weekly modes, with corresponding implementation in the node registration code, while attempting to follow Apple’s HI guidelines. Added implementation and interface to save, recall, and display recently used nodes, files, and applications for new jobs, including on-the-fly abbreviation of node names in menus. Correctly retrieve and recognize Unix permissions of all files and folders on OS X, then use that information to recognize Unix-based applications. All permissions of files and folders and preserved when copying between OS X machines. Added code to correctly recognize, copy, and launch application bundles, especially Cocoa apps. Added the option to launch Unix-based executables via a Unix tcsh script rather than through a call to system. Added code to recognize Process Manager processes by signature. Added the option to launch Unix-based executables through the Terminal app. Terminal is called by sending AppleEvents. If Terminal is not running, it is launched via Launch Services. Added the ability to “kill -9” processes on OS X. Added code for writing procgroup files and specifying Unix command-line options to pass cluster information to the mpich master and slaves. Added environment variable specification and launch code for MPI-Pro jobs. Added the ability to kill all running processes tracked by Pooch at once. Added utilities to extract and search for arbitrary information, including searching for particular POSIX paths or particular process ID numbers, from “ps” output. Organize process tracking by process ID on OS X. Tracks all Unix processes, rather than only those reported by the Carbon Process Manager. Added tracking of Unix process creation by POSIX path, even if the process takes up to a minute to appear in “ps”. Extended data structures and added implementations to supply Unix process information to other nodes in addition to Process Manager information while retaining backwards compatibility. Added “node registration” information and implementation to AppleScript interface. Added “job type” to AppleScript job class type and corresponding implementation to specify MacMPI, mpich, and MPI-Pro job types. Correctly clear the job options structure when there are no options present in a job data structure. Added job type, subdirectory, and start time accessors for job data structures. Added and implemented Kill All Jobs on Deregister, Recent Node Lists Length, Recent Apps and Files Lists Length, and Use Terminal For Unix Jobs preferences. Added and implemented the Accept Remote Preferences Command preference in a private resource that cannot be remotely overwritten. Added and implemented Job Type automatic reloading preference. Schedule more time to launching process during a parallel launch, improving launch speeds. Encapsulated and optimized the Parallel Launch Status window so that it updates itself much more smoothly and cleanly and uses internal timers to leave only a minimal burden on the launch process. Corrected an error on OS X where garbage might appear in the about box when NSL/SLP attempted to register. When NSL/SLP fails to successfully register (e.g., when the network interface is unavailable), Pooch retries with decreasing, random frequency, rather than at regular intervals. Changed the Select Apps and Files dialog to accept folders, bundles, and multiple selections. Correctly report network, hard drive, and system activity to the Power Manager when remote requests and commands occur or when launching jobs. Created new commands to launch bundle-based applications, mpich executables, and MPI-Pro executables, to create a new subdirectory and specify a bundle application simultaneously, and to retrieve OS X file information. Changed the “change directory” command to accept a parent directory specification. Extended the set file information command to accept OS X file information. Fixed a bug where the Parallel Launch Status window progress bars would jump. Altered send file sequence of the launch process to recursively explore a folder structure using FSOpenIterator and FSGetCatalogInfoBulk. Altered the TCP port number information. Provided correct rounding of processor clock speed in Node Info Window, Node Scan Window, and AppleScript interface. Recognizes the PowerPC G4 7455. 36,326 lines of code.

v1.1.2: January 11, 2002
Added Node Scan Columns “Show Columns…” window. Optimized and made more fault tolerant, because of particular behaviors in Open Transport, the initial connection algorithms in the launching mechanism for greater speed and reliability when launching onto large numbers of nodes. Reorganized and rearranged the Job Window and Node Info Window for almost complete compliance with Apple’s Aqua user interface guidelines. Made the job window horizontally resizable. Added recognition of the PowerPC 7450. Under OS X, made the “Remote Node Scan…”, “Open Apps and Files…”, “Place in Subdirectory…”, and “Start Job at Time…” dialogs appear as sheets. Modified the connection algorithms in the Node Scan Window to attempt access to all found nodes once before giving other nodes a second chance to connect. Modified timeouts and other timings for the Node Scan Window, Node Info Window, and the launch process. Greater use of DrawStringUsingThemeTextBox, with the correct font and size selections, in the Node Info Window, the Job Window, the Node Scan Window, the Get Files window, and the About Box for smoother text under OS X. Added Always Include this Node in Local Scan setting with corresponding implementation. Added Accept Commands from Local Subnet Only setting with corresponding implementation. Gave the launch process code and network scan engine code greater encapsulation. Instructed Pooch to wake up the Macs upon successful launch on all nodes. Modified to recognize window classes correctly when determining the frontmost window. Corrected certain behaviors in the launch progress bar. Corrected a cosmetic error when the network scan code reports its registration. Corrected internal network structure allocation code for the case when a part of Pooch requests to connect to more than 64 other Pooches. Correctly notifies the system of network activity when Pooch receives commands. Acquires the correct IP address when multiple network interfaces are active.

v1.1: October 2, 2001
Fully operational on OS 9.x with CarbonLib 1.2.x and today’s OS X 10.1. AppleScript interface implemented and supported, making customized parallel launches, customized job queues, Unix command-line calls to Pooch, and fully automated parallel application launching possible.

Worked with Apple to fix NSL, Open Transport, and CoreServices bugs in 10.0.x, leading to the fixes in 10.1 release. On X, now uses the Unix “ps” command to determine load distribution on a node. On X 10.1, uses CSCopyMachineName to determine the computer name. Heuristic node rating added and exposed to the Node Scan Window and AppleScript interfaces. Add Best… pop-up replaces Add All button, leading to alteration of corresponding selection list calculation routines. Revised Node Scan Window code so that, when automatic rescan occurs, the scan results no longer disappear and reappear when being updated. Nodes selected for launch in the Job Window now appear in the Node Scan Window in gray, rather than disappearing. Revised Network Neighborhood Node Scan pop-up menu and behavior. Added the ability to recognize and launch Mach-O, Unix-based executables. Altered Open Transport endpoint settings and optimized background time calculation and internal task code to improve network performance, file transfer speed, and launch speed. When remembering the last job and node 0 was itself, updates the node 0 IP address if it has changed. Added Verbose Error Reporting setting to reduce “nuisance calls”. Added the job option to skip the launch of node zero. Internal window code structure reorganized and factored to a greater degree. In X, ATSUI text methods now used for custom content Data Browser columns to match font type and size of the other columns.

v1.0.2: May 2, 2001
Fully operational on OS 9.x with CarbonLib 1.2.x and today’s OS X.

Prevented a possible circumstance where Pooch’s connector endpoint would continuously reject an incoming password. Added code to the Set Pooch in Login Items to operate correctly when there is no loginwindow.plist file. Fixed a cosmetic bug in the Job Launch status window where the text could overlap. Fixed a cosmetic bug so that the node list font looks the same as the file list font. Corrected a Launch Job button behavior. The DNS translation routines correctly report a dotted quad.

Remaining known issue on X: The problem of the Process manager returning zero is still present.

v1.0.1: April 25, 2001
Fully operational on OS 9.x with CarbonLib 1.2.x. This Pooch contains workarounds for known bugs in OS X, so this Pooch should be fully operational with today’s X.

Known issues in OS X 10.0.1:
• Pooch uses the Process manager to monitor the CPU time taken by other apps and estimate load. However, Apple’s documentation says the Process Manager on X returns zero CPU time for all processes, making these calculations impossible directly in Carbon. A workaround through Unix may eventually be devised.
• During the search for nodes, NSLM does not respect maxSearchTime parameter, resulting in infinite searches. The current workaround is to set maxSearchTime to zero and use NSLCancelRequest.
• The official way to access the user-specified computer name entered in the Sharing system preference, CSCopyMachineName, returns the wrong name. Another way is being used.
• Setting Pooch in Login Items automatically. There is no documented way to do that, so I edit the loginwindow.plist file myself.
• A few cosmetic errors remain, such as custom content Data Browser columns not quite matching the font size of standard text Data Browser items.

v1.0: April 17, 2001
First public release - Fully functional on OS 9.x with CarbonLib 1.2.x. Due to known bugs in OS X, Pooch on OS X cannot be officially supported. Over 23,000 lines of code in C.

Known issues in OS X Release Candidate:
• During search for nodes, NSLM does not respect maxSearchTime parameter, resulting in infinite searches. Attempts to cancel the search result in a hang. However, this problem does not restrict launching a job onto OS X machines from an OS 9 machine.
• There is no reliable way to access the user-specified computer name entered in the Sharing system preference. The official way, CSCopyMachineName, returns the wrong name.
• An assortment of cosmetic errors, such as Aqua buttons not being compatible with texture backgrounds.





X. Credits

Pooch Concept, Code, Engineering, and User-Interface Design
Dean E. Dauger, Ph. D.

Pooch Icon Art Jean-Marie Venturini

Advice and Brainstorming Viktor K. Decyk, Ph. D., Alan B. Dauger, Ph. D.
Kevin T. Sinclair, EE & MBA, Robert M. Zirpoli, III, M. E., Catherine C. Venturini, M. S.

Beta Testing Viktor K. Decyk, Ph. D., Frank S. Tsung, Ph. D., John W. Tonge, Ph. D.

Compiler Metrowerks CodeWarrior Pro 6

We also wish to thank employees of Apple Computer for their assistance, technical and otherwise, who helped demystify the future of the Mac OS and saved a lot of headaches:

Tim Parker, Warner Yuen, Robert Kehrer,
John Tucker, Kevin Arnold, Steve Zimmer, Ernest Prabhakar, Ph. D.







XI. Contact Information

Pooch is released by Dauger Research, Inc. The latest version of this document is available from the Pooch web site. We are open to suggestions and wish lists regarding parallel computing issues. Depending on your needs and the nature of the request, custom configurations of Pooch can be arranged.

Dauger Research, Inc. http://daugerresearch.com/
Pooch web site http://daugerresearch.com/pooch/
The Dauger Research Vault, providing tutorials, sample code, and other documentation http://daugerresearch.com/vault/
Pooch technical support, questions, and feedback pooch@daugerresearch.com
Dauger Research mailing list info http://daugerresearch.com/mailing-list/

The latest MacMPI libraries are available from the AppleSeed web site.

AppleSeed http://exodus.physics.ucla.edu/appleseed/
UCLA Plasma Physics Group http://exodus.physics.ucla.edu/

Pooch and this document are Copyright © 2001-6 Dauger Research, Inc.
Macintosh, Mac OS, and Mac OS X are trademarks of Apple Computer, Inc.


Last Update: May 15, 2006







Appendix. Compiling the MPIs

This Appendix was written to provide basic information on compiling your code with MacMPI_X, mpich, and MPI-Pro and running these codes in Pooch. Most of this information is derived from outside sources and is not necessarily complete for all purposes. We provide this information as a convenience. Note that these MPIs were not created and are not maintained by Dauger Research, Inc., so this information is subject to change at any time.

MacMPI_X

MacMPI_X, an Open Transport/Carbon-based implementation of an MPI subset, is available from the AppleSeed Development web site:

http://exodus.physics.ucla.edu/appleseed/dev/


Both C and Fortran versions are available with corresponding header files. An introduction to MPI, some basic examples of MPI programming, and links to references on MPI are available on the site as well. The following information is derived from the above link. Although MacMPI is only a subset of MPI-1, Dauger Research, Inc., recommends its use because of its long history of reliability on the Mac OS, its wide flexibility compiling in different environments, and its extremely helpful diagnostics.

Compiling MacMPI_X.f with the Absoft Pro Fortran v7.0 compiler in OS X’s Terminal

Download MacMPI_X.f and mpif.h and include them in the same directory as your Fortran source. Compile MacMPI_X using:

f77 -O -c MacMPI_X.f

which generates MacMPI_X.o. To compile your code, you will need to link with MacMPI_X.o and the Carbon libraries. For example, depending on if you are using Fortran 77 or Fortran90/95 to compile test.f, you would use commands like these:

f77 -O test.f MacMPI_X.o /System/Library/Frameworks/Carbon.framework/Carbon
f95 -O test.f MacMPI_X.o /System/Library/Frameworks/Carbon.framework/Carbon

These commands create a Mach-O executable, which Pooch should recognize. By default, the Absoft compiler on OS X creates executables with a 512 kB stack. If your application needs more, you may wish to use the -s flag or the limit command.

Compiling MacMPI_X.f with the Absoft Pro Fortran v7.0 compiler on OS 9

Again, you will need to download MacMPI_X.f and mpif.h into the directory where your source is located. Compile MacMPI_X using:

f77 -O -c MacMPI_X.f

which generates an object file named MacMPI_X.f.o. To compile your main code, test.f in this example, in Fortran 77 or Fortran 90, use commands like these:

f77 -O test.f MacMPI_X.f.o
f90 -O test.f MacMPI_X.f.o

This version of the Absoft compiler on OS 9 will create a Carbon application by default, which should be executable on both OS 9 and X. Pooch should be able to recognize and launch this application.

Compiling MacMPI_X.c with Metrowerks CodeWarrior Pro 6 for both OS 9 and X

There are two major options to using MacMPI_X.c in CodeWarrior: 1. using the Standard C Console window; and 2. creating a Macintosh application. In most cases where you are porting an ANSI C-compliant code from another platform, you would probably want to use the Standard C Console. A Macintosh application (the Power Fractal app and Parallel Fractal Demo are examples) would need to know how to organize menus, windows, and so forth.

Standard Console C

If your C source code is standard ANSI C (i.e., this code is multiplatform and uses ANSI C calls like fprintf and scanf), you should start creating your project by selecting the Mac OS C Stationery category. Under the Standard Console > Carbon category, select the “Std C Console Carbon” stationery. Add your source as you normally would.  Then add MacMPI_X.c.  Be sure mpi.h is in your source directory as well.
There are a few settings that should be set for the best behavior of your executable. In the Target Settings Panel named "PPC Target", edit the 'SIZE' Flags to use localAndRemoteHLEvents. Also in the same panel, be sure to set the Preferred Heap Size so that your code will have enough available memory. In addition, your code will need to edit certain run-time console flags. Before your main() code, add “#include <SIOUX.h>”, and at the top of your main() code add these two lines:

SIOUXSettings.asktosaveonclose=0;
SIOUXSettings.autocloseonquit=1;

This will have the app quit when it falls out of main(), otherwise, the apps will wait for user input before quitting, locking up the remote machines forever.
In addition, if you don't already have one, you may wish to add a call to printf prior to calling MPI_Init(). It appears to be necessary for proper initialization of the app's runtime environment, without which a crash may occur.

Mac OS Toolbox C

If your C source code is a Macintosh C application (i.e., your C calls Macintosh Toolbox routines exclusively), then create your project using the Mac OS C Stationery category. Under the Mac OS Toolbox category, select the “MacOS Toolbox Carbon” stationery. Add your source as you normally would, then add MacMPI_X.c as well. Since MacMPI relies on ANSI C routines, confirm that the stationery included CodeWarrior's ANSI libraries in your project. Again, in the Target Settings Panel named "PPC Target", edit the 'SIZE' Flags to use localAndRemoteHLEvents. And, when you write your code, be sure to have your remote apps (node ID > 0) quit without direct human interaction. It would also be helpful to have your app's event loop respond correctly to Quit AppleEvents, so Pooch can kill them remotely, if necessary.
Also, you may wish to set the monitor flag of MacMPI_X to 0, which will turn off the MacMPI status window. If you find your event loop and windows conflicting with MacMPI, this change may help.

Compiling MacMPI_X.c with the GNU cc compiler through OS X’s Terminal

In order to use cc, you must install the Mac OS X Developer Tools from the Dev Tools CD. This CD should have accompanied your OS X User Install CD but is also available for download from the Apple Developer web site. Be sure to download MacMPI_X.c and mpi.h into the directory where your source is located. To compile your application, in the case of test.c, from the Terminal window, compile the MacMPI_X.c code and link with the Carbon library and include files using this one-line command:

cc test.c MacMPI_X.c /System/Library/Carbon.framework/Carbon -I /Developer/Headers/FlatCarbon

This command creates a Mach-O executable, which Pooch should recognize and be able to launch.

Compiling MacMPI_X.c in a Cocoa application on OS X

Create a Cocoa application using the Project Builder on OS X. Download MacMPI_X.c and mpi.h into your source directory. Include MacMPI_X.c and Carbon.frameworks in your project. Because MacMPI_X expects to read the nodelist_ip file using fopen, and this file is generally placed in the same directory where the Cocoa bundle application resides, it is necessary to set the default directory to the directory of the application very early in the code (before calling MPI_Init). To do this, add the following code (thanks to Steve Hayman of Apple Canada) to the beginning of main() in main.m:

NSString *path = [[NSBundle mainBundle] bundlePath];
char cpath[1024];

[path getCString:cpath];
{char*lastSlash, *tp;

for(lastSlash=tp=cpath; tp=strchr(tp, '/'); lastSlash=tp++) ;
lastSlash[1]=0; //specifies the parent of the bundle directory
}
chdir(cpath);

When your code calls MPI_Init, MacMPI should then be able to locate the node list information. This executable should be recognized and launched by version 1.2 and later of Pooch.

mpich

mpich is an open source implementation of the MPI standard, and it is used extensively on Linux-based clusters. By 2002, it was ported to Mac OS X. The following information was derived from the mpich documentation.
In order for mpich to work with Pooch, some minor modifications to mpich had to be made. The latest mpich distribution (v1.2.4 as of this writing) came from:

http://www-unix.mcs.anl.gov/mpi/mpich/


Under open-source license, the version of mpich modified for Pooch by Dauger Research, Inc., is available from the Dauger Research web site at:

http://daugerresearch.com/pooch/mpich/


The modification allows Pooch to inform mpich which TCP port it should use. Details on the changes can be found in the Read Me files and source code. This version of mpich should compile precisely as the mpich documentation describes. That is, follow steps 1 through 3 on pages 4 and 5 in mpichman-chp4.pdf.
In summary, cd to the mpich-1.2.4 directory that was created after decompressing and un-tar-ing the mpich archive. We recommend extracting the archive to a directory where you have full read-write-execute access, such as your home directory (e.g., /Users/yourusername/). Then use the following command to specify the mpich libraries in the mpich-1.2.4 directory of /Users/yourusername/ and sense various features of your computer:

./configure --prefix=/Users/yourusername/mpich-1.2.4 |& tee c.log

(You may use another directory if you wish.) Note: if you are using Absoft’s Fortran 90, you might want to make sure the .a files in the /Applications/Absoft/lib/ are up to date (e.g., use ranlib) and use the --disable-f90modules flag when using ./configure:

./configure --prefix=/Users/yourusername/mpich-1.2.4 --disable-f90modules |& tee c.log

Then make mpich using:

make |& tee make.log

This step may take many minutes as it compiles all of mpich. A very large amount of output will be created.
To compile your C code, you may use the mpicc command residing in the bin directory of the mpich library directory:

./bin/mpicc -o test.out test.c

Pooch should recognize the resulting executable. IMPORTANT: Be sure to select “mpich” from the Job Type submenu of the Options… pop-up on the Job Window. Pooch should then be able to launch your mpich executable.
Please note that using this modified mpich with Pooch DOES NOT require:

This combination of mpich and Pooch releases these traditional requirements.

MPI/Pro

MPI/Pro is a commercial implementation of the MPI standard developed and released by MPI Software Technology, Inc., whose web site is:

http://www.mpi-softtech.com/


MPI/Pro is a trademark of MSTI. Some of the people there have been involved with the MPI standard since its beginning. They provide a commercial impetus to the performance and quality of their MPI implementation.
To install MPI/Pro, you may use the its installer package. No special modifications or configuration is needed for MPI/Pro to work with Pooch. Also, configuring inetd.conf or .rhost files is not necessary when using MPI/Pro with Pooch. Once you have finished installating MPI-Pro, you may compile your C code using the following command:

cc -o test.out test.c -lmpipro -lpthread -lm

Pooch should recognize the resulting executable. IMPORTANT: Be sure to select “MPI/Pro” from the Job Options pop-up on the Job Window. Pooch should then be able to launch your executable. (Thanks goes to Bobby Hunter of MSTI for his insight into MPI/Pro.) Again, using NFS, rsh, inetd.conf files, .rhosts files, or mpirun is not necessary.

mpich-gm

mpich-gm is a version of ANL’s mpich for modified to use Myricom’s Myrinet hardware interface. The Mac OS X version of mpich-gm is available here:

http://www.myri.com/scs/


Myrinet is a trademark of Myricom. No modification of their distribution was required.
After installing the gm libraries and drivers on all nodes with Myrinet hardware, you may use the the following configure line to install mpich-gm:

./configure --with-device=ch_gm -prefix=/dir/for/gm --enable-sharedlib

then the usual “make” and “make install” commands. Also, configuring host files is not necessary when using mpich-gm with Pooch. Be sure that all dylib’s, or dynamic libraries, created by this mpich are deleted. By removing these dylib’s, you need install mpich-gm only on the machine you use to compile your code. Once you have finished installating mpich, you may compile your C code using its mpicc command:

mpicc -o test.out test.c

Pooch should recognize the resulting executable. IMPORTANT: Be sure to select “mpich-gm” from the Job Type option in the Job Window. Pooch should then be able to launch your executable on a Mac cluster connected using Myrinet hardware. Many thanks goes to Prof. John Huelsenbeck of UCSD for his support and help. As before, using shared storage (NFS, etc.), ssh, rsh, inetd.conf files, .rhosts files, or mpirun is not necessary.

LAM/MPI

LAM/MPI is an open-source MPI implementation created and supported at the Pervasive Technology Labs at Indiana Unviersity. The original Mac OS X version of LAM is available here:

http://www.lam-mpi.org/


With the help of Dr. Jeff Squyres there, we have produced a modified version of LAM that operates with Pooch available here:

http://daugerresearch.com/pooch/lam/


To install this LAM, you may use the the following configure line:

./configure --prefix=/usr/local/lamPooch --with-boot=pooch

If you don’t have a fortran compiler installed, you will need to add --without-fc. If you want a “dual-use” LAM that will operate normally and be able to be used with Pooch, remove the --with-boot=pooch argument. After that, use the usual “make” and “make install” commands. This places the LAM binaries in /usr/local/lamPooch, where Pooch can find it. Configuring host files or other static data is not necessary when using this LAM/MPI with Pooch. Because LAM’s run time environment (RTE) executables are needed for LAM to run, you will need to repeat the above LAM installation process for every node on your cluster.
Once you have finished installating LAM, you may compile your C code using its mpicc command:

/usr/local/lamPooch/bin/mpicc -o test.out test.c

Pooch should recognize the resulting executable. IMPORTANT: Be sure to select “LAM/MPI” from the Job Type option in the Job Window. Pooch should then be able to launch your executable on a Mac cluster whose nodes have the above LAM executables installed. As always, using shared storage (NFS, etc.), ssh, rsh, inetd.conf files, .rhosts files, or mpirun is not required.

1 Also operates on OS 9. See the Introduction for system requirements.

2 This contrasts with distributed computing, such as SETI@Home, in which the pieces of its computation are all but completely isolated from one another.

3 As of version 1.2, Pooch can recognize and launch a wide variety of applications, including Classic apps, CFM-based Carbon apps, bundle-based Carbon apps, Cocoa apps, Mach-O executables, Unix scripts, and AppleScript apps. In version 1.7, Pooch supports launching Universal Binaries.

4 Unix permissions will be preserved only when copying between OS X machines.

5 For more about the different types of parallel computing, see http://daugerresearch.com/pooch/parallelzoology.html