Class pjrun

java.lang.Object
  extended by pjrun

public class pjrun
extends Object

Class pjrun is a main program that runs a program, other than a Parallel Java (PJ) program, on a cluster parallel computer using the PJ job queue. (PJ programs interact with the PJ job queue directly and do not need to use the pjrun program.)

Reserving Nodes on the Cluster

Usage: java -Dpj.np=K pjrun
K = Number of nodes

The pjrun program contacts the PJ Job Scheduler Daemon and requests a job running on K nodes of the cluster. The job goes into the job queue and may sit in the job queue for some time until K nodes are available. Once K nodes are available, the pjrun program prints their names on the standard output. For example:

     $ java -Dpj.np=4 pjrun
     thug01
     thug02
     thug03
     thug04
 
You can then do whatever you want with those nodes, such as log into them and run programs on them. Other PJ jobs will not be assigned those nodes as long as the pjrun program runs. The pjrun program continues to run until killed externally. To release the assigned nodes, kill the pjrun program, e.g. by typing CTRL-C.

You can put a time limit on the pjrun program this way:

Usage: java -Dpj.np=K -Dpj.jobtime=T pjrun
K = Number of nodes
T = Job time (seconds)

In this case the pjrun program will terminate itself automatically after T seconds. You can also kill the pjrun program manually.

Example

For example, here's how to run an MPI program via the PJ job queue in the author's installation. Type this command in one shell:

     $ java -Dpj.np=4 pjrun
     thug01
     thug02
     thug03
     thug04
 
Then type this command in another shell:
     $ mprun -np 4 -l "thug01,thug02,thug03,thug04" foo ...
 
mprun is the MPI launcher program. The -np option tells the MPI launcher to use 4 nodes, the number requested from the PJ job queue. The -l option tells the MPI launcher to use the specific cluster nodes assigned by the PJ job queue. foo is the MPI program to run, followed by its command line arguments.

As long as the pjrun program remains running, the cluster nodes remain reserved for use by MPI (or anything else). MPI jobs running on those nodes will not interfere with PJ jobs running on other nodes, and vice versa. When the MPI jobs are finished, kill the pjrun program.


Method Summary
static void main(String[] args)
          Main program.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

main

public static void main(String[] args)
                 throws Exception
Main program.

Throws:
Exception


Copyright © 2005-2012 by Alan Kaminsky. All rights reserved. Send comments to ark­@­cs.rit.edu.