Using Sun GridEngine Scheduler: additional features

Next: Parallel jobs : Using Up: User Guide: Quickstart Previous: SGE NOTES Contents

Using Sun GridEngine Scheduler: additional features

The documentation for SGE is in /usr/local/sge/doc. The main Administration guide is in PDF format: SGE53AdminUserDoc.pdf. There is a man page document : SGE53Ref.pdf . There are also manpages for all sge commands.

A useful start in administering SGE is the qmon graphical configuration tool.

SGE is a good deal more fexible than PBS as it deals directly with hosts rather than entities like PBS queues. For instance (permissions allowing) a batch job may either be submitted to a job pool or directly to a compute node. E.g

  qsub myjob.sh

  qsub -q comp000.q  myjob.sh

Another powerful feature of SGE is the ability to submit "array" jobs. This allows a user to submit a range of jobs with a single qsub. For example:

  qsub  -t 1-100:2  myjob.sh

This will submit 50 tasks (1,3,5,7,...,99). The job script knows which of the tasks it is via the $SGE_TASK_ID variable. For example a job script might look like:

#!/bin/sh                      
TASK=$SGE_TASK_ID
# Run my code for input case $TASK and output it to an
# appropriate output file.
cd /users/nrcb/data
DATE=`date`
echo "This is the standard output for task $TASK on $DATE"
/users/nrcb/bin/mycode.exe input.$TASK output.$TASK

This would enable a user to run the code mycode.exe taking it's input from a series of input files input.1,input.3,...,input.99 and sending the output of the run to output files output.1,...,output.99.

Another feature of SGE is the ability to request a variable resource. For instance you can give a process range when submitting a parallel job - e.g

  qsub  -pe score 5-9 my_pe_job.sh

This would run a parallel job on maximum avaliable cpus in the range 4-8.

The scalar host queus are comp000.q,...comp084.q . These will all accept scalar batch and interactive jobs.

An SGE scalar batch job is very similar to a PBS one. SGE has similar commands to PBS - e.g qsub, qstat, qdel etc. Options to qsub may be embedded into the job script using #$. For instance:

#!/bin/sh
# This is a test
#$ -cwd
pwd
hostname
date

The -cwd option changes to the current working directory before running the job. Unlike PBS if you don't do this then all standard output and error files are sent to a user's home directory by default. Alternatively you can submit a job using -cwd on the command line instead of embedding it :

  qsub -cwd myjob.sh

Another difference between PBS and SGE is that SGE standard error and output files appear in the default location immediately and are not stored in a spool file while a job is running.

All compute nodes are set up to run both scalar and parallel jobs.

Subsections

Parallel jobs : Using automatic job submission with mpisub

Next: Parallel jobs : Using Up: User Guide: Quickstart Previous: SGE NOTES Contents

2004-06-17