To do this you must first write a simple script file. You cannot submit an executable to SGE. This is best explained using an example. Suppose you have compiled an MPI binary called mpitest (see section 8.2 for instructions on how to compile ), and want to run this using 4 cpu's, then you will need to write a script called (say) score.sh whose contents are:
#!/bin/bash #$ -masterq ehtpx-cluster.q -cwd -V scout -wait -F $HOME/.score/ndfile.$JOB_ID -e /tmp/scrun.$JOB_ID \ -nodes=$((NSLOTS-1))x1 /users/nrcb/mpi/mpitestHere the variables $JOB_ID and $NSLOTS are defined by SGE when your job runs, so you must not define values for these yourself. Next you must submit the script file to the batch system using the ``score'' parallel queue thus:
[nrcb@ehtpx-cluster]$ qsub -pe score 5 ./score.sh
your job 38 ("score.sh") has been submitted
[nrcb@ehtpx-cluster]$ qstat
job-ID prior name user state submit/start at queue master ja-task-ID
---------------------------------------------------------------------------------------------
38 0 score.sh nrcb t 08/14/2002 09:55:28 comp000.q SLAVE
38 0 score.sh nrcb t 08/14/2002 09:55:28 comp001.q SLAVE
38 0 score.sh nrcb t 08/14/2002 09:55:28 comp002.q SLAVE
38 0 score.sh nrcb t 08/14/2002 09:55:28 comp003.q SLAVE
38 0 score.sh nrcb t 08/14/2002 09:55:28 ehtpx-cluster.q MASTER
Because SCore always spawns MPI jobs from the front end server, you
need to include an extra "slot" to account for this. So to run a parallel
job on 4 compute nodes you need to request 5 slots, 4 slots for the
parallel execution and 1 slot for the spawning process. In this case the front
end server is called server. Options to qsub may be embedded in the job script
after #$ at the begining
of a job script line.
The SGE qsub option -masterq ehtpx-cluster.q in the job script
refers to the spawning queue on the front end server. Change the name server to
whatever the name of your front end server is. The SGE qsub option -cwd changes to the
directory you were in when you submitted the job. The SGE qsub option -V carries
all your currently set environment variables over when the job executes.
The qstat command can be used to query the job. And the above shows that the job
is running.
If the job normally prints to the screen (standard output) then this output will
be directed to a file in the user's home directory with the same name as the job
script appended with a .o and the job id number - e.g in the above example
script.sh.o146, and similarly, any errors will be sent to script.e146.
For a parallel job 2 additional files are generated, namely the output
from the parallel job "start" script and job stop script. In this example
the files would be score.sh.po143 and score.sh.pe146 respectively.
The status of a submitted job may be queried by the qstat command. A user may query his or her own jobs by using the -u option to qstat - e.g qstat -u bloggs. See man qstat for full details of qstat options.
SGE can be set up so that it emails you when the job is complete. Please see the man page for qsub or ask your administrator.