next up previous contents
Next: terminating SCore multi-user environment Up: User Guide: Quickstart Previous: Choosing the mpich version   Contents

Running a job in the SCore multi-user environment

NOTE: By default the SCore multi-user environment is not running. If your system is set up using the SCore multi-user environment then you will be able to share the resource with other users just like on a single machine. Your job can be run interactively and will time share with other jobs using the same cpu resource. Please consult your system administrator if you are unsure which environment you are running in.

Once you are logged into the front end server you should obtain the local hostname if it is not known to you by executing the hostname command. For example:

[nrcb@vmserve nrcb]$ hostname
vmserve.streamline.com
To find if the multi-user environment is running type ``sctop'' followed by the hostname (you may use the short form of the name) - e.g
[nrcb@vmserve nrcb]$  sctop vmserve
[nrcb@vmserve nrcb]$
If you are running in a suitable terminal window it should clear and display a parallel ``top'' showing users jobs running and free nodes. For example:

Up 4.489[H], 4 Hosts, Load Average: 1.70, 3 Jobs
Host#: 0123
#Jobs: 2222

JID       User    Resource P S        Time           Binary    Command
8 nrcb@vmserve:1790  4x1@0 1 R 20.835[S]/34.185[S] i386-linux  hello
7 nrcb@vmserve:1789  2x1@2 1 R 53.789[S]/75.649[S] i386-linux  mpitest
9 nrcb@vmserve:1791  2x1@0 1 R  92.693[m]/1.448[S] i386-linux  sum
Press ``Cntrl'' C to stop the output. The above shows 3 jobs running simultaneously on a 4 cpu cluster. The ``hello'' code is running on 4 cpus, and the ``mpitest'' and ``sum'' codes are running on 2 separate pairs of cpus. The total number of jobs on each node is 2.

If you are not running in the SCore parallel environment or it is temporarily unavailable then you will not see any special output. Press ``Cntrl'' C to get your terminal to respond.

To run your parallel code in the SCore multi-user environment use the scrun command. E.g:

[nrcb@vmserve pbs]$ scrun -nodes=2 ./hello
SCORE: Connected (jid=10)
<0:0> SCORE: 2 nodes (2x1) ready.
 My process number is  1 of  2procs
 My process number is  0 of  2procs
runs a 2 processor job.
[nrcb@vmserve pbs]$ scrun -nodes=4 ./hello
SCORE: Connected (jid=11)
<0:0> SCORE: 4 nodes (4x1) ready.
 My process number is  1 of  4procs
 My process number is  2 of  4procs
 My process number is  3 of  4procs
 My process number is  0 of  4procs
runs a 4 processor job and so on.
If you have the Sun Grid Engine job scheduler running on other compute nodes, you also need to add the scored option, e.g On system server the muli-user scored is running on comp35 so you will need to use:

[nrcb@ehtpx-cluster]$ scrun -nodes=4,scored=comp35  ./hello
SCORE: Connected (jid=11)
<0:0> SCORE: 4 nodes (4x1) ready.
 My process number is  1 of  4procs
 My process number is  2 of  4procs
 My process number is  3 of  4procs
 My process number is  0 of  4procs

Under SCore you cannot run any parallel job using more processes than there are cpus. For example if you have a 32 cpu system then attempting to use more will simply result in an error and the job will be aborted:

[nrcb@ehtpx-cluster]$ scrun -nodes=64,scored=comp35  ./hello
FEP:ERROR SCore-D Login failed: Resource unavailable.
[nrcb@ehtpx-cluster]$

The system is set up to use the most sensible defaults. For instance if you have a Myrinet network then your job will run by default over Myrinet. You may force your job to run over ethernet using the network option to scrun. e.g

[nrcb@ehtpx-cluster]$ scrun -nodes=4,scored=comp35,network=ethernet ./hello
SCORE: Connected (jid=12)
<0:0> SCORE: 4 nodes (4x1) ready.
 My process number is  1 of  4procs
 My process number is  2 of  4procs
 My process number is  3 of  4procs
 My process number is  0 of  4procs
For other options that you can add to the scrun command please see the documentation by pointing your web browser at :

file:/opt/score/doc/html/en/man/index.html

on the front end server, and also at the pccluster.org web site.



Subsections
next up previous contents
Next: terminating SCore multi-user environment Up: User Guide: Quickstart Previous: Choosing the mpich version   Contents
2004-06-17