What is a PBS script?

A PBS script is a shell script that tells the PBS client how to execute your job. At the very simplest, it consists of a set of PBS commands, followed by the shell commands you wish to run, followed by an "exit" command.

The very least you should know

The Cluster Resources web page has full documentation for Torque, including several examples of how to use PBS scripts to submit jobs. The most important are reproduced here. The Torque documentation may be found here.

The basics of a PBS script

Just like a shell script, a PBS script must begin with:

#!/bin/bash

Any shell supported by the system (sh, bash, csh, ksh) will do. Then, at the very least, you must give your job a name:

#PBS -N /jobname/

Usefully, the job name that you use can later be referenced by the environment variable $PBS_JOBNAME. Once the shell and job name are defined, enter the shell commands you wish to execute, followed by "exit":

/shell command 1/
/shell command 2/
...
exit 0


That's it! You're ready to run jobs on Solomon!

More advanced usage of the PBS script

In order to exercise control over your job, you may use the following PBS flags in your PBS script. The first and most important flags are the error and output targets, which can be designated by:

#PBS -e /errorfile/
#PBS -o /outputfile/


If any errors are encountered in the execution of your script, they will be piped to /errorfile/. /outputfile/ should be empty, unless your job was terminated for some reason (exceeded walltime, exceeded CPU time, was terminated by the administrator ...)

Using the -m flag will force Torque to send you an e-mail under various conditions, being [b]egin, [e]nd, and [a]bort:

#PBS -m [abe]

Using -l allows you to more carefully control the resources that your job uses. The most important of these are nodes and walltime. nodes gives control over what type, and how many, nodes are being used. For instance:

#PBS -l nodes=2

specifies that two nodes are to be used. Using -l nodes=/nodename/ specifies that only a particular node may be used. Several options may be strung together with a +, as in:

#PBS -l nodes=2+node4

The nodes keyword has additional functionality by which nodes may be sorted and selected, but, as the nodes on Solomon are identical, this is not useful here. walltime allows you to change the maximum elapsed time which a job may take on the compute nodes; the default is one hour. walltime of two hours is invoked by:

#PBS -l walltime=2:00:00

Last but not least, the -V flag pipes your environment variables to the PBS server, so that your script can use important things like your path, and the $SANDIA directory.

Using resources effectively and wisely

...depending on what you want

Issues with our previous cluster, Flare, have raised an interesting question regarding cluster task management: "Where is the best place to run my job?" Fortunately, using PBS makes choosing where to run your job somewhat simpler. In the old days, we simply used to run jobs by logging in to whichever slave node we happened to find available; we knew our files existed because all home directories were mounted on the Net File System. Solomon is set up in the same way, so it is feasible to do something like this. A PBS script written to run a job in your home directory might look something like this:

#!/bin/csh
#PBS -N myjob
#/some list of PBS commands/
premix-d
exit 0


This job would use the cklink and files that already exist in your home directory. This may not be the best way to run jobs, however; among other things, it creates a high network load for processes (such as Premix) which write to and from disk frequently. For this reason, one might take advantage of shell variables and write a PBS script that looks like this:

#!/bin/csh
#PBS -N myjob
...
set PBS_WORKDIR = /tmp/$PBS_JOBNAME
cp cklink $PBS_WORKDIR
cp tplink $PBS_WORKDIR
cp inp #PBS_WORKDIR
premix-d
cp f.out ~/
cp recover ~/


This script would execute in the directory /tmp/myjob on the node to which it was submitted. It has the disadvantage that, if your job terminates, any files will be abandoned in /tmp on the node you were using. It is also possible to use PBS to do interesting things, such as:

#!/bin/csh
...
set PBS_WORKDIR = ~/$PBS_JOBNAME
...


which would guarantee that your job were placed in an appropriately-named directory in your home file system. This should work well, because Solomon is designed for high data throughput to the NFS-mounted drive.

Closing Remarks

This should be enough to get you running on PBS. Happy computing!