Pittsburgh Supercomputing Center 

Advancing the state-of-the-art in high-performance computing,
communications and data analytics.

Greenfield User Guide

 

Running jobs 

Torque, an open source version of the Portable Batch System (PBS), controls all access to Greenfield's compute processors for both batch and interactive jobs. 

To run a job on Greenfield, use the qsub command to submit a job script to the scheduler. A job script consists of PBS directives, comments and executable commands. 

Queue structure

There are two queues on Greenfield: the batch queue and the debug queue. Interactive jobs can run in either queue and the method for doing so is discussed below.

The debug queue is used for debugging runs. It is not to be used for production runs. The maximum walltime for jobs in the debug queue is 30 minutes.

The batch queue is for all production runs. Jobs can use a maximum of 168 hours of walltime.

You determine how much memory your job will be allocated through the value of your core request.

Job scripts

Job scripts consist of PBS directives, comments and executable commands.   The last line of your batch script must end with a newline.

  • PBS directives tell the scheduler what resources the job needs and define settings such as the output file name, which grant will be charged for the job, etc. They begin with #PBS and must appear before any executable statments in the job script.
  • Greenfield resources are allocated to a job based on the number of cores requested. At a minimum, you must include a directive requesting one processors and the number of cores your job needs.  The number or cores must be a multiple of 15.
  • The number of cores you request determines the amount of memory your job will be allocated.  For every 15 cores requested, 750GB of memory is allocated to your job.
  • The first line in the script cannot be a PBS directive. Any PBS directive in the first line is ignored. Generally, the first line identifies which shell should be used for your batch job.
  • Comment lines begin with # and can be anywhere in the job script.
  • Executable commands perform tasks like moving files into place,  setting up a work environment, running your program, and so on.

 

File use in batch jobs

Use the /crucible filesystem for I/O in batch jobs.  Be sure to move files from /crucible to the Data Supercell when the job completes to avoid losing them. 

IO to /tmp

Some applications will write temporary files while running.  The /tmp directories on Greenfield are very small, and can fill up quickly.  Your job will fail if this happens.

To prevent this, set the environment variable TMPDIR to be $SCRATCH_RAMDISK.  $SCRATCH_RAMDISK is automatically defined for you in batch jobs.

For Bourne-type shells, type

export TMPDIR=$SCRATCH_RAMDISK

For C-type shells, type

setenv TMPDIR $SCRATCH_RAMDISK

For Java applications, use this definition instead.  Note the leading underscore:

For Bourne-type shells

export _JAVA_OPTIONS="Djava.io.tmpdir=$SCRATCH_RAMDISK"

For C-type shells

setenv _JAVA_OPTIONS "Djavaj.io.tmpdir=$SCRATCH_RAMDISK"

 

Basic job script

Following is a basic job script with an explanation of the commands; a list of useful PBS directives; and sample scripts for OpenMP, Java, MPI and serial jobs.

 

A simple job might look like this:

#!/bin/csh
#  Request 15 cores 
# Note that this means the job will be allocated 750GB of memory #PBS -l nodes=1:ppn=15 # Request 5 minutes of cpu time #PBS -l walltime=5:00 # Rename output file to test.out
#PBS -o test.out
# Combine standard output and error into one file #PBS -j oe set echo
# Define where /tmp files will be written
setenv TMPDIR $SCRATCH_RAMDISK
# move to scratch directory
cd /crucible/group-name/user-name
# copy executable to the scratch directory
cp $HOME/myprogram .
# run my executable
./myprogram
#!/bin/csh
This line identifies which shell should be used for your batch job. The first line in a job script cannot be a PBS directive. PBS directives in the first line of a script are ignored.
# Request 15 cores
This is a comment. All lines beginning with "#   " are comment lines.
#PBS -l  nodes=1:ppn=15
This directive is required.
Request 15 cores for your job. The value for nodes must always be 1; the job will fail if it is not.   Because Greenfield processors each contain 15 cores, your core request must be a multiple of 15.   Note that this also means that the job is allocated 750GB of memory.
#PBS -l walltime=5:00
Request the amount of time your job will need. This requests 5 minutes of walltime. Specify the time in the format HH:MM:SS. At most two digits can be used for minutes and seconds. Do not use leading zeroes in your walltime specification.
#PBS -o test.out
Name the output file test.out.
#PBS -j oe 
This combines your .o (standard output) and .e (standard error) output into one file, in this case named test.out. This will make your job easier to debug.
set echo
This command causes your batch output to display each command next to its corresponding output. This makes your job easier to debug. If you are using the Bourne shell or one of its descendants use set -x instead.
setenv TMPDIR $SCRATCH_RAMDISK
Write any temporary files created by the application to $SCRATCH_RAMDISK. Otherwise, your job may fail if the small default temporary space fills up.
cd /crucible/group-name/user-name
Move to your scratch directory before beginning your executable. We recommend this because the scratch filesystem is much larger and much faster than the home directory filesystem.  You must substitute your group name and user name for group-name and user-name in the command.
cp $HOME/myprogram .
Copy your program file (executable) to your scratch directory. Be sure to also copy any input files your job needs to your scratch directory so they are available.

Remember to always copy any files you want to keep out of your scratch space as soon as possible after your job finishes.

 

This table lists commonly used PBS directives.  Type man qsub for a complete list.

 
Function Directive Description
Request cores and processors for the job -l nodes=1:ppn=cores You must request the number of processors and cores that your job requires.  Always request 1 processor with 'nodes=1' or your job will fail.  Because there are 15 cores per processor, the number of nodes must be a multiple of 15.
Request time for the job -l walltime=HH:MM:SS Specify the time in the format HH:MM:SS.

You can omit the hours, so -l walltime=5:00 requests 5 minutes. At most two digits can be used for minutes and seconds. Do not use leading zeroes in your walltime specification.
Rename the standard output -o filename Specify an output filename. By default the standard output is named script-name.ojob-id
Combine the standard output and error into one file -j oe This combines your .o and .e output into one file, in this case your .o file. This makes your job easier to debug.

Your stdout and stderr files are each limited in size if they are not redirected to a file. If your job exceeds either limit it will be killed by the system. If you have a program that you think will exceed either of these limits you should redirect either your stdout or stderr output or both to a scratch file on /crucible. Moreover, unless you redirect your stdout and stderr output you cannot see it until your job ends.
Specify the queue -q queue-name Choose which queue to submit the job to. Default is batch
Get notified by mail when a job starts, ends or is aborted. Report memory use of the job in the email. -m a|b|e|n Defines the conditions under which a mail message will be sent about a job. If "a", mail is sent when the job is aborted by the system. If "b", mail is sent when the job begins execution. If "e", mail is sent when the job ends. If "n",no mail is sent. This is the default.

The email will list the amount of memory which the job used. This data is very useful to calculate how many processors to request since memory allocation is tied to the number of processors a job requests.
Define which users get email notification -M userlist Specifies the users to receive mail about the job. Userlist is a comma-separated list of email addresses. If omitted, it defaults to the user submitting the job. You should specify your full email address when using the -M option.
Restart the job automatically -r y|n Indicates whether or not a job should be automatically restarted if it fails due to a system problem. The default is to not restart the job. Note that a job which fails because of a problem in the job itself will not be restarted.
Specify the group to charge the job to -W group_list=charge_id Indicates to which charge_id you want a job to be charged. If you only have one grant on greenfield you do not need to use this option; otherwise, you should charge each job to the appropriate grant.

You can see your valid charge_ids by typing groups at the greenfield prompt. Typical output will look like
sy2be6n ec3l53p eb3267p jb3l60q
Your default charge_id is the first group in the list; in this example "sy2be6n". If you do not specify -W group_list for your job, this is the grant that will be charged.

If you want to switch your default charge_id, send email to This email address is being protected from spambots. You need JavaScript enabled to view it..
Define job dependencies -W depend=dependency:jobid Specifies how the execution of this job depends on the status of other jobs. Some values for dependency are:
after this job can be scheduled after job jobid begins execution.
afterok this job can be scheduled after job jobid finishes successfully.
afternotok this job can be scheduled after job jobid finishes unsucessfully.
afterany this job can be scheduled after job jobid finishes in any state.
before this job must begin execution before job jobid can be scheduled.
beforeok this job must finish successfully before job jobid begins
beforenotok    this job must finish unsuccessfully before job jobid begins
beforeany this job must finish in any state before job jobid begins

Specifying "before" dependencies requires that job jobid be submitted with -W depend=on:count. See the man page for details on this and other dependencies.

Make environment variables available in the batch job -v variable_list This option exports the environment variables named in the variable_list to the environment of your batch job. The -V option exports all your environment variables.

Sample scripts for several types of jobs are given below.

 

A sample job script to run an OpenMP program is

 

#!/bin/csh
#  Request 30 cores  
#PBS -l nodes=1:ppn=30
# Request 5 minutes of cpu time 
#PBS -l walltime=5:00
#  Name the output file test.out
#PBS -o test.out
# Combine standard output and error into one file  
#PBS -j oe test.out 
set echo 
# Define where /tmp files will be written
setenv TMPDIR $SCRATCH_RAMDISK
# move to scratch directory 
cd  /crucible/group-name/user-name
#  copy executables to the scratch directory
cp $HOME/myopenmp1 . 
cp $HOME/myopenmp2 . 
#  run my executable 
setenv OMP_NUM_THREADS 15
numactl -N +0 -m +0 ./myopenmp1 &
numactl -N +1 -m +1 ./myopenmp2 &
wait

The OpenMP example follows the basic job script but it requests 30 cores and runs two executables.

Additional commands for this job are:

setenv OMP_NUM_THREADS 15
The environment variable OMP_NUM_THREADS defines the number of threads (cores) your job can use. This command sets OMP_NUM_THREADS to 15.
numactl -N +1 -m +1 ./myopenmp1
This command runs your executable. The numactl command ensures that the threads for this executable do not migrate across the nodes assigned to the job. In the case that any process exceeds the memory of the node it is assigned to, this command also prevents that process from using memory from the other node.

Note that this command does not prevent your threads from migrating across the cores in one node. If it is important to prevent that, we recommend that you do not pack multiple executables into one job. Use these commands in your job:

If you are used the GNU compilers:

setenv OMP_NUM_THREADS 30
setenv GOMP_CPU_AFFINITY 0-29
./myopenmp

If you used the Intel compilers:

setenv OMP_NUM_THREADS 30
setenv KMP_AFFINITY compact
./myopenmp

 

 

A sample script to run an MPI program is:

 

#!/bin/csh
#  Request 15 cores  
#PBS -l nodes=1:ppn=15
#  Request 5 minutes of cpu time
#PBS -l walltime=5:00
#  Name the output file test.out
#PBS -o test.out
#  Combine standard output and error into one file 
#PBS -j oe
set echo
# Define where /tmp files will be written
setenv TMPDIR $SCRATCH_RAMDISK
# load the MPI  module
module load openmpi_x86-64
# move to scratch directory 
cd  /crucible/group-name/user-name
#  copy executable to 
cp $HOME/mympi .
#  run my executable
mpirun -np $PBS_NP ./mympi

This script follows the basic script but includes these lines:

module load openmpi_x86-64
This makes the mpirun command available in your job
mpirun -np $PBS_NP ./mympi
You must use the mpirun command to launch your executable on Greenfield's compute processors. The value for the -np option is the number of your MPI tasks. You should normally set -np to $PBS_NP. This will run each of your MPI tasks on its own core.

 

 

A sample script to run a Java program is:

 

#!/bin/csh
#  Request 15 cores  
#PBS -l nodes=1:ppn=15
#  Request 5 minutes of cpu time
#PBS -l walltime=5:00
#  Name the output file test.out
#PBS -o test.out
#  Combine standard output and error into one file 
#PBS -j oe 
set echo
#  Define where /tmp files will be written
setenv _JAVA_OPTIONS "Djava.io.tmpdir=$SCRATCH_RAMDISK"
# move to scratch directory 
cd  /crucible/group-name/user-name
#  copy executable to the scratch directory
cp $HOME/MyJavaApp.class .
module load java
#  run my executable
java -XX:ParallelGCThreads=15 MyJavaApp

The Java example follows the basic example script but also includes these lines:

setenv _JAVA_OPTIONS "Djava.io.tmpdir=$SCRATCH_RAMDISK"
Defines where temporary files will be written. Otherwise, your job may fail if the small default temporary space is filled.
module load java
This command loads the java module, which defines some necessary paths to Java binaries.
java -XX:ParallelGCThreads=15 MyJavaApp
This command calls the Java interpreter to run your program.

The ParallelGCThreads option is used to specify the number of Java threads to generate for the purposes of garbarge collection. Normally for performance reasons you will generate one thread per core, although this is application dependent. In this example you are asking for 15 threads, which will give one thread per core.   You should always set ParallelGCThreads to at least 2.  More information can be found here: http://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/index.html

If you are using the IBM version of Java the option to request 15 threads would be

java -Xgcthreads15 MyJavaApp

The Java system call Runtime.getRuntime().availableProcessors() will always return 4096. To get the correct value for your number of cores you should instead call System.getenv("PBS_NP") and multiply the returned value by two.

 

 

A sample job to run a serial program is:

 

#!/bin/csh
#  Request 15 cores  
#PBS -l nodes=1:ppn=15
#PBS -l walltime=5:00
#  Name the output file test.out
#PBS -o test.out
#PBS -j oe
#PBS -q batch
set echo
# Define where /tmp files will be written
setenv TMPDIR $SCRATCH_RAMDISK
# move to scratch directory 
cd  /crucible/group-name/user-name
#  copy executable to the scratch directory
cp $HOME/myserial .
#  run my executable
./myserial

To run a serial program, give the name of your program as a command in your batch script. Use the nodes and ppn parameters to ask for the number of cores you need so that enough memory is allocated to your job for your program to run. Your serial program will have access to all the memory in all the cores allocated to your job.

 

Packing serial jobs

Running many small jobs places a great burden on the scheduler and is probably inconvenient for you. An alternative is to pack many executions into a single job, which you then submit to PBS with a single qsub command. The basic method to pack jobs is to run each program execution in the background and place a wait command after all your executions.

To pack serial executables, use the following in place of the command to run your single executable.

numactl -C +0 ./myserial0 &
numactl -C +1 ./myserial1 &
numactl -C +2 ./myserial2 &
numactl -C +3 ./myserial3 &
numactl -C +4 ./myserial4 &
numactl -C +5 ./myserial5 &
numactl -C +6 ./myserial6 &
numactl -C +7 ./myserial7 &
numactl -C +8 ./myserial8 &
numactl -C +9 ./myserial9 &
numactl -C +10 ./myserial10 &
numactl -C +11 ./myserial11 &
numactl -C +12 ./myserial12 &
numactl -C +13 ./myserial13 &
numactl -C +14 ./myserial14 &
wait

The ampersand at the end of each line runs the executable in the background. All executables will run concurrently.

 

Qsub command

After you create your batch script, submit it to PBS with the qsub command.

qsub myscript.job

Your batch output--your .o and .e files--is returned to the directory from which you issued the qsub command after your job finishes.

 

Interactive access

A form of interactive access is available on greenfield by using the -I option to qsub. For example, the command

qsub -I -l nodes=1:ppn=15 -l walltime=5:00 -q debug

requests interactive access to 15 cores for 5 minutes in the debug queue. Your qsub -I request will wait until it can be satisfied. If you want to cancel your request you should type ^C.

When you get your shell prompt back your interactive job is ready to start. At this point any commands you enter will be run as if you had entered them in a batch script. Stdin, stdout, and stderr are connected to your terminal. To run an MPI or hybrid program you must use the mpirun command just as you would in a batch script.

When you finish your interactive session type ^D. When you use qsub -I you are charged for the entire time you hold your processors whether you are computing or not. Thus, as soon as you are done executing commands you should type ^D.

 

X11 connections in interactive use

In order to use any X11 tool, you must also include -X on the qsub command line:

qsub -X -I -l nodes=1:ppn=15 -l walltime=5:00

This assumes that the DISPLAY variable is set. Two ways in which DISPLAY is automatically set for you are:

  1. Connecting to greenfield with ssh -X greenfield.psc.xsede.org
  2. Enabling X11 tunneling in your Windows ssh tool

 

Monitoring and Killing Jobs

The qstat and pbsnodes commands provide information about jobs and queues. The qdel command is used to kill a job.

qstat

The qstat command requests the status of jobs or queues.  Options include:

-a
Displays the status of the queues, including  running and queued jobs. For each job it shows the amount of walltime and the number of cores and processors requested.  For running jobs it shows the amount of walltime the job has already used.
-s
Includes comments provided by the batch administrator or scheduler.
-f
Provides a full status display
-u username
Displays all running or queued jobs belonging to user username

pbsnodes

The pbsnodes -a command shows details about the nodes.

qdel

The qdel command is used to kill queued and running jobs. For example,

qdel 54

The argument to qdel is the jobid of the job you want to kill.  The jobid is displayed when the job is submitted.  You can also find the jobid with qstat -u username. If you cannot kill a job you want to kill send email to This email address is being protected from spambots. You need JavaScript enabled to view it..

 

Debugging

Strategy

Your first few runs should be on a small version of your problem,  not your largest problem size. It is easier to solve code problems if you are using fewer processors.  Do this even if you are porting a working code from another system.

Estimating memory requirements

Use the -m PBS directive to report the memory used by a debugging job.  Use this figure to estimate the memory requirements of production runs.

 

Debuggers

The gdb debugger is  available on greenfield.    See man gdb for complete details.  Send email to This email address is being protected from spambots. You need JavaScript enabled to view it. for more information.

 

Compiler options

Several compiler options can be useful to you when you are debugging your program.   The -g option to the  GNU compilers produces more informative error messages. For example, you will probably be given the line number of the source code statement that caused the failure. Once you have a production version of your code, you should not use the -g option or your program will run slower.

Variables on Greenfield are not automatically initialized. This can cause your program to fail if it relies on variables being initialized. The -Wall and -O options to the GNU compilers will catch certain cases of uninitialized variables.

There are more options to the GNU compilers that may assist you in your debugging. For more information see the appropriate man pages.

 


Publications

Spring2015 cover

View all PSC publications

PSC Fall2014 cover PSC Spring2014cPSC2013 covers web 

Subscriptions: You can receive PSC news releases via e-mail. Send a blank e-mail to psc-wire-join@psc.edu.