Pople
- System Configuration
- Access
- Storing Files
- Transferring files
- Creating Programs
- Running Jobs
- Queue structure
- Scheduling policies
- Sample batch jobs
- Qsub command
- Interactive access
- Using the module command in a batch script
- Using the module command in an interactive job
- Monitoring and Killing Jobs
- Debugging
- Improving Performance
- Software Packages
- Pople and the TeraGrid
- Stay Informed
- Reporting a Problem
System Configuration
Hardware
Pople is an SGI Altix 4700 shared-memory NUMA system comprising
192 blades. Each blade holds 2 Itanium2 Montvale 9130M
There are multiple frontend processors, which are also Itanium2 processors and which run the same version of SuSE Linux as the compute processors. You login to one of these frontend processors, not to the compute processors.
Software
The Intel C, C++ and Fortran compilers and the Gnu C and C++ compilers are installed on pople, as are the facilities to enable you to run OpenMP, MPI and hybrid OpenMP and MPI programs.
Access
Connecting
To connect to pople you must ssh to tg-login.pople.psc.teragrid.org or to pople.psc.edu. When you are prompted for a password enter your PSC Kerberos password.
Changing your password
Use the kpasswd command to change your PSC Kerberos password, not the passwd command. You have the same password on all PSC production platforms. If you change your password on one PSC system using kpasswd you change it on all other PSC systems.
You must change your pople password within 30 days of the date on your initial password form or your password will be disabled. We will also disable your password if you do not change it at least once a year. We will send you an email notice warning you that your password is about to be disabled in the latter case. See the PSC password policies for more information. If your password is disabled send email to remarks@psc.edu to have it reset.
Changing your login shell
You can use the chsh command to change your login shell. When doing so, specify a shell from the /usr/psc/shells directory.
Storing Files
File Systems
File systems are file storage spaces directly connected to a system. There are currently two such areas available to you on pople.
- $HOME
-
This is your home directory. Your $HOME directory has a 5-Gbyte quota. $HOME is visible to all of pople's compute and frontend processors. $HOME is backed up daily, although it is still a good idea to store your important $HOME files to golem. Golem, PSC's file archival system, is discussed below.
- $SCRATCH
-
This is pople's scratch area to be used as a working space for your running jobs. $SCRATCH is visible to all of pople's compute and frontend processors. You should use the name $SCRATCH to refer to your scratch area since we may change its implementation. $SCRATCH is a parallel file system.
$SCRATCH is not a permanent storage space. Files can only remain on $SCRATCH for up to 7 days and then we will delete them. In addition, we will delete $SCRATCH files if we need to free up space to keep jobs running. Finally, $SCRATCH is not backed up. For these three reasons, you should store copies of your $SCRATCH files to your local site or to golem as soon as you can after you create them. Golem, PSC's file archival system, is discussed below.
File Repositories
File repositories are file storage spaces which are not directly connected to a frontend or compute processor. You cannot, for example, open a file that resides in a file repository. You must use explicit file copy commands to move files to and from a repository. You currently have one file repository available to you on pople: golem, PSC's file archival system.
- golem
-
Golem is a combination tape-and-disk archival system. The far program should be used to tranfer files between golem and pople. You should transfer files between golem and pople outside of your batch jobs. Otherwise your jobs will be holding compute processors while your files are being transferred. You can use scp or kftp to transfer files between golem and your remote machine. If you need to store a file to golem that is 2 Tbytes or larger first send email to remarks@psc.edu so that special arrangements can be made to store your file.
Transferring Files
You can use either the scp or the kftp program to transfer files between your remote machine and pople and between your remote machine and golem. Which method will perform better varies based on location. Therefore you should try both approaches and see which performs better for you. If you want assistance in improving the performance of your file transfers send email to remarks@psc.edu.
Creating Programs
The Intel C, C++ and Fortran compilers and the GNU C and C++ compilers are installed on pople and they can be used to create OpenMP, MPI, hybrid and serial programs. The commands you should use to create each of these types of programs are shown in the table below.
| OpenMP | MPI | Hybrid | Serial | |
| Intel Fortran | ifort -openmp myopenmp.f | ifort mympi.f -lmpi | ifort -openmp myhybrid.f -lmpi | |
| Intel C | icc -openmp myopenmp.c | icc mympi.c -lmpi | icc -openmp myhybrid.c -lmpi | |
| Intel C++ | ||||
| GNU C | gcc -fopenmp myopenmp.c | gcc mympi.c -lmpi | gcc -fopenmp myhybrid.c -lmpi | |
| GNU C++ |
Man pages are available for ifort, icc and icpc and for gcc and g++.
The UPC compiler is installed on pople. Online instructions for its use are available.
A native Java compiler and interpreter are available on pople. Issue the command
module load jrockit/5.0
to get access to the javac and java commands.
Running Jobs
Queue structure
Torque, an open source version of the Portable Batch System (PBS), controls all access to pople's compute processors, for both batch and interactive jobs. Currently pople has two queues: the batch queue and the debug queue. Interactive jobs can run in the debug queue and batch queue and the method for doing so is discussed below.
Jobs submitted to the batch queue are actually split by the system into two subqueues: the batch_r and the batch_l queue. Jobs in the batch_r or regular queue have requested walltimes that range up to and including 24 hours. Jobs in the batch_r queue can request up to and including 604 cores.
Jobs in the batch_l or long queue have requested walltimes that range above 24 hours up to and including 168 hours. Jobs in the batch_l queue can ask for at most 32 cores. There are 128 cores reserved for the batch_l queue. The batch_l queue is to be used for long jobs that cannot be checkpointed
Jobs that request more than 24 hours of walltime will probably wait longer in the queue than jobs that request 24 hours or less of walltime. Thus, you should realistically estimate your walltime needs and not request more than 24 hours of walltime if your job will not need it.
The maximum walltime for the debug queue is 30 minutes and the maximum number of cores you can request is 32. The debug queue is not to be used for short production jobs.
We plan to create several other queues to meet user needs. If you would like to make a suggestion about pople's queue structure send email to remarks@psc.edu.
Scheduling policies
The batch and debug queues are basically FIFO queues. However, there are mechanisms in place to prevent a single user from dominating either queue and to prevent idle time on the machine. The result is some deviation from a strictly FIFO scheme. We will modify the scheduling policies on pople to meet user needs. If you have suggestions or comments about the scheduling policies on pople or find that they do not meet your needs send email to remarks@psc.edu.
Sample batch jobs
To run a batch job on pople you submit a batch script to the scheduler. A job script consists of PBS directives, comments and executable commands. The last line of your batch script must end with a newline.
A sample job script to run an OpenMP program is
#!/bin/csh #PBS -l nodes=1:ppn=4 # nodes must always be 1 and ppn must be a multiple of 4 #PBS -l walltime=5:00 #PBS -j oe #PBS -q batch set echo ja #move to my $SCRATCH directory cd $SCRATCH #copy executable to $SCRATCH cp $HOME/myopenmp . #run my executable setenv OMP_NUM_THREADS 4 ./myopenmp ja -chlst
The first line in the script cannot be a PBS directive. Any PBS directive in the first line is ignored. Here, the first line identifies which shell should be used for your batch job.
The four #PBS lines are PBS directives.
- #PBS -l nodes=1:ppn=4
-
This directive specifies the number of cores to allocate for the job. For performance reasons the actual allocation of resources is done by blades, with each blade containing four cores. You must request cores in multiples of four. Jobs do not share blades.
In this directive the value of nodes is always '1'. The value of ppn is the number of cores requested. Here we request 4 cores. The number of cores must be a multiple of four, or the job submission will fail. Within your batch script the environment variable PBS_PPN is set to the number of cores you requested.
Each blade has 8 Gbytes of physical memory. If your job exceeds the amount of physical memory available to it--a job requesting 16 cores will run on 4 blades and thus have 32 Gbytes of memory available to it--it will be killed by the system with a message similar to
PBS: Job killed: cpuset memory_pressure X reached/exceeded limit Ywritten to its stderr. In this message 'X' and 'Y' are integers.
If this happens to your job you should resubmit it and ask for more cores. The output from the ja command, which is discussed below, can help you determine how many blades your job needs.
- #PBS -l walltime=5:00
-
The second directive requests 5 minutes of walltime. Specify the time in the format HH:MM:SS. At most two digits can be used for minutes and seconds. Do not use leading zeroes in your walltime specification.
- #PBS -j oe
-
The next directive combines your .o and .e output into one file, in this case your .o file. This makes your job easier to debug.
Your stdout and stderr files are each limited to 20 Mbytes. If your job exceeds either of these limits it will be killed by the system. If you have a program that you think will exceed either of these limits you should redirect either your stdout or stderr output or both to a $SCRATCH file. Another option is run your job from $SCRATCH.
- #PBS -q batch
-
The final PBS directive requests that your job be run in the batch queue. The system will route your job to the batch_r or batch_l queue based on your resource requests.
The remaining lines in the script are comments and command lines.
- set echo
-
This command causes your batch output to display each command next to its corresponding output. This makes your job easier to debug. If you are using the Bourne shell or one of its descendants use
set -x
instead.
- ja
-
The ja command turns on job accounting for your job. This allows you to obtain information on the elpased time and memory and IO usage of your program, plus other data.
You must pair the command with another ja command at the end of your job. The option -t to this second ja command turns off job accounting and writes your accounting data to stdout. The other options to the second example ja command determine what output you will receive from ja. We recommend these options because we think they will provide detailed but useful information about your job's processes. However, you can look at the man page for ja to see what reporting options you want to use.
There is no overhead to using ja. We strongly recommend that you use ja so you can understand the resource usage of your jobs, which you can use when you submit future jobs. The output from ja can also be used for debugging and performance improvement purposes.
- Comment lines
-
The other lines in the sample script that begin with '#' are comment lines. The '#' for comments and PBS directives must be in column one of your scripts.
- setenv OMP_NUM_THREADS 4
-
This command sets the number of threads for your OpenMP program to use. It is set to 1 by default. You should set this value to the number of cores you requested with your PBS nodes directive so each of your threads will run on its own core.
- ./myopenmp
-
This command runs your executable.
A sample job to run an MPI program is
#!/bin/csh #PBS -l nodes=1:ppn=4 # nodes must always be '1' and ppn must a multiple of 4 #PBS -l walltime=5:00 #PBS -j oe #PBS -q batch set echo ja #move to my $SCRATCH directory cd $SCRATCH #copy executable to $SCRATCH cp $HOME/mympi . #run my executable mpirun -np 4 ./mympi ja -chlst
This script is identical to the OpenMP script except when you run your executable. You do not have to set the variable OMP_NUM_THREADS, but you have to use the mpirun command to launch your executable on pople's compute processors. The value for the -np option is the number of cores you want your program to run on. You should set -np to the number of cores you requested with your PBS nodes directive. You must use mpirun to run your MPI executable or it will run on a frontend and degrade overall system performance.
A sample job to run a hybrid OpenMP and MPI program is
#!/bin/csh #PBS -l nodes=1:ppn=64 # nodes must always be '1' and ppn must be a multiple of 4 #PBS -l walltime=5:00 #PBS -j oe #PBS -q batch set echo ja #move to my $SCRATCH directory cd $SCRATCH #copy executable to $SCRATCH cp $HOME/myhybrid . #run my executable setenv OMP_NUM_THREADS 4 omplace -nt 4 mpirun -np 16 ./myhybrid ja -chlst
This script is identical to the above two scripts except when you run your executable. You use a combination of the mpirun and omplace commands to run your hybrid program. The value of the -np option to the mpirun command is the number of your MPI tasks. The value of the -nt option to the omplace command is the number of your OpenMP threads per MPI task. The value of the -nt option and the value you set OMP_NUM_THREADS to must be the same. The product of these two values should be the total number of cores you requested with your PBS nodes specification.
The omplace command insures that each of your OpenMP threads runs on its own core. You must use mpirun to run your hybrid executable or it will run on a frontend and degrade overall system performance.
Qsub command
After you create your batch script you submit it to PBS with the qsub command.
qsub myscript.job
Your batch output--your .o and .e files--is returned to the directory from which you issued the qsub command after your job finishes.
You can also specify PBS directives as command-line options. Thus, you could omit the PBS directives from the above sample scripts and submit the scripts with the command
qsub -l nodes=1:ppn=4 -l walltime=5:00 -j oe -q batch myscript.job
Command-line directives override directives in your scripts.
If you are going to run a program that uses X-windows during your job you must add the option -X to your qsub command.
Interactive access
A form of interactive access is available on pople by using the -I option to qsub. For example, the command
qsub -I -l nodes=1:ppn=4 -l walltime=5:00 -q debug
requests interactive access to 4 cores for 5 minutes in the debug queue. Your qsub -I request will wait until it can be satisfied. If you want to cancel your request you should type ^C.
When you get your shell prompt back your interactive job is ready to start. At this point any commands you enter will be run as if you had entered them in a batch script. Stdin, stdout, and stderr are connected to your terminal. To run an MPI or hybrid program you must use the mpirun command just as you would in a batch script.
When you finish your interactive session type ^D. When you use qsub -I you are charged for the entire time you hold your processors whether you are computing or not. Thus, as soon as you are done executing commands you should type ^D.
X-11 connections in interactive use
In order to use any X-11 tool, you must also include -X on the qsub command line:
qsub -X -I nodes=1:ppn=4 -l walltime=5:00 -q debug
This assumes that the DISPLAY variable is set. Two ways in which DISPLAY is automatically set for you are:
- Connecting to pople with ssh -X pople.psc.edu
- Enabling X-11 tunneling in your Windows ssh tool
Totalview, Fluent and TAU are among the packages which require X-11 connections.
Using the module command in a batch script
Depending on your login shell and the shell you use in your batch script you may have to make changes to your batch script if you want to use the module command in your batch script.
If your login shell is csh and your batch script uses csh as its shell then if you need to use the module command in your batch script you must include the commands
source /usr/share/modules/init/csh
source /etc/csh.cshrc.psc
in your batch script after your PBS specifications. If you use tcsh as your login shell and as your batch shell you must include the commands
source /usr/share/modules/init/tcsh
source /etc/csh.cshrc.psc
in your script.
If your login shell is csh or tcsh and you use sh or bash in your batch script you must start your job with the line
#!/bin/sh -l
or
#!/bin/bash -l
depending on whether you want to use sh or bash in your batch script.
If your login shell is sh or bash and you use csh or tcsh as your batch shell you must include the two source commands in your batch script that were described above in the first case.
If your login shell is sh or bash and you use either sh or bash as your batch shell you do not need to make any changes to your batch scripts.
Using the module command in an interactve job
You do not need to issue any special commands if you want to use the module command in an interactive session, but you should not switch your shell from your login shell duing your interactive session.
Monitoring and Killing Jobs
The qstat -a command displays the status of the queues. It shows running and queued jobs. For each job it shows the amount of walltime and the number of cores and processors requested. For running jobs it shows the amount of walltime the job has already used. The qstat -f command, which takes a jobid as an argument, provides more extensive information for a single job.
The qdel command is used to kill queued and running jobs. An example is the command
qdel 54
The argument to qdel is the jobid of the job you want to kill, which you are shown when you submit your job or you can get it with the qstat command. If you cannot kill a job you want to kill send email to remarks@psc.edu.
Debugging
Debugging strategy
Your first few runs should be on a small version of your problem. You first run should not be for your largest problem size. It is easier to solve code problems if you are using fewer processors. This strategy should be followed even if you are porting a working code from another system.
You should use the debug queue for your debugging runs. Do not run a debugging run on any of pople's front ends. You should always run a pople program with qsub.
The debug queue is intended to be used in the classic debugging cycle in which you run a debugging job, check its output and then submit another debugging job. You should not flood the debug queue with jobs nor should you chain your jobs through the debug queue by having a debug job submit its sucessor.
The debug queue should not be used for production runs that use only a few processors.
Compiler options
Several compiler options can be useful to you when you are debugging your program. If you use the -g option to the Intel or GNU compilers, the error messages you receive when your program fails will probably be more informative. For example, you will probably be given the line number of the source code statement that caused the failure. Once you have a production version of your code you should not use the -g option or your program will run slower.
The -check bounds option to the ifort compiler will cause your program to tell you if it exceeds an array bounds while running.
Variables on pople are not automatically initialized. This can cause your program to fail if it relies on variables being initialized. The -check uninit and -ftrupuv options to the ifort compiler will catch certain cases of uninitialized variables, as will the -Wall and -O options to the GNU compilers.
There are more options to the Intel and GNU compilers that may assist you in your debugging. For more information see the appropriate man pages.
Improving Performance
Assistance with improving performance
If you would like to improve the performance of your code, you can get optimization assistance from PSC. This assistance includes consulting assistance from PSC, special queue handling if necessary, and service unit discounts, all of which are designed to enable you to scale up your code as quickly as possible. Send email to remarks@psc.edu if you would like performance improvement assistance with your program.
Software Packages
A list of software packages installed on Pople is available. If you would like us to install a package that is not in this list send email to remarks@psc.edu.
Pople and the TeraGrid
Pople is on the TeraGrid. Thus, you have additional methods of connecting to pople, of transferring files to and from pople and of running jobs on pople. For information on using the TeraGrid see the general online documentation for the TeraGrid and the PSC-specific online TeraGrid documentation.
Stay Informed
As a user of pople, it is imperative that you stay informed of changes to the machine's environment. Refer to this document frequently. In addition, important system information is posted to the PSC's Web page of bboard posts.
You will also periodically receive email from PSC with information about pople. In order to insure that you receive this email, you should make sure your email forwarding is set properly by following the instructions for setting your email forwarding.
Reporting a Problem
You have two options for reporting problems on pople.
- You can call the User Services Hotline at 1-800-221-1641 from
9:00 a.m. until 8:00 p.m., Eastern time, on weekdays, and from 9:00 a.m. until
4:00 p.m., Eastern time, on Saturdays.
- You can send email to remarks@psc.edu.