Jonas

rachel.psc.edu and jonas.psc.edu

rachel.psc.edu and jonas.psc.edu



System Architecture and Configuration

Jonas is a set of SMP machines. Each machine has 64 1.15 Ghz EV7 processors with 256 Gbytes of shared memory. Each SMP machine is distinct and your job can only run on one machine. When you submit a job to the system the OS determines on which of the SMP machines your job will run. Future plans include aggregating the SMP machines so you can run jobs that request more resources.

You login to a front end node with 2 EV67 processors, not to any of the SMP machines.The front end node and the SMP machines run the Tru64 Unix operating system.

Stay Informed

As a user of jonas, it is imperative that you stay informed of changes to the machine's environment. Refer to this document frequently. In addition, important system status information is posted to the PSC's Web page of bboard posts.

You will also periodically receive email from PSC with information about jonas. In order to insure that you receive this email, you should make sure that your email forwarding is set properly by following the online instructions for setting your email forwarding.

Access to Jonas

Applying for a grant

To apply for time on jonas, see the NRBSC website at http://www.nrbsc.org/resources/.

Connecting to jonas

Use ssh to connect to jonas.

ssh jonas.psc.edu  -l username

The first time you log in, you will receive a message similar to

Host key not found from list of known hosts.  Are you sure 
you want to continue connecting?

Answer 'yes' to make the connection. You should not receive this message on subsequent connections.

You will next be prompted for your jonas password. Your jonas password is your PSC Kerberos password, which is also your PSC AFS password. If you enter your password successfully you will be connected to jonas.

Changing your jonas password

You must change your jonas password within 30 days of the date on the initial password sheet. If you don't, logins will be disabled on your account. Contact PSC User Services if this happens.

Use the kpasswd command to change your PSC Kerberos password. Do not use the passwd command to change your PSC password. You have the same password on all PSC production systems. If you change it on one system using kpasswd it will change on all PSC production systems.

See the general PSC password policies.

Changing your login shell

You should change your login shell with the chsh command. When doing so you must specify a shell from the /usr/psc/shells directory. In your batch jobs, however, you can use the shells in /bin.

Accounting

Charging algorithm

For the HP Marvels (rachel and jonas), the idea of a "virtual cpu" is defined in terms of memory. Let M represent the total memory available to users on the system (in Gbytes), and C represent the total number of CPUs on the system. Then P, the per-processor share of memory is defined as:
    P = M/C

The charging algorithm for rachel and jonas is given as:
     SU = max (ceil(m/P), N) * w
where:

  • ceil = the ceiling function, which rounds any non-integer real number to the next largest integer.
  • m = memory requested, in Gbytes
  • P = per-processor share of memory, as defined above
  • N = number of processors requested
  • w = walltime used by job, in hours

Tracking charges

User accounting data is available on jonas with the xbanner command. Account information such as the total amount of the grant, the date of the last job and the remaining amount of the grant are displayed by xbanner.

Accounting information for grants is also available at the PSC Grant Management System on the Web at https://grants.psc.edu/arms.

You will need your PSC Kerberos password to access this system. This system can provide more detailed information than xbanner, although some of the information is only available to grant PIs. The system has extensive internal documentation.

Storing Files

File Systems

File systems are file storage spaces directly connected to a system. There are currently two such areas available to you on jonas.

/usr/users/n/username

This is your home directory. The numeral 'n' will be replaced by an integer and 'username' will be replaced by your userid. You can also refer to this directory as $HOME. You have a 1 Gbyte quota for your home directory. Thus you will probably not be able to store your data files on $HOME. Your home directory is backed up. $HOME is visible to all of the SMP machines, but through a relatively slow connection.

$LOCAL

Each of the SMP machines has 6 Tbytes of local disk space. The local space for a machine is not visible to the other SMP machines. However, the local spaces for all of the SMP machines are visible to the front end node, but through a very slow connection.

When you run a batch job you cannot determine on which SMP machine your job will run. Thus, you cannot ensure that it will run on the same SMP machine with the same local disk system as any of your prior runs. Therefore, you should consider this local space only as working disk space. In other words, you should copy your data files between golem and this local disk system at the beginning and end of your batch jobs with the far command. The file archiver golem is discussed below.

Within a job you can refer to the local space assigned to that job on its SMP machine as $LOCAL. You should refer to it with the variable name since we could change the implementation of $LOCAL for performance reasons. See the sample batch job below for an example of how to use $LOCAL and far in a batch job.

Files on a $LOCAL are, however, accessible to you on jonas's front end both while the job is running and after the job ends, either with the tcscp command or with standard Unix commands. To use either type of command you need to know your job's PBS jobid, which is given when you submit your job or in your job's .o output, and on which SMP machine your job ran. This can be found by using the qstat -f command's exec_host output field for a running job or in the .o output of a finished job. These two pieces of information are used to refer to your files on a $LOCAL. For example, if your job is running or ran on salk64a and the PBS jobid is or was 15786 then the local disk space for that job can be referred to as

/salk64a/local/15786

in front end Unix commands and as

salk64a:/local/15786

in tcscp commands.

Files on a local disk system left by a batch job will remain in place for a week after the job ends. However, they are not backed up and if disk space is low on a local disk system they can be deleted at any time to allow currently running jobs to continue. Thus, you should move any files that you want a permanent copy of from their local disk system to golem as soon as possible.

You must use tcscp to copy files from a local disk system to golem after a job ends. Once a job ends you cannot use far to perform this transfer. A tcscp copy will use a very fast path between the SMP machines and golem. See the discussion below of tcscp for an example of how to use tcscp.

You can also access your files on any of the local file systems using standard Unix commands if you use a filename similar to the one described above. For example, the command

tail /salk64b/local/21689/output.dat

can be used to examine the end of file output.dat for job 21689 on SMP machine salk64b. This command can be issued while your job is running. Other Unix commands, such as ls or rm, will work similarly.

However, the connection between the jonas front end and the local file systems is very slow. Thus, you should not use the cp command to copy large files between $HOME and the local file systems. In general, you should limit your interactions with Unix commands between the front end and the local file systems on the SMP machines to essential operations.

File Repositories

File repositories are file storage spaces which are not directly connected to a front end node or compute processors. You cannot, for example, in a program open a file that resides in a file repository. You must use explicit file copy commands to move files to and from the repository. You currently have one file repository available to you on jonas, golem, PSC's file archiver.

golem

Golem runs Cray's DMF file archival system. It is a combination tape-and-disk archival system.

The far program and the tcscp program can be used to transfer files between golem and jonas.

You can use kftp, gridftp, scp or sftp to transfer files between golem and your remote machine. We strongly recommend you use kftp instead of scp for remote file transfer if kftp is available. We recommend against using sftp. See the golem Web page for more information.

You should store your data files on golem rather than in your home directory because your home directory space is limited. At the beginning of your batch jobs you should transfer your data files to $LOCAL and then at the end of your batch jobs copy files that you want a permanent copy of back to golem.

If you need to store a file to golem that is 2 Tbytes or larger please first contact User Services so that special arrangements can be made to store your file.

Transferring Files

Kftp

Jonas is running Kerberos 5 (K5) client and server software. If your local site also has K5 client/server software installed, you can transfer files to and from jonas whether you are logged into jonas or your local machine. The examples below assume that you are logged into your local machine.

Before you can use kftp to transfer files, you must authenticate yourself to jonas. To do this use the kinit command.

kinit username@PSC.EDU

For 'username' substitute your PSC userid. PSC.EDU is PSC's Kerberos realm name.

After you enter this command you are prompted for your PSC Kerberos password, which is the password you use to login to jonas.

Once you are authenticated you can use the kftp command to actually perform your file transfers.

kftp jonas.psc.edu

The kftp command functions like the ftp command.

You should not use kftp to transfer files to $LOCAL.

You should verify that the Kerberos commands operate on your local system as described here. Some installations of Kerberized ftp differ in their implementation.

Man pages for kinit and kftp are available on jonas.

A Unix kftp client is available at http://www.pdc.kth.se/heimdal. A Windows kftp client is available at http://web.mit.edu/network/kerberos-form.html.

Kftp will be much faster than scp, discussed below, and sftp for file transfers.

scp

The scp program can be used to transfer files between your remote machine and your jonas home directory. You should not use it to transfer files to $LOCAL.

The format for the scp command is

scp source-filename target-filename
where the filename on the remote system, whether it is the target or the source, must be specified as
username@system:filename

For example, to copy a file to your home directory on jonas when you are logged in to your home system use a command such as

scp filename username@jonas.psc.edu:/usr/users/n/username/filename
If you are logged in to jonas and you want to copy over a file from your home system to jonas, use a command such as
scp username@remote-system:filename  filename

The first time you use scp to or from jonas, you will receive a message similar to

Host key not found from list of known hosts.  Are you sure 
you want to continue connecting?

Answer 'yes' to make the connection. You should not receive this message on subsequent connections.

You will be prompted next for your password on the remote system. For jonas you should use your PSC Kerberos password.

You may be able to improve your scp transfer rate by using the blowfish encryption method rather than the default method, if your version of scp supports it. To use this method issue your scp command as

scp -c blowfish source-filename target-filename

For more information on the scp command, see the scp man page.

Scp is part of the ssh distribution.

We strongly recommend that you use kftp rather than scp for remote file transfers if kftp is available.

Far

You can use the far program to move files between jonas and golem,.

Tcscp

The tcscp command, created by PSC, allows you to copy files from a local file system on an SMP machine to golem. Standard Unix file protections are used to determine which files you can copy with tcscp. Thus, other users will not be able to copy your files on a local file system unless you set the file permissions to allow this.

The format of the command is based on the cp command, with the addition of the ability to specify source and target machines as well as source and target filenames. For example, the command

tcscp salk64a:/local/15786/output.dat golem:output.dat

copies output.dat from your directory /local/15786 on SMP machine salk64a to your golem home directory. You can get the name of the machine your job ran on from your .o output or from qstat -f for a running job. The PBS jobid is available when you submit your job or from your .o output. You issue the tcscp command interactively while logged into one of jonas's front end nodes.

The wildcard characters '*' and '?' are permitted in source filename specifications and are treated as the shell treats them.

Just as with the cp command, you can specify multiple source filenames

tcscp salk64a:/local/15786/output1.dat  \
  salk64a:/local/15786/output2.dat golem:

This command will copy output1.dat and output2.dat from your directory /local/15786 on salk64a to your golem home directory. When you use this form of the command the last file specification is the target specification and must be a directory.

The tcscp command has several options. The -r option allows you to recursively copy directories and their contents, just like cp. The -v option runs the command in verbose mode. In verbose mode the fully expanded filenames used in the copy are shown as is timing data about the transfer. The -no option is used to specify that you do not want existing files to be overwritten if a target file has the same filename as an existing file. The default behavior of tcscp is to overwrite existing files. When you use the -no option existing files are skipped over by tcscp. The -nk option causes tcscp to delete its source files after it successfully copies them. Finally, the -h option provdes help information for tcscp.

Tar

Whether you are transferring files between jonas and golem or between jonas and your remote system if you have many files--1000 or more--it is much more efficient to tar them up into one file and then transfer this single tar file, especially if they are small files, 64 Kbytes or smaller.

Tru64 tar--located at /bin/tar--can only create a tar file up to 8 Gbytes. Gnu tar--located at /usr/psc/gnu/bin/tar--can create tar files larger than 8 Gbytes. However, a file created by Gnu tar that is larger than 8 Gbytes cannot be read by Tru64 tar.

You should first contact User Services if you are going to create a tar file that is 50 Gbytes or larger. You should move your tar file to golem or to your remote system as soon as you can after you create it.

Compilers

HP compilers for Fortran90, C, and C+ are available, as are Gnu C and C++ compilers.

Compile your program by executing one of the following sets of statements.

C programs: cc program-name.c for HP compiler or gcc program-name for Gnu compiler
C++ programs: cxx program-name.C for HP compiler or g++ program-name for Gnu compiler
Fortran90 programs: f90 program-name.f90

Please note that the HP C compiler expects a .c extension (lower case) while the HP C++ compiler expects a .C (upper case) extension.

The HP compilers will only run on jonas's front end, not on its compute processors.

The make command executes /bin/make. To use Gnu make use the gmake command.

We recommend that you use the options

-O -fast -tune ev7 -arch ev7

when you compile your programs using the HP compilers.

Communication libraries

MPI

The MPI message passing library is available on jonas. To enable your programs to use MPI, you must include the MPI header file in your source and link to the MPI libraries when you compile.

For Fortran, use this include directive in your source:

include 'mpif.h'

and compile with a command like:

f90 program.f90 -lmpi

For C, use this include directive:

#include <mpi.h>

and compile with a command similar to:

cc program.c -lmpi

OpenMP

The OpenMP library for shared memory communication is available on jonas. MPI will use shared memory for communication on jonas so you do not need to use OpenMP to do this.

If you want to use OpenMP you should use the -omp option when compiling. Also, before you execute your program set the environment variable OMP_NUM_THREADS to the number of virtual processors your job will use. The variable $PBS_VPPN is set to this value by the system. If you do this each thread will run on its own processor. See the discussion of running jobs on jonas below for more information on running OpenMP jobs on jonas, virtual processors, and $PBS_VPPN.

Module software

The Module package provides for the dynamic modification of a users's environment via module files. Module can be used:

  • to manage multiple versions of applications, tools and libraries
  • to manage software where complex changes to the environment are necessary
  • to manage software where name conflicts with other software would cause problems

Module is available automatically for interactive use, although if you are going to use modules you should not switch your shell from your login shell during your interactive session.

To use module in a batch job, add one of these commands to your script:

For csh type shells (csh, tcsh)
source /usr/local/Modules/default/init/shell-name, where shell-name is either csh or tcsh.

For sh type shells (sh, bash, ksh)
./usr/local/Modules/default/init/shell-name, where shell-name is one of sh, bash, or ksh.

Some useful module commands are:

module avail lists all the available modules
module help foo displays help on module foo
module whatis foo displays a brief description of module foo
module display foo indicates what changes would be made to the environment by loading module foo without actually loading it
module load foo loads module foo
module unload foo reverses all changes to the environment made by previously loading module foo

Some modules are defined by the system (type module avail for a list), but you can create your own. For more information on module and how to create a modulefile, see the man pages for module and modulefile.

Running a Job

Scheduling policies

The Portable Batch Scheduler (PBS) controls all access to jonas's processors, for both batch and interactive jobs. PBS on jonas currently has only one queue. Interactive and batch jobs compete in this queue for scheduling.

The scheduler actually schedules virtual processors on the machine. A virtual processor is a physical processor associated with 3.7 Gbytes of memory. Memory is the most critical resource on jonas, especially given its potential impact on swapping performance. We tie memory to physical processors in this way to reduce job swapping.

Since we do this tying of memory to processors the specification of just the amount of memory you want for your jobs is enough for the system to determine how many virtual processors you need.

The scheduling policies on jonas are designed to favor jobs requesting large numbers of virtual processors to make the best use of the jonas resource. Not allowing idle time is also important to the scheduler.

There are also policies in place to prevent a single user from dominating the machine.

Send email to remarks@psc.edu if you have computing needs that cannot be met by these scheduling policies.

Batch access

Use the qsub command to submit a job to PBS. A PBS job script consists of PBS directives, comments and executable commands. The last line of your job script must end with a newline character.

A sample job script for an MPI program is

#!/bin/csh
#PBS -l walltime=5:00:00
#PBS -l vmem=32gb
#PBS -j oe

set echo

cd $LOCAL

# copy executable and input files
cp $HOME/a.out .
far get input.dat .

# execute program
dmpirun -np ${PBS_VPPN} ./a.out

# cleanup in case dmpirun failed
mpiclean

# store output file
far store output.dat .

# clean up files if no longer needed
rm -f *

The first line in the script cannot be a PBS directive. Any PBS directive in the first line is ignored. Here, the first line identifies which shell should be used.

The next three lines are PBS directives.

#PBS -l walltime=5:00:00
The first directive requests a wall clock time limit of 5 hours. Specify the time in the format HH:MM:SS. Only two digits can be used for minutes and seconds. Do not use leading zeros. The default walltime is 30 minutes. The maximum time you can request is 168 hours (1 week).
#PBS -l vmem=32gb
This directive specifies the maximum amount of memory your entire job can use at one time. The largest value you can specify for vmem is 236 Gbytes. As was discussed above, from your memory specification the system can determine that your job will run on ceiling(32/3.7) = 9 virtual processors and thus on 9 physical processors. Your job will reserve 9 physical processors while it runs even if it does not use them all.

If it is more convenient you can specify the number of processors you want instead of the amount of memory. For example,

#PBS -l nodes=1:ppn=9

requests 9 physical processors, which the system will translate into a request for 9 virtual processors with a total memory request of 9 * 3.7 = 33.3 Gbytes. The value for nodes must always be '1' in your processor specification.

If you specify both processors and memory the system will select as your number of virtual processors the larger of your physical processor specification and the virtual processor value computed from your memory request. You can request at most 64 virtual processors in a job.

We ask that your job not use more processors while it is running than your final virtual processor request. If you do so you can degrade system performance.

#PBS -j oe
The final PBS directive combines your stdout and stderr output into one file, in this case stdout. This will make your program easier to debug.

The remaining lines in the script are comments or command lines.

set echo
This command causes your batch output to display each command next to its corresponding output. This will make your program easier to debug. If you are using the Bourne shell or one of its descendants use 'set -x' instead of 'set echo'.
Comment lines
The other lines in the sample script that begin with '#' are comment lines. The '#' for comments and PBS directives must begin in column one of your script file. The remaining lines in the sample script are executable commands.
dmpirun
The dmpirun command is used to launch your executable. The -np option to dmpirun indicates the number of processors your program will run on and should be set to $PBS_VPPN, which is set by the system to the number of virtual processors your job is requesting. Dmpirun cannot read directly from stdin but you can use input redirection. If your job is a serial job or an OpenMP job, you can just specify your executable name, although you must set the OMP_NUM_THREADS variable before running your OpenMP executable. You should set OMP_NUM_THREADS to $PBS_VPPN.
mpiclean
The mpiclean command must be executed to clean up system resources properly after a failed dmpirun execution.

A sample job script for an OpenMP program is

#!/bin/csh
#PBS -l walltime=5:00:00
#PBS -l nodes=1:ppn=8
#PBS -j oe

set echo

cd $LOCAL

# copy executable and input files
cp $HOME/a.out .
far get input.dat .

# specify number of OpenMP threads
setenv OMP_NUM_THREADS ${PBS_VPPN}

# execute program
./a.out

# store output file
far store output.dat .

# clean up files if no longer needed
rm -f *

There are several differences between the MPI and the OpenMP scripts. We recommend that you specify only the number of processors you want when using OpenMP and set OMP_NUM_THREADS to this value. This will insure that each of your threads runs on a different processor and that each thread has enough memory in which to run.

Also, you do not run your executable with dmpirun. You just specify the executable name on its own command line. Nor do you need to run dmpiclean after your executable is finished.

After you create your script you must make it executable with the chmod command.

chmod 755 yourscript.job

Then you can submit it to PBS with the qsub command.

qsub yourscript.job

Batch output, including your job's stdout and stderr output, is returned to the directory from which you issued the qsub comand after your job finishes.

You can also specify the PBS directives as command-line options to qsub. Thus, you could omit the PBS directives in the sample script above and submit the script with

qsub -l walltime=5:00:00  -l vmem=32gb -j oe

Command line options override PBS directives included in your script.

The -W group_list option, which is used to indicate to which chargeid a job should be charged, must be used as a command line option. It cannot be a PBS specification inside your batch script. If you only have one jonas grant--like most users--then you do not need to use -W group_list.

The -M and -m options can be used to have the system send you email when your undergoes specified state transtions. See the qsub man page for more information on these options and other qsub options.

Interactive access via qsub -I

A form of interactive access is available by using the -I option to qsub. An example command using qsub -I is

qsub -I -l walltime=10:00 -l nodes=1:ppn=2

This command requests interactive access to 2 processors for 10 minutes.

The system will respond with a message similar to

qsub: waiting for job 54.jonas.psc.edu to start

Your qsub request will wait until it can be satisfied. If you want to cancel your request you should type ^C. You will have to try your interactive job at a later time. The qstat can be used to show how many jobs are running on the system.

When your job starts you will receive the message

qsub: job 54.jonas.psc.edu ready

and then the OS command prompt. You can use the -M and -m options to qsub to have the system send you email when your job has started.

At this point any commands you enter will be run on your processors as if you had entered them in a batch script. Your commands will be executed on one of the SMP compute platforms. For example, to run an MPI code you must enter a dmpirun command.

Stdin, stdout, and stderr are all connected to your terminal, although you will still need to use input direction for your MPI code to read stdin.

When you are finished with your interactive session type ^D. The system will respond

qsub: job 54.jonas.psc.edu completed

When you use qsub -I you are charged for the total time that you hold your processors and your memory whether you are computing or not. Thus, as soon as you are done running executables you should type ^D.

Interactive access can be used to compile your jonas programs.

Monitoring and Killing Jobs

Qstat

The qstat command is used to display the status of the PBS queue. It includes running and queued jobs. The -f and -a options to qstat provide you with more extensive status listings. See the man page for qstat for more details.

Qdel

The qdel command is used to kill queued and running jobs.

Qalter

The qalter command is used to alter queue options for queued and running jobs.

Qstatform

The qstatform command shows how many virtual processors are in use on each of jonas's two nodes. There are 64 virtual processors available on each node.

Reporting a Problem

You have several options for reporting problems on jonas.

  • You can call the User Services Hotline at 1-800-221-1641 from 9:00 a.m. until 8:00 p.m., Eastern time, on weekdays, and 9:00 a.m. until 4:00 p.m., Eastern time, on Saturdays.
  • You can send email to remarks@psc.edu.

Other Documentation