New on Bridges

 

Outbound Connections for Interactive Jobs

08/22/2017

A new option, --egress, has been enabled for interactive jobs.  This allows you to monitor your Bridges jobs on your local machine.  To enable outbound connections, add --egress to your interact command:

interact --egress ...more options...

More information on the interact command is found in the Running Jobs section of the Bridges User Guide.

Allocating for GPUs Separately

03/03/2017

From Bridges' beginning, users with a Bridges Regular Memory allocation have had access to Bridges' GPU nodes.   Due to high user demand,  the GPU nodes will be allocated as a separate resource beginning on June 1, 2017.    Starting in June, users with existing Regular Memory allocations will not have access to the GPU nodes. Only users with a Bridges GPU allocation will be able to use them. You will receive notification of the exact date for this transition as it nears.

If you have a current Regular Memory allocation on Bridges that ends later than June 2017 and you wish to use the GPU nodes, you can request a transfer of some of the Service Units (SUs) in your Bridges Regular Memory allocation to a Bridges GPU allocation. 

When you apply for an allocation that will start on June 1, 2017 or later, you will be able to request the Bridges GPU resource when submitting your proposal.

Transferring part of a Regular Memory allocation

Beginning in early May, you can transfer some of your Regular Memory allocation to a GPU allocation.  To do this, submit a transfer request through the XSEDE User Portal.  Instructions for submitting a transfer request are found here: https://portal.xsede.org/knowledge-base/-/kb/document/avva

Charging for GPUs

Bridges contains two kinds of GPU nodes: NVIDIA Tesla K80s and NVIDIA Tesla P100s.  Because of the difference in the performance of the nodes, the charges will be different for the two types of nodes.

K80 nodes 

The K80 nodes hold 4 GPU units each, each of which can be allocated separately.  Service units (SUs) are defined in terms of GPU-hours:

1 GPU-hour = 1 SU

Note that the use of an entire K80 GPU node for one hour would be charged 4 SUs.

P100 nodes

The P100 nodes hold 2 GPU units each, which can be allocated separately.  Service units (SUs) are defined in terms of GPU-hours:

1 GPU-hour = 2.5 SUs

Note that the use of an entire P100 node for one hour would be charged 5 SUs.

More information

For more information on Bridges' GPU nodes, see https://www.psc.edu/index.php/bridges/user-guide/gpu-use.  

For information on running a job on Bridges, including how to submit a job that uses GPUs and how to select the type of GPU node you will use, see https://www.psc.edu/index.php/bridges/user-guide/running-jobs.

For other questions on Bridges, please email This email address is being protected from spambots. You need JavaScript enabled to view it.

 

Updated Scratch File System, pylon5

 03/03/2017

An upgraded scratch file system, named pylon5, will be available on Bridges as of March 7th to replace pylon1.  pylon5 will serve the same purpose as pylon1 has, specifically:

  • It is fast, temporary storage for running jobs
  • Files are wiped after 30 days
  • It is not backed up

Be aware that any job scripts which specifically reference /pylon1 must be edited to reference /pylon5 instead.

You are responsible for moving your files from pylon1 to pylon5.  You will have from March 7 to March 30 to do this.  The pylon1 file system will be decommissioned on April 4, 2017, and any files remaining there will be lost.

Use the rsync or cp command to move your files.

Moving your pylon1 files to pylon5

Substitute your groupname and username for "yourgroup" and "yourusername" in these examples.

We recommend using rsync to copy your files from pylon1 to pylon5.  An advantage of rsync over the cp command is that if the transfer gets interrupted, running the same rsync command again only copies those files that were missed (or partially copied) in the first try.  

We also recommend that you do the file transfer via the srun command.  This will avoid deterioration of service on the login nodes, and will capture any errors in the job output file.

To transfer all of your pylon1 files, use this command:

srun -p RM-small rsync -av  /pylon1/yourgroup/yourusername/   /pylon5/yourgroup/yourusername/

If you want rsync to calculate checksums of the files in both places as additional validation, add the -c option:

srun -p RM-small rsync -avc  /pylon1/yourgroup/yourusername/   /pylon5/yourgroup/yourusername/

More information and options to the rsync command can be found by typing  rsync --help.

 

Time limit on Large Memory Partition Increased

02/17/2017

In response to demand from users taking advantage of Bridges Large Memory nodes,  the walltime limit has been extended to 14 days in the LM partition.   For information on how to run jobs in the LM partition, see https://www.psc.edu/bridges/user-guide/running-jobs

Omni-Path User Group

The Intel Omni-Path Architecture User Group is open to all interested users of Intel's Omni-Path technology.

More information on OPUG