There are several distinct file spaces available on Bridges, each serving a different function.
- Home ($HOME), your home directory on Bridges
- pylon5 ($SCRATCH), a Lustre system for persistent file storage. Pylon5 has replaced pylon1.
- Node-local storage ($LOCAL), scratch storage in the local memory associated with a running job
- Memory storage ($RAMDISK), scratch storage on the local disk associated with a running job
Note that pylon2 has been decommissioned on June 19, 2018.
Three months after your grant expires all of your Bridges files associated with that grant will be deleted, no matter which file space they are in. You will be able to login during this 3-month period to transfer files, but you will not be able to run jobs or create new files.
Access to files in any Bridges space is governed by Unix file permissions which you control. If your data has additional security or compliance requirements, please contact email@example.com.
This is your Bridges home directory. It is the usual location for your batch scripts, source code and parameter files. Its path is /home/username, where username is your PSC userid. You can refer to your home directory with the environment variable $HOME. Your home directory is visible to all of Bridges's nodes.
Your home directory is backed up daily, although it is still a good idea to store copies of your important files in another location, such as the pylon5 file system or on a local file system at your site. If you need to recover a home directory file from backup send email to firstname.lastname@example.org. The process of recovery will take 3 to 4 days.
Your home directory has a 10GB quota. You can check your home directory usage using the
quota command or the command
du -sh. To improve the access speed to your home directory files you should stay as far below your home directory quota as you can.
Three months after a grant expires, the files in your home directory associated with that grant will be deleted.
The pylon5 file system is persistent storage, and can be used as working space for your running jobs. It provides fast access for data read or written by running jobs. IO to pylon5 is much faster than to your home directory.
Pylon5 is a Lustre file system shared across all of Bridges' nodes. It is available on Bridges compute nodes as $SCRATCH.
Files on pylon5 are not backed up, so you should store copies of important pylon5 files in another location.
The path of your pylon5 home directory is /pylon5/groupname/username, where groupname is the name for the PSC group associated with your grant. Use the
id command to find your group name.
id -Gn will list all the groups you belong to.
id -gn will list the group associated with your current session.
If you have more than one grant, you will have a pylon5 directory for each grant. Be sure to use the appropriate directory when working with multiple grants.
Your usage quota for each of your grants is the Pylon storage allocation you received when your proposal was approved. If your total use in pylon5 exceeds this quota your access to the partitions on Bridges will be shut off until you are under quota.
du -sh or
projects command to check your pylon5 usage. You can also check your usage on the XSEDE User Portal.
If you have multiple grants, it is very important that you store your files in the correct pylon5 directory.
Three months after a grant expires, the files in any pylon5 directories associated with that grant will be deleted.
Sharing files on pylon5
Unix file permissions can be used to share pylon5 files among members of your group. To do this you need to give each directory from your top-level pylon5 directory down your directory hierarchy to the directory that contains the files you want to share a file protection of 750 with the chmod command. Then you need to give you each file you want to share a protection of 740 with the chmod command.
If you want more fine-grained control than this---if you want to give only certain members of a group access to a file, but not all members---then you need to use Access Control Lists (ACLs). Suppose, for example, that you want to give janeuser access to a file in a directory, but no one else in the group. Then issue a command similar to the following
setfacl -m user:janeuser:rx directoryname
for each directory from your top-level pylon5 directory down your directory hierarchy to the directory that contains the file you want to share with janeuser. Then give janeuser access to a specific file with a command similar to
setfacl -m user:janeuser:r filename
User janeuser will now be able to read this file, but no one else in the group will have access to it.
There are man pages for chmod, setfacl and getfacl.
Each of Bridges's nodes has a local file system attached to it. This local file system is only visible to the node to which it is attached. The local file system provides fast access to local storage.
This file space is available on all nodes as $LOCAL.
If your application performs a lot of small reads and writes, then you could benefit from using $LOCAL. Many genomics applications are of this type.
$LOCAL is only available when your job is running, and can only be used as working space for a running job. Once your job finishes your local files are inaccessible and deleted. To use local space, copy files to $LOCAL at the beginning of your job and back out to a persistent file space before your job ends.
If a node crashes all the $LOCAL files are lost. Therefore, you should checkpoint your $LOCAL files by copying them to pylon5 during long runs.
If you are running a multi-node job the variable $LOCAL points to the local file space on the node that is running your rank 0 process.
The maximum amount of local space varies by node type. The RSM (128GB) nodes have a maximum of 3.7TB. The LSM (3TB) nodes have a maximum of 14TB and the ESM (12TB) nodes have a maximum of 49TB.
To check on your local file space usage type:
There is no charge for the use of $LOCAL.
To use $LOCAL you must first copy your files to $LOCAL at the beginning of your script, before your executable runs. The following script is an example of how to do this
RC=1 n=0 while [[ $RC -ne 0 && $n -lt 20 ]]; do rsync -aP $sourcedir $LOCAL/ RC=$? let n = n + 1 sleep 10 done
Set $sourcedir to point to the directory that contains the files to be copied before you execute your program. This code will try at most 20 times to copy your files. If it succeeds, the loop will exit. If an invocation of rsync was unsuccessful, the loop will try again and pick up where it left off.
At the end of your job you must copy your results back from $LOCAL or they will be lost. The following script will do this.
mkdir $SCRATCH/results RC=1 n=0 while [[ $RC -ne 0 && $n -lt 20 ]]; do rsync -aP $LOCAL/ $SCRATCH/results RC=$? let n = n + 1 sleep 10 done
This code fragment copies your files to a directory in your pylon5 file space named results, which you must have created previously with the mkdir command. Again it will loop at most 20 times and stop if it is successful.
Memory files ($RAMDISK)
You can also use the memory allocated for your job for IO rather than using disk space. This will offer the fastest IO on Bridges.
In a running job the environment variable $RAMDISK will refer to the memory associated with the nodes in use.
The amount of memory space available to you depends on the size of the memory on the nodes and the number of nodes you are using. You can only perform IO to the memory of nodes assigned to your job.
If you do not use all of the cores on a node, you are allocated memory in proportion to the number of cores you are using. Note that you cannot use 100% of a node's memory for IO; some is needed for program and data usage.
$RAMDISK is only available to you while your job is running, and can only be used as working space for a running job. Once your job ends this space is inaccessible. To use memory files, copy files to $RAMDISK at the beginning of your job and back out to a permanent space before your job ends. If your job terminates abnormally your memory files are lost.
Within your job you can cd to $RAMDISK, copy files to and from it, and use it to open files. Use the command
du -sh to see how much space you are using.
If you are running a multi-node job the $RAMDISK variable points to the memory space on the node that is running your rank 0 process.