Golem
To support the file repository needs of its users, PSC has deployed a file archival system named golem.
Golem is a combination tape-and-disk system. Initially, files moved to golem reside on disk. Factors such as file size and time of last access determine when a file gets migrated to tape. When you access a migrated file, it is automatically read in from tape.
Golem passwords
Your golem password is your AFS password, which is also your Kerberos password. Initially, it is the password given on your PSC initial account form. To change your AFS password, you can use the kpasswd command on bigben, rachel, jonas or on a PSC AFS system, or use a web form at http://www.psc.edu/private/htbin/passwd/setpw.cgi.
Methods of file transfer
Golem is strictly a file archiver. It is not used for computing. In addition, you cannot login to golem to perform file transfers. Logged into another system, either at PSC or at your remote site, there are several methods you can use to transfer files to and from golem:
If you are going to store a file to golem that is 2 Terabytes or larger, please contact User Services ahead of time so that special arrangements can be made to handle your file.
tcscp
The PSC-developed tcscp program can be used to transfer files between golem and jonas or rachel. For more information please see the discussions of the use of tcscp on rachel and jonas.
far
The PSC-developed far program can be used to transfer files between golem and PSC's computational platforms, including bigben, rachel, and jonas. On bigben, it should only be used interactively. It can be used interactively or in batch jobs on rachel and jonas.
kftp and krcp
Another option for golem file transfers is to use Kerberos ftp or rcp. Golem is running version 5.0 Kerberized services.
You must have a K5 client on your remote machine to use kftp or krcp to golem. Many Kerberos distributions are available. We recommend these:
| Unix K5 client: | http://www.pdc.kth.se/heimdal. |
| Windows K5 client: | http://sourceforge.net/projects/filezilla |
K5 clients are installed on all PSC systems.
Before you transfer files to and from golem using Kerberos, you must authenticate yourself to golem. The kinit command is used to do this. The format of the kinit command is
% kinit username@Kerberos realm name
For 'username' substitute your golem username and for 'Kerberos realm name' substitute PSC.EDU, which is the name of PSC's Kerberos realm. After you hit return you are prompted for your password, which, for realm PSC.EDU, is your AFS password
Once you are authenticated you can use kftp or krcp to actually do the file transfers.
You should verify that the Kerberos commands operate on your local system as described here. Some installations of Kerberized ftp differ in their implementation. For example, at NCSA, the ftp program is "kerberized" and there is no separate "kftp" command. Then the session would be something like:
kinit user@PSC.EDU ftp golem.psc.edu
From SDSC, a port number is needed on the kftp command:
kinit user@PSC.EDU kftp golem.psc.edu 21
kftp
% kftp remote_machine_name
For 'remote_machine_name' substitute 'golem.psc.edu'. The kftp command functions identically to the ftp command.
krcp
% krcp source-file destination-file
The format of "source-file" is hostname:path. If the source file is a local file, the hostname is not necessary. Specify the "destination-file" as username@hostname:/path. If username is omitted, the username on the local machine is used. The krcp command functions identically to the rcp command.
GridFTP
GridFTP is a secure, reliable, high-performance data transfer protocol optimized for high-bandwidth wide-area networks. It is based on FTP, but includes additional features to meet requirements from data grid projects.
The GridFTP servers at PSC are:
gridftp.bigben.psc.teragrid.org
gridftp.rachel.psc.teragrid.org
gridftp.archive.psc.teragrid.org
Please note that transfers to gridftp.rachel.psc.teragrid.org are stored on the archiver. Rachel users should store data on the archiver, then stage it over to rachel during job execution. After the job finishes, results should be stored back to the archiver.
For more information on GridFTP, please see http://www.globus.org/toolkit/docs/4.0/data/gridftp.
scp
Another option for file transfer to and from golem is the scp command. The format of the scp command is
scp source-filename target-filename
The scp command's format and its effect are modeled on the cp command, with the addition of the ability to specify a userid and machine name for either the source or target filename.
Since you cannot login to golem to perform file transfers, you will always be logged into your remote system when using scp. Thus, if you are transferring to golem your scp command will be similar to
scp source.f90 janeuser@golem.psc.edu:source.f90
This command will copy source.f90 from your remote system to golem, where it will be managed by DMF.
If you are transferring from golem to your remote system your scp command will be similar to
scp janeuser@golem.psc.edu:output.dat output.dat
This command will copy output.dat from golem to your remote system.
The first time you use scp to transfer files to or from golem you will receive a message similar to
Host key not found from list of know hosts. Are you sure you want
to continue connecting?
Answer 'yes' to make the connection. You should not receive this message on subsequent connections.
You will then be prompted for your golem password. You will need to supply your golem password each time you use scp.
Scp is part of the ssh distribution. PSC provides a recommended list of sites that distribute ssh. For more information on scp see the scp man page.
We strongly recommend that you use kftp rather than scp if kftp is available.
sftp
The sftp protocol is a secure version of ftp. The format and commands are similar to ftp. From a local machine, open a connection to a remote machine, log in to the remote machine, and use put and get to transfer files. We strongly recommend that you not use sftp if another file transfer method is available.
Performance
The performance of the transfer methods to and from golem can be ranked from best to worst as follows.
- tcscp
- far
- kftp, gridftp
- scp
- sftp
See also
- Guidelines for file storage.
- File protections
- File retention after a grant expires
- Maintenance downtimes
- The Andrew File System (AFS)