DMOVER
Scheduled Data Transfer for Distributed Computational Workflows
DMOVER is an easy-to-use harness for scheduling data transfer jobs, using GridFTP (globus-url-copy) and GSI-OpenSSH (gsiscp).
Installed on: DMOVER commands are available on the BigBen login nodes.
Usage Summary
To use DMOVER:
-
sshto a login node where DMOVER is available. - Load the dmover module to make the DMOVER client programs available in your $PATH.
module load dmover
- Create (or retrieve) a transfer file, containing a list of file transfer source and destination pairs.
- Obtain a user proxy certificate for authentication to all file transfer source and destination hosts.
- Submit the DMOVER job.
dsub -f path-to-the-transfer-file
- Check on the status of the DMOVER job
dstat
Example: Transfer File 1
Each source and destination file pair is listed with a file:// and either a fully qualified GridFTP (gsiftp://) or a fully qualified GSI-OpenSSH (gsiscp://) URL.
file:// URLs specify full paths on the login host to the specific files.
The first URL on each line is the source URL and the second is the destination URL.
# Comment: Transfer Example File 1 file:///bessemer/janedoe/test/5GB gsiftp://gridftp.mercury.ncsa.teragrid.org/gpfs_scratch1/janedoe/5GBtest gsiftp://gridftp.mercury.ncsa.teragrid.org/gpfs_scratch1/janedoe/10GB file:///bessemer/janedoe/test/10GBtest file:///bessemer/janedoe/test/100MB gsiscp://tg-login.ncsa.teragrid.org/gpfs_scratch1/janedoe/100MBtest gsiscp://tg-login.ncsa.teragrid.org/gpfs_scratch1/janedoe/10MB file:///bessemer/janedoe/test/10MBtest
Example: DMOVER job submission on BigBen
sshto a login node where DMOVER is available.myhost janedoe$ ssh tg-login.bigben.psc.teragrid.org janedoe@tg-login.bigben.psc.teragrid.org's password: Last login: Tue Sep 2 23:21:32 2008 from 68.162.157.171 Pittsburgh Supercomputing Center Cray XT3 SN 2352 login node. UNICOS/lc 1.5.47 This system is for the use of authorized users only. Unauthorized use may be monitored and recorded. In the course of such monitoring or through system maintenance, the activities of authorized users may be monitored. By using this system you expressly consent to such monitoring. **SCRATCH file systems are for temporary storage only. They are volatile and not backed up so valuable data must be moved to FAR asap after a run completes. janedoe@tg-login11:~>
- Execute
module load dmoverto make the DMOVER client programs available in your $PATH.janedoe@tg-login11:~> module load dmover janedoe@tg-login11:~> which dsub /usr/local/packages/dmover/1.0/bin/dsub
- Create (or retrieve) a transfer file, containing a list of file transfer source and destination pairs.
... janedoe@tg-login11:~> cat example1 # Comment: Transfer Example File 1 file:///bessemer/janedoe/test/5GB gsiftp://gridftp.mercury.ncsa.teragrid.org/gpfs_scratch1/janedoe/5GBtest gsiftp://gridftp.mercury.ncsa.teragrid.org/gpfs_scratch1/janedoe/10GB file:///bessemer/janedoe/test/10GBtest file:///bessemer/janedoe/test/100MB gsiscp://tg-login.ncsa.teragrid.org/gpfs_scratch1/janedoe/100MBtest gsiscp://tg-login.ncsa.teragrid.org/gpfs_scratch1/janedoe/10MB file:///bessemer/janedoe/test/10MBtest
- Obtain a user proxy certificate for authentication to all file source and destination hosts.
janedoe@tg-login11:~> myproxy-logon -s myproxy.teragrid.org Enter MyProxy pass phrase: A credential has been received for user janedoe in /tmp/x509up_u17780. janedoe@tg-login11:~> grid-proxy-info subject : /C=US/O=National Center for Supercomputing Applications/CN=Jane Doe issuer : /C=US/O=National Center for Supercomputing Applications/OU=Certificate Authorities/CN=MyProxy identity : /C=US/O=National Center for Supercomputing Applications/CN=Jane Doe type : end entity credential strength : 1024 bits path : /tmp/x509up_u17780 timeleft : 11:59:53
- Execute
dsub -f path-to-the-transfer-fileto submit the DMOVER job.The following example submits a DMOVER job requesting 4 nodes and 2 stripes per node for an 8-stripe GridFTP service used by the GridFTP transfers included in the example1 transfer file. The GSI-OpenSSH (gsiscp) transfers each use a single allocated node.
janedoe@tg-login11:~> dsub -n 4 -s 2 -f example1 Proxy Check Succeeded. Using proxy at /tmp/x509up_p32637.filetXSR1J.1 DMover Submision Initiated Using 4 Node(s). Configuring 2 Stripe(s) per Node. 8 Stripes total. Logging to /usr/users/0/janedoe/dmover.2008-09-03-07_31_35.32669 Executing qsub: 1927.fred001.psc.teragrid.org
- Check on the status of the DMOVER job using the
dstatcommand.... janedoe@tg-login11:~> dstat Job id Name User Time Use S Queue ------ ---------------- ---------------- -------- - ----- 1927.fred001 dmover_transfer janedoe 00:00:06 R dmover ... janedoe@tg-login11:~> dstat janedoe@tg-login11:~>
(Done)
Example: User Configuration File
You can create an SSH-style configuration file to store commonly used DMOVER preferences and to define shortcut aliases for long host URLs. To learn more about creating a DMOVER configuration file, see the dmover_config man page in the "Related Documents" section below.
janedoe@tg-login11:~> cat ~/.dmover/config # Example ~/.dmover/config user configuration file Host bigben URI gsiftp Hostname gridftp.bigben.psc.teragrid.org URI gsiscp Hostname gridftp.bigben.psc.teragrid.org Host mercury URI gsiftp Hostname gridftp.mercury.ncsa.teragrid.org URI gsiscp Hostname tg-login.ncsa.teragrid.org URIOrder gsiftp, gsiscp
Example: Transfer File 2
With a ~/.dmover/config file as illustrated above, the example transfer file 1 can be simplified to:
janedoe@tg-login11:~> cat example2 # Comment: Transfer Example File 2 file:///bessemer/janedoe/test/5GB mercury:/gpfs_scratch1/janedoe/5GBtest mercury:/gpfs_scratch1/janedoe/10GB file:///bessemer/janedoe/test/10GBtest file:///bessemer/janedoe/test/100MB mercury:/gpfs_scratch1/janedoe/100MBtest mercury:/gpfs_scratch1/janedoe/10MB file:///bessemer/janedoe/test/10MBtest
Note that given the URIOrder specified for the mercury alias in the configuration file, each transfer in the example2 transfer file will first be attempted using GridFTP. If transfers attempted using GridFTP fail, DMOVER will retry using the gsiscp method instead.
Related Documents
Man pages
- dsub ... DMOVER job submission.
- ddel ... DMOVER job cancellation.
- dstat ... DMOVER job submission status in DMOVER queue.
- dmover_config ... format of ~/.dmover/config DMOVER user configuration file.
Papers and presentations:
- Derek Simmel & Robert Budden, DMOVER: Scheduled Data Transfer for Distributed Computational Workflows, TeraGrid 2008, Las Vegas, NV, June, 2008.
- Derek Simmel & Robert Budden, DMOVER: Scheduled Data Transfer for HPC Grid Workflows, HPC 2008, Cetraro, Italy, July, 2008.
Related work:
- Chris Rapier & Benjamin Bennett, PSC, Michael Stevens, CMU, High Performance SSH/SCP - HPN-SSH. Performance improvements provided by HPN-SSH are incorporated into GSI-OpenSSH.
Questions?
Please address your questions regarding DMOVER to PSC User Services.