DMOVER

Scheduled Data Transfer for Distributed Computational Workflows

DMOVER is an easy-to-use harness for scheduling data transfer jobs, using GridFTP (globus-url-copy) and GSI-OpenSSH (gsiscp).

Installed on: DMOVER commands are available on the BigBen login nodes.

Usage Summary

To use DMOVER:

  1. ssh to a login node where DMOVER is available.
  2. Load the dmover module to make the DMOVER client programs available in your $PATH.
    module load dmover 
  3. Create (or retrieve) a transfer file, containing a list of file transfer source and destination pairs.
  4. Obtain a user proxy certificate for authentication to all file transfer source and destination hosts.
  5. Submit the DMOVER job.
    dsub  -f  path-to-the-transfer-file
  6. Check on the status of the DMOVER job
    dstat

Example: Transfer File 1

Each source and destination file pair is listed with a file:// and either a fully qualified GridFTP (gsiftp://) or a fully qualified GSI-OpenSSH (gsiscp://) URL.

file:// URLs specify full paths on the login host to the specific files.

The first URL on each line is the source URL and the second is the destination URL.

# Comment: Transfer Example File 1
file:///bessemer/janedoe/test/5GB   gsiftp://gridftp.mercury.ncsa.teragrid.org/gpfs_scratch1/janedoe/5GBtest
gsiftp://gridftp.mercury.ncsa.teragrid.org/gpfs_scratch1/janedoe/10GB   file:///bessemer/janedoe/test/10GBtest
file:///bessemer/janedoe/test/100MB   gsiscp://tg-login.ncsa.teragrid.org/gpfs_scratch1/janedoe/100MBtest
gsiscp://tg-login.ncsa.teragrid.org/gpfs_scratch1/janedoe/10MB   file:///bessemer/janedoe/test/10MBtest

Example: DMOVER job submission on BigBen

  1. ssh to a login node where DMOVER is available.
    myhost janedoe$ ssh tg-login.bigben.psc.teragrid.org
    janedoe@tg-login.bigben.psc.teragrid.org's password: 
    Last login: Tue Sep  2 23:21:32 2008 from 68.162.157.171
    
    Pittsburgh Supercomputing Center
    Cray XT3 SN 2352 login node.
    UNICOS/lc 1.5.47
      
    This system is for the use of authorized users only.  Unauthorized use may
    be monitored and recorded.  In the course of such monitoring or through
    system maintenance, the activities of authorized users may be monitored.
    By using this system you expressly consent to such monitoring.
    
    **SCRATCH file systems are for temporary storage only. They are volatile and
    not backed up so valuable data must be moved to FAR asap after a run
    completes. 
    
    janedoe@tg-login11:~> 
    
  2. Execute  module load dmover to make the DMOVER client programs available in your $PATH.
    janedoe@tg-login11:~> module load dmover
    
    janedoe@tg-login11:~> which dsub
    /usr/local/packages/dmover/1.0/bin/dsub
    
  3. Create (or retrieve) a transfer file, containing a list of file transfer source and destination pairs.
    ...
    janedoe@tg-login11:~> cat example1
    # Comment: Transfer Example File 1
    file:///bessemer/janedoe/test/5GB gsiftp://gridftp.mercury.ncsa.teragrid.org/gpfs_scratch1/janedoe/5GBtest
    gsiftp://gridftp.mercury.ncsa.teragrid.org/gpfs_scratch1/janedoe/10GB file:///bessemer/janedoe/test/10GBtest
    file:///bessemer/janedoe/test/100MB gsiscp://tg-login.ncsa.teragrid.org/gpfs_scratch1/janedoe/100MBtest
    gsiscp://tg-login.ncsa.teragrid.org/gpfs_scratch1/janedoe/10MB file:///bessemer/janedoe/test/10MBtest
     
  4. Obtain a user proxy certificate for authentication to all file source and destination hosts.
    janedoe@tg-login11:~> myproxy-logon -s myproxy.teragrid.org
    Enter MyProxy pass phrase:
    A credential has been received for user janedoe in /tmp/x509up_u17780.
    
    janedoe@tg-login11:~> grid-proxy-info
    subject  : /C=US/O=National Center for Supercomputing Applications/CN=Jane Doe
    issuer   : /C=US/O=National Center for Supercomputing Applications/OU=Certificate Authorities/CN=MyProxy
    identity : /C=US/O=National Center for Supercomputing Applications/CN=Jane Doe
    type     : end entity credential
    strength : 1024 bits
    path     : /tmp/x509up_u17780
    timeleft : 11:59:53
    
  5. Execute dsub -f path-to-the-transfer-file to submit the DMOVER job.

    The following example submits a DMOVER job requesting 4 nodes and 2 stripes per node for an 8-stripe GridFTP service used by the GridFTP transfers included in the example1 transfer file. The GSI-OpenSSH (gsiscp) transfers each use a single allocated node.

    janedoe@tg-login11:~> dsub -n 4 -s 2 -f example1
    Proxy Check Succeeded.
    Using proxy at /tmp/x509up_p32637.filetXSR1J.1
    DMover Submision Initiated
    Using 4 Node(s).
    Configuring 2 Stripe(s) per Node.
    8 Stripes total.
    Logging to /usr/users/0/janedoe/dmover.2008-09-03-07_31_35.32669
    Executing qsub: 1927.fred001.psc.teragrid.org
    
  6. Check on the status of the DMOVER job using the dstat command.
    ...
    janedoe@tg-login11:~> dstat
    Job id Name             User             Time Use S Queue
    ------ ---------------- ---------------- -------- - -----
    1927.fred001 dmover_transfer  janedoe          00:00:06 R dmover          
    
    ...
    janedoe@tg-login11:~> dstat
    janedoe@tg-login11:~>
    

    (Done)

Example: User Configuration File

You can create an SSH-style configuration file to store commonly used DMOVER preferences and to define shortcut aliases for long host URLs. To learn more about creating a DMOVER configuration file, see the dmover_config man page in the "Related Documents" section below.

janedoe@tg-login11:~> cat ~/.dmover/config
# Example ~/.dmover/config user configuration file

Host bigben
	URI gsiftp
		Hostname gridftp.bigben.psc.teragrid.org
	URI gsiscp
		Hostname gridftp.bigben.psc.teragrid.org

Host mercury
	URI gsiftp
		Hostname gridftp.mercury.ncsa.teragrid.org
	URI gsiscp
		Hostname tg-login.ncsa.teragrid.org
	URIOrder gsiftp, gsiscp

Example: Transfer File 2

With a ~/.dmover/config file as illustrated above, the example transfer file 1 can be simplified to:

janedoe@tg-login11:~> cat example2
# Comment: Transfer Example File 2
file:///bessemer/janedoe/test/5GB   mercury:/gpfs_scratch1/janedoe/5GBtest
mercury:/gpfs_scratch1/janedoe/10GB   file:///bessemer/janedoe/test/10GBtest
file:///bessemer/janedoe/test/100MB   mercury:/gpfs_scratch1/janedoe/100MBtest
mercury:/gpfs_scratch1/janedoe/10MB   file:///bessemer/janedoe/test/10MBtest

Note that given the URIOrder specified for the mercury alias in the configuration file, each transfer in the example2 transfer file will first be attempted using GridFTP. If transfers attempted using GridFTP fail, DMOVER will retry using the gsiscp method instead.

Related Documents

Man pages

  • dsub ... DMOVER job submission.
  • ddel ... DMOVER job cancellation.
  • dstat ... DMOVER job submission status in DMOVER queue.
  • dmover_config ... format of ~/.dmover/config DMOVER user configuration file.

Papers and presentations:

Related work:

  • Chris Rapier & Benjamin Bennett, PSC, Michael Stevens, CMU, High Performance SSH/SCP - HPN-SSH. Performance improvements provided by HPN-SSH are incorporated into GSI-OpenSSH.

Questions?

Please address your questions regarding DMOVER to PSC User Services.