DXC

Data Exacell (DXC): User Guide

Getting started

When your account has been created on the DXC, you will receive an email with your username.  You are not assigned an initial password; you must create a password using the PSC password change utility before you can log in to the DXC.

Once you have created your password, use SSH to connect to the DXC. SSH  is a program that enables secure logins over an unsecure network.  It is client-server software, which means that both the user's local computer and the remote computer must have it installed.  Free versions for  Macs  and Windows machines and many versions of Unix are available.

Read more about using SSH to connect to PSC systems.  

Nodes

The DXC is composed of many distinct machines, called nodes. Some of the nodes are general purpose, and others have very specialized uses, like Hadoop or Galaxy. The tools that you want to use will determine which nodes you will log in to.

File spaces

There are several distinct file spaces on the DXC.

/home
This is the default space for all your files. You have a home directory named /home/username, where username is your DXC username. The /home filesystem is shared across DXC, so you can access your files in /home from all DXC nodes. There is currently no quota on the /home file system.
Hadoop File System
Hadoop users will be given space on the Hadoop File System, or HDFS. The HDFS is only available on the Hadoop nodes: dxchd01 through dxchd04. All files to be used with Hadoop must reside in the HDFS. For more information about the HDFS, see the DXC Hadoop and Spark documentation.

File Transfer

There are multiple methods to move files in and out of the file systems on the DXC.  

Our recommended method for file transfer into the DXC is Globus Online. If you cannot use Globus Online, but do have access to Globus client software, we recommend globus-url-copy. Otherwise, you can use sftp or scp.

Globus Online

Globus Online is our recommended method of transferring data to and from the DXC. 

Before you can use Globus Online to transfer files, you must either create a Globus account  or be affiliated with an InCommon institution. 

Create a Globus Online account

Go to https://www.globus.org/, click "Sign Up" in the upper right, and follow the instructions.

Once you have done that, you can use Globus Online to transfer files to the DXC.

Use InCommon credentials

If you are affiliated with an InCommon institution you can use your userid and password for that institution to authenticate to Globus.

Before you can use Globus with your InCommon credentials, you must register with PSC as an InCommon user.

To register with PSC as an InCommon user, follow these steps.

  1. Go to https://cilogon.org/
  2. Select your institution from the 'Select an Identity Provider' list
  3. Click on the 'Log On' button

    This will take you to a login page for your institution.

  4. Enter your username and password for your institution
  5. Click on the 'Login' button

    You will be redirected back to the CILogon Service web page.

  6. Find and copy your "Certificate Subject" string

    Near the top of the CILogon Service web page you will see a field called "Certificate Subject" with a string like /DC=org/DC=cilogon/C=US/O=My Institution/CN=My Name A1234. Copy this string. You'll need it in step 9.

  7. Log off from the CILogin Service webpage
  8. Log in to https://dirs.psc.edu/cgi-bin/teragrid/userpage/list.pl with your PSC username and password

    This site lists the certificate subjects (DNs) that we have in our PSC database for your PSC account.

  9. Add your CILogon Certficate Subject (DN) to this list
    1. Click on the 'Add DN' link at the top left

      This will take you to the "Adding DN" page.

    2. Paste the certificate subject that you copied in step 6 into the DN: field

      Make sure there are no extra spaces before or after the pasted string.

    3. Click on 'Create' to add your new CILogon DN (certificate subject) to the PSC database

      You can click on the 'List DNs' link at the top left to confirm that your new DN was added.

Within an hour you should be able to use Globus Online to copy files to and from endpoint psc#dxc using your userid and password at your institution. Send email to This email address is being protected from spambots. You need JavaScript enabled to view it. if you have any questions about using InCommon.

Moving files with Globus Online

To start a file transfer, log in to the Globus site.  

Globus transfers files between known endpoints. The endpoint for the DXC is psc#dxc.  Use this endpoint whether you use Globus or InCommon credentials to authenticate to Globus.

Choose 'Transfer Files' and you will be taken to a graphical interface where you will select endpoints and identify the files to be copied.

If you do not enter a path for the psc#dxc endpoint, your destination will be your DXC home directory.  Enter a path if you want a different destination on the DXC.

If you are unable to use either Globus or InCommon credentials to authenticate to Globus, send email to This email address is being protected from spambots. You need JavaScript enabled to view it. to see if you can use other methods of authentication.

Sftp and scp

You can transfer files between your local systems and the DXC using the SSH file transfer clients sftp andscp. Both graphical and command line versions of these clients are available.

Using a graphical sftp or scp app

If you have a graphical sftp or scp client application on your local system, you can use it to transfer files to the DXC.  Use data.dxc.psc.edu for the endpoint and  your PSC userid and password for authentication. 

Using sftp from the command line

You can use the command line sftp client to transfer files to and from the  DXC interactively.

  1. First authenticate to data.dxc.psc.edu using a command like:
     $ sftp This email address is being protected from spambots. You need JavaScript enabled to view it.

    where joeuser is your PSC userid.

    The first time you connect to data.dxc.psc.edu using sftp, you may be prompted to accept the server's host key. Enter yes to accept the host key:


    The authenticity of host 'data.psc.xsede.org (128.182.70.103)' can't be established.
    RSA key fingerprint is d5:77:f2:d9:07:f6:32:b6:c3:eb:0d:d1:29:ed:9b:80.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'data.dxc.psc.edu' (RSA) to the list of known hosts. 

    You will then be prompted to enter your PSC password.

  2. Transfer files using the put command to copy a file from your local system to the DXC, or get to copy a file from the DXC to your local system.

    Examples:

    • Copy local file file1.dat to the DXC:
      sftp> put file1.dat newdata/file1.dat
      Uploading file1.dat to /home/joeuser/file1.dat
      file1.dat          100% 1016KB   1.0MB/s   00:00
    • Copy  file1 from your DXC home directory to directory /Users/JoeUser/Documents/ on your local system:
      sftp> get /home/joeuser/file1 /Users/JoeUser/Documents/file1
      Fetching /home/joeuser/file1 to /Users/JoeUser/Documents/file1
      /home/joeuser/file1          100%   31     0.0KB/s   00:00
Other sftp commands

At the sftp> prompt, you can use other sftp  commands to manage and transfer your files to/from the Data Supercell. Enter a question mark for a list of available sftp commands.

  See more sftp examples

Examples (where joeuser is the user's PSC userid, and entered commands appear in bold):

  • What directory am I in on the DXC?
    sftp> pwd
    Remote working directory: /home/joeuser
  • What directory am I in on my local system?
    sftp> lpwd
    Local working directory: /Users/JoeUser/Documents
  • Change directories on my local system to /Users/JoeUser/Documents/project1:
    sftp> lcd /Users/JoeUser/Documents/project1
  • Make a new directory called "newdata" under my current directory on the DXC :
    sftp> mkdir newdata
  • Exit from this sftp session :
    sftp> exit

 

Using scp from the command line

For scripted transfers, or transfers that you want to execute directly from your command-line shell, you can use the SSH scp client.

Examples (where joeuser is the user's PSC userid, and entered commands appear in bold):

    • Copy my local file/Users/JoeUser/Documents/project1/file1.dat to my home directory on the DXC:
      $ scp /Users/JoeUser/Documents/project1/file1.dat  This email address is being protected from spambots. You need JavaScript enabled to view it.:.
      This email address is being protected from spambots. You need JavaScript enabled to view it.'s password: 
      file1.dat          100% 1016KB   1.0MB/s   00:00  
      
      The first time that you use scp to transfer files to the DXC, you may receive a warning similar to:
      The authenticity of host '(128.182.nn.nnn)' can't be established.
      RSA key fingerprint is 05:9d:1b:98:f9:92:71:60:e7:66:bd:35:d8:89:58:d2.
      Are you sure you want to continue connecting (yes/no)? yes
      Warning: Permanently added 'data.dxc.psc.edu' (RSA) to the list of known hosts.
      

      You will then be prompted for your PSC password.

    • Copy all the files in my newdata directory on the DXC to directory /Users/JoeUser/Documents/project1/newdata on my local system.  If directory /Users/JoeUser/Documents/project1/newdata does not exist, it will be created.
      $ scp -r This email address is being protected from spambots. You need JavaScript enabled to view it.:newdata /Users/JoeUser/Documents/project1   
      This email address is being protected from spambots. You need JavaScript enabled to view it.'s password: 
      file2.dat          100% 1016KB   1.0MB/s   00:00
      file3.dat          100% 1016KB   1.0MB/s   00:01
      file1.dat          100% 1016KB   1.0MB/s   00:00
      

User Information

Passwords
Connect to PSC systems:
Policies
Technical questions:

Send mail to remarks@psc.edu or call the PSC hotline: 412-268-6350.