Pittsburgh Supercomputing Center 

Advancing the state-of-the-art in high-performance computing, communications and informatics.

Instructions for running MDCS R2011 on Blacklight

There are three things you must do before you can use MDCS on Blacklight:

  • You must install the MATLAB Parallel Computing Toolbox (PCT) on your local machine
  • You must request access to MDCS by submitting this form. PSC has 32 licenses for MDCS workers, but special permission is required to use them.
  • You must set your PSC password. Even if you access PSC resources through the XSEDE User Portal, MDCS requires your PSC password to be set.

After you have successfully installed PCT on your local machine, you need to define the correct environment to run on Blacklight.  The exact steps to do this depend on the version of MATLAB you have. 

These instructions are for versions R2011a and R2011b. 

If you have MATLAB R2012a, see these instructions for R2012a.

For MATLAB versions R2011a and R2011b

Initial setup

  1. Download MDCS_2011_Utils_PSC_Blacklight.zip. Note that it is a zip file.
  2. Extract all the files from MDSC_2011_Utils_PSC_Blacklight.zip.
  3. Copy all the files from MDCS_2011_Utils_PSC_Blacklight into a directory in the MATLAB path.

    Caution: Simply putting the MDCS_2011_Utils_PSC_Blacklight directory into the MATLAB path may not be sufficient. Be sure that all the files from MDCS_2011_Utils_PSC_Blacklight are also in the MATLAB path.

    A good place to put these files is the MATLAB default directory on your machine. This directory is usually the one that MATLAB is in after launching. It may be something like C:\Users\johndoe\Documents\MATLAB. If you are not sure what the MATLAB default directory on your machine is, you can determine it with these steps:

    1. Launch MATLAB on your local machine.
    2. In the MATLAB window, type pwd.
  4. Create a directory under the MATLAB default directory which will be your working directory . In this example, this working directory is named MDCSDataLocation_R2011a. You can choose any name you like. Be consistent in substituting your chosen name for MDCSDataLocation_R2011a throughout this example.
  5. Create a similar directory on Blacklight, preferably in $SCRATCH.
    1. Login to Blacklight.
    2. Create the directory with a command like:
      mkdir -p $SCRATCH/MDCSDataLocation_R2011a
  6. Launch MATLAB on your local machine, if you have not done so already.
  7. Import the PSC Blacklight configuration file.
    1. On the MATLAB Desktop menu bar, click Parallel > Manage Configurations.
    2. In the Configurations Manager window, click File > Import
    3. In the Import Configuration dialog box, browse to find the MAT-file Blacklight_PSC_R2011a.mat, then click Open.
  8. Make the Blacklight_PSC_R2011a configuration the default by selecting the button to its left in the Configurations Manager.
  9. Edit properties in the Blacklight_PSC_R2011a configuration so that they are correct for Blacklight.  Usually this means simply substituing your Blacklight username for 'johndoe'.
    1. Right click on the Blacklight_PSC_R2011a configuration and select Properties.
    2. Edit the following properties in the Generic Scheduler Configuration Properties window that opens:
      1. Folder where job data is stored : Set this to be the full path to the MDCSDataLocation_R2011a directory on your local machine (e.g. C:\Users\johndoe\Documents\MATLAB\MDCSDataLocation_R2011a)
      2. Function called when submitting parallel jobs: Edit this to contain the full path to MDCSDataLocation_R2011a directory that you created on Blacklight in step 4.  This setting should look like
        {@parallelSubmitFcn, 'blacklight.psc.teragrid.org', '/brashear/johndoe/MDCSDataLocation_R2011a'}

        Note that $SCRATCH on Blacklight is defined as '/brashear/johndoe/' for each user.

    3. Click OK and close the Configuration Manager.

    Note: There is a Configuration Validation function at the bottom of the Configuration Manager. Do not bother to run it, as it will queue a job that will have a long wait time, and will likely fail due to a timeout. Instead, follow the steps in the next section, Running Jobs, to verify that your setup is working.

Running jobs

Given below are instructions for running MATLAB on your local machine and on Blacklight.  This example uses the function lotsOfPauses(x,y). This function is defined in the file MDCS_Utils_PSC_Blacklight\MDCS_Examples\lotsOfPauses.m. This will pause y times, for x seconds each time.  In the example, we use 16 pauses of 5 seconds each.

The serial version, which uses a for loop, should take approximately 5*16=80 secs to finish. In the parallel version, for has been replaced with parfor.

On your local machine

To run lotsOfPauses(x,y) in parallel with PCT on your local machine with 2 workers, type

matlabpool local 2 
lotsOfPauses(5,16)

If your local machine is at least a dual core and nothing else is taxing the CPU cores, this should take slightly more than 40 secs.

On Blacklight

Follow these steps to run lotsOfPauses on blacklight.

  1. Check your current Blacklight submission options. Type
    ClusterInfo.state
  2. Clear the current option settings. Type
    ClusterInfo.clear
  3. Specify the queue needed for the job. Use the debug queue for this sample job. Type
    ClusterInfo.setQueueName('debug')

    Note: The debug queue is for small test jobs. Bigger jobs and production jobs must use the batch queue. See the Blacklight document for more details.

  4. Specify the amount of time the job needs to run. This sample job needs two minutes. Type
    ClusterInfo.setWallTime('00:02:00')
  5. Specify the number of MDCS licenses that the job needs.

    There are a limited number of MDCS licenses on Blacklight that are shared amongst all users. Each job needs to specify how many licenses it requires. The number of licenses required for an MDCS job is the number of parallel workers in matlabpool plus one for the master worker. For example, this sample job requires 8+1=9 MDCS licenses. To specify the number of licenses needed, type

    ClusterInfo.setUserDefinedOptions('-l licenses=MATLAB_Distrib_Comp_Engine:9')

    The job will wait in queue until both the requisite number of cores and MDCS licenses are available.

  6. Optionally, you can move to a unique subdirectory.  To do so, type
    cd MDCS_Examples
  7. Submit the job.  Type
    j=batch(@lotsOfPauses,1,{5,16},'matlabpool',8)
    A User Credentials window will open.  Enter your Blacklight login and password. Use your PSC login and password, not XSEDE's.

Checking the status of your job

When the job has been submitted, you can check its status, either in real time or at your convenience.

In real time

Check the status of your job in real time with

j.state

You can see details of the job with simply

j

When your job completes, j.state is set to finished. To see the output, type

j.getAllOutputArguments()

For lotsOfPauses, the only return value is the total time it took to run, and should be a little more than 10 seconds.

At your convenience

You can shut down your computer once the job is submitted. Later when you want to retrieve the job, run the following:

sched=findResource()
sched.Jobs         # Lists all jobs in the database
j=sched.Jobs(2)    # Get a handle to the 2nd job in the database
j.getAllOutputArguments()

Capturing output to the command window from a parallel job

If you are executing a MATLAB script in parallel that prints to the command window, you will need to turn on capturing of the diary, and afterwards retrieve the diary. For example:

  1. Submit the job:
    j=batch(' matrixOps_spmd.m','Matlabpool',16,'CaptureDiary',true)
  2. Retrieve the diary after job finishes
    j.diary

Checking for and interpreting errors

Errors are reported in the field TaskID of errors within the job information obtained by typing the job handle (in our example simply j). If the field TaskID of errors is empty, then there are no errors; otherwise, there are errors. To see the specific error message, type

j.Tasks(#).Error

where # should be replaced by the ID of a task or worker that has an error (in our example between 1 and 9; the ID of the master worker is 1 and the IDs of the 8 slave workers are 2-9).

Checking whether the license server is down

If your job is reported to be finished, but the StartTime and Running Duration fields under job information are empty, then probably the job didn’t actually start because the MATLAB license server is down. Send an email to This email address is being protected from spambots. You need JavaScript enabled to view it. requesting that the status of the MATLAB license server be checked.

Stay Connected

Stay Connected with PSC!

facebook 32 twitter 32 google-Plus-icon