MATLAB
MATLAB is a high-level language and interactive environment that enables you to perform computationally intensive tasks faster than with traditional programming languages.
At PSC, the MATLAB Distributed Computing Server (MDCS) is installed on blacklight. With MDCS, you develop your program or model on your own local multicore computer using MATLAB's Parallel Computing Toolbox™ and then scale up to many cores by running it using MDCS on blacklight.
There are three things you must do before you can use MDCS on Blacklight:
- You must install the MATLAB Parallel Computing Toolbox (PCT) on your local machine
- You must request access to MATLAB by submitting this form. PSC has 32 licenses for MDCS workers, but special permission is required to use them.
- You must set your PSC password. Even if you access PSC resources through the XSEDE User Portal, MDCS requires your PSC password to be set.
Instructions for running MDCS on PSC’s Blacklight
>> Download these instructions as a PDF.
Note: You must have the MATLAB Parallel Computing Toolbox (PCT) installed on your local machine before you can use MDCS on blacklight.
Initial setup
After you have successfully installed PCT on your local machine, take these steps to define the correct environment to run on Blacklight.
- Download and unzip MDCS_Utils_PSC_Blacklight. It is available here.
- Copy all the files from MDCS_Utils_PSC_Blacklight into a directory in the MATLAB path.
Caution: Simply putting the MDCS_Utils_PSC_Blacklight directory into the MATLAB path may not be sufficient. Be sure that all the files from MDCS_Utils_PSC_Blacklight are also in the MATLAB path.
A good place to put these files is the MATLAB default directory on your machine. This directory is usually the one that MATLAB is in after launching. It may be something like
C:\Users\johndoe\Documents\MATLAB. If you are not sure what the MATLAB default directory on your machine is, you can determine it with these steps:- Launch MATLAB on your local machine.
- In the MATLAB window, type
pwd.
- Create a directory under the MATLAB default directory which will be your working directory . In this example, this working directory is named
MDCSDataLocation_R2011a. You can choose any name you like. Be consistent in substituting your chosen name forMDCSDataLocation_R2011athroughout this example. - Create a similar directory on Blacklight, preferably in $SCRATCH.
- Login to Blacklight.
- Create the directory with a command like:
mkdir -p $SCRATCH/MDCSDataLocation_R2011a
- Launch MATLAB on your local machine, if you have not done so already.
- Import the PSC Blacklight configuration file.
- On the MATLAB Desktop menu bar, click Parallel > Manage Configurations.
- In the Configurations Manager window, click File > Import
- In the Import Configuration dialog box, browse to find the MAT-file
Blacklight_PSC_R2011a.mat, then click Open.
- Make the
Blacklight_PSC_R2011aconfiguration the default by selecting the button to its left in the Configurations Manager. - Edit properties in the
Blacklight_PSC_R2011aconfiguration so that they are correct for Blacklight. Usually this means simply substituing your Blacklight username for 'johndoe'.- Right click on the
Blacklight_PSC_R2011aconfiguration and select Properties. - Edit the following properties in the Generic Scheduler Configuration Properties window that opens:
- Folder where job data is stored : Set this to be the full path to the
MDCSDataLocation_R2011adirectory on your local machine (e.g.C:\Users\johndoe\Documents\MATLAB\MDCSDataLocation_R2011a) - Function called when submitting parallel jobs: Edit this to contain the full path to
MDCSDataLocation_R2011adirectory that you created on Blacklight in step 4. This setting should look like
{@parallelSubmitFcn, 'blacklight.psc.teragrid.org', '/brashear/johndoe/MDCSDataLocation_R2011a'}Note that $SCRATCH on Blacklight is defined as '/brashear/johndoe/' for each user.
- Folder where job data is stored : Set this to be the full path to the
- Click OK and close the Configuration Manager.
Note: There is a Configuration Validation function at the bottom of the Configuration Manager. Do not bother to run it, as it will queue a job that will have a long wait time, and will likely fail due to a timeout. Instead, follow the steps in the next section, Running Jobs, to verify that your setup is working.
- Right click on the
Running jobs
Given below are instructions for running MATLAB on your local machine and on Blacklight. This example uses the function lotsOfPauses(x,y). This function is defined in the file MDCS_Utils_PSC_Blacklight\MDCS_Examples\lotsOfPauses.m. This will pause y times, for x seconds each time. In the example, we use 16 pauses of 5 seconds each.
The serial version, which uses a for loop, should take approximately 5*16=80 secs to finish. In the parallel version, for has been replaced with parfor.
On your local machine
To run lotsOfPauses(x,y) in parallel with PCT on your local machine with 2 workers, type
matlabpool local 2
lotsOfPauses(5,16)
If your local machine is at least a dual core and nothing else is taxing the CPU cores, this should take slightly more than 40 secs.
On Blacklight
Follow these steps to run lotsOfPauses on blacklight.
Check your current Blacklight submission options. TypeClusterInfo.state
- Clear the current option settings. Type
ClusterInfo.clear
- Specify the queue needed for the job. Use the debug queue for this sample job. Type
ClusterInfo.setQueueName('debug')Note: The debug queue is for small test jobs. Bigger jobs and production jobs must use the batch queue. See the Blacklight document for more details.
- Specify the amount of time the job needs to run. This sample job needs two minutes. Type
ClusterInfo.setWallTime('00:02:00') - Specify the number of MDCS licenses that the job needs.
There are a limited number of MDCS licenses on Blacklight that are shared amongst all users. Each job needs to specify how many licenses it requires. The number of licenses required for an MDCS job is the number of parallel workers in
matlabpoolplus one for the master worker. For example, this sample job requires 8+1=9 MDCS licenses. To specify the number of licenses needed, typeClusterInfo.setUserDefinedOptions('-l licenses=MATLAB_Distrib_Comp_Engine:9')The job will wait in queue until both the requisite number of cores and MDCS licenses are available.
- Optionally, you can move to a unique subdirectory. To do so, type
cd MDCS_Examples
- Submit the job. Type
j=batch(@lotsOfPauses,1,{5,16},'matlabpool',8)A User Credentials window will open. Enter your Blacklight login and password. Use your PSC login and password, not XSEDE's.
Checking the status of your job
When the job has been submitted, you can check its status, either in real time or at your convenience.
In real time
Check the status of your job in real time with
j.state
You can see details of the job with simply
j
When your job completes, j.state is set to finished. To see the output, type
j.getAllOutputArguments()
For lotsOfPauses, the only return value is the total time it took to run, and should be a little more than 10 seconds.
At your convenience
You can shut down your computer once the job is submitted. Later when you want to retrieve the job, run the following:
sched=findResource() sched.Jobs # Lists all jobs in the database j=sched.Jobs(2) # Get a handle to the 2nd job in the database j.getAllOutputArguments()
Capturing output to the command window from a parallel job
If you are executing a MATLAB script in parallel that prints to the command window, you will need to turn on capturing of the diary, and afterwards retrieve the diary. For example:
- Submit the job:
j=batch(' matrixOps_spmd.m','Matlabpool',16,'CaptureDiary',true) - Retrieve the diary after job finishes
j.diary
Checking for and interpreting errors
Errors are reported in the field TaskID of errors within the job information obtained by typing the job handle (in our example simply j). If the field TaskID of errors is empty, then there are no errors; otherwise, there are errors. To see the specific error message, type
j.Tasks(#).Error
where # should be replaced by the ID of a task or worker that has an error (in our example between 1 and 9; the ID of the master worker is 1 and the IDs of the 8 slave workers are 2-9).
Checking whether the license server is down
If your job is reported to be finished, but the StartTime and Running Duration fields under job information are empty, then probably the job didn’t actually start because the MATLAB license server is down. Send an email to
This e-mail address is being protected from spambots. You need JavaScript enabled to view it
requesting that the status of the MATLAB license server be checked.