ALLPATHS-LG is a whole-genome shotgun assembler that can generate high-quality genome assemblies using short reads (~100bp) such as those produced by the new generation of sequencers. The significant difference between ALLPATHS and traditional assemblers such as Arachne is that ALLPATHS assemblies are not necessarily linear, but instead are presented in the form of a graph. This graph representation retains ambiguities, such as those arising from polymorphism, uncorrected read errors, and unresolved repeats, thereby providing information that has been absent from previous genome assemblies.

ALLPATHS-LG was developed at the Broad Institute.

Installed on blacklight.


Running large assemblies on Blacklight efficiently requires special steps that may vary depending on your particular assembly problem. If you expect your assembly to take more than a few hours, please contact PSC User Services beforehand to discuss your assembly with PSC staff.

Sequence data conversion: Before it can be used by ALLPATHS-LG, you must convert your sequencing data and add metadata to it. Please see this document on how to convert data for how to accomplish this data conversion.

Limits on stdout and stderr files: Blacklight limits stdout and stderr files to 20 Mbytes each.  If this limit is exceeded, the job is killed.  ALLPATHS-LG writes large amounts of data to stdout and stderr which often exceeds 20 Mbytes.  To prevent your job from being killed by the system, redirect stdout and stderr to a $SCRATCH file on the ALLPATHS-LG command line.

When you are ready to run your assembly, follow these steps:

  1. Prepare a batch job containing commands to
    1. Set up the module command. Blacklight system modules define environment variables which make your life easier when using different software packages. See documentation for the module command for more information.
    2. Load the system module for ALLPATHS. First, determine which releases are available by typing
      module avail allpaths
      Then load the appropriate one for your work, for example:
      module load allpaths-lg/41370
    3. Change the allowable stack size. ALLPATHS-LG requires this.
      ulimit -s 100000
    4. Prepare the input data using PrepareAllPathsInputs

      The PrepareAllPathsInputs perl script ( can be used to convert input data to ALLPATHS-LG format files. See the ALLPATHS-LG Manual for details on See the example job script   for an example of using

    5. Call the RunAllPathsLG module to control the assembly pipeline.

      ALLPATHS-LG consists of a series of modules. The module pipeline can be controlled through the RunAllPathsLG module. See the ALLPATHS-LG Manual for details on RunAllPathsLG. See the example job script for an example of using RunAllPathsLG.

  2. Submit the job with the qsub command.
Last Updated on Thursday, 14 February 2013 16:19  

User Information

PSC Passwords

Connect to PSC systems:

PSC Policies

For technical questions:
Call the PSC hotline: 412-268-6350 / 800-221-1641 or mail to

Other services PSC provides:

Advanced Networking: High-speed network design, testing and tuning

3ROX: High-speed network access

Biomedical Applications: Computational biomedical research and training