Pittsburgh Supercomputing Center 

Advancing the state-of-the-art in high-performance computing,
communications and data analytics.

Trinity

Trinity, developed at the Broad Institute, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-Seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-Seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms and to tease apart transcripts derived from paralogous genes.

Installed on blacklight.

In this document, you will find instructions for running a Trinity job on blacklight

Other resources that may be helpful include:

Running a Trinity job

Trinity jobs are submitted to blacklight's batch queues to run. Only very small test runs with a small number of cores and a short run time can be run interactively. Production runs of any size must be submitted as batch jobs.

To run a Trinity job, follow these steps.  Details for each step follow.

    1. Choose which version of Trinity to use. Multiple versions of Trinity are installed. It is important to know which version you are using, as the command line options and the default settings can change between versions. It is also a good practice to use the same version throughout your project. For help on selecting a version:

      There is always a default version of Trinity defined, but it changes as new versions are added and older versions deleted. For this reason, you should never just load the default version; it may change without notice. Always load the specific version that you want.

    2. Create a job script. The script will contain commands to
      1. Combine stdout and stderr
      2. Load the appropriate Trinity module
      3. Set the stack size to unlimited
      4. Copy your input files to $SCRATCH
      5. Move to your $SCRATCH directory
      6. Run Trinity. One thing you must do in the Trinity command line is redirect the Trinity output to a file.
      7. Copy your Trinity output file from $SCRATCH

       

    3. Submit your job with the qsub command
  1. Choose which version of Trinity to use

    Multiple versions of Trinity are installed. It is important to know which version you are using, as the command line options and the default settings can change between versions.

    The module command can tell you what versions are installed.  When you have chosen a version, you will use the "module load" command to set up the correct environment to run that version of Trinity. For more information, see documentation on the module command.

    • See what versions are available

      To see what versions of Trinity are available, type

      module available trinity

      This example shows four versions installed, from r2012-01-25 to r2012-06-08:

      tg-login1:~> module avail trinity
      -------------------------- /usr/local/opt/modulefiles --------------------------
      trinity/r2012-01-25 trinity/r2012-05-18
      trinity/r2012-03-17 trinity/r2012-06-08
      tg-login1:~>
      
    • See the options for a specific version

      You can see the available options and defaults for a given version. First load the specific Trinity module you are interested in with the module load command. Type in the full name of the module from the module available trinity command. For example, to see the options for version r2012-03-17, type

      module load trinity/r2012-03-17

      To see the Trinity command line options, now type

      Trinity.pl
    • Read the release notes for a specific version

      You can read the release notes for any version by looking at the Release.Notes file, found in the top level directory (/usr/loca/packages/trinity/version) for that version. For example, to see the release notes for version 2012-06-08, look at the file /usr/local/packages/trinity/r2012-06-08/Release.Notes.

  2. Create a job script

    Create a job script which will do all the set-up necessary and then run Trinity. See the blacklight document for extensive details on the structure of blacklight job scripts, including the necessary PBS directives.

    Your script should contain commands to

    1. Combine stdout and stderr

      In a batch job, the messages and errors that are normally displayed on the monitor while an interactive job runs are instead written to two files, stdout and stderr, respectively.

      Redirect stdout and stderr by using the PBS directive -j oe. This combines both stdout and stderr into one file, which makes debugging easier. Put this line into your batch script:

      #PBS -j oe my-PBS-output-file
      

      For more information on PBS directives in batch jobs, see the blacklight document.

    2. Load the appropriate Trinity module

      Use module load to define the correct environment to run a specific version of Trinity.  Type in the full name of the module from the module available trinity command.

      module load trinity/r2011-11-26
      
    3. Set the stack size to unlimited

      You must set the stack size to unlimited in your batch script before running Trinity, or the job will fail.

      If you are using bash, type

      ulimit -s unlimited

      If you are using csh, type

      limit stacksize unlimited
    4. Copy your input files to $SCRATCH

      Your $SCRATCH directory on blacklight is intended to be used as working space for your running jobs. All of the files that your job needs should be copied to $SCRATCH.   Copy them with

      cp inputfile $SCRATCH
    5. Move to your $SCRATCH directory

      Move to your $SCRATCH before starting Trinity with

      cd $SCRATCH
      
    6. Run Trinity

      Some typical command lines are given below. We recommend that you look at the complete list of options available, given at http://trinityrnaseq.sourceforge.net.

      It is important that the output produced by Trinity be redirected into a file. By default, Trinity output is written to stdout.  This can cause trouble on blacklight, because stdout and stderr files are limited to 20 Mbytes each. If either file exceeds this limit, the job will be killed.

      The Trinity package writes a lot of information to stdout and stderr and often exceeds these limits. To prevent your job from being killed by the system, you should redirect Trinity output to a different file. To redirect your Trinity output, use the ">" operator. Here is a command line showing Trinity output redirected to a file called my-trinity-output.out.

      Trinity.pl command-line-options > my-trinity-output.out

       

      Typical Trinity command lines

      Variables in brackets should be replaced with the desired options or names of your input files. Do not include the brackets themselves in the command line.

      Strand Specific Sequencing (Preferred Library Method typical of the dUTP/UDG sequencing method)

      Please note that other methods of Strand Specific library generation may require FR orientation. See the Trinity website for a full explanation.

      Trinity.pl --seqtype fq --JM 100G --left <yourreads1.fq> --right <yourreads2.fq> --output <dirnameforoutput> --SS_lib_type RF --min_contig_length <contiglengthmincutoff> --CPU 16 --bflyCPU 16 --bflyGCThreads 16 > my_trinity_output.out

      Non-Strand Specific Library

      Trinity.pl --seqtype fq --JM 100G --left <yourreads1.fq> --right <yourreads2.fq> --output <dirnameforoutput> --min_contig_length <contiglengthmincutoff> --CPU 16 --bflyCPU 16 --bflyGCThreads 16 > my_trinity_output.out
    7. Copy your Trinity output file from $SCRATCH

      Although $SCRATCH files should be available for 21 days, it is good practice to copy your Trinity output file back to your home directory before the job ends. Copy it back to your home directory with

      cp my-trinity-output.out $HOME
  3. Submit your job with the qsub command

    qsub my-job-script

    For a detailed description of the qsub command and its options, see the blacklight document.

Stay Connected

Stay Connected with PSC!

facebook 32 twitter 32 google-Plus-icon