FAQ for Trinity use on Blacklight

  1. How many cores should I request? How many should Trinity use? How do I do this?
  2. What do I do when my job fails with an error like:
1. How many cores should I request for Trinity?  How many should I tell Trinity to use?  How do I do this?
There is a distinction between the number of cores you request for your job via the #PBS directive and the number specified in the Trinity command line with the --CPU option.

The number of cores you request with the #PBS directive determines the amount of memory that will be available for your job. On Blacklight, each core has 8 GB of memory associated with it. So if your job requests 128 cores, you will be allocated 1 TB (1024 GB) of memory.   Use the #PBS directive to request as many cores as you need to get sufficient memory for your job.

A rough rule of thumb is that you will need 1 GB per million reads.

However, the number of cores that Trinity uses when it runs is determined by the --CPU option on the Trinity command line, regardless of the number requested with the #PBS directive. In the initial stages Trinity uses only 16 or maybe 32 cores productively. (Later stages like QuantifyGraph or Butterfly can use many more.)

So if your job requires a Terabyte of memory, your job script should contain these lines:

#!/bin/bash
#PBS -l ncpus=128
     .
     .
     .
Trinity.pl --CPU 16 other-command-line-options
     .
     .
2. What should I do when my job gets killed with an error like this:
  • PBS: job killed: stdout file size 96800KB exceeded limit 20480KB
    You must direct the output from Trinity directly to a file on the Trinity command line like this:
    Trinity.pl command-line-options >& my-trinity-output.out
    If you don't do this, by default, Trinity output is written to stdout. This can cause trouble on blacklight, because stdout and stderr files are limited to 20 Mbytes each. If either file exceeds this limit, the job will be killed. The Trinity package writes a lot of information to stdout and stderr and often exceeds these limits.


  • PBS: job killed: pid 26111814 (java) thread count 2573 exceeded limit 2500
    When you run Butterfly, you should limit the number of Java garbage collection threads by using this command line option:
    --bflyGCThreads 4
    These threads are only for Java's internal memory management, so they should not really affect your performance. Anything between 2 and 16 threads is reasonable. If you do not include the --bflyGCThreads option, Java will launch as many threads as there are cores, and you will certainly exceed the limit.
Last Updated on Thursday, 29 November 2012 15:51  

More on Trinity

Examples

  • Example files which show how to run Trinity in one run (for very small datasets) or in four stages (for larger datasets).

Documentation

More User Information

Passwords

Connect to PSC systems:

PSC Policies

Technical questions:

Call the PSC hotline: 412-268-6350 / 800-221-1641 or mail to remarks@psc.edu.