Pittsburgh Supercomputing Center 

Advancing the state-of-the-art in high-performance computing,
communications and data analytics.

Celera WGS Assembler


Celera Assembler is a de novo whole-genome shotgun (WGS) DNA sequence assembler.

Installed on blacklight.


Other resources that may be helpful include:

  • The standard citation is the original paper: Myers et al. "A Whole-Genome Assembly of Drosophila." Science 287 2196-2204 (2000).
  • More recent papers describe
    • modifications for human genome assembly: Istrail et al. 2004; Levy et al. 2007
    • metagenomics assembly: Venter et al. 2004; Rusch et al. 2007
    • haplotype separation: Levy et al. 2007; Denisov et al. 2008
    • a Sanger+pyrosequencing hybrid pipeline: Goldberg et al. 2006
    • native assembly of 454 data: Miller et al. 2008
    There are links to these papers, and more, in the Publications section of the on-line documentation.
  • Website: http://wgs-assembler.sourceforge.net/

Running Celera WGS Assembler

On blacklight:

The Celera WGS Assembler is made availiable for use through the module command. To load the Celera WGS Assembler module enter:

module load wgs

General Usage:

runCA -d <dir> -p <prefix> [options] <frg> ...

where:

-d <dir> Use <dir> as the working directory. Required
-p <prefix> Use <prefix> as the output prefix. Required
-s <specFile> Read options from the specifications file <specfile>. <specfile> can also be one of the following key words:
[no]OBT - run with[out] OBT
noVec - run with OBT but without Vector
-version Version information
-help This information
-options Describe specFile options, and show default values
<frg> CA formatted fragment file

Example

runCA -p test -d test $CELERA_WGS_HOME/example/a006.frg.bz2