FASTX Toolkit
The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
Next-Generation sequencing machines usually produce FASTA or FASTQ files, containing multiple short-reads sequences (possibly with quality information). The main processing of such FASTA/FASTQ files is mapping (aka aligning) the sequences to reference genomes or other databases using specialized programs. However, it is sometimes more productive to preprocess the FASTA/FASTQ files before mapping the sequences to the genome – manipulating the sequences to produce better mapping results.
The FASTX-Toolkit tools perform some of these preprocessing tasks.
Documentation
Usage on Bridges-2
To see what versions of FASTX Toolkit are available and if there is more than one, which is the default, along with some help, type
module spider fastx
To use FASTX Toolkit, include a command like this in your batch script or interactive session to load the FASTX Toolkit module: (note ‘module load’ is case-sensitive):
module load fastx-toolkit