next up previous contents
Next: Author Up: SELEX alignment format Previous: Reference coordinate system

Sequence header

Additional per-sequence information can be placed in a header before any blocks appear. These lines, one per sequence and in exactly the same order as the sequences appear in the alignment, are formatted like #=SQ <seqname> <weight> <database source name> <database accession> <source coordinates as start..stop::original length> <description>.

This information includes a sequence weight (for compensating for biased representation of subfamilies of sequences in the alignment); source information, if the sequence came from a database, consisting of identifier, accession number, and source coordinates; and a description of the sequence.

If a #=SQ line is present, all the fields must be present on each line and one #=SQ line must be present for each sequence. If no information is available for a field, use '-' for all the fields except the source coordinates, which would be given as '0'.



Sean Eddy
Mon Apr 17 09:54:19 CDT 1995