next up previous contents
Next: Generating likely sequences: Up: Searching sequence databases Previous: Global (Needleman/Wunsch) style

Local (Needleman/Wunsch) style searching: hmmls

hmmls (``HMM local search'') finds high-scoring local matches of subsequences to the whole HMM. It is used for nucleic acid models or for protein domain models, where one expects a full match to the model to be found in a longer sequence.

The command line and options are essentially identical to hmmsw and hmmfs. The output is the score of the match, the start and end position of the match on the target sequence, and the name and description of the matched target sequence.

hmmls is somewhat older than hmmsw and hmmfs, and has some idiosyncracies. One is that you sometimes must specify the maximum length of the hits that you expect; by default, this is 1000 residues. If you expect matches longer than this, use hmmls -w <max length> to set a longer width. Another idiosyncracy is that hmmls does not correct for the length of the target sequence, so hmmls scores are typically inflated by 5-10 bits over what they should be.

hmmfs is almost always preferable. hmmfs scores are more correct, hmmfs allows for significant fragmentary matches to just a part of the model, and hmmfs uses a memory-efficient search algorithm and a less ad hoc implementation than hmmls. hmmls often has difficulty parsing closely spaced (tandemly repeated) domains, for instance, a problem that hmmfs cleanly resolves. The only reason to use hmmls would be if you had good reason to force the full HMM to match in the target sequence. (However, this desire does arise frequently!)



Sean Eddy
Mon Apr 17 09:54:19 CDT 1995