next up previous contents
Next: Frequently Asked Questions Up: Advanced topics Previous: Advanced topics

Constrained simulated annealing

If you know the structures of some of your sequences, you might create a structure-based alignment either manually or automatically. Then you want to align the rest of the known sequence homologues to this alignment without disturbing the structural alignment. hmmt provides an option -a <alignfile> to do this. Because part of the training set is kept in a known alignment, and the rest is aligned by simulated annealing, I call this procedure ``constrained simulated annealing''.

Take the globin demos as an example. bashford.slx is a known structure alignment of seven globins manually created by Don Bashford. globins50.fa contains 50 randomly chosen globins. To train a model that uses the sequences in bashford.slx and keeps their alignment fixed:

> hmmt -a bashford.slx -o newglobin.slx newglobin.hmm globins50.fa

The model will be saved in newglobin.hmm, and the final alignment will be saved in newglobin.slx. The -o option for saving the final alignment is somewhat important here. If you instead use hmma to make an alignment using newglobin.hmm, you would have to merge the two sequence files somehow, and the alignment of the sequences in bashford.slx is not guaranteed to be fixed because hmma has no constraint mechanism (yet).

Both the sequences in the alignment and the unaligned training set are used as training sequences. Therefore there should be no overlap between the two files. Any sequence that occurs in both files will be counted twice.



Sean Eddy
Mon Apr 17 09:54:19 CDT 1995