Classified Bibliography for Nucleic Acid and Protein Sequence Analysis
REQUIRED READINGS
Dynamic Programming
A general method applicable to the search for similarities in the amino acid sequences of two proteins. Needleman, S.B., Wunsch, C.D. 1970 Journal of Molecular Biology 48:443-453.
Substitution Matrices
Amino acid substitution matrices from an information theoretic perspective. Altschul, S.F. 1991 Journal of Molecular Biology 219: 555-565.A model of evolutionary change in proteins. Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C. 1978 In "Atlas of Protein Sequence and Structure, vol. 5, suppl. 3," M.O. Dayhoff (ed.), pp. 345-352, Natl. Biomed. Res. Found., Washington.
Improved Sensitivity of Nucleic Acid Database Searches Using Application Specific Scoring Matrices. States, D.J., Gish, W., Altschul, S.F. 1991 Methods: A companion to Methods in Enzymology 3: 66-70.
A Structural Basis of Sequence Comparisons An evaluation of scoring methodologies Johnson, M.S., Overington, J.P. 1993 Journal of Molecular Biology 233: 716-738.
Local Similarities
A new algorithm for best subsequence alignments with applications to tRNA-rRNA comparisons. Waterman, M.S., Eggert, M. 1987 Journal of Molecular Biology 197:723-728.
Database Searches
Basic local alignment search tool. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J. 1990 Journal of Molecular Biology 215:403-410.Improved tools for biological sequence comparison. Pearson, W.R., Lipman, D.J. 1988 Proceedings of the National Academy of Sciences USA 85 :2444-2448.
Searching Protein Sequence Libraries: Comparison of the Sensitivity and Selectivity of the Smith Waterman and FASTA algorithms. Pearson, W.R. 1991 Genomics 11 : 635-650.
Global Multiple Sequence Alignment
A tool for multiple sequence alignment. Lipman, D.J., Altschul, S.F., Kececioglu, J.D. 1989 Proceedings of the National Academy of Sciences USA 86 :4412-4415.A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. Barton, G.J., Sternberg, M.J. 1987 Journal of Molecular Biology 198 :327-337.
Simultaneous comparison of three or more sequences related by a tree. Sankoff, D., Cedergren, R.J. (1983) In "Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparison," D. Sankoff & J.B. Kruskal (eds), pp. 253-263, Addison-Wesley, Reading, MA.
Local Multiple Alignment and Motifs
A workbench for multiple alignment construction and analysis. Schuler, G.D., Altschul, S.F., Lipman, D.J. 1991 Proteins 9 :180-190.An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences. Lawrence, C.E., Reilly, A.A. 1990 Proteins 7:41-51.
Finding sequence motifs in groups of functionally related proteins. Smith, H.O., Annau, T.M., Chandrasegaran, S. 1990 Proceedings of the National Academy of Sciences USA 87 :826-830.
Identifying protein-binding sites from unaligned DNA fragments. Stormo, G.D., Hartzell, G.W. III 1989 Proceedings of the National Academy of Sciences USA 86 :1183-1187.
Rigorous pattern-recognition methods for DNA sequences. Analysis of promoter sequences from Escherichia coli. Galas, D.J., Eggert, M. Waterman, M.S. 1985 Journal of Molecular Biology 186 :117-128.
Detecting Subtle Sequence signals: A Gibbs Sampling Strategy for Multiple Alignment. Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., Wootton, J.C. 1993 Science 262: 208-214.
Sequence Profiles
Automatic generation of primary sequence patterns from sets of related protein sequences. Smith, R.F., Smith, T. F. 1990 Proceedings of the National Academy of Sciences USA 87 :118-122.Profile scanning for three-dimensional structure patterns in protein sequences. Gribskov, M., Homyak, M., Edenfield, J., Eisenberg, D. 1988 Computer Applications in the Biosciences 4:61-66.
Profile analysis: detection of distantly related proteins. Gribskov, M., McLachlin, A.D., Eisenberg, D. 1987 Proceedings of the National Academy of Sciences USA 84 :4355-4358.