MEME Program - Multiple Expectation maximization for Motif Elicitation
MEME is a tool for discovering motifs in groups of sequences. A motif is defined here as a a position-dependent letter frequency matrix that describes a set of similar subsequences of equal length. (A MEME motif is similar to a gapless profile.)
MEME takes as input a group of DNA or protein sequences in Pearson/FASTA format and outputs as many motifs as requested. MEME uses statistical modeling techniques to automatically choose the best width and content of each motif. For each motif MEME discovers, it outputs
- the subsequences in the input set that match the motif,
- an information content diagram that visually displays the degree of conservation of each position of the motif,
- a three-level consensus sequence describing the most conserved residues at each position, and
- the simplified (rounded to 1 digit) frequency matrix.
Several statistical scores are reported for each motif, most useful of which is probably its information content (IC). Motifs with IC values of 18 bits or more are usually most useful as probes for searching large sequence databases. Lower IC values tend to give many false positives in searches.