MEME Examples

The following examples use data files provided in this release of MEME. MEME writes its output to standard output, so you will want to redirect it to a file in order for use with the program MAST.

1) A simple DNA example:

$ meme crp0.s -dna -mod oops -pal > ex1

MEME looks for a single motif in the file crp0.s which contains DNA sequences in FASTA format. The OOPS model is used so MEME assumes that every sequence contains exactly one occurrence of the motif. The palindrome switch is given so motifs are tested to see if they are palindromes and reported as such if they are. MEME automatically chooses the best width for the motif in this example since no width was specified.

2) A fast DNA example:

$ meme crp0.s -dna -mod oops -pal -w 20 -noshorten > ex2

This example differs from example 1) in that MEME is told to only consider motifs of width 20. This causes MEME to execute about 10 times faster. The -w and -noshorten switches can also be used with protein datasets if the width of the motifs are known in advance.

3) A simple protein example:

$ meme lipocalin.s -mod oops -maxw 50 -nmotifs 2 > ex3

MEME searches for two motifs each of width less than or equal to 50. (Specifying -maxw 50 makes MEME run faster since it does not have to consider motifs longer than 50; otherwise motif widths of up to 173 would be considered with this particular dataset.) Each motif is assumed to occur in each of the sequences because the OOPS model is specified.

4) Another simple protein example:

$ meme farntrans5.s -mod tcm -maxw 50 -nmotifs 3 > ex4

MEME searches for three motifs of maximum width 50 using the TCM sequence model which allows each motif to have any number of occurrences in each sequence. This dataset contains motifs with multiple repeats in each sequence.

6) Annotating sequences with MEME motifs and searching for homologs in databases:

$ mast ex3 -d sprot31 -z 3

MAST searches a database (the Swiss-Prot version 31 database in this example), and indicates which sequences appear to contain the motifs and may be homologs of the sequence family in which MEME discovered the motifs. MAST uses a scoring function which combines individual log-odds scores for all the motifs. MAST produces:

  1. a histogram of ZSCORES
  2. a list of high-scoring sequences sorted by p-value
  3. motif schematic diagrams for high-scoring sequences
  4. the high-scoring sequences annotated with most likely motif positions and their scores.

The MAXSUM score for a sequence is the sum of the single maximum score for each motif, regardless of whether it is over the threshold, and regardless of whether it would overlap the maximum-scoring position for another motif. ZSCORES are (MAXSUM - AVG)/SD where AVG and SD are calculated from the dataset being searched using motifnormal. Schematic diagrams and annotations are printed only for sequences with ZSCORES at least <zthresh>.

Block diagrams show the order and spacing of *non-overlapping* matches to the motifs in each high-scoring sequence. Motif matches (score >= threshold) are shown in brackets "[]", weak matches (0 <= score < threshold) are shown in angle brackets "<>", and the length of non-motif sequence ("spacer") is shown between dashes "-". For example,

27-[3]-44-<4>-99-[1]-7

shows an initial spacer of length 27, followed by a strong match to motif 3, a spacer of length 44, a strong match to motif 4, and so on.