MEME Output

The next sections show actual output from the MEME program. The output section is followed by an explanatory section.

Motif   5

(best) e_ll_0  -7031 e_ll  -6814 ll  -6750 sig 9.980e-001
(best) lrt 5.396e-001 bonferroni 1.000e+000 root 9.980e-001

(best) RAADIRDVTKRVLAHLLGVTI --> RAADIRDVxxRVLxHL
(best) w  16 nsites   8.0 lambda 0.0052288 IC/col  2.269

(best) IC 36.298
Bayes optimal threshold for information content scores =   7.5718
Alignment of sites with IC scores over 7.57175:
sequence         start     IC        pre site             post
--------         -----     --        --- ----             ----
U12340               1  52.50            RAADIRDVTKRVLAHL LGVTISNPSL
P08838               1  46.94            RAADIRDVTKRVTGHL LGVEIPNPSM
L15191               1  51.32            RAADIRDVAKRVLAHL LGVELPNPAT
U15110               1  35.40            RSADIKDVSLRIISHI LGLEIHDLST
M21450               1  41.54            RAADVRDIGKRLLRNI LGLAIIDLSA
P32670               2  39.45          E RALDVRDVCFQLLQQI YGEQRFPAPG
P23388               5  43.48       VLSG RAIDLRDAGQRVLQHL GRVRTGETHL
Z37113               1  45.39            RAADLRDVGRRVLAQL DAAAAGAGLT
(consensus)     (   8)                   RAADIRDVxxRVLxHL

Information content of positions in motif:
                                     7.1
                                     6.4
                                     5.7
                                     5.0
                                     4.3
Info Content                         3.5 *
                                     2.8 *  * **   *   *
                                     2.1 *  * **   *   *
                                     1.4 ** ***** **** **
                                     0.7 ****************
                                     0.0 ----------------
(consensus)                              RAADIRDVxxRVLxHL
                                          SL V  I   LI QI
                                           I L      I  N

                                      A  0870000120000300
                                      C  0000000010000000
                                      D  0009009000000000
                                      E  0000000000000000
                                      F  0000000001000000
                                      G  0000000030000100
                                      H  0000000000000070
                                      I  0010600100011003
                                      K  0000000005000000
                                      L  0010100001027006
                                      M  0000000000000000
                                      N  0000000000000010
                                      P  0000000000000000
                                      Q  0000000001000210
                                      R  9000090001900100
                                      S  0100000010000100
                                      T  0000000020001000
                                      V  0000200700060000
                                      W  0000000000000000
                                      Y  0000000000000000

Explanation of MEME output

This is the output for the fifth motif discovered by MEME in a group of sequences. The first six lines give various statistical measures of the strength of the motif and other motif data. The best measure of motif strength is probably its information content (IC), shown on the sixth line. The next lines show a default threshold which can be used for searches with the motif on small datasets.

The occurrences (hits above the default threshold) of the motif in the original dataset are shown next. Each line shows the

  • sequence identifier (ie, P23388),
  • starting position of the hit (ie, 5),
  • IC score of the hit, (ie, 43.48),
  • the actual sequence matching the motif (ie, RAIDLRDAGQRVLQHL),
  • the left flanking sequence, (ie VLSG), and
  • the right flanking sequence, (ie, GRVRTGETHL).

The next lines in the output of MEME show the information content of the motif on a column by column basis. Information content depends on the frequencies of the letters in a given column compared to the overall frequencies of those letters in the group of sequences. The more conserved the position is and the more rare the conserved letters are, the higher information content is.

Beneath the motif information content plot is the three-level consensus sequence of the motif. This shows the (up to) three most common letters occuring in each column of the motif. The rules for printing letters are:

  • If three or fewer letters have combined frequency of 0.8 they are printed, most frequent letter first, one above the other.
  • If no three letters have combined probability above 0.8, ``x'' is printed.

This provides a simplified picture of the conserved letters in each column of the motif at a glance.

The actual motif is shown below the consensus sequence. It is rounded to one digit and multiplied by 10 to conserve space. MEME also prints the log-odds matrix corresponding to the motif. This is used for searching by program MAST and is omitted from the sample output shown above.