Motif ID / Description / % promoters / Incidence / Total motifs
O$PTBP / Plant TATA binding protein factor / 30% / 7 / 31
O$VTBP / Vertebrate TATA binding protein factor / 73% / 7 / 68
P$AHBP / Arabidopsis homeobox protein / 88% / 7 / 67
P$BRRE / Brassinosteroid response element / 35% / 2 / 2
P$CCAF / Circadian control factors / 52% / 4 / 9
P$DOFF / DNA binding with one finger / 84% / 7 / 20
P$GAPB / GAP-box (light response elements) / 42% / 4 / 7
P$GBOX / Plant G-box/C-box bZIP proteins / 69% / 5 / 12
P$GTBX / GT-box elements / 91% / 7 / 50
P$IDDF / ID domain factors / 38% / 5 / 9
P$L1BX / L1 box / 81% / 7 / 37
P$LREM / Light responsive element motif / 58% / 7 / 13
P$MADS / MADS box proteins / 84% / 6 / 28
P$MIIG / MYB IIG-type binding sites / 51% / 3 / 5
P$MSAE / M-phase-specific activator elements / 52% / 5 / 5
P$MYBL / MYB-like proteins / 94% / 7 / 19
P$MYBS / MYB proteins with single DNA binding repeat / 82% / 7 / 25
P$NACF / Plant specific NAC transcription factors / 56% / 6 / 12
P$NCS1 / Nodulin consensus sequence 1 / 57% / 7 / 18
P$OPAQ / Opaque-2 like transcriptional activators / 69% / 6 / 14
P$RAV5 / 5'-part of bipartite RAV1 binding site / 17% / 5 / 5
P$SALT / Salt/drought responsive elements / 25% / 3 / 3

Table S1 Identification of transcription factor binding sites within promoters that drive preferential transcript expression in the outer tissues of the developing fruit. The table contains 22 binding sites (motif IDs) identified in both GhPRP3 and GhCHS1 promoters by the MatInspector program (Quandt et al. 1995). The descriptions of the corresponding transcription factors as indicated by MatInspector are presented as well as the percentage of plant promoters within the MatInspector database (63,000 plant promoters with an average length of 612 nucleotides) that the transcription factor binding sites occur in. The number of promoters from the dataset of seven (GhPRP3, GhCHS1, AGL5, AT2G40250, PG, PPC2 and CRC) that the particular motifs are present in is indicated (incidence) as well as the total number of times that each motif was located of the seven promoters (total motifs).

Fig. S1 Identification of three novel motifs common to the GhPRP3 and GhCHS1 promoters. Motifs identified by the MEME program (Bailey and Elkan 1994) are presented as position-specific probability matrices that specify the probability of each possible nucleotide appearing at each possible position in an occurrence of the motif. The total height of the stack of nucleotides at each position represents the information content of that site in the motif in bits. The height of the individual letters in a stack is the probability of the letter at that position multiplied by the total information content of the stack.

2