Usary et al.

Supplementary Materials Figure 1. Sequenceanalysis of GATA3 in tumor and cell line genomic DNAs and cDNAs. A) Wild type sequence of intron junction and exon 5 nucleotides 1483-1490. B) Tumor Ull-214 showing a splice site mutation due to a CA deletion at the exon 5 splice acceptor site. C) Wild type sequence of nucleotides 1499-1513. D) Mutant Ull-030 showing a frameshift mutation due to an A insertion. E) Wild type sequence of nucleotides 1559-1573. F) Tumor-derived cell line MCF-7 showing a frameshift mutation due to a G insertion sequenced from cDNA. G) Wild-type sequence of nucleotides 1582-1597. H) Tumor BR00-0587 showing the missense mutation CTC>TTC (Leu to Phe). I) Wild type sequence of nucleotides 1651-1665. J) Tumor Ull-011 showing the missense mutation CGA>CTA (Arg to Leu). K) Tumor BR99-0348 showing the nonsense mutation CGA>TGA (Arg-Stop) sequenced from cDNA. L) Wild type sequence of exon 4 to exon 5 region taken from cDNA. M) Tumor BR99-0207 sequence across exon 4 to exon 5 region obtained from cDNA.

Supplementary Materials Figure 2. Immunohistochemical analysis of paraffin embedded cell lines for GATA3 expression. A) MCF-7 cells, B) 293T-GATA3-WT, C) 293T-Empty Vector, D) 293T-GATA3-R367L. Magnification was 200X.

Supplementary Materials Figure 3. Hierarchical clustering analysis of 115 breast tumors using the “intrinsic” gene set highlighting the GATA3 mRNA expression cluster (adapted from Sorlie et al. 2003). A) Scaled down cluster diagram of 115 tumors clustered using the 561 cDNA clone “intrinsic” gene set. B) Close-up view of the “luminal epithelial/ESR1+” gene expression cluster showing expression of GATA3 (red text). Two genes (HNF3FOXA1 and TFF3) that were induced by GATA3 in 293T cells, were also present in this previously published tumor data set and are shown in pink.

Supplementary Materials Figure 4. Hierarchical clustering analysis of the 415 genes associated with tumor grade. Significance Analysis of Microarrays was used to identify the genes that correlated with tumor grade (grade I vs II vs III) across the set of 115 tumors analyzed in Sorlie et al. 2003. The genes were clustered using average-linkage clustering analysis, while the order of the samples was maintained according to the breast tumor “intrinsic” gene set analysis of Sorlie et al. 2003 (Figure 1), which identifies 5 distinct subtypes of breast tumors. GATA3 was present on this tumor grade associated list and is identified in red, while the 10 GATA3-induced genes from the 293T cell experiments that were also present in this list, are identified in pink.

Supplementary Materials Table 1. Microarray and promoter analysis of the ectopic expression of GATA3 in 293T epithelial cells. Wild type GATA3, along with the R367L mutant, were expressed in 293T cells and compared to the expression pattern of the empty vector control lines. Significance Analysis of Microarrays was used to identify the genes that changed in expression when the empty vector controls were compared to the WT and R367L mutants together. This analysis resulted in the identification of 74 genes, 73 of which were induced (asparagine sythetase was the lone repressed gene). rVista (Loots et al., 2002) was used to identify GATA3 binding sites in the promoter region of these genes, with a 1 signifying the presence of a GATA3 site, a 0 means no site identified and a blank means undetermined due to the lack of a clear mouse ortholog (which is needed for rVista analysis). On the right hand side is the data summary of the EASE analysis (Dennis et al., 2003), which identifies gene annotation categories that are over represented relative to chance. We are only showing those categories that gave an EASE score of less than 0.05.

1