Supplementary Materials and Methods
Microarray Datasets
Preparation of the labeled cRNA samples for Affymetrix DNA microarray was described previously (Berchuck et al., 2005).
Four microarray datasets of clinical ovarian samples and one microarray dataset of ovarian cancer cell lines that include clear cell and serous subtypes were obtained from the Gene Expression Omnibus (GEO) web site at http://www.ncbi.nlm.nih.gov/sites/entre and from the website of the University of Texas M.D. Anderson Cancer Center at http://www.mdanderson.org/departments/expther/bastovcalab/. GEO datasets are: GSE6008 including 99 samples (8 OCCC: 8, OS: 41, OM: 13, OE: 37) from the University of Michigan (Hendrix et al., 2006); GSE2109 including 138 samples (OCCC: 11, OS: 87, OM: 13, OE: 27) from the Expression Project for Oncology (expO) (https://expo.intgen.org/geo/); GSE4198 including 44 samples (OCCC: 6, OS: 38) from Stanford University (Schaner et al., 2003); and GSE3001 including 10 ovarian cancer cell lines (OCCC: 5, OS: 5) from Hiroshima University (Komatsu et al., 2006). A dataset including 50 samples (OCCC: 9, OS: 23, OM: 9, OE: 9) was from the University of Texas M.D. Anderson Cancer Center (PMID16144910) (Marquez et al., 2005) .
Bioinformatics Methodologies
Identification of differentially expressed genes between OCCC and non-OCCC: We identified differentially expressed genes by comparing OCCC with non-clear cell carcinoma (non-OCCC) using SAM (Tusher et al., 2001) with the threshold q-value of 5% by Wilcoxon type statistical analysis. For statistical analysis of gene expression data, R version 2.8.2 (R Development Core Team, R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, http://www.R-project.org.) and Bioconductor version 2.3 (Ihaka and Gentleman, 1996) were used.
Hierarchical clustering: Genes selected by SAM were used to conduct average-linkage hierarchical clustering of KyotoOv38, GSE6008, and external datasets (GSE2109, GSE4198, GSE3001, and PMID16144910) with Cluster version 3.0. (http://rana.lbl.gov/eisen/). We compared different microarray platforms with U133A by using the following methods: only the overlapping probe sets between the U133A and U133 plus 2.0 dataset (GSE2109) were used; the ‘‘best match’’ annotation file provided by Affymetrix (http://www.netaffx.com) (Ramaswamy et al., 2003) was used (PMID16144910); Chip Comparer (http://tenero.duhs.duke.edu/genearray/perl/chip/chipcomparer.pl) was used (GSE4198); and the files were merged using GeneIDs (GSE3001). In using the cDNA spotted array datasets (GSE4198 and GSE3001), a SD>0.7 across all samples were used. For all analyses, levels of gene expression were standardized using mean-centering. Heat maps and dendrograms were generated with Java TreeView (http://jtreeview.sourceforge.net/).
Gene Set Enrichment Analysis (GSEA): GSEA was performed to determine whether the up-regulated or down-regulated gene sets in the OCCC signature are enriched within OCCC samples or non-OCCC samples in external datasets (GSE2109, GSE4198, GSE3001, and PMID16144910). GSEA was performed to analyze the enrichment of the gene sets following the developer’s protocol (Subramanian et al., 2005) (http://www.broad.mit.edu/gsea/).
Allez analysis: The biological characteristics of the OCCC signature were evaluated by the enrichment of MSigDB gene sets (v2.5 updated April 7 2008, Subramanian et al., 2005) using the R package allez 1.0 (Newton et al., 2007). Briefly, for each gene set, the proportion of the annotated genes in the OCCC signature was compared to that for all probe set genes. A gene set was considered significantly enriched if the nominal p-value was less than 0.05 (Pyeon et al., 2007).
Pathway analysis: Ingenuity Pathway Analysis (IPA; Ingenuity Systems®, http://www.ingenuity.com) is a commercial application that calculates the association between a particular gene set and known pathways. The activated pathways in OCCC were searched using IPA with filtering of “Human” and “Relax Relation (including Indirect)”.
Evaluation of the effect of oxidative stress on the OCCC signature: To evaluate enrichment of the OCCC signature among genes induced by oxidative stress, we generated an enrichment score as defined by GSEA procedures (Sweet-Cordero et al., 2005). Rank-ordered gene lists of N total genes,, were based on the log fold changes, , between conditions “with” and “without” oxidative stress. The fraction of genes in the OCCC signature (set S of NH genes) was weighted by absolute log fold changes, Phit, and the fraction of genes not in S, Pmiss, was evaluated.
, where
Running enrichment score (RES) is the difference between empirical distribution functions . The magnitude of enrichment was assessed by the maximum deviation of RES from zero.
Statistical Analysis
The Mann-Whitney U test with Bonferroni correction was used to compare the expression levels of OCCC with those of non-OCCC as detected by RT-PCR analyses. Fisher’s exact test was used in the analysis of contingency tables. The correlation between the expression of KyotoOv38 microarray datasets and that of Semi-qRT-PCR or qRT-PCR was evaluated by Spearman’s correlation. A p-value less than 0.05 was considered statistically significant.
Supplementary References
Berchuck A, Iversen ES, Lancaster JM, Pittman J, Luo J, Lee P et al (2005). Patterns of gene expression that characterize long-term survival in advanced stage serous ovarian cancers. Clin Cancer Res 11: 3686-96.
Hendrix ND, Wu R, Kuick R, Schwartz DR, Fearon ER, Cho KR (2006). Fibroblast growth factor 9 has oncogenic activity and is a downstream target of Wnt signaling in ovarian endometrioid adenocarcinomas. Cancer Res 66: 1354-62.
Ihaka R, Gentleman R (1996). R: A language for data analysis and graphics. J Comput Graph Stat 5: 299-314.
Komatsu M, Hiyama K, Tanimoto K, Yunokawa M, Otani K, Ohtaki M et al (2006). Prediction of individual response to platinum/paclitaxel combination using novel marker genes in ovarian cancers. Mol Cancer Ther 5: 767-75.
Marquez RT, Baggerly KA, Patterson AP, Liu J, Broaddus R, Frumovitz M et al (2005). Patterns of gene expression in different histotypes of epithelial ovarian cancer correlate with those in normal fallopian tube, endometrium, and colon. Clin Cancer Res 11: 6116-26.
Newton MA, Quintana FA, den Boon JA, Sengupta S, Ahlquist P (2007). Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis. Annals of Applied Statistics 1: 85-106.
Pyeon D, Newton MA, Lambert PF, den Boon JA, Sengupta S, Marsit CJ et al (2007). Fundamental differences in cell cycle deregulation in human papillomavirus-positive and human papillomavirus-negative head/neck and cervical cancers. Cancer Res 67: 4605-19.
Ramaswamy S, Ross KN, Lander ES, Golub TR (2003). A molecular signature of metastasis in primary solid tumors. Nat Genet 33: 49-54.
Schaner ME, Ross DT, Ciaravino G, Sorlie T, Troyanskaya O, Diehn M et al (2003). Gene expression patterns in ovarian carcinomas. Mol Biol Cell 14: 4376-86.
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA et al (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102: 15545-50.
Sweet-Cordero A, Mukherjee S, Subramanian A, You H, Roix JJ, Ladd-Acosta C et al (2005). An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis. Nat Genet 37: 48-55.
Tusher VG, Tibshirani R, Chu G (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 98: 5116-21.