This is supplementary material to the paper "ProfCom:
a web tool for profiling the complex functionality of gene groups identified
from high-throughput data".
Here we present several examples of data analyses by ProfCom. We bring together several independent studies that performed gene expression analyses to identify over/under expressed genes in different cancer types. We collect a set of differentially expressed genes originally identified for each study. Hereafter we refer to each of these sets as set A. The set of all human genes was considered as the reference set (referred to as set B). For each case we analyzed enrichment related to GO terms and “complex functions” constructed from GO terms in the set A. To compare ProfCom to other related tools we examined the examples additionally by GENECODIS .
Example 1. Gene Expression in Ovarian Cancer Reflects Both Morphology and Biological Behavior, Distinguishing Clear Cell from Other Poor-Prognosis Ovarian Carcinomas
Gene expression in 113 ovarian epithelial tumors using oligonucleotide microarrays was analyzed . In total, 73 genes, expressed 2- to 29-fold higher in clear cell ovarian carcinoma compared with each of the other ovarian carcinoma types, were identified. Standard functional profiling of these genes reveals statistically significant enrichments related to several GO terms.
One of the enriched terms was “cell adhesion”. In the set A of 73 up-regulated genes 10 genes belonged to this category while 390 genes classified by this term in all human genome. This group of genes may be of particular interest as it was shown in different studies that cell adhesion molecules can play important role in epithelial ovarian cancer development [3;4]. By analyses of relation between GO terms in the set A, ProfCom classify these genes more specifically. The complex function “cell adhesion EXCLUDING homophilic cell adhesion EXCLUDING structural molecule activity” inferred by ProfCom classifies only 245 (compare to 390) genes in the whole genome and all 10 genes in the set A. The resolved complex function is more specific (the same selectivity with almost 2-fold increase in specificity). In addition, this information may be useful to analyze cancer molecular mechanisms.
GENECODIS detects single GO term “cell adhesion” as being overrepresented. However, no other evidences that can be helpful to understand the role of up-regulated “cell adhesion” genes were provided.
Example 2. Comprehensive Gene Expression Analysis of Prostate Cancer Reveals Distinct Transcriptional Programs Associated with Metastatic Disease
This study  performs a comprehensive gene expression analysis of prostate cancer using oligonucleotide arrays with 63,175 probe sets to identify genes with strong differential expression between non-recurrent primary prostate cancers and metastatic prostate cancers. Among highly ranked over-expressed genes (73 genes selected based on the t test statistic) by manual analyses the authors found genes that participate in cell cycle regulation, DNA replication, and DNA repair. Standard functional profiling of these genes reveals statistically significant enrichments related to several GO terms.
For example, a subset of 10 from 73 over-expressed genes was related by term “regulation of progression through cell cycle”. This category may be relevant for understanding of transcriptional programs associated with metastatic disease. According to GO annotation, the term “regulation of progression through cell cycle” unites approximately 160 genes in human genome. It is clear that only a fraction of genes classified by this term may be involved in the molecular model of cancer.
Profom classify these genes by complex functions “regulation of progression through cell cycle EXCLUDING growth factor activity EXCLUDING transcription” which is more specific. Only 106 genes from the whole human genome are classified by this complex function.
GENOCODIS detects single GO term “regulation of progression through cell cycle” as being overrepresented. However, no other evidences that can be helpful to understand the role of up-regulated “regulation of progression through cell cycle” genes were provided.
Example 3. Patterns of Gene Expression in Different Histotypes of Epithelial Ovarian Cancer Correlate with Those in Normal Fallopian Tube, Endometrium, and Colon
Microarray analysis was done to compare gene expression in 50 ovarian cancer specimens, including all four histotypes to gene expression in 5 pools of normal ovarian surface epithelial cells . Data were analyzed to determine whether changes in gene expression correlated with different histotypes, grade, or stage.
Several set of genes that show the greatest ability to differentiate between considered cancer subtypes were originally identified. For example, 47 selected genes were 2-fold differentially expressed in mucinous ovarian cancers compared to other histotypes and with normal ovarian surface epithelial cells. Standard functional profiling reveals several GO term significantly overrepresented. It is widely known that the processes of Ca++ homeostasis are often disordered in many cancer types . Therefore, significant enrichment of GO term “calcium ion binding” among differentially expressed genes is of particular interest. Eight genes (MRC1 EFHD2 PLS1 ANXA10 LDLR MMP1 S100P THBS2) from the set A are related by this term. On the other hand, there are 780 genes in the whole human genome classified as “calcium ion binding”. Using conventional GO terms vocabularies, standard profiling procedure is not able to supply any piece of evidences that would discriminate these 8 genes (from all human 780 “calcium ion binding”) and, thus, to clarify molecular mechanism involved.
The complex function “calcium ion binding EXCLUDING membrane EXCLUDING endoplasmic reticulum” inferred by ProfCom is more specific in comparison to single GO term “calcium ion binding”, i.e. only 363 genes (compare to 780) in the human genome are classified by this complex function. It is not only better from statistical viewpoint (equal selectivity with approximately 1-fold increase in specificity) but supply valuable biological information which can be helpful for making biological conclusions about molecular mechanisms involved in the considered cancer type.
Analyses by GENOCODIS also detect single GO term “calcium ion binding” as being overrepresented, however, no other evidences that can be helpful to differentiate calcium ion binding genes in the set A and B were provided.
Example 4. Exploration of Global Gene Expression Patterns in Pancreatic Adenocarcinoma Using cDNA Microarrays
This study  used cDNA microarrays to analyze global gene expression patterns in 14 pancreatic cancer cell lines, 17 resected infiltrating pancreatic cancer tissues, and 5 samples of normal pancreas to identify genes (125 genes) that are differentially expressed. Standard functional profiling of these genes reveals statistically significant enrichments of several GO terms.
For example, single GO term “cell adhesion” was found to be enriched. A group of 9 genes from this category are presented in the set A. According to GO annotation, 391 genes from the whole genome are classified by this functional category. As it was already mentioned, several studies proposed biochemical and cell biological evidences [3;4] suggesting that genes from this category may have important implications in the context of cancer development. However, it is clear that only a fraction of cell adhesion genes may be involved in the molecular model of the cancer in the analyzed case.
ProfCom inferred complex functions that classify these genes more specifically. Complex function “cell adhesion EXCLUDING homophilic cell adhesion EXCLUDING cytoplasm” classifies 8 genes in the set A and only 206 genes in the set B. The resolved “complex function” is more specific (almost equal selectivity with 1-fold increase in specificity) and is more informative to analyze cancer molecular mechanisms.
GENOCODIS also inferred single GO term “integral to plasma membrane” as being overrepresented. However, it was unable to identify any complex function because it was constructed by “EXCLUDE” Boolean operation which is beyond the scope of GENOCODIS classification models.
- Carmona-Saez P, Chagoyen M, Tirado F, Carazo JM, Pascual-Montano A: GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists. Genome Biol 2007;8:R3. Pubmed
- Schwartz DR, Kardia SL, Shedden KA, Kuick R, Michailidis G, Taylor JM, Misek DE, Wu R, Zhai Y, Darrah DM, Reed H, Ellenson LH, Giordano TJ, Fearon ER, Hanash SM, Cho KR: Gene expression in ovarian cancer reflects both morphology and biological behavior, distinguishing clear cell from other poor-prognosis ovarian carcinomas. Cancer Res 8-15-2002;62:4722-4729. Pubmed
- Hong G, Baudhuin LM, Xu Y: Sphingosine-1-phosphate modulates growth and adhesion of ovarian cancer cells. FEBS Lett 11-5-1999;460:513-518. Pubmed
- Spizzo G, Went P, Dirnhofer S, Obrist P, Moch H, Baeuerle PA, Mueller-Holzner E, Marth C, Gastl G, Zeimet AG: Overexpression of epithelial cell adhesion molecule (Ep-CAM) is an independent prognostic marker for reduced survival of patients with epithelial ovarian cancer. Gynecol Oncol 2006;103:483-488.Pubmed
- LaTulippe E, Satagopan J, Smith A, Scher H, Scardino P, Reuter V, Gerald WL: Comprehensive gene expression analysis of prostate cancer reveals distinct transcriptional programs associated with metastatic disease. Cancer Res 8-1-2002;62:4499-4506.Pubmed
- Marquez RT, Baggerly KA, Patterson AP, Liu J, Broaddus R, Frumovitz M, Atkinson EN, Smith DI, Hartmann L, Fishman D, Berchuck A, Whitaker R, Gershenson DM, Mills GB, Bast RC, Jr., Lu KH: Patterns of gene expression in different histotypes of epithelial ovarian cancer correlate with those in normal fallopian tube, endometrium, and colon. Clin Cancer Res 9-1-2005;11:6116-6126.Pubmed
- Revankar CM, Advani SH, Naik NR: Altered Ca2+ homeostasis in polymorphonuclear leukocytes from chronic myeloid leukaemia patients. Mol Cancer 2006;5:65.Pubmed
- Iacobuzio-Donahue CA, Maitra A, Olsen M, Lowe AW, van Heek NT, Rosty C, Walter K, Sato N, Parker A, Ashfaq R, Jaffee E, Ryu B, Jones J, Eshleman JR, Yeo CJ, Cameron JL, Kern SE, Hruban RH, Brown PO, Goggins M: Exploration of global gene expression patterns in pancreatic adenocarcinoma using cDNA microarrays. Am J Pathol 2003;162:1151-1162.Pubmed