We propose a robust model for discovering differentially expressed genes which directly incorporates biological significance, i.e., effect dimension. Using the so-called c-fold rule, we transform the expressions into a nominal observed random variable with three categories: below a fixed lower threshold, above a fixed upper threshold or within the two thresholds. Gene expression data is then transformed into a nominal variable with three levels possibly originated by three different distributions corresponding to under expressed, not differential, and over expressed genes. This leads to a statistical model for a 3-component mixture of trinomial distributions with suitable constraints on the parameter space. In order to obtain the MLE estimates, we show how to implement a constrained EM algorithm with a latent label for the corresponding component of each gene. Different strategies for a statistically significant gene discovery are discussed and compared. We illustrate the method on a little simulation study and a real dataset on multiple sclerosis.

Editor-in-Chief: Stumpf, Michael P.H.
Editorial Board Member: Beaumont, Mark / Binder, Harald / Gupta, Mayetri / Hubbard, Alan E. / Husmeier, Dirk / Ji, Hongkai / Keles, Sunduz / Kerr, Kathleen / Lazzeroni, Laura / Lin, Shili / Ma, Ping / Marjoram, Paul / Mertens, Bart / Nerman, Olle / G. Petretto, Enrico / Plagnol, Vincent / Purdom, Elizabeth / Robin, Stéphane / Rzhetsky, Andrey / Sanguinetti, Guido / van der Laan, Mark J. / von Haeseler, Arndt / Weeks, Daniel E. / Wiuf, Carsten / Zhao, Hongyu
6 Issues per year
IMPACT FACTOR 2011: 1.517
5-year IMPACT FACTOR: 1.704
Rank 27 out of 116 in category Statistics & Probability in the 2011 Thomson Reuters Journal Citation Report/Science Edition
Issues
Volume 12 (2013)
Volume 11 (2012)
Volume 10 (2011)
Volume 9 (2010)
Volume 8 (2009)
Volume 7 (2008)
Volume 6 (2007)
Volume 5 (2006)
Volume 4 (2005)
Volume 3 (2004)
Volume 2 (2003)
Volume 1 (2002)
Most Downloaded Articles
- A General Framework for Weighted Gene Co-Expression Network Analysis by Zhang, Bin and Horvath, Steve
- Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments by Smyth, Gordon K
- Detecting Differential Expression in RNA-sequence Data Using Quasi-likelihood with Shrunken Dispersion Estimates by Lund, Steven P./ Nettleton, Dan/ McCarthy, Davis J. and Smyth, Gordon K.
- Adjusting for Spurious Gene-by-Environment Interaction Using Case-Parent Triads by Shin, Ji-Hyung/ Infante-Rivard, Claire/ Graham, Jinko and McNeney, Brad
- A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics by Schäfer, Juliane and Strimmer, Korbinian
A Three Component Latent Class Model for Robust Semiparametric Gene Discovery
1Sapienza - Università di Roma
1Sapienza - Università di Roma
1Sapienza - Università di Roma
Citation Information: Statistical Applications in Genetics and Molecular Biology. Volume 10, Issue 1, Pages 1–19, ISSN (Online) 1544-6115, DOI: 10.2202/1544-6115.1565, January 2011
- Published Online:
- 2011-01-21
Keywords: differentially expressed genes; effect size; microarray data; mixture model


















Comments (0)