Statistical Applications in Genetics and Molecular Biology
Editor-in-Chief: Sanguinetti, Guido
IMPACT FACTOR 2018: 0.536
5-year IMPACT FACTOR: 0.764
CiteScore 2018: 0.49
SCImago Journal Rank (SJR) 2018: 0.316
Source Normalized Impact per Paper (SNIP) 2018: 0.342
Mathematical Citation Quotient (MCQ) 2018: 0.02
A common experimental strategy utilizing microarrays is to develop a signature of genes responding to some treatment in a model system, and then ask whether the same genes respond in an analogous way in a more natural and uncontrolled environment. In statistical terms, the question posed is whether genes score similarly on some statistical test in two independent data sets. Approaches to this problem ignoring gene/gene correlations common to all microarray data sets are known to give overstated statistical confidence levels. Permutation approaches have been proposed to give more accurate confidence levels, but can not be applied when sample sizes are small. Here we argue that the product moment correlation between test statistics in the two experiments is an ideal measure for summarizing concordance between the experiments, as confidence levels accounting for intergene correlations depend only on a single number -- the average squared correlation between gene pairs in the data set. The resulting null standard deviation is shown to vary by less than a factor of two over six distinct experimental data sets, suggesting that a universal constant may be used for this quantity. We show how a hidden assumption of the permutation approach may lead to incorrect p-values, while the analytic approach presented here is shown to be resistant to this assumption.
Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.