Statistical Applications in Genetics and Molecular Biology
Editor-in-Chief: Sanguinetti, Guido
IMPACT FACTOR 2018: 0.536
5-year IMPACT FACTOR: 0.764
CiteScore 2018: 0.49
SCImago Journal Rank (SJR) 2018: 0.316
Source Normalized Impact per Paper (SNIP) 2018: 0.342
Mathematical Citation Quotient (MCQ) 2017: 0.04
A Compendium to Ensure Computational Reproducibility in High-Dimensional Classification Tasks
We demonstrate a concept and implementation of a compendium for the classification of high-dimensional data from microarray gene expression profiles. A compendium is an interactive document that bundles primary data, statistical processing methods, figures, and derived data together with the textual documentation and conclusions. Interactivity allows the reader to modify and extend these components. We address the following questions: how much does the discriminatory power of a classifier depend on the choice of the algorithm that was used to identify it; what alternative classifiers could be used just as well; how robust is the result. The answers to these questions are essential prerequisites for validation and biological interpretation of the classifiers. We show how to use this approach by looking at these questions for a specific breast cancer microarray data set that first has been studied by Huang et al. (2003).
Supplementary Article Materials
Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.