Jump to ContentJump to Main Navigation
Show Summary Details
In This Section

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Stumpf, Michael P.H.

6 Issues per year

IMPACT FACTOR 2016: 0.646
5-year IMPACT FACTOR: 1.191

CiteScore 2016: 0.94

SCImago Journal Rank (SJR) 2015: 0.954
Source Normalized Impact per Paper (SNIP) 2015: 0.554

Mathematical Citation Quotient (MCQ) 2015: 0.06

See all formats and pricing
In This Section
Volume 11, Issue 3 (Apr 2012)


The practical effect of batch on genomic prediction

Hilary S. Parker
  • Johns Hopkins Bloomberg School of Public Health
/ Jeffrey T. Leek
  • Johns Hopkins Bloomberg School of Public Health
Published Online: 2012-04-16 | DOI: https://doi.org/10.1515/1544-6115.1766

Measurements from microarrays and other high-throughput technologies are susceptible to non-biological artifacts like batch effects. It is known that batch effects can alter or obscure the set of significant results and biological conclusions in high-throughput studies. Here we examine the impact of batch effects on predictors built from genomic technologies. To investigate batch effects, we collected publicly available gene expression measurements with known outcomes, and estimated batches using date. Using these data we show (1) the impact of batch effects on prediction depends on the correlation between outcome and batch in the training data, and (2) removing expression measurements most affected by batch before building predictors may improve the accuracy of those predictors. These results suggest that (1) training sets should be designed to minimize correlation between batches and outcome, and (2) methods for identifying batch-affected probes should be developed to improve prediction results for studies with high correlation between batches and outcome.

Keywords: batch effects; prediction; microarrays; reproducibility; research design

About the article

Published Online: 2012-04-16

Citation Information: Statistical Applications in Genetics and Molecular Biology, ISSN (Online) 1544-6115, DOI: https://doi.org/10.1515/1544-6115.1766. Export Citation

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Nicholas GD Masca, Elizabeth MA Hensor, Victoria R Cornelius, Francesca M Buffa, Helen M Marriott, James M Eales, Michael P Messenger, Amy E Anderson, Chris Boot, Catey Bunce, Robert D Goldin, Jessica Harris, Rod F Hinchliffe, Hiba Junaid, Shaun Kingston, Carmen Martin-Ruiz, Christopher P Nelson, Janet Peacock, Paul T Seed, Bethany Shinkins, Karl J Staples, Jamie Toombs, Adam KA Wright, and M Dawn Teare
eLife, 2015, Volume 4
Hilary S. Parker, Héctor Corrada Bravo, and Jeffrey T. Leek
PeerJ, 2014, Volume 2, Page e561
Jung Ae Lee, Kevin K. Dobbin, and Jeongyoun Ahn
Statistics in Medicine, 2014, Volume 33, Number 15, Page 2681

Comments (0)

Please log in or register to comment.
Log in