Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Stumpf, Michael P.H.

6 Issues per year


IMPACT FACTOR 2016: 0.646
5-year IMPACT FACTOR: 1.191

CiteScore 2016: 0.94

SCImago Journal Rank (SJR) 2016: 0.625
Source Normalized Impact per Paper (SNIP) 2016: 0.596

Mathematical Citation Quotient (MCQ) 2016: 0.06

Online
ISSN
1544-6115
See all formats and pricing
More options …
Volume 6, Issue 1

Issues

Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

Super Learner

Mark J. van der Laan / Eric C Polley / Alan E. Hubbard
Published Online: 2007-09-16 | DOI: https://doi.org/10.2202/1544-6115.1309

When trying to learn a model for the prediction of an outcome given a set of covariates, a statistician has many estimation procedures in their toolbox. A few examples of these candidate learners are: least squares, least angle regression, random forests, and spline regression. Previous articles (van der Laan and Dudoit (2003); van der Laan et al. (2006); Sinisi et al. (2007)) theoretically validated the use of cross validation to select an optimal learner among many candidate learners. Motivated by this use of cross validation, we propose a new prediction method for creating a weighted combination of many candidate learners to build the super learner. This article proposes a fast algorithm for constructing a super learner in prediction which uses V-fold cross-validation to select weights to combine an initial set of candidate learners. In addition, this paper contains a practical demonstration of the adaptivity of this so called super learner to various true data generating distributions. This approach for construction of a super learner generalizes to any parameter which can be defined as a minimizer of a loss function.

Keywords: cross-validation; loss-based estimation; machine learning; prediction

About the article

Published Online: 2007-09-16


Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 6, Issue 1, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.2202/1544-6115.1309.

Export Citation

©2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston. Copyright Clearance Center

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

[1]
Joan Casey, Peter James, Lara Cushing, Bill Jesdale, and Rachel Morello-Frosch
International Journal of Environmental Research and Public Health, 2017, Volume 14, Number 12, Page 1546
[2]
Jennifer N. Cooper, Peter C. Minneci, and Katherine J. Deans
Journal of Surgical Research, 2018, Volume 221, Page 311
[3]
Richard Wyss, Sebastian Schneeweiss, Mark van der Laan, Samuel D. Lendle, Cheng Ju, and Jessica M. Franklin
Epidemiology, 2018, Volume 29, Number 1, Page 96
[4]
Daniel L. Wilson, Jeremy R. Coyle, Evan A. Thomas, and Jacobus P. van Wouwe
PLOS ONE, 2017, Volume 12, Number 11, Page e0188808
[5]
L. Drew Hill, Rufus Edwards, Jay R. Turner, Yuma D. Argo, Purevdorj B. Olkhanud, Munkhtuul Odsuren, Sarath Guttikunda, Chimedsuren Ochir, Kirk R. Smith, and Roger A. Coulombe
PLOS ONE, 2017, Volume 12, Number 10, Page e0186834
[7]
Samir Bhatt, Ewan Cameron, Seth R. Flaxman, Daniel J. Weiss, David L. Smith, and Peter W. Gething
Journal of The Royal Society Interface, 2017, Volume 14, Number 134, Page 20170520
[8]
Anthony J. Rosellini, Francisca Dussaillant, José R. Zubizarreta, Ronald C. Kessler, and Sherri Rose
Journal of Psychiatric Research, 2017
[9]
Marc-André Verner, Fraser W. Gaspar, Jonathan Chevrier, Robert B. Gunier, Andreas Sjödin, Asa Bradman, and Brenda Eskenazi
Environmental Science & Technology, 2015, Volume 49, Number 6, Page 3940
[10]
Vikas Bansal, Ondrej Libiger, Ali Torkamani, and Nicholas J. Schork
Nature Reviews Genetics, 2010, Volume 11, Number 11, Page 773
[11]
John Norrie
The Lancet Respiratory Medicine, 2015, Volume 3, Number 1, Page 5
[12]
Romain Pirracchio, Maya L Petersen, Marco Carone, Matthieu Resche Rigon, Sylvie Chevret, and Mark J van der Laan
The Lancet Respiratory Medicine, 2015, Volume 3, Number 1, Page 42
[13]
Maya L. Petersen
Epidemiology, 2014, Volume 25, Number 6, Page 898
[14]
R.B. Gunier, M. Jerrett, D.R. Smith, T. Jursa, P. Yousefi, J. Camacho, A. Hubbard, B. Eskenazi, and A. Bradman
Science of The Total Environment, 2014, Volume 497-498, Page 360
[15]
Maya Petersen, Constantin T. Yiannoutsos, Amy Justice, and Matthias Egger
JAIDS Journal of Acquired Immune Deficiency Syndromes, 2014, Volume 67, Page S8
[16]
Yuying Xie, Yeying Zhu, Cecilia A Cotton, and Pan Wu
Statistical Methods in Medical Research, 2017, Page 096228021771548
[17]
Benjamin F. Arnold, Mark J. van der Laan, Alan E. Hubbard, Cathy Steel, Joseph Kubofcik, Katy L. Hamlin, Delynn M. Moss, Thomas B. Nutman, Jeffrey W. Priest, Patrick J. Lammie, and Mathieu Picardeau
PLOS Neglected Tropical Diseases, 2017, Volume 11, Number 5, Page e0005616
[18]
Benjamin W. Chaffee, Carlos Alberto Feldens, and Márcia Regina Vítolo
Annals of Epidemiology, 2014, Volume 24, Number 6, Page 448
[19]
Sherri Rose
American Journal of Epidemiology, 2013, Volume 177, Number 5, Page 443
[20]
Jonathan M. Snowden, Sherri Rose, and Kathleen M. Mortimer
American Journal of Epidemiology, 2011, Volume 173, Number 7, Page 731
[21]
Samantha F. Ehrlich, Lisa G. Rosas, Assiamira Ferrara, Janet C. King, Barbara Abrams, Kim G. Harley, Monique M. Hedderson, and Brenda Eskenazi
American Journal of Epidemiology, 2013, Volume 177, Number 8, Page 768
[22]
Fraser W. Gaspar, Jonathan Chevrier, Lesliam Quirós-Alcalá, Jonah M. Lipsitt, Dana Boyd Barr, Nina Holland, Riana Bornman, and Brenda Eskenazi
Environmental Health Perspectives, 2017, Volume 125, Number 7
[23]
Z. Ouni, C. Denis, C. Chauvel, and A. Chambaz
Journal of the Royal Statistical Society: Series C (Applied Statistics), 2017
[24]
Frank M. Davis, Danielle C. Sutzko, Scott F. Grey, M. Ashraf Mansour, Krishna M. Jain, Timothy J. Nypaver, Greg Gaborek, and Peter K. Henke
Journal of Vascular Surgery, 2017, Volume 65, Number 6, Page 1769
[25]
Majid Mojirsheibani and Zahra Montazeri
Journal of Statistical Computation and Simulation, 2015, Volume 85, Number 6, Page 1187
[26]
Jade Benjamin-Chung, Sonia Sultana, Amal K. Halder, Mohammed Ali Ahsan, Benjamin F. Arnold, Alan E. Hubbard, Leanne Unicomb, Stephen P. Luby, and John M. Colford
American Journal of Public Health, 2017, Volume 107, Number 5, Page 694
[27]
Sara E. Moore, Anna Decker, Alan Hubbard, Rachael A. Callcut, Erin E. Fox, Deborah J. del Junco, John B. Holcomb, Mohammad H. Rahbar, Charles E. Wade, Martin A. Schreiber, Louis H. Alarcon, Karen J. Brasel, Eileen M. Bulger, Bryan A. Cotton, Peter Muskat, John G. Myers, Herb A. Phelan, Mitchell J. Cohen, and Nandita Mitra
PLOS ONE, 2015, Volume 10, Number 8, Page e0136438
[28]
Romain Neugebauer, Bruce Fireman, Jason A. Roy, Marsha A. Raebel, Gregory A. Nichols, and Patrick J. O'Connor
Journal of Clinical Epidemiology, 2013, Volume 66, Number 8, Page S99
[29]
Samuel D. Lendle, Bruce Fireman, and Mark J. van der Laan
Journal of Clinical Epidemiology, 2013, Volume 66, Number 8, Page S91
[30]
Bryan Greenhouse, Benjamin Ho, Alan Hubbard, Denise Njama-Meya, David L. Narum, David E. Lanar, Sheetij Dutta, Philip J. Rosenthal, Grant Dorsey, and Chandy C. John
The Journal of Infectious Diseases, 2011, Volume 204, Number 1, Page 19
[31]
Michelle Pearl, Laura Balzer, and Jennifer Ahern
Epidemiology, 2016, Volume 27, Number 4, Page 512
[32]
Menglan Pang, Tibor Schuster, Kristian B. Filion, Maria Eberg, and Robert W. Platt
Epidemiology, 2016, Volume 27, Number 4, Page 570
[33]
Jiaqi Li, Anil Vachani, Andrew Epstein, and Nandita Mitra
Statistical Methods in Medical Research, 2017, Page 096228021769326
[34]
Wei Luo, Yeying Zhu, and Debashis Ghosh
Biometrika, 2017, Page asw068
[36]
Richard D. Semba, Indi Trehan, Ximin Li, Ruin Moaddel, M. Isabel Ordiz, Kenneth M. Maleta, Klaus Kraemer, Michelle Shardell, Luigi Ferrucci, and Mark Manary
EBioMedicine, 2017, Volume 17, Page 57
[37]
Romain Pirracchio, John K Yue, Geoffrey T Manley, Mark J van der Laan, and Alan E Hubbard
Statistical Methods in Medical Research, 2016, Page 096228021562733
[38]
Daniel M. Brown, Maya Petersen, Sadie Costello, Elizabeth M. Noth, Katherine Hammond, Mark Cullen, Mark van der Laan, and Ellen Eisen
Epidemiology, 2015, Volume 26, Number 6, Page 806
[39]
Noémi Kreif, Susan Gruber, Rosalba Radice, Richard Grieve, and Jasjeet S Sekhon
Statistical Methods in Medical Research, 2016, Volume 25, Number 5, Page 2315
[40]
Elizabeth J. Carlton, Alan Hubbard, Shuo Wang, Robert C. Spear, and Joanne P. Webster
PLoS Neglected Tropical Diseases, 2013, Volume 7, Number 3, Page e2098
[41]
Andrew Wey, John Connett, and Kyle Rudser
Biostatistics, 2015, Volume 16, Number 3, Page 537
[42]
Megan S. Schuler and Sherri Rose
American Journal of Epidemiology, 2017, Volume 185, Number 1, Page 65
[43]
Catherine Juillard, Laya Cooperman, Isabel Allen, Romain Pirracchio, Terrell Henderson, Ruben Marquez, Julia Orellana, Michael Texada, and Rochelle Ami Dicker
Journal of Trauma and Acute Care Surgery, 2016, Volume 81, Number 6, Page 1156
[44]
Brenda Eskenazi, Stephen A. Rauch, Rachel Tenerelli, Karen Huen, Nina T. Holland, Robert H. Lustig, Katherine Kogut, Asa Bradman, Andreas Sjödin, and Kim G. Harley
International Journal of Hygiene and Environmental Health, 2017, Volume 220, Number 2, Page 364
[45]
Kara E. Rudolph and Mark J. van der Laan
Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2016
[46]
Edward H. Kennedy, Zongming Ma, Matthew D. McHugh, and Dylan S. Small
Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2017, Volume 79, Number 4, Page 1229
[47]
R C Kessler, H M van Loo, K J Wardenaar, R M Bossarte, L A Brenner, T Cai, D D Ebert, I Hwang, J Li, P de Jonge, A A Nierenberg, M V Petukhova, A J Rosellini, N A Sampson, R A Schoevers, M A Wilcox, and A M Zaslavsky
Molecular Psychiatry, 2016, Volume 21, Number 10, Page 1366
[48]
Markus Frölich, Martin Huber, and Manuel Wiesenfarth
Computational Statistics & Data Analysis, 2017, Volume 115, Page 91
[49]
Elizabeth L. Turner, Melanie Prague, John A. Gallis, Fan Li, and David M. Murray
American Journal of Public Health, 2017, Volume 107, Number 7, Page 1078
[50]
Sharon K. Sagiv, Katherine Kogut, Fraser W. Gaspar, Robert B. Gunier, Kim G. Harley, Kimberly Parra, Diana Villaseñor, Asa Bradman, Nina Holland, and Brenda Eskenazi
Neurotoxicology and Teratology, 2015, Volume 52, Page 151
[51]
M A Gianfrancesco, L Balzer, K E Taylor, L Trupin, J Nititham, M F Seldin, A W Singer, L A Criswell, and L F Barcellos
Genes and Immunity, 2016, Volume 17, Number 6, Page 358
[52]
M. Maria Glymour and Kara E. Rudolph
Social Science & Medicine, 2016, Volume 166, Page 258
[53]
Majid Mojirsheibani and Jiajie Kong
Statistics & Probability Letters, 2016, Volume 119, Page 91
[54]
Gérard Biau, Aurélie Fischer, Benjamin Guedj, and James D. Malley
Journal of Multivariate Analysis, 2016, Volume 146, Page 18
[55]
Joshua Beemer, Kelly Spoon, Lingjun He, Juanjuan Fan, and Richard A. Levine
International Journal of Artificial Intelligence in Education, 2017
[56]
Daniel Scharfstein, Aidan McDermott, Iván Díaz, Marco Carone, Nicola Lunardon, and Ibrahim Turkoz
Biometrics, 2017
[57]
Michael T. Zimmermann, Richard B. Kennedy, Diane E. Grill, Ann L. Oberg, Krista M. Goergen, Inna G. Ovsyannikova, Iana H. Haralambieva, and Gregory A. Poland
Frontiers in Immunology, 2017, Volume 8
[58]
K. Ellicott Colson, Kara E. Rudolph, Scott C. Zimmerman, Dana E. Goin, Elizabeth A. Stuart, Mark van der Laan, and Jennifer Ahern
Scientific Reports, 2016, Volume 6, Number 1
[59]
Laura Acion, Diana Kelmansky, Mark van der Laan, Ethan Sahker, DeShauna Jones, Stephan Arndt, and Raymond Niaura
PLOS ONE, 2017, Volume 12, Number 4, Page e0175383
[60]
Monique A. Ladds, Adam P. Thompson, Julianna-Piroska Kadar, David J Slip, David P Hocking, and Robert G Harcourt
Animal Biotelemetry, 2017, Volume 5, Number 1
[61]
Safoora Gharibzadeh, Mohammad Ali Mansournia, Abbas Foroushani, Ahad Alizadeh, Atieh Amouzegar, Kamran Mehrabani-Zeinabad, and Kazem Mohammad
Communications in Statistics - Simulation and Computation, 2017, Page 0
[62]
Detian Deng, Yu Du, Zhicheng Ji, Karthik Rao, Zhenke Wu, Yuxin Zhu, and R. Yates Coley
F1000Research, 2016, Volume 5, Page 2672
[63]
Richard D. Semba, Michelle Shardell, Indi Trehan, Ruin Moaddel, Kenneth M. Maleta, M. Isabel Ordiz, Klaus Kraemer, Mohammed Khadeer, Luigi Ferrucci, and Mark J. Manary
Scientific Reports, 2016, Volume 6, Number 1
[64]
Jonathan Mummolo and Erik Peterson
American Politics Research, 2017, Volume 45, Number 2, Page 159
[65]
Felix Thoemmes and Anthony D. Ong
Emerging Adulthood, 2016, Volume 4, Number 1, Page 40
[66]
Reuben Thomas, Russell S. Thomas, Scott S. Auerbach, Christopher J. Portier, and David L. McCormick
PLoS ONE, 2013, Volume 8, Number 5, Page e63308
[67]
Reuben Thomas, Alan E. Hubbard, Cliona M. McHale, Luoping Zhang, Stephen M. Rappaport, Qing Lan, Nathaniel Rothman, Roel Vermeulen, Kathryn Z. Guyton, Jennifer Jinot, Babasaheb R. Sonawane, Martyn T. Smith, and Shyamal D. Peddada
PLoS ONE, 2014, Volume 9, Number 5, Page e91828
[68]
Hope H. Biswas, Oscar Ortega, Aubree Gordon, Katherine Standish, Angel Balmaseda, Guillermina Kuan, Eva Harris, and Benedito A. Lopes da Fonseca
PLoS Neglected Tropical Diseases, 2012, Volume 6, Number 3, Page e1562
[69]
Elizabeth M. Sweeney, Joshua T. Vogelstein, Jennifer L. Cuzzocreo, Peter A. Calabresi, Daniel S. Reich, Ciprian M. Crainiceanu, Russell T. Shinohara, and Bogdan Draganski
PLoS ONE, 2014, Volume 9, Number 4, Page e95753
[70]
Layla Parast, Daniel F. McCaffrey, Lane F. Burgette, Fernando Hoces de la Guardia, Daniela Golinelli, Jeremy N. V. Miles, and Beth Ann Griffin
Health Services and Outcomes Research Methodology, 2016
[71]
[72]
Jason Roy, Kirsten J. Lum, and Michael J. Daniels
Biostatistics, 2017, Volume 18, Number 1, Page 32
[73]
Erwan Scornet
IEEE Transactions on Information Theory, 2016, Volume 62, Number 3, Page 1485
[74]
Joan Casey, Peter James, Kara Rudolph, Chih-Da Wu, and Brian Schwartz
International Journal of Environmental Research and Public Health, 2016, Volume 13, Number 3, Page 311
[75]
Heather Wachtel, Edward H. Kennedy, Salman Zaheer, Edmund K. Bartlett, Lauren Fishbein, Robert E. Roses, Douglas L. Fraker, and Debbie L. Cohen
Annals of Surgical Oncology, 2015, Volume 22, Number S3, Page 646
[76]
Alexia Kakourou, Werner Vach, and Bart Mertens
Journal of Computational Biology, 2014, Volume 21, Number 12, Page 898
[77]
Mark J. van der Laan and Richard J. C. M. Starmans
Advances in Statistics, 2014, Volume 2014, Page 1
[78]
S. Rose and M. van der Laan
American Journal of Epidemiology, 2014, Volume 179, Number 6, Page 672
[79]
Jeff Goldsmith and Fabian Scheipl
Computational Statistics & Data Analysis, 2014, Volume 70, Page 362
[80]
M. M. Glymour, T. L. Osypuk, and D. H. Rehkopf
American Journal of Epidemiology, 2013, Volume 178, Number 6, Page 858
[81]
Susan Gruber and Mark J. van der Laan
Biometrics, 2013, Volume 69, Number 1, Page 254
[82]
Paul Chaffee and Mark van der Laan
Journal of the American Statistical Association, 2012, Volume 107, Number 498, Page 513
[83]
Russell T. Shinohara, Constantine E. Frangakis, and Constantine G. Lyketsos
Biometrics, 2012, Volume 68, Number 1, Page 85
[84]
Iván Díaz Muñoz and Mark van der Laan
Biometrics, 2012, Volume 68, Number 2, Page 541
[85]
T.G. Doeswijk, A.K. Smilde, J.A. Hageman, J.A. Westerhuis, and F.A. van Eeuwijk
Analytica Chimica Acta, 2011, Volume 705, Number 1-2, Page 41
[86]
Hui Wang, Sherri Rose, and Mark J. van der Laan
Statistics & Probability Letters, 2011, Volume 81, Number 7, Page 792

Comments (0)

Please log in or register to comment.
Log in