Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Stumpf, Michael P.H.

6 Issues per year

IMPACT FACTOR 2016: 0.646
5-year IMPACT FACTOR: 1.191

CiteScore 2016: 0.94

SCImago Journal Rank (SJR) 2016: 0.625
Source Normalized Impact per Paper (SNIP) 2016: 0.596

Mathematical Citation Quotient (MCQ) 2016: 0.06

See all formats and pricing
More options …
Volume 10, Issue 1 (Jul 2011)


Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

High-Dimensional Regression and Variable Selection Using CAR Scores

Verena Zuber
  • University of Leipzig
/ Korbinian Strimmer
  • University of Leipzig
Published Online: 2011-07-18 | DOI: https://doi.org/10.2202/1544-6115.1730

Variable selection is a difficult problem that is particularly challenging in the analysis of high-dimensional genomic data. Here, we introduce the CAR score, a novel and highly effective criterion for variable ranking in linear regression based on Mahalanobis-decorrelation of the explanatory variables. The CAR score provides a canonical ordering that encourages grouping of correlated predictors and down-weights antagonistic variables. It decomposes the proportion of variance explained and it is an intermediate between marginal correlation and the standardized regression coefficient. As a population quantity, any preferred inference scheme can be applied for its estimation. Using simulations, we demonstrate that variable selection by CAR scores is very effective and yields prediction errors and true and false positive rates that compare favorably with modern regression techniques such as elastic net and boosting. We illustrate our approach by analyzing data concerned with diabetes progression and with the effect of aging on gene expression in the human brain. The R package “care” implementing CAR score regression is available from CRAN.

Keywords: variable importance; variable selection; decorrelation; lasso; elastic net; boosting; CAR score

About the article

Published Online: 2011-07-18

Citation Information: Statistical Applications in Genetics and Molecular Biology, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.2202/1544-6115.1730.

Export Citation

©2011 Walter de Gruyter GmbH & Co. KG, Berlin/Boston. Copyright Clearance Center

Comments (0)

Please log in or register to comment.
Log in