Estimating player contribution in hockey with regularized logistic regression

Robert B. Gramacy
  • Corresponding author
  • Booth School of Business, The University of Chicago, 5807 S Woodlawn Ave, Chicago, IL 60637, USA
  • Email:
/ Shane T. Jensen
  • The Wharton School, University of Pennsylvania 3730 Walnut St., Philadelphia, PA 19102, USA
/ Matt Taddy
  • Booth School of Business, The University of Chicago, 5807 S Woodlawn Ave, Chicago, IL 60637, USA
Published Online: 2013-03-30 | DOI: https://doi.org/10.1515/jqas-2012-0001


We present a regularized logistic regression model for evaluating player contributions in hockey. The traditional metric for this purpose is the plus-minus statistic, which allocates a single unit of credit (for or against) to each player on the ice for a goal. However, plus-minus scores measure only the marginal effect of players, do not account for sample size, and provide a very noisy estimate of performance. We investigate a related regression problem: what does each player on the ice contribute, beyond aggregate team performance and other factors, to the odds that a given goal was scored by their team? Due to the large-p (number of players) and imbalanced design setting of hockey analysis, a major part of our contribution is a careful treatment of prior shrinkage in model estimation. We showcase two recently developed techniques – for posterior maximization or simulation – that make such analysis feasible. Each approach is accompanied with publicly available software and we include the simple commands used in our analysis. Our results show that most players do not stand out as measurably strong (positive or negative) contributors. This allows the stars to really shine, reveals diamonds in the rough overlooked by earlier analyses, and argues that some of the highest paid players in the league are not making contributions worth their expense.

Keywords: Bayesian shrinkage; lasso; logistic regression; regularization; sports analytics


Corresponding author: Robert B. Gramacy, Booth School of Business, The University of Chicago, 5807 S Woodlawn Ave, Chicago, IL 60637, USA, Tel.: +773-702-0739

Published Online: 2013-03-30

1Note that we include goalies in our analysis.

2Fitted in R using the command fit>-glm(goals~XP, family=“binomial”).

3We used forward step-wise regression with the Bayes information criterion (BIC).

4This is the lowest possible budget from which lines can be formed satisfying (4).

5Sweater sales is another matter.

6We omitted goalie-skater and goalie-goalie interaction terms.

