Estimating player contribution in hockey with regularized logistic regression

Robert B. Gramacy 1 , Shane T. Jensen 2  and Matt Taddy 1
  • 1 Booth School of Business, The University of Chicago, 5807 S Woodlawn Ave, Chicago, IL 60637, USA
  • 2 The Wharton School, University of Pennsylvania 3730 Walnut St., Philadelphia, PA 19102, USA
Robert B. Gramacy, Shane T. Jensen and Matt Taddy

Abstract

We present a regularized logistic regression model for evaluating player contributions in hockey. The traditional metric for this purpose is the plus-minus statistic, which allocates a single unit of credit (for or against) to each player on the ice for a goal. However, plus-minus scores measure only the marginal effect of players, do not account for sample size, and provide a very noisy estimate of performance. We investigate a related regression problem: what does each player on the ice contribute, beyond aggregate team performance and other factors, to the odds that a given goal was scored by their team? Due to the large-p (number of players) and imbalanced design setting of hockey analysis, a major part of our contribution is a careful treatment of prior shrinkage in model estimation. We showcase two recently developed techniques – for posterior maximization or simulation – that make such analysis feasible. Each approach is accompanied with publicly available software and we include the simple commands used in our analysis. Our results show that most players do not stand out as measurably strong (positive or negative) contributors. This allows the stars to really shine, reveals diamonds in the rough overlooked by earlier analyses, and argues that some of the highest paid players in the league are not making contributions worth their expense.

  • Awad, T. 2009. “Numbers On Ice: Fixing Plus/Minus.” Hockey Prospectus. See www.puckprospectus.com

  • Friedman, J. H., T. Hastie, and R. Tibshirani. 2010. “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software 33(1): 1–22.

    • Crossref
  • Geman, S. and D. Geman. 1984. “Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images.” IEEE Transaction on Pattern Analysis and Machine Intelligence 6: 721–741.

    • Crossref
  • Gramacy, R. 2012a. Reglogit: Simulation-based Regularized Logistic Regression. R package version 1.1.

  • Gramacy, R. B. 2012b. Monomvn: Estimation for multivariate normal and Student-t data with monotone missingness. R package version 1.8-9.

  • Gramacy, R. and N. Polson. 2012. “Simulation-Based Regularized Logistic Regression.” Bayesian Analysis 7: 1–24.

    • Crossref
  • Hoerl, A. E. and R. W. Kennard. (1970). “Ridge Regression: Biased Estimation for Nonorthogonal Problems.” Technometrics 12: 55–67.

  • Holmes, C. and K. Held. 2006. “Bayesian Auxilliary Variable Models for Binary and Multinomial Regression.” Bayesian Analysis 1(1): 145–168.

    • Crossref
  • Hornik, K., D. Meyer, and C. Buchta. (2011). slam: Sparse Lightweight Arrays and Matrices. R package version 0.1-23.

  • Ilardi, S. and A. Barzilai. 2004. “Adjusted Plus-Minus Ratings: New and Improved for 2007–2008.” 82games.com.

  • Macdonald, B. 2010. “A Regression-based Adjusted Plus-Minus Statistic for NHL Players.” Tech. rep., arXiv: 1006.4310.

  • Rosenbaum, D. T. 2004. “Measuring How NBA Players Help Their Teams Win.” 82games.com.

  • Schuckers, M. E., D. F. Lock, C. Wells, C. J. Knickerbocker, and R. H. Lock. 2010. “National Hockey League Skater Ratings Based upon All On-Ice Events: An Adjusted Minus/Plus Probability (AMPP) Approach.” Tech. rep., St. Lawrence University.

  • R Development Core Team 2010. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.

  • Taddy, M. 2012a. textir: Inverse Regression for Text. R package version 1.8-6.

  • Taddy, M. 2012b. “Multinomial Inverse Regression for Text Analysis.” Journal of the American Statistical Association, accepted for publication.

  • Thomas, A. C., S. L. Ventura, S. Jensen, and S. Ma. 2012. “Competing Process Hazard Function Models for Player Ratings in Ice Hockey.” Tech. rep., ArXiv:1208.0799.

  • Tibshirani, R. 1996. “Regression shrinkage and selection via the lasso.” J. R. Statist. Soc. B, 58: 267–288.

    • Crossref
  • Vollman, R. 2010. “Howe and Why: Ten Ways to Measure Defensive Contributions.” Hockey Prospectus.

Purchase article
Get instant unlimited access to the article.
£23.00
Log in
Already have access? Please log in.


or
Log in with your institution

Journal + Issues

JQAS, an official journal of the American Statistical Association, publishes research on the quantitative aspects of professional and collegiate sports. Articles deal with subjects as measurements of player performance, tournament structure, and the frequency and occurrence of records. Additionally, the journal serves as an outlet for professionals in the sports world to raise issues and ask questions that relate to quantitative sports analysis.

Search