Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter October 12, 2020

Algorithmically deconstructing shot locations as a method for shot quality in hockey

Devan G. Becker ORCID logo, Douglas G. Woolford and Charmaine B. Dean

Abstract

Spatial point processes have been successfully used to model the relative efficiency of shot locations for each player in professional basketball games. Those analyses were possible because each player makes enough baskets to reliably fit a point process model. Goals in hockey are rare enough that a point process cannot be fit to each player’s goal locations, so novel techniques are needed to obtain measures of shot efficiency for each player. A Log-Gaussian Cox Process (LGCP) is used to model all shot locations, including goals, of each NHL player who took at least 500 shots during the 2011–2018 seasons. Each player’s LGCP surface is treated as an image and these images are then used in an unsupervised statistical learning algorithm that decomposes the pictures into a linear combination of spatial basis functions. The coefficients of these basis functions are shown to be a very useful tool to compare players. To incorporate goals, the locations of all shots that resulted in a goal are treated as a “perfect player” and used in the same algorithm (goals are further split into perfect forwards, perfect centres and perfect defence). These perfect players are compared to other players as a measure of shot efficiency. This analysis provides a map of common shooting locations, identifies regions with the most goals relative to the number of shots and demonstrates how each player’s shot location differs from scoring locations.


Corresponding author: Devan G. Becker, The University of Western Ontario, London, Canada, E-mail:

Funding source: Natural Sciences and Engineering Research Council of Canada

Award Identifier / Grant number: RGPIN-2015-04221

Award Identifier / Grant number: RGPIN-2014-06187

Funding source: CANSSI Collaborative Research Team Grant

Acknowledgment

We acknowledge the support of the Natural Sciences and Engineering Research Council of Canada (NSERC), [funding reference numbers RGPIN-2015-04221 and RGPIN-2014-06187]. Additional support was provided by a CANSSI Collaborative Research Team grant. We would also like to thank Michael Schuckers and Nathan Sandholtz for helpful conversations regarding this work.

  1. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: This study was supported by Natural Sciences and Engineering Research Council of Canada (NSERC), [funding reference numbers RGPIN-2015-04221 and RGPIN-2014-06187] and CANSSI Collaborative Research Team grant.

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

Bachl, F. E., F. Lindgren, D. L. Borchers, and J. B. Illian. 2019. “Inlabru: An R package for Bayesian Spatial Modelling from Ecological Survey Data.” Methods in Ecology and Evolution 10 (6): 760–6, https://doi.org/10.1111/2041-210x.13168.Search in Google Scholar

Becker, D. 2017 April. “Space and Some Other Things: Point Process Models for Hockey Data.” In Ottawa Hockey Analytics Conference.Ottawa, Ontario: Carleton University.Search in Google Scholar

Brunet, J.-P., P. Tamayo, T. R. Golub, and J. P. Mesirov. 2004 March. “Metagenes and Molecular pattern Discovery Using Matrix Factorization.” Proceedings of the National Academy of Sciences 101 (12): 4164–9, https://doi.org/10.1073/pnas.0308531101.Search in Google Scholar

Cane, M. 2014. Using Shot Location Data for Team and Player Strategy. Pittsburgh, Pennsylvania: Pittsburgh Hockey Analytics workshop. http://blog.war-on-ice.com/wp-content/uploads/2014/11/20141106_PGH_Analytics_Shot_Location.pdf.Search in Google Scholar

Cervone, D., A. D’Amour, L. Bornn, and K. Goldsberry. 2016 April. “A Multiresolution Stochastic Process Model for Predicting Basketball Possession Outcomes.” Journal of the American Statistical Association 111 (514): 585–99, https://doi.org/10.1080/01621459.2016.1141685.Search in Google Scholar

Chalise, P., and B. L. Fridley. 2017 May. “Integrative Clustering of Multi-Level ‘Omic Data Based on Non-negative Matrix Factorization Algorithm.” PloS One 12 (5): e0176278. https://doi.org/10.1371/journal.pone.0176278.Search in Google Scholar

Cinlar, E., and R. A. Agnew. 1968. “On the Superposition of Point Processes.” Journal of the Royal Statistical Society: Series B 30 (3): 576–81, https://doi.org/10.1111/j.2517-6161.1968.tb00758.x.Search in Google Scholar

Diggle, P. J., P. Moraga, B. Rowlingson, and B. M. Taylor. 2013 November. “Spatial and Spatio-Temporal Log-Gaussian Cox Processes: Extending the Geostatistical Paradigm.” Statistical Science 28 (4): 542–63, https://doi.org/10.1214/13-sts441.Search in Google Scholar

Ellis, M. 2018. “NHL Game Data.” Also avaiable at https://kaggle.com/martinellis/nhl-game-data.Search in Google Scholar

Franks, A., A. Miller, L. Bornn, and K. Goldsberry. 2015 March. “Characterizing the Spatial Structure of Defensive Skill in Professional Basketball.” Annals of Applied Statistics 9 (1): 94–121, https://doi.org/10.1214/14-aoas799.Search in Google Scholar

Frigyesi, A., and M. Höglund. 2008 January. “Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: Identification of Clinically Relevant Tumor Subtypes.” Cancer Informatics 6: CIN.S606, https://doi.org/10.4137/cin.s606.Search in Google Scholar

Gaujoux, R., and C. Seoighe. 2010. “A Flexible Software Package for Nonnegative Matrix Factorization.” BMC Bioinformatics 11: 367. https://doi.org/10.1186/1471-2105-11-367.Search in Google Scholar

Hawerchuck. 2007. 2007–08 5v5 Goaltender Performance. Behindthenet Blog. http://www.behindthenet.ca/blog/2007/12/2007-08-5v5-goaltender-performance.html.Search in Google Scholar

Hohl, G. 2017. Behind the Numbers: The Issues with Binning, QoC, and Scoring Chances. Hockey Graphs. https://hockey-graphs.com/2017/02/06/behind-the-numbers-the-issues-with-binning-qoc-and-scoring-chances/.Search in Google Scholar

Hutchins, L. N., S. M. Murphy, P. Singh, and J. H. Graber. 2008 December. “Position-dependent Motif Characterization Using Non-negative Matrix Factorization.” Bioinformatics 24 (23): 2684–90, https://doi.org/10.1093/bioinformatics/btn526.Search in Google Scholar

Kasan, S. 2008. “Off-ice Officials Are a Fourth Team at Every Game.” Also avaiable at https://www.nhl.com/news/off-ice-officials-are-a-fourth-team-at-every-game/c-388400.Search in Google Scholar

Krzywicki, K. 2005. “Shot Quality Model: A Logistic Regression Approach to Assessing NHL Shots on Goal.” Also avaiable at https://hockeyanalytics.com/Research_files/Shot_Quality_Krzywicki.pdf.Search in Google Scholar

Krzywicki, K. 2010. “NHL Shot Quality 2009–10: A Look at Shot Angles and Rebounds.” Also avaiable at https://hockeyanalytics.com/Research_files/SQ-RS0910-Krzywicki.pdf.Search in Google Scholar

Lee, D. D., and H. S. Seung. 2000. “Algorithms for Non-negative Matrix Factorization.” In 13th International Conference on Neural Information Processing Systems. Denver, Colorado: MIT Press, https://doi.org/10.1117/12.405857.Search in Google Scholar

Lin, X., and P. C. Boutros. 2019. NNLM: Fast and Versatile Non-negative Matrix Factorization. R package version 0.4.3. Also available at http://www.lukebornn.com/papers/sandholtz_sloan_2019.pdf.Search in Google Scholar

Lindgren, F., and H. Rue. 2015 February. “Bayesian Spatial Modelling with R-INLA.” Journal of Statistical Software 63 (1): 1–25, https://doi.org/10.18637/jss.v063.i19.Search in Google Scholar

Miller, A., L. Bomn, R. Adams, and K. Goldsberry. 2014. “Factorized Point Process Intensities: A Spatial Analysis of Professional Basketball.” In 31st International Conference on Machine Learning, ICML 2014, 1, 398–414. International Machine Learning Society (IMLS).Search in Google Scholar

Møller, J., A. R. Syversveen, and R. P. Waagepetersen. 1998. “Log Gaussian Cox Processes.” Scandinavian Journal of Statistics 25 (3): 451–82, https://doi.org/10.1111/1467-9469.00115.Search in Google Scholar

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.Search in Google Scholar

Rue, H., S. Martino, and N. Chopin. 2009. “Approximate Bayesian Inference for Latent Gaussian Models by Using Integrated Nested Laplace Approximations.” Journal of the Royal Statistical Society: Series B 71 (2): 319–92, https://doi.org/10.1111/j.1467-9868.2008.00700.x.Search in Google Scholar

Ryder, A. 2004a. “Goal Prevention.” Also avaiable at https://hockeyanalytics.com/Research_files/Goal_Prevention_2004.pdf.Search in Google Scholar

Ryder, A. 2004b. “Shot Quality.” Also avaiable at https://hockeyanalytics.com/Research_files/Shot_Quality.pdf.Search in Google Scholar

Sandholtz, N., J. Mortensen, and L. Bornn. 2019. “Chuckers: Measuring Lineup Shot Distribution Optimality Using Spatial Allocative Efficiency Models.” In MIT Sloan Sports Analytics Conference, Boston, MA. http://www.lukebornn.com/papers/sandholtz_sloan_2019.pdf.Search in Google Scholar

Schuckers, M., and B. Macdonald. 2014. “Accounting for Rink Effects in the National Hockey League’s Real Time Scoring System.” arXiv preprint arXiv:1412.1035Search in Google Scholar

Schuckers, M. “A Defense Independent Rating of NHL Goaltenders Using Spatially Smoothed Save Percentage Maps.” In MIT Sloan Analytics Conference, Boston, MA.Search in Google Scholar

Simpson, D., J. Illian, F. Lindgren, and S. Sørbye. 2011 “Going off Grid: Computationally Efficient Inference for Log-Gaussian Cox Processes.” Biometrica 103 (1). https://doi.org/10.1093/biomet/asv064.Search in Google Scholar

Vavasis, S. A. 2007. “On the Complexity of Nonnegative Matrix Factorization.” SIAM Journal on Optimization 203: 1364–77, https://doi.org/10.1137/070709967.Search in Google Scholar

Received: 2020-01-30
Accepted: 2020-09-18
Published Online: 2020-10-12
Published in Print: 2021-06-25

© 2020 Walter de Gruyter GmbH, Berlin/Boston