Skill importance in women’s soccer

Matthew Heiner 1 , Gilbert W. Fellingham 1 , and Camille Thomas 2
  • 1 Statistics, Brigham Young University, Provo, UT, USA
  • 2 Physical Education and Human Performance, Southern Utah University, Cedar City, UT, USA
Matthew Heiner, Gilbert W. Fellingham and Camille Thomas

Abstract

Soccer analytics often follow one of two approaches: 1) regression models on number of shots taken or goals scored to predict match winners, or 2) spatial and/or temporal analysis of plays for evaluation of strategy. We propose a new model to evaluate skill importance in soccer. Play by play data were collected on 22 NCAA Division I Women’s Soccer matches with a new skill notation system. Using a Bayesian approach, we model play sequences as discrete absorbing Markov chains. Using posterior distributions, we estimate the probability of 35 distinct offensive skills leading to a shot during a single possession.

  • Allan, M. L. 2009. Measuring Skill Importance in Women’s Soccer and Volleyball, Master’s thesis, Brigham Young University. (http://hdl.lib.byu.edu/1877/etd2809).

  • Anderson, T. W. and L. A. Goodman. 1957. “Statistical Inference about Markov Chains.” The Annals of Mathematical Statistics 28:89–110.

    • Crossref
  • Bates, D. and M. Maechler. 2013. Matrix: Sparse and Dense Matrix Classes and Methods. (http://CRAN.R-project.org/package=Matrix, r package version 1.0-12).

  • Brillinger, D. R. 2007. “A Potential Function Approach to the Flow of Play in Soccer.” Journal of Quantitative Analysis in Sports 3.

    • Crossref
  • Carlin, B. P. and T. A. Louis. 2009. Bayesian Methods for Data Analysis. 3rd ed. Boca Raton: CRC Press.

    • Crossref
  • Dahl, D. B. 2013. xtable: Export tables to LaTeX or HTML. (http://CRAN.R-project.org/package=xtable, r package version 1.7-1).

  • de Campos, C. and A. Benavoli. 2011. “Inference with Multinomial Data: Why to Weaken the Prior Strength.” in International Joint Conference on Artificial Intelligence, 2107–2112. Retrieved August 6, 2013 (http://www.aaai.org/ocs/index.php/IJCAI/IJCAI11/paper/view/3292).

  • Dixon, M. J. and S. G. Coles. 1997. “Modelling Association Football Scores and Inefficiencies in the Football Betting Market.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 46:265–280.

    • Crossref
  • Fellingham, G. W. and C. S. Reese. 2004. “Rating Skills in International Men’s Volleyball.” Brigham Young University. Unpublished report to the USA National Men’s Volleyball Team.

  • FIFA. 2006. “FIFA Survey: Approximately 250 Million Footballers Wordwide.” Retrieved August 5, 2013 (http://web.archive.org/web/20060915133001/http://access.fifa.com/infoplus/IP-199_01E_big-count.pdf).

  • FIFA. 2007. “2006 FIFA World Cup Broadcast Wider, Longer and Farther than ever before.” Retrieved August 5, 2013 (http://www.fifa.com/aboutfifa/organisation/marketing/news/newsid=111247/).

  • Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. 2013. Bayesian Data Analysis. 3rd ed. Boca Raton: Chapman and Hall/CRC Press.

    • Crossref
  • Goddard, J. 2005. “Regression Models for Forecasting Goals and Match Results in Association Football.” International Journal of Forecasting 21:331.

    • Crossref
  • Goldner, K. 2012. “A Markov Model of Football: Using Stochastic Processes to Model a Football Drive.” Journal of Quantitative Analysis in Sports 8.

  • Hamilton, H. H. 2011. “An Extension of the Pythagorean Expectation for Association Football.” Journal of Quantitative Analysis in Sports 7.

    • Crossref
  • Hastie, T., R. Tibshirani and J. Friedman. 2009. The Elements of Statistical Learning. 2nd ed. New York: Springer.

    • Crossref
  • Hirotsu, N. and M. Wright. 2003. “An Evaluation of Characteristics of Teams in Association Football by using a Markov Process Model.” Journal of the Royal Statistical Society: Series D (The Statistician) 52:591–602.

    • Crossref
  • Hughes, M. and I. Franks. 2005 “Analysis of Passing Sequences, Shots and Goals in Soccer.” Journal of Sports Sciences 23:509–514.

  • James, B. 1980. The Bill James Abstract. self-published.

  • Karlis, D. and I. Ntzoufras. 2003. “Analysis of Sports Data by using Bivariate Poisson Models.” Journal of the Royal Statistical Society: Series D (The Statistician) 52:381–393.

    • Crossref
  • Lawler, G. F. 1995. Introduction to Stochastic Processes. New York: Chapman & Hall.

  • Maher, M. J. 1992. “Modelling Association Football Scores.” Statistica Neerlandica 36:109–118.

    • Crossref
  • Martin, A. D., K. M. Quinn, and J. H. Park. 2011. “MCMCpack: Markov chain Monte Carlo in R.” Journal of Statistical Software 42:22. (http://www.jstatsoft.org/v42/i09/).

    • Crossref
  • McHale, I. and P. Scarf. 2007. “Modelling Soccer Matches using Bivariate Discrete Distributions with General Dependence Structure.” Statistica Neerlandica 61:432–445.

    • Crossref
  • Miskin, M. A., G. W. Fellingham, and L. W. Florence. 2010. “Skill Importance in Women’s Volleyball.” Journal of Quantitative Analysis in Sports 6.

    • Crossref
  • Plummer, M., N. Best, K. Cowles, and K. Vines. 2006. “CODA: Convergence Diagnosis and Output Analysis for MCMC.” R News 6:7–11. (http://CRAN.R-project.org/doc/Rnews/).

  • Pollard, R. and C. Reep. 1997. “Measuring the Effectiveness of Playing Strategies at Soccer.” Journal of the Royal Statistical Society: Series D (The Statistician) 46:541–550.

    • Crossref
  • Reep, C. and B. Benjamin. 1968. “Skill and Chance in Association Football.” Journal of the Royal Statistical Society. Series A (General) 131:581–585.

    • Crossref
  • Rudd, S. 2011. “A Framework for Tactical Analysis and Individual Offensive Production Assessment in Soccer using Markov Chains.” Retrieved August 6, 2013 (http://onfooty.com/2011/09/nessis-wrap-up-and-slides.html).

  • Rue, H. and Y. Salvesen. 2000. “Prediction and Retrospective Analysis of Soccer Matches in a League.” Journal of the Royal Statistical Society: Series D (The Statistician) 49:399–418.

    • Crossref
  • R Core Team. 2012. R: A Language and Environment for Statistical Computing, Vienna, Austria: R Foundation for Statistical Computing. (http://www.R-project.org), ISBN 3-900051-07-0.

  • Sarkar, D. 2008. Lattice: Multivariate Data Visualization with R. New York: Springer. (http://lmdvr.r-forge.r-project.org), ISBN 978-0-387-75968-5.

  • Strelioff, C., J. Crutchfield, and A. Hubler. 2007. “Inferring Markov Chains: Bayesian Estimation, model Comparison, Entropy Rate, and Out-of-class Modeling.” Physical Review E 76:011106.

  • Thomas, C., G. Fellingham, and P. Vehrs. 2009. “Development of a Notational Analysis System for Selected Soccer Skills of a Women’s College Team.” Measurement in Physical Education and Exercise Science 13:108–121.

    • Crossref
Purchase article
Get instant unlimited access to the article.
$42.00
Log in
Already have access? Please log in.


or
Log in with your institution

Journal + Issues

Search