Accessible Requires Authentication Published by De Gruyter January 25, 2014

A Bayesian stochastic model for batting performance evaluation in one-day cricket

Theodoro Koulis, Saman Muthukumarana and Creagh Dyson Briercliffe

Abstract

We consider the modeling of individual batting performance in one-day international (ODI) cricket by using a batsman-specific hidden Markov model (HMM). The batsman-specific number of hidden states allows us to account for the heterogeneous dynamics found in batting performance. Parallel sampling is used to choose the optimal number of hidden states. Using the batsman-specific HMM, we then introduce measures of performance to assess individual players via reliability analysis. By classifying states as either up or down, we compute the availability, reliability, failure rate and mean time to failure for each batsman. By choosing an appropriate classification of states, an overall prediction of batting performance of a batsman can be made. The classification of states can also be modified according to the type of game under consideration. One advantage of this batsman-specific HMM is that it does not require the consideration of unforeseen factors. This is important since cricket has gone through several rule changes in recent years that have further induced unforeseen dynamic factors to the game. We showcase the approach using data from 20 different batsmen having different underlying dynamics and representing different countries.


Corresponding author: Theodoro Koulis, Department of Statistics, University of Manitoba, 338 Machray Hall, Winnipeg Manitoba R3T2N2, Canada, Tel.: 204-474-8205, Fax: 204-474-7621, e-mail:

Acknowledgments

The authors have been partially supported by research grants from the Natural Sciences and Engineering Research Council of Canada. The authors thank the Editor, Associate Editor, and the two anonymous reviewers whose comments led to an improvement in the manuscript.

Appendix A: Gibbs Sampling

Gibbs sampling proceeds by repeated application of the following steps:

  1. Given the current value of Θ=θ, we generate a sample path y(N) of the performance states according to (Zucchini and MacDonald 2009):

    (16)P(Y(N)=y(N)|X(N)=x(N), θ)=P(YN=yN|X(N)=x(N), θ)×n=1N1P(Yn=yn|X(N)=x(N),Yn+1N,θ), (16)

    where

    (17)P(YN=j|X(N)=x(N),θ)ϕN(j), (17)

    and

    (18)P(Yn=j|X(N)=x(N),Yn+1N,θ)ϕn(j)pj,Yn+1. (18)

    Recall that ϕn(j) are the forward step probabilities (2) of the FB algorithm. We simulate the performance states in reverse order by using (17) to generate yN and (18) to generate yn given yn+1N=(yn+1,,yN) for n=N–1, …, 1. Let ni,j be the number of transitions from state i to j, and let ni=(ni,1, …, ni,K).

  2. Using the simulated hidden performance states y(N), we decompose the observations x(N) into regime contributions by generating r(N)=(r1, …, rN) as described in Section 2.1. Recall that rn=(r1,n, …, rK,n) with xn=i=1K ri,n, and ri,n=0 for i>yn. Let vj be the number of times regime j was active:

    νj=n=1N1(yn>j).

    When not-out scores are present, we need to modify the regime decomposition. Under no censoring, and conditional on Xn=xn and Yn=k, the joint distribution of the regimes R1,n, …, Rk,n is multinomial with total xn. Under right-censoring, this still holds, except that xn is a right-censored observation, so that Xnxn. In this case, we sample from the distribution of the regimes R1,n,…,RK,n conditional on Xnxn. To overcome this, we require an additional sampling step. First, conditional on Xnxn, we draw a score Xn=xn from a truncated Poisson distribution P(xn|Xnxn) with rate λk. Then, conditional on Xn=xn, we draw a sample of regimes according to a multinomial with total xn and probability vector proportional to (μ1, …, μk).

  3. Finally, we note that

    (19)P(Y(N)=y(N),R(N)=r(N)|X(N)=x(N), θ)P(Y(N)=y(N)|X(N)=x(N), θ)P(R(N)=r(N)|X(N)=x(N), Y(N)=y(N), θ)πy1i,j=1Kpi,jni,ji=1Kμijri,jeνiμi. (19)

    Given the priors and the current values of y(N) and r(N), we update the parameter θ by drawing μ and P according to

    (20)μi~Gamma(ai+jri,j,bi+νi), (20)
    (21)pi~Dir(αi+ni). (21)

References

Akaike, H. 1973. “Information Theory and an Extension of The Maximum Likelihood Principle.” in 2nd International Symposium on Information Theory, 267–281. Search in Google Scholar

Albert, J. 2002. “Hitting with Runners in Scoring Position.” Chance 15:8–16. Search in Google Scholar

Bailey, M. and S. Clarke. 2004. “Market Inefficiencies in Player Head to Head Betting on The 2003 Cricket World Cup.” pp. 185–202 in Economics, Management and Optimization in Sport, edited by S. Butenko, J. Gil-Lafuente, and P. Pardalos. Heidelberg: Springer-Verlag. Search in Google Scholar

Bailey, M. and S. Clarke. 2006. “Predicting the Match Outcome in One Day International Cricket Matches, While The Match is in Progress.” Journal of Science and Sports Medicine 5:480–487. Search in Google Scholar

Barbu, V. and N. Limnios. 2008. “Semi-Markov Chains and Hidden Semi-Markov Models Toward Applications: Their Use in Reliability and DNA Analysis, Lecture Notes in Statistics-Springer, Springer Science + Business Media. Search in Google Scholar

Beaudoin, D. 2003. The Best Batsmen and Bowlers in One-Day Cricket. Master’s thesis, Simon Fraser University, Department of Statistics and Actuarial Science. Search in Google Scholar

Brewer, B. J. 2008. “Getting Your Eye in: A Bayesian Analysis of Early Dismissals in Cricket,” pre-print, arXiv:0801.4408v2 [stat.AP]. Search in Google Scholar

Carlin, B. P. and S. Chib. 1995. “Bayesian Model Choice via Markov Chain Monte Carlo Methods.” Journal of the Royal Statistical Society. Series B (Methodological) 57:473–484. Search in Google Scholar

Congdon, P. 2006. “Bayesian Model Choice Based on Monte Carlo Estimates of Posterior Model Probabilities.” Computational Statistics & Data Analysis 50:346–357. Search in Google Scholar

Daniyal, M., T. Nawaz, I. Mubeen, and M. Aleem. 2012. “Analysis of Batting Performance in Cricket Using Individual and Moving Range (mr) Control Charts.” International Journal of Sports Science and Engineering 6:195–202. Search in Google Scholar

de Campos, C. P. and A. Benavoli. 2011. “Inference with Multinomial Data: Why to Weaken the Prior Strength,” pp. 2107–2112 in Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence.. Search in Google Scholar

de Silva, B. and T. B. Swartz. 1997. “Winning the Coin Toss and The Home Team Advantage in One-Day International Cricket Matches.” New Zealand Statistician 32:16–22. Search in Google Scholar

Frühwirth-Schnatter, S. 2010. Finite Mixture and Markov Switching Models. Springer Series in Statistics, Springer. Search in Google Scholar

Green, P. J. 1995. “Reversible Jump Markov chain Monte Carlo Computation and Bayesian Model Determination.” Biometrika 82:711–732. Search in Google Scholar

Hintze, J. L. and R. D. Nelson. 1998. “Violin plots: A Box plot-Density Trace Synergism.” The American Statistician 52:181–184. Search in Google Scholar

Jensen, S. T., B. B. McShane, and A. J. Wyner. 2009. “Hierarchical Bayesian Modeling of Hitting Performance in Baseball.” Bayesian Analysis 4:631–652. Search in Google Scholar

Kass, R. E. and A. E. Raftery. 1995. “Bayes Factors.” Journal of the American Statistical Association 90:773–795. Search in Google Scholar

Kimber, A. C. and A. R. Hansford. 1993. “A Statistical Analysis of Batting in Cricket.” Journal of the Royal Statistical Society. Series A (Statistics in Society) 156:443–455. Search in Google Scholar

Lemmer, H. H. 2004. “A Measure for The Batting Performance of Cricket Players.” South African Journal for Research in Sport, Physical Education and Recreation 26:55–64. Search in Google Scholar

Lemmer, H. H. 2007. “The Allocation of Weights in The Calculation of Batting and Bowling Performance Measures.” South African Journal for Research in Sport, Physical Education and Recreation 29:75–85. Search in Google Scholar

Lemmer, H. H. 2011. “The Single Match Approach to Strike Rate Adjustments in Batting Performance Measures in Cricket.” Journal of Sports Science & Medicine 10:630–634. Search in Google Scholar

Martin, J. 1967. Bayesian Decision Problems and Markov Chains. Publications in Operations Research, Wiley. Search in Google Scholar

Rabiner, L. R. 1989. “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.” Proceedings of the IEEE 77:257–286. Search in Google Scholar

Raj, B. 2002. “Asymmetry of Business Cycles: The Markov-Switching Approach.” pp. 687–710 in Handbook of Applied Econometrics and Statistical Inference, edited by A. Ullah, A. Wan, and A. Chaturvedi. New York: Marcel Dekker. Search in Google Scholar

Roy, D. and R. Gupta. 1992. “Classifications of Discrete Lives.” Microelectronics Reliability 32:1459–1473. Search in Google Scholar

Ryden, T. 2008. “EM versus Markov Chain Monte Carlo for Estimation of Hidden Markov Models: A Computational Perspective.” Bayesian Analysis 3:659–688. Search in Google Scholar

Sadek, A. and N. Limnios. 2002. “Asymptotic Properties for Maximum Likelihood Estimators for Reliability and Failure Rates of Markov Chains.” Communications in Statistics-Theory and Methods 31:1837–1861. Search in Google Scholar

Schwarz, G. 1978. “Estimating the Dimension of a Model.” The Annals of Statistics 6:461–464. Search in Google Scholar

Scott, S. L. 2002. “Bayesian Methods for Hidden Markov Models: Recursive Computing in the 21st Century.” Journal of the American Statistical Association 97:337–351. Search in Google Scholar

Swartz, T. B., P. S. Gill, D. Beaudoin, and B. de Silva. 2006. “Optimal Batting Orders in One-Day Cricket.” Computers and Operations Research 33:1939–1950. Search in Google Scholar

Swartz, T. B., P. S. Gill, and S. Muthukumarana. 2009. “Modelling and Simulation for One-Day Cricket.” Canadian Journal of Statistics 37:143–160. Search in Google Scholar

Tucker, B. C. and M. Anand. 2005. “On the Use of Stationary versus Hidden Markov Models to Detect Simple versus Complex Ecological Dynamics.” Ecological Modelling 185:177–193. Search in Google Scholar

Valero, J. and T. B. Swartz. 2012. “An Investigation of Synergy Between Batsmen in Opening Partnerships.” Sri Lankan Journal of Applied Statistics 13:87–98. Search in Google Scholar

van Staden, P. 2009. “Comparison of Cricketers’ Bowling and Batting Performances using Graphical Displays.” Current Science 96:764–766. Search in Google Scholar

van Staden, P., A. Meiring, J. Steyn, and I. Fabris-Rotelli. 2010. “Meaningful Batting Averages in Cricket.” pp. 75–82 in Proceedings of the 52nd Annual Conference of the South African Statistical Association for 2010, edited by P. Debba, F. Lombard, V. Yadavalli, and L. Fatti, Potchefstroom: North-West University. Search in Google Scholar

Xie, M., O. Gaudoin, and C. Bracquemond. 2002. “Redefining Failure Rate Function for Discrete Distributions.” International Journal of Reliability, Quality and Safety Engineering 9:275–285. Search in Google Scholar

Zucchini, W. and I. MacDonald. 2009. Hidden Markov Models for Time Series: An Introduction Using R. Chapman & Hall/CRC Monographs on Statistics & Applied Probability, Taylor & Francis. Search in Google Scholar

Published Online: 2014-1-25
Published in Print: 2014-1-1

©2014 by Walter de Gruyter Berlin/Boston