Accessible Requires Authentication Published by De Gruyter March 30, 2013

Determining the level of ability of football teams by dynamic ratings based on the relative discrepancies in scores between adversaries

Anthony Costa Constantinou and Norman Elliott Fenton

Abstract

A rating system provides relative measures of superiority between adversaries. We propose a novel and simple approach, which we call pi-rating, for dynamically rating Association Football teams solely on the basis of the relative discrepancies in scores through relevant match instances. The pi-rating system is applicable to any other sport where the score is considered as a good indicator for prediction purposes, as well as determining the relative performances between adversaries. In an attempt to examine how well the ratings capture a team’s performance, we have a) assessed them against two recently proposed football ELO rating variants and b) used them as the basis of a football betting strategy against published market odds. The results show that the pi-ratings outperform considerably the widely accepted ELO ratings and, perhaps more importantly, demonstrate profitability over a period of five English Premier League seasons (2007/2008–2011/2012), even allowing for the bookmakers’ built-in profit margin. This is the first academic study to demonstrate profitability against market odds using such a relatively simple technique, and the resulting pi-ratings can be incorporated as parameters into other more sophisticated models in an attempt to further enhance forecasting capability.


Corresponding author: Anthony Costa Constantinou, Electronic Engineering and Computer Science, Queen Mary, University of London, CS332, RIM GROUP, EECS, Mile End, London E1 4NS, UK

We acknowledge the financial support by the Engineering and Physical Sciences Research Council (EPSRC) for funding this research, and the reviewers and Editors of this Journal whose comments have led to significant improvements in the paper.

Appendix

Appendix A Rating development over a period of 20 seasons.

Figure A.1 Rating development over a period of 20 seasons, assuming λ=0.035 and γ=0.7, for the six most popular EPL teams (from season 1992/1993 to season 2011/2012 inclusive).

Figure A.1

Rating development over a period of 20 seasons, assuming λ=0.035 and γ=0.7, for the six most popular EPL teams (from season 1992/1993 to season 2011/2012 inclusive).

Appendix B Learning rates λ and γ

Table B.1

Squared error values generated based on learning rates λ and γ.

γλ
0.050.100.150.200.250.300.350.400.450.500.550.600.650.700.750.800.850.900.951.00
0.0052.84362.83652.83012.82432.81922.81482.81112.80822.80612.80502.80492.80592.80822.81192.81732.82462.83402.84612.86162.8809
0.0102.72732.72122.71562.71052.70572.70152.69782.69462.69212.69022.68912.68902.69022.69302.69812.70642.71932.73892.76802.8115
0.0152.68472.68002.67562.67162.66802.66472.66172.65912.65682.65492.65352.65272.65272.65382.65672.66242.67302.69182.72612.7896
0.0202.66582.66182.65822.65482.65172.64892.64632.64392.64182.63992.63842.63722.63662.63682.63812.64172.64942.66592.70122.7800
0.0252.65612.65262.64932.64622.64342.64082.63842.63622.63422.63252.63102.62972.62892.62862.62932.63152.63712.65062.68422.7760
0.0302.65182.64852.64532.64242.63972.63722.63482.63272.63082.62902.62752.62632.62542.62502.62532.62682.63092.64152.67232.7749
0.0352.65062.64732.64422.64142.63872.63622.63392.63182.63002.62832.62692.62582.62512.62472.62492.62582.62862.63692.66442.7758
0.0402.65102.64782.64482.64212.63962.63722.63492.63282.63102.62952.62812.62702.62622.62572.62572.62642.62872.63552.65982.7782
0.0452.65272.64972.64672.64402.64142.63892.63682.63482.63302.63152.63022.62912.62842.62792.62772.62822.63002.63592.65772.7818
0.0502.65542.65242.64942.64662.64402.64162.63952.63762.63582.63432.63312.63212.63132.63072.63052.63102.63272.63762.65722.7865
0.0552.65912.65592.65302.65022.64762.64532.64312.64122.63942.63792.63672.63562.63482.63432.63432.63482.63622.64052.65792.7918
0.0602.66352.66032.65742.65462.65192.64962.64752.64542.64362.64212.64082.63982.63902.63862.63862.63922.64072.64452.65982.7979
0.0652.66852.66532.66222.65942.65682.65462.65242.65032.64852.64682.64542.64442.64362.64332.64352.64412.64562.64922.66282.8044
0.0702.67402.67072.66762.66472.66212.65982.65762.65552.65362.65182.65042.64932.64872.64842.64872.64932.65092.65452.66682.8113
0.0752.67992.67662.67352.67062.66792.66552.66312.66092.65892.65722.65592.65492.65422.65402.65422.65502.65692.66052.67182.8187
0.0802.68622.68282.67972.67692.67412.67152.66912.66692.66482.66322.66202.66102.66032.66002.66032.66132.66332.66712.67752.8264
0.0852.69292.68952.68642.68332.68052.67792.67572.67342.67132.66982.66842.66742.66672.66662.66702.66802.67012.67382.68382.8345
0.0902.69992.69652.69322.69022.68742.68492.68252.68032.67832.67672.67542.67432.67372.67362.67402.67512.67722.68092.69062.8429
0.0952.70722.70372.70052.69742.69462.69202.68962.68752.68562.68392.68262.68162.68112.68112.68152.68262.68472.68842.69772.8518
0.1002.71472.71132.70802.70482.70202.69952.69712.69512.69332.69172.69032.68932.68892.68892.68952.69062.69262.69632.70512.8610

Appendix C Description of the ratings ELOb and ELOg

In this section we provide a brief description of the ratings ELOb and ELOg as defined by the authors of the ratings (Hvattum and Arntzen 2010).

C.1 Description of ELOb

Let

and
be the ratings, at the start of a match, of the home and away teams respectively. The ELO ratings assume that the home and away teams should score γH and γA respectively where:

and the parameters c and d serve only to set a scale of the ratings. The authors suggest that we use c=10 and d=400 (but alternative values of c and d give identical rating systems). Assuming that the score for the home team follows:

Then the actual score for the away team is αA=1–αH. At the end of the match, the revised ELO rating for the home team is (the away team’s

is calculated in the same way):

with k=20 as a suitable parameter value.

C.2 Description of ELOg

The ELOg rating is a variant of ELOb above, in an attempt to also consider score difference, with the difference that k is replaced by the expression:

where δ is the absolute goal difference, and assuming k0>0 and λ>0 as fixed parameters; suggesting k0=10 and λ=1 as suitable parameter values.

Appendix D Optimised [k] and [k0, λ] values for the ratings ELOb and ELOg.

Figure D.1 Optimised k0 and λ values for ELOg. Minimum squared error of expected goal difference observed when k0=2 and λ=2.8, where e=0.3405.

Figure D.1

Optimised k0 and λ values for ELOg. Minimum squared error of expected goal difference observed when k0=2 and λ=2.8, where e=0.3405.

Figure D.2 Optimised k-value for ELOb. Minimum squared error of expected goal difference observed when k=56, where e=0.3514.

Figure D.2

Optimised k-value for ELOb. Minimum squared error of expected goal difference observed when k=56, where e=0.3514.

  1. 1

    It might also worth mentioning that the ELO rating algorithm was featured prominently in the popular movie The Social Network (also known as the Facebook movie), whereby during a scene Eduardo Saverin writes the mathematical formula for the ELO rating system on Zuckerberg’s dorm room window.

  2. 2

    If the rating is applied to a single league competition, the average team in that league will have a rating of 0. If the rating is applied to more than one league in which adversaries between the different leagues (or cup competitions) play against each other, the average team over all leagues will have a rating of 0.

  3. 3

    If the prediction is +4 in favour of the home side then an actual result of 5–0 will give you an error of approximately 1. But if the prediction is 0 in favour of the home side and the actual result is 1–0, then this also gives you the same error as above.

  4. 4

    The learning parameters could have been optimised based on predictions of type {H, D, A} (corresponding to home win, draw and away win), based on profitability, based on scoring rules, or based on many other different accuracy measurements and metrics. We have chosen score difference for optimising the learning parameters since the pi-ratings themselves are exclusively determined by that information.

  5. 5

    The first five EPL seasons (1992/1993 to 1996/1997) are solely considered for generating the initial ratings for the competing teams. This is important because training the model on ignorant team ratings (i.e., starting from 0) will negatively affect the training procedure. Thus, learning parameters λ and γ are trained during the subsequent ten seasons; 1997/1998 to 2006/2007 inclusive.

  6. 6

    For the pi-rating system the ratings are segregated into intervals of 0.10 (from ≤–1.1 to >1.6), for ELOb the ratings are segregated into intervals of 25 (from ≤–330 to >345), and for ELOg the ratings are segregated into intervals of 35 (from ≤–475 to >470).

  7. 7

    Assumes a profit margin of 5%.

  8. 8

    For the newly promoted team Wolves the development of the ratings start at match instance 760 since no performances have been recorded relative to the residual EPL teams during the two preceding seasons.

  9. 9

    Where the pi-ratings of the home and away team follow ∼Normal (x, y) distributions for capturing rating uncertainty, where x is the pi-rating value (RαH or RβA) and y is the pi-rating variance, which can be measured over n preceding match instances.

References

Baio, G., and M. Blangiardo. 2010. “Bayesian Hierarchical Model for the Prediction of Football Results.” Journal of Applied Statistics 37(2):253–264. Search in Google Scholar

Buchner, A., W. Dubitzky, A. Schuster, P. Lopes, P. O’Doneghue, J. Hughes, D. A. Bell, K. Adamson, J. A. White, J. M. C. C. Anderson and M. D. Mulvenna. 1997. Corporate Evidential Decision Making in Performance Prediction Domains. Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI ’97). Providence, Rhode Island, USA: Brown University. Search in Google Scholar

Clarke, S. R. and J. M. Norman. 1995. “Home Ground Advantage of Individual Clubs in English Soccer.” The Statistician 44:509–521. Search in Google Scholar

Constantinou, A. C. and N. E. Fenton. 2012. Evidence of an (Intended) Inefficient Association Football Gambling Market.Under Review. Draft available at: http://constantinou.info/downloads/papers/evidenceofinefficiency.pdf. Search in Google Scholar

Constantinou, A. C., N. E. Fenton, and M. Neil. 2012a. “pi-football: A Bayesian Network Model for Forecasting Association Football Match Outcomes”. Knowledge-Based Systems, 322–339. Draft available at: http://www.constantinou.info/downloads/papers/pi-model11.pdf. Search in Google Scholar

Constantinou, A. C., N. E. Fenton, and M. Neil. 2012b. Profiting from an Inefficient Association Football Gambling Market: Prediction, Risk and Uncertainty Using Bayesian Networks.Under Review. Draft available at: http://www.constantinou.info/downloads/papers/pi-model12.pdf. Search in Google Scholar

Crowder, M., M. Dixon, A. Ledford and M. Robinson. 2002. “Dynamic Modelling and Prediction of English Football League Matches for Betting.” The Statistician 51:157–168. Search in Google Scholar

Dixon, M., and S. Coles. 1997. “Modelling Association Football Scores and Inefficiencies in the Football Betting Market.” Applied Statistics 46:265–280. Search in Google Scholar

Dixon, M., and P. Pope. 2004. “The Value of Statistical Forecasts in the UK Association Football Betting Market.” International Journal of Forecasting 20:697–711. Search in Google Scholar

Dunning, E. 1999. Sport Matters: Sociological Studies of Sport, Violence and Civilisation. London: Routledge. Search in Google Scholar

Dunning, E. G., A Joseph and R.E. Maguire. 1993. The Sports Process: A Comparative and Developmental Approach. p. 129. Champaign: Human Kinetics. Search in Google Scholar

Elo, A. E. 1978. The Rating of Chess Players, Past and Present. New York: Arco Publishing. Search in Google Scholar

Fenton, N. E. and M. Neil. 2012. Risk Assessment and Decision Analysis with Bayesian Networks. London: Chapman and Hall. Search in Google Scholar

FIFA. 2012. FIFA. Retrieved March 27, 2012, from FIFA/Coca-Cola World Ranking Procedure: http://www.fifa.com/worldranking/procedureandschedule/menprocedure/index.html. Search in Google Scholar

Football-Data. 2012. Football-Data.co.uk. Retrieved August 2, 2012, from Football Results, Statistics & Soccer Betting Odds Data: http://www.football-data.co.uk/englandm.php. Search in Google Scholar

Forrest, D., J. Goddard and R. Simmons. 2005. “Odds-Setters as Forecasters: The Case of English Football.” International Journal of Forecasting 21:551–564. Search in Google Scholar

Goddard, J. 2005. “Regression Models for Forecasting Goals and Match Results in Association Football.” International Journal of Forecasting 21:331–340. Search in Google Scholar

Goddard, J. and I. Asimakopoulos. 2004. “Forecasting Football Results and the Efficiency of Fixed-odds Betting.” Journal of Forecasting 23:51–66. Search in Google Scholar

Halicioglu, F. 2005a. “Can We Predict the Outcome of the International Football Tournaments?: The Case of Euro 2000.” Doğuş Üniversitesi Dergisi 6:112–122. Search in Google Scholar

Halicioglu, F. 2005b. Forecasting the Professional Team Sporting Events: Evidence from Euro 2000 and 2004 Football Tournaments. 5th International Conference on Sports and Culture: Economic, Management and Marketing Aspects. Athens, Greece, pp. 30–31. Search in Google Scholar

Harville, D. A. 1977. “The Use of Linear-model Methodology to Rate High School or College Football Teams.” Journal of American Statistical Association 72:278–289. Search in Google Scholar

Hirotsu, N. and M. Wright. 2003. “An Evaluation of Characteristics of Teams in Association Football by Using a Markov Process Model.” The Statistician 52(4):591–602. Search in Google Scholar

Hvattum, L. M. and H. Arntzen. 2010. “Using ELO Ratings for Match Result Prediction in Association Football.” International Journal of Forecasting 26:460–470. Search in Google Scholar

Joseph, A., N. Fenton and M. Neil. 2006. “Predicting Football Results Using Bayesian Nets and Other Machine Learning Techniques.” Knowledge-Based Systems 7:544–553. Search in Google Scholar

Karlis, D. and I. Ntzoufras. 2000. “On Modelling Soccer Data.” Student 229–244. Search in Google Scholar

Karlis, D. and I. Ntzoufras. 2003. “Analysis of Sports Data by Using Bivariate Poisson Models.” The Statistician 52(3): 381–393. Search in Google Scholar

Knorr-Held, L. 1997. Hierarchical Modelling of Discrete Longitudinal Data, Applications of Markov Chain Monte Carlo. Munich: Utz. Search in Google Scholar

Knorr-Held, L. 2000. “Dynamic Rating of Sports Teams.” The Statistician 49(2):261–276. Search in Google Scholar

Koning, R. 2000. “Balance in Competition in Dutch Soccer.” The Statistician 49(3):419–431. Search in Google Scholar

Koning, R., H. Koolhaas, M. Renes and G. Ridder. 2003. “A Simulation Model for Football Championships.” European Journal of Operational Research 148:268–276. Search in Google Scholar

Kuonen, D. 1996. Statistical Models for Knock-Out Soccer Tournaments. Technical Report, Department of Mathematics, Ècole Polytechnique Federale de Lausanne. Search in Google Scholar

Kuypers, T. 2000. “Information and Efficiency: an Empirical Study of a Fixed Odds Betting Market.” Applied Economics 32: 1353–1363. Search in Google Scholar

Lee, A. J. 1997. “Modeling Scores in the Premier League: is Manchester United Really the Best?” Chance 10:15–19. Search in Google Scholar

Leitner, C., A. Zeileis and K. Hornik. 2010. “Forecasting Sports Tournaments by Ratings of (prob)abilities: A Comparison for the EURO 2008.” International Journal of Forecasting 26:471–481. Search in Google Scholar

Maher, M. J. 1982. “Modelling Association Football Scores.” Statististica Neerlandica 36:109–118. Search in Google Scholar

Min, B., J. Kim, C. Choe, H. Eom, and R. B. McKay. 2008. “A Compound Framework for Sports Results Prediction: A Football Case Study.” Knowledge-Based Systems 21:551–562. Search in Google Scholar

Mueller, F. O., R. C. Cantu and S. P. Camp. 1996. Catastrophic Injuries in High School and College Sports. Champaign: Human Kinetics, p. 57. Search in Google Scholar

Murali, V. (2011, October 28) Bleacher Report. Retrieved March 28, 2012, from World Football: 40 Biggest Scandals in Football History: http://bleacherreport.com/articles/909932-world-football-40-biggest-scandals-in-football-history. Search in Google Scholar

Poulter, D. R. 2009. “Home Advantage and Player Nationality in International Club Football.” Journal of Sports Sciences 27(8):797–805. Search in Google Scholar

Reid, D. A. and M. S. Nixon. 2011. “Using Comparative Human Descriptions for Soft Biometrics.” International Joint Conference on Biometrics (IJCB) 2011. Search in Google Scholar

Rotshtein, A., M. Posner and A. Rakytyanska. 2005. “Football Predictions Based on a Fuzzy Model with Genetic and Neural Tuning.” Cybernetics and Systems Analysis 41(4):619–630. Search in Google Scholar

Rue, H. and O. Salvesen. 2000. “Prediction and Retrospective Analysis of Soccer Matches in a League.” The Statistician 3:339–418. Search in Google Scholar

Tsakonas, A., G. Dounias, S. Shtovba and V. Vivdyuk. 2002. Soft Computing-Based Result Prediction of Football Games. The First International Conference on Inductive Modelling (ICIM 2002). Lviv, Ukraine. Search in Google Scholar

Published Online: 2013-03-30

©2013 by Walter de Gruyter Berlin Boston