Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter February 17, 2016

Is there a Pythagorean theorem for winning in tennis?

Stephanie Ann Kovalchik EMAIL logo

Abstract

Bill James’ discovery of a Pythagorean formula for win expectation in baseball has been a useful resource to analysts and coaches for over 30 years. Extensions of the Pythagorean model have been developed for all of the major professional team sports but none of the individual sports. The present paper attempts to address this gap by deriving a Pythagorean model for win production in tennis. Using performance data for the top 100 male singles players between 2004 and 2014, this study shows that, among the most commonly reported performance statistics, a model of break points won provides the closest approximation to the Pythagorean formula, explaining 85% of variation in season wins and having the lowest cross-validation prediction error among the models considered. The mid-season projections of the break point model had performance that was comparable to an expanded model that included eight other serve and return statistics as well as player ranking. A simple match prediction algorithm based on a break point model with the previous 9 months of match history had a prediction accuracy of 67% when applied to 2015 match outcomes, whether using the least-squares or Pythagorean power coefficient. By demonstrating the striking similarity between the Pythagorean formula for baseball wins and the break point model for match wins in tennis, this paper has identified a potentially simple yet powerful analytic tool with a wide range of potential uses for player performance evaluation and match forecasting.


Corresponding author: Stephanie Ann Kovalchik, Institute of Sport, Exercise & Active Living, Victoria University, Footscray Park, VIC, Australia, Tel.: +61 450 509 098, e-mail:

Acknowledgments

I am grateful to the staff at the ATP and flashscore.com for making a vast amount of tennis data available to the public and the research presented in this paper possible.

References

Baumer, B. and A. Zimbalist. 2014. The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball. Philadelphia, Pennsylvania: University of Pennsylvania Press.10.9783/9780812209129Search in Google Scholar

Braunstein, A. 2010. “Consistency and Pythagoras.” Journal of Quantitative Analysis in Sports 6(1):1–16.10.2202/1559-0410.1215Search in Google Scholar

Caro, C. A. and R. Machtmes. 2013. “Testing the Utility of the Pythagorean Expectation Formula on Division One College Football: An Examination and Comparison to the Morey Model.” Journal of Business & Economics Research 11(12):537–542.10.19030/jber.v11i12.8261Search in Google Scholar

Cha, D. U., D. P. Glatt, and P. M. Sommers. 2007. “An Empirical Test of Bill James’s Pythagorean Formula.” Journal of Recreational Mathematics 35(2):117–130.Search in Google Scholar

Cochran, J. J. and R. Blackstock. 2009. “Pythagoras and the National Hockey League.” Journal of Quantitative Analysis in Sports 5(2):1–13.10.2202/1559-0410.1181Search in Google Scholar

Davenport, C. and K. Woolner. 1999. “Revisiting the Pythagorean Theorem: Putting Bill James’ Pythagorean Theorem to the Test.” The Baseball Prospectus.Search in Google Scholar

Faraway, J. J. 2005. Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models. Boca Raton, Florida: CRC Press.Search in Google Scholar

Gilsdorf, K. F. and V. A. Sukhatme. 2008. “Testing Rosen’s Sequential Elimination Tournament Model Incentives and Player Performance in Professional Tennis.” Journal of Sports Economics 9(3):287–303.10.1177/1527002507306790Search in Google Scholar

Hamilton, H. H. 2011. “An Extension of the Pythagorean Expectation for Association Football.” Journal of Quantitative Analysis in Sports 7(2):1–18.10.2202/1559-0410.1335Search in Google Scholar

Hammond, C., W. P. Johnson, and S. J. Miller. 2015. “The James Function.” Mathematics Magazine 88(1):54–71.10.4169/math.mag.88.1.54Search in Google Scholar

James, B. 1981. Baseball Abstract. Self-Published, Lawrence, KS.Search in Google Scholar

Klaassen, F. J. and J. R. Magnus. 2001. “Are Points in Tennis Independent and Identically Distributed? Evidence from a Dynamic Binary Panel Data Model.” Journal of the American Statistical Association 96(454):500–509.10.1198/016214501753168217Search in Google Scholar

Knottenbelt, W. J., D. Spanias, and A. M. Madurska. 2012. “A Common-opponent Stochastic Model for Predicting the Outcome of Professional Tennis Matches.” Computers & Mathematics with Applications 64(12):3820–3827.10.1016/j.camwa.2012.03.005Search in Google Scholar

McHale, I. and A. Morton. 2011. “A Bradley-Terry Type Model for Forecasting Tennis Match Results.” International Journal of Forecasting 27(2):619–630.10.1016/j.ijforecast.2010.04.004Search in Google Scholar

Miller, S. J. 2007. “A Derivation of the Pythagorean Won-loss Formula in Baseball.” Chance 20(1):40–48.10.1080/09332480.2007.10722831Search in Google Scholar

Miller, S. J., T. Corcoran, J. Gossels, V. Luo, and J. Porflio. 2014. “Pythagoras at the Bat.” in Social Networks and the Economics of Sports, 89–113. Springer.10.1007/978-3-319-08440-4_6Search in Google Scholar

Morris, C. 1977. “The Most Important Points in Tennis.” Optimal Strategies in Sports 5:131–140.Search in Google Scholar

R Core Team. 2015. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.Search in Google Scholar

Rosenfeld, J. W., J. I. Fisher, D. Adler, and C. Morris. 2010. “Predicting Overtime with the Pythagorean Formula.” Journal of Quantitative Analysis in Sports 6(2):1–19.10.2202/1559-0410.1244Search in Google Scholar

Stefani, R. T. 1997. “Survey of the Major World Sports Rating Systems.” Journal of Applied Statistics 24(6):635–646.10.1080/02664769723387Search in Google Scholar

Tibshirani, R. 1996. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society. Series B (Methodological) 267–288.10.1111/j.2517-6161.1996.tb02080.xSearch in Google Scholar

Vollmayr-Lee, B. 2002. More than You Probably ever Wanted to Know about the “Pythagorean” Method. http://www.eg.bucknell.edu/bvoll-may/baseball/pythagoras.html.Search in Google Scholar

Winston, W. L. (2012). Mathletics: How Gamblers, Managers, and Sports Enthusiasts use Mathematics in Baseball, Basketball, and Football. Princeton, New Jersey: Princeton University Press.10.1515/9781400842070Search in Google Scholar

Published Online: 2016-2-17
Published in Print: 2016-3-1

©2016 by De Gruyter

Downloaded on 30.1.2023 from https://www.degruyter.com/document/doi/10.1515/jqas-2015-0057/html
Scroll Up Arrow