Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Journal of Quantitative Analysis in Sports

An official journal of the American Statistical Association

Editor-in-Chief: Steve Rigdon, PhD

CiteScore 2018: 1.67

SCImago Journal Rank (SJR) 2018: 0.587
Source Normalized Impact per Paper (SNIP) 2018: 1.970

See all formats and pricing
More options …
Volume 9, Issue 2


Volume 1 (2005)

Ranking rankings: an empirical comparison of the predictive power of sports ranking methods

Daniel Barrow / Ian Drayer / Peter Elliott / Garren Gaut / Braxton Osting
Published Online: 2013-05-27 | DOI: https://doi.org/10.1515/jqas-2013-0013


In this paper, we empirically evaluate the predictive power of eight sports ranking methods. For each ranking method, we implement two versions, one using only win-loss data and one utilizing score-differential data. The methods are compared on 4 datasets: 32 National Basketball Association (NBA) seasons, 112 Major League Baseball (MLB) seasons, 22 NCAA Division 1-A Basketball (NCAAB) seasons, and 56 NCAA Division 1-A Football (NCAAF) seasons. For each season of each dataset, we apply 20-fold cross validation to determine the predictive accuracy of the ranking methods. The non-parametric Friedman hypothesis test is used to assess whether the predictive errors for the considered rankings over the seasons are statistically dissimilar. The post-hoc Nemenyi test is then employed to determine which ranking methods have significant differences in predictive power. For all datasets, the null hypothesis – that all ranking methods are equivalent – is rejected at the 99% confidence level. For NCAAF and NCAAB datasets, the Nemenyi test concludes that the implementations utilizing score-differential data are usually more predictive than those using only win-loss data. For the NCAAF dataset, the least squares and random walker methods have significantly better predictive accuracy at the 95% confidence level than the other methods considered.

Keywords: cross validation; Friedman test; Nemenyi test; hypothesis testing; sports rankings


  • Berry, S. M. 2003. “A Statistician Reads the Sports Pages: College Football Rankings: The BCS and the CLT.” Chance 16:46–49.Google Scholar

  • Bradley, R. A. and M. E. Terry. 1952. “Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons.” Biometrika 39:324–345.Google Scholar

  • Burer, S. 2012. “Robust Rankings for College Football.” Journal of Quantitative Analysis in Sports 8(2).Google Scholar

  • Callaghan, T., P. J. Mucha, and M. A. Porter. 2007. “Random Walker Ranking for NCAA Division I-A Football.” American Mathematical Monthly 114:761–777.Google Scholar

  • CBB. 2012. “http://www.sports-reference.com/cbb/.” webpage accessed November 1, 2012.

  • CFB. 2012. “http://www.sports-reference.com/cfb/.” webpage accessed October 29, 2012.

  • Chan, V. 2011. “Prediction Accuracy of Linear Models for Paired Comparisons in Sports.” Journal of Quantitative Analysis in Sports 7(3), Article 18.Google Scholar

  • Chartier, T. P., E. Kreutzer, A. N. Langville, and K. E. Pedings. 2011a. “Sensitivity and Stability of Ranking Vectors.” SIAM Journal on Scientific Computing 33:1077–1102.Web of ScienceGoogle Scholar

  • Chartier, T. P., E. Kreutzer, A. N. Langville, and K. E. Pedings. 2011b. “Sports Ranking with Nonuniform Weighting.” Journal of Quantitative Analysis in Sports 7(3), Article 6.Google Scholar

  • Colley, W. N. 2002. “Colley’s Bias-Free College Football Ranking Method: The Colley Matrix Explained.” Technical report, Princeton University.Google Scholar

  • David, H. A. 1963. The Method of Paired Comparisons. Charles Griffin & Co.Google Scholar

  • Demšar, J. 2006. “Statistical Comparisons of Classifiers Over Multiple Data Sets.” JMLR 7:1–30.Google Scholar

  • Dwork, C., R. Kumar, M. Naor, and D. Sivakumar. 2001a. “Rank Aggregation Methods for the Web.” pp. 613–622, in: Proceedings of the 10th International Converence on World Wide Web. ACM.Google Scholar

  • Dwork, C., R. Kumar, M. Naor, and D. Sivakumar. 2001b. “Rank Aggregation Revisited.” pp. 613–622, in: Proceedings International Conference World Wide Web (WWW10).Google Scholar

  • Elo, A. E. 1961. “The New U.S.C.F. Rating System.” Chess Life 16:160–161.Google Scholar

  • Foulds, L. R. 1992. Graph Theory Applications. Springer.Google Scholar

  • Gill, R. 2009. “Assessing Methods for College Football Rankings.” Journal of Quantitative Analysis in Sports 5(2), Article 3.Google Scholar

  • Glickman, M. E. 1995. “A Comprehensive Guide to Chess Ratings.” American Chess Journal 3:59–102.Google Scholar

  • Harville, D. 1977. “The Use of Linear-Model Methodology to Rate High School or College Football Teams.” Journal of the American Statistical Society 72:278–289.Google Scholar

  • Herbrich, R., T. Minka, and T. Graepel. 2007. “Trueskill: A Bayesian Skill Rating System.” Advances in Neural Information Processing Systems 19:569.Google Scholar

  • Hirani, A. N., K. Kalyanaraman, and S. Watts. 2011. “Least Squares Ranking on Graphs.” arXiv:1011.1716v4.Google Scholar

  • Hochbaum, D. S. 2010. “The Separation and Separation-Deviation Methodology for Group Decision Making and Aggregate Ranking.” TutORials in Operations Research 7:116–141.Google Scholar

  • Horn, R. A. and C. R. Johnson. 1991. Matrix Analysis. Cambridge University Press.Google Scholar

  • Jiang, X., L.-H. Lim, Y. Yao, and Y. Ye. 2010. “Statistical Ranking and Combinatorial Hodge Theory.” Mathematical Programming Ser. B 127:203–244.Google Scholar

  • Keener, J. P. 1993. “The Perron-Frobenius Theorem and the Ranking of Football Teams.” SIAM Review 35:80–93.Google Scholar

  • Langville, A. N. and C. D. Meyer. 2012. Who’s #1?: The Science of Rating and Ranking. Princeton University Press.Google Scholar

  • Leake, R. 1976. “A Method for Ranking Teams: With an Application to College Football.” Management Science in Sports 4:27–46.Google Scholar

  • Massey, K. 1997. Statistical Models Applied to the Rating of Sports Teams, Master’s thesis, Bluefield College.Google Scholar

  • Miwa, T. 2012. “ http://cse.niaes.affrc.go.jp/miwa/probcalc/s-range/.” webpage accessed November 26, 2012.

  • MLB. 2012. “http://www.baseball-reference.com/.” webpage accessed October 29, 2012.

  • NBA. 2012. “http://www.basketball-reference.com/.” webpage accessed October 29, 2012.

  • Osting, B., C. Brune, and S. Osher. 2013a. “Enhanced statistical rankings via targeted data collection.” JMLR, W&CP 28(1):489–497.Google Scholar

  • Osting, B., J. Darbon, and S. Osher. 2013b. “Statistical Ranking Using the ℓ1 -Norm on Graphs.” accepted to AIMS J. Inverse Problems and Imaging.Google Scholar

  • Page, L., S. Brin, R. Motwani, and T. Winograd. 1999. “The PageRank Citation Ranking: Bringing Order to the Web.” Technical report, Stanford InfoLab Technical Report 1999–66.Google Scholar

  • Pickle, D. and B. Howard. 1981. “Computer to Aid in Basketball Championship Selection.” NCAA News 4.Google Scholar

  • Shaffer, J. P. 1995. “Multiple Hypothesis Testing.” Annual Review of Psychology, 46:561–584.Google Scholar

  • Stefani, R. T. 1977. “Football and Basketball Predictions Using Least Squares.” IEEE Transactions on Systems, Man, and Cybernetics 7:117–121.Google Scholar

  • Stefani, R. T. 1980. “Improved Least Squares Football, Basketball, and Soccer Predictions.” IEEE Transactions on Systems, Man, and Cybernetics 10:116–123.Google Scholar

  • Stefani, R. 2011. “The Methodology of Officially Recognized International Sports Rating Systems.” Journal of Quantitative Analysis in Sports 7(4), Article 10.Google Scholar

  • Tran, N. M. 2011. “Pairwise Ranking: Choice of Method Can Produce Arbitrarily Different Rank Order.” arXiv:1103.1110v1.Web of ScienceGoogle Scholar

  • Trono, J. A. 2010. “Rating/Ranking Systems, Post-Season Bowl Games, and ‘The Spread’.” Journal of Quantitative Analysis in Sports 6(3), Article 6.Google Scholar

  • Xu, Q., Y. Yao, T. Jiang, Q. Huang, B. Yan, and W. Lin. 2011. “Random Partial Paired Comparison for Subjective Video Quality Assessment via HodgeRank.” pp. 393–402, in Proceedings of the 19th ACM International Conference on Multimedia.Google Scholar

About the article

Corresponding author: Braxton Osting, UCLA, Department of Mathematics, 405 Hilgard Avenue, Los Angeles, CA 90095, USA, Tel.: +3108252601

Published Online: 2013-05-27

Published in Print: 2013-06-01


In equation (2), we take the fraction to be

if the game results in a 0–0 tie.

A digraph is weakly connected if replacing its arcs with undirected edges yields a connected graph.

Recall that for a matrix with non-negative entries, there exists a positive, real eigenvalue (called the Perron-Frobenius eigenvalue) such that any other eigenvalue is smaller in magnitude. The Perron-Frobenius eigenvalue is simple and the corresponding eigenvector (called the Perron-Frobenius eigenvector) has non-negative entries. See, for example, Horn and Johnson (1991).

The matrices W and S are irreducible if the corresponding directed graph is strongly connected. The matrix W is irreducible if there is no partition of the teams V=V1ߎV2 such that no team in V1 has beat a team in V2.

Citation Information: Journal of Quantitative Analysis in Sports, Volume 9, Issue 2, Pages 187–202, ISSN (Online) 1559-0410, ISSN (Print) 2194-6388, DOI: https://doi.org/10.1515/jqas-2013-0013.

Export Citation

©2013 by Walter de Gruyter Berlin Boston.Get Permission

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Fabian Wunderlich, Daniel Memmert, and Anthony C. Constantinou
PLOS ONE, 2018, Volume 13, Number 6, Page e0198668
David Beaudoin and Tim Swartz
Operations Research Perspectives, 2018
Richard Demsyn-Jones
Journal of Sports Analytics, 2018, Page 1
Damian Farrow, Machar Reid, Tim Buszard, and Stephanie Kovalchik
International Review of Sport and Exercise Psychology, 2017, Page 1
Baback Vaziri, Shaunak Dabadghao, Yuehwern Yih, and Thomas L. Morin
Journal of the Operational Research Society, 2017
Kristina Gavin Bigsby and Jeffrey W. Ohlmann
Journal of Sports Analytics, 2017, Volume 3, Number 1, Page 1

Comments (0)

Please log in or register to comment.
Log in