Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter February 24, 2015

Building an NCAA men’s basketball predictive model and quantifying its success

Michael J. Lopez EMAIL logo and Gregory J. Matthews


Computing and machine learning advancements have led to the creation of many cutting-edge predictive algorithms, some of which have been demonstrated to provide more accurate forecasts than traditional statistical tools. In this manuscript, we provide evidence that the combination of modest statistical methods with informative data can meet or exceed the accuracy of more complex models when it comes to predicting the NCAA men’s basketball tournament. First, we describe a prediction model that merges the point spreads set by Las Vegas sportsbooks with possession based team efficiency metrics by using logistic regressions. The set of probabilities generated from this model most accurately predicted the 2014 tournament, relative to approximately 400 competing submissions, as judged by the log loss function. Next, we attempt to quantify the degree to which luck played a role in the success of this model by simulating tournament outcomes under different sets of true underlying game probabilities. We estimate that under the most optimistic of game probability scenarios, our entry had roughly a 12% chance of outscoring all competing submissions and just less than a 50% chance of finishing with one of the ten best scores.

Corresponding author: Michael J. Lopez, Skidmore College – Mathematics and Computer Science, 815 N. Broadway Harder Hall, Saratoga Springs, New York 12866, USA, Tel.: +9784072221, e-mail:


Barra, A. 2014. Is March Madness a Sporting Event – or a Gambling Event?. URL (accessed June 1, 2014).Search in Google Scholar

Boudway, I. 2014. The Legal Madness Around NCAA Bracket Pools. URL (accessed June 1, 2014).Search in Google Scholar

Boulier, B. L. and H. O. Stekler. 1999. “Are Sports Seedings Good Predictors?: An Evaluation.” International Journal of Forecasting 15:83–91.10.1016/S0169-2070(98)00067-3Search in Google Scholar

Breiter, D. J. and B. P. Carlin. 1997. “How to Play Office Pools if You Must.” Chance 10:5–11.10.1080/09332480.1997.10554789Search in Google Scholar

Carlin, B. P. 1996. “Improved NCAA Basketball Tournament Modeling Via Point Spread and Team Strength Information.” The American Statistician 50:39–43.Search in Google Scholar

Caruana, R. and A. Niculescu-Mizil. 2006. “An Empirical Comparison of Supervised Learning Algorithms.” In Proceedings of the 23rd International Conference on Machine Learning, ACM. pp. 161–168.Search in Google Scholar

Colquitt, L. L., N. H. Godwin, and S. B. Caudill. 2001. “Testing Efficiency Across Markets: Evidence from the NCAA Basketball Betting Market.” Journal of Business Finance & Accounting 28:231–248.10.1111/1468-5957.00372Search in Google Scholar

Constantinou, A. C., N. E. Fenton, and M. Neil. 2013. “Profiting from an Inefficient Association Football Gambling Market: Prediction, Risk and Uncertainty using Bayesian Networks.” Knowledge-Based Systems 50:60–86.10.1016/j.knosys.2013.05.008Search in Google Scholar

Dietterich, T. G. (2000). Ensemble methods in machine learning. Multiple classifier systems (pp. 1–15). Berlin, Heidelberg: Springer.Search in Google Scholar

ESPN. 2014. Official Rules. URL (accessed June 1, 2014).10.1007/978-3-642-27843-3_74-1Search in Google Scholar

Hansen, L. K. and P. Salamon. 1990. “Neural Network Ensembles.” IEEE Transactions on Pattern Analysis and Machine Intelligence 12:993–1001.10.1109/34.58871Search in Google Scholar

Harville, D. 1980. “Predictions for National Football League Games Via Linear-Model Methodology.” Journal of the American Statistical Association 75:516–524.10.1080/01621459.1980.10477504Search in Google Scholar

Kaggle. 2014. Competition Forum. URL (accessed June 1, 2014).Search in Google Scholar

Kubatko, J., D. Oliver, K. Pelton, and D. T. Rosenbaum. 2007. “A Starting Point for Analyzing Basketball Statistics.” Journal of Quantitative Analysis in Sports 3:1–22.10.2202/1559-0410.1070Search in Google Scholar

Kvam, P. and J. S. Sokol. 2006. “A Logistic Regression/Markov Chain Model for NCAA Basketball.” Naval Research Logistics (NrL) 53:788–803.10.1002/nav.20170Search in Google Scholar

Linna, K., E. Moore, R. Paul, and A. Weinbach. 2014. “The Effects of the Clock and Kickoff Rule Changes on Actual and Market-Based Expected Scoring in NCAA Football.” International Journal of Financial Studies 2:179–192.10.3390/ijfs2020179Search in Google Scholar

Metrick, A. 1996. “March Madness? Strategic Behavior in NCAA Basketball Tournament Betting Pools.” Journal of Economic Behavior & Organization 30:159–172.10.1016/S0167-2681(96)00855-4Search in Google Scholar

Nichols, M. W. 2014. “The Impact of Visiting Team Travel on Game Outcome and Biases in NFL Betting Markets.” Journal of Sports Economics 15:78–96.10.1177/1527002512440580Search in Google Scholar

Opitz, D. and R. Maclin. 1999. “Popular Ensemble Methods: An Empirical Study.” Journal of Artificial Intelligence Research 11:169–198.10.1613/jair.614Search in Google Scholar

Pagels, J. 2014. Challenging the Tournament Challenge: Devising a More Equitable Bracket Scoring System. URL in Google Scholar

Paul, R. J. and A. P. Weinbach. 2014. “Market Efficiency and Behavioral Biases in the WNBA Betting Market.” International Journal of Financial Studies 2:193–202.10.3390/ijfs2020193Search in Google Scholar

Paul, R. and A. Weinbach. 2005. “Market Efficiency and NCAA College Basketball Gambling.” Journal of Economics and Finance 29:403–408.10.1007/BF02761584Search in Google Scholar

Pomeroy, K. 2012. Ratings Glossary. URL (accessed June 1, 2014).Search in Google Scholar

Schwertman, N. C., K. L. Schenk, and B. C. Holbrook. 1996. “More Probability Models for the NCAA Regional Basketball Tournaments.” The American Statistician 50:34–38.Search in Google Scholar

Stern, H. 1991. “On the Probability of Winning a Football Game.” The American Statistician 45:179–183.Search in Google Scholar

TeamRankings. 2014. NCAA BB Team Possessions per Game. URL (accessed June 1, 2014).Search in Google Scholar

Tsu, T. 2014. March Madness: Distracted Workers, Illegal Gambling, Loss of Sleep? URL (accessed June 1, 2014).Search in Google Scholar

Yahoo 2014. Official Rules. URL (accessed June 1, 2014).Search in Google Scholar

Published Online: 2015-2-24
Published in Print: 2015-3-1

©2015 by De Gruyter

Downloaded on 8.12.2022 from
Scroll Up Arrow