Abstract
Computing and machine learning advancements have led to the creation of many cutting-edge predictive algorithms, some of which have been demonstrated to provide more accurate forecasts than traditional statistical tools. In this manuscript, we provide evidence that the combination of modest statistical methods with informative data can meet or exceed the accuracy of more complex models when it comes to predicting the NCAA men’s basketball tournament. First, we describe a prediction model that merges the point spreads set by Las Vegas sportsbooks with possession based team efficiency metrics by using logistic regressions. The set of probabilities generated from this model most accurately predicted the 2014 tournament, relative to approximately 400 competing submissions, as judged by the log loss function. Next, we attempt to quantify the degree to which luck played a role in the success of this model by simulating tournament outcomes under different sets of true underlying game probabilities. We estimate that under the most optimistic of game probability scenarios, our entry had roughly a 12% chance of outscoring all competing submissions and just less than a 50% chance of finishing with one of the ten best scores.
References
Barra, A. 2014. Is March Madness a Sporting Event – or a Gambling Event?. URL http://www.theatlantic.com/entertainment/archive/2014/03/is-march-madness-a-sporting-event-or-a-gambling-event/284545/ (accessed June 1, 2014).Search in Google Scholar
Boudway, I. 2014. The Legal Madness Around NCAA Bracket Pools. URL http://www.businessweek.com/articles/2012-03-15/the-legal-madness-around-ncaa-bracket-pools (accessed June 1, 2014).Search in Google Scholar
Boulier, B. L. and H. O. Stekler. 1999. “Are Sports Seedings Good Predictors?: An Evaluation.” International Journal of Forecasting 15:83–91.10.1016/S0169-2070(98)00067-3Search in Google Scholar
Breiter, D. J. and B. P. Carlin. 1997. “How to Play Office Pools if You Must.” Chance 10:5–11.10.1080/09332480.1997.10554789Search in Google Scholar
Carlin, B. P. 1996. “Improved NCAA Basketball Tournament Modeling Via Point Spread and Team Strength Information.” The American Statistician 50:39–43.Search in Google Scholar
Caruana, R. and A. Niculescu-Mizil. 2006. “An Empirical Comparison of Supervised Learning Algorithms.” In Proceedings of the 23rd International Conference on Machine Learning, ACM. pp. 161–168.Search in Google Scholar
Colquitt, L. L., N. H. Godwin, and S. B. Caudill. 2001. “Testing Efficiency Across Markets: Evidence from the NCAA Basketball Betting Market.” Journal of Business Finance & Accounting 28:231–248.10.1111/1468-5957.00372Search in Google Scholar
Constantinou, A. C., N. E. Fenton, and M. Neil. 2013. “Profiting from an Inefficient Association Football Gambling Market: Prediction, Risk and Uncertainty using Bayesian Networks.” Knowledge-Based Systems 50:60–86.10.1016/j.knosys.2013.05.008Search in Google Scholar
Dietterich, T. G. (2000). Ensemble methods in machine learning. Multiple classifier systems (pp. 1–15). Berlin, Heidelberg: Springer.Search in Google Scholar
ESPN. 2014. Official Rules. URL http://games.espn.go.com/tournament-challenge-bracket/2014 (accessed June 1, 2014).10.1007/978-3-642-27843-3_74-1Search in Google Scholar
Hansen, L. K. and P. Salamon. 1990. “Neural Network Ensembles.” IEEE Transactions on Pattern Analysis and Machine Intelligence 12:993–1001.10.1109/34.58871Search in Google Scholar
Harville, D. 1980. “Predictions for National Football League Games Via Linear-Model Methodology.” Journal of the American Statistical Association 75:516–524.10.1080/01621459.1980.10477504Search in Google Scholar
Kaggle. 2014. Competition Forum. URL https://www.kaggle.com/c/march-machine-learning-mania/forums (accessed June 1, 2014).Search in Google Scholar
Kubatko, J., D. Oliver, K. Pelton, and D. T. Rosenbaum. 2007. “A Starting Point for Analyzing Basketball Statistics.” Journal of Quantitative Analysis in Sports 3:1–22.10.2202/1559-0410.1070Search in Google Scholar
Kvam, P. and J. S. Sokol. 2006. “A Logistic Regression/Markov Chain Model for NCAA Basketball.” Naval Research Logistics (NrL) 53:788–803.10.1002/nav.20170Search in Google Scholar
Linna, K., E. Moore, R. Paul, and A. Weinbach. 2014. “The Effects of the Clock and Kickoff Rule Changes on Actual and Market-Based Expected Scoring in NCAA Football.” International Journal of Financial Studies 2:179–192.10.3390/ijfs2020179Search in Google Scholar
Metrick, A. 1996. “March Madness? Strategic Behavior in NCAA Basketball Tournament Betting Pools.” Journal of Economic Behavior & Organization 30:159–172.10.1016/S0167-2681(96)00855-4Search in Google Scholar
Nichols, M. W. 2014. “The Impact of Visiting Team Travel on Game Outcome and Biases in NFL Betting Markets.” Journal of Sports Economics 15:78–96.10.1177/1527002512440580Search in Google Scholar
Opitz, D. and R. Maclin. 1999. “Popular Ensemble Methods: An Empirical Study.” Journal of Artificial Intelligence Research 11:169–198.10.1613/jair.614Search in Google Scholar
Pagels, J. 2014. Challenging the Tournament Challenge: Devising a More Equitable Bracket Scoring System. URL https://www.bsports.com/statsinsights/ncaa/march-madness-scoring.Search in Google Scholar
Paul, R. J. and A. P. Weinbach. 2014. “Market Efficiency and Behavioral Biases in the WNBA Betting Market.” International Journal of Financial Studies 2:193–202.10.3390/ijfs2020193Search in Google Scholar
Paul, R. and A. Weinbach. 2005. “Market Efficiency and NCAA College Basketball Gambling.” Journal of Economics and Finance 29:403–408.10.1007/BF02761584Search in Google Scholar
Pomeroy, K. 2012. Ratings Glossary. URL http://kenpom.com/blog/index.php/weblog/entry/ratings_glossary (accessed June 1, 2014).Search in Google Scholar
Schwertman, N. C., K. L. Schenk, and B. C. Holbrook. 1996. “More Probability Models for the NCAA Regional Basketball Tournaments.” The American Statistician 50:34–38.Search in Google Scholar
Stern, H. 1991. “On the Probability of Winning a Football Game.” The American Statistician 45:179–183.Search in Google Scholar
TeamRankings. 2014. NCAA BB Team Possessions per Game. URL http://www.teamrankings.com/ncb/ (accessed June 1, 2014).Search in Google Scholar
Tsu, T. 2014. March Madness: Distracted Workers, Illegal Gambling, Loss of Sleep? URL http://articles.latimes.com/2012/mar/12/business/la-fi-mo-march-madness-20120312 (accessed June 1, 2014).Search in Google Scholar
Yahoo 2014. Official Rules. URL https://www.quickenloansbracket.com/rules/rules.html (accessed June 1, 2014).Search in Google Scholar
©2015 by De Gruyter