Using Tree Ensembles to Analyze National Baseball Hall of Fame Voting Patterns: An Application to Discrimination in BBWAA Voting

Brian M. Mills 1  and Steven Salaga 2
  • 1 University of Michigan
  • 2 University of Michigan

We predict the induction of Major League Baseball hitters and pitchers into the National Baseball Hall of Fame by the Baseball Writers’ Association of America. We employ a Random Forest algorithm for binary classification, improving upon past models with a simplistic input approach. Our results suggest that the random forest technique is a fruitful line of research with prediction in the sports world. We find an error rate as low as 0.91% in our most accurate forest, with no out-of-bag Error higher than 2.6% in any tree ensemble. We extend the results to an examination of the possibility of discrimination with respect to BBWAA voting, finding little evidence for exclusions based on race.

Purchase article
Get instant unlimited access to the article.
$42.00
Log in
Already have access? Please log in.


or
Log in with your institution

Journal + Issues

JQAS, an official journal of the American Statistical Association, publishes research on the quantitative aspects of professional and collegiate sports. Articles deal with subjects as measurements of player performance, tournament structure, and the frequency and occurrence of records. Additionally, the journal serves as an outlet for professionals in the sports world to raise issues and ask questions that relate to quantitative sports analysis.

Search