Quantitative evaluation of fielding ability in baseball has been an ongoing challenge for statisticians. Detailed recording of ball-in-play data in recent years has spurred the development of sophisticated fielding models. Foremost among these approaches, Jensen et al. (2009) used a hierarchical Bayesian model to estimate spatial fielding curves for individual players. These previous efforts have not addressed evolution in a player’s fielding ability over time. We expand the work of Jensen et al. (2009) to model the fielding ability of individual players over multiple seasons. Several different models are implemented and compared via posterior predictive validation on hold-out data. Among our choices, we find that a model which imposes shrinkage towards an age-specific average gives the best performance. Our temporal models allow us to delineate the performance of a fielder on a season-to-season basis versus their entire career.
National Football League teams have complex drafting strategies based on college and combine performance that are intended to predict success in the NFL. In this paper, we focus on the tight end position, which is seeing growing importance as the NFL moves towards a more passing-oriented league. We create separate prediction models for 1. the NFL Draft and 2. NFL career performance based on data available prior to the NFL Draft: college performance, the NFL combine, and physical measures. We use linear regression and recursive partitioning decision trees to predict both NFL draft order and NFL career success based on this pre-draft data. With both modeling approaches, we find that the measures that are most predictive of NFL draft order are not necessarily the most predictive measures of NFL career success. This finding suggests that we can improve upon current drafting strategies for tight ends. After factoring the salary cost of drafted players into our analysis in order to predict tight ends with the highest value, we find that size measures (BMI, weight, height) are over-emphasized in the NFL draft.
A plethora of statistics have been proposed to measure the effectiveness of pitchers in Major League Baseball. While many of these are quite traditional (e.g., ERA, wins), some have gained currency only recently (e.g., WHIP, K/BB). Some of these metrics may have predictive power, but it is unclear which are the most reliable or consistent. We address this question by constructing a Bayesian random effects model that incorporates a point mass mixture and fitting it to data on twenty metrics spanning approximately 2,500 players and 35 years. Our model identifies FIP, HR/9, ERA, and BB/9 as the highest signal metrics for starters and GB%, FB%, and K/9 as the highest signal metrics for relievers. In general, the metrics identified by our model are independent of team defense. Our procedure also provides a relative ranking of metrics separately by starters and relievers and shows that these rankings differ quite substantially between them. Our methodology is compared to a Lasso-based procedure and is internally validated by detailed case studies.
Traditional NBA player evaluation metrics are based on scoring differential or some pace-adjusted linear combination of box score statistics like points, rebounds, assists, etc. These measures treat performances with the outcome of the game still in question (e.g. tie score with five minutes left) in exactly the same way as they treat performances with the outcome virtually decided (e.g. when one team leads by 30 points with one minute left). Because they ignore the context in which players perform, these measures can result in misleading estimates of how players help their teams win. We instead use a win probability framework for evaluating the impact NBA players have on their teams’ chances of winning. We propose a Bayesian linear regression model to estimate an individual player’s impact, after controlling for the other players on the court. We introduce several posterior summaries to derive rank-orderings of players within their team and across the league. This allows us to identify highly paid players with low impact relative to their teammates, as well as players whose high impact is not captured by existing metrics.
Numerous statistics have been proposed to measure offensive ability in Major League Baseball. While some of these measures may offer moderate predictive power in certain situations, it is unclear which simple offensive metrics are the most reliable or consistent. We address this issue by using a hierarchical Bayesian variable selection model to determine which offensive metrics are most predictive within players across time. Our sophisticated methodology allows for full estimation of the posterior distributions for our parameters and automatically adjusts for multiple testing, providing a distinct advantage over alternative approaches. We implement our model on a set of fifty different offensive metrics and discuss our results in the context of comparison to other variable selection techniques. We find that a large number of metrics demonstrate signal. However, these metrics are (i) highly correlated with one another, (ii) can be reduced to about five without much loss of information, and (iii) these five relate to traditional notions of performance (e.g., plate discipline, power, and ability to make contact).
Within sports analytics, there is substantial interest in comprehensive statistics intended to capture overall player performance. In baseball, one such measure is wins above replacement (WAR), which aggregates the contributions of a player in each facet of the game: hitting, pitching, baserunning, and fielding. However, current versions of WAR depend upon proprietary data, ad hoc methodology, and opaque calculations. We propose a competitive aggregate measure, openWAR, that is based on public data, a methodology with greater rigor and transparency, and a principled standard for the nebulous concept of a “replacement” player. Finally, we use simulation-based techniques to provide interval estimates for our openWAR measure that are easily portable to other domains.
We present a regularized logistic regression model for evaluating player contributions in hockey. The traditional metric for this purpose is the plus-minus statistic, which allocates a single unit of credit (for or against) to each player on the ice for a goal. However, plus-minus scores measure only the marginal effect of players, do not account for sample size, and provide a very noisy estimate of performance. We investigate a related regression problem: what does each player on the ice contribute, beyond aggregate team performance and other factors, to the odds that a given goal was scored by their team? Due to the large-p (number of players) and imbalanced design setting of hockey analysis, a major part of our contribution is a careful treatment of prior shrinkage in model estimation. We showcase two recently developed techniques – for posterior maximization or simulation – that make such analysis feasible. Each approach is accompanied with publicly available software and we include the simple commands used in our analysis. Our results show that most players do not stand out as measurably strong (positive or negative) contributors. This allows the stars to really shine, reveals diamonds in the rough overlooked by earlier analyses, and argues that some of the highest paid players in the league are not making contributions worth their expense.