The action of crossing the ball in soccer has a long history as an effective tactic for producing goals. Lately, the benefit of crossing the ball has come under question, and alternative strategies have been suggested. This paper utilizes player tracking data to explore crossing at a deeper level. First, we investigate the spatio-temporal conditions that lead to crossing. Then we introduce an intended target model that investigates crossing success. Finally, a contextual analysis is provided that assesses the benefits of crossing in various situations. The analysis is based on causal inference techniques and suggests that crossing remains an effective tactic in particular contexts.
Examining data for the 10 Olympic Games contested this century, we ask whether confirmation bias exists in judged events. We theorize that if such bias is present, then competitors in judged events should perform closer to predicted than competitors in non-judged events. Among a sample of over 5100 predicted medalists from the 10 Games, we find that, all else equal, the differences between ex-ante conventional wisdom and ex-post observed outcome are larger for competitors in timed events than for competitors in judged events. These results suggest that confirmation bias does potentially exist for judged events at the Olympic Games.
In order to make comparisons of competitive balance across sports leagues, we need to take into account how different season lengths influence observed measures of balance. We develop the first measures of competitive balance that are invariant to season length. The most commonly used measure, the ASD/ISD or Noll-Scully ratio, is biased. It artificially inflates the imbalance for leagues with long seasons (e.g., MLB) compared to those with short seasons (e.g., NFL). We provide a general model of competition that leads to unbiased variance estimates. The result is a new ordering across leagues: the NFL goes from having the most balance to being tied for the least, while MLB becomes the sport with the most balance. Our model also provides insight into competitive balance at the game level. We shift attention from team-level to game-level measures as these are more directly related to the predictability of a representative contest. Finally, we measure competitive balance at the season level. We do so by looking at the predictability of the final rankings as seen from the start of the season. Here the NBA stands out for having the most predictable results and hence the lowest full-season competitive balance.
Vast data on eSports should be easily accessible but often is not. League of Legends (LoL) only has rudimentary statistics such as levels, items, gold, and deaths. We present a new way to capture more useful data. We track every champion’s location multiple times every second. We track every ability cast and attack made, all damages caused and avoided, vision, health, mana, and cooldowns. We track continuously, invisibly, remotely, and live. Using a combination of computer vision, dynamic client hooks, machine learning, visualization, logistic regression, large-scale cloud computing, and fast and frugal trees, we generate this new high-frequency data on millions of ranked LoL games, calibrate an in-game win probability model, develop enhanced definitions for standard metrics, introduce dozens more advanced metrics, automate player improvement analysis, and apply a new player-evaluation framework on the basic and advanced stats. How much does an individual contribute to a team’s performance? We find that individual actions conditioned on changes to estimated win probability correlate almost perfectly to team performance: regular kills and deaths do not nearly explain as much as smart kills and worthless deaths. Our approach offers applications for other eSports and traditional sports. All the code is open-sourced.
Defensive repositioning strategies (shifts) have become more prevalent in Major League Baseball in recent years. In 2018, batters faced some form of the shift in 34% of their plate appearances (Sawchik, Travis. 2019. “Don’t Worry, MLB–Hitters Are Killing The Shift On Their Own.” FiveThirtyEight, January 17, 2019. Also available at fivethirtyeight.com/features/dont-worry-mlb-hitters-are-killing-the-shift-on-their-own/). Most teams use a shift that overloads one side of the infield and adjusts the positioning of the outfield. In this work we describe a mathematical approach to the positioning of players over the entire field of play without the limitations of traditional positions or current methods of shifting. The model uses historical data for individual batters, and it leaves open the possibility of fewer than four infielders. The model also incorporates risk penalties for positioning players too far from areas of the field in which extra-base hits are more likely. This work is meant to serve as a decision-making tool for coaches and managers to best use their defensive assets. Our simulations show that an optimal positioning with three infielders lowered predicted batting average on balls in play (BABIP) by 5.9% for right-handers and by 10.3% for left-handers on average when compared to a standard four-infielder placement of players.
Amid much recent interest we discuss a Variance Gamma model for Rugby Union matches (applications to other sports are possible). Our model emerges as a special case of the recently introduced Gamma Difference distribution though there is a rich history of applied work using the Variance Gamma distribution – particularly in finance. Restricting to this special case adds analytical tractability and computational ease. Our three-dimensional model extends classical two-dimensional Poisson models for soccer. Analytical results are obtained for match outcomes, total score and the awarding of bonus points. Model calibration is demonstrated using historical results, bookmakers’ data and tournament simulations.