## 1 Introduction and related literature

In this paper we investigate the extent to which being left-handed impacts elite performance and rankings in one-on-one interactive sports such as tennis, fencing, badminton etc. Our goal is to provide a coherent framework for measuring the benefit of being left-handed in these sports and tracking how this benefit evolves over time. We also aim to provide a framework for considering such questions as “who are the most talented players?” Of course for this latter question to be reasonable it must be the case that the lefty advantage (to the extent that it exists) can be decoupled from the notion of “talent.” Indeed it’s not at all clear that such a decoupling exists.

### 1.1 Causes and extent of the lefty advantage

In fact much of the early research into the performance of left-handers in sports relied on the so-called “innate superiority hypothesis” (ISH), where left-handers were said to have an edge in sporting competitions due to inherent neurological advantages associated with being left-handed (Geschwind and Galaburda 1985; Nass and Gazzaniga 1987). The presence of larger right hemispheric brain regions associated with visual and spatial functions, a lack of lateralization, and a larger corpus callosum (Witelson 1985) (the brain structure involved in communication across hemispheres) were all suggested as neurological mechanisms for this edge. Applications of the ISH to sport occurred primarily in fencing (Bisiacchi et al. 1985; Taddei, Viggiano, and Mecacci 1991; Akpinar et al. 2015), where left-handers appeared to have advantages in attentional tasks (in terms of response to visual stimuli), though there were also proponents for this view in other sports such as tennis (Holtzen 2000).

The idea that an innate advantage was responsible for the significant over-representation of left-handers in professional sports gradually lost momentum following the works of (Wood and Aggleton 1989; Aggleton and Wood 1990; Grouios et al. 2000a). These papers analyzed interactive sports such as tennis and non-interactive sports such as darts and pool. They found there was a surplus of left-handers in the interactive sports, but not generally in the non-interactive sports. (One exception is golf where Loffing and Hagemann (2016, Box 12.1) noted that the proportion of top left-handed^{1} golfers is higher than in the general population.) It was reasoned that any innate superiority should also bring left-handers into prominence in non-interactive sports and so alternative explanations were sought. The primary argument of (Wood and Aggleton 1989; Aggleton and Wood 1990) was that the prominence of left-handers in a given sport was due to the strategic advantages of being left-handed in that sport.

Indeed the prevailing^{2} explanation today for the over-representation of left-handers in professional interactive sports is the negative frequency-dependent selection (NFDS) effect. This effect is also assumed to underlie the so-called “fighting hypothesis” (Raymond et al. 1996) which explains why there is long-lasting handedness polymorphism in humans despite the fitness costs that appear to be associated with left-handedness. The NFDS effect is best summarized as stating that right-handed players have less familiarity competing against left-handed players (because of the much smaller percentage of lefties in the population) and therefore perform relatively poorly against them as a result. Key evidence supporting this hypothesis was the demonstration of mechanisms for how NFDS effects might arise (Daems and Verfaillie 1999; Stone 1999; Grossman et al. 2000; Grouios et al. 2000b). The difficulty of playing elite left-handed players in one-on-one interactive sports has long been recognized. For example, Breznik (2013) quotes Monica Seleš, the former women’s world number one tennis player:

“It’s strange to play a lefty (most players are right-handed) because everything is opposite and it takes a while to get used to the switch. By the time I feel comfortable, the match is usually over.”

A more general overview and discussion of NFDS effects can be found in the recent book chapter of Loffing and Hagemann (2016) who also provide extensive statistics regarding the percentage of top lefties across various sports. It is also perhaps worth mentioning that the debate between the ISH and the NFDS mechanism is not quite settled and some research, e.g. (Gursoy 2009) in boxing and (Breznik 2013) in tennis, still argue that the ISH has a role to play.

Recent analyses of combat sports (such as judo (Sterkowicz, Lech, and Blecharz 2010), mixed martial arts (Dochtermann, Gienger, and Zappettini 2014), and boxing (Loffing and Hagemann 2015)) also support the existence of NFDS effects on performance, although they suggest that alternative explanations must still be considered and that the resulting advantage is small. This agrees with (Loffing, Hagemann, and Strauss, 2012a) which suggests that although left-handedness provides an advantage, modern professionalism and training are acting to counter the advantage. Deliberate training was shown in (Schorer et al. 2012) to improve the performance of handball goalies against players of specific handedness while (Ullén, Hambrick, and Mosing 2016) explores the issue of deliberate training vs innate talent in depth. A recent article (Liew 2015) in the Telegraph newspaper in the UK, for example, noted how seven of the first seventeen Wimbledon champions in the open era were left-handed men while there were only two left-handers among the top 32 seeds in the 2015 tournament. Some of this variation is undoubtedly noise (see Section 6) but there do appear to be trends in the value of left-handedness. For example, the same Telegraph article noted that a reverse effect might be taking place in women’s tennis. Specifically, the article noted that 2015 was the first time in the history of the WTA tour that there were four left-handed women among the top 10 in tennis. In most sports the lefty advantage appears to be weaker in women than in men (Loffing and Hagemann 2016). The issue of gender effects of handedness in professional tennis is discussed in (Breznik 2013) where it is shown through descriptive statistics and a PageRank-style analysis that women do indeed have a smaller lefty advantage than men although it’s worth noting their data only extends to 2011. It is also suggested in (Breznik 2013) that the lefty advantage in tennis is weaker in Grand Slams than on the ATP and Challenger tours. They conjecture that possible explanations for this are that the very best players are more able to adjust to playing lefties and they may also be in a better position to tailor their training in anticipation of playing lefties.

Many other researchers have studied the extent of the leftie advantage and how it might arise. For example, (Goldstein and Young 1996) determines a game theoretic evolutionary stable strategy from payoff matrices of summarized performance, whereas (Billiard, Faurie, and Raymond 2005) explicitly uses frequency dependent interactions between left- and right-handed competitors. The work of Abrams and Panaggio (2012) has some similarity to ours as they also model professionals as being the top performers from a general population skill distribution. They use differential equations to define an equilibrium of transitions between left- and right-handed populations. These papers rely then on the NFDS mechanism to generate the lefty advantage. We note that equilibrium-style models suggest the strength of the lefty advantage might be inversely proportional to the proportion of top lefties. Such behavior is not a feature of our modeling framework but nor is it inconsistent with it as we do not model the NFDS mechanism (and resulting equilibrium). Instead our main goal is to measure the size of the lefty advantage rather than building a model that leads to this advantage.

Several researchers have considered how the lefty advantage has evolved with time. In addition to their aforementioned contributions, (Breznik 2013) also plot the mean rank of top lefties and righties over time in tennis and they obtain broadly similar results to those obtained by our Kalman filtering approach. Other researchers have also analyzed the proportion of top lefties in tennis over time. For example, (Loffing, Hagemann, and Strauss 2012b) fit linear and quadratic functions to the data and then extrapolate to draw conclusions on future trends. Their quadratic fit for the proportion of lefties in men’s tennis uses data from 1970 to 2010 and predicts a downwards trend from 1990 onwards. This is contradicted by our data from 2010 to 2015 in Section 6 which suggests that the number of top lefties may have been increasing in recent years. They also perform a separate analysis for amateur players, showing that the lefty advantage increases as the quality of players improves. It is also worth noting that Ghirlanda, Frasnelli, and Vallortigara (2009) introduce a model suggesting the possibility of the lefty advantage remaining stable over time.

### 1.2 Latent ability and competition models

Our work in this paper builds on the extensive^{3} latent ability and competition models literature. The original two competition models are the Bradley-Terry-Luce (BTL) model (Bradley and Terry 1952; Luce 1959) and the Thurstone-Mosteller (TM) model (Thurstone 1927; Mosteller 1951). BTL assumes each player *i* has skill *S _{i}*, so that the probability of player

*i*beating player

*j*is a logistic function of the difference in skills. Specifically BTL assumes

where *i* ▷ *j* denotes the event of *i* beating *j*. TM is defined similarly, but with the probability of *i* beating *j* being a probit function of the difference in their skills. Given match-play data, the skill of each player may be inferred using maximum likelihood estimation (MLE) where the probability of the match-play results is assumed to satisfy

where *M _{ij}* is the number of matches where player

*i*beats player

*j*, and

*i*and

*j*. The inferred skills can then be used to predict the outcome of future matches.

There are a few notable extensions to the BTL and TM models including ELO (Elo 1978), Glicko (Glickman 1999) and TrueSkill^{TM} (Herbrich, Minka, and Graepel 2007). ELO models the performance of each player in a match as having a Gaussian distribution centered around their respective skill. Glicko and TrueSkill^{TM} extend the ELO model by putting a Gaussian prior on the skill of each player. These models have been widely applied to various competition settings. For example, ELO was developed as a chess ranking system, and TrueSkill^{TM} has been used for online match making for video games on Xbox Live. These models allow one to infer the skill level of each player and thereby construct player rankings.

### 1.3 Contributions of this work

In this paper we propose a Bayesian latent ability model for identifying the advantage of being left-handed in one-on-one interactive sports but with the additional complication of having a latent factor, i.e. the advantage of left-handedness, that we need to estimate. Inference is further complicated by the truncated nature of data-sets that arise from only observing data related to the top players. The resulting pattern of data “missingness” therefore depends on the latent factor and so it is important that we model it explicitly. We show how to infer the advantage of left-handedness when only the proportion of top left-handed players is available. In this case we show that the distribution of the number of left-handed players among the top *n* (out of *N*) converges as *N* → ∞ to a binomial distribution with a success probability that depends on the tail-length of the innate skill distribution. Since this result would not be possible if we used short- or long-tailed skill distributions, we also argue for the use of a medium-tailed distribution such as the Laplace distribution when modeling the “innate”^{4} skills of players. We also use this result to develop a simple Kalman filtering model for inferring how the lefty advantage has varied through time in a given sport. Our Kalman filter/smoother enables us to smooth any spurious signals over time and should lead to a more robust inference regarding the dynamics of the lefty advantage.

We also consider various extensions of our model. For example, in order to estimate the innate skills of top players we consider the case when match-play data among the top *n* players is available. This model is a direct generalization of the Glicko model described earlier. Unlike other models, this extension learns simultaneously from (i) the over-representation of lefties among top players and (ii) match-play results between top lefties and righties. Previously these phenomena were studied separately. We observe that including match-play data in our model makes little difference to the inference of the lefty advantage and therefore helps justify our focus on the simplified model that only considers the proportion of lefties in the top *n* players. This extension does help us to identify the innate skills of players, however, we acknowledge that these so-called innate skills may only be of interest to the extent that the NFDS mechanism is responsible for the lefty advantage. (To the extent that the innate superiority hypothesis holds, it’s hard to disentangle the notion of innate skill or talent from the lefty advantage and using the phrase “innate skills” would be quite misleading in this case.)

The remainder of this paper is organized as follows. In Section 2 we describe our skill and handedness model and also develop our main theoretical results here. In Section 3 we introduce match-play results among top players into the model while in Section 4 we consider a variation where we only know the handedness and external rankings of the top players. We present numerical results in Section 5 using data from men’s professional tennis in 2014. In Section 6 we propose a simple Kalman filtering model for inferring how the lefty advantage in a given sport varies through time and we conclude in Section 7 where possible directions for future research are also outlined. Various proofs and other technical details are deferred to the appendix.

## 2 The latent skill and handedness model

We assume there is a universe of *N* players and for *S _{i}*, of the

*i*

^{th}player as the sum of his innate skill

*G*and the lefty advantage

_{i}*L*if he is in fact left-handed. That is, we assume

where *H _{i}* is the handedness indicator with

*H*= 1 if the player is left-handed and

_{i}*H*= 0 otherwise. The generative framework of our model is:

_{i}- –Left-handed advantage:
where$L\sim \text{N}(0,{\sigma}_{L}^{2})$ *σ*_{L}is assumed to be large and N denotes the normal distribution. - –For players
$i=1,2,\mathrm{\dots},N$ **–**Handedness: where${H}_{i}\sim \text{Bernoulli}\left(q\right)$ *q*is the proportion of left-handers in the overall population.**–**Innate skill: for some given distribution G${G}_{i}\sim \mathrm{G}$ **–**Skill: .${S}_{i}={G}_{i}+{H}_{i}L$

The joint probability distribution corresponding to the generative model then satisfies

where we note again that *N* is the number^{5} of players in our population universe. We assume we know (from public results of professional competitions etc.) the identity of the top *n* players have indices in

Note, however, that even when we condition on

### 2.1 Medium-tailed priors for the innate skill distribution

Thus far we have not specified the distribution G from which the innate skill levels are drawn in the generative model. Here we provide support for the use of medium-tailed distributions such as the Laplace distribution for modeling these skill levels. We do this by investigating the probability of top players being left-handed as the population size *N* becomes infinitely large. Consider then

where *n _{l}* denotes the number of left-handers among the top

*n*players. For the skill distribution to be plausible, the probability that top players are lefthanded should be increasing in

*L*and be consistent with what we observe in practice for a given sport. Letting

*x*successes in a

*Assume that g has support R. Then*

*where*

*if the limit *

See Appendix A.1. □

The function *L* > 0 as is the case for example with the normal distribution, then *g* is said to be short-tailed. In contrast, long-tailed distributions such as the *t* distribution, have

If we use a short-tailed innate skill distribution and *L* > 0, then ^{6} that other skill distributions may be more appropriate. As an alternative, consider a long-tailed distribution. In this case we have *N* → ∞. This too is undesirable, since the probability of a top player being left-handed does not depend on *L* in the limit and agrees with the probability of being left-handed in the general population. As a consequence, such a distribution would be unsatisfactory for modeling in those sports where we typically see left-handers over-represented among the top players.

We therefore argue that the ideal distribution for modeling the innate skill distribution is a medium-tailed distribution such as the standard Laplace distribution which has PDF

where

which is much more plausible. For very small values of *L*, the probability of top players being left-handed is approximately *q* which is what we would expect given the small advantage of being left-handed. For large positive values of *L* we see that the probability approaches 1 and for intermediate positive values of *L* we see that the probability of top players being left handed lies in the interval ^{7} distribution for modeling the innate skill levels in the remainder of this paper.

### 2.2 Large *N* inference using only aggregate handedness data

Following the results of Proposition 1, we assume here that we only know the number *n _{l}* of the top

*n*players who are left-handed. We shall see later that only knowing

*n*results in little loss of information regarding

_{l}*L*compared to the full information case of Section 3 where we have knowledge of the handedness and all match-play results among the top

*n*players. We shall make use of this observation in Section 6 when we build a model for inferring the dynamics of

*L*through time series observations of

*n*.

_{l}### 2.2.1 Posterior of *L* in an infinitely large population

Applying Bayes’ rule to (6) yields

where *n _{r}* :=

*n*−

*n*is the number of top righthanded players, all factors independent of

_{l}*L*were absorbed into the constant of proportionality and the binomial distribution term on the second line is now written explicitly as a function of

*n*. We shall verify empirically in Section 5 that (9) is a good approximation of

_{l}*N*is large. As the number of top players

*n*increases while keeping

*n*

^{th}power in (9) causes the distribution to become more peaked around its mode. This effect can be seen in Figure 1 where we have plotted the r.h.s of (9) for different values of

*n*. If

*n*is sufficiently large then the data will begin to overwhelm the prior on

*L*and the posterior will become dominated by the likelihood factor, i.e. the second term on the r.h.s. of (9). This likelihood term achieves its maximum at

which we plot as the dashed vertical line in Figure 1. We can clearly see from the figure that the density becomes more peaked around *L ^{∗}* as

*n*increases while keeping

^{8}of

*L*provides an easy-to-calculate point estimate of

^{∗}*L*for large values of

*n*.

The bell-shaped posteriors in Figure 1 suggest that we might be able to approximate the posterior of *L* as a Gaussian distribution. This can be achieved by first approximating (9) as a Gaussian distribution over *L* via a Laplace approximation (Barber 2012, Sec. 28.2) to the second term on the r.h.s. of (9). Specifically, we set the mean of the Laplace approximation equal to the mode, *L ^{∗}*, of (6) and then set the precision to the second derivative of the logarithm evaluated at the mode. This yields:

Note that this use of the Laplace approximation is non-standard as the left side of (11) is not a distribution over *L*, but merely a function of *L*. However if *L* and, up to a constant of proportionality, is well approximated by a Gaussian. We can then multiply the normal approximation in (11) by the other term, *n* and where the likelihood factor was set^{9} to the exact value of (6) or the Laplace approximation of (11). It is evident from the figure that the Laplace approximation is extremely accurate. This gives us the confidence to use the Laplace approximation in Section 6 when we build a dynamic model for *L*.

### 2.3 Interpreting the posterior of *L*

In the aggregate data regime we do not know the posterior distributions of the skills and thus cannot directly infer the effect of left-handedness on match-play outcomes. However, we can still infer this effect in aggregate. As we shall see, the relative ranking of left-handers is governed by the value of *L*. In particular, if we continue to assume the Laplace distribution for innate skills, then we show in Appendix A.3 that for any fixed and finite value of *L*, the difference in skills between players at quantiles *λ*_{j} and *λ*_{i} satisfies

where the convergence is in probability and we use *Nλ _{i}*]

^{th}order statistic of

^{10}that

*N*it immediately follows that

Consider now a left-handed player of rank *x* with innate skill, *G*. All other things being equal, if this player was instead right-handed then his skill would change from *x* to *L*. We can therefore interpret the advantage of being left-handed as improving, i.e. lowering, one’s rank by a multiplicative factor of

We can use this result to infer the improvement in rank for lefties due to their left-handedness in various sports (Flatt 2008). We do so by substituting the fraction of top players that are left-handed, *L ^{∗}*, our point estimate of

*L*. Following the preceding discussion, the (multiplicative) change in rank due to going from left-handed to right-handed can then be approximated by

which again follows from (10). It is important to interpret (14) correctly. In particular, it represents the multiplicative drop in rank if a particular left-handed player were somehow to give up the advantage of being left-handed. It does not represent his drop in rank if he and all other left-handed players were to simultaneously give up the advantage of being left-handed. In this latter case, the drop in rank would not be as steep as that given in (14) since that player would still remain higher-ranked than the other left-handed players who were below him in the original ranking. In fact, we can argue that the approximate absolute drop in ranking for a left-handed player of rank *x* when all left-handers give up the benefit of being left-handed is given by the number of right-handed players between ranks *x* and

We can argue for (15) by noting that on average a fraction *x* and *x* will be

when all left-handed players simultaneously give up the advantage of being left-handed. Simplifying (16) using (14) yields a new ranking of

We therefore refer to the r.h.s. of (14) and

Proportion of left-handers in several interactive one-on-one sports (Flatt 2008; Loffing and Hagemann 2016) with the relative changes in rank under the Laplace distribution for innate skills with

Sport | Approx % left-handed | ||
---|---|---|---|

( | |||

Tennis | 15% | 1.43 | 1.36 |

Badminton | 23% | 2.41 | 2.09 |

Fencing (épée) | 30% | 3.47 | 2.73 |

Table-tennis | 32% | 3.81 | 2.91 |

^{}

These results, while pleasing, are not very surprising. After all, a back-of-the-envelope calculation could come to a similar conclusion as follows. The proportion of top table-tennis players who are left-handed is ≈ 32% but the proportion^{11} of left-handers in the general population is ≈ 11% . Assuming the top left-handers are uniformly spaced among the top right-handers, we would therefore need to reduce the ranking of all left-handers in fencing by a factor of

While these results and specifically the interpretation of *N* → ∞) using either short- or long-tailed distributions to model the innate skills.

## 3 Including match-play and handedness data

Thus far we have not considered the possibility of using match-play data among the top *n* players to infer the value of *L*. In this section we extend our model in this direction so that *L* and the innate skills of the top *n* players can be inferred simultaneously. We suspect (and indeed this is confirmed in the numerical results of Section 5) that inclusion of the match-play data adds little information regarding the value of *L* over and beyond what we can already infer from the basic model of Section 2. However, it does in principle allow us to try and answer hypothetical questions regarding the innate skills of players and win-probabilities for players with and without the benefit of left-handedness. We therefore extend our basic model as follows:

- –For each combination of players
$i<j$ - –Match-play results:
${M}_{ij}\sim \text{Binomial}({T}_{ij},p(i\u25b7j\mid {S}_{i},{S}_{j};{\sigma}_{M}))$

*i*defeats player*j*is defined according to$$p(i\u25b7j\mid {S}_{i},{S}_{j};{\sigma}_{M}):=\frac{1}{1+{e}^{-{\sigma}_{M}\left({S}_{i}-{S}_{j}\right)}}.$$ - –Match-play results:

In contrast with the win probability in (1), our win probability in (17) has a hyperparameter *σ*_{M} that we use to adjust for the predictability of each sport. In less predictable sports, for example, weaker players will often beat stronger players and so even when *σ*_{M} which accentuates the effects of skill disparity. Having an appropriate *σ*_{M} allows the model to fit the data much more accurately than if we had simply set *σ _{M}* = 1 as is the case in BTL. It’s worth noting that instead of scaling by

*σ*

_{M}in (17), we could have scaled the skills themselves so that

*L*consistent across sports.

### 3.1 The posterior distribution

The joint probability distribution corresponding to the extended model now satisfies

As before we condition on

where *n* ranked players. Bayes’ rule therefore implies

where in the final line we have simplified the conditional probabilities and dropped the *L*. This last statement follows because, as emphasized above, the first *n* players are not player rankings but are merely^{12} player indicators from the universe of *N* players. We know the form of *q*) handedness) satisfies

where we recall that *g* denotes the PDF of the innate skill distribution, G. Letting

We can now simplify (20) to obtain

### 3.2 Inference via MCMC

We use a Metropolis-Hastings (MH) algorithm to sample from the posterior in (22). By virtue of working with the random variables *σ*_{M}, as we discuss below. More specifically, we will use a Gaussian proposal distribution with mean vector equal to the current state of the Markov chain and covariance matrix, ^{13}*λ* = 2.38 and Σ is an approximation to the covariance of the posterior distribution. In the numerical results of Section 5, we will run multiple chains in order to properly diagnose convergence to stationarity using the well-known Gelman-Rubin *μ* is an approximation to the mean of the posterior distribution and where *γ* is set sufficiently large so as to ensure the starting points are over-dispersed.

### 3.2.1 Approximating the mean and covariance of the posterior

The posterior of the skills *i*^{th} largest of

where *n* players.

We can estimate the mean and covariance of the distribution given by (23) via Monte Carlo. First let us apply Bayes’ rule to separate *L*,

We would like to jointly sample from this distribution by first sampling *L*, then sampling *L*, and finally sampling *L* and *L* and *L* and

- 1.As discussed in Section 2.2, for large populations the posterior of
*L*can be approximated asWe can easily simulate from the distribution on the r.h.s. of (25) by computing its CDF numerically and then using the inverse transform approach. It can be seen from Figure 2 in Section 5 that this approximation is very accurate for large$$p(L\mid {M}_{[1:n]},{H}_{[1:n]})\stackrel{\propto}{\sim}p\left(L\right){\left(\frac{\mathrm{exp}\left({n}_{l}/n\cdot L/b\right)}{q\mathrm{exp}\left(L/b\right)+1-q}\right)}^{n}.$$ *N*. - 2.It is intractable to simulate
directly according to the conditional distribution on the r.h.s. of (24). It seems reasonable to assume, however, that${S}_{\left[n\right]}$ where we ignore the conditioning on$$p({S}_{\left[n\right]}\mid L,{M}_{[1:n]},{H}_{[1:n]})\approx p({S}_{\left[n\right]}\mid L),$$ and${M}_{[1:n]}$ . As with${H}_{[1:n]}$ *L*, we can use the inverse transform approach to generate according to the distribution on the r.h.s. of (26) by noting that its CDF is proportional to (David and Nagaraja 2003, p. 12)${S}_{\left[n\right]}$ where$$F{({S}_{\left[n\right]}\mid L)}^{N-n}{(1-F({S}_{\left[n\right]}\mid L))}^{n-1}f({S}_{\left[n\right]}\mid L)$$ *f*and*F*are as defined in (21) and the following discussion. - 3.Finally, we can handle the conditional distribution of
on the r.h.s. of (24) by assuming${S}_{[1:n-1]}$ where we again ignore the conditioning on$$p({S}_{[1:n-1]}\mid {S}_{\left[n\right]},L,{M}_{[1:n]},{H}_{[1:n]})\approx p({S}_{[1:n-1]}\mid {S}_{\left[n\right]},L)$$ and${M}_{[1:n]}$ . It is easy to simulate${H}_{[1:n]}$ from the distribution on the r.h.s of (27). We do this by simply generating${S}_{[1:n-1]}$ *n*− 1 samples from the distribution (a simple truncated distribution) and then ordering the samples.$p(S\mid S>{S}_{\left[n\right]},L)=p(G+HL\mid G+HL>{S}_{\left[n\right]},L)$

We can run steps 1 to 3 repeatedly to generate many samples of *μ*, and covariance matrix, Σ, of the true posterior distribution of *n* players have now been ordered according to their BTL ranking). As described above, the resulting Σ is used in the proposal distribution for the MH algorithm while we use *μ* as the mean of the over-dispersed starting points for each chain. The accuracy of the approximation is empirically investigated in Section 5 and is found to be very close to the true mean and covariance of *L* and *μ*, and covariance, Σ, could also be used in other MCMC algorithms such as Hamiltonian Monte Carlo (Neal 2011, Sec 4.1) or elliptical slice sampling (Murray, Adams, and Mackay 2010).

### 3.2.2 Setting *σ*_{M} via an empirical Bayesian approach

The hyperparameter *σ*_{M} was introduced in (17) to adjust for the predictability of the sport and we need to determine an appropriate *σ*_{M} in order to fully specify our model. A simple way to do this is to set *σ*_{M} to be the maximum likelihood estimator over the match-play data where the skills are set to be

We are thus adopting an empirical Bayes approach where a point estimate of the random variables is used to set the hyperparameter, *σ*_{M}; see (Murphy 2012, p. 172).

An alternative to the empirical Bayes approach would be to allow *σ*_{M} be a random variable in the generative model and to infer its value via MCMC. Unfortunately this approach leads to complications. Recall from Section 2 that *σ*_{M} can be interpreted as scaling the left-handed advantage and innate skill distributions. If *σ*_{M} is allowed to be random then this effectively changes the skill distribution. For example if the skills were normally distributed conditioned on *σ*_{M}, and *σ*_{M} were distributed as an inverse gamma distribution, then the skills would effectively have a *t* distribution (as the inverse gamma is a conjugate prior to the normal distribution). Since we wish to keep our skills as Laplace distributed, or more generally medium-tailed, it is simpler to fix *σ*_{M} as a hyperparameter.

## 4 Using external rankings and handedness data

An alternative variation on our model is one where we know the individual player handedness of each of the top *n* players and also have an external ranking scheme of their total skills. For example, such a ranking may be available for the professional athletes in a given sport, e.g. the official world rankings maintained by the World ATP Tour for men’s tennis (Stefani 1997). We will assume without loss of generality that the player indices are ordered as per the given rankings so that the *i*^{th} ranked player has index *i* in our model and *n* are the top *n* players to assume that the *i*^{th} indexed player is also the *i*^{th} ranked player for

where *I* denotes the indicator function. We can simulate from the posterior distribution in (29) using a Gibbs sampler. The conditional marginal distribution (required for the Gibbs sampler) of each player’s skill is then a simple truncated distribution so that

for *S*_{1} and *S _{n}* satisfy

and

Conveniently, the skills of the odd ranked players can be updated simultaneously since they are all independent of each other conditional on the skills of the even ranked players. Similarly, the even-ranked players can also be updated simultaneously conditional on the skills of the odd-ranked players. This makes the sampling parallelizable and efficient to implement when using Metropolis-within-Gibbs. Our algorithm therefore updates the variables in three blocks:

- Update all even skills simultaneously.
- Update all odd skills simultaneously.
- Update
*L*via a Metropolis-Hastings^{14}(MH) step with$$p(L\mid {S}_{1:n},{H}_{1:n})\propto p\left(L\right)\prod _{i=1}^{n}g({S}_{i}-L{H}_{i})F{({S}_{n}\mid L)}^{N-n}.$$

In the absence of match-play data, we believe this model should yield slightly more accurate inference regarding *L* than the base model of Section 2 when the left-handers are not evenly spaced among the top *n* players. For example, it may be the case that all left-handers in the top *n* are ranked below all the right-handers in the top *n*. While such a scenario is of course unlikely, it would suggest that the value of *L* is not as large as that inferred by the base model which only considers *n _{l}* and not the relative ranking of the

*n*players among the top

_{l}*n*. The model here accounts for the relative ranking and as such, should yield a more accurate inference of

*L*to the extent that the lefties are not evenly spaced among the top

*n*players.

## 5 Numerical results

We now apply our models and results to Mens ATP tennis. Specifically, we use handedness data as well as match-play results from ATP Tennis Navigator (Tennis Navigator 2004), a database that includes more than seven thousand players from 1980 until the present and hundreds of thousands of match results at various levels of professional and semi-professional tennis. We restrict ourselves to players for whom handedness data is available and who have played a minimum number of games (here, set at thirty). This last restriction is required because we run BTL as a preprocessing step in order to extract the top *n* = 150 players before applying our methods and because BTL can be susceptible to large errors if the graph of matches (with players as nodes, wins as directed edges) is not strongly^{15} connected. Using data on numbers of recreational tennis players (Tennis Europe 2015; The Physical Activity Council 2016), we roughly estimate a universe of *N* that we chose. Specifically, we also considered *N* = 1 and *N* = 50 million and obtained very similar results regarding *L*. We used data from 2014 for all of our experiments and our results were based on the model of Section 2 with a Laplace prior on the skills and an uninformative prior on *L* with *σ _{L}* = 10. The MCMC chains for the extended model of Section 3 were initialized by a random perturbation from the approximate mean as outlined in Section 3.2, and convergence checked using the Gelman-Rubin diagnostic (Gelman and Rubin 1992).

### 5.1 Posterior distribution of *L*

In Figure 2, we display the posterior of *L* obtained using each of the different models of Sections 2 to 4 using data from 2014 only. We observe that these inferred posteriors are essentially identical. This is an interesting result and it suggests that for large populations there is essentially no additional information conveyed to the posterior of *L* by match-play data or ranked handedness if we are already given the proportion of left-handers among the top *n* players. The posterior of *L* can be interpreted in terms of a change in rank as discussed Section 2.3. Since the posterior of the aggregate handedness with the Laplace approximation agrees with the posterior obtained from the full match-play data, we would argue the results of Table 1 are valid even in the light of the match-play data. These results suggest that being left-handed in tennis improves a player’s rank by a factor of approximately 1.36 on average. Of course, these results were based on 2014 data only and as we shall see in Section 6, there is substantial evidence to suggest that *L* has varied through time.

While match-play data and ranked handedness therefore provide little new information on *L* over and beyond knowing the proportion of left-handers in the top *n* players, we can use the match-play model to answer hypothetical questions regarding win probabilities when the lefty advantage is stripped away from the players’ skills. We discuss such hypothetical questions in Section 5.3.

### 5.2 On the importance of conditioning on ${\text{Top}}_{n,N}$

We now consider if it’s important to condition on

It makes sense then to assess if there is value in conditioning on *L* and this may also be seen in Figure 2. Indeed when we fail to condition on *L* that places more probability on a lefty disadvantage than a lefty advantage and whose mode is negative. The reason for this is that in 2014 there were few very highly ranked left-handers with Nadal being the only left-hander among the top 20. In fact when we apply BTL to the match-play data among the top 150 players that year we find that the mean rank of top right-handers is 74.07 while the mean rank of top lefties is 83.39. A model that only considers results among top 150 players that year therefore concludes there appears to be a disadvantage to being left-handed. In contrast, when we also condition upon *L* over and beyond just considering the match-play results among the top *n* players.

### 5.3 Posterior of skills with and without the advantage of left-handedness

We now consider the posterior distribution of the innate skills and how they differ (in the case of left-handers) from the posterior of the total skills. We also consider the effect of *L* on match-play probabilities and rankings of individual players. These results are based on the extended model of Section 3. In Figures 3 and 4 we demonstrate the posterior of the skills of Rafael Nadal who plays^{16} left-handed and the right-handed Roger Federer. During their careers these two players have forged perhaps the greatest rivalry in modern sport. The figures display the posterior distributions of the innate skill, *G*, and total skill,

Probability of match-play results with and without the advantage of left-handedness.

Djokovic | Federer | Nishikori | Nadal | Murray | Raonic | Klizan | Lopez | |
---|---|---|---|---|---|---|---|---|

Djokovic | – | 38.9 | 31.5 | 29.3 | 21.9 | 20.2 | 8.0 | 7.6 |

Federer | 38.9 | – | 42.0 | 39.5 | 30.6 | 28.5 | 12.0 | 11.5 |

Nishikori | 31.5 | 42.0 | – | 47.4 | 37.8 | 35.5 | 15.9 | 15.2 |

Nadal | 22.9 | 31.8 | 39.2 | – | 40.3 | 37.9 | 17.3 | 16.7 |

Murray | 21.9 | 30.6 | 37.8 | 48.5 | – | 47.5 | 23.7 | 22.8 |

Raonic | 20.2 | 28.5 | 35.5 | 46.0 | 47.5 | – | 25.6 | 24.7 |

Klizan | 5.9 | 8.9 | 11.9 | 17.3 | 18.2 | 19.7 | – | 48.8 |

Lopez | 5.6 | 8.5 | 11.4 | 16.7 | 17.5 | 19.0 | 48.8 | – |

^{}

The players are ordered according to their rank from the top ranked player (Djokovic) to the lowest ranked player (Lopez). A player’s rank is given by the posterior mean of his total skill, *S*. Each cell gives the probability of the lower ranked player beating the higher ranked player. Above the diagonal the advantage of left-handedness is included in the calculations whereas below the diagonal it is not. The left-handed players are identified in bold font together with the match-play probabilities that change when the left-handed advantage is excluded, i.e. when left- and right-handed players meet.

While Nadal and Federer is probably the most interesting match-up for tennis fans, this match-up also points to one of the weaknesses of our model. Specifically, we do not allow for player interaction effects in determining win probabilities whereas it is well known that some players match up especially well against other players. Nadal, for example, is famous for matching up particularly well against Federer and has a 23-15 career head-to-head win/loss record^{17} against Federer despite Federer often having a superior record (to Nadal’s) against other players. For this reason (and others outlined in the introduction) we do acknowledge that inference regarding individual players should be conducted with care.

Table 2 extends the analysis of Federer and Nadal to other top ranked players. Each cell in the table provides the probability of the lower ranked player beating the higher ranked player according to their posterior skill distributions. Above the diagonal the advantage of left-handedness is included in the calculations whereas below the diagonal it is not. The effect on the winning probability due to left-handedness can be observed by comparing the values above and below the diagonal. For example Nadal has a 39.5% chance of beating Federer with the advantage of left-handedness included, but this drops to 31.8% when the advantage is excluded. The left-handed players are identified in bold font together with the match-play probabilities that change when the left-handed advantage is excluded, i.e. when left- and right-handed players meet. In all cases removing the advantage of left-handedness decreases the winning probability of left-handed players, although the magnitude of this effect varies on account of the non-linearity of the sigmoidal match-play probabilities in (17).

If the advantage of left-handedness was removed then the decrease in left-handers’ skills would lead to a change in their rankings. In Table 3, for example, we see how the ranking (as determined by posterior skill means) of the top four left-handed players changes when we remove the left-handed advantage. We also display how these rankings would change when we only use handedness data and external rankings as in Section 4, and when we only have aggregate handedness data of the top *n* players as in Section 2.2. We see that the change in rankings suggested by each of the methods largely agree, although there are some minor variations. Notably Nadal’s rank does not change when we use the full match-play data-set but he does drop from 4 to 5 when we use the other inference approaches. Klizan’s change in rank using the handedness and external rankings data is much smaller than for the other two methods. Overall, however, we see substantial agreement between the three approaches. This argues strongly for use of the simplest approach, i.e. the aggregate handedness approach of Section 2, when the only quantity of interest is the posterior distribution of *L*.

Changes in rank of prominent left-handed players when the left-handed advantage is excluded.

Player | Match- | Handedness and | Aggregate |
---|---|---|---|

play | rankings | handedness | |

Rafael Nadal | |||

Martin Klizan | |||

Feliciano Lopez | |||

Fernando Verdasco |

^{}

The change in rank in the “Match-play” column is computed using the MCMC samples generated using the full match-play data. The change in rank in the “Handedness and Rankings” column is computed using the MCMC samples given only individual handedness data and external skill rankings. The “Aggregate handedness” column uses the BTL ranking as the baseline and the change in rank is obtained by multiplying the baseline by the rank scaling factor of 1.36 from Table 1 and rounding to the closest integer.

## 6 A dynamic model for *L*

Thus far we have only considered inference based on data collected over a single time period, but it is also interesting to investigate how the advantage of left-handedness has changed over time in a given sport. Towards this end, we assume that the advantage of left-handedness, *L _{t}*, in period

*t*follows a Gaussian random walk for

*L*is latent and therefore unobserved. Instead, we observe the number of top players,

_{t}*n*, as well as the number,

_{t}- –Initial left-handed advantage:
${L}_{0}\sim \text{N}(0,{\theta}^{2})$ - –For time periods
$t=1,\mathrm{\dots},T$ **–**Left-handed advantage:${L}_{t}\sim \text{N}({L}_{t-1},{\sigma}_{K}^{2})$ **–**Number of top left-handers:${n}_{t}^{l}\sim p({n}_{t}^{l}\mid {L}_{t};{n}_{t})$

where we assume *θ*^{2} is large to reflect initial uncertainty on *L*_{0} and *σ*_{K} controls how smoothly *L* varies over time. The posterior distribution of *L* given the data is then given by

The main complexity in (31) stems from the distribution

where

where the constant of proportionality coming from the Laplace approximation does not depend on *L* are Gaussian, it is possible to analytically integrate out *L* leaving a closed form expression involving *L*.

In the top panel of Figure 5 we plot the fraction of left-handers among the top 100 mens tennis players as a function of year from 1985 to 2016 and using data from (Bačić and Gazala 2016). In the bottom panel we plot the inferred value of *L* over this time period. We note that in 2006 and 2007, the fraction of top left-handers dropped below 11%, the estimated fraction of left-handers in the general population. A naive analysis would conclude that for those years the advantage of left-handedness was negative. However, this would ignore the randomness in the fraction of top left-handers from year to year. The Kalman filter smoothes over the anomalous 2006 and 2007 years and has a posterior on *L* with positive mean throughout 1985 to 2016. We also recall our observation from the introduction where we noted that only 2 of the top 32 seeds in Wimbledon in 2015 were left-handed. There is no inconsistency between that observation and the data from 2015 in Figure 5, however. While there were indeed only 2 left-handed men among the top 32 in the official year-end world rankings, there was a total of 13 left-handers among the top 100.

Finally we note that we could also have included individual player skills as latent states in our model but this would have resulted in a much larger state space and made inference significantly^{18} more difficult. As we observed in Section 5.1, including match-play results does not change the posterior distribution of *L* significantly and so we are losing very little information regarding *L _{t}* when our model and inference is based only on the observed number of top left-handed players.

## 7 Conclusions and further research

In this paper we have proposed a model for identifying the advantage, *L*, of being left-handed in one-on-one interactive sports. We use a Bayesian latent ability framework but with the additional complication of having a latent factor, i.e. the advantage of left-handedness, that we needed to estimate. Our results argued for the use of a medium-tailed distribution such as the Laplace distribution when modeling the innate skills of players. We showed how to infer the value of *L* when only the proportion of top left-handed players is available. In the latter case we showed that the distribution of the number of left-handed players among the top *n* (out of *N*) converges as *N* → ∞ to a binomial distribution with a success probability that depends on the tail-length of the innate skill distribution. We also use this result to develop a simple dynamic model for inferring how the value of *L* has varied through time in a given sport. In order to estimate the innate skills of top players we also considered the case when match-play data among the top *n* players was available. We observed that including match-play data in our model makes little or no difference to the inference of the left-handedness advantage but it did allow us to address hypothetical questions regarding match-play win probabilities with and without the benefit of left-handedness.

It is worth noting that our framework is somewhat coarse by necessity. In tennis for example, there are important factors such as player ability varying across different surfaces (clay, hard court or grass) that we don’t model. We also attach equal weight to all matches in our model estimation despite the fact that some matches and tournaments are clearly (much) more important than others. Moreover, and as we shall see below, we assume (for a given sport) that there is a single latent variable, *L*, which measures the advantage of being left-handed in that sport. We therefore assume that the total skill of each left-handed player benefits to the same extent according to the value of *L*. This of course would not be true in practice as it seems likely that some lefties take better advantage of being left-handed than others. Alternatively, our model assumes that all righties are disadvantaged to the same extent by being right-handed. Again, it seems far more likely that some right-handed players are more adversely affected playing lefties than other right-handed players. Finally, we don’t allow for interaction effects between two players in determining the probability of one player beating the other. Again, this seems unlikely to be true in practice where some players are known to “match up well” against other players. Nonetheless, we do believe our model captures the value of being left-handed in an aggregate sense and can be reasonably interpreted in that manner. While it is tempting to use the model to answer questions such as “What is the probability that Federer would beat Nadal if Nadal was right-handed?” (and we do ask and answer such questions in Section 5!), we do acknowledge that the answers to such specific questions should be taken lightly for the reasons outlined above.

There are several directions of interest for future research. First, it would be interesting to apply our model to data-sets from other one-on-one sports and to estimate how *L _{t}* varies with time in these sports. The Kalman filtering/smoothing approach developed in Section 6 is straightforward to implement and the data requirements are very limited as we only need the aggregate data

*L*across different sports would be interesting in their own right, the cross-sport dynamics of

_{t}*L*could be used to shed light on the potential explanations behind the benefit of left-handedness. For example, there is some evidence in Figure 5 suggesting that the benefit of left-handedness in men’s professional tennis has decreased with time. If such a trend could be linked appropriately with other developments in men’s tennis such as the superior strength and speed of the players, superior racket and string technology, time pressure etc. then it may be possible to attach more or less weight to the various hypotheses explaining the benefit of left-handedness. The recent

_{t}^{19}work of Loffing (Loffing 2017), for example, studies the link between the lefty advantage and time pressure in elite interactive sports. While these ideas are clearly in the literature already, the Kalman filtering approach provides a systematic, straightforward and consistent approach for measuring

*L*. This can only aid with identifying the explanation(s) for the benefits of left-handedness and how it varies across sports and time.

_{t}It would also be of interest to consider more complex models that can also account for interactions in the skill levels between players and/or different match-play circumstances. As discussed above, examples of the latter would include distinguishing between surfaces (clay, grass etc.) and grand-slam/regular tour matches in tennis. Given the flexibility afforded by a Bayesian approach it should be straightforward to account for such features in our models. Given limited match-play data in many sports, however, it is not clear that we would be able to learn much about such features. In tennis, for example, even the very best players may only end up playing each other a couple of times a year or less. As mentioned earlier, Nadal and Federer only faced each other once in 2014. It would therefore be necessary to consider data-set spanning multiple years in which case it would presumably also be necessary to include form as well as the general trajectory of career arcs in our models. We note that such modeling might be of more general interest and identifying the value(s) of *L* might not be the main interest in such a study.

Continuing on from the previous point, there has been considerable interest in recent years in the so-called “interacting performances theory” O’Donoghue (2009). This theory recognizes that the performance (and outcome of a performance) is determined by both the skill level or quality of an opponent as well as the specific type of an opponent. Indeed, different players are influenced by the same opponent types in different ways. Under this theory, it is important^{20} to be able to identify different types of players. Once these types have been identified we can then label each player as being of a specific type. It may then be possible to accommodate interaction effects between specific players (as outlined in the paragraph immediately above) by instead allowing for player-type interactions. Such a model would require considerably fewer parameters to be estimated than a model which allowed for specific player-interactions.

Returning to the issue of the left-handedness advantage, we would like to adapt these models and apply them to other sports such as cricket and baseball where one-on-one situations still arise and indeed are the main aspect of the sport. It is well-known, for example, that left-handed pitchers in Major League Baseball (approx. 25%) and left-handed batsmen in elite cricket (approx 20%) are over-represented. While the model of Section 2 that only uses aggregate handedness could be directly applied to these sports, it would be necessary to adapt the match-play model of Section 3 to handle them. This follows because the one-on-one situations that occur in these sports do not have binary outcomes like win/lose, but instead have multiple possible outcomes whose probabilities would need to be linked to the skill levels of the two participants.

We hope to consider some of these alternative directions in future research.

## A.1 Proof of proposition 1

We begin by observing that the exchangeability of players implies

Integrating (34) over the handedness of the top *n* players yields

We can expand each term in the summation by conditioning on the top players’ skills,

Most of the proof will focus on showing that the r.h.s. of (36) equals *i*^{th} order statistic of the skills *F* denotes the conditional CDF of a player’s total skill given *L*. Note that for any values of *k*, *L* and *ϵ* > 0, we can find *k*, *L*, *ϵ* and *N* we have

where *f* denotes the PDF of *F*. The conditional distribution of player handedness in (36) factorizes as

and consider now the *i*^{th} term in this product. We have

Assuming

which we recognize as *H _{i}* = 0 yields

*ϵ*> 0 there exists a

We are now in a position to prove that

For any *ϵ* > 0, for all

Observe that

where the first inequality follows from (40). Similarly, its minimum value is bounded by

where the first inequality follows from (40) and the second inequality follows from (37). Combining the upper and lower bounds of (43) and (44) implies that the first term on the r.h.s. of (42) is bounded above by

We can therefore rewrite the bound in (42) as

completing the proof of (41). Substituting (41) into (36) and (36) into (35) yields

so the limiting distribution is

## A.2 Rate of convergence for probability of left-handed top player given normal innate skills

Throughout this subsection we use *L*, is a strictly positive and known constant. The notation *N* IID variables

*If the innate skills are IID standard normal and the advantage of left-handedness, L, is strictly positive, then *

Since *L* > 0 is assumed known we will typically not bother to condition on it explicitly in the arguments below. This means the expectations that appear below are never over *L*. We begin by observing that

where the second equality follows from precisely the same argument that we used to derive (40). Lemma 1 below implies that we can replace *N* IID standard normal random variables and with the equality replaced by a greater-than-or-equal to inequality. We therefore obtain

where

We can now complete the proof by applying the results of Lemma 2 and Lemma 3 below beginning with (47). Specifically, we have

where the second inequality follows from Lemma 2, the third inequality follows from Jensen’s inequality, the fourth follows from Lemma 3 (a standard result that bounds the maximum of IID standard normal random variables) and

As stated earlier, in each of the following Lemmas it is assumed that

*Define the function*

*If the advantage of left-handedness, L, is strictly positive then *

We first recall the CDF of the total skill, *S*, is given by *s* if *L* > 0. We then obtain

where the second to last line follows from integration by parts, and the last line follows because for any *L* > 0 we have (i) *s* and (ii)

*For any constants L and c, we have*

*where *

We first note that

Integrating both sides of (48) w.r.t. *x* from −∞ to *c* yields

from which we obtain

where *X* is a standard normal random variable. The statement of the Lemma now follows by noting

where the inequality follows from (49). □

The following Lemma is well known but we include it here for the sake of completeness.

The proof follows from a simple application of Jensen’s Inequality and the fact that the sum of non-negative variables is larger than the maximum of those variables. For any constant

where we chose

## A.3 Difference in skills at given quantiles with Laplace distributed innate skills

*If the innate skills are IID Laplace distributed with mean 0 and scale parameter b > 0, then for any fixed finite L and sufficiently small quantiles *

*where the convergence is in probability and we use *

Since the innate skills are Laplace distributed with mean 0 and scale *b* > 0 they have the CDF

while the total skill distribution for someone from the mixed population of left- and right-handers has CDF

Since the Laplace distribution has domain ℝ, for any fixed finite *L* we have *λ*. Since

where the second equality follows from (51) with

From (David and Nagaraja 2003, pp. 288) we know that

where convergence is understood to be in probability. Substituting (53) into (54) then yields (for *λ*_{i} and *λ*_{j} sufficiently small)

as claimed. □

## A.4 Estimating the Kalman filtering smoothing parameter

Here we explain how to compute the MLE for *L _{t}* through time. The likelihood of the observed handedness data over the time interval

where *σ*_{K} we first simplify the likelihood, keeping only factors involving *σ*_{K}.

We therefore obtain

where

and

The expression in (56) can then be evaluated numerically for any value of *σ*_{K}. (Note that Σ and therefore the *σ*_{K} so an explicit solution for the MLE of *σ*_{K} is unlikely to be available.)

## References

Abrams, D. M. and M. J. Panaggio. 2012. “A Model Balancing Cooperation and Competition can Explain our Right-Handed World and the Dominance of Left-Handed Athletes.” Journal of the Royal Society Interface 9:2718–2722.

Aggleton, J. P. and C. J. Wood. 1990. “Is There a Left-Handed Advantage in ’Ballistic’ Sports?” International Journal of Sport Psychology 21:46–57.

Akpinar, S., R. L. Sainburg, S. Kirazci, and A. Przybyla. 2015. “Motor Asymmetry in Elite Fencers.” Journal of Motor Behavior, 47:302–311.

Bačić, B. and A. H. Gazala. 2016. “Left-Handed Representation in top 100 Male Professional Tennis Players: Multi-Disciplinary Perspectives.” http://tmg.aut.ac.nz/tmnz2016/papers/Boris2016.pdf, accessed: 2017-06-15.

Barber, D. 2012. Bayesian Reasoning and Machine Learning. New York, NY, USA: Cambridge University Press.

Billiard, S., C. Faurie, and M. Raymond. 2005. “Maintenance of Handedness Polymorphism in Humans: A Frequency-Dependent Selection Model.” Journal of Theoretical Biology 235:85–93.

Bisiacchi, P. S., H. Ripoll, J. F. Stein, P. Simonet, and G. Azemar. 1985. “Left-Handedness in Fencers: An Attentional Advantage?” Perceptual and Motor Skills 61:507–513.

Bradley, R. A. and M. E. Terry. 1952. “Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons.” Biometrika 39:324–345. http://www.jstor.org/stable/2334029.

Breznik, K. 2013. “On the Gender Effects of Handedness in Professional Tennis.” Journal of Sports Science & Medicine 12:346.

Cui, Y., M.-Á. Gómez, B. Gonçalves, H. Liu, and J. Sampaio. 2017a. “Effects of Experience and Relative Quality in Tennis Match Performance During Four Grand Slams.” International Journal of Performance Analysis in Sport 17:783–801.

Cui, Y., M.-Á. Gómez, B. Gonçalves, and J. Sampaio. 2017b. “Identifying Different Tennis Player Types: An Exploratory Approach to Interpret Performance Based on Player Features.” in Complex Systems in Sport, International Congress Linking Theory and Practice 97.

Daems, A. and K. Verfaillie. 1999. “Viewpoint-Dependent Priming Effects in the Perception of Human Actions and Body Postures.” Visual Cognition 6:665–693.

David, H. A. and H. N. Nagaraja. 2003. Order Statistics. 3rd ed. Hoboken, N.J: Wiley-Interscience.

Del Corral, J. and J. Prieto-Rodríguez. 2010. “Are Differences in Ranks Good Predictors for Grand Slam Tennis Matches?” International Journal of Forecasting 26:551–563.

Dochtermann, N. A., C. Gienger, and S. Zappettini. 2014. “Born to Win? Maybe, but Perhaps only Against Inferior Competition.” Animal Behaviour 96:e1–e3.

Elo, A. E. 1978. The Rating of Chessplayers, Past and Present. Batsford: Arco Pub.

Fischer, G. H. and I. W. Molenaar. 2012. Rasch Models: Foundations, Recent Developments, and Applications. Springer-Verlag, NY, USA: Springer Science & Business Media.

Flatt, A. E. 2008. “Is Being Left-Handed a Handicap? The Short and Useless Answer is “Yes and No.” Proceedings of Baylor University Medical Center 21:304–307.

Gelman, A. and D. B. Rubin. 1992. “Inference from Iterative Simulation Using Multiple Sequences.” Statistical Science 7:457–472.

Geschwind, N. and A. M. Galaburda. 1985. “Cerebral Lateralization: Biological Mechanisms, Associations, and Pathology: I. A Hypothesis and a Program for Research.” Archives of Neurology 42:428–459.

Ghirlanda, S., E. Frasnelli, and G. Vallortigara. 2009. “Intraspecific Competition and Coordination in the Evolution of Lateralization.” Philosophical Transactions of the Royal Society of London B: Biological Sciences 364:861–866.

Glickman, M. E. 1999. “Parameter Estimation in Large Dynamic Paired Comparison Experiments.” Applied Statistics 48:377–394.

Goldstein, S. R. and C. A. Young. 1996. “Evolutionary” Stable Strategy of Handedness in Major League Baseball.” Journal of Comparative Psychology 110:164–169.

Grossman, E., M. Donnelly, R. Price, D. Pickens, V. Morgan, G. Neighbor, and R. Blake. 2000. “Brain Areas Involved in Perception of Biological Motion.” Journal of Cognitive Neuroscience 12:711–720.

Grouios, G., H. Tsorbatzoudis, K. Alexandris, and V. Barkoukis. 2000a. “Do Left-Handed Competitors have an Innate Superiority in Sports?” Perceptual and Motor Skills 90:1273–1282.

Grouios, G., H. Tsorbatzoudis, K. Alexandris, and V. Barkoukis. 2000b. “Do Left-Handed Competitors have an Innate Superiority in Sports?” Perceptual and Motor Skills 90:1273–1282.

Gursoy, R. 2009. “Effects of Left-or Right-Hand Preference on the Success of boxers in Turkey.” British Journal of Sports Medicine 43:142–144.

Herbrich, R., T. Minka, and T. Graepel. 2007. “TrueskillTM: A Bayesian Skill Rating System.” in Advances in Neural Information Processing Systems, 569–576.

Holtzen, D. W. 2000. “Handedness and Professional Tennis.” International Journal of Neuroscience 105:101–119.

Liew, J. 2015. “Wimbledon 2015: Once they were Great – but where have all the Lefties Gone?”. The Telegraph, June 27, 2015. https://www.telegraph.co.uk/sport/tennis/wimbledon/11703777/Wimbledon-2015-Once-they-were-great-but-where-have-all-the-lefties-gone.html.

Linderman, S., M. Johnson, and R. P. Adams. 2015. “Dependent Multinomial Models Made Easy: Stick-Breaking with the pólya-Gamma Augmentation.” in Advances in Neural Information Processing Systems, 3456–3464.

Loffing, F. 2017. “Left-Handedness and Time Pressure in Elite Interactive Ball Games.” Biology Letters 13. DOI: 10.1098/rsbl.2017.0446.

Loffing, F. and N. Hagemann. 2015. “Pushing Through Evolution? Incidence and Fight Records of Left-Oriented Fighters in Professional Boxing History.” Laterality: Asymmetries of Body, Brain and Cognition 20:270–286.

Loffing, F. and N. Hagemann. 2016. “Chapter 12 – Performance Differences between Left- and Right-Sided Athletes in One-on-One Interactive Sports.” in Laterality in Sports, edited by F. Loffing, N. Hagemann, B. Strauss, and C. MacMahon, pp. 249–277. San Diego: Academic Press. https://www.sciencedirect.com/science/article/pii/B9780128014264000122.

Loffing, F., N. Hagemann, and B. Strauss. 2012a. “Left-Handedness in Professional and Amateur Tennis.” PLoS One 7. DOI: 10.1371/journal.pone.0049325.

Loffing, F., N. Hagemann, and B. Strauss. 2012b. “Left-Handedness in Professional and Amateur Tennis.” PLoS One 7:e49325.

Luce, D. R. 1959. Individual Choice Behavior: A Theoretical Analysis. Courier Corporation, New York: Wiley Publishing.

Mosteller, F. 1951. “Remarks on the Method of Paired Comparisons: I. The Least Squares Solution Assuming Equal Standard Deviations and Equal Correlations.” Psychometrika 16:3–9. http://dx.doi.org/10.1007/BF02313422.

Murphy, K. P. 2012. Machine Learning: A Probabilistic Perspective. Cambridge, MA: MIT press.

Murray, I. and R. P. Adams. 2010. “Slice Sampling Covariance Hyperparameters of Latent Gaussian Models.” in Advances in Neural Information Processing Systems, 1732–1740.

Murray, I., R. P. Adams, and D. Mackay. 2010. “Elliptical Slice Sampling.” in International Conference on Artificial Intelligence and Statistics, 541–548.

Nass, R. and M. Gazzaniga. 1987. “Cerebral Lateralization and Specialization in Human Central Nervous System.” Handbook of Physiology. Vol. 5. New York: Oxford University Press. https://doi.org/10.1002/cphy.cp010518.

Neal, R. M. 2011. “MCMC using Hamiltonian Dynamics.” Handbook of Markov Chain Monte Carlo 2:113–162.

O’Donoghue, P. 2005. “Normative Profiles of Sports Performance.” International Journal of Performance Analysis in Sport 5:104–119.

O’Donoghue, P. 2009. “Interacting Performances Theory.” International Journal of Performance Analysis in Sport 9:26–46.

Raymond, M., D. Pontier, A.-B. Dufour, and A. P. Moller. 1996. “Frequency-Dependent Maintenance of Left Handedness in Humans.” Proceedings of the Royal Society of London B: Biological Sciences 263:1627–1633.

Roberts, G. O., A. Gelman, W. R. Gilks. 1997. “Weak Convergence and Optimal Scaling of Random Walk Metropolis Algorithms.” The Annals of Applied Probability 7:110–120.

Roberts, G. O., J. S. Rosenthal. 2001. “Optimal Scaling for Various Metropolis-Hastings Algorithms.” Statistical Science 16:351–367.

Schorer, J., F. Loffing, N. Hagemann, and J. Baker. 2012. “Human Handedness in Interactive Situations: Negative perceptual Frequency Effects Can be Reversed!” Journal of Sports Sciences 30:507–513.

Stefani, R. T. 1997. “Survey of the Major World Sports Rating Systems.” Journal of Applied Statistics 24:635–646.

Sterkowicz, S., G. Lech, and J. Blecharz. 2010. “Effects of Laterality on the Technical/Tactical Behavior in View of the Results of Judo Fights.” Archives of Budo 6:173–177.

Stone, J. V. 1999. “Object Recognition: View-Specificity and Motion-Specificity.” Vision Research 39:4032–4044.

Taddei, F., M. P. Viggiano, and L. Mecacci. 1991. “Pattern Reversal Visual Evoked Potentials in Fencers.” International Journal of Psychophysiology 11:257–260.

Tennis Navigator. 2004. “Tennis Software – Tennis Navigator.” http://www.tennisnavigator.com/, accessed: 2015-02-03.

Tennis Europe. 2015. “About Tennis Europe”. https://www.tenniseurope.org/page/12173, accessed: 2017-01-01.

The Physical Activity Council. 2016. “2016 Participation Report The Physical Activity Council’s Annual Study Tracking Sports, Fitness, and Recreation Participation in the US.” https://cdn4.sportngin.com/attachments/document/0112/0253/2016_Physical_Activity_Council_Report.pdf.

Thurstone, L. L. 1927. “A Law of Comparative Judgment.” Psychological Review 34:273–286.

Ullén, F., D. Z. Hambrick, and M. A. Mosing. 2016. “Rethinking Expertise: A Multifactorial Gene–Environment Interaction Model of Expert Performance.” Psychological Bulletin 142:427–446.

Witelson, S. F. 1985. “The Brain Connection: The Corpus Callosum is Larger in Left-Handers.” Science 229:665–668.

Wood, C. and J. Aggleton. 1989. “Handedness in Fast Ball Sports: Do Lefthanders have an Innate Advantage?” British Journal of Psychology 80:227–240.

Yin, S. 2017. Do Lefties have an Advantage in Sports? It Depends. The New York Times. https://www.nytimes.com/2017/11/21/science/lefties-sports-advantage.html

## Footnotes

^{1}

It should be noted, however, that most of these left-handed golfers play right-handed and so being left-handed and playing left-handed are not the same.

^{2}

We do note, however, that there are other hypotheses for explaining the high proportion of lefties in elite sports. They include higher testosterone levels, personality traits, psychological advantages and early childhood selection. See (Loffing and Hagemann 2016) and the references therein.

^{3}

See also the related literature on item response theory (IRT) from psychometrics and the Rasch model (Fischer and Molenaar 2012) which is closely related to the BTL model below.

^{4}

Throughout this paper we will use the term “innate skill” to refer to all components of a player’s “skill” apart from the advantage/disadvantage associated with being left-handed. We acknowledge that the term “innate” may be quite misleading – see the discussion below – but will continue with it nonetheless for want of a better term.

^{5}

We have in mind that *N* is the total number of players in the world who can play at a good amateur level or above. In tennis, for example, a good amateur level might be the level of varsity players or strong club players. *N* will obviously vary by sport but what we have in mind is that the player level should be good enough to take advantage of the left-handedness advantage (to the extent that it exists).

^{6}

In defense of the normal distribution, we show in Appendix A.2 that a normally distributed *G* may in fact be a suitable choice if *N* is not too large. Though *N* → ∞ for normally distributed skills, the rate of convergence is only *N*. However, the result of Proposition 1 suggests that a normal skill distribution would be inappropriate for very large values of *N*.

^{7}

We will use *b* could be accommodated via changes in *σ*_{L} and *σ*_{M}.

^{8}

As a sanity check, suppose that

^{9}

In each case, we normalized so that the function integrated to 1.

^{10}

There is a slight abuse of notation here since we first need to round *Nλ _{i}* to the nearest integer.

^{11}

Note that the proportion of left-handers can vary from country to country for various reasons including cultural factors etc. It is therefore difficult to pin down exactly but a value ≈ 11% seems reasonable.

^{12}

It is only through the act of conditioning on

^{13}

The value of *λ* = 2.38 is optimal under certain conditions; see (Roberts and Rosenthal 2001). We also divide the covariance matrix by the dimension, *n*, to help counteract the the curse of dimensionality which makes proposing a good point more difficult as the dimension of the state-space increases.

^{14}

In the numerical results of Section 5, the MH proposal distribution for *L* will be a Gaussian with mean equal to the current value of *L* and where the variance is set during a tuning phase to obtain an acceptance probability ≈ 0.234. (This value is theoretically optimal under certain conditions; see (Roberts, Gelman, and Gilks 1997)).

^{15}

Otherwise there may be players that have played and won only one match who are impossible to meaningfully rank.

^{16}

It is of interest to note that although Nadal plays left-handed and has done so from a very young age, he is in fact right-handed. Therefore to the extent that the ISH holds, then Nadal may not actually be benefitting from *L*. In contrast, to the extent the NFDS mechanism is responsible for the lefty advantage, then Nadal should be benefitting from *L*.

^{17}

The 23-15 record is as of writing this article although it should be noted that Federer has won their last 5 encounters. It is also interesting to note that Nadal’s advantage is explained entirely by their clay-court results where he has a 13-2 head-to-head win/loss edge.

^{18}

Recent techniques have been developed for MCMC on multinomial linear dynamical systems (Linderman, Johnson, and Adams 2015) that significantly improve the efficiency of inference of large state space models. However these methods will still be orders of magnitude slower than performing inference on the reduced state space containing only *L*.

^{19}

This paper was also discussed in a recent New York Times article (Yin 2017) reflecting the general interest in the lefty advantage beyond academia.

^{20}

See for example Cui et al. (2017a), Cui et al. (2017b) and O’Donoghue (2005), all of which discuss techniques for identifying different types of tennis players.

^{21}

Here [⋅] denotes rounding to the nearest integer.