Accessible Unlicensed Requires Authentication Published by De Gruyter July 23, 2013

Advanced putting metrics in golf

Kasra Yousefi and Tim B. Swartz

Abstract

Using ShotLink data that records information on every stroke taken on the PGA Tour, this paper introduces a new metric to assess putting. The methodology is based on ideas from spatial statistics where a spatial map of each green is constructed. The spatial map provides estimates of the expected number of putts from various green locations. The difficulty of a putt is a function of both its distance to the hole and its direction. A golfer’s actual performance can then be assessed against the expected number of putts.


Corresponding author: Tim B. Swartz, Simon Fraser University

Appendix

We provide the details associated with the Markov chain Monte Carlo implementation briefly discussed in Section 3.

In a Markov chain approach, it is typical to first consider the construction of a Gibbs sampling algorithm. In a Gibbs sampling algorithm, we require the full conditional distributions of the model parameters. A little algebra yields the following full conditional densities:

Referring to the full conditional distributions in (10), we observe that sampling σ is straightforward. Most statistical software packages facilitate generation of a random variate v from the required Gamma distribution, and we then set

The remaining distributions in (10) are nonstandard statistical distributions, and we therefore introduce Metropolis steps, sometimes referred to as “Metropolis within Gibbs” steps for variate generation (Gilks, Richardson and Spiegelhalter 1996). A general strategy in Metropolis is to introduce proposal distributions which facilitate variate generation and yield variates that are in the “vicinity” of the full conditional distributions. For the generation of λi, we consider putting data obtained from the 2012 PGA Tour up to and including the Ryder Cup on September 30, 2012. The data were obtained from the website www.pgatour.com and are summarized in Table 3 by considering the median putting performance by PGA Tour professionals at a distance of r feet from the pin. From Section 2.1 of the paper, we recall that the expected number of putts is given by

where τ=exp{λ}. Since λ(r, θ) is a function of the distance to the pin r and the directional angle θ, we equate the values in the fifth column of Table 3 to E(Zλ) from which plausible values of λ can be derived for a specified distance r to the pin. For example, when r=17.5, we obtain λ=0.071. These plausible values of λ can then be used in the development of proposal densities. For example, if ri=17.5 corresponding to the ith golfer, we consider the proposal density λi∼Normal(0.071, 0.04) where the variance is conservatively large relative to plausible values of λi.

Table 3

Putting summaries from the 2012 PGA Tour and the resultant expected number of putts.

Putting distance r (in feet)Proportion ofE(Z)
One-puttsTwo-puttsThree-putts
7.50.5540.4410.0051.45
12.50.2980.6940.0081.71
17.50.1800.8040.0161.84
22.50.1140.8610.0251.91

For the generation of βj, we consider the Normal

proposal distribution. Recall that the Normal
distribution is also the prior distribution for βj, j=1,…,8. Matching the prior with the proposal results in a simplification of the corresponding Metropolis acceptance ratio. Recall from (3) that λi has mean μi and that
according to (5). Referring to the case of r=17.5 in Table 1 and the above considerations, this suggests

0.071=g(17.5)

corresponding to putting angles of average difficulty. Using similar constraints at other distances r, we obtain knots for the piecewise linear function g. To be precise, we set

where values

are set to
and values
are set to
. To get a feeling for the piecewise linear function g, Figure 6 provides a plot of g versus r at the maximum posterior mean β1=0.151 corresponding to the first quadrant (green line) and at the prior mean β=0.0 (red line). We observe that a putt from any given distance is more difficult in the first quadrant than on average.

Figure 6 The piecewise linear function g evaluated at the maximum posterior mean β1=0.151 (green line) and at β=0 (red line).

Figure 6

The piecewise linear function g evaluated at the maximum posterior mean β1=0.151 (green line) and at β=0 (red line).

For the generation of δ, we need to be aware of the constraint δ>0. We consider the proposal distribution Gamma(0.25, 1.0). This is based on a subjective estimate δ=0.25 and a sufficiently large variance to capture the true value of δ.

  1. 1

    Reaching a par 3/4/5 hole in regulation indicates that a golfer has landed on the green in 1/2/3 shots.

References

Banerjee, S., B. P. Carlin, and A. E. Gelfand. 2004. Hierarchical Modeling and Analysis for Spatial Data. Boca Raton, Florida: Chapman and Hall/CRC.Search in Google Scholar

Beaudoin, D. and T. B. Swartz. 2003. “The Best Batsmen and Bowlers in One-Day Cricket.” South African Statistical Journal 37(2): 203–222.Search in Google Scholar

Besag, J., J. York, and A. Mollie. 1991. “Bayesian Image Restoration, with Two Applications in Spatial Statistics (with Discussion).” Annals of the Institute of Statistical Mathematics 43(1): 1–59.Search in Google Scholar

Broadie, M. 2008. “Assessing Golfer Performance using Golfmetrics.” Pp. 253–262 in Science and Golf V: Proceedings on the 2008 World Scientific Congress of Golf, edited by D. Crews and R. Lutz. Mesa, Arizona: Energy in Motion Inc.Search in Google Scholar

Cressie, N. A. C. 1993. Statistics for Spatial Data, Revised Edition. New York: John Wiley and Sons.Search in Google Scholar

Diggle, P. J., J. A. Tawn, and R. A. Moyeed. 1998. “Model-Based Geostatistics (with Discussion).” Journal of the Royal Statistical Society, Series C 47(3): 299–350.Search in Google Scholar

Fearing, D., J. Acimovic, and S. C. Graves. 2011. “How to Catch a Tiger: Understanding Putting Performance on the PGA Tour.” Journal of Quantitative Analysis in Sports 7(1): Article 5.Search in Google Scholar

Gilks, W. R., S. Richardson, and D. J. Spiegelhalter, (editors). 1996. Markov Chain Monte Carlo in Practice. London: Chapman and Hall.Search in Google Scholar

Jensen, S. T., K. E. Shirley, and A. J. Wyner. 2009. “Bayesball: A Bayesian Hierarchical Model for Evaluating Fielding in Major League Baseball.” The Annals of Applied Statistics 3(2): 491–520.Search in Google Scholar

Oliver, D. 2004. Basketball on Paper: Rules and Tools for Performance Analysis. Washington: Potomac Books.Search in Google Scholar

Reich, B. J., J. S. Hodges, B. P. Carlin, and A. M. Reich. 2006. “A Spatial Analysis of Basketball Shot Chart Data.” The American Statistician 60(1): 3–12.Search in Google Scholar

Shuckers, M. E. 2011. “DIGR: A Defense Independent Rating of NHL Goaltenders using Spatially Smoothed Save Percentage Maps.” MIT Sloan Sports Analytics Conference, March 4–5, 2011, Boston, MA.Search in Google Scholar

Wilson, M. 2012. “Moneyball 2.0: How Missile Tracking Cameras are Remaking the NBA.” .Search in Google Scholar

Woolner, K. 2002. “Understanding and Measuring Replacement Level.” Pp. 55–66 in Baseball Prospectus 2002, edited by J. Sheehan. Dulles, Virginia: Brassey’s Inc.Search in Google Scholar

Published Online: 2013-07-23
Published in Print: 2013-09-01

©2013 by Walter de Gruyter Berlin Boston