# Using a Markov decision process to model the value of the sacrifice bunt

• Nobuyoshi Hirotsu and J. Eric Bickel

## Abstract

We use a Markov decision process to model the value of the sacrifice bunt. Specifically, we consider a nine-inning baseball game with non-identical batters and compute the degree to which sacrificing increases the probability of winning the game. We populate our model using data covering the National League of Major League Baseball, and demonstrate the importance of using the probability of winning the game when analyzing the value of the sacrifice bunt. We show how and why the criterion of maximizing the probability of winning is superior to that of maximizing the expected number of runs scored or the probability of scoring at least one run in the half inning. Our model enables us to investigate situations that are not possible to investigate using earlier models, and find that the sacrifice bunt is more beneficial than previously thought. We also discuss the effect sizes of individual sacrifice bunts, and the effect of model simplifications on runner advancement or ignoring double plays.

Award Identifier / Grant number: 26350434

Funding statement: Grants-in-Aid for Scientific Research (C) of Japan, Grant Number: 26350434.

## Appendix A. Formulation for maximizing the expected number of runs scored in a game

In Equation (1), v is made up of 9 of 216 (= 1944/9) × 1 vectors vn (n = 1, …, 9) each entry of which represents the ENRS in the remainder of the game from a current state when the n-th batter is coming up to bat. r is a 1944 × 1 vector comprised of 9 216 × 1 vectors rn each entry of which represents the immediate expected reward generated by the n-th batter. Using vn and rn, Equation (1) is given by

(A1)(v1v2v8v9)=(0P10P20P8P90)(v1v2v8v9)+(r1r2r8r9),

where Pn is 216 × 216 matrix which consists of the transition probabilities for the n-th batter. The structure of Equation (A1) reflects the batting order, i.e. the state transits to the n + 1 batter by the batting of n-th batter. As the 216 states consist of 9 innings of the 24 (= 216/9) base states, Pn has the following structure:

(A2)Pn= 1   291289(QnQ0nQnQ0nQnQ0nQn),

where Qn is a 24 × 24 matrix representing the transition within a half-inning by a plate appearance of the n-th batter, and Q0n is a 24 × 24 matrix representing the transition to the next inning by “out” of the n-th batter. Qn and Q0n are given by

(A3)Qn=(AnBn00AnBn00An) and Q0n=(000000Fn00),

where An, Bn, and Fn are 8 × 8 matrices. Here, we refer the states within a half-inning such that state 1 to 8 for no outs, state 9 to 16 for one out, states 1, 9 and 17 for no runners, and states 2, 10 and 18 for one runner on first base, etc. Each state number referred above within a half-inning corresponds to the row and column number in Qn and Q0n. These matrices are made up of PSn, PDn, PTn, PHn, PWn and POn. The structure of these matrices can be referred to Bukiet et al. (1997), Hirotsu and Wright (2003, 2004, 2005) or Hirotsu and Bickel (2016). rn in Equation (A1) is made up of 27 8 × 1 vector dn:

(A4)rn=(dndndn1stinningdndndn2ndinning      dndndn9thinning)T, wheredn=(PHn2PHn+PTn2PHn+PTn+PDn+PSn2PHn+PTn+PDn+PSn3PHn+2PTn+PDn+PSn3PHn+2PTn+PDn+PSn3PHn+2PTn+2PDn+2PSn4PHn+3PTn+2PDn+2PSn+PWn)

In Equation (2) Pb has the same structure as P. In Pb, there are such blocks that correspond to the Qn block of P, and we define them as Qbn. Qbn and rb are given by

(A5)Qbn=(0Bbn000Bbn000)andrbn=(dbndbn01stinningdbndbn02ndinningdbndbn09th inning)T,

where Bbn and dbn are

(A6)Bbn=(0000000001PbnPbn00000001PbnPbn0000Pbn1Pbn00000000001Pbn0Pbn000Pbn01Pbn000000Pbn01Pbn00000000Pbn1Pbn)anddbn=(000Pbn0PbnPbnPbn).

In terms of rbn in (A5), there are 0s, 8 × 1 zero vectors, which correspond to the situations with two outs, and any sacrifice bunts are not attempted. In (A6), non-zero entries appears in dbn because 1 × Pbn runs are expected to score by a squeeze bunt.

## Appendix B. Formulation for maximizing the probability of scoring in the inning

In the formulation for maximizing the PS at least one run in the inning by using a sacrifice bunt, The current state is identified by the combination of the following factors:

1. Out count (3 possibilities);

2. Occupation of the bases (8 possibilities);

3. One of the 9 batters coming up to bat (9 possibilities).

These factors produce 216 (= 3 × 8 × 9) different states in the inning. The following equation determines the PS in the inning from each state

(A7)ps=Psps+q.

In Equation (A7), ps is a 216 × 1 vector made up of 9 24 × 1 vectors psn (n = 1, …, 9) each entry of which represents the PS in the remainder of the inning from a current state when the n-th batter is coming up to bat. Ps is a 216 × 216 matrix, which represents the transition between the 216 states by hitting in the inning. q is a 216 × 1 vector comprised of 9 24 × 1 vectors qn each entry of which represents the PS immediately by the n-th batter. Using psn and qn, Equation (A7) is given by

(A8)(ps1ps2ps8ps9)=(0Q010Q020Q08Q090)(ps1ps2ps8ps9)+(q1q2q8q9)

where the entries in Q0n blocks corresponding to the transition which leads to the scoring of zero runs in a single plate appearance. Other transitions that lead to score are summed up as each entry of qn. That is, Q0n and qn are given by

(A9)Q0n=(A0nBn00A0nBn00A0n)and qn=(qnqnqn),

where

(A10)Aboldzeron=(0PSn+PWnPDnPTn00000000PSn+PWn0PDn00000PWn00000000PW000000000PWn0000000PWn0000000PWn00000000)and  qn=(PHnPTn+PHnPSn+PDn+PTn+PHnPSn+PDn+PTn+PHnPSn+PDn+PTn+PHnPSn+PDn+PTn+PHnPSn+PDn+PTn+PHnPSn+PDn+PTn+PHn+PWn).

We compare the PS between the cases to hit and to bunt, and take the maximum of them in each state:

(A11)ps=max{Psps+q    :HitPsbps+qb   :Bunt

where Psb is a 216 × 216 matrix which represents the transition by attempting a sacrifice bunt, and given by substituting the following Q0bn for Q0n in (A9):

(A12)Q0bn=(0B0bn000B0bn000),

where B0bn is

(A13)B0bn=(0000000001PbnPbn00000001PbnPbn000001Pbn00000000001Pbn0Pbn000001Pbn000000001Pbn0000000001Pbn).

qb is a 216 × 1 vector each entry of which represents the PS immediately by the sacrifice bunt and also given by substituting dbn in (A6) for q′bn in (A10).

## Appendix C. Formulation for maximizing the probability of winning a game

In Equation (4), w is made up with 81 of the 17,712 × 1 vector wnm which represents the probabilities of a home team winning in the remainder of a game from a current state where the n-th batter (n = 1, …, 9) of a home team coming up and the m-th batter (m = 1, …, 9) of a visiting team is coming up, as shown in (A14). In Equation (4), pend is a 1,434,672 × 1 vector leading to the end of the game from the state with two outs in the bottom of the ninth inning, and is also made up with 81 of the 17,712 × 1 vector pendn (n = 1, …, 9), as shown in (A14).

(A14)

In Equation (4), Pw is a 1,434,672 × 1,434,672 matrix which has the structure as also shown in (A14). In (A14), PnH and PmV are 17,712 (= 1,434,672/81) × 17,712 matrices which have the transition probabilities for the batting of the n-th batter (n = 1, …, 9) of the home team and for the batting of the m-th batter (m = 1, …, 9) of the visiting team (m = 1, …, 9), respectively.

The transition matrix PnH can be decomposed to P0nH, P1nH, P2nH, P3nH, P4nH:

(A15)PnH=P0nH+P1nH+P2nH+P3nH+P4nH.

Following P0nH, P1nH, P2nH, P3nH and P4nH, the batting of the n-th batter leads from any states where the home team leads by r runs to the next state where it leads by r runs, r + 1 runs, r + 2 runs, r + 3 runs or r + 4 runs, respectively. Here, r is in the range of [−20, 20] (i.e. 41 possibilities). P0nH consists of blocks Q0n and Q00n:

(A16)P0nH=129TBTBT1BT9B(00Q0nQ00n00Q0n).

Here, the Q0n blocks are in (A9). The transition matrices P0mV, P1mV, P2mV, P3mV and P4mV in (A14) for the batting of a visiting team player are also defined in a similar manner.

In Equation (5), Pwb is a 1,434,672 × 1,434,672 matrix which represents the transition by a sacrifice bunt. Pwb also has the same structure with Pw in (4). The attempt of a sacrifice bunt produces no runs or one run, and the blocks in Pwb corresponding to no runs is given by Q0bn in (A12) and the blocks corresponding to one run is given by the following Q1bn:

(A17)Q1bn=(0B1bn000B1bn000),

where B1bn is

(A18)B1bn=(000000000000000000000000Pbn00000000000000000Pbn00000000Pbn0000000000Pbn0).

All other entries in Pwb are zeros. Note that the non-zero entries in Q1bn in (A16) represents the transition by the success of a squeeze bunt to score a run.

## Appendix D. Analysis for the effect of model simplifications: runner advancement and double plays

To see the effect of model simplifications in runner advancement, we conduct the same analysis using two other runner advancement models: an aggressive runner advancement model and a conservative runner advancement model. Under the aggressive model, a runner on first reaches third on a single, and a runner on first scores on a double, while a runner on second only reaches third on a single under the conservative model. The calculated result in the case of success probability 0.8 is shown in Table A1(A) and (B). As shown in these tables, the difference in ENRS between them is more than one run.

Table A1:

Batting position for home team to attempt a sacrifice bunt with a runner on first and no outs in the analysis of modifying the runner advancement model, or including a double play (Sacrifice Bunt Success Probability = 0.8).

InningMaximizing ENRS (ENRS: 4.063)Maximizing PS (PS: 0.287)Maximizing PW (PW: 0.5008)
−4−3−2−101234
11,2,5,6+,9++ – –9999
21,2,5,6+,9++ – – –9999
31,2,5,6+,9++ – – –99999
41,2,5,6+,9++ – – –99999
51,2,5,6+,9++ – – –99999
61,2,5,6+,9++ – – –99999
71,2,5,6+,9++ – – –99+9999
81,2,5,6+,9++ – – –99+9999
91,2,5,6+,9++ – – –91,2,5,6,9+XXXX
InningMaximizing ENRS (ENRS: 2.916)Maximizing PS (PS: 0.210)Maximizing PW (PW: 0.5001)
−4−3−2−101234
19+
29+
39+
49+
59+
69+
79+
89+9
99+9XXXX
InningMaximizing ENRS (ENRS: 3.662)Maximizing PS (PS: 0.271)Maximizing PW (PW: 0.5032)
−4−3−2−101234
(C) Including a double play
191+,2+,3+,4+,5+,6++,7+,8+,9++ – –999999
291+,2+,3+,4+,5+,6++,7+,8+,9++ – – –999999
391+,2+,3+,4+,5+,6++,7+,8+,9++ – – –999999
491+,2+,3+,4+,5+,6++,7+,8+,9++ – – –999999
591+,2+,3+,4+,5+,6++,7+,8+,9++ – – –99+9996,9
691+,2+,3+,4+,5+,6++,7+,8+,9++ – – –9+9+9996,9
71+,2+,3+,4+,5+,6++,7+,8+,9++ – – –9+6,9+6,96,96,96,9
891+,2+,3+,4+,5+,6++,7+,8+,9++ – – –9+1,2,5,6+,8,9+6,96,96,91,2,5,6,9
91+,2+,3+,4+,5+,6++,7+,8+,9++ – – –9+1+,2+,3+,4+,5+,6+,7+,8+,9++XXXX
1. The values given in parentheses in each table are ENRS and PW at the beginning of the game, or PS at the beginning of the half-inning. + and ++ is denoted if sacrificing increases the PS or the PW more than 0.01 and 0.05, respectively. Cells containing an “X” represent the situation where home team has already won the game.

Comparing Tables A1(A,B) and 4(B), the effect of modifying to the aggressive model is small, but that of the conservative model is not small. This is caused by the difference of the assumptions whether a runner on second scores or not on a single. That is, he scores under the D’Esopo & Lefkowitz model and the aggressive model, but not under the conservative model. Further, we can see that sacrificing is recommended in far fewer situations in the aggressive and conservative models, not only for maximizing the PW but also for the ENRS and the PS. Under the aggressive model, even a double makes a runner on first base score, but under the conservative model a runner on second base does not score on a single. So, in both of these models the need of a sacrifice bunt is reduced. As a result, changing our runner advancement model would reduce the situations in which to sacrifice.

To see the effect of the simplification for ignoring double plays, we introduce a new transition from the state with a runner on first base and no outs to the state with no runners and two outs into our model. We set the probability of double plays to 1.6%. Comparing between Tables A1(C) and 4(B), by the introduction of double plays, the ninth batter is not recommended to bunt in the 7th and 9th inning for maximizing the ENRS, although by the introduction of a double play the situation to bunt is not influenced for maximizing the PS or not much for the PW.

## References

Albert, J. and J. Bennett. 2003. Curve Ball: Baseball, Statistics, and the Role of Chance in the Game (Revised Edition). New York, NY: Copernicus Books.Search in Google Scholar

Bickel, J. E. 2004. “Teaching Decision Making with Baseball Examples.” INFORMS Transactions on Education 5:2–9.10.1287/ited.5.1.2Search in Google Scholar

Bukiet, B., E. R. Harold, and J. L. Palacios. 1997. “A Markov Chain Approach to Baseball.” Operations Research 45:14–23.10.1287/opre.45.1.14Search in Google Scholar

Cook, E. 1964. Percentage Baseball. Cambridge, MA: MIT Press.Search in Google Scholar

D’Esopo, D. A. and B. Lefkowitz. 1977. “The Distribution of Runs in the Game of Baseball.” Pp. 55–62 in Optimal Strategies in Sports, edited by S. P. Ladany and R. E. Machol. New York: North-Holland.Search in Google Scholar

Dewan, J., D. Zminda, and J. Callis, eds. 2000. Baseball Scoreboard 2000. Morton Grove, IL: STATS, Inc.Search in Google Scholar

Hirotsu, N. and J. E. Bickel. 2016. “Optimal Batting Orders in Run-Limit-Rule Baseball: A Markov Chain Approach.” IMA Journal of Management Mathematics 27:297–313.10.1093/imaman/dpu024Search in Google Scholar

Hirotsu, N. and M. Wright. 2003. “A Markov Chain Approach to Optimal Pinch Hitting Strategies in a Designated Hitter Rule Baseball Game.” Journal of Operations Research. Society of Japan 46:353–371.10.15807/jorsj.46.353Search in Google Scholar

Hirotsu, N. and M. Wright. 2004. “Modeling a Baseball Game to Optimize Pitcher Substitution Strategies Using Dynamic Programming.” Pp. 132–161 in Economics, Management, and Optimization in Sports, edited by M. P. Pardalos, S. Butenko, and J. Gil-Lafuente. Berlin: Springer-Verlag.10.1007/978-3-540-24734-0_9Search in Google Scholar

Hirotsu, N. and M. Wright. 2005. “Modelling a Baseball Game to Optimise Pitcher Substitution Strategies Incorporating Handedness of Players.” IMA Journal of Management Mathematics 16:179–194.10.1093/imaman/dpi009Search in Google Scholar

Howard, R. A. 1960. Dynamic Programming and Markov Processes. Cambridge, MA: MIT Press.Search in Google Scholar

Howard, R. A. 1977. “Baseball a la Russe.” Pp. 89–93 in Optimal Strategies in Sports, edited by S. P. Ladany and R. E. Machol. New York: North-Holland.Search in Google Scholar

Keri, J. 2006. Baseball Between the Numbers: Why Everything You Know about the Game Is Wrong. New York, NY: Perseus Publishing.Search in Google Scholar

Lindsey, G. R. 1963. “An Investigation of Strategies in Baseball.”Operations. Research 11:477–501.10.1287/opre.11.4.477Search in Google Scholar

Sokol, J. S. 2004. “An Intuitive Markov Chain Lesson From Baseball.” INFORMS Transactions on Education 5:47–55.10.1287/ited.5.1.47Search in Google Scholar

Stern, H. S. 1997. “A Statistician Reads the Sports Page: Baseball by the Numbers.” Chance 10:38–41.10.1080/09332480.1997.10554797Search in Google Scholar

Tango, T. M., M. G. Lichtman, and A. E. Dolphin. 2007. The Book, Playing the Percentages in Baseball. Dulles, VA: Potomac Books.Search in Google Scholar

Thorn, J. and P. Palmer. 1985. The Hidden Game of Baseball. New York, NY: Doubleday.Search in Google Scholar

Trueman, R. E. 1977. “Analysis of Baseball as a Markov Process.” Pp. 68–76 in Optimal Strategies in Sports, edited by S. P. Ladany and R. E. Machol. New York: North-Holland.Search in Google Scholar

Winston, W. 2009. Mathletics. Princeton, NJ: Princeton University Press.Search in Google Scholar

Published Online: 2019-06-19
Published in Print: 2019-10-25

©2019 Walter de Gruyter GmbH, Berlin/Boston