Christopher Keyes

Talking Baseball: A case study in win probability and leverage

(Originally published on April 25, 2022)

This is part of a series of posts on sabermetrics and the mathematics of baseball. You can find more here.

On Saturday (April 23rd, 2022) the Boston Red Sox had a bizarre contest against the Tampa Bay Rays. While the Rays threw a combined no-hitter over the first 9 innings, the Red Sox held the Rays to only two baserunners in that span, sending the game to extra innings tied 0-0. In the top of the 10th, with the designated runner on second base, the Red Sox scored two runs to go up 2-0. Here's the play-by-play for the bottom of the 10th:

Ji-Man Choi strikes out. 1 out.
Josh Lowe strikes out. 2 out.
Balk. Designated runner Randy Arozarena advances to 3rd.
Taylor Walls reaches on error (by second baseman Trevor Story). Arozarena scores. Rays trail 1-2.
Walls steals second base.
Kevin Kiermaier hits a home run. Walls scores. Rays win 3-2.

With the runner starting on second base, one swing of the bat could have tied the game at any moment. Still, especially once there were two outs, it felt like the Red Sox were solidly in control of the game, right up until Kiermaier's blast. But what does the data say? Can we measure the effect of the balk, error, and steal on the win probability, and does this match my viewer's intutition?

Historical win probability

There are different ways to estimate win probability. One is simply to comb through historical data and check how often a team in a certain situation went on to win the game. Greg Stoll's Win Expectancy Finder lets us do just that. Between 2000 - 2019, there were 11 games in which the visiting team led by two runs in the bottom of the 10th inning, with a runner on 2nd base. In 9 of those games the visiting team went on to win, so we can estimate that the Red Sox win probability at the start of the bottom of the 10th inning was about 9/11, 82%. The table below shows how these historical probabilities changed throughout the inning, calculated the same way.

Score	Out	Runner(s)	Batter	Win Probability (Visiting Team)	Total Games (2000 - 2019)	Actual Outcome
V +2	0	020	Choi	82%	11	Strikeout
V +2	1	020	Lowe	83%	30	Strikeout
V +2	2	020	Walls	94%	54	Balk
V +2	2	003	Walls	100%	12	Reach on error
V +1	2	100	Kiermaier	90.5%	126	Stolen base
V +1	2	020	Kiermaier	91%	69	Home Run
V -1	2	000		0%	N/A	Rays win

Pre-pandemic, i.e. before the designated runner rule, a two run lead in extras was indeed a lot safer. Using the 2000 - 2019 season data, 250 MLB games reached the bottom of the 10th with the visiting team ahead by two, no runners on base, and no outs. The visiting team went on to win 231 of them, or about 92.4%.

Win probability via run expectancy

We could also use expected run values to estimate win probability in each situation. This avoids issues with small sample sizes and allows more careful finetuning of the run environment, if needed. In the situation of this particular game, the visiting Red Sox were ahead by two, so three things could have happened:

Rays score 0-1 runs → Red Sox win,
Rays score 2 runs → go to 11th inning tied,
Rays score 3 or more runs → Rays win.

Let's make the heuristic assumption that if the game goes to an 11th inning, there is a 50-50 chance that either team wins. We certainly could do something more sophisticated here, but if the two teams went 9 innings tied 0-0 (which they did), it seems reasonably fair to me for this heuristic. Now we can translate win probability into run scoring probabilities, \[\mathrm{Prob}(V \text{ win}) = \mathrm{Prob}(H \text{ score } \leq 1) + \frac{1}{2} \cdot \mathrm{Prob}(H \text{ score } 2) + 0 \cdot \mathrm{Prob}(H \text{ score } \geq 3).\] Here \(V\) stands for the visiting team and \(H\) stands for the home team. We can similarly write \[\mathrm{Prob}(V \text{ win}) = 1 - \mathrm{Prob}(H \text{ win}) = 1 - \frac{1}{2} \cdot \mathrm{Prob}(H \text{ score } 2) - \mathrm{Prob}(H \text{ score } \geq 3)\] which will be somewhat more convenient. Note also that these calculations are specific for the situation that the visiting team is up by 2 runs, but an analogous calculation would work for other lead amounts!

To estimate the scoring probabilities, we once again turn to Greg Stoll's win expectancy finder. This time, we look at more recent 2021 scoring data, which tells us that on average, with no outs and a runner on second base, the batting team scored no runs 38.4% of the time, one run 32.4% of the time, two runs 14.9% of the time, and 3 or more runs 14.4% of the time. Assuming an average performance by the Rays, the above formula suggests that \[\mathrm{Prob}(V \text{ win}) = 100\% - \frac{1}{2} \cdot 14.9\% - 14.4\% = 78.2\%.\] Below we repeat these calculations for the entire inning.

Score	Out	Runner(s)	Batter	Prob(Rays tie)	Prob(Rays go ahead)	Win Probability (Visiting Team)	Actual Outcome
V +2	0	020	Choi	14.9%	14.4%	78.2%	Strikeout
V +2	1	020	Lowe	10.4%	7.8%	87.1%	Strikeout
V +2	2	020	Walls	4.9%	2.6%	94.9%	Balk
V +2	2	003	Walls	5.3%	2.6%	94.8%	Reach on error
V +1	2	100	Kiermaier	5.6%	7.2%	90.0%	Stolen base
V +1	2	020	Kiermaier	14.2%	7.5%	85.4%	Home Run
V -1	2	000		N/A	N/A	0%	Rays win

While I was watching the game, I was surprised to hear commentator (and former Red Sox) Kevin Youkilis say that the balk moving the runner over to third didn't really matter. But this model agrees with his assertion, as we see the win expectancy dropped only by a miniscule 0.1%. Why? The Rays needed to score two runs to tie, so it's the current batter, not the the runner on second/third that really counts. In fact, some teams are even balking intentionally with multi-run leads to let a runner advance from second to third, to combat potential sign stealing.

Also interesting is that the error by second baseman Trevor Story that allowed Walls to reach base only lowered the Red Sox win probability by about 5%, from 94.8% to 90.0%. Of course, had he made an accurate throw for the third out, the game would have ended with a Red Sox win, but with two outs they still were in a reasonably good position.

Note that we could refine this approach by using more specific values instead of league averages from 2021. For instance, we might take into account how likely the Rays score runs in these various situations, or conversely how well the Red Sox pitching and defense can prevent runs. We also might try to take into account the specific batters at the plate, for example by using a batter's wOBA value to predict the run expectancy as in this FanGraphs article.

Leverage

By both of our methods of estimating win probability, the Red Sox were favored to win this game throughout the bottom of the 10th. Anyone watching, however, realized that a home run could have tied the game at 2-2 in one swing. Going by MLB averages for the 2022 season (through 4/23/22) about 2.4% of plate appearances end in a home run, so this outcome is rather unlikely. For comparison, an out (of any type) occurs in about 70% of plate appearances, while a hit occurs in about 20%.

The leverage index is a metric that attempts to quantify how much the outcome of the game hinges on the outcome of the current play. To compute it, we must consider the possible outcomes of the current plate appearance and how they affect win probability.

Let's look at Kevin Kiermaier's at bat when he hit the game winning home run. We determined above that the win probability for the Red Sox was 85.4% at the time. If instead he walked (which occurs about 9% of the time on average), there would have been runners at first and second with two outs, with the Rays still down by 1 run. Using our win probability calculations from earlier, we find the Red Sox win probability is 82.3%, representing a decrease in win probability by about 3.1%.

We can similarly compute the changes in win probability for if Kiermaier gets out, is hit by the pitch, or gets a hit. For the purposes of this exercise, let's assume that if he singles (or gets any other hit), the runner scores from second base, tying the game. Using MLB-wide data from the 2022 season so far, we estimate the likelihood of each outcome using the proportion of plate appearances that ended in each outcome, ignoring anything other than these seven.

Outcome	Likelihood of outcome	New situation			Prob(Rays tie)	Prob(Rays win)	Win Probability (Visiting Team)	Change in WP
Outcome	Likelihood of outcome	Out	Runner(s)	Score	Prob(Rays tie)	Prob(Rays win)	Win Probability (Visiting Team)	Change in WP
Out (any)	69.4%	3	000	V +1	0	0	100%	+14.6%
BB/HBP	10.1%	2	120	V +1	10.4%	12.48%	82.3%	-3.1%
1B	13.5%	2	100	V +0	87.2%	12.8%	43.6%	-41.8%
2B	4.2%	2	020	V +0	78.3%	21.7%	39.2%	-46.2%
3B	0.3%	2	003	V +0	73.0%	27.0%	36.5%	-48.9%
HR	2.4%	2	000	V -1	0	1	0%	-85.4%

These calculations show what we knew to be true: the game outcome depended greatly on Kiermaier's at bat. If we weight the changes in win probability by the estimated likelihood of their occurence, we find that the (absolute value of the) expected change in win probability is 20.3%.

To actually compute the leverage index, we take the ratio of the expected change in win probability to the average expected change in win probability. \[LI = \frac{\mathbb{E}(|\Delta_{WP}|)}{\text{Average} (|\Delta_{WP}|)}\] The denominator is the average of the absolute change in win probability over all events in a given period. So, \(LI = 1\) corresponds to average leverage, while \(LI > 1\) means an above average leverage situation. Put another way, the outcome of the game hinges more on this at bat than usual.

I had trouble finding a current value for this denominator, so I used 0.0346, as suggested in Tom Tango's article on leverage index from 2006. Hopefully these values haven't changed too much. Thus for Kevin Kiermaier's at bat, the leverage index was \[LI = \frac{0.203}{0.0346} = 5.85.\] Unsurprisingly, this is massive! The at bat was almost six times as impactful as the average play, and of course it was, because there were two outs with a runner in scoring position in extra innings, with his team down by one.

Using standard deviation instead

Another way to do this is to use the standard deviation of the win probability in this situation and compare it to the standard deviation of win probability in an average at bat. This method is also mentioned in Tango's article above, and in this 2019 blog post. In our example, we compute \[\sigma_{WP} = \sqrt{\sum_{i=1}^6 P_i \cdot (WP_i - 0.854)^2} = 0.257,\] where \(i\) runs through the six rows in the table above (i.e. \(i=1\) for an out, \(i=2\) for a walk or HBP, etc), \(P_i\) is the probability of occurence, and \(WP_i\) is the new win probability in each case. This is a bit larger than the 0.203 we computed above, because this methods gives slightly more weight to the larger swings in probability.

Using Tango's reported value of 0.0418 as the standard deviation in an average scenario, we compute the ratio to be \[LI = \frac{0.257}{0.0418} = 6.14.\] Again, this suggests that the result of Kiermaier's at bat would have the potential to swing the win probability of the game by about 6 times more than an average at bat.