(Originally published on April 25, 2022)
This is part of a series of posts on sabermetrics and the mathematics of baseball. You can find more here.
On Saturday (April 23rd, 2022) the Boston Red Sox had a bizarre contest against the Tampa Bay Rays. While the Rays threw a combined no-hitter over the first 9 innings, the Red Sox held the Rays to only two baserunners in that span, sending the game to extra innings tied 0-0. In the top of the 10th, with the designated runner on second base, the Red Sox scored two runs to go up 2-0. Here's the play-by-play for the bottom of the 10th:
There are different ways to estimate win probability. One is simply to comb through historical data and check how often a team in a certain situation went on to win the game. Greg Stoll's Win Expectancy Finder lets us do just that. Between 2000 - 2019, there were 11 games in which the visiting team led by two runs in the bottom of the 10th inning, with a runner on 2nd base. In 9 of those games the visiting team went on to win, so we can estimate that the Red Sox win probability at the start of the bottom of the 10th inning was about 9/11, 82%. The table below shows how these historical probabilities changed throughout the inning, calculated the same way.
Score | Out | Runner(s) | Batter | Win Probability (Visiting Team) |
Total Games (2000 - 2019) |
Actual Outcome |
---|---|---|---|---|---|---|
V +2 | 0 | 020 | Choi | 82% | 11 | Strikeout |
V +2 | 1 | 020 | Lowe | 83% | 30 | Strikeout |
V +2 | 2 | 020 | Walls | 94% | 54 | Balk |
V +2 | 2 | 003 | Walls | 100% | 12 | Reach on error |
V +1 | 2 | 100 | Kiermaier | 90.5% | 126 | Stolen base |
V +1 | 2 | 020 | Kiermaier | 91% | 69 | Home Run |
V -1 | 2 | 000 | 0% | N/A | Rays win |
Pre-pandemic, i.e. before the designated runner rule, a two run lead in extras was indeed a lot safer. Using the 2000 - 2019 season data, 250 MLB games reached the bottom of the 10th with the visiting team ahead by two, no runners on base, and no outs. The visiting team went on to win 231 of them, or about 92.4%.
We could also use expected run values to estimate win probability in each situation. This avoids issues with small sample sizes and allows more careful finetuning of the run environment, if needed. In the situation of this particular game, the visiting Red Sox were ahead by two, so three things could have happened:
To estimate the scoring probabilities, we once again turn to Greg Stoll's win expectancy finder. This time, we look at more recent 2021 scoring data, which tells us that on average, with no outs and a runner on second base, the batting team scored no runs 38.4% of the time, one run 32.4% of the time, two runs 14.9% of the time, and 3 or more runs 14.4% of the time. Assuming an average performance by the Rays, the above formula suggests that \[\mathrm{Prob}(V \text{ win}) = 100\% - \frac{1}{2} \cdot 14.9\% - 14.4\% = 78.2\%.\] Below we repeat these calculations for the entire inning.
Score | Out | Runner(s) | Batter | Prob(Rays tie) | Prob(Rays go ahead) | Win Probability (Visiting Team) |
Actual Outcome |
---|---|---|---|---|---|---|---|
V +2 | 0 | 020 | Choi | 14.9% | 14.4% | 78.2% | Strikeout |
V +2 | 1 | 020 | Lowe | 10.4% | 7.8% | 87.1% | Strikeout |
V +2 | 2 | 020 | Walls | 4.9% | 2.6% | 94.9% | Balk |
V +2 | 2 | 003 | Walls | 5.3% | 2.6% | 94.8% | Reach on error |
V +1 | 2 | 100 | Kiermaier | 5.6% | 7.2% | 90.0% | Stolen base |
V +1 | 2 | 020 | Kiermaier | 14.2% | 7.5% | 85.4% | Home Run |
V -1 | 2 | 000 | N/A | N/A | 0% | Rays win |
While I was watching the game, I was surprised to hear commentator (and former Red Sox) Kevin Youkilis say that the balk moving the runner over to third didn't really matter. But this model agrees with his assertion, as we see the win expectancy dropped only by a miniscule 0.1%. Why? The Rays needed to score two runs to tie, so it's the current batter, not the the runner on second/third that really counts. In fact, some teams are even balking intentionally with multi-run leads to let a runner advance from second to third, to combat potential sign stealing.
Also interesting is that the error by second baseman Trevor Story that allowed Walls to reach base only lowered the Red Sox win probability by about 5%, from 94.8% to 90.0%. Of course, had he made an accurate throw for the third out, the game would have ended with a Red Sox win, but with two outs they still were in a reasonably good position.
Note that we could refine this approach by using more specific values instead of league averages from 2021. For instance, we might take into account how likely the Rays score runs in these various situations, or conversely how well the Red Sox pitching and defense can prevent runs. We also might try to take into account the specific batters at the plate, for example by using a batter's wOBA value to predict the run expectancy as in this FanGraphs article.
By both of our methods of estimating win probability, the Red Sox were favored to win this game throughout the bottom of the 10th. Anyone watching, however, realized that a home run could have tied the game at 2-2 in one swing. Going by MLB averages for the 2022 season (through 4/23/22) about 2.4% of plate appearances end in a home run, so this outcome is rather unlikely. For comparison, an out (of any type) occurs in about 70% of plate appearances, while a hit occurs in about 20%.
The leverage index is a metric that attempts to quantify how much the outcome of the game hinges on the outcome of the current play. To compute it, we must consider the possible outcomes of the current plate appearance and how they affect win probability.
Let's look at Kevin Kiermaier's at bat when he hit the game winning home run. We determined above that the win probability for the Red Sox was 85.4% at the time. If instead he walked (which occurs about 9% of the time on average), there would have been runners at first and second with two outs, with the Rays still down by 1 run. Using our win probability calculations from earlier, we find the Red Sox win probability is 82.3%, representing a decrease in win probability by about 3.1%.
We can similarly compute the changes in win probability for if Kiermaier gets out, is hit by the pitch, or gets a hit. For the purposes of this exercise, let's assume that if he singles (or gets any other hit), the runner scores from second base, tying the game. Using MLB-wide data from the 2022 season so far, we estimate the likelihood of each outcome using the proportion of plate appearances that ended in each outcome, ignoring anything other than these seven.
Outcome | Likelihood of outcome |
New situation | Prob(Rays tie) | Prob(Rays win) | Win Probability (Visiting Team) |
Change in WP | ||
---|---|---|---|---|---|---|---|---|
Out | Runner(s) | Score | ||||||
Out (any) | 69.4% | 3 | 000 | V +1 | 0 | 0 | 100% | +14.6% |
BB/HBP | 10.1% | 2 | 120 | V +1 | 10.4% | 12.48% | 82.3% | -3.1% |
1B | 13.5% | 2 | 100 | V +0 | 87.2% | 12.8% | 43.6% | -41.8% |
2B | 4.2% | 2 | 020 | V +0 | 78.3% | 21.7% | 39.2% | -46.2% |
3B | 0.3% | 2 | 003 | V +0 | 73.0% | 27.0% | 36.5% | -48.9% |
HR | 2.4% | 2 | 000 | V -1 | 0 | 1 | 0% | -85.4% |
These calculations show what we knew to be true: the game outcome depended greatly on Kiermaier's at bat. If we weight the changes in win probability by the estimated likelihood of their occurence, we find that the (absolute value of the) expected change in win probability is 20.3%.
To actually compute the leverage index, we take the ratio of the expected change in win probability to the average expected change in win probability. \[LI = \frac{\mathbb{E}(|\Delta_{WP}|)}{\text{Average} (|\Delta_{WP}|)}\] The denominator is the average of the absolute change in win probability over all events in a given period. So, \(LI = 1\) corresponds to average leverage, while \(LI > 1\) means an above average leverage situation. Put another way, the outcome of the game hinges more on this at bat than usual.
I had trouble finding a current value for this denominator, so I used 0.0346, as suggested in Tom Tango's article on leverage index from 2006. Hopefully these values haven't changed too much. Thus for Kevin Kiermaier's at bat, the leverage index was \[LI = \frac{0.203}{0.0346} = 5.85.\] Unsurprisingly, this is massive! The at bat was almost six times as impactful as the average play, and of course it was, because there were two outs with a runner in scoring position in extra innings, with his team down by one.
Another way to do this is to use the standard deviation of the win probability in this situation and compare it to the standard deviation of win probability in an average at bat. This method is also mentioned in Tango's article above, and in this 2019 blog post. In our example, we compute \[\sigma_{WP} = \sqrt{\sum_{i=1}^6 P_i \cdot (WP_i - 0.854)^2} = 0.257,\] where \(i\) runs through the six rows in the table above (i.e. \(i=1\) for an out, \(i=2\) for a walk or HBP, etc), \(P_i\) is the probability of occurence, and \(WP_i\) is the new win probability in each case. This is a bit larger than the 0.203 we computed above, because this methods gives slightly more weight to the larger swings in probability.
Using Tango's reported value of 0.0418 as the standard deviation in an average scenario, we compute the ratio to be \[LI = \frac{0.257}{0.0418} = 6.14.\] Again, this suggests that the result of Kiermaier's at bat would have the potential to swing the win probability of the game by about 6 times more than an average at bat.