Hitter volatility and the 2021 Yankees

Embed from Getty Images

Sometimes we receive mailbag emails that are worthy of their own post. This came from one of our readers, Aaron, a couple weeks ago:

The Yankees were, despite not believing in ‘hot’, somehow one of the streakiest teams we’ve ever seen, alternating between looking like World Series contenders and a Little League team within the span of mere weeks. What are your thoughts on the possibility that having streaky players like Stanton, Sánchez, and (later in the season) Gallo in the lineup had something to do with this?

This question addresses a broader narrative about this year’s club, and even Yankees teams from prior seasons. The thought is that an offense prone to streaks (hot or cold) is less likely to find success in a short series. From a high level, this makes some sense. Either everything is going to click at the right time and a team will bulldoze its way to a title, or a team will suddenly turn into a pumpkin and get knocked out quickly. Meanwhile, a more consistent ballclub should have a better chance of making a long October run. And, while the Yankees have extremely talented hitters, they do seem like a group of guys who are either blazing hot or ice cold, making postseason success difficult.

Seeking out consistency

When I hear people say that they want a less streaky offense, what I think they mean is one that is reliable on a day-to-day basis over a long season. I think fans want players who, if picking a random day’s box score, frequently perform at or near his full season stat line on selected given day. That’s a measure of consistency, and I that’s what I think Aaron is seeking out in his question.

Bill Petti, formerly of FanGraphs, has done some pretty extensive research on hitter volatility, which I think is what Aaron (and others) are looking to understand with regard to the Yankees. It’s important to note that volatility does not equate to streakiness, though there can be some overlap. Petti’s work on hitter volatility does a great job describing the types of hitters who are (and aren’t) consistent over a full season.

Petti’s most recent work on the subject was published in 2017 here. I can’t recreate that for the 2021 season as it’s too complex for my understanding, but there are older versions (team and player based) of a metric he created (VOL) that I recomputed using Excel. The older versions are definitely inferior, but this the best I can do with my knowledge base and Excel. There are leaderboards for Petti’s updated individual metric, but only through 2019.

Volatility vs. Streakiness

Before diving into the Yankees’ volatility numbers, let’s clear up the difference between volatility and streakiness. Here’s what Petti wrote himself:

As I have written before, volatility is not the same thing as being streaky. Streakiness is about how extreme positive and negative performances lump together over the course of a season. Essentially, it’s the clustering of good and bad performances over long stretches. Volatility is different. It is more about the overall distribution of a player’s daily performance relative to the overall runs they create over the course of a season. If a player creates 81 runs in a given season, did he create half of a run every game (perfectly consistent/equal distribution of runs), or did he create 80 percent of his runs in only 20 percent of his games?

Make sense? In other words, streakiness is more about outlier performances over consecutive stretches of games, whereas volatility measures the range of performance across the entire season and the order of games does not matter.

To further illustrate what Petti means here, let’s compare Gary Sánchez and DJ LeMahieu. The former had a .314 wOBA this year, and the latter recorded a .315 mark. Basically the same full season performance! How both got there wasn’t, though. Here are the recomputed VOL inputs and results:

PlayerwOBAStd. Dev. Daily wOBAVOLVOL-
Gary Sánchez.314.264.482102
DJ LeMahieu.315.208.37980
VOL- is indexed where 100 is average. Results above 100 are more volatile than average, and results below 100 are less volatile than average.

LeMahieu’s performance was much more evenly distributed throughout the 162 game season than Sánchez’s. Said another way, there were more individual games where DJLM’s wOBA was closer to his full season wOBA than Gary’s. In terms of streakiness, though? Both had plenty of up-and-downs throughout the year, but Sánchez seems to have had longer peaks and valleys.

Before looking at any data, I’d assume most people would call Sánchez streaky and LeMahieu consistent. VOL confirms that LeMahieu was indeed consistent in 2021, though it also doesn’t indicate that Sánchez was streaky. We can see Sánchez’s streakiness via the rolling wOBA chart, while VOL tells us that the distribution of his performances were a tad more varied than the typical hitter.

The point here? Pick a random game across 162 and you’re more likely to find a boom-or-bust result from Sánchez than LeMahieu. Again, this has nothing to do with streakiness, but rather the frequency with which a player performs at or close to his mean performance on any given day. I believe that this is what fans want out of players more than anything else, regardless of streakiness, which is also why we see so many people want guys like Sánchez or Gallo off the team.

Team Volatility

So, would a team full of less-volatile players be more successful than more-volatile hitters? Let’s take a look. First, we’ll go in-depth on the numbers this season and then look at the playoff teams in the nine seasons prior.


To my surprise, the Yankees were the second-most consistent offense in the majors in terms of runs scored in 2021. While that caught me off guard initially, it does make sense. I’ll explain after the table of team-by-team data, sorted by least volatile to most volatile:

TeamRStd. Dev. R/GameR/GameActual Wins – Pythag WinsVOL

The verdict: consistency doesn’t make a good offense. It can help! But teams can also be consistently mediocre or bad. In the Yankees case, the 2021 lineup was frequently mediocre. Yes, the team went on some wild streaks of hot and cold play, but the team far more often than not was pretty ordinary.

Now, you may have noticed that I included the differential between team’s actual wins and Pythagorean wins (i.e. expected based on run differential). This is where consistency can help. There’s a correlation between VOL and a team beating its Pythagorean record. Petti’s study found this in past years, and I noticed the same for 2021. While the Mariners skew things a bit, this is worth noting:

  • Least volatile 15 teams: +22 wins over Pythagorean
  • Most volatile 15 teams: -27 wins over Pythagorean

I think this makes sense. Blowouts can skew a team’s run differential, and teams with frequent blowouts are generally more volatile. Look at the Blue Jays: they scored the third-most runs per game but were the fifth-most volatile offense. They fell short of their Pythagorean record by 8 wins.

What am I getting to here, though? We’re looking for answers about World Series odds, not beating Pythagorean records. Well, here’s the deal. The Yankees offense just wasn’t that good. Jaime did a nice job detailing this recently, in fact. However: being consistent may have helped them sneak into the playoffs by nabbing a few extra wins.

Prior years

The phrase “the MLB postseason is a crapshoot” has been thrown around a lot over the past decade or two. Just get in the dance and anything can happen. In that sense, it’s better to be less volatile, right? After all, as I showed for 2021 and Petti’s proven previously, a less volatile offense tends to overperform in the regular season, which can help said club grab a playoff spot.

However, consistency doesn’t seem to matter once the playoffs roll around. It’s great if it helps a team get into the playoffs, but once in, it doesn’t seem to matter. Thanks to Retrosheet, I took a look at where the last nine seasons’ pennant winners ranked in regular season VOL:

YearTeamResultRegular Season VOL Rank
2012SFWon WS6
2013BOSWon WS26
2014SFWon WS24
2015KCWon WS12
2016CHCWon WS2
2017HOUWon WS24
2018BOSWon WS16
2019WSNWon WS16
2020LADWon WS1
2012DETLost WS1
2013STLLost WS24
2014KCLost WS3
2015NYMLost WS27
2016CLELost WS11
2017LADLost WS5
2018LADLost WS28
2019HOULost WS28
2020TBLost WS15

All over the place! Sometimes the upper echelon clubs went to the Fall Classic. In other years, wildly inconsistent teams made it to the final round. Circling back to this season: Houston was one of the more consistent regular season clubs while Atlanta was quite volatile.

The point

It’s not a bad thing to have a more consistent offense. It certainly looks like it will help over the course of a 162 game slate. But to win 11 (or 12, if you’re the Wild Card) games in October? It doesn’t seem to make a difference, although there’s a case to be made that it can’t hurt.

As we’ve seen before, getting red-hot in October can make a world of difference. That’s the best time to go on a hot streak, though I don’t think it’s possible to predict when that will happen. At the end of the day, I think I’d take my chances with a team that has a great offense and is less volatile. Things may not work out anyway given the wonkiness of playoff baseball, but it does seem like the best foot forward, in theory.

Volatility on an individual level

With the team-level stats out of the way, let’s get a bit more granular. Here are the Yankees hitters for 2021 and numbers relative to the rest of the league.

 DJ LeMahieu6790.2080.3150.379            8098th
 Tim Locastro1560.1870.2370.395            8494th
 Gleyber Torres5160.2230.3070.411            8789th
 Tyler Wade1450.2310.3020.431            9181st
 Anthony Rizzo5760.2520.3390.442            9470th
 Giancarlo Stanton5790.2680.3700.449            9563rd
 Aaron Hicks1260.2390.2790.464            9849th
 Gio Urshela4420.2550.3090.470          10044th
 Brett Gardner4610.2530.3050.470          10044th
 Rougned Odor3610.2460.2890.470          10043rd
 Luke Voit2410.2660.3320.472          10042nd
 Clint Frazier2180.2480.2890.473          10042nd
 Gary Sánchez4400.2640.3140.482          10233rd
 Aaron Judge6330.3040.3870.499          10622nd
 Joey Gallo6160.2930.3480.508          10817th
 Miguel Andújar1620.2690.2880.513          10914th
 Kyle Higashioka2110.3180.2720.625          1331st

You’re probably not surprised to see Joey Gallo as one of the most volatile hitters in the league. And per Petti’s research, there’s a strong positive correlation between VOL- and stats like ISO and strikeout rate, two stats Gallo is at the top of the leaderboard (for better and for worse). OBP has a negative correlation, meaning it can help offset some of Gallo’s volatility. But ultimately, when you’re a three-true outcome hitter like Gallo, there are going to be a lot of .000 wOBA games because of a lack of contact and not much else. It looks worse if/when they cluster together, of course.

Again though, VOL isn’t saying that Gallo is streaky. It’s saying that he does a lot of damage in select individual games, regardless of the order. I’d suspect, though I am not sure how to prove it statistically, that such a high VOL makes him more susceptible to streaks (sure seems that way, at least). Meanwhile, to go back to LeMahieu, you essentially know what you’re going to get day in and day out, even if he was consistently pretty meh in 2021.

While Gallo seems very inconsistent hitter, we can’t simply say that he’ll be that way again in 2022. Petti’s research found that VOL isn’t very predictive. The shape of Gallo’s performance could be drastically different next season for all we know. What we do know is that it’s highly likely that Gallo will still be a very good hitter next year.

Ultimately, it’s not that easy to build out a low variance offense. A low VOL offense seems desirable, but it doesn’t correlate with run production.

Consistency doesn’t correlate with run production.

At the end of the day, stacking good hitters, regardless of the variances in their daily performances, is what’s most important.


I really think that the “complaint” that the Yankees’ offense was too streaky really comes down to the desire for a good and consistent offense through 162 games. Or, one that is going to put up 5 or 6 runs per game very, very frequently, rather than one that blows up for 10 runs a lot but also gets shut out on the regular.

Funnily enough, the 2021 Yankees gave us plenty of consistency! Yes, I know there were some extended hot and cold streaks mixed in, but just look at the distribution of runs scored per game this season:

They scored four or fewer runs more often than not (54 percent of the time, to be exact). Gross! A middling offense, regardless of its ability to get red hot on occasion, is not a recipe for October success. Give me a great offense, ideally that’s consistent, and I’ll take my chances. If it’s not consistent, so be it.

Streakiness doesn’t seem predictable, but going on a hot streak in October obviously is ideal. As I said earlier, I’d theorize that a great offense that is less-volatile is more likely to sustain an October run, though the data doesn’t seem to bear that out. It’s still the platonic ideal for me, though.

The problem is that building a less-volatile offense is very difficult. Given that VOL isn’t particularly predictive, it’s much easier to identify hitters who project to have good or great full season performances. And with that in mind, the real criticism of the 2021 Yankees’ offense really just boils down to it underperforming its projections. It had nothing to do with consistency or streakiness. It just wasn’t a very good lineup, period.


Mailbag: Day game record, luxury tax, starting pitching concern


Everyone’s favorite catcher, Gary Sánchez [2021 Season Review]


  1. Aaron

    Thanks for taking the time to really explore the question in detail. I never knew that there was prior research into questions of streakiness and volatility.

    At the same time, I can’t help but wonder if arguing that the playoffs are a crap shoot might be oversimplifying things. While a lot can happen in a smaller sample, it probably doesn’t hurt to try tilting the odds in one’s favor as much as possible by being prepared with what should theoretically work in a larger sample (e.g., stealing what wins you can by being consistent while making sure said consistent players are actually good ones).

    If we went back far enough with that volatility study to look at teams that made it into the AL/NLCS over 20 years, it might provide further food for thought. The same might be true of the 1996-2001 Yankees, since it would be interesting to look past the usual narratives about the dynasty in order to better understand just what it may have been besides luck that allowed the team to consistently succeed.

    Thanks again for the in-depth response.

  2. JorgiePorgie

    Love the level of analysis and depth here. Learned a lot. One thing I think is inconsistent. “At the end of the day, stacking good hitters, regardless of the variances in their daily performances, is what’s most important.” seems true AND as you demonstrated earlier, low volatility seems to help you steal a few wins above Pythagorean expected since blowout wins don’t affect the standings. So actually it seems arguable that your ideal lineup is high average run producing AND (let’s say, as an added bonus) low volatility, and that might be a strategy to aim for–among other things, meaning moving toward high-quality, contact hitters who are likely to be less volatile than your true-three-outcomes type. Which obviously is a topic of much debate rn in Yankees fandom. What do you think?

  3. MikeD

    The issue is not one of volatility or streakiness, but instead one of mediocrity. Some fans might say it’s too many hitters like Stanton that are the problem. Uhh, no. Stanton and Judge were the good hitters. Everyone else were below league average. That was the problem. LeMahieu was one of the “better” hitters beyond Judge and Stanton, and he was basically blah. A 97 OPS+ with anemic power. Gleyber was below average. Gio was below average. Gardner was a total non-entity for 2/3rds of the year. LF was a black hole. It remained a black hole even after Gallo arrived. A .160/.303/.404 slash with a 93 OPS+ and a 35% K rate was not what the Yankees were hoping for. For most of the season, there was almost nothing around Judge and Stanton to put consistent pressure on the opposition and to generate sustained rallies. Fewer runners on base in front of them, and little hitting behind them means they also could be pitched around in key situations, so even they were hurt by the collective below-league average hitting of the rest of the lineup.

    People can call is streakiness. People can call it volatility. I’m calling it mediocrity.

    I do believe LeMahieu will rebound some. Gio too. Peripherals point that way. One can hope Gleyber’s .295/.365/.465 mid-July forward indicates he’s abandoned his HR-hitter approach that was almost assuredly a mirage created by the juiced ball. I’d love for Gleyber to revert to the hitter approach we saw when he first came up.

    End the mediocrity!

  4. Jim Beam

    I’m not sure about the rest of you, but I’m personally less worried about streakiness/VOL, and more interested in balance. I’m not talking about L/R necessarily, although that does play a factor. I’m talking about mixing guys that have AVG, speed, and defense with the TTO guys, and on paper, this team mostly had that (minus speed.)

    They just didn’t perform like we’d have hoped. If Torres goes back to his prospect-roots approach in the box, If DJ is healthy and figures out how to hit like a batting champ again and is aligned where he’s a defensive asset, and Urshella doesn’t play beat-up all season like DJ might have… These guys are all capable of providing consistent base-runners for the TTO guys and Judge to knock in, even if that means sacrificing or going for an easy base-knock instead of going all-out for the fences during slumps. Hopefully the new hitting coach will address the latter.

    Defense at first is an issue, but not enough of one to sacrifice the offensive necessities of a corner IF since Urshella isn’t typically a slugger for his career. Defense at catcher is an issue, and while there are plenty of potential options out there to address that, there really aren’t many that would be better offensively than Sanchez, it’s really a top-heavy position that no one is trading from in the top-tiers. That’s why the Yanks have been drafting catchers high in drafts over the last few years.

    Getting Seager @ SS will help with a lot of this including being a lefty, and there’s enough fill-in depth in the IF to deal w/ his occasional injuries. I think Correa’s a big ask from the FO to the fans, and there are reports that they’ve “soured” on Story, plus it’s hard to see where else they they could bring in another decent lefty bat, other than 1B.

  5. Jimmy bob

    nice analysis

  6. Anthony Rizzeddardo

    I agree, Derek. They were consistently dreadful. This is the argument many of us have made ad nauseum for years. An all or nothing boom or bust HR or strikeout lineup is prone to these long droughts of no runs. You’re not putting the ball in play, you have no base runners to manufacture runs and with the deadened baseball you aren’t hitting home runs. It was the perfect recipe for awfulness and this Yankee organization has done nothing to adjust. The Braves and Astros have given us the blueprint for greatness. Freeman and Riley hit home runs but they also hit .300. If they were on the Yankees they’d want them to swing as hard as they can every time with an uppercut and they’d hit .220. Now you might say well what about the Tampa Rays hitting home runs. Yeah, and look where that got them in October. They reminded me of the Yankees of the last few years. Sure they win 100 games and make the postseason but then the weather gets colder, the pitching gets better and the home runs dry up. No World Series winning team has ever slugged their way to a ring. The late 90’s dynasty had guys like Paulie, Bernie and Jete hitting .300. The ’09 club hit a lot of home runs but guys like Hideki and Johnny hit at least .280 and Teixera did too before they ruined his swing.

    This is what has made baseball unwatchable nowadays. Too much home run or strike out and starting pitchers don’t pitch more than 4 innings. So boring.

    • Dave Jordan

      Thanks Rizzadardo – you hit the nail right in the head here. Hope we see some changes to the approach next year, but not holding my breath.

Powered by WordPress & Theme by Anders Norén