Sometimes we receive mailbag emails that are worthy of their own post. This came from one of our readers, Aaron, a couple weeks ago:
The Yankees were, despite not believing in ‘hot’, somehow one of the streakiest teams we’ve ever seen, alternating between looking like World Series contenders and a Little League team within the span of mere weeks. What are your thoughts on the possibility that having streaky players like Stanton, Sánchez, and (later in the season) Gallo in the lineup had something to do with this?
This question addresses a broader narrative about this year’s club, and even Yankees teams from prior seasons. The thought is that an offense prone to streaks (hot or cold) is less likely to find success in a short series. From a high level, this makes some sense. Either everything is going to click at the right time and a team will bulldoze its way to a title, or a team will suddenly turn into a pumpkin and get knocked out quickly. Meanwhile, a more consistent ballclub should have a better chance of making a long October run. And, while the Yankees have extremely talented hitters, they do seem like a group of guys who are either blazing hot or ice cold, making postseason success difficult.
Seeking out consistency
When I hear people say that they want a less streaky offense, what I think they mean is one that is reliable on a day-to-day basis over a long season. I think fans want players who, if picking a random day’s box score, frequently perform at or near his full season stat line on selected given day. That’s a measure of consistency, and I that’s what I think Aaron is seeking out in his question.
Bill Petti, formerly of FanGraphs, has done some pretty extensive research on hitter volatility, which I think is what Aaron (and others) are looking to understand with regard to the Yankees. It’s important to note that volatility does not equate to streakiness, though there can be some overlap. Petti’s work on hitter volatility does a great job describing the types of hitters who are (and aren’t) consistent over a full season.
Petti’s most recent work on the subject was published in 2017 here. I can’t recreate that for the 2021 season as it’s too complex for my understanding, but there are older versions (team and player based) of a metric he created (VOL) that I recomputed using Excel. The older versions are definitely inferior, but this the best I can do with my knowledge base and Excel. There are leaderboards for Petti’s updated individual metric, but only through 2019.
Volatility vs. Streakiness
Before diving into the Yankees’ volatility numbers, let’s clear up the difference between volatility and streakiness. Here’s what Petti wrote himself:
As I have written before, volatility is not the same thing as being streaky. Streakiness is about how extreme positive and negative performances lump together over the course of a season. Essentially, it’s the clustering of good and bad performances over long stretches. Volatility is different. It is more about the overall distribution of a player’s daily performance relative to the overall runs they create over the course of a season. If a player creates 81 runs in a given season, did he create half of a run every game (perfectly consistent/equal distribution of runs), or did he create 80 percent of his runs in only 20 percent of his games?
Make sense? In other words, streakiness is more about outlier performances over consecutive stretches of games, whereas volatility measures the range of performance across the entire season and the order of games does not matter.
To further illustrate what Petti means here, let’s compare Gary Sánchez and DJ LeMahieu. The former had a .314 wOBA this year, and the latter recorded a .315 mark. Basically the same full season performance! How both got there wasn’t, though. Here are the recomputed VOL inputs and results:
|Player||wOBA||Std. Dev. Daily wOBA||VOL||VOL-|
LeMahieu’s performance was much more evenly distributed throughout the 162 game season than Sánchez’s. Said another way, there were more individual games where DJLM’s wOBA was closer to his full season wOBA than Gary’s. In terms of streakiness, though? Both had plenty of up-and-downs throughout the year, but Sánchez seems to have had longer peaks and valleys.
Before looking at any data, I’d assume most people would call Sánchez streaky and LeMahieu consistent. VOL confirms that LeMahieu was indeed consistent in 2021, though it also doesn’t indicate that Sánchez was streaky. We can see Sánchez’s streakiness via the rolling wOBA chart, while VOL tells us that the distribution of his performances were a tad more varied than the typical hitter.
The point here? Pick a random game across 162 and you’re more likely to find a boom-or-bust result from Sánchez than LeMahieu. Again, this has nothing to do with streakiness, but rather the frequency with which a player performs at or close to his mean performance on any given day. I believe that this is what fans want out of players more than anything else, regardless of streakiness, which is also why we see so many people want guys like Sánchez or Gallo off the team.
So, would a team full of less-volatile players be more successful than more-volatile hitters? Let’s take a look. First, we’ll go in-depth on the numbers this season and then look at the playoff teams in the nine seasons prior.
To my surprise, the Yankees were the second-most consistent offense in the majors in terms of runs scored in 2021. While that caught me off guard initially, it does make sense. I’ll explain after the table of team-by-team data, sorted by least volatile to most volatile:
|Team||R||Std. Dev. R/Game||R/Game||Actual Wins – Pythag Wins||VOL|
The verdict: consistency doesn’t make a good offense. It can help! But teams can also be consistently mediocre or bad. In the Yankees case, the 2021 lineup was frequently mediocre. Yes, the team went on some wild streaks of hot and cold play, but the team far more often than not was pretty ordinary.
Now, you may have noticed that I included the differential between team’s actual wins and Pythagorean wins (i.e. expected based on run differential). This is where consistency can help. There’s a correlation between VOL and a team beating its Pythagorean record. Petti’s study found this in past years, and I noticed the same for 2021. While the Mariners skew things a bit, this is worth noting:
- Least volatile 15 teams: +22 wins over Pythagorean
- Most volatile 15 teams: -27 wins over Pythagorean
I think this makes sense. Blowouts can skew a team’s run differential, and teams with frequent blowouts are generally more volatile. Look at the Blue Jays: they scored the third-most runs per game but were the fifth-most volatile offense. They fell short of their Pythagorean record by 8 wins.
What am I getting to here, though? We’re looking for answers about World Series odds, not beating Pythagorean records. Well, here’s the deal. The Yankees offense just wasn’t that good. Jaime did a nice job detailing this recently, in fact. However: being consistent may have helped them sneak into the playoffs by nabbing a few extra wins.
The phrase “the MLB postseason is a crapshoot” has been thrown around a lot over the past decade or two. Just get in the dance and anything can happen. In that sense, it’s better to be less volatile, right? After all, as I showed for 2021 and Petti’s proven previously, a less volatile offense tends to overperform in the regular season, which can help said club grab a playoff spot.
However, consistency doesn’t seem to matter once the playoffs roll around. It’s great if it helps a team get into the playoffs, but once in, it doesn’t seem to matter. Thanks to Retrosheet, I took a look at where the last nine seasons’ pennant winners ranked in regular season VOL:
|Year||Team||Result||Regular Season VOL Rank|
All over the place! Sometimes the upper echelon clubs went to the Fall Classic. In other years, wildly inconsistent teams made it to the final round. Circling back to this season: Houston was one of the more consistent regular season clubs while Atlanta was quite volatile.
It’s not a bad thing to have a more consistent offense. It certainly looks like it will help over the course of a 162 game slate. But to win 11 (or 12, if you’re the Wild Card) games in October? It doesn’t seem to make a difference, although there’s a case to be made that it can’t hurt.
As we’ve seen before, getting red-hot in October can make a world of difference. That’s the best time to go on a hot streak, though I don’t think it’s possible to predict when that will happen. At the end of the day, I think I’d take my chances with a team that has a great offense and is less volatile. Things may not work out anyway given the wonkiness of playoff baseball, but it does seem like the best foot forward, in theory.
Volatility on an individual level
With the team-level stats out of the way, let’s get a bit more granular. Here are the Yankees hitters for 2021 and numbers relative to the rest of the league.
|Name||PA||SD wOBA||wOBA||VOL||VOL-||MLB Percentile|
You’re probably not surprised to see Joey Gallo as one of the most volatile hitters in the league. And per Petti’s research, there’s a strong positive correlation between VOL- and stats like ISO and strikeout rate, two stats Gallo is at the top of the leaderboard (for better and for worse). OBP has a negative correlation, meaning it can help offset some of Gallo’s volatility. But ultimately, when you’re a three-true outcome hitter like Gallo, there are going to be a lot of .000 wOBA games because of a lack of contact and not much else. It looks worse if/when they cluster together, of course.
Again though, VOL isn’t saying that Gallo is streaky. It’s saying that he does a lot of damage in select individual games, regardless of the order. I’d suspect, though I am not sure how to prove it statistically, that such a high VOL makes him more susceptible to streaks (sure seems that way, at least). Meanwhile, to go back to LeMahieu, you essentially know what you’re going to get day in and day out, even if he was consistently pretty meh in 2021.
While Gallo seems very inconsistent hitter, we can’t simply say that he’ll be that way again in 2022. Petti’s research found that VOL isn’t very predictive. The shape of Gallo’s performance could be drastically different next season for all we know. What we do know is that it’s highly likely that Gallo will still be a very good hitter next year.
Ultimately, it’s not that easy to build out a low variance offense. A low VOL offense seems desirable, but it doesn’t correlate with run production.
At the end of the day, stacking good hitters, regardless of the variances in their daily performances, is what’s most important.
I really think that the “complaint” that the Yankees’ offense was too streaky really comes down to the desire for a good and consistent offense through 162 games. Or, one that is going to put up 5 or 6 runs per game very, very frequently, rather than one that blows up for 10 runs a lot but also gets shut out on the regular.
Funnily enough, the 2021 Yankees gave us plenty of consistency! Yes, I know there were some extended hot and cold streaks mixed in, but just look at the distribution of runs scored per game this season:
They scored four or fewer runs more often than not (54 percent of the time, to be exact). Gross! A middling offense, regardless of its ability to get red hot on occasion, is not a recipe for October success. Give me a great offense, ideally that’s consistent, and I’ll take my chances. If it’s not consistent, so be it.
Streakiness doesn’t seem predictable, but going on a hot streak in October obviously is ideal. As I said earlier, I’d theorize that a great offense that is less-volatile is more likely to sustain an October run, though the data doesn’t seem to bear that out. It’s still the platonic ideal for me, though.
The problem is that building a less-volatile offense is very difficult. Given that VOL isn’t particularly predictive, it’s much easier to identify hitters who project to have good or great full season performances. And with that in mind, the real criticism of the 2021 Yankees’ offense really just boils down to it underperforming its projections. It had nothing to do with consistency or streakiness. It just wasn’t a very good lineup, period.