Nylon Calculus: When can we trust a team’s stats?
One of the big questions through the early part of this season has been, “When do we have to start worrying about the Thunder?” Most expectations for the Thunder had them at least be among the four best teams in the Western Conference and potentially a challenger to the Warriors come playoff time. But as we approach the 30 game mark of the season (or pass it for some teams), and the Thunder continue to struggle, it’s worth wondering if this is just who they are.
This raises the more general question of what point in the season can we start to trust a team’s stats. Of course, the answer can vary depending on the statistic you’re looking at, from net rating to something very specific like opponent 3-point percentage.
Before answering this question, I wanted to make it clear what I’m answering versus what I’m not answering. The question, “At what point in the season can we start to trust a team’s stats as a reflection of their actual ability?” could be confused with the question, “How long does it take for a teams’ stats to stabilize?” (as Darryl Blackport answered here for a player’s 3-point percentage).
Read More Nylon Calculus: On rotations, rest and ramifications for the Timberwolves
For the first question, I used a method that may or may not answer the second question. I looked at the correlation between a teams’ statistics after X number of games have gone by and the full season result. Once the r-squared passes 0.5 (where the skill outweighs the noise), we’re into the territory where we can trust the stat as a measure of a team’s true and expected future performance level.
So why does this not answer the second question? Well, it might. But a better way to tackle that question would be to do what Darryl did here: take two random samples of X number of games and check the correlation, then repeat this for every permutation of games.
For that second question, we care less about the order of games but for the first question, we do care about the order. We want to know, on Dec. 21, what is the correlation between the team’s statistics so far (the information we have) versus where we’ll be at the end of the season (which includes the information we’ve accumulated through Dec. 21).
Perhaps there’s a better way to answer this (there probably is) but I thought my method here would be simple while doing a decent job answering the question. I looked at the numbers over the last four seasons for a number of different types of statistics.
Here were the results for how many games were needed in an initial sample to pass an r-squared of 0.5 with a team’s end-of-season numbers.:
Stat | # of games |
Pace | 4 |
3PAr | 4 |
FTR | 5 |
KOBE | 5 |
AST% | 6 |
Opp. 3PAr | 7 |
Win% | 8 |
Net Rtg | 8 |
TOV% | 10 |
KOBEdef | 10 |
2PT% | 11 |
ORB% | 11 |
KOBE3s | 11 |
TS% | 12 |
EFG% | 12 |
KOBE3sdef | 12 |
Off Rtg | 13 |
Opp. EFG% | 14 |
Opp. FTR | 14 |
DRB% | 15 |
REB% | 15 |
Opp. TOV% | 15 |
FT% | 16 |
Def Rtg | 16 |
SHAC | 16 |
Opp. 2PT% | 18 |
SHAC3s | 25 |
3PT% | 26 |
SHACdef | 26 |
SHAC3sdef | 28 |
Opp. 3PT% | 33 |
Note: KOBE methodology can be found here. It measures the shot quality of a player or team while taking into account defender distance, height difference, shot clock, etc. SHAC is just the difference between a team’s actual efficiency and the KOBE stat. So it’s more of a skill-based stat.
As we can see here, I looked at both stylistic stats (such as KOBE, pace, etc.) and predictive stats (such as Net Rating). Interestingly and perhaps not surprisingly, it doesn’t take us many games to know the style of a team — what type of shot selection will they have (KOBE, 3PAr) or how fast will the team play (Pace). Also, not surprising: it takes a long time to trust a team’s opponent 3-point percentage. In fact, we’re just now passing the point where you can start to trust a team’s opponent 3-point percentage as reflective of where they will end up at the end of the season.
You’ll also notice that you can trust a team’s KOBE stats earlier than a team’s SHAC stats. Again, this makes sense as the KOBE metric is more of a stylistic metric than the SHAC one is. We know pretty early if a team is going to launch a ton of 3s during the year. But the results of those shots going in will take a bit longer to trust (think of the Magic this year). This is especially the case with 3-point shots. It would be interesting to break down the 3-point percentage into shots near the basket and mid-range shots. Presumably, shots near the basket would take less games to trust than mid-range shots.
One other thing you can notice with the table above — defensive statistics take longer to trust than their offensive counterparts. This makes sense as the range of performance in defensive stats seem to be more tightly packed than their offensive counterparts. Of course ,this raises an interesting question — how do we account for the variance in a statistic? For example, at the end of the year, the opponent KOBE stat for 3s tends to be tightly bunched (within the margin of error actually) so another interesting question to ask would be how long does it take for the variance of the league to stabilize. But that’s a question for another day.
Next: Nylon Calculus Week 9 in Review: PER 2.0 updates
I’ll end this post by circling back to the beginning — what does this mean for the Thunder? The table above would seem to indicate that we’re so far into the season that the skill is likely to outweigh the noise. However, for a team like the Thunder, who are trying to mesh stars together, it would be wise to apply some context to their situation. While it’s fair to wonder if the style of the team will change much, the end results may still change as it seems unlikely that Westbrook (the key driving force of the team) will continue to post the worst true shooting percentage of his career (the only two other seasons he’s had with his true shooting percentage below 50 were his first two years in the league).