Nylon Calculus: Closure on 3-point team defense

OAKLAND, CA - JANUARY 27: Terry Rozier
OAKLAND, CA - JANUARY 27: Terry Rozier /
facebooktwitterreddit

After several attempts at solving the 3-point defense myth thoroughly, I’m finally getting around to the last chapter. It’s been a tough project, one that’s had me second-guessing myself continuously and one that I’ve been working on, off-and-on, for a long time. How do we tackle the nebulous issue of 3-point defense, where it appears teams have little control on how their opponents shoot? The issue is partly philosophical, and an exercise in prediction. And with the league attempting more and more of these shots, it’s becoming even more important that we understand the ramifications and how to deal with them or we may not be able to gauge who the best defenses are anymore.

Background

A few years ago, Nylon Calculus began producing articles on the myths surrounding 3-point defense. The inciting incident was probably the release of SportVU shot data, which allowed the public stats community to analyze shots with a lot more detail, like defender distance. I noted in this shot analysis article that most 3-pointers had virtually no defensive resistance — they were “open.” You can see this piece here from Andrew Johnson on equating 3-point shots to luck since most of those shots are open. Then there was this piece from Seth Partnow about those shots and variance. This led to a lot of discussion on the site, like this post where Johannes Becker gives his own thoughts on the subject. All these pieces found that opponent 3-point percentage was noise, and that it was not stable from one sample — e.g. one team’s season to the following one or in 10 game subsets — to the next.

Another catalyst in the discussions was Houston, whose defensive performance in the 2015 season turned heads. Seth noted they were a regression candidate on Nylon. Also, he found that the notion the Rockets, or other teams, were “choosing” who to leave open, and thus had lower open percentages, was bogus. When they opened the 2016 season with a porous defense, their opponent 3-point percentage was again discussed, and this time by Johannes Becker.

You can also see this effect in the NCAA, as Ken Pomeroy found here. Also, the phenomenon exists on the individual level too in the form of plus-minus stats. For instance, last year many people noticed Kawhi Leonard’s on/off defensive stats were poor, which is confusing for someone who is undeniably an elite defensive player. But, as I found along with Kevin Pelton, Bo Schwartz Madsen, and others, that his statistical defensive decline could almost entirely be explained by a fluky opponent 3-point percentage. It also partially explained Isaiah Thomas’ abnormally awful on/off numbers last year, as was discussed in multiple places. This season, Kevin Pelton brought this up when discussing how the Pelicans played much better with Anthony Davis compared to DeMarcus Cousins.

The effect is now well-known enough that a multitude of people across the analytics community not only understood this phenomenon — and the Kawhi Leonard opponent 3-point fluke got enough traction that Zach Lowe mentioned it — buy they started making luck adjustments. You can see examples of that from Nylon’s own Nate Walker and this new metric that uses luck (mostly 3-point percentages) adjustments from Jacob Goldstein. I actually showed an example of a luck adjustment for RPM a couple years ago hereThis is what I started a year ago, and here was my most recent attempt at implementing a 3-point defense adjustment.

This concept, understanding 3-point defense as a noisy component that’s more akin to luck than team skill, has been a pet project of Nylon Calculus for years, and it’s time to write that final chapter. My exploration stalled when some of the results were contradictory, and I began to question the fundamental idea behind the concept. For instance, the Golden State Warriors and the Boston Celtics have shown a repeatable ability to allow lower percentages from outside the arc over the years. It’s time to definitively solve the issue of 3-point defense and its meaning.

The method to the madness

Let’s get into the meat of the issue. The core problem is that most 3-pointers are open shots, and open shots inherently can’t be influenced by the defense. They’re open. It’s the same reason why we shouldn’t use individual defensive field goal percentage — the open shot data tells you nothing. Likewise, rim protection stats are actually useful because most of those shots are not open and, thus, can be influenced by the defender. It’s a basic law of shot defense: the closer you are, the more you can influence the shot.

However, the future has no guarantee of being like the past: the NBA is changing. The league-wide 3-point rate not only is increasing, but it’s accelerated in recent years, and players are arguably shooting different types of 3-pointers more frequently. Pull-up shots from behind the arc have gone up from 22 percent of outside shots in 2014 to 26 percent this season. That’s a modest change and it’s only a small window of time. However, looking at shot tracking data again, we’re actually seeing a lower proportion of shots not labeled as “open:” from 13.9 percent to 11.7 percent. Also, when you look at unassisted 3-pointers as a proxy for defended shots, and we have that data going back to 1997, the rate has actually remained fairly constant over the last two decades except for the blip during the shortened line season. One could argue we’re seeing more defended outside shots, but I have no solid evidence to support that claim, even if I think it’s reasonable.

Not only do the types of shots matter, but we have larger sample sizes in the season too. In other words, teams may see enough shots against them for statistical significance — it takes a while for 3-point percentages to stabilize. In fact, given how I set up the models the last time I attempted this for an article, I was ignoring that trend. I set up a rudimentary testing period from 2011 to 2016, which meant I was training on seasons with fewer 3-pointers and I also wasn’t properly adjusting for volume. This was definitely an error of thinking on my part, and I questioned 3-point defensive noise.

Thus, I’m trying a new methodology here. Basically, I split the season into two halves, and I use the stats from one half of the season to predict the other. Stats with stability, like point differential or free throw rate, should be able to predict the other half of the season well, and if it’s all noise they’ll be no correlation. Plus, I split up the season randomly: games with an even index (e.g. the first game of the season is a 1) were in one block, and odd in another. This means you get the full range of the season in the testing data instead of trying to predict, say, the latter half of the season. The model won’t be clouded by the effects of midseason trades, injuries, or how offenses get better as the season progresses. And with that detailed, let’s see what happened.

Results

Dealing with the effects of the league’s ballooning 3-point rate may seem tricky at first — after all, if volume effects the stability of the stat, how do I compare 1998 to 2018? — but there was an elegant workaround. It’s just a weighted average, like how one would compute their GPA. In theory, your opponent 3-point percentage is more meaningful the more opponent attempts there are. You get a formula like this: (Opp3P%*Opp3PA+Exp3P%*A)/(Opp3PA+A). The expected 3-point percentage is just the average of your opponents. This actually helps with an outlier like the Warriors, who have the benefit of never having to face their own offense. The question is, what’s A, the weight for the expected 3-point percentage? It will show how many attempts it takes for the opponent percentages to stabilize, essentially. The larger the “A” value is, the more noise there is.

For calculating the optimal weight, there are a number of methods available, from solvers in programs to nonlinear regression in R to just testing multiple models where each one has a different “A” value and seeing which one has the best fit. After messing around with a few different techniques, I converged onto the same range for the outcome: around 4000 to 7000 attempts, depending on the seasons chosen. Let’s give that number more context. The Rockets broke the record last year with 2030 attempts, and the list of players with at least 7000 attempts is just Ray Allen right now. Is 3-point percentage that noisy for opponents? What about the Warriors and Celtics, who year-after-year have performed well? (And Boston has actually been top five since 2008, so that includes different personnel and coaching — what’s in the water?)

Table: Opponent 3-point percentage ranks

SeasonGSWBOS
201374
201435
201554
201624
201712

Perhaps most teams have no control over their opponents, but some do. What are those teams doing, and can we quantify it? The most common factor people cite is the opponent rate. Namely, if you give up fewer outside shots, you’re probably better at defending them or are just giving up tougher ones. That’s easy to include. I’ll also throw in other components of defense, like opponent 2-point percentage and free throw rate. This is what I did last time too.

Which component predicts opponent 3-point percentage? The only one that performed well was 2-point percentage, as the others had weak effects and borderline statistical significance. I think this makes sense. If you defend inside the arc well, it’s easier to cover outside shots and you’re probably a good shot defense too. For simplicity, and because 2-point defense is correlated with other stats, I’m just including that one.

Let’s go to the final model, which you can see below. I know it looks like there are a lot of terms there, but it’s pretty simple. You have your opponent 3-point attempts, the percentage, the expected rate, and the 2-point defense stat, which is league adjusted. The expected rate is just the average of what your opponents have shot for the season (or just use the league average subtracting out the team’s own shots — there’s just a small difference.) The effect here is that opponent percentages are smoothed considerably, but there is a noticeable boost for good shot defenses, like the Warriors, and another boost once your defense has enough attempts.

Predicted 3PT% = (3PA*Opp. 3PT% + 6000*Expected 3PT%)/(3PA + 6000) + 0.26*(Opp. 2PT% – LgAvg)

The fit itself does appear to be sound when I analyzed the results. Last time I did this with a flawed method, the teams with the greatest negative residuals were all the best modern defenses in the league, and vice versa — terrible defenses had the biggest boosts. It was just regressing too heavily in many cases. Instead the tables below show a nice mix of teams with the largest adjustments. Obviously, since defensive rating includes 3-point defense, the teams are the extremes should diverge in defensive rating rank, but even with that effect it appears the adjustment is affecting almost every type of team. That’s a good sign.

Table: Biggest increases 2005-17

TeamOpponent 3PT%Predicted 3PT%DifferenceDrtg rank
2011 CLE41.136.94.229
2010 PHI39.336.13.224
2009 SAC40.637.72.930
2012 DEN38.335.52.921
2008 IND38.636.22.415
2015 NYK38.035.62.428
2009 NJN39.136.82.224
2016 TOR37.335.12.211
2009 MIA38.936.82.111
2013 PHO38.836.72.123

Table: Biggest decreases 2005-17

TeamOpponent 3PT%Predicted 3PT%DifferenceDrtg rank
2012 NOH31.734.3-2.615
2012 BOS30.833.4-2.61
2008 BOS31.634.1-2.51
2006 DET32.535.0-2.55
2015 HOU32.234.3-2.28
2013 POR34.036.1-2.026
2009 NYK35.137.1-2.023
2012 HOU32.534.4-2.017
2007 CLE32.934.8-1.94
2010 LAL32.834.7-1.94

Conclusion

I could go on and make further refinements by utilizing other numbers out there, like unassisted rates or some of the newfangled shot tracking data. I could have also made more complicated adjustments and used techniques like auto-correlation regression or some mixed effects. But I wanted a complete and thorough answer for base-level statistics that you could apply going back to the inception of the 3-point line. This doesn’t have to be complex, and if you want true precision bring in the heavy modeling guns and use all that new data. This is at least a starting point.

This is indeed a much needed adjustment as well. At the extremes, teams can see a change of one to two points per game. That can change a handful of wins and it’s enough to, say, lose homecourt advantage. We may still see teams who out-perform the model results and our expectations. The Boston Celtics have been doing this not just in the Brad Stevens era, but back in the early Kevin Garnett days as well. Even so, while the Celtics are leading the league in opponent 3-point percentage right now at 33.3 percent, the model has them at 34.9 percent compared to a league average of 36.2 percent — that’s not a bad “miss” and it does acknowledge their skill.

Next: The Encyclopedia of Modern Moves

This is the data-ball era, and it’s 2018. Our team metrics should be smarter, even in the public-sphere. We need to find and implement these adjustments, and we can’t ignore the inherent noise of 3-point defense. After all, the Golden State Warriors, once the stalwarts of outside shot defense and the first counter to any notion that it’s noise, are allowing near league average percentages right now. They need the adjustment too because luck can fall both ways.