Nylon Calculus: Measuring creation with the box score

HOUSTON, TX - APRIL 25: Russell Westbrook
HOUSTON, TX - APRIL 25: Russell Westbrook /
facebooktwitterreddit

Assists aren’t a great measure of offensive shot creation. They suffer from the problem of the “Rondo Assist,” when credit is given for passing to a good isolation scorer, or hitting teammates freed by a screen. These vanilla passes don’t tell us whether a player broke down the defense and created an open opportunity for a teammate.

During my stat-tracking days, I expanded the box score to include a metric called “Opportunities Created” that would account for open scoring attempts generated by players. The basic idea was to give credit to players who create open shots for teammates, regardless of whether that player made the final pass, and regardless of whether the shot even went in (to account for variance in teammate quality.) It wasn’t a perfect solution — for example, should two players receive fifty-fifty credit for a pick-and-roll? — but it captured defensive warping in a way assists alone cannot.

In recent years, optical tracking has created more fluid approaches to measuring these kinds of effects, but Opportunities Created still represent a fundamental concept missing from the original box score. With a few hundred games of data, I (finally) got around to regressing the box onto Opportunities Created to determine if it can predict creation.

And good news…It can!

The fundamentals of pressuring a defense

To determine creation using the classic box score — let’s call this Box Creation — we only need a few terms.

First, assists. Sometimes there’s an assist without creation, and sometimes creation without an assist, but assists are still a decent starting point for gauging creation. However, they under credit guys like LeBron and Kobe, who rack up hockey assists left and right because their scoring pressures the defense and leads to an open shot after multiple passes. It will over credit guys like Rondo who pound the ball a lot and pass to skilled scorers.

Read More: Three models to project the best offenses for next season

Thus, the next term in the regression is essentially “usage” — points (per 100) plus turnovers (per 100). Interestingly, points were more predictive of creation than simply using attempts, probably because the better someone is at scoring the more the defense will naturally react to him. Usage and assists give a pretty good approximation of creation, but there was the additional factor of shooting to consider.

Adding 3-pointers was more predictive than free throws, probably due to some geometric effect related to spacing — some opportunities are created without the ball — and stretching defenses. Think of a pick-and-roll player who isn’t a threat to shoot from 25 feet; opponents can go under the screen, conceding the shot, and well, that’s just not going to collapse defenses easily. On the other hand, the more court coverage required to defend a pick-and-roll, the more the defense is going to bend until it breaks.

But adding 3-point percentage isn’t straightforward. Is a 40-percent 3-point shooter who takes fifty shots a season having the same effect as Steph Curry? Of course not. So I used a sigmoid function to create a factor that can be thought of as “3-point proficiency.” In laymen’s terms, it accounts for volume too; if you barely shoot 3s, you aren’t considered a good shooter, but once you start shooting them enough, 3-point “proficiency” is reflective of shooting percentage. Here’s how the metric improves with each iteration:

All told, the interaction of scoring volume, assists and shooting reflects creation quite well. (The interaction of all these effects has been similarly observed in plus-minus regressions like Daniel Myer’s Box Plus-Minus.)

And I find this to be one of the most fundamental and fascinating components of basketball — too much scoring (low assist numbers) means a player isn’t creating for others that much, and too much passing (low scoring numbers) means a player isn’t creating for others either! Instead, he must achieve an equilibrium, in which he threatens the defense enough to score himself, but also keeps them honest with passing. I discuss this and its impact on offense in great length in Thinking Basketball if you want to dive further into the how and why.

Eliminating the Rondo Effect

So how accurate is Box Creation when compared to the real thing? If we compare it to hand-tracked Opportunities Created, the Mean Absolute Error (MAE) of players with at least 500 possessions is 0.90. Of the players with at least 1000 possessions logged in the data, MAE is 0.77. Since this a per-100 (rate) statistic, that means it’s off by less than eight shots created every 1000 possessions, or about one every game-and-a-half, on average.

Seventy-five percent of players (44 of 59) in the sample were within 1.0 Box Creations of their actual shots created and ninety-two percent were within two. The maximum error from the 1000 possession group was 3.2. So when applying this, it will nail most players, plus-or-minus two creations…as long as the 3-point shot has been prevalent. More on that in a second.

So is Box Creation accurate enough to eliminate the aforementioned Rondo Effect? In my tracking sample, seven players had more than twice as many assists as Opportunities Created. (!) Here’s what the main culprits look like, with how much Box Creation was able to reduce the “error” in assists as a measure of creation:

Box Creation largely eliminates these overestimation effects. It still slightly over credits some of the players like this, but compared to the previous discrepancies between assists and creation it’s an astronomical improvement. (And Rondo himself was pegged perfectly by Box Creation.) Here’s the full formula for Box Creation:

Creation = Ast*0.1843+(Pts+TOV)*0.0969-2.3021*(3pt proficiency)+0.0582*(Ast*(Pts+TOV)*3pt proficiency)-1.1942

where


3pt proficiency = (2/(1+EXP(-3PA))-1)*3P%

and all stats are per 100

That formula is a mouthful, but it reveals some fundamental components of creation.

The 1980’s were a strange time…

Now, I said I would return to the prevalence of the 3-point shot. Before the 1990’s, players simply didn’t shoot 3s that often, which means that the “3pt proficiency” term is representing something that wasn’t a factor during much of the 1980’s. Because of this, I’m concerned the translation may break down in those years. I’m confident in Box Creation’s accuracy for the last two or three decades, but it might be overfitting to the spacing of the modern game when applied to the 1980’s. (Box Creation can be used back to 1978.) Here’s a historical view of the average of the top-5 leaders per season:

Even the iteration of Box Creation without a 3-point term still shows this S-like pattern, although the difference between the 1980’s and today is not so severe. I have tracked games from that era, and creation was indeed less frequent than it is today, as teams operated more in transition and ran more post ups. Defenses also sagged back in the lane and individual “gravity” wasn’t as prevalent as it is in today’s pick-and-roll dominant, pace-and-space game. However, players like Magic and Jordan were likely creating at rates that are more comparable to modern ball-dominators, and the issue caused by the 3-point term is reflected in their Box Creation exploding once they cross the proficiency threshold at the end of the 1980’s.

Top historical seasons

So who is the best creator over the last forty seasons according to the metric? The current MVP, Russell Westbrook. He nearly broke the stat this year, but that’s not too surprising. A young Westbrook was near the leaders when I tracked his game years ago, and we know his time of possession was the highest in the four years NBA.com has put out that stat. Also, Westbrook had the third-highest scoring rate in NBA history this season behind Kobe and Michael. In addition to his scoring, Russ averaged nearly 15 assists per 100 possessions, and no one has ever scored at that volume and dished out so many dimes. Here’s a list of the top-10 Box Creation seasons in NBA history:

From the top 100 scoring seasons in league history, James Harden joined Westbrook as the only other player to average over 14 assists per 100. No one else besides them has ever been over 12 assists per 100 while scoring over 32.5 points per 100.

Next: Nylon Calculus -- Ranking the best and worst scorers in every offensive role

So, in all likelihood, we just watched the two highest creation seasons in NBA history.

Ben Taylor is the founder of Backpicks.com, where the full historical list of Box Creation can be found. His book, “Thinking Basketball,” is available on Amazon.