Nylon Calculus: Grouping players by offensive role, again

OAKLAND, CA - MAY 16: Stephen Curry #30 of the Golden State Warriors and Seth Curry #31 of the Portland Trail Blazers look on during Game Two of the 2019 Western Conference Finals of the NBA Playoffs at the ORACLE Arena on May 16, 2019 in Oakland, California. NOTE TO USER: User expressly acknowledges and agrees that, by downloading and or using this Photograph, user is consenting to the terms and conditions of the Getty Images License Agreement. Mandatory Copyright Notice: Copyright 2019 NBAE (Photo by Noah Graham/NBAE via Getty Images)
OAKLAND, CA - MAY 16: Stephen Curry #30 of the Golden State Warriors and Seth Curry #31 of the Portland Trail Blazers look on during Game Two of the 2019 Western Conference Finals of the NBA Playoffs at the ORACLE Arena on May 16, 2019 in Oakland, California. NOTE TO USER: User expressly acknowledges and agrees that, by downloading and or using this Photograph, user is consenting to the terms and conditions of the Getty Images License Agreement. Mandatory Copyright Notice: Copyright 2019 NBAE (Photo by Noah Graham/NBAE via Getty Images) /
facebooktwitterreddit

I have a preoccupation with sorting things. My mom is a retired librarian and I think she ingrained in me this deep-seated need to categorize. Encyclopedia Brown, Goosebumps, books about famous pirates — each of my favorite childhood genres had its own space on the library’s shelves and a corresponding entry filed neatly away in the giant card catalog. Even our bookcases at home held traces of the Dewey-Decimal System, with a separate shelf just for autobiographies. Now, as an adult without my own library collection to shelve and reshelve, I am left to entertain myself by creating catalogs of NBA players.

In the past, I’ve used algorithmic clustering of Synergy’s play-type data to define players’ offensive roles. Last year I updated these hierarchies and tried to make the player-sorting process more transparent by sharing the results as interactive dashboards. Defining offensive roles in this way has some potentially-practical applications. Role definitions could be a starting point for understanding how to optimize a team’s lineup fit. They could be a first step to finding useful player comps and rating scoring efficiency among those who share similar play-making responsibilities. They could help explain the dynamics of the free-agent marketplace and predict the cost of new contracts. But — as I said — this sorting thing is mostly just a compulsion for me.

My approach, once again, has been to sort players by the types of plays which they have used to try to score. The NBA’s stat site breaks down individual scoring attempts into 11 types of plays: scoring as the pick-and-roll ball-handler, in isolation, with a spot-up shot, working around an off-ball screen, from a handoff, as the pick-and-roll roll man, off a cut, on a putback, on a post-up, in transition, or anything in between (lumped into the catch-all “miscellaneous” category). These useful data are provided by Synergy using a proprietary, real-time, video-indexing, statistical engine that logs every play of every game. More information about how specific types of plays are coded can be found in this helpful guide.

Retracing my steps, I began my latest version of player sorting with a bit of machine learning. Using k-means clustering I found a familiar branching of offensive roles; with players split first into ball-handlers (who tried to score as the ball-handler in pick-and-roll or in isolation), wings (who tried to score spotting up or working off-ball around screens and handoffs), and bigs (who tried to score as the roll man in pick-and-roll, on cuts and putbacks, or posting up). These three coarse player groupings could be further refined into specific roles based on how frequently players used each of the characteristic play-types, e.g. distinguishing the assist-dependent roll-and-cut big from the more versatile big who created his own offense by trying to score on post-ups.

The benefit of using an unsupervised approach like k-means or hierarchical clustering is that — once you have all of the data wrangled — it’s actually pretty straightforward to implement. It’s basically just a few lines of code. Plus it sounds impressive, right?

The downside is that the steps of the clustering process can be a bit opaque and hard to generalize. By design, you’re not left with brightline distinctions between roles or even rules-of-thumb to explain how specific play-type frequencies translated into role definitions. So, this time around, I used the clustering algorithm as more of a loose guide with the goal of creating my own explicit role-sorting formulas. You can find all the gory details at the end of this post, but for now, I’ll just mention that I didn’t use the ‘transition’ or ‘miscellaneous’ play types in my role definitions (at least not explicitly) and I treated off-ball screen and handoff plays as a single combined category. In the end, I came up with 11 offensive roles — four types of ball handlers, three types of wings, and three types of bigs, with one jack-of-all-trades grouping, stuck somewhere in the middle.

To visualize which types of scoring chances were used by players in each of the 11 offensive roles I took inspiration from the Apple ‘Photos’ icon of a multi-colored 8-petaled flower. Each petal represented one type of play, with its opacity showing how frequently that play was used. Specifically, each color was linked to the percentile rank of the corresponding play-type frequency; so that, if a player used a particular type of play less often than the rest of his peers, that petal was completely transparent. If a player used a particular type of play more often than the rest of his peers, then the petal was completely opaque. The result is that the ball handlers have the brightest pinks and purples, the wings have the brightest greens and blues, and the big men have the brightest yellows and oranges. In the image above, there’s a column with five players to serve as examples for each role.

Because the NBA has now made available four seasons-worth of play-type data, we can evaluate how players have changed roles over that time. I looked at the 164 players who have attempted to score on 250+ possessions during each of the last four years and found that 72 of them (44 percent) kept the same role during each year of that stretch, 75 (46 percent) filled two different positions in that time, and 17 (10 ten) bounced around between three distinct roles since the 2015-16 season. The ten players highlighted below recently began to create more of their own offense. That is, over the past four seasons, they notched some of the biggest increases in the frequency with which they tried to score as the ball handler in pick-and-roll (pink), in isolation (purple), or posting up (orange).

There are some common archetypes, here: young studs who have filled creative voids on lottery teams (Devin Booker, Tim Hardaway Jr., Aaron Gordon), players who stepped out from the shadow of a departing, high-usage teammate (Blake Griffin, Karl-Anthony Towns), and guys who took on more responsibility after changing teams (Zach Lavine, D’Angelo Russell, Harrison Barnes). Barnes is one of those rare 3-role players in our dataset. He was a glue guy with the high-powered Warriors in 2016 who became a primary option with the struggling Mavericks in 2017 and 2018 (earning labels as a ‘versatile big’ and a ‘tall ball handler’ in our system). And then last season, with rookie Luka Doncic assuming much of the creative burden in Dallas, Barnes returned to being a glue guy once again.

On the other side of the coin, below are ten players who recently began to create LESS of their own offense. Over the past four seasons these players experienced some of the biggest declines in the frequency with which they tried to score as the pick-and-roll ball handler, in isolation, or posting up (less pinks, purples, and oranges). These are the league’s ‘drooping’ flowers — aging stars like Dirk Nowitzki, Vince Carter, and Dwyane Wade — who were able to find diminished, late-career roles for themselves. Last season, they were forced to work off-the-ball more often than they had in the past, relying on their younger teammates to set them up to score for a change.

Likewise, there is a group of very capable ball-handlers who have accepted reduced on-ball roles to accommodate high-scoring teammates: Stephen Curry has made room for Kevin Durant in Golden State, Kyle Lowry has yielded to (first DeMar DeRozan and now) Kawhi Leonard in Toronto, and Jrue Holiday let Anthony Davis shine as the star in New Orleans.

In addition to highlighting these shifting teammate dynamics, four years of play-type data can help illustrate league-wide trends in the style of play. This can be done using the team-level data, but we can also find evidence of basketball’s evolution in the individual-level play-type data as well. For example, here are five players who have become increasingly focused on spotting-up to find scoring opportunities (more green) over the past four seasons. Houston wing P.J. Tucker, in particular, is a study in modern shot specialization; as he’s jettisoned all of his scoring chances created as a ball-handler, in isolation, or on cuts in favor of taking additional spot-up corner 3s.

So, certainly, if we look carefully, we do see that players can change roles — growing, shifting, and diminishing over time. However, the year-to-year differences in play-type usage are generally very subtle, showing the movement of edge-case players between adjacent role categories from ‘spot-up wing’ to ‘tall spot-up wing’, for example. Understanding that most NBA players “are who they are” has important implications for setting player-development expectations and for team building. It seems that, in most cases, it may be more realistic to expect that a player can become more efficient in his current role than to hope that he can take on a new one.

If you’re interested in recreating these player groupings yourself, you can find the role definitions below. Again, these rules were informed by the clustering algorithm, but there’s nothing magical, here. Other cut-off values could yield different (equally valid) groups.