Nylon Calculus: Bravely staring down the radar chart backlash

May 17, 2017; Boston, MA, USA; Cleveland Cavaliers forward LeBron James (23) reacts to the official over a call during the first half in game one of the Eastern conference finals of the NBA Playoffs against the Boston Celtics at TD Garden. Mandatory Credit: Bob DeChiara-USA TODAY Sports
May 17, 2017; Boston, MA, USA; Cleveland Cavaliers forward LeBron James (23) reacts to the official over a call during the first half in game one of the Eastern conference finals of the NBA Playoffs against the Boston Celtics at TD Garden. Mandatory Credit: Bob DeChiara-USA TODAY Sports /
facebooktwitterreddit

Controversy has rocked the NBA stats world, and it may be difficult to explain to outsiders: radar charts were viciously attacked, and by implication the people who use them. The inciting incident is shown in the tweet below. You can see how that’s relevant to the Nylon Calculus collective, as they’ve been featured a few times before in our work.

Radar charts are an eye-grabbing way to display multiple variables and they’re usually well-received when they’re presented, but they’re also derided for how misleading they can be at presenting data. Radar charts aren’t the only novel way to illustrate stats, but the strong arguments they incited are emblematic of the world of sports statistics, where everything was built on the bedrock of entertainment and it’s difficult enough to get any new type of statistic recognized.

What are radar charts?

Quite simply, a radar chart is a way to summarize a several variables in one chart using a spider web design where things are measured by their distance away from the center. The name (obviously) was derived from their similarity to old fashioned radar displays; it’s also been called a spider chart or a star chart.

They’re popular in the sports world because players usually have multiple attributes — scoring, rebounding, passing, blocking shots, etc. — and it’s more digestible to put everything into one snazzy-looking graphic. You can see an example of one below from Todd Whitehead, where Harrison Barnes’ strengths in youth and 3PT-shooting stand out easily.

That plot was used in an article about the replacement for Barnes and who was the closest match. Radar charts are useful there because the variety of attributes create a shape, and we can then mentally compare that shape to others. That visual aspect is primarily why they’re so popular across the sports world with people who would never even think about plotting variables. People don’t want to read numbers; they want to see a personification.

Unfortunately, they have a fatal flaw: the charts create an area, the “webbing,” between two variables, but the variable ordering is mostly arbitrary. Thus, as was shown in the inciting tweet shown earlier, you can randomize the ordering and the interpretation can change. That is not what you want from a statistical graphic — you want clarity and consistency. There are perhaps a few circumstances where one is appropriate, like plotting data per hour in a 24 hour cycle, but those are limited and radar charts have been applied to data that’s not cyclical and where adjacent variables — the ones that will create the webbing — are not related.

In the academic world, radar charts are frowned upon because of the reasons stated above. People are supposed to compare the variables by the radial length, not the area of the webbing, which is what instinctively happens for most readers. Additionally, area is difficult to judge and comparethere’s a lot of research about that fact, mostly concerned with how terrible pie charts are. In the influential textbook Graphics for Statistics and Data Analysis with R, the principles of effective statistical graphics are summarized in an acronym “accent:” Apprehension. Clarity. Consistency. Efficiency. Necessity. Truthfulness. Radar charts fail the A and the second C because you can reorder the variables. They’re not that efficient either, and they’re arguably not necessary or truthful because of how people judge the results by the webbing area and not the length

However, in another section of the bookradar charts, or star plots as they’re called there, have their own section without a repudiation and where the author Kevin J. Keen states that it’s a tool for data analysis and uses it to analyze health care spending among different countries. It’s curious how the ACCENT philosophy is not applied here; perhaps it was a lack of research.

The far-reaching world of alternatives

While more basic charts like a bar graph are more effective and accurate, you can understand that some people are more concerned with grabbing a reader’s attention or presenting statistics in a way that’ll bring in more people than otherwise. But there’s an endless well for inventive graphics out there if you care more about aesthetics than academic rigor, from utilizing the emoji to using a player’s own body to display statistics. But one of the oldest, and oddest, is Chernoff’s faces, where it’s reasoned that since people are wired to read faces, it should surely be best to display statistics with facial features. The inspiration is sane, but the results are a little bizarre.

To build one in R, all you need is the aplpack package and to use the function Faces with a few columns of data. There’s one important point I’d like to emphasize though: there are 15 features with each face, and if you don’t provide 15 variables the plot will simply copy your data columns until there are 15. Here is a full list of the features and the order in which they appear, which you can find in the documentation too.

"1-height of face, 2-width of face, 3-shape of face, 4-height of mouth, 5-width of mouth, 6-curve of smile, 7-height of eyes, 8-width of eyes, 9-height of hair, 10-width of hair, 11-styling of hair, 12-height of nose, 13-width of nose, 14-width of ears, 15-height of ears. For painting elements of a face the colors of are found by averaging of sets of variables: (7,8)- eyes:iris, (1,2,3)-lips, (14,15)-ears, (12,13)-nose, (9,10,11)-hair, (1,2)-face."

If you don’t want to use all 15 features, you can simply add constant vectors to your data set. And for an example, you can see the attached figure below showing team stats for the 2017 season (For stats where a lower statistic is better, like defensive rating, I subtracted the league average and added a negative sign. I did not do this with the second graph, however.)

That graphic is probably overwhelming, but it does communicate a few things well. Firstly, the Golden State Warriors stand out from the crowd. Facial color is determined by the first two features, which measure offensive rating and defensive rating; hence, only the Warriors are white. The Spurs are the closest match, but they differ in a few key ways, like their “superior” eye size, which is indicative of their rebounding prowess over Golden State. You can also see how a few poor teams are quite similar, like the Nets, Lakers, and the Magic.

Without a deep and weird knowledge of how the features are applied, it’s tough to pick up on meaning without referring to the legend. Compare Utah and Toronto, for example. They’re similar, except that Utah is wider in many aspects, and you’d have to read that legend to pick up on why: facial width is defensive rating, eye width is defensive rebound percentage, and hair width is opponent 2-point percentage. Toronto’s happier smile and “taller” mouth is a result of their higher shooting percentages. Maybe with some time and training one can use these faces to quickly compare teams, but some features are smaller and more difficult to read, like the nose and the ears.

That’s not all the fun that can be had, however. You can change the facial type with “face.type.” In a heightened level of uselessness, type 2 is a Santa face — yes, Santa Claus; this does not refer to some obscure bit of statistical graphing.

For a hands-on example of this, you can see the graph below and the R code here to reproduce it. This is showing the best games on Christmas since 1984 (when full game logs were available on Basketball-Reference.) And yes, this is absurdly superfluous, and no, this is probably not a good idea. The code itself should at least be useful enough to show anyone how to do something creative with Chernoff’s faces, and feel free to change the type to 1 for a less festive and saner plot. By the way, that red star and the beard shape mean nothing. They’re random and they’ll change each time you plot.

Again, you may have to refer to the legend frequently, but maybe it’s starting to make sense now. The guys with the larger eyes are the ones with more rebounds because obviously their eyes are wide open enough to grab all the missed shots. The guys with the bigger ears are ones who played better defense because they can hear the opponents coming. Gunners are showy, so of course the guys with the most field goals will have the biggest hair. Great passers are ones who can sniff out the best opportunities. And the players with the highest percentages have the biggest mouths so they can drink in all the efficiency.

Next: Nylon Calculus -- Projecting 3-point shooting for 2017 NBA Draft prospects

Chernoff faces have many issues as effective graphing device. It’s difficult to read some of the smaller features, like nose and ear sizes, and judging the size or length of a component in general is challenging too. Radar charts are easier to comprehend, but they’re not perfect either. But if you’re going to use a chart that violates some of the basic tenets of statistical graphing just to goad the interest of the reader, you may as well have some fun with it.