Team USA’s performance and the pursuit of lineup fit: Part 2

While the Olympics have been over for a few weeks, the dominant Team USA is approaching a moment of change. Coach Krzyzewski is stepping down and even though he’s being replaced by Gregg Popovich, it would be hard for anyone to replicate a near-pristine track record of 81-1 including exhibition games. (Only counting major tournament games, i.e. the Olympics and the World Championship, and that record is 50-1.)

There were a few hiccups during group play at the Olympics, but the US ultimately obliterated its competition to the point where the next team could get too confident and underestimate its opponents. This recent team, as I figured, wasn’t one of the strongest since professionals were allowed but they weren’t one of the weakest either. I recently ranked them among the other US teams since 1992 which led to an important set of questions: How do you rate teams in international tournaments with so few games? And how do you predict the performance of a super-team like the US ones, where stars come together and drastically change their roles?

Team strength

For Nylon Calculus readers, I shouldn’t need to explain why win/loss records are not the best measures of team performance but it’s especially pertinent in short tournaments where a couple close losses can distort the numbers and schedule strengths vary widely. Thus, there’s a fairly quick but rough way of measuring strength: adjusted point differential, using ridge regression in R. However, we need an adjustment. Olympic tournaments feature 12 participants, while the World Championship has gone from 16 to 24 teams.

More from Nylon Calculus

To re-calibrate the numbers, I looked at teams who competed in World Championships and Olympics in the next or previous year for a simple one-number adjustment for every type of tournament. The Olympics are the standard, so the adjustment there is zero, but the World Championships got an adjustment of 2.6 points when there were 16 teams and 3.5 points for 24 teams. (The Tournament of the Americas, by the way, have an adjustment of 5.4 points for the years where pro players were used, for what it’s worth.)

The numbers below are point differential ratings for every team since 1992 in the two major tournaments, the World Cup/Championship and the Olympics, as well as the Pan American qualifying tourneys where the NBA players have participated. Besides the US teams, the strongest one is Yugoslavia, featuring Vlade Divac, in 1996. They have a few blowout victories, including a rare “double-up” over China: 128 to 61. Spain in 2006, by the way, was a close second among non-US teams. Other elite international teams include Croatia in 1994 featuring Dino Radja and Toni Kukoc, Lithuania in 1996 with Arvydas Sabonis, Spain in 2014, and Argentina with their own “golden generation” in 2006. Strangely, Spain in 2012 did not rate well as they had a number of narrow victories before an epic game with the US.

Of course, we can improve upon those results using a universal guideline in statistics – if you have useful information, use it. Going into tournaments, FIBA has a set of rankings for every relevant team based on how they performed in previous tournaments. The scale is from 0 to 1000, and I experimented with it during the first few games of the last Olympics, figuring that the long-term strengths of various nations could be weighed significantly in comparison to a paltry handful of games. Not only do you have a better understanding of how strong, say, Australia is when they start outperforming expectations, you have a better understanding of how good their opponents are, creating a more accurate estimate for every game.

While that reasoning is sound in theory, for best practices one should do some testing. For that, I calculated the same type of ratings as in the above table but with FIBA rankings included as an additional variable and I only used group play games. Additionally, because I found that African and Asian teams were being systematically overrated by the rankings, I had a dummy variable for countries from those continents. Then I used those ratings to predict the elimination stage games. The improvement wasn’t dramatic – the average absolute error fell from 12.3 to 11.8 points per game – but it is evidence prior information (the rankings) are useful. Those were also rough, unadjusted numbers with few tweaks, as the ranking are by no means ideal – there’s a lot of room on the margins for some growth in predictive power.

For some real numbers, the two tables below have adjusted ratings for 2012, 2014, and 2016, respectively. The FIBA rankings don’t change the results by any great magnitude, but they help improve accuracy by providing information for every match-up and, consequently, the strength of every team. Sometimes a traditionally strong team, like Spain in 2012, will have a surprisingly low margin of victory, and the rankings are there to inform every game of the team’s historical strength.

From the parts to the whole

Lastly, there’s another method to determine team strength, and it’s one I’ve already covered in detail: cobbling together individual player ratings from the NBA and translating them to FIBA tournaments. After some more experimentation and another year of Olympic games to consider, along with adding the qualifying tournaments from 2003 and 2007, I have an improved model. This is also where the team ratings above are useful. I decided to rein in the overfitting, so this version is more conservative but it should be more accurate in future years.

Predicted SRS = 3.3 +1*BPM -0.33*BPM usage adjustment +8.6*Team avg adj. 3PTA per MP

I’ve explained the variables in more detail in the linked article above, but here’s a summary: you take a team’s collective BPM, weighed by minutes. Then you adjust for high usage rates since superstars won’t have the same value as they do in the NBA when they’re not taking as many shots. Finally, there’s an additional three-point “spacing” adjustment (based on everyone’s NBA three-point stats) because the most successful teams have a lot of shooters and vice versa. If one does simple linear regression, three-pointers are weighed absurdly high. The two most disappointing teams in Team USA history (2004 and 2006, respectively) were actually below average for NBA players in three-point rate.

Furthermore, I tested an adjustment for assists because they’re highly valued in the BPM metric but rates fall across the board for stars who play on Team USA. But it was actually positively correlated with team strength – you can’t have too much passing, apparently. I also looked at rebounding, but the trends weren’t significant.

International competition flux

Finally, there’s one more adjustment I can’t ignore without a note: strength of schedule by the year. I would not want to assume that the international competition was the same in 1992 as it was in 2006 and 2016. I did not take the time to do this – that would be its own large post because you need to wrangle data from several leagues for accuracy – but I do want to critique one attempt because I think it’s sending the wrong message. This FiveThirtyEight article also used BPM to evaluate US rosters, but instead of investigating why that BPM talent didn’t translate to on-court performance the article stated the changes were due to varying levels of competition. Essentially, it’s using a residual to measure strength of schedule without any adjustments.

I feel that this may be an error as I find it suspicious that world competition jumped by 15-20 points for just one year. The tournaments are too short to make such grand judgments – it’s why I started blending and regressing to the mean so heavily for more accuracy. The numbers are also ignoring improved coaching tactics and preparation, which is why I opted for more conservative estimates. Finally, I’m not sure if the numbers were adjusted for the change in the number of World Cup teams, which obviously has an effect. I’m pointing this out not to nitpick with the good people at 538 but because a negative appraisal of the international basketball community feels inaccurate – we’ve seen a few recent draft classes with a lot of strong non-US talent, and those players have more influence in the NBA than they did in 1992.

Conclusion

Performing a complicated retrodiction for tournaments long since completed may seem like overkill, but there is a noble goal here: the Olympics and the World Cup, and related exhibitions, are basketball experiments that are hard to find in the wild, outside of a few super-team alignments. It’s a place to study how the game responds to extreme conditions and, consequently, it provides everyone a way to see how things function at a basic level. This is why science experiments are done in a lab – you can’t find everything naturally – and the Olympics are one lab I use.

Much of my own basketball philosophy was born from the chaos when Team USA lost. We all analyzed what happened and drew our own conclusions. I’m still learning from those failures and measuring their success now to see what matters. From all that we’ve seen, and I extend this to how teams are constructed in the NBA, to build the best teams you can’t ignore the “one ball” problem and how certain players hinge a lot of their value on usage. That’s a key factor. However, there are not issues with diminishing returns with respect to passing and outside shooting, as the most successful teams loaded up in both areas. As Team USA moves into a new phase of the program, we’ll see how they build future super-teams because we should understand how to construct them now – more guys like Kyle Lowry and Jimmy Butler and fewer guys like DeMar DeRozan.

The next tournament will be the 2019 World Cup in China, and since the media tends to ignore the tournament, the players do so as well. It’ll likely be a weaker team, and yet we could still struggle with overconfidence issues, forgetting the close calls in recent years. I don’t know exactly what will happen – the abrupt nature of the tournament leads to unpredictability – but I hope Team USA has learned from its own history.