Guest Post: Jamal Crawford An Unsurprising 6MOY Winner
By Guest Post
This guest post comes to us from Bo Schwartz Madsen. Bo writes about the NBA and is co-host on the Danish NBA podcast “Under Kurven”, when he’s not studying climate science as part of his PhD. He has previously written about how and why certain players can “fool” ESPN’s Real Plus-Minus metric. Follow him on twitter @BoSchwartz.
Jamal Crawford won his third Sixth Man of the Year award on Tuesday to much chagrin of parts of Basketball Twitter. Well, as much chagrin as people can muster up about the Sixth Man of the Year Award, which for me was 3-4 ranting tweets.
But we should have seen this coming. Crawford winning the award fell perfectly in line with how voters have voted previous years. I have looked at some very simple models for predicting votes and Crawford was the likeliest winner this year.
Jacob Rosen broke down the Sixth Man of the Year field on HP last week. He identified two factors in the voting for 6MOY: points per game and playoff status for the team.
Using data from basketball-reference.com, I looked at all Sixth Man of the Year votes from the 2002-03 season until now. Today voters rank three players. Before 2003, they only had one vote to give. I wanted to include all eligible players, not just the ones that received votes, so I have included all players that played more than 900 minutes and 50 games in a season, while appearing more frequently off the bench than starting.
This actually leaves off two seasons that received votes.
- Luke Babbitt received a third-place vote in 2013 coming off the bench for 11.8 minutes per game in 62 games for a total of 730 minutes. He was by far the vote recipient with the fewest minutes played.[1]
- Kyle Korver received a second-place vote in 2005 even though he started 57 of the 82 games he played.[1. Minutes and games were adjusted for the shortened 2011-12 season. ]
The goal was to build a simple model. Besides points per game, it turned out that number of games played was also highly significant. Voters reward the players that contribute to the team the entire season. This year, Andre Iguodala’s candidacy has also been questioned because injury caused him to miss 21 games.
Voter share can of course not be less than zero. I have used a Tobit model to predict share of the vote for each player with the model returning “0” if it predicts negative votes for a player. However, in this case, I wanted to illustrate just how unusual Iguodala’s candidacy was this season is predicted to be. The negative value is proportional to the model’s confidence in him not receiving any votes at all. Thus based on his points per game and number of games played, Iguodala received a lot more votes than can usually be expected from players with his stats.
In the graph above, I have used actual model predictions[1. Thus, no more negative vote estimates. The model shares are not as big as the voter shares, because the model does not know that there are other players in the same year and only a finite number of votes. In the end, this turns out not to be a problem.] Highlighted by the black ellipse are some of the players that the model thinks should get votes (mostly because they scored a lot), but who did not get a lot of attention for 6MOY in real life. The reason is the other factor that Jacob noted. Playoffs. Voters want the 6MOY to be on a successful team. Listed below are the players from that upper left sliver of players who got the fewest votes relative to expectation based on their bench scoring:
To predict the winner from each year, I include whether the team made the playoffs or not.[3. If the player was traded during the season, his playoff status is taken from his last team.] That means I have three predictors: Points per game, games played and playoffs(y/n). With these I predict voter share and use that as a score for each year.[2. All stats from the year being predicted have been left out of fitting the model. So the model is trained 14 times, once for each year.] The predicted voter share ranks all players in every year. This analysis produces the following results:
Green means that the model hit the right Sixth Man of the Year. Bold means that the runner-up is also correct.
The model does really well. The three years that it misses the Sixth Man of the Year, it has the actual running up finishing first place while predicting that winner to have finished second. I also tried limiting the training set. Training the model on the first 6 seasons and testing on the last 8 did not make a difference for the model as to which winner it chose each year.
There are all kinds of variables that I could think of to add to the model, e.g. PER, VORP, etc. It could also be more indirect variables like e.g. stats for the guy that the player usually subs in for or whether a player has received votes before. The model structure is also very simple and could probably be improved. But I stopped when I reached this point. Because the simple model works, and I think that shows how simple the voting decision for Sixth Man of the Year is for many voters. The 6MOY has become a type and it is hard to break the mold. Iggy tried to buck the trend this year, and came very close. The graph below shows the model score for this year. The legend on the right is sorted from highest score to lowest:
Iguodala had by far the actual vote share of any player predicted to get zero votes by the model. So while the Sixth Man of the Year Award still rewards scoring for playoff teams above all else, maybe the voters are becoming aware of other criteria.