Nylon Calculus: The Art of the Finals MVP prediction

Three years ago, I created a rough model that predicts the Finals MVP for a series. It could use an update, and a new base metric to judge players (hello, DRE), but at this point there’s an opportunity to take a step back and look at the situation and how the numbers are functioning. Because the vote is subjective and controlled by humans, no purely quantitative model will be perfect, but this project is to explore the trends and dig into history through a new prism.

Modeling the vote

For the model itself, I didn’t want to focus solely on prediction itself, where I could have used a deep wealth of stats against past voting trends. Instead I wanted to see how production translated to a Finals MVP with something simple: Game Score, modified in a way that can (sorta) translate across eras (You can see the calculations details here at the bottom of the page). Additionally, you need to know whether the player was on the winning or losing team — that has a huge impact on the odds. And that’s it: just two predictors, although one is more involved.

You can see the formula below. It’s a basic logistic formula (functions that do well with predicting 0 to 1 odds) and this is based on data from 1969 to 2017. You take the likelihood, which states how good of a case an individual player has, and then divide that by the total likelihood of every player in the series to see how each player compares. The “likelihood” variable is roughly set to the same scale as Finals MVP odds, but the odds do change based on how other players do, which should be obvious.

Likelihood = 1/(1 +exp(-(-13.6 +0.376*GameScore +5.56*Series win)))

Odds = Likelihood / sum( total likelihood )

You can mess around with that formula and figure out that if a player has an earth-shaking performance with a Game Score around 30, like Kevin Durant did, he is all but guaranteed to win the MVP; but if his team loses, his likelihood drops to 9 percent. It is now an unwritten rule that players on the losing side are ineligible — I’ve ranted about the logical inconsistencies and just how much more boring that makes the award before.

Finals MVP: 2017

Kevin Durant’s performance in the NBA Finals should make this an easy call, given the historic level of production. But Stephen Curry’s influence on the court — how opposing defenses have to be wary of his ridiculous range and ability to shoot quickly off the dribble — at least gives him a consideration and thus his value is difficult to pin down with basic stats. Then there’s LeBron James being LeBron; at this point his excellence in the Finals is taken for granted, which is a far cry from his perception just six years ago. (Some sportswriters are still asking him to undergo a Sisyphean task of lifting his team to a title against virtually impossible odds, but I think a triple-double while averaging over 30 points per game with high efficiency should excuse him — it’s not his fault his team failed to add a Hall-of-Famer in his prime in the off-season.)

However, Curry’s series was marvelous in its own regard, and there are some arguments for him. In that way, it was a strange year for the Finals MVP because Curry had the resume of an above-average MVP — and still Kevin Durant had a significant lead. Yet neither player had the best modified Game Score of the series; that belonged to LeBron James, who had no chance at winning because of some unwritten rules. Overall, Kevin Durant is most likely the best choice, because he was such a large part of his team’s offense, played adept defense (he does have the reach of a center), and scored at scary levels of efficiency. I don’t mind his win there, though the model isn’t too great at dealing with multiple high-caliber MVP candidates; it’s hard to separate them.

Model details

I first developed this model in 2014, and while I do need to make updates and more intelligent adjustments, this is a good opportunity to assess how it’s performed with a sample set of seasons that are truly out of sample (only using pre-2014 data). The model has been fairly accurate every year since then, but it hasn’t been a particularly difficult stretch, besides the old argument over whether or not a great performance on the losing side matters. You can see how LeBron’s been listed in every season below, but he’s only had one title. Given the harsh penalty for losing, he’s trying his best to break the model. He actually has gotten a little traction here and there for a finals MVP despite losing, so perhaps this is quite accurate. Otherwise, the results are pretty spot-on — scorers and playmakers get all the credit — except for Andre Iguodala. This makes sense from this methodology because defense is harder to quantify, and Iggy’s MVP was certainly a bit of a surprise. Still, he at least does show up as a candidate — it’s not a total whiff.

Out of sample results

Player	Team	Season	Odds%	MVP
Kevin Durant	GSW	2017	43.1	1
Stephen Curry	GSW	2017	39.4	0
LeBron James	CLE	2017	15.6	0
LeBron James	CLE	2016	64.7	1
Kyrie Irving	CLE	2016	33.8	0
Stephen Curry	GSW	2015	62.4	0
Andre Iguodala	GSW	2015	9.0	1
LeBron James	CLE	2015	21.0	0
Tony Parker	SAS	2014	12.1	0
Kawhi Leonard	SAS	2014	46.1	1
Tim Duncan	SAS	2014	18.0	0
Manu Ginobili	SAS	2014	8.0	0
LeBron James	MIA	2014	8.2	0

Historically, you can see the leaders below from the model results, which go to 1969. LeBron James has the most Finals MVP shares, but that changes if you use the model with the win adjustment or if you go by a per series metric — at eight appearances, few players have had more time to rack up the stats. You can see a few “competitive losers” too, like Dr. J who had several appearances but only won in 1983 when he took a step back in the pecking order of his own team. You can also see the effect of the win adjustment, boosting guys like Scottie Pippen.

Table: career leaders (win adjustment gives boost to title winners)

Player	Finals	MVP Shares (win adj.)	MVP Shares (no win adj.)
LeBron James	8	3.03	4.29
Michael Jordan	6	5.07	3.92
Shaquille O’Neal	6	2.97	3.19
Magic Johnson	9	3.51	3.10
Kareem Abdul-Jabbar	10	2.45	2.38
Jerry West	4	0.25	1.52
Larry Bird	5	1.78	1.51
Kobe Bryant	7	2.37	1.51
Julius Erving	4	0.33	1.48
Dwyane Wade	5	1.40	1.43
Tim Duncan	6	2.08	1.29
Hakeem Olajuwon	3	1.59	1.22
Kevin Durant	2	0.99	1.18
Clyde Drexler	3	0.55	0.85
Stephen Curry	3	1.31	0.83
Scottie Pippen	6	1.40	0.64
Rick Barry	1	0.87	0.59
Isiah Thomas	3	0.91	0.59
James Worthy	6	0.70	0.55
John Havlicek	3	1.13	0.54
Charles Barkley	1	0.02	0.52
Kevin McHale	5	0.48	0.52
Joe Dumars	3	0.85	0.50
Walt Frazier	3	0.44	0.50

While those numbers are a fun little exercise, I believe there’s a lot of room for improvement not just for pure prediction — using more advanced techniques, machine modeling, and throwing in other stats — but for better measures of production, like DRE. I’m approaching this from two angles, studying how people gravitate toward supporting certain players over others — why Wes Unseld over his teammates in 1978, for example, or Chauncey Billups in 2004 — and using a more accurate yard-stick to grade performances.

I don’t expect every decision will be perfect, nor would I expect a perfect model. But it’s the pursuit that’s worth the effort, quantifying something like how severe the penalty is for losing the series (think of it like this: according to my numbers, if you and another player on the opposing team have the same game score and you want your odds to remain even with his if your team loses, you would need to increase your game score by 14.8 points, few players can even average that for a series), or digging up gems about the NBA and its storied past (The award is named after Bill Russell, who never actually won it John Havlicek, his teammate, won it during his last season in 1969, though according to my research he probably would have won five. If you ever want to dive into NBA history, Bill’s finals series are well worth the time).

Kevin Durant may have won and the NBA season has closed, but the curiosity lingers.

Home/Nylon Calculus