
Win Projections are my speciality āĀ not because Iām especially good at them[1. Though I have done extremely well the last two years thanks to Real Plus-Minus] but because I think theyāreĀ especiallyĀ fun[2. Puts on nerdy glasses]. If youāre familiar with my style of win projections, you know that I am a big fan of Real-Plus Minus for its out of sample prediction ability. So without any further ado, hereās how I projected win totals this year.
Step 1: Player Ratings: Blend RPM and BPM
First, a quick overview of why RPM and BPM can be used to predict wins:
RPMās strengths:
-Most accurate prediction ability ofĀ futureĀ possessions for any publicly available all-in-one statistic.
That seems like a good summary.
BPMās strengths
āStill very good out of sample prediction[3. About the link āĀ BPM used to be called ASPM]
-Passes the eye test a little more strongly[4. Somewhat because people use per-game info to make judgments, and BPM really onlyĀ includes āvisibleā things like scoring, assists, etc].
-Much more data on player progression and regression (36 yearsā worth).
-On par with RPM in terms of predicting offense
The last two points are the salient ones here. While RPM is the heralded champion at predicting future possessions and wins, BPM has way more seasons under its belt to help us predict player development/regression. So ā my assumption is that BPM has some information to give us regarding offensive development that RPM might not.
The weights are relatively simple: Box Plus-Minus gets between 0 and 25% weight, based on how much of a playerās RPM rating last year came from their offense. For a Russell Westbrook type, BPMās projection is weighted higherĀ (22.5%) say than an Andrew Bogut (12.4%)[5. Notice that despite Bogutās low RPM rating on defense, he still gets a share much higher than zero. Thatās because his offense is particularly negative according to RPM ā so I assume thereās significant information there to not be ignored] or a Dwight Howard (9.4%).
Translation: IĀ blend RPM and BPM ratings, RPM for its known predictive ability, and BPM for its larger dataset of player regression.
Note: For rookies, I used Kevin Ferriganās RAPM projections.
Step 2: Adjust player ratings for usage
This is the big one. After a whole summer of not being able to come up with good āfitā adjustments, I was able to at last built some semblance of a āUsageā adjuster. This is what my analysis found:
- The BPM of players on a team with too many high-usage players typically regress
- They especially regress when their value is scoring-based (i.e. Russell Westbrook would regress more than Bogut in a scenario where they gain lots of high-usage teammates)
And hereās how it impacted my projections[6. The reason it isnāt perfectly linear is because of #2 āĀ individual players regress at different rates based on their prior-season scoring.].

It may seem extremely counter intuitive, but what I found from 36 years of player-season data is that offensive players regress significantly from their projections when newly paired with other offensive players. Think 2011 DWade/Bosh, 2015 Kyrie & KLove. OK so this makes sense ā players will score less if they have to share the ball. I also found the reverse to be true ā players newly paired with low-usage guys tend to fill the gap more (i.e. Russell Westbrook without KD last year used I think 14,000% of possessions).
Unfortunately, while my analysis found quite a lot of āsignalā āĀ that is, team usage variables that were almost certainly impacting player BPM, there was also a significant amount of noise. I therefore added in some extra regression to prevent my results from being too strange[7. I multiplied the initial adjustment by about 1/3.].
Translation:Ā While high-usage guys provide extra value on their own, regression[8. or diminishing returns] occurs when you team them up Ā *AND* vice versa. So IĀ adjust accordingly.
Step 3: Project Minutes
I spent most of my time last season working on minutes projections. Minutes are extremely important as they can completely skew your results āĀ for example, if you have Clint Capela playing 990 minutes per game, the Rockets might not be a top-5 team[9. ;)]. Rather than spend all my time working on this, I āoutsourcedā by using the venerable Kevin Peltonās minutes projections, and combined them with a fantasy sports siteās projections. After averaging, I multiplied each playerās minutes by a constant, forcing theĀ teamās total to equal the magic number,Ā 19,680 minutes.
Step 4: Project pace
WARNING: This method is lazy.
First, I created a āTrue Pace Estimate,ā for 2015. ThisĀ assumes each team played opponents with average pace last season, and removes that āaverage teamā from their value[10. True Pace Estimate = 2xPace ā League Avg Pace]. Then, I regress each team to the mean based on how much roster turnover they had to come up with a 2016 True Pace Estimate[11. 2016 True Pace Estimate = (Expected Roster Minutes Continuity% x 2015 True Pace Estimate) + (1- Expected Roster Minutes Continuity)x2015 ]. Finally, I project each gameās pace by taking the average of both teamās 2016 True Pace Estimates.
This is important because higher pace = lower variance. Better teams win more often in these scenarios and vice versa.
Translation: Good fast teams win more games than good slow teams, so IĀ adjust accordingly.
Step 5: Adjust for Variance and Project Win probabilities for each game
The first four steps give us enough information to project a teamās chance of winning.
Team A Win%Ā ~
(Team A Rating ā Team B Rating) * 0.5 * (Team A True Pace + Team B True Pace) / 100 + Home-Court-Advantage
But there is one mystery yet left to solve, which I eyeballed this year: how much should I trust my ratings?

If we take my playerĀ ratings, minutes projections, pace at face value, we can calculate win probability as usual āĀ but these team ratings are different than mid-season numbers.
One of the biggest reasons I didnāt win the APBR contest last year[8. Though in terms of Average Error, I actually did win :)]Ā was because my data wasnāt regressed enough towards the mean. Simply said, you canāt use the Pythagorean formula (or in my case, the normal distribution function) as-is to project out-of-sample wins. These win percentage formulas are based on in-sample Efficiency Differential, which is much more highly correlated to win percentage than this preseason data.
So I took my win projection totals and added in a variance factor āĀ increasing the formulaās[9. In excel, the formula is =NORMDIST(Expected Point Differential, 0, Expected Variance (Standard Deviation), 1)] varianceĀ by 14 percent[9. Exactly half of the 28% variance increase which I foundĀ most closely matched my win totals to Vegasā predictions].
Note: This is the biggest changeĀ I made since we put out the greatĀ Nylon team previewsĀ (other than adjusting for Tyreke Evansā injury today). Sorry! Nobody changed more than a couple wins.
Translation:Ā We are less able to predict a whole seasonās worth of wins before it starts than we are able to predict a win three-fourths of the way through the season, so IĀ adjust accordingly.
Thatās enough words for now. *Hereās* part 2 of 2 āĀ The Results!