A Shooting Foul Rate Model

Mandatory Credit: Derick E. Hingle-USA TODAY Sports

In the middle of expanding the defensive side of my Player Tracking Plus Minus metric (PT-PM), I added more detail on fouling types, specifically shooting fouls and offensive fouls drawn, using data via NBA Miner. Given that the data set already had the SportVU estimates of closest defender shot defense locations I decided to throw in a model to predict a player’s shooting foul rate, even though there are some interpretive issues with this data.

It is important to note that this model is trained to predict the shooting foul rate of the player, not tell us precisely where the shooting fouls occurred. That’s data that we hope to have at some point soon. But from the model we can get some inferences of where shooting fouls are taking place in addition to the player profile of the offenders and a list of outliers from last year. In some ways it is the mirror of a study I did on players who draw shooting fouls using where the offensive player shoots and the percent of unassisted shots.

For this study, I ran a number of cross validations and boosting, bagging and bootstrapping models just to make sure I had a consistent story. Across the board the models gave me answers that indicating defending more shots lead to more shooting fouls [1. Science!], defending shots at the rim more so, mid-range shot defense the least, three pointers, interestingly, in the middle, and drawing offensive fouls is associated with committing more shooting fouls.

Below are the coefficients from a simple least squares model that is representative of the most stable predictors, and explains about 50% of the variation in the training set.

The results between mid-range shots defended and threes is kind of interesting. There are a couple of things to note about the data. One thing is that three-point shots are mostly open, as Seth Partnow found here, so there can be few fouls in the three point territory, but a relatively higher rate of fouls per shots actually defended. Another thing to note is that the shots defended in the SportVU data is grouped or ‘binned’ based on being defended if the player is within a few feet of the shooter, so the data may be picking up ‘defenders’ that are not defending all that intently.

Lastly, again this model is about player’s foul rates, and the coefficients are only relevant in relation to the overall model, meaning it’s influenced by the contrast between players who never defend outside the rim area and more mobile defenders who more often venture to defend mid-range shots.

The offensive fouls drawn relationship deserves mention too, a measure that has scored well in my new testing of the defensive PT-PM model. In the shooting foul model, this seems to be something of a proxy for physical defense, as well as the fact that many offensive fouls/shooting fouls could reasonably go either way. Steal rate and the plus-minus effect on two point shooting measured by SportVU enter the model much more reliably if the offensive fouls drawn is excluded.

Lastly, the model also gave me a list of outliers, who fouled more or less than predicted based on the model last year.

The more foul prone outliers tended towards players with less minutes both because of sample size/variance issues and because fouling has a negative correlation with minutes played. The group also has a couple of rookies in Kelly Olynyk and Anthony Bennett. The under-foulers were lead by none other than the much maligned defender Kevin Love.

Needless to say actually charting shot foul locations, and then comparing that to shot volumes and defensive contests directly will represent a further step forward in understanding the game. But, this is, I think, an interesting first take on the association between shooting foul rate by players and where and how they defend.