Industry Q&A: Ryan Warkins of STATS, Inc.

facebooktwitterreddit

Flickr | kwarz

One of the bigger pieces of NBA analytics industry news over the offseason was Brian Kopp leaving STATS, Inc. to head up Catapult Sport’s North American operations. Kopp had spearheaded the efforts to first get the SportVU system into all 29 NBA arenas. More importantly, from a public perspective, was his role in ensuring that access to a subset of SportVU-derived data was available on NBA.com.

Entering the second season of the SportVU era, I spoke with Ryan Warkins, Assistant Vice President of STATS and Kopp’s primary replacement, about the transition, changes and improvements in the underlying technology, adoption within the league and a hint of what’s going to be new for this year.

Q:  Are you the new Mr. SportVU?

Ryan Warkins: I prefer not to go by that term! It’s been a transition from me being the day-to-day support for [NBA] teams to me being out there to grow the business by both by supporting the teams and putting together our own [internal] team that can fulfill that support. We’ve added a new Basketball Manager. He [Charlie Rohlf] was actually a four-year manager at Duke under Coach K. He’s a world-class programmer as well. So his ability to know the game and then decipher and make sense of the data has been awesome in terms of our growth. Adding him to the team has definitely made my job and the transition [after Kopp left for Catapult] easier.

 

Having someone who “speaks basketball” to interface with teams?

The things he can do to create new data points, instead of me trying to explain how something should look, he can just go ahead an do it.

Can you tell me more about the your current role as well as what is entailed by the “basketball manager” role? How much of the league is actively taking advantage of that support?

My day-to-day role is overseeing the application development. Our web platform that people [inside the league] are using on a day-to-day basis to gain access to the data.

And then on the SportVU side, the underlying technology to push the product forward. How do we make this better? How do we push the product forward in terms of the live environment and get access to the data sooner? I oversee that and a team of three data analysts and Charlie.

Charlie’s role is then to work with the developer on both [internal] teams, application and SportVU development to make sure he’s bridging the gap between client [team/franchise] need and what’s actually possible from a development standpoint. So with Charlie and the analysts they support the daily calls from the teams. So if a team calls in and says “hey we want to run this kind of query” Charlie or the analysts will field that call, will understand what they [the NBA team] is looking for and will work to get that data into the team’s hands.

In terms of use, I’d say about tw0-thirds of the league is actively using our web platform. And there is also a small subset of teams that are taking in our raw [XML] files and are doing the coding and analysis in-house. We don’t have a lot of daily interaction with them, we don’t really know what they are doing with the data. Maybe they are doing a lot, maybe not so much and they just don’t have questions for us but it’s hard for us to pinpoint.

 

What are some of the improvements that have been made in the system itself in the last few months?

We’ve looked hard at our algorithms and how we can use more physics within the data to read more ‘character flaws’ looking at the jerseys.[1. SportVU identifies players on the floor by reading jersey numbers] If the system reads a “6” of a jersey that looks a lot like an “8” crumpled up. So figuring out how that happens and patterns of movement we can go ahead and fix some of those in terms of the technologies a rely less on the operators.[2. Two individuals present at each game which assist the system by responding to prompts such as when the system is unable to identify players on the floor] We’ve made improvements to understanding what happens when players run by each other or bump into each other on the floor just making sure to understand that when a player is running full speed he’s not going to [suddenly] change direction. So understanding that when players cross paths keeping those paths consistent, things like that.

We’ve also been working with the league on workflow. We’re hoping by January 1st to be able to produce everything by the end of a game. Right now we’re delivering stuff the next morning to teams in terms of some of our more intensive algorithms. We’re looking to move that forward to allow for teams to have access to our whole suite of data at the end of games. A lot of that will move downstream and will allow our live data to be that much more powerful.

Does this mean better real-time data to the fan or are you focusing primarily on the data to the teams?

Both. We have daily interactions with the team at NBA.com, working with their group to decide which data to expose to the public. Some of it is sensitive information. Certain [NBA] teams don’t want certain data points out there, so that’s always a balance. There’s also the danger of too much information for groups of fans. There are the die-hard group that really wants to dive into this data, but how do we make this data interesting to someone like my father who might not even be a box-score guy? How do you make that jump and make it applicable? We could be watching a game and I could say “the Bulls ran that ICE to perfection, really downed that pick-and-roll effectively, Hinrich did a great job fighting over the screen” that’s already too much information for my dad who wants to complain about people traveling. So how do you bridge that gap?

 

How much of a priority is integration with the media, having SportVU-based metrics become part of the game presentation?

That’s part of the overall goal. Again, we’re walking that tightrope both of what’s sensitive to the teams and what isn’t and also what’s applicable to the broadcaster. Each broadcaster is different. Certain guys you’ll get them the data and they’ll want to use it in their presentation. Some just want to talk about the game. Both can be interesting and compelling to the fan, it’s just different styles.

What can we bring now that we have a full year of data? It’s all about context because some of these stats are so new. What if I told you this player had 54 touches in a game, what does that mean? Well now we have a full season of tracked data and have a better understanding of what’s good and what’s not. We use this as a baseline. Using team ranks or player rankings makes it much more digestible. That’s how the media is going to use it and it’s just a matter of finding out the proper workflow. I would say at the start of the season, broadcasters will probably be not as engaged, but it will grow more and more as the season goes along.

 

Certainly some broadcast teams are much more open to dropping nuggets of quote-unquote “analytics” into their commentary than others?

When I’m talking to talent and coaches, I try not to use the term “analytics.” It’s data reporting. It’s stuff they’ve been manually tracking for years. It’s a simple as something like “how many ball-screens did a player use in the course of a game?” You start using a term like “analytics” and coaches think “you’re trying to push an agenda on me, you’re trying to make a point using numbers and numbers only.”

What I like to communicate is “we’ve had this reporting structure for a long time. Let us help you evaluate what you’re already doing. If you then want to throw it in a regression on what’s important and how it impacts the game, we can do that. But you tell me the information you want, and we’ll generate a report for that.” That’s how I try to communicate with coaches and broadcasters.

When it comes to communicating with front offices, it’s a more strictly analytical play. It’s trying to communicate it in a way so that no one gets scared or threatened.

The A-word, “basketball people” sometimes recoil if they think you’re trying to convince them of something with a spreadsheet rather than a tape cut-down, while what you’re offering is the best of both?

We do some research on the stuff to make sure it’s statistically valid and we’re not just throwing statistics out there, but you have to know your audience. You mentioned people getting scared, but Zach Lowe says this all the time, when people rag on these “analytic guys”, these guys all watch a ton of basketball. They’re watching, they’re learning, they’re listening to coaches and they are committed to learning the game the right way and then supplying an additional tool to that toolbox. All they [analysts] are trying to do is present more information.

I’m never going to go sit down next to Tom Thibodeau and tell him how to run his offense. I want to learn what he’s doing, why he’s doing it and how I can help evaluate how effective it’s been. That’s the best way for it to work and that’s where I see us beginning to make ground and not be seen as “a threat” but rather be seen as someone who can help them make better decisions.

 

So in the way Zach likes to put it, how do the “geeks” talk to the “jocks?” Well the “geeks” have to speak “basketball” not the other way around is the only way it will work?

That’s definitely the case. You’ve got changes in front offices, Ben Falk going from Portland to Philadelphia, so obviously they [analysts] are becoming more influential.

I wouldn’t want a team of all data analysts who never played the game. There’s great value in that “old school” mentality. But maybe there’s things you miss by just watching tape, and maybe we can help with analytics and data to help you watch tape differently, make it more efficient. Maybe you only need to watch a 6 or 7 minute clip of what Carmelo is doing on the floor. If you’re trying to key on his tendencies maybe we can point you towards those tendencies and make your workflow much better.

 

Having been knee deep in this data for several years now, what are some things that most surprised you and most changed the way you thought about things?

That’s a good question. let me think about it some and we’ll come back to it.

 

In terms of teams “pushing back” on the public data, what goes into that balancing act? Is it teams being very circumspect or teams being protective of the questions they are asking so as to not give their hands away?

We are very protective of teams’ data requests. If for example [two teams] came to us with the same request, we’d never say “well we have this query we just ran for Team X available.” We’ll wait for them to ask for it specifically.

Now there is a laundry list of things that general basketball minds would agree are some of the next steps they’d like to see, and those we’ll just go ahead and develop. If seven different teams have asked us about a particular type of data, we’re going to develop that and make it available for everybody. We walk that tightrope all the time.

From day one, the team business has driven all the data creation. So we’ve done it from a team standpoint and then we look at it and see what can we adjust and what can we water down from the fan perspective. You might want to see everything the same way a team sees it, but not everybody does. How do we make it digestible for the casual fan?

If you give a broadcaster, now that we know about various pick & roll coverage – hard hedge, soft hedge, blitz, “down” under, over -now that we know that, that’s awesome information. But that’s a three minute segment to explain all of it just to integrate it into a broadcast. So how do we break that down into maybe a 15-second bit that might last a whole possession? If you can’t get it in in 10-15 seconds and the casual fan doesn’t already know it, they probably aren’t going to use it. So it’s a matter of building up that portfolio of knowledge for things that can be used on a daily basis.

If I can go back to the question you asked before

About things that have surprised you?

Yes.

One of the things we looked a lot last year was pick-and-roll coverages. At that point, we were only able to identify the four players involved: the ball-handler, the screener, the on-ball defender and the screen defender. What was really interesting is that we were looking at Noah and Hinrich [compared to] Hinrich and Boozer, Hinrich and Boozer were a “better” combination in terms of defending the pick-and-roll on a points per possession basis, We thought “how can this be?” Joakim Noah is a much better defender, you have to watch 10 seconds of basketball to know that.

So when we looked at it, who is the other big on the floor? When it was Boozer [defending the screen] it was Joakim Noah [as the other big on the floor.] Noah was more impactful when he’s loading up the strong side on “ICE” coverage than when he’s put in a ball-screen. [In ICE} he can defend those next two passes, so he’s more important than when Boozer is just trying to force [the ball-handler] to the sideline and keep him out of the middle.

So it was really interesting to think about, how does that work for other teams? Do you want to put the best [defensive] big man in the ball-screen so you weaken the help-side defense? How do you do that strategically? And we went from thinking about just two players to thinking about how the whole team worked together defensively.

The other cool thing is to come up with these algorithms we create, we watch tons and tons of tape [to check accuracy]. So we’re asking do coaches play that big a role? So when you’re grading algorithms[3. A process where the team manually charts a number of games to test the accuracy of a given query to identify if it is capturing all the desired information while not picking up extraneous “false positive” data], you watch how the Bulls play pick-and-roll, they play “down.” They go over the screen. So it’s a lot of “ICE” combinations. We were watching a Bulls-Wizards games and the Wizards were doing the same thing in pick-and-roll coverage, it was always “down”, “down”, “down” pushing [the ball-handler] sideline.

Middle of the third quarter, I noticed something weird. [Washington] was switching all the time now. There was an offensive adjustment, because [the Bulls] weren’t getting a good relationship between the two players in a traditional little-big pick-and-roll, they started going wing-wing pick-and-roll. The interesting thing was, when those combinations changed, they [the Wizards] changed how they were defending pick-and-roll. They went from down and ICE everything to switching everything just in the course of the game. And you see that when you’re grading it and watching it at that level of detail, you start to see the impact that coaches can have on the game, the nuance.

 

There has been a fair amount of academic work using SportVU data, is that something STATS is still involved with?

There are still some groups out there using the data. We’re working more closely with them this year to know how they are using the data. It’s still a big part of the community and it’s great exposure for the data and the system to be used in that way. To give people a chance to play with the data and advance the game of basketball.

But we’re more focused on what is applicable. A lot of that stuff [academic research] is really great, but we have to make sure it’s actionable intelligence in a decision-making process. How can people use it and actually make the game better rather than just being a theory?

Charlie is a lot more dialed into that community, he wants to get his PhD, he has that academic background. Personally, that’s less interesting to me, I get more excited about pushing the team business forward and integrating with the media, whereas Charlie is really interested in the pure research side of it, using machine learning techniques and data mining technologies.

 

So you’re less interested in “solving” basketball than in more “ad hoc useful” analysis?

More or less.

 

What can you say about the research going on out there?

I wouldn’t want to steal their thunder. But think about defense and the stuff Kirk [Goldsberry] is working on, those are the big ones.

 

Is the defense research along the lines of the “ghost player” stuff Zach wrote about in Toronto a few years ago?

Probably not to that level. What you have to remember was that wasn’t done as pure research. If you were doing it as research, you wouldn’t have access to the coaches. That’s what gets a little lost even to this day. People think it was just computer programmers that went and did that, but what’s actually the case is those were really smart guys fortunate enough to have a front office that was really into data and analytics so they were able to get access to coaches to program that around their schemes.

For us [STATS] to do that, that gets back our earlier conversation about me telling you how to play defense instead of letting me help evaluate how well you’re playing defense.

 

Is the SportVU system progressing into the college game much this season?

We’re trying to approach the problem from the conference level, so you get both teams involved in every game. The value becomes exponential when you add larger sample of data and I think that’s what we saw with the NBA this past year. So the question is how do you get the entire ACC or SEC to jump on board so we get more data and it applies more places than just Duke’s 20 home games.

 

Are there conferences coming online this year, or is that an in-progress initiative?

In-progress.

 

I would imagine that would have implications for predraft rankings and scouting.

Yeah, that’s when you know people are using the data is when they start asking “hey what about college data” then you know they are measuring player performance based on that data and they think they’ve found something so they want to measure incoming players on those metrics.

 

How frequently do you get those types of questions?

Quite a bit.

 

Is that the next thing people say “this is cool. When do I see it for college players.”

Yup. And then you ask “but are you willing to pay for it?” and the answer is usually “of course not, but I really want it”

 

So you’d say two-thirds of the teams are fairly actively using the system?

That other third breaks down about half of that third is doing everything in-house and then the other half, everyone has access to the web plaftorm and a lot of teams are heavy users, but some teams just want custom excel-based reports that we send to them after every game, and we work with them that way. They don’t want to log in to look at something, but they are happy to look at in a Word or PDF format.

 

Brian and I talked about the partnership with Catapult last spring and now that he’s working there is that progressing?

Obviously, I have a good relationship with Brian and we continue to work together on a daily basis. It’s really merging best-of-breeds in terms of practice tracking and in-game performance metrics. It’s measuring athlete load and athlete performance, how do we manage that over the court of the season. LeBron going to Cleveland, how do they manage his minute. More importantly how do they manage his athletic load? If you compare a guy like LeBron with  Luol Deng, they could do the exact things on the floor, that’s still going to be harder on LeBron because of his sheer size.

Unless you’re Gregg Popovich and you just do whatever you want, you’re probably not going to be able to influence LeBron’s playing time — coaches are putting their lineups together to win every game. So how trainers and strength and conditioning coaches use practice to learn about how to we best prepare players for the next game? And now with Catapult and SportVU you’re able to manage that across the full infrastructure.

So what can you tell me about stuff that’s new this year? 

I can’t talk too much about it because we haven’t finalized our plans for the new things coming on NBA.com this season. Hopefully it we’ll be doing more and more over the course of the season and not just releasing it all at the start of the season and having it be static over the course of the season.

 

Can I make a date to talk again closer to the season to talk again once that becomes finalized? From a fan perspective, a walk through the new goodies is something people would be interested!

We could do something closer to the season. I can tell you the biggest differences are that it’s going to be more visual and more interactive. Quick snippets that people can share easily. Making it much easier to go to a player page and look at player tendencies and play types. Making it easier for player comparisons and stuff like that.