Beginner’s guide to baseball analytics

May 27, 2016; Seattle, WA, USA; Seattle Mariners fans cheer for a strikeout by starting pitcher Felix Hernandez (34) during the sixth inning against the Minnesota Twins at Safeco Field. Mandatory Credit: Jennifer Buchanan-USA TODAY Sports
May 27, 2016; Seattle, WA, USA; Seattle Mariners fans cheer for a strikeout by starting pitcher Felix Hernandez (34) during the sixth inning against the Minnesota Twins at Safeco Field. Mandatory Credit: Jennifer Buchanan-USA TODAY Sports /

For fans interested in learning a little more than the basic statistics of the game, we offer a beginner’s guide to baseball analytics.

I’m kind of old, but not really old. A child of the ‘80s and an Atlanta native, I grew up watching the Braves on TBS and Baseball Tonight on ESPN. I collected cards, played RBI Baseball and Major League Manager, and read newspaper box scores and preseason magazines. I played youth baseball and played in high school (as a decent defensive catcher without much of an arm). I also coached some high school baseball, and later spent three years working as a front office executive for a minor league club.

However, despite all my baseball experience, analytics and Sabermetrics weren’t something I thought about often. I was old school. Not because I rejected new age stats, but because I never sought them out. Somehow, Bill James and Strat-O-Matic, both of which helped many of today’s great baseball writers and thinkers develop a love for the game, never entered my world. I was comfortable with the simplicity of hits, runs and errors, though in truth, when I saw stats like WAR, UZR, BABIP, wOBA, DRS and wRC+, I got intimidated.

I was a pretty good math student in school, and had watched, played, coached or worked in baseball all my life, but for some reason I simply didn’t understand baseball analytics at first. For years, I didn’t really feel like I needed to. After all, I hadn’t needed them at any point in my baseball journey.

However, when I began writing about baseball, I slowly began to pick up some new tricks. And, the more I learned, the more I wanted to learn. I began visiting and subscribing to stathead websites like Baseball Prospectus and FanGraphs. I bought books, including The Book: Playing the Percentages in Baseball by Tom Tango and Mitchel Lichtman, and checked out others from the library.

Learning about baseball analytics made me a better baseball writer and a better fan of the game itself. I’m far from perfect, and still have a lot to learn. But if you’re like me, you may find this beginner’s guide to baseball analytics helpful.

The idea is to provide you with the most important and useful Sabermetric terms, and briefly explain them in a way traditional fans and beginners alike can understand. When more information is needed, links are available.

Mandatory Credit: Charles LeClaire-USA TODAY Sports
Mandatory Credit: Charles LeClaire-USA TODAY Sports /

An introduction to baseball analytics

Just about every baseball box score in history features runs, hits and errors. They are the basic building blocks of the game’s numbers based nomenclature, and helped lead to simple common calculations like batting average and fielding percentage, as well as other stats like RBI. Fans and statheads also began tallying home runs, stolen bases, walks, strikeouts. Eventually, on-base percentage, slugging percentage and OPS (on-base pus slugging) rose to fame for their ability to add context to a player’s value as a hitter and run producer.

For pitchers, wins and losses, earned runs and ERA, games and innings pitched, hits and runs allowed, walks and strikeouts generally graced the back of baseball cards. Complete games, shutouts and saves eventually followed. WHIP (walks and hits per inning pitched) emerged as a more advanced look into the amount of base runners a pitcher allowed, also fit the shorthanded acronym mold popular in baseball statistics throughout history, and also today.

However, over the past two decades, a wider range of statistics made their way into barroom debates, online discussion groups, and baseball front offices. The book Moneyball by Michael Lewis is often credited with bringing Sabermetrics to the masses, as it followed Oakland Athletics general manager Billy Beane’s quest to field a winning team with fewer resources than richer franchises with more assets at their disposal.

Beane was a devoted reader of James, who is widely considered the most influential baseball statistician in history, and used the knowledge he picked up as a player and front office executive to build the A’s into a surprisingly consistent winner. Too simply put, he focused on statistics that were largely overlooked by traditional scouts and front office decision makers. Most famously, Beane emphasized on-base percentage over batting average.

Despite cries of a Moneyball era “scouts vs. stats” debate, statistics have been used to measure baseball performance for well over a century, though their meanings and uses have changed over time. Some have often been overlooked, including plate appearances and batters faced. Others were overvalued, such as RBI. For example, the baseball community has largely dismissed pitcher wins as a meaningful statistic, and for good reason given the shorter workload for starting pitchers and the increased role of relievers in the modern game.

Also, many newer stats have been created in an effort to quantify nearly every aspect of baseball. While some of the new stats were easy enough to understand, others weren’t. Still, many have offered a better understanding of the actual effectiveness of a particular player.

Mandatory Credit: Jayne Kamin-Oncea-USA TODAY Sports
Mandatory Credit: Jayne Kamin-Oncea-USA TODAY Sports /

Baseball analytics glossary

This is in no way a complete list of baseball statistics. For that, check here. Instead, these are some of the analytics traditional baseball fans may be somewhat unfamiliar with that are worthy of learning.

BABIP: Batting Average on Balls in Play. Developed following the historic research by Voros McCracken, which attempted to measure how much a pitcher can control. Among its uses, BABIP helps determine whether a hitter or pitcher is “lucky” or “unlucky” as a result of the defense. The calculation for BABIP does not include home runs.

A league-average BABIP is generally .300. Hitters with a BABIP above .300 are typically either more talented than average (for example, a hitter with a high BABIP may hit the ball harder than most) or is the beneficiary of some type of luck (though fast runners capable of beating out ground balls can also be considered an exception). Read more about BABIP here.

BB% or BBr: Base on balls percentage or walk rate. Similar to strikeout rate, walk rate is the percentage of plate appearances that results in a hitter reaching base via a base on balls.

BB9/W_IP: Walks per nine innings pitched.

BF or TBF: Batters faced or total batters faced. A pitcher’s version of plate appearances (see below) that includes literally every hitter a pitcher faces. BF is important for calculating walk and strikeout rates, and is more telling of a pitcher’s workload than games or innings pitched.

DRA: Deserved Run Average. Described as Baseball Prospectus’ core pitching metric. DRA uses runs allowed per nine innings as its base (not ERA), and uses all runs allowed in an attempt to show how many runs a pitcher “deserves.” DRA helps to distinguish which pitchers are over- or underrated by ERA, including groundball pitchers.

DRS: Defensive Runs Saved. A plus/minus fielding metric developed by The Fielding Bible used to determine whether or not a player is below or above average at his position.

FIP: Fielding Independent Pitching. Based on McCracken’s research, which explained that pitcher’s lack total control in limiting base runners. Pitchers have control over walks, strikeouts and hit by pitches, as well as home runs, but all other balls in play are impacted by defense.

Specifically, a slow or poor defender that allows a hit to drop for a hit that a better fielder would catch can penalize a good pitcher. Also cFIP takes FIP one step further by making adjustments for park factors, hitter, catcher and more.

FRAA: Fielding Runs Above Average. A Baseball Prospectus defensive metric that usez play-by-play data to determine how well a player fields his position compared to others.

ISO: Isolated Power. Also can be described as raw power, and helps show how often a batter hits for extra bases. Calculated by subtracting a player’s batting average from his slugging percentage. (SLG-AVG). Useful for weeding out singles hitters with high batting averages from more valuable sluggers.

K% or SOr: Strikeout percentage of strikeout rate. Explains the percentage of plate appearances in which a hitter strikes out. Calculated by dividing strikeouts by plate appearances. The league average strikeout rate is roughly 21 percent

PA: Plate appearances. Every time a player comes up to bat. Plate appearances is a more valuable tool than at bats (which do not account for walks, hit by pitches or sacrifices). Plate appearances are used when calculating OBP, BB%, K% and other valuable measures used to rate effectiveness.

PECOTA: Developed by Nate Silver, formerly of Baseball Perspectus and currently of FiveThirtyEight, PECOTA is a player projection system that also helps to determine win totals. PECOTA is generally considered the gold standard of preseason projections, though FanGraphs offers similar systems ZiPS and Steamer.

No projection system is full proof, of course. Famously, PECOTA has underestimated the Kansas City Royals and Baltimore Orioles. Read more about PECOTA here.

Pythagorean Winning Percentage: Derived from Bill James’ research, Pythagorean won-loss record and winning percentage uses the number of runs a team scores compared to the runs it allows to show whether a team played better or worse than its final record – or whether a team is somewhat lucky or unlucky.

SIERA: Skill-Interactive Earned Run Average. A complicated ERA-related formula intended to determine the skill level of pitchers, or how and why good pitchers are good, and others aren’t. Read more about SIERA here.

SO9/SO_IP: Strikeouts per nine innings pitched.

SO/BB: Ratio of strikeouts to walks.

WAR: Wins Above Replacement. Sometimes the simplest solutions are the most difficult to understand. WAR is designed to be an all-inclusive, total value look at a player’s value to his team compared to an abstract “replacement level” player (a replacement player is often described as a below average MLB player, or a Triple-A player that would be called up to replace the player in question, perhaps if he were injured or traded to another team).

There is no one equation used to calculate WAR, though FanGraphs (fWAR) and (rWAR or bWAR) are commonly used. Baseball Prospectus uses WARP (Wins Above Replacement Player). BP’s VORP (Value Over Replacement Player) metric is similar. For a detailed explanation of WAR and how it is calculated, read more here.

wOBA: Weighted On-Base Average. An overall statistic created by Tom Tango designed to measure a hitter’s value through the number of runs he creates. Fangraphs calls wOBA “the key to everything” in understanding Sabermetrics. Read more about wOBA here.

wRC+: Weighted Runs Created Plus. An evolved statistic with roots dating back to Bill James, which also builds upon wOBA and adjusts results based on the ballpark in which a hitter plays. Designed to explain how many runs a specific player is worth to his team in a given year. Read more on wRC+ here.

UZR: Ultimate Zone Rating. UZR is a defensive metric that uses zone data to determine how good a fielder a particular player is. Because they use a player’s range instead of simply the balls he has an opportunity to field, metrics like UZR, DRS and FRAA are far more reliable and useful than traditional statistics like errors and fielding percentage.

Mandatory Credit: Tommy Gilligan-USA TODAY Sports
Mandatory Credit: Tommy Gilligan-USA TODAY Sports /

Throughout baseball history, some catchers earned reputations as strong defenders, while others were labeled poor pitch-framers. Some of the distinction relied on statistics related to the number of attempted base stealers catchers threw out, but most was simple observation.

Part of that observation centered on pitch framing, and which catchers were soft-handed receivers capable of “stealing” strikes for their pitchers. In the early 2010s, pitch framing data became available, which changed the way we look at catchers defensively.

However, as data evolves, there is concern that “pitch framing was doomed from the start,” as Jeff Sullivan wrote for The Hardball Times.

Statcast: Tracking technology used by Major League Baseball to gather data on everything from pitcher velocity, spin rate, and release to hitters’ exit velocity and launch angle, the acceleration of baserunners and the max speed and route efficiency of outfielders.

Statcast data is a treasure trove of information that could change the way we look at the game. There is even recent discussion as to how it will help calculate WAR.

Defensive Shifts: Though they date back at least as far as the 1920s, and were used often against legendary hitter Ted Williams in the 1940s, defensive shifts are often employed today in baseball to take advantage of data that indicates a batter is most likely to pull the ball. Here is a good primer from Baseball-Reference.

Fly-Ball Revolution: A very recent topic of discussing the changes in batted-ball data toward more fly balls. In short, more players appear to be hitting fly balls in hopes of hitting more home runs. So far, it appears to be working for players like Josh Donaldson, J.D. Martinez and Brian Dozier.

Park Factors/Park Adjustments: Unlike other sports, no two baseball ballparks are the same. Coors Field in Denver has a reputation as a hitter’s paradise for good reason: it’s easier to hit home runs there than any other park in the big leagues.

Because of unique aspects of each major league stadium, be it the Green Monster in Fenway Park or Tal’s Hill (may it rest in peace) in Houston, many modern statistics attempt to adjust their numbers to properly account for which park a player was in when an event occurred.

For example, a member of the Colorado Rockies pitching staff may have an unfairly inflated ERA compared to his NL West rival that pitches in Dodger Stadium. You can read more about park factors here and keep up with 2017 park factors here.

Platoon Splits: A pretty simple idea that most traditional baseball fans already understand. Generally speaking, left-handed hitters struggle to hit left-handed pitchers and right-handed pitchers are more successful against right-handed hitters.

Managers and front office personnel use the data from past matchups between players and hitters to make moves in-game and when cbuilding a roster, respectively. That’s why some players are switch hitters, and why teams keep picking up Pat Venditte.

Mandatory Credit: Jon Durr-USA TODAY Sports
Mandatory Credit: Jon Durr-USA TODAY Sports /

Further reading and baseball analytics resources

Because of Moneyball, the Oakland Athletics have long been considered one of the most analytically inclined teams in Major League Baseball. The Tampa Bay Rays have also done more with less, as Jonah Keri explains in The Extra 2%: How Wall Street Strategies Took a Major League Baseball Team From Worst to First.

According to a 2015 feature by Ben Baumer in ESPN the Magazine, in addition to the A’s and Rays, the Boston Red Sox, Chicago Cubs, Cleveland Indians, Houston Astros, New York Yankees, Pittsburgh Pirates and St. Louis Cardinals are “all-in” on sabermetrics.

The Los Angeles Dodgers, who now have former Tampa Bay GM Andrew Friedman as their President of Baseball Operations, should be included as well. The Milwaukee Brewers have made a sabermetric shift under GM David Stearns, who took over in the fall of 2015, and the new front office team of the Minnesota Twins also has sabermetric tendencies.

Among those slow to adapt to the analytics craze are the Atlanta Braves (though new front office duo John Coppolella and John Hart embrace new methods more than the ultra-successful team of John Schuerholz and Bobby Cox), as well as the Philadelphia Phillies, Colorado Rockies and Miami Marlins.

Mandatory Credit: Geoff Burke-USA TODAY Sports
Mandatory Credit: Geoff Burke-USA TODAY Sports /

In addition to Baseball Prospectus and FanGraphs, which were often cited above, and anything written by Bill James, those looking for additional sabermetric and baseball analytics-related reading material should check out:

Again, this is not an exhaustive list. There are tons of great baseball thinkers and writers out there that have gone through painstaking effort to try and better understand why some teams and players succeed why others fail. However, this beginner’s guide to baseball analytics has helped a relative baseball lifer like me understand the game better, and can do the same for any fan.