I. Historical Impact: WOWY Score Update

How valuable is a player? How many points per game is he worth? In sports, these are Holy Grail questions that play-by-play data has helped estimate. But how do we compare Magic and Bird when they played before play-by-play was available? How do we compare Russell and Chamberlain when they don’t even have a complete box score?

A few years ago, I circulated a method that takes a stab at these questions by using injuries, trades and free agent signings to compare teams with and without a given player. The result is an historical, (mostly) apples-to-apples comparison of value between players, called WOWY. (There’s a full primer on WOWY attached to the end of this post.) The lineup data — not from play-by-play, but from game-by-game — gives us the same insight into players for the last 60 years.

Take Bill Walton’s legendary rise and fall in Portland. All other things being equal, how did the team fare with and without him in the lineup? It turns out, Walton’s missed time from those years produces the best WOWY score in NBA history. (See the third tab, titled “Top WOWY Runs.”) In other words, Walton had the biggest observable impact of qualifying players (i.e. players who were injured or traded) on any team ever, his presence improving the Blazers by more than eight points per game.

In researching Thinking BasketballI examined hundreds of these WOWY runs. For those familiar with it, I also cleaned up the data, adding controls and incorporating postseason games for over 1,500 instances since the inception of the shot clock in 1954-55. And if we combine those instances for players — only focusing on what I’ve liberally called their “prime” — we can see who left a large impact when in and out of the lineup for an entire career.

Below are the top 10 prime WOWY scores of all-time, with a minimum sample of 20 games missed:

Top 10 WOWY Scores All Time

Indeed, the best combined numbers are from players often found in all-time top-10 or top-20 lists. You can see all the results in this spreadsheet of over 200 qualifying players.

The two outliers — Robertson and West — make most top-20 or top-15 lists. (ESPN had them at 11 and 13, respectively, in their recent top-100 rankings.) While Oscar is largely revered, most people don’t know that his impact was quantifiably enormous, dragging an otherwise inept team in Cincinnati to respectability, then later catapulting Kareem’s Bucks into the upper stratosphere.

Meanwhile, when West was healthy, many of his teams were elite, only overshadowed in history by the dynastic Celtics. Amazingly, West’s teams performed better with him in all 12 lineups that he missed time. Oscar did the same for 11 consecutive lineups. (Note that about one third of WOWY scores on that list are negative.)

538’s Benjamin Morris ran a limited version of this years ago to argue for the greatness of Dennis Rodman, although he only used a minimum of 15-game injury blocks. Rodman’s good, but he clocks in at 16th here. And yes, Kobe (26th) beats Jordan (32nd), but MJ’s number comes largely from 1986 when he broke his foot and missed most of the season. (His 22 missed games from 92, 93 and 95 respectively would elevate him to 25th on the list.)

While this is all valuable data, it’s still limited. It doesn’t help answer our original question for players who don’t miss much time, like Chamberlain and Russell (and even Jordan). We’ll address that issue in Part II of this series on historical impact. For now, I’ll leave you with a WOWY primer below…

What’s WOWY?

It stands for “With or Without You,” and compares the performance of a roster with a given player and without that given player over the course of an entire game. It is an attempt to isolate a player’s impact on that given roster.

I almost always control for players who played at least 25 minutes per game (noted in the control column of this spreadsheet). This typically yields five to seven-man rotations for most teams, depending on how they distribute the minutes. There are some instances where I’ll control for the entire starting 5, even if someone is below the 25-minute mark. Similarly, there are situations that call for including two players at around 23 to 24 minutes per game because there is no clear-cut fifth man on a team.

How is WOWY different from On/Off?

On/Off captures changes within a game. WOWY captures changes from game-to-game. One strength of WOWY is that multi-collinearity is not a problem; in other words, player values cannot be confounded by moving in and out of the game together. In that sense, it is an incredibly pure representation of a player’s value to a given roster, troubled by issues like sample size (major issue), team growth (minor issue), opponent unhealthy lineups (minor) and valuable bench cohorts (minor).

(Note that some lineups have synergistic effects where the whole is greater than the sum of the parts, and removing any player from that equation can disrupt the synergy.)

What’s a WOWY Score?

It’s an attempt to quantify how impressive a given WOWY run is. It takes into account sample size, the distribution of SRS scores in a given era and the quality of the player’s team.

What is “95% +/-?”

It is a confidence interval, based on the SRS-variance of a typical NBA team. For example, from 1977-1978 the Blazers were a -1.2 team in 26 controlled games without Bill Walton. A 95% +/- value of “3.5” means that 95% of the time, the actual full-season SRS of such a team will fall within 3.5 points of that value, or somewhere between -4.7 and +2.3 SRS. (Note: More consistent teams will be slightly penalized by this and more inconsistent teams with benefit from it.)

What is SIO?

It stands for “simple in/out,” and is a basic curving of impact based on the quality of a team. It means that taking a -10 team to -5 is given less value than taking a +5 team to +10.

When combining runs for a “prime score,” why is SIO different than WOWY score?

Uneven samples can provide extremely warped results due to some basic math illusions. Take Michael Jordan, who missed the majority of his games in 1986. His team’s “out” totals will then largely reflect the 1986 Bulls (who were below .500), but his “in” totals will be weighted heavily by the Bulls dynastic teams. So, even if his team performed the same with or without him, his out sample largely be from a -3 SRS team, while his in sample would be teams closer to 9 SRS.

WOWY score was designed to correct this problem — for multiple seasons, it takes the impact (SIO) from a given sample and weighs it accordingly. For instance, if a player makes a team 10 points better in a five-game sample, and then two points better in a 20-game sample, his weighted impact is 3.6 points (because 80% of the sample is from the two-point change).

The actual in and out values are included for posterity, but unless a player played on relatively consistent teams, the numbers won’t reflect the actual impact the player had on his lineups.

Why are there multiple entries for the sample player-season?

The controls are different. Players might miss games from one lineup and then, following a team trade, might play with and without a different lineup.

From the Vault: Are Role Players Worse on the Road

This post was originally published on June 3, 2012. 

On the ESPN pregame show before Game 4 of the Eastern Conference Finals between Boston and Miami, there was a long discussion about why peripheral players tend to struggle more on the road than at home. Which, of course, begs the question…do peripheral players really struggle more on the road than at home?

If we break down player importance by minutes played, we can stratify everyone in the NBA into six categories, ranging from guys who play under 15 minutes per game to those who play more than 35. This is what the results look like for free throw shooting:

Free Throw% Away Home Diff Away % of Home
Players Over 35 mpg .763 .757 -.006 100.8%
30-35 .796 .795 -.001 100.1%
25-30 .774 .768 -.005 100.7%
20-25 .712 .722 .011 98.5%
15-20 .707 .689 -.018 102.6%
Under 15 .677 .675 -.002 100.3%

This confirms another myth-buster we learned at the Sloan MIT Sports Conference this year: teams shoot free throws better on the road, not at home. Players over 35 minutes per game see a small improvement on the road, with home-cooking only reserved for players in the 20 to 25 minute bracket. Those players see a 1.1% decrease in free throw shooting on the road.

Obviously, free throw shooting isn’t a category that tells us much. Instead, let’s look at overall box score statistics that ballpark play based on the basic box stats. First, Game Score (similar to PER) and then Expected Value run only on the box score stats. Finally, points per game and True Shooting% are included.

Game Score Away Home Diff Away % of Home
Players Over 35 mpg 14.6 15.7 1.1 93.2%
30-35 11.1 12.0 0.9 92.5%
25-30 7.7 8.7 0.9 89.4%
20-25 5.9 6.1 0.2 96.8%
15-20 4.0 4.0 0.0 99.8%
Under 15 2.2 2.4 0.2 90.9%
Box EV Away Home Diff Away % of Home
Players Over 35 mpg 4.5 5.4 0.8 84.3%
30-35 3.8 4.4 0.6 87.4%
25-30 2.9 3.4 0.5 84.0%
20-25 2.3 2.5 0.2 92.6%
15-20 1.5 1.5 -0.3 102.2%
Under 15 0.6 0.7 0.1 85.9%
Points Per Game Away Home Diff Away % of Home
Players Over 35 mpg 18.95 19.62 0.7 96.6%
30-35 14.75 15.36 0.6 96.0%
25-30 10.79 11.17 0.4 96.6%
20-25 7.81 7.81 0.0 100.0%
15-20 5.43 5.31 -0.1 102.2%
Under 15 3.18 3.33 0.2 95.5%
True Shooting% Away Home Diff Away % of Home
Players Over 35 mpg .535 .556 .021 96.2%
30-35 .534 .554 .020 96.4%
25-30 .522 .534 .013 97.6%
20-25 .520 .519 -.001 100.2%
15-20 .506 .506 .000 100.0%
Under 15 .471 .485 .013 97.2%

Based on these measurements, lower minute players are actually more consistent on the road than they are away from home. Consider:

  • 25-30 minute players decrease the most on the road by Game Score and Box-based EV
  • 25-30 minute player see no decline in points per game and TS%
  • Under 15 minute players see a Road decline comparable to high minute players in points and efficiency

Otherwise, the players with the biggest drop-off in road performance are the high-minute players! Whether it’s composite box metrics or scoring and shooting, the biggest difference between road and home is typically seen in the key players. In the composite metrics especially, the high-minute players see a significantly larger decline on the road than the low-minute players do.

Keep in mind this is not a definitive study, but a broad examination. We could change the criteria to examine “only All-Stars” or “only All-Stars in the playoffs,” and it’s possible the results look different. But it’s important to note, that in general, it is not the role players who decline more on the road, but the stars.

The Best “Healthy” Offenses of All-Time

In the last post, we looked at the best point-differentials of all-time posted by teams that were “healthy” (when all 25-minute per game players were in action, minimum 41 games played together). But what about isolating the offensive side of the ball?

Since box scores aren’t readily available before 1984, we are limited to teams from the last 32 seasons. But that’s fine — 99 of the top 100 most efficient teams in NBA history played in 1984 or later. (The 1982 Nuggets are the exception.)

Before analyzing the list, a quick disclaimer to keep in mind: Offensive rating is not a perfect representation of offensive quality. Teams can choose offensive-centric lineups at the expense of defense for a net boost, such as crashing the offensive boards instead of retreating in transition defense. Below are the top offenses, relative to the defenses faced, based on these healthy lineup standards:

Healthy Offense Relative ORtg

The 2006 Suns played the second half of the season without Kurt Thomas after running out of big men. As a result, they played one of the most lopsided lineups in NBA history, starting Boris Diaw at center and three wings alongside Steve Nash. They had one big man (Brian Grant) off the bench. While they feasted on defenses, they were defensively compromised and posted an abysmal +6.6 defensive rating during these games. Given that, I wouldn’t rush to crown them the GOAT offense. Some of the Dallas teams on the list also suffer from this kind of lineup tradeoff, slotting Dirk Nowitzki at center next to an offensively-leaning forward.

Offensive rebounding has declined drastically over the last two decades as teams have sacrificed crashing the glass in order to defend transition. Some teams still crash the glass hard, but sometimes individuals are just great offensive rebounders. Dennis Rodman is the ultimate example example of this — he’ll jump a team two tiers by himself. For example, in 37 games without Rodman, the ’96 and ’97 Bulls — with Longley, Kukoc, Jordan and Pippen playing — saw a 5.1% drop in their offensive rebounding rate, which is about the difference between the best offensive rebounding team in the league and an average one.

So let’s expand the list to include components that help place a team’s offensive rating in perspective, such as offensive rebounding rate and turnover percentage. Let’s also include the raw offensive rating and true shooting percentage (TS%) of the team:

Healthy Offense Expanded

First, the 2004 Kings number is shocking because they did it without Chris Webber; they smoked the league with Brad Miller starting in place of him, a phenomenon I discuss in Thinking Basketball.

The ’96 Magic and ’16 Warriors certainly jump out as candidates for the best offenses ever. If you’re eyes thought you’d never seen anything like Golden State this year, you were right; the Warriors shooting efficiency yielded an unheard of 1.19 points per scoring attempt. The ’96 Magic were amazing too, but aided by a shortened 3-point line.

There’s a dark horse in there: The 2016 Cavs, the team that beat the Warriors. Injuries have masked an all-time level offense, led by LeBron JamesKyrie Irving and even Kevin Love. They are not in the upper stratosphere of shooting efficiency, but are a low-turnover offense (11.5%) that benefits from player-specific offensive rebounding by Tristian Thompson.

But how good is an offense that can only take advantage of a fundamentally poor defense? While most offenses perform better against weaker defenses, Cleveland has no correlation between an opponent’s defensive strength and its own offensive production. A linear regression predicts that the Cavs offense will actually perform better against elite defenses than almost every team on this list, including Golden State. Given the small samples, I wouldn’t put too much stock in this, but it is worth noting nonetheless.

Of course, there’s an elephant in the room. Where will Kevin Durant, Steph Curry and the 2017 Warriors place on this list?



From the Vault: Observations from a season of Stat-Tracking

This post was originally published on April 13, 2011. It is a summary of findings after one year of stat-tracking basketball games in attempt to extend the box score. SportsVU now captures similar data. 

If one spends enough time watching NBA games with a DVR, trends start to jump out. Unfortunately, there’s no way the human brain can accurately catalog all that information. Perhaps Data from Star Trek should be assigned to finding trends in basketball games. In the meantime, here are some statistical observations from roughly 23,000 possessions of stat-tracking in 2011:


  • 16% of all field goals came off of an Opportunity Created (OC).
  • 46% of 3-pointers came off of an OC.
  • The average player shot 40% on 3-point shots off of an OC.
  • The Spurs led the league in OC’s (23.5 per 100 possessions), with the Hawks second at 23.4.
  • The Jazz needed the most help on defense — which means their opponents create the most opportunities (23.7 OC per 100).
    • The Jazz had the lowest Defensive Rating in the sample by far (118.7).


  • The Lakers committed the fewest shooting fouls in the league (16.5 free throws/100).
  • Someone takes an offensive foul every 88 possessions…or a little more than once per game.
  • Phoenix takes more offensive fouls than any other team – 2.2 per 100 possessions.


  • In guarded situations, the most successful teams are the best defensive teams: The four leaders in guarded field goal percentage are in the top-5 in defensive rating (and Milwaukee’s sample was too small):
    1. Miami (36.6%)
    2. Chicago (36.6%)
    3. Boston (36.9%)
    4. LA Lakers (37.7%)
  • The Lakers make the most defensive errors…but give up the fewest points per error (1.50 points/error), a credit to Andrew Bynum and Pau Gasol protecting the paint.
  • Every 172 possessions there is a forced turnover not counted as a steal (eg slapped off a leg out of bounds. That means the NBA doesn’t track about 1700 “steals” during the season.
  • Teams shoot the worst in unguarded situations against the Lakers (56.6% eFG%), which suggests that LA does well closing out shots and fighting through screens…or they’re just lucky.

Top “Healthy” Teams in NBA History

Who are the best teams in NBA history? We often answer this question by looking at a team’s entire body of work, lumping in the good, the bad and the injured. Most teams have key players miss games and some even trade for key players, changing the chemistry of a given lineup. So who were the best teams when all of the key actors were on stage?

Below I’ve indexed the top “healthy” teams — when all 25-minute per game players were in action for a game — since the shot clock (1955) by SRS (adjusted margin of victory). Using this criteria, 51 teams have posted at least an 8.0 SRS when healthy.  Just 29 teams have eclipsed the 9.0 mark. (10 of those teams failed to win a title — well inline with what is predicted by the variability of a 7-game series.) The best are below, playoffs included:

Top Healthy Offenses

Disclaimers: SRS, while a better predictor of results than win percentage, is not a de facto team-ranker. First, it’s subject to the usual variance seen in the NBA (detailed in Chapter 4 of Thinking Basketball), so it’s not a perfect representation of team strength. Second, some teams are more resilient in makeup — they are better equipped at handling a variety of opponents while still remaining efficient, boosting their odds of winning from series to series. Finally, SRS is a measure of within-season dominance, so it cannot allow for perfect comparisons across seasons. A 10 SRS in 1986 is probably more impressive than one in 1972.

With that said, it is by far the single best metric for evaluating the performance of a team against its competition. The teams listed above were manhandling opponents, which is why many went on to win a title.

While this year’s Warriors were the most dominant single-season team ever, their SRS is influenced by a league that was incredibly top-heavy. Four of the top-40 healthy teams ever played in 2016 (Golden State, San Antonio, Oklahoma City and Cleveland), which is either an unlikely coincidence, or a reflection of inflated numbers from a lopsided league.

The other top four seasons are from expansion eras, when teams could pick up an additional point or two by facing expansion squads a few times a year and padding their numbers with blowouts. All of those teams are in the conversation for “greatest ever,” but their statistical dominance here should be slightly curved.

As mentioned, we see the usual suspects: Jordan’s first three-peat Bulls. Jordan’s second three-peat Bulls. Kareem’s Bucks and the early 70’s Lakers. This is all line with in-depth analysis of the greatest teams ever.

So who are the most impressive teams of all-time that you probably didn’t know about:

  1. 2014 Spurs. When healthy, they posted an amazing 11.8 SRS. That team is basketball’s Sistine Chapel and Gregg Popovich its Michelangelo.
  2. 2004 Pistons. Absolutely impregnable after the Rasheed Wallace trade in ways that reminded everyone it was time for a rule change.
  3. 2008-09 Lakers and Celtics. These teams were fantastic in an incredibly competitive league. The Celtics were +8.8 and +9.3 when healthy, and the Lakers +9.7 and +9.0 once Pau Gasol joined. Kevin Garnett’s injury robbed us of possibly the NBA’s greatest trilogy.
  4. 1996 Magic. Yes, they were worthy of a documentary.

Amazingly, of the top 40 healthy teams of all-time, seven are Pop’s Spurs teams. Five are Jordan’s Bulls. Four are Laker teams with Kobe Bryant.

Remember this list the next time you construct an all-time list or you look ahead to the 2016 season.

Edit: This post was updated to include the postseason totals for the 2016 Warriors, and 96-97 Bulls. 

From the Vault: Exploring the Spacing Effect

This post was originally published on November 26, 2011. It examines a concept mentioned in my new book, Thinking Basketball

One of the more dominant themes of this summer’s Online Hoops Summit of Nerdness was the “Spacing Effect” that good shooters provide for an offense. By being a threat to score from all over the floor, shooters pull out defenders who could otherwise help on penetration or flood the paint for defense and rebounding. For example, in the last post we combed over five years of raw on/off data — how well a team performed with a player in the lineup versus when he was on the bench — and some of the biggest impacts were made by great shooters.

Of the 21 players who added at least six points of efficiency to a 107 offense (teams averaging 107 points or more per 100 possessions without the player), seven are on the all-time top-100 list of 3-point percentage leaders (minimum 500 attempts). 17 of the 21 (81%) used the 3-point shot regularly, with only Brad Miller (2004), Shaquille O’Neal (2005), Kevin Garnett (2008) and Tyson Chandler (2008) operating primarily inside the arc. The average 3-point percentage from that group was a whopping 38.2%. (League average 35.7% over that time.)

Below are the 21 player seasons, with their 3-point percentage:

Player Year Net Change Ortg On Court Ortg Off Court Season 3 pt %
Josh Howard 2004 6 117.6 111.6 .303
Radmanovic 2008 8.6 119.5 110.9 .406
Williams 2008 6.1 116 109.9 .395
Nowitzki 2004 6.2 115.6 109.4 .341
Bryant 2008 6.5 115.4 108.9 .361
Lewis 2005 7.3 116 108.7 .400
Joe Johnson 2005 8.4 117 108.6 .478
Josh Howard 2007 6.5 114.9 108.4 .385
Allen 2005 7 115.2 108.2 .376
Marion 2007 8.6 116.8 108.2 .317
Radmanovic 2005 11.7 119.8 108.1 .389
Chandler 2008 6.9 114.5 107.6
O’Neal 2005 7.6 114.9 107.3
Christie 2004 6.7 114 107.3 .345
Finley 2005 6.8 114 107.2 .407
Posey 2006 6.2 113.4 107.2 .403
B. Miller 2004 7.5 114.6 107.1
Billups 2008 8 115.1 107.1 .401
Terry 2006 8.5 115.5 107 .411
Garnett 2008 8 115 107

Also from that five-year chunk of data, there were 55 instances of players boasting an on/off of 9.0 or better on offense (minimum 1000 minutes played). Again, this means their teams offense scored at least nine more points per 100 possessions with them on the court that year. Only ten of those seasons saw a player attempt less than one 3-point shot per game. We see the same results: the other 45 (82% of the group) averaged 38.4% from behind the arc.

Of particular interest are the shooting specialists. Who we classify as one-dimensional shooters is somewhat subjective, but it’s a mighty coincidence that Vladimir Radmanovic appears on the above list twice, with two different teams. And that Peja Stojakovic does the same, in two different situations, in his two best 3-point shooting seasons (43.3% in 2004, 44.1% in 2008). And that Damon Jones seemed to help Miami so much in 2005 with a career-best 43.2% from downtown. And that Fred Hoiberg led the league in 3-point percentage in 2005 at a staggering 48.3% and booted Minnesota’s offense while on the court.

Of course, making so many 3′s is also part of the reason these players are helping so much, but perhaps not quite as much as one would think. In Hoiberg’s case, he attempted 4.1 3′s every 36 minutes, which means the difference between 48.3% and league average was roughly 1.6 points per 36 minutes, or about 2.3 points/100 at Minnesota’s 2005 pace. Radmanovic launched 5.7 3′s every 36 minutes in 2008, and if he converted at league average the Lakers would have scored about 1.8 fewer points in his games.

So while greater accuracy translates directly to more points, something else is happening here indirectly. It’s possible these shooters are repeatedly the beneficiary of coming in and out of the lineup with their team’s superstars. Although that seems unlikely, we can look at long-term adjusted plus-minus (APM) data and see the same pattern.

In Joe Ilardi’s 2003-2009 APM model, the best offensive players in the league are names we’d expect: Steve NashLeBron JamesKobe BryantChris Paul and Dwyane Wade. It’s also littered with resident shooters, like Antawn Jamison (“stretch” power forward) at No. 7, Michael Redd (12th), Ray Allen (13th), Jason Terry (19th), Anthony Morrow (21st), Peja Stojakovic (22nd), Rashard Lewis (23rd), Danilo Gallinari (26th), Anthony Parker (40th), Mike Bibby (45th) and Sasha Vujacic (48th). Below are how the top-50 3-pooint shooters (500 attempts) by percentage scored in Ilardi’s APM study:

Player 3P% Off APM
Jason Kapono .454 -1.36
Steve Nash .439 8.84
Anthony Parker .424 2.55
Ben Gordon .415 2.37
Raja Bell .414 -1.22
Daniel Gibson .412 1.40
Bobby Simmons .410 -0.61
Brent Barry .409 0.32
Matt Bonner .409 1.00
Peja Stojakovic .409 4.15
Bruce Bowen .408 -4.99
Wally Szczerbiak .406 1.08
Leandro Barbosa .404 0.51
Kyle Korver .404 -0.20
Eddie House .403 -1.88
Mike Miller .402 1.21
Chauncey Billups .401 5.32
Matt Carroll .400 0.39
Troy Murphy .398 0.41
Roger Mason .395 -0.58
Brian Cook .394 -2.41
Danny Granger .393 1.40
James Jones .393 1.80
Ray Allen .392 5.33
Steve Blake .392 -0.08
Luther Head .392 -1.21
Shane Battier .391 0.33
Rashard Lewis .390 3.91
Michael Finley .389 -1.58
Kevin Martin .389 1.10
Jameer Nelson .389 0.14
Hedo Turkoglu .389 1.89
Jason Terry .387 4.41
Mo Williams .386 1.39
Tyronn Lue .384 -0.47
Jose Calderon .383 0.71
Vladimir Radmanovic .381 1.84
Michael Redd .381 5.46
Kirk Hinrich .380 -0.88
Mike Bibby .379 2.31
Joe Johnson .379 1.40
Dirk Nowitzki .379 4.71
Mike James .378 -0.80
Delonte West .378 -0.39
Andrea Bargnani .377 -1.24
Maurice Evans .377 0.26
Mehmet Okur .377 -0.48
Sasha Vujacic .377 2.24
Manu Ginobili .376 4.94
J.R. Smith .376 1.98
Derek Fisher .375 -1.60

The average Offensive APM in the entire study was -0.45. The average Offensive APM of the top-50 3-point shooters on the list is +1.08. 32 of the 50 were positive-impact players. The glaring outlier, Bruce Bowen, can be explained away quite nicely. We’re using the 3-point shot to approximate outside shooting ability (or the threat of outside shooting), and Bowen isn’t a very good outside shooter. Using available data, he took about one deep jumper a game from 2007-2009 converting at 38%. He shot 57.5% from the free throw line during the period, the worst of anyone of the list by nearly 8%.

We could further define “good outside shooters” by looking at floor data on shooting from 16-23 feet if we wanted to. Although, despite the presence of someone like Bowen, 3-point shooting is sufficient for now to demonstrate the presence of the Spacing Effect.

Thinking Basketball Now Available on Amazon

Excited to announce that my new book, Thinking Basketball, is now available on Amazon in paperback.

The book is largely a culmination of the ideas on this blog over the years, using our own cognition to explore misconceptions about the NBA. It’s built on the concepts that have been presented on this blog (some of which I’ll try and re-upload this summer), as well as new research that was developed specifically for the book.

It would not exist without you, the reader, supporting this blog over the years and constantly participating to improve the ideas shared in this space. Thanks for reading and I hope you enjoy it!

Some core topics:

  • Averaging 50 points per game is rarely better than averaging 20
  • Why “Chokers” aren’t always chokers
  • How winning warps our memories, and thus our narratives about players and teams
  • The value of clutch play and closers
  • Building championship teams and the value of one-on-one play

Half-Court Math: Hack-a-Whoever, Isolation and Long 2’s

In my upcoming book, Thinking Basketball, I allude to certain instances where “low efficiency” isolation offense provides value for teams. Most of us compare a player’s efficiency to the overall team or league average, but that’s not quite how the math works, because the average half-court possession is worth less than the average overall possession.

In 2016, the typical NBA possession was worth about 1.06 points. That’s a sample that includes half-court possessions against a set defense, but also scoring attempts from:

  • transition
  • loose-ball fouls
  • intentional fouls
  • technical fouls

Transition is by far the largest subset of that group, accounting for 15% of possessions for teams, per Synergy Sports play-tracking estimations. Not surprisingly, transition chances, when the defense is not set, are worth far more than half-court chances. As are all of the free-throw shooting possessions that occur outside of the half-court offense.

Strip away those premium opportunities from transition and miscellaneous free throws and the 2016 league averaged 95 points per 100 half-court possessions. (All teams were between 7 and 14 points worse in the half-court than their overall efficiency.) Golden State, the best half-court offense in the league this year, tallied an offensive rating around 105, far off its overall number of 115 that analysts are used to seeing.

Transition vs Half Court Efficiency

This has major implications for the math behind “Hack-A-Whoever.” If the defense is set, then, all things being equal, fouling someone who shoots over 50% from the free throw line is doing them a favor. One might think that a 53% free throw shooter (1.06 points per attempt) at the line is below league average on offense because of the overall offensive efficiency. But it’s actually well above league average against a set, half-court defense. (Other factors, like offensive rebounding and allowing the free-throw shooters team to set-up on defense complicate the equation.)

Said another way — fouling a 53% free throw shooter is similar to giving up a 53% 2-point attempt…which is woeful half-court defense.

There could be other viable reasons to “Hack-A-Whoever,” such as breaking up an opponent’s rhythm or psychologically disrupting the fouled player. (These would be good strategic reasons to keep the rule, in my opinion.) But assuming he was a 50-60% foul shooter, coaches would still be making a short-term tradeoff, exchanging an inefficient defensive possession for other strategic gains.

This also has ramifications for isolation scorers and long 2-point shots. Isolation matchups that create around a point per possession in the half court — or “only” 50% true shooting — are indeed excellent possessions. If defenses don’t react accordingly, they will be burned by such efficiency in the half-court. As an example, San Antonio registered about 103 points per 100 half-court possessions this year, and combined it with a below-average transition attack to still finish with an offensive rating of 110, fourth-best in the league.

The same goes for the dreaded mid-range or long 2-pointer — giving these shots to excellent shooters from that range (around 50% conversion) is a subpar defensive strategy. And even a 35% 3-point shooter (1.05 points per shot) yields elite half-court offense.

So, when we talk about the Expected Value of certain strategies, mixing transition possessions together with half-court ones will warp the numbers. Sometimes, seemingly below-average efficiency is actually quite good.


How 2016 NBA Teams Differentiated Themselves on Offense

Dean Oliver’s Four Factors uses box score data to determine how teams are successful in key elemental areas. Instead of looking at box stats like turnovers and rebounding, what if we used different types of plays to determine a team’s offensive strengths? Synergy tracks a number of play types, but not all have a large impact on the game. Based on the 2016 data on nba.com, the following were the most common play types this year:

  • 25% were pick-n-roll plays
  • 20% were spot-ups
  • 15% were in transition

Naturally, teams differentiate themselves from the pack based on the plays they run the most; The Lakers led the league in isolation plays, but their efficiency was below-average on those plays, so they lost lots of ground on the average offense. The five categories from Synergy with the largest degree of differentiation were:*

  1. Pick-n-Roll (PnR)
  2. Spot Up
  3. Transition
  4. Post Up
  5. Off Screen

Below is a visual of how every team in the NBA this year fared in these five factors.

2016 Differentiation by Play Type

The y-axis represents the per-game differentiation based on efficiency of a given play type (relative to league average). For instance, if a team ran 820 post ups (10 per game) and averaged 0.10 points per play more than league average, they would generate an extra point per game.

Not surprisingly, the most differentiating play type during the 2016 season was a Golden State spot-up shot. Of the 203 players with at least 100 spot-ups, Steph Curry was 2nd in efficiency at 1.49 points per play and splash brother Klay Thompson 15th at 1.18 points per play. (League average was 0.97 points per spot-up.) Let’s simplify the above visual and just focus on the final eight teams left in this year’s playoff field:

2016 Differentiation Final 8

Now it’s easier to see how the remaining teams stack up. The Warriors don’t really have a post-up game, but so what? They excel at everything else and created the most differentiation of any team in the league in three major categories (PnR, Spot Up and Off Screen.) On the other hand, the Spurs were dominant in the post and excellent in their own right at spot-up plays, but they don’t do damage in transition. (San Antonio also led the league in “put backs” by a large degree, generating over a point of separation alone in that category.) The East’s best team, Cleveland, was above-average at everything.

*Isolation plays would be the 6th major play type. However, no team in 2016 created a point of positive or negative differentiation from isolation plays, which accounted for 8% of all plays tracked during the season. 

Goodell’s Illogical and False Deflategate Statements

It turns out that Roger Goodell, Exponent and Ted Wells just aren’t very good at logic. Whether that’s due to severe defensiveness and a major confirmation bias or something else is irrelevant. I’m not going to go into legal details or CBA issues, but I will discuss n then scientific and logical errors and inconsistencies from Goodell’s appeal ruling and the hearing itself in deflategate.

Falsehood No. 1 — Timing was accounted for the in the statistical test

On pg 6 of his ruling, Goodell writes:

“In reaching this conclusion, I took into account Dean Snyder’s opinion that the Exponent analysis had ignored timing…however, both [Dr. Caligiuri and Dr. Steffey] explained how timing was, in fact, taken into account in both their experimental and statistical analysis.”

This is patently false. It is not an opinion of Dr. Dnyder’s. It is a fact. And it is a fact agreed upon by Dr. Caligiuri and Dr. Steffey after much run around and refusal to answer this question. In Dr. Caligiuri’s testimony on pg 361 of the hearing:

“So the reason you don’t see a timing effect that we concluded in the statistical analysis is because it’s being masked out by the variability in the data due to these other effects.”

And then later on pg 380:

Kessler: So the initial test you did to determine whether there was anything to study did not have a timing variable?

Caligiuri: Not specifically, no.

Steffey echoes this fact on page 429 and 430:

Kessler: This one-structured model that you chose to present as your only structured model in this appendix and in the entire report, okay, has no timing variable in it, correct?”

Steffey: There’s no term in there that says time effect.

Goodell is either misrepresenting the truth or he is very, very confused and was not able to understand this issue at the hearing. Either way, once and for all, timing is not accounted for in Exponent’s statistical analysis. It is a major confound, and it does change the results when timing is indeed accounted for.

(By the way, the Exponent scientists were attempting to claim that an ordering effect is the same thing as accounting for timing, but that is also wrong. First, an ordering effect can have different increments of time (as the Patriot and Colt balls do) and second, an ordering effect is independent of time, which is relevant in an instance where another variable, like wetness, would completely mitigate the presence of an ordering effect but not undo the effect of time.)

Falsehood No. 2 — Brady’s “extraordinary volume” of communication for ball prep

On pg 8 of his ruling, as part of discrediting Brady’s testimony, Goodell reasons:

“After having virtually no communications by cellphone for the entire regular season, on January 19, the day following the AFC Championship Game, Mr. Brady and Mr. Jastremski had four cellphone conversations, totaling more than 25 minutes, exchanged 12 text messages, and, at Mr. Brady’s direction, met in the ‘QB room,’ which Mr. Jastremski had never visited before…the need for such frequent communication beginning on January 19 is difficult to square with the fact that there apparently was no need to communicate by cellphone with Mr. Jastremski or to meet personally with him in the ‘QB room’ during the preceding twenty weeks.”

This is a serious mischaracterization of facts. Let’s ignore the basic fact that there wasn’t a media frenzy surrounding Jastremski’s domain in any of the previous 20 weeks. During the hearing, Brady explained that, for the Super Bowl, Jastremski needs to prepare approximately one hundred footballs, at least eight times his normal volume.

Furthermore, Brady testified that deflategate allegations surfaced on days when he was not at the stadium because of the Super Bowl break. Frankly, it would have been stranger if he didn’t call Jastremski. The hoopla over the visit to the QB room is also bizarre, since Brady said he simply didn’t want to look for him in the stadium. There is no justification for how Goodell ignores this evidence, even taking it further and writing on pg 9:

“The sharp contrast between the almost complete absence of communication through the AFC Championship Game and the extraordinary volume of communication during the three days following the AFC Championship Game undermines any suggestion that the communication addressed only preparation of footballs for the the Super Bowl..”

Yet Brady testified, in front of Goodell, that they were discussing Super Bowl preparation (of 100 balls, not 12) and the issue of alleged tampering.

Logic Error No. 1 — It has never happened…but it has happened…but that doesn’t matter

On page 3 of his ruling, Goodell writes that:

“Mr. McNally’s unannounced removal of the footballs from the locker room was a substantial breach of protocol, one that Mr. Anderson had never before experienced. Other referees interviewed said…that [McNally] had not engaged in similar conduct in the games that they had worked at Gillette Stadium.”

So Goodell is saying that McNally grabbing the balls is a huge deal and in fact, it has never even happened before! Which would then make it impossible for this to have been a regular practice.

Thus, when analyzing text messages, Goodell ignores this information and believes that McNally’s references to “Deflator” (in May) and “needles” in October of 2014 are signs of a tampering scheme, but when trying to establish the severity of the situation he believes nothing like this has ever happened before.

Similarly, during the hearing (pg 307) Ted Wells admitted that he ignored the testimony of Rita Calendar and Paul Galanis — game day employees — who claimed that McNally took the balls to the field about half of the time without the officials. Wells doesn’t even think this issue is relevant, explaining that:

“I didn’t need to drill down and decide when he walked down the hall 50 percent of the time by himself or was this person right or that person right.”

Got all that? This has been happening since at least 2014, but this is the first time something like this has ever happened. And Wells thinks it doesn’t matter whether this ever happened before or not.

Logic Error No. 2 — Jastremski expects a 13 PSI ball despite a tampering scheme

On pg 278 of the hearing, Wells acknowledges that John Jastremski texted his significant other about the Jets game and said that he expected the footballs to be 13 PSI. Amazingly, Wells believes he is telling the truth. Which creates yet another, Wellsian logical impossibility.

How can Wells believe Jastremski expected the balls to be at 13 PSI for the Jets game and believe that there was a scheme to deflate the balls below 12.5? It is a completely contradictory thought. (This is similar to Jastremski’s text message that he sent to McNally about the ref causing the balls to be 16 PSI in that game, and not a message to McNally about why the balls weren’t properly deflated.)

This makes it logically impossible for there to have been a tampering scheme for that home game against the Jets. This either means that:

  1. There was no tampering scheme ever
  2. There was a tampering scheme, but only after October, 2014
  3. The tampering is carried out inconsistently at home

The third explanation borders on preposterous, namely because the text still would have said something like “we should deflate every week from now on to avoid this!” The other two explanations make it impossible for the comments from May, 2014 to be about deflating footballs. Yet Goodell follows suit and cites such messages as evidence of a tampering scheme (pg 10 of his ruling):

“Equally, if not more telling, is a text message earlier in 2014, in which Mr. McNally referred to himself as “the deflator.”

Goodell, like Wells before him, omits that McNally claimed the reference was about weight loss, which may sound crazy until you consider that other people use the term for weight loss, including the NFL’s own network in 2009, and that McNally himself appears to make a reference to weight loss using the term “deflate” during the Patriot-Packers in 2014 in Green Bay. (McNally was watching the game on TV from his living room, and after seeing a picture of Jastremski in a puffy jacket, texted him a message to “deflate and give someone that jacket.”)

Logic Error No. 3 — For the Colts, the Logo gauge matters. For the Patriots, it is impossible.

On pg 3 of Goodell’s ruling, he writes:

“Eleven of New England’s footballs were tested at halftime; all were below the prescribed air pressure range as measured on each of two gauges. Four of Indianapolis’s footballs were tested at halftime; all were within the prescribed air pressure range on at least one of the two gauges

First, this is bizarre, because it’s clear both sets of footballs lost pressure due to environmental factors. The Colts being “within the prescribed air pressure range” is simply due to their balls starting higher — Goodell knows it, you know it, every c-minus physics student in America knows it.

But what’s more problematic, and yet another assault on common sense, is that Goodell later rules that Anderson had to have used the non-logo gauge at halftime due to unassailable logic, but here he references the Colts being “within the prescribed air pressure range” on a gauge he considers to have been impossible to have been used.

Logic Error No. 4 — The balls were the same wetness

Wetness or moisture is a huge issue in the science. Yet here’s what Exponent scientist Dr. Caligiuri had to say about it as an alternative explanatory factor to tampering on pg 385 of the hearing:

“It is a possibility [that the Patriots’ balls could have been much wetter than the Colts’ balls because of the fact that the Patriots were on offense all the time with the balls], but there is no evidence that that occurred. The ball boys themselves said they tried to keep them as dry as possible. “

Brady’s attorney Jeffrey Kessler then asks him to confirm:

Kessler: Well, if you are on offense and you playing with the ball, can you keep it dry when it’s out there on the field?

Caligiuri: No

Kessler: Okay. So if the Patriots have those balls out there on the field, it’s plausible those balls were wetter, sir, right? You are under oath.

Caligiuri: Sure.

Kessler: Okay. And you didn’t test of that plausible assumption, right? Did you test for it?

Caligiuri: No…

Later Caligiuri states:

“We did not test for that because there was no basis to test for that.”

Yet, there is indisputable evidence that the Patriot balls were wetter. Namely, it was raining during the game and the Patriot possessed the ball for essentially 17 consecutive minutes in real-time, during the rain, to end the first half. Saying that there is no basis to test for that is a direct contradiction of the publicly available and undisputed information. Yet, on page 383-384 of the hearing, Caligiuri says:

“Did we look at wetness as a variability…in the beginning, no we didn’t.

Instead, he says they looked at “extremes.” This makes plenty of sense, except there are two giant problems. First, misting a football every 15 minutes with a hand spray and then immediately toweling it off is a nonsensical proxy for constant exposure to rain. Second, it does no good to create a range of possibilities and then not test the most likely possibility, namely that one set of footballs is on the wetter end of that range and the other is on the dryer end.

Logic Error No. 5 — Evidence that inflation mattering = deflation mattering/preference for deflation

Goodell has another breakdown in logic on pg 11, footnote 9:

“Even accepting Mr. Brady’s testimony that his focus with respect to game balls is on a ball’s ‘feel'” rather than its inflation level, there is ample evidence that the inflation level of the ball does matter to him.”

Yes, there is evidence that it matters if the ball is grossly overinflated. There’s no evidence that he wants it underinflated, or that reasonable inflation levels actually matter to him. None. It is a logical fallacy to think otherwise. It’s like saying “Mr. Brady complained about his food being too salty last night, therefore there is evidence that Mr. Brady really cares about having under-salted food.”

Logic Error No. 6 — Practical Significance

Finally, lost in all the discussion of the statistical significance is the issue of practical significance. This is the area that I really wish the NFLPA would have attacked at the hearing, but they did not broach it at all. Ironically, It’s probably the easiest part of the science for the lay person to understand.

Let’s assume that we were 99.9999% certain that the Patriot balls were all 0.3 PSI below where they should have been at halftime based on temperature alone — right around the actual number we think they are based on projections. That certainly does not mean that “tampering” is the only alternative explanation, and more importantly, it’s not very likely if the real-world explanation is not practical.

What benefit would someone actually gain from a completely undetectable change in PSI? Remember, players have never even known there were PSI changes from temperature in the past.

In other words, even if there is statistical significance on data that incorporates measurement time (which there isn’t), what would that data be suggesting? That Brady can magically detect differences in footballs that others can’t (and yet despite this, does not care if balls on the road are not a few tenths below 12.5), or that some other factor, like wetness, wind, temperature difference, gauge variability, inaccurate memory, etc., is a more practical explanation?

For those who missed it, Exponent themselves discovered on order of a few tenths of a PSI difference between the Patriot actual halftime measurements and where they projected their measurements to be.

Bonus Logic Error — It had to be the Non-Logo gauge

I’m hesitant to discuss this Red Herring, because the difference is negligible between the Logo and Non-Logo gauge when comparing the Colt and Patriot measurements. And this makes total sense — shifting the Patriot balls down a few tenths should (and does) also shift the Colt balls down a few tenths. But let’s pause to appreciate the absurdity of this logic, and then doubling-down to call it “unassailable.”

On pg 7, footnote 1, Goodell writes:

“I find unassailable the logic of the Wells Report and Mr. Wells’s testimony that the non-logo gauge was used because otherwise neither the Colt’s balls nor Patriots’ balls, when tested by Mr. Anderson prior to the game, would have measured consistently with the pressure at which each team had set their footballs prior to delivery to the game officials.”

Here’s what he’s referring to, echoed by Dr. Caligiuri on pg 364 of the hearing:

“Yes, he calculated, I rounded it up. 12.17, correct, okay. And then if you look at the Colts’ balls, if the same logo gauge was used, it’s reading 12.6, 12.7. We were told that the Patriots and the Colts were insistent that they delivered balls at 12 and a half and 13, which means, geez, looks like the logo gauge wasn’t used pre-game.”

OK, now let me assail it quickly — something that was already done at the hearing which Goodell provided over. The Logo gauge is inaccurate (reads too high) and the Non-Logo gauge is much closer to the “true” reading. Exponent tested a bunch of new gauges. Based on these two facts alone, Wells and Exponent have concluded that it’s improbable the Logo gauge was used because then Colt and Patriot gauge would also have to be off by a similar amount, and that’s just, I mean, geez, that’s just insane.


Except for the pesky little problem that according to the Wells Report, Exponent tested one model only, Wilson model CJ-01. A model they describe as being “similar” to the Non-Logo gauge! So their sample size to make these “unassailable” conclusions is really one.

But there’s more: Exponent discovered gauges can “drift,” or grow more inaccurate with use. It’s quite possible that the Patriot and Colt ballboys both have older gauges that have “drifted” to a similar degree. At the hearing, this was scoffed at because it would be coincidental that they were off by the same amount. Again, this doesn’t actually matter — it’s a Red Herring — but it demonstrates how poor these people are at basic logic. On pg 295, Wells said:

“Maybe lightning could strike and both the Colts and Patriots also had a gauge that just happened to be out of whack like the logo gauge. I rejected that.”

The Patriots claim to set balls at 12.6 PSI, but Anderson did not remembering gauging them all at 12.6 in the pre-game. (He remembered 12.5, and had to re-inflate two balls that were under 12.5.) There are two likely explanations for this:

  1. The gloving procedure created some variability in the Patriot balls. This would make it more likely the Logo gauge were used base on Exponent’s logic.
  2. The Patriot gauge and Anderson’s pre-game gauge are off by about 0.15 PSI.

Either way, it’s impossible for the “lightning striking” concept to even apply (that the gauges were off by an identical amount). Using Wellsian logic — which means we ignore things like gloving or temperature changes from ball to ball — the very fact that the balls weren’t 12.6 as the Patriot say but some were under 12.5 for Anderson tells you that the two gauges are not identical. So there’s no need for “lightning to strike.”

Bonus Question: How closely did Roger Goodell read the Wells Report?

In his ruling, Goodell states that he relied on the factual and evidentiary findings (pg 1) of the Wells Report — but during the appeal, there are times during the appeal hearing when Goodell does not seem to know the basic case facts:

  • pg 49, he asks “John who?” when Brady is talking about John breaking the balls in. It’s possible this is Goodell’s way of confirming he is talking about John Jastremski, however it’s bizarre given the context of Brady’s explanation and Jastremski being one of a handful central figures in the case that he has to ask who John is. Does he know about Jim and Dave too?
  • pg 61, in reference to the October Jets game, he says, “Just so I’m clear, the Jets game is in New York.” This is a huge detail to not understand as it relates to the 13 PSI text mentioned above.
  • pg 177, while Edward Snyder is discussing the halftime period, he interjects “Just so I’m clear, you are saying it would take 4 minutes for 11 balls to be properly inflated? That’s your analysis or what analysis is that?” Here, Goodell is saying that he is completely unaware that the witnesses in the room at halftime provided those estimates to Wells, who relayed them to Exponent, and that those estimates are central to the scientific and statistical analysis in the case.
  • pg 180, in the discussion about “dry time” (vs moisture), Goodell asks in regards to moisture, “that’s a what-if, right?” How can the person ruling on the case, after reading a report that was designed to determine if environmental factors could explain the halftime measurements, ask if “rain” is a “what if” when it rained during the game?
  • pg 396, perhaps the clearest indication that Goodell either did not read or did not properly retain the information in the report is that he has no idea what the “gloving” issue is. This is the gloving referenced by Bill Belichick in his press conference and given an entire section by Exponent in their report.