Deflategate: Exponent’s Bias and the Master Error

With all of the publicized corrections to the science section of the Wells Report, I’ve been asked by more than one person whether Exponent, the author of said section, was simply incompetent, or whether they were biased. It’s a question that might have legal ramifications in the near future for Tom Brady.

As I’ll detail below,  there is a body of evidence suggesting that Exponent’s report was not merely the result of bad science, but conducted with a clear anti-Patriot bias. They repeatedly made errors or only looked at possibilities that weakened the Patriot’s position without ever making errors in their favor. The nature and frequency of these errors makes it unlikely to be a coincidence. Furthermore, Exponent committed a major error in one of their key figures, an error that allowed them to report, incorrectly, an anti-Patriot conclusion back to Ted Wells. What exactly am I referring to?

Not accounting for time of halftime measurements

At a high level, the biggest methodological error Exponent commits is not properly accounting for the time differences of when the balls were measured at halftime. This leads to a nonsensical statistical test that they publish to establish “statistical significance.” The problem is, they knew about this factor. They too considered it a salient factor. They made multiple transient curves mapping how things change depending on when they were measured at halftime.

They didn’t stop there.

They dedicated an entire section (Table 13, page 58) to perform a mini-version of the analysis I present here, using periods of “average measurement time” to compare the difference between expected PSI and observed PSI at a given time.

Wells writes, on page 122:

“According to Exponent, the environmental conditions with the most significant impact on the halftime measurements were the temperature in the Officials Locker Room when the game balls were tested prior to the game and at halftime, the temperature on the field during the first half of the game, the amount of time elapsed between when the game balls were brought back to the Officials Locker Room at halftime and when they were tested, and whether the game balls were wet or dry when they were tested. “

So they thought a lot about the impact of the timing of halftime measurements. On page 57, in one of many mentions of this:

“A similar effect is seen in the game day simulation data; the average pressure rises as the average measurement time is increased.”

Again on page 62:

“Based on the transient curves explained above, one would expect that if the Patriots footballs were set to a consistent or relatively consistent starting pressure, the pressure would rise relatively consistently as they were tested later in the Locker Room Period.”

Yet they still published their p-values on page 11 and conducted analyses in the opening pages without considering time! This cannot be due to incompetence since they are keenly aware of and explicitly call out the importance of time on multiple occasions. On page 64, in their concluding statement, their second point cites these statistical tests as critical pieces of evidence supporting their conclusion. Unless different people prepared different parts of the report, this is evidence of a clear bias against the Patriots. But it’s also just the beginning.

Switching Fig. 26 to the extreme low temperature of 67 degrees

The transient curve used in Figure 24 to project Non-Logo gauge results uses a pre-game room temperature of 71 degrees. The HVAC on the day of the game was set between 71 and 74 degrees. But Exponent measured the temperature in the room where the balls were gauged by officials in the pre-game to range from 67-71 degrees. It was a good 30 degrees colder outside on the day Exponent measured, and there wasn’t the same game day activity where numerous people give off extra heat in the room.

When they project the Logo gauge results on the transient curve used in Fig. 26, they switch the pre-game temperature to 67 degrees, the extreme end of the plausible spectrum that produces the lowest Patriot reading. Their explanation for using 67 degrees is so the Colt measurements align with the projections. This is a reasonable approach, given that the Colt balls “should” obey the laws of physics, but (a) it should not be the only scenario examined and (b) they did not need to drop the pre-game temp all the way to 67 degrees to achieve this! Doing so only increased the appearance of guilt for the Patriots. The Colt readings are still viable and withinin Exponent’s “range’ of what is predicted by physics even with a 69 degree pre-game temperature.

Misting the footballs to simulate rain

When accounting for water, as described on page 42 (footnote 36), footballs were sprayed every 15 minutes with a hand held spray bottle and then toweled off immediately. As has been demonstrated, this is a minimal attempt at simulating rain. This is critical to interpreting the results (that will be discussed below and that reflect those presented here); Exponent’s wet curves between Figure 24 and Figure 26 show an additional effect of about 0.25 PSI due to wetness simply from running the simulation again. Yet, as we’ll see in a second, they cannot imagine how the Patriot footballs would be a few tenths below where they were expected based on temperature-only projections.

Not calculating the actual PSI differences from expected

The mini experiment Exponent runs in Table 13 produces the following results: at the earliest plausible time (let’s use the 4:17 reading), Patriot averages should have been 11.54 PSI on the Non-Logo gauge. The actual Master-adjusted halftime average on the Non-Logo gauge was 11.09 PSI. So the Patriots are -0.45 PSI from expected. The Colts Non-Logo average was 12.29 according to Table 11 on pg 45. (This is because Exponent uses the “switch” option to correct for the anomalous 3rd Colt ball.) Therefore, the Patriot balls are about 0.4 PSI below the Colt balls relative to expected. Is that clear from Table 13?

Exponent Table 13

Not only is it unclear, Exponent never even publishes the differences. They fail to calculate or discuss perhaps the most specific and important detail of all of their experimentation, instead simply noting that the Colt readings are in-line with these simulations and the Patriot readings are not. This is not incompetence, it is a bias of omission. More importantly, are the Colt measurement times in Table 13 even plausible?

Assuming the Colt balls are measured before the Patriot balls

Exponent assumes, contrary to the evidence, that the Colt balls were gauged before the 11 Patriot balls were reinflated. This is yet another anti-Patriot “error” or instance where they refuse to examine other plausible scenarios. The repeated and consistent manner in which this happens is hard to chalk up to coincidental incompetence.

Wells does not explicitly state that the Colt balls were gauged before the Patriot balls were re-inflated. Exponent should have asked about this and should have clearly stated it if it were provided such information. If not, they should have, “to be fair,” at least considered the possibility that the Colt balls were gauged later in the locker room period as an explanation for the differences of a few tenths of air pressure.

Burying the Logo and Non-Logo average PSI results

So, what happens if they were to explicitly note the PSI differences in their table as well as including Colt measurements at 11 or 12 minutes, the times that they were likely to be gauged?

Table 13 Updated

An updated version of Exponent’s table 13, showing Non-Logo Gauge Master-Adjusted results with a 71-degree pre-game temperature. This table includes a later measurement time for the Colts as well as explicitly calling out the differences between the expected and observed halftime values.

Now, for example, it’s crystal clear that an approximate 4-and-a-half minute measure time for New England and 11-minute measure time for Indianapolis result in a difference of 0.3 PSI on the Non-Logo gauge between the Patriot and Colt balls. This is similar to what has been observed in more detailed analyses.

Forget the inclusion of a later Colt measurement though. Why doesn’t Exponent call out that differential since it’s perhaps the single most salient data point in their entire report? Without any corrections, it would reveal differences of a few tenths of PSI between the control (Colts) and Patriot Non-Logo readings. Would publishing that number have impacted people’s reactions to their conclusions?

What about the Logo gauge experiment in Table 14? The Patriot Master-adjusted Logo halftime average value was 11.21 PSI, hidden in the paragraph on the following page, meaning that their experiment again found Patriot balls 0.3-0.4 PSI below expected on the Logo gauge, with the pre-game temperature at 67 degrees.

Table 14 Updated

An updated version of Exponent’s table 14, showing Logo Gauge Master-Adjusted results with a 67-degree pre-game temperature. This table includes a later measurement time for the Colts as well as explicitly calling out the differences between the expected and observed halftime values.

Could water account for that small difference? Or a different temperature? Placing the pre-game temperature at something like 69 degrees will bring the Patriot balls about 0.1 PSI closer to expected. Again, this is something Exponent conveniently does not even consider, despite providing a plausible temperature range of 67-74 degrees and running misting tests that demonstrate an effect of wetness.

The Master Error — failing to use master projections for master results

And then there’s this enormous error.

In Figure 26 (a figure recycled again in Figure 30), Exponent used a Master-adjusted transient curve to demonstrate where the footballs are projected to be as they heat up at halftime. Only they fail to present an adjusted curve! Figure 26 is simply wrong.

The curve shows a dry starting halftime value of over 11.5 PSI for the expected Patriot values. But a Master-adjusted Patriot ball would actually be 12.17 PSI in the pre-game according to Exponent. A dry football is expected to be 11.20 PSI at 48 degrees if it were set at 12.17 PSI in a 67 degree environment in the pre-game, as Exponent is attempting to model. The graph is not master-adjusted, even though Exponent claims it is. It is a clear error and needs to be corrected.

What happens when it is corrected?
Screen Shot 2015-07-24 at 10.14.59 PM

The Logo scenario that Exponent presents to support its case suddenly contradicts it. It makes their primary conclusion on page 55 simply wrong:

“Based on the above conclusions, although the relative ‘explainability’ of the results from Game Day are dependent on which gauge was used by Walt Anderson prior to the game, given the most likely timing of events during halftime, the Patriots halftime measurements do not appear to be explained by the environmental factors tested, regardless of the gauge used.

Correcting this huge error would fundamentally alter this conclusion.

Incorrectly claiming that the pre-game temperature is set to help the Patriots

They continue to write, on page 54, that

“it is important again to note that values for the pre-game and halftime locker room temperatures shown in Figure 27 put the Patriots transient curves at their lowest possible positions.”

But this is completely backwards — yet another anti-Patriot error. In order to generate the lowest starting transient curve within the HVAC parameters, the pre-game temperature would be 74 degrees, producing a starting halftime value of 10.86 PSI. 67 degrees is actually the worst starting value for the Patriot differentials.

Inability to conceive of wetness as the explainable natural factor

The icing on the cake is that the differences in the Colt and Patriot measurements are in all likelihood the difference in their exposure to rain. For the uninitiated, this can be clearly seen in the gradient of differences among the Patriot balls that suggests some Patriot balls were exposed to more rain, and in particular those balls on the final drive of the half.

Yet on page 55, when discussing wetness as a factor, they write:

“According to Paul, Weiss, [a majority of wet balls] were most likely not present on Game Day.”

How can they say that, given the factors around wetness? They mention nothing of the Patriot balls being used more, and being in play at the end of the half. This is yet another ant-Patriot oversight. Remember, they presented back-to-back graphics in which water made on order of 0.2 PSI-0.4 PSI differences from the “dry” condition, based on their own misting procedure. Despite the game being played in rain, Exponent concludes that results of the exact same magnitude cannot be explained by rain.


All told, the only time they seem to do something that isn’t anti-Patriot is when they create a row in Tables 13 and 14 for average measurement times that are improbably early in the locker room period. Otherwise, every misstep, omission and blatant error is decidedly anti-Patriot, and often committed in inexplicable fashion. In summary, Exponent demonstrates the following biases by:

  • Failing to account for halftime measurements in publishing p-values, despite knowing time of measurement is critical
  • Switching to an (unnecessarily) extremely low temperature projection for the Patriot Logo gauge
  • Misting footballs to simulate rain (and immediately toweling them off)
  • Not publishing the actual PSI differences between halftime measurements and expected measurements
  • Assuming the Colt balls are measured improbably early in the locker room period, and not considering later measurement times
  • Presenting Figure 26 and 30 with completely false transient curves, thereby altering their conclusions vis-a-vis the Logo gauge
  • Incorrectly claiming the pre-game indoor temperature of 67 degrees is a best-case for the Patriots
  • Not considering wetness as an explanation for the few tenths difference despite finding a few tenths difference from wetness