It turns out that Roger Goodell, Exponent and Ted Wells just aren’t very good at logic. Whether that’s due to severe defensiveness and a major confirmation bias or something else is irrelevant. I’m not going to go into legal details or CBA issues, but I will discuss the scientific and logical errors and inconsistencies from Goodell’s appeal ruling and the hearing itself in deflategate.
Falsehood No. 1 — Timing was accounted for the in the statistical test
On pg 6 of his ruling, Goodell writes:
“In reaching this conclusion, I took into account Dean Snyder’s opinion that the Exponent analysis had ignored timing…however, both [Dr. Caligiuri and Dr. Steffey] explained how timing was, in fact, taken into account in both their experimental and statistical analysis.”
This is patently false. It is not an opinion of Dr. Dnyder’s. It is a fact. And it is a fact agreed upon by Dr. Caligiuri and Dr. Steffey after much run around and refusal to answer this question. In Dr. Caligiuri’s testimony on pg 361 of the hearing:
“So the reason you don’t see a timing effect that we concluded in the statistical analysis is because it’s being masked out by the variability in the data due to these other effects.”
And then later on pg 380:
Kessler: So the initial test you did to determine whether there was anything to study did not have a timing variable?
Caligiuri: Not specifically, no.
Steffey echoes this fact on page 429 and 430:
Kessler: This one-structured model that you chose to present as your only structured model in this appendix and in the entire report, okay, has no timing variable in it, correct?”
Steffey: There’s no term in there that says time effect.
Goodell is either misrepresenting the truth or he is very, very confused and was not able to understand this issue at the hearing. Either way, once and for all, timing is not accounted for in Exponent’s statistical analysis. It is a major confound, and it does change the results when timing is indeed accounted for.
(By the way, the Exponent scientists were attempting to claim that an ordering effect is the same thing as accounting for timing, but that is also wrong. First, an ordering effect can have different increments of time (as the Patriot and Colt balls do) and second, an ordering effect is independent of time, which is relevant in an instance where another variable, like wetness, would completely mitigate the presence of an ordering effect but not undo the effect of time.)
Falsehood No. 2 — Brady’s “extraordinary volume” of communication for ball prep
On pg 8 of his ruling, as part of discrediting Brady’s testimony, Goodell reasons:
“After having virtually no communications by cellphone for the entire regular season, on January 19, the day following the AFC Championship Game, Mr. Brady and Mr. Jastremski had four cellphone conversations, totaling more than 25 minutes, exchanged 12 text messages, and, at Mr. Brady’s direction, met in the ‘QB room,’ which Mr. Jastremski had never visited before…the need for such frequent communication beginning on January 19 is difficult to square with the fact that there apparently was no need to communicate by cellphone with Mr. Jastremski or to meet personally with him in the ‘QB room’ during the preceding twenty weeks.”
This is a serious mischaracterization of facts. Let’s ignore the basic fact that there wasn’t a media frenzy surrounding Jastremski’s domain in any of the previous 20 weeks. During the hearing, Brady explained that, for the Super Bowl, Jastremski needs to prepare approximately one hundred footballs, at least eight times his normal volume.
Furthermore, Brady testified that deflategate allegations surfaced on days when he was not at the stadium because of the Super Bowl break. Frankly, it would have been stranger if he didn’t call Jastremski. The hoopla over the visit to the QB room is also bizarre, since Brady said he simply didn’t want to look for him in the stadium. There is no justification for how Goodell ignores this evidence, even taking it further and writing on pg 9:
“The sharp contrast between the almost complete absence of communication through the AFC Championship Game and the extraordinary volume of communication during the three days following the AFC Championship Game undermines any suggestion that the communication addressed only preparation of footballs for the the Super Bowl..”
Yet Brady testified, in front of Goodell, that they were discussing Super Bowl preparation (of 100 balls, not 12) and the issue of alleged tampering.
Logic Error No. 1 — It has never happened…but it has happened…but that doesn’t matter
On page 3 of his ruling, Goodell writes that:
“Mr. McNally’s unannounced removal of the footballs from the locker room was a substantial breach of protocol, one that Mr. Anderson had never before experienced. Other referees interviewed said…that [McNally] had not engaged in similar conduct in the games that they had worked at Gillette Stadium.”
So Goodell is saying that McNally grabbing the balls is a huge deal and in fact, it has never even happened before! Which would then make it impossible for this to have been a regular practice.
Thus, when analyzing text messages, Goodell ignores this information and believes that McNally’s references to “Deflator” (in May) and “needles” in October of 2014 are signs of a tampering scheme, but when trying to establish the severity of the situation he believes nothing like this has ever happened before.
Similarly, during the hearing (pg 307) Ted Wells admitted that he ignored the testimony of Rita Calendar and Paul Galanis — game day employees — who claimed that McNally took the balls to the field about half of the time without the officials. Wells doesn’t even think this issue is relevant, explaining that:
“I didn’t need to drill down and decide when he walked down the hall 50 percent of the time by himself or was this person right or that person right.”
Got all that? This has been happening since at least 2014, but this is the first time something like this has ever happened. And Wells thinks it doesn’t matter whether this ever happened before or not.
Logic Error No. 2 — Jastremski expects a 13 PSI ball despite a tampering scheme
On pg 278 of the hearing, Wells acknowledges that John Jastremski texted his significant other about the Jets game and said that he expected the footballs to be 13 PSI. Amazingly, Wells believes he is telling the truth. Which creates yet another, Wellsian logical impossibility.
How can Wells believe Jastremski expected the balls to be at 13 PSI for the Jets game and believe that there was a scheme to deflate the balls below 12.5? It is a completely contradictory thought. (This is similar to Jastremski’s text message that he sent to McNally about the ref causing the balls to be 16 PSI in that game, and not a message to McNally about why the balls weren’t properly deflated.)
This makes it logically impossible for there to have been a tampering scheme for that home game against the Jets. This either means that:
- There was no tampering scheme ever
- There was a tampering scheme, but only after October, 2014
- The tampering is carried out inconsistently at home
The third explanation borders on preposterous, namely because the text still would have said something like “we should deflate every week from now on to avoid this!” The other two explanations make it impossible for the comments from May, 2014 to be about deflating footballs. Yet Goodell follows suit and cites such messages as evidence of a tampering scheme (pg 10 of his ruling):
“Equally, if not more telling, is a text message earlier in 2014, in which Mr. McNally referred to himself as “the deflator.”
Goodell, like Wells before him, omits that McNally claimed the reference was about weight loss, which may sound crazy until you consider that other people use the term for weight loss, including the NFL’s own network in 2009, and that McNally himself appears to make a reference to weight loss using the term “deflate” during the Patriot-Packers in 2014 in Green Bay. (McNally was watching the game on TV from his living room, and after seeing a picture of Jastremski on the suddenly in a large, puffy jacket texted him a message to “deflate and give someone that jacket.”)
Logic Error No. 3 — For the Colts, the Logo gauge matters. For the Patriots, it is impossible.
On pg 3 of Goodell’s ruling, he writes:
“Eleven of New England’s footballs were tested at halftime; all were below the prescribed air pressure range as measured on each of two gauges. Four of Indianapolis’s footballs were tested at halftime; all were within the prescribed air pressure range on at least one of the two gauges“
First, this is bizarre, because it’s clear both sets of footballs lost pressure due to environmental factors. The Colts being “within the prescribed air pressure range” is simply due to their balls starting higher — Goodell knows it, you know it, every c-minus physics student in America knows it.
But what’s more problematic, and yet another assault on common sense, is that Goodell later rules that Anderson had to have used the non-logo gauge at halftime due to unassailable logic, but here he references the Colts being “within the prescribed air pressure range” on a gauge he considers to have been impossible to have been used.
Logic Error No. 4 — The balls were the same wetness
Wetness or moisture is a huge issue in the science. Yet here’s what Exponent scientist Dr. Caligiuri had to say about it as an alternative explanatory factor to tampering on pg 385 of the hearing:
“It is a possibility [that the Patriots’ balls could have been much wetter than the Colts’ balls because of the fact that the Patriots were on offense all the time with the balls], but there is no evidence that that occurred. The ball boys themselves said they tried to keep them as dry as possible. “
Brady’s attorney Jeffrey Kessler then asks him to confirm:
Kessler: Well, if you are on offense and you playing with the ball, can you keep it dry when it’s out there on the field?
Kessler: Okay. So if the Patriots have those balls out there on the field, it’s plausible those balls were wetter, sir, right? You are under oath.
Kessler: Okay. And you didn’t test of that plausible assumption, right? Did you test for it?
Later Caligiuri states:
“We did not test for that because there was no basis to test for that.”
Yet, there is indisputable evidence that the Patriot balls were wetter. Namely, it was raining during the game and the Patriot possessed the ball for essentially 17 consecutive minutes in real-time, during the rain, to end the first half. Saying that there is no basis to test for that is a direct contradiction of the publicly available and undisputed information. Yet, on page 383-384 of the hearing, Caligiuri says:
“Did we look at wetness as a variability…in the beginning, no we didn’t.“
Instead, he says they looked at “extremes.” This makes plenty of sense, except there are two giant problems. First, misting a football every 15 minutes with a hand spray and then immediately toweling it off is a nonsensical proxy for constant exposure to rain. Second, it does no good to create a range of possibilities and then not test the most likely possibility, namely that one set of footballs is on the wetter end of that range and the other is on the dryer end.
Logic Error No. 5 — Evidence that inflation mattering = deflation mattering/preference for deflation
Goodell has another breakdown in logic on pg 11, footnote 9:
“Even accepting Mr. Brady’s testimony that his focus with respect to game balls is on a ball’s ‘feel'” rather than its inflation level, there is ample evidence that the inflation level of the ball does matter to him.”
Yes, there is evidence that it matters if the ball is grossly overinflated. There’s no evidence that he wants it underinflated, or that reasonable inflation levels actually matter to him. None. It is a logical fallacy to think otherwise. It’s like saying “Mr. Brady complained about his food being too salty last night, therefore there is evidence that Mr. Brady really cares about having under-salted food.”
Logic Error No. 6 — Practical Significance
Finally, lost in all the discussion of the statistical significance is the issue of practical significance. This is the area that I really wish the NFLPA would have attacked at the hearing, but they did not broach it at all. Ironically, It’s probably the easiest part of the science for the lay person to understand.
Let’s assume that we were 99.9999% certain that the Patriot balls were all 0.3 PSI below where they should have been at halftime based on temperature alone — right around the actual number we think they are based on projections. That certainly does not mean that “tampering” is the only alternative explanation, and more importantly, it’s not very likely if the real-world explanation is not practical.
What benefit would someone actually gain from a completely undetectable change in PSI? Remember, players have never even known there were PSI changes from temperature in the past.
In other words, even if there is statistical significance on data that incorporates measurement time (which there isn’t), what would that data be suggesting? That Brady can magically detect differences in footballs that others can’t (and yet despite this, does not care if balls on the road are not a few tenths below 12.5), or that some other factor, like wetness, wind, temperature difference, gauge variability, inaccurate memory, etc., is a more practical explanation?
For those who missed it, Exponent themselves discovered on order of a few tenths of a PSI difference between the Patriot actual halftime measurements and where they projected their measurements to be.
Bonus Logic Error — It had to be the Non-Logo gauge
I’m hesitant to discuss this Red Herring, because the difference is negligible between the Logo and Non-Logo gauge when comparing the Colt and Patriot measurements. And this makes total sense — shifting the Patriot balls down a few tenths should (and does) also shift the Colt balls down a few tenths. But let’s pause to appreciate the absurdity of this logic, and then doubling-down to call it “unassailable.”
On pg 7, footnote 1, Goodell writes:
“I find unassailable the logic of the Wells Report and Mr. Wells’s testimony that the non-logo gauge was used because otherwise neither the Colt’s balls nor Patriots’ balls, when tested by Mr. Anderson prior to the game, would have measured consistently with the pressure at which each team had set their footballs prior to delivery to the game officials.”
Here’s what he’s referring to, echoed by Dr. Caligiuri on pg 364 of the hearing:
“Yes, he calculated, I rounded it up. 12.17, correct, okay. And then if you look at the Colts’ balls, if the same logo gauge was used, it’s reading 12.6, 12.7. We were told that the Patriots and the Colts were insistent that they delivered balls at 12 and a half and 13, which means, geez, looks like the logo gauge wasn’t used pre-game.”
OK, now let me assail it quickly — something that was already done at the hearing which Goodell provided over. The Logo gauge is inaccurate (reads too high) and the Non-Logo gauge is much closer to the “true” reading. Exponent tested a bunch of new gauges. Based on these two facts alone, Wells and Exponent have concluded that it’s improbable the Logo gauge was used because then then Colt and Patriot gauge would also have to be off by a similar amount, and that’s just, I mean, geez, that’s just insane.
Except for the pesky little problem that according to the Wells Report, Exponent tested one model only, Wilson model CJ-01. A model they describe as being “similar” to the Non-Logo gauge! So their sample size to make these “unassailable” conclusions is really one.
But there’s more: Exponent discovered gauges can “drift,” or grow more inaccurate with use. It’s quite possible that the Patriot and Colt ballboys both have older gauges that have “drifted” to a similar degree. At the hearing, this was scoffed at because it would be coincidental that they were off by the same amount. Again, this doesn’t actually matter — it’s a Red Herring — but it demonstrates how poor these people are at basic logic. On pg 295, Wells said:
“Maybe lightning could strike and both the Colts and Patriots also had a gauge that just happened to be out of whack like the logo gauge. I rejected that.”
The Patriots claim to set balls at 12.6 PSI, but Anderson did not remembering gauging them all at 12.6 in the pre-game. (He remembered 12.5, and had to re-inflate two balls that were under 12.5.) There are two likely explanations for this:
- The gloving procedure created some variability in the Patriot balls. This would make it more likely the Logo gauge were used base on Exponent’s logic.
- The Patriot gauge and Anderson’s pre-game gauge are off by about 0.15 PSI.
Either way, it’s impossible for the “lightning striking” concept to even apply (that the gauges were off by an identical amount). Using Wellsian logic — which means we ignore things like gloving or temperature changes from ball to ball — the very fact that the balls weren’t 12.6 as the Patriot say but some were under 12.5 for Anderson tells you that the two gauges are not identical. So there’s no need for “lightning to strike.”
Bonus Question: How closely did Roger Goodell read the Wells Report?
In his ruling, Goodell states that he relied on the factual and evidentiary findings (pg 1) of the Wells Report — but during the appeal, there are times during the appeal hearing when Goodell does not seem to know the basic case facts:
- pg 49, he asks “John who?” when Brady is talking about John breaking the balls in. It’s possible this is Goodell’s way of confirming he is talking about John Jastremski, however it’s bizarre given the context of Brady’s explanation and Jastremski being one of a handful central figures in the case that he has to ask who John is. Does he know about Jim and Dave too?
- pg 61, in reference to the October Jets game, he says, “Just so I’m clear, the Jets game is in New York.” This is a huge detail to not understand as it relates to the 13 PSI text mentioned above.
- pg 177, while Edward Snyder is discussing the halftime period, he interjects “Just so I’m clear, you are saying it would take 4 minutes for 11 balls to be properly inflated? That’s your analysis or what analysis is that?” Here, Goodell is saying that he is completely unaware that the witnesses in the room at halftime provided those estimates to Wells, who relayed them to Exponent, and that those estimates are central to the scientific and statistical analysis in the case.
- pg 180, in the discussion about “dry time” (vs moisture), Goodell asks in regards to moisture, “that’s a what-if, right?” How can the person ruling on the case, after reading a report that was designed to determine if environmental factors could explain the halftime measurements, ask if “rain” is a “what if” when it rained during the game?
- pg 396, perhaps the clearest indication that Goodell either did not read or did not properly retain the information in the report is that he has no idea what the “gloving” issue is. This is the gloving referenced by Bill Belichick in his press conference and given an entire section by Exponent in their report.