Epstein Audit I: Notable Reactions
Alex Epstein recorded this reaction to Audit I. I don’t think we even disagree, but decide for yourself.
Meanwhile, Matthew Lilley, Australian National University Lecturer and Visiting Research Scholar at Duke University, tells me that I went much too easy on Hausfather et al.’s 2019 “Evaluating the Performance of Past Climate Model Projections,” in Geophysical Research Letters:
What do the numbers in Table 1 mean? Consider the following procedure.
Take the observed temperature series, run a regression on a linear time trend, obtain beta_t_observed.
Take the model, run a regression on a linear time trend, obtain beta_t_model
Then their statistic appears to be equivalent to 1 - abs(beta_t_model - beta_t_observed) / abs(beta_t_observed)
Strictly speaking they do this entire procedure for 5 different observed temperature series and use a simulation procedure, and take the median of these stats.
So it's a useful number for "how accurate is our model" but not particularly for "did the models overestimate the increase in temp" because the models over and underestimating reality affect the statistic symmetrically. What it does show is that the models consistently beat "predict no increase at all" by a decent margin.
Separately, you note the abstract claims "most models examined showing warming consistent with observations". They evaluate this in a way that is very strange, and frankly, statistically incoherent. If "the confidence interval of the difference between the model prediction and the observed reality" overlaps with zero, they deem the model "consistent". However, the way they calculate the confidence interval for the observed reality is wrong - they do a calculation that is very strange and get a confidence interval that is erroneously extremely wide. As a result, they deem most of the models consistent with observation. If you (incorrectly) say that the observed reality is very uncertain e.g. you measured the temperature today but your thermometer is on the fritz and you really just aren't sure if it got to 80 or maxed out at 50, then just about any model - even those that say contradictory things - might have been right.