The model of potential COVID-19 pandemic outcomes prepared by the University of Washington Institute for Health Metrics and Evaluation has taken on a vastly outsized impact across the United States. Though the numbers produced from this evaluation may have had value, along with the earlier study from the Imperial College COVID-19 Response Team, in finally shocking Donald Trump into doing something other than dismissing the whole pandemic as nothing but “the regular flu,” there has been an unfortunate tendency to treat the UW IHME model as something much more than its very simplistic parts.
The update of this model which appeared on Monday dropped its projections of overall national deaths from COVID-19 by over 10,000, setting a new bar at 81,800 deaths and dropping the top of its overall range a full 40,000 to a 136,000 maximum through the beginning of August. These are still hideous numbers, but the shift in the good direction was enough to have Donald Trump shouting about “light at the end of the tunnel” on Monday morning. However, there are very, very big reasons that the UW IHME model should be getting much less attention—not least of all the fact that it changed its projection by a huge amount on just four days of new data.
Right off the bat, any model whose top line went from 175,000 to 136,000 on the basis of only four days data is not a good predictive model. But then, the IHME model isn’t much more than a quadratic equation with a few inputs. The whole nature of the UW IMHE model appears to make it intensely sensitive to very small changes in the early data set, so much so that at any given point it is extremely likely to reflect outcomes that are little more than statistical noise.
For example, this is how the model showed state results for Alabama on April 4.
This is what the chart shows for the same state following the April 5 update.
Those two charts may seem similar—unless you check the scales on the left of the chart. The top chart, based on data through April 1, showed maximum cases peaking at 32,000. The second chart, based on data from April 5, shows a maximum number of cases at 2,050. This is greater than an order of magnitude difference based on the input of just four days data. That is, to put it kindly, less than useful.
The result of this massive change is that the April 1 version predicted 7,300 deaths in the state of Alabama. The second version calls for 920.
That would be amazingly reassuring, except that it’s not. Because it shows a model so damnably sensitive to input that it’s simply incapable of modeling outcome. And that is demonstrated in the results so far because, even allowing for the massive error ranges that are included in the projections, data just a day or two after the data set has repeatedly fallen completely outside the range.
If Alabama suddenly had a 6,000 person reprieve, not every state got a pat on the back from the revised model. Here’s Kentucky as it was presented using the April 1 data set.
Kentucky was one of several states where the model showed a gradual rise in cases that stayed within the limits of the state’s available healthcare resources. As a result, the peak wasn’t so much a peak as a gradual rise and fall that eventually left the state with a projected 800 deaths.
But on the basis of a few days data, the model now shows this.
The new version shows a ragged, sharper peak that brings maximum usage forward by a month and triples the demand for ICU beds and ventilators. As a result, the projected deaths in Kentucky went from 800, to almost 1,800—again, on the basis of a handful of days.
Looking back along the lines on both states based on the April 1 data shows multiple days in the last week during which deaths fall not just off the midline of the model, but completely outside the (extremely generous) error range. The differences between the actual results and the projected results may seem small, but this is only a few days out on a model that projects the next four months, and whose totals are being pointed to as “goals” by the White House team.
Another concern is the basic data which the model is using to define the parameters of its equations. Looking at the top line for Kentucky, the model insists that schools were closed on March 20—they were closed on March 11. It indicates that nonessential government services were closed on March 26—they were closed on March 16. And most amazingly, it puts Kentucky down as having no “stay-at-home” order, when in fact Gov. Andy Beshear was one of the first to issue such an order with a series of “stay healthy” executive orders that began on March 7 and tightened twice in advance of one of the most comprehensive stay-at-home orders on March 25. As of April 7, the UW model insists that Kentucky has “not implemented” a stay-at-home order.
On the other hand, the model credits Tennessee with having issued a stay-at-home order on April 2, despite Republican Gov. Bill Lee making it clear that his “safer at home” order was explicitly not a stay-at-home order, and is purposely more porous than cheese cloth.
The University of Washington Institute for Health Metrics and Evaluation model is getting an inordinate amount of attention, but very little of it appears to be deserved. There’s no doubt in the end that it will be “right,” but only in the sense that it will get continually updated so that it shows the current conditions on that day.
However, the evidence so far is that:
- The model is hugely sensitive to small changes in inputs, so much so that it often falls outside its own extraordinarily generous error ranges almost immediately following an update.
- The basic information on social distancing used in the model does not withstand even a cursory examination.
- The model has, to date, been neither precise nor accurate nor robust.
This is the equivalent of trying to take a simplistic weather model and extend it into a climate model. It’s so sensitive to small changes in early data, that long-term outcomes swing enormously day to day. That’s the kind of outcome generally described under one branch of chaos theory.
It’s not that people shouldn’t try to model the course of the epidemic; of course they should. Modeling is absolutely vital for predicting resource demands, for looking at which measures are proving effective, and for any planning. But the model that’s getting the most attention may be the least deserving.