Last Thursday, I wrote an article stating that the rate of deaths from COVID-19 among white Americans now exceeded those among Black or Latino Americans. The article drew from a report by The New York Times written by David Leonhardt which extrapolated this information from recent reports at the CDC’s COVID Data Tracker. Not only did the numbers appear correct, these results spoke to a story that felt right: White Republicans had been so casual in their treatment of COVID-19 that their actions had outweighed the considerable advantages white people have in the American health care system.
However, that article was wrong. So was the source article. Both fell prey to a statistical phenomenon known as Simpson’s Paradox, in which “an association between two variables in a population emerges, disappears, or reverses when the population is divided into subpopulations.”
I was wrong, both in my interpretation of the data, and in the commentary I drew from this conclusion. And in both cases, I forgot one of the most serious dictums of any form of journalism: Beware the story that is too friendly to your own beliefs.
Campaign Action
This is not the first time this fallacy has appeared even when it comes to just the area of COVID-19. This paper from Cornell University researchers showed how public health officials compared results in a large-scale study from China to early results from the outbreak in Italy, and came up with what seem to be utterly impossible results: Case fatality rates in Italy were lower for every age group, but higher overall. How is that possible?
It’s possible because the first set of results—deaths as a percentage of each age group—doesn’t acknowledge that Italy has a lot more elderly patients. So even though a smaller percentage of each age group was dying, Italy’s population was more heavily weighted toward older patients, resulting in a higher rate of overall deaths.
What happened with the article that Leonhardt wrote—and which I blindly imitated—on June 9, was the inverse of this issue. The overall rate of deaths for white Americans is higher specifically because more white Americans are older. In every age bracket, the death rate among Black Americans was, and is, higher than that of white Americans. Because Black Americans consistently have lower access to health care and get lower-quality care. Even vaccine denial and mask rejection—primarily by Republicans, who are overwhelmingly white—have not overcome that difference.
To be fair, several Daily Kos readers left comments on that original article suggesting exactly this: That age or other demographic factors were responsible for this supposed “shift.” But in spite of those warnings, as someone who took biostatistics, geostatistics, and two years of plain vanilla statistics before taking a job where my primary task for over a decade was sniffing out the critical statistics for job processes, I charged right in and … was absolutely wrong.
Dividing any statistic into groups by an arbitrary factor like race may generate the discovery of valuable correlations and insights. It may also generate false narratives, especially when the numbers are split up without regard to other demographics.
This article was incorrect on its core assumptions. You have a right to expect better. Apologies.