Welcome back statisticians, masochists, and/or pollercoaster enthusiasts. When we last met, I wrote a series of diaries dissecting how (and why) the media called the 2022 midterm election so wrong. For background, you can read those diaries here:
Data journalism utterly botched the 2022 midterms, and they are still wrong again in 2024. It isn’t hard to get this right. After all, how could I basically call the 2022 Midterm correctly weeks in advance?
A WARNING
Before we begin, I’d like to start with a warning about the effort we’re about to embark upon (Don’t worry, the conclusion is still that the polls are wrong. Very wrong.)
Pilots are told to always trust their instruments. Pilots who trust their physical sensations more than an aircraft's instruments are headed for trouble. A plane may feel like it is in level flight, when in fact, due to a miraculous combination of forces resulting in sensory illusion, the plane is actually in a steep bank. If there are no visual cues, the pilot has no way to verify their attitude other than their instruments.
There’s been more than one high profile civil aviation accident where a pilot failed to trust their instruments. Data journalism insists that polls are our instruments, and we cannot trust our feelings. And they would historically be right. Nate Silver is famous for calling the 2012 election correctly based on data, while Republicans insisted Mitt Romney had an equal chance right up until Karl Rove famously melted down on FOX News as Ohio went for Barack Obama.
But there are also a smaller number of aviation accidents where instruments malfunctioned, and the pilots, either unable or unwilling to verify what they were observing, followed their instrument into a horizonless dark sea. See Air India Flight 855, or Birgenair Flight 301, or Air France 447, etc.
So it is not without great trepidation that I state we are in one of those rare instances where our instruments (i.e the polls) are malfunctioning.
WE thankfully do HAVE A HORIZON
Fortunately, we can look out our window, and see the (political) horizon. It may not be perfectly clear, and it doesn’t mean that Joe Biden will win the Presidency in 2024. It just means what we are seeing out the window, however imperfect, in no way matches what our instruments are telling us.
What we see, especially since post Dobbs, is that Joe Biden and the Democrats have not only won elections, but consistently overperformed.
This is in direct conflict with polls showing Joe Biden being supposedly "unpopular” and Trump winning the Presidency.
There is simply no way for both of these to be true at the same time.
“But people could like Democrats and not Joe Biden!” you say. True, that could be the case. It would be ahistorical, but it is plausible. America would have to be increasingly progressive, and at the same time, dislike Joe Biden for a variety of reasons, real or imaginary. But the 2024 election is not between Biden and some unknown quantity, it is the ultra rare occurrence of a presidential election between two incumbents. The same polls show Biden “losing” to a flawed, and always deeply disliked, ousted after one term, former ultra-right wing president.
Again, there is no way for both of these to be true at the same time.
What the Data JOURNALISTS Say
As you can imagine, those who became millionaires based on the argument that rote data aggregation can predict political elections have a vested interest in telling us to trust the polls and not look out the window when it seems those polls may have gotten a little wonky. Typically, their arguments start with the presumption that the polls are correct, and then they reason from there to tell us not to trust our lying eyes.
The problem is, if you start with the faulty assumption that the polls are correct, then your conclusion will be even more faulty. Look at how Steve Kornacki is on the right trail there, and then just loses the plot because he assumes the polls must be right.
The most egregious example of this faulty reason came from Nate Cohn in the New York Times.
Nate basically argues that the special elections and midterms have turnout patters that favor democrats this cycle, hence it is no surprise that democrats did well. Moreover, the general election will have a different turnout pattern which is why Trump could win. There’s just one huge problem with this argument: Nate Cohn’s own poll, the New York Times Sienna poll with its vaunted “A+” rating from Nate Silver, is a likely voter poll. This means turnout is already factored in. You can’t argue that polls are inherently correct and then argue that your poll was incorrect based on turnout your poll accounted for! This sophistry made it all the way to the New York Times.
So What’s Going On?
The problem with the polls are:
- Probably many things,
- We don’t know which, and
- It doesn’t really matter
So let’s start by going over everything that could be going wrong.
a past look out the window
For our first explanation, we're going to enter the way back machine and travel back in time to the last incumbent Democratic Presidential election.
Way back in 2012, Gallup used to conduct a seven day rolling average head to head matchup. While it may not have been accurate, it was very precise, and a good way to test the state of the race on a day to day basis.
2012 was shaping up to be a real bore of a Presidential election (polling-wise). Mitt Romney had eventually managed to limp to the nomination, but consistently trailed President Obama in head to head matchups. Base conservatives weren't that thrilled with Rmoney and we rounded Labor Day with it looking like the media (who had garnered clicks off of terrified liberals for years with predictions of Obama’s doom) would have to choke down an Obama victory come November.
Then there was the first debate.
You remember the first debate, right? You know, that thing that historically never matters?
I listened to the debate on the radio and thought Obama was fine. But Andrew Sullivan ran to the press room screaming that the black man had blown it, and it was a full blown media feeding frenzy.
Now watch what happens in the Gallup tracker.
The debate was on October 3. For six whole days, there’s no real change in the Gallup tracker. Then, only after a week of hysteria in the media, does movement begin to materialize on the seventh day.
The race goes from D+5 on October 9 to R+3 in a matter of days. That's a 8 point swing in the electorate!
Did the electorate really become R+3 for a few panic-stricken days?
No.
One thing we know about our electorate is that it is highly polarized and there are very few persuadable swing voters.
And watch what happens as the media storm dies down: the Gallup tracker reverts to D+3 by election day, pretty much what it was before the flip out. Obama would win the race by 4 points and with 332 EV’s.
This is an example of how the media environment can affect polling response, and it is called nonresponse bias. In short, a form of nonresponse bias is when partisans are more likely to respond to pollsters when the environment is good for their candidate, and less likely when the environment is unfavorable.
How’s the media environment for Joe Biden been?
BUT WAIT, THERE’s MORE
Next time, we’ll talk about the challenges of reaching Biden voters.
It certainly doesn’t help that the most democratic demographics have become the most expensive for pollsters to reach, while at the same time, conducting polls has also become more costly to cash strapped news organizations.
We’ll save that for Part 2.