With control of the Senate at stake in the next midterm elections candidates face some major strategic decisions. Should they go after the swing voter, or go for turnout of the partisan base? Our conventional wisdom here is that the base turnout is more important, and that's supported by direct data on individual voter choices. (See Lynn Vavrick's op-ed.) On the other hand, Harry Enten at the new 538 site has argued that it's not who turns out that matters, but who they choose to vote for. My point in this diary is that Enten has made a serious rookie statistical error, a version of the famous "ecological fallacy".
The polling data cited by Vavrick show that few voters switched allegiance between 2008 and 2010, and that these switchers were closely balanced between the two directions. On the other hand, many more individual Democrats failed to show up than individual Republicans. So there's a very strong case that the most variable feature, and thus probably the feature most easily changed, is turnout.
How then does Enten reach the conclusion that "What really mattered was that voters changed their minds about which party they wanted to vote for." ? He does it by not considering individual voters but by considering only a few broad demographic categories of voters: racial/ethnic and age. Some of these categories have fairly strong correlations with voting patterns. Thus if you see relatively more old whites showing up to vote, that by itself is reason to suspect a swing toward Republicans. Just from the shifts in these broad categories alone, Enten finds that changing voter demographics would account for more than 2.6% shifts in the margins between 2010 and 2012. (For some reason he omits 2008, perhaps because he's trying to pick data to minimize the effect.) Although that's a very significant shift, it wouldn't have shifted control of the House.
The problem with any such calculation, however, is that it's utterly dependent on how the electorate is divided up into groups. Why not include different income groups? Why not include religious categories? Why not include combinations of those various features: Hispanics age 18-24 with income under $35k/yr, etc.? Generally speaking (though with possible exceptions) you expect that the finer you divide up the population into groups with more distinct voting patterns, the larger effect you will find from the changing turnout. So when an extremely crude, broad division turns up a 2.7% effect, the best guess is that turnout is the dominant factor. And that's exactly what the data on the relevant "groups", namely individual voters, say.
So our bottom line with regard to strategy is that the conventional wisdom here looks right. Midterms are mainly base turnout elections.
The bottom line with regard to 538 is that its quality control is lousy. This sort of article has neither the formal sophistication of some of Sam Wang's stuff, nor the Bayesian common sense of Nate Silver's seat-of-the pants poll analysis methods. 538 is not living up to its promise of being a high-quality, accessible statistically smart site. This article looked more like a textbook example of what not to do to the data.