Hi everyone. Here we go with another update on what I've been working on for DRA and what I plan to do next.
Before I get to that, a quick comment about 'found votes' in my DRA fixes. I've had several inquiries about why it is that my fixes of data sets previously on DRA always seem to turn up a disproportionate number of Obama votes. The answer is quite simple: By far, the errors that result in missing votes tend to involve urban precincts and so fixing the data invariably turns up more Democrats than Republicans. If it needs saying, my only goal is to provide the most complete and accurate possible translation of the 2008 election results into the 2010 census voting districts.
On another note, a few people have asked if I plan to put 2012 election data on DRA after Nov 6. Here's my provisional non-answer. If the 2012 presidential results are similar to the 2008 presidential results, then the precinct distribution of votes should also be quite similar, so 2012 data would be all but redundant in my view. If the 2012 presidential results end up being substantially different from the 2008 results, then I agree there would be some value to having 2012 data, but then the problem is that most states would be a nightmare to translate because they've reprecincted based on the 2010 census. It may very well be the case that it makes sense for select states, so the shorter non-answer is that I'll cross that bridge if I come to it (and assuming Dave thinks it's worth doing, of course).
That said, if any state plans to do another round of redistricting, for whatever reason, then I'll almost surely try to get 2012 election data up for that state, regardless.
Now, with that out of the way, here's the actual update after the jump!
Iowa: I've forwarded a file with 2008 presidential data and Dave expects to get that uploaded by the weekend. Iowa was drastically more tedious and time-consuming than I had hoped. The main problem was that rural election precincts frequently cover multiple census voting districts, which basically consist of the township lines and select cities/towns. In short, I had to create a precinct-to-VTD conversion table for each county which meant comparing 2008 precinct maps with 2010 VTDs for each of 99 counties. Then I used my standard distribution formula to translate precinct results to VTDs based on VAP.
What this means is that the DRA data will appear slightly deceptive.
On one hand, there are a number of precincts which are dominated by a town but that are subdivided on DRA to include one or more rural areas. This means that if the overall precinct was, say, 60% Obama all the constituent VTDs will be 60% Obama. However, in reality, the rural areas surely were less Obama than the given town, but there's no way to know by how much since the votes were cast in one location.
On the flipside, there are quite a large number of Iowa VTDs where a small town is separated from the rural parts of its precinct. In these cases, if the overall precinct was, say, 60% McCain the town will be 60% McCain on DRA. Again, while the given town almost surely voted less McCain than the rural parts of its precinct, there's no way to know by how much.
To be sure, this is an issue that almost entirely involves rural precincts with relatively small numbers of voters. For Iowa's more urban/suburban counties, the precincts and VTDs match up near perfectly. In other words, I don't think it will be of any consequence when aggregating VTDs into election districts, even on the legislative level.
Virginia: I'll be starting on a revision of the 2008 election data tonight. Based on my preliminary assessment thus far, I don't think Virginia will have drastic errors, so I suspect (hope) that I'll mainly just be adding complete precinct counts. That said, Maryland looked much the same in my preliminary assessment and it turned out that the majority of the VTD figures on DRA were completely wrong once I delved deeper. So, I obviously can't be sure yet what exactly is going on with the Virginia election data.
[Update]: I've completed the Virginia revisions. See comment for details.
In the meantime, I've forwarded a correction of the demographic data for the Zion Grace and Taylor Elementary precincts in Norfolk. The Census Bureau evidently thought that four aircraft carriers ported at the Norfolk Naval Station were actually three miles away in a random field between an electrical relay station and an abandoned railyard... So, 19,279 sailors ended up being counted in the Taylor Elementary precinct when they should've been in the Zion Grace precinct.
The Census Bureau issued a partial correction here. However, they didn't bother to reissue the non-Hispanic race figures which is what's actually on DRA. To make the long story short, I was able to triangulate the data between the above memo, the original PL 94-171 data, and the Virginia Legislative Services data which simply omitted the Hispanic breakdown from the adjusted census blocks. I was therefore able to identify the primary race for all but six Hispanics. By my calculation, there is a 97.6% probability that those six Hispanics designated their primary race as four whites and two 'others' (with a 2.3% probability of 3 whites, 1 black, and 2 'others') so that's what I rolled with.
I rather doubt any redistricting schemes will hinge on these six Hispanics, but if someone really wants my worksheet I'll be happy to forward it along. In any event, I'm projecting that I can get a replacement file for the Virginia election data completed by Monday.
New Jersey & Arkansas: After I get Virginia done, the pace of my DRA work will slow down considerably as I have a couple of very busy months ahead. My main priorities are to redo the New Jersey data that I screwed up (as I noted in my previous update) and to put together a file for Arkansas. Unless I'm particularly inspired and have a lot more free time than I anticipate, I doubt those will come along before late October.
And onward! So far as 2008 presidential data, the main task for sometime after Election Day will be to tackle the four states that only have block groups on DRA, not voting districts. These are Alaska, Montana, Oregon, and Rhode Island. It's tough to say how that'll go until I really get into doing them. Rhode Island should be fairly easy (I can layer shape files) but if I have to get 2008 precinct maps for every county in Montana & Oregon that could take quite a while. As for Alaska, VTD files now seem to be available on the Census Bureau FTP download site, so perhaps the issue will become moot. (Hi Dave!)
Besides all of that, at some point I plan to add third party votes to the Florida data (which was a nightmare to sort out) and to complete the California data sets. But, being realistic, I don't see that happening until 3-4 months from now.
I guess that's it. If interested, just keep an eye out for the Iowa data and for the Virginia fixes. I'll post a brief note in open thread comments to let people know once each state is actually uploaded to DRA.