In 2006 and 2008, a nonprofit executive named Donna Edwards ran against incumbent Congressman Al Wynn in Maryland's Fourth District, which includes portions of Prince George's County and Montgomery County. In 2006, she narrowly lost, but in 2008, she easily won, in what is considered something of a landmark of "netroots" influence on elections.
Back at Swingstateproject, we were talking about the effects of the VRA, and I mentioned a book, David Canon's Race, Redistricting, and Representation. Canon claimed that the white populations of majority-minority or VRA districts could exert a (sometimes decisive) influence by siding with one of several minority candidates in a contested primary. Someone mentioned that the Edwards-Wynn primaries would be interesting case studies. I decided I would study them.
My approach was pretty simple--I simply downloaded the precinct-by-precinct elections results for the two primaries at the Maryland Board of Elections and (laboriously) added in the 2010 precinct-by-precinct racial percentages for each of the 169 or so precincts. (I say "or so" because, for some reason, the 2008 results include some precincts absent from the 2006 results). I then ran a series of multivariate regressions using Edwards' percentage of the vote in each precinct and one or more of the racial variables. Note that this approach of course does not necessarily mean a candidate does better or worse with voters of a particular racial group--it's all about areas.
Let's start with 2006. Here are the results of the regression analysis program (I used Winpepi, which is apparently for epidemiologists, because it's free):
Regression equation:
Edwards = 0.460 + 0.003(White) - 0.001(Black)
Variable Coefficient SE Two-tailed P
White 0.003 < 0.001 0.000
Black -0.001 0.000 0.013
Standard error of estimate = 0.077
R-squared (coefficient of determination) = 0.636
Adjusted coefficient of determination = 0.631
Now, if you've had no formal training in statistics...that makes two of us. However, here's what I believe this means. On average, Edwards did better the whiter the precinct's population, and independently from that, she did worse the blacker the precinct's population (several precincts also had significant Asian or Hispanic populations). The two-tailed P test means that both coefficients are statistically significant. However, the R-squared value is low at .636, which means there's a lot of error going on.
Now, it's a bit harder to graph three variables than two, and I'd like to include a picture of what's going on. so let's run the same analysis using just Edwards' share in each precinct and the white population, since the two-tailed P is lower in that case:
Simple linear regression:
Equation: Edwards = a + (b x White)
Edwards = 0.379 + (0.004 x White)
a = 0.379 (S.E.: 0.008)
b = 0.004 (S.E.: 0.000; 95% C.I.: 0.004 to 0.005)
P = 0.000 (for difference from zero)
Coefficient of determination (r-squared) = 0.622
Adjusted coefficient of determination = 0.620
Standard error of estimate = 0.078
Note that the R-squared has only decreased a little bit. My impression is that this means that we haven't lost too much accuracy by looking at just two variables. Here is the scatterplot: Each dot is a precinct, Edwards' 2006 share of the vote is on the Y-axis, and the white % of the population is on the X-axis.
Note why a linear regression might not have the best fit--one line with a pretty sharp slope seems to fit the cluster of heavily-nonwhite precincts on the left, while after that, Edwards hovers around 60% throughout. Even the equation Edwards=.379+(.004 x White) would predict Edwards got a majority of the vote in any precinct with more than, say, a 30% White population.
Another way to look at this is by county: Just looking at the numbers, Edwards seemed to pretty much romp in Montgomery County (whiter overall) while her performance in Prince George's County was much more varied. Here's the above graph, just in Montgomery County:
The linear regression says:
Simple linear regression:
Equation: Edwards = a + (b x White)
Edwards = 0.478 + (0.002 x White)
a = 0.478 (S.E.: 0.027)
b = 0.002 (S.E.: 0.001; 95% C.I.: 0.001 to 0.003)
P = 0.000 [ 3.1E-5 ] (for difference from zero)
Coefficient of determination (r-squared) = 0.226
Adjusted coefficient of determination = 0.214
And here it is in Prince George's County:
To try to get a better fit, I looked at PG with both white and black percentages thrown in, but no luck:
Regression equation:
Edwards = 0.316 + 0.008(White) + 0.000(Black)
Variable Coefficient SE Two-tailed P
White 0.008 < 0.001 0.000
Black 0.000 0.000 0.339
Standard error of estimate = 0.064
R-squared (coefficient of determination) = 0.467
Adjusted coefficient of determination = 0.456
You can see that the graphs are pretty well-separated, and the R-squared--the quality of the fit--for both is quite poor. So the relatively not-bad R-squared from before might have just been an artifact of some kind of county-level divide. However, note that "White" remains statistically significant in both counties.
I was going to talk about 2008 too, but it's not very interesting.
Edwards won nearly every precinct in the district on her way to a landslide victory.
Now for some questions: First, does anyone know a good way to estimate income or education levels by precinct? Is there one, or would the error be too high for something like that? I was thinking of using ACS estimates by census tract, but they're not the same divisions. Overall, this area is one of the wealthiest and best-educated in the county, and I'd be very interested to know if education or income levels provided better fits, especially within the counties--or if racial percentages were still statistically significant after accounting for education and income levels.
Second, of course, I'd be interested to hear the reaction from people who know this election well--I deliberately kept this post entirely quantitative.
Finally, I'm not sure if anyone else has done this analysis--a quick Google of "Edwards Wynn precinct regression" didn't seem to show anything, nor did a more detailed googling just now. This fellow has done some similar stuff with 2000 census data for other MontCo races: http://maryland-politics.blogspot.com/
It was as much an exercise for me to learn how to use the software and such as anything else, but I thought it might have some interest. If nothing else, it gave me a chance to run a regression analysis, like an actual scientist instead of the mathematician I am. Anyway, looking forward to your reactions.