Comment Preferences

• non-math explanation of first table(5+ / 0-)

We are trying to determine what numbers are predictive.  Imagine a variable that means nothing at all, say electric vs gas stoves (no idea if it is actually true).  I can tell you the number in all districts and then the math has no correlation.  It would give you a zero.

Now, imagine a variable that has everything to do with it and is perfectly predictive.  Lets call it percent of voters for the Democratic congressional contestant in two way vote.  This variable would accurately predict which districts are D and R everytime, the greater the percentage the more likely the D wins until you fall below 50%, then Republicans win.  This variable gives you a 1.  The Republican opposite (voters for R cong ect) would yield a -1.  This number perfectly predicts voting because, surprise, it is voting.

Statisticians look for the variables where the math yields as close to 1 or -1 as possible.  I'm not 100% remembering but I think that .1 to -.1 are effectively seen as zero.  So that chart is the five strongest variables eachway rated from strongest to weakest.  So renters is the strongest variable from the census data, beating out even white voters, despite being imperfect.

• Look at linear combinations not single variates(0+ / 0-)

Variables may be relatively more highly correlated than others, but not necessarily predictive.  Yes, it seem counter-intuitive, but keep in mind that all such variables are likely to be proxies for the latent, underlying variable that you seek.

From your data, it is clear that you really have NO highly correlated variables (an r value of .8 or higher would be generally regarded as strong in most studies dealing with organisms).

You would do much better to understand the associations among combinations of your variables and then target those populations that best fit that particular combination of co-variates.

What you are saying you want to do is to perform discriminant functions analysis or alternatively principal components analysis to better understand the latent variables of most interest, which are in this case far more relevant to the prediction of democrats than any of your variables alone in a series of multiple uni-variate analyses.  This will be especially true if you fail to correct for experiment wise error (the more tests you perform the greater the probability that you will reject a true null hypothesis by chance alone; if you set alpha at 0.05 then 5 out of 100 such tests should be expected to reject the null hypothesis of no difference by chance alone).

Use Dunn-Bonferroni or Sheffe tests when multiple comparisons are being made, unless you can a priori identify a smaller number of tests, in which case Fisher's Protected Least Significant Difference Method or Bryant Paulson's modification of Tukey's Honestly Significant Difference Method, all of which can be used for multivariate designs.

Discriminant functions analysis provides you with at linear combination of variables that provides the greatest prediction.  Principal components analysis would provide you that linear combination of variables that explains the most variance in the sample.

Subscribe or Donate to support Daily Kos.