I'm seeing a big potential problem in the polls right now. We're talking potentially catastrophic.
There's more...
Maybe I'm overreacting, but let me give you two bits of data I ran across today:
First, a Research 2000 poll from Daily Kos, NC-President. Parenthetical numbers are from the R2K poll taken two weeks ago.
All voters:
McCain (R) 45 (44)
Obama (D) 47 (46)Early voters (19%):
McCain (R) 40
Obama (D) 52
Okay, nothing too odd there. But I was looking at early voting statistics earlier in the day, to see if I could get a handle on what was up with turnout this year vs. last year. My hypothesis is that the increased turnout and the increased Democratic share in early voting may reflect stable Republican numbers from 2004 to 2008, but significantly different Democratic numbers. This would undermine the claim by Bill McInturff (McCain's pollster) that we'll be seeing higher turnout from all parties and, concomitantly, an electorate much like 2004 (just with higher absolute magnitudes).
Anyway, that got me nowhere pretty fast. I could only turn up party shares of early voting for 2004 in one state: North Carolina. The NC electorate is tremendously heavy on early voting, and they have great data on their early voters. Here's the relevant bit of what I learned:
2004 Presidential Election
Early Voters 1,094,154
Total Voters 3,552,449
Early Share of Total Electorate 30.8%2008 Presidential Election (as of 5:25am, 10/31)
Early Voters 2,078,050
Early Share of 2004 Electorate 58.5%
Kudos to anyone who sees the problem already.
In 2004, 30.8% of North Carolinans voted early. In 2008, so far, almost twice as many North Carolinans have voted in early balloting. But the R2K survey for dKos is only showing a 19% share of their polled sample as early voters - fully 81% of the people they surveyed have yet to vote. Something is going on here.
The easiest explanation is that turnout this year will just be 2x or 3x what it was in 2004 for North Carolina. That would explain the data in the R2K poll. Except that the North Carolinan board of elections only reports a total of 6,191,768 registered voters. That's less than twice the 2004 turnout, and even in an historic election year like this year, turnout is unlikely to approach 100%.
This corroborates my fears here. It appears that early voters are being significantly undersampled in the R2K poll. Very significantly undersampled. And since the Obama/McCain split is so large among the early voters who were sampled, this suggests that the actual state of the race in North Carolina may be very, very different from the 47/45 split R2K reports among all likely voters. This is exactly the sort of undersampling problem I've been discussing inre stratified sampling and demographic weighting.
I can't say a lot more right now. This needs further investigation. I expect I'll be returning to this topic very soon.
Comments
I can haz update??/
Mon, 11/03/2008 - 12:11 — stacieI can haz update??/