try another color:
try another fontsize: 60% 70% 80% 90%
Overdetermined

The Plebian's Guide to Polls - From Sentiment to Substance

 

 
This week proves that you should never leave your project outline on someone else's computer.  In last week's issue, "Margin of Error", we discussed how polling provides a stable estimation for public opinion.  Today's issue, "From Sentiment to Substance", actually should have been LAST WEEK'S issue.  Oh well.  I'll fix it in the archive.  Anyway, today's issue will give a layman's view of how polls go from idea to implementation, how they are actually conducted.
 
 
 
-----------------------------------
 
From Sentiment to Substance
 
I talked before about four types of polls: political, policy, parse, and push polls.  While I consider these to be largely distinct categories, they unequivocably share one thing: polling methodology. The easiest way to understand how polls are conducted will be for us to take an example from inception to completion.  Therefore, I propose that we take an hypothetical political poll for the 2048 presidential race (between Malia Obama and Piper Palin).
 
In framing our demo-poll, let's start with the fundamental questions Dirty D lays out in his Building a Poll series:
  • WHAT questions do we want to ask?
  • Of WHOM do we want to ask them?
  • WHEN do we want to frame our questions?
  • WHY are we asking them?
  • And HOW do we go about doing this "polling" thing?
So for our hypothetical example, let's break these down.
 
WHAT do we want to ask?  Since this is a political poll, we're primarily interested in gauging the relative strength of the candidates in the electorate.  So we want to ask who people plan to vote for, Malia or Piper?  Let's take this a step further, though.  It's all well and good to know how many of the people we poll plan to vote for Piper, and how many for Malia.  But wouldn't it be even better if we knew more about the kinds of people we talked to?  Let's ask some more questions - how about the race, religion, gender, and educational attainment of the people we're going to poll.
 
Of WHOM do we want to ask?  Well, in 2048, the biggest swing state and the center of the US population is Rhode Island.  So let's confine our poll to Rhode Island voters.
 
WHEN do we want to frame our questions?  If we want to know about the 2048 presidential election, we should tailor our questions to ask who people plan to vote for on election day, not who they might vote for if the election were held today.  Our demographic questions, on the other hand, should probably be phrased in present tense.  We don't have a whole lot of interest in what race voters will be on election day.  It's a pretty safe bet that they'll be the same race they were when the polls were taken.
 

WHY do we want to ask these questions?  That's mostly summed up by the fact that this is a political poll.  We're interested in understanding the support dynamics for the two candidates - who is ahead overall, who leads with certain demographic groups, etc.

And lastly, HOW are we going to ask people these questions?  This is what I referred to before as "polling methodology": what all polls have in common, however divergent their goals might be.  HOW is our key question in this issue.  We want to understand what happens in the space between sentiment (what people think on a subject) and substance (the raw data of a completed poll).  We're going to run into a lot of potential sources of methodological error along the way - ways poll results may become skewed because of decisions made by pollsters in the polling process.  These are important to remember, because they stand at the heart of why not all polls are created equal.

So our first step in polling Obama vs. Palin '48 should be to decide on the wording of our questions.  This is actually a very big deal, counterintuitive as it might seem.  If you ask someone, "Which candidate do you think would make the best president", you'll almost always hear the same answer as if you had asked, "Which candidate do you plan to vote for in November".  But you won't always hear the same answer, and like I said in my earlier article, accuracy is of paramount importance in political polling.  If just 1% of people respond to those questions differently, you've now introduced 1% of methodological error to your estimates.  That may not sound like a lot, but remember that close political races can come down to the wire, close to a 50/50 split.  Even little things like how questions are worded can skew a poll for one candidate or another.

All right, we thought about questions and let's posit that we've decided what to ask.  Now we have to figure out how to ask it.  There are a number of different ways polls can be taken, and as an everyday citizen who gets exposed to polling data in the newspaper or on TV, it's important to understand that no methodology is perfect.  Every methodology will have its own individual problems.  The two critical concerns in deciding how to take a poll are:

  1. Gathering a random sample from all the people you're interested in
  2. Ensuring an adequate response rate (ensuring people don't just ignore your questions)

We can think about a few ways to poll people here.  Let's start with something most people are familiar with - online polls.  (For polling nerds, I'm not talking about secure-server login polls here, I'm talking about your everyday CNN.com or FoxNews.com sidebar poll)  If you notice, news organizations that put up online polls always go out of their way ro mention that the results are unscientific.  Why is that?  Well, these kinds of polls are very prone to response pattern problems.  First of all, people decide voluntarily whether or not to participate.  This leads to a situation where people who are interested in the poll topic vote overwhelmingly, whereas people who aren't terribly interested don't really vote.  On some issues, this can be a huge problem.  Say you put up an online poll asking if people approve of the WTO (World Trade Organization).  There's a vocal minority who actively oppose the WTO, but there's not really an opposite counterpart.  So if you put up an online poll, you're much more likely to draw respondants who vociferously disapprove than those who - more ambivalently - approve.  Your poll won't give you a good cross-section of all people.

Polls-by-mail, where you fill out a series of questions and send them back to a pollster, often have a good random sample to start.  Packets are mailed out to a truly random sample of households.  But polls-by-mail also lack the immediacy of, say, a phone call.  And they require extra work on the part of the person polled.  Response rate tends to be very low for polls-by-mail, since it's so easy for individuals to ignore the packet of materials.  Now, a low response rate in itself may not be a big problem - if you know you won't get many packets back, you just send more out to compensate.  The problem comes in if there's a pattern to who does and doesn't respond, as with the online poll example above.  To give an example of how this can skew things, imagine that 1% of the people receiving a poll-by-mail packet are functionally illiterate.  Obviously, these people will be less likely to return the completed packet.  But say we're using poll-by-mail for a political poll.  Do we expect functionally illiterate people to be equally balanced between Democrats and Republicans?  Unlikely.  People who are illiterate will have a better than average chance of falling in a few other categories as well - low-income, low-education, etc.  We find ourselves undersampling from one portion of the population just because of the polling methodology we chose.

Phone polls are perhaps the most well-known polling method, at least for political polling.  One advantage to phone polls I alluded to earlier is their immediacy - once you have someone on the line, it's not that hard to get them to answer a few questions.  Not relative to making them find time to sit down and fill out a packet (and then mail it back in), anyway.  But even phone polls can run into sampling problems, as was widely discussed in the 2008 presidential campaign.  More and more young people are moving toward using only cell phones, but phone polls are traditionally restricted to landlines.  There is a segment of the population that you're likely to miss simply by choosing the phone poll methodology.

What's left once you have your data, that's analysis.  And I'll leave that for another time, because that's a whole 'nother can of worms.

As I said, no system is perfect.  It's up to pollsters to weigh the pros and cons of each system and choose the one they think will give them the most accurate representation of the population they're interested in.  But it's up to us, the public, to understand that these differences in methodology can have reall differences in results as well.  Not all polls are equal, and we should all be cognizant of the fact that poll results are always shaped, to some extent, by the choices of the people conducting the polls.

Join me again next Wednesday, and we'll get back into tackling some of the mechanics of poll results themselves and what polls really tell us.