try another color:
try another fontsize: 60% 70% 80% 90%
Overdetermined

Data

Data Mining for Journalists

Via Slashdot, investigative journalist John Mecklin lays out a way that the Internet revolution is actually helping journalism (crazy, I know):

Now, in the post-Google Age, Allison sees the possibility that computer algorithms can sort through the huge amounts of databased information available on the Internet, providing public interest reporters with sets of potential story leads they otherwise might never have found. The programs could only enhance, not replace, the reporter, who would still have to cultivate the human sources and provide the context and verification needed for quality journalism. But the data-mining programs could make the reporters more efficient — and, perhaps, a less appealing target for media company bean counters looking for someone to lay off.

IMHO, the part about investigative reporters not getting laid off seems increasingly far-fetched.  There are problems in the news business that a few new reporting techniques won't solve.  But still, increasing the efficiency with which the public can gain from its own data is something worth cheering.  As I've tried to stress throughout my posts, the ability to search through massive databases of material like this is still in its infancy.  Our ability to collect information has outstripped our ability to make sense of it, and we're still growing into all the things we can do with this data.

Building a Voter File: Address Standardization

In my last entry, I discussed matching lists when they did not share a common, persistent and unique identifier.  Basic conclusion: challenging! In this week's entry, I'll share a common technique for making the job a little easier--one which has a number of uses beyond list-matching.  Read more...

"We don't live in a Christian nation, at least not the way they mean it." - Quantified

Hat tip to Secular Right.

I normally make it a point to avoid anything to do with John Derbyshire and Heather MacDonald, who are otherwise quite repellant, but even while on vacation, I can't quite seem to get away from them. A friend of mine IM'd me their reading of the Pew Study, and I thought that it might be interesting to take a look at it.  They surveyed 2,905 American adults  from July 31, 2008 to August 10, 2008.  Anyone self-identifying as atheist, agnostic or unaffiliated with any religion were left out of the poll. The previous waves of the study were in 2002 and 2007.

There's more...

Cell-phone Polling and Obama Leads

Via the always-excellent Future Majority, Pew Research has released a new report about the impact of missing cell-phone-only voters in telephone polling.

An analysis of six Pew surveys conducted from September through the weekend before the election shows that estimates based only on landline interviews were likely to have a pro-McCain tilt compared with estimates that included cell phone interviews. But the difference, while statistically significant, was small in absolute terms

There's more below the fold...

An Alarming Use of Personal Data

It's not political, but the International Business Times has an interesting story about the rise of "spear phishing": personalized spam.

A new study by Cisco Systems Inc. found an alarming increase in the amount of personalized spam, which online identity thieves create using stolen lists of e-mail addresses or other poached data about their victims, such as where they went to school or which bank they use.

Unlike traditional spam, most of which is blocked by e-mail filters, personalized spam, known as "spear phishing" messages, often sail through unmolested. They're sent in smaller chunks, and often come from accounts the criminals have set up at reputable Web-based e-mail services. Some of the messages are expertly crafted, linking to beautifully designed Web sites that are bogus or immediately install malicious programs.

I suppose this will mark me as a naif, but I had never thought of this particular bit of criminality.  On hearing of it, though, it strikes me as obvious--and bound to grow.  According to the article, personalized spam only represents four-tenths of a percent of all spam, but if it's much more effective and less likely to be caught by a spam filter, then smart criminals should quickly add it to their arsenals.  And as immense databases full of personally-identifying information proliferate ever-wider, there will be more targets.  We live in interesting times.

Using census data as a pollster

I'd like to follow up on Pluribus' brilliant post on downloading and using census data with an explanation of how these data can be useful to you from a polling perspective. You're starting to see some of this happen with voter file vendors, but, hey, remember, the point of this site is to help smaller groups who can't afford a Catalist subscription do all the crazy fun data things that we do. 

There's more.

Are polling databases a good solution to the sample size problem?

Working in strategic polling is a lot like being in college: you're constantly scrambling to meet crazy deadlines, dealing with too many obligations at once and barely have enough time, money and energy to put into handling any single one of these obligations at any time. So, similar to writing papers and cramming for exams, you limit how many resources you put towards each poll. Instead of calling 5,000 people over the course of one week, hiring out the entire phonebank, you call 500 people, hire out only part of the phone bank's hours, and do it over the course of two days.  You get the data faster, it's way cheaper and your client is happy with what he has. And, really, it's only the difference of a few points on the MoE, so who's really hurt?

This is, of course, completely wrong.  Think about it for a minute, and then click the full entry to see why.

Am I missing something?

Far be it from me to ever, ever think that I could be better at reading data than Steven Levitt. I've been a big fan of his since back when he was putting out studies on crime rates at The Harvard Society of Fellows, and I think that there are few people who are as capable of looking at data without predisposition as he is, and, let's be honest, Freakonomics was the book in 2005. That being said, there are times when I read his NYT blog and wonder what the hell is going there. Today's guest post from Eric Oliver was one of those times.

There's more.

Quick Hit: Blacks cannot be the reason Prop 8 passed

Normally, we like to provide our own analysis, but there is absolutely nothing we can add to this.

Before we start making ridiculous generalizations about black people, it helps to do a little research.

DD

Quick Hit: Blacks cannot be the reason Prop 8 passed

Normally, we like to provide our own analysis, but there is absolutely nothing we can add to this.

Before we start making ridiculous generalizations about black people, it helps to do a little research.

DD

Syndicate content