Building a Voter File Part 6: Tutorial the Second

So, we've decided to try and build a voter file for Adams County in Ohio.  We've downloaded the information we need from the secretary of state, and figured out what's in the file.  But all we have now is a massive text file.  What's the next step?

 

Before we do anything else, we should load the data into a database so that it's easier to manipulate.  I'm going to use MySQL for this example, mainly because I know how (and if you don't, and you're going to be doing a lot of work with voter data, it's well worth your time to learn--there are any number of good books out there, plus web tutorials). 

 

First, let's take a look at the file.  Notice that the various fields are separated by commas, i.e. it's comma-delimited.  This makes our lives easier--some files separate fields based on their widths (e.g. "Columns 1-5 are ID, columns 6-19 are name and so on")--this, in my expert opinion, is a weapons-grade headache.  Notice also that the comma--the delimiter--does not actually appear in any of the fields (e.g. the address column is not "1 Main St., Sandusky, OH").  This is helpful as well--some states that maintain less-clean files will screw this up.

 

Second, let's go through a couple rows.  We're looking here for anything that could break the file--missing fields in some rows, delimiters within fields, any irregularities in the file that could cause it to load improperly.  This will also give us a sense of what values mean what in the different columns.  For example, looking at the first few rows of the Adams file, I see no obvious problems; I'll also note that regular vote history is recorded with an "X" in that column, while primary vote history is recorded with an "R" or "D". These sorts of details will come in handy later.

 

As long as we don't see anything amiss, it's time to actually load the data.  We can create an empty table with a column for each field in the voter data, and then populate it using the MySQL command LOAD DATA INFILE [file] INTO [tablename].  And voila! We've turned a completely unreadable file into a tech-person-readable SQL table.  Later in the series, we can discuss how to get this data out to other users.

 

Editors' note: I'm aware that this installment elides a lot of technical detail.  Over the course of the series, I will post some additional material consisting of the code that you could actually use to do what we're talking about here, along with my annotations; however, that's a longer-term project than merely writing up the post.  When I do get around to actually doing stuff, rather than telling you the reader what to do, I'll note it.  I can probably use the practice, if nothing else.

wCXnUDnpNDsuyAYCwe

O2rEje qwnhsascglml, [url=http://eufsrskdsgpg.com/]eufsrskdsgpg[/url], [link=http://vauemetwxcqt.com/]vauemetwxcqt[/link], http://wbqmasivzpoq.com/