Ohio Registration Verification Cannot be Done in Time

by dsmetis

Community

(This content is not subject to review by Daily Kos staff prior to publication.)

Wednesday, Oct. 15, 2008 Wednesday, Oct. 15, 2008 at 8:47:58am PDT

I'm in the business. I consult for an agency of New York State. One of our main activities is to match datasets to either permit or disallow duplicate insurance coverage. We work with sources from all over state government. So let me talk about feasibility.

Presumably, each county board of elections will need to verify their new registrations. I'm guessing there's about 50 counties in Ohio, with registrations distributed unevenly - probably more in urban and suburban areas than in the rural counties, and the targeted registrations (from the GOP perspective) are heavy in those Democratic areas. Let's say that a typical urban BOE has to verify 25-50K registrations against DMV and SSA records.

Here's the problem. Name and address standardization for matching is a very difficult thing. It requires good tools, some patience, lots of tries to get it right. We're good at this, and we spent several weeks building out high-quality standardization routines. How are these individual BOEs supposed to build out that expertise in that small amount of time?

The next problem is that the matching routines operate on a form of fuzzy logic. In our environment, matches are "scored" ... that is, the system determines the likelihood that two rows match each other. This is another place where experience and experimentation mix. When we built out matching routines, it took a week or two of manipulation to get acceptable results.

Of course, when the system is running, it returns sets with a range of scores. The highest scores indicate certain matches, and the lowest scores indicate really shaky ones. The systems people decide on cutoff values for "good" matches, "non-matches", and allow for some clerical review for records that are landing in the middle. These scores really state a bias toward matches or non-matches. In a BOE where Democrats are in charge, you might have a low cutoff for good matches, but a Republican might put the bar higher. And this is obviously a place where yet another lawsuit could challenge the methodology.

Finally, matching of records is an imperfect science because it depends on human data entry in the first place. I cannot tell you how many cases we see every day of commercial and government agencies reversing digits in SSNs, swapping SSNs for people as they enter them, interpolating dates, misspelling names, swapping first and last names - especially if those names aren't European names. The quality of data being offered as a verification base is pretty good, but it ain't perfect. And the data entry of 600K new names into the voter registration database is almost certain to be rife with errors. After all, all of these names are brand new to the system and will not have been verified by the individual at the polling place yet.

Here is my point ... or points:

this work is hard and cannot be done in an automated way in less than 3 weeks until election day.

nobody has the manpower to do it manually.

this work leaves open lots of doors for manipulation of the results.

Of course, that's their point, and by "their" I mean both the GOP and the 6th District Court. I don't know how stuffed that court is with Federalist Society members, but I do smell a stinker. We need to fight this hard. Since when did it become a crime for American citizens to vote? (don't answer that ... )