As some of you might know, the New York State Senate recently released a document with all of its pork projects for the years 2003-2004 and 2004-2005.
One little problem. It was in a password-protected PDF with images (i.e. non-searchable). Our Republican (and dick) Senate Majority Leader decided to purposely release the information in the most unreadable way possible.
In fact, virtually all news stories posted in the day after the dump (thanksgiving, of course), said nothing other than the vast unreadability of the data.
Our State Assembly, which is run by (slightly less corrupt) Democrats released the information in searchable PDF form with no password.
To make a long story short, I cracked the password and ran the document through some sophisticated OCR (optical character recognition). I then ran it through some slight parsing I wrote to make it more readable. The results for 2004-2005 are now available on The Albany Project's Website for all to see (see the diary).
We've already uncovered some interesting scandal-like behavior (including some from the guy NYBri ran against).
Enjoy!