Recently
SusanG and
georgia10 posted a
quote by Valdis Krebs that states, "If you're looking for a needle, making the haystack bigger is counterintuitive. It just doesn't make sense." This is meant to indicate that the NSA's apparent program of creating a database of all domestic phone calls would ineffective at capturing terrorists.
In the interest of helping us all be well-informed on this issue, I'd like to offer a different perspective, one that disagrees with Mr. Krebs. The NSA's approach could actually be effective.
Note: I'm not defending the NSA's program in this diary! No no no. I'm adding some knowledge I have to the mix. Please read my whole post before assuming I'm an idiot. :)
The needle in a haystack analogy that Krebs uses is misleading because he doesn't paint the whole picture. For one thing, suppose you were looking for a needle in a haystack, but you were only allowed to look at part of the hay. That might make your task impossible, if the needle happens to be where you can't look.
But even that doesn't capture very accurately what the NSA tries to do, I believe. It'd be better to imagine that you know very little about this "needle" that you're looking for. Perhaps you don't even know it's a needle, only that it's sharper than all the other bits of hay. In this case one way you could possibly find the elusive object would be to test the "sharpness" of all the pieces of hay, compare them, and select the sharpest.
That would be incredibly difficult, so another strategy would be to try to study random bits of hay to get a better understanding of the properties of hay. Gather as much information about the hay as possible, and then look for anomalies. If your data-collection is accurate enough, or if you "cast a big enough net," you might notice that there's something in the hay that, in addition to being rather sharp, is also silver and made of metal, unlike all the rest of the hay. Sure enough, this will be your needle.
This is the kind of analysis that people in data-mining do. They use computer algorithms to look for similarities and differences in the data, and they're more likely to find what they're looking for the more data they have about a given population.
We have to note, however, that no one here at dKos actually knows what kind of algorithms or data mining the NSA does. (Or, if anyone here does know, they're not going to tell us. :) Even the mathematicians who work for the NSA don't typically know what the algorithms they're designing are going to be used for. (As a mathematician myself, I know people who work for the NSA, and they tell me that they're often given abstract problems to solve without knowing what the applications will be.) So I could be totally wrong.
But outside the NSA, people who study this kind of pattern-matching from large amounts of data will always tell you that the more data they have, the better.
By no means am I condoning the actions of the NSA and the Bush administration on this issue, however! I find the whole thing as repulsive and scary as everyone else. But we'll be better at combating the evil if we have greater knowledge about what's going on. Assuming potentially fallacious sound bites like the one Valdis Krebs uttered can be misleading.