As could be expected, the revelation of NSA interception of electronic communications involving US persons absent warrants, and as directed by the President, has caused a furore. The numbers of related diaries here on dKos, the articles and pundit opinions of the MSM, the blogs and rages devoted to that broad subject are testamentary.
Now, why should the topic of data mining be of interest here? Numerous reasons perhaps, but two are paramount.
- There has been a near-hysteria associated with the subject, i.e., "I have been raped. But, oh my God, I have been data-mined, too!" - indicating a misunderstanding of data mining.
- There has been widespread misinformation concerning data mining and its relevance to the NSA program of electronic communications intercepts ordered by President Bush. It must be countered.
So, for a different view about data mining, read on.
Before proceeding, I apologize in advance to those IT professionals, computer scientists, statisticians, mathematicians, and others who many be involved in data mining, either in theory or practice. I know I am neither using precise terminology nor the definitions that you might accept. The purpose here, though, rather than a technical or academic discussion, is to provide information that all can use in forming their own judgements.
As most diarists, I have a point of view; in providing information, I also seek to advance my viewpoint ... which you do not have to accept, and for which I do not give apology. But, even if you do not subscribe to my viewpoint, at least take away the information, please?
What views? Well, frankly, the question of data mining with respect to President Bush's NSA program is a non-starter, a red herring that detracts from the central questions. By focusing on data mining, people lose sight of the forest for the trees, so to speak. Data mining is not the issue at all; illegal conduct in obtaining the data, and illegal use of the data mining results are the issues. In this case, data mining is merely a tool that has been used, benign in itself, and from a legal point of view, I think neutral.
That said, let's get on with data mining!
For starters, what is data mining? Well, there is no one accepted definition. Here is what Wikepedia has to offer as one definition. Data mining
"is the practice of automatically searching large stores of data for patterns. To do this, data mining uses computational techniques from statistics and pattern recognition."
In this limited discussion, I think this is perhaps the best definition, as others tend to emphasize the more techinal aspects in varying degrees. Still, I would add a few things to it. Aside from statistics and pattern recognition, data mining techniques also rely on other mathematical tools, logical relationships, and artificial intelligence tools (such as expert system inference, artificial neural networks, hidden Markov mark-up models, and support vector machines). You can, of course, find any number of more technical definitions. In any case, I recommend that you read the entire Wikipedia article; it both gives a frame of reference, and it lays the groundwork for future discussion.
So, here is my take on the situation. The NSA program may have been illegal, since it involved warrantless interceptions of electronic communications to which at least one of the parties was a US person. The collection take, i.e., the interceptions, was processed using data mining as a tool, a collection of techniques that can be used for a myriad of purposes, techniques that are not unique to national security intelligence production. Now, the uses of the end results of the data mining may have been illegal. But the role of data mining contributed nothing to the legality/illegality of the collection under the NSA program, and it contributed nothing to any illegality in the use of its results. The data mining, in and of itself was, in this case, completely neutral with respect to any questions of legality or Constitutionality - it was only a tool, even if the tool was abused or misused.
Any focus on data mining as an aspect of the NSA program misses the point, the entire set of points actually, related to the legality and Constitutionality of the program itself and the President's role. Further, if you recognize the data mining aspect for what it is (and is not), you must surely see that to focus on it to any degree, however slight, usurps energy and direction from the broader issues, the real issues - those surrounding the President's actions.
As I discovered during an exchange of comments relative to another diary, my position is easily misunderstood. Do I believe data mining is relevant to the NSA problem? No - I believe it to be a distraction, a red herring. Do I believe date mining to be always neutral and benign? No. absolutely not!
If there is enough interest in the topic, I will post another diary showing why, in a broader sense, data mining is a great concern, or should be, to everyone. It (data mining) does present a very real danger (ala Total Information Awareness) to the public at large. There should be an open and large-scale public debate in the interests of all, possibly with circumscribing legislation as an end result.
But in this case, the case of Bush's little NSA problem, forget data mining. Concentrate on the real issues. Do not be led astray. (I might add, there are other red herrings swimming in the bath, as well - data mining is only one among the lot.)
For now, I have taxed my little fingers far too much, so I will get off my soapbox. I hope you will all discuss this pro and con. And of course, if appropriate, flame away! (I have just donned my fireproof long-johns!}