CAN DATA MINING HELP IN RESOLVING POLITICAL ISSUES?
by Inderpal Bhandari, executive editor at large
Business Intelligence or Technologies of Global Importance?
Too often, we see only the profit-making justification for investing in data mining and related technologies while losing sight of the fact that they can also help resolve issues of global or national importance. In this column, I recount two such issues that have been making the news in the last few weeks.
Center stage was taken by the terrorist attacks on the U.S. embassies overseas and the U.S. retaliation. In the wings, a special panel of judges ruled that the use of statistical sampling in the Census violated Federal Law. (There was one other matter in the spotlight, but that has already been covered in the press ad nauseum).
Let's start with center stage. Terrorism is clearly a threat that must be dealt with firmly. In the furor of all the reporting on the subject in the past few weeks, two points have stuck to my mind. First, there was as clear an explanation as any that I have seen on why retaliation is essential. The reasoning was as follows. Terrorism is psychological warfare. When terrorists target a particular country, it is essential to keep up the morale of its people. Retaliation serves that purpose.
The second point had to do with the importance of gathering intelligence electronically to combat terrorism. Good intelligence enables one to retaliate, and even more significantly, to prevent terrorist attacks. In today's high-tech times, terrorists leave electronic fingerprints all over the place, as they communicate, travel, and spend money. However, the National Security Agency is battling a tough data mining problem. The NSA is overwhelmed by all the electronic data that it must process, only managing to cover 20% of it, according to a recent article in the Op-Ed page of the New York Times. Finding the electronic imprints of the terrorist is like looking for the proverbial needle in the haystack.
Walter Lippman once said, "You and I are forever at the mercy of the census-taker and the census-maker. The impertinent fellow who goes from house to house is one of the real masters of the statistical situation. The other is the man who organizes the results." Evidently, there are people who manage to elude the census takers, and the census makers want to rectify that situation. They argue that the census undercounts the rural and urban poor and propose that sampling be used to rectify that situation. The issue has become politically charged since the undercounted live in predominantly Democratic areas, and therefore counting them might help the Democratic party when drawing district lines. The latest round was won by the Republicans when a special panel of judges ruled that sampling violated Federal Law.
Politics aside, there is a serious question here. When is it better to count less than to count more?
The conventional view is that it is better to use as much data as possible,
but this issue illustrates that it eventually comes down to the soundness of the underlying process that is used to do the counting. The advocates of sampling in the census debate believe that the sampler is going to be more successful than the census taker in tracking down a segment of the population. If so, less is more. The opponents believe otherwise, hence, for them, more is more. Depending on which way this debate turns, you and I may soon be at the mercy of the sampler, in addition to the census taker and census maker.
---
Inderpal Bhandari can be reached via http://www.virtualgold.com