[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]

WHEN SHOULD WE BE AFRAID OF DATA MINING?
by Michael J.A. Berry


In an earlier column, HOW I LEARNED TO STOP WORRYING AND LOVE DATA MINING , I tried to justify my lack of concern at having personal data such as my shopping habits, magazine subscriptions, web site visits and even telephone calls mined and analyzed by enterprising database marketers. The gist of my argument was that I have nothing to fear from these people because

  1. they have no personal interest in me beyond whether or not I am likely to buy their product, and
  2. they do not want to do anything that might upset me because if I were upset with them, I might not buy their product.

I ended that column with a promise that in a later column I would explore the circumstances where data mining really does pose a threat. This is that column.

The major impetus for the current flowering of interest in data mining is the move towards one-to-one marketing. In order to implement a true one-to-one marketing program, a company must be able to learn from its interactions with its customers and put what it has learned to use in such a way that it is easier and more pleasant for the customer to stay put than to shop around. One-to-one marketing companies employ data mining as a tool to help them learn how to serve their customers better so they will be happier and more profitable. It is in this context that I examined data mining and judged it to be benign.

But of course, the same techniques that are used to analyze catalog orders in order to target a mailing could easily be put to more nefarious purposes. For instance, a Big Brother government might find data mining techniques very helpful in compiling an "enemies list". If we put data mining applications on a continuum where a direct mailer deciding not to send you a sweepstakes entry is at one end and a repressive state identifying you as a target for special persecution is at the other, many of them will fall somewhere in the middle. How can we draw a line between the applications that ought to be tolerated or even welcomed and those that should be feared and outlawed? My answer is to evaluate each application on two scales:

  1. How close is the alignment between the people doing the data mining and the people whose data is being mined?
  2. What is the balance of power between the miners and the mined?

Let's look at a few potential applications of data mining with these two scales in mind.

In the case of a consumer direct marketing organization trying to reach the right customers, the interests of the miners and their targets are actually very closely aligned. Consumers do not want to get junk mail advertising products and services in which they have no interest. Similarly, the mailer has no interest in wasting postage on people who are unlikely to respond. Conversely, if the offer is one that the consumer considers to be valuable, both the vendor and the consumer are pleased. As for power, it is all in the hands of the consumer who is free to respond to the offer or not.

A more troubling prospect is the mining of medical records or lifestyle data in order to assign risks for various ailments to individuals or subpopulations. Whether the interests of the miner and the mined are in alignment will depend greatly on the nature of the health care system. In most of the rich world, it is generally accepted that the society as a whole benefits from having a healthier population and that the health and well-being of all members of society is the responsibility of the society as a whole. In most rich countries, this understanding has led to the creation of single-payer health care systems in which every citizen is automatically covered.

In such a system, the interests of the health care system and of the individual are reasonably well aligned. The health care system saves money by preventing people from becoming ill and by getting them early treatment when they are in need. Since people tend to prefer being healthy to being sick, they have no particular reason to withhold information that may help in their diagnosis or treatment. The balance of power is not one-sided. The health care system has the power to decide what treatments ought to be pursued, but it does not have the power to refuse coverage.

Here in the United State, the situation is quite different. Health care is paid for largely through a myriad of for-profit insurance companies. These companies can save money and increase their profitability by refusing to cover people who are at greater risk of becoming ill. Here the interests of the individual and the provider are at odds. The sicker I am, the more I want health care and the less inclined the insurer is to provide it. Furthermore, in the US system, the power is all on the side of the insurer which can approve or deny coverage. So, while I might look with indifference on a project to use data mining for medical risk assessment in Canada or Europe, I would regard a similar program with alarm here in the United States.

In fact, medical records are already accorded a higher level of protection than most data, but what if non-medical data were to be used for the same purpose? While I have no objection to the supermarket using my purchasing patterns to determine which coupons to issue me, I would feel very differently about the supermarket data being used by an insurance company to determine my risk for heart disease. And yet, premiums are already higher for cigarette smokers, so why not for people who purchase a lot of beef and sour cream?

Similar questions about misuse of data come up with automatic toll payment systems (Who is interested in where you are and when?), telephone records (Why do they want to know who your friends and family are?) and even magazine subscriptions, catalog orders or internet site visits (What do they want the information for?). After all, data mining is simply a tool. Like any tool, it can be put to evil use.

As the information society matures, society will have to evolve new guidelines, laws and regulations to cope with the new ways that information can be put to use. We should not rush to regulate data mining where no harm is being caused, but where regulation is necessary, the best regulations will be those that work to keep the interests of data miners aligned with those of individual members of society and the balance of power in favor of those individuals.
---
For more information, see http://www.data-miners.com/
+1 617 666-2836


[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]