[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]

SCORN FROM THOSE THAT HAVE GONE BEFORE US
by John K. Thompson


I like to talk, always have and probably always will. I like to talk to all sorts of people, but mostly I like to talk to people who are smarter than I am. Yes, I can envision, and almost here, the snickers and wisecracks from the peanut gallery, but it's true, talking to people who are very bright is invigorating, at least it is for me.

I had two conversations this week that have stuck with me. In this article, I'll explain the interactions, the participants and what I drew from them.

One was with an old friend. I have known this fellow for about 10 years. He has held executive positions with some of the leading syndicated data providers, started two firms based on performing advanced analyses for Fortune 500 firms, and is probably one of the most talented analytical thinkers currently performing analysis on day to day problems. If the Harvard MBA you hired to run the brand or brand group can't figure out why the share is tanking, this is the man you turn to for help. I sat down in his office and started to chat about the current state of the data mining market.

After a couple of sentences, he replied, "These people have no idea of what they're doing."

I asked, "What people?"

He shot back, "The vendors, the users, any of them."

I thought for a second and inquired, "You mean everybody in the DM market is clueless?" At least that got a smile and stopped a tirade before it began. Hanging out with smart people has a down side, too. They're usually quite taken with their own abilities, and rightly so in some cases, but not many.

He went on to explain that he was working with a Fortune 500 firm on a brand analysis. They, he and staff from the client, had built a database on one of the leading multidimensional database systems and performed an analysis. The analysis was good and the client was quite happy with the insights that had been delivered.

He went away and worked on other projects. Some time had passed and the client called back and asked if he would consult with them and a data-mining vendor on utilizing the database they had previously built. This time the database was to be used for data mining operations. Of course, he agreed to work with them. After working with the client staff, the data mining vendor, and his colleagues, his conclusion was: first of all the algorithms that they were trying to apply were all wrong for the phenomena that was to be examined, the staff of the client as well as the data mining vendor had no clue regarding the business issue, and the results were horribly wrong and could lead to some very bad decisions.

So here we are, smart people, advanced technology and high probability of making some serious mistakes. What this story illustrates to me is:

  1. You must know your business and how to manage it. Not just by gut feel or intuition, but quantitatively. I am a big proponent of gut feel, but data mining is not about impressions or intuition. Data mining is a quantitative exercise that demands knowledge of the business domain by the numbers.
  2. You must know the underlying technological or mathematical philosophy, which underpins the algorithms, or approaches that you intend to apply to the data set that you have built.
  3. You must be able to quantitatively verify what you discover. It's like taking a test in math. If you can't prove your results through some test, then it's more than likely that you have arrived at an erroneous conclusion.

These points, for many, may seem self evident, but at least one Fortune 500 firm and one data-mining vendor did not feel that it was important to understand these three points. It is my impression that there are many more firms and vendors out there trying to force fit data mining into the analytical portfolio of client organizations. This can only hurt the market in general.

The other conversation was much shorter and quite a bit less fun, but I still believe that it was interesting. I subscribe to many e-mail publications. Some are very regular, like DS*, and others are so sporadic that when I receive a note from them I have to think to remember if I signed up for this or if I have been spammed. I received a note telling me that a study had been done on the attitudes of buying life insurance and how that fits in with the financial needs of American families. Scintillating reading, hardly, but interesting because we market our software to insurance companies.

I sent an e-mail, and the author of the report called me back. During the conversation I detected a slight tone of disdain for my request. So I probed, and the point of irritation was that people in the general business population perform analysis in a one off manner without proper and due consideration for the techniques to be employed upon the data, and generally were not rigorous enough in their analytical methods. The author went on to explain that he had been working with a super set of data for over a decade and that this study was a spin out of the larger work that he and his group were performing.

I thought, this guy really has some pride of ownership in the care and feeding of this data set, and the analysis that comes out of the data. I have not heard of or met anyone in a business setting that exhibits comparable zeal for the data that they are working with. Maybe this is the crux of the issue.

If client personnel and vendor personnel are simply trying to "get this done", then we might be in for some rude awakenings. The computer industry press will like that, they always love to report on the latest and most gruesome failure of technology along with the staff firings and humiliation that goes with it.

Data Mining tools need to mature and mature quickly. The evolution needs to occur in the data access, and transformation functions, as well as in the functions of applying algorithms or approaches correctly. Perhaps the later need is more in the area of training in the short term, but for long term success, data mining products will need intelligence to be built into them to help people from making the most rudimentary of mistakes.

Not even a small minority of business or IS professionals will spend their time learning about the appropriate data transformations required for data mining, even less of them will spend the time to learn the appropriate application of all the algorithms that are being marketed under the data mining moniker. With that said, it is up to the data mining vendors, most likely through services offerings in the short term, and that's not a perfect solution either, to build these functions into their products.

The bottom line is, if users and vendor staffs are not exercising the appropriate level of rigor, then it is required that the products provide safeguards against misapplication. I haven't seen all of the product plans, but I wonder if the vendors are thinking along these lines yet, or are they still enamoured with adding more algorithms?
---
John Thompson, Vice President - Marketing, Magnify, Inc. I'd like to hear your thoughts. You can reach me at jkt@magnify.com


[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]