DATA MINING AND KNOWLEDGE DISCOVERY FROM THE EXECUTIVE PERSPECTIVE:
AN INTERVIEW WITH ROGER STEIN 11.04.97 by Alan Beck, editor in chief D S *
D S * : What are the most common serious mistakes made by CIOs implementing data mining and knowledge discovery technologies?
STEIN: "At the CIO level, it's hard to characterize something as a mistake after only a short period of time, which is typically how long most firms have been undertaking data mining programs. This is particularly so given the amount of infrastructure development and learning that must sometimes take place. Having said that, certain patterns of behavior seem to emerge in many data mining projects.
"There is, first, a tendency to focus closely on the tools, getting excited about one or another, as opposed to really looking at how the business problems are structured. Yet it is the structure of the problems that allows the solutions to fall out. You can use any tool to solve almost any problem, if you're willing to work hard enough on it. It then becomes a question of whether you're doing things efficiently and what your likelihood of success will be.
"People sometimes think of these technologies as magic bullets. Much early commercial work in neural networks, for example, took the position that you didn't have to understand either your data or statistics -- just dump the data in, and the technology would find the relationships. But, it turns out that, you must have a very firm background in modeling, statistics and the business domain in order to structure, validate and justify any model, whether it's a decision tree, neural network, discriminant analysis or whatever. So the tool wasn't the answer.
"Because most of these methods involve extensive searches for patterns through very large data spaces, to the extent that you make the process more difficult by a problem-poor formulation you're far less likely to find interesting information.
"There is another side to that coin: the problem of overfitting. It's human nature to see patterns in things. When people look at the output of, say, a particular data mining algorithm -- a rule, perhaps that bubbles to the surface of the data -- there's a desire to explain that rule using intuition and background knowledge. And people are very good at figuring out such explanations, even if they're wrong.
"Most people who work in this field have had the experience of finding an interesting (but false) rule, only to later realize that they made some error in problem formulation or picked up some spurious relationship in the data. What is interesting is that people can usually generate a very good explanation for a rule even when the rule is wrong! So it is very easy to fool yourself: that's another mistake you run into frequently. This points out the need for more rigor in some data mining approaches."
D S * : Specifically, how can you guard against the perception of spurious patterns or relationships?
STEIN: "It is a tricky problem. Rigorous testing procedures are important: this breaks with some of the more common statistical approaches to problem-solving that concentrate on evaluating the significance of the parameters of the model itself. I, like most people who work in this field, typically favor out-sample testing. But even this can be tricky! People tend to make technical mistakes during testing. There are lots of stories of developers that thought they were performing exceptionally rigorous testing, when in fact, they were missing a fundamental assumption in their whole approach."
D S * : Should a CIO work closely with a statistician?
STEIN: "Since the CIO sets the vision for the organization, it behooves him or her to have a firm grasp of data mining technology at least at the intuitive level. But this doesn't necessarily mean that the CIO needs to work intimately with a statistician or programmer. That level of in-depth technical familiarity may not be warranted.
"However, the team responsible for development of data mining in a particular domain should be made up of domain experts (business folks), as well as specialists with strong backgrounds in mathematics, statistics, and database programming. This is how I usually structure projects, and I've seen it work quite well and deliver remarkable results. The key is understanding how the technologies will fit into larger business solutions.
"Unfortunately, though, what happens in many organizations is that you get a lopsided team: people with strong business backgrounds who don't understand the technology well or technical people who do all kinds of things to obtain "interesting" information from data, but sometimes end up solving the wrong problems or solving the right problem the wrong way from a business needs perspective.
"There must be a good partnership between the two types of people, not merely a superficial one."
D S * : How can two groups with such different perspectives be liaisoned effectively?
STEIN: "You need a way to map problems onto business solutions and vice-versa. But often all you get, say in vendor literature, is someone pushing a particular tool.
"It is vital that a person who is business-savvy in a particular area understand a little about the technology, at least at a basic level. By the same token, technologists must actively work to understand how business people will utilize the output of their technologies. Both sides need to talk to one another in a sort of middle-ground language, a language that is both technical and business-focused, a language that does not require either a Ph.D. in mathematics or an MBA. This language is what Vasant Dhar and I attempted to provide in our book."
D S * : What kind of resources must be utilized to implement this?
STEIN: "It depends upon the structure and culture of the organization. Some firms work well with project teams. Here, a business team comes together to solve a particular problem, and the technologists act in the capacity of a consulting service, going from team to team and problem to problem. Other firms form a specific group to solve a specific business problem, so a given business unit will "own" that process and take full responsibility for its solution. It really depends on the scope and structure of the project.
"Typically I favor finding a single strategic business need where a moderate increase in data understanding can potentially produce a fairly large impact. This is what I call a "quick kill." If all goes well, more challenging problems can be attacked next. This lets organizations get familiar with the culture of leveraging data and fitting business problems to different technologies. It also allows the technologies to build a track record within the firm."
"The key is that the organization's data and the expertise of its people must be deployed and managed as any other asset, as opposed to being thought of as a support-type function that is drawn upon by the business, like the opening of a faucet. The very concept of using data strategically must be important to the organization! Otherwise, what you end up with is a bunch of technical people trying to get business people excited about a particular idea or a group of business people getting excited about a new toy and trying to figure out how to apply it to their problems."
D S * : What specific role does education play here?
STEIN: "Of course, people should be well-trained, and the assumption is that teams are made up of the types of people I described earlier: specialists in statistical modeling and business strategy, etc. I'm also a strong advocate of self-education in this context: finding out what others have done, going to conferences, reading, etc., and trying technologies out to understand how they can fit into a firm's business strategy. This is especially useful on the management side, where people are not familiar with the technology and may well feel intimidated by the literature. However, I don't think attendance at, say, a three-day a course will teach people how to solve these problems in one shot. The learning experience must be iterative.
"But I want to re-emphasize that there is no way you can get either the business side or the technology side out of the picture. I would never think of developing this type of business process and system without a interdisciplinary group -- period."
D S * : How can an executive realistically benchmark the results obtained from these technologies? How do you judge whether outcomes are marginal, adequate or exceptional?
STEIN: "Technologies cannot provide an answer here. This depends heavily on which problems are to be solved within a business domain and the quality of the data and expertise that are available.
"There is a tendency to say 'This group improved their profitability by x percent...' or 'That project failed by y percent.' But ultimately these are all domain-specific criteria. Accuracy is only one dimension. From a business perspective, things like business flexibility or decision explainability may be far more important, depending on the intended use.
"For example, the Chicago Bulls coaching staff reported that data mining had improved performance by a 2-3 points per game. That's very different from US WEST saying they've saved millions of dollars using OLAP and a data warehouse to improve internal processes by highlighting lapses in transaction processing."
"Results depend upon the goals of an organization, the structure of the entire problem and the structure of the data. And optimal results always reflect where the organization is currently. An organization that's already been using data efficiently for a certain application may obtain only moderate added benefits from data mining. But, as it turns out, most organizations don't take very good advantage of their data, so most can realize large improvements from even simple projects.
"Evaluating results is intimately dependent upon the particular problem itself; it cannot be generalized. It's a chicken-and-egg situation: the problem defines the solution, and the solution defines the results, given the business context."
D S * : How should executives prioritize involvement with data mining technologies?
STEIN: "Again, let business needs drive development. Find a problem that the line cares a lot about. If you then concentrate on the dynamics of that problem -- and what a solution can provide -- you can intelligently evaluate the function of various tools: neural networks, genetic algorithms, recursive partitioning, etc. Each of these has a specific footprint or characteristic in terms of what sort of solution can be provided. Once the business needs are explicitly understood, certain tools and approaches will rule themselves out while others will suggest themselves.
"A classic example is neural networks, where the structure of the final decision model is often considered hard to interpret as compared to, say, a pattern coming out of CART, a rule-tree generating algorithm. On the other hand, neural networks give better fitting of complex surfaces, whereas a CART tree would have to be extremely complex to map such surfaces. There are trade-offs that must be made, and the key is deciding what is important for a particular business application. Note that two people could attack the same problem but nevertheless need very different things on the business side! The ultimate solutions required will dictate different approaches."
"Above all, though, firms need to begin to explore these technologies, if they haven't already, and understand how they fit into their business strategies. It can be very hard to play catch-up in this arena."
---
Roger M. Stein, steinr@moodys.com , is vice president, senior credit officer, Quantitative Analytics, Moody's Investors Service.
Online orders for "Seven Methods for Transforming Corporate Data into Business Intelligence" may be tendered via http://www.viamall.com/softpro , http://www.amazon.com , http://www.prenhall.com , Phone: 1-(800) 643-5506. FAX: 1-(800) 835-5327.
---
Alan Beck is editor in chief of D S * and vice president of publications for Tabor Griffin Communications. Comments are always welcome and should be emailed to alan@tgc.com