[ Table of Contents | NEXT ARTICLE ]

REAL WORLD EXPERIENCES WITH DATA MINING AND KNOWLEDGE DISCOVERY: PART I
by Jill Dyche & Evan Levy


These days, there are as many different definitions of data mining as there are tools in the marketplace. Vendors of such diverse technologies as OLAP, Knowledge Discovery, behavior modeling, customer segmentation software, and affinity analysis are all taking shelter under the data mining umbrella. The byproduct of this all-for-one mentality is general confusion in the marketplace.

IT organizations and business users alike are not only unsure of what data mining is, but how it can help them with their specific business problems. This paper, while not attempting to provide the definitive taxonomy on data mining, provides some examples of how data mining and knowledge discovery have been deployed in actual business environments. Moreover, it addresses how this deployment has helped businesses discover newfound knowledge about their customers and products, and further argued in favor of the strategic value of large data warehouses.

A Decision Support Taxonomy

In our extensive work in implementing large data warehouses and decision support systems, an evolution has emerged that is fairly standard and predictable across industries. This evolution is represented by the pyramid below.

User sophistication
  | Hi                 /   KD   \            No Hypotheses
  
                    /  Exploration  \        Low Hypotheses
                          
                 /      Drill Down     \     Moderate Hypotheses
                      
  | Low       /          Queries          \  High Hypotheses

Note the time factor here. In our experience, not only is it easier, but it is most straightforward if a company begins its data warehousing functionality with canned reports and works its way up to knowledge discovery tools. Of course, depending on a host of factors, the evolution of data warehouse development could be different at your site. These factors include:

Funding availability for other DSS application tools

As a company moves "up the pyramid" with its decision support capabilities, so does its data infrastructure grow and mature. This means that, as a company adds new query tools and additional users to its data warehouse, it should also ideally be supplementing its database(s) with additional data elements, metadata, and objects.

Note also that, as the user community for the data warehouse grows, inevitably the "level" of user also evolves. The basic business user such as the financial analyst or even the CEO may be content with canned reports for the long term. However, as the data becomes more robust, "knowledge workers" such as marketing analysts and strategic planners require more sophisticated analytical capabilities in order to ensure that their companies can more nimbly respond to market pressures, and to ultimately gain market share.

Knowledge Discovery at a Telecommunications Company

Deregulation, new competitors, privatization, and product diversification are all taking their toll on the old-guard behemoth RBOCs (Regional Bell Operating Companies) in the U.S. These pressures are also very much on the minds of international phone companies, as they struggle for brand-recognition and ever-faster product-to-market.

According to the CIO of one such telecommunications company, they had "hit the wall" with their use of strategic data in their warehouse. True, all the paper reports had been replaced with on-line access. True, their business people were leveraging the warehouse cross-functionally and introducing new products and services faster than ever before. However, the number of users logging on had leveled off, and no new applications had been requested for six months.

Baseline Consulting Group was retained to introduce data mining technology, and charged with finding "new market information" by introducing a different set of tools. The objective was simple: "Use our existing data to find NEW information. Information we could not have found before, or would not have known to look for."

Note that the pyramid shown above depicts the changing hypothesis level as decision support functionality evolves. This CIO was correct in his assertion that with data mining technology, his organization could -- contrary to his users' then-existing data warehouse queries -- assume NO hypothesis.

Nevertheless, this was a tall order by any standards. Baseline consultants, who had guided the company on the construction of their corporate data warehouse and hence were already familiar with the data, were nevertheless apprehensive about finding any new information, let alone valuable new knowledge the business could use.
---
Part II of this article will appear next week's edition of D S * .


[ Table of Contents | NEXT ARTICLE ]