[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]

DATA MINING WITH THE EXPLORATION WAREHOUSE: PART I
by W H Inmon


Data warehousing has taken the world in a whirlwind. In 1990 data warehousing was a theory uniformly despised by the database theoreticians. In 1998 data warehousing is conventional wisdom practiced by thousands of corporations around the world.

THE MATURING OF THE WAREHOUSE

As data warehousing has grown, it has matured. In the early days there was simply a data warehouse. Today we have many different forms of a data warehouse -- operational data stores, enterprise data warehouses, and data marts. But now there is an exciting and new data warehouse construct on the scene called an "exploration warehouse". An exploration warehouse is a structure devoted solely to data exploration and data mining. It is through the exploration warehouse that many of the promises of corporate competitive advantage start to become true.

THE EXPLORATION WAREHOUSE

The exploration warehouse is a proper subset of the data found in the enterprise data warehouse. Masses of detailed data are periodically moved directly from the enterprise data warehouse to create the exploration warehouse. Once the exploration warehouse is created the data miner has a convenient and isolated place in which to test hypotheses and to do other analytical activities.

The first place that data miners start is the enterprise data warehouse. The enterprise data warehouse contains a wealth of history and low level detailed data. As such, the enterprise data warehouse forms a wonderful foundation for the purpose of data mining. However, the explorer soon finds out that doing his/her exploration and analysis in the enterprise data warehouse is a real imposition on the other users of the enterprise data warehouse. The queries submitted by the data miner are simply too large for the enterprise data warehouse to handle gracefully. Other users of the enterprise data warehouse complain as performance turns bad when the data miner shows up.

The data warehouse administrator trying to manage this problem has a very fundamental conflict on his/her hands. If the data warehouse administrator allows the data miner to do processing on the enterprise data warehouse, the regular users suffer. If the data warehouse administrator bans the data miner to the wee hours of the morning when there are spare machine cycles, the data warehouse administrator greatly limits the effectiveness of the data miner. What is the data warehouse administrator to do?

WHAT IS AN EXPLORATION WAREHOUSE?

The best alternative the data warehouse administrator has is to create a physically separate structure from the enterprise data warehouse called an exploration warehouse. And what exactly is an exploration warehouse? An exploration warehouse is a copy of some or all of the enterprise data warehouse designed specifically for exploration. The exploration warehouse contains detailed data and historical data copied from the enterprise data warehouse. The exploration warehouse is created directly from the enterprise data warehouse.

The exploration warehouse can be created and recreated very quickly should the data miner decide that data is needed in a different manner or that different data is needed. The data miner can use the exploration warehouse as seen fit with no consideration for the performance impact on other users. The data miner is the sole user of the exploration warehouse so there is no conflict with resource utilization with other warehouse analysts.
---
Part II of this article will appear in the next edition of D S *
---
For more information, see http://www.pine-cone.com


[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]