Next Article Table of Contents Previous Article

ARCHITECTURAL EVOLUTION IN DATAWAREHOUSING
by Joseph M. Firestone, Ph.D.

Architectural Evolution in DataWarehousing and Distributed Knowledge Management Architecture

Introduction

The Dynamic Integration Problem (DIP) is the problem of proactively and automatically monitoring and managing evolutionary change in data warehousing systems without imposing a traditional and constraining "Top-Down" architecture. It is the problem of providing managers of both data warehouses and data marts, the capability to innovate while still maintaining the integration and consistency of the system. It is increasing recognition of the need for this capability that drives architectural evolution in the DW system. No data warehousing vendor is currently offering a solution to the DIP, though some offer change management through intentional DBA action backed by extensive monitoring and reporting capabilities.

The DIP is now more difficult to solve because data warehousing is increasingly a complex systems integration problem. A full-blown Data Warehousing System may encompass the following database servers: The data warehouse; various data marts (department, function, or application-specific DSSs, using ROLAP, Multi-dimensional, or Column-based Servers); One or more Operational Data Stores (ODSs); One or more Data Staging Areas; and the following application servers: Web Servers; ETML Servers; Data Mining servers; Stateless Transaction Servers (e.g., MTS, Jaguar CTS, etc.; Business Process engines (e.g. Persistence Power-Tier); Document Servers; ROLAP Support Servers (e.g., Microstrategy, Information Advantage); Report Servers; and various front-end OLAP and reporting tools.

The Dynamic Integration problem in this context is three-fold:

First, an integrated view of all server-based assets is needed; Second, flows of data, information, and knowledge throughout this system need to be managed to maintain the common view in the face of change in form and content, and to distribute the system's data, information, and knowledge bases as required, and Third, such management needs to occur automatically and without centralizing the system so that the authority and responsibility for adding new data and information to the system is distributed. This paper is concerned with DSS/data warehouse system architectural evolution in response to the growing complexity of the enterprise DSS environment, and with the relationship of new architectures to a developing capability to handle the DIP. The paper briefly describes and analyzes the following architectures: Top-Down Architecture; Bottom-Up Architecture; Enterprise Data Mart Architecture; Data Stage/Data Mart Architecture (DS/DMA); Distributed Data Warehouse/Data Mart Architecture (DDW/DMA); Distributed Knowledge Management Architecture (DKMA); Variations with introduction of the ODS. In addition it comments on the relationship between DKM architecture and data mining, and provides some brief comments on software tools for implementing DKMA.

Top-Down Architecture

Introduced by Bill Inmon [1], this is the first data warehousing architecture.

The interaction associated with the architecture begins with an Extraction, Transformation, Migration, and Loading (ETML) process working from legacy and/or external data sources. Extraction transformation, and migration, process data from these sources and output it to a centralized Data Staging Area. Following this, data and metadata are loaded into the Enterprise Data Warehouse and the centralized metadata repository. Once these are constituted, Data Marts are created from summarized data warehouse data and metadata. The data warehouse has an atomic data layer and also contains detailed historical data. In contrast, the data marts contain lightly and highly summarized data and also metadata.

In the top-down model, data warehouses use Normalized E-R Data Models. In contrast, data marts use star schema data models to improve understandability and performance [2].

In the top-down model, integration between the data warehouse and the data marts is automatic as long as the discipline of constituting data marts as subsets of the data warehouse is maintained. Tools (such as Microstrategy's DSS Server) exist to generate data marts from the data warehouse "by pushing a button."

Bottom-Up Architecture

The second data warehousing systems architecture, the "Bottom-up" architecture became popular because the Top-down architecture took too long to implement, was often politically unacceptable, and was too expensive. The Bottom-up architecture also provided a rationalization for department heads and others with budgets to use the new technology of data warehousing to produce application specific DSSs relevant to their organizational roles.

The central idea in Bottom-up architecture is to construct the data warehouse incrementally over time from independently developed data marts. The process begins with ETML for one or more data marts. No common data staging area is required. There is generally a separate area for each data mart. There may not even be standardization on the ETML tool.

The data marts generally do not use normalized E-R data models. It is generally recognized that when data marts use relational data bases, they should employ dimensional star schema data models, or variants of them, to achieve better performance and understandability. Many data marts don't even use relational technology, but may employ multidimensional database servers or column-based databases (Sybase IQ, and Broadbase).

In Top-down architecture, data marts use lightly and highly summarized data. But in Bottom-up architecture, they also use atomic, and detailed, including historical data. Since the data marts are to be the building blocks of the data warehouse, they must contain all of the data that will appear in the projected data warehouse.

Bottom-up differs from Top-down architecture also, in that it provides no common metadata components across data marts. This is the most important difference between the two architectures from the standpoint of integration. In fact, the evolution of architecture beyond the basic Top-down and Bottom-up patterns is largely the evolution of increasingly sophisticated metadata and meta-object structures in an effort to achieve integration.

Data warehousing architectures tend to be associated with specific ETML tool vendors, more than they are with other classes of data warehousing/DSS tool vendors. Top-down architecture has been associated with Prism Solutions, Evolutionary Technologies (ETI), and Carleton. Bottom-up architecture was initially associated with "second generation" ETML tool vendors Informatica, Sagent, and Ardent Software (formerly VMark). These vendors have always emphasized the importance of metadata, but initially, at least, their tools did not provide metadata tools for integration across data marts.

While Bottom-up architecture was quite successful in meeting initial expectations in building data marts, it very soon was widely perceived as unacceptable for the long term for the very reason that it failed to provide a common metadata component. Without shared metadata, it is difficult to construct the data warehouse from data marts. So, the Bottom-up architecture, in its pure form fails to fulfill its promise of an incremental approach to the data warehouse. This failure also leads to new "stovepipes" or "legamarts" over time.

Enterprise Data Mart Architecture (EDMA)

Though the "legamart" critique of the pure Bottom-up architecture was decisive, the idea of an incremental approach to data warehouse construction through application specific data marts that deliver value along the road to the comprehensive data warehouse, has real legs. So, Bottom-up supporters quickly modified their approach and their technology to save the idea.

All of the remaining architectures to be discussed here are recent reactions to the need to make an incremental, relatively inexpensive, value-driven approach to data warehousing work [3]. The Enterprise Data Mart Architecture (EDMA), to be discussed in this section [4] is one evolutionary response of Bottom-up supporters to the "legamart" argument.

EDMA supports an incremental approach to the data warehouse through data mart development by creating a shared framework for development. The EDMA framework [5] includes enterprise subject areas, common dimensions, metrics, business rules, and data sources, all represented in a logically common (but not necessarily physically centralized) Global Metadata Repository (GMR). This common framework is established before the EDMA-guided incremental process of data mart/data warehouse development occurs. As development occurs, EDMA is incrementally modified as the development process gradually evolves the foundation for the data warehouse.

Central to the architecture also, is a common data staging area called a Dynamic Data Store (DDS) [6] for extraction, transformation, and migration results. A DDS stores, cleans, and transforms data extracted from operational systems, and also prepares the data for loading into DSS data stores. A DDS is not the same as an Operational Data Store (ODS), as it prohibits DSS processing. The DDS is dynamic in the sense that it is frequently changing as new data is added. The process of establishing and maintaining the DDS also contributes metadata to the GMR. The fact that the architecture incorporates a logically unified data staging area is instrumental in supporting the integration across data marts and with the eventual data warehouse. Because the unified data staging area, along with the GMR, and local data mart metadata repositories, all help to create and maintain semantic consistency in data.

As in the Bottom-up architecture, Data Marts may use multiple data storage technologies as appropriate, but if relational technology is used star schema modeling is the preferred choice.

EDMA is best described by Douglas Hackney in his Understanding and implementing Successful Data Marts [7], and he is the data warehousing practitioner most closely associated with the architecture. Informatica [8] was the first vendor to implement this architecture through its PowerCenter Tool, which implements a DDS, and also a GMR, which it refers to as the Global Data Mart Repository (GDR).

EDMA as implemented through Informatica PowerCenter supports metadata management through extensive monitoring and reporting mechanisms; but not through an automated process. Thus, EDMA does not yet provide a solution to the Dynamic Integration Problem (DIP), though it certainly makes substantial progress towards that goal. In addition to Informatica, EDMA (though the name is not used) is also supported by Carleton through its combined Enterprise Integrator and Passport products, www.carleton.com.

Both Informatica and Carleton make use of Object Technology in their metadata repositories. Like Informatica, Carleton provides a framework for beginning to deal with the DIP. But it still provides no automatic mechanisms for synchronizing incremental data marts and the data warehouse.

Data Stage/Data Mart Architecture (DS/DMA)

It seems worthwhile to distinguish DS/DMA from EDMA on one side, and DDW/DMA on the other, since its central idea seems to be spreading. DS/DMA is the same as EDMA with the important exception that no physical enterprise-wide data warehouse is implemented. Instead, the data warehouse is viewed as the conjunction of the data marts in the context of an EDMA-like metadata repository.

The repository provides a common view of DSS resources across data marts, but not necessarily any data or tables that represent a view of global enterprise characteristics. There is no guarantee that the conjunction of subject matter, department, and/or application specific data marts will provide access to such global attributes, as would a data warehouse.

In other words, the view that the data warehouse is just the logical conjunction of the data marts, ignores the fact that data marts are constructed to satisfy the interests of actors whose departmental locations, or less than global responsibilities, bias them toward warehousing data that may not measure or describe global enterprise properties. That is, their subject matter, departmental, and application specificity will not deal with enterprise wide interests.

The conjunction of such data marts may allow one to derive aggregate properties that are at the enterprise level of analysis, and even structural properties describing relationships among departments or groups or individuals in the enterprise. But they will not allow one to derive global properties of the enterprise, because these are emergent characteristics of organizational interaction. They cannot be derived from aggregations or structural measurements on enterprise components. They must be measured through data stores that directly address global enterprise matters.

This argument assumes that data marts are never concerned with global enterprise matters, and this assumption may seem unreasonable since such things as sales data marts may clearly involve such properties as total enterprise sales revenue. But the terms data warehouse and data mart are often used loosely, and just as there are sales data marts, there are also sales data warehouses. So what's the distinction between these two concepts?

I think there is none, and that this looseness of language is the real explanation for the emergence of DS/DMA. If we accept a definition of data mart that allows us to apply the term to enterprise wide, application specific DSS data stores; then the central idea of DS/DMA, that the global data warehouse is the conjunction of all data marts, is much more plausible. Application specific, enterprise wide data marts will contain global characteristics of the enterprise, and the conjunction of them may contain all such characteristics.

But, in the end this definition of data mart, even though it sometimes accords with common usage, is not reasonable. There is really no need to stretch the data mart concept to encompass global phenomena. It is much easier to maintain that the essence of the data warehouse/data mart distinction is the global/local distinction, and therefore that sales data marts, if global in nature, are really application (or at least subject matter) specific, sales data warehouses. That is, we should recognize data warehouses and data marts, and within the data warehouse category "galactic" data warehouses, and application or subject matter specific ones.

Primary tool providers for DS/DMA are again Informatica and Carleton. Both tools emphasize the importance of shared metadata arising from every stage of the data warehousing process. Both tools emphasize the importance of a GMR, and lastly, both tools emphasize the importance of a centralized data staging area in producing data consistency and integrity.

Distributed Data Warehouse/Data Mart Architecture (DDW/DMA)

DDW/DMA is also similar to EDMA (See Figure Five). Like EDMA it provides a dynamic data staging area and a common view of metadata across the enterprise in the form of a shared metadata repository. The distinctive characteristics of DDW/DMA are two.

First, it provides a logical database layer mapping a unified logical data model to physical tables in the various data marts. And second it provides transparent querying of the unified logical database across data marts and data warehouses along with caching and joining services. Thus, the distributed character of the data warehouse/data mart system is made transparent to users.

Leading tool providers supporting this architecture are Informatica, Carleton, and Sybase Adaptive Server. These tools are all offered as part of Sybase's Warehouse Studio [9]. Informatica and Carleton provide the unified logical view of the data warehouse, and Sybase Adaptive Server provides the ability to query, cache, and join across data marts and data warehouses as necessary. HP Intelligent Warehouse is also an example of this architecture. But it is a product in transition following its recent purchase by Platinum Technologies.

This is the most adaptable of the architectures discussed to this point, but it still reflects the limitations of the relational viewpoint when it comes to handling objects and processes, and it still doesn't support distributed and automated change capture and management.

Distributed Knowledge Management Architecture (DKMA)

DKM architecture is an evolving O-O/Component-based architecture applicable to enterprise wide systems incorporating multiple processing styles including DSS, OLTP, and Batch processing. These systems are called Distributed Knowledge Management Systems (DKMS), a concept I've introduced in previous work [10]. Here DKM architecture is applied to data warehouse/data mart -based DSS systems.

Top - Down and Bottom-Up architectures may be viewed as two-tier architectures utilizing clients and local or remote databases. EDMA, DS/DMA, and DDW/DMA may be viewed as adding middleware and tuple [11] layers to earlier architectures to provide the capability to manage warehouse systems integration through unified logical views, monitoring, reporting, and intentional DBA maintenance activity. But tuple-layer based management still doesn't provide automatic feedback of changes in one component of a data warehousing system to others.

DKM architecture may be viewed as adding an object layer to EDMA or to DDW/DMA to provide integration through automated change capture and management. Figure Six depicts DKM architecture.

The object layer contains an architectural component called an Active Knowledge Manager (AKM) [12]. The AKM provides process control/distribution services, an in-memory active object model accompanied by a persistent object store, and connectivity to a variety of data store and application types.

Process Control Services include:

  • in - memory proactive object state management and synchronization across distributed objects; component management; workflow management; transactional multithreading.

The in-memory Active Object Model and Persistent Object Store Model components of the AKM include:

  • Event-driven behavior; DKMS-wide model with shared representation; Declarative business rules; Caching through partial instantiation; and A Persistent Object Store for the AKM.

Connectivity Services of the AKM include:

Language APIs: C, C++, Java, CORBA, COM Databases: Relational, ODBC, OODBMS, hierarchical, network, flat file, etc. Wrapper connectivity for application software: custom, CORBA, or COM-based. Applications including all categories mentioned in earlier discussion of the Dynamic Integration problem whether these are mainframe, server, or desktop - based. The DKM Architecture and the AKM provide the solution to the Dynamic Integration Problem, because only the DKMA among the preceding architectures, supports distributed, proactive monitoring and management of change in the web of data warehouse, data mart, web information servers, component transaction servers, data mining servers, ETML servers, other application servers, and front-end applications comprising today's Enterprise DSS/Data Warehousing System.

Variations With Introduction of the ODS

Each of the architectures covered may vary with the addition of an Operational Data Store (ODS). According to Inmon, Imhoff, and Sousa, [13]: "An ODS is a collection of data containing detailed data for the purpose of satisfying the collective, integrated operational needs of the corporation . . . The ODS is: subject-oriented, integrated, volatile, current-valued, detailed." The ODS is like a data warehouse in its first two characteristics, but it is like an OLTP system in its last three characteristics. Its purpose is to support operational, tactical decisions. The workload of an ODS involves four kinds of processing: loading data, updating, access processing, and DSS-style analysis across many records [14].

The four types of ODS processing are the source of difficulties in optimizing ODS processing. It is difficult to optimize performance over all four types.

Look at the above architectures in relation to the ODS. It is clear that an architecture that will support both DSS and OLTP-style processing is needed in order to optimally integrate the ODS into the broader data warehousing architecture. In particular, process control services will be very important for the OLTP-style of processing we find in the ODS. Also, distribution of ODS objects across multiple servers will help ODS performance. Finally, in-memory processing in distributed AKMs can do much to upgrade performance in a distributed ODS. Of course, only one of the above architectures can provide these capabilities for the ODS: the DKM Architecture.

DKM Architecture and Data Mining

A key emerging capability in DKMS and data warehousing systems is Knowledge Discovery in Databases (KDD) or Data Mining [15][16]. The key mechanism for KDD is the data mining server.

Here are some difficulties with current data mining server products:

It's difficult to incorporate new data mining algorithms, and therefore keep pace with new developments coming out of the research world; Many products require that data must be transported to proprietary data stores before data mining can occur; Models produced by the data mining algorithms are not freely available to power users unless they use the data mining tool itself; It is difficult to incorporate validation criteria not initially incorporated in the data mining tool into the KDD process; There are few "open architecture" commercial data mining tools. To solve these problems a product class called An Analytical Data Mining Workbench (ADMW) should be developed. The ADMW needs:

Easy and convenient encapsulation of new algorithms into object model classes; Capability to mine data from any data source in the enterprise; Incorporation of analytical models into an object model repository; A modifiable validation model, Integration of legacy data mining applications with the ADMW. An ADMW with these capabilities would meet all of the difficulties specified above.

There are a number of reasons why DKM Architecture can help in developing an ADMW product. First, new algorithms can easily be encapsulated in objects through the wrapping capabilities of the AKM [17]. These wrapping capabilities can be used to create standard object interfaces for objects incorporating new data mining algorithms developed in universities and research centers. In turn, the standard interfaces can be used to plug the new objects including their new methods into the analytical workbench. The time to market of new data mining algorithms could be substantially reduced.

Second, persistent data can be brought into the AKM's in-memory object model for data mining without first relocating it from its current data mining data mart relational data store [18]. AKM's connectivity allows it to access relational tables to read data in "chunks" for high speed processing. To do this efficiently it is necessary to maintain relational production data mining tables side-by-side with the dimensionally modeled relational tables of the data mart. The function of the data mining tables is to maintain data in optimal form for input into an in-memory data mining engine that would be integrated with the AKM.

Third, data mining can be performed by executing the analytical models in memory. The AKM's connectivity allows it to integrate the data mining engine with its process control services and in-memory object model.

Fourth, analytical models produced by an AKM -based application would be placed in an object model repository. Any model generated by the data mining engine can be expressed as an object with the model as one of its methods. So analytical models entering the data mining process can be saved and placed in the object model repository.

Fifth, customized validity criteria could be added by modifying the validation model [19] in the repository. A validation model is nothing but another object in the object model. As such, it can be accessed by users of the repository, and it's content can be changed, using the tools made available by the system to edit and reformulate analytical models of other kinds.

Sixth, legacy data mining applications could be integrated using AKM connectivity services. If a language API (C/C++, Java, CORBA, DCOM) exists, AKM will be able to access the legacy application's functions once these are described in the AKM object model. If an API is absent, the AKM component can be used to "wrap" the data mining application and make its functions available to the AKM.

When considering the above points, keep in mind that there is no ADMW with all of the above capabilities at present. Data Mining is a rapidly growing field, but the market niche represented by the ADMW is empty.

DKM Architecture and Software Tools

To implement DKM Architecture in a DKMS you need the full range of tools now used to create data warehousing systems. In addition you need tools specifically for the AKM component. These include:

An object modeling RAD environment providing extensive process control services and connectivity ( e.g. Template Software's Enterprise Integration Template, Forte, DAMAN's InfoManager, a combination of Ibex's DAWN workflow product along with its Itasca Active Object Database, a combination of Rational Rose, Persistence Power-Tier; and Iona's Orbix); Technology for constructing software agents to proactively monitor components of the DKMS (e.g. CA Unicenter TNG, ObjectSpace's Voyager, DAMAN's InfoManager, and Persistence Power-Tier). An OODBMS to serve as a persistent object repository for the AKM component (ObjectStore, Objectivity/DB, Jasmine, Versant, Itasca).

Conclusion: Coordinated Evolution of the Enterprise DKMS

An enterprise DKMS, or for that matter, an enterprise data warehousing system is not a static construct in which the whole is the sum of the parts. It is, or should be, an intelligent system that learns and adapts over time. In such a system, the parts, or data marts, should not be subsets of the data warehouse. Nor should the data warehouse be the logical union of the data marts. Data marts are, at best fuzzy subsets of data warehouses; and data warehouses represent the fuzzy union of data marts.

The DKM architecture provides for coordination of data mart development in enterprises from the get-go. The vision of the DKM architecture is not top-down coordination, however, but the coordination of components having fuzzy relations to one another.

There are a number of reasons why the relationship between data warehouses and data marts should be fuzzy, but they all stem from consideration of the role of business users in the dynamics of the developing relationships between data marts and the data warehouse within an enterprise. Whether a primarily top down, bottom up, or metadata guided architecture is used, there will always be continuous user feedback in response to data mart and data warehouse activity, because such feedback is part of the learning process in an enterprise.

User requirements are not static. They tend to evolve on exposure to new applications and new technologies as the users learn. Changes in requirements are not limited only to faster hardware, or better techniques for data mining, or improved database software, or GUI interfaces. They're also going to include changes in information, knowledge, and data requirements. New attributes and tables might need to be added to data marts. Old tables might need to be reorganized. New requirements may therefore impact either data or object models at both the data mart and data warehouse levels. New causal dimensions and attributes will be conceived and created by users responding to their new technology. New ETML processes may be needed to process these causal dimensions.

How will the demands for new information and knowledge be handled? Will users be told to wait until a central coordinating team adds new dimensions to a centralized data model? Or will departments supplement their data marts with new data; go through a new, if limited ETML process; and constitute a revised data mart (now a fuzzy subset of the data warehouse) that will solve their specific analysis problem?

If the central coordinating team disapproves adding new dimensions because they've "taken the pledge" to guard the integrity of the set of "master conformed dimensions," will they be able to stop a vice-president of Marketing from doing his job by adding the new causal dimensions? Should they be able to? And if they are allowed to build their fuzzy subset data marts would this again lead to stovepipes?

Not if the enterprise implements dynamic integration through continuous feedback from data marts to the data warehouse, and continuous feedback integration of changes that seem necessary at either the data mart or data warehouse levels. If a continuous pattern of adjustment to data mart changes is adopted as policy, a pattern of gradual evolution of data warehouses and data marts will occur, and stovepipes will be avoided.

But data marts will not generally be logical subsets of data warehouses, nor will data warehouses be merely the union of data marts. The pattern of development will involve continuous feedback from the periphery to the center, and continual adjustment of both the periphery and the center to each other. The enterprise data warehouse will not bring a once-and-for-all decision support nirvana, but a much healthier process of continuous conflict, learning, and growth of business intelligence.

The means to reach this state of coordinated evolution is the DKMS. And to implement the DKMS we need an architecture consistent with its purposes and with the requirement that the dynamic integration problem give way to the pattern of coordinated evolution. The DKMA is that architecture.

White Paper No. Eleven

References

[1] W. H. Inmon, Building the Data Warehouse, 2nd ed. (New York, NY: John Wiley & Sons, 1996)

[2] The clearest statements on dimensional modeling and star schemas, are Ralph Kimball's. See The Data Warehouse Toolkit (New York, NY: John Wiley & Sons, 1996), and "A Dimensional Modeling Manifesto," DBMS, August 1997. There is considerable controversy over whether dimensional modeling or E-R modeling should be used for the data warehouse. But everyone seems to agree that dimensional modeling is the preferred technique for relational data marts. See W. H. Inmon, Claudia Imhoff, and Ryan Sousa, Corporate Information Factory (New York, NY: John Wiley & Sons, 1998), 76-78

[3] I'd like to thank my fellow participants in the dwlist server group (dwlist@datawarehousing.com) for the continuing architectural discussions carried on in the group. Our discussions motivated me to think through this paper on architectural evolution in data warehousing, and I'm sure your friendly response to this paper will contribute to the evolution of my ideas on architectural evolution.

[4] See Douglas Hackney, Understanding and Implementing Successful Data Marts (Reading, MA: Addison-Wesley, 1997), Pp. 52-54, 183-84, 257, 307-309, for clear and more detailed treatment of EDMA. See also Informatica's treatment of the EDMA concept in its "PowerCenter Technical Overview" White Paper, available at www.informatica.com.

[5] Hackney, Ibid.

[6] Informatica, Op. Cit., defines DDS as ". . . itself a data mart." This is carrying things a bit too far. If data mart means anything, it means a DSS processing platform. The DDS is not that. Informatica also says that "the DDS maintains a data model that closely resembles that of the operational systems." This may be true in cases where both the operational data and its DSS platform target are in relational form. But data stage processing, as Ralph Kimball has made clear, is overwhelmingly sequential in nature. So except in cases such as the above where it is inconvenient to take the data out of its relational form, the DDS should actually be in flat file form.

[7] Hackney, op. cit.

[8] Informatica, op. cit.

[9] www.sybase.com/products/dataware

[10] I introduced the DKMS concept in two previous White Papers "Object-Oriented Data Warehouse," and "Distributed Knowledge Management Systems: The Next Wave in DSS." Both are available at www.dkms.com/White_Papers.htm.

[11] Wolfgang Keller, Christian Mitterbauer, and Klaus Wagner, "Object-Oriented Data Integration," in Mary E. S. Loomis, and Akmal B. Choudri (eds.), Object Databases in Practice (Upper Saddle River, NJ: Prentice-Hall, 1998), pp. 7-11

[12] The ideas for the AKM owe much to the following White Papers. Template Software, "Integration Solutions for the Real-Time Enterprise: EIT - Enterprise Integration Template," Dulles, VA, White Paper May 8, 1998. See also www.template.com. Persistence Software, "The PowerTier Server: A Technical Overview" at www.persistence.com/products/tech_overview.html and John Rymer, "Business Process Engines, A New Category of Server Software, Will Burst the Barriers in Distributed Application Performance Engines," Emeryville, CA, Upstream Consulting White Paper, April 7, 1998 at www.persistence.com/products/wp_rymer.html. Two other products that could be used to develop the AKM component are DAMAN's InfoManager (inquire at www.damanconsulting.com), and Ibex's DAWN workflow product along with its ITASCA active database (at www.ibex.ch)

[13] W. H. Inmon, Claudia Imhoff, and Ryan Sousa, Op. Cit., Pp. 87-88

[14] Ibid. Pp. 95-97

[15] See Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasmy Uthurusamy (eds.), Advances in Knowledge Discovery and Data Mining (Cambridge, MA: M.I.T. Press, 1996)

[16] A recent white paper of mine "Knowledge Management Metrics Development: A Technical Approach," treats KDD as a use case in the DKMS. See it at www.dkms.com/White_Papers.htm.

[17] A good summary of wrapping is in Thomas J. Mowbray and Ron Zahavi, The Essential CORBA: Systems Integration Using Distributed Objects (New York, NY: John Wiley & Sons, 1995), Pp. 232-238.

[18] These architectural views on data mining owe much to the architecture of the Datasage production data mining tool (www.datasage.com)

[19] See the development of this idea in my "Knowledge Management Metrics Development . . ." Op. Cit., Pp. 15-18.


Biography

Joseph M. Firestone, Ph.D.
CEO, Chief Scientist
Executive Information Systems Inc (EIS)
703-461-8823, eisai@home.com

Joseph M. Firestone, Ph.D. is CEO and Chief Scientist of Executive Information Systems (EIS) Inc. Joe has varied experience in consulting, management, information technology, decision support, and social systems analysis. Currently, he focuses on product, methodology, architecture, and solutions development in Enterprise Information and knowledge Portals, where he performs Knowledge and knowledge management audits, training, and facilitative systems planning, requirements capture, analysis, and design. Joe was the first to define and specify the Enterprise Knowledge Portal Concept. He is widely published in the areas of Decision Support (especially Enterprise Information and Knowledge Portals, Data Warehouses/Data Marts, and Data Mining), and Knowledge Management, and has recently completed a full-length industry report entitled "Approaching Enterprise Information Portals." Joe is a founding member of the Knowledge Management Consortium International (KMCI), Editor of the new KMCI Journal, Chairperson of the KMCI’s Artificial Knowledge Management Systems SIG, a member of its Executive Committee, its Metaprise Project, and the KMCI Institute Governing Council. Joe is a frequent speaker at national conferences on KM and Portals. He is also developer of the Web site www.dkms.com, one of the most widely visited Web sites in the Portal and KM fields. DKMS.com has now reached a visitation rate of 83,000 visits annually.

Executive Information Systems Inc

The Executive Information Systems (EIS) Enterprise Knowledge Portal (EKP) is the only portal solution that provides the assurance that enterprise decision making will be based on validated knowledge. EIS’s EKP lets enterprises avoid the risk involved in Enterprise Information Portals which claim to offer increases in competitive advantage, ROI, speed of innovation, productivity, effectiveness and profitability, but have as a central vulnerability the fact that they are only capable of managing data and information, not knowledge.

Enterprises using EIP-based solutions when they could be using EKP-based ones, are gambling that unvalidated information can produce promised EIP benefits. The central value proposition of the EIS EKP is that it replaces gambling on unvalidated information with knowledge-based decision making. That is why it is much more likely to achieve the promised benefits of EIP-based solutions than its EIP competitors.

For more information, see www.dkms.com

Top of Page


Previous Article  |  Table of Contents  |  Next Article