DKMA AND THE DATA WAREHOUSE BUS ARCHITECTURE
by Joseph M. Firestone, Ph.D.
Introduction
The Data Warehouse Bus Architecture is composed of "a master suite of
conformed dimensions" and standardized definitions of facts. [1, P. 156]
Business process data marts throughout an enterprise can "plug into" this bus
to receive the dimension and fact tables they need. The Bus thus supports the
various processes and associated data marts that measure key aspects of the
processes.
The logical union of these data marts is said to be the data warehouse. And
each data mart is said to be a subset of that data warehouse. This paper
describes the Data Warehouse Bus Architecture offered by Kimball, Reeves,
Ross, and Thornthwaite, [1, Pp. 153-157, 266-277, and 346-347] and then
contrasts it with DKM Architecture, an object-oriented alternative. [2]
The Data Warehouse Bus (DWB) Architecture
The DWB architecture pictures an enterprise composed of a set of business
processes. In many businesses the processes may be components of supply chains
and value chains. In other businesses they may represent "value circles."
Either way, taken as a whole they represent the activities that produce the
value stream of the enterprise. To describe the properties of the processes
and ultimately the value stream, we develop a set of data marts, at least one
for each process, the logical union of which is the data warehouse. The data
marts are composed of dimension tables and fact tables. Some of these tables
may be needed by more than one data mart, and more than one business process.
The tables that need to be shared must be "conformed" across data marts. The
conformed dimensions are viewed by Kimball, Reeves, Ross, and Thornthwaite as
constituting a bus that the business process data marts can "plug into" to
receive the shared dimensions they need. The metaphor of the bus can be most
clearly seen in two contexts. First, an analytical matrix [1, P. 271] whose
rows are business process data marts and whose columns are dimensions is
specified in the early stages of data warehouse development to guide warehouse
evolution. Checkmarks in the cells of the matrix indicate where a process is
"plugging into" a conformed dimension.
Second, a more visual metaphor for the DWB is provided by the system bus in a
PC. [1, P. 347] Thus:
"In a way, the matrix is like the system bus in your computer. Each dimension
is like a connector, or wire, on the bus. It carries a standard (i.e.,
conformed) 'signal' like product or customer. Each business process is an
expansion card that plugs into the data connectors as appropriate. " [P. 346]
It uses only those conformed dimensions that are useful for it. As new
business processes are added, their associated data marts can "plug" into the
existing dimensional bus.
Distributed Knowledge Management (DKM) Architecture
The DKM architecture has been outlined elsewhere. [2] [3] [4] DKM architecture
is the characteristic architectural pattern of the Distributed Knowledge
Management System. [5] It is an evolving O-O/Component-based architecture
applicable to enterprise wide systems incorporating multiple processing styles
including DSS, OLTP, and Batch processing.
Top - Down and Bottom-Up architectures may be viewed as two-tier architectures
utilizing clients and local or remote databases. [2] Enterprise Data Mart
Architecture (EDMA), Data Stage/Data Mart Architecture (DS/DMA), Distributed
Data Warehouse/Data Mart Architecture (DDW/DMA), and the Data Warehouse Bus
architecture may all be viewed as adding middleware and tuple layers to
earlier architectures. The layers are added to provide the capability to
manage warehouse systems integration through unified logical views,
monitoring, reporting, and intentional DBA maintenance activity. But
tuple-layer based management still doesn't provide automatic feedback of
changes in one component of a data warehousing system to others.
DKM architecture may be viewed as adding an object layer to EDMA, DS/DMA,
DDW/DMA, or DWB, to provide integration through automated change capture and
management. The object layer contains process distribution services, in-memory
and persistent object models, and connectivity to a variety of data store and
application types. The layer requires an architectural component called an
Active Knowledge Manager (AKM). [2][3][4].
The Active Knowledge Manager
An AKM provides process control and distribution services, an object model of
the DKMS, and connectivity to all enterprise information, data stores, and
applications. Process Control and Distribution services are the activities
that manage communication among and change in the objects and components of
the DKMS, as well as work flow in the system. The in-memory active object
model provides the behavioral capabilities and attributes needed for the
process control and distribution services. And the connectivity services
provide the communications access needed to perform the process control and
distribution services.
Relating the Architectures
The DKM architecture is much broader in scope than the DWB architecture. It
addresses a broader set of issues, and is comparable to a data warehouse
systems architecture, [See 1, P. 508] such as the "Big Shopper Data Warehouse
Technical Architecture Model," rather than the more limited in scope data
architecture of the DWB. But within DKM architecture there is a persistent
object store. It is a repository catalog of all of the objects within the
DKMS.
Among these objects are entity objects encapsulating all of the data
attributes in the DKMS. Among these entity objects are dimensional objects and
fact objects. [6] Moreover, process control and distribution services in the
DKM architecture, guarantee both conformed dimensional and fact objects as the
system evolves, and transparent object sharing across data marts when object
sharing is desirable.
In short, DKM architecture embeds an Object Warehouse Bus (OWB) of conformed
fact and dimensional objects. In fact, a fundamental idea of object
orientation is the notion that objects support business processes and use
cases. And that it is through objects that business processes are performed.
[7] This idea is analogous to the emphasis of Kimball, Reeves, Ross, and
Thornthwaite on data marts supporting business processes and the DWB providing
a "bus' that the processes can "plug into."
So, the OWB, like the DWB provides a systematic foundation for business
process data marts, and supports their construction and maintenance. Entity
objects (and other types of objects as well) are permanently staged in the
persistent object catalog. They may be used whenever a new data mart needs to
be constructed, and they may be freely distributed across the diverse data
stores and application platforms of the DKMS.
The metaphor of an electronic or PC-bus not only fits tuple layer structures
such as the tables in a relational database, but also fits object layer
structures such as the entity objects in the AKM's persistent data store. In
addition to the data warehouse bus, there is also an object warehouse bus, for
those who wish to take advantage of it.
Beyond the DWB and the OWB
The DWB architecture supports sharing dimension and fact tables, and provides
consistency of data and semantics across data marts. The OWB within DKM
architecture supports sharing dimensional and fact objects, consistency of
data and semantics, and sharing of methods of data access across data marts.
So it provides a somewhat more comprehensive shared architecture encompassing
not only data aspects but access methods as well.
But DKM architecture, as I stated earlier, is much broader in scope than
either DWB architecture, or the OWB, and therefore it provides a much greater
breadth of architectural sharing across data marts. Specifically, in addition
to sharing entity objects and access methods for those objects, DKM
architecture provides for sharing control objects. And through them: (a)
declarative business rules encapsulated in the objects, (b) procedural
networks of business rules including computational algorithms, and analytical
models, and (c) reports and query formats and results.
It also provides for tracking what is shared, so that change in any shared
object or component at any point in the distributed system is automatically
tracked, managed, and integrated into the system. In addition, DKM
architecture provides a variety of other capabilities to all data marts in the
data warehousing system.
In the area of process control and distribution services these include:
In - memory proactive object state management and synchronization across
distributed objects (including business rule management and processing, and
metadata management); component management; workflow management; transactional
multithreading; CORBA and/or COM messaging. These services provide an
integrative "glue," and comprehensive management capability to the data
warehousing system that is not available from either the DWB or the OWB alone.
The in-memory Active Object Model and Persistent Object Store Model components
of the AKM include:
Event-driven behavior; DKMS-wide model with shared representation; Declarative
business rules; Caching through partial instantiation; and A Persistent Object
Store for the AKM. The role of the object model is to provide the AKM with a
shared representation of system state across all of the data marts. This state
is available to all of the data marts, and when changes occur in any, the
reflexive objects in the AKM specific to that data mart alert related objects
in the system. While a shared state representation of entity objects is
available from the OWB, and a shared representation of data content and logic
is available from the DWB, these will not meet the needs of the various
process control and distribution services. Only the architectural component of
a shared object model including control objects providing for event-driven
behavior, business rules, and partial instantiation of objects can provide the
foundation needed for these.
The Connectivity Services of the AKM include:
Language APIs: C, C++, Java, CORBA, COM Databases: Relational, ODBC, OODBMS,
hierarchical, network, flat file, etc. Wrapper connectivity for application
software: custom, CORBA, or COM-based. All legacy, relational, and
object/component-based applications. The DWB and the OWB are not broad enough
in scope to provide shared connectivity to the data marts. They don't deal
with the architecture of connectivity. So here is another area where DKM
architecture exceeds their bounds.
In data warehousing systems based on a metadata layer, various products
provide connectivity. The architecture of connectivity is specified
accordingly. In the AKM though, connectivity comes integrated. It is provided
by its control objects which provide access to sources of data mapped to the
object model.
This broad range of connectivity services of the AKM provides it with the
ability to map and connect its abstract object model to the corporate reality
of diverse and chaotic data, information, and knowledge resources. The AKM,
however elegant and systematic its internal organization, and however,
powerful its methods for process distribution and control could not operate
without the data necessary to populate its object and component-based
abstractions. The extent to which the AKM's connectivity is universal, places
an outer boundary on its capabilities for full data warehouse systems
integration. If you can't connect to a data resource, you can't integrate it
into your system.
Conclusion
The DWB provides an architecture for integrating data across data marts. It is
not true though, that the architectural framework provided by the DWB alone is
sufficient to provide the foundation for an incremental data warehouse. That
is, the union of business process data marts is not a data warehouse. [1, Pp.
19, 200-203, 266-271] There are at least two reasons why this is not so.
First, because the union of business process data marts doesn't necessarily
provide the data needed for management decision support for departments, or
for departmental interactions among themselves and with the external world. So
unless the term business processes includes processes internal to departments,
the statement won't hold even at the level of data attributes.
Second, though, and more importantly, the DWB architecture alone, does not
address process control and distribution issues, shared methods and behavior
of data marts becoming integrated into a data warehouse, and shared aspects of
connectivity. So if you put together a set of data marts sharing a DWB
architecture in hopes of getting a data warehouse in the course of time,
you'll still have to contend with conflicting processes, methods, behavior,
and connectivity needs when you attempt to perform the integration. You won't
have solved the islands of information problem, but only the islands of data
problem.
A data warehouse is not just a data store. It is a data store capable of
providing decision support. This means it must be integrated with the
environment around it. And this, in turn, means that the DWB must be
supplemented with a process and metadata architecture that the various data
marts can hold in common. And finally, it means that this process and metadata
architecture must be scalable to the enterprise level.
But these process and metadata requirements go beyond the conceptual bounds of
the DWB as they do of the bounds of the OWB. To handle these requirements you
need a data warehouse systems architecture, and not just a data warehouse bus
architecture. Such an architecture is provided by DKM architecture and its
Active Knowledge Manager component. It is the DKM architecture that provides
both the data and process integration necessary for incremental, evolutionary,
data warehouse construction from data marts.
DKMS Brief No. Seven
References
[1] Ralph Kimball, Laura Reeves, Margy Ross, and Warren Thornthwaite, The Data
Warehouse Life Cycle Toolkit (New York: John Wiley & Sons, 1998).
[2] Joseph M. Firestone,"Architectural Evolution in Data Warehousing,"
available at
www.dkms.com/White_Papers.htm.
[3] Joseph M. Firestone, "DKMS Brief No. Four: Business Process Engines in
Distributed Knowledge Management Systems," available at
www.dkms.com/White_Papers.htm.
[4] Joseph M. Firestone, " DKMS Brief No. Six: Data Warehouses, Data Marts,
and Data Warehousing: New Definitions and New Conceptions," available at
www.dkms.com/White_Papers.htm.
[5] Joseph M. Firestone, "Distributed Knowledge Management Systems: The Next
Wave in DSS," available at
www.dkms.com/White_Papers.htm.
[6] Joseph M. Firestone, "Dimensional Object Modeling," available at
www.dkms.com/White_Papers.htm.
[7] Ivar Jacobson, Maria Ericsson and Agneta Jacobson, The Object Advantage:
Business Process Reengineering with Object Technology (Reading, MA:
Addison-Wesley, 1995)
Biography
Joseph M. Firestone, Ph.D.
CEO, Chief Scientist
Executive Information Systems Inc (EIS)
703-461-8823, eisai@home.com
Joseph M. Firestone, Ph.D. is CEO and Chief Scientist of Executive Information Systems (EIS)
Inc. Joe has varied experience in consulting, management, information
technology, decision support, and social systems analysis. Currently, he
focuses on product, methodology, architecture, and solutions development in
Enterprise Information and knowledge Portals, where he performs Knowledge and
knowledge management audits, training, and facilitative systems planning,
requirements capture, analysis, and design. Joe was the first to define and
specify the Enterprise Knowledge Portal Concept. He is widely published in the
areas of Decision Support (especially Enterprise Information and Knowledge
Portals, Data Warehouses/Data Marts, and Data Mining), and Knowledge
Management, and has recently completed a full-length industry report entitled
"Approaching Enterprise
Information Portals." Joe is a founding member of the Knowledge Management
Consortium International (KMCI), Editor of the new KMCI Journal, Chairperson
of the KMCI’s Artificial Knowledge Management Systems SIG, a member of its
Executive Committee, its Metaprise Project, and the KMCI Institute Governing
Council. Joe is a frequent speaker at national conferences on KM and Portals.
He is also developer of the Web site www.dkms.com, one of the most widely visited
Web sites in the Portal and KM fields. DKMS.com has now reached a visitation
rate of 83,000 visits annually.
Executive Information Systems Inc
The Executive Information Systems (EIS) Enterprise Knowledge Portal (EKP) is
the only portal solution that provides the assurance that enterprise decision
making will be based on validated knowledge. EIS’s EKP lets enterprises avoid
the risk involved in Enterprise Information Portals which claim to offer
increases in competitive advantage, ROI, speed of innovation, productivity,
effectiveness and profitability, but have as a central vulnerability the fact
that they are only capable of managing data and information, not
knowledge.
Enterprises using EIP-based solutions when they could be using EKP-based ones,
are gambling that unvalidated information can produce promised EIP benefits.
The central value proposition of the EIS EKP is that it replaces gambling on
unvalidated information with knowledge-based decision making. That is why it
is much more likely to achieve the promised benefits of EIP-based solutions
than its EIP competitors.
For more information, see
www.dkms.com
|