DATA WAREHOUSING:
A CATALYST FOR OPERATIONAL INFORMATION SYSTEM IMPROVEMENT: PART I
by Kathy Long
Data warehousing is the enterprise-wide transformation of historical raw data into business intelligence. To compete and grow in today's dynamic business market, decision makers need to have their hands directly on the pulse of the business. Business strategy is formed. Plans are put into action. Actions are measured. Analysts need flexible, immediate access to high quality, actionable information that can be used to measure today's performance and formulate tomorrow's winning strategies. Data warehousing provides the information critical to business survival.
The warehousing process requires standardization, transformation and integration of shared data extracted from heterogeneous operational systems. As we go through the data acquisition process, we gain an enormous amount of knowledge about the extent of integration, and level of quality of source system data. To gain full benefit from data warehousing initiatives, we need to leverage that knowledge to improve the architecture of the operational systems. We need to establish a "closed loop" IT architecture assessment process, where the information that the warehousing team gains in the data acquisition phase can be used as input to the operational systems assessment and improvement process. This requires that we need to think of the systems -- operational and warehouse -- as one enterprise information systems architecture.
All uses of enterprise information require high quality data -- planning, analysis and control. Data has high quality when it serves its business purpose, satisfies quality characteristics, is shareable and its management is cost effective. The operational systems may, but do not necessarily, manage data quality or enforce all required business rules.
Continuous improvements to operational systems data will improve the cost effectiveness and value of data across all enterprise applications. This goal requires the management of data as an information asset where its definition, maintenance and use are controlled to ensure that the data meets business needs and quality standards. A closed loop IT architecture assessment process will:
The causes of poor data quality stem from a broad spectrum of factors. A short term, reactionary systems management outlook, application teams with different methodologies, diverse toolsets that make integration difficult, weak security enforcement, and a focus on hardware and software technology instead of architecture are all possible causes of poor data quality. It may not be feasible or cost effective to correct all instances of poor quality data, e.g. anomalies in historical data. However, the causes must be examined and initiatives must be taken to ensure that the policies and projects are put in place to support the warehouse objective of providing business intelligence. The warehousing team understands the extent of data integration and data quality in the operational sources, and they are in an ideal position to support quality management objectives at the enterprise level. There are several ways the warehousing team can support enterprise information quality objectives:
Many organizations have a quality management policy that is proudly advertised in corporate literature, but is not woven into the information management process. Data standards can be defined and state-of-the-art information technology employed, but if the integrity of the data is not continuously managed, it will be impossible for the data in the warehouse to add any business value to the organization.
The team must assess the existing data quality policies and procedures that are in place supporting the warehouse data sources. If there is no quality management process, one should be defined within the context of the warehouse environment. Once established and proven, the process can be "marketed" and migrated upstream to the source systems.
The quality management process should define end user expectations for data quality based on quality characteristics including accuracy, validity, completeness, timeliness, consistency, relatability, and uniqueness. The data to be loaded into the warehouse must be measured against the quality expectations. Statistics on data quality measures should be captured and published. Data quality improvement projects should be defined and prioritized based on the severity of the errors and the cost/benefit of improvement. The improvement efforts should be continuously monitored, capturing quality trends.
2) Establish business data stewardship
The business analysts who need to make decisions based on key performance indicators, data relationships and trends are strong advocates for high quality data, and specific individuals should be selected as business data stewards. The stewards are asked to apply their knowledge of the information and its use in the decision making process to help define the quality management process described above and prioritize improvement projects.
The training of business analysts in the use of the warehouse should include their role as data consumers in the quality management process. All knowledge workers have a responsibility in their use of information to assess its quality and provide feedback.
---
The concluding part of this article will appear in the next edition of D S * .
---
For more information, see http://www.spectrumtech.com