[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]

META DATA IS THE ANSWER; NOW WHAT WAS THE QUESTION? PART I
by Robert S. Seiner - Spectrum Technology Group


Some people talk about meta data all the time as though it holds the answer to ALL questions about managing data. I may even be considered as one of those people. If YOU are already getting tired of talking about meta data, you are probably in the wrong business.

Meta data is not really new. It was around long before you and I arrived in this industry. Meta data will be around long after we go to that great repository in the sky. But it was not until the past ten years, maybe much less, that the importance of managing meta data became apparent. This increase in importance has been a result of companies integrating the vast number of information systems and databases, creating decision support environments, and paying attention to managing data as a valued corporate asset.

Meta data (as data about data; or documentation about data) has now become an integral part of EVERYTHING having to do with information technology. If you do not believe me, consider the following questions.

  1. Should meta data be a consideration in the evaluation of IT tools?

Yes, Meta data should always be a consideration when evaluating new and old IT tools. The data (or information) that is found in all IT tools is meta data. The meta data is, more often than not, used ONLY by the tool to perform its function (extraction, movement, security, modeling, reporting, ...). Each function is very important but the meta data typically is never viewed or used by people who are not directly associated with the use of that tool.

The truth is that the meta data has the potential for value far beyond how it is used by the IT tool in which it originates. But since no person can view it, the meta data never provides additional value. Examples of under-used
meta data include business rules in data models, allowable values in auditing and transformation tools, user ids and privileges in security packages, and on and on. Business rules are valuable information but often nobody sees them, and they are not actively being used to control processes. Allowable values are often audited when data is moved to the warehouse, however, the lists of accepted values and their descriptions can not be viewed by the processes that are generating the data. User ids and privileges from security packages can be the basis of information stewardship if only that information, the meta data, was available. ... And these are just three examples.

If one is to uncover the hidden value in meta data, it becomes important to pay closer attention to the meta data sources. Some IT tool vendors store meta data in proprietary formats while other stores meta data in open formats. When using tools that store meta data in a proprietary format, it becomes difficult (if not impossible) to use that meta data for any reason other than to operate the tool. Tools that provide open (and documented) meta data structures provide the opportunity to extract that meta data for use in a centralized meta data repository or data asset catalog. This leads to providing meta data to individuals that may benefit from its use.

Meta data, in all of its shapes and forms in the dozens of IT tools we all use, has increasingly become the knowledge bank of what companies are doing with their data. Therefore, it is better to evaluate and understand the meta data availability and capabilities of an IT tool prior to spending the money, than it is to find out too late that the meta data will serve one purpose only and stay hidden.

2. Is meta data a factor in achieving a high Return on Investment (ROI) in Data Warehousing?

Simply stated, if meta data is not provided to the knowledge workers, the chances of data warehouse / mart under-use or mis-use are significantly increased.

Most individuals want to know simple information about the warehouse data, such as, what types of data exist, how the data is named, defined, and referenced, and how the data can be selected. There are also end-users that want to know more complex information about warehouse data, such as, what business rules were used to select the data, how the data was mapped, transformed, cleansed, and moved to the data warehouse, and how the data is audited and balanced.

Warehouse end-users are less likely to use the data warehouse if they do not understand the data. A client of mine said it succinctly when he stated that "if they are confused, they will not use it".

In many organizations, time spent researching, selecting, extracting and verifying data, consumes more time than the time spent doing data analysis. "Our goal is to flip the percentages from 70% of the time preparing data and 30% of the time doing data analysis, to 30% and 70%," stated one vice president of a large financial institution. This type of statement is fairly common and is often used to cost justify data mart development. One way to "flip" the figures is typically through improved understanding of the warehouse data using meta data.

Warehouse mis-use is also a risk when meta data is not available. When several individuals spend their time researching the same data and come to different conclusions through different results, or results from one data source do not match results from another data source, the cost of data preparation is multiplied (as is the frustration of the decision makers). In this type of situation, the company is often required to select the "best" answer to the original question, as opposed to THE answer. More often than not, the reason for the discrepancy in the results comes from the lack of data understanding or inconsistencies created through less-than-adequate data management practices. Both of these can be improved through improved meta data management practices.
---
The second part of this article will appear in the next edition of D S * .
---
For more information, see http://www.spectrumtech.com



[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]