[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]

DATA MINING, MODELING, & PREDICTIVE-SERVICES: ORGANIZATIONAL CONTEXT
by John Thompson, Vice President - Marketing, Magnify Inc


The previous two articles have highlighted how breaking down the functionality of traditional data mining systems can benefit firms that are interested in augmenting their existing decision making systems. This article details the organizational and human resources framework in which a decomposed and distributed data mining system would operate.

Let's review for a moment. The data mining algorithms and data access elements of traditional data mining systems are contained in a component referred to as the data-mining engine. The engine is optimized for accessing and analyzing large amounts of data and is designed to formulate and deliver the newly derived knowledge to analysts. This function represents the "finding the gold nugget" metaphor that is so often associated with data mining. Because of the volume of data and depth of its searches, the engine runs only on high performance platforms as it interrogates raw, semi-refined and refined information. Mr. Wm. Inmon has described how this type of data is loaded and managed within the exploration warehouse in his previous DS* articles.

The second main function of the data-mining engine is to provide a robust platform upon which to build either predictive models or entire applications.

Many vendors, who, just a few months, ago were known as data mining vendors, are now building applications on top of data-mining algorithms. In so doing, these vendors are now being referred to as application and solutions providers in the various vertical markets that they have chosen. You, as a buyer of a data mining system, could choose to bring in such an application and solution provider, and their data-mining engine would underpin the application construction efforts. Another option would be to buy the engine standalone, but use the expertise and talent of in-house staff, such as a modeling group, to build predictive models or applications on top of data mining engine.

The personnel involved in making the data-mining engine function are typically the same people that are involved in either the construction of the data warehouse or in the development and operation of advanced analytical systems. Data derived from the data mining engine is generally reviewed and distributed by those directly involved with data warehouse access. These would be people such as the data access architect I spoke with recently at the DCI Database World Conference in San Francisco, or the outside consultant I met with yesterday who focuses on how to build quality data warehouses.

There are two primary internal organizations that use the data-mining engine. The first group is Information Systems (IS) and they are concerned with providing data that will be accessed and analyzed. The second group is the modeling group, who is responsible for building predictive models and/or whole applications, or maintaining a purchased application from an outside vendor. The modeling group typically reports into a line of business management structure that has sponsored the construction of the system.

Up until now, this has been the extent of the human resource impact for data mining. The organizational impact has been directly tied to the functionality and operation of the data-mining engine. However, a new area of impact is emerging and it is embodied in the functionality that lies outside the data-mining engine -- the execution environment for the application or predictive models. This functionality is built into something called the predictive-services engine. As mentioned in previous articles, this is the component that lends ubiquity and invisibility to the technology of data mining.

The predictive-services engine is built to be implemented into existing production flows and requires implementation either by vendor staff or technical personnel from the client (typically IS). The implementation centers on installing the predictive-services engine, setting up the input and output mappings and testing that the engine itself is benign within the existing production environment. Without a predictive model or application installed, the predictive-services engine is simply a pass through environment that has no processing capability or intelligence. In that sense there really are no new users of the predictive-services engine other than IS staff, who are likely already involved with the data-mining engine.

What happens next is what brings in a part of the organization that has not previously been impacted by data mining. A predictive model or complete application is exported from the data-mining engine and put into a lightweight format that enables its portability. The model or application is taken to the physical location(s) where the predictive-services engine has been installed and the model and/or application is implemented into the predictive-services engine. Thus, we've moved the intelligence derived from the data-mining engine out to production areas.

The IS personnel or the outside vendor personnel transport and install the application or predictive models into the predictive-services engine, and test its operation. The information or derived knowledge produced by the contents of the predictive-services engine is then passed into a system that serves a line of business department such as credit risk, database marketing, or point of sale station. Derived knowledge is now presented to the end users, in context, with the information that they already use in their day to day operations. These users simply need to be trained on the value and appropriate interpretation of the new data element(s) appearing on their screen.

What has happened? We have utilized the personnel, skills, and resources of the IS and modeling organization to build a mechanism that distributes derived knowledge, based on expansive data stores containing transaction history and other information, directly to the personnel who make decisions on how to deal with customers and partners. Real time evaluation of requests based on historical data is now a reality. End-users are impacted by data mining, even though they might not realize it.

With this type of distributed data mining/predictive services environment, clients can have the much discussed, but rarely achieved "closed loop decision support system". Organizations can continue to leverage the investments that have been made in personnel and data warehouse resources. They can also extend their strategic and production level decision making systems to extract the maximum value from the data which they current possess.

No new departments, no new technologies - just an extension of existing systems for better management of the business that you have chosen to be in. Allowing more of the organization to utilize data mining without requiring any major business or process alterations.

Frankly, I had started out to write an article on the technology that links the data-mining and predictive-services engines together, but I kept hearing from people that they didn't understand the people implications as these systems are built, used, and maintained. In response to those requests, I changed course. Hopefully, I've added some clarity about who can use data mining. I'll try to write the article on the linking technology in the next issue.

---

John Thompson, Vice President - Marketing, Magnify, Inc. You can reach me at jkt@magnify.com


[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]