Leading Edge R&D:
IBM SPILLS BEANS ON 'XPERANTO' INITIATIVE
by Tom Sullivan and Ed Scannell
XML has been causing quite a splash in the database world, particularly in the
last few weeks, and IBM is the latest vendor to detail plans for the
standard.
In IBM's research labs, the company is working on a project, code-named
"Xperanto," which is to be a native XML database that acts as a subset of DB2,
said Janet Perna, general manager of IBM's data management solutions group
based in Armonk, New York. By using XML and relying on the XML query language
XQL, Xperanto should be a critical piece of IBM's long-term vision to marry
structured and unstructured data.
"The value of this is it's the next step beyond a federated database," Perna
said.
That step, Perna added, is information integration. IBM has application
integration via its WebSphere products, business process integration from its
recent CrossWorlds acquisition, and Xperanto acts as a dedicated server for
data or information integration. "We have a new class of software that really
is about information integration," Perna said.
Nelson Mattos, a distinguished engineer and director of information
integration at IBM's Silicon Valley Labs, said that the customer pain point
Xperanto is aimed at is how to tie together all the systems in an
organization.
"Xperanto is the technology that allows customers to integrate all their
systems," Mattos said.
Mattos said that Xperanto will be the materialization of IBM's work on a
number of Web services-related standards, including XQuery, XML Schema, UDDI
(Universal Description, Discovery, and Integration), SOAP (Simple Object
Access Protocol), WSDL (Web Services Description Language) and WSFL (Web
Services Flow Language).
The end goal of IBM's integration strategy is to be able to combine structured
and unstructured data, thereby enabling access to a broader array of data sets
within an organization, such as Office files. So organizations would be able
to access the content in the Word files that reside on individual employees
desktop systems.
Both Microsoft and Oracle said they are working to enhance XML support in the
database as well as toward the same goal of providing users more insight into
all of the intelligence within an organization.
"The ability to search against XML data is going to be key," said Peter Urban,
an analyst at AMR Research in Boston.
Useful information also can be found in historically unorthodox data mining
locales. Perna pointed to audio files with recorded conversations between
customer service representatives and prospective customers as an example of
data sources that potentially can be mined to glean nuggets of gold.
"The XML approach will provide the lingua franca for getting at various types
of data; it is providing a sort of structure for unstructured data," said
Henry Morris, an analyst at IDC in Framingham, Massachusetts. "The question is
how much of this unstructured data is going to be in XML. It will be a small
part relative to the total amount of unstructured data that is in a company."
Within IBM's strategy, DB2 handles structured data, OLTP (Online Transaction
Processing), BI (business intelligence), and Web applications, while the
Content Manager software takes care of unstructured information, such as rich
media and flat files.
Perna said that the widespread adoption of XML has made the idea of combining
structured and unstructured data come alive.
"Will the native XML database support replace the relational database? The
answer is no," Perna said. She added, however, that XML will work for certain
applications. "Think of this as an application of the database," she said.
IBM plans to post a pre-beta version of Xperanto technology to its developer
Web site in the first half of next year, and the technology will be finalized
toward the end of next year.
Perna said that IBM has yet to decide how Xperanto will be sold, but the
options Big Blue is considering include a standalone product, or pulled into
either WebSphere or DB2.
|