[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]

DATA WAREHOUSING SYSTEM ADMINISTRATION, A BUSINESS-CRITICAL SYSTEM, PART II
by Nigel Chang


RDBMS Expertise

Unless you are a die-hard multidimensional database (MDD) fan, you can only choose to hug RDBMS. As I have mentioned before, data is everything. Your systems administrator is just one of its guardian angels, and in order to guard the data, a systems administrator must know the database management system. For the MIS department, DBMS is the only software people care about and is the only critical software running on the machine. Systems administrators ought to be trained in database systems in order to be able to communicate with DBAs and applications developers and in order to better serve their needs.

The MIS UNIX system administrator not only has to build the kernel, reboot the machine, and replace the disk, but also needs to know the data partitioning, parallel data loading, and hot backup. Furthermore, the systems administrator must be able to tell the fundamental difference between OLTP and data warehouse so he/she can be more focused on Full Table Scan and Batch Jobs. The Very Large Data Warehouse (VLDW) includes both a data warehouse and OLTP. VLDW is completely different from Very Large OLTP.

Hardware Platforms

A frequently asked question is, "Which hardware platform is most reliable?" As an administrator of many different UNIX platforms over many years, I still don't have a quick answer. The answer could depend on the business environment and the application at various points in time.

Some MIS departments simply maintain a hate list of hardware platforms. If there are negative stories about any particular hardware platform, that hardware platform is registered on the hate list and has no chance of being selected. This approach assumes that the final selection has maintained a perfect image and has no complaints whatsoever in technical support, availability, maintenance, reliability, scalability, and performance. The final selection may turn out to be an unknown hardware platform, and such a machine will require training and test driving.

Another popular method in many MIS departments is to select one or two major hardware vendors as enterprise vendors. This approach will eliminate the need to evaluate and select hardware platforms for each and every MIS project. The drawback to this method is that, at times, the project cannot find the suitable hardware platform for its application.

Others play the benchmark game. They either buy the TPC-C/TPC-D benchmark results or conduct their own on-site benchmark tests. Some benchmark reports are available on the Web at http://www.ideasinternational.com/benchmark

The most troublesome and time-consuming method is to set up your own rules and invite all interested vendors to generate figures according to your rules. By regulating the queries and configuration based on your data warehouse data and business requirements, the vendors do not have the luxury of playing tricks. This method is not used often, but does guarantee the success of the data warehouse project, if time and budget permit.

As to the constant debate over SMP vs. MPP, many articles have been published on the Web in the past few years. It 's not difficult to find such discussions on Web sites, such as that of the Data Warehouse Information Center, and get a better idea as to which platform suits your data warehouse.

Operating Systems

Of course, there is more than one type of UNIX operating system (SVR4, OSF, BSD, etc.). SVR4 is by far the most commonly used type, but there are others that are also popular on the market. The compatibility between SVR4 and OSF is not 100%. Porting would be an issue, as one tries to share scripts between different UNIX systems.

The 64-bit UNIX operating system is yet to come. The data warehouse, like VLDB, is very sensitive about the limitations set by the 32-bit UNIX operating system. Things like file size, file system size, memory size, I/O bandwidth, and microprocessor speed are commonly capped by the 32-bit architecture. Some UNIX vendors have offered a 64-bit solution in recent years. Others are expected to catch up within the next six to 12 months.

Data warehouses love to take advantage of the 64-bit UNIX operating system because almost all limitations disappear. However, the 32-bit to 64-bit conversion will be one of the troublemakers to watch over the next couple of years. All of the data warehouse software, including RDBMS, OLAP Tools, and data extraction tools are expected to be buggy as they try to port from 32-bit to 64-bit.

When tuning data warehouse performance, UNIX kernel improvement is often the first step people want to take. It may take several rounds of trial and error to build a good kernel. Your RDBMS vendors should be able to provide kernel-tuning guidelines for every supported UNIX system.
---
Part III of this series will appear in the next edition of D S * .
---
For more information, see http://www.metaedge.com


[ PREVIOUS ARTICLE | Table of Contents | NEXT ARTICLE ]