DATA WAREHOUSING SYSTEM ADMINISTRATION,
A BUSINESS-CRITICAL SYSTEM, PART III
by Nigel Chang
RAID Devices
Data warehouses have more online data than any other system. They easily consume gigabytes to terabytes of disk space. Today, the only conventional technology to provide sufficient disk space with acceptable availability and performance is RAID. A data warehouse systems administrator must do his/her homework to understand just what RAID has to offer. Many hardware platform vendors, such as IBM and DEC, offer their own RAID solutions, while others, such as HP, team up with RAID-specialized vendors like EMC.
Here, we only want to list some basic knowledge that is required by any data warehouse systems administrator. I, personally, require that my systems administrator be able to explain RAID 0 (or striping), RAID 1 (or mirroring), RAID 0+1 (for write intensive data), RAID 5 (or parity disk). I also want explanations on the mean time between failure, dual path, hot spare, floating parity, hardware striping, OS striping, and database striping. He/she should know how to calculate I/O throughput and usable space, based on the vendor's data.
Tape Devices
Yes, data warehouses are business critical. We need the largest and fastest tape library and devices. Data warehouses requires end-to-end parallel processing. Backup and recovery is very important and can only be achieved with multiple quick tape drives.
The capacity of a tape library must be accounted for, based on the online data and retention requirements.
CPU and Memory Administration
Data Warehouses are both CPU and memory hungry. MPP architecture with parallel server DBMS has no problem in supplying both CPU and memory with its scaleable architecture.
SMP, on the other hand, is a different story. It is extremely unpleasant when you realize that there is nothing to tune unless more CPU or memory is installed. It is a problem that money can sometimes resolve. However, there may not be a physical slot in your SMP machine in which to install them. It is far better to spend the extra money to upgrade to the next generation model. By that time, your data warehouse may have grown to a larger scale and want a bigger machine.
Memory must be partitioned for both UNIX sorting and RDBMS shared memory. There should be a balanced configuration so neither side will be short of memory.
RAID Device Administration
A new data warehouse systems administrator who asks why the raw device is used, rather than the cooked device, should be sent to RDBMS class immediately. By the same token, a systems administrator must be equipped with a volume manager (such as HP LVM, or Veritas) knowledge and experience. Not even the best systems administrator will be able to handle hundreds of disks without this type of software.
In addition to purchasing volume management software, the systems administrator must be able to program or script to automate the disk or volume allocation and achieve load balancing, reduce I/O contention, and exercise RAID hardware evenly. If this is too much to ask, MetaEdge does offer a top-notch database management tool (i.e., DBA2GO) and consulting service in this area.
Some first-time data warehouse systems administrators want to configure the entire disk farm using RAID 5. Once this type of configuration is done and in production, it is very difficult to convert to others. As both experienced data warehouse systems administrators and database administrators, we do not appreciate this all-RAID 5 configuration. There are critical data, such as the Oracle data dictionary, rollback segments, and online redo logs, that need to be saved in dedicated mirrored disks.
The mirrored RAID 5 device is a possibility but is not favorable for two major reasons: it is more difficult to locate and has higher usage. Preconfigured and dedicated mirroring disks are easier to map. RAID 1 provides 50 per cent of usable space and RAID 5, given 1 parity disk for every 4 data disks, reduces 20 per cent of usable space. RAID 1 on top of RAID 5 will introduce a 60 per cent loss of disk space. This is not a good configuration.
Therefore, the systems administrator needs to call the DBA for requirements and count the number of mirrored disks before the RAID device installation and configuration. The systems administrator could then provides the RAID vendor with the proper cabling plan, RAID 1 or RAID 0+1 count, and disk grouping method.
Some RAID device vendors want the systems administrator to believe that their RAID can be run like a black box. This is absolutely a bloody trap. The system administrator wants every possible way to see through the box. He/she should be able to acquire RAID device status and statistics down to the lowest level at anytime. RAID monitoring with the phone home feature doesn't work, due to frequent modem failure. Internet browser implementation is a much better idea.
In our past experience, black boxes have failed to phone home for several weeks and have downgraded disk performance and availability. A lot of data was solely written to and read from the parity disk. Luckily, it wasn't a long enough period to fail the parity disk and cause severe data loss.
RAID cache for the data warehouse application is critical. Before you buy into the vendor's "large cache in RAID box", there are some realities to think about. Although read-ahead is a good and commonly-used feature, it is not the complete solution. Dynamic cache allocation is a more important feature.
One of the major RAID device vendors offers up to 4 GB in each RAID box
which houses more than 100 disks. The cache was programmed for static
allocation. Three GB was left for all 100 disks when the directory used 1 GB
cache. Each disk is assigned with a 30-MB cache physically. This 30-MB cache
space is not shared between disks. When one queries full-table scan a huge
table on a 4-disk stripe set, there are only 120 MB (4 * 30 MB) cache
allocated to serve the query. The rest of the 2.88 GB cache is simply idle
during the query. Thus, before you select a RAID vendor, you had better make
sure your RAID vendor designed its RAID for data warehouse applications.
---
Part IV of this series will appear in the next edition of D S * .
---
For more information, see http://www.metaedge.com