THE ROLE OF TPC-D BENCHMARK RESULTS
IN SELECTING A SERVER FOR BUSINESS INTELLIGENCE: PART III
by Daniel Graham
THE TPC-D BENCHMARK "POWER"
Now for the metrics in TPC-D benchmarks. First on the list comes power, defined as the ability to give a single user the best response time. The trouble is: that is not the real world. It is a bit like comparing Formula One with rush hour city driving. Nobody in the data mining, data warehousing industry runs only one user at a time. A lot of customers are confused when they have been supporting several thousand users on a five or ten million dollar machine, then suddenly they put data warehouses on it only to discover that the first parallel user consumes 50% of compute-resources, leaving only 50% for everyone else. That is when the customer pages you in the shower and says: "Hey, we've got people hung up with ten second problems taking 34 minutes. Do something!" This concurrency management problem is severe.
Well, let me just toot a brief horn for IBM here. We developed something called Workload Manager. The key to its success is that we put it at the query level, not the session level. The query comes in, Workload Manager assesses the query and assigns the necessary machine time. Then it distributes the query to any of the processing nodes in a parallel machine. Nothing new, you say: we were setting performance groups on 390s since time out of mind. What is different with Workload Manager is that it works like a set of hour-glasses set up side by side. It guarantees that a certain amount of workload will go through the fast query section, so the ten-second query people go away happy. Meanwhile, the user who wants to read every available database record and do every possible permutation on warehouse data moves to the bottom of the priority list. His query may take four hours, but he expected that anyway.
THE TPC-D BENCHMARK "THROUGHPUT"
That brings us back to our meetings with prospective clients. Question Number One the vendor must ask is: how important is concurrent user management? That's when you explain to them that the TPC-D tests on power and throughput relate to Formula One, not rush hour. If you are really going to give your customers added value, you have to tell them: TPC-D gives great metrics, but they do not relate to multi-stream. TPC has it in the rules and regulations to do multi-stream tests, but so far the end-users have not demanded it. Maybe they should. Because when you put the real stress to the machine in multi-stream tests, everybody will know more than they did before. Only then will potential customers looking to buy servers for complex tasks like data mining be able to make truly valid comparative judgments of various offerings on a level playing field.
THE TPC-D BENCHMARK "PRICE/PERFORMANCE"
This takes us forwards (and backwards) to "price/performance" ratios. Everyone wants a fair price. And I don't think there is a vendor in the business who wants to feel that a customer goes away feeling beaten on price. But, as I said before, when you move into serious study of a client's requirements, price slips down the list a little. Statistically it is not in a client's interest to spend several months shaving half a million dollars off the price of a suitable server if that unit is going to return its price many times over in profits.
That pretty much takes us full circle. After working through a good many negotiations for data warehousing projects -- admittedly on the vendor's side of the table -- I have no hesitation in setting my own priorities. Skills come Number One. In that respect I would add another metric: a skills/performance ratio. If a purchaser lacks access to a ready supply of skills at a competitive cost to make things happen, it doesn't matter what purchase choice he makes. As a vendor, it is my job to see that the client is fully aware of the importance of skills/performance in his purchase decision. He must establish the degree of support that he can employ to back up his purchase decision. And a responsible vendor has to help him do that. Only then will he guarantee the success of his business intelligence project in years to come.
---
For more information, see http://www.ibm.com/bi