![]() |
|
| The global publication of record for High Performance Computing / July 9, 2004: Vol. 13, No. 27 | |
|
||||
Features:LETTER TO THE EDITOR: BENCHMARKS ARE LIKE PLACEBOSWilli Schoenauer, professor at Rechenzentrum der Universitaet, Karlsruhe in Germany, responds to a recent article from the anonymous High-End Crusader conerning benchmarks and their place in HPC. [107896, http://www.tgc.com/hpcwire/hpcwireWWW/04/0625/107896.html] Dear High-End Crusader, I agree with all your arguments concerning benchmarks. However, I think benchmarks are like placebos: they have an effect only if you believe in them. The reason why I don't believe in whatsoever benchmark is that a supercomputer is a much too complicated system to evaluate it by a benchmark or a series of benchmarks. So what else to do? My placebo in which I believe are pure kernel measurements. You should now have obtained my lecture notes "Scientific supercomputing: Architecture and Use of Shared and Distributed Memory Parallel Computers" that I have sent to you (those that do not know should enter in Google 'scientific supercomputing architecture use'). There you have examples how to evaluate a supercomputer by kernel measurements: addition, liked triad, vector triad (most important operation for engineering applications) for different vector length, with different strides with different indirect addressing. This must be combined with an estimate of the maximal expected performance caused by the cache or memory bottleneck. Also measure the scalar performance. For communication simple and double ping-pong on an SMP node and between nodes of a parallel computer for different message size gives startup time and transfer rate. Essential is the measurement of the overlap factor that shows if latency hiding for communication is possible. These kernel measurements, if appropriately analyzed, give you the insight into the possibilities of the supercomputer. If e.g. for the vector triad for data from memory for the Power4 processor you get an architectural efficiency of 3.2%, this means that you get only 3.2% of the theoretical peak, and what is much more important, you lose 96.8% of the possible performance of your processor or 96.8% of the time the floating-point units wait for data. To remember: The Power2 Wide 77 had for the same case 15% architectural efficiency, this shows the "progress". Or if the transfer rate for double ping-pong is half of that for simple ping-pong you have only unidirectional communication although the manufacturer offers bidirectional. Or if the overlap factor is zero, you cannot hide communication behind communication. All these properties are surely reflected in a benchmark, but the ultimate causes for the behavior of the benchmark are not visible. What the user finally gets out of his computer unfortunately depends largely from the quality of his programmer. From the detailed analysis of the kernel measurements the programmer gets hints for the optimal design of his program, that he does get only in a very restricted way from a benchmark. So what we need is not a 'superbenchmark' but a clear prescription HOW to evaluate a supercomputer from kernel measurements. That's the reason why my placebo are kernel measurements and why I believe in this placebo. Best regards, Willi Willi Schoenauer Rechenzentrum der Universitaet Karlsruhe D-76128 |
||||
| | Table of Contents | |