
Features:
INTERVIEW WITH BILL BLAKE, SVP, PRODUCT DVPT., NETEZZA
by Alan Beck, Editor-in-Chief, HPCwire
HPCwire: Thus far, the race between COTS-based cluster supercomputers and
those based upon proprietary processors has resembled that of Achilles and the
tortoise: the clusters approach ever nearer but never quite succeed in
surpassing -- or indeed drawing even with -- their elite contenders. Will this
situation ever change? Why or why not? And given the speeds involved, will it
still matter?
BLAKE: Lessons learned from the development of proprietary processors have
tended to flow down to the industry standard parts, much like the lessons
learned on formula 1 racing cars have influenced "commodity cars". And today's
crop of 64 bit industry standard processors are certainly catching up to where
proprietary processors such as Alpha were just a few years ago. But the key
factor may well be the economics, since innovation in areas such as new
approaches to on-chip paralellism in microprocessors can be a billion dollar
proposition in order to significantly exceed the industry standard parts.
Does it matter? Yes if the industry standard parts do not support the
dramatically higher memory bandwidth requirements of supercomputing then it
matters a lot. Architectural approaches such as hypertransport are very
important to opening up the memory system of the processor to high performance
system interconnects that are the lifeblood of scalable parallel systems. At
the system level, clusters will clearly dominate as all the hardware and
software building blocks are there to deliver significant application
performance with very good price/performance characteristics.
Linux clusters are the mainstream, and their elite contenders as you call them
are relegated to those specialized cases where highest capability is required.
As for capacity computing, the important load sharing software, be it from
Platform Computing Inc. or many home grown varieties, is in place to support
excellent system utilization is in use everywhere. The key enabling technology
for COTS-based cluster supercomputers for parallel compute intensive
applications has been tools like MPI for coarse grained message passing plus a
lot of work by parallel application developers to deal with explicit
parallelism in their codes.
HPCwire: What kinds of networking architectures will provide the principal
support for the simplified supercomputers of the future? Will such networks
ultimately prove as unwieldy, in their own way, as traditional HPC vector
processors? Why or why not?
BLAKE: Myricom and Quadrics are both setting the bar for all others to meet in
terms of bandwidth and latency. And I expect them to continue in that mode for
the forseeable future especially if they can continue to exploit new high
bandwidth memory interfaces such as hypertransport. But there are a number of
new startups, such as Alacritech and Amasso, that are trying to improve
bandwidth and lower the latency of the standard ethernet stack and if they are
successful then they will cannibalize the proprietary schemes. By mid to late
decade, the horse race at the high end will be between proprietary all-optical
switches in conventional topologies like the fat tree and highly optimized
mesh architectures built into the microprocessor itself (but without the
overhead of maintain cache coherence in large meshes).
HPCwire: Will the new strategy of simplified computing truly provide HPC power
for general use -- or will security concerns eventually eclipse the enormous
potential that appears to lie just ahead?
BLAKE: I expect the grid forces to solve the security issues needed to make
simplified computing truly available for general use. For example, the success
Avaki has had with secure data grids for the pharmaceutical companies is very
encouraging.
HPCwire: Any other surprises at the system architectural level?
BLAKE: Absolutely. As cluster supercomputers mature, we are seeing significant
innovation at the system architectural level, especially in the specialization
of node function. New systems such as the Cray/Sandia Red Storm show an
interesting approach to both scalability and serviceability with cluster nodes
optimized for compute (very light weight kernel) versus file system (full
Linux) versus service and maintenance. By solving the software challenges of
heterogeneous and asymmetric node configurations, system performance and
functionality will improve. At Netezza we are pursuing that path for the
hardware needed to support analytic terascale databases as we couple a front
end SMP machine to a highly parallel database engine with nodes optimized for
database operations with processing as close to the disk where data resides as
possible.
|