
Features:
INTERVIEW WITH EYAL WALDMAN, CEO MELLANOX TECHNOLOGIES
by Alan Beck, Editor-in-Chief
InfiniBand is making news in the HPC market. Impressive InfiniBand performance
results have been released and a new high port count InfiniBand switch was
announced. HPCwire interviewed Eyal Waldman, CEO of Mellanox Technologies,
Ltd. to discuss his plans for InfiniBand in the HPC market.
HPCwire: Eyal, Mellanox is fairly new to the HPCwire readers. Why don't we
start with you telling us a little about Mellanox and yourself?
Waldman: Mellanox was founded in 1999 with the expressed purpose of developing
InfiniBand silicon. The technology did not have an official name at the time.
With the hard work of many top companies in the server industry, we published
the 1.0 specification and named the technology in 2000. Mellanox has led the
market since then with our silicon solutions and now offers 2nd generation
10Gb/sec switch and HCA (Host Channel Adapter) silicon in the market. Our R&D
team is located in Israel and our Business team is based in Santa Clara.
Mellanox has an exceptional engineering and management team. Mellanox
employees some of the best engineering talent available and our managers know
how to lead and how to ship a product that meets the needs of the customer. To
date, Mellanox has delivered three silicon devices, each has worked the first
time and we have many excellent products based on these devices. Before
starting Mellanox, I co-founded Galileo Technology and ran engineering there.
Prior to that I was at Intel as a microprocessor and cache architect as well
as an engineering manager.
HPCwire: Just what is InfiniBand?
Waldman: In short, the InfiniBand architecture is a low latency high-speed
channel interconnect designed to carry clustering, communication and storage
traffic all over a single wire. It is an industry standard that was developed
by all the major server and storage companies including Dell, EMC, IBM, Intel,
HP (and Compaq), Net App, Microsoft, Sun and many others. The specification
was released three years ago by the IBTA (InfiniBand Trade Association) and
today there are many 10Gb/sec products in the market. Also, the IBTA has
already defined 30Gb/sec as the next step in bandwidth. The combination of
InfiniBand and industry standard servers make a powerful combination of near
limitless compute power whether it be for the HPC market or enterprise
applications. Many technology white papers on the specification and its
applications are available on our web site, http://www.mellanox.com.
HPCwire: What benefits does InfiniBand and Mellanox provide for the HPC
market?
Waldman: Mellanox offers many benefits for the HPC market. First, InfiniBand
is an open standard. Standards provide many positive attributes for any
market. InfiniBand offers a range for the vendor's choice; economies of scale,
competition, multiple market solutions and a faster pace of innovation.
Second, the specification defines the technical specifications that HPC
desires. These include low latencies, high bandwidth of 10 Gb/sec TODAY with
30 Gb/sec in the future, RDMA and near limitless scalability. In addition,
InfiniBand provides a rich, open protocol environment that opens the door to
integrating fast storage solutions and real-time traffic protocols to
clustering.
Third, InfiniBand is available at a low cost. The cost of clustered computing
will drop thanks to the combined engineering efforts of an industry standard,
increased competition and range of choice. Fourth, InfiniBand offers record
MPI and application performance. These assertions are proven by the record
performance delivered today over MPI and HPC applications.
Finally, Mellanox is delivering all the interconnect solution elements that
HPC OEMs or VARs need to deliver this high performance.
HPCwire: Doesn't InfiniBand replace PCI? And aren't PCI-Express and InfiniBand
competitors?
Waldman: Some misconceptions exist that InfiniBand will replace PCI; this
isn't the case. InfiniBand is designed to be a system-to-system or box-to-box
interconnect that enables low latency RDMA (remote direct memory access)
system or storage communication. PCI is designed as a local chip-to-chip or
chip-to-upgrade card interconnect that resides inside a system. All of the
InfiniBand HCAs in the market today use PCI-X as the system interface.
Mellanox is developing 3rd generation HCAs that will connect to servers via
either PCI-X DDR or PCI-Express. Both of these HCAs will deliver improved HPC
bandwidth at lower latencies.
HPCwire: I've heard InfiniBand is an enterprise data center technology. What
is Mellanox's commitment to the HPC market?
Waldman: InfiniBand has been designed by the IBTA as a highly flexible and
robust architecture able to serve many markets. InfiniBand attributes can be
exploited by servers, storage, communications, HPC, embedded,
telecommunications, back planes, blades and other applications. First and
foremost, InfiniBand is a clustering technology with HPC leading the
technology innovation. Major server OEMs initially participated in developing
the specification to replace the internal proprietary fabrics used to cluster
their proprietary high-end server and mainframe designs. Therefore, the design
robustness of the IBTA specification for RDMA clustering can be counted on.
Mellanox's two focused priorities are HPC and enterprise data base clustering.
Even though we have design wins in all the applications I just mentioned,
these two markets receive the majority of the company's resources and will
continue to get them in the foreseeable future. HPC is where our latest state-
of-the-art products can be introduced, deployed and utilized quickly. HPC is a
market that Mellanox takes seriously and it will be a permanent focus for us.
HPCwire: You mentioned record performance. Can you back it up?
Waldman: Certainly, InfiniHost's user-level latency performance is tremendous,
currently in the 5-microsecond range. MPI bandwidth in now over 850 MB/sec and
MPI latency is superb at less than 7 microseconds.
Ohio State University and others have published a number of MPI performance
comparisons. In most every bandwidth/latency intensive benchmarks we have run,
InfiniBand demonstrated the best performance. In the OSU tests, our
InfiniHost latency was lower than Myrinet at every message size. At larger
message sizes we have about 1/3 the latency thanks to our 10Gb/sec link
capabilities.
Actual application performance is determined by a number of different factors
including bandwidth, latency, CPU utilization, hardware parallelism and
interrupt response time. Different applications depend more or less on each of
these individual factors and ultimately application level benchmarking is
critical to get a real measure of HPC interconnect performance. InfiniBand
performance shines here. For example, on the NAS IS parallel benchmark
InfiniBand outperforms the proprietary clustering technologies.
Mellanox is very pleased with these results and very optimistic they will get
better. We showed our initial results last November at SC2002. At that time we
had bandwidth of about 750 MB/sec and just over 10 microsecond MPI latencies.
Huge improvements have made in the last 6 months. We'll continue to advance
these results with both improved software and hardware products. Many
performance references are available on our web site; please look at our HPC
web page at http://www.mellanox.com.
HPCwire: Great but does the performance scale?
Waldman: Absolutely! We are seeing great results with mid-sized clusters with
dozens of nodes. While these results are not yet public, you can expect to
see them in the next few weeks.
HPCwire: What products does Mellanox offer for the market?
Waldman: Mellanox is a complete HPC interconnect solutions company offering
products through OEMs or VAR channels. Silicon products are the core of the
company, available to OEM partners who build their own designs. HCA cards,
switches and software that can enable a complete clustered HPC interconnect
solution are also available. A complete solution can be deployed through OEMs
or VARs that provide all the hardware, integration services and support.
Our HCA is called InfiniHost, which is a dual port 10Gb/sec PCI-X device also
offered on a PCI-X add-in card. This second generation device has an
architecture designed to lead in performance and throughput at the lowest
latency.
Our switch is called InfiniScale. It has eight 10Gb/sec ports full wire speed,
non-blocking ports. Mellanox has created a high port count 96-Port InfiniScale
switch from this device. This high port count modular InfiniBand switch
platform is designed for the flexible deployment of high performance computing
clusters. One chassis can support from 12 to 96 InfiniBand ports in a CBB fat
tree topology with extremely low latencies. We also offer 8 and 16 port CBB
switch designs. Very large two stage CBB clusters can be created from these
switches.
Mellanox has demonstrated two InfiniBand Server Blade designs. These designs
come complete with dual redundant switches offering 10Gb/sec InfiniBand
communication between all the server blades. The designs we have shown enable
a 12-node HPC cluster in just 5U, or up to 60 nodes in just 25U and it is
accomplished without an external switch. InfiniBand is the ideal interconnect
for server blades and the best way to maximize many positive attributes of new
modular designs, including diskless booting.
HPCwire: What about MPI support?
Waldman: We have experienced great MPI support from many major providers.
Professor DK Panda and his OSU team, with funding and support from Intel, have
been great at improving their open source software for InfiniBand. They have
supported InfiniBand semantics and our silicon in releases since last October.
They have also published many performance comparisons and a number of papers
on their work. Their MVAPICH (MPI for InfiniBand on VAPI Layer) open source
software is available today.
(See http://nowlab.cis.ohio-state.edu/projects/mpi-iba/index.html.)
Last November MPI SoftTech Technology announced their support of InfiniBand
and in February released an InfiniBand enabled version of MPI/PRO. This robust
commercially hardened version of MPI has bandwidth of over 800 MB/sec with
excellent latencies and a low CPU overhead mode. MPI solutions are also
available from Scali and NCSA.
HPCwire: The InfiniBand message sounds very positive: great latencies with 3X
the bandwidth. So isn't it much more expensive?
Waldman: InfiniBand is not expensive. Standard based technologies provide the
opportunity to lower costs through the economies of combined engineering
efforts and scale. Today our partners have quoted $995 for an HCA card and
less than $600 per 10Gb/sec switch port. We are very cost competitive with
current proprietary interconnects while offering next generation technology
that scales into the future. I also expect that the combined InfiniBand
volume across all markets will drive lower costs for the HPC market.
HPCwire: What can you tell use about your future product plans?
Waldman: Of the many significant disclosures made on our products the most
important is our InfiniHost HCA plans. To date, our dual 10Gb/sec device and
PCI-X board is the basis for all our results. These dual ports are capable of
supporting a total of 40 Gb/sec of bi-directional bandwidth. Even though we
are delivering near 3X the bandwidth of the proprietary solutions, we are not
wire speed limited; InfiniBand needs faster local buses within the server. Our
third generation core will have the capability to scale well beyond the 850
MB/sec of MPI bandwidth being delivered today. To enable this, we have
announced that Mellanox will produce 3rd generation PCI-Express and PCI-X 2.0
HCA devices that should achieve close to 2GB of delivered MPI bandwidth and do
it with even lower latencies.
Mellanox has an additional third generation switch device under design with
disclosure of details pending.
HPCwire: What goals does Mellanox have in this market?
Waldman: Our first goal is to offer HPC customers a complete InfiniBand
solution for clustering with multiple sources to obtain these solutions. We
will continue to push HPC performance to new levels with each new release of
our products. And, of course, we want to generate revenue in the process.
HPCwire: Where can interested parties buy InfiniBand solutions?
Waldman: Mellanox offers silicon and InfiniBand products through both OEMs and
VARs giving many options for purchase of our products. If interested in
finding a Mellanox solution, email Dave Sheffler at daves@mellanox.com or
contact your preferred HPC OEM or VAR.
HPCwire: Anything else?
Waldman: Mellanox has shown a lot of progress since SC2002 and we plan
accelerated progress over the next year. I fully expect within a year many of
the Top 500 supercomputers will be based on InfiniBand technology. I encourage
your readers to follow upcoming advancements of the InfiniBand community that
will compel the establishment of InfiniBand as the preferred interconnect
standard in HPCC.
|