
Features:
EXCLUSIVE - PITT SC DIRECTORS SPEAK OUT
by Alan Beck, Editor-in-Chief
Interview with Mike Levine and Ralph Roskies, co-scientific directors of the
Pittsburgh Supercomputing Center...
HPCwire: LeMieux, your 3,000 processor HP AlphaServerSC terascale system, has
been at Pittsburgh Supercomputing Center (PSC) since October 2001 and in full
production mode, through the NSF PACI allocation process, since April 2002. It
remains, as it was when installed, the most powerful NSF system and that also
makes it the most powerful U.S. system available to all science and
engineering disciplines for unclassified research. What are the noteworthy
scientific gains?
LEVINE: There have been excellent results across a range of application areas.
There's significant work in protein structure and dynamics. There's major work
in computational fluid dynamics of blood flow. In CFD, we've also enabled
simulations of new designs for power turbines and, related to that,
simulations of industrial-scale coal gasification. There's new work in solar
turbulence and magneto-convection. In other areas of physics, LeMieux has made
possible significant progress in quantum chromodynamics, also some
breakthrough work in photonics. There's also important work in cosmology and
major new findings in planet formation.
There's a lot going on, at many levels and with many approaches to using this
machine. LeMieux caught on rapidly. We regularly solicit feedback from
researchers, and a recurring theme has been "We couldn't have done these
computations without this machine."
HPCwire: Could you elaborate with a few examples?
ROSKIES: One of the earliest notable results came from Klaus Schulten and his
colleagues at the University of Illinois. They simulated a protein called
aquaporin, an important channel protein that regulates the passage of water
into and out of cells. The simulations solved a mystery that experiments
couldn't resolve.
These were very detailed simulations, more than 100,000 atoms, and they gave a
precise picture of how the water molecules line up and move through this
channel. What they showed - which you couldn't see in experiments - is that
each water molecule does a 180 degree flip-flop at the midstream point of this
channel. Because of that flip-flop, protons can't come through, only the
water.
Free passage of protons through this channel, which is ubiquitous in many
organisms, would totally disrupt the electrochemistry that drives metabolism.
Until these simulations, no one knew how the channel managed to keep the
protons out while letting water through. It was perplexing. Because he had
access to LeMieux, Schulten was able to get the answer and publish the results
in SCIENCE.
Rita Colwell [director of NSF] highlighted this work, by the way, in her
keynote address at SC2002 in Baltimore.
Schulten's team was quick to take advantage of LeMieux because they developed
a molecular dynamics package, called NAMD, that's well adapted to large-scale
parallel systems. Using all 750 of LeMieux's nodes [each node has four
processors], they've sustained performance of over a teraflop per second. This
is for a real problem, not artificial benchmarking. It's outstanding and
probably sets a record for molecular dynamics, an application that generally
speaking doesn't parallelize well.
HPCwire: Can you mention one or two other notable results?
ROSKIES: Sure. Another paper in SCIENCE came from simulations of planet
formation -- by a team led by Thomas Quinn, an astrophysicist at the
University of Washington. In this case, the question was how so-called "gas
giant" planets form. These are huge planets, made mostly of gas, like Saturn
and Jupiter.
Quinn's objective was to model a new theory, based on gravitational
instability, of how these planets coalesce from the gaseous halo that swirls
around a young star. Because he had access to LeMieux, he was able to test
this model with high resolution, about 10 times better than anyone before. As
a result, he found that a gas giant planet can form in hundreds of years,
rather than millions as had been the standard thinking.
This was big news for many people who are now interested in planet formation.
In the past eight years or so, astronomers have for the first time found
planets outside our solar system, over 100 so far, and most of them are gas
giants.
Another fascinating project is the simulation of anti-microbial polymers by
Mike Klein and colleagues at the University of Pennsylvania. They used LeMieux
to show the feasibility of using a polymer to mimic anti-microbial peptides,
such as were first found on the skin of a frog. These natural peptides are an
important part of our defense against pathogens. If they can be synthesized,
we could use them to help defeat anti-biotic resistant bacteria, among other
things. They could also be used to create antibacterial clothing, bandages,
surgical instruments, a whole list of potential applications to address the
problem that hospitals are crawling with germs. One report last year said that
hospital-acquired infection is the fourth leading cause of death in the United
States.
The big problem is that these natural peptides are extremely difficult and
expensive to synthesize. As Mike Klein says, you might as well be grinding up
diamonds. His group showed that a polymer structure, much easier to synthesize
than the natural peptides, would have virtually the same anti-microbial
properties. Lab work by his colleagues proved that this prediction was
correct. These results were published in the PROCEEDINGS OF THE NATIONAL
ACADEMY OF SCIENCE. This project was also one of six finalists in the 2003
Computerworld Honors program in Science. Because of this work, the University
of Pennsylvania filed for patents and created a company to exploit the medical
potential of these molecules.
LEVINE: We've also had some very large-scale simulations of earthquakes that
we should mention. This is a team of people led by Jacobo Bielak and Omar
Ghattas at Carnegie Mellon. They've used 2,048 processors with nearly 90
percent parallel efficiency.
They've sustained close to a teraflop per second performance over 12 hours.
This is an application that simulates soil conditions for the Los Angeles
earthquake basin. It's a very large-scale simulation that solves hundreds of
millions of differential equations for each time step in an unstructured mesh,
which subdivides the basin into more than 100 million grid points. This code
performs at nearly 25 percent of peak performance for the Alpha processor,
which is exceptional for an application like this, contrasted with synthetic
benchmark results.
This project is directed toward getting better data on ground vibration so
that it's possible to create more rational building codes in earthquake prone
areas, such as Los Angeles and Mexico City. And these calculations, which are
a major step forward toward that goal, run at a scale -- 2,048 processors --
that simply is not feasible without a system like LeMieux.
Because of LeMieux, this group has also made significant progress in a
direction with wide implications for computational science -- a problem called
"the inverse problem." It's also sometimes called "blind deconvolution." Omar
Ghattas won the best paper award last year at SC 2003 for his parallel
inverse-problem algorithm. It's a sophisticated mathematical approach to
recover subsurface soil data from the observations on the surface.
This approach is extremely computationally intensive and ultimately will
require petaflop computing. But the potential is very significant, not only
for earthquake safety, but also for global climate change, and many other
areas -- ocean mapping, geological mapping. It's no surprise that DOE is
funding some of this work. In their most recent round of simulations, Ghattas
and Bielak have proven the concept for a model test case, in which they've
recovered subsurface soil parameters starting from surface seismic
observations. You need a LeMieux class system to do this.
Another application that just blazes on LeMieux is solid-state physics
software called LSMS, which we're using here in a couple different areas. This
software was originally developed at Oak Ridge National Lab, and one of our
staff scientists, Yang Wang, has worked extensively on adapting it to parallel
systems.
On LeMieux, using all 3,000 processors, LSMS has achieved 4.6 teraflops.
Obviously, this is an application that's very well adapted to parallel
processing.
Yang is collaborating with Mike Widom, a physicist at Carnegie Mellon, using
LSMS to simulate a solid-state phenomenon called amorphous metals -- to answer
questions about the unusual behavior of these materials. They are used, for
instance, for high-performance golf clubs, because they have unusual
springiness combined with strength and hardness.
HPCwire: What capabilities of LeMieux have been important in enabling this
work?
LEVINE: Well of course right off there's the scale of the machine and the very
powerful Alpha processor. You have 3,000 of these processors all connected
together and that makes possible many large-scale simulations -- QCD is
another example -- that simply wouldn't happen without LeMieux.
Another key factor is the way these powerful processors are tied together. The
Quadrics interconnect network that we used gives us good latency and a very
high aggregate internal bandwidth. These are important factors in many
applications.
ROSKIES: It's worth noting also that it's relatively easy to port code from
other systems to LeMieux. Many researchers have told us this.
HPCwire: What have you learned from your experience with LeMieux that might
influence development of future systems?
ROSKIES: The main thing is the importance of memory bandwidth of the
processor. It's what we felt from the beginning. One of the reasons we chose
this processor is its superior memory bandwidth. I think we were right in that
choice.
LEVINE: Right. The bandwidth between nodes and to I/O devices is also
absolutely crucial. We made a very conscious effort to get the best out of
available technology, and we played an active role with Compaq in putting this
system together. This was not in any way a standard configuration we could
have just ordered from Compaq. It's also a machine of much larger scale than
any of this nature that had been fielded by Compaq. Scale matters and,
realizing this, we went to a great deal of difficulty to do it in the way that
made the scale our friend.
Likewise with the software -- I would say that it's a lesson for the future
that it pays to understand what your target is, what your target applications
are, what your target style of computing is, and to make changes to the
hardware configuration that are crucial to serve those goals. It's a lot of
effort, but important.
HPCwire: Compaq is now merged with HP, which happened after you installed
LeMieux. Does that have any impact on your future plans at PSC?
ROSKIES: If anything, the corporate commitment to high-end computing is
strengthened with this merger. HP's CEO, Carly Fiorina, made it a priority to
visit our site last year to take a first-hand look at LeMieux. She's very
interested in the scientific progress that's fostered through state-of-the-art
processing. She talked about this in her keynote speech at Comdex last
November. LeMieux is an accomplishment that HP rightfully takes some pride in,
that their technology contributes directly to solving problems like disease
and earthquakes.
LEVINE: We've been very impressed in our interaction with the people of HP
labs on both technical and scientific issues, and we think, in that sense, our
involvement with HP is going to help us go forward. The labs have excellent
experience with planning and layout of large data centers, for instance, and
they do detailed computational fluid dynamic simulations of airflow and
temperature in the machine room.
ROSKIES: They have a room that they use to reconfigure and experiment with
different layouts.
HPCwire: What impact has LeMieux had within PACI and NSF planning for
cyberinfrastructure?
LEVINE: The delivery of this machine on time, on budget, and at the strength
and capability promised, has helped to put PACI on a solid footing from a
computational infrastructure point of view. From this foundation it has
continued to build. For NSF, the PACI program has represented a massive
scaling up in cyberinfrastructure. The first large piece of that puzzle, in
terms of a high-end system, was the Terascale Computing System -- LeMieux. It
has delivered the expected performance. In this sense, it's a keystone of this
program and it provides a firm footing from which the program can continue to
grow.
HPCwire: What are challenges of integrating LeMieux into the TeraGrid?
LEVINE: The TeraGrid initially was the PACI response to the NSF Distributed
Terascale Facility (DTF) solicitation. It was based on a homogeneous hardware
and software environment, that is Intel and Linux. The current challenge has
to do with the different operating system and software base that LeMieux
brings into the TeraGrid environment. LeMieux is the test case for
interoperability. This is a first pass at a grid environment incorporating
heterogeneity of system architectures. The challenge is to expand the software
suite to allow interoperability between different kinds of equipment.
The TeraGrid has established an Interoperability Working Group to coordinate
this effort, which initially is the effort of bringing LeMieux on board. Derek
Simmel of our staff is leading this Teragrid working group.
The point is not just to incorporate PSC and LeMieux, but also to establish
the necessary framework of interoperability so that the TeraGrid can expand
and other sites, other resources, can be seamlessly woven into this fabric.
What we're after is to create a sustained national cyberinfrastructure to
empower research and education.
ROSKIES: As part of this effort, we've established working relationships among
key staff people at PSC and their counterparts at the other TeraGrid major
resource centers, Illinois and San Diego, and also with Argonne and Caltech,
which are also TeraGrid sites. There are many technically challenging problems
that need to be solved to attain the goal -- to create the overarching look
and feel of homogeneity and ease of use of computation, data, visualization
and other services from the researcher point of view. This requires that the
partners work closely together. All of us are contributing our energies toward
this vision of national cyberinfrastructure, and it's coming along well.
HPCwire: In January, HP announced another server product line, code-named
Marvel, that extends the Alpha processor technology. PSC has installed some of
the first of these servers. How does this new system fit into PSC's plans?
ROSKIES: In February, we installed two separate, 16-processor Marvel systems,
each with 32 gigabytes of shared memory. The HP official HP product
designation for these servers is the GS1280. Each system represents an initial
phase of what will become two larger GS1280 systems, each composed of 128
processors and 512 gigabytes of shared memory.
There are two separate systems because one of them is funded by NIH and will
support biomedical work and the other is funded by NSF to support NSF science
and engineering. The NSF system will also be part of the TeraGrid.
These systems have a very fast processor, the EV7, 20 percent faster than the
EV68s of LeMieux. What's even more important, though, for these systems is the
memory structure. They're shared memory systems with an exceptional memory
bandwidth, the importance of which we've mentioned. The Marvel's memory
bandwidth is about six times greater than LeMieux.
The Marvels allow large shared-memory programming style, which is
complementary to the capabilities that we have in LeMieux, and they can
support a different and very important class of applications. We've identified
applications in genomics and structural biology and also visualization that
will benefit strongly from this architecture. Quantum chemistry is another
large application area that will run very well on the Marvels.
LEVINE: We've come up quickly with these systems -- they're already doing
production research by the way -- because they didn't come to us out of a
vacuum. We worked closely with HP in this process, and we prepared for the
February arrival of the first-phase production systems with extensive previous
work on early engineering test units and field-test machines. We and members
of the research community knew what we were getting well ahead of time.
HPCwire: HP, along with other vendors, appears to be committed to the Itanium
processor. How does this affect PSC?
LEVINE: I'm often asked about this, and the implication usually is that it
creates some sort of problem to change processors. But you have to keep in
mind that in high-performance computing we change processors all the time.
These transitions happen regularly, and if there's one thing we are expert in
at PSC it's in clearing out the underbrush to make a pathway with new systems.
Keep in mind that in about fifteen years we've debuted at least five
generations of new systems. We've evolved through the transition from Cray
vector systems to massive parallelism. With our last three lead systems, we
laid entirely new track, introducing systems to the research community.
There's no reason to think that switching from Alpha processors to something
else down the road represents something out of the ordinary. This is what we
do.
ROSKIES: The Alpha EV7 and HP Marvel server is the best technology for at
least the next several years. Eventually that will change. In the meantime,
we're serving the research community in the best way we can. As new technology
emerges, we'll adopt it.
In this respect, we've already brought in, through our alliance with HP, a 32-
processor Itanium-2 cluster, which among other things helps in development and
testing of TeraGrid interoperability.
As with the Marvel, we're familiarizing ourselves with a new technology, with
significant lead-time for when it becomes a production system.
LEVINE: What we value from Alpha are things that have to do with its
electronic design plus the software suite. The design groups at HP that worked
on EV7 are now working on IA64, so you can expect a carryover of excellence,
with the difference that the Itanium is much more of a mass-market processor,
which brings the cost down.
The bright view of the future is that there will be carryover from the lessons
we've learned technologically from the Alpha plus the advantage of a broader
economic base.
Of course hardware is only part of the picture. We're deeply invested in all
the grid-enabling efforts -- networking, storage, interoperability -- that are
essential parts of creating national cyberinfrastructure through TeraGrid.
This is a community effort. It's vital work and we're pleased to play a role.
|