HPCwire
 The global publication of record for High Performance Computing / June 13, 2003: Vol. 12, No. 23

Previous Article   |  Table of Contents  |   Next Article

Features:

EXCLUSIVE - PITT SC DIRECTORS SPEAK OUT
by Alan Beck, Editor-in-Chief

Interview with Mike Levine and Ralph Roskies, co-scientific directors of the Pittsburgh Supercomputing Center...

HPCwire: LeMieux, your 3,000 processor HP AlphaServerSC terascale system, has been at Pittsburgh Supercomputing Center (PSC) since October 2001 and in full production mode, through the NSF PACI allocation process, since April 2002. It remains, as it was when installed, the most powerful NSF system and that also makes it the most powerful U.S. system available to all science and engineering disciplines for unclassified research. What are the noteworthy scientific gains?

LEVINE: There have been excellent results across a range of application areas. There's significant work in protein structure and dynamics. There's major work in computational fluid dynamics of blood flow. In CFD, we've also enabled simulations of new designs for power turbines and, related to that, simulations of industrial-scale coal gasification. There's new work in solar turbulence and magneto-convection. In other areas of physics, LeMieux has made possible significant progress in quantum chromodynamics, also some breakthrough work in photonics. There's also important work in cosmology and major new findings in planet formation.

There's a lot going on, at many levels and with many approaches to using this machine. LeMieux caught on rapidly. We regularly solicit feedback from researchers, and a recurring theme has been "We couldn't have done these computations without this machine."

HPCwire: Could you elaborate with a few examples?

ROSKIES: One of the earliest notable results came from Klaus Schulten and his colleagues at the University of Illinois. They simulated a protein called aquaporin, an important channel protein that regulates the passage of water into and out of cells. The simulations solved a mystery that experiments couldn't resolve.

These were very detailed simulations, more than 100,000 atoms, and they gave a precise picture of how the water molecules line up and move through this channel. What they showed - which you couldn't see in experiments - is that each water molecule does a 180 degree flip-flop at the midstream point of this channel. Because of that flip-flop, protons can't come through, only the water.

Free passage of protons through this channel, which is ubiquitous in many organisms, would totally disrupt the electrochemistry that drives metabolism. Until these simulations, no one knew how the channel managed to keep the protons out while letting water through. It was perplexing. Because he had access to LeMieux, Schulten was able to get the answer and publish the results in SCIENCE.

Rita Colwell [director of NSF] highlighted this work, by the way, in her keynote address at SC2002 in Baltimore.

Schulten's team was quick to take advantage of LeMieux because they developed a molecular dynamics package, called NAMD, that's well adapted to large-scale parallel systems. Using all 750 of LeMieux's nodes [each node has four processors], they've sustained performance of over a teraflop per second. This is for a real problem, not artificial benchmarking. It's outstanding and probably sets a record for molecular dynamics, an application that generally speaking doesn't parallelize well.

HPCwire: Can you mention one or two other notable results?

ROSKIES: Sure. Another paper in SCIENCE came from simulations of planet formation -- by a team led by Thomas Quinn, an astrophysicist at the University of Washington. In this case, the question was how so-called "gas giant" planets form. These are huge planets, made mostly of gas, like Saturn and Jupiter.

Quinn's objective was to model a new theory, based on gravitational instability, of how these planets coalesce from the gaseous halo that swirls around a young star. Because he had access to LeMieux, he was able to test this model with high resolution, about 10 times better than anyone before. As a result, he found that a gas giant planet can form in hundreds of years, rather than millions as had been the standard thinking.

This was big news for many people who are now interested in planet formation. In the past eight years or so, astronomers have for the first time found planets outside our solar system, over 100 so far, and most of them are gas giants.

Another fascinating project is the simulation of anti-microbial polymers by Mike Klein and colleagues at the University of Pennsylvania. They used LeMieux to show the feasibility of using a polymer to mimic anti-microbial peptides, such as were first found on the skin of a frog. These natural peptides are an important part of our defense against pathogens. If they can be synthesized, we could use them to help defeat anti-biotic resistant bacteria, among other things. They could also be used to create antibacterial clothing, bandages, surgical instruments, a whole list of potential applications to address the problem that hospitals are crawling with germs. One report last year said that hospital-acquired infection is the fourth leading cause of death in the United States.

The big problem is that these natural peptides are extremely difficult and expensive to synthesize. As Mike Klein says, you might as well be grinding up diamonds. His group showed that a polymer structure, much easier to synthesize than the natural peptides, would have virtually the same anti-microbial properties. Lab work by his colleagues proved that this prediction was correct. These results were published in the PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCE. This project was also one of six finalists in the 2003 Computerworld Honors program in Science. Because of this work, the University of Pennsylvania filed for patents and created a company to exploit the medical potential of these molecules.

LEVINE: We've also had some very large-scale simulations of earthquakes that we should mention. This is a team of people led by Jacobo Bielak and Omar Ghattas at Carnegie Mellon. They've used 2,048 processors with nearly 90 percent parallel efficiency.

They've sustained close to a teraflop per second performance over 12 hours. This is an application that simulates soil conditions for the Los Angeles earthquake basin. It's a very large-scale simulation that solves hundreds of millions of differential equations for each time step in an unstructured mesh, which subdivides the basin into more than 100 million grid points. This code performs at nearly 25 percent of peak performance for the Alpha processor, which is exceptional for an application like this, contrasted with synthetic benchmark results.

This project is directed toward getting better data on ground vibration so that it's possible to create more rational building codes in earthquake prone areas, such as Los Angeles and Mexico City. And these calculations, which are a major step forward toward that goal, run at a scale -- 2,048 processors -- that simply is not feasible without a system like LeMieux.

Because of LeMieux, this group has also made significant progress in a direction with wide implications for computational science -- a problem called "the inverse problem." It's also sometimes called "blind deconvolution." Omar Ghattas won the best paper award last year at SC 2003 for his parallel inverse-problem algorithm. It's a sophisticated mathematical approach to recover subsurface soil data from the observations on the surface.

This approach is extremely computationally intensive and ultimately will require petaflop computing. But the potential is very significant, not only for earthquake safety, but also for global climate change, and many other areas -- ocean mapping, geological mapping. It's no surprise that DOE is funding some of this work. In their most recent round of simulations, Ghattas and Bielak have proven the concept for a model test case, in which they've recovered subsurface soil parameters starting from surface seismic observations. You need a LeMieux class system to do this.

Another application that just blazes on LeMieux is solid-state physics software called LSMS, which we're using here in a couple different areas. This software was originally developed at Oak Ridge National Lab, and one of our staff scientists, Yang Wang, has worked extensively on adapting it to parallel systems.

On LeMieux, using all 3,000 processors, LSMS has achieved 4.6 teraflops. Obviously, this is an application that's very well adapted to parallel processing.

Yang is collaborating with Mike Widom, a physicist at Carnegie Mellon, using LSMS to simulate a solid-state phenomenon called amorphous metals -- to answer questions about the unusual behavior of these materials. They are used, for instance, for high-performance golf clubs, because they have unusual springiness combined with strength and hardness.

HPCwire: What capabilities of LeMieux have been important in enabling this work?

LEVINE: Well of course right off there's the scale of the machine and the very powerful Alpha processor. You have 3,000 of these processors all connected together and that makes possible many large-scale simulations -- QCD is another example -- that simply wouldn't happen without LeMieux.

Another key factor is the way these powerful processors are tied together. The Quadrics interconnect network that we used gives us good latency and a very high aggregate internal bandwidth. These are important factors in many applications.

ROSKIES: It's worth noting also that it's relatively easy to port code from other systems to LeMieux. Many researchers have told us this.

HPCwire: What have you learned from your experience with LeMieux that might influence development of future systems?

ROSKIES: The main thing is the importance of memory bandwidth of the processor. It's what we felt from the beginning. One of the reasons we chose this processor is its superior memory bandwidth. I think we were right in that choice.

LEVINE: Right. The bandwidth between nodes and to I/O devices is also absolutely crucial. We made a very conscious effort to get the best out of available technology, and we played an active role with Compaq in putting this system together. This was not in any way a standard configuration we could have just ordered from Compaq. It's also a machine of much larger scale than any of this nature that had been fielded by Compaq. Scale matters and, realizing this, we went to a great deal of difficulty to do it in the way that made the scale our friend.

Likewise with the software -- I would say that it's a lesson for the future that it pays to understand what your target is, what your target applications are, what your target style of computing is, and to make changes to the hardware configuration that are crucial to serve those goals. It's a lot of effort, but important.

HPCwire: Compaq is now merged with HP, which happened after you installed LeMieux. Does that have any impact on your future plans at PSC?

ROSKIES: If anything, the corporate commitment to high-end computing is strengthened with this merger. HP's CEO, Carly Fiorina, made it a priority to visit our site last year to take a first-hand look at LeMieux. She's very interested in the scientific progress that's fostered through state-of-the-art processing. She talked about this in her keynote speech at Comdex last November. LeMieux is an accomplishment that HP rightfully takes some pride in, that their technology contributes directly to solving problems like disease and earthquakes.

LEVINE: We've been very impressed in our interaction with the people of HP labs on both technical and scientific issues, and we think, in that sense, our involvement with HP is going to help us go forward. The labs have excellent experience with planning and layout of large data centers, for instance, and they do detailed computational fluid dynamic simulations of airflow and temperature in the machine room.

ROSKIES: They have a room that they use to reconfigure and experiment with different layouts.

HPCwire: What impact has LeMieux had within PACI and NSF planning for cyberinfrastructure?

LEVINE: The delivery of this machine on time, on budget, and at the strength and capability promised, has helped to put PACI on a solid footing from a computational infrastructure point of view. From this foundation it has continued to build. For NSF, the PACI program has represented a massive scaling up in cyberinfrastructure. The first large piece of that puzzle, in terms of a high-end system, was the Terascale Computing System -- LeMieux. It has delivered the expected performance. In this sense, it's a keystone of this program and it provides a firm footing from which the program can continue to grow.

HPCwire: What are challenges of integrating LeMieux into the TeraGrid?

LEVINE: The TeraGrid initially was the PACI response to the NSF Distributed Terascale Facility (DTF) solicitation. It was based on a homogeneous hardware and software environment, that is Intel and Linux. The current challenge has to do with the different operating system and software base that LeMieux brings into the TeraGrid environment. LeMieux is the test case for interoperability. This is a first pass at a grid environment incorporating heterogeneity of system architectures. The challenge is to expand the software suite to allow interoperability between different kinds of equipment.

The TeraGrid has established an Interoperability Working Group to coordinate this effort, which initially is the effort of bringing LeMieux on board. Derek Simmel of our staff is leading this Teragrid working group.

The point is not just to incorporate PSC and LeMieux, but also to establish the necessary framework of interoperability so that the TeraGrid can expand and other sites, other resources, can be seamlessly woven into this fabric. What we're after is to create a sustained national cyberinfrastructure to empower research and education.

ROSKIES: As part of this effort, we've established working relationships among key staff people at PSC and their counterparts at the other TeraGrid major resource centers, Illinois and San Diego, and also with Argonne and Caltech, which are also TeraGrid sites. There are many technically challenging problems that need to be solved to attain the goal -- to create the overarching look and feel of homogeneity and ease of use of computation, data, visualization and other services from the researcher point of view. This requires that the partners work closely together. All of us are contributing our energies toward this vision of national cyberinfrastructure, and it's coming along well.

HPCwire: In January, HP announced another server product line, code-named Marvel, that extends the Alpha processor technology. PSC has installed some of the first of these servers. How does this new system fit into PSC's plans?

ROSKIES: In February, we installed two separate, 16-processor Marvel systems, each with 32 gigabytes of shared memory. The HP official HP product designation for these servers is the GS1280. Each system represents an initial phase of what will become two larger GS1280 systems, each composed of 128 processors and 512 gigabytes of shared memory.

There are two separate systems because one of them is funded by NIH and will support biomedical work and the other is funded by NSF to support NSF science and engineering. The NSF system will also be part of the TeraGrid.

These systems have a very fast processor, the EV7, 20 percent faster than the EV68s of LeMieux. What's even more important, though, for these systems is the memory structure. They're shared memory systems with an exceptional memory bandwidth, the importance of which we've mentioned. The Marvel's memory bandwidth is about six times greater than LeMieux.

The Marvels allow large shared-memory programming style, which is complementary to the capabilities that we have in LeMieux, and they can support a different and very important class of applications. We've identified applications in genomics and structural biology and also visualization that will benefit strongly from this architecture. Quantum chemistry is another large application area that will run very well on the Marvels.

LEVINE: We've come up quickly with these systems -- they're already doing production research by the way -- because they didn't come to us out of a vacuum. We worked closely with HP in this process, and we prepared for the February arrival of the first-phase production systems with extensive previous work on early engineering test units and field-test machines. We and members of the research community knew what we were getting well ahead of time.

HPCwire: HP, along with other vendors, appears to be committed to the Itanium processor. How does this affect PSC?

LEVINE: I'm often asked about this, and the implication usually is that it creates some sort of problem to change processors. But you have to keep in mind that in high-performance computing we change processors all the time. These transitions happen regularly, and if there's one thing we are expert in at PSC it's in clearing out the underbrush to make a pathway with new systems.

Keep in mind that in about fifteen years we've debuted at least five generations of new systems. We've evolved through the transition from Cray vector systems to massive parallelism. With our last three lead systems, we laid entirely new track, introducing systems to the research community. There's no reason to think that switching from Alpha processors to something else down the road represents something out of the ordinary. This is what we do.

ROSKIES: The Alpha EV7 and HP Marvel server is the best technology for at least the next several years. Eventually that will change. In the meantime, we're serving the research community in the best way we can. As new technology emerges, we'll adopt it.

In this respect, we've already brought in, through our alliance with HP, a 32- processor Itanium-2 cluster, which among other things helps in development and testing of TeraGrid interoperability.

As with the Marvel, we're familiarizing ourselves with a new technology, with significant lead-time for when it becomes a production system.

LEVINE: What we value from Alpha are things that have to do with its electronic design plus the software suite. The design groups at HP that worked on EV7 are now working on IA64, so you can expect a carryover of excellence, with the difference that the Itanium is much more of a mass-market processor, which brings the cost down.

The bright view of the future is that there will be carryover from the lessons we've learned technologically from the Alpha plus the advantage of a broader economic base.

Of course hardware is only part of the picture. We're deeply invested in all the grid-enabling efforts -- networking, storage, interoperability -- that are essential parts of creating national cyberinfrastructure through TeraGrid. This is a community effort. It's vital work and we're pleased to play a role.


Top of Page

Previous Article   |  Table of Contents  |   Next Article