
Features:
PSC: THE POLITICS AND POTENTIAL OF SUPERCOMPUTING
by Alan Beck, Editor-in-Chief
Following is Alan Beck's interview of Sergiu Sanielevici, assistant director
at Pittsburgh Supercomputing Center, regarding his views on the future of
supercomputing.
HPCwire: The path ahead as HPC looks toward the 2010 horizon calls for
Petaflop systems, which most likely will entail massive parallelism on the
order of thousands of processors. How is this likely to affect software?
SS: You're right. All the petascale system architectures currently on the
drawing board are massively parallel. These systems will bring to bear many
thousands of processors on each computation in order to exploit the scale of
the system. There's more than scale alone, however, that's involved.
The processors will be fed by complex, multi-stage memory hierarchies, and for
these systems to be productive -- to deliver sustained application performance
in an acceptable relation to peak -- they will require programs that can
efficiently manipulate this kind of complex memory hierarchy.
At PSC, as we've worked with research teams to gain high levels of
productivity with LeMieux, the NSF terascale system, we've begun to learn how
to think about software at these scales. Among the concerns, for instance, is
the stringent performance requirements that will be involved with petascale
I/O. With so many fallible components in the system, application codes need to
do their own checkpointing, which requires careful attention at the
application level, something which is not faciliated by traditional
programming models, in order to save terabytes of data over a complex network
in minutes. The system solutions devised by architects to meet these I/O
requirements will need to be efficiently yet portably exploited by petascale
programming models.
HPCwire: We're hearing from various places, such as DARPA's High Productivity
Computing Systems (HPCS) program, that the present moment represents a
"critical juncture" for U.S. supercomputing. Do you agree?
SS: Yes, I do, and it's certainly just as true for software as hardware. The
two go hand-in-hand, and the necessity of coordination between the two becomes
much more important at the petascale, which is why we're at a critical
juncture and why we should address it now and, hopefully, get a jump on
thinking clearly about the challenges involved. At PSC, our experience with
the current crop of systems capable of terascale performance suggests that
almost all leading-edge scientific codes will have to be re-engineered to some
extent. The sooner as a community we begin to pay attention to this,
obviously, the farther ahead we'll be.
The problem is that most parallel codes follow the message-passing and/or the
shared-memory programming model, both of which have drawbacks that become
critical at the scale and complexity we're now reaching. Shared memory doesn't
scale, is expensive even at small scales, and doesn't encourage data locality.
Message passing is too low-level, too burdensome to the programmer, and too
hard to optimize when the number of communicating objects grows into the
thousands. Efficient, portable parallel I/O and checkpointing (as well as
visualization and steering) are seldom incorporated into existing codes.
In the 30 years or so since supercomputing has emerged, it has been possible
for physicists, chemists, biologists, engineers and others to make progress
mainly using the numerical and programming expertise within their own groups
(including computer-savvy graduate students and postdocs). They could more or
less get by with a minimalist reaction to changes in the architectures they
have been presented with -- from the CDC-7600 via the vector era to the 1990's
style of more or less "massive" parallel systems. But now, the complexity of
the system is such that you're much more likely to succeed if you can adapt
the advanced methods developed by numerical mathematicians and computer
scientists to your specific needs.
HPCwire: What new software approaches specifically do you envision?
SS: I think we need to consider innovations at several levels: languages and
compilers, runtime systems, frameworks and tools, for both computation and
I/O. There has been solid progress by computer scientists and numerical
mathematicians over the past few years, with the explicit goal of facilitating
development of application codes that will perform efficiently at any scale.
Many of these methods have already been successfully demonstrated on real
codes achieving excellent performance on today's terascale systems.
For example, certain dialects of Fortran, C and Java provide a global memory
space abstraction whereby all data has a user-controllable processor affinity,
but parallel processes may directly reference each other's memory. I'm
thinking of, respectively, Co-Array Fortran (CAF), Unified Parallel C (UPC),
and U.C. Berkeley's Titanium.
At the runtime level, UIUC's Charm++, a parallel C++ library, and AMPI, an
adaptive MPI implementation, provide processor virtualization. This technique
allows the programmer to divide the computation into a large number of
entities that are mapped to the available processors by an intelligent runtime
system, enabling a separation of concerns that leads to both improved
productivity and higher performance. As another example, UCSD's KeLP
programming system enables the programmer to express complicated dependence
patterns in geometric terms. On top of such a programming system one can then
build domain-specific solver libraries, such as SCALLOP for elliptic partial
differential equations.
At the highest level, there are collections of tools available to the
applications programmer. For example, the DOE Advanced CompuTational Software
(ACTS) collection offers direct and iterative methods for the solution of
linear and non-linear systems of equations; partial differential equation
solvers and multi-level methods; structured and unstructured meshes
(generation, manipulation and computation); as well as performance monitoring
and tuning. PSC has developed tools for I/O and checkpointing on large-scale
parallel and distributed systems.
HPCwire: Will this require a radical reprogramming effort in applications?
SS: It's going to depend. As I've said, these new approaches are available at
several levels: source, link, and executable components. So the scientific
applications teams will need to study and evaluate various options depending
on their needs and means, the fit with their own existing codes, the
performance, reliability and maintainability of the new external methods and
components, etc. Clearly everyone will want to get the maximum benefit with
the minimum investment of time and resources.
HPCwire: What kind of applications will be affected by these changes?
SS: It's easier to think of types of applications that will probably *not*
need to be changed. "Pleasingly parallel" and parameter sweep (ensemble
simulation) applications should be fine regardless of the complexity of the
communications fabric and the number of processors -- assuming the single-
processor or low-parallelism "elementary" code is kept well tuned as new nodes
are deployed. Also, there certainly exist beautifully scalable and efficient
terascale codes painstakingly crafted using "traditional" programming
paradigms and tools, which may remain satisfactory into the petascale regime.
But, in general, any scientist who plans to work at the petascale should
critically examine the possibility that a break with the past may be needed.
HPCwire: What role will computer scientists play in these changes?
SS: That's a good question, because the problems we're talking about aren't
purely technical. We also need to be thinking about the sociological
situation. As a community, we need to do a better job of bridging the gap
between academic computer scientists and the computational scientists whose
application codes need to be re-engineered to petascale requirements. Computer
scientists need to understand, in detail, the concerns and the outlook of the
application developers. Sometimes, these are not necessarily the "coolest"
issues a computer scientist would want to work on: things like long-term
maintenance and user support, documentation written in intelligent layperson's
language, or quickly adding functionality that would not be a priority from a
pure computer science viewpoint. Work that helps a biophysicist publish
breakthrough papers may not produce any significant publications for her
computer scientist partner, and vice versa.
We also need these same computer scientists to collaborate with the commercial
vendors who will design and implement the systems that will reach the
petascale, and with the supercomputing centers that will deploy and operate
them. This gap-bridging approach is being pioneered by several initiatives
including the NSF TeraGrid project and DARPA's HPCS program.
HPCwire: These seem to be daunting problems. How do you propose to attack
them?
SS: I think the first step is to engage the computational science community in
a process of critically examining how they will operate at the petascale, and
in a close and mutually beneficial collaboration with the computer scientists
who are interested in petascale systems. At PSC, together with the other NSF
PACI centers and the DOE centers at ORNL and NERSC, we made a start in 2002 by
organizing the workshop "Scaling to New Heights," where members of both
communities discussed their experiences and ideas.
This year, NSF, DOE and DoD are sponsoring a tightly focused workshop that
aims to introduce computational scientists and engineers to the new software
approaches we've discussed, and to their developers. This will take place at
PSC on May 3 and 4, 2004. We certainly suggest that everyone who plans to do
computational science on the upcoming generation of platforms consider
participation.
Details and the registration form can be found at:
http://www.psc.edu/training/PPS_May04/.
What I'd like to see is that this workshop and others like it will produce a
series of "grassroots" collaborations, in which people can generate some sense
of shared purpose in making petascale computing work. I've found that the
funding agencies are keenly interested in this topic and enthusiastically
support our plans for this workshop. If we in the U.S. computational science
and HPC community hope to convince the nation to invest the considerable
dollars needed to keep us moving forward, we should demonstrate that we are
doing all we can to maximize the returns in scientific and technological
breakthroughs.
|