
Features:
IBM P4: HANDICAP FOR HIGH RESOLUTION EARTH SYSTEM MODELING?
by Christopher Lazou
Some 90 meteorologists and HPC experts from 15 countries and 4 continents
attended the bi-annual CAS2K3 workshop on the use of HPC in meteorology, held
at the idyllic Imperial Palace Hotel, Annecy, France, organised by the
National Centre for Atmospheric Research (NCAR), USA. This excellent
relatively small and friendly workshop provided a tour de force in
meteorological and computing techniques by active practitioners striving to
maximise the latest HPC technology to refine and improve their climate
prediction models. Most presenters came from sites in the USA with large IBM
P3/4 systems, while the European contingent included a strong representation
from sites with large NEC SX-6 systems. This article highlights a few of the
many issues raised by presentations given at this workshop.
There were 43 presentations in 4 days and a live demonstration of the Grid's
enabling potential for international collaboration within the community of
climate system modelling (CCSM). The talks were crammed with technical
information on how to use parallel supercomputers for computation using
mathematical models, which describe climate/weather patterns over time. They
were interspersed with performance figures, weather maps and video pictures
from simulations and these were compared with satellite pictures of the
behaviour of actual weather events.
Why are meteorologists doing all this Earth System Modelling and what is the
urgency? Dramatic reports of flooding and other climate change events now
appear frequently in the press and on television (Hurricane Isabel hitting
North Carolina as I write). These images are injecting a political dimension
into the proceedings.
Climate simulations show that intensely hot summers and increase in rainfall,
causing flooding, are likely to become more common. The reduction of snow in
the north, the melting of glaciers, the projection of no snow in the north in
year 2100 (even at the North Pole) and the implied rise in sea level raise
questions on the state of the atmosphere, the ocean, sea-ice, the land
surfaces and mankind. In short, there is a perceived pending catastrophe,
because of global warming exacerbated by greenhouse gases and other pollutants
from human activities.
A study at the Max Planck Institute, Germany, comparing CO2, parts per million
(ppm) and temperature change, shows that over 400,000 years prior to the year
1850 the temperature changed from -8oC to 0oC (rose by 8oC) while the CO2
changed from 200 (ppm) to 280 (ppm). From 1850 to today the temperature rose
by 1.4oC and the CO2 rose to 550 (ppm). From today to 2100 year, the CO2 (ppm)
is expected to rise from 550 (ppm) to 960 (ppm) and the temperature from
1.4oC to 5.8oC. This is the so-called Vostok curve and the gap indicates the
severity of the situation.
Some scenarios show that sea level rise alone could deprive two billion people
of food in the next hundred years. Insurance companies cannot protect against
consequences of this magnitude. Thus the stakes are high and finding answers
to the socio-economic effects of climate change has climbed to the top of the
political agenda.
The key goal of the climate change efforts is to develop and enhance our
capability to monitor and predict how the Earth System is evolving.
Temporal scales seasonal and inter-annual, weather forecasting and climate
change predictions are dominated by initial conditions of the atmosphere, the
oceans and by forcing factors (naturally-occurring and human-induced).
Dr. William Collins, from NCAR and Chair of the scientific steering committee
for the Community Climate System Model (CCSM), in a keynote address explained
that CCSM is a comprehensive system for simulating the past, present and
future climates of the Earth. It currently consists of four major components
representing the atmosphere, ocean, sea ice, and land surface. The exchange of
energy, water and other constituents at the interfaces among these components
is simulated using a flux coupler.
The CCSM resulted from a collaborative development effort involving NCAR,
university investigators and scientists from several US federal agencies. One
of the distinguishing features of CCSM is that the complete source code,
documentation and simulation data sets are freely distributed to the
international climate research community. A new version of the model, CCSM3,
has been developed to facilitate work on a wide variety of scientific
problems. These include the interactions between aerosols and climate, the
relative importance of natural and anthropogenic forcing from the last
millennium, and the nature of abrupt climate change. Results from CCSM3 will
form the basis for NCAR's contribution to forthcoming international (IPCC and
WMO) climate assessments. This talk focused on major new features and
improvements in CCSM3 relative to its predecessors.
These include new radiation and cloud parameterisations in the atmosphere;
heating of the ocean surface by chlorophyll and detailed vegetation ecology.
The improvements in simulations of present-day climate produced by the new
model physics were illustrated with recent coupled experiments.
In the next few years, the CCSM will be expanded to include reactive
troposphere chemistry, detailed aerosol physics and microphysics,
comprehensive biogeochemistry and ecosystem dynamics, and the effects of
urbanization and land use change. These new capabilities will considerably
expand the scope of earth system science that can be studied with CCSM and
other climate models of similar complexity. The computer requirements, for the
next generation of comprehensive climate models, can only be satisfied by
major advances in computer hardware, software, and storage.
The major atmospheric research centres now have systems consisting of several
hundred NEC SX-6 processors or up to a thousand and more IBM P3/4 processors.
In either case they can achieve about a half Teraflop/s sustained and even
Teraflop/s on certain application codes. The exemption to this is the Earth
Simulator based on NEC SX-6 technologies in Japan, which delivers over
12Teraflop/s sustained performance.
Thus with Teraflop/s sustained computing on the horizon and occasionally on
stream, meteorologists are moving from Climate to Earth System Modelling
(ESM). This is because feedback loops of climate system with other relevant
systems like ecology and socio-economy are not negligible. Climate Modelling
is not possible without proper representation of these systems hence ESM.
Earth System Modelling is: Multi (time and space) scale, multi process, multi
topical (physics, chemistry, biology, geology, economy…). It is both very
compute and data intensive. Some people claim it requires several orders of
magnitude more computing power to tackle the problem. Petaflop/s and
Hexaflop/s are therefore eagerly awaited.
Bill Collins said: "A factor of 150 times the present NCAR computing resources
is needed to accommodate CCSM requirements over the next 5 years, i.e. by year
2008. Moore's Law will only deliver an eightfold increase. How this deficiency
is to be remedied is a great challenge. Although special architectures, like
the IBM Blue Gene for protein folding are in the pipeline, they have limited
instruction sets. This is because protein folding deals with very simple
equations. This architecture is not suited to ESM, which needs a small number
of fat nodes, rather than the thousands of processors as in the Blue Gene." He
went on to say that CCSM is forty times slower on the 5.2Teraflop/s IBM P4
NCAR system compared to the Japanese Earth Simulator.
The message, that capacity computers such as the IBM P4+ systems are
unsuitable for high resolution Earth System Modelling, was re-enforced by many
of the speakers. For example, Dr. Albert Semtner, Naval Postgraduate School,
Monterey, California, in his talk described ocean and ice models that are
capable of reproducing the observed mean states and variability of the global
ocean and its sea ice. It is necessary to use horizontal grid spacing less
than 10Km for both ocean and ice, indicated by a comparison of simulated model
results with observational statistics. As a result the most advanced computing
systems are required to run these models.
Specific results were shown from running the Parallel Ocean Program (POP) and
the Sea Ice Model developed at Los Alamos Laboratory. The output from a number
of simulations conducted by investigators at the Naval Postgraduate School and
their collaborators were evaluated against observations.
The simulations were conducted on large IBM, NEC and Cray machines. His
findings are stark. "Only systems that deliver multiple teraflop/s of
sustained performance can be used to project climatic conditions out for many
centuries, with highly realistic ocean and ice interactions in terms of
spatial and temporal evolution On sub-teraflop/s systems, ensemble forecasting
of ocean and ice for optimal ship routing and other marine applications can be
done for time-scales of months."
He illustrated this by showing results obtained from the IBM P3 and an NEC SX-
6. Using a model of ~6.5Km spacing over the ocean, the simulation on a 500
processors IBM P3, took eight days to simulate fifteen years. This same model
simulated 300 years in just eight hours on 960 processors of the Earth
Simulator (NEC SX-6).
Utilizing thousands of IBM type processors would not help, according to
results from the presentation by Patrick Worley, Oak Ridge National
Laboratory. Scaling would act as a major limiting constraint. This is where
commodity capacity chip systems are getting problematic. As Walter
Zwieflhofer, from ECMWF said: "Power, cooling and space requirements of large
systems built out of commercial servers have been growing steadily - this is
not sustainable, but this is not the place to write an RFP."
As an aside, the sustained performance on the NCAR workload is around 4.1%,
delivering 213Gflop/s sustained out of a 5.2teraflop/s system. More revealing
is the statistic derived from the NCAR sustained performance results. It shows
that the vector based Earth Simulator (NEC SX-6) is twice more cost efficient
(dollars/Gigaflop/s) in both price and electrical power usage than the IBM
P690 P4, when using sustained performance as a measure. Thus, the myth that
commodity chip computers are cheaper has been debunked. (See details in my
next article from CAS2K3).
Dr. Tetsuya Sato, Director of the Earth Simulator, described work on ESM in
Japan, including international collaborations. The Earth Simulator delivers
around 30% sustained performance, i.e. over 12Teraflop/s. Tetsuya Sato is
already thinking how to develop new models with a radically different
approach, emulating natural processes using a holistic model. He then
illustrated how nature does not discriminate between macroscopic or
microscopic events and also how natural structures require at least ten
million times more computer power than is currently available on the Earth
Simulator. His vision is to install a new Earth Simulator with much more power
than Moore's Law predicts.
Although Japanese scientists with the NEC SX product line of systems are well
provided with high productivity computers, scientists in the USA are poorly
served by commodity chip based systems. The U.S. is however waking up to this
strategic deficiency and is now pursuing the DARPA High Productivity Computer
Systems (HPCS) programme to deliver a Petaflop/s by year 2009-10. The White
House has an inter-agency effort underway, the High End Computing
Revitalization Task Force (HEC-RTF), for enabling agencies to submit
coordinated budget requests in this area for fiscal year 2005. The IBM system
to be offered is expected to be at least two generations later than the Power5
technology being proposed for the Blue Planet system. In my view, the
imbalance between processor and memory subsystem, as currently manifested in
the IBM Power 4+ series, would not deliver Petaflop/s. The Cray system, based
on new generations of their Cray X1 line, is likely to be more promising.
During this workshop a strong emphasis was placed on data management and the
challenges this entails.To illustrate the kind of resources required, data
assimilation in real-time often requires more resources than the weather
forecast models. In order to analyse historical observations, data sets that
are as consistent as possible are needed and this can only be done with
international collaborations to incorporate the maximum of the available data.
The most recent effort by ECMWF was the ERA- 40 (1957 to 2002) project, using
conventional observations from 1957 and satellite data from 1973. The analysis
system used a 125 km grid and a coupled wave model. The validation and
production phase took 3 years on ECMWF's HPC systems. These projects need to
be completed within the lifetime of one HPC system to avoid the overheads
caused by migration. The resulting data set is close to 40TBs in size.
In the data management area, space-based instruments and high-resolution
models produce huge volumes of data; to use this data effectively, it needs to
be carefully managed. The archives held by centres such as NCAR (~1000+TBs)
and ECMWF (~800+TBs) count as some of their most valuable assets. Both NCAR
and ECMWF run dedicated data management systems clearly separated from the HPC
resources. Metadata-based access and increasingly faster wide-area network
links open these archives to the wider research community. The data problem is
not insurmountable, but it does require attention and dedicated human
resources.
Next week, I'll summarise the performance issues raised at CAS2K3 and how IBM
users in the Earth System Modelling field are being "short-changed", so watch
this space.
(Brands and names are the property of their respective owners) Copyright:
Christopher Lazou, HiPerCom Consultants, Ltd., UK. Email:
Chris@lazou.demon.co.uk September 2003.
The opinions expressed in this feature are those of the author and do not
necessarily reflect the views of HPCwire.
|