
Features:
NEC USER GROUP TALKS PETAFLOP/S AND BRAZILIAN SAMBA
By Christopher Lazou, HiPerCom Consultants, Ltd.
Around eighty experts from twelve mostly European countries braved air travel
and SARs, to attend the NEC User Group, NUG-XV meeting in an idyllic coastal
location near Angra Dos Reis, Brazil. This four day meeting was hosted by the
Brazilian weather/climate centre, (Centro de Tempo e Estudos Climaticos),
CPTEC. It enabled experts in computing, meteorology and technical engineering
fields to share their latest research results, crystallise future hardware and
software needs and collectively leverage NEC to take these onboard in their
development plans for new systems. These needs are not only for faster more
powerful (Petaflop/s) systems for the scientific and technical market, but
increasingly also in data management and storage handling of PetaBytes size
file systems.
The CPTEC mission is to provide Brazil with state-of-the-art weather forecasts
and climate predictions for the benefit of civil society. They recently
upgraded their computer system to an NEC SX-6. The final configuration, to be
installed by end year 2003, will consist of 12 Nodes (96 processors) NEC SX-6
with peak performance of 768Gflop/s, the most powerful supercomputer system
sold by NEC in South America.
In the last ten years, using the power of NEC SX systems they have modernised
weather forecasts in Brazil. They are using the global atmospheric models
developed in the USA and Europe at comparable high resolution. They have
coupled the atmospheric, ocean, hydrological and wave models presently using
100kms grids, for global calculations, reducing to 40kms for regional and
20kms for local predictions. They provide weather forecasts, severe storms,
floods, biomass-burning monitoring, mapping fire risk, issuing early warnings
to civil defence and government. For example, hydrological monitoring was
recently used to good effect during the recent energy crisis, as 95% of
Brazil's power energy is hydro-driven.
Listening to the CPTEC presentation it became clear that they have done a
great job, their daily predictions are published in the national press,
carried on national TV and radio, posted on the Web and are widely accepted by
the Brazilian public. As Dr. Paulo Castello Branco, President of NEC Brazil
said in his welcome address: "NEC Brazil is proud to be working with CPTEC,
recognizing the huge symbiotic power which the joint technology matrix can
produce, from NEC, the state of the art scientific high performance computers,
and from CPTEC the wide human knowledge in atmospheric sciences translated
into the sophisticated computer language codes which describe and foresee the
future of meteorological phenomena based on measurements and probes taken from
the past."
He continued: "Facing this disturbed moment the world is going through, it is
very comfortable and relieving to work with customers who are committed to
social welfare, concerned with bringing social and economic benefits to this
huge and promising country, contributing to the safety and welfare of people
and providing data for the development of the most promising industry in
Brazil, namely, the agriculture business. I look forward to the time when
supercomputing will help us predict the development of societies and the
impact our present actions and investments will bring to future generations.
This will certainly help our government's to focus their priorities to
meaningful programmes for societal development".
With opening ceremonies over, Jack Dongarra, reviewed the high performance
computer developments and successes over the years. Using Moore's law and the
Top500 ranking list as templates, he weaved a fascinating story of computer
device growth and computer power delivered to the user. For example, an
application which required a whole year to complete its calculations on the
fastest system available in 1980, took 10 hours in 1992, 16 minutes in 1996,
27 seconds, on the ASCI white system, in 2001 and a mere 5.4 seconds, on the
Earth Simulator, in 2002. On Linpack the Earth Simulator achieves an
impressive 88% efficiency. One worrying trend is that over the years "the
processor memory gap" increases by 50% per year, causing a deteriorating
imbalance reflected in the Sustained/Peak performance ratio and in particular
on scalar systems.
He went on to describe projects recently announced in the USA, planned for
completion in the next three years. These included Red Storm at Sandia, a
40Tflop/s peak performance special system using 10,368 2GHz AMD compute
processors, a fast memory and communication switch developed by Cray Inc., the
ASCI Purple, with projected 100Tflop/s peak, followed by 160Tflop/s and the
IBM Blue Gene/L with 360Tflop/s peak to be built by IBM in the 2005/6 time
frame. Although, the ASCI Purple and Blue Gene/L systems have been announced
by the DOD, at SC2002 last November, Jack Dongarra raised doubts as to whether
they will be built on time, as no money has been earmarked to-date.
Then he touched on the DARPA High Productivity Computing Initiative, which as
its name implies aims to produce high efficiency systems delivering a large
portion, 30% to 40% of their peak performance, to the user application, rather
than the pitiful 2% to 5% of present scalar based MPP systems. This inevitably
leads to an NEC, parallel "vector" type architecture. IBM is already moving in
this direction with the Power5+ chip, which introduces Virtual Vector
Architecture, (ViVA). The Cray X1 already has a vector variant of this.
Several vendors, SGI, Hp, IBM, Sun and Cray have been funded to do a
feasibility study for DARPA to then choose the most promising design. The
target is for a Petaflop/s system by year 2009. The IBM proposal consists of
the Blue Planet system.
One is often focussing on scientific technical applications, but a number of
other fields are also using impressive amounts of computing. The Internet
search engine from Google, is one such example. It currently, deals with 150
million queries per day from over 100 countries. It has 3Billion documents in
its index, has over 15,000 Linux systems, in 6 data centres, a peak
performance greater than 15Tflop/s and a capacity of over 1Petabytes.
Data handling is becoming a headache, but the growth of processors is even a
bigger problem in scientific/technical applications. In the Google processing
environment, if a processor fails, the user tries again, a few seconds later.
In scientific technical work, if a processor fails recovery is impossible.
Fault tolerant systems are essential to enable the Blue Planet 130,000
processors to function producing results. This is no mean task to achieve.
Both hardware and algorithms need to be reworked. Checkpoint restart is no
solution if saving and restoring calculations take longer than the Mean-Time-
Between-Failure, (MTBF) of an element in the total system.
Jack Dongarra concluded by describing how the GRID and automated library
software, in dynamic encapsulation mode, would aid the continuous exponential
growth of computing. He said: "We know the concepts of how to improve things,
capture insights, use experience to do what humans do well and automate the
dull stuff. Numerical software will be adaptive, exploratory and intelligent.
Determinism in numerical computing will be gone. After all, it is not
reasonable to ask for exactness in numerical computing. Audit of computation
and reproducibility are things of the past. Stochastic models and adaptability
are the new buzzwords.
Tadashi Watanabe, NEC Vice President and designer of the NEC parallel vector
SX series systems, presented his vision "Towards Petaflop/s Computing". He
documented from his long experience the performance growth of supercomputing
performance, from the start of the SX project twenty years ago. The
performance achievements, from device fabrication densities, to total system
performance are staggering. Where in 1983, the highest peak system
performance, 1.3Gflop/s, was achieved by NEC SX-3, today this has reached
40Tflop/s in the Earth Simulator, an increase by a factor of thirty thousand.
Similarly memory size has increased by a factor of forty thousand and CPU
performance increased six-fold while its size decreased by six thousand seven
hundred and fifty times. Today one chip suffices to build a CPU, where in 1983
one needed 2500 chips to build a processor.
The question whether silicon technology will continue at this exponential rate
in the future was then addressed. Watanabe went on to illustrate that system
performance outperformed Moore's law, due to parallelism and software
improvements. He then articulated the many problems outstanding, to be
resolved by hardware designers of future systems, from on/off chip I/O Pads,
optical connections for memory interfaces to the extraction of power
dissipation, expected to rise to about 300 Watts in next generation systems.
He then listed Grand Challenge applications, which potentially need Petaflop/s
performance systems. These included, biotechnology, protein folding, medical
treatments, automotives, aerospace, environment/climate, energy, nano-
technology, and new material designs using 1,000 to 10,000 atoms.
Using an evolutionary approach and allowing for projected silicon technology
improvements, he rendition a detailed system based on the NEC SX series,
potentially capable to deliver one Petaflop/s of peak performance with only
8,192 CPUs, by year 2009. This is a much more manageable proposition than the
130,000 plus CPUs envisaged in the IBM Blue Planet solution. He stressed that
high memory bandwidth with direct optical connections between chips and high-
speed interface between nodes are essential to achieve high system efficiency,
not merely peak performance. Above all what is needed is money to fund it and
I would add a truly free global market to sell it.
Watanabe went on to talk about post silicon and the quantum entanglement but
that has to wait another article.
Apart from technology overviews, many interesting user talks were given,
illustrating fascinating results from the Earth Simulator and other climate
centres. Other talks recounted how computer centres struggle to solve the ever
burgeoning data handling problem. Both the NEC/Legato scalable parallel Data
Management System at DKRZ, and the "scalable global parallel file system" at
HLRS, Germany were, presented. Dr. Sell computer director of DKRZ said: "There
are many technical challenges in running an HPC centre for climate research. I
am very happy in using NEC expertise and dedication to attain effective
solutions". This was the overall sentiment of users I talked to and they were
even more positive after listening to NEC executives, under non-disclosure
conditions about NEC's future products.
All in all, NUG provided a very fruitful meeting and an enjoyable stay in
Brazil. As Djordge Maric its President said: "The user community not only
shared their own research and experiences but also had the opportunity to
leverage technology from all of NEC's businesses for their own benefit. The
fact that NEC committed to continue the SX Series for at least two more
upgrades is reassuring and shows that parallel vector systems have a bright
life".
The next article will report on a very interesting user presentation on an
international study of the Amazon Basin, titled: "Biosphere-Atmosphere
Interactions in Amazonia".
|