HPCwire
 The global publication of record for High Performance Computing / May 30, 2003: Vol. 12, No. 21

  |  Table of Contents  |  

Features:

NEC USER GROUP TALKS PETAFLOP/S AND BRAZILIAN SAMBA
By Christopher Lazou, HiPerCom Consultants, Ltd.

Click For More InformationAround eighty experts from twelve mostly European countries braved air travel and SARs, to attend the NEC User Group, NUG-XV meeting in an idyllic coastal location near Angra Dos Reis, Brazil. This four day meeting was hosted by the Brazilian weather/climate centre, (Centro de Tempo e Estudos Climaticos), CPTEC. It enabled experts in computing, meteorology and technical engineering fields to share their latest research results, crystallise future hardware and software needs and collectively leverage NEC to take these onboard in their development plans for new systems. These needs are not only for faster more powerful (Petaflop/s) systems for the scientific and technical market, but increasingly also in data management and storage handling of PetaBytes size file systems.

The CPTEC mission is to provide Brazil with state-of-the-art weather forecasts and climate predictions for the benefit of civil society. They recently upgraded their computer system to an NEC SX-6. The final configuration, to be installed by end year 2003, will consist of 12 Nodes (96 processors) NEC SX-6 with peak performance of 768Gflop/s, the most powerful supercomputer system sold by NEC in South America.

In the last ten years, using the power of NEC SX systems they have modernised weather forecasts in Brazil. They are using the global atmospheric models developed in the USA and Europe at comparable high resolution. They have coupled the atmospheric, ocean, hydrological and wave models presently using 100kms grids, for global calculations, reducing to 40kms for regional and 20kms for local predictions. They provide weather forecasts, severe storms, floods, biomass-burning monitoring, mapping fire risk, issuing early warnings to civil defence and government. For example, hydrological monitoring was recently used to good effect during the recent energy crisis, as 95% of Brazil's power energy is hydro-driven.

Listening to the CPTEC presentation it became clear that they have done a great job, their daily predictions are published in the national press, carried on national TV and radio, posted on the Web and are widely accepted by the Brazilian public. As Dr. Paulo Castello Branco, President of NEC Brazil said in his welcome address: "NEC Brazil is proud to be working with CPTEC, recognizing the huge symbiotic power which the joint technology matrix can produce, from NEC, the state of the art scientific high performance computers, and from CPTEC the wide human knowledge in atmospheric sciences translated into the sophisticated computer language codes which describe and foresee the future of meteorological phenomena based on measurements and probes taken from the past."

He continued: "Facing this disturbed moment the world is going through, it is very comfortable and relieving to work with customers who are committed to social welfare, concerned with bringing social and economic benefits to this huge and promising country, contributing to the safety and welfare of people and providing data for the development of the most promising industry in Brazil, namely, the agriculture business. I look forward to the time when supercomputing will help us predict the development of societies and the impact our present actions and investments will bring to future generations. This will certainly help our government's to focus their priorities to meaningful programmes for societal development".

With opening ceremonies over, Jack Dongarra, reviewed the high performance computer developments and successes over the years. Using Moore's law and the Top500 ranking list as templates, he weaved a fascinating story of computer device growth and computer power delivered to the user. For example, an application which required a whole year to complete its calculations on the fastest system available in 1980, took 10 hours in 1992, 16 minutes in 1996, 27 seconds, on the ASCI white system, in 2001 and a mere 5.4 seconds, on the Earth Simulator, in 2002. On Linpack the Earth Simulator achieves an impressive 88% efficiency. One worrying trend is that over the years "the processor memory gap" increases by 50% per year, causing a deteriorating imbalance reflected in the Sustained/Peak performance ratio and in particular on scalar systems.

He went on to describe projects recently announced in the USA, planned for completion in the next three years. These included Red Storm at Sandia, a 40Tflop/s peak performance special system using 10,368 2GHz AMD compute processors, a fast memory and communication switch developed by Cray Inc., the ASCI Purple, with projected 100Tflop/s peak, followed by 160Tflop/s and the IBM Blue Gene/L with 360Tflop/s peak to be built by IBM in the 2005/6 time frame. Although, the ASCI Purple and Blue Gene/L systems have been announced by the DOD, at SC2002 last November, Jack Dongarra raised doubts as to whether they will be built on time, as no money has been earmarked to-date.

Then he touched on the DARPA High Productivity Computing Initiative, which as its name implies aims to produce high efficiency systems delivering a large portion, 30% to 40% of their peak performance, to the user application, rather than the pitiful 2% to 5% of present scalar based MPP systems. This inevitably leads to an NEC, parallel "vector" type architecture. IBM is already moving in this direction with the Power5+ chip, which introduces Virtual Vector Architecture, (ViVA). The Cray X1 already has a vector variant of this. Several vendors, SGI, Hp, IBM, Sun and Cray have been funded to do a feasibility study for DARPA to then choose the most promising design. The target is for a Petaflop/s system by year 2009. The IBM proposal consists of the Blue Planet system.

One is often focussing on scientific technical applications, but a number of other fields are also using impressive amounts of computing. The Internet search engine from Google, is one such example. It currently, deals with 150 million queries per day from over 100 countries. It has 3Billion documents in its index, has over 15,000 Linux systems, in 6 data centres, a peak performance greater than 15Tflop/s and a capacity of over 1Petabytes.

Data handling is becoming a headache, but the growth of processors is even a bigger problem in scientific/technical applications. In the Google processing environment, if a processor fails, the user tries again, a few seconds later. In scientific technical work, if a processor fails recovery is impossible. Fault tolerant systems are essential to enable the Blue Planet 130,000 processors to function producing results. This is no mean task to achieve. Both hardware and algorithms need to be reworked. Checkpoint restart is no solution if saving and restoring calculations take longer than the Mean-Time- Between-Failure, (MTBF) of an element in the total system.

Jack Dongarra concluded by describing how the GRID and automated library software, in dynamic encapsulation mode, would aid the continuous exponential growth of computing. He said: "We know the concepts of how to improve things, capture insights, use experience to do what humans do well and automate the dull stuff. Numerical software will be adaptive, exploratory and intelligent. Determinism in numerical computing will be gone. After all, it is not reasonable to ask for exactness in numerical computing. Audit of computation and reproducibility are things of the past. Stochastic models and adaptability are the new buzzwords.

Tadashi Watanabe, NEC Vice President and designer of the NEC parallel vector SX series systems, presented his vision "Towards Petaflop/s Computing". He documented from his long experience the performance growth of supercomputing performance, from the start of the SX project twenty years ago. The performance achievements, from device fabrication densities, to total system performance are staggering. Where in 1983, the highest peak system performance, 1.3Gflop/s, was achieved by NEC SX-3, today this has reached 40Tflop/s in the Earth Simulator, an increase by a factor of thirty thousand. Similarly memory size has increased by a factor of forty thousand and CPU performance increased six-fold while its size decreased by six thousand seven hundred and fifty times. Today one chip suffices to build a CPU, where in 1983 one needed 2500 chips to build a processor.

The question whether silicon technology will continue at this exponential rate in the future was then addressed. Watanabe went on to illustrate that system performance outperformed Moore's law, due to parallelism and software improvements. He then articulated the many problems outstanding, to be resolved by hardware designers of future systems, from on/off chip I/O Pads, optical connections for memory interfaces to the extraction of power dissipation, expected to rise to about 300 Watts in next generation systems. He then listed Grand Challenge applications, which potentially need Petaflop/s performance systems. These included, biotechnology, protein folding, medical treatments, automotives, aerospace, environment/climate, energy, nano- technology, and new material designs using 1,000 to 10,000 atoms.

Using an evolutionary approach and allowing for projected silicon technology improvements, he rendition a detailed system based on the NEC SX series, potentially capable to deliver one Petaflop/s of peak performance with only 8,192 CPUs, by year 2009. This is a much more manageable proposition than the 130,000 plus CPUs envisaged in the IBM Blue Planet solution. He stressed that high memory bandwidth with direct optical connections between chips and high- speed interface between nodes are essential to achieve high system efficiency, not merely peak performance. Above all what is needed is money to fund it and I would add a truly free global market to sell it.

Watanabe went on to talk about post silicon and the quantum entanglement but that has to wait another article.

Apart from technology overviews, many interesting user talks were given, illustrating fascinating results from the Earth Simulator and other climate centres. Other talks recounted how computer centres struggle to solve the ever burgeoning data handling problem. Both the NEC/Legato scalable parallel Data Management System at DKRZ, and the "scalable global parallel file system" at HLRS, Germany were, presented. Dr. Sell computer director of DKRZ said: "There are many technical challenges in running an HPC centre for climate research. I am very happy in using NEC expertise and dedication to attain effective solutions". This was the overall sentiment of users I talked to and they were even more positive after listening to NEC executives, under non-disclosure conditions about NEC's future products.

All in all, NUG provided a very fruitful meeting and an enjoyable stay in Brazil. As Djordge Maric its President said: "The user community not only shared their own research and experiences but also had the opportunity to leverage technology from all of NEC's businesses for their own benefit. The fact that NEC committed to continue the SX Series for at least two more upgrades is reassuring and shows that parallel vector systems have a bright life".

Click For More InformationThe next article will report on a very interesting user presentation on an international study of the Amazon Basin, titled: "Biosphere-Atmosphere Interactions in Amazonia".


Top of Page

  |  Table of Contents  |