HPCwire
 The global publication of record for High Performance Computing / June 4, 2004: Vol. 13, No. 22

  |  Table of Contents  |  

Features:

PRODUCTIVITY - A TRUE MEASURE OF SUPERCOMPUTING
by Christopher Lazou

May 24-27, 2004, Kiel, Germany: Seventy five experts, from thirteen mostly European countries, attended the NEC User Group, NUG-XVI meeting in the Maritim Hotel Bellevue, which overlooks the Baltic sea in the northern city of Kiel, Germany. This meeting was hosted by, the German Climate Computing Centre, the Deutsches Klimarechenzentrum GmbH (DKRZ) Hamburg. Attendance and consequently the talks were mainly from computer centre directors explaining how they manage their own computing facilities to support their core business plus NEC support staff describing new hardware and software developments. Below are a few extracts from these talks, which should be of interest to the supercomputer community.

This get together, enabled experts in computing, meteorology and other technical engineering fields, to share their latest research results, crystallise future hardware and software needs and collectively leverage NEC to take these onboard in their development plans for new systems. These needs are not only for faster more powerful (petaflop/s) systems for the scientific and technical market, but increasingly also in data management and storage handling of petabytes size file systems. In short an infrastructure to deliver a total solution.

In recent years meteorology evolved from its esoteric weather prediction role and has become a high profile e-business, with enormous commercial potential. The UK Met Office for example sells hundreds of different products to a diverse set of clients. Because of global warming the frequency of extreme events has increased, so UK Met wants to refine their local weather prediction model from a 12Kms grid down to 2Kms to improve the accuracy of their prediction. The importance of meteorology was reflected in the NUG programme where the whole of Monday was allocated to the Special Interest Group for Meteorology Applications (SIG-MA). Speakers came from weather and climate centres across the globe from Australia, Japan, Brazil and most of those in European countries.

The host site, DKRZ was founded in 1987 with the mission to provide state-of- the-art supercomputing, data handling and associated services, including high- level visualization, to the German scientific community, in order to conduct large-scale earth system and climate modelling.

Three years ago DKRZ upgraded its computer systems focusing on productivity and became one of Europe's fastest supercomputer facilities in production at that time, used for climate research. They purchased the then latest supercomputers from NEC, the SX-6 series with half a teraflop/s sustained performance, a unified data management system, based on the Intel IA-64 (Itanium) architecture and Linux.

With the advent of new technology, one trend in high performance computing is the fusion of computation, simulation and data analysis. With advance satellite technology delivering massive data streams in the earth systems and climate area, the challenges and opportunities for fusing observational and/or experimental data with classical simulation have increased enormously.

To address this new reality, DKRZ developed a unified concept, capable of delivering a total solution with transparent access for the climate user community. In addition to the high compute servers, an integrated distributed data management system was specified as an essential part of this upgrade. To achieve this, new hardware and software had to be put in place to support the high numerical calculations, high networking demands and a scalable architecture unified shared file system and archive, to handle the massive volume of generated new data. As Dr. Wolfgang Sell, director of DKRZ said: "Vector architecture machines deliver data to the application on time".

Another speaker Dr. Francois Mescam, director of computing and networking at the French research aerospace agency (ONERA) whose mission is equivalent to that of NASA in the USA, explained why for aerospace applications, productivity could only be achieved by using vector parallel supercomputer systems.

ONERA is the scientific and technical government agency reporting to the French Ministry of Defence and employs some 2,000 people. Established over 50 years ago, ONERA has been actively involved in all major French and European aerospace programs, including Mirage, Concorde, Airbus including the latest super jumbo-jet A380, which it is claimed can carry up to 800 (555 normal) passengers, space vehicles, such as Ariane, Rafale, and many more. Its partners include large companies such as SNEGMA (aircraft engines), THALES, Airbus France, Eurocopter and others. Headquartered in Chƒtillon, in the Paris suburbs, the agency has eight sites throughout France, including Palaiseau (Paris region), the Toulouse Research Centre and the Modane Wind Tunnels in the French Alps.

With ONERA, France, NLR, Holland and DLR, Germany currently using NEC SX vector supercomputers, this means that in Europe at least, the aerospace industry has decided that the NEC SX series, with its vector/parallel architecture and large shared memory, provides the extra power and delivers unparallel productivity the true measure of supercomputing. This enabled engineers to achieve efficiencies in design, and gain a competitive edge. The use of the same system across the European consortium made integration much easier. As Dr. Gerard Hameetman, the Director of computing at NLR told me over dinner: "Using the NEC SX-6, NLR developed new light antiglare material, used between the cockpit and the wings of the Airbus A380 plane and this saved 1000Kgms, the equivalent of 10 passengers.

To digress a little, cost savings achieved by the use of supercomputers, and the integration of all design functions and production using a virtual enterprise model, enabled European companies to consolidate their position in the market. For example, Airbus is now the largest manufacturer of civilian aircraft.

The competition between Boeing and Airbus is putting enormous pressure on both companies to cut costs. As 80% of costs are fixed by the preliminary design chosen, it is important to cut costs at the margin. Today the building of the Airbus involves more than 50 thousand people across Europe. The design teams are merging their work sharing designs and risks whilst people are working in different countries.

The dream in Europe is for the virtual designed aircraft to be a realistic one. For example, in the last five years there was a drive to reduce design times for the aircraft wing, from 2 years to less than one month; but to achieve this, it required more than a hundred times improvement in human efficiency. This is why high performance supercomputing moved centre stage and became a critical and essential element in the design process.

In fact the aerospace industry benefited enormously from using parallel vector supercomputers and is one of the industries, which provides the financial impetus for sustaining the development of shared memory vector machines, such as the NEC SX series, and also was a pressure point for the recent resuscitation of Cray Inc., in the USA.

In most high technology industries the margin between success and failure is very narrow. In the past three decades very few commercial aircraft were successful enough to make their manufacturer profit. The Boeing aircraft and the European Airbus are notable exemptions. The economics of aircraft operation are such that even a small improvement in efficiency can translate into substantial savings in operating costs. Therefore, the operating efficiency of an aeroplane is a major attraction to potential buyers and aircraft manufacturers have a compelling incentive to design the most efficient operating aircraft. That Boeing lost its primacy in civilian aircraft sales and has now been overtaken by the European Airbus is in part probably caused by design constraints due to the unavailability of modern vector parallel systems in the USA during the last eight years, until the recent arrival of Cray X1.

Prof. Dr.-Ing. Michael Resch, director of the High Performance Computing Centre in Stuttgart (HLRS) gave a presentation titled: "Vector Systems: The key to sustained Teraflop/s". This is not wishful thinking since Stuttgart recently completed a competitive procurement and purchased a new NEC SX series system with a four Teraflop/s sustained performance. HLRS supports users from R&D in the use of leading edge supercomputer technology and its applications. The mission of HLRS is to provide its users with tools and expertise to achieve top international positions in their research field. Capabilities and economy of scale are possible, at the high-end, through a joint operation of supercomputer systems with T-Systems Solutions for Research GmbH and Porsche AG, who provide 25% of the HLRS income. For Porsche the benefit of using supercomputers is quantifiable, namely, as the shortest time to deliver a solution for crash analysis.

Michael Resch set the tone about the situation of supercomputing by showing a slide with the following two quotes:

"Computational scientists have seen a frustrating trend of stagnating application performance despite dramatic increases in the claimed peak capability of HPC Systems". Oliker et al, 2004, LBNL, USA

"The fact that there has been no fundamental advance in high-performance capability computers in the last eight years has forced these communities to adapt less qualified commercial offerings to the solution of their problems". Vincent Scarafino Manager, Numerically Intensive Computing, Ford Motor Company, Before the Committee on Science, US House of Representatives, July 16, 2003.

Professor Resch, then shared findings from the recent HLRS procurement exercise. They surveyed many supercomputer sites and analysed performance results for three types of systems, super-scalar microprocessors, clusters and parallel vector processors. The price/performance graphs showed, what some of us have been saying for years, namely that: "Using a price/peak performance metric, vector systems appear to be more expensive, but are much better performing when using a price/sustained performance measure". When hidden costs of reliability, staff to operate the system, electrical power use, cooling costs and machine room costs are taken into account, the measure of Total Cost of Ownership (TCO) was very clear: "The cost for achieving one Teraflop/s is much more expensive for microprocessor systems, a factor of two compared to the NEC SX series, parallel vector systems".

The recent developments of including 2 cores on a chip, increases CPU speed, but the memory gap gets worse. Their survey found that most users in HPC use 512=>1000 CPUs and capability is the reserve of vector systems, while throughput can be achieved using clusters. The message is: "Use the right architecture for the right task".

At this point I could not help but muse: "USA microprocessor and cluster vendors promise a system to climb Everest and deliver a flight ticket to Kathmandu".

Apart from technology overviews, including a fascinating one from Mr. T. Kondo, CEO of NEC Solutions, many interesting user talks were given, illustrating fascinating results from supercomputing centres. Other talks recounted how computer centres struggle to solve the ever burgeoning data handling problem. Peter Haas, HLRS gave a talk titled: Advanced Network Technology and Parallel File Systems", describing the latest thinking of how the international community is tackling this problem.

Dr. Sell computer director of DKRZ summed up by saying: "There are many technical challenges and unresolved IT-Issues in Earth Systems Modelling. I am very happy in using NEC expertise and dedication to attain effective solutions". This sentiment was even more positive after users listened to NEC executives, under non-disclosure conditions, about NEC's future products.

All in all, NUG provided a fruitful meeting and an enjoyable stay in Kiel. The user community not only shared their own research and operational experiences, but also had the opportunity to leverage technology from all of NEC's businesses for their own benefit.

The next article will report on an exclusive interview I had with Mr. Tadashi Watanabe, NEC Vice President and designer of the NEC parallel vector SX series systems. The interview explores his views on the state of the industry and his vision for achieving petaflop/s Computing.

(Brands and names are the property of their respective owners) Copyright: Christopher Lazou, HiPerCom Consultants, Ltd., UK. June 2004.


Top of Page

  |  Table of Contents  |