HPCwire
 The global publication of record for High Performance Computing - LIVEwire Edition / November 9, 2004: Vol. 13, No. 45A

  |  Table of Contents  |  

LIVEwire News Briefs:

Microway Announces Diagnostic Software For MPI Clusters

Microway showcased MPI Link-Checker at SC2004. This software is a diagnostic tool that finds underperforming nodes in MPI based clusters. Formerly available as a Beta release and free download on Microway's website, the new commercial version contains enhancements that make it possible to detect hard to find problems in large clusters, as well as look for bad cables, motherboards, NICs, switches, BIOS's and OS's in real time. A new data collection facility makes it possible to probe the cluster off line, and then analyze the data collected at a later date.

Finding a bad node in a large cluster is not a trivial problem. MPI Link- Checker can collect hundreds of megabytes of performance data on a cluster over a four day validation burn in. Sifting through this information quickly requires the right tool. The new release simplifies this problem and also makes it possible to drill down into the analysis grids generated by large clusters, dynamically view plots of transfer time and bandwidth versus packet size for all the nodes in an analysis matrix, reduce analysis time by breaking large clusters into groups of nodes and select the statistical method used to view the data. This last feature when combined with off line collection makes it possible to isolate intermittent problems that have heretofore been impossible to find.

"MPI Link Checker," commented Stephen Fried, president of Microway, "Is the first HPC product that makes it possible to diagnose really hard to find Cluster failures, like intermittent cables that are not properly seated, while at the same time being able to spot problems in MPI itself that are the result of inefficient device drivers or the wrong choice of parameters, such as the transition point between the Eager and Rendezvous protocols. MPI Link-Checker will become an essential tool for all MPI based Linux clusters."


Top of Page

  |  Table of Contents  |