HPCwire
 The global publication of record for High Performance Computing - LIVEwire Edition / November 20, 2003: Vol. 10, No. 3

Previous Article   |  Table of Contents  |  

Features:

QUESTIONS AND ANSWERS FROM ANNAPOLIS MICRO SYSTEMS INC
By Alan Beck, Editor-in-Chief, HPCwire

Q1: We are beginning to see a lot of interest in using FPGAs for high performance applications. Just what is an FPGA, and what can it do for us?

A1: FPGA stands for Field Programmable Gate Array. FPGA based processing is also known as Adaptive Computing or Reconfigurable Computing. An FPGA is basically a chip which contains a large number of very low level hardware components. These chips are loaded, on the fly, in real time, with an off-line designed file, which causes the chip to become a customized parallel processing chip, specifically crafted for a particular application, accelerating the application to run in hardware and at hardware speeds, far faster than could be achieved with software on a generic processor. The FPGA design files are very similar in concept and in development to Application Specific Integrated Circuit (ASIC) files. The FPGA acts as a custom designed chip, to perform the exact processing needed at a particular time in the application. Because the chip is reconfigurable, a completely different design for a completely different part of the application can be downloaded in real time. It is well known that FPGA-based Processing outperforms conventional processors, sometimes by as much as 50 times, on a board for board comparison, resulting in significant improvement in Processing Speed, Size, Weight, Power and Dollars.

Data can be processed in real time, on site, saving all the time and money involved in data collection and off site processing. The processing can be modified, by simply reconfiguring the chip (by downloading a different FPGA file) to fix bugs, to adapt to a new set of interface requirements, or to modify the processing in response to things found in the data itself. New applications can be delivered in place, with no human on-site intervention, by any means of file transfer, including network, hard drive storage, smart card or wireless modem.

Q2: How does this relate to supercomputers

A2: Many high performances applications that could only be performed on very expensive supercomputers in the past are now moving out into clusters, high end general purpose computers and PCs. This will increase as the processor technology moves forward, and as users seek cheaper, faster, better solutions to their problems. PCI based FPGA processing boards fit into these systems, and are increasingly being used to speed up repetitious parts of the processing.

The FPGAs themselves are now big enough (up to 8 million gates per chip) and fast enough (up to 150 MHz) to perform real meaningful work. There is room in the FPGAs for floating point and matrix calculations and corner turns. Users are finally able to attack true supercomputer type applications.

The Air Force Research Lab at Rome, New York, took delivery in February 2003 on a 48-node Heterogeneous High Performance Beowulf Cluster. Each node has dual Xeon processors and one Annapolis WILDSTAR II Adaptive Computing board with 2 Xilinx Virtex II 6000-4 FPGAs with an LVDS I/O card for interconnecting the cluster nodes. The acceptance test was to demonstrate a 34 Teraops FIR Filter. The majority of the processing for this application took place on the FPGAs.

Q3: How do you design applications for and use an FPGA based Processing Board?

A3: Think of the FPGA as an Attached Processor. The main part of the application runs on the host computer, with C or Java Programming API calls to configure, load and use the FPGA for processing intensive operations. FPGA configurations are files produced at design time and stored in Flash on the board, in host memory, on disk, or elsewhere on the network.

Previously, FPGAs have been designed using the type of tools used by ASIC designers, including schematics, VHDL and Verilog. With the current gate count at upward of 32 million gates per PCI slot, there are simply not enough top notch ASIC designers in the world to do all the FPGA designs that industry requires, and most users cannot afford to wait the 10 -12 calendar months that these methodologies require. In addition, most high performance processing is first designed by algorithm experts. What is needed is a tool that can be used by the algorithm designer to very quickly implement and test his ideas directly on the FPGAs.

Annapolis developed CoreFire to meet this need. The designer drags and drops icons to add and connect our very high performance optimized modules into his FPGA design, runs a design check, and then automatically goes into the Xilinx place and route tools. The resulting FPGA design file, with its associated java and C files, can then be run directly on the FPGA itself, for Hardware in the Loop Debugging. The tool manages all the chip level hardware automatically and correctly. The combination of a very quick and easy to use GUI interface, a data flow approach to the problem, a very large number of high performance IP modules, the automatic generation of the logic necessary to control the interfaces between the modules, and Hardware in the Loop Debugging provides an exceedingly convenient and fast methodology for developing FPGA application files.

The CoreFire Design Suite makes it possible for users to develop FPGA Application files at the rate of 50,000 to 2 million gates per hour. This is in stark contrast to the average 1,000 to 20,000 for a VHDL user.

Q4: What kinds of applications can benefit from FPGA performance?

A4: The area wherein the most work has been done to date, is probably digital signal processing with FFTs and FIR filters, including radar, COMINT, SIGINT, communications, various compression and decompression schemes, software radio and image processing. They are being used in many other applications including encryption, various biotechnology intiatives, finger print and facial recognition, etc.

The latest release of CoreFire includes cores particulary appropriate for supercomputer applications, such as matrix computations, corner turns, array types and variable precision floating point.

Q5: With all this high speed processing, how do you get the data into the processing elements?

A5: We support a rapidly growing number of I/O standards, with COTS daughter cards that fit onto our COTS main boards. Most of these daughter cards also have Virtex II or Virtex II Pro FPGAs, to add more processing power. Current offerings include Quad Fibre Channel 2 to transfer data betweeen cluster nodes and disks, raids, jbods and sensors, GBit Ethernet I/O, LVDS with up to 1.5 GBytes/sec per I/O card, and our WILDSTAR Data Port (WSDP) which has up to 6 ports per I/O card, with 1.2 GBytes/port.

Q6: Tell us a little bit about Annapolis -- what is your history and where are you now ?

A6: We like to call ourselves "The World Leaders in FPGA Based High Performance Processing."

We are currently delivering our 8th Generation of FPGA-based COTS hardware and software solutions for VME, PCI, PMC and CardBus, with a large, growing list of I/O options, including 1.5 GHz A/D, 105 MHz A/D, Fiber Optic G-Link, Fibre Channel 2, Ethernet, WSDP and FPDP, all using the latest Xilinx Virtex II and/or Virtex E FPGAs. We host our boards on a large number on operating systems, including Win NT, Win2000, Linux, DEC Alpha, Solaris, QNX and VxWorks. We support our board products with a standardized set of drivers, APIs, and VHDL simulation models. Our CoreFire FPGA Application Design Suite makes it easy for users to develop high performance applications. We also offer training, including customized application development and customer support.

We are a privately held woman-owned small business, located in Annapolis, Maryland. All our products are U.S. made, including board assembly, which we do in house.

We began in 1982 as a contract design house, and worked for 10 years with an IBM fellow, designing many aspects of touch screen technology implementations, including ASIC and FPGA application design. Other customers during that time included Schlumberger, Comsat, Intel, ARINC and many others. In 1994 we licensed the technology for SPLASH II (an FPGA based processing system) from NSA, and since that time have moved away from "design for hire" to become a product company.

Q7: What will we see out of Annapolis in the future?

A7: We have a number of initiatives we are pursuing in the Short Term.

a.) We plan to work with the Rome Lab users and others to characterize and improve the ease of use of our newly released CoreFire modules for "supercomputer type" applications.

b.) You will see our cards in a growing number of clusters and larger machines.

c.) You will also see our cards move out into the field in more applications, providing cheaper, faster, more useful alternatives to processing data real time in the field.

d.) Our 9th Generation of boards, using Xilinx's latest FPGA, the Virtex II Pro, will become available in the spring.

e.) We will enhance the functionality of our Quad FC2 cards, by adding file capability.

f.) We will release more and faster I/O Cards, including Dual 1.5 GHz A/D, a Quad 105 MHz A/D, a Synchronization Board for A/Ds, and up to 8 GBytes of Synchronous DRAM on a daughter card.

For the Longer Term, we will continue to pursue bigger, faster, cheaper FPGAs, more and faster memory, more and faster I/O, and new application areas.


Top of Page

Previous Article   |  Table of Contents  |