HPCwire
 The global publication of record for High Performance Computing / February 27, 2004: Vol. 13, No. 8

  |  Table of Contents  |  

Features:

U.S. MOVING AHEAD ON DATA-INTENSIVE INTELLIGENCE

The US government is pressing ahead with research to create ultrapowerful tools to mine millions of public and private records for information about terrorists. Although Congress eliminated a Pentagon office that had been developing important terrorist-tracking technology, key projects from retired Adm. John Poindexter's Total Information Awareness effort were transferred to U.S. intelligence offices, congressional, federal and research officials told The Associated Press.

In addition, Congress left undisturbed a separate but similar $64 million research program run by the Advanced Research and Development Activity, or ARDA, that has used some of the same researchers as Poindexter's program.

Poindexter aimed to predict terrorist attacks by identifying telltale patterns of activity in arrests, passport applications, visas, work permits, driver's licenses, car rentals and airline ticket buys as well as credit transactions and education, medical and housing records.

Disturbed by left-wing hysteria, Congress last fall closed Poindexter's office, part of the Defense Advanced Research Projects Agency, and barred the agency from continuing most of his research. Poindexter quit the government, noting that his work had been misunderstood.

But the work, fortunately, did not die: Congress quietly agreed to continue paying to develop highly specialized software to gather foreign intelligence on terrorists.

In a classified section summarized publicly, Congress added money for this software research to the "National Foreign Intelligence Program," without identifying openly which intelligence agency would do the work. It said, for the time being, products of this research could only be used overseas or against non-U.S. citizens in this country.

Congressional officials would not say which Poindexter programs were killed and which were transferred. People with direct knowledge of the contracts told the AP that the surviving programs included some of 18 data-mining projects known in Poindexter's research as Evidence Extraction and Link Discovery.

Poindexter's office described that research as "technology not only for 'connecting the dots' that enable the U.S. to predict and pre-empt attacks but also for deciding which dots to connect." Ted Senator, who managed that research for Poindexter, told government contractors that mining data to identify terrorists "is much harder than simply finding needles in a haystack."

"Our task is akin to finding dangerous groups of needles hidden in stacks of needle pieces," he said. "We must track all the needle pieces all of the time."

Among Senator's 18 projects, the work by researcher Jensen shows how flexible such powerful software can be. Jensen used two online databases, the Physics Preprint Archive and the Internet Movie Database, to develop tools that would identify authoritative physics authors and would predict whether a movie would gross more than $2 million its opening weekend.

Jensen said in an interview that Poindexter's staff liked his research because the data involved "people and organizations and events ... like the data in counterterrorism."

At the University of Southern California, professor Craig Knoblauch said he developed software that automatically extracted information from travel Web sites and telephone books and tracked changes over time.

ARDA, the research and development office, sponsors corporate and university research on information technology for U.S. intelligence agencies. It is developing computer software that can extract information from databases as well as text, voices, other audio, video, graphs, images, maps, equations and chemical formulas. It calls its effort "Novel Intelligence from Massive Data."

The office said it has given researchers no government or private data and obeys privacy laws.

The project is part of its effort "to help the nation avoid strategic surprise ... events critical to national security ... such as those of Sept. 11, 2001," the office said.

Poindexter had envisioned software that could quickly analyze "multiple petabytes" of data. The Library of Congress has space for 18 million books, and one petabyte of data would fill it more than 50 times. One petabyte could hold 40 pages of text for each of the world's more than 6.2 billion people.

ARDA said its software would have to deal with "typically a petabyte or more" of data. It noted that some intelligence data sources "grow at the rate of four petabytes per month." Experts said those probably are files with satellite surveillance images and electronic eavesdropping results.

The Poindexter and ARDA projects are vastly more powerful than other data- mining projects such as the Homeland Security Department's CAPPS II program to classify air travelers or the six-state, Matrix anti-crime system financed by the Justice Department.

In September 2002, ARDA awarded $64 million in contracts covering 3 1/2 years. The contracts went to more than a dozen companies and university researchers, including at least six who also had worked on Poindexter's program.

Congress had thrown these researchers into turmoil. Doug Lenat, the president of Cycorp Corp. in Austin, Texas, will not discuss his work but said he had an "enormous seven-figure deficit in our budget" because Congress shut down Poindexter's office.

Even critics like James Dempsey of the Center for Democracy and Technology see a vital role for data-mining technology in evaluating the vast, underanalyzed data the government already collects.

On the Net:

DARPA: http://www.darpa.mil

ARDA: http://www.ic-arda.org


Top of Page

  |  Table of Contents  |