Next Article Table of Contents Previous Article

MICROARRAY TECHNOLOGY FOR GENETIC ENGINEERS / BI
by Cara Goldenberg

"Hi-tech Microarray Technology Holds Future Breakthroughs for Genetic Engineers, and Future Earnings For Business Strategists"

Thanks to enduring efforts of committed educational institutions like Stanford University and MIT, complete genomes of many species have already been mapped; now, with a working draft of the human genome in place, and seemingly infinite avenues for experimentation, it is the private sector's turn to cash in. Aspiring multi-billion dollar companies like Affymetrix and Incyte Technologies are competing head-to-head to continually develop, update and market instrumentation that will permit cutting-edge researchers to gather their data in a manner commensurate with the sophistication of their experimental design itself (reported price tag: $175,000 for microarray slides, and the tools that collect, store, and analyze the data). Included are microarray chips, typically glass slides which, for between $500 and $2000 per chip (chips are not reusable) contain the entire genome of the species being examined. But what good is data generated by these hi-tech, hi-price techniques if no reliable data analysis method exists? Data is only as valuable as its interpreter, and with the volume of data generated by microarray techniques, no single researcher or team of researchers could be expected to make "heads or tails" without the aid of a computer. But it doesn't appear that any computer has yet been able to make any significant breakthrough through analysis of the volumes of data generated by microarrays. The potential for gain seems infinite, yet these invaluable data sources remain "untapped."

A void has existed for some time in microarray data analysis. Despite efforts to create more diverse and cost-effective alternatives to microarray collection, only greater options regarding data collection methodology have emerged. Those chiefly responsible for the effort include geneticist Patrick Brown, his former graduate student Joseph DeRisi, and a bioinformatics expert Michael Eisen, all from Stanford University. Brown, along with engineering student Dari Shalon, devised a substantially less expensive way of generating microarrays in the mid-90s geared toward studying gene expression patterns in yeast.

While advances in data collection have been rapid, data analysis methods remain limited and lag the recent advances made in data collection. The primary existing method is termed clustering. The principle: Given a set of data points -- gene data points map expression levels across certain environmental conditions -- each having a set of attributes, and a similarity measure among them, clustering maps data points in one cluster that are more similar to one another, and maps data points in separate clusters that are less similar to one another (similarity measures include Euclidean Distance if attributes are continuous and other Problem-specific Measures).

While clustering methods are not without value, a more intuitive, insightful model for data analysis is required. According to Eisen, "What is needed instead is a holistic approach to analysis of genomic data that focuses on illuminating order in the entire set of observations, allowing biologists to develop an integrated understanding of the process being studied." According to Eisen, "An important test of the value of this approach comes when we examine the identity of the clustered genes at varying levels of identity." Thus, if expression patterns were explicitly sought amongst genes with known roles of coexpression (inter-dependence), data results would be more intuitive and readily interpretable. Moreover, genes yet unidentified but with similarly-observed behavior as one or several genes already identified could be inferred to be involved in that particular process, and thus labeled and classified - the ultimate goal of any gene-expression data based experiment.

The upshot - two lessons for two very different types of professionals. For the genomic scientist, a more intuitive, sophisticated method of microarray data analysis is in the works, with the potential to yield clear-cut, definitive results, easily interpreted by even the business leader. And for the business leader, remember this - if your data analysis software yields results that are not readily interpretable, the impact is fleeting at best. Time and resources invested in developing a more easily interpretable analysis will pay dividends, rendering today's purely quantitative methods obsolete, and paving the path for significantly more meaningful, qualitative data analysis.


Cara Goldenberg is a member of the Research and Development team at Virtual Gold, a data mining and business intelligence company in Hartsdale, NY. Cara is currently working on projects related to data mining of bioinformatics data as she prepares to enter her junior year as a chemistry major at Yale University, where she is pursuing a Bachelor of Science/ Master's Degree. Her previous areas of research include Canine Von Willebrand's Disease, a fatal bleeding disorder in dogs (at Michigan State University); and the development of Nuclear Magnetic Resonance-based experimental techniques (at Yale). Cara also has previous work experience at Chase Manhattan Bank and the New York City Charter Revision Commission under Mayor Rudolph Giuliani. Her e-mail address is cara.goldenberg@yale.edu.

For more information, see www.virtualgold.com

Top of Page


Previous Article  |  Table of Contents  |  Next Article