QUANTUM COMPUTING AND DATA MINING
by Ed Colet
Occasionally ruminating about ideas that exist beyond a horizon of
plausibility comes with the territory of Research. Over time, basic
research typically finds its way into applied research that in turn becomes
implemented into practical and meaningful applications and solutions. In
this column, I spend some time on the notion of quantum computing -- which
promises to radically revolutionize computing power. As we'll see in this
column, the topics that we've been concerned about in data mining, such as
automated search and scalability, will diminish to insignificance in a
quantum computer.
Quantum computing is the application of the principles of quantum physics or
quantum mechanics to build a quantum computer. At this point, the work has
been mostly theoretical, and no one has actually been able to build or
implement a quantum computer for practical purposes. Although the
mathematical theory is for the most part well specified, it's practical
implications when described in non-technical ways shake up many of our
common sense notions of how things work.
The basics of quantum principles are illustrated by the parable of
"Schrodinger's cat" invented by the 1933 Nobel Prize winning physicist Erwin
Schrodinger. The parable describes a cat placed in a black box. The box
also contains a vial of cyanide that if stepped on by the cat will break,
release the gas, and kill the cat. While the lid is closed we don't know
whether the cat is dead or alive. Since we can't see or measure the cat, the
state of things is ambiguous. In an ambiguous situation, all possible
states (dead, alive) are equally valid, and all possible states are
considered to exist. Thus, we say that during this unobserved period, the
cat is both dead and alive! (Rather than dead or alive -- which implies one
but not the other state). Until the period of ambiguity ends (i.e. we open
the lid), superposition (the simultaneous existence of multiple states)
reigns. An alternative and more disturbing interpretation to superposition
is the notion that during the period of ambiguity, the universe splits into
multiple universes - one for each possible state.
The implementation of a quantum computer would make questions of response
time and scalability irrelevant. This is because a quantum computer will
operate in fundamentally different ways than current non-quantum computers
do today. In our current world, computers perform operations sequentially
and response times increase with the number of computations that are
required. Imagine that a computer would require 1 second to perform a
single computation, or to evaluate a single search and imagine that response
time increases linearly (i.e. it always takes 1 second to perform a
computation).
The operating principles of a quantum computer as outlined by British
physicist David Deutsch proceeds as follows. As is known, computers can
only process information represented as strings of 1's and 0's, each digit
represented by a bit. In a quantum computer, each bit can be represented
instead by a single particle. Each of these particles can be made to spin
westwards or eastwards thus representing a "1" or "0" by the direction of
their spin. Thus, a string of 1's and 0's in a traditional computer can be
represented as a set of spinning particles in a quantum computer, referred
to as qubits. Thus for every original string consisting of a series of 1's
and 0's, we could have a set of spinning particles instead.
Firing a sufficiently powerful pulse of energy can change each particle's
spin. But if instead, one fired a weaker pulse of energy so that it is not
known for sure whether the pulse had changed the direction of the particle's
spin, and we are in an unobserved situation, then we would be in an
ambiguous situation. There would thus be one distinct state for every
possible particle sting, and/or a distinct universe for each distinct set of
particles. Regardless of the number of states, each computation is
performed simultaneously. Therefore, in our scenario it would take a
response time of one second to complete all computations or searches.
Scalability also becomes a non-issue in a quantum computer. Data mining's
task is to search for patterns hidden in large amounts of data. As data
grows, the search space also grows. In most cases, it's not feasible to
search for all possible combinations in a brute force manner. Current
algorithms typically have to utilize some form of automated sampling and
pruning of this search space before they return results that are determined
to be interesting based on some criteria. A quantum computer used for data
mining can truly search the entire space for all possible interesting
patterns -- there would be one state or one universe for each possibility --
and the search is done instantaneously!
A current problem of course, is that although quantum theory is
mathematically plausible, implementation or building a quantum computer that
can function in any meaningful way has not been achieved. But as we've seen
in the past, research horizons have a way of becoming reachable eventually.
Ed Colet is the Acting Director of Research at Virtual Gold Inc.,
responsible for developing analytical methods for data mining and for
investigating human factors and usability issues of business intelligence
systems. At present, he is in the final stage of completing a doctoral
dissertation in the Cognition and Perception program at New York
University's Department of Psychology. Ed has also worked for IBM Research
at the T.J. Watson Research Center. At IBM, Ed was a member of the group
that developed Advanced Scout, the data mining application for NBA teams.
His research interests focus on statistical methods and human factors.
For more information, see www.virtualgold.com.
|