FORMALLY MODELING CAUSE AND EFFECT: PART II
by Ed Colet
Many data mining algorithms for detecting patterns rely upon finding
associations and dependencies among the many attributes that can exist in a
large dataset. But because a correlation does not imply causation, it can be
difficult to determine the nature of a causal relationship (or if one even
exists) between the attributes. Last week's column addressed the formal
computations for a correlation, and current approaches to investigate
causality. In this week's column, I describe some new research outlining the
limitations of current formalisms in modeling causality -- and progress
towards modeling causality with a better degree of mathematical precision.
Given a correlation between two attributes, it is difficult to determine if
one causes the other or vice versa. While current approaches such as computing
partial or semi-partial correlations (described last week) can effectively
determine if another factor may be influencing the relationship, it can't
really model the causal relationship that exists. A more involved approach is
to have the domain-expert or analyst propose causal mechanisms to account for
an observed relationship. Then an investigation and testing of the
predictions and implications that follow from the proposed mechanism is
conducted. This may require careful controlled research studies that may not
always be practical or possible to do. What may be useful is a framework in
which causality can be studied and easily manipulated.
Some interesting work in this area is being conducted by Professor Judea
Pearl of UCLA, and described in his book, "Causality: Models, Reasoning, and
Inference", Cambridge University Press. What follows is an abbreviated
description of his development of a "causal calculus" for formally modeling
causality.
Why current formalisms don't support causality:
Pearl argues that currently, formalisms expressed as equations don't
adequately support causal notions. Despite this, we have always interpreted
and thought of equations in terms of causality. Pearl provides the following
argument as an example. Take Newton's law that force equals mass times
acceleration, or "f = ma". We know that force causes acceleration, but we do
not also say that force causes mass. There's nothing in the formal expression
to allow us to derive causality. With algebra, we can move terms around and
derive that "f/a = m", but note that we interpret this expression as "f/a"
determines the mass, not that it causes the mass. So in terms of the current
language for capturing and modeling causality, the pre-dominant expressions
and rules for manipulating these expressions do not capture causal notions
that we know exist. One reason for this is that algebraic equations are
bi-directional, e.g. the rules for re-writing equations allow us to shuffle
any term back and forth. Causality, on the other hand is directional, but the
language and rules of existing algebras don't retain the directionality that
is an important feature of causality.
Probability theory is the basic language of data mining and data analysis.
Here too, Pearl points out how the formal language of probability theory also
doesn't capture causality -- but it can be extended to do so. He illustrates
his approach via the following argument: If we wish to find the chance it
rained, given that we see the grass wet, we can express our question in a
formal sentence written like that: P(rain | wet) to be read:
Probability~Of~Rain~given~~Wet. But suppose we ask a different question:
"What is the chance it rained if we MAKE the grass wet?" We cannot even
express our query in the syntax of probability, because the vertical bar in
the probability expression, P(rain | wet) is already taken to mean, "given
that I see". We can invent a new symbol "DO", and each time we see a DO
after the bar we read it "GIVEN THAT WE DO" -- but this does not help us
compute the answer to our question, because the rules of probability do not
apply to this new reading. We know intuitively that the answer should be equal
to just P(rain), because making the grass wet does not change the chance of
rain. But can this intuitive answer, and others like it, be derived
mechanically?
Towards a formal language for causality:
According to Pearl, the answer is of course, "yes" and his research outlines
his development of a new algebra of a causal calculus. The algebraic
transformations basically consist of 3 rules that permit us to transform
expressions. Transformations of expressions involving actions and
observations can be transformed into other expressions of this type. The first
rule allows us to ignore an irrelevant observation, the third to ignore an
irrelevant action, the second allows us to exchange an action with an
observation of the same fact.
But rather than go into the technical details of this language and it's
grammar, it is more useful to note ways that it can be used in the context of
data mining and decision-making. The existence of a formal mathematical
language means that it can be easily processed by computer analysis. Given an
association among attributes in a large data set in becomes possible with
mathematical precision to determine the nature of a causal relationship. It
also become possible to assess the effects of various interventions, making
"what if" analyses more powerful (e.g. "if I raise prices, what might happen
to sales?"). Currently many "what-if" analyses are based on extrapolations of
existing data coupled with certain assumptions defined by domain experts. As
such, the result can be characterized as an educated guess. But a "what-if"
analysis modeled within a context of causality can precisely determine the
effect of an intervention -- and this can ensure better decision-making.
Ed Colet is the Acting Director of Research at Virtual Gold
Inc.,
responsible for developing analytical methods for data mining and for
investigating human factors and usability issues of business intelligence
systems. At present, he is in the final stage of completing a doctoral
dissertation in the Cognition and Perception program at New York
University's Department of Psychology. Ed has also worked for IBM Research
at the T.J. Watson Research Center. At IBM, Ed was a member of the group
that developed Advanced Scout, the data mining application for NBA teams.
His research interests focus on statistical methods and human factors.
For more information, see www.virtualgold.com.
|