confounding and directed acyclic graphs
DESCRIPTION
An introduction for masters students in public healthTRANSCRIPT
Confounding
LEARNING OBJECTIVESIntroduction
The student will be able to:
1. Define confounding.
2. Discuss the implications of confounding for
epidemiological research.
3. Describe the nature and uses of a directed acyclic
graph.
4. Create a directed acyclic graph based on a real
research question, and use it to identify potential
confounders.
5. Compare and contrast different methods to deal with
confounding.
MOTIVATIONIntroduction
Confounding is the most important topic in epidemiology.
Epidemiology is...
X Y
What do we mean when we say one thing causes
another?
Why are causes so important in epidemiology?
What is the gold standard study design for testing
causal hypotheses?
Why?
When is it not possible to use a RCT?
X Y
Cause
X Y
Statistical Association
Statistical Association
When variables vary similarly.
Correlated; Covary; Dependent
Draw Inferences
Statistical inferences
Causal inferences
“Correlation does not imply
causation.”
BREAK
QUESTIONS?
DEFINITIONSConfounding
Synonym:
Spurious association
Confounding is...
“...the problem of confusing or mixing of exposure effects with other
"extraneous" effects...”
Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding revisited. Epidemiol Perspect Innov. 2009; 6: 4. doi: 10.1186/1742-5573-6-4
Early definitions were based on notions of...
Comparability
or
Collapsibility
Comparability
Inherent difference in risk between exposed and unexposed groups.
Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol. 1986;15:413–419. doi: 10.1093/ije/15.3.413.
Collapsibility
Apparent differences between the crude estimate of a statistical association and
strata-specific estimates.
Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol. 1986;15:413–419. doi: 10.1093/ije/15.3.413.
Problems with collapsibility:
1. Parameter estimates can change upon controlling for
mediators, by controlling for variables that introduce
new biases, or because of measurement error.
2. There are situations where controlling for a “true”
confounder leads to no change in the estimate.
Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding revisited. Epidemiol Perspect Innov. 2009; 6: 4. doi: 10.1186/1742-5573-6-4
Comparability
Inherent difference in risk between exposed and unexposed groups.
Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol. 1986;15:413–419. doi: 10.1093/ije/15.3.413.
Imagine that individuals can be classified based on their inherent risk of the outcome, prior to any exposure.
These classifications incorporate the entirety of causal mechanisms operating, known or unknown.
Rothman K. Causes. AJE. (1995) 141 (2):90-95.
Is exposure a cause of disease?
Is there an assumption we can make that allows us to infer that
the exposure causes the disease?
Counterfactuals, or Potential Outcomes
Yix = 1 Yix = 0
Exchangeability
The strong assumption that they are of the same type.
Comparability assumption (or partial exchangeability):
The proportion who would fall ill in the absence of exposure is the same in both groups.
(p1 + p3) = (q1 +q3)
Alternately, the baseline risk (prior to any possibility of exposure) is the same in both groups.
Thus an observed difference in risk between exposure and unexposed is due to the relative proportion of types 2 and 3 in the exposed.
If IPD > 0 Then P2 >P3
If IPD < 0 Then P3 > P2
If IPD = o Then P3 = P2
We can, if we wish, further assume that P3 (or P2) is equal to
zero.
For example, we might assume smoking is never good for anyone.
If P3 = 0 and IPD = 0 Then P2 = 0
Causal inferences, which we must make, rely on a strong assumption – the comparability (or
exchangeability) of exposed and unexposed groups.
Comparability is synonymous with “no confounding”.
(p1 + p3) = (q1 + q3)
Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol. 1986;15:413–419. doi: 10.1093/ije/15.3.413.
How seriously do “epidemiologists”
take this?
“We adjusted for appropriate confounders.”
...
Corollary 4:
“The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific
field, the less likely the research findings are to be true.”
“Implicit in these pressures* is a growing
dissatisfaction outside the field of
epidemiology with epidemiologic
description and correlation...”
Galea S. An Argument for a Consequentialist Epidemiology. AJE. 2013; doi: 10.1093/aje/kwt172
Good for scienceVs
Good for a scientist
BREAK
QUESTIONS?
OVERVIEWDirect Acyclic Graphs
1. DAGs are a tool.
2. They help clarify causal thinking.
3. They guide the modelling process by helping to identify
potential confounding.
4. They have been used to identify many of the problems
with earlier approaches to confounding.
5. They are a great compliment to comparability based
definitions of confounding.
Traditional rules of thumb for identifying confounders:
1. It must be predictive of risk among the unexposed.
2. It must be associated with the exposure in the population
under study.
3. It must not fall on the causal path from exposure to outcome,
or be a consequence of the outcome.
Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. Int J Epidemiol. 1986;15:413–419. doi: 10.1093/ije/15.3.413.
These don’t always work. Is there something better?
DirectedAcyclicGraphs
This is a graph.It is directed. It is acyclic.
This is a graph.It is directed. It is acyclic.
In a DAG, any unblocked path between two nodes implies a
marginal (unadjusted) association.
Algorithm for identifying confounders.
1. Erase all directed edges emanating from the exposure.
2. Identify all unblocked, backdoor paths between the
exposure and outcome.
Each of these paths implies confounding.
Confounding is removed by controlling for a variable along that path.
Adjustment for one variable can address confounding due to multiple paths.
Erase all directed edges emanating
from
the exposure.
Identify unblocked, backdoor paths.
This means we can identify an optimal* sufficient set for adjustment.
* The optimal sufficient set might be the smallest possible set, or the set that is easiest to collect, or the least expensive, etc.
At what stage in the research process should we employ a DAG?
But we aren’t done yet.
Controlling for a collider has the effect of inserting a new edge between its
parents.
Fast and agile Tough and strong
Rugby Ability
Glymour M. USING CAUSAL DIAGRAMS TO UNDERSTAND COMMON PROBLEMS IN SOCIAL EPIDEMIOLOGY. Methods in Social Epidemiology
Fast and agile Tough and strong
Rugby Ability
Algorithm for identifying confounders.
1. Erase all directed edges emanating from the exposure.
2. Identify all unblocked, backdoor paths from between the
exposure and outcome.
3. Define S, your sufficient set of variables needed to adjust
for confounding.
4. Draw an edge to connect all pairs of variables with a child
in S, or a child with a descendent in S.
5. Identify any new unblocked, backdoor paths, and update S.
MAKING A DAGDirect Acyclic Graphs
1. Identify an important health outcome, and a modifiable
exposure. Draw an arrow from the later to the former.
2. Think about what other variables might be related to
these. Brainstorm, use existing literature, etc.
3. Draw in any hypothesized causal paths.
4. Follow the steps previously outlined.
MAKING A DAGDirect Acyclic Graphs
5. Explore choices, and consider how these affect your
optimal sufficient set for adjustment (S).
6. Draw your DAG so it flows in the same direction you
read (as best as possible).
7. Use colour, notes, etc.
8. There are programs available, but pencil and lots of
paper work best at first.
Learning Task
• Based on the topic of your research thesis, create a DAG.
• It should include a preventable exposure, an important outcome, and at least 3 other potentially important covariates.
• Send me an image of your DAG before 17:00, next Wednesday ([email protected]).
SUMMARYConfounders
1. Epidemiologists should be preoccupied with causes.
2. Confounding is the single greatest threat to our causal inferences –
which we must make, or risk irrelevance.
3. Definitions and rules of thumb based on collapsibility are not
sufficient to identify many commonly encountered confounders.
4. Comparability based definitions are better, but don’t lend themselves
to simple rules of thumb.
5. Epidemiologists do not consistently use the same level of rigour when
trying to address confounding.
6. As a field, this limits our ability to effect positive change.
SUMMARYDirect Acyclic Graphs
1. DAGs are a useful tool.
2. They help clarify causal thinking.
3. They guide the modelling process by helping to identify
potential confounding.
PREVIEWNext week
1. More DAG examples.
2. Critiques of DAGs, and my responses to these.
3. Methods for dealing with confounding, once you
suspect it.