dealing with confounding in drug studies - eular carmona.pdfconfounding by indication a type of...
TRANSCRIPT
Dealing with confounding in drug studiesLoreto CarmonaInstituto de Salud Musculoesquelética, Madrid, Spain
33 slides
2
London, circa 1800…§ Sometime after industrial revolution, an amazing finding: § The more storks were sighted in the city in a given year, the greater the
number of human births that year.
§ Data pointed to a clear relationship between these two phenomena, a significant positive association.
3
[1] Sies H. A new parameter for sex education. Nature 1988;332:495.
[2] Höfer T, et al. New evidence for the theory of the stork. Paediatr Perinat Epidemiol. 2004;18(1):88-92.
[3] Matthews, R. (2000), Storks Deliver Babies (p= 0.008). Teaching Statistics, 22: 36–38.
Storks Babies!
Confounding § A confounding variable is an
extraneous variable in a statistical model that correlates (positively or negatively) with both the dependent variable and the independent variable.
4
Storks Babies!
Good weather conditions
Cause(independent variable)
Cause(independent variable)
Effect / outcome(dependent variable)
Effect / outcome(dependent variable)
Other factors(confounding variable)
Other factors(confounding variable)
5
Confusion in drug studies
Drug X Response
Drug Y Adverse event
6
Cause (independent variable)Cause (independent variable) Effect / outcome (dependent variable)Effect / outcome (dependent variable)
Confusion in drug studies
Drug X Response
Drug Y Adverse event
7
Cause (independent variable)Cause (independent variable) Effect / outcome (dependent variable)Effect / outcome (dependent variable)
Other treatmentsDisease activityCalendar year
Tight monitoring
Other treatmentsComorbidity
Tight monitoring
Plus many other unmeasured things…
The best way to deal with confusion in drug studies is…
Randomised Controlled Trials!!!
Control
BlindingRandom
8
BUT…
§ Too homogeneous populations
§ Too “healthy”
§ Too active
§ Too short follow-up
§ Too perfect
§ …
9
…You are so obsessed with real life!!!
Extreme confusion: Case series
10
Wada T, et al. [A case of rheumatoid arthritis complicated with deteriorated interstitial pneumonia after the administration of abatacept].Nihon Rinsho MenekiGakkai Kaishi. 2012;35(5):433-8.
Arabshahi B, et al. Abatacept and sodium thiosulfate for treatment of recalcitrant juvenile dermatomyositis complicated by ulceration and calcinosis. J Pediatr. 2012 Mar;160(3):520-2
Confusion in observational drug studies
Response
Adverse events
What’s the problem when you compare groups that were not randomised at start?
12
Elegible
Treated
Controls
Rt
Rc
randomisationrandomisation HH00: : RRtt==RRccHH00: : RRtt==RRcc
Treated
Controls
Rt
Rc
HH00: : RRtt==RRcc ??????HH00: : RRtt==RRcc ??????
Confounding by indication§ A type of selection bias
§ When physicians assign different treatments, they account for…§ different diagnoses,
§ severity of illness,
§ or comorbid conditions
§ A common problem in pharmacoepidemiological studies comparing benefits.
§ Difficult to adjust for.
13
Channelling bias§ Drugs are preferentially prescribed to patients with baseline
characteristics that place them at differential risk for the outcome of interest.
§ à differences may be due to the patients’ baseline profile, and not the drugs they received.
14
Drug XX is believed to be associated with a given complication AA
Patients at high risk for AA are preferentially
given drug YY
Drug XX looks safer OR YY less safe
How to tackle confusion§ First of all, you have to think about it§ Brainstorming à Read à Causal diagrams (directed acyclic graphs, DAG)
15
Hernán MA, et al. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002 Jan 15;155(2):176-84.
What to include when you have no pre-defined hypothesis?§ With respect to the treated condition, diagnosis, disease duration, and
phenotype data as available (e.g., rheumatoid factor status, erosive disease) will be important not only for the identification of predictors of risk, but also for the assessment of channelling bias. § The same argument applies to co-morbidities. Again, however, a
transparent definition of, e.g., “cardiovascular disease” is needed in order to avoid lumping uncomplicated hypertension together with serious ischemic heart disease. § Non-clinical data items such as educational level, or socio-economic
status, may also provide useful information, particularly in setting where there is not universal subsidised health care. § Since response to prior treatments is often a predictor of the outcome
of future treatments, the treatment history (for the disorder under observation) may be an important source for information to determine channelling bias.
16
How to tackle confusion§ First of all, you have to think about it§ Brainstorming à Read à Causal diagrams (directed acyclic graphs, DAG)
§ You can deal with confounding during the design phase:§ Randomised sampling
§ Restrict inclusion criteria (i.e. Tx started after year 2006)
§ Case-control matching
17
How to tackle confusion§ First of all, you have to think about it§ Brainstorming à Read à Causal diagrams (directed acyclic graphs, DAG)
§ You can deal with confounding during the design phase:§ Randomised sampling
§ Restrict inclusion criteria
§ Case-control matching
§ You can deal with it in the analysis:§ Stratified analysis
§ Multivariate analysis
18
Propensity score§ Propensity scores are an alternative method to estimate the effect
of receiving treatment when random assignment of treatments to subjects is not feasible.
1. Identify variables may interfere with decision to assign a treatment
2. Calculate each individual’s propensity score based on a regression model
3. Use that propensity score to balance treatment / exposure groups
19
How can the propensity score balance confounding
Weighting• It creates a
"pseudopopulation" in which the distribution of confounding factors is the same in exposed and unexposed.
Stratification• It divides the sample into
subgroups with similar risks: similar distributions of confounders.
Matching• Extreme stratification: each
layer contains an exposed and unexposed.
20
Propensity score: 1§ How big a problem we have?§ Compare the groups at baseline.
. tabstattabstat t_remisiont_remision sdaisdai ccpccp frfr erosive_RAerosive_RA, , byby((optimizationoptimization) ) statstat(mean (mean sdsd))
Summary statistics: mean, sdby categories of: optimization
optimization | t_remi~n sdai ccp fr eros~_RA-------------+--------------------------------------------------
0 | 29.9964 4.694656 .654386 .6486014 .5164179| 27.25977 6.071395 .4759858 .4775464 .4998796
-------------+--------------------------------------------------1 | 24.19804 4.631893 .7771739 .5769231 .4927066
| 20.78664 5.566589 .417278 .4944438 .5003524-------------+--------------------------------------------------
Total | 28.43836 4.68038 .6843501 .6294872 .5100349| 25.80433 5.959015 .4650827 .4830453 .5000084
----------------------------------------------------------------
. tabstattabstat t_remisiont_remision sdaisdai ccpccp frfr erosive_RAerosive_RA, , byby((optimizationoptimization) ) statstat(mean (mean sdsd))
Summary statistics: mean, sdby categories of: optimization
optimization | t_remi~n sdai ccp fr eros~_RA-------------+--------------------------------------------------
0 | 29.9964 4.694656 .654386 .6486014 .5164179| 27.25977 6.071395 .4759858 .4775464 .4998796
-------------+--------------------------------------------------1 | 24.19804 4.631893 .7771739 .5769231 .4927066
| 20.78664 5.566589 .417278 .4944438 .5003524-------------+--------------------------------------------------
Total | 28.43836 4.68038 .6843501 .6294872 .5100349| 25.80433 5.959015 .4650827 .4830453 .5000084
----------------------------------------------------------------
21
2. Then compare standardiseddifferences to compare effects
22
. xi:pbalchkxi:pbalchk optimizationoptimization t_remisiont_remision sdaisdai i.ccp i.fr i.ccp i.fr i.erosive_RAi.erosive_RAi.ccp _Iccp_0-1 (naturally coded; _Iccp_0 omitted)i.fr _Ifr_0-1 (naturally coded; _Ifr_0 omitted)i.erosive_RA _IAR_Erosiv_0-1 (naturally coded; _IAR_Erosiv_0 omitted)
Mean in treated Mean in Untreated Standardised diff.----------------------------------------------------------------------t_remision | 21.95 34.15 -0.397
sdai | 4.51 4.67 -0.033-------------+--------------------------------------------------------
_Cccp_0 | 23.9 % 48.2 % -0.523_Cccp_1 | 76.1 % 51.8 % 0.523
| _Cfr_0 | 28.3 % 30.7 % -0.053_Cfr_1 | 71.7 % 69.3 % 0.053
| _CAR_Erosi~0 | 26.1 % 44.0 % -0.383_CAR_Erosi~1 | 73.9 % 56.0 % 0.383
| ----------------------------------------------------------------------WarningWarning: : SignificantSignificant imbalanceimbalance existsexists in in thethe followingfollowing variables:variables:t_remisiont_remision ccpccp AR_ErosivAR_Erosiv
3. Calculate propensity score from logistic regression
. logisticlogistic optimizationoptimization t_remisiont_remision i.ccp i.ccp i.erosive_RAi.erosive_RALogistic regression Number of obs = 527
LR chi2(3) = 33.58Prob > chi2 = 0.0000
Log likelihood = -285.19076 Pseudo R2 = 0.0556
------------------------------------------------------------------------------optimization | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------t_remision | .9860856 .0043929 -3.15 0.002 .9775132 .9947332
1.ccp | 2.068701 .465636 3.23 0.001 1.330773 3.2158181.erosive_RA | 1.183667 .2678439 0.75 0.456 .7596561 1.844345
_cons | .2845514 .0776425 -4.61 0.000 .1666874 .4857568------------------------------------------------------------------------------
. predictpredict propensitypropensity(option pr assumed; Pr(optimization))(1834 missing values generated)
23
4. Check goodness of fit. . estatestat gofgof, , groupgroup(10) (10) tabletableLogistic model for optimization, goodness-of-fit test
(Table collapsed on quantiles of estimated probabilities)+--------------------------------------------------------+| Group | Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total ||-------+--------+-------+-------+-------+-------+-------|| 1 | 0.0638 | 5 | 4.0 | 57 | 58.0 | 62 || 2 | 0.2005 | 10 | 10.0 | 46 | 46.0 | 56 || 3 | 0.2006 | 4 | 10.8 | 50 | 43.2 | 54 || 4 | 0.2048 | 7 | 8.2 | 33 | 31.8 | 40 || 5 | 0.2789 | 22 | 24.1 | 73 | 70.9 | 95 ||-------+--------+-------+-------+-------+-------+-------|| 6 | 0.2875 | 15 | 10.9 | 23 | 27.1 | 38 || 7 | 0.3608 | 28 | 16.5 | 19 | 30.5 | 47 || 8 | 0.3739 | 14 | 16.3 | 30 | 27.7 | 44 || 9 | 0.3972 | 21 | 16.5 | 21 | 25.5 | 42 || 10 | 0.4039 | 11 | 19.8 | 38 | 29.2 | 49 |+--------------------------------------------------------+
number of observations = 527number of groups = 10
Hosmer-Lemeshow chi2(8) = 29.78Prob > chi2 = 0.000224
Option A. Stratify by propensity score. xtilextile qpsqps = = propensitypropensity, n(5), n(5). tabtab qpsqps optimizationoptimization
5 |quantiles |
of | optimizationpropensity | 0 1 | Total-----------+----------------------+----------
1 | 103 15 | 118 2 | 83 11 | 94 3 | 96 37 | 133 4 | 49 42 | 91 5 | 59 32 | 91
-----------+----------------------+----------Total | 390 137 | 527
25
A.1. Use the stratification variable to adjust for in models
. regressregress sdaisdai optimizationoptimization------------------------------------------------------------------------------
sdai | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------optimization | --.0627626 .0627626 .3200338 -0.20 0.845 -.6904025 .5648772
_cons | 4.694656 .1526319 30.76 0.000 4.395319 4.993993------------------------------------------------------------------------------
. xi:regressxi:regress sdaisdai optimizationoptimization i.qpsi.qpsi.qps _Iqps_1-5 (naturally coded; _Iqps_1 omitted)------------------------------------------------------------------------------
sdai | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------optimization | --1.481545 1.481545 .6488591 -2.28 0.023 -2.756943 -.2061463
_Iqps_2 | -2.115092 .7884186 -2.68 0.008 -3.664809 -.5653753_Iqps_3 | .688498 .7756742 0.89 0.375 -.8361684 2.213164_Iqps_4 | 4.753318 .8855135 5.37 0.000 3.012751 6.493884_Iqps_5 | .4929259 .8187014 0.60 0.547 -1.116315 2.102167
_cons | 4.414459 .5652331 7.81 0.000 3.303436 5.525482------------------------------------------------------------------------------
. 26
A.2. Use the stratification variable to select for subgroup analysis
. regressregress sdaisdai optimizationoptimization------------------------------------------------------------------------------
sdai | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------optimization | -.0627626 .3200338 -0.20 0.845 -.6904025 .5648772
_cons | 4.694656 .1526319 30.76 0.000 4.395319 4.993993------------------------------------------------------------------------------
. xi:regressxi:regress sdaisdai optimizationoptimization ifif inlistinlist((qpsqps, 4, 5), 4, 5)------------------------------------------------------------------------------
sdai | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------optimization | -2.588866 1.277299 -2.03 0.045 -5.113251 -.0644814
_cons | 7.188866 .7498024 9.59 0.000 5.706997 8.670735------------------------------------------------------------------------------
.
27
Option B. Weighting by the propensity score§ Inverse probability of treatment (IPT) weights§ IPT weights changes the distribution of confounders in exposed and
unexposed à the same distribution in the entire sample.
§ SMR weights§ It does not change the distribution in the exposed, but it does in the
unexposed to à appropriate matching.
§ Assuming treatment has the same effect on all, SMR = IPT.§ But if we assume that those who receive the treatment will benefit most
then SMR > IPT
28
propwtpropwt optimization propensity, optimization propensity, iptipt smrsmr
Example of weighting. regressregress y ty t------------------------------------------------------------------------------
y | Coef. Std. Err. T P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
t |-.5873868 .0495175 -11.86 0.000 -.6844982 -.4902754_cons | .0448315 .0341635 1.31 0.190 -.0221684 .1118313------------------------------------------------------------------------------
29
Inverse probability weights
SMR weights
Option C. Propensity score matching
gmatchgmatch optimizationoptimization propensitypropensity, set(set1) , set(set1) diffdiff(diff1)(diff1)
§ It generates 2 variables§ a case-control pair identifier, set1 § the difference in propensity score between case and control, diff1
30
Confounding is not modifying
31
Summary§ When you want to compare groups that were not randomly
assigned to a treatment, you can:§ forget about it
§ adjust for confounders
§ select only those that are comparable
§ Always take confounding into account in observational studies.§ think about it, read, follow others
§ do something about ito Measure => collect the variables!
o Propensity score
o Other confounders in multivariate analysis
32
A final cautious message
§ It has to make sense; data are only data.
§ Beware of strong conclusions; in science there’s little black or white and lots of grey.
Thank you!
33