dealing with confounding in drug studies - eular carmona.pdfconfounding by indication a type of...

Dealing with confounding in drug studiesLoreto CarmonaInstituto de Salud Musculoesquelética, Madrid, Spain

33 slides

London, circa 1800…§ Sometime after industrial revolution, an amazing finding: § The more storks were sighted in the city in a given year, the greater the

number of human births that year.

§ Data pointed to a clear relationship between these two phenomena, a significant positive association.

3

[1] Sies H. A new parameter for sex education. Nature 1988;332:495.

[2] Höfer T, et al. New evidence for the theory of the stork. Paediatr Perinat Epidemiol. 2004;18(1):88-92.

[3] Matthews, R. (2000), Storks Deliver Babies (p= 0.008). Teaching Statistics, 22: 36–38.

Storks Babies!

Confounding § A confounding variable is an

extraneous variable in a statistical model that correlates (positively or negatively) with both the dependent variable and the independent variable.

4

Storks Babies!

Good weather conditions

Cause(independent variable)

Cause(independent variable)

Effect / outcome(dependent variable)

Effect / outcome(dependent variable)

Other factors(confounding variable)

Other factors(confounding variable)

Confusion in drug studies

Drug X Response

Drug Y Adverse event

6

Cause (independent variable)Cause (independent variable) Effect / outcome (dependent variable)Effect / outcome (dependent variable)

Confusion in drug studies

Drug X Response

Drug Y Adverse event

7

Cause (independent variable)Cause (independent variable) Effect / outcome (dependent variable)Effect / outcome (dependent variable)

Other treatmentsDisease activityCalendar year

Tight monitoring

Other treatmentsComorbidity

Tight monitoring

Plus many other unmeasured things…

The best way to deal with confusion in drug studies is…

Randomised Controlled Trials!!!

Control

BlindingRandom

8

BUT…

§ Too homogeneous populations

§ Too “healthy”

§ Too active

§ Too short follow-up

§ Too perfect

§ …

9

…You are so obsessed with real life!!!

Extreme confusion: Case series

10

Wada T, et al. [A case of rheumatoid arthritis complicated with deteriorated interstitial pneumonia after the administration of abatacept].Nihon Rinsho MenekiGakkai Kaishi. 2012;35(5):433-8.

Arabshahi B, et al. Abatacept and sodium thiosulfate for treatment of recalcitrant juvenile dermatomyositis complicated by ulceration and calcinosis. J Pediatr. 2012 Mar;160(3):520-2

Confusion in observational drug studies

Response

Adverse events

What’s the problem when you compare groups that were not randomised at start?

12

Elegible

Treated

Controls

Rt

Rc

randomisationrandomisation HH00: : RRtt==RRccHH00: : RRtt==RRcc

Treated

Controls

Rt

Rc

HH00: : RRtt==RRcc ??????HH00: : RRtt==RRcc ??????

Confounding by indication§ A type of selection bias

§ When physicians assign different treatments, they account for…§ different diagnoses,

§ severity of illness,

§ or comorbid conditions

§ A common problem in pharmacoepidemiological studies comparing benefits.

§ Difficult to adjust for.

13

Channelling bias§ Drugs are preferentially prescribed to patients with baseline

characteristics that place them at differential risk for the outcome of interest.

§ à differences may be due to the patients’ baseline profile, and not the drugs they received.

14

Drug XX is believed to be associated with a given complication AA

Patients at high risk for AA are preferentially

given drug YY

Drug XX looks safer OR YY less safe

How to tackle confusion§ First of all, you have to think about it§ Brainstorming à Read à Causal diagrams (directed acyclic graphs, DAG)

15

Hernán MA, et al. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002 Jan 15;155(2):176-84.

What to include when you have no pre-defined hypothesis?§ With respect to the treated condition, diagnosis, disease duration, and

phenotype data as available (e.g., rheumatoid factor status, erosive disease) will be important not only for the identification of predictors of risk, but also for the assessment of channelling bias. § The same argument applies to co-morbidities. Again, however, a

transparent definition of, e.g., “cardiovascular disease” is needed in order to avoid lumping uncomplicated hypertension together with serious ischemic heart disease. § Non-clinical data items such as educational level, or socio-economic

status, may also provide useful information, particularly in setting where there is not universal subsidised health care. § Since response to prior treatments is often a predictor of the outcome

of future treatments, the treatment history (for the disorder under observation) may be an important source for information to determine channelling bias.

16


§ You can deal with confounding during the design phase:§ Randomised sampling

§ Restrict inclusion criteria (i.e. Tx started after year 2006)

§ Case-control matching

17


§ You can deal with confounding during the design phase:§ Randomised sampling

§ Restrict inclusion criteria

§ Case-control matching

§ You can deal with it in the analysis:§ Stratified analysis

§ Multivariate analysis

18

Propensity score§ Propensity scores are an alternative method to estimate the effect

of receiving treatment when random assignment of treatments to subjects is not feasible.

1. Identify variables may interfere with decision to assign a treatment

2. Calculate each individual’s propensity score based on a regression model

3. Use that propensity score to balance treatment / exposure groups

19

How can the propensity score balance confounding

Weighting• It creates a

"pseudopopulation" in which the distribution of confounding factors is the same in exposed and unexposed.

Stratification• It divides the sample into

subgroups with similar risks: similar distributions of confounders.

Matching• Extreme stratification: each

layer contains an exposed and unexposed.

20

Propensity score: 1§ How big a problem we have?§ Compare the groups at baseline.

. tabstattabstat t_remisiont_remision sdaisdai ccpccp frfr erosive_RAerosive_RA, , byby((optimizationoptimization) ) statstat(mean (mean sdsd))

Summary statistics: mean, sdby categories of: optimization

optimization | t_remi~n sdai ccp fr eros~_RA-------------+--------------------------------------------------

0 | 29.9964 4.694656 .654386 .6486014 .5164179| 27.25977 6.071395 .4759858 .4775464 .4998796

-------------+--------------------------------------------------1 | 24.19804 4.631893 .7771739 .5769231 .4927066

| 20.78664 5.566589 .417278 .4944438 .5003524-------------+--------------------------------------------------

Total | 28.43836 4.68038 .6843501 .6294872 .5100349| 25.80433 5.959015 .4650827 .4830453 .5000084

----------------------------------------------------------------

. tabstattabstat t_remisiont_remision sdaisdai ccpccp frfr erosive_RAerosive_RA, , byby((optimizationoptimization) ) statstat(mean (mean sdsd))

Summary statistics: mean, sdby categories of: optimization

optimization | t_remi~n sdai ccp fr eros~_RA-------------+--------------------------------------------------

0 | 29.9964 4.694656 .654386 .6486014 .5164179| 27.25977 6.071395 .4759858 .4775464 .4998796

-------------+--------------------------------------------------1 | 24.19804 4.631893 .7771739 .5769231 .4927066

| 20.78664 5.566589 .417278 .4944438 .5003524-------------+--------------------------------------------------

Total | 28.43836 4.68038 .6843501 .6294872 .5100349| 25.80433 5.959015 .4650827 .4830453 .5000084

----------------------------------------------------------------

21

2. Then compare standardiseddifferences to compare effects

22

. xi:pbalchkxi:pbalchk optimizationoptimization t_remisiont_remision sdaisdai i.ccp i.fr i.ccp i.fr i.erosive_RAi.erosive_RAi.ccp _Iccp_0-1 (naturally coded; _Iccp_0 omitted)i.fr _Ifr_0-1 (naturally coded; _Ifr_0 omitted)i.erosive_RA _IAR_Erosiv_0-1 (naturally coded; _IAR_Erosiv_0 omitted)

Mean in treated Mean in Untreated Standardised diff.----------------------------------------------------------------------t_remision | 21.95 34.15 -0.397

sdai | 4.51 4.67 -0.033-------------+--------------------------------------------------------

_Cccp_0 | 23.9 % 48.2 % -0.523_Cccp_1 | 76.1 % 51.8 % 0.523

| _Cfr_0 | 28.3 % 30.7 % -0.053_Cfr_1 | 71.7 % 69.3 % 0.053

| _CAR_Erosi~0 | 26.1 % 44.0 % -0.383_CAR_Erosi~1 | 73.9 % 56.0 % 0.383

| ----------------------------------------------------------------------WarningWarning: : SignificantSignificant imbalanceimbalance existsexists in in thethe followingfollowing variables:variables:t_remisiont_remision ccpccp AR_ErosivAR_Erosiv

3. Calculate propensity score from logistic regression

. logisticlogistic optimizationoptimization t_remisiont_remision i.ccp i.ccp i.erosive_RAi.erosive_RALogistic regression Number of obs = 527

LR chi2(3) = 33.58Prob > chi2 = 0.0000

Log likelihood = -285.19076 Pseudo R2 = 0.0556

------------------------------------------------------------------------------optimization | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------t_remision | .9860856 .0043929 -3.15 0.002 .9775132 .9947332

1.ccp | 2.068701 .465636 3.23 0.001 1.330773 3.2158181.erosive_RA | 1.183667 .2678439 0.75 0.456 .7596561 1.844345

_cons | .2845514 .0776425 -4.61 0.000 .1666874 .4857568------------------------------------------------------------------------------

. predictpredict propensitypropensity(option pr assumed; Pr(optimization))(1834 missing values generated)

23

4. Check goodness of fit. . estatestat gofgof, , groupgroup(10) (10) tabletableLogistic model for optimization, goodness-of-fit test

(Table collapsed on quantiles of estimated probabilities)+--------------------------------------------------------+| Group | Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total ||-------+--------+-------+-------+-------+-------+-------|| 1 | 0.0638 | 5 | 4.0 | 57 | 58.0 | 62 || 2 | 0.2005 | 10 | 10.0 | 46 | 46.0 | 56 || 3 | 0.2006 | 4 | 10.8 | 50 | 43.2 | 54 || 4 | 0.2048 | 7 | 8.2 | 33 | 31.8 | 40 || 5 | 0.2789 | 22 | 24.1 | 73 | 70.9 | 95 ||-------+--------+-------+-------+-------+-------+-------|| 6 | 0.2875 | 15 | 10.9 | 23 | 27.1 | 38 || 7 | 0.3608 | 28 | 16.5 | 19 | 30.5 | 47 || 8 | 0.3739 | 14 | 16.3 | 30 | 27.7 | 44 || 9 | 0.3972 | 21 | 16.5 | 21 | 25.5 | 42 || 10 | 0.4039 | 11 | 19.8 | 38 | 29.2 | 49 |+--------------------------------------------------------+

number of observations = 527number of groups = 10

Hosmer-Lemeshow chi2(8) = 29.78Prob > chi2 = 0.000224

Option A. Stratify by propensity score. xtilextile qpsqps = = propensitypropensity, n(5), n(5). tabtab qpsqps optimizationoptimization

5 |quantiles |

of | optimizationpropensity | 0 1 | Total-----------+----------------------+----------

1 | 103 15 | 118 2 | 83 11 | 94 3 | 96 37 | 133 4 | 49 42 | 91 5 | 59 32 | 91

-----------+----------------------+----------Total | 390 137 | 527

25

A.1. Use the stratification variable to adjust for in models

. regressregress sdaisdai optimizationoptimization------------------------------------------------------------------------------

sdai | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------optimization | --.0627626 .0627626 .3200338 -0.20 0.845 -.6904025 .5648772

_cons | 4.694656 .1526319 30.76 0.000 4.395319 4.993993------------------------------------------------------------------------------

. xi:regressxi:regress sdaisdai optimizationoptimization i.qpsi.qpsi.qps _Iqps_1-5 (naturally coded; _Iqps_1 omitted)------------------------------------------------------------------------------

sdai | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------optimization | --1.481545 1.481545 .6488591 -2.28 0.023 -2.756943 -.2061463

_Iqps_2 | -2.115092 .7884186 -2.68 0.008 -3.664809 -.5653753_Iqps_3 | .688498 .7756742 0.89 0.375 -.8361684 2.213164_Iqps_4 | 4.753318 .8855135 5.37 0.000 3.012751 6.493884_Iqps_5 | .4929259 .8187014 0.60 0.547 -1.116315 2.102167

_cons | 4.414459 .5652331 7.81 0.000 3.303436 5.525482------------------------------------------------------------------------------

. 26

A.2. Use the stratification variable to select for subgroup analysis

. regressregress sdaisdai optimizationoptimization------------------------------------------------------------------------------

sdai | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------optimization | -.0627626 .3200338 -0.20 0.845 -.6904025 .5648772

_cons | 4.694656 .1526319 30.76 0.000 4.395319 4.993993------------------------------------------------------------------------------

. xi:regressxi:regress sdaisdai optimizationoptimization ifif inlistinlist((qpsqps, 4, 5), 4, 5)------------------------------------------------------------------------------

sdai | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------optimization | -2.588866 1.277299 -2.03 0.045 -5.113251 -.0644814

_cons | 7.188866 .7498024 9.59 0.000 5.706997 8.670735------------------------------------------------------------------------------

.

27

Option B. Weighting by the propensity score§ Inverse probability of treatment (IPT) weights§ IPT weights changes the distribution of confounders in exposed and

unexposed à the same distribution in the entire sample.

§ SMR weights§ It does not change the distribution in the exposed, but it does in the

unexposed to à appropriate matching.

§ Assuming treatment has the same effect on all, SMR = IPT.§ But if we assume that those who receive the treatment will benefit most

then SMR > IPT

28

propwtpropwt optimization propensity, optimization propensity, iptipt smrsmr

Example of weighting. regressregress y ty t------------------------------------------------------------------------------

y | Coef. Std. Err. T P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

t |-.5873868 .0495175 -11.86 0.000 -.6844982 -.4902754_cons | .0448315 .0341635 1.31 0.190 -.0221684 .1118313------------------------------------------------------------------------------

29

Inverse probability weights

SMR weights

Option C. Propensity score matching

gmatchgmatch optimizationoptimization propensitypropensity, set(set1) , set(set1) diffdiff(diff1)(diff1)

§ It generates 2 variables§ a case-control pair identifier, set1 § the difference in propensity score between case and control, diff1

30

Confounding is not modifying

31

Summary§ When you want to compare groups that were not randomly

assigned to a treatment, you can:§ forget about it

§ adjust for confounders

§ select only those that are comparable

§ Always take confounding into account in observational studies.§ think about it, read, follow others

§ do something about ito Measure => collect the variables!

o Propensity score

o Other confounders in multivariate analysis

32

A final cautious message

§ It has to make sense; data are only data.

§ Beware of strong conclusions; in science there’s little black or white and lots of grey.

Thank you!

33

dealing with confounding in drug studies - eular carmona.pdfconfounding by indication a type of...

Documents