1 epi235: epi methods in hsr april 19, 2005 l6 program evaluation with longitudinal data 2: basic...
TRANSCRIPT
1
EPI235: Epi Methods in HSR
April 19, 2005 L6
Program Evaluation with Longitudinal Data 2: Basic Techniques (Dr. Schneeweiss)
Basic techniques and SAS code for interrupted linear regression for the evaluation of health programs. Time varying confounding vs. constant confounders. Adjusting for correlated data. Handling overdispersion and skewed distributions of outcomes. Data structure and SAS codes will be discussed.
Background reading: •Wagner AK, Soumerai SB, Zhang F, Ross-Degnan D: Segmented regression analysis of interrupted time series studies in medication use research. J Clin Pharm Ther 2002;27:299-309.•Schneeweiss S, Maclure M, Walker AM, Grootendorst P, Soumerai SB: On the evaluation of drug policy changes with longitudinal claims data: The Policy maker’s versus the clinician’s perspective. Health Policy 2001;55:97-109.•Schneeweiss S, Maclure M, Soumerai SB, Walker AM, Glynn RJ: Quasi-experimental longitudinal designs to evaluate drug benefit policy changes with low policy compliance. J Clin Epidemiol 2002;55:833-841
2
Segmented (linear) regression
Series of measurements of a single characteristic in different time intervals Health facility utilization rates Drug consumption Prescribing indicators
Reasons for collection Description of levels and trends Prediction of future values
3
Segmented (linear) regression
Segments Specific event causes a change in the series,
dividing it into distinct segments Estimating the change in the series allows you to
assess the impact of the event
Validity Strongest non-experimental research design Pre-event level and trend serves as a built-in
“control” Can have historical control time trend Can have concurrent control time trend
4
Options
Aggregate level time series Easy to do, simple data structure
Individual level time series (repeated measures) Can adjust for individual level time-varying
covariates Multiplies the data Need more sophisticate statistics, though no longer
a problem
5
Assumption: Extrapolating the pre-intervention level and trend correctly reflects the (counterfactual) outcome that would have occurred had the intervention not happened.
Analysis of an intervention effect using segmented linear regression
immediate level change
projected changeslope
Utilization rateIntervention
slope=0
6
It’s the model AND the data structure
tt3t2t10t epolicy1after time*βpolicy1*βtime*ββY
immediate level change
projected changeslope
Utilization rateIntervention
slope=0
7
It’s the model AND the data structure
Time
to
Program start date
8
Two segments exampleAverage number of constant-size prescriptions per continuously eligible
Medicaid patient per month among multiple drug recipients
0
2
4
6
8
Dec-79 Jun-80 Jan-81 Aug-81 Feb-82 Sep-82 Mar-83 Oct-83 Apr-84
Study Month
Mea
n n
um
ber
of
pre
scri
pti
on
s p
er
pat
ien
t
tt5t4
t3t2t10t
epolicy2after time*βpolicy2*β
policy1after time*βpolicy1*βtime*ββY
Cap Copay
9
Two segments example
Time (Months)
Mean # of Rx
3-Drug Cap Time after Cap
Copay Time after Copay
***
17 5.10 0 0 0 0
18 5.10 0 0 0 0
19 5.00 0 0 0 0
20 6.20 0 0 0 0
21 2.60 1 0 0 0
22 2.80 1 1 0 0
23 2.75 1 2 0 0
24 2.75 1 3 0 0
25 2.95 1 4 0 0
***
32 3.85 1 11 1 0
33 3.65 1 12 1 1
34 3.55 1 13 1 2
***
Cap
Copay
10
Proc autoreg to evaluate aggregate time trend data in SAS:
PROC AUTOREG DATA=temp; MODEL Rx = time cap t_cap copay t_copay/ METHOD=ML NLAG=1 DWPROB COVB; OUTPUT OUT=ESTOUT PREDICTED=PRED LCL=PRED_L UCL=PRED_U RESIDUAL=RESID PREDICTEDM=PREDM LCLM=PREDM_L UCLM=PREDM_U RESIDUALM=RESIDM; RUN; PROC PRINT DATA=ESTOUT NOOBS; VAR PRED RESID PRED_L PRED_U PREDM_L PREDM_U PREDM RESIDM TIME CAP T_CAP COPAY T_COPAY; RUN;
11
Questions:
How to choose segments?
12
Repeated measurements
Modeling of repeated events:
Divide observation period into equal time periods, e.g. months, quarters.
Visits, hospitalizations, or ER visits are repeated events.
Within each quarter individuals can have multiple events (=count data).
Individual level time series
13
Data structure for repeated events modelling
One time period (T) equals 1 quarter from Q2/96 to Q1/98.Income status Outcome variables
PATIENT T POST T_POST AGE CSEX CDS PSC1 PSC2 ALL_CLAIM ALL_VIS ALL_HOS7100000081 1 0 0 73.23 0 4 0 0 0 0 07100000081 2 0 0 73.48 0 4 0 0 0 0 07100000081 3 0 0 73.73 0 5 0 0 10 1 07100000081 4 1 1 73.90 0 5 0 0 21 7 07100000081 5 1 2 74.23 0 5 0 0 14 4 07100000081 6 1 3 74.48 0 5 0 0 5 4 07100000081 7 1 4 74.73 0 5 0 0 17 8 07100000081 8 1 5 74.90 0 5 0 0 0 0 0
7100000704 1 0 0 70.23 0 10 0 0 0 0 07100000704 2 0 0 70.48 0 10 0 0 0 0 07100000704 3 0 0 70.73 0 11 0 0 1 1 07100000704 4 1 1 70.90 0 11 0 0 2 1 07100000704 5 1 2 71.23 0 . 0 0 . . .7100000704 6 1 3 71.48 0 . 0 0 . . .7100000704 7 1 4 71.73 0 . 0 0 . . .7100000704 8 1 5 71.90 0 . 0 0 . . .
7100000729 1 0 0 67.93 1 8 0 1 0 0 07100000729 2 0 0 68.18 1 8 0 1 38 15 07100000729 3 0 0 68.43 1 8 0 1 12 5 07100000729 4 1 1 68.60 1 12 0 1 2 1 07100000729 5 1 2 68.93 1 10 0 1 0 0 07100000729 6 1 3 69.18 1 10 0 1 2 2 07100000729 7 1 4 69.43 1 10 0 1 4 4 07100000729 8 1 5 69.60 1 10 0 1 0 0 0
7100001186 1 0 0 70.43 0 3 0 0 8 6 07100001186 2 0 0 70.68 0 3 0 0 2 2 07100001186 3 0 0 70.93 0 6 0 0 2 1 07100001186 4 1 1 71.10 0 6 0 0 10 5 17100001186 5 1 2 71.43 0 7 0 0 19 7 07100001186 6 1 3 71.68 0 6 0 0 0 0 07100001186 7 1 4 71.93 0 6 0 0 5 1 07100001186 8 1 5 72.10 0 6 0 0 0 0 0
14
=> Fit a generalized linear model using generalized estimating equations (GEE)
- modeling the events as a function of time and selected covariates (age,
income status, sex, chronic disease score [CDS])
- with a Poisson link function
- allowing repeated events per subject
- assuming an autoregressive covariance structure.
- report empirical parameter estimates.
15
Build data structure (1)
Usually have pre-processed data:
Lists of patient IDs with dates and services
ID Date Service ICD1 … ICD16 ICPM1…10 Dischar. date
1 Amb
1 Amb
1 ER
1 Hosp
1 Amb
2
n
Usually a separate Pharmacy File with similar structure
16
Build data structure (2)
data outcomes.temp_amb;
merge outcomes.id orgdata.MSP_CV d_open.DMorASTH orgdata.drugs
(keep = patient age clntsex);
by patient;
if ((DM>=3) or (asthma>=3)) then delete;
if clntsex='U' then csex=.;
if clntsex='M' then csex=0;
if clntsex='F' then csex=1;
if age<0 then age=.;
if age > 120 then age=.;
run; proc print data=outcomes.temp_amb (obs=60); run;
17
Build data structure (3)
* combine 3 months to one time unit starting Apr/May/Jun 1996 as t=1 ;
data outcomes.temp_amb; set outcomes.temp_amb;
t1=age+1+(4/12); * Apr/May/Jun 96; t2=age+1+(7/12); * Jul/Aug/Sep 96; t3=age+1+(10/12); * Oct/Nov/Dec 96; t4=age+2; * Jan/Feb/Mar 97; t5=age+2+(4/12); * Apr/May/Jun 97; t6=age+2+(7/12); * Jul/Aug/Sep 97; t7=age+2+(10/12); * Oct/Nov/Dec 97; t8=age+3; * Jan/Feb/Mar 98;
run;
proc print data=outcomes.temp_amb (obs=60); run;
ID ... csex age T1 T2 T3 … T8 1 1 69 70.33 70.58 70.83 2 n
18
Build data structure (4)
proc transpose data=outcomes.temp_amb out=outcomes.matrix3;
var t1 - t8 ;
by patient csex;
run;
ID ... csex _name_ var1 t 1 1 T1 70.33 1 1 1 T2 70.58 2 1 1 T3 70.83 3 n data outcomes.matrix3; set outcomes.matrix3; * code time; if _name_ = 'T1' then t=1; if _name_ = 'T2' then t=2; if _name_ = 'T3' then t=3; ... if _name_ = 'T8' then t=8; delete _name_ ; run;
19
Build data structure (5)
Coding of exposure ( = time): if t>=4 then post=1; else post=0; * code time trend, 1 interruption ; if t>=4 then t_post=t-3; else t_post=0; Patients leaving the cohort:
Set outcomes missing on that month (or following) SAS will not use those observations
20
Build data structure (6)
How to assess a time-varying chronic disease score? CDS is a measure of health status derived from the utilization of prescription
medications. Advantage: well recorded Disadvantages: measuring a mix of diseases and prescription behavior
Usually CDS is assessed during 6 months Algo: 1. Identify 6 month time-period preceding t1
2. calc. CDS and store result in line of t1
3. repeat for all time periods
(cut and past, or SAS macro)
21
Poisson regression with repeated events in SAS:
proc genmod data=temp;
class patient;
output out=outcomes.pred1 predicted=pred lower=lower upper=upper;
model allvis = age csex cds psc1 psc2 t post t_post
/dist = poisson scale=deviance;
repeated subject = patient/sorted type=AR ;
run;
22
Monthly physician visit days per patient in policy and historical control cohorts of ACE
inhibitor recipients. The vertical line indicates the implementation of reference pricing.
Differences in rates between both cohorts are plotted at the right hand scale. Trends
are adjusted for deaths, emigrations and length of month.
0
0.5
1
1.5
2
2.5
Jan-
96
Feb-
96
Mar
-96
Apr
-96
May
-96
Jun-
96
Jul-9
6
Aug
-96
Sep
-96
Oct
-96
Nov
-96
Dec
-96
Jan-
97
Feb-
97
Mar
-97
Apr
-97
May
-97
Jun-
97
Jul-9
7
Aug
-97
Sep
-97
Oct
-97
Nov
-97
Dec
-97
Jan-
98
Feb-
98
Mar
-98
Apr
-98
Months
Num
ber
of p
hysi
cian
vis
it da
ys
per
patie
nt
-0.5
0
0.5
1
1.5
2
Diff
eren
ce in
phy
sici
an e
ncou
nter
day
s pe
r pa
tient
bet
wee
n po
licy
coho
rt a
nd
hist
oric
al c
ontr
ols
Physician visit days per patient
Physician visit days per patient (Control cohort)
Difference between policy cohort and histor. controls
right hand scale
23
Estimation 1:
M1: allvis = age csex CDS1 psc1 psc2 t post t_post, 8 quarters
n=30,000
PARM ESTIMATE STDR LOWER UPPER PZSCR
INTERCEPT 0.948 0.044 0.863 1.034 0.000
AGE 0.007 0.001 0.006 0.008 0.000
CSEX 0.008 0.009 -0.010 0.025 0.387
CDS 0.397 0.008 0.381 0.413 0.000
PSC1 0.018 0.012 -0.005 0.042 0.121
PSC2 -0.015 0.017 -0.048 0.019 0.390
T 0.012 0.003 0.006 0.019 0.000
POST 0.405 0.007 0.392 0.418 0.000
Scale 2.416
Computing power (1999):
PC pentium 450, 380 MB memory, lots of hard-drive space
2:30 hours for n=30,000, >6 hours for n=48,000
24
Estimation 2:
M1: allvis = age csex CDS1 psc1 psc2 t post cohort post*cohort, 8 quarters
n=30,000
PARM ESTIMATE STDR LOWER UPPER PZSCR
INTERCEPT 0.948 0.044 0.863 1.034 0.000
AGE 0.007 0.001 0.006 0.008 0.000
CSEX 0.008 0.009 -0.010 0.025 0.387
CDS 0.397 0.008 0.381 0.413 0.000
PSC1 0.018 0.012 -0.005 0.042 0.121
PSC2 -0.015 0.017 -0.048 0.019 0.390
T 0.012 0.003 0.006 0.019 0.000
POST 0.405 0.007 0.392 0.418 0.000
COHORT 0.004 0.003 -0.013 -0.020 0.546
POST_cohort 0.010 0.003 -0.006 -0.028 0.231
Scale 2.416