1 epi235: epi methods in hsr april 19, 2005 l6 program evaluation with longitudinal data 2: basic...

1

EPI235: Epi Methods in HSR

April 19, 2005 L6

Program Evaluation with Longitudinal Data 2: Basic Techniques (Dr. Schneeweiss)

Basic techniques and SAS code for interrupted linear regression for the evaluation of health programs. Time varying confounding vs. constant confounders. Adjusting for correlated data. Handling overdispersion and skewed distributions of outcomes. Data structure and SAS codes will be discussed.

Background reading: •Wagner AK, Soumerai SB, Zhang F, Ross-Degnan D: Segmented regression analysis of interrupted time series studies in medication use research. J Clin Pharm Ther 2002;27:299-309.•Schneeweiss S, Maclure M, Walker AM, Grootendorst P, Soumerai SB: On the evaluation of drug policy changes with longitudinal claims data: The Policy maker’s versus the clinician’s perspective. Health Policy 2001;55:97-109.•Schneeweiss S, Maclure M, Soumerai SB, Walker AM, Glynn RJ: Quasi-experimental longitudinal designs to evaluate drug benefit policy changes with low policy compliance. J Clin Epidemiol 2002;55:833-841

2

Segmented (linear) regression

Series of measurements of a single characteristic in different time intervals Health facility utilization rates Drug consumption Prescribing indicators

Reasons for collection Description of levels and trends Prediction of future values

3

Segmented (linear) regression

Segments Specific event causes a change in the series,

dividing it into distinct segments Estimating the change in the series allows you to

assess the impact of the event

Validity Strongest non-experimental research design Pre-event level and trend serves as a built-in

“control” Can have historical control time trend Can have concurrent control time trend

4

Options

Aggregate level time series Easy to do, simple data structure

Individual level time series (repeated measures) Can adjust for individual level time-varying

covariates Multiplies the data Need more sophisticate statistics, though no longer

a problem

5

Assumption: Extrapolating the pre-intervention level and trend correctly reflects the (counterfactual) outcome that would have occurred had the intervention not happened.

Analysis of an intervention effect using segmented linear regression

immediate level change

projected changeslope

Utilization rateIntervention

slope=0

6

It’s the model AND the data structure

tt3t2t10t epolicy1after time*βpolicy1*βtime*ββY

immediate level change

projected changeslope

Utilization rateIntervention

slope=0

7

It’s the model AND the data structure

Time

to

Program start date

8

Two segments exampleAverage number of constant-size prescriptions per continuously eligible

Medicaid patient per month among multiple drug recipients

0

2

4

6

8

Dec-79 Jun-80 Jan-81 Aug-81 Feb-82 Sep-82 Mar-83 Oct-83 Apr-84

Study Month

Mea

n n

um

ber

of

pre

scri

pti

on

s p

er

pat

ien

t

tt5t4

t3t2t10t

epolicy2after time*βpolicy2*β

policy1after time*βpolicy1*βtime*ββY

Cap Copay

9

Two segments example

Time (Months)

Mean # of Rx

3-Drug Cap Time after Cap

Copay Time after Copay

***

17 5.10 0 0 0 0

18 5.10 0 0 0 0

19 5.00 0 0 0 0

20 6.20 0 0 0 0

21 2.60 1 0 0 0

22 2.80 1 1 0 0

23 2.75 1 2 0 0

24 2.75 1 3 0 0

25 2.95 1 4 0 0

***

32 3.85 1 11 1 0

33 3.65 1 12 1 1

34 3.55 1 13 1 2

***

Cap

Copay

10

Proc autoreg to evaluate aggregate time trend data in SAS:

PROC AUTOREG DATA=temp; MODEL Rx = time cap t_cap copay t_copay/ METHOD=ML NLAG=1 DWPROB COVB; OUTPUT OUT=ESTOUT PREDICTED=PRED LCL=PRED_L UCL=PRED_U RESIDUAL=RESID PREDICTEDM=PREDM LCLM=PREDM_L UCLM=PREDM_U RESIDUALM=RESIDM; RUN; PROC PRINT DATA=ESTOUT NOOBS; VAR PRED RESID PRED_L PRED_U PREDM_L PREDM_U PREDM RESIDM TIME CAP T_CAP COPAY T_COPAY; RUN;

11

Questions:

How to choose segments?

12

Repeated measurements

Modeling of repeated events:

Divide observation period into equal time periods, e.g. months, quarters.

Visits, hospitalizations, or ER visits are repeated events.

Within each quarter individuals can have multiple events (=count data).

Individual level time series

13

Data structure for repeated events modelling

One time period (T) equals 1 quarter from Q2/96 to Q1/98.Income status Outcome variables

PATIENT T POST T_POST AGE CSEX CDS PSC1 PSC2 ALL_CLAIM ALL_VIS ALL_HOS7100000081 1 0 0 73.23 0 4 0 0 0 0 07100000081 2 0 0 73.48 0 4 0 0 0 0 07100000081 3 0 0 73.73 0 5 0 0 10 1 07100000081 4 1 1 73.90 0 5 0 0 21 7 07100000081 5 1 2 74.23 0 5 0 0 14 4 07100000081 6 1 3 74.48 0 5 0 0 5 4 07100000081 7 1 4 74.73 0 5 0 0 17 8 07100000081 8 1 5 74.90 0 5 0 0 0 0 0

7100000704 1 0 0 70.23 0 10 0 0 0 0 07100000704 2 0 0 70.48 0 10 0 0 0 0 07100000704 3 0 0 70.73 0 11 0 0 1 1 07100000704 4 1 1 70.90 0 11 0 0 2 1 07100000704 5 1 2 71.23 0 . 0 0 . . .7100000704 6 1 3 71.48 0 . 0 0 . . .7100000704 7 1 4 71.73 0 . 0 0 . . .7100000704 8 1 5 71.90 0 . 0 0 . . .

7100000729 1 0 0 67.93 1 8 0 1 0 0 07100000729 2 0 0 68.18 1 8 0 1 38 15 07100000729 3 0 0 68.43 1 8 0 1 12 5 07100000729 4 1 1 68.60 1 12 0 1 2 1 07100000729 5 1 2 68.93 1 10 0 1 0 0 07100000729 6 1 3 69.18 1 10 0 1 2 2 07100000729 7 1 4 69.43 1 10 0 1 4 4 07100000729 8 1 5 69.60 1 10 0 1 0 0 0

7100001186 1 0 0 70.43 0 3 0 0 8 6 07100001186 2 0 0 70.68 0 3 0 0 2 2 07100001186 3 0 0 70.93 0 6 0 0 2 1 07100001186 4 1 1 71.10 0 6 0 0 10 5 17100001186 5 1 2 71.43 0 7 0 0 19 7 07100001186 6 1 3 71.68 0 6 0 0 0 0 07100001186 7 1 4 71.93 0 6 0 0 5 1 07100001186 8 1 5 72.10 0 6 0 0 0 0 0

14

=> Fit a generalized linear model using generalized estimating equations (GEE)

- modeling the events as a function of time and selected covariates (age,

income status, sex, chronic disease score [CDS])

- with a Poisson link function

- allowing repeated events per subject

- assuming an autoregressive covariance structure.

- report empirical parameter estimates.

15

Build data structure (1)

Usually have pre-processed data:

Lists of patient IDs with dates and services

ID Date Service ICD1 … ICD16 ICPM1…10 Dischar. date

1 Amb

1 Amb

1 ER

1 Hosp

1 Amb

2

n

Usually a separate Pharmacy File with similar structure

16


data outcomes.temp_amb;

merge outcomes.id orgdata.MSP_CV d_open.DMorASTH orgdata.drugs

(keep = patient age clntsex);

by patient;

if ((DM>=3) or (asthma>=3)) then delete;

if clntsex='U' then csex=.;

if clntsex='M' then csex=0;

if clntsex='F' then csex=1;

if age<0 then age=.;

if age > 120 then age=.;

run; proc print data=outcomes.temp_amb (obs=60); run;

17


* combine 3 months to one time unit starting Apr/May/Jun 1996 as t=1 ;

data outcomes.temp_amb; set outcomes.temp_amb;

t1=age+1+(4/12); * Apr/May/Jun 96; t2=age+1+(7/12); * Jul/Aug/Sep 96; t3=age+1+(10/12); * Oct/Nov/Dec 96; t4=age+2; * Jan/Feb/Mar 97; t5=age+2+(4/12); * Apr/May/Jun 97; t6=age+2+(7/12); * Jul/Aug/Sep 97; t7=age+2+(10/12); * Oct/Nov/Dec 97; t8=age+3; * Jan/Feb/Mar 98;

run;

proc print data=outcomes.temp_amb (obs=60); run;

ID ... csex age T1 T2 T3 … T8 1 1 69 70.33 70.58 70.83 2 n

18


proc transpose data=outcomes.temp_amb out=outcomes.matrix3;

var t1 - t8 ;

by patient csex;

run;

ID ... csex _name_ var1 t 1 1 T1 70.33 1 1 1 T2 70.58 2 1 1 T3 70.83 3 n data outcomes.matrix3; set outcomes.matrix3; * code time; if _name_ = 'T1' then t=1; if _name_ = 'T2' then t=2; if _name_ = 'T3' then t=3; ... if _name_ = 'T8' then t=8; delete _name_ ; run;

19


Coding of exposure ( = time): if t>=4 then post=1; else post=0; * code time trend, 1 interruption ; if t>=4 then t_post=t-3; else t_post=0; Patients leaving the cohort:

Set outcomes missing on that month (or following) SAS will not use those observations

20


How to assess a time-varying chronic disease score? CDS is a measure of health status derived from the utilization of prescription

medications. Advantage: well recorded Disadvantages: measuring a mix of diseases and prescription behavior

Usually CDS is assessed during 6 months Algo: 1. Identify 6 month time-period preceding t1

2. calc. CDS and store result in line of t1

3. repeat for all time periods

(cut and past, or SAS macro)

21

Poisson regression with repeated events in SAS:

proc genmod data=temp;

class patient;

output out=outcomes.pred1 predicted=pred lower=lower upper=upper;

model allvis = age csex cds psc1 psc2 t post t_post

/dist = poisson scale=deviance;

repeated subject = patient/sorted type=AR ;

run;

22

Monthly physician visit days per patient in policy and historical control cohorts of ACE

inhibitor recipients. The vertical line indicates the implementation of reference pricing.

Differences in rates between both cohorts are plotted at the right hand scale. Trends

are adjusted for deaths, emigrations and length of month.

0

0.5

1

1.5

2

2.5

Jan-

96

Feb-

96

Mar

-96

Apr

-96

May

-96

Jun-

96

Jul-9

6

Aug

-96

Sep

-96

Oct

-96

Nov

-96

Dec

-96

Jan-

97

Feb-

97

Mar

-97

Apr

-97

May

-97

Jun-

97

Jul-9

7

Aug

-97

Sep

-97

Oct

-97

Nov

-97

Dec

-97

Jan-

98

Feb-

98

Mar

-98

Apr

-98

Months

Num

ber

of p

hysi

cian

vis

it da

ys

per

patie

nt

-0.5

0

0.5

1

1.5

2

Diff

eren

ce in

phy

sici

an e

ncou

nter

day

s pe

r pa

tient

bet

wee

n po

licy

coho

rt a

nd

hist

oric

al c

ontr

ols

Physician visit days per patient

Physician visit days per patient (Control cohort)

Difference between policy cohort and histor. controls

right hand scale

23

Estimation 1:

M1: allvis = age csex CDS1 psc1 psc2 t post t_post, 8 quarters

n=30,000

PARM ESTIMATE STDR LOWER UPPER PZSCR

INTERCEPT 0.948 0.044 0.863 1.034 0.000

AGE 0.007 0.001 0.006 0.008 0.000

CSEX 0.008 0.009 -0.010 0.025 0.387

CDS 0.397 0.008 0.381 0.413 0.000

PSC1 0.018 0.012 -0.005 0.042 0.121

PSC2 -0.015 0.017 -0.048 0.019 0.390

T 0.012 0.003 0.006 0.019 0.000

POST 0.405 0.007 0.392 0.418 0.000

Scale 2.416

Computing power (1999):

PC pentium 450, 380 MB memory, lots of hard-drive space

2:30 hours for n=30,000, >6 hours for n=48,000

24

Estimation 2:

M1: allvis = age csex CDS1 psc1 psc2 t post cohort post*cohort, 8 quarters

n=30,000

PARM ESTIMATE STDR LOWER UPPER PZSCR

INTERCEPT 0.948 0.044 0.863 1.034 0.000

AGE 0.007 0.001 0.006 0.008 0.000

CSEX 0.008 0.009 -0.010 0.025 0.387

CDS 0.397 0.008 0.381 0.413 0.000

PSC1 0.018 0.012 -0.005 0.042 0.121

PSC2 -0.015 0.017 -0.048 0.019 0.390

T 0.012 0.003 0.006 0.019 0.000

POST 0.405 0.007 0.392 0.418 0.000

COHORT 0.004 0.003 -0.013 -0.020 0.546

POST_cohort 0.010 0.003 -0.006 -0.028 0.231

Scale 2.416

1 epi235: epi methods in hsr april 19, 2005 l6 program evaluation with longitudinal data 2: basic...

Documents