methods to correct measures of effect for bias due to exposure measurement error

96
Methods to Correct Measures of Effect for Bias due to Exposure Measurement Error Donna Spiegelman, ScD Departments of Epidemiology and Biostatistics Harvard School of Public Health, Boston, MA [email protected] Statistical Society of Canada 2013 Introductory Overview Lecture May 28, 2013 1

Upload: sugar

Post on 23-Feb-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Methods to Correct Measures of Effect for Bias due to Exposure Measurement Error Donna Spiegelman, ScD Departments of Epidemiology and Biostatistics Harvard School of Public Health, Boston, MA [email protected] Statistical Society of Canada 2013 Introductory Overview Lecture - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Methods to Correct Measures of Effect for Bias due to Exposure Measurement Error

Donna Spiegelman, ScDDepartments of Epidemiology and Biostatistics

Harvard School of Public Health, Boston, MA

[email protected]

Statistical Society of Canada 2013 Introductory Overview LectureMay 28, 2013

1

Page 2: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

In this talk, I will give a brief overview of several problems which have been addressed by myself and colleagues at HSPH, motivated by ongoing environmental and occupational epidemiologic research at HSPH and elsewhere:

• Regression calibration (two versions)

• Regression calibration for multiple surrogates for the same exposure

• Regression calibration with heteroscedastic error

• Regression calibration for main study/internal validation study designs

• Regression calibration for Cox models with time-varying functions of a mis-measured exposure history

All methods will be illustrated by motivating examples in environmental and occupational epidemiology

2

Page 3: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Notation

Number of participants in main study

Number of participants in validation study

Binary health outcome

“True” exposure

Surrogate exposure

s perfectly measured covariates (e.g. age, race, smoking status)

Measured on all participants in main and validation study

Main study

External validation study

Internal validation study

1n :

2n :

D :X :Z :U :

, , , ,...,i i i 1D Z i 1 nU

, , , ,...,i i i 1 1 2X Z i n 1 n n U

3 , , , , ,...,i i i i 1 1 2D X Z i n 1 n n U

Page 4: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Rosner et al. regression calibration method for MS/EVS

The (Rosner et al., 1989; Rosner et al., 1990; Rosner et al., 1992) version of regression calibration for MS/EVS design:

3-step algorithm

1. In the main study, regress Y on Z and U to obtain where now Z is a vector of mis-measured continuous covariates

and U is a vector of perfectly measured covariates.

* * *0 1 2

ˆ ˆ ˆ, , β β1s

1t

4

Page 5: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Rosner et al. regression calibration method for MS/EVS

2. In the validation study, regress X on Z and U to obtain where is a vector of regression intercepts, is a

matrix of slopes for the regression of X on Z, adjusted for U, and

is a matrix of slopes for the regression of X on U,

adjusted for Z.

1 20ˆ , ,γ Γ Γ

0γ 1s 1Γ s s

2Γ s t

5

Page 6: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Rosner et al. regression calibration method for MS/EVS

3. Correct estimates of effect for measurement error, by

or

where 0 is a matrix of 0’s and I is a identity matrix,

** *1

1 0 0 1 0 2 2 1 21

ˆˆ ˆ ˆ ˆ ˆ ˆ ˆˆ ˆ, ,ˆ

1 *1 1 1

*2 22

ˆ ˆˆ

ˆ ˆˆ

T

T

T T

T T

Γ 0 β β

Γ I ββs t t t

1 0 00 1

00 0 1

I

t t

6

Page 7: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Rosner et al. regression calibration method for MS/EVS

4. Use multivariate delta method to derive variance, e.g.

See Appendices 2 and 3 of (Rosner et al., 1990) for a

derivation of the variance of , again using

the multivariate delta method.

* * 2

1 1 11 2 4

1 1

ˆ ˆ ˆ( ) ( ) ( )ˆ( )ˆ ˆ

Var VarVar

1

2

ˆ

ˆ

β

β

7

Page 8: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Regression calibration (Carroll et al.)

Given validation or reliability data, the Carroll et al. version of theregression calibration estimator follows (when ):

Sketch of Algorithm

1. Estimate and in the validation study from the regression of on or in the reliability study from the regression of on

2. Estimate in the main study. 

0 1iX , 1,...,i VZ i n

1iZ 2 , 1,,,i RZ i n

0 1ˆ ˆ , 1,...,i i i MX Z e i n

2i Ir Rn n

8

Page 9: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Regression calibration (Carroll et al.)

3. Run usual regression model for Y on X in the main study to obtain estimates of effect adjusted for measurement error, i.e. fit model in the main study, where

is a link function, e.g. identity for linear regression, log for Poisson and log-binomial regression, logit for logistic regression, probit for probit regression to obtain estimates of and that are corrected for measurement error, at least ‘approximately’.

4. Variance must be adjusted as well and cannot be obtained from the standard regression software.

It remains to show the theoretical justification for what is thus far an ad hoc procedure, and derive the measurement-error corrected variance .  

0 1[ ( | )] ii ig E Y X X [ ]g

01

9

Page 10: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Motivation for regression calibration estimator (logistic regression model)

Small measurement error justification for regression calibration estimator

We use a Taylor series expansion for the likelihood of the main study data for this derivation.

From the Taylor expansion,

  

0 1

0 10 1 0 1

0 1 0 1

0 1

0 1

( | )

( | )

( | )

22

2

( | )

1( | )

1 1

( | )12

X

XX E X Z

X E X Z

X E X Z

X

X

X E X Z

eee e X E X Z

e e X

eX E X Ze

X

10

Page 11: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

  

0 1

0 1

Pr( 1| ) Pr( 1| )Pr( | )

exp{ } |1 exp{ }X

D Z D x x Z dx

XE ZX

2 0 1

2

|0 1 0 1

2

0 1

( | )

exp{ }

exp{ ( | )} 1 exp{ }0

1 exp{ ( | )} 2X Z

X E X Z

X

E X Z X

E X Z X

|

0 1

0 10 1

0 1

( | )

exp{ }1 exp{ }exp{ ( | )} ( | )

1 exp{ ( | )}X Z

X E X Z

E

XXE X Z X E X Z

E X Z X

2 0 1

20 1

2

( | )

exp{ }1 exp{ } ( | )

2

X E X Z

XX X E X Z

ZX

11

Page 12: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

and

Hence, if is small (i.e. small measurement error), this approximation should work well. This justification was first suggested by Armstrong (Armstrong, 1985) for the more general setting of generalized linear models, of which logistic regression is one example.Armstrong B. (1985) Measurement error in the generalized linear-model. Communications in Statistics-Simulation and Computation 14:529-544.

Note: This approximation is not one that improves as the sample size increases, as is typically the case in statistics.

  

0 1

0 10 1 0 1

2 0 1 0 1|

22( | ) ( | )

|( | ) ( | )20

( | )

1lim 0

1 2 1

X Z

X

XE X Z E X ZX Z

E X Z E X Z

X E X Z

eee e

e X e

|

2X Z

12

Page 13: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Small measurement error

What is small? (Carroll and Wand, 1991; Kuha, 1994; Neaton andBartsch, 1992; Rosner et al., 1989) all reported based upon simulationstudies of regression calibration for logistic regression that the approximation works remarkably well when is small, and Kuha suggested the value of 0.5.  

Multivariate version of this is given in (Carroll et al., 2006) (See 4.7.1.1 and section B.3.3).

Carroll R.J., Wand M.P. (1991) Semiparametric estimation in logistic measurement error models. Journal of the Royal Statistical Society Series B-Methodological 53:573-585.

Kuha J. (1994) Corrections for exposure measurement error in logistic regression models with an application to nutritional data. Stat Med 13:1135-48.

Neaton J.D., Bartsch G.E. (1992) Impact of measurement error and temporal variability on the estimation of event probabilities for risk factor intervention trials. Statistics in Medicine 11:1719-1729.

  

2 21 | X Z

13

Page 14: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Likelihood-based justification

For the multivariate normal measurement error model, a.k.a. linear regression, first given by Fuller for the classical measurement error model; later, given by Spiegelman et al. for the linear measurement error model

with and its multivariate extensions.

Fuller, W.A. (1987) Measurement Error Models, New York, Wiley

Spiegelman, D., McDermott, A. and Rosner, B. (1997). Regression calibration methodfor correcting measurement-error bias in nutritional epidemiology. American Journal ofClinical Nutrition 65, 179s-1186s.

  

0 1 | X ZX Z e

2| |(0, )X Z X Ze N

14

Page 15: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

The main study likelihood function

The critical identity, that entirely depends on the surrogacy assumption, is used to obtain the main study likelihood in the observed data, as follows

Below the following notation was used:   

This Lecture (Spiegelman et al., 1997)

XX

UZ

x

U0

1 ( , )( | ) ( | ) ( , , ) ( | )( ) ( )

f D Zf D x f x Z dx f D x Z dx f D Z

f Z f Z

We will first consider a primary regression model where now the outcome, , is continuous (rather than the binary outcome, , previously considered).

Y

D

15

Page 16: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

  

Derivation of for multivariate linear models.

Assume (A1)

and (A2)

ˆRC

| , ,0 1 2 yY x U MVN x U

1 2| , , xx X U MVN X U '

16

Page 17: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

17

Page 18: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Key result

In our notation:

with Y, Z, X, U all scalar

variance still constant but

larger than in original model without measurement error.

  

*0E(Y | = β + 1 1 2 1 2Z,U) β Γ Z + (β + β Γ )U

* * *0 1 0 1 1 2 1 2 0 1 2( | , ) ( )E Y Z U Z U Z U

Var(Y | )= TY 1 X|Z 1Z,U Σ β Σ β

18

Page 19: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Key result (continued)

Now that we know that the likelihood of

we can see that replacing with , gives the likelihood function

so the estimate of the regression slope will be consistent for as desired.

  

* * 2 * 20 1 0 1 1

2 2 2 2 2 21 | 1 |

2 2 2 2 2 21 | 1 |

1 ( ) 1 ( )exp exp2 2

( | )2 ( ) 2 ( )

Y X Z Y X Z

Y X Z Y X Z

Y Z Y Z

f Y Z

X 0 1ˆ ˆ X Z

2 2

0 1 0 1 0 1 1

2 2 2 2 2 21 | 1 |

1 1exp ( ) exp ( )2 2( | )

2 ( ) 2 ( )

Y X Z Y X Z

Y X Y Zf Y X X

119

Page 20: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

The logistic regression model (rare disease)

Recall logistic regression model

Under the rare disease assumption,

We need the likelihood of the main study data in terms of the surrogate exposure, Z. Similarly to above in the multivariate normal setting, we

integrate over x as follows

  

0 1

0 1Pr( 1| )

1

X

X

eD Xe

0 10 1

0 1Pr( 1| )

1

XX

X

eD X ee

0 1 0 1

20 1

2|

|

2 20 1 0 | 1 1 1

1 ( )exp2

Pr( 1| ) ( | )2

exp{ / 2 }

X Zx x

X Z

X Z

x Z

D Z e f x Z dx e dx

Z20

Page 21: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

The logistic regression model (rare disease)

Hence, the same regression calibration estimator is obtained as inmultivariate normal case, approximately. This justification was given by (Rosner et al., 1989) and generalized to the multivariate case later (Rosner et al., 1990).

Rosner, B., Spiegelman, D., and Willett, W.C. (1989) Correction of logistic regression relative riskestimates and confidence intervals for systematic within-person measurement error. Statistics In Medicine 8:9, 1051-69

Rosner, B., Spiegelman, D., and Willett, W.C. (1990) Correction of logistic regression relative riskestimates and confidence intervals for measurement error: the case of multiple covariates measuredwith error. Am J Epidemiol 132, 734-745.

  

21

Page 22: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Home Endotoxin Exposure and Wheeze in Infants: Correction for Bias Due to Exposure Measurement Error

Nora Horick, Edie Weller, Donald K. Milton, Diane R. Gold, Ruifeng Li, and Donna Spiegelman

Department of Biostatistics and Department of Environmental Health, Harvard Schoolof Public Health, Boston, Massachusetts, USA; Channing Laboratory, Harvard Medical School, Boston, Massachusetts, USA; Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA

Environmental Health Perspectives Volume 114, Number 1, January 2006

22

An example

Page 23: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

23

Page 24: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

24

Page 25: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

25

Page 26: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Regression calibration for logistic regression with multiple surrogates for one exposure

Edie A. Weller, Donna Spiegelman, Don Milton, Ellen EisenDepartments of Biostatistics, Epidemiology, and Environmental HealthHarvard School of Public Health and Dana Farber Cancer Institute

Journal of Statistical Planning and Inference, 2007; 137:449-461

Occupational exposures often characterized by numerous factors of the workplace and work duration in a particular area ==> multiple surrogates describe one exposure.

Validation study: Personal exposure is commonly measured on a subset of the subjects and these values are then used to estimate average exposure by job or exposure zone.

No adjustment for bias or uncertainty in the exposure estimates.

Current methods typically assume that there is one surrogate for each exposure (for example, Rosner et al, 1989, 1990).

Propose adjustment method which allows for multiple surrogates for one exposure using a regression calibration approach.

••

26

Page 27: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

• To assess the relationship between exposure to metal working fluids (MWF) and respiratory function (United Automobile Workers Union and General Motors Corporation sponsored study, Greaves et al, 1997).

• Outcome here is prevalence of wheeze• Job characteristics include metal working fluid (MWF) type, plant and machine

operation (grinding or not).• Assembly workers are considered the non-exposed group.• Possible confounders include age, smoking status and race.

Main Study

27

Page 28: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

• Exposure was measured in various job zones (Woskie et al, 1994).• Intensity of exposure to MWF aerosol measured by the thoracic aerosol fraction

(i.e. the sum of the two smallest size fractions measured with the personal monitors).

• Full shift (8 hour) personal samples of aerosol exposure in breathing zone of automobile workers were collected in various job zones.

Exposure Assessment Study (generically, the validation study)

28

Page 29: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Assumptions• True exposure and the s-vector of covariates are related to the

probability of binary outcome by the logistic function:

Pr 0 1 2D 1 X Ulogit

where 21 22 2s= β ,β …,β .2β• Linear regression model is appropriate to relate the r surrogates and

the s covariates to the true exposure:

where

• is a surrogate if , that is, knowledge of the surrogates provides no additional information if the true exposure is known.

Pr | , Pr | ,D X D XW,U UW

2| ,( ) 0, ( ) X WE Var U

• and small, or small

|~ ( , )2XN 0 W,U Pr( )D

X U D

W Z

0X γ 2W U γ1

|2 21 X W,U 29

Page 30: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

• Quantitative measure of exposure is not measured on all subjects.

Goal: To obtain point and interval estimates of and relating exposure to outcome adjusting for the covariates

Problem

X– is measured on all of the subjects

– and measured on subjects.

W 1nX W 2n

• Multiple surrogates, describe exposureW,

Solution: An extension to two closely related approaches

• Rosner, Spiegelman and Willett (RSW, 1989, 1990)• Carroll, Ruppert and Stefanski (CRS, 1995)

e X D U

30

Page 31: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Procedure:

Propose the following approach which follows RSW and assumes normality of and rare disease, or small (parameter of the small ME approximation):

2 21 X|β σ W,U

1. Estimate from a logistic regression model of on and in subjects in main study

2. Estimate from a measurement error model among the validation study subjects using ordinary least squares regression.

D W1n

ˆ ˆ ˆPr D 1 2W α + U'α01logit

2n

ˆ ˆ ˆ0X 1 2W γ + U'γ

SAS PROC GENMOD or PROC LOGISTIC for step 1, PROC REG for step 2

31

Page 32: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

3. Optimally combine the adjusted estimates for each surrogate Wˆ ˆˆ Wτ β1

where

ˆ ˆ ˆ-1W 1 1β = Γ α ˆ ˆdiag1 1Γ γ

ˆ ˆ W W

-1-1 -1β βτ = 1 Σ 1 1 Σ '= 1,1,…,11

ˆ W

is the estimated variance-covariance matrix of ˆW

β

ˆˆ ˆ ˆ ˆ

ˆ 0ˆ .ˆ0

1W

11 1 1 1

αW W

1 1 γ 1 1α ,γ α ,γ

β βα ,γ α ,γ

SAS macro downloadable from my website to accomplish step 3; input to the macro is the output from PROC LOGISTIC and PROC REG

http://www.hsph.harvard.edu/faculty/spiegelman/multsurr.html32

Page 33: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Results from logistic regression model for wheeze. GM/UAW main study (n1 = 1040). “True” Exposure (X) is thoracic aerosol fraction (mg/m3 ) measures on n2 = 83 workers

Variable Uncorrected P-value Corrected P-value

Exposure1 (mg/m3 ) 2.875 (1.353, 6.108) 0.006

Surrogates (W) Plant 2 Grinding Straight Synthetic

2.109 (1.391, 3.198)0.706 (0.374, 1.332)1.641 (1.119, 2.407)1.851 (1.200, 2.854)

< 0.0010.2820.0110.005

Covariates (Z) Age 30-39 Age 40-49 Age 50+ Race Current Smoker

0.897 (0.615, 1.307)0.834 (0.512, 1.358)0.912 (0.544, 1.528)1.173 (0.796, 1.728)3.042 (2.210, 4.188)

0.5710.4650.7260.420

< 0.001

0.965 (0.648, 1.437)0.853 (0.513, 1.418)0.914 (0.535, 1.561)1.166 (0.782, 1.740)2.978 (2.144, 4.137)

0.8610.5400.7410.451

< 0.001

(95% )OR CI

1 Estimated GLS weights are 0.857 for straight, 0.127 for synthetic, 0.15 for grinding, and 0.0001 for plant

(95% )OR CI

33

Page 34: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

34

ARE of optimal method compared to Carroll method

Page 35: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Regression CalibrationWith Heteroscedastic Variance

Donna Spiegelman, Roger Logan, Douglas Grove

International Journal of Biostatistics: 2011 Vol. 7, Issue 1, Article 4. PMCID: PMC3404553

35

Page 36: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Derivation of estimator

• Let under the rare disease assumption, and

• Then,

( )( )( | )

0 10 1

0 1

x Dx D

1 xef D x e1 e

( | ) ~ ( , ( ) )22f x X N X g X

( )( )

( | ) ( | ) ( | )( )

20 1 2

1x x X2g X

3 1 2 2x x

ef D 1 x f D 1 x f x X dx dx2 g X

* * *( ) ( ) .2 2

0 1 1 1 0 11 12X g X X g Xe e 36

Page 37: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

The Procedure1. A logistic regression model of D on X and g(X) is run in the main study to obtain and and their estimated variances

2. A weighted linear regression is run in the validation study, with weights 1/g(X), to obtain and

3. and are calculated as a function of

and and efficiently combined to produce a single

estimate

4. The asymptotically minimum variance weights and their derivation, as well as the formula for the variance of ,

are given in the Appendix of the manuscript.

*ˆ11

*ˆ12

ˆ 2

ˆ,ˆ11

ˆ12

*ˆ11

*ˆ ,12

ˆ ,2

ˆRCH

ˆRCH

37

Page 38: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Example: ACE study

prevalence of fever

average weekly chemotherapeutics exposure, self-reported on questionnaire

same, from on-site diary for 1-2 weeks

Y

1 2619, 56n n

104 cases, 6 in validation study

Valanis et al., 1993

control for = age (years), shift work (yes/no)

logit 0 1 2Y X U

( , )Corr X Z = 0.70

Z

X

U

38

Page 39: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

39

Page 40: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Examples

ACE study

Corr 0.21 (0.26 outliers out) ˆ| |,e X

uncorrected 1.13 (1.03 - 1.23)

1.22 (1.04 - 1.43)

1.24 (1.05 - 1.48)

ˆRSW

OR

52 drugs mixed/day (90th-10th)

controlling for age, shift, community hospital

HRC

40

Page 41: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

41

n2 173 346 

0.5 0.4 0.4 % bias3

MSE4 ×10-3

Asym CP5(%)

-930.1030

-422.796

-732466

-910.09629

-121.597

-322664

0.5 0.6 0.4 % bias3

MSE4 ×10-3

Asym CP5(%)

-1000.1223

-1101895

-18016082

-990.1122

-651696

-13013087

0.5 0.8 0.5 % bias3

MSE4 ×10-3

Asym CP5(%)

-810.08042

-140.3194

202.070

-790.07542

-3.60.2697

262.267

1 0.6 0.5 % bias3

MSE 4×10-3

Asym CP5(%)

-610.05262

-0.430.1095

-2604.981

-580.04666

5.40.07895

-2003.884

2 0.6 0.6 % bias3

MSE 4×10-3

Asym CP5(%)

6.00.1292

300.05489

2.90.1483

6.70.009297

250.03491

-6.10.1891

2 0.8 0.6 % bias3

MSE 4×10-3

Asym CP5(%)

-550.04466

-2.50.06894

4.20.1790

-530.03971

3.80.05696

140.1791

Simulation study of estimators under heteroscedastic measurement error variance

p ,x X ,2e x ˆRC ˆ

RCˆRCH ˆ

RCH

Page 42: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

A comparison of regression calibration approaches for designs with internal validation data

Sally W. Thurston , Paige L. Williams, Russ Hauser, Howard Hu, Mauricio Hernandez-Avila, and Donna Spiegelman

Department of Biostatistics and Computational Biology, University of Rochester Medical Center, 601 Elmwood Avenue, P.O. Box 630, Rochester, NY 14642, USADepartment of Biostatistics, Harvard School of Public Health, USADepartment of Environmental Health, Harvard School of Public Health, USACentro de Investigaciones en Salud Poblacional, Instituto Nacional de Salud Publica, Cuernavaca, Morelos, MexicoDepartment of Epidemiology, Harvard School of Public Health, USAChanning Laboratory, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, US

Journal of Statistical Planning and Inference, 2005; 131:175-190.

42

Page 43: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

We compare the asymptotic relative efficiency of several regression calibration methods of correcting for measurement error in studies with internal validation data, when a single covariate is measured with error.

The estimators we consider are appropriate in main study/hybrid validation study designs, where the latter study includes internal validation and may include external validation data. Although all of the methods we consider produce consistent estimates, the method proposed by Spiegelman et al. (Statistics in Medicine, 2001; 29:139-160) has an asymptotically smaller variance than the other methods.

The methods for measurement error correction are illustrated using a study of the effect of in utero lead exposure on infant birth weight.

43

Page 44: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Internal validation

Methods to compare:

1. “As external”: Treat internal validation (IV) data as external validation data – i.e. ignore in IV study.

2. “Same intercept”: Regress on for IV, for main study.

3. “Different intercept”: Same as (2), but allow IV study, and main study to have different intercepts.

4. “Weighted” (Spiegelman, Carroll, Kipnis, SIM, 2001): Calculate from IV study, and from bias-corrected main study. Combine by weighting each estimate of by its inverse variance.

One can obtain closed form estimators for

X XY

ˆ,Var .

Y

44

Page 45: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

The different intercept method (CRS, 1st edition, p. 46)

when participant is in the internal validation study, 0 otherwise

when participant is in the main study, 0 otherwise

Since when sampling into the internal validation study isindependent of given and , estimation of this additional parameter, if correlated with could only increase the variance of the different intercept method relative to the same intercept method.

This estimator is not considered any further.

,IV iI 1

,M iI 1

( )IVE 0

ˆ ,1

i , 0 1 , , ˆ .IV i IV i i M i i u i iY I I X I x U

i

i

U ZX

45

Page 46: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Asymptotic relative efficiencies

• “As external” does same/worse than other 2 methods.

• “Weighted” (SCK) method does much better than “same intercept” method when:

- is small.

- is large.

- is small.

• Based on a grid search, “weighted” method never does

worse than “same intercept” method.

( , )Corr X U( , )Corr Y X

/v mn n

46

Page 47: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Fig. 1. A comparison of the asymptotic standard error of as a function of the correlation between the trueexposure, x, and the exposure measured with error, w, for two values of the percentage of subjects in theinternal validation study, and two values of the correlation between the outcome, y, and the true exposure,x. Plots were constructed assuming no additional covariates and equal variances of the true exposure andthe “proxy” exposure.

1

47

Page 48: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Fig. 2. A comparison of the asymptotic standard error of as a function of the correlation between the outcome, y, and the true exposure, x, for two values of the percentage of subjects in the internal validation study, and two values of the correlation between the true exposure, x, and the “proxy” exposure, w. Plots were constructed assuming no additional covariates and equal variances of the true exposure and the “proxy” exposure.

1

48

,y x ,y x

,y x,y x

Page 49: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Method Point estimate (95% CI)

Uncorrected (g/μg Hb/dL) -4.3 (-12.7,4.2)

Validation study alone (g/µg Hb/g bone) -3.8 (-6.8, -0.7)

As external (RSW) (g/µg Hb/g bone) -7.9 (-23.8, 8.1)

Weighted (g/µg Hb/µg bone) -3.9 (-6.9, -0.9)

Effect of bone lead on birth weight ( =577, =485) (Gonzalez-Cossio, 1997) .

Corr(X,W) = 0.19

1n 2n

49

Page 50: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Summary and conclusions

• With internal validation (IV) study, 3 methods were compared:

1. “as external”: ignores in IV data.

2. “Same intercept”: uses in IV data, otherwise.

3. “weighted” (SCK): combines from IV, from corrected main study, weighting each by its inverse variance.

• (1) same/worse than (2), (2) same/worse than (3).

• Especially important to use (3) when small, large, and/or validation sample is small relative to main study.

X

( , )Corr X U( , )Corr Y X

y

x

50

Page 51: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

A risk set regression calibration method for measurement error

correction for time-varying functions of the exposure history

Donna Spiegelman, Xiaomei Liao, and Samuela Pollack

Departments of Epidemiology and Biostatistics, Harvard School of Public Health

Page 52: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

52

Page 53: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error
donna spiegelman
xiaomei - please make sure this table matches results at the end
Page 54: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

The geographical location for all the nurses in NHS between 1986 and 2008

Page 55: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

55

(n2=505)

Page 56: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

56

Page 57: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

57

Page 58: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

58

Page 59: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

59

Page 60: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

60

Page 61: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

61

Page 62: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Risk set Regression Calibration

rela

tive

to

fr

om O

RC

1

1

Page 63: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

63

Page 64: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

64

Page 65: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

65

Page 66: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

66

Page 67: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

67

Page 68: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

68

Page 69: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

69

Page 70: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

0.49 0.07

0.330.67

CV_B

CV_B

BCV Estimate divided by square root of between-studies variance of estimate

Page 71: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

ResultsNHS 2000-2006, 12-month moving average PM2.5 of ambient origin in relation to all-cause mortality.

n/N = 8604 deaths / 108,765 participants during 7,538,537 person-person months of follow-up adjusted for age (months), calendar year, region, and season.Surrogate exposure: Yanosky spatio-temporal model predictions (Need to consider effect modification of calibration factor by season)

Surrogate exposure: EPA nearest monitor exposure

Overall resultsmodeltype RR (95% CI) pvalue groupnum numrisksets

Uncorrected 1.13 (1.05, 1.21) 0.001 NA NARRC 1.20 (1.04, 1.39) 0.01 30 5RRC 1.24 (1.05, 1.46) 0.01 40 4

Summer (Apr–Sep)

Winter (Oct–Mar)

 RR summer

(95% CI)RR_winter (95% CI) RR (95% CI) p-value groupnum # of risk sets

Uncorrected 1.19 (1.08,1.31) 1.14 (1.02,1.27) 1.17 (1.08, 1.26) 0.00004 NA NA

RRC 1.31 (1.09,1.57) 1.29 (1.02,1.62) 1.30 (1.12, 1.50) 0.0004 30 2 (2)

 RRC 1.27 (1.09,1.48) 1.31 (1.03,1.68) 1.28(1.13, 1.46) 0.0002 40 2 (2)

Page 72: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

72

Page 73: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

73

• We can accommodate the following situations:

• multiple surrogates for a single mis-measured exposure

• heteroscedastic measurement error

• internal and hybrid validation study designs

• cumulative exposure variables and other functions of the exposure history in cohort studies

• User-friendly SAS macros are available to implement many of these procedures• http://www.hsph.harvard.edu/faculty/spiegelman/blinplus.html

• http://www.hsph.harvard.edu/faculty/spiegelman/multsurr.htm

• http://www.mep.ki.se/%7Emarrei/software/ (for optimal main study / validation study design)

• Papers have been published applying these methods to the analysis of occupational and environmental studies: you won’t be the first!

• Just as we routinely adjust for confounding, we can routinely adjust for

measurement error

Page 74: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Conclusions

• Bias due to exposure measurement error in a major limitation to the validity of occupational and environmental studies

• Methods have been developed which accommodate the features of study design and data distributions found in such studies

• These methods implement explicit adjust for this source of bias, using the exposure validation study to characterize the magnitude and other features of the measurement error

• Point and interval estimates of effect are adjusted

74

Page 75: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

75

Page 76: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

76

Acknowledgements:

• Edie Weller, Ruifeng Li, Don Milton, Ellen Eisen, Barbara Valanis, Sally Thurston, Jon Samet, Paige Williams, Russ Hauser,

Roger Logan, Jon Samet, Doug Grove, Doug Dockery, Lucas Neas, NoraHorrick, Diane Gold, Mauricio Hernandez, Howard Hu, Aparna

Keshaviah

• Xiaomei Liao, Molin Wang, Biling Hong

• Francine Laden, Helen Suh, Jaime E. Hart, Joel Kaufman, Ronald Williams, Robin C. Puett, Marianthi-Anna Kioumourtzoglou

• Alan Berkeley

Thank You!

Page 77: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

77

Page 78: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

78

Page 79: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

79

Page 80: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

80

Page 81: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Association of indoor nitrogen dioxide with respiratorysymptoms in children:

Application of measurement error correction techniques to utilize data from multiple surrogates

Li R, Weller EA, Dockery DW, Neas LM, Spiegelman D.

Journal of Exposure Analysis and EnvironmentalEpidemiology, 2006; 16:342-350.

81

Page 82: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Table 4: NO2 and surrogates in relation to annual prevalence of respiratory symptoms1

RO ˆRO ˆ RO ˆ

    Variable

 Uncorrected

Analysis(n1=1754)

  [95%CI]

 Measurement Error

Corrected4

(n1=1754) 

[95%CI]

 Validation Study

Alone(n2=1137)

  [95%CI]

 Combined Analysis

(n1+n2=2891) 

[95%CI]

 NO2 (per 15 ppb increment) Surrogates2 (W) Gas stove, no pilot Gas stove, pilot Stove heater Fan Wood stove Number of rooms in the home3

Kerosene heater

    

0.68 [0.42, 1.10]1.54 [0.94, 2.52]1.61 [1.05, 2.47]0.93 [0.81, 1.07]0.91 [0.66, 1.25]0.99 [0.92, 1.06]

1.41 [0.96, 2.07]

 1.60 [1.10, 2.32]

  

0.37 [0.11, 1.29]1.91 [0.91,3.99]

435 [0.01,>1000]4.69 [0.21,104]

2.17 [0.17, 27.1]1.84 [0.09, 37.0]

2.77 [0.86, 8.97]

 1.41 [1.13, 1.75]

 1.45 [1.20, 1.75]

1 Adjusted for cities, single marital status, higher education status, parental history of bronchitis or emphysema,parental history of asthma, gender, age, and the total packs of cigarette smoking inside the child’s home

2 Odds ratios & their 95% CIs are given for the effect of each surrogate, adjusted for all others and for the model covariates

3 Odds ratios & their 95% CIs are given for the effect of a one room increase, adjusted for all other surrogates and for the model covariates

4 Measurement error-corrected odds ratios are per 15 ppb NO2

RO ˆ

82

Page 83: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

 

MEASUREMENT ERROR MODEL ( =1137)

Variable name (ppb/unit increase) P-value Weights1 ES 

Intercept 18.38 1.14 <0.001

SurrogatesGas without pilot light 6.01 0.69 <0.001 .183Gas with pilot light 10.05 0.73 <0.001 .510Stove heater 1.17 0.95 0.218 .002Fan -0.72 0.24 0.003 .029Wood stove -1.94 0.53 <0.001 .044Number of rooms in the home -0.35 0.12 0.004 .031

Kerosene heater 5.07 0.76 <0.001 .202

ConfoundersWatertown, Massachusetts -3.50 0.85 <0.001Kingston and Harriman, Tennessee -8.97 0.87 <0.001Steubenville, Ohio -4.41 0.76 <0.001Portage and surrounding communities -9.81 0.72 <0.001Topeka, Kansas -9.46 0.76 <0.001Single 1.01 0.60 0.089Parents with higher education -0.34 0.43 0.427Parental history of bronchitis 0.72 0.46 0.116Parental history of asthma 0.54 0.61 0.375Girls 0.70 0.41 0.086Age older >= 10 at first questionnaire 0.17 0.48 0.719

Parental cigarette smoking (packs/day) 1.07 0.33 0.001

1 These are the inverse of the variance of the corresponding measurement error model coefficient 

2n

83

Page 84: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Environmetrics 2003; 14: 573–582 (DOI: 10.1002/env.604)Occupational exposure to methyl tertiary butyl ether inrelation to key health symptom prevalence: the effectof measurement error correction

Aparna P. Keshaviah, Edie Weller and Donna Spiegelman

Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, U.S.A.Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115, U.S.A.

84

Page 85: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Table 2. Association of prevalence of key symptoms in relation to serum MTBE levels and occupation

Standard Measurement Validation study Combined analysis error corrected alone analysis

( = 328) ( = 328) ( =81) ( = 409)

Variable OR[95% CI] OR[95% CI] OR[95% CI] OR[95% CI]log10(MTBE) (X) 1.1[0.67, 1.77] 3.1[1.48, 6.33] 1.5[1.00, 2.25] (mg/L)Surrogates (W) Commuters 1.2[0.66, 2.05] 1.56[0.29, 8.45] Car repair 1.2[0.36, 4.06] 1.16[0.44, 3.10] Other 6.3[0.71, 56.3] >100[0.00, >1000] Pump gas 1.1[0.35, 13.54] 1.05[0.61, 1.81]

Reference group is Albany study participants. Adjusted for age, smoking and gender

1 2n n1n 1n1n

85

Page 86: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Table 3. Measurement error model for log10(blood MTBE levels) (mg/L) ( =81)

Variable SE( ) p-value Intercept 0.96 0.121 <0.0001Commuter 0.34 0.159 0.034 0.042Car repair 1.24 0.148 <0.0001 0.21Other 0.14 0.230 0.54 0.00Pump gas 2.13 0.316 <0.0001 0.75Smoke 0.062 0.130 0.64Age# 0.070 0.057 0.23Gender§§ 0.27 0.139 0.056

# Grouped as <35, 36–41, 42–52, 53+.§§Relative to males.

2n

86

Page 87: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Correcting for Measurement Error Bias InCumulative Exposure Variables:

A Cox Model For Lung Cancer Mortality In Relation To Radon Progeny Exposure.

R. Logan,D. Spiegelman,

Departments of Epidemiology and BiostatisticsHarvard School of Public Health

Boston, MA, USA.

J. SametJohns Hopkins School of Public Health

Baltimore, MD, USA

87

Page 88: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Original Study:

Lung Cancer Mortality and Exposure toRadon Progeny In A Cohort of New MexicoUnderground Uranium Miners.

J. Samet, D. Pathak, M. Morgan, C. Key,A. Valdivia, J. Lubin

Health Physics, vol 61, No. 6, 1991, pp 745-752

88

Page 89: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Original study consisted of:• 3469 males with at least one year of underground uranium mining experience.• Follow-up through December 31, 1985• 70 cases of lung cancer• 408 deaths

Present study:• 3469 males with at least one year of underground uranium mining experience.• Follow-up through December 31, 1993• Contains 120 cases of lung cancer• 686 deaths

89

Page 90: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Types of Measurements

Source of exposure data Years Spanned

1 Individual estimates (c) 1967 – 19852 Company-section Measurements (C) 1956 – 19763 Grants clinic 1942 – 19794 Colorado Plateau 1967 – 1985 5 Overrides 1959 – 1974

Hierarchy of data quality 1 > 2 > 3.

Assume that the individual estimates are the gold standard, in the sense that the ‘ideal’ study would have used these measurements for everyone.

There are 8 years of overlap between c and C, between 1967 – 1976. 90

Page 91: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Validation Study

The validation study consists of 2833 pairs of individual annualmeasurements and annual section/company samples (c and C point exposure measurements).

where and are the longest available cumulative exposure measurements for the 862 miners in the validation study.

2, 0.33, = 2833.Corr c C n

, 0.64, = 862,Corr x t X t n

x t X t

91

Page 92: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

92

Page 93: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Error Model for Possible Choices:

Gaussian : includes normal and other power transforms; log-normal also considered. Gamma :

In the above models, is a collection of perfectly measured covariates.  Model Used:A combination of gamma distribution and a point mass at 0.    

Where is 1 when holds and 0 otherwise.

| , .f c C

1 2 2 2| , , ,p p pc C U N C U

| , , , , .c C U Gamma C U C U

U

,

,| , 01

C U

C U

ef c C U I ce

,

11 0 , , , ,1 C UI c Gamma C U C U

e

I A A93

Page 94: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Extended Partial Likelihood function.

where

if if

if

and

1 2 1 2

1

2 21 1

, log | , ; ,n n n n

i i i ii i n

EPL PL f x t X t U t

1 2

1

* ,log ,

* ,

iD

i i i ii n n

j i i i i ij

RR X T U TPL

I T T RR X T U T

3

3

*

* wX t X tw

x t

RR t

RR t RR t e

e

0i wit t t

0i wit t t

0 ,wi it t t

*3 2 2| , .x t

x tRR t e f x t X t U t dx t

94

Page 95: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Results

Here is the log rate ratio corresponding to cumulative exposure.

• There is about a 30% attenuation in beta due to exposure measurement error.

• Results could have policy implications since the original study played an important role in permissible exposure levels for radon

Model 100 WLM 500 WLMUncorrected

0.003524 1.42 ( 1.25, 1.62) 5.82 (3.06, 11.1)

Corrected0.005632 1.76 ( 1.39, 2.22) 16.7 (5.18, 53.9)

OR

95

Page 96: Methods  to Correct Measures of Effect for Bias  due  to Exposure Measurement  Error

Assumptions• True exposure and the s-vector of covariates are related to the

probability of binary outcome by the logistic function:

Pr 0 1 2D 1 X Z logit

where 2 21 22 2sβ = β ,β …,β .

• Linear regression model is appropriate to relate the r surrogates (W) and the s covariates (Z) to the true exposure:

Where

*******

• is a surrogate if , that is, knowledge of the surrogates provides no additional information if the true exposure is known.

• and small, or small

Pr | , Pr | ,D X W,Z D X ZW

X U D

| ,~ ( , )2X W ZN 0

2| ,( ) 0, ( ) X Z WE Var

0 1 2X W Z γ

| ,2 21 X W Z

Pr( )D

96