longitudinal analyses in diabetes epidemiologybiostat.au.dk/researchseminars/longitudinal... ·...

Post on 31-May-2020

9 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMANAARHUSUNIVERSITYDEPARTMENT OF PUBLIC HEALTH

AUDEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

LONGITUDINAL ANALYSES IN DIABETES EPIDEMIOLOGY

Methodological overview

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

OUTLINE

� Multilevel models

• Motivation

• Definition, model formulation

• Practical example

� Missing data

• Types, consequences

� Extensions of the multilevel framework

• Joint modeling

• Latent class trajectory analysis

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

MULTILEVEL MODELS

� Part on multilevel models is based on ALDA* from Singer & Willett

� Notations and examples are from ALDA

http://www.ats.ucla.edu/stat/examples/alda*Applied Longitudinal Data Analysis

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

MOTIVATION

� Is there a need for longitudinal studies?

� If yes, do we need special statistical models to analyze the data?

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

QUESTIONS

� How does the outcome change over time within individuals?

� Which factors explain between-individual differences?

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

MULTILEVEL MODEL FORMULATION

Yij = π 0i + π1i ⋅TIMEij +εijLevel 1 ε ij �N(0,σ ε

2 )

Growth parameters for individual i

� We assume linear change over time t

π 0 i = γ00 +ζ0i

π1i = γ10 +ζ1i

Level 2

ζ 0 i

ζ1i

�N

0

0

,

σ 0

2 σ 01

σ 10 σ 1

2

Fixed effects: population average intercept & slopeRandom effects: deviations from the population average

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

VARIANCE COMPONENTS

σ ε2: unexplained within-person residual variance

σ 0

2: between-person residual variance in initial status

σ1

2: between-person residual variance in rate of change

σ 01,σ10: residual covariance between initial status and rate of change

π 0 i = γ 00 +ζ 0 i

π1i = γ 10 +ζ1i

ε ij �N(0,σ ε

2 )

ζ 0 i

ζ1i

�N

0

0

,

σ 0

2 σ 01

σ 10 σ 1

2

Yij = π 0i + π1i ⋅TIMEij +εij

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

ESTIMATION

� Full maximum likelihood vs. Restricted maximum likelihood (ML / REML)

• Focus on fixed effects (ML) or variance components (REML)

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

COMPOSITE MODEL

� Substitute level-2 equations into level-1

� Rearrange terms of the composite model

Yij = (γ00 +ζ0i )+ (γ10 +ζ1i ) ⋅TIMEij +εij

Yij = γ00 +γ10 ⋅TIMEij + (ζ0i +ζ1i ⋅TIMEij +εij )

Composite residual: rij

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

EXAMINE COMPOSITE RESIDUALS

� We assume a balanced data set with three waves for simplicity

� OLS?

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

OLS NOT CORRECT

� Between individuals

• Independence

� Within individuals

• Heteroscedastic

• Correlated

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

COVARIANCE STRUCTURE

� Block diagonal matrix

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

COVARIANCE STRUCTURE

� Unstructured

� Compound symmetry (if slopes does not differ much)

� Autoregressive (band diagonal)

� …

Σ r =

σ r1

2 σ r1r2σ r1r3

σ r2r1σ r2

2 σ r2r3

σ r3r1σ r3r2

σ r3

2

Σ r =

σ 2 +σ 1

2 σ1

2 σ1

2

σ 1

2 σ 2 +σ 1

2 σ1

2

σ 1

2 σ1

2 σ 2 +σ 1

2

Σ r =

σ 2 σ 2ρ σ 2ρ 2

σ 2ρ σ 2 σ 2ρ

σ 2ρ 2 σ 2ρ σ 2

σ ri

2 = Var (ζ 0 i +ζ1i ⋅ tij + ε ij ) = σ ε2 +σ 0

2 + 2σ 01tij +σ1

2tij

2

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

GENERALIZED LEAST SQUARES

� More complex assumptions than OLS

� Minimizes the sum of squared residuals

� Some programs use iterative GLS (IGLS)

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

EXAMPLE FROM SINGER & WILLET

Yij = (γ00 +γ01COAi +ζ0 i )+ (γ10 +γ11COAi +ζ1i ) ⋅TIMEij +εij

� Alcohol use during adolescence (data and code at ALDA webpage)

• Unconditional means (Model A)

• Unconditional growth (Model B)

• Model B + a time-invariant predictor (Model C)

Yij = (γ00 +ζ0i )+ (γ10 +ζ1i ) ⋅TIMEij +εij

Yij = γ00 +ζ0 i +εij

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

PRACTICAL

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

MISSING DATA

� Different reasons: late entry, intermittent missingness, loss to follow-up

� Can lead to unbalanced data

� Effect depends on the missing data mechanism (type of missingness)

• Missing completely at random (MCAR)

• Missing at random (MAR)

• Missing not at random (MNAR)

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

MCAR

� Missingness is completely unrelated to both the history and the current value of the outcome

� Examples:

• Lab equipment did not work

• Samples were lost

• Participants were mistakenly not invited for examination

� Less data collected (loss of efficiency, but no bias)

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

MAR

� Missingness depends on the history of the outcome, but not on the current value

� Example:

• Participants are not invited anymore if they reached a threshold (diabetes diagnosis based on FPG)

� Some methods might give biased estimatest

7.0 mmol/lF

PG

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

MAR

� ML based method gives valid estimates in case of MAR

Figure from Rizopoulos D, CRC Press, 2012

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

MNAR

� Missingness depends on the current value of the outcome

� E.g. participant doesn’t show up because she

� MAR vs. MNAR?

� Dropout event should be modelled simultaneously (JM)t

7.0 mmol/lF

PG

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

EXTENSIONS OF THE MULTILEVEL MODEL

� Joint models for longitudinal and survival data

� Latent class trajectory analyses

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

JOINT MODELING

� Combines trajectory and survival analyses (continuous biomarker closely related to an event)

� Survival analysis, but taking the entire history of a biomarker into account (also for endogenous variables)

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

IN RELATION TO MULTILEVEL MODELS

� JM gives valid estimates in case of MNAR

� No statistical test to decide between MAR and MNAR…

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

IN RELATION TO COX REGRESSION

� Common approach to use baseline or last available value in Cox model

� Time-varying Cox regression

• Problematic for endogenous (internal) variables

• Does not take measurement error into account (unrealistic)

• Underestimates the true association (theoretical + simulation results)

Figure from Rizopoulos D, CRC Press, 2012

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

PARAMETERIZATION OF JM

� Link between longitudinal and survival model

� Association between biomarker and event hazard

• Value (1)

• Lagged (2)

• Slope (3)

• Cumulative (4)

• …

• Any function of m(t)

1 2

3 4

Figure from Rizopoulos D, CRC Press, 2012

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

DYNAMIC PREDICTION

Figure from Rizopoulos D, CRC Press, 2012

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

LATENT CLASS ANALYSES

� Is it sufficient to look at only mean trajectories?

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

LATENT CLASS ANALYSES

� Just one mean trajectory does not always fit well to the data

� Potential underlying heterogeneity, but not based on predefined groups

� “Cluster-type” analysis

� Model formulation is similar to what we saw previously for multilevel models, but we have to specify a priori

• which effects might vary between classes (not necessarily all)

• the number of classes

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

MODELING STEPS

� Common strategy: fit 1, 2, 3 classes…

� Choose lowest BIC with meaningful patterns and sufficient class size (e.g. >5%)

� This results in coefficient estimates for each class (some might be shared coefs)

� We also get class membership probabilities, so that we can assign individuals in our sample to a pattern

� Comparison of class characteristics

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMAN

DEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AARHUSUNIVERSITYAU

EXAMPLE

� Plasma glucose peaks between 30-75 mins at 7–12.5 mmol/l

Study Inter99 CPH Hoorn

N (participants) 118 238 185

N (measurements) 9 5 6

Men (%) 61 64 48

Age, median (Q1-Q3)

56 (46-61) 56 (38-66) 54 (48-59)

� Oral glucose tolerance test (OGTT) with multiple glucose measurements

OCTOBER 23, 2015

RESEARCH SEMINAR

POSTDOCTORAL RESEARCHER

ADAM HULMANAARHUSUNIVERSITYDEPARTMENT OF PUBLIC HEALTH, SECTION FOR EPIDEMIOLOGY

AU

Thank you for your attention!

top related