modelling longitudinal data general points single event histories (survival analysis) multiple...

52
Modelling Longitudinal Data • General Points • Single Event histories (survival analysis) • Multiple Event histories

Upload: brendan-mccormick

Post on 19-Jan-2018

215 views

Category:

Documents


0 download

DESCRIPTION

Statistical Modelling Framework Generalized Linear Models An interest in generalized linear models is richly rewarded. Not only does it bring together a wealth of interesting theoretical problems but it also encourages an ease of data analysis sadly lacking from traditional statistics….an added bonus of the glm approach is the insight provided by embedding a problem in a wider context. This in itself encourages a more critical approach to data analysis. Gilchrist, R. (1985) ‘Introduction: GLIM and Generalized Linear Models’, Springer Verlag Lecture Notes in Statistics, 32, pp.1-5.

TRANSCRIPT

Page 1: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Modelling Longitudinal Data

• General Points

• Single Event histories (survival analysis)

• Multiple Event histories

Page 2: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Motivation

• Attempt to go beyond more simple material in the first workshop.

• Begin to develop an appreciation of the notation associated with these techniques.

• Gain a little “hands-on” experience.

Page 3: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Statistical Modelling FrameworkGeneralized Linear Models

An interest in generalized linear models is richly rewarded. Not only does it bring together a wealth of interesting theoretical problems but it also encourages an ease of data analysis sadly lacking from traditional statistics….an added bonus of the glm approach is the insight provided by embedding a problem in a wider context. This in itself encourages a more critical approach to data analysis.

Gilchrist, R. (1985) ‘Introduction: GLIM and Generalized Linear Models’, Springer Verlag Lecture Notes in Statistics, 32, pp.1-5.

Page 4: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Statistical Modelling

• Know your data.• Start and be guided by

‘substantive theory’.• Start with simple

techniques (these might suffice).

• Remember John Tukey!• Practice.

Page 5: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Willet and Singer (1995) conclude that discrete-time methods are generally considered to be simpler and more comprehensible, however, mastery of discrete-time methods facilitates a transition to continuous-time approaches should that be required.

Willet, J. and Singer, J. (1995) Investigating Onset, Cessation, Relapse, and Recovery: Using Discrete-Time Survival Analysis to Examine the Occurrence and Timing of Critical Events. In J. Gottman (ed) The Analysis of Change (Hove: Lawrence Erlbaum Associates).

Page 6: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

As social scientists we are often substantively interested in whether a specific event has occurred.

Page 7: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Survival Data – Time to an event

In the medical area…

• Time from diagnosis to death.• Duration from treatment to full health.• Time to return of pain after taking a pain

killer.

Page 8: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Survival Data – Time to an eventSocial Sciences…

• Duration of unemployment.• Duration of housing tenure.• Duration of marriage.• Time to conception.• Time to orgasm.

Page 9: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Consider a binary outcome or two-state event

0 = Event has not occurred1 = Event has occurred

Page 10: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Start of Study End of Study

0 1

0

0

1

1

t1 t2 t3

A

B

C

Page 11: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

These durations are a continuous Y so why can’t we use standard

regression techniques?

Page 12: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

These durations are a continuous Y so why can’t we use standard

regression techniques?

We can. It might be better to model the log of Y however. These models are sometimes known as ‘accelerated life models’.

Page 13: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Start of Study

0 1

0

0

1

1

t1 t2 t3 t4

1946

1946 Birth Cohort Study

Research Project 2060(1st August 2032 VG retires!)

1=Death

A

C

B

Page 14: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Breast Feeding Study –

Data Collection Strategy

1. Retrospective questioning of mothers

2. Data collected by Midwives

3. Health Visitor and G.P. Record

Page 15: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Birth

1995

Start of Study

Breast Feeding Study –

Age 6

2001

Page 16: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Birth

1995

Start of Study

0 1

0

0

1

1

t1 t2 t3

Breast Feeding Study –

Age 6

2001

Page 17: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Accelerated Life Model

Loge ti = x1i+ei

Page 18: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Accelerated Life Model

Loge ti = x1i+ei

constant

explanatory variable

error termBeware this is log t

Page 19: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

At this point something should dawn on you – like fish scales falling from your eyes – like pennies from Heaven.

Page 20: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Think about the l.h.s.•Yi - Standard liner model•Loge (odds) Yi - Standard logistic model•Loge ti - Accelerated life model

We can think of these as a single ‘class’ of models and (with a little care) can interpret them in a similar fashion (as Ian Diamond of the ESRC would say “this is phenomenally groovy”).

x1i+ei is the r.h.s.

Page 21: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Start of Study End of Study

0 1

0

0

1

1

1

0

CENSORED OBSERVATIONS

0

Page 22: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Start of Study End of Study

1

B

CENSORED OBSERVATIONS

A

Page 23: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

These durations are a continuous Y so why can’t we use standard

regression techniques?

What should be the value of Y for person A and person B at the end of our study (when we fit the model)?

Page 24: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Cox Regression(proportional hazard model)

is a method for modelling time-to-event data in the presence of censored cases.

•Explanatory variables in your model (continuous and categorical). •Estimated coefficients for each of the covariates.•Handles the censored cases correctly.

Page 25: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Cox, D.R. (1972) ‘Regression models and life tables’ JRSS,B, 34 pp.187-220.

Page 26: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Childcare Study –Studying a cohort of women who

returned to work after having their first child.

• 24 month study

• The focus of the study was childcare spell #2

• 341 Mothers (and babies)

Page 27: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Variables

• ID• Start of childcare spell #2 (month)• End of childcare spell #2 (month)• Gender of baby (male; female)• Type of care spell #2 (a relative;

childminder; nursery)• Family income (crude measure)

Page 28: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Describes the decline in the size of the risk set over time.

Survival Function(or survival curve)

Page 29: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

S(t) = 1 – F(t) = Prob (T>t)

also

S(t1) S(t2)

for all t2 > t1

Survival Function

Page 30: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

S(t) = 1 – F(t) = Prob (T>t)

Survival Function

survival probability

complement

Cumulative probability

event

time

Page 31: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

S(t1) S(t2)

for all t2 > t1

Survival Function

All this means is… once you’ve left the risk set you can’t return!!!

Page 32: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Survival Functions

TIME

3020100

Cum

Sur

viva

l1.2

1.0

.8

.6

.4

.2

0.0

family income

Up to £30K

Up to £30K-censored

£30K +

£30K +-censored

Page 33: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Survival Functions

TIME

3020100

Cum

Sur

viva

l

1.2

1.0

.8

.6

.4

.2

0.0

family income

Up to £30K

Up to £30K-censored

£30K +

£30K +-censored

Median Survival Times

Page 34: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

One Minus Survival Functions

TIME

3020100

One

Min

us C

um S

urvi

val

1.0

.8

.6

.4

.2

0.0

-.2

family income

Up to £30K

Up to £30K-censored

£30K +

£30K +-censored

Page 35: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Log Survival Function

TIME

3020100

Log

Sur

viva

l

1

0

-1

-2

-3

-4

-5

family income

Up to £30K

Up to £30K-censored

£30K +

£30K +-censored

Too hard to interpret except for the Rain Man

Page 36: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

HAZARD

In advanced analyses researchers sometimes examine the shape of something called the hazard. In essence the shape of this is not constrained like the survival function. Therefore it can potentially tell us something about the social process that is taking place.

Page 37: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

For the very keen…

Hazard – the rate at which events occur

Orthe risk of an event occurring at a particular time, given that it has not happened before t

Page 38: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

For the even more keen…Hazard –

The conditional probability of an event occurring at time t given that it has not happened before. If we call the hazard function h(t) and the pdf for the duration f(t)Then, h(t)= f(t)/S(t)

Page 39: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Hazard Function

TIME

3020100

Cum

Haz

ard

5

4

3

2

1

0

-1

family income

Up to £30K

Up to £30K-censored

£30K +

£30K +-censored

Page 40: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Y variable = duration with censored observations

X1

X3

X2

A Statistical Model

Page 41: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Y variable = duration with censored observations

Family income

Gender of baby

A Statistical Model

Mother’s age

A continuous covariate

Type of childcare

Page 42: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

For the keen..

Cox Proportional Hazard Model

h(t)=h0(t)exp(bx)

Page 43: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Cox Proportional Hazard Model

h(t)=h0(t)exp(bx)

hazard baseline hazard(unknown)

exponential

estimateX var

Page 44: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

For the very keen..

Cox Proportional Hazard Model can be transformed into an

additive model

log h(t)=a(t) + bxTherefore…

Page 45: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

For the very keen..

Cox Proportional Hazard Model

log h(t)=0(t) + x1

This should look distressingly familiar!

Page 46: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…
Page 47: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Define the code for the event

(i.e. 1 if occurred – 0 if censored)

Page 48: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Enter explanatory variables

(dummies and continuous)

Page 49: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Variables in the Equation

1.282 .140 83.594 1 .000 3.605-.046 .118 .153 1 .696 .955.012 .010 1.258 1 .262 1.012

1.165 .151 59.157 1 .000 3.2061.887 .157 144.903 1 .000 6.598

INCGENDERMUMAGECHILDMNURSERY

B SE Wald df Sig. Exp(B)

X var

EstimateStandard error

Chi-square related

Un-logged estimate

Page 50: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

What does this mean?

Our Y the duration of childcare spell #2.Note we are modelling the hazard!

Page 51: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Significant Variables• Family income p<.001

• Gender baby p=.696

• Mother’s age p=.262

• Childminder p<.001

• Nursery p<.001

Page 52: Modelling Longitudinal Data General Points Single Event histories (survival analysis) Multiple Event…

Effects on the hazard

• Family income p<.001£30K +Up to £30K

Childminder p<.001

Nursery p<.001