the chicago guide to writing about multivariate analysis, 2nd edition. data structure for a...

25
The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

Upload: poppy-miller

Post on 03-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Data structure for a discrete-time event history analysis

Jane E. Miller, PhD

Page 2: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Overview• Structure of most survey data: One record per

respondent• Discrete-time event history analysis requires separate

records for each person-time unit at risk of the event• Review: How to create one record per spell• How to create one record per person-time unit

– Components of the dependent variable– Fixed characteristics– Time varying characteristics

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 3: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Data preparation for an event history

• Survey data often contains one record per respondent

• Continuous-time event history data contain one record per spell

• Discrete-time event history analysis requires one record per person-time unit within each spell– E.g., one record for each person-month at risk of divorce,

within each spell at risk of divorce

Page 4: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Source data from survey: 1 record per respondent

IDDate of

birthDate of 1st marriage

Date of 1st

divorce

Date of 2nd

marriage

Date of 2nd

divorceDate of death

Date 1st observed

Date last observed Gender

Date of 1st child's

birth

Date of 2nd child's

birth

1 2/1/52. . . . . 7/15/85 10/1/10 F . .

2 7/15/69 6/22/10. . . . 9/21/85 11/5/10 M . .

3 3/1/65 8/1/90 1/1/97 10/1/04. . 10/8/85 5/1/05 M 12/5/95.

4 3/1/42 6/1/63. . . 10/1/02 12/2/85 10/2/02 F 9/21/64 5/11/67

Page 5: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Example timelines for study of divorce

End of observation period

L

M = MarriedD = DivorcedL = Lost to follow-upO = Censored by end of study.X = Died

D

M

M

M O

X

M

Case 1: Never married -> no spells

Case 2: Married once, censored by end of survey

Case 3: Married twice, lost to follow-up before end of survey

Case 4: Married once, died before end of survey

Not married -> not at risk of divorce -> not part of a spell

Page 6: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Continuous-time event history data• One record for each period at risk (spell)

– Duration of overall spell– Event indicator at end of spell

IDSpell #

(marriage #)

Date spell

started

Duration of spell (mos.)

Status at end of spell

Divorce event

indicator

Age first observed

(yrs)

Age at start of

spell (yrs)

Age last observed

(yrs) Gender

# kids at start of

spell

2 1 6/22/10 3.5 0 0 16 40 41 male 0

3 1 8/1/90 76.5 1 1 20 25 45 male 0

3 2 10/1/04 6.5 2 0 20 39 45 male 1

4 1 6/1/63 474.5 3 0 43 21 60 female 0

Page 7: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Event history timeline: Discrete time specification

Case 2, Continuous time version: One four-month spell

Married 6/22/2010 Last surveyed 11/5/2010

1st person-month

Four person-month units

Case 2, Discrete-time version: Each person-month unit becomes one record -> unit of analysis. All records for each spell include respondent ID and other characteristics.

Married O2nd person-month O O3rd person-month O O

4th person-month O End of survey

O = Censored

Page 8: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

One record per person-month

IDSpell #

(marriage #)Record #

w/in spell2 1 12 1 22 1 32 1 43 1 13 1 23 1 33 1 …3 1 773 2 13 2 23 2 …3 2 7

One record per spell

IDSpell #

(marriage #)

Duration of spell (mos.)

Status at end of spell

Divorce indicator

2 1 4 0 03 1 77 1 13 2 7 2 0

Discrete-time data set: ID codes on person-time records

• Each person-month record carries the respondent ID

• Each record within a given spell also includes the spell # for that respondent

Page 9: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Record number within spell

• Each month in a spell will generate one person-month record, e.g.,

– respondent #2 is observed for 4 months -> 4 person-month records

– respondent #3 contributes a total of 84 records

• 77 in his first spell • 7 in his second spell

One record per person-month

IDSpell #

(marriage #)Record #

w/in spell2 1 12 1 22 1 32 1 43 1 13 1 23 1 33 1 …3 1 773 2 13 2 23 2 …3 2 7

One record per spell

IDSpell #

(marriage #)

Duration of spell (mos.)

Status at end of spell

Divorce indicator

2 1 4 0 03 1 77 1 13 2 7 2 0

Page 10: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Month counter within spell

One record per person-month

IDSpell #

(marriage #)Record #

w/in spell

month # within spell

2 1 1 02 1 2 12 1 3 22 1 4 33 1 1 03 1 2 13 1 3 33 1 … …3 1 77 763 2 1 13 2 2 23 2 … …3 2 7 6

One record per spell

IDSpell #

(marriage #)

Duration of spell (mos.)

Status at end of spell

Divorce indicator

2 1 4 0 0

3 1 77 1 1

3 2 7 2 0

The “month # within spell” counter indicates the start time of the person-month at risk for that record. E.g., the first record for a given spell starts at baseline (time point 0).

Page 11: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Duration measure for each record within spell

The duration measure will = 1 time units for all person-time records within a given spell EXCEPT = 0.5 for the last month in a spell

One record per person-month

ID

Spell # (marriage

#)

Record # w/in spell

month # within spell

Person-months

w/in record

2 1 1 0 12 1 2 1 12 1 3 2 12 1 4 3 .53 1 1 0 13 1 2 1 13 1 3 3 13 1 … … 13 1 77 76 .53 2 1 1 13 2 2 2 13 2 … … 13 2 7 6 .5

One record per spell

IDSpell #

(marriage #)

Duration of spell (mos.)

Status at end of spell

Divorce indicator

2 1 4 0 0

3 1 77 1 1

3 2 7 2 0

Page 12: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Status indicator for each record within spell

The indicator for status at end of record will = 0 for all person-time records within a given spell EXCEPT the last one because by definition they end in censoring (the spell is not yet complete)

One record per person-month

ID

Spell # (marriage

#)

Record # w/in spell

month # within spell

Person-months

w/in record

Status at end

of record

2 1 1 0 1 02 1 2 1 1 02 1 3 2 1 02 1 4 3 .5 03 1 1 0 1 03 1 2 1 1 03 1 3 3 1 03 1 … … 1 03 1 77 76 .5 13 2 1 1 1 03 2 2 2 1 03 2 … … 1 03 2 7 6 .5 2

One record per spell

IDSpell #

(marriage #)

Duration of spell (mos.)

Status at end of spell

Divorce indicator

2 1 4 0 0

3 1 77 1 1

3 2 7 2 0

Page 13: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Status indicator for last record within spell

The indicator for status at end of record for the last person-time record within each spell will take on the value of the status indicator for the overall spell

One record per person-month

ID

Spell # (marriage

#)

Record # w/in spell

month # within spell

Person-months

w/in record

Status at end

of record

2 1 1 0 1 02 1 2 1 1 02 1 3 2 1 02 1 4 3 .5 03 1 1 0 1 03 1 2 1 1 03 1 3 3 1 03 1 … … 1 03 1 77 76 .5 13 2 1 1 1 03 2 2 2 1 03 2 … … 1 03 2 7 6 .5 2

One record per spell

IDSpell #

(marriage #)

Duration of spell (mos.)

Status at end of spell

Divorce indicator

2 1 4 0 0

3 1 77 1 1

3 2 7 2 0

Page 14: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Event indicator for each record within spell

One record per person-month

IDSpell #

(marriage #)Record #

w/in spell

month # within spell

Divorce indicator for record

2 1 1 0 02 1 2 1 02 1 3 2 02 1 4 3 03 1 1 0 03 1 2 1 03 1 3 3 03 1 … … 03 1 77 76 13 2 1 1 03 2 2 2 03 2 … … 03 2 7 6 0

One record per spell

IDSpell #

(marriage #)

Duration of spell (mos.)

Status at end of spell

Divorce indicator

2 1 4 0 0

3 1 77 1 1

3 2 7 2 0

Page 15: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Fixed covariates for each person-time record

IDSpell #

(marriage #)Record #

w/in spell

month # within spell

Divorce indicator for record

Age at start of

spell (yrs) Gender

# children at start of

spell2 1 1 0 0 40 male 02 1 2 1 0 40 male 02 1 3 2 0 40 male 02 1 4 3 0 40 male 03 1 1 0 0 25 male 03 1 2 1 0 25 male 03 1 3 3 0 25 male 03 1 … … 0 25 male 03 1 77 76 1 25 male 03 2 1 1 0 39 male 13 2 2 2 0 39 male 13 2 … … 0 39 male 13 2 7 6 0 39 male 1

Age, number of children at start of spell, and gender do not change during the course of a spell, so they have the same value for each person-time record within a given spell

Page 16: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Example timelines for number of children as time-varying covariate in study of divorce

L

M = Married D = Divorced C = Child bornL = Lost to follow-up O = Censored by end of study. X = Died

D

M

M

X

MCase 3:

Case 4:

C

C C

No kids

No kids

One kid

Two kidsOne kid

IDDate of

birthDate 1st observed

Date of 1st

marriage

Date of 1st

child's birth

Date of 2nd

child's birth

Date of 1st

divorce

Date of 2nd

marriage

Date of 2nd

divorceDate of death

Date last observed

3 3/1/65 10/8/85 8/1/90 12/5/95. 1/1/97 10/1/04. . 5/1/05

4 3/1/42 12/2/85 6/1/63 9/21/64 5/11/67. . . 10/1/02 10/2/02

Columns reordered into chronological order

Page 17: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Discrete time with time-varying covariates

• Case 3 has his first child 64 months into his first marriage, and no additional children while observed. # kids at start of record is 0 for his first 63 records of

spell 1 1 for records 64 through 77

of spell 1 1 for all records in spell 2

ID Spell #month #

w/in spell

Divorce indicator for

record

# kids at start of

spell

# kids at start of record

3 1 0 0 0 03 1 1 0 0 03 1 … 0 0 03 1 64 0 0 13 1 77 1 0 13 2 0 0 1 13 2 … 0 1 13 2 6 0 1 1

Page 18: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Discrete time with time-varying covariates

• Case 4 has her first child 15 months into her marriage, a second child in month 47 after marriage. For her the # kids at start of record is 0 for her first 15 records 1 for records 15 through

46 2 for records 47 or higher,

all in spell 1

ID Spell #month #

w/in spell

Divorce indicator for

record

# kids at start of

spell

# kids at start of record

4 1 0 0 0 04 1 … 0 0 04 1 15 0 0 14 1 … 0 0 14 1 47 0 0 24 1 … 0 0 24 1 474 0 0 2

Page 19: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Presenting information on event history construction: Background work• Most of the gory details of creating an event history are part

of behind-the-scenes work– Important to do consistency checks to make sure event histories

were created correctly given • Original data source of information for timeline construction• Type of event under study• Fixed covariates• Time-varying covariates

– E.g., correct • Number of spells per respondent• Number of person-time records for each spell• Duration and event indicators for each person-time record• Values of fixed- and time-varying covariates for each person-time record

Page 20: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Presenting information on event history construction

• In the data and methods section, describe:– Original data source of information for timeline construction

• Dates, status, duration of events

– Type of event under study– Unit of person-time (e.g., person-years, person-months)– What constitutes censoring– Fixed covariates– Time-varying covariates

• Source(s) of information for determining timing of changes in those variables

• See checklist in chapter 17 of Writing about Multivariate Analysis, 2nd Edition for more detail on what to report

Page 21: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Summary• A discrete-time event history analysis requires a separate

record for each person-time unit at risk of the event• For each respondent, create correct number of spells• For each spell, calculate

– Correct number of person-time units– Components of the dependent variable

• Duration measure• Event indicator

– Fixed characteristics– Time-varying characteristics

• In data and methods section, describe data sources and variables for the event history

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 22: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Suggested resources

• Allison, P. D. 2010. Survival Analysis Using the SAS System: A Practical Guide, 2nd Edition. Cary, NC: SAS Institute.

• Miller, J. E. 2013. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. University of Chicago Press, chapter 17.

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 23: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Suggested online resources

• Podcast on data structure for a continuous-time event history analysis

Page 24: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Suggested exercises

• Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.– Question #3a in the problem set for chapter 17– Suggested course extensions for chapter 17

• “Reviewing” exercises #2a through 2h• “Applying statistics and writing” exercises #1 and 2a

Page 25: The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Data structure for a discrete-time event history analysis Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.Event history analysis: discrete time data

Contact information

Jane E. Miller, [email protected]

Online materials available athttp://press.uchicago.edu/books/miller/multivariate/index.html

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.