jean-marie le goff pavie-unil - home -...
Post on 16-Sep-2018
223 Views
Preview:
TRANSCRIPT
Duration models
Jean-Marie Le GoffPavie-Unil
Other terms for duration models
• Duration models: econometry
• Survival analysis : medical sciences, demography
• Event history analysis, transition analysis : social sciences
Outline• Aim of event history analysis• How to prepare data ?
– Rule of precautions– Censoring
• To investigate data– The life table
command in SPSS• Elements of hazard regression models
– The choice of a model (Cox model, time discrete logit model)
– The person-period file for discrete time models– Estimations– Interpretation of results
• Example of Internet adoption on panel data
Aims of event history analysis
timetl
0
1
Event =A transition or a switch from a
state 0 to a state 1« A change in a variable »
(Tuma & Hannan 1984)
-Estimation of the distribution of the risk or the hazard to experiment the event along the time
- Influences of individual and contextual characteristics on the hazard
Durationat risk
to experience the event
State (Y)
Notations• Discrete time • Continuous time
ttTttTPth
t Δ≥Δ+<
=→Δ
)|(lim)(0)|()( lll tTtTPtP ≥==
The hazard corresponds to the conditional probability to experience the event
=Probability to experience the event at time t
given that people did not experience it before t
Before running an event history analysis
• The risk of occurrence of an event depends on the past and the present of individuals and not of their future• A second marriage cannot explain a first divorce• The rule is to follow individuals through time
Do not determine the future of individuals
To operationalize an analyze
Definitionof the
analyze
Definition of thepopulation
submitted to the risk
Time t0and
clock
Independantcovariates
To define censors
First unionformation
Individual who never
cohabitedwith a partner
Age from 16years before To get in a first union
Religion, practice.
Etc.
People who did not get into
a union
Fixed covariates and time dependant covariates
To have experience the event or not
Example
Censoring
to tmax
Left censors Right censors
A couple of two dependent variables
• Age from 16 years old to :
– If a first union : age of this first union formation
– If not : age at the moment of the survey
• Everybody is considered to be submitted to the risk until the disappearance of the population (because of union formation, or because of the moment of the survey)
• Censors:
– 0 : people who did not get into a union
– 1: people who get into a cohabiting union
– 2: people who get into a direct marriage
How to investigate data. The life table method
l at timeevent theexperiencenot did whopersons ofNumber 1l timeand l mebetween ti occured events ofNumber )( +
=ltP
-El = Number of events in tl-Cl = Numbers of censoring spells during the interval [t l ,tl+1 ]-Rl = number of persons who did not experience the event in tl ,
= number of persons who have an observed or censored duration greater or equal to tl
ll
ll
CR
EtP
21)(
−=
Hazard to experience the event in discrete time: conditional probability to experience the event
Probability of survival (probability to not have experienced the event)
))(1()(
))(1))...(1(1))(0(1()(
uPtS
tPPPtS
tu
ou−Π=
−−−=
=
=
Hazard rate (Hazard in case of continuous time)
event theexperience tosubmitting years-person ofNumber timeof intervalan during events ofNumber
=th
( )lll
ll
CER
Eth−−
=
21)(
- El = Number of events in tl- Cl = Numbers of censoring spells during the interval [t l ,tl+1 ]- Rl = number of persons who did not experience the event in tl ,
= number of persons who have an observed or censored duration greater or equal to tl
How to choose between a Cox model and discrete time logistic
modelCox Logistique
Continuous time Discrete time
A small unit of time A large unit of time
Less than 5% of individuals experience the event during a time
interval
More than 5%
Continuous time vs
discrete time
• Classic models (Cox models) are based on a continuous time (time in days in medical research)
• Two kinds of discrete time:– « true » discrete process of data generation
(acces into higher degree for a population of students)
– Continuous process of data generation but long interval between measures
Cox model
)exp()(),( 0 βtt xthxth =
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
=
nβ
βββ
β...
3
2
1
Où
-Non-parametric composant of the model : risk in the case of individuals who have all their characteristic xt =0 (individual of reference)
⎟⎠
⎞⎜⎝
⎛= ∑
iiitt xthxth βexp)(),( 0
coefficientsto beestimated
Time discrete logistic model
• αt
: fonction of time
( )[ ]
ttt
t
ttt
xxtP
xtP
xxtP
βα
βα
+=⎥⎦
⎤⎢⎣
⎡−
+−+=
),(1),(log
exp11),(
Alternatives• Discrete time logit models should be used in case of
a « true » discrete process of data generation
• Alternative 1: discrete time probit models
• Alternative 2: discrete time complementary log-log models.– Theoritically more adequate if the process of data
generation is continuous like in the case of the panel
• Logit model remains the more diffused and developped in the litterature because of its simplicity
Preparation of a database (Allison, 1982)
• The equation of log-likelihood for discrete time models can be simplified in an equation of log-likelihood of a dichotomic covariate
• Which means that discrete models can be estimated on a « person wave » database (person-period database)– An individual is represented by a number of lines equal to the
number of waves he is present before to experiment the event or to leave the observation
– The dichotomic dependant variable is equal to 0 in all lines except in the last one where it is 0 or 1 (censored or event)
• Remains true when several levels in the data (Barber et al, 2000).
File person-yearID Age Censor
55102 16 055102 17 055102 18 055102 19 188102 16 088102 17 088102 18 088102 19 088102 20 291102 16 091102 17 091102 18 091102 19 091102 20 091102 21 091102 22 2
112102 16 0112102 17 0112102 18 2218101 16 0218101 17 0218101 18 0218101 19 0218101 20 0218101 21 0218101 22 0218101 23 2
Notes on covariates xt
• Fixed covariates (non-time dependent)
– Status at the birth
– Status reached before the beginning of the observation
• Time dependant covariates
– Predefined covariates (clocks)
– Auxiliary covariates (context)
– Internal covariates (interdependencies, linked lives)
Example of internet adoption• Event history can also be used to analyse processes
of diffusion– of an innovation, a behavior, a rumor…, (Dieckmann; 1989,
Strang and Tuma, 1993)
• Influence between persons in a household– Does a first user in an household influence others?– If yes, increase in the probability to adopt internet when a
first person adopted it.
• Limited here to partners– Does the adoption of internet by the man (the woman)
increase the risk of adoption of his (her) partner?
Use of Internet
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80.0
90.0
100.0
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
%
Restrained circle Extended circle Panel
Source: BFS (MA_net,Net-Matrix base) and
Swisspanel
In the present case• Selection of couples starting to be
interviewed in 1999 and who did not declare to use internet in 1999.
• Couples with no missing values on the use of internet
• Control covariates (education, language, etc).
File
ID IDHOUS NUMBER INTERNET COVARIATES IDPART INTERNET PARTENAIRE4102 41 1 0 4101 14102 41 2 0 4101 14102 41 3 1 4101 1
… … … … … … …74101 741 1 0 74102 074101 741 2 0 74102 074101 741 3 0 74102 074101 741 4 0 74102 074101 741 5 0 74102 074101 741 6 0 74102 074101 741 7 0 74102 074101 741 8 1 74102 0
t+1 t
RESULTS (estimated coefficients)Model 1 Model 2 Model 3 Model 1 Model 2 Model 3
Constant -0.55 *** -1.79 *** -2.88 *** -1.00 *** -2.27 *** -3.42 ***1999-2000 0 0 0 0 0 02000-01 -0.60 *** -0.56 *** -0.46 *** -0.46 *** -0.59 *** -0.46 ***2001-02 -0.81 *** -0.74 *** -0.59 *** -0.41 *** -0.58 *** -0.40 *2002-03 -1.16 *** -1.06 *** -0.91 *** -0.83 *** -0.99 *** -0.75 ***2003-04 -1.74 *** -1.59 *** -1.44 *** -0.96 *** -1.13 *** -0.88 **2004-05 -1.89 *** -1.79 *** -1.62 *** -0.96 *** -1.11 *** -0.83 **2005-06 -1.70 *** -1.47 *** -1.37 *** -0.94 *** -1.04 *** -0.76 **2006-07 -1.51 *** -1.21 *** -1.07 *** -1.01 *** -1.02 *** -0.71 *Children 0.72 *** 0.12 0.58 *** -0.22Computer 1.34 *** 1.13 *** 0.95 *** 0.79 ***Partner already use -0.09 -0.11 0.62 *** 0.53 ***Before 1940 0 0 0 01940_49 1.05 *** 0.95 ***1950_59 1.31 *** 1.84 ***1960_69 1.53 *** 1.90 ***1970 and after 1.50 *** 2.32 ***Level1 0 0Level 2 0.14 0.27Level 3 0.70 *** 0.98 ***German 0 0French -0.07 -0.29 *Italian -0.53 -0.35Other -0.44 -1.18 ***Non-Swiss 0.23 -0.01
Men Women
*:5%,**:1%,***,0,1%.
References• Blossfeld H.P. and Rohwer G (1995) Techniques of
event history modelling. Mahwah: Lawrence Erlbaum.
• Box-Stephensmeier J. and Jones B. (2004). Event History
Modeling: A Guide for Social Scientists.
Cambridge: Cambridge University Press.
• Singer J.D. and Willett J.B. (2003). Applied Longitudinal Data Analysis. Modelling change and Event Occurrence. Oxford. Oxford University Press.
top related