discrete-time methods for the analysis of event histories ... · pdf filediscrete-time methods...

Download Discrete-Time Methods for the Analysis of Event Histories ... · PDF fileDISCRETE-TIME METHODS FOR THE ANALYSIS OF EVENT HISTORIES Paul D. Allison UNIVERSITY OF PENNSYLVANIA The history

If you can't read please download the document

Upload: truongkhuong

Post on 07-Feb-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

  • http://www.jstor.org

    Discrete-Time Methods for the Analysis of Event HistoriesAuthor(s): Paul D. AllisonSource: Sociological Methodology, Vol. 13, (1982), pp. 61-98Published by: American Sociological AssociationStable URL: http://www.jstor.org/stable/270718Accessed: 15/08/2008 10:13

    Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at

    http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless

    you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you

    may use content in the JSTOR archive only for your personal, non-commercial use.

    Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at

    http://www.jstor.org/action/showPublisher?publisherCode=asa.

    Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed

    page of such transmission.

    JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the

    scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that

    promotes the discovery and use of these resources. For more information about JSTOR, please contact [email protected].

    http://www.jstor.org/stable/270718?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/action/showPublisher?publisherCode=asa

  • DISCRETE-TIME METHODS FOR THE ANALYSIS OF

    EVENT HISTORIES

    Paul D. Allison UNIVERSITY OF PENNSYLVANIA

    The history of an individual or group can always be characterized as a sequence of events. People finish school, enter the labor force, marry, give birth, get promoted, change employers, retire, and ultimately die. Formal organizations merge, adopt innovations, and go bankrupt. Nations experi- ence wars, revolutions, and peaceful changes of government. It is surely the business of sociology to explain and predict the occurrence of such events. Why is it, for example, that some individuals try marijuana while others do not? Why do some people marry early while others marry late? Do educational

    For helpful suggestions, I am indebted to Charles Brown, Rachel Ro- senfeld, Thomas Santner, Nancy Tuma, and several anonymous referees.

    61

  • PAUL D. ALLISON

    enrichment programs reduce the likelihood of dropping out of school? What distinguishes firms that have adopted compu- terized accounting systems from those that have not? What are the causes of revolutions?

    Perhaps the best form of data for answering questions like these is an event history. Quite simply, an event history is a record of when events occurred to a sample of individuals (Tuma and Hannan, 1978). If the sample consists of women of childbearing age, for example, each woman's event history might consist of the birthdates of her children, if any. If one is interested in the causes of events, the event history should also include data on relevant explanatory variables. Some of these, like race, may be constant over time while others, like income, may vary.

    Although event histories are almost ideal for studying the causes of events, they also typically possess two features- censoring and time-varying explanatory variables-that create major difficulties for standard statistical procedures. In fact, the attempt to apply standard methods to such data can lead to serious bias or loss of information. These difficulties are dis- cussed in some detail in the following pages. In the last decade, however, several innovative methods for the analysis of event histories have been proposed. Sociologists will be most familiar with the maximum-likelihood methods of Tuma and her col- leagues (Tuma, 1976; Tuma and Hannan, 1978; Tuma, Hannan, and Groeneveld, 1979). Similar procedures have been developed by biostatisticians interested in the analysis of sur- vival data (Gross and Clark, 1975; Elandt-Johnson and Johnson, 1980; Kalbfleisch and Prentice, 1980). A related ap- proach, known as partial likelihood, offers important advan- tages over maximum-likelihood methods and is now in wide- spread use in the biomedical sciences (Cox, 1972; Kalbfleisch and Prentice, 1980; Tuma, present volume, Chapter 1).

    Most methods for analyzing event histories assume that time is measured as a continuous variable-that is, it can take on any nonnegative value. Under some circumstances discrete-time models and methods may be more appropriate or, if less appropriate, highly useful.

    62

  • DISCRETE-TIME METHODS

    First, in some situations events can only occur at regular, discrete points in time. For example, in the United States a change in party controlling the presidency only occurs quad- rennially in the month of January. In such cases a discrete- time model is clearly more appropriate than a continuous-time model.

    Second, in other situations events can occur at any point in time, but available data record only the particular interval of time in which each event occurs. For example, most surveys ask only for the year of a person's marriage rather than the exact date. It would clearly be inappropriate to treat such data as though they were continuous. Two alternative approaches are available, however. One is to assume that there is an underlying continuous-time model and then estimate the model's parame- ters by methods that take into account the discrete character of the data. The other approach is simply to assume that events can occur only at the discrete time points measured in the data and then apply discrete-time models and methods. In practice, these two approaches lead to very similar estimation proce- dures and, hence, both may be described as discrete-time methods.

    Discrete-time methods have several desirable features. It is easy, for example, to incorporate time-varying explanatory variables into a discrete-time analysis. Moreover, when the explanatory variables are categorical (or can be treated as such), discrete-time models can be estimated by using log-linear methods for analyzing contingency tables. With this approach one can analyze large samples at very low cost. When explana- tory variables are not categorical, the estimation procedures can often be well approximated by using ordinary least-squares regression. Finally, discrete-time methods are more readily un- derstood by the methodologically unsophisticated.

    For all these reasons, discrete-time methods for the anal- ysis of event histories are often well suited to the sorts of data, computational resources, and quantitative skills possessed by social scientists. The aim of this chapter is to examine the discrete-time approach closely and compare it with continuous-time methods. Before undertaking this task, I shall

    63

  • PAUL D. ALLISON

    first discuss the problems that arise in the analysis of event his- tories and then summarize the continuous-time approach.

    PROBLEMS IN ANALYZING EVENT HISTORIES

    Whether time is measured on a continuous or discrete scale, standard analytic techniques are not well suited to the analysis of event-history data. As an example of these diffi- culties, consider the study of criminal recidivism reported by Rossi, Berk, and Lenihan (1980). Approximately 430 inmates released from Maryland state prisons were followed up for one year after their release. The events of interest were arrests; the aim was to determine how the likelihood of an arrest depended on various explanatory variables.

    Although the date of each arrest was known, Rossi and

    colleagues simply created a dummy variable indicating whether or not a person was arrested during the 12-month follow-up period. They then regressed this dummy variable on possible explanatory variables including age at release, race, education, and prior work experience. While this is not an unreasonable exploratory method, it is far from ideal. Aside from the well- known limitations of using a dummy dependent variable in a

    multiple regression (Goldberger, 1964), the dichotomization of the dependent variable is both arbitrary and wasteful of infor- mation. It is arbitrary because there was nothing special about the 12-month interval except that the study ended at that point. Using the same data, one might just as well compare those ar- rested before and after a 6-month dividing line. It is wasteful of information because it ignores the variation on either side of the cutoff point. One might suspect, for example, that a person arrested immediately after release had a higher propensity toward recidivism than one arrested 11 months later.

    To avoid these difficulties, it is tempting to use the

    length of time from release to first arrest as the dependent vari- able in a multiple regression. But this strategy poses two new

    problems. First, the value of the dependent variable is unknown or "censored" for persons who experienced no ar-

    64

  • DISCRETE-TIME METHODS

    rests during the one-year period. An ad hoc solution to this di- lemma might be to exclude all censored observations and just look at those cases for whom an arrest is observed. But the number of censored cases may be large (47 percent were cen- sored in this sample), and it has been shown that their exclusion can lead to large biases (SBrensen, 1977; Tuma and Hannan, 1978). An alternative ad hoc approach is to assign the max- imum length of time observed, in this case one year, as the value of the dependent variable for the censored cases. Ob- viously this strategy underestimates the true value, and again substantial biases may re