occurrence and timing of events depend on exposure to the risk of an event exposure risk depends on...

24
Occurrence and timing of events depend on Exposure to the risk of an event exposur e Risk depends on exposure

Upload: geraldine-robertson

Post on 25-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Occurrence and timingof

eventsdepend

on

Exposure to the risk of an event

exposureRisk depends on exposure

ID Age Educ MS Exposure1 13 1 0 11 14 1 0 11 15 1 0 11 16 1 0 11 17 1 0 11 18 1 0 11 19 2 0 11 20 2 0 11 21 2 0 11 22 2 0 11 23 2 0 0.5 Censoring2 13 1 0 12 14 1 0 12 15 1 0 12 16 1 0 12 17 1 0 12 18 0 0 12 19 0 1 0.5 Event3 13 1 0 13 14 1 0 13 15 1 0 13 16 1 0 13 17 1 0 13 18 0 0 13 19 0 0 13 20 2 0 13 21 2 1 0.5 Event4 13 1 0 14 14 1 0 14 15 0 0 14 16 0 0 14 17 0 0 14 18 0 0 14 19 0 0 14 20 0 1 0.5 Event

Person-age record fileTime-varying covariates

Age at first marriage and age at change in education: Person-years file

Educ: 0 = not in school full-time 1 = secondary eduction 2 = postsecondary education

Marriage [MS]: 0 = not married 1 = married

Source: Yamaguchi, 1991, p. 22

EDUC Events ExposureMarriages

1 0 182 1 60 2 9

Total 3 33O/E rate 0.0909

Events and exposures

All age periods prior to marriage and age at marriage are included.

Exposure: examples

• To risk of conception

• To risk of infection (e.g. malaria, HIV)

• To marriage

• To risk of divorce

• To risk of dying

• Health risk

Exposure to risk

Whenever an event or act gives rise to gain or loss that cannot be

predicted

Risk of the unexpected

Williams et al., 1995, Risk management and insurance, McGraw-Hill, New York, p. 16

Exposure analysis• Being exposed or not

• If exposed, level of exposure (intensity)

• Factors affecting level of exposure(e.g. age, contacts, etc.)

• Interventions may affect level of exposure– Contraceptives and sterilisation are used to prevent unwanted pregnancies

– Breastfeeding prolongs postpartum amenorrhoea (PPA)

– Immunisation prevents (reduces) risk of infectious disease

– Lifestyle reduces/increases risk of lung cancer

• Which mechanism(s) determines level of exposure– e.g. Breastfeeding stimulates production of prolactin hormone, which inhibits ovulation

Hobcraft and Little, ??

Risk levels and differentials

Risk measuresPrediction of risk levels

Determinants of differential risk levels

Risk = potential variation in outcome

• Count: Number of events during given period (observation window)• Count data

• Probability: probability of an outcome: proportion of risk set experiencing a given outcome (event) at least once

• Basis = Risk set• Risk set = all persons at risk at given point in time.

• Rate: number of events per time unit of exposure (person-time)• Basis: duration of exposure (duration at risk)

• Rate (general) = change in one quantity per unit change in another quantity (usually time; other possible measures include space, miles travelled)

(Objective) risk measures

• Difference of probabilities: p1 - p2 (risk difference)

• Relative risk: ratio of probabilities (focus: risk factor)• prob. of event in presence of risk factor/ prob. of event in absence of risk

factor (control group; reference category): p1 / p2

• Odds: odds on an outcome: ratio of favourable outcomes to unfavourable outcomes. Chance of one outcome rather than another: p1 / (1-p1)

The odds are what matter when placing a bet on a given outcome, i.e. when

something is at stake. Odds reflect the degree of belief in a given outcome.

Relation odds and relative risk: Agresti, 1996, p. 25

Risk measures

• Odds: two categories (binary data)

) ... 0:scale] [odds (Range p-1

p Odds

Odds 1

1

Odds 1

Odds p

1-

) ,- :(range logit(p) p-1

pln ln(odds)

]exp[- 1

1

]exp[1

]exp[ p

Risk measures

Parameters of logistic regression: ln(odds) and ln(odds ratio)

In regression analysis, is linear predictor: = 0 + 1 x1 + 2 x2 +

• Odds: multiple categories (polytomous data)

)logit( ln )ln(11

3

11 p

pp

odds

Risk measures

Parameters of logistic regression: ln(odds) and ln(odds ratio)

1

pp

Oddspp

Oddspp

Odds3

33

3

22

3

11

)logit( ln )ln(22

3

22 p

pp

odds

]1[ ]exp[ ln - ln ppppppp213311131

jj

1

21

1

1 ]exp[

]exp[

]exp[ ]exp[1

]exp[ p

jj

i

i ]exp[

]exp[ p

Select category 3 as reference category

• Odds ratio : ratio of odds (focus: risk indicator, covariate)• odds in target group / odds in control group [reference category]: ratio

of favourable outcomes in target group over ratio in control group. The odds ratio measures the ‘belief’ in a given outcome in two different populations or under two different conditions. If the odds ratio is one, the two populations or conditions are similar.

Target group: k=1; Control group: k=2

Risk measures

Parameters of logistic regression: ln(odds) and ln(odds ratio)

pppp

OddsOdds

2k221k

1k211k

2k

1k

12

Risk measures in epidemiology• Prevalence: proportion (refers to status)• Incidence rate: rate at which events (new cases)

occur over a defined time period [events per person-time]. Incidence rate is also referred to as incidence density (e.g. Young, 1998, p. 25; Goldhaber and Fireman, 1991).

• Case-fatality ratio: proportion of sick people who die of a disease (measure of severity of disease). Is not a rate!! (Young, 1998, p. 27)

Confusion:Birth defect prevalence: proportion of live births having defectsBirth defect incidence: rate of development of defects among all embryos over the period of gestation (Young, 1998, p. 48)

Risk measures in epidemiology• Attributable risk (among the exposed): proportion

of events (diseases) attributable to being exposed: [p1-p2]/p1 (since non-exposed can also develop disease)

• Subjective probability: degree of belief about the outcome of a trial or process, or about the future. It is the perception of the probability of an outcome or event. ‘It is highly dependent on judgment’ (Keynes, 1912, A treatise on probability, Macmillan, London). Keynes regarded probability as a subjective concept: our judgment (intuition, gut feeling) about the likelihood of the outcome.

– See also Value-expectancy theory: attractiveness of an alternative (option) depends on the subjective probability of an outcome and the value or utility of the outcome (Fishbein and Ajzen, 1975).

(Subjective) risk measures

In case of multiple categories,select a reference category

Reference category is coded 0

Various coding schemes!

Coding schemes

• Contrast coding: one category is reference category (simple contrast coding; dummy coding). Model parameters are deviations from reference category.

• Indicator variable coding: indicator (0,1) variables• Cornered effect coding (Wrigley, 1985, pp. 132-136) [0,1])

• Effect coding: the mean is the reference. Model parameters are deviations from the mean.

• Centred effect coding (Wrigley, 1985, pp. 132-136) [-1,+1]

• Other types of coding: see e.g. SPSS Advanced Statistics, Appendix A

Vermunt, 1997, p. 10

Coding schemes

• Categories are coded:– Binary: [0,1], [-1,+1], [1,2]– Multiple: [0,1,2,3,..], [set of binary]

e.g. 3 categories:

100

010

000

Coding schemes

Selection of reference category depends on research question

Example

Age Females Males TotalEarly (LT 20) 135 74 209Late (GE 20) 143 178 321Total 278 252 530Censored at int 13 40 53TOTAL 291 292 583

Number of young adults leaving homeby age and sex, Netherlands, 1961 birth cohort

Sex

The survey (Sept. 1987 - Febr. 1988):Sample of 583 young adults born in 1961530 left home before survey53 censored cases

A. CountsAge Females Males TotalEarly (LT 20) 135 74 209Late (GE 20) 143 178 321Total 278 252 530

B. ProbabilitiesAge Females Males F+MEarly (LT 20) 0.49 0.29 0.39Late (GE 20) 0.51 0.71 0.61Total 1.00 1.00 1.00

C. ODDS and LOGITAge Females Males F+MODDS: Early/Late 0.94 0.42 0.65LOGIT:Early/late -0.058 -0.878 -0.429

Young adults leaving homeby age and sex, Netherlands, 1961 birth cohort

Descriptive statistics

Reference categories: Late [20], Males

Odds on leaving home early (rather than late) Logit

- Males: 74/178 = 0.416 -0.877

- Females: 135/143 = 0.944 -0.058

Odds ratio (): 0.944/0.416 = 2.27 0.820(if we bet that a person leaves home early, we should bet on females; they are the ‘winners’ - leave home early)

Var() = 2 [1/135+1/143+1/74+1/178] = 0.1725

ln = 0.819

Var(ln ) = 1/135+1/143+1/74+1/178 = 0.0335Selvin, 1991, p. 345

Age Females Males TotalEarly (LT 20) 135 74 209Late (GE 20) 143 178 321Total 278 252 530

Number of young adults leaving homeby age and sex, Netherlands, 1961 birth cohort

Sex

T a b l eN u m b e r o f y o u n g a d u l t s l e a v i n g h o m e b y a g e a n d s e x

F e m a l e s M a l e s T o t a l

< 2 0 1 3 5 7 4 2 0 9

2 0 1 4 3 1 7 8 3 2 1

T o t a l 2 7 8 2 5 2 5 3 0

D u m m y c o d i n g : r e f e r e n c e c a t e g o r y : ( i ) f e m a l e s ; ( i i ) l e a v i n g h o m e l a t e

L o g i t m o d e l :p-1

pln Logit

i

i

ip i i s s e x ( i = 1 f o r f e m a l e s a n d 2 f o r m a l e s )

O D D SF e m a l e s ( r e f e r e n c e ) : 1 3 5 / 1 4 3 = 0 . 9 4 4 0M a l e s : 7 4 / 1 7 8 = 0 . 4 1 5 7

O D D S R A T I OO D D S m a l e s / O D D S f e m a l e s = 0 . 4 1 5 7 / 0 . 9 4 4 0 = 0 . 4 4 0 4

L O G I T p i s l n ( 0 . 9 4 4 0 ) = – 0 . 0 5 7 5 7 f o r f e m a l e s a n d l n ( 0 . 4 1 5 7 ) = - 0 . 8 7 7 7 f o r m a l e s

L n o d d s r a t i o = - 0 . 8 2 0 1N O T E t h a t – 0 . 8 7 7 7 = – 0 . 0 5 7 5 7 – 0 . 8 2 0 1

A r e m a l e s m o r e l i k e l y t o l e a v e h o m e e a r l y t h a n f e m a l e s ?

Leaving home

Odds and probabilities

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0

9.0

10.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Probability

Od

ds

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

Lo

git

odds logit

Relation probabilities, odds and logit

Risk analysis: modelsPrediction of risk levels and differentials risk levels

Probability models and regression models

– Counts Poisson r.v. Poisson distribution Poisson regression / log-linear model

– Probabilities binomial and multinomial r.v. binomial and multinomial distribution logistic regression / logit model

(parameter p, probability of occurrence, is also called risk; e.g. Clayton and Hills, 1993, p. 7)

– Rates Occurrences/exposure Poisson r.v. log-rate model