multi-population mortality modelling danish data and …andrewc/ohp/cairns_riskminds2015.pdf ·...
TRANSCRIPT
MULTI-POPULATION MORTALITY MODELLING
Danish data and Basis Risk
Andrew Cairns
Heriot-Watt University, Scotland
and
The Maxwell Institute, Edinburgh
Joint work with
David Blake, Kevin Dowd, Malene Kallestrup-Lamb and Carsten Rosenskjold
1
Plan
• Motivation and challenges
• Danish males data
10 sub-populations grouped by affluence
• An extended CBD multi-population model
• Bayesian implementation and results
2
Motivation for multi-population modelling
A: Risk assessment
• Multi-country (e.g. consistent demographic projections)
• Males/Females (e.g. consistent demographic projections)
• Socio-economic subgroups (e.g. blue or white collar)
• Smokers/Non-smokers
• Annuities/Life insurance
• Limited data ⇒ learn from other populations
3
Motivation for multi-population modellingB: Risk management for pension plans and insurers
• Retain systematic mortality risk; versus:
• ‘Over-the-counter’ deals (e.g. longevity swap)
• Standardised mortality-linked securities
– linked to national mortality index
– < 100% risk reduction: basis risk
4
Challenges
• Data availability
• Data quality and depth
• Model complexity
– single population models can be complex
– 2-population versions are more complex
– multi-pop ......
• Multi-population modelling requires
– (fairly) simple single-population models
– simple dependencies between populations
5
A New Case Study and a New Model
• Sub-populations differ from national population
– socio-economic factors
– geographical variation
– other factors
• Denmark
– High quality data on ALL residents
– 1981-2005 available
– Can subdivide population using covariates on the
database
6
Danish Data
• What can we learn from Danish data that will inform us
about other populations?
• Key covariates
– Wealth
– Income
• Affluence = Wealth+15×Income
7
Problem
• High income ⇒ “affluent” and healthy BUT
• Low income ⇒/ not affluent, poor health
• High wealth ⇒ “affluent” and healthy BUT
• Low wealth ⇒/ not affluent, poor health
Solution: use a combination
• Affluence, A = wealth +K× income
•K = 15 seems to work well statistically as a predictor
• Low affluence, A, predicts poor mortality
8
Subdividing Data
• Males resident in Denmark for the previous 12 months
• Divide population in year t
– into 10 equal sized Groups (approx)
– using affluence, A
• Individuals can change groups up to age 67
• Group allocations are locked down at age 67
(better than not locking down at age 67)
9
Subdivided Data
• Exposures E(i)(t, x) for groups i = 1, . . . , 10
range from over 4000 down to 20
• Deaths D(i)(t, x)
range from 150 down to 6
• Crude death rates m̂(i)(t, x) = D(i)(t, x)/E(i)(t, x)
• Small groups ⇒ Poisson risk is important
10
Crude death rates 2005
60 70 80 90
0.00
20.
010
0.05
00.
200
Group 1Group 2Group 3Group 4Group 5Group 6Group 7Group 8Group 9Group 10
Males Crude m(t,x); 2005
Age
m(t
,x)
(log
scal
e)
11
Modelling the death rates, mk(t, x)
m(k)(t, x) = pop. k death rate in year t at age x
Population k, year t, age x
logm(k)(t, x) = β(k)(x) + κ(k)1 (t) + κ
(k)2 (t)(x− x̄)
(Extended CBD with a non-parametric base table, β(k)(x))
• 10 groups, k = 1, . . . , 10 (low to high affluence)
• 21 years, t = 1985, . . . , 2005
• 40 ages, x = 55, . . . , 94
12
Model-Inferred Underlying Death Rates 2005
60 70 80 90
0.00
20.
010
0.05
00.
200
Group 1Group 2Group 3Group 4Group 5Group 6Group 7Group 8Group 9Group 10
Males Crude m(t,x); 2005
Age
m(t
,x)
(log
scal
e)
60 70 80 90
0.00
20.
010
0.05
00.
200
Males CBD−X Fitted m(t,x); 2005
Age
m(t
,x)
(log
scal
e)
13
Modelling the death rates, mk(t, x)
logm(k)(t, x) = β(k)(x) + κ(k)1 (t) + κ
(k)2 (t)(x− x̄)
• Model fits the 10 groups well without a cohort effect
• Non-parametric β(k)(x) is essential to preserve group
rankings
– Rankings are evident in crude data
– “Bio-demographical reasonableness”:
more affluent ⇒ healthier
14
Time series modelling
• t→ t + 1: Allow for correlation
– between κ(k)1 (t + 1) and κ
(k)2 (t + 1)
– between groups k = 1, . . . , 10
• Bio-demographical reasonableness
⇒ key hypothesis: groups should not diverge
⇒ group specific period effects gravitate towards the
national trend
15
Life Expectancy for Groups 1 to 10
1985 1990 1995 2000 2005
1416
1820
2224
26Males Period LE: Age 55
1985 1990 1995 2000 2005
46
810
1214
16
Males Period LE: Age 67
Group 10Group 9Group 8Group 7Group 6Group 5Group 4Group 3Group 2Group 1
16
Mortality Fan Charts Including Parameter Uncertainty
1990 2000 2010 2020 2030 2040 2050
0.01
0.02
0.05
0.10
Group 10
Group 1
Mortality Rates: Age 75
Year
q(t,x
)
17
Simulated Group versus Population Mortality
0.02 0.04 0.06 0.10
0.02
0.04
0.06
0.08
Group 2T=2006Corr = 0.58
Tota
l q(t
,x)
Group q(t,x)
0.02 0.04 0.06 0.10
0.02
0.04
0.06
0.08
Group 2T=2010Corr = 0.72
Tota
l q(t
,x)
Group q(t,x)
0.02 0.04 0.06 0.10
0.02
0.04
0.06
0.08
Group 2T=2030Corr = 0.86
Tota
l q(t
,x)
Group q(t,x)
As T increases
• Scatterplots become more dispersed
• Shift down and to the left
• Correlation increasess
18
Forecast Correlations
• Deciles are quite narrow subgroups
• Blue collar pension plan
⇒ equal proportions of groups 2, 3, 4
• White collar pension plan
⇒ equal proportions of groups 8, 9, 10
19
Forecast Correlations
2010 2015 2020 2025 2030
0.0
0.2
0.4
0.6
0.8
1.0
Year
Cor
rela
tion Blue Collar Plan
White Collar PlanGroup 1Group 2Group 3Group 4Group 5Group 6Group 7Group 8Group 9Group 10
Age 75
Correlation Between Group q(t,x) and Total q(t,x)
20
Conclusions
• Development of a new multi-population dataset for Denmarkstrong bio-demographically reasonable group rankingsbased on a new measure of affluence
• Unlike multi-country dataa priori ranking of affluence-related groups
• Proposal for a simple new multi-population model
• Strong correlations over medium to long term
• Correlations depend strongly on diversity of sub-population
21
Bonus Slides
22
A specific model
κ(i)1 (t) = κ
(i)1 (t− 1) + µ1 + Z1i(t) (random walk)
−ψ(κ(i)1 (t− 1)− κ̄1(t− 1)
)(gravity between groups)
κ(i)2 (t) = κ
(i)2 (t− 1) + µ2 + Z2i(t)
−ψ(κ(i)2 (t− 1)− κ̄2(t− 1)
)where
κ̄1(t) =1
n
n∑i=1
κ(i)1 (t) and κ̄2(t) =
1
n
n∑i=1
κ(i)2 (t)
23
A specific model
κ(i)1 (t) = κ
(i)1 (t− 1) + µ1 + Z1i(t)− ψ
(κ(i)1 (t− 1)− κ̄1(t− 1)
)κ(i)2 (t) = κ
(i)2 (t− 1) + µ2 + Z2i(t)− ψ
(κ(i)2 (t− 1)− κ̄2(t− 1)
)Model structure ⇒
• (κ̄1(t), κ̄2(t)) ∼ bivariate random walk
• Each κ(i)1 (t)− κ̄1(t) ∼ AR(1) reverting to 0
• Each κ(i)2 (t)− κ̄2(t) ∼ AR(1) reverting to 0
• β(i)(x) vs β(j)(x)⇒ intrinsic group differences
24
Non-trivial correlation structure:between different ages and groups
κ(i)1 (t) = κ
(i)1 (t− 1) + µ1+Z1i(t)− ψ
(κ(i)1 (t− 1)− κ̄1(t− 1)
)κ(i)2 (t) = κ
(i)2 (t− 1) + µ2+Z2i(t)− ψ
(κ(i)2 (t− 1)− κ̄2(t− 1)
)The Zki are multivariate normal, mean 0 and
Cov(Zki, Zlj) =
vkl for i = j
ρvkl for i ̸= j
ρ = cond. correlation between κ(i)1 (t) and κ
(j)1 (t) etc.
25
Comments
• Model is very simple
– One gravity parameter, 0 < ψ < 1
– One between-group correlation parameter,
0 < ρ < 1
• Many generalisations are possible
• But more parameters + more complex computing
• This simple model seems to fit quite well.
• Nevertheless ⇒ work in progress
26
Prior distributions
• As uninformative as possible
• µ1, µ2 ∼ improper uniform prior
• {vij} ∼ Inverse Wishart
• ρ ∼ Beta(2, 2)
• ψ ∼ Beta(2, 2)
State variables and process parameters estimated using
MCMC (Gibbs + Metropolis-Hastings)
27
Posterior Distributions and 95% Credibility Intervals
−0.030 −0.020 −0.010 0.000
0.0
0.2
0.4
0.6
0.8
1.0
Kappa_1 Drift, mu_1
mu_1
Cum
ulat
ive
Pos
terio
r P
roba
bilit
y
0.0000 0.0004 0.0008 0.00120.
00.
20.
40.
60.
81.
0
Kappa_2 Drift, mu_2
mu_2
Cum
ulat
ive
Pos
terio
r P
roba
bilit
y
Note: −µ1 = global improvement rate28
Posterior Distributions and 95% Credibility Intervals
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Between Group Correlation, rho
rho
Cum
ulat
ive
Pos
terio
r P
roba
bilit
y
0.00 0.02 0.04 0.06 0.08 0.10 0.12
0.0
0.2
0.4
0.6
0.8
1.0
Gravity Parameter, psi
psi
Cum
ulat
ive
Pos
terio
r P
roba
bilit
y
29