on predictive modeling for claim severity glenn meyers iso innovative analytics care seminar june...

45
On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Upload: emerald-waters

Post on 04-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

On Predictive Modeling for Claim Severity

Glenn MeyersISO Innovative Analytics

CARe SeminarJune 6-7, 2005

Page 2: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Problems with Experience Rating for

Excess of Loss Reinsurance

• Use submission claim severity data– Relevant, but– Not credible– Not developed

• Use industry distributions– Credible, but– Not relevant (???)

Page 3: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

General Problems withFitting Claim Severity Distributions

• Parameter uncertainty– Fitted parameters of chosen model are

estimates subject to sampling error.

• Model uncertainty– We might choose the wrong model. There is

no particular reason that the models we choose are appropriate.

• Loss development– Complete claim settlement data is not always

available.

Page 4: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Outline of Remainder of Talk

• Quantifying Parameter Uncertainty– Likelihood ratio test

• Incorporating Model Uncertainty– Use Bayesian estimation with likelihood

functions– Uncertainty in excess layer loss estimates

• Bayesian estimation with prior models based on data reported to a statistical agent– Reflect insurer heterogeneity– Develops losses

Page 5: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

How Paper is Organized

• Start with classical hypothesis testing.– Likelihood ratio test

• Calculate a confidence region for parameters.• Calculate a confidence interval for a function

of the parameters.– For example, the expected loss in a layer

• Introduce a prior distribution of parameters.• Calculate predictive mean for a function of

parameters.

Page 6: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

The Likelihood Ratio Test

1Let ( ,..., ) be a set of observed

losses.nx xx

1Let ( ,..., ) be a parameter vector

for your chosen loss model.kp pp

ˆLet be the maximum likelihood

estimate of given .

p

p x

Page 7: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

The Likelihood Ratio Test

0 1Test H : against H : * *p p p p

0

*

2

Theorem 2.10 in Klugman, Panjer & Willmot

If H is true then:

ˆ ln 2 ln ; ln ;

has a distribution with degrees

of freedom.

LR L p x L p x

k

2Use distribution to find critical values.

Page 8: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

An Example – The Pareto Distribution

( ) 1F xx

• Simulate random sample of size 1000

= 2.000, = 10,000

Maximum Likelihood = -10034.660 with

ˆˆ 8723.04 1.80792

Page 9: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Hypothesis Testing Example

• Significance level = 5%

2 critical value = 5.991

• H0: () = (10000, 2)

• H1: () ≠ (10000, 2)

• lnLR = 2(-10034.660 + 10035.623) =1.207

• Accept H0

Page 10: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Hypothesis Testing Example

• Significance level = 5%

2 critical value = 5.991

• H0: () = (10000, 1.7)

• H1: () ≠ (10000, 1.7)

• lnLR = 2(-10034.660 + 10045.975) =22.631

• Reject H0

Page 11: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Confidence Region

• X% confidence region corresponds to the 1-X% level hypothesis test.

• The set of all parameters () that fail to reject corresponding H0.

• For the 95% confidence region:– (10000, 2.0) is in.– (10000, 1.7) out.

Page 12: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Confidence Region

Outer Ring 95%, Inner Ring 50%

0.0

0.5

1.0

1.5

2.0

2.5

0 5000 10000 15000Theta

Alp

ha

Page 13: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Grouped Data

• Data grouped into four intervals– 562 under 5000– 181 between 5000 and 10000– 134 between 10000 and 20000– 123 over 20000

• Same data as before, only less information is given.

Page 14: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Confidence Region for Grouped Data

Outer Ring 95%, Inner Ring 50%

0.0

0.5

1.0

1.5

2.0

2.5

0 5000 10000 15000Theta

Alp

ha

Page 15: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Confidence Region for Ungrouped Data

Outer Ring 95%, Inner Ring 50%

0.0

0.5

1.0

1.5

2.0

2.5

0 5000 10000 15000Theta

Alp

ha

Page 16: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Estimation with Model UncertaintyCOTOR Challenge – November 2004

• COTOR published 250 claims– Distributional form not revealed to participants

• Participants were challenged to estimate the cost of a $5M x $5M layer.

• Estimate confidence interval for pure premium

Page 17: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

You want to fit a distribution to 250 Claims

• Knee jerk first reaction, plot a histogram.

0 1 2 3 4 5 6 7

x 106

0

50

100

150

200

250

Claim Amount

Cou

nt

Histogram of Cotor Data

Page 18: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

This will not do! Take logs• And fit some standard distributions.

6 7 8 9 10 11 12 13 14 15 160

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Log of Claim Amounts

Den

sity

lcotor data

lognormal

gamma

Weibull

Page 19: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Still looks skewed. Take double logs.

• And fit some standard distributions.

1.8 2 2.2 2.4 2.6 2.80

0.5

1

1.5

2

2.5

log log of Claim Amounts

Den

sity

llcotor data

Lognormal

Gamma

Weibull

Page 20: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Still looks skewed. Take triple logs.• Still some skewness. • Lognormal and gamma fits look somewhat better.

0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 10

1

2

3

4

5

Triple log of Claim Amounts

Den

sity

lllcotor data

Lognormal

Gamma

Normal

Page 21: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Candidate #1Quadruple lognormal

Distribution: Lognormal Log likelihood: 283.496 Domain: 0 < y < Inf Mean: 0.738351 Variance: 0.006189 Parameter Estimate Std. Err. Mu -0.30898 0.00672 sigma 0.106252 0.004766 Estimated covariance of parameter estimates: mu sigma Mu 4.52E-05 1.31E-19 Sigma 1.31E-19 2.27E-05

Page 22: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Candidate #2Triple loggamma

Distribution: Gamma Log likelihood: 282.621 Domain: 0 < y < Inf Mean: 0.738355 Variance: 0.00615 Parameter Estimate Std. Err. A 88.6454 7.91382 B 0.008329 0.000746 Estimated covariance of parameter estimates: a b A 62.6286 -0.00588 B -0.00588 5.56E-07

Page 23: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Candidate #3Triple lognormal

Page 24: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

All three cdf’s are within confidence interval for the quadruple lognormal.

0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Triple log of Claim Amounts

Cum

ulat

ive

prob

abili

ty

lllcotor data

Lognormal

confidence bounds (Lognormal)

Gamma

Normal

Page 25: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Elements of Solution

• Three candidate models– Quadruple lognormal– Triple loggamma– Triple lognormal

• Parameter uncertainty within each model• Construct a series of models consisting of

– One of the three models.– Parameters within a broad confidence interval

for each model. – 7803 possible models

Page 26: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Steps in Solution

• Calculate likelihood (given the data) for each model.

• Use Bayes’ Theorem to calculate posterior probability for each model– Each model has equal prior probability.

Posterior model|data Likelihood data|model Prior model

Page 27: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Steps in Solution

• Calculate layer pure premium for 5 x 5 layer for each model.

• Expected pure premium is the posterior probability weighted average of the model layer pure premiums.

• Second moment of pure premium is the posterior probability weighted average of the model layer pure premiums squared.

Page 28: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

CDF of Layer Pure Premium

Probability that layer pure premium ≤ x

equals

Sum of posterior probabilities for which the

model layer pure premium is ≤ x

Page 29: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Numerical Results

Mean 6,430 Standard Deviation 3,370 Median 5,780

Range Low at 2.5% 1,760 High at 97.5% 14,710

Page 30: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Histogram of Predictive Pure Premium

Predictive Distribution of the Layer Pure Premium

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Low End of Amount (000)

Den

sity

Page 31: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Example with Insurance Data

• Continue with Bayesian Estimation

• Liability insurance claim severity data

• Prior distributions derived from models based on individual insurer data

• Prior models reflect the maturity of claim data used in the estimation

Page 32: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Initial Insurer Models

• Selected 20 insurers– Claim count in the thousands

• Fit mixed exponential distribution to the data of each insurer

• Initial fits had volatile tails

• Truncation issues– Do small claims predict likelihood of large

claims?

Page 33: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Initial Insurer Models

0

5,000

10,000

15,000

20,000

25,000

30,000

35,000

40,000

45,000

1,000 10,000 100,000 1,000,000 10,000,000

Loss Amount - x

Lim

ited

Ave

rage

Sev

erit

y

Page 34: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Low Truncation Point

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

Probability That Loss is Over 5,000

500

x 5

00 L

ayer

Ave

rage

Sev

erit

y

Page 35: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

High Truncation Point

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

5,000

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07

Probability That Loss is Over 100,000

500

x 5

00 L

ayer

Ave

rage

Sev

erit

y

Page 36: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Selections Made

• Truncation point = $100,000

• Family of cdf’s that has “correct” behavior– Admittedly the definition of “correct” is

debatable, but– The choices are transparent!

Page 37: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Selected Insurer Models

0

5,000

10,000

15,000

20,000

25,000

30,000

35,000

40,000

45,000

100,000 1,000,000 10,000,000

Loss Amount - x

Lim

ited

Ave

rage

Sev

erit

y

Page 38: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Selected Insurer Models

0

1,000

2,000

3,000

4,000

5,000

6,000

0.00 0.01 0.01 0.02 0.02 0.03 0.03 0.04 0.04 0.05

Probability That Loss is Over 100,000

500

x 50

0 L

ayer

Ave

rage

Sev

erit

y

Page 39: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Each model consists of

1. The claim severity distribution for all claims settled within 1 year

2. The claim severity distribution for all claims settled within 2 years

3. The claim severity distribution for all claims settled within 3 years

4. The ultimate claim severity distribution for all claims

5. The ultimate limited average severity curve

Page 40: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Three Sample Insurers Small, Medium and Large

• Each has three years of data

• Calculate likelihood functions– Most recent year with #1 on prior slide– 2nd most recent year with #2 on prior slide– 3rd most recent year with #3 on prior slide

• Use Bayes theorem to calculate posterior probability of each model

Page 41: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Formulas for Posterior Probabilities

, 1 ,

, ,, 11

AY m i AY m ii AY m

AY m

F x F xP

F x

,9 3

, ,1 1

i AYn

m i AY mi AY

l P

Posterior( ) Prior( )mm l m

Model (m) Cell Probabilities

Likelihood (m)

Using Bayes’ Theorem

Number of claims

Page 42: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

ResultsTaken from

paper.

IntervalLower Claim Prior Posterior $500K x $1M x

Lags Bound Count Model # Probability $500K $1M1 100,000 15 1 0.016406 763 5411 200,000 2 2 0.041658 911 6451 300,000 1 3 0.089063 1,153 6821 400,000 2 4 0.130281 1,224 7961 500,000 0 5 0.157593 1,281 9121 750,000 0 6 0.110614 1,390 9781 1,000,000 0 7 0.075702 1,494 1,0401 1,500,000 0 8 0.053226 1,587 1,0951 2,000,000 0 9 0.080525 1,849 1,328

10 0.104056 2,069 1,52311 0.129925 2,417 1,828

1-2 100,000 40 12 0.010896 2,598 1,9161-2 200,000 10 13 0.000007 2,788 1,9221-2 300,000 1 14 0.000009 3,004 2,1241-2 400,000 0 15 0.000011 3,202 2,3091-2 500,000 2 16 0.000013 3,382 2,4771-2 750,000 0 17 0.000014 3,543 2,6281-2 1,000,000 2 18 0 4,058 3,2111-2 1,500,000 0 19 0 4,663 3,7841-2 2,000,000 0 20 0 5,354 4,440

1,572 1,1131-3 100,000 76 463 3851-3 200,000 261-3 300,000 111-3 400,000 31-3 500,000 81-3 750,000 01-3 1,000,000 01-3 1,500,000 01-3 2,000,000 0

Posterior MeanPosterior Std. Dev.

Exhibit 1 – Small InsurerLayer Pure Premium

Page 43: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Formulas for Ultimate Layer Pure Premium

• Use #5 on model (3rd previous) slide to calculate ultimate layer pure premium

20

=1

202 2

=1

Posterior Mean = Layer Pure Premium( ) Posterior( ).

Posterior Standard Deviation =

Layer Pure Premium( ) Posterior( ) Posterior Mean .

m

m

m m

m m

Page 44: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Results

• All insurers were simulated from same population.

• Posterior standard deviation decreases with insurer size.

$500K x $1M x $500K x $1M x $500K x $1M x$500K $1M $500K $1M $500K $1M

1,572 1,113 1,344 909 1,360 966463 385 278 245 234 188

Small Insurer Medium Insurer Large Insurer

Posterior MeanPosterior Std. Dev.

Layer Pure PremiumLayer Pure Premium Layer Pure Premium

Page 45: On Predictive Modeling for Claim Severity Glenn Meyers ISO Innovative Analytics CARe Seminar June 6-7, 2005

Possible Extensions

• Obtain model for individual insurers

• Obtain data for insurer of interest

• Calculate likelihood, Pr{data|model}, for each insurer’s model.

• Use Bayes’ Theorem to calculate posterior probability of each model

• Calculate the statistic of choice using models and posterior probabilities– e.g. Loss reserves