katrien antonio - chaire damichaire-dami.fr/files/2016/09/antonio-katrien.pdf · 2016-09-19 ·...

Actuaries and predictive modeling: past, present and future

Katrien Antonio

Faculty of Economics and BusinessLRisk Research CenterKU Leuven & [email protected]

3rd European Actuarial Journal Conference, Lyon

September 6, 2016

mailto:[email protected]

Goals of this talk

Focus on two case-studies using a blend of analytic techniques.

(1) Using risk factors in P&C pricing: a data driven strategy with GAMs,regression trees and GLMs.

(2) Unraveling the predictive power of telematics data in car insurancepricing.

A blend of techniques/learning outcomes/buzz words from

(recent) past, present and future?

K. Antonio, KU Leuven & UvA Goals of this talk 2 / 35

Goals of this talk


(1) Using risk factors in P&C pricing: a data driven strategy with

GAMs, regression trees andGLMs.(2) Unraveling the predictive power of telematics data in car insurance

pricing.




Goals of this talk


(1) Using risk factors in P&C pricing: a data driven strategy with GAMs,

regression trees and GLMs.(2) Unraveling the predictive power of telematics data in car insurance

pricing.




Goals of this talk


(1) Using risk factors in P&C pricing: a data driven strategy with GAMs,regression trees and GLMs.

(2) Unraveling the predictive power of telematics data in carinsurance pricing.




Data science and predictive modeling

(1) Schutt & O’Neil (2013), Doing data science -Straight talk from the frontline.

What is the eyebrow-raising about big data anddata science?

‘The hype is crazy.’

Getting past the hype?

‘There might be some meat in the data sciencesandwich’;

‘Data science, as it’s practiced, is a blend ofRed-Bull-fueled hacking and espresso-inspiredstatistics.’

(2) Prof. David Donoho (2015), 50 years of datascience.

K. Antonio, KU Leuven & UvA Data science and predictive modeling: buzz words 6 / 35

Actuarial pricing models in P&C insurance

I (Past)

One-way and two-way analysis, minimum bias (Bailey & Simon, 1960).

I (Present)

Risk classification in competitive markets using Generalized LinearModels for frequency and severity.

I (Future) Challenges?

- high dimensional variables (e.g. territory, vehicle groups)

- (structured and unstructured) telematics data;

- keep model explainable to clients, regulators, ICT, . . .;

- be aware of actuarial features!!

K. Antonio, KU Leuven & UvA Actuaries and Predictive modeling 7 / 35

Actuarial pricing models in P&C insurance: a blend of?

de Jong & Heller Ohlsson & Johansson Denuit et al.

Hastie, Tibshirani & Friedman James et al. Kuhn & Johnson

K. Antonio, KU Leuven & UvA Actuaries and Predictive modeling 8 / 35

Using risk factors in P&C pricing

a data driven strategy with GAMs, regression trees andGLMs.

Katrien AntonioKU Leuven & UvA

Maxime ClijstersAG Insurance

Roel HenckaertsKU Leuven

Roel VerbelenKU Leuven

GAM claim frequency model as starting point

I Our solution starts with an exhaustive search using GAMs.

I Best GAM according to AIC/BIC:

log(E(nclaims)) =log(exposure) + β0 + β1coveragePO + β2coverageFO + β3fueldiesel+ f1(ageph) + f2(bm) + f3(power)

+ f4(ageph, power)

+ f5(long, lat),

which combines offset and

Categorical Continuous Interactions Spatial

risk factors.

K. Antonio, KU Leuven & UvA Case study 1 10 / 35





+ f4(ageph, power)

+ f5(long, lat),


Categorical

Continuous Interactions Spatial

risk factors.






+ f4(ageph, power)

+ f5(long, lat),


Categorical Continuous

Interactions Spatial

risk factors.






+ f4(ageph, power)

+ f5(long, lat),


Categorical Continuous Interactions

Spatial

risk factors.






+ f4(ageph, power)

+ f5(long, lat),


Categorical Continuous Interactions Spatial

risk factors.



−0.2

0.0

0.2

0.4

0.6

25 50 75ageph

Sin

gle

effe

ct

0.0

0.4

0.8

0 5 10 15 20bm

Sin

gle

effe

ct

−1

0

1

0 50 100 150 200 250power

Sin

gle

effe

ct

0

50

100

150

200

250

25 50 75ageph

pow

er

−0.5

0.0

0.5

Interaction effect

−0.4

−0.2

0.0

0.2

Spatial effect


From GAMs to GLMs

We choose number of geo-classes by optimizing BIC for the GAM withbinned spatial effect.

Spatial effect

[−0.437,−0.328)

[−0.328,−0.219)

[−0.219,−0.109)

[−0.109,0.000104)

[0.000104,0.11)

[0.11,0.219)

[0.219,0.328]

Equal intervals

Spatial effect

[−0.437,−0.201)

[−0.201,−0.136)

[−0.136,−0.0846)

[−0.0846,−0.0258)

[−0.0258,0.0207)

[0.0207,0.119)

[0.119,0.328]

Quantile binning

Spatial effect

[−0.437,−0.382)

[−0.382,−0.278)

[−0.278,−0.121)

[−0.121,−0.0278)

[−0.0278,0.0475)

[0.0475,0.169)

[0.169,0.328]

Complete linkage

Spatial effect

[−0.437,−0.415)

[−0.415,−0.382)

[−0.382,−0.359)

[−0.359,−0.328)

[−0.328,−0.318)

[−0.318,0.325)

[0.325,0.328]

Single linkage

Spatial effect

[−0.437,−0.318)

[−0.318,−0.218)

[−0.218,−0.134)

[−0.134,−0.0448)

[−0.0448,0.0553)

[0.0553,0.169)

[0.169,0.328]

K−means clustering

Spatial effect

[−0.437,−0.255)

[−0.255,−0.146)

[−0.146,−0.0618)

[−0.0618,0.015)

[0.015,0.103)

[0.103,0.214)

[0.214,0.328]

Fisher−Jenks


From GAMs to GLMs using evolutionary trees

I We fit evolutionary trees to the single and interaction effects:

f̂1(ageph) f̂2(bm) f̂3(power) f̂4(ageph, power).

I We bin these effects by picking the best tree according to someevaluation criterion:

N · log(wMSE) + 4 · α · (M + 1) · log(N).

I The evaluation criterion balances goodness-of-fit (wMSE) andcomplexity of the tree (M), while accounting for portfolio compositionas weights.

I We tune α and then use the optimal tree according to this evaluationcriterion.



−0.2

0.0

0.2

0.4

0.6

25 50 75ageph

Sin

gle

effe

ct

0.0

0.4

0.8

0 5 10 15 20bm

Sin

gle

effe

ct

−1

0

1

2

0 50 100 150 200 250power

Sin

gle

effe

ct

0

50

100

150

200

250

25 50 75ageph

pow

er

−0.5

0.0

0.5

Interaction effect



●●●●●●

●●●

●●●

●●●

●●●●●●●●●●●●●●●●●●

●●●●●

●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●

−0.2

0.0

0.2

0.4

0.6

25 50 75ageph

Sin

gle

effe

ct

●

●

●

● ●● ●

●

●

●

●

●

● ● ●

● ● ● ● ● ● ● ●

0.0

0.4

0.8

0 5 10 15 20bm

Sin

gle

effe

ct

●●●●●●●●●●

●●●●●●●

●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●● ●●● ●●●●●●● ●●●●●●●●●● ●●

−1

0

1

2

0 50 100 150 200 250power

Sin

gle

effe

ct

0

50

100

150

200

250

25 50 75ageph

pow

er

−0.75

−0.50

−0.25

0.00

0.25

Residual



I Hence, we obtain a fully data-driven binning procedure.

I We use a blend of techniques:

trees, genetic algorithms;

(machine learning);

GAMs;

(flexible statistical modeling);

GLMs;

(the actuarial comfort zone).


Unraveling the predictive power of telematics data in carinsurance pricing.

Roel VerbelenKU Leuven

Katrien AntonioKU Leuven & UvA

Gerda ClaeskensKU Leuven

Telematics insurance: the future?

I The Economist, February 23 2013,How’s my driving?

I “Underwriters have traditionally used crude

demographic data such as age, location and

sex to separate the testosterone-fuelled boy

racers from their often tamer female

counterparts. [. . .] By monitoring their

customers’ motoring habits, underwriters

can increasingly distinguish between drivers

who are safe on the road from those who

merely seem safe on paper. Many think that

telematics insurance will become the

industry norm.”

K. Antonio, KU Leuven & UvA Case study 2: telematics insurance 18 / 35

New rating variables due to telematics technology

Telematics data collected in each trip: driving habits

and driving style

• the distance driven;

• the time of day;

• how long you have been driving;

• the location;

• the speed/speeding;

• harsh or smooth breaking;

• aggressive acceleration ordeceleration;

• your cornering and parking skills.

Possibly combined with:

• road maps;

• weather information;

• traffic information.


New rating variables due to telematics technology

Telematics data collected in each trip: driving habits and driving style

• the distance driven;

• the time of day;

• how long you have been driving;

• the location;

• the speed/speeding;

• harsh or smooth breaking;

• aggressive acceleration ordeceleration;

• your cornering and parking skills.

Possibly combined with:

• road maps;

• weather information;

• traffic information.


Unique telematics data set from a Belgian insurer

I Telematics data collected in between 2010 and 2014.

I Belgian MTPL product with telematics black box targeted to youngdrivers.

I Daily CSV-files with trip info, aggregated on daily basis:

- number of trips;

- meters traveled (in total) and

• divided by time slot: 6u-9u30, 9u30-16u, 16u-19u, 19u-22u,22u-6u;

• divided by road type: motorways, urban area, abroad, any othertype.



Insured Insurer

Data provider

Policy information

Raw

telematics

information

Agg

rega

ted

tele

mat

ics

info

rmat

ion



●●

●

●

●

●●●●●●

●

●●●●●

●

●●●●●●●

●

●●

●●●●

●

●●●●●●●●●

●●

●●●

●●

●●

●●●●

●●

●

●

●●●●

●●

●

●

●

●●

●●

●●●●●

●●

●

●●●●

●●

●

●●●●

●

●

●●

●●●

●●

●●●●●

●●

●●●●

●

●

●

●

●●●●

●

●

●

●●●●

●●

●

●●

●

●

●●

●

●●●●

●●●●●●●

●●

●

●●

●●

●●

●

●

●

●●

●

●

●●●●●

●

●

●

●●●●

●●

●●●●●

●

●

●●●●●

●●

●●●●●●●

●●●

●●

●●

●

●●●●

●●

●●●●●

●●

●●●●●

●●

●

●●●●

●●

●●●●●

●●

●

●●●●

●●

●●●●●

●●

●●●●●

●●

●

●●●●

●

●

●●●●●

●

●

●●●●●

●●

●●●●●

●●

●

●●●●

●

●

●

●●●●

●

●

●

●

●●●

●●

●

●●

●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●●

●

●

●●

●●●

●●

●●

●●●●

●●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●

●●

●●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●●●●●

●●

●

●●●●

●

●

●

●●

●

●

●

●

●

●●●●

●

●

●

●●●●●

●

●●●●●

●●

●

●●●●

●

●

●●

●●●

●

●

●

●●●●●

●

●●●●●●

●

●

●●●●●

●

●

●●●●

●

●

●

●

●●

●●

●

●

●●●●

●

●

●

●

●●●●

●

●

●●●●●

●

●

●

●

●

●

●

●

●

●●●●●

●

●●●●●●

●

●●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●●●●●

●

●

●

●●●●

●

●●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●●●●●●

●

●●●●●●

●

●

●●●●●

●

●

●●●

●

●

●

●

●●●●●

●

●●

●

●●

●

●

●

●●

●

●

●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●

●

●

●

●

●●●

●

●

●●

●●

●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●

●

●

●

●●●

●

●

●

●

●●●●●

●

●

●●●●

●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●●●●

●

●

●

●

●●●

●

●

●

●●●●

●

●

●

●●●●

●

●●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●●

●

●

●●

●

●●

●●

●

●

●

●

●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●●

●●

●

●

●●

●●●●●

●●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●

●

●

●●

●●●●●

●

●

●

●●●●

●●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●

●●●●

●

●

●

●●

●

●

●

●

●

●●●●

●

●

●

●●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●●●●●●

●

●●●●

●

●

●

●●●●

●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●●

●

●●●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●●

●●●

●

●

●

●●●●

●

●

●●●

●●

●

●

●

●●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●●●

●

●

●

●

●●●

●

●

●

●●●●

●

●

●

●

●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●

●●

●

●

●

●●●●

●●

●●●●●

●●

●

●

●●●

●●

●

●

●●●

●●

●

●●●●

●

●

●●●●

●

●●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●

●

●

●

●

●●●●

●

●

●●

●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●●

●

●

●●

●

●●●●●●

●

●

●

●

●

●

●●●●

●

●

●●

●

●●●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●●●

●

●

●

●

●

●●●●

●

●

●

●●●●

●

●●●

●

●

●

●●●●

●●

●

●●●●

●

●

●

●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●●●●

●

●

●

●

●●●●

●●

●

●●●

●

●

●

●

●

●●●

●

●

●

●●

●

●

●

●

●

●●●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●

●

●

●

●

●●●●

●

●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●

●

●●

●

●●●●

●●

●●

●

●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●●●

●

●

●

●●

●

●

●

●

●

●

●●

0

100k

200k

300k

400k

2010 2011 2012 2013 2014 2015Date

Dis

tanc

e (in

km

)



●●

●

●

●

●●●●●●

●

●●●●●

●

●●●●●●●

●

●●

●●●●

●

●●●●●●●●●

●●

●●●

●●

●●

●●●●

●●

●

●

●●●●

●●

●

●

●

●●

●●

●●●●●

●●

●

●●●●

●●

●

●●●●

●

●

●●

●●●

●●

●●●●●

●●

●●●●

●

●

●

●

●●●●

●

●

●

●●●●

●●

●

●●

●

●

●●

●

●●●●

●●●●●●●

●●

●

●●

●●

●●

●

●

●

●●

●

●

●●●●●

●

●

●

●●●●

●●

●●●●●

●

●

●●●●●

●●

●●●●●●●

●●●

●●

●●

●

●●●●

●●

●●●●●

●●

●●●●●

●●

●

●●●●

●●

●●●●●

●●

●

●●●●

●●

●●●●●

●●

●●●●●

●●

●

●●●●

●

●

●●●●●

●

●

●●●●●

●●

●●●●●

●●

●

●●●●

●

●

●

●●●●

●

●

●

●

●●●

●●

●

●●

●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●●

●

●

●●

●●●

●●

●●

●●●●

●●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●

●●

●●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●●●●●

●●

●

●●●●

●

●

●

●●

●

●

●

●

●

●●●●

●

●

●

●●●●●

●

●●●●●

●●

●

●●●●

●

●

●●

●●●

●

●

●

●●●●●

●

●●●●●●

●

●

●●●●●

●

●

●●●●

●

●

●

●

●●

●●

●

●

●●●●

●

●

●

●

●●●●

●

●

●●●●●

●

●

●

●

●

●

●

●

●

●●●●●

●

●●●●●●

●

●●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●●●●●

●

●

●

●●●●

●

●●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●●●●●●

●

●●●●●●

●

●

●●●●●

●

●

●●●

●

●

●

●

●●●●●

●

●●

●

●●

●

●

●

●●

●

●

●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●

●

●

●

●

●●●

●

●

●●

●●

●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●

●

●

●

●●●

●

●

●

●

●●●●●

●

●

●●●●

●

●

●

●●●●●

●

●

●●●●●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●●●●

●

●

●

●

●●●

●

●

●

●●●●

●

●

●

●●●●

●

●●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●●

●

●

●●

●

●●

●●

●

●

●

●

●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●●

●●

●

●

●●

●●●●●

●●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●

●

●

●●

●●●●●

●

●

●

●●●●

●●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●

●●●●

●

●

●

●●

●

●

●

●

●

●●●●

●

●

●

●●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●●●●●●

●

●●●●

●

●

●

●●●●

●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●●

●

●●●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●●

●●●

●

●

●

●●●●

●

●

●●●

●●

●

●

●

●●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●●●●

●

●

●

●

●●●

●

●

●

●●●●

●

●

●

●

●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●

●●

●

●

●

●●●●

●●

●●●●●

●●

●

●

●●●

●●

●

●

●●●

●●

●

●●●●

●

●

●●●●

●

●●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●

●

●

●

●

●●●●

●

●

●●

●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●●

●

●●●

●

●

●

●

●

●●●

●

●

●

●●

●

●

●

●

●

●●●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●

●

●

●

●

●●●●

●

●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●

●

●●

●

●●●●

●●

●●

●

●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●●●●

●

●

●

●

●●●

●

●

●

●●

●

●

●

●

●

●

●●

0

100k

200k

300k

400k

2010 2012 2014Date

Dis

tanc

e (in

km

)


Description of the data

The resulting data set has 33 259 observations:

I 10 406 unique policyholders;

I 17 681 years of insured periods;

I 0.0838 claims per insured year;

I 1481 MTPL claims at fault;

I 297 million kilometers driven;

I 0.0499 claims per 10 000 km.

What is the best measure of exposure to risk?

0.000

0.002

0.004

0.006

0.008

50 100 150 200 250 300 350Policy period (days)

Den

sity

0.00

0.02

0.04

0.06

0.08

0 10 20 30 40 50 60 70Distance (1000 km)

Den

sity


Policy information

0.00

0.05

0.10

0.15

18 21 24 27 30Age

Den

sity

0.00

0.05

0.10

0.15

0 3 6 9 12Experience

Den

sity

0.00

0.05

0.10

0.15

0 4 8 12 16 20 24Age vehicle

Den

sity

0.00

0.01

0.02

30 60 90 120150180210Kwatt

Den

sity

0.00

0.05

0.10

−4 0 4 8 12 16 20Bonus−malus

Pro

port

ion

0.0

0.2

0.4

male femaleGender

Pro

port

ion

0.0

0.2

0.4

0.6

Diesel PetrolFuel

Pro

port

ion

0.0

0.2

0.4

0.6

yes noMaterial damage cover

Pro

port

ion

Proportion per km2

[3.69e−07,5.1e−05)

[5.1e−05,0.00014)

[0.00014,0.000274)

[0.000274,0.000475)

[0.000475,0.000789)


Telematics information


Predictor sets

Classic

Timehybrid

Meterhybrid

TelematicsPolicy

informationTelematicsinformation

Time ba

sedratin

g

Meter bas

ed rating


Generalized additive models

We use GAMs (Wood, 2006):

Nit ∼ POI(µit = exp (ηit))

ηit = offset + ηcatit + η

contit + η

spatialit + η

reit + η

compit

ηcatit + ηcontit + η

spatialit = Z itβ +

J∑j=1

fj(xjit) + fspatial(latit , longit) ,

We combine:

categorical + continuous + spatial + compositional (new!!)

risk factors.


Compositional data

I Satellite talk!

I Roel Verbelen’s talk, today, in ParallelSession 1 (Analytics), from 11-11.30h!

I More on our telematics paper, withfocus on methodological contributionwrt compositional data as predictors. Roel Verbelen, KU

Leuven


Model selection and assessment

I Exhaustive search with AIC as a global goodness-of-fit measure.

AIC = −2 · logL+ 2 · EDF

where EDF is the effective degrees of freedom.

I Predictive performance is assessed using proper scoring rules for countdata (Czado et al., 2009) with 10-fold cross validation

S =1∑I

i=1 Ti

I∑i=1

Ti∑t=1

s(P̂−κitit , nit) ,

where P̂−κitit the predictive count distribution for observation nitestimated with the κitth part of the data removed.


Results: discussion

I Telematics information improves predictive power.

- Gender plays no role anymore in models incorporating telematicsinformation (cfr. Gender Directive).

- Spatial heterogeneity decreases.

- Time hybrid model incorporating telematics through additional riskfactors is optimal.

- Experience is preferred above age of the driver.

- Compositional driving habits have significant impact on riskiness.

- Classic approach performs worse.

I Similar results using negative binomial regression and using exposureas offset.


Results: model assessment

Predictor set EDFAIC logS QS SphS

value rank value rank value rank value rank

Classic 32.15 11 896 4 0.1790 4 −0.918 58 4 −0.958 22 4Time hybrid 39.66 11 727 1 0.1764 1 −0.919 10 1 −0.958 37 1Meter hybrid 41.47 11 736 2 0.1766 2 −0.919 08 2 −0.958 36 2Telematics 18.05 11 890 3 0.1787 3 −0.918 60 3 −0.958 22 3

I Significant impact of the use of telematics data;

I Time hybrid is the best model according to AIC and all proper scoringrules;

I Using only telematics predictors is even better than the use oftraditional rating variables.


Time hybrid - Policy information

Predictor

Pol

icy

TimeAgeExperienceSexMaterialPostal codeBonus-malusAge vehicleKwattFuel


Time hybrid - Telematics information

Predictor

Tel

emat

ics

DistanceYearly distanceAverage distanceRoad type 1111Road type 0111Time slotWeek/weekend


Outlook

I encourage the blending idea . . .

- of techniques (from machine learning, statistical modeling, actuarialscience);

- of disciplines (from computer science, statistics, actuarial science, butalso law);

- of people from practice and academia;

. . . to tackle the challenges imposed by structured and unstructured data inorder to create insurance analytics, products and risk management of thefuture.

K. Antonio, KU Leuven & UvA Outlook 35 / 35

Goals of this talkGoals of this talkData science and predictive modeling: buzz wordsActuaries and Predictive modelingCase study 1Case study 2: telematics insuranceDataModelResults

Outlook

katrien antonio - chaire damichaire-dami.fr/files/2016/09/antonio-katrien.pdf · 2016-09-19 ·...

Documents