predictive modeling in life insurance · 2018-09-13 · predictive modeling in life insurance...

SOA Predictive Analytics Seminar – Hong Kong 29 Aug. 2018 | Hong Kong

Session 5

Predictive Modeling in Life Insurance

Jingyi Zhang, Ph.D

9/12/2018

Predictive Modeling in Life Insurance

JINGYI ZHANG PhD

Data Scientist

Global Research & Data Analytics

Agenda

Overview of Predictive Modelling Techniques

o What actuary already know - OLS

o Generalized Linear Model

o Decision Tree Model

o Data Clustering

Sharing - Predictive Modelling Projects

Introduction - RGA Data Science Team

9/12/2018

What Is Predictive Modeling & Analytics?

DataHigh-quality data1

2

3

ModelingStatistical model

PredictionBusiness decisions

Predictive modeling & analytics is about driving business outcomes

What actuary already know

• Are you familiar with the following terms?

○ Linear Regression

○ Ordinary Least Square (OLS)

• Linear regression model

• Y target variable, Xi predictor variable, error term/noise

• i parameters to be estimated

• Underlying assumptions for a valid Linear regression model

• Normality, ε ~ N(0,2)

• Homogeneity, Y representative of population, Independence between observations

• Linearity

⋯

9/12/2018

Ordinary Least Squares

5

Ordinary Least Squares(OLS)

• For a simple regression

Identical to Maximum likelihood estimator More robust and consistent approach

Use adj R2 to compare fitness of models

1

Define 1 ∑

∑, but it is biased

Adjusted 1 ∗ 1 ∗

β min min min

β , β β ̅

β m , , min ln , , min if normal distribution

• portion that has been explained by OLS model• portion of TSS for the error

Why actuary did not use OLS

6

Processes are inherently linear, or can be well-approximated by LM

Effectiveness & Completeness OLS makes very efficient use of the data; good results with relatively small datasets

Identical to maximum likelihood estimation

Easy to understand and communicate theory is well-understood; Results are easy to communicate

Great! But wait …

There are several issues with OLS Validation of assumptions - Normal w/ constant 2, independent, homogeneous

Unbounded data, non-negative value

How about insurance application? Distribution of data, variance structure Binomial for rate (mortality/lapse/UW, etc.), 2 ~ r(1-r)

Poisson for claim count, ~ mean

OLS is not applicable in insurance, but you already know lots about modeling

9/12/2018

What actuary may not knowMachine Learning & Statistical Techniques

Decision Trees (CART)

Neural Networks / Deep Learning

Bayesian Analysis

Classification/Association

Analysis of Variance

Mixed Models

Survival Analysis

Cluster Analysis (e.g. K-Means)

Random Forest

XGBoost machine

Gradient Boosting

Ada Boosting

Support vector machine

Ensemble method

Survey Data Analysis

Feature engineering

Non-Parametric Analysis

Supervised vs. Unsupervised Learning

Supervised: estimate expected value of Y given values of X.

• GLM, Cox, CART, MARS, Random Forests, SVM, NN, etc.

Unsupervised: find interesting patterns amongst X; no target variable Y• Clustering, Correlation / Principal Components / Factor Analysis

Classification vs. Regression

Classification: to segment observations into 2 or more categories• fraud vs. legitimate, lapsed vs. retained, UW class

Regression: to predict a continuous amount.• Dollars of loss for a policy, ultimate size of claim

Parametric vs. Non-Parametric

Parametric Statistics: probabilistic model of data• Poisson Regression(claims count), Gamma (claim amount)

Non-Parametric Statistics: no probability model specified• classification trees, NN

8

PM terminology

9/12/2018

Generalized Linear Model

9

Generalized Linear Model(GLM) Major focus of PM in insurance industry

Include most distributions related to insurance Great flexibility in variance structure

OLS model is a special case of GLM

(Relatively) Easy to understand and communicate Multiplicative model intuitive & consistent with

insurance practice

3 components Random component

Systematic component

link function


10

Random componentObservations Y1, . . . , Yn are independent w/ density from the exponential family

From maximum likelihood theory,

Each distribution is specified in terms of mean & variance

Variance is a function of mean

; ,

,

,

Normal Poisson Binomial Gamma InverseGaussian

Name , , ⁄ , ,

Range (-,+) (0,+) (0,1) (0,+) (0,+)

b e ln(1+e) ln 2 /

e e/(1+e) 1/ 2 /

1 1

9/12/2018

11

Variance of different distributions Gaussian, constant

Poisson, ~ mean

Gamma, ~ mean^2

Why distribution will affect results

0

1

2

3

4

5

6

7

8

9

0.5 1 1.5 2 2.5 3

GLM Different Distributions

Data

Normal

C

BA

0

1

2

3

4

5

6

7

8

9

0.5 1 1.5 2 2.5 3


Data

Poisson

0

1

2

3

4

5

6

7

8

9

0.5 1 1.5 2 2.5 3


Data

Gamma

0

1

2

3

4

5

6

7

8

9

0.5 1 1.5 2 2.5 3


Data

Normal

Poisson

Gamma

A B

C


12

Systematic componentA linear predictor ∑ for observation i

link function , random & systematic are connected by a smooth &invertible function

Log is unique in insurance application - all parameters are multiplicative

exp ∑ ∏ exp ∏ exp ∏

Consistent with most insurance practices

Intuitively easy to understand and communicate

Identity Log Logit Reciprocal

ln ln1

1/

1/

9/12/2018

13

Solve for parameters () by maximum likelihood

Closed form for small data and simple model

Iterative numerical techniques for large data set & complex model

Use statistical analysis application, such as R

Compare OSL and GLM

Great flexibility Various distribution, variance structure

Prior weight and the credibility of data

Random Systematic Link

OLS Normal only

GLM Various distribution


Decision Tree Model

14

Decision Tree - Classification And Regression Tree (CART) Both classification and regression

Non-parametric approach (no insight in data structure)

CART tree is generated by repeated partitioning of data set Data is split into two partitions (binary partition)

• Consider all possible values of all variables.• Select the variable/value (X=t1) that produces the greatest “separation” in

the target.

Partitions can also be split into sub-partitions (recursive)

Until data in end node(leaf) is homogeneous (more or less)

Results are very intuitive Identify specific groups that deviate in target variable

Yet, algorithm is very sophisticated

9/12/2018

Splitting Point• “Separation” defined in many ways; different for regression &

classification

• Regression Trees: use sum of squared errors– ∑

– ∑ ∑– Select X=t1 such that max

,

• Classification Trees: use measures of purity/impurity

– Intuition: an ideal tree model would produce nodes with only either class A or class B - completely pure nodes

– Gini Index - purity of a node 1∑ 1 1 ∑ , = probability of class i

– Entropy – information index

1 1

Decision Tree Model

Data Clustering

16

Clustering algorithm Find similarities in data according to features in data & group similar

objects into clusters

Unsurprised (no pre-defined), classification, non-parametric

How to measure similarities/dissimilarities, e.g. distance

Numeric, categorical, and ordinal variables

Partitioning (k-means), Hierarchical, etc.

9/12/2018

Data Clustering

17

Algorithm

Partitioning algorithms - K-measn/k-medoids

• Maintain k clusters with k known; place points into their “nearest” cluster

Hierarchical

• Objects are more related to nearby objects than to objects farther away; objects are connected by distance; how to define “nearby” object

K-Means Algorithm

1. Select K points as initial centroids, with a given k

2. Repeat

3. Form K clusters by assign each points to its nearest centroid

4. Re-compute the centroids of each cluster

5. Until centroids do not change

Data Clustering

18

Standardization / Normalization

Values of variables may have different units

Variable with high variability/range will dominate metric & lead to bias

How to determine K

Business reasons could dictate k

Try different k, looking at the change in the average distance to centroid, as k increases;

error falls rapidly until right k, then changes little

9/12/2018

Data Clustering

19

Comments on K-Means

• Strength:

• simple, very efficient & fast

• Weakness

• Applicable only when mean is defined, (categorical?)

• Need to know k in advance

• Unable to handle noisy data & outliers; sensitive to outliers

• Maybe sensitive to initialization

Hierarchical clustering

• Bottom up or top down produce a dendrogram

• Important questions - how to represent a cluster of more than one point, &

how to determine the “nearness” of clusters?– Single Link: smallest distance between points– Complete Link: largest distance between points – Average Link: average distance between points– Centroid: distance between centroids

Conclusion

Advantage of actuary Industry knowledge - domain knowledge is a key in modeling process

Expertise in data process - data is always #1 issue in data-driven application

Unique position in data analytics

Opportunity Solid foundation in statistics

Education experience in modeling (OLS)

Need to pick up new skills & thinking by education, training, and experience

Actuaries can not miss it Data analytics is here to stay; it is changing insurance industry, and will

fundamentally change how we run insurance business

Actuaries could and should be on top of it and lead the change

9/12/2018

Sharing -

Predictive Modelling Projects

Considerations

Business Goals

Data

Environment

Objective is to support profitable growth of business

Resources available & strong support from executives

Sufficient quantity & high quality to support analytics

Satisfactory data depth & width

Able to obtain & capable to understand / clean data

Regulatory & privacy laws allow such data analytics

Distribution channel can support data-driven solutions

9/12/2018

Across the Value ChainAs long as there is data, there is potential to capitalize on it

Levelof client demand

High

Low

Medium

Pre-sale UnderwritingIn-force

managementClaims

New rating factors

Multivariate analysis

Cross-sell/upsell

Fraud/non-disclosure

Preferred risk

selection

Predictive underwriting

Propensity to apply & triggers

Distributor quality control

Propensity to complete purchase

Underwriting triage

Determine underwriting

ratings

Proactive lapse

management

Claims triageCustomer

lifetime value

Competitive pricing

strategy

Customer Risk Scoring – ChinaClient would like to build customer risk score for their cancer product, which can predict the claim risk of the customer.

• To predict claim risk of customer

• To Improve customer experience for best risks with reduced UW & sales process

• To improve claim experience of existing customers, by identifying high risks

Objectives

• Two data source combined

o Policy data

o Claim data

• Modelled claim risk using wide range of rating factors & compared to pricing assumption

Data

• 6 statistically significant variables in model

• Claim risk of best group is less than half of their pricing assumption; the risk of worst group is about the double of their pricing assumption

Modeling & Lift Plot

9/12/2018

Bancassurance Predictive Underwriting -SEA

A bank with a large customer base expressed a strong desire to increase sales penetration of their life product, while streamlining the underwriting process

Simplified underwriting and sales process with high take-up for the best risks

Reduce acquisition costs

Increase protection sales and product penetration

Objectives

Two data sources combined:

o Bank customer information at time of issue

o Underwriting decision

About 80 variables available for modeling:

o Demographic data, bank and insurance product data, banking transaction data etc.

Data

11 statistically significant variables in model:

o Branch, AUM, customer segment, credit card

GIO for the best 20% risks; SIO for next best 20%

Business Application and Lift Plot

Introduction -

RGA Data Science Team

9/12/2018

RGA Data Science Team –Global Presence, Local Focus

• Data Science team includes data scientists, actuaries and IT experts

• More than 50% of the team have a Ph.D. and the rest have master’s degrees

• Work closely with UW, actuarial, admin and IT

• The DS team collaborates with regional/local offices to focus on regional initiatives and local market projects

• We leverage local market knowledge to maximize data value & drive business outcomes

Global Research & Data Analytics

ResearchExperienceAnalytics

DataStrategy

Global(15)

Asia(6)

DataScience

Regional Data

Strategy

LocalOffice

RGA Data Science Team – Who are We?

9/12/2018

Thank You!

predictive modeling in life insurance · 2018-09-13 · predictive modeling in life insurance...

Documents