predictive modeling in life insurance · 2018-09-13 · predictive modeling in life insurance...
TRANSCRIPT
SOA Predictive Analytics Seminar – Hong Kong 29 Aug. 2018 | Hong Kong
Session 5
Predictive Modeling in Life Insurance
Jingyi Zhang, Ph.D
9/12/2018
Predictive Modeling in Life Insurance
JINGYI ZHANG PhD
Data Scientist
Global Research & Data Analytics
Agenda
Overview of Predictive Modelling Techniques
o What actuary already know - OLS
o Generalized Linear Model
o Decision Tree Model
o Data Clustering
Sharing - Predictive Modelling Projects
Introduction - RGA Data Science Team
9/12/2018
What Is Predictive Modeling & Analytics?
DataHigh-quality data1
2
3
ModelingStatistical model
PredictionBusiness decisions
Predictive modeling & analytics is about driving business outcomes
What actuary already know
• Are you familiar with the following terms?
○ Linear Regression
○ Ordinary Least Square (OLS)
• Linear regression model
• Y target variable, Xi predictor variable, error term/noise
• i parameters to be estimated
• Underlying assumptions for a valid Linear regression model
• Normality, ε ~ N(0,2)
• Homogeneity, Y representative of population, Independence between observations
• Linearity
⋯
9/12/2018
Ordinary Least Squares
5
Ordinary Least Squares(OLS)
• For a simple regression
Identical to Maximum likelihood estimator More robust and consistent approach
Use adj R2 to compare fitness of models
1
Define 1 ∑
∑, but it is biased
Adjusted 1 ∗ 1 ∗
β min min min
β , β β ̅
β m , , min ln , , min if normal distribution
• portion that has been explained by OLS model• portion of TSS for the error
Why actuary did not use OLS
6
Processes are inherently linear, or can be well-approximated by LM
Effectiveness & Completeness OLS makes very efficient use of the data; good results with relatively small datasets
Identical to maximum likelihood estimation
Easy to understand and communicate theory is well-understood; Results are easy to communicate
Great! But wait …
There are several issues with OLS Validation of assumptions - Normal w/ constant 2, independent, homogeneous
Unbounded data, non-negative value
How about insurance application? Distribution of data, variance structure Binomial for rate (mortality/lapse/UW, etc.), 2 ~ r(1-r)
Poisson for claim count, ~ mean
OLS is not applicable in insurance, but you already know lots about modeling
9/12/2018
What actuary may not knowMachine Learning & Statistical Techniques
Decision Trees (CART)
Neural Networks / Deep Learning
Bayesian Analysis
Classification/Association
Analysis of Variance
Mixed Models
Survival Analysis
Cluster Analysis (e.g. K-Means)
Random Forest
XGBoost machine
Gradient Boosting
Ada Boosting
Support vector machine
Ensemble method
Survey Data Analysis
Feature engineering
Non-Parametric Analysis
Supervised vs. Unsupervised Learning
Supervised: estimate expected value of Y given values of X.
• GLM, Cox, CART, MARS, Random Forests, SVM, NN, etc.
Unsupervised: find interesting patterns amongst X; no target variable Y• Clustering, Correlation / Principal Components / Factor Analysis
Classification vs. Regression
Classification: to segment observations into 2 or more categories• fraud vs. legitimate, lapsed vs. retained, UW class
Regression: to predict a continuous amount.• Dollars of loss for a policy, ultimate size of claim
Parametric vs. Non-Parametric
Parametric Statistics: probabilistic model of data• Poisson Regression(claims count), Gamma (claim amount)
Non-Parametric Statistics: no probability model specified• classification trees, NN
8
PM terminology
9/12/2018
Generalized Linear Model
9
Generalized Linear Model(GLM) Major focus of PM in insurance industry
Include most distributions related to insurance Great flexibility in variance structure
OLS model is a special case of GLM
(Relatively) Easy to understand and communicate Multiplicative model intuitive & consistent with
insurance practice
3 components Random component
Systematic component
link function
Generalized Linear Model
10
Random componentObservations Y1, . . . , Yn are independent w/ density from the exponential family
From maximum likelihood theory,
Each distribution is specified in terms of mean & variance
Variance is a function of mean
; ,
,
,
Normal Poisson Binomial Gamma InverseGaussian
Name , , ⁄ , ,
Range (-,+) (0,+) (0,1) (0,+) (0,+)
b e ln(1+e) ln 2 /
e e/(1+e) 1/ 2 /
1 1
9/12/2018
11
Variance of different distributions Gaussian, constant
Poisson, ~ mean
Gamma, ~ mean^2
Why distribution will affect results
0
1
2
3
4
5
6
7
8
9
0.5 1 1.5 2 2.5 3
GLM Different Distributions
Data
Normal
C
BA
0
1
2
3
4
5
6
7
8
9
0.5 1 1.5 2 2.5 3
GLM Different Distributions
Data
Poisson
0
1
2
3
4
5
6
7
8
9
0.5 1 1.5 2 2.5 3
GLM Different Distributions
Data
Gamma
0
1
2
3
4
5
6
7
8
9
0.5 1 1.5 2 2.5 3
GLM Different Distributions
Data
Normal
Poisson
Gamma
A B
C
Generalized Linear Model
12
Systematic componentA linear predictor ∑ for observation i
link function , random & systematic are connected by a smooth &invertible function
Log is unique in insurance application - all parameters are multiplicative
exp ∑ ∏ exp ∏ exp ∏
Consistent with most insurance practices
Intuitively easy to understand and communicate
Identity Log Logit Reciprocal
ln ln1
1/
1/
9/12/2018
13
Solve for parameters () by maximum likelihood
Closed form for small data and simple model
Iterative numerical techniques for large data set & complex model
Use statistical analysis application, such as R
Compare OSL and GLM
Great flexibility Various distribution, variance structure
Prior weight and the credibility of data
Random Systematic Link
OLS Normal only
GLM Various distribution
Generalized Linear Model
Decision Tree Model
14
Decision Tree - Classification And Regression Tree (CART) Both classification and regression
Non-parametric approach (no insight in data structure)
CART tree is generated by repeated partitioning of data set Data is split into two partitions (binary partition)
• Consider all possible values of all variables.• Select the variable/value (X=t1) that produces the greatest “separation” in
the target.
Partitions can also be split into sub-partitions (recursive)
Until data in end node(leaf) is homogeneous (more or less)
Results are very intuitive Identify specific groups that deviate in target variable
Yet, algorithm is very sophisticated
9/12/2018
Splitting Point• “Separation” defined in many ways; different for regression &
classification
• Regression Trees: use sum of squared errors– ∑
– ∑ ∑– Select X=t1 such that max
,
• Classification Trees: use measures of purity/impurity
– Intuition: an ideal tree model would produce nodes with only either class A or class B - completely pure nodes
– Gini Index - purity of a node 1∑ 1 1 ∑ , = probability of class i
– Entropy – information index
1 1
Decision Tree Model
Data Clustering
16
Clustering algorithm Find similarities in data according to features in data & group similar
objects into clusters
Unsurprised (no pre-defined), classification, non-parametric
How to measure similarities/dissimilarities, e.g. distance
Numeric, categorical, and ordinal variables
Partitioning (k-means), Hierarchical, etc.
9/12/2018
Data Clustering
17
Algorithm
Partitioning algorithms - K-measn/k-medoids
• Maintain k clusters with k known; place points into their “nearest” cluster
Hierarchical
• Objects are more related to nearby objects than to objects farther away; objects are connected by distance; how to define “nearby” object
K-Means Algorithm
1. Select K points as initial centroids, with a given k
2. Repeat
3. Form K clusters by assign each points to its nearest centroid
4. Re-compute the centroids of each cluster
5. Until centroids do not change
Data Clustering
18
Standardization / Normalization
Values of variables may have different units
Variable with high variability/range will dominate metric & lead to bias
How to determine K
Business reasons could dictate k
Try different k, looking at the change in the average distance to centroid, as k increases;
error falls rapidly until right k, then changes little
9/12/2018
Data Clustering
19
Comments on K-Means
• Strength:
• simple, very efficient & fast
• Weakness
• Applicable only when mean is defined, (categorical?)
• Need to know k in advance
• Unable to handle noisy data & outliers; sensitive to outliers
• Maybe sensitive to initialization
Hierarchical clustering
• Bottom up or top down produce a dendrogram
• Important questions - how to represent a cluster of more than one point, &
how to determine the “nearness” of clusters?– Single Link: smallest distance between points– Complete Link: largest distance between points – Average Link: average distance between points– Centroid: distance between centroids
Conclusion
Advantage of actuary Industry knowledge - domain knowledge is a key in modeling process
Expertise in data process - data is always #1 issue in data-driven application
Unique position in data analytics
Opportunity Solid foundation in statistics
Education experience in modeling (OLS)
Need to pick up new skills & thinking by education, training, and experience
Actuaries can not miss it Data analytics is here to stay; it is changing insurance industry, and will
fundamentally change how we run insurance business
Actuaries could and should be on top of it and lead the change
9/12/2018
Sharing -
Predictive Modelling Projects
Considerations
Business Goals
Data
Environment
Objective is to support profitable growth of business
Resources available & strong support from executives
Sufficient quantity & high quality to support analytics
Satisfactory data depth & width
Able to obtain & capable to understand / clean data
Regulatory & privacy laws allow such data analytics
Distribution channel can support data-driven solutions
9/12/2018
Across the Value ChainAs long as there is data, there is potential to capitalize on it
Levelof client demand
High
Low
Medium
Pre-sale UnderwritingIn-force
managementClaims
New rating factors
Multivariate analysis
Cross-sell/upsell
Fraud/non-disclosure
Preferred risk
selection
Predictive underwriting
Propensity to apply & triggers
Distributor quality control
Propensity to complete purchase
Underwriting triage
Determine underwriting
ratings
Proactive lapse
management
Claims triageCustomer
lifetime value
Competitive pricing
strategy
Customer Risk Scoring – ChinaClient would like to build customer risk score for their cancer product, which can predict the claim risk of the customer.
• To predict claim risk of customer
• To Improve customer experience for best risks with reduced UW & sales process
• To improve claim experience of existing customers, by identifying high risks
Objectives
• Two data source combined
o Policy data
o Claim data
• Modelled claim risk using wide range of rating factors & compared to pricing assumption
Data
• 6 statistically significant variables in model
• Claim risk of best group is less than half of their pricing assumption; the risk of worst group is about the double of their pricing assumption
Modeling & Lift Plot
9/12/2018
Bancassurance Predictive Underwriting -SEA
A bank with a large customer base expressed a strong desire to increase sales penetration of their life product, while streamlining the underwriting process
Simplified underwriting and sales process with high take-up for the best risks
Reduce acquisition costs
Increase protection sales and product penetration
Objectives
Two data sources combined:
o Bank customer information at time of issue
o Underwriting decision
About 80 variables available for modeling:
o Demographic data, bank and insurance product data, banking transaction data etc.
Data
11 statistically significant variables in model:
o Branch, AUM, customer segment, credit card
GIO for the best 20% risks; SIO for next best 20%
Business Application and Lift Plot
Introduction -
RGA Data Science Team
9/12/2018
RGA Data Science Team –Global Presence, Local Focus
• Data Science team includes data scientists, actuaries and IT experts
• More than 50% of the team have a Ph.D. and the rest have master’s degrees
• Work closely with UW, actuarial, admin and IT
• The DS team collaborates with regional/local offices to focus on regional initiatives and local market projects
• We leverage local market knowledge to maximize data value & drive business outcomes
Global Research & Data Analytics
ResearchExperienceAnalytics
DataStrategy
Global(15)
Asia(6)
DataScience
Regional Data
Strategy
LocalOffice
RGA Data Science Team – Who are We?
9/12/2018
Thank You!