predictive analytics: an overview with an application to wc claims by chris coleianne
DESCRIPTION
Chris Coleianne of Aon Risk Solutions presented "Predictive Analytics: An Overview With An Application to WC Claims" to the 68th Annual F. Addison Fowler Fall Seminar on October 17, 2014.TRANSCRIPT
Predictive Analytics:Overview with an Application to WC Claims
68th Annual F. Addison Fowler
Fall Seminar
October 17, 2014
Chris Coleianne Aon Risk SolutionsGlobal Risk Consulting
2
Agenda
When people ask actuaries about their models…
October 17, 2014
The Basics
October 17, 2014 4
The Basics
Predictive analytics– Using statistical techniques to anticipate future outcomes
Predictive analytics are used in many applications, not just insurance– Moneyball is predictive analytics
• Using statistical analysis to anticipate baseball players’ performance• Build a better roster
– Credit Scoring (default risk)– Traditional Marketing applications
• Defining a class of target consumers and ranking these targets based on anticipated retention.
• Allocating marketing resources to pursue the insured who will stay the longest.
• But be careful; perhaps these consumers are retained because they are higher cost and can not find alternatives in the marketplace.
October 17, 2014 5
Predictive Modeling in the Insurance Marketplace
We are going to provide workers compensation insurance to Hunt Valley insurance agencies.
We rely on NCCI loss costs for our estimate of losses, and we add on our expenses and profit load.
The market is competitive because this is a low hazard occupation, and each insurer has approximately the same expense load and pricing.
The agencies value loss control services and company reputation. But price is pretty important too!
Our competitor’s charge sometimes higher, sometimes lower after applying credits and debits. Perhaps not enough to move from us, and not enough for us to steal business based on price.
October 17, 2014 6
Predictive Modeling in the Insurance Marketplace
October 17, 2014 7
A B C D E
Cost $80 $100 $120 $160 $190
Price $130 $130 $130 $130 $130
$50
$70
$90
$110
$130
$150
$170
$190
$210
Cost Versus Price
Predictive Modeling in the Insurance Marketplace
October 17, 2014 8
A B C D E
Profit/Loss $50 $30 $10 ($30) ($60)
-$80
-$60
-$40
-$20
$0
$20
$40
$60
Profit/Loss
Predictive Modeling in the Insurance Marketplace
October 17, 2014 9
A B C D E
Profit/Loss $50 $30 $10 ($30) ($60)
-$80
-$60
-$40
-$20
$0
$20
$40
$60
Profit/Loss
$0
Predictive Modeling in the Insurance Marketplace
October 17, 2014 10
A B C D E
Cost $80 $100 $120 $160 $190
Price $130 $130 $130 $130 $130
Enhanced Pricing $105 $105 $120 $160 $160
$50
$70
$90
$110
$130
$150
$170
$190
$210
Cost Versus Enchanced Price
Predictive Modeling in the Insurance Marketplace
October 17, 2014 11
A B C D E
Original Pricing Profit/Loss $50 $30 $10 ($30) ($60)
Enchanced Pricing Profit/Loss $25 $25 $10 ($30) ($30)
-$80
-$60
-$40
-$20
$0
$20
$40
$60
Profit/Loss
$0
Predictive Modeling in the Insurance Marketplace
October 17, 2014 12
A B C D E
Original Pricing Profit/Loss $50 $30 $10 ($30) ($60)
Enchanced Pricing Profit/Loss $25 $25 $10 ($30) ($30)
-$80
-$60
-$40
-$20
$0
$20
$40
$60
Profit/Loss
Predictive Modeling in the Insurance Marketplace
October 17, 2014 13
A B C D E
Cost $80 $100 $120 $160 $190
Competitors $130 $130 $130 $130 $130
Our Pricing $105 $105 $120 $160 $160
$50
$70
$90
$110
$130
$150
$170
$190
$210
Cost Versus Enchanced Price
Predictive Modeling in the Insurance Marketplace
October 17, 2014 14
A B C D E
Original Pricing Profit/Loss $50 $30 $10 ($30) ($60)
Enchanced Pricing Profit/Loss $25 $25 $10 ($30)
-$80
-$60
-$40
-$20
$0
$20
$40
$60
Profit/Loss
Predictive Modeling in the Insurance Marketplace
October 17, 2014 15
A B C D E
Original Pricing Profit/Loss $50 $30 $10 ($30) ($60)
Enchanced Pricing Profit/Loss $25 $25 $10 ($30)
-$80
-$60
-$40
-$20
$0
$20
$40
$60
Profit/Loss
$0
$30
Evolution and Refinement
October 17, 2014 16
$50
$70
$90
$110
$130
$150
$170
$190
$210
A1 A2 A3 A4 A5 B1 B2 B3 B4 B5 C1 C2 C3 C4 C5 D1 D2 D3 D4 D5 E1 E2 E3 E4 E5
Cost
Our Pricing
Competitor's Pricing
Complications
October 17, 2014 17
• Regulatory concerns• Approval• Limiting premium increases
• Customer concerns• Unexplainable premium changes
• Variability of loss process• Will actual experience fall into these bands• Was historical experience predictive of the future
• Competition• How quickly can we implement the model without
disrupting our book• What models are our competitors using
Traditional Analytics versus Predictive Analytics
October 17, 2014 18
Traditional Approach Predictive Analytic Approach
Main Input Claims data Claims data
Additional Input None/Limited Claimant: Personal Data, EmploymentEnvironmental: Economic, Census, Location
Loss Driver Several, viewed separately Many (100s Considered)
Analysis At most several data elements at once
Forward looking simultaneous consideration of relationship between loss drivers and claims.
Correlations / Double Counting
Unnoticed, uncorrected Scaled or eliminated
Prediction of Expected Losses
Unrefined (A,B,C,D) Scoring engine model that is versatile
Consistency Settlement practices can vary by adjuster
Scoring model applied the same way across entire portfolio
Updates Training, UW bulletins, Claims Bulletins
Model adjustment
Implementation Relatively simple Challenging
When is PM Most Useful?
– High frequency coverages • Lots of data to work with
Multi year• Helps with segmentation• Reduces adjustments to data
– Coverages where external data can be utilized• Need the external data to enhance the current models• Can be tied to geographic information and validated
– Good candidates• Personal Auto• Workers Compensation• Business Auto
– Not so good candidates• D&O?• Professional Liability?
October 17, 2014 19
Cutting the Data
October 17, 2014 20
Validation
Data
Too
GreenTraining
DataTe
st
Data
… 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
DOLAgeSexIncurredetc.
Claim 1
Claim 2
Claim 3
Claim 4
…
Variable Types
– Primary data• Claims, locations, employee• Categorical data versus numerical data
– Derivative data• Socio-economic data by zip code
– Compound variables• Transformed variables• Credit score• Text concepts
October 17, 2014 21
“arm” or
“lacer-”
Concept word
list
Depart-ment
sex
Location
age
Sample Variables
October 17, 2014 22
$9,103
$1,944
$12,671 $12,439
$15,304 $15,435
$17,937
All-Claim Average$11,939
$0
$2,000
$4,000
$6,000
$8,000
$10,000
$12,000
$14,000
$16,000
$18,000
$20,000
0 0.01 - 0.99 1.00 - 4.99 5.00 - 9.99 10.00 - 24.99 25.00 - 49.99 50 or More
Ave
rag
e L
oss
Distance Between Claimant Zip Code and Zip Code of Loss
Average Loss by Distance
Sample Variables
October 17, 2014 23
$13,113 $13,321 $13,249
$11,929
$9,365
$11,285 $11,457
All-Claim Average$11,939
$0
$2,000
$4,000
$6,000
$8,000
$10,000
$12,000
$14,000
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Ave
rag
e L
oss
Day of the Week for the Claim Date of Loss
Average Loss by Loss Day of the Week
Sample Variables
October 17, 2014 24
$24,346
$13,281 $13,261$12,297
$10,369 $10,199
$7,935
All-Claim Average$11,939
$0
$5,000
$10,000
$15,000
$20,000
$25,000
$30,000
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
Ave
rag
e L
oss
Day of the Week for the Claim Date Reported
Average Loss by Report Day of the Week
Sample Variables
October 17, 2014 25
$8,671$9,513
$14,784
$16,926
$20,794
$27,346
All-Claim Average$12,116
$0
$5,000
$10,000
$15,000
$20,000
$25,000
$30,000
0.0 - 0.9 1.0 - 1.9 2.0 - 4.9 5.0 - 9.9 10.0 - 24.9 25 or More
Ave
rag
e L
oss
Years Between Date of Hire with Client B and Date of Loss
Average Loss by Length of Employment with Client B
Sample Variables
October 17, 2014 26
Docum
ent length
Number of words
Word 1
Word 2
Word 3
Word 4
Word 5
Word 6
Word 7
Word 8
Word 9
Word 10
1 70 4 iw trip foot fell 2 70 5 ie fell walk build caus 3 58 4 iw pull left shoulder 4 64 8 iw numb left arm due repetit motion comput 5 70 4 iw go slip ice 6 70 5 offic walk door injur hit 7 43 5 iw injur right wrist use 8 70 2 iw way 9 70 2 iw way
10 70 4 iw hurt right finger 11 70 7 iw feel pain lower back area felt 12 70 2 ie walk 13 70 7 iw step car trip fell park lot 14 70 7 iw pain rt wrist numb hand came 15 69 5 iw experienc pain shoulder iw 16 70 5 complain pain numb arm around 17 51 3 ie feel fell 18 70 3 slip door fell 19 70 3 type come build 20 70 4 type come build pain 21 69 4 iw walk stair caus 22 70 4 state felt pain wrist 23 70 5 outsid ie come slip fell 24 70 4 walk slip stair fall 25 69 6 went park lot way back step 26 70 7 iw experienc pain numb right elbow way 27 70 4 iw restroom build floor 28 70 2 ie fell 29 70 7 walk break room slip fell water land 30 69 4 go step fell knee 31 70 7 iw fell injur lower back iw state 32 70 3 iw vehicl come 33 70 6 iw complain pain rt arm shoulder 34 70 3 come side slip 35 70 4 ie go stair hand 36 69 9 iw slip fell floor fell onto right knee caus 37 69 5 iw pain rt wrist pain 38 70 5 right wrist possibl repetit comput
First 38 of 16,000 Claims
Sample Variables
October 17, 2014 27
Word Fragment Word Fragment Word Fragment Word Fragment
ee iw fell ee
iw fell ie fell
fell slip slip iw
slip caus walk slip
walk trip park walk
pain knee lot park
hand hit ice lot
left ie stair floor
ie stair knee ice
right lot injur stair
back rt floor trip
wrist park trip step
caus ice step wet
rt walk ankl build
Concept 1
Concept 2
Concept 3
Concept 50
. . .
Most significant 14 words for each Concept group shown. Over 100 total words are identified for each Concept.
Sample Variables
October 17, 2014 28
1,205
2,441
4,680
6,051
6,854
8,090 7,923 7,621
8,158
10,698
All-Claim Average$7,626
$0.00
$2,000.00
$4,000.00
$6,000.00
$8,000.00
$10,000.00
$12,000.00
1 2 3 4 5 6 7 8 9 10
Loss Description Concept Index Value
Average Limited Loss By Concept 1 of 50
Testing Client A
October 17, 2014 29
All-Claim Average$11,480
$0
$10,000
$20,000
$30,000
$40,000
$50,000
$60,000
$70,000
$80,000
$90,000
$100,000
Ave
rag
e L
oss
Predicted Loss Size Percentile-Based Scoring Group
Average Severity of Score Bands Using Aon's Predictive Model
$44,583
$665
$78,871
$586 $839 $1,059 $925 $1,488 $2,092$6,174
$16,214
$53,706
0-10 11-20 21-30 31-40 41-50 51-60 61-70 71-8081-85
86-9091-95
96-100
Variable Types
October 17, 2014 30
All-Claim Average$11,480
$0
$10,000
$20,000
$30,000
$40,000
$50,000
$60,000
$70,000
$80,000
$90,000
$100,000
Ave
rag
e L
oss
Predicted Loss Size Percentile-Based Scoring Group
Low, Mid, and High: Using Aon Predictive Model
$39,367
$1,709
$78,871
Low: 0-80 Mid: 81-95
High: 96-100
Knowledge beforePredictive Modeling
Standard Mitigation Effort
The Best 80% of Scored Claims
(Lowest Expected Loss)
Strong Mitigation EffortThe Mid/High 15% of Scored
Claims
Aggressive Mitigation Effort
The Worst 5% of Scored Claims (Highest Expected Loss)
Mid or High Claims— Strong Accuracy:
78% of new claims predicted in these
groups were valued in the top 20% 3 years
later.
Expected Outcome
– Sort claims and allocate resources to those claims with the most potential for cost increases
– Active identification and management of these claims will result in better outcomes
– And next year we will recalibrate…
October 17, 2014 31
32
Contact Information
Christian Coleianne, FCAS, MAAA Associate Director and ActuaryActuarial & Analytics+1.410.309.0741 [email protected]
October 17, 2014