data mining in health insurance

64
Data mining in Health Insurance

Upload: guy

Post on 06-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Data mining in Health Insurance. Introduction. Rob Konijn, [email protected] VU University Amsterdam Leiden Institute of Advanced Computer Science (LIACS) Achmea Health Insurance Currently working here Delivering leads for other departments to follow up Fraud, abuse - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data mining in Health Insurance

Data mining in Health Insurance

Page 2: Data mining in Health Insurance

Introduction

• Rob Konijn, [email protected]– VU University Amsterdam– Leiden Institute of Advanced Computer Science (LIACS)– Achmea Health Insurance

• Currently working here• Delivering leads for other departments to follow up

– Fraud, abuse

• Research topic keywords: data mining/ unsupervised learning / fraud detection

2

Page 3: Data mining in Health Insurance

Outline

• Intro Application– Health Insurance– Fraud detection

• Part 1: Subgroup discovery • Part 2: Anomaly detection (slides partly

by Z. Slavik, VU)

Page 4: Data mining in Health Insurance

Intro Application

• Health Insurance Data• Health Insurance in NL

– Obligatory– Only private insurance companies– About 100 euro/month(everyone)+170 euro (income)– Premium increase of 5-12% each year

Achmea: about 6 million customers

Page 5: Data mining in Health Insurance

Funding of Health Insurance Costs in the Netherlands

vereveningsfonds

verzekerde zorgverzekeraar

rijksbijdrageverzekerden 18-

2 mld

inkomensafh.bijdragewerkgevers 17 mld

30 mld

zorguitgaven

vereveningsbijdrage

18 mld

nominale premie 18+:

- rekenpremie (~€ 947/vrz): 12 mld- opslag (~€ 150/vrz) : 2 mld

vereveningsfondsvereveningsfondsvereveningsfondsvereveningsfondsvereveningsfondsvereveningsfonds

zorgverzekeraar

vereveningsfonds

Page 6: Data mining in Health Insurance

Verevenings-model• By population

characteristics– Age– Gender– Income, social class– Type of work

• Calculation afterwards– High costs

compensation (>15.000 euro)

30 - 34 jr98035 - 39 jr1,044

50 - 54 jr

2,394

1,639

45 - 49 jr

55 - 59 jr60 - 64 jr 1,885

1,1831,354

40 - 44 jr

25 - 29 jr 870

1,400 0 - 4 jr1,026 5 - 9 jr90710 - 14 jr96415 - 17 jr89218 - 24 jr

905

3,34980 - 84 jr75 - 79 jr

65 - 69 jr

3,42490 jr e.o.

2,8263,244

70 - 74 jr

3,464

Mannen

85 - 89 jr

1,876

1,7131,905

1,366

2,560

1,476

2,201

1,768

1,532

1,232

Vrouwen

2,8863,0183,0343,014

918

1,2141,062

9361,210

Page 7: Data mining in Health Insurance

Fraude in de zorg

Page 8: Data mining in Health Insurance

Introduction Application:The Data

• Transactional data– Records of an event– Visit to a medical practitioner

• Charged directly by medical practioner• Patient is not involved• Risk of fraud

Page 9: Data mining in Health Insurance

Transactional Data

• Transactions: Facts– Achmea:

About 200 mln transactions per year

• Info of customers and practitioners: dimensions

Page 10: Data mining in Health Insurance

Different levels of hierarchy

• Records represent events• However, for example for fraud detection, we are

interested in customers, or medical practitoners

• See examples next pages• Groups of records: Subgroup Discovery• Individual patients/practioners: outlier detection

Page 11: Data mining in Health Insurance

Different types of fraud hierarchy

• On a patient level, or on a hospital level:

Page 12: Data mining in Health Insurance

Handling different hierarchy

• Creating profiles from transactional data• Aggregating costs over a time period

– Each record: patient• Each attribute i =1 to n: cost spent on treatment i

• Feature construction, for example– The ratio of long/short consults (G.P.)– The ratio of 3-way and 2 way fillings (Dentist)– Usually used for one-way analysis

Page 13: Data mining in Health Insurance

Different types of fraud detection

• Supervised– A labeled fraud set– A labeled non-fraud set– Credit cards, debit cards

• Unsupervised– No labels– Health Insurance, Cargo, telecom, tax etc.

Page 14: Data mining in Health Insurance

Unsupervised learning in Health Insurance Data

• Anomaly Detection (outlier detection)– Finding individual deviating points

• Subgroup Discovery– Finding (descriptions of) deviating groups

• Focus on differences and uncommon behavior– In contrast to other unsupervised learning methods

• Clustering• Frequent Pattern mining

Page 15: Data mining in Health Insurance

Subgroup Discovery

• Goal: Find differences in claim behavior of medical practitioners

• To detect inefficient claim behavior– Actions:

• A visit from the account manager• To include in contract negotiations

– In the extreme case: fraud• Investigation by the fraud detection department

• By describing deviations of a practitioner from its peers– Subgroups

Page 16: Data mining in Health Insurance

Patient-level, Subgroup Discovery

• Subgroup (orange): group of patients• Target (red)

– Indicates whether a patient visited a practitioner (1), or not (0)

Page 17: Data mining in Health Insurance

Subgroup Discovery: Quality Measures

• Target Dentist: 1672 patiënten– Compare with peer group, 100.000 patients in

total

• Subgroup V11 > 42 euro : 10347 patients– V11: one sided filling

• Crosstable 

target dentist rest totaal

V11 >= 42 871 9476 10347rest 801 88852 89653totaal 1672 98328 100000

Page 18: Data mining in Health Insurance

The cross table

• Cross table in data

• Cross table expected:

• Assuming independence

  target dentist rest totalV11 >= 42 173 10174 10347

rest 1499 88154 89653

total 1672 98328 100000

  target dentist rest totalV11 >= 42 871 9476 10347rest 801 88852 89653total 1672 98328 100000

Page 19: Data mining in Health Insurance

Calculating Wracc and Lift

• Size subgroup = P(S) = 0.10347, size target dentist = P(T) = 0.01672• Weighted Relative ACCuracy (WRAcc) = P(ST) – P(S)P(T) = (871 –

173)/100000 = 689/100000• Lift = P(ST)/P(S)P(T) = 871/173 = 5.03

  target dentist rest totalV11 >= 42 173 10174 10347

rest 1499 88154 89653

total 1672 98328 100000

  target dentist rest totalV11 >= 42 871 9476 10347rest 801 88852 89653total 1672 98328 100000

Page 20: Data mining in Health Insurance

Example dentistry, at depth 1, one target dentist

Page 21: Data mining in Health Insurance

ROC analysis, target dentist

Page 22: Data mining in Health Insurance

Making SD more useful: adding prior knowledge

• Adding prior knowledge– Background variables patient (age, gender, etc.)– Specialism practitioner– For dentistry: choice of insurance

• Adding already known differences– Already detected by domain experts themselves– Already detected during a previous data mining run

Page 23: Data mining in Health Insurance

Prior Knowledge, Motivation

Page 24: Data mining in Health Insurance

Example, influence of prior knowledge

Page 25: Data mining in Health Insurance

The idea: create an expected cross table using prior knowledge

Page 26: Data mining in Health Insurance

Quality Measures• Ratio (Lift)

• Difference (WRAcc)

• Squared sum (Chi-square statistic)

Page 27: Data mining in Health Insurance

Example, iterative approach

• Idea: add subgroup to prior knowledge iteratively• Target = single pharmacy• Patients that visited the hospital in last 3 years removed

from data• Compare with peer group (400,000 patients), 2929 patiënts

of target pharmacy• Top subgroup : “B03XA01 (Erythropoietin)>0 euro”

subgroup T F

T 1297 224

F 1632 396,847

B03XA01 > 0

1 ‘target’ pharmacy

rest

rest

Page 28: Data mining in Health Insurance

Next iteration• Add “B03XA01 (EPO) >0 euro” to prior knowledge• Next best subgroup: “N05AX08 (Risperdal)>= 500 euro”

Page 29: Data mining in Health Insurance

Figure describing subgroup: N05AX08 > 500

Left: target pharmacy, right: other pharmacies

Page 30: Data mining in Health Insurance

Addition: adding costs to quality measure

– M55: dental cleaning– V11: 1-way filling– V21: polishing

• Cost of treatments in subgroup 370 euro (average)• 791 more patients than expected• Total quality 791*370 = 292,469 euro

Page 31: Data mining in Health Insurance

Iterative approach, top 3 subgroups

V12: 2-sided filling V21: polishing V60: indirect pulpa covering

V21 and V60 are not allowed on the same day Claim back (from all dentists): 1.3 million euro

Page 32: Data mining in Health Insurance

3d isometrics, cost based QM

Page 33: Data mining in Health Insurance
Page 34: Data mining in Health Insurance

Other target types: double binary target

• Target 1: year: 2009 or 2008• Target 2: target practitioner

• Pattern:– M59: extensive (expensive) dental cleaning– C12: second consult in one year

• Crosstable:

Page 35: Data mining in Health Insurance

Other target types: Multiclass target

• Subgroup (orange): group of patients• Target (red), now is a multi-value column, one

value per dentist

Page 36: Data mining in Health Insurance

Multiclass target, in ROC Space

Page 37: Data mining in Health Insurance

Anemaly Detection

The example above contains a contextual anomaly...

Page 38: Data mining in Health Insurance

Outline Anomaly Detection

• Anomalies– Definition– Types– Technique categories– Examples

• Lecture based on– Chandola et al. (2009). Anomaly

Detection: A Survey– Paper in BB

38

Page 39: Data mining in Health Insurance

Definition

• “Anomaly detection refers to the problem of finding patterns in data that do not conform to expected behavior”

• Anomalies, aka.– Outliers– Discordant observations– Exceptions– Aberrations– Surprises– Peculiarities– Contaminants

39

Page 40: Data mining in Health Insurance

Anomaly typesPoint anomalies

– A data point is anomalous with respect to the rest of the data

40

Page 41: Data mining in Health Insurance

Not covered today

• Other types of anomalies:– Collective anomalies– Contextual anomalies

• Other detection approaches:– Supervised learning– Semi supervised

• Assume training data is from normal class• Use to detect anomalies in the future

Page 42: Data mining in Health Insurance

We focus on outlier scores

• Scores– You get a ranked list of anomalies– “We investigate the top 10”– “An anomaly has a score of at least 134”– Leads followed by fraud investigators

• Labels

42

ANOMAL

Y

Page 43: Data mining in Health Insurance

Detection method categorisation

1. Model based2. Depth based3. Distance Based

4. Information theory related (not covered)5. Spectral theory related (not covered)

43

Page 44: Data mining in Health Insurance

Model based

• Build a (statistical) model of the data

• Data instances occur in high probability regions of a stochastic model, while anomalies occur in low probability regions

• Or: data instances have a high distance to the model are outliers

• Or: data instances have a high influence on the model are outliers

Page 45: Data mining in Health Insurance

Example: one way outlier detection

• Pharmacy records• Records represent patients• One attribute at a time:

– This example: attribute describing the costs spent on fertility medication (gonodatropin) in a year

• We could use such one way detection for each attribute in the data

Page 46: Data mining in Health Insurance

Example, model = parametric probability density function

Page 47: Data mining in Health Insurance

Example, model = non-parametric distribution

• Left: kernel density estimate• Right: boxplot

Page 48: Data mining in Health Insurance

Example: regression model

Page 49: Data mining in Health Insurance

Other models possible

• Probabilistic– Bayesian networks

• Regression models– Regression trees/ random forests– Neural networks

• Outlier score = prediction error (residual)

Page 50: Data mining in Health Insurance

Depth based methods

• Applied on 1-4 dimensional datasets– Or 1-4 attributes at a time

• Objects that have a high distance to the “center of the data” are considered outliers

• Example Pharmacy:– Records represent patients– 2 attributes:

• Costs spent on diabetes medication • Costs spent on diabetes testing material

Page 51: Data mining in Health Insurance

Example: bagplot, halfspace depth

Page 52: Data mining in Health Insurance

Distance based (nearest neighbor based)

• Assumption:– Normal data instances occur in dense neighbourhoods,

while anomalies occur far from their closest neighbours

Page 53: Data mining in Health Insurance

Similarity/distance

• You need a similarity measure between two data points– Numeric attributes: Eucledian, etc.– Nominal: simple match often enough– Multivariate:

• Distance using all attributes• Distance between attribute values, then combine

Page 54: Data mining in Health Insurance

Example, dentistry data

• Records represent dentists

• Attributes are 14 cost categories– Denote the percentage

of patients that received a claim from the category

Page 55: Data mining in Health Insurance

Option 1:Distance to kth neighbour as anomaly

score

Page 56: Data mining in Health Insurance

Option 2:Use relative densities of neighbourhoods

• Density of neighbourhood estimated for each instance

• Instances in the low density neighbourhoods are anomalous, others normal

• Note:– Distance to kth neighbour is an estimate for the

inverse of density (large distance low density)– But this estimates outliers in varying density

neighbourhoods badly

56

Page 57: Data mining in Health Insurance

LOF• Local Outlier Factor:• Local density:

– k divided by the volume of the smallest hyper-sphere centred around the instance, containing k neighbours

• Anomalous instance:– Local density will be

lower than that ofthe k nearest neighbours

57

Average local density of k nearest neighboursLocal density of instance

Average local density of k nearest neighboursLocal density of instance

Page 58: Data mining in Health Insurance

Example LOF outlier, dentistry

Page 59: Data mining in Health Insurance

3. Clustering based a.d. techniques

• 3 possibilities;1. Normal data instances belong to a cluster in

the data, while anomalies do not belong to any cluster– Use clustering methods that do not force all

instances to belong to a cluster• DBSCAN, ROCK, SSN

2. Distance to the cluster center = outlier score3. Clusters with too few points are outlying

clusters59

Page 60: Data mining in Health Insurance

K-means with 6 clusters, centers of the dentistry data set

• Attributes: percent of patient that received claim from cost category

• Clusters correspond to specialism1. Dentist2. Orthodontist3. Orthodontist

(charged by dentist)

4. Dentist5. Dentist6. Dental hygenist

Page 61: Data mining in Health Insurance

Combining Subgroup Discovery and Outlier Detection

• Describe regions with outliers using SD• Identify suspicious medical practitioners• 2 or 3 step approach to describe outliers:

1. Calculate outlier score2. Use subgroup discovery to describe regions with

outliers.3. (optional) identify the involved medical

practitioners

Page 62: Data mining in Health Insurance

Example output:

• Look at patients with ‘P30>1050 euro’ for practitioner number 221

• Left: all data, right: practitioner 221

Page 63: Data mining in Health Insurance

Descriptions of outliers: LOCI outlier score

• 1. Calculate outlier score – LOCI is a density based

outlier score• 2. Describe outlying

regions• Result top subgroup:

– Orthodontics (dentist) 0.044 ^ Orthodontics 0.78

– Group of 9 dentists with an average score of 3.9

Page 64: Data mining in Health Insurance

Conclusions

• Health insurance: Interesting application domain– Very relevant

• Outlier Detection and Subgroup discovery are useful