measuring the quality of credit scoring models

30
Measuring the Quality of Credit Scoring Models Martin Řezáč Dept. of Mathematics and Statistics, Faculty of Science, Masaryk University CSCC XI, Edinburgh August 2009

Upload: others

Post on 26-Oct-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measuring the Quality of Credit Scoring Models

Measuring the Quality of Credit Scoring Models

Martin ŘezáčDept. of Mathematics and Statistics, Faculty of Science,

Masaryk University

CSCC XI, EdinburghAugust 2009

Page 2: Measuring the Quality of Credit Scoring Models

2/30

1. Introduction 3

2. Good/bad client definition 4

3. Measuring the quality 6

4. Indexes based on distribution function 75. Indexes based on density function 17

6. Some results for normally distributed scores 24

7. Conclusions 30

Content

Page 3: Measuring the Quality of Credit Scoring Models

3/30

Introductionq It is impossible to use scoring model effectively without knowing how good it is.

q Usually one has several scoring models and needs to select just one. The best one.

q Before measuring the quality of models one should know (among other things):Ø good/bad definitionØ expected reject rate

Page 4: Measuring the Quality of Credit Scoring Models

4/30

Good/bad client definition

Ø GoodØ BadØ IndeterminateØ InsufficientØ ExcludedØ Rejected.

q Good definition is the basic condition of effective scoring model.q The definition usually depends on:

q Generally we consider following types of client:

Ø days past due (DPD)Ø amount past dueØ time horizon

Page 5: Measuring the Quality of Credit Scoring Models

5/30

Customer

Default(60 or 90 DPD)

Not default

Fraud (first delayed payment, 90 DPD)

Early default(2-4 delayed payment, 60 DPD)

Late default (5+ delayed payment, 60 DPD)

Good/bad client definition

Rejected

Accepted

Insufficient

GOOD

BAD

INDETERMINATE

Page 6: Measuring the Quality of Credit Scoring Models

6/30

Measuring the qualityq Once the definition of good / bad client and client's score is available, it is possible to evaluate the quality of this score. If the score is an output of a predictive model (scoring function), then we evaluate the quality of this model. We can consider two basic types of quality indexes. First, indexes based on cumulative distribution function like

The second, indexes based on likelihood density function like

Ø Kolmogorov-Smirnov statistics (KS)Ø Gini indexØ C-statisticsØ Lift.

Ø Mean difference (Mahalanobis distance)Ø Informational statistics/value (IVal).

Page 7: Measuring the Quality of Credit Scoring Models

7/30

Indexes based on distribution function

=.,0

,1otherwise

goodisclientDK

( )11)(1

. =∧≤= ∑=

Ki

n

iGOODn DasI

naF

( )01)(1

. =∧≤= ∑=

Ki

m

iBADm DasI

maF

[ ]HLa ,∈( )asIN

aF i

N

iALLN ≤= ∑

=1.

1)(

Ø Empirical distribution functions:

( )

=otherwise

trueisAAI

01

mnmpB +

=,mn

npG +=

Number of good clients:Number of bad clients:Proportions of good/bad clients:

nm

Ø Kolmogorov-Smirnov statistics (KS)

[ ])()(max ,,,

aFaFKS GOODnBADmHLa−=

Page 8: Measuring the Quality of Credit Scoring Models

8/30

Ø Lorenz curve (LC)

Ø Gini index

ABA

AGini 2=+

=

[ ].,),()(

.

.

HLaaFyaFx

GOODn

BADm

∈==

( ) ( )∑+

=−− +⋅−−=

mn

kkGOODnGOODnkBADmkBADm FFFFGini

k2

1..1..1

kBADmF . kGOODnF .where ( ) is k-th vector value of empirical distribution function of bad (good) clients

Indexes based on distribution function

Page 9: Measuring the Quality of Credit Scoring Models

9/30

21 GiniCAstatc +

=+=−

( )012121 =∧=≥=− KK DDssPstatc

It represents the likelihood that randomly selected good client has higher score than randomly selected bad client, i.e.

c

Indexes based on distribution function

Ø C-statistics:

Page 10: Measuring the Quality of Credit Scoring Models

10/30

q Another possible indicator of the quality of scoring model can be cumulative Lift, which says, how many times, at a given level of rejection, is the scoring model better than random selection (random model). More precisely, the ratio indicates the proportion of bad clients with less than a score a, , to the proportion of bad clients in the general population. Formally, it can be expressed by:

[ ]HLa ,∈

( )

( )

( )

( )

( )

( )

Nn

asI

YasI

YYI

YI

asI

YasI

BadRateaCumBadRateaLift

i

mn

i

i

mn

i

mn

i

mn

i

i

mn

i

i

mn

i

=∧≤

=

=∨=

=

=∧≤

==∑

∑+

=

+

=

+

=

+

=

+

=

+

=

1

1

1

1

1

10

10

0

0

)()(

Indexes based on distribution function

BadRateaBadRateaabsLift )()( =

Page 11: Measuring the Quality of Credit Scoring Models

11/30

# bad clients Bad rate abs. Lift # bad clients Bad rate cum. Lift1 100 16 16,0% 3,20 16 16,0% 3,20 2 100 12 12,0% 2,40 28 14,0% 2,80 3 100 8 8,0% 1,60 36 12,0% 2,40 4 100 5 5,0% 1,00 41 10,3% 2,05 5 100 3 3,0% 0,60 44 8,8% 1,76 6 100 2 2,0% 0,40 46 7,7% 1,53 7 100 1 1,0% 0,20 47 6,7% 1,34 8 100 1 1,0% 0,20 48 6,0% 1,20 9 100 1 1,0% 0,20 49 5,4% 1,09 10 100 1 1,0% 0,20 50 5,0% 1,00 All 1000 50 5,0%

absolutely cumulativelydecile # cleints

Indexes based on distribution function

-

0,50

1,00

1,50

2,00

2,50

3,00

3,50

1 2 3 4 5 6 7 8 9 10

decile

Lift

valu

e

abs. Liftcum. Lift

Gini=0,55

0

0,2

0,4

0,6

0,8

1

0 0,2 0,4 0,6 0,8 1

Lornz curveBase line

q Usually it is computed using table with numbers of all and bad clients in some bands (deciles).

Page 12: Measuring the Quality of Credit Scoring Models

12/30

# bad clients Bad rate abs. Lift # bad clients Bad rate cum. Lift1 100 8 8,0% 1,60 8 8,0% 1,60 2 100 12 12,0% 2,40 20 10,0% 2,00 3 100 16 16,0% 3,20 36 12,0% 2,40 4 100 5 5,0% 1,00 41 10,3% 2,05 5 100 3 3,0% 0,60 44 8,8% 1,76 6 100 2 2,0% 0,40 46 7,7% 1,53 7 100 1 1,0% 0,20 47 6,7% 1,34 8 100 1 1,0% 0,20 48 6,0% 1,20 9 100 1 1,0% 0,20 49 5,4% 1,09 10 100 1 1,0% 0,20 50 5,0% 1,00 All 1000 50 5,0%

decile # cleintsabsolutely cumulatively

Indexes based on distribution function

-

0,50

1,00

1,50

2,00

2,50

3,00

3,50

1 2 3 4 5 6 7 8 9 10

decile

Lift

valu

e

abs. Liftcum. LiftGini=0,48

0

0,2

0,4

0,6

0,8

1

0 0,2 0,4 0,6 0,8 1

Lornz curveBase line

q When bad rates are not monotone:

Ø LC looks fineØ Gini is slightly loweredØ Lift looks strange

Page 13: Measuring the Quality of Credit Scoring Models

13/30

# bad clients Bad rate abs. Lift # bad clients Bad rate cum. Lift1 100 16 16,0% 3,20 16 16,0% 3,20 2 100 12 12,0% 2,40 28 14,0% 2,80 3 100 8 8,0% 1,60 36 12,0% 2,40 4 100 5 5,0% 1,00 41 10,3% 2,05 5 100 3 3,0% 0,60 44 8,8% 1,76 6 100 2 2,0% 0,40 46 7,7% 1,53 7 100 1 1,0% 0,20 47 6,7% 1,34 8 100 1 1,0% 0,20 48 6,0% 1,20 9 100 1 1,0% 0,20 49 5,4% 1,09 10 100 1 1,0% 0,20 50 5,0% 1,00 All 1000 50 5,0%

absolutely cumulativelydecile # cleints

Indexes based on distribution function

# bad clients Bad rate abs. Lift # bad clients Bad rate cum. Lift1 100 1 1,0% 0,20 1 1,0% 0,20 2 100 1 1,0% 0,20 2 1,0% 0,20 3 100 1 1,0% 0,20 3 1,0% 0,20 4 100 1 1,0% 0,20 4 1,0% 0,20 5 100 2 2,0% 0,40 6 1,2% 0,24 6 100 3 3,0% 0,60 9 1,5% 0,30 7 100 5 5,0% 1,00 14 2,0% 0,40 8 100 8 8,0% 1,60 22 2,8% 0,55 9 100 12 12,0% 2,40 34 3,8% 0,76 10 100 16 16,0% 3,20 50 5,0% 1,00 All 1000 50 5,0%

absolutely cumulativelydecile # cleints

Gini= - 0,55

0

0,2

0,4

0,6

0,8

1

0 0,2 0,4 0,6 0,8 1

Lornz curveBase line

-

0,50

1,00

1,50

2,00

2,50

3,00

3,50

1 2 3 4 5 6 7 8 9 10

decile

Lift

valu

e

abs. Liftcum. Lift

q When score is reversed, we obtain reversed figures.

Page 14: Measuring the Quality of Credit Scoring Models

14/30

Gini = 0.42

0

0,2

0,4

0,6

0,8

1

0 0,2 0,4 0,6 0,8 1

Lornz curveBase line

Indexes based on distribution function

# bad clients Bad rate1 100 35 35,0%2 100 16 16,0%3 100 8 8,0%4 100 8 8,0%5 100 7 7,0%6 100 6 6,0%7 100 6 6,0%8 100 5 5,0%9 100 5 5,0%10 100 4 4,0%All 1000 100 10,0%

decile # cleints

# bad clients Bad rate1 100 20 20,0%2 100 18 18,0%3 100 17 17,0%4 100 15 15,0%5 100 12 12,0%6 100 6 6,0%7 100 4 4,0%8 100 3 3,0%9 100 3 3,0%10 100 2 2,0%All 1000 100 10,0%

decile # cleints

Ø SC 1:

K-S = 0.36

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

goodbad

K-S = 0.34

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

goodbad

Gini= 0,42

0

0,2

0,4

0,6

0,8

1

0 0,2 0,4 0,6 0,8 1

Lornz curveBase line

Ø SC 2:

q The Gini is not enough!!!

Page 15: Measuring the Quality of Credit Scoring Models

15/30

Indexes based on distribution function

-

0,50

1,00

1,50

2,00

2,50

3,00

3,50

4,00

1 2 3 4 5 6 7 8 9 10

decile

Lift

valu

e

abs. Liftcum. Lift

-

0,50

1,00

1,50

2,00

2,50

1 2 3 4 5 6 7 8 9 10

decile

Lift

valu

e

abs. Liftcum. Lift

Ø SC 1: Ø SC 2:

Lift20% = 2.55 >Lift50% = 1.48 <

Lift20% = 1.90Lift50% = 1.64

SC 2 is better if reject rate is expected around 50%.SC 1 is much more better if reject rate is expected by 20%.

Page 16: Measuring the Quality of Credit Scoring Models

16/30

)()(

)(.

.

aFaF

aLiftALLN

BADn= [ ]HLa ,∈

( ))(1))(())(( 1

..1..

1.. qFF

qqFFqFFLift ALLNBADn

ALLNALLN

ALLNBADnq

−−

==

{ }qaFHLaqF ALLNALLN ≥∈=− )(],,[min)( .1.

( ).)1.0(10 1..%10

−⋅= ALLNBADn FFLift

Indexes based on distribution function

q Lift can be expressed and computed by formulae:

Page 17: Measuring the Quality of Credit Scoring Models

17/30

Indexes based on density function

)(xfGOOD )(xf BAD

( )∑=

=

−=n

Di

ihGOOD

K

sxKn

hxf1,1

1),(~ ( )∑=

=

−=m

Di

ihBAD

k

sxKm

hxf0

1

1),(~

12112

1

23

,~

)!32()52()!12( +

++

⋅⋅

+++

= kkk

kOS nk

kkkh σ

Ø Likelihood density functions:

Ø Kernel estimates:

Ø Optimal bandwidth (maximal smoothing):

where: is the order of kernel function(e.g. 2 for Epanechnikov kernel)is number of actual casesis an estimate of standard deviation

k

nσ~

Page 18: Measuring the Quality of Credit Scoring Models

18/30

Ø Mean difference (Mahalanobis distance):

21

22

+

+=

mnmSnS

S bg

SMM

D bg −=

Indexes based on density function

where S is pooled standard deviation:

gM bM

gS bS, are standard deviations of good (bad) clients

, are means of good (bad) clients

Page 19: Measuring the Quality of Credit Scoring Models

19/30

Ø Information value (Ival) – continuous case (Divergence):

( ) dxxfxfxfxfI

BAD

GOODBADGOODval

−= ∫

∞− )()(

ln)()(

)()()( xfxfxf BADGOODdiff −=

=

)()(ln)(

xfxfxf

BAD

GOODLR

Indexes based on density function

Page 20: Measuring the Quality of Credit Scoring Models

20/30

• Replace density functions by their kernel estimates and compute integral numerically (e.g. by composite trapezoidal rule).

• Using Epanechnikov kernel, given byand optimal bandwidth we have

• For given M+1 points we obtain

Ø Information value (Ival) – discretized continuous case:

( ) [ ]( )1,1143)( 2 −∈⋅−= xIxxK

++

−= ∑

=

1

10

0 )(~)(~2)(~2

M

iMIViIVIV

Mval xfxfxf

MxxI

( )

−=

),(~),(~

ln),(~),(~)(~

2,

2,2,2,

OSBAD

OSGOODOSBADOSGOODIV hxf

hxfhxfhxfxf

Indexes based on density function

kOSh ,

Mxx ,,0 K

0x Mx

Page 21: Measuring the Quality of Credit Scoring Models

21/30

• Create intervals of score – typically deciles. Number of goods (bads) in i-th interval is marked by .

• It must holds

• Then we have ∑

−=

i i

iiival nb

mgmb

ngI ln

Indexes based on density function

score int. # bad clients #good clients % bad [1] % good [2] [3] = [2] - [1] [4] = [2] / [1] [5] = ln[4] [6] = [3] * [5]1 1 10 2,0% 1,1% -0,01 0,53 -0,64 0,012 2 15 4,0% 1,6% -0,02 0,39 -0,93 0,023 8 52 16,0% 5,5% -0,11 0,34 -1,07 0,114 14 93 28,0% 9,8% -0,18 0,35 -1,05 0,195 10 146 20,0% 15,4% -0,05 0,77 -0,26 0,016 6 247 12,0% 26,0% 0,14 2,17 0,77 0,117 4 137 8,0% 14,4% 0,06 1,80 0,59 0,048 3 105 6,0% 11,1% 0,05 1,84 0,61 0,039 1 97 2,0% 10,2% 0,08 5,11 1,63 0,1310 1 48 2,0% 5,1% 0,03 2,53 0,93 0,03All 50 950 Info. Value 0,68

Ø Information statistics/value (Ival) – discrete case:

( )ii bg

ibg ii ∀>> 0,0

Page 22: Measuring the Quality of Credit Scoring Models

22/30

q Information value for our example of two scorecards:

Indexes based on density function

Ø SC 1:

ØSC 2:

decile # cleints # bad clients #good % bad [1] % good [2] [3] = [2] - [1] [4] = [2] / [1] [5] = ln[4] [6] = [3] * [5] cum. [6]1 100 35 65 35,0% 7,2% -0,28 0,21 -1,58 0,44 0,442 100 16 84 16,0% 9,3% -0,07 0,58 -0,54 0,04 0,473 100 8 92 8,0% 10,2% 0,02 1,28 0,25 0,01 0,484 100 8 92 8,0% 10,2% 0,02 1,28 0,25 0,01 0,495 100 7 93 7,0% 10,3% 0,03 1,48 0,39 0,01 0,506 100 6 94 6,0% 10,4% 0,04 1,74 0,55 0,02 0,527 100 6 94 6,0% 10,4% 0,04 1,74 0,55 0,02 0,558 100 5 95 5,0% 10,6% 0,06 2,11 0,75 0,04 0,599 100 5 95 5,0% 10,6% 0,06 2,11 0,75 0,04 0,63

10 100 4 96 4,0% 10,7% 0,07 2,67 0,98 0,07 0,70All 1000 100 900 Info. Value 0,70

decile # cleints # bad clients #good % bad [1] % good [2] [3] = [2] - [1] [4] = [2] / [1] [5] = ln[4] [6] = [3] * [5] cum. [6]1 100 20 80 20,0% 8,9% -0,11 0,44 -0,81 0,09 0,092 100 18 82 18,0% 9,1% -0,09 0,51 -0,68 0,06 0,153 100 17 83 17,0% 9,2% -0,08 0,54 -0,61 0,05 0,204 100 15 85 15,0% 9,4% -0,06 0,63 -0,46 0,03 0,225 100 12 88 12,0% 9,8% -0,02 0,81 -0,20 0,00 0,236 100 6 94 6,0% 10,4% 0,04 1,74 0,55 0,02 0,257 100 4 96 4,0% 10,7% 0,07 2,67 0,98 0,07 0,328 100 3 97 3,0% 10,8% 0,08 3,59 1,28 0,10 0,429 100 3 97 3,0% 10,8% 0,08 3,59 1,28 0,10 0,52

10 100 2 98 2,0% 10,9% 0,09 5,44 1,69 0,15 0,67All 1000 100 900 Info. Value 0,67

Page 23: Measuring the Quality of Credit Scoring Models

23/30

q Using markings we have:

−=

mb

ngI ii

diffi

Indexes based on density function

=

nbmgI

i

iLRi

ln

Ø SC 1:

ØSC 2:

-0,30

-0,25

-0,20

-0,15

-0,10

-0,05

0,00

0,05

0,10

1 2 3 4 5 6 7 8 9 10

-2,00

-1,50

-1,00

-0,50

0,00

0,50

1,00

1,50I_diffI_LR

0,00

0,05

0,10

0,15

0,20

0,25

0,30

0,35

0,40

0,45

0,50

1 2 3 4 5 6 7 8 9 100,00

0,10

0,20

0,30

0,40

0,50

0,60

0,70

0,80I_diif * I_LRcum. I_diff * I_LR

-0,15

-0,10

-0,05

0,00

0,05

0,10

1 2 3 4 5 6 7 8 9 10

-1,00

-0,50

0,00

0,50

1,00

1,50

2,00I_diffI_LR

0,00

0,02

0,04

0,06

0,08

0,10

0,12

0,14

0,16

1 2 3 4 5 6 7 8 9 100,00

0,10

0,20

0,30

0,40

0,50

0,60

0,70

0,80I_diif * I_LR

cum. I_diff * I_LR

K-S = 0.34Gini = 0.42Lift20% = 2.55Lift50% = 1.48Ival = 0.70Ival20% = 0.47Ival50% = 0.50

K-S = 0.36Gini = 0.42Lift20% = 1.90Lift50% = 1.64Ival = 0.67Ival20% = 0.15Ival50% = 0.23

Page 24: Measuring the Quality of Credit Scoring Models

24/30

Ø Assume that the scores of good and bad clients are normally distributed, i.e. we can write their densities as

Ø Estimates of parameters and :

Ø Pooled standard deviation:

Ø Estimates of mean and standard dev. of scores for all clients :

Some results for normally distributed scores

( )2

2

2

21)( g

gx

gGOOD exf σ

µ

πσ

−−

=

( )2

2

2

21)( b

bx

bBAD exf σ

µ

πσ

−−

=

gbb σµµ ,, bσ

,

.

gM bM

gS bS, are standard deviations of good (bad) clients

, are means of good (bad) clients

21

22

+

+=

mnmSnS

S bg

mnmMnM

MM bgALL +

+==

( )21

2

2222

+

+=

mnSmSn

S bgALL

ALLALL σµ ,

Page 25: Measuring the Quality of Credit Scoring Models

25/30

12

222

Φ⋅=

Φ−

Φ=

DDDKS

Where is the standardized normal distribution function, the normal distribution function with parameters , and is the standard quantile function.

σµµ bgD

−=

( )

⋅+Φ⋅Φ= − Dpq

qLift G

ALLq

11σ

σ

2DIval =

12

2 −

Φ⋅=DGini

SMM

D bg −=

Some results for normally distributed scores

Ø Assume that standard deviations are equal to a common value :σ

( )⋅Φ)(1 ⋅Φ −

)(2,⋅Φ

σµµ 2σ

( )

⋅+ΦΦ= − Dpq

SS

qLift G

ALLq

11

Page 26: Measuring the Quality of Credit Scoring Models

26/30

Ø Generally (i.e. without assumption of equality of standard deviations):

⋅+−⋅Φ−

⋅+−⋅Φ= cbDa

bD

bacbDa

bD

baKS bggb 2121 22 *2**2* σσσσ

Some results for normally distributed scores

,22gba σσ +=

22

*

bg

bgDσσ

µµ

+

−= 22

*

bg

bg

SS

MMD

+

−=

where

=

b

gcσσ

ln,22gbb σσ −=

( ) ( )

( ) ( )

−⋅++

−−⋅

+Φ−

−⋅++

−−⋅

+Φ=

b

ggbgbb

gbg

gb

gb

b

ggbgbg

gbb

gb

gb

SS

SSDSSSSS

DSSSSS

SS

SSDSSSSS

DSSSSS

KS

ln21

ln21

22*2222

*22

22

22*2222

*22

22

2

2

Page 27: Measuring the Quality of Credit Scoring Models

27/30

Ø Generally (i.e. without assumption of equality of standard deviations):

( ) 12 * −Φ⋅= DGini

Some results for normally distributed scores

( )( ) ( )

−+Φ⋅Φ=Φ⋅+Φ=

−−

b

bALLALLALLALLq

qq

qq

Liftbb σ

µµσσµ

σµ

11

,

112

+=−++= 2

2

2

22*

21,1)1(

b

g

g

bval AADAI

σσ

σσ

+=−++= 2

2

2

22*

21,1)1(

b

g

g

bval S

SSSAADAI

( )

−+Φ⋅Φ=

b

bALLq S

MMqSq

Lift11

Page 28: Measuring the Quality of Credit Scoring Models

28/30

q KS and the Gini react much more to change of

and are almost unchanged in the direction of .

Ø Gini ,

Some results for normally distributed scores

0=bµ 12 =bσØ KS: ,

0=bµ 12 =bσ

• Gini > KS

2gσ

Page 29: Measuring the Quality of Credit Scoring Models

29/30

Ø Lift10%: ,

Some results for normally distributed scores

0=bµ 12 =bσ

Ø Ival: ,0=bµ 12 =bσ

q In case of Lift10%it is evident strong dependence on and significantly higher dependence on than in case of KS and Gini.

q Again strong dependence on . Furthermore value of Ival rises very quickly to infinity when tends to zero.

2gσ

2gσ

Page 30: Measuring the Quality of Credit Scoring Models

30/30

Conclusions

q It is impossible to use scoring model effectively without knowing how good it is.q It is necessary to judge scoring models according to their strength in score range where cutoff is expected. q The Gini is not enough!q Results concerning Lift and Information value can be used to obtain the best available scoring model.q Results for normally distributed scores can help with computation of referred indexes. Furthermore they can help to understanding how those indexes behave.