using rule ensembles to predict credit risk #ghc15
TRANSCRIPT
![Page 1: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/1.jpg)
2015
Using Rule Ensembles to Predict Credit Risk
Diane Chang, Intuit
October 16, 2015
#GHC15
2015
![Page 2: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/2.jpg)
2015
Trade-off: speed vs. rates
![Page 3: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/3.jpg)
2015
Can accounting data help?
![Page 4: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/4.jpg)
2015
Low credit risk?
![Page 5: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/5.jpg)
2015
Traditionally, one big model
Annual Revenue
≤ $100K > $100K
In TX Not in TX Retail Non-retail
State Industry
… …
![Page 6: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/6.jpg)
2015
Ensemble methods
![Page 7: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/7.jpg)
2015
Lots of little trees
![Page 8: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/8.jpg)
2015
Trees Rules
Time in Business
≤ 3 yrs > 3 yrs
> 8 > 10% ≤ 10%
# Invoices Sales Growth
≤ 8
![Page 9: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/9.jpg)
2015
Trees Rules
Time in Business
≤ 3 yrs > 3 yrs # Invoices Sales Growth
Rule 1: Time in Business ≤ 3 yrs
Rule 2: Time in Business > 3 yrs Rule 3: Time in Business ≤ 3 yrs and # Invoices ≤ 8
Rule 4: Time in Business ≤ 3 yrs and # Invoices > 8
1 2
Rule 5: Time in Business > 3 yrs and Sales Growth ≤ 10% Rule 6: Time in Business > 3 yrs and Sales Growth > 10%
≤ 8
3
> 8
4
≤ 10%
5
> 10%
6
![Page 10: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/10.jpg)
2015
A lot of rules!
Rule 1 Rule 2 Rule 3 Rule 4 Rule 5 Rule 6
…
Rule 1 Rule 2 Rule 3 Rule 4 … Rule n
![Page 11: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/11.jpg)
2015
Which rule is most predictive?
w1* R1 + w2* R2 + w3* R3 + … + wn* Rn
Weights, wi’s, computed via logistic regression
![Page 12: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/12.jpg)
2015
Scoring
https://en.wikipedia.org/wiki/Logistic_regression
logep
1- pp =
1
1+ e-score
score = logep
1- p=w0 +w1R1 +w2R2 +...+wnRn
![Page 13: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/13.jpg)
2015
Example
R1: Time in Business ≤ 3 yrs R2: Time in Business > 3 yrs R3: Time in Business ≤ 3 yrs & Invoices ≤ 8 R4: Time in Business ≤ 3 yrs & Invoices > 8 R5: Time in Business > 3 yrs & Sales Growth ≤ 10% R6: Time in Business > 3 yrs & Sales Growth > 10%
✓ ✗
✗
✗
✗ ✓
score = -2 + 0.1 − 0.6 = -2.5
Make a loan offer!
• 2 years in business • 10 invoices per month
• Sales growth: 12%
score= -2+0.1R1 -0.25R2 +0.14R3 -0.6R4 +0.01R5 -0.07R6
p =1
1+ e2.5» 0.08
![Page 14: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/14.jpg)
2015
Rule ensembles RULE
Rule Ensemble
Logistic Regression
✔ ✖
✔
✔
✖
✖
Missing data
High predictive power
Large number of variables
![Page 15: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/15.jpg)
2015
Try it yourself!
Professor Friedman’s website:
− http://statweb.stanford.edu/~jhf/R_RuleFit.html
Open source “wrapper” to RuleFit
− https://github.com/intuit/rego
![Page 16: Using Rule Ensembles to Predict Credit Risk #GHC15](https://reader031.vdocuments.us/reader031/viewer/2022030303/587ba0bd1a28ab81758b4bdf/html5/thumbnails/16.jpg)
2015
Got Feedback?
Rate and Review the session using the GHC Mobile App
To download visit www.gracehopper.org