11
Survival Data Mining using Enterprise Miner
and Proportional Hazard Cox Model
25th June 2015Manchester – UK
Professor Jorge Ribeiro Patrick Ribeiro
2
Survival Analysis Node
2
Enterprise Miner 13.2
Simulation Studio 13.2
SAS/OR Operational Research
SAS/ETSEconometrics
Time Series
PROC ARIMA
PROC AUTOREG
77
1.2 - Model 1 - Time to Next Purchase
“People are much more
likely to get on a bus if
they know where it is
going”.
Steps Plan
1717
SAS/ETS – Econometrics Time SeriesThe Cross-Correlation Function
tt HL 4
Jan
tt HL 1
tt HL 2
Oct
Nov
Dec
Jan
Feb
Lag
Apr
Dec
Jan
Feb
Mar
Dec
18
PROC ARIMA / PROC AUTOREG
18
SAS/ETS – Econometrics Time Series
Primary Event Variables
Royal Wedding
Bank Holiday
Price
Marketing Campaign
Point/Pulse
Step
Ramp
tevent
32
Step 1 – Economic variables
Economic Variables
Unemployment
GDP
Inflation
Cash rate
Credit availability
House prices
Commercial property prices
Commodity prices
Swap rates
Equity prices
33
Cox Proportional Hazards Model
1 1
0
{ ... }( )( ) i k ikX X
ih t eh t
Baseline Hazard function - involves time but not predictor variables
Linear function of a set of predictor variables - does not involve time
...
34
Step 3 – Model
PROC PHREG DATA = MODEL COVSANDWICH(AGGREGATE);
CLASS Risk ;
MODEL (START,END)*DEFAULT(0) = Risk P1GDP UNEMPLOYMENT;
ID CUSTOMER_ID;
HAZARDRATIO Risk / DIFF=REF;
HAZARDRATIO P1GDP / UNITS = 1 2 3 5;
HAZARDRATIOUNEMPLOYMENT / UNITS = 1 2 3 5;
RUN;
PD_Band Risk
1 to 5 1
6 to 11 5
12 to 16 09
17 to 18 12
19 to 20 15
35
SAS ResultsFor each 1 unit increase in the GDP,
the Hazard of Default goes down by an
estimated 16.7 %.
0.18257e 0.833
100*(0.833 1) 16.7%
36
SAS Results For each 1 unit increase in the
Unemployment, the Hazard of Default
increases by an estimated 25.5 %.
0.22684e 1.255 100*(1.255 1) 25.5%
Risk
37
SAS Results A customer in the Band 01 has a ONLY 8.7%
the risk of Default (or - 91.3%) compared to a
customer in the Band 15 (the reference Band).
2.44279e 0.087 100*(0.087 1) 91.3%
HAZARD RATIO (BAND 01)0.087
HAZARD RATIO (BAND 15)
38
100*(0.087 1) 91.3%
HAZARD RATIO (BAND 01)0.087
HAZARD RATIO (BAND 15)
HAZARDRATIO Risk / DIFF=REF;
SAS Results A customer in the Band 01 has a ONLY 8.7%
the risk of Default (or - 91.3%) compared to a
customer in the Band 15 (the reference Band).
Risk
39
SAS Results
Output 7
PROC PHREG DATA = MODEL COVSANDWICH(AGGREGATE);
CLASS Risk (PARAM=REF REF='15') ;
MODEL (START,END)*DEFAULT(0) = Risk P1GDP UNEMPLOYMENT;
ID CUSTOMER_ID;
HAZARDRATIO P1GDP / UNITS = 1 2 3 5;
HAZARDRATIOUNEMPLOYMENT / UNITS = 1 2 3 5;
RUN;
100*(0.694 1) 30.6%
100*(0.578 1) 42.2%
100*(0.401 1) 59.9%
40
SAS Results
Output 8
PROC PHREG DATA = MODEL COVSANDWICH(AGGREGATE);
CLASS Risk (PARAM=REF REF='15');
MODEL (START,END)* DEFAULT(0) = Risk P1GDP UNEMPLOYMENT;
ID CUSTOMER_ID;
HAZARDRATIO P1GDP / UNITS = 1 2 3 5;
HAZARDRATIO UNEMPLOYMENT / UNITS = 1 2 3 5;
RUN;
100*(1.574 1) 57.4%
100*(1.975 1) 97.5%
100*(3.109 1) 210.9%
41
Survival Function
Scenario Analysis 1
P1GDP=1.1;
Unemployment=6;
Scenario Analysis 2
P1GDP=0.8;
Unemployment=10;
4343
Go Further Introduction to Survival
Analysis using PH Cox ModelsApplying Survival
Analysis for Business
4444
Go Further Survival Data Mining
Programming ApproachSurvival Data Mining
Using Enterprise Miner
47
Questions
47
www.modellingtraining.com
- SAS code
- Results
Email:
Web page:
Tel: 01943 430241
07880 474564