Bridging the Academic–Practitioner Divide in Credit Risk Modeling
Vadim Melnitchouk, Metropoliten State University, Saint Paul, MN, US
Agenda1. Academic model selection by a practitioner and organizational issues
2. ‘Optimal complexity model’ : stochastic parametric method with macroeconomic variables and unobserved consumer heterogeneity
3. Data access , collaboration & prototype development
Who is a practitioner?
1. Ph. D in applied math, former academic, teaching part-time ‘Data Mining’.
2. Ph. D in physics, former academic
3. M.S. in OR, former ‘Fed’ examiner
4. M.S. in Econometric
A practitioner’s search for the right academic paper /model
Paper/Methodology Potential Business Impact Organizational issue
Andreeva, Ansell & Crook 'Modeling Profitability using Survival Combination Scores' Increase Profitability
How to get CRO & CMO to agree on the same KPIs?
Belloti & Crook Forecasting and Stress Testing Credit Card Default..'
More accurate estimation for unexpected losses, Economic Capital Reduction
US Banks are getting a stress test scenario from Regulators
Fader & Hardie 'Customer-Base Analysis with Discrete-time …'
Increase Sales, prevent Customer Attrition
Was implemented at GE Money in 2008-2009
Fader & Hardie 'Customer-Base Analysis with Discrete-time …' Reduce losses Cultural resistence
Leow & Crook 'Intensity Models and Transition Probabilities ‘ Reduce losses
Feasible, but optimal complexity model is required
Time to Default: Optimal complexity model 1. According to Bellotti & Crook (2007) survival
(hazard) modeling is competitive alternative to logistic regression when predicting default events.
2. The method has become a model of choice in recent publications. But its complexity makes such technique unfeasible for practitioners.
3. It also has some limitations. Bellotti (2010) believes that ‘any credit risk model with macroeconomic variables can’t be expected to capture the direct reason for default like a loss of job, negative equity or a sudden personal crisis such as sickness or divorce’.
Methodology
The goal of this paper is to present more practical method which also can take unobserved obligor heterogeneity into account.
Stochastic parametric Time to Event method is well known in marketing (Hardie & Fader, 2001).
It was also applied by Brusilovskiy (2005) to predict the time of the first home purchase by immigrants.
The method as far as we know has not been used in credit risk by academics or practitioners.
Assumptions & inputs 1. Time to Default - Weibull distribution (Appendix)
2. Default density across obligors - Gamma distribution (to include unobserved consumer heterogeneity). 3. Vintage aggregate level modeling to avoid so called aggregation bias when unemployment is used.
Inputs:
1. Monthly number of defaults 2. Time varying covariates : Unemployment and Home Price Index (HPI). Macroeconomic factors are incorporated into the hazard rate function.
Recent trends in mortgage default rate & data
1. The default rates have spiked from historical trends in 2005 and more significantly in 2006 & 2007 beginning almost immediately after origination.
2. Average time to reach maximum default rate decreased from 5-6 (Vintage 2001-2004) to 2-3 years (Vintage 2005-2007)
3. LPS prime, first, fixed rate 30 years mortgage originated in 2006 data were used to build a model (Schelkle, 2011).
Model training and out-of-time validation
1. Model training period for vintage 2006 was June 2006 – March 2009.
2. April 2009 to March 2010 period was selected for ‘out of time’ validation because unemployment increased from 8.5% to 10.1% during this period.
3. The model was implemented in MS Excel (using Solver) and in SAS/IML. Maximum likelihood was estimated to get values for five parameters.
Forecasted vs Actual monthly # of defaults
Weibull/Gamma model for 2006 mortgage origination year (LPS data, vintage 2006).
Results & Discussion
The forecast accuracy for ‘out-of-time’ period is at acceptable level (low forecast error and conservative estimate for regulators).
Issues with one segment model:
1. Time varying covariates formula is taken from marketing application and is not flexible one for credit risk modeling (Appendix).
2. The impact of unemployment and HPI can be double counted.
Next steps in collaboration with academics
1. Bayesian parameters’ estimation was applied in collaboration with Prof. Shemyakin (St.Thomas University, St. Paul. MN) and his students to improve numeric stability.
Two segments latent class Weibull model
(Appendix) was also used to estimate parameters of consumer segment with default hazard increasing over time.
Unemployment and HPI were not included to avoid double counting (academic’s preference).
Data access and three levels of collaboration
Collaborati
on levelExecution
byAcademic's Motivation
Practitioner's Motivation Data Access
Academic Partner
Looking over your shoulder
Practitioner
Marketing and
validation
Apply new method
(professional growth) N/A
Prof. Fader &
Prof. Hardie
Joint supervision Student
Real life project for a student
Additional validation &
enhancement
Vintageaggregated data only
Prof. Shemyaki
n, June 2012
Bridging the
Academic–Practitioner
Divide
Academic and
practitioner ?
Resolve real issue like
wrong signs in
multinomial regression coefficients
Aggregated by
delinquency status ?
Data access
1. It is very problematic to get loan level data from financial firms for joint projects.
2. Aggregate level delinquency and default data for mortgages, credit cards , installment loans and commercial lending can be extracted from public websites.
3. But data decomposition of completely aggregated data like Federal Reserve one (Appendix) should be implemented first to apply vintage based modeling.
From a prototype to production: possible collaboration
Model description
Model Category Scope Major Issue Possible solution
Non-stationary Markov Chain model with hazard
functions and macroeconomic variables Production
Consumer & Commercial
Zero values for some transition coefficients
Bayesianestimator/ Gibbs
sampling? Non-stationary Markov
Chain model with multinomial transition
functions and macroeconomic variables Production
Consumer & Commercial
Wrong signs in some transition coefficients ?
Experiment with a second order Markov
Chain Research Commercial
To many parameters, small sample size for
some transitions MCMC Forecasting Time to
Delinquency using Stochastic Parametric
Model Benchmarking ConsumerMLE estimation
numerical stabilityBayesian
estimation
Predicting delinquent loans’ recovery using
Stochastic 'Choice' Model Benchmarking ConsumerNot included in SAS, R, etc., no standard tests
Alternative to Markov model
Next search for optimal complexity model: Combined Markov Chain
and Survival Analysis
Model descriptionMacroeconomi
c variablesObjective function Major Issue Possible solution
Leow & Crook
'Intensity Models and Transition
Probabilities ‘Next step Partial MLE ? N/A
Louis, Laere, Baesens
‘Predicting bank rating
transitions..’ Yes Partial MLE Correlated event times Clustering
Jones ‘Estimating Markov
Transition’ Yes Least Sq.Migration
underestimation
Bayesian MCI(Christodoulak
is)
Kunovac ‘Estimating Credit Migration…– Bayesian Approach, No MPLE
Zero values in some transition coefficients Gibbs sampling
Grimshaw & Alexander ‘Markov Chain model for
delinquency..’ No MLEStatistical significance
for some transitionsBayesian estimator
Conclusions
Stochastic parametric method with macroeconomic variables and unobserved consumer heterogeneity can be used by practitioners as an alternative to survival modeling
The optimal complexity model can provide an incentive to try to bridge the Academic –Practitioner Divide
Appendix
Latent class Weibull model with two segments
Assumptions:
1.All obligors can be divided into two segments with their own fixed but unknown values of shape and scale parameters.
2. Large segment has decreasing default hazard.3.Relatively small consumer segment exists with
default hazard increasing over time . The segment size (percentage) is latent variable which must be estimated for each vintage.