ensemble models: theory and applications, figini, vezzoli. september, 3 2013

Ensemble models: theory and applications

SYstemic Risk TOmography:

Signals, Measurements, Transmission Channels, and Policy Interventions

Silvia Figini University of Pavia, Italy

Marika Vezzoli University of Brescia, Italy Royal Statistic Society Conference 2013

Conference 2013 September 3 – 5, 2013 Newcastle UK

SYRTO Project

Sovereigns Banks and other Financial

Intermediaries (BFIs)

Corporations

This study is part of the SYRTO Project which is funded by the European Union

(EU) under the 7th Framework Programme (FP7-SSH/2007-2013)

Focusing on the European Union the project explores the relationships between (and among)

Silvia Figini, Marika Vezzoli



SYRTO Project: Two main objectives

Identify the common and the sector-specific (idiosyncratic) risks, and assemble a web-based Early Warnings System (EWS) to be used as:

Risk Barometer for each sector and countries alike, in order to identify potential threats to financial stability

a system of Rules of Thumb by monitoring a series of leading indicators so as to minimise the possible negative impacts from systemic crises

1. EWS

2. Syrto Code

Realize the SYRTO Code in order to detect a series of recommendations, also expressed in terms of EWS prescriptions, on:

the appropriate governance structures for EU to prevent and minimise systemic risks

the best mechanisms for ensuring an effective interplay between, and coordination of, macro and micro-prudential responsibilities



SYRTO Project: Who we are

Consortium

Advisory Board

University of Brescia

Centre National de la Recherche Scientifique (CNRS)

Athens University of Economics and Business – Research Center

University Cà Foscari Venice

University of Amsterdam Stichting VU-VUMC (VUA)

1. Scientific Division

Research Unit (among others: P. Balduzzi, A. W. Lo)

Supervisory Unit (among others: R. Engle, Y. Aït-Sahalia, D. Duffie, P. Embrechts)

2. Policy Division

ECB, ESRB, IMF BIS, D. Bundesbank, EBA, EC, OECD, Sveriges Riksbank



Introduction

In this study we investigate ensemble learning and classical model averaging in order to obtain a well calibrated credit risk model in terms of predictive accuracy

We compare ensemble learning approaches, like Random Forest (Breiman, 2001) with Bayesian Model Averaging (BMA) (e.g. Steel, 2011). The final aim is to improve the predictive performance of the models

With a special focus on credit risk application, few papers have investigated the comparison between single selected models and model averaging. In the parametric framework, we recall the paper of Hayden et al. (2009) which presents a comparison between stepwise selection in logistic regression and BMA (Madigan et al. 1999) and Tsai et al. (2010) that show a statistical criterion and a financial market measure to compare the forecasting accuracy of different model selection approaches

In the non parametric framework, we recall the papers of Figini and Fantazzini (2009) and Zhang et al. (2010)



Main objectives

Non Parametric framework: comparing single model based on classification tree with Random Forest

Parametric framework: comparing single model based on logistic regression with BMA

Proposing some ideas on which models we should include in the pool of models in order to make a coherent averaging in terms of predictive capability, discriminatory power, stability of the results



Non Parametric methods based on Random Forests

In the non parametric framework, the ensemble learning techniques combine poor predictors, like trees, in order to obtain robust forecasting

Schapire (1990) showed that weak learner could always improve its performance by training two additional predictors on filtered versions of the input data, while Breiman (2001) generated multiple predictors combining them by simple averaging (regression) or voting (classification)

In this study, we focus our attention on Random Forest (RF) where every weak learner is obtained by growing a non pruned tree on a training set which is a different bootstrap sample drawn from the data

We have chosen Random Forest because it provides an accuracy level that is in line with Boosting algorithm with better performance in terms of computational time Breiman (2001)



Parametric methods based on Bayesian Model Averaging



Prior selection



Bayesian Model Averaging

BMA can be summarized in the following steps:

Given q variables, we fit all the possible variables combination and we obtain the model space M of dimension 2q

For each model we compute its marginal likelihood

We assume a prior on the model space, as in Ley and Steel (2009), with a specific setting of the hyper parameters involved

For each model we obtain the posterior model probability

We fit each model on the data at hand and the final forecast for a specific observation is the average of the prediction made by each model weighted by the relative posterior model probability



Model space in the parametric framework: an example with 4 variables

Model space M 24 = 16



Predictive measures of performance

In order to detect the predictive capability of a single model with respect to averaged models based on BMA or RF, we shall consider the Receiver Operating Characteristic curve (ROC), the area under it (AUC) and the H measure (e.g. Hand et al. 2010)

The discriminant power of a predictive model can be measured by a confusion matrix (Kohavi and Provost, 1998), which compares actual and predicted classifications for a fixed cut-off

We have derive different cut-offs resorting to the minimisation of the difference between sensitivity and specificity (P fair in Schrder and Richter 1999) or to the maximisation of the correct classification rate (P opt, calculated from the ROC as described in Zweig and Campbell (1993) taking into account different costs of false positive or false negative predictions). We have use also a cut-off = 0.5



The data

In this study we focus on a real data base provided by Creditreform and previously analysed in Figini and Fantazzini (2009)

The data set is composed of about 800 SMEs, 9 quantitative independent variables and a binary target variable (default)

The a priori probability of default is equal at 12.5%



Assessment of Single and Averaged Models



Selection of Single and Averaged Models based on AUC

Following DeLong et all. (1998), we compare the AUCs between pairs of models. We obtain that:

AUCTree ≠ AUCRandom Forest (p-value < 0.05)

AUCTree ≠ AUCBMA (p-value < 0.05)

while all the remaining comparison are not statistical different



Prior on the model space and BMA



Discriminatory Power



Remarks and Conclusions

Bayesian Model Averaging

Both the Binomial and the Binomial-Beta priors have in common the implicit assumption that the probability of one regressors appears in the model is independent of the inclusion of others whereas regressors are typically correlated (e.g. Durlauf et al. 2008)

It is interesting to focus on how different priors settings affect the predictive performances of the averaged models

Random Forest

On the basis of the results at hand we underline that also in the non parametric framework averaged models perform better that single model

It is interesting to compare the results at hand with different ensemble methods to optimise the accuracy of the averaged model

This project has received funding from the European Union’s

Seventh Framework Programme for research, technological

development and demonstration under grant agreement n° 320270

www.syrtoproject.eu

This document reflects only the author’s views.

The European Union is not liable for any use that may be made of the information contained therein.

ensemble models: theory and applications, figini, vezzoli. september, 3 2013

Documents

marika vezzoliconference

single model

model space

classical model averaging

papers of figini

amongsilvia figini

newcastle ukintroduction

calibrated credit risk