role of data science in erm @ nashville analytics summit sep 2014
DESCRIPTION
An overview of how organizations can leverage data science and predictive analytics to improve enterprise risk management. Applications for risk identification, mitigation and management will be discussed, as well as methods to facilitate strategic integration across an organization.TRANSCRIPT
The Role of Data Science in Enterprise Risk Management By John Liu, PhD, CFA
Question of the Day ¡ How do you tell the difference between a
Bayesian Statistician and Data Scientist?
¡ Answer: What’s the p-value?
Big Data: Big Risks ¡ Healthcare
¡ Financial Services
¡ Insurance
¡ Transportation
¡ National Security
¡ Dating
Key Takeaways ¡ What is Enterprise Risk Management
(ERM)?
¡ What is the Role of Data Science in ERM?
¡ What Data Analytics are used for ERM?
What is Enterprise Risk
Management?
What is Risk Management? ¡ A structured approach to manage uncertainty
¡ Management strategies:
Risk Avoidance Risk Transfer Risk Mitigation
Risk Management - Defense Insurance Approach
Reward
Probability of Success
Do Nothing
Risk Management - Offense Opportunistic Approach
Reward
Probability of Success
Carpe Diem
Do Nothing
What is ERM? ¡ Risk-based approach to managing an enterprise
¡ Risk-aware: every major tangible and intangible factor contributing toward failure in every process at every level of the enterprise
¡ Enterprise value maximized with optimal balance between profitability/growth and related risks
¡ Management better prepared to seize opportunities for growth and value creation
ERM Components
Identify
Quantify
Respond
Monitor
Comprehensive Approach To
Managing Uncertainty
Identify/Assess Internal and External Risks
Risk Scoring & Modeling
Respond and Control
Monitor & Report Effectiveness
ERM Goals ¡ Provide holistic view across an organization
leveraging firm experience and knowledge
¡ Provide greater transparency to factors that can impair value preservation and business profitability
¡ Understand & test assumptions & interpretations in business decision-making
ERM Risk Types ¡ • Resource Capital Management
• Business Disruption, IT Operational
• Credit Exposure • Exchange Rate, Cash flow, Funding Financial
• Privacy, Security, Safety • Regulatory and Statutory Compliance
• Financial Reporting • Regulatory Reporting Reporting
• Natural Catastrophe • Market Panics Hazard
• Business Planning • Marketing, Reputation Strategic
RM vs ERM HQ: EUR exposure Subsidiary: USD exposure
Sells EUR, Buys USD Sells USD, Buys EUR
RM: Subsidiaries/Business Units manage risks separately
ERM: Manage NET exposure across entire enterprise
Data Science and ERM
ERM Framework
Compliance
Financial
Compliance
Reporting
Hazard
Strategic
Entity W
ide
Divisio
n
Busin
ess U
nit
Enterprise Structure, Risks Objectives & Components Comprehensive Approach Leverage Data & Analytic Resources Predictive Modeling
Common Challenges ¡ Data warehousing & sharing across entity
¡ Prioritization methodology
¡ Consolidated reporting
¡ Timeliness
¡ Data security
¡ The risk management process itself!
Role of Data Science ¡ Data science methods provide: ¡ Enterprise Data Management
¡ Comprehensive warehousing
¡ Data quality and abundance
¡ Risk Analytics ¡ Predictive Modeling
¡ Loss Distributions
¡ Reporting ¡ Real-time visualization, dashboards
¡ Regulatory requirements
Reporting
Typical Corporate EDW
¡ Big data warehouse ≠ useful data (quite the opposite)
Data Management ¡ Comprehensive data warehouse ¡ Coherent data collection (maybe)
¡ Facilitate data sharing across entity
¡ No useful analytics without abundant, high quality data
Data Big Data
Excel BigTable
PostgreSQL Cassandra, HIVE, HBase
MongoDB Vertica, KDB
Risk Analytics ¡ Benefits beyond Business Intelligence
¡ Newest: cognitive analytics = What is the best answer?
Descriptive Analytics
Predictive Analytics
Prescriptive Analytics
What happened? What’s likely to occur? Why would it occur?
Hindsight Foresight Insight
Summary Statistics Data mining Heuristic Optimization
web analytics, BI, inventory reporting
credit scoring, trend analysis, sentiment
operations planning, stochastic methods
Rich Set of Visualization & Reporting Tools
Aggregate Risk Dashboards Continuous & Comprehensive Risk Monitors
Source: IBM Cognos
Data Analytics Applications for ERM ¡ • Scenario Analysis & Stress Testing Operational
• Credit Scoring Financial
• IT Security Anomaly Detection Compliance
• Risk Dashboard Reporting
• Catastrophe & Market Risk Hedging Hazard
• Marketing Analytics Strategic
Data Analytics for ERM
Definition of Risk ¡ Risk = Frequency of Loss x Severity of Loss
¡ Loss Distribution
Unexpected Loss
Traditional ERM ¡ Analytic Methods ¡ Closed-form solutions (…just like most things in life)
¡ Historical ¡ Estimate risk using internal and external loss data
¡ Monte Carlo ¡ Estimate distribution parameters from real data ¡ Monte-Carlo sample distribution ¡ Calculate ensemble measures to estimate overall risk
¡ Simple to implement, aggregate across entity, but make complex assumptions, not robust to outliers
Modern ERM ¡ Data analytics driven
¡ Inference based methods
¡ KRI scoring
¡ Parallelization
¡ Natural applications ¡ credit risk scoring ¡ Anti-money laundering ¡ Fraud
Prediction Methods
Methods
Tail Bayesian Frequentist
Transduction
Extreme-Value Expected Deficit
Naïve Bayes HMMs
Bayes Nets
Regression, Decision Trees
SVM
Ensemble Methods Bagging, Boosting, Voting
Outliers, Inliers, and Just Plain Liars ¡ Prediction problems fall in two classes:
Inliers Outliers
Inherently different problems with different quirks
Main Problems with Inlier Prediction ¡ Parametric model choice
¡ Estimation error for lower moments (mean, s.d.)
¡ Incorrectly conjugating priors
¡ Normal/Gaussian distributions don’t really occur in real life
¡ I.I.D.? Really?
Main Problem with Outlier Prediction ¡ Data Quality and Abundance ¡ To estimate low probability events, big data may not be big
enough
Data: 150 years of daily data
Predictor: 100 year flood severity
Relevant Data: 1 or 2 data points
Value-at-Risk (VaR) ¡ Loss severity measure for a given probability and time
horizon
• Estimate potential losses (or historical losses)
• Rank losses based on severity • 95% Value-at-Risk is equal to
the 95th percentile loss • Interpretation = Losses won’t
exceed 65.2m 95% of time • Underestimates losses during
the other 5% of time
Rank Loss 1 -‐0.1 2 -‐0.1 3 -‐0.3 4 -‐0.6 5 -‐0.7 6 -‐0.9 7 -‐1.1 … … 91 -‐59.5 92 -‐63.2 93 -‐64.9 94 -‐65.0 95 -‐65.2 96 -‐66.5 97 -‐67.8 98 -‐93.9 99 -‐110.0 100 -‐273.1
VaR
Value-at-Risk ¡ Loss severity measure for a given probability and time
horizon
1-day 95% VaR of $1m Expect to lose no more than$1m in 95 out of every 100 days Says nothing about the other 5 days out of 100. Not very reassuring, is it?
Tail Value-at-Risk (TVaR) ¡ Loss severity measure for a given probability and time
horizon
• Estimate potential losses (or historical losses)
• Rank losses based on severity • 95% Tail Value-at-Risk is equal
to average of all losses beyond 95th percentile loss
• Expect to lose on average $122m if losses exceed the 95th percentile
Rank Loss 1 -‐0.1 2 -‐0.1 3 -‐0.3 4 -‐0.6 5 -‐0.7 6 -‐0.9 7 -‐1.1 … … 91 -‐59.5 92 -‐63.2 93 -‐64.9 94 -‐65.0 95 -‐65.2 96 -‐66.5 97 -‐67.8 98 -‐93.9 99 -‐110.0 100 -‐273.1
TVaR
Tail Value-at-Risk (TVaR) ¡ Loss severity measure for a given probability and time
horizon
1-day 95% TVaR of $122m Better Measure of Risk Also known as Expected Shortfall, CVaR
Application: Operational Risk Management ¡ Definition: The risk of direct and indirect loss resulting
from inadequate or failed: ¡ Internal processes
¡ People
¡ IT systems
¡ External events
Source: NYFed
Operational Risk
External Criminal Activity
Information security failure
Internal Criminal Activity Unauthorized
Activity
Processing Failure
System Failure
Control Failure
Business Disruption
Workplace Safety Malpractice
Managing OpRisk ¡ One Approach
Source: NYFed
Assess Scorecard Identify
Weakness
Internal Loss Data
Risk Scenarios
Risk Model OpVar
Risk Capital
Methods ¡ Scorecard ¡ KRI scoring models
¡ Useful where no severity data exists
¡ Loss Distribution ¡ Estimation of severity distribution parameters
¡ MLE Not robust – data not i.i.d., biased upwards, subject to data paucity & sparsity
¡ Leads to biased loss exposures and correlation assumptions
¡ Huge opportunity for inference-based analytics
1
2
3
2
3
5
3
5
9
Impact
Pro
ba
bili
ty
Looking Forward
ERM Trends
Source: NCSU
¡ Increasing adoption of ERM
Fraud Detection Top Concern
But Low Adoption.
Forensic Data Analytics
Source: Ernst & Young
Promise of Data Analytics ¡ EDW remains a huge issue for most corporations ¡ Legacy zombie systems
¡ IT reporting lines
¡ Increased understanding by senior managers and C-suite
¡ Analytics as a Service: growing competition within consulting industry
¡ Talent Gap – same for anything Data Science
Thank you