big data, big brother, and st irikm t dsystemic risk...
TRANSCRIPT
Big Data, Big Brother, andS t i Ri k M t dSystemic Risk Measurement and ManagementManagementAndrew W. Lo, MITYale CPSC 458a Guest Lecture
MITLaboratory foryFinancial Engineering
Technology and the Financial SystemTechnology and the Financial System
Buttonwood AgreementMay 17, 1792
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 2
Technology and the Financial SystemTechnology and the Financial SystemWall Street Before the
Invention of the TelegraphWall Street After the
Invention of the Telegraphg p g p
Source: Leinweber, 2009, Nerds on Wall Street.
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 3
Source: Leinweber, 2009, Nerds on Wall Street.
Technology and the Financial SystemTechnology and the Financial SystemNew Challenges to Financial Regulation
C it l i t h d t i l t Capital requirements are harder to implement New risks to financial stability have been created Fairness and privacy issues have emerged Speed of financial innovation has increased; speed ofSpeed of financial innovation has increased; speed of
regulatory innovation has not kept pace, e.g., HFT We need better “regulatory technology”: (1) risk We need better regulatory technology : (1) risk
transparency; (2) adaptive regulations; (3) framework for financial regulation (“regulation science”)for financial regulation ( regulation science )
Financial System 2.0 (Big Data Is Key)
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 4
The Need forThe Need forThe Need for The Need for Big DataBig DataBig DataBig Data
“Nobody Knows Anything”“Nobody Knows Anything”y y gy y gThe May 6, 2010 “Flash Crash”
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 6
“Nobody Knows Anything”“Nobody Knows Anything”y y gy y gAccenture plc, Market Depth,
Aggressive Buys, and PriceAggressive Buys, and Price
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 7
“Nobody Knows Anything”“Nobody Knows Anything”y y gy y g
September 30, 2010October 6, 2010April 21, 2015
p ,
April 3, 2013
http://sec gov/news/studies/2010/marketevents-report pdfhttp://sec.gov/news/studies/2010/marketevents-report.pdf
© 2015 by Andrew W. Lo All Rights Reserved
Slide 812 Oct 2015
“Nobody Knows Anything”“Nobody Knows Anything”y y gy y g
October 15 2014October 15, 2014
© 2015 by Andrew W. Lo All Rights Reserved
Slide 912 Oct 2015
The Promise ofThe Promise ofThe Promise of The Promise of Big DataBig DataBig DataBig Data
Big Data for Consumer CreditBig Data for Consumer Creditgg $3.4T of consumer credit outstanding as of Mar 2015 $889B of revolving consumer credit outstanding as of $889B of revolving consumer credit outstanding as of
Mar 201538% f h h ld i i di d b l i 38% of households carry positive credit card balance in 2013 ($5,700)
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 11
Big Data for Consumer CreditBig Data for Consumer CreditggStandard Credit Scores Are Too Insensitive
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 12
Big Data for Consumer CreditBig Data for Consumer Creditgg
Transaction Data
Anonymized Data from Large U.S. Commercial Bank
Transaction Count Mortgage payment Hotel expenses Bar Expenses Total Inflow Credit car payment Travel expenses Fast Food Expenses Total Outflow Auto loan payment Recreation (golf Total Rest/Bars/Fast Food
By CategoryTransaction Data
Total Outflow Auto loan payment Recreation (golf Total Rest/Bars/Fast-Food Student loan payment Department Stores Expenses Healthcare related expenses
By Channel: All other types of loan payment Retail Stores Expenses Health insurance Other line of credit payments Clothing expenses Gas stations expenses
ACH (Count, Inflow and Outflow) Brokerage net flow Discount Store Expenses Vehicle expenses ATM (Count Inflow and Outflow) Dividends net flow Big Box Store Expenses Car and other insuranceATM (Count, Inflow and Outflow) Dividends net flow Big Box Store Expenses Car and other insurance BPY (Count, Inflow and Outflow) Utilities Payments Education Expenses Drug stores expenses CC (Count, Inflow and Outflow) TV Total Food Expenses Government DC (Count, Inflow and Outflow) Phone Grocery Expenses Treasury (eg. tax refunds) INT (Count, Inflow and Outflow) Internet Restaurant Expenses Pension Inflow WIR (Count, Inflow and Outflow) Collection Agencies Unemployment Inflow Collection Agencies ( , ) g p y g
Balance DataChecking Account Balance
k l il ( )
Credit Bureau Data1% Sample =Brokerage Account Balance File Age Type (CC, MTG, AUT, etc)
Saving Account Balance Credit Score Age of AccountCD Account Balance Open/Closed Flag & Date of Closure BalanceIRA Account Balance Bankruptcy (Date & Code) Limit if applicable
MSA & Zip Payment Status48 Month Payment Status History
1% Sample =10 Tb!
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 13
48-Month Payment Status History
Big Data for Consumer CreditBig Data for Consumer Creditgg
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 14
Big Data for Consumer CreditBig Data for Consumer Creditgg Decision tree Logistic regression
– Outputs a probability of delinquency
Random forest Clustering/segmentationg/ g
– Can be combined with other models
Software: Software: – WEKA (machine-learning suite – University of
Waikato NZ) http://www cs waikato ac nz/ml/weka/Waikato, NZ) http://www.cs.waikato.ac.nz/ml/weka/– LIBLINEAR (National Taiwan University)
http://www.csie.ntu.edu.tw/~cjlin/liblinear/p // / j / /
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 15
Big Data for Consumer CreditBig Data for Consumer Creditgg
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 16
Big Data for Consumer CreditBig Data for Consumer Credit Khandani, Kim, and Lo (2010)
600 000 dit d th 40 h ti
gg
600,000 credit cards per month; 40-hour runtime
© 2015 by Andrew W. Lo All Rights Reserved
Slide 1712 Oct 2015
Big Data for Consumer CreditBig Data for Consumer Creditgg
Credit Forecasts Over TimeCredit Forecasts Over Time
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 18
Current ResearchCurrent Research 6 largest banks from
Jan 2008 to Dec 2014 Macro and institution-
specific factors (137) 25 Tb of data25 Tb of data Used to gauge quality
of risk managementof risk management across institutions
d l l Models vary greatly But you need data!
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 19
Law As CodeLaw As Code
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 20
Law As CodeLaw As CodeCharacteristic United States Code Software Code
Structure Titles, Chapters, Sections, Sub-sections
Libraries, Classes, Sub-classes, Functions
Execution Businesses regulators Computers interpretExecution Businesses, regulators, police, courts, interpret and execute laws
Computers interpret and execute software
Evolution Congress passes laws that edit the U.S. Code
Programmers edit the software system’s codebasecodebase
Language English (specialized “legal
Computer-interpretable language
language”) (e.g. C, Java)
© 2015 by Andrew W. Lo All Rights Reserved
Slide 2112 Oct 2015
Law As CodeLaw As Code “All general and permanent laws…” HeineOnline, U.S. Code 1925–2006 (does not include
all regulations, CFR); Office of the Law Revision Counsel U.S. Code size:
– 1925–2006: 36 gigabytesg g y– For 2014 (from U.S. OLRC): 1.8 million sentences (41.4 million
words, >200,000 pages)
Comparison with other software:– Mozilla Firefox: 14.4 million lines of code– Linux kernel: 23.5 million lines of code
© 2015 by Andrew W. Lo All Rights Reserved
Slide 2212 Oct 2015
Law As CodeLaw As Code
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 23
Law As CodeLaw As Code
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 24
Law As CodeLaw As Code
Principle Proposed Metric
The Five C’s of Software DevelopmentPrinciple Proposed Metric
1. Conciseness: Good code should be as long as it needs to be, but no longer.
Number of words
2. Cohesion: Modules in code should do one thing well, not multiple things badly.
Language perplexity
3. Change: Code that exhibits large or frequent Number of3. Change: Code that exhibits large or frequent change may suggest defects.
Number of sections/subsections affected
4 Coupling: Modular code is more robust and easier Size of cross-reference4. Coupling: Modular code is more robust and easier to maintain than code with unnecessary cross-dependencies.
Size of cross-reference network core versus periphery
5 C l it C d ith l b f N b f diti5. Complexity: Code with a large number of conditions, cases, and exceptions is difficult to understand and prone to error.
Number of condition statements in code (McCabe’s complexity)
© 2015 by Andrew W. Lo All Rights Reserved
Slide 2512 Oct 2015
Law As CodeLaw As Code4. Coupling:Principle: Modular code is more robust and easier to maintain than code with unnecessary cross-dependencies (reduces complexity)Metric: Cross-reference network analysis, e.g., Baldwin, y , g , ,MacCormack, and Rusnak, (2013) Map the network of interdependencies ofMap the network of interdependencies of
subcomponents of large software systems Develop measures of complexity robustness etc Develop measures of complexity, robustness, etc. Begin with notion of “strongly connected” nodes
© 2015 by Andrew W. Lo All Rights Reserved
Slide 2612 Oct 2015
Law As CodeLaw As Code
EEBB EEBB
AA DD GG
CoreFFHHCCII
Core
U S C d t k h 36 670 ti ( d ) U.S. Code network has 36,670 sections (nodes) Core is 6,947 sections (18.9%)
© 2015 by Andrew W. Lo All Rights Reserved
Slide 2712 Oct 2015
Law As CodeLaw As CodeLaws Passed in 111th Congress with Highest Coupling
P bli L Total Number of Public Law Number Title of Act Affected Core
SectionsAmerican Recovery and111-5 American Recovery and Reinvestment Act 293
111-148 Patient Protection and Aff d bl C A t 251Affordable Care Act
111-203 Dodd-Frank Wall Street Reform and Consumer Protection Act 232
111-312Tax Relief, Unemployment Insurance Reauthorization, and J b C ti A t
199Job Creation Act
111-84 National Defense Authorization Act for F.Y. 2010 143
© 2015 by Andrew W. Lo All Rights Reserved
Slide 2812 Oct 2015
Law As CodeLaw As CodeOmnibus Appropriations Act (111–8)
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 29
Law As CodeLaw As CodeTitle 35 (Patents)
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 30
Law As CodeLaw As CodeTitle 12 (Banks and Banking)
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 31
Law As CodeLaw As CodeTitle 26 (Internal Revenue Code)
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 32
Law As CodeLaw As CodeQuality Assurance?? For Linux Kernel “Patches”: Designated individuals (“Maintainers”) oversee
different versions Benevolent dictatorship: “Linus Torvals is the
fi l bit f ll h t d i t thfinal arbiter of all changes accepted into the Linux kernel.” –www.kernel.org/doc/Documentation/SubmittingPatches g What’s the equivalent for financial regulation?
© 2015 by Andrew W. Lo All Rights Reserved
Slide 3312 Oct 2015
The Threat ofThe Threat ofThe Threat ofThe Threat ofBig DataBig DataBig DataBig Data
Privacy vs. TransparencyPrivacy vs. Transparencyy p yy p y
PNAS 106 (July 2009)
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 35
I Th A C i B tI Th A C i B tIs There A Compromise Between Is There A Compromise Between DataData Privacy and Transparency?Privacy and Transparency?DataData Privacy and Transparency?Privacy and Transparency?
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 36
Secure MultiSecure Multi--Party ComputationParty Computationy py p
Y Y YY1 Y2 Yn-1…Y1 = S1+X1
Y2 = Y1+S2+X2
Yn = Yn-1+Sn+Xn
Y Z ZAndrew Alice Bob
Yn Z1 Zn-1…Z1 = Yn−X1
Z2 = Z1−X2
Zn = Zn-1−Xn
Andrew Alice Bob
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 37
Privacy Privacy andand TransparencyTransparencyyy p yp yTransparency and Privacy Can Both Be Achieved Abbe, Khandani, and Lo (2012, 2015) Individual data is kept private e g RSA Individual data is kept private, e.g., RSA Encryption algorithms are “collusion-robust” Aggregate risk statistics can be computed using
encrypted dataencrypted data– Means, variances, correlations, percentiles,
Herfindahl indexes VaR CoVaR MES etcHerfindahl indexes, VaR, CoVaR, MES, etc.
Privacy is preserved, no need for raw data!
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 38
Privacy Privacy andand TransparencyTransparencyyy p yp yReal Estate Loans Outstanding
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 39
Privacy Privacy andand TransparencyTransparencyyy p yp yReal Estate Loans Outstanding
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 40
ConclusionConclusion Technology has transformed everything! Financial markets are vastly better off But new challenges have emergedBut new challenges have emerged We can do better We have to do better Regulation has to account for technology andRegulation has to account for technology and
how it interacts with human behaviorl d f Computer science, law, and finance must
collaborate to create the Financial System 2.0© 2015 by Andrew W. Lo
All Rights ReservedSlide 4112 Oct 2015
Th k Y !Th k Y !Thank You!Thank You!
Additional ReferencesAdditional References Abbe, E., Khandani, A. and A. Lo, 2012, “Privacy-Preserving Methods for Sharing Financial Risk Exposures”,
American Economic Review: Papers and Proceedings 102, 65–70. Bisias, D., Flood, M., Lo, A., and S. Valavanis, 2012, “A Survey of Systemic Risk Analytics”, Annual Review of
Financial Economics 4 255–296Financial Economics 4, 255 296. Brennan, T. and A. Lo, 2012, “Do Labyrinthine Legal Limits on Leverage Lessen the Likelihood of Losses? An
Analytical Framework”, Texas Law Review 90, 1775–1810. Brennan, T. and A. Lo, 2014, “Dynamic Loss Probabilities and Implications for Financial Regulation”, to appear in
Yale Journal on RegulationYale Journal on Regulation. Butaru, F., Chen, Q., Clark, B., Das, S., Lo, A., and A. Siddique, 2015, “Risk and Risk Management in the Credit
Card Industry” (June 14, 2015). Available at SSRN: http://ssrn.com/abstract=2618746 or http://dx.doi.org/10.2139/ssrn.2618746
Campbell J Lo A and C MacKinlay 1997 The Econometrics of Financial Markets Princeton NJ: PrincetonCampbell, J., Lo, A. and C. MacKinlay, 1997, The Econometrics of Financial Markets. Princeton, NJ: Princeton University Press.
Chan, N., Getmansky, M., Haas, S. and A. Lo, 2007, “Systemic Risk and Hedge Funds”, in M. Carey and R. Stulz, eds., The Risks of Financial Institutions and the Financial Sector. Chicago, IL: University of Chicago Press.
Getmansky M Lo A and I Makarov 2004 “An Econometric Analysis of Serial Correlation and Illiquidity inGetmansky, M., Lo, A. and I. Makarov, 2004, An Econometric Analysis of Serial Correlation and Illiquidity in Hedge-Fund Returns”, Journal of Financial Economics 74, 529–609.
Getmansky, M., Lo, A. and S. Mei, 2004, “Sifting Through the Wreckage: Lessons from Recent Hedge-Fund Liquidations”, Journal of Investment Management 2, 6–38.
Khandani A Kim A and A Lo 2010 “Consumer Credit Risk Models via Machine-Learning Algorithms” JournalKhandani, A., Kim, A. and A. Lo, 2010, Consumer Credit Risk Models via Machine Learning Algorithms , Journal of Banking & Finance 34, 2767–2787.
Khandani, A. and A. Lo, 2007, “What Happened to the Quants In August 2007?”, Journal of Investment Management 5, 29–78.
Khandani A and A Lo 2009 “What Happened to the Quants in August 2007?: Evidence from Factors and
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 43
Khandani, A. and A. Lo, 2009, What Happened to the Quants in August 2007?: Evidence from Factors and Transactions Data”, to appear in Journal of Financial Markets.
Additional ReferencesAdditional References Khandani, A., Lo, A., and R. Merton, 2013, “Systemic Risk and the Refinancing Ratchet Effect”, Journal of
Financial Economics 108, 29–45. Kirilenko, A., and A. Lo, 2013, “Moore's Law vs. Murphy's Law: Algorithmic Trading and Its Discontents”, Journal
of Economic Perspecti es 27 51 72of Economic Perspectives 27, 51–72. Li, W., Azar, Larochelle, D., Hill, P. and A. Lo, 2015, “Law Is Code: A Software Engineering Approach to Analyzing
the United States Code”, Journal of Business & Technology Law 10, 297–374. Lo, A., 2001, “Risk Management for Hedge Funds: Introduction and Overview”, Financial Analysts Journal 57, 16–
33 Lo A 2002 “The Statistics of Sharpe Ratios” Financial Analysts Journal 58 36 5033 Lo, A., 2002, The Statistics of Sharpe Ratios , Financial Analysts Journal 58, 36–50. Lo, A., 2002, “The Statistics of Sharpe Ratios”, Financial Analysts Journal 58, 36–50. Lo, A., 2004, “The Adaptive Markets Hypothesis: Market Efficiency from an Evolutionary Perspective”, Journal of
Portfolio Management 30, 15–29. Lo, A., 2005, The Dynamics of the Hedge Fund Industry. Charlotte, NC: CFA Institute. Lo, A., 2005, “Reconciling Efficient Markets with Behavioral Finance: The Adaptive Markets Hypothesis”, Journal
of Investment Consulting 7, 21–44. Lo, A., 2007, “Where Do Alphas Come From?: Measuring the Value of Active Investment Management”, to , , , p g g ,
appear in Journal of Investment Management. Lo, A., 2012, “Reading About the Financial Crisis: A 21-Book Review”, Journal of Economic Literature 50, 151–
178. Lo, A., 2013, “Fear, Greed, and Financial Crises: A Cognitive Neurosciences Perspective”, in J.P. Fouque and J. , , , , , g p , q
Langsam, eds., Handbook of Systemic Risk, Cambridge University Press. Lo, A. and C. MacKinlay, 1999, A Non-Random Walk Down Wall Street. Princeton, NJ: Princeton University Press. Lo, A., Petrov, C. and M. Wierzbicki, 2003, “It's 11pm—Do You Know Where Your Liquidity Is? The Mean-
Variance-Liquidity Frontier”, Journal of Investment Management 1, 55–93.
© 2015 by Andrew W. Lo All Rights Reserved
12 Oct 2015 Slide 44
q y , f g ,