financial markets signal detection with bayesian networks - phd dreamt - workshop 17th march 2016
TRANSCRIPT
FINANCIAL MARKETS SIGNALS DETECTION WITH BAYESIAN NETWORKS
Alessandro Greppi
Università degli Studi di Pavia
The OutlineBayesian Networks (BN) Definition Our Approach:
Novel Methodology for Studying the Financial Markets
The Market BN Learnt from the Data
Conclusive Remarks
What is a Bayesian NetworkA BN is a graphical model (Pearl, 1988; Neapolitan, 1990; Jensen, 1996) that uses a Direct Acyclic Graph (DAG) to represent the relationships and interactions among a set of variables (Jensen, 1996) and provides an inferential engine that allow to simulate in real time alternative scenarios.
The variables are represented as nodes in the BN and their dependencies are indicated as directed edges between variables. Each variable has a finite set of states (Lauritzen, 2003)
Direct because the arrows have a direction
Acyclic because no loops are allowed in a DAG
An Example: Oil Stocks (1) A fund manager holds in his portfolio 2 oil stocks: ExxonMobil (XOM) and Petrobras (PBR)
At the opening bell ….
PBR XOM-1.5% -1.5%
…. Maybe it’s because the oil dropped to 30$ per barrel.
An Example: Oil Stocks (2)02:00 p.m.
PBR XOM-6% -1.5%
The fund manager checks the oil price and he observes that it is stable around 30$ per barrel.
An Example: Oil Stocks (3) Knowing that oil is stable at 30$ per barrel, PBR losses are not related to oil price movements.
Now that he knows that PBR daily performance is not connected to an oil price drop…
… he believes that XOM price wouldn’t collapse on that trading day too.
PBR
XOM
-6%
-1.5%
Oil Price Down?We consider the following three variables:
Oil price goes down (OD)
Petrobras stock goes down (PD)
ExxonMobil stock goes down (ED).
Each variable is represented by a node and it has two states: YES / NO
A different level of certainty is associated to each of them.
OD has the effect of increasing the level of certainty associated
to both PD and ED.
The arrows that connects the nodes model the direct impact, while the other black arrows indicates the direction of the impact on certainty.
When the fund manager observes that PBR price is down by 6%, he is reasoning in the opposite direction of the direct arrows.
Conditional Independence (1)
Conditional Independence (2)
Finally, he observes that the oil price is stable at 30$ per barrel, consequently, he knows that PBR down by 6% has no influence on XOM stock performance.
This example shows how dependence/independence changes according to the information gathered.
PBR XOM-6% -1.5%
Introducing Probabilities Only the oil price level is relevant for PBR and XOM (Oil Stocks).
We need to calculate:
P(PD|OD)
P(ED|OD)
P(OD)
We assume that the probability for the oil price to go down is 70%. Since both PBR and XOM are oil stocks, they suffer if the oil price plunges:
Probability of PBR and XOM to go down if the oil price drops: 80%
Probability of PBR and XOM stock to go up if barrel goes down: 10%
The Fundamental Rule In order to obtain the initial probabilities for PD and ED we can use the so called “fundamental rule” (Jensen, 1996): P(A|B) P(B) = P(A,B)
In order to calculate P(PD, OD) and P(ED, OD) we have:
P(PD=y, OD=y) = P(PD=y | OD=y) P(OD=y)= 0.8 x 0.7= 0.56
P(PD=n, OD=y) = P( PD=n | OD=y) P(OD=y)= 0.2 x 0.7= 0.14
P(ED=y, OD=n) = P( ED=y | OD=n) P(OD=n)= 0.1 x 0.3= 0.03
P(ED=n, OD=n) = P( ED=n | OD=n) P(OD=n)= 0.9 x 0.3= 0.27
Calculating P(PD) and P(ED) In order to get the probabilities for PD and ED we marginalize out OD.
We propose the joint probabilities table for P ( PD | OD ) and P ( ED | OD )
P(PD) = P (ED) = (0.59, 0.41)
OD = y OD = n
y
n
0.56 0.03
0.14 0.27
0.59
0.41
The Bayes RuleThen, we need to know that PBR stock is down at 2 p.m. by 6% in order to update the probability of OD.
In order to do that we use the Bayes rule: P(B|A) = [ P(A|B) P(B) ]/ P(A)
P(OD | PD = y) = P (PD = y | OD) * (P(OD) / P(PD=y) = (1/0.59) * (0.8 * 0.7 , 0.1 * 0.3) = (0.95, 0.05)
To update the probability of ED, we use the fundamental rule to calculate P(ED, OD)
In conclusion, we calculate P(ED) by marginalizing OD out of P(ED, OD).
The result is P (ED)= (0.765, 0.235)
This represents the quantitative effect of the information that Petrobras stock crashed. At last, when the fund manager observes that the oil price is stable at 30$ per barrel,
P(ED|OD =n) = (0.1 , 0.9)
We Used Bayesian Networks for…
… conducting an analysis on S&P 500 buy/sell signals.The variables have been chosen according to a reseach conducted by Credit Suisse (Patel et al., 2011):
Growth variables Technical Analysis and Momentum variables Sentiment variables Valuation variables Profitability variables
These variables provide a complete view of the market :
Fundamental analysis + Quantitative approach + Behavioral finance
The information availableMarket
Available DataNewspapers articles,
Tv News, specialized websites, market rumors
Info generated inside of the financial community: i.e. Broker’s reports, studies on a specific country or sector
Qualitative: Quantitative
Microeconomic data(i.e. company data)
Macroeconomic data(i.e. inflation, GDP)Market
Sentiment / Behavioral Indications
How a Fund Manager Collects Infos…
Financial information are available on electronic platforms such as Bloomberg or Factset.
• Not easy to integrate together information.
• Often behavioral variables are neglected because they are difficult to include in a model
A New Tool to Fund Managers
The Current Situation:
Common tools (i.e. regressions, basic statistics) do not allow to interpret existing relations among variables belonging to different areas: quantitative, qualitative and behavioural.
A New Approach:
By using the BNs we integrate in the same framework variables belonging to different areas in order to catch aspects (i.e. non-linear interactions) often neglected in the most common models.
Our Model We learned the Bayesian network directly from the data downloaded from Bloomberg (weekly basis) via the Hugin Software.
The intervals considered are 1994-2003 (several fin. crisis and bubbles) and 2004-2015. The variables involved are: Value Growth Profitability Sentiment Momentum and Technical Analysis
+ we built 2 contrarian variables : B_S_CRB ( on commodities index) and B_S_SPX (on S&P 500).
Data PreprocessingCommon practice: investors reason in terms of a discretized version of the variables used.
We consider three states: 1 (high value)2 (low value)0 (neutral value)
The market behavior is influenced by the two extreme situations: states 1 and 2.
Learning the BN from the Data
For our application we used the Hugin software
1. We ran the Chow-Liu algorithm (Chow and Liu, 1968) to draw an initial draft of the network
2. Then we applied a constraint-based algorithm: the NPC. It carries out a series of independence tests and builds a graph which satisfies the discovered independence statements.
We used as a set of constraints those suggested by the Chow-Liu algorithm + other constraints deriving from our financial market knowledge.
The conditional distribution have been estimated from the data by using the EM algorithm, whose version for BNs has been proposed by Lauritzen (1995)
The 1994-2003 Network for S&P 500
This screenshot provide us a picture of the starting point before simulating alternative scenarios for the period 1994-2003
The 2004-2015 Network for S&P 500
This screenshot provide us a picture of the starting point before simulating alternative scenarios for the period 2004-2015
Examination of Different Scenarios
Once the model has been estimated we can address a number of queries.
Different scenarios can be observed by inserting and propagating new evidences throughout the network. For lack of space, we report only the results referring to a volatility shock and the role of P/E.
The theme of volatility is recently dominating the media headlines while, P/E is considered by practitioners the key metric for conducting fundamental analysis.
Simulations can be performed in real-time (mouse-click), by using the evidence propagation algorithm.
For a matter of time we propose in detail only the results referred to the period 2004-2015
Low/High Vola (2004-2015)
LOW VOLA
• High RSI : from 32,59% to 60,97%% • High ROC : 25,45% to 41,13%• High P_UP_DOWN: from 67% to
76,72%• High PC_RATIO: from 21,92% to
20,93%, but still the highest• High BB YLD: from 30,97% to
29,68%; Low BB YLD from 28,59% to 23,13%
• EARN_GR: No clear indication• High PE RATIO (state 1) increases
from 29,01% to 29,76% • B_S_CRB: SELL from 26,14% to
34,18%• B_S_SPX: SELL decreases from
23,67% to 16,36%; but still the highest
HIGH VOLA
• Low RSI : from 32,59% to 61,44% • Low ROC : from 23,71% to 43,99%• Low P_UP_DOWN: from 43% to
69,35%• Low PC_RATIO: from 22,89% to
24,48%• High BB YLD: from 30,97% to 43,65% • Low EARN GR increases from 31,36%
to 50,79%• low PE RATIO: from 29,35% to
32,64%• B_S_CRB: SELL from 26,14% to
34,18%• B_S_SPX: BUY from 24,22% to
39,62%
The most interesting findings involve the following neighbor variables: PE_RATIO, BB_YLD, RSI, ROC, P_ UP_DOWN, EARN_GR and B_S_SPX
Low/High P/E (2004-2015)
LOW P/E
• Low BB YLD: from 30,97% to 31,61%
HIGH P/E
• Low BB YLD: from 28,59% to 43,39%
In contrast with the common financial belief, a change in the PE RATIO, impacts in a sensible way only the profitability variable BB YLD. The effect of PE RATIO on BB YLD confirms that the companies repurchase their own shares according to their valuation
The Innovativeness of the Approach
The framework developed is innovative and usefull for fund managers because:
Currently the tools available (i.e. rankings, scorecards) do not consider and model at the same time all the available information It follows a rigorous approach Its results could be easily interpreted BNs look as an ideal tool in uncertain situations
Conclusive Remarks Market Efficiency does not only depend on financial news but also on information coming from other areas.By using the BN we directly find out in a mouse-click time new information and dynamics that otherwise would not be revealed by common tools used by financial practitioners everyday
Some results differs from common financial knowledge. We propose few examples: 1994 - 2003
P/E do not affect RSI and ROC -> Evidence of market irrationality during “bubbles” High Volatility -> High P/E ( Low Vola -> Low P/E )
2004 – 2015P/E do not provide buy/sell signals on SPX as the financial community generally believes
...These are the evidences that the market equilibrium and its drivers changes across time
Thanks to BN we can update financial knowledge because markets are continuously evolving
References 1.Chow, C. K., and Liu, C. N.: Approximating Discrete Probability Distributions with Dependence Trees. IEEE Transactions on Information Theory, 14, 462–467 (1968).
2.Cowell, R. G., Dawid, A. P., Lauritzen, S. L., and Spiegelhalter, D. J.: Probabilistic Networks and Expert Systems. Springer, New York (1999).
3.Fama, E.: Efficient Capital Markets: A Review of Theory and Empirical Work, J. of Finance, 25, 383-417 (1970).
4.Jensen, F.V.: Bayesian networks, UCL press, London (1996)
5.Lauritzen, S.L.: The EM Algorithm for Graphical Association Models with Missing Data, CSDA, 19, 191-201 (1995).
6.Nielsen, A.E.,: Goal - Global Strategy Paper No. 1, Goldman Sachs Global Economics - Commodities and Strategy Research (2011).
7.Patel, P.N., Yao, S., Carlson, R., Banerji, A., Handelman, J.: Quantitative Research - A Disciplined Approach, Credit Suisse Equity Research (2011).
8.Steck, H.: Constraint-Based Structural Learning in Bayesian Networks using Finite Data, PhD thesis, Institut fur Informatik der Technischen Universitat Munchen (2001).