fe 582 - project presentation
TRANSCRIPT
![Page 1: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/1.jpg)
STATISTICAL ARBITRAGE PAIRS FOR THE UNIVERSE OF SECTORAL ETFS USING CO-INTEGRATION
Manoj Shenoy, Zenghui Liu, Yangxi leng
![Page 2: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/2.jpg)
Central Theme Pair Trading is a statistical strategy which takes
advantage of mis-pricings between assets Deserves due attention and study owing to it being
a highly utilized strategy among hedge funds because of its low market & sector specific risk
Why Co-integration: Using R-squared statistic to check regression can give misleading results because of the tendency of time series with trends to produce something which has come to be known as ‘Spurious regression’. Hence the need arises for co-integration.
![Page 3: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/3.jpg)
Project focus and Objective To give a brief idea of a Pair Trading Strategy
for the universe of sectoral ETFs. The main aim is to thoroughly assess the
Sectoral ETFs, bucket them into various sectors using already defined Industry wide classification and outline trading strategies for different ETF pairs in all sectors, based on whether there exists co-integration between them or not.
Usage of machine learning algorithms to train the data and predict ETF Spread, using a co-integrated Natural Resources pair CRBQ-GRES as an example.
![Page 4: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/4.jpg)
Methodology & Technology R for generating the code for the Statistical model FUnitRoots, Tseries packages for determination of
co-integration property between ETFs Quantmod and Performance Analytics package for
Portfolio Statistics R to generate the visualizations using ggplot,
Quantmod Machine Learning Tool Weka used for ETF Spread
Prediction. A Classifier Model called Multi-Layer Perceptron used for training the data and predicting spread.
![Page 5: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/5.jpg)
Current Work Development of a Statistical Model for pair Trading using
Co-integration back-tested over a period of 5 years. Back-tested the entire universe of Sectoral ETFs to
arrive at the optimal portfolio of ETF pairs in the same sector.
One co-integrated pair from Natural resources industry CRBQ-GRES chosen to show visualizations of Spreads, Equity curves, scatterplots etc.
Determination of optimal threshold levels of buy and sell based on P & L Optimization
Machine Learning Tool Weka used for training part of the data and predicting the future spread based on supervised learning methods.
![Page 6: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/6.jpg)
Defining the Co-integration Model
Two ETFs A and B are co-integrated with the non-stationary time series corresponding to them being and respectively.
We have two equations equating the scaled difference of log prices to return of ETFs in the current time period. We can write
Where ϒ is the Co-integration coefficient and and are error correction terms. The scaled difference of log prices is termed as spread in our model.
![Page 7: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/7.jpg)
Defining the Co-integration Model Consider a Portfolio with long one share of ETF A
and short ϒ shares of ETF B. The return of the portfolio for a given time period is given as:
Consider the trading strategy where the trades are put on and unwound on a deviation of Δ on either direction from the spread mean. Buy the portfolio (Long ETF A & Short ETF B) when the current spread is Δ below the mean. Similarly, Sell the portfolio (Short ETF A and Long ETF B when the current spread is Δ above the mean
![Page 8: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/8.jpg)
Road Map for Strategy Design & Implementation Data is downloaded directly from Yahoo using R
code. Use ETF Pairs from the same sector and test for
Co-integration using Augmented Dickey Fuller Test. This involves determining the co-integration coefficient and examining the spread time series to ensure that it is stationary and mean reverting.
This is achieved by regressing the log price series of one ETF v/s the other to get the regression coefficient, which is also known as the hedge ratio.
![Page 9: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/9.jpg)
Road Map for Strategy Design & Implementation If the p-value is less than or equal to 0.01 as obtained from
the ADF test, we conclude that the series is stationary. The entire universe of ETF pairs is run through the code to
determine co-integrated pairs. The data is then trained to determine the value of delta
which optimizes the profit function Delta is the optimal threshold value at which the pair is
bought or sold which maximizes the profit. Visualizations are generated for the co-integrated ETF pair
from Natural Resources: CRBQ-GRES. ETF Spread Prediction for the Pair is implemented through
Supervised Machine learning using the Classifier Algorithm ‘Multi-Layer Perceptron’ in the tool Weka
![Page 10: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/10.jpg)
Visualizations for the pair from Natural Resources sector CRBQ - GRES
![Page 11: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/11.jpg)
Visualizations : CRBQ-GRES Pair
![Page 12: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/12.jpg)
Visualizations : CRBQ-GRES Pair
![Page 13: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/13.jpg)
Visualizations : CRBQ-GRES Pair
![Page 14: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/14.jpg)
Training Data to get optimal level of Delta for Max Profit
90110130150170190210230 Cum.Profit
![Page 15: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/15.jpg)
Performance Analysis – Co-integration portfolio
X1 9% X23 11% X45 6% X67 7% X89 4% X111 10% X133 3% X155 8%
X2 13% X24 6% X46 0% X68 7% X90 7% X112 8% X134 5% X156 4%
X3 8% X25 5% X47 3% X69 6% X91 4% X113 5% X135 3% X157 7%
X4 3% X26 6% X48 5% X70 8% X92 7% X114 3% X136 5% X158 9%
X5 6% X27 4% X49 4% X71 6% X93 6% X115 6% X137 1% X159 6%
X6 8% X28 1% X50 15% X72 3% X94 6% X116 9% X138 3% X160 7%
X7 5% X29 4% X51 10% X73 8% X95 7% X117 9% X139 8% X161 3%
X8 3% X30 6% X52 6% X74 9% X96 9% X118 3% X140 4% X162 4%
X9 3% X31 15% X53 8% X75 7% X97 9% X119 8% X141 3% X163 10%
X10 9% X32 5% X54 8% X76 7% X98 10% X120 4% X142 3% X164 7%
X11 2% X33 6% X55 9% X77 6% X99 10% X121 4% X143 3% X165 3%
X12 6% X34 6% X56 7% X78 5% X100 9% X122 4% X144 6% X166 3%
X13 8% X35 6% X57 5% X79 5% X101 6% X123 6% X145 7% X167 2%
X14 6% X36 11% X58 10% X80 8% X102 9% X124 6% X146 8% X168 6%
X15 10% X37 7% X59 34% X81 8% X103 7% X125 3% X147 4% X169 4%
X16 5% X38 2% X60 6% X82 9% X104 8% X126 6% X148 8% X170 3%
X17 5% X39 2% X61 5% X83 8% X105 12% X127 3% X149 5% X171 7%
X18 8% X40 5% X62 5% X84 9% X106 10% X128 5% X150 4% X172 5%
X19 4% X41 2% X63 8% X85 5% X107 4% X129 4% X151 10% X173 5%
X20 4% X42 6% X64 8% X86 3% X108 4% X130 2% X152 9% X174 4%
X21 9% X43 7% X65 6% X87 4% X109 8% X131 3% X153 8% X175 6%
X22 3% X44 7% X66 9% X88 7% X110 4% X132 6% X154 3% X176 7%
Average 2%
CO-INTEGRATED PAIRS PORTFOLIO
Worst Drawdowns for Co-integrated Pairs Portfolio
![Page 16: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/16.jpg)
Performance Analysis – Co-integration portfolio
Particulars Portfolio
Observations 1206NAs 0Minimum 99.9348Quartile 1 104.78Median 108.41Arithmetic Mean 112.46Geometric Mean 111.99Quartile 3 118.09
Particulars Portfolio
Maximum 141.89SE Mean 0.3043LCL Mean 111.86UCL Mean 113.05Variance 111.67Stdev 10.56Skewness 1.0552Kurtosis 0.1696
![Page 17: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/17.jpg)
ETF SPREAD PREDICTION THROUGH SUPERVISED MACHINE LEARNING
A Machine Learning tool called Weka has been used for the purposes of Spread prediction. The objective of spread prediction through this Machine Learning tool is to show the application of supervised Machine learning.
The same Natural Resources pair CRBQ-GRES has been used as an example. Two sample data are used, one which uses 5 period lagged or embedded dimension variables and the other uses 10 period lagged variables.
![Page 18: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/18.jpg)
ETF SPREAD PREDICTION THROUGH SUPERVISED MACHINE LEARNING
. Without repetitive data, the algorithm cannot be trained effectively so as to minimize the error between the actual and the predicted variable.
A Classifier algorithm called Multi-Layer Perceptron is used for training the data and developing the training model.
![Page 19: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/19.jpg)
Weka Analysis for the CRBQ – GRES Pair with lag - 5
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05SPREAD
Predicted
![Page 20: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/20.jpg)
Weka Analysis for the CRBQ – GRES Pair with lag - 5
=== Evaluation on test split ====== Summary ===Lagged Variables 5
Correlation coefficient0.934
7
Mean absolute error0.002
8Root mean squared error
0.0041
Relative absolute error22.64
%Root relative squared error
27.12%
Total Number of Instances 412
Summary Statistics for 5 Lagged Variables
![Page 21: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/21.jpg)
Weka Analysis for the CRBQ – GRES Pair with lag - 10
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05SPREAD
Predicted
![Page 22: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/22.jpg)
Weka Analysis for the CRBQ – GRES Pair with lag - 10
=== Evaluation on test split ====== Summary ===Lagged Variables 10Correlation coefficient
0.9328
Mean absolute error0.004
1Root mean squared error
0.0053
Relative absolute error
32.89%
Root relative squared error
35.32%
Total Number of Instances 412
Summary Statistics for 10 Lagged Variables
![Page 23: FE 582 - Project Presentation](https://reader035.vdocuments.us/reader035/viewer/2022081520/589c5a111a28abc4358b5641/html5/thumbnails/23.jpg)
BIBLIOGRAPHY AND REFERENCES
Ganapathy Vidyamurthy. Pairs Trading: Quantitative Methods and Analysis, 4th Edition (New York: John Wiley & Sons, Inc., 2004).
Elton, Edwin J. and Martin J. Gruber. Modern Portfolio Theory and Investment Analysis, 4th Edition. (New York: John Wiley & Sons, Inc., 1991).
Robert H. Shumway and David S. Stoffer. Time Series Analysis and its Applications - with R Examples, 3rd Edition. (New York: Springer, 2010