tls orthogonal regression vba

x y z Total Least-squares ( Orthogonal ) RegressionOrdinary Least Squares ( Normal )0.0118 0.0549 0.3769 Slope 3.2867 0.0648 0.4990 0.5511 Intercept 0.3381 0.1365 1.2477 0.7867 0.1509 1.3439 0.8341 0.1730 0.9849 0.9067 0.1934 1.5304 0.9737 0.1991 0.9769 0.9925 0.2523 1.2821 1.1673 0.2714 1.6606 1.2301 0.2844 1.5627 1.2728 0.2897 1.7861 1.2903 0.2987 1.7280 1.3198 0.3028 1.5151 1.3333 0.3093 1.2807 1.3547 0.3412 1.4339 1.4595 0.3420 1.9614 1.4621 0.3704 1.2501 1.5555 0.3784 1.5916 1.5818 0.4449 1.9384 1.8003 0.4692 1.8366 1.8802 0.4966 2.1051 1.9703 0.5226 2.0129 2.0557 0.5341 2.4959 2.0935 0.5417 2.1110 2.1185 0.5466 1.8384 2.1346 0.5681 1.7141 2.2053 0.5936 1.5000 2.2891 0.6213 2.0627 2.3801 0.6449 2.6729 2.4577 0.6602 2.3864 2.5080 0.6614 2.4871 2.5119 0.6822 2.2778 2.5803 0.6946 2.3559 2.6210 0.6979 2.8558 2.6319 0.7027 2.3110 2.6476 0.7271 2.2392 2.7278 0.7373 2.8841 2.7614 0.7948 2.3997 2.9504 0.8180 2.6302 3.0266 0.8216 3.3867 3.0384 0.8385 3.3287 3.0940 0.8537 3.3824 3.1439 0.8600 2.5985 3.1646 0.8757 2.8299 3.2162 0.8801 3.5722 3.2307 0.8939 3.3630 3.2761 0.8998 3.4912 3.2955 0.9568 3.6173 3.4828 0.9797 3.3579 3.5581 0.9883 3.1547 3.5863

0.00 0.20 0.40 0.60 0.80 1.00 1.20

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

I?m looking to trade a pair of equities, and attempting to keep them market neutral. A simple example is:

ABC, Last Price = 50XYZ, Last Price = 10

For every 1 share of ABC sold, I would buy (ABC/XYZ) = 5 shares of XYZ.

That works well for two stocks that move around at a similar pace, but let?s say I?m looking at AAPL vs SPY. Let?s say that for every 1% SPY moves, AAPL moves 2%. In this case the ratio needs to take into account this difference in movement.

AAPL, Last Price = 260SPY, Last Price = 113

For every 1 share of AAPL sold, I would buy (AAPL/SPY)*2 = 4.6 shares of SPY. In this case AAPL?s beta to SPY is 2.

Another way to look at that AAPL vs SPY pair is to say for every share of SPY sold, I would buy (SPY/AAPL) *.5=.2173 shares of AAPL. In this case SPY?s beta to AAPL is .5

As in the example above, I?d expect to find that the calculation of AAPL?s beta to SPY is simply the inverse of SPY?s beta to AAPL. Here?s where things stop making sense to me. Using Bloomberg to calculate the Beta of AAPL to SPY using daily 2010 returns results in 1.065. Doing the same to calculate the Beta of SPY to AAPL results in 0.475. I've tried manually calculating these betas in excel and came up with the same answers.

Most examples I try are like this where the beta of ABC to XYZ does not equal the inverse of the beta of XYZ to ABC. When the beta of ABC to XYZ does not equal the inverse of XYZ to ABC, there is a problem when deciding what ratio to use. The resulting ratio should be the same either way. I think I'm either calculating something wrong or I'm approaching this pair neutrality from the wrong angle and have incorrect expectations for Beta.

What am I missing?

Use an orthogonal regression and the beta will be the same both ways.

Yes, orthogonal regression is better. But if it helps much, you should reconsider the trade.

Think of it this way. OLS regression tells you to hold the portfolio $1 of asset Y and -Beta of asset X. It will compute Beta for you to minimize the standard deviation of this portfolio. Orthogonal regression tells you to hold:

The standard deviation of return of asset Y times -Rho divided by (1 - Rho^2)^0.5 of asset XThe standard deviation of return of asset X divided by (1 - Rho^2)^0.5 of asset Y

where the standard deviations are either estimated from the sample or derived independently. Rho is estimated to minimize the standard deviation of that portfolio.

If these portfolios are very different then the correlation between the two assets is small and asset X has larger standard deviation than asset Y. In this case, X is not a good hedge for Y as it will not reduce risk much but it will add a lot of volatility. If you held X and used OLS to to find out how much Y to use for a hedge, it would give you a small number.

The same t and F stats apply to orthogonal regression as OLS.

Can't help with the first part.

On the second part, dollar-neutral is for funds/people trying to eliminate the need to put up dollars. Delta-neutral is for funds/people trying to eliminate market risk. Choose delta-neutral ...

Volatility enters the calculation by attempting to form the minimum variance portfolio. Write down an expression for the variance of your portfolio and then minimise the variance with respect to the weights of each asset. In the two asset case you will end up with something like p.V1/V2 where p is the correlation-coefficient between the two assets and V1, V2 are the volatilities. This expression is just good-old-fashioned beta. Note that this expression can also be derived directly from OLS.

There are two separate questions regarding pair trading

1. Is orthogonal regression a superior way of identifying "correlated" pairs than least square regression? Also, what are the statistical tests for testing the significance of parameters obtained from orthogonal regression (equivalent of t-stat and F-stat)?

2. Once a pair has been identified, how do we ratio the pairs: shall we strive for dollar-neutrality or delta-neutrality? And according to Mcmillan volatility also enters the calculation of the ratio, can somebody explain to me why?

im trying to calculate the variance of the beta estimate.Once estimated the orthogonal regression parameters, you can simply measure the diagonal distance between the data points and the estimated regression line.http://mathworld.wolfram.com/Point-LineDistance2-Dimensional.html

I still end up with two different betas as with ordinary OLS... I thought the purpose of Orthogonal regression was to give you one beta only??the two betas should be reciprocal, no?

R^2 is just ss^(xy)/ (ss(x) * ss(y)) which does not change on interchanging x and y. Similarly if you look at t-stat for the slope coefficient (not the intercept) you will end up with a formula which is unchanged on interchanging x and y.

As Athletico wrote, this is orthogonal regression and is very closely related to PCA. For N observations of k dimensional data, usually you need to process a k dimensional matrix (Gauss-Jordan inversion, Cholesky, eigen-decomposition all O(k^3)). Check out the Karhunen-Loeve transform which enables you to process an N dimensional matrix instead. Useful for N<<k. HTH.

I'm looking for a linear regression that will minimize the ("closest distance to the line")^2 rather than the ("vertical distance to the line")^2 as with standard linear least squares.This is sometimes called orthogonal regression or total least squares regression. http://en.wikipedia.org/wiki/Total_least_squares










What am I missing?



















0.0549 0.0118 0.4990 0.0648 1.2477 0.1365 1.3439 0.1509 0.9849 0.1730 1.5304 0.1934 0.9769 0.1991 1.2821 0.2523 1.6606 0.2714 1.5627 0.2844 1.7861 0.2897 1.7280 0.2987 1.5151 0.3028 1.2807 0.3093 1.4339 0.3412 1.9614 0.3420 1.2501 0.3704 1.5916 0.3784 1.9384 0.4449 1.8366 0.4692 2.1051 0.4966 2.0129 0.5226 2.4959 0.5341 2.1110 0.5417 1.8384 0.5466 1.7141 0.5681 1.5000 0.5936 2.0627 0.6213 2.6729 0.6449 2.3864 0.6602 2.4871 0.6614 2.2778 0.6822 2.3559 0.6946 2.8558 0.6979 2.3110 0.7027 2.2392 0.7271 2.8841 0.7373 2.3997 0.7948 2.6302 0.8180 3.3867 0.8216 3.3287 0.8385 3.3824 0.8537 2.5985 0.8600 2.8299 0.8757 3.5722 0.8801 3.3630 0.8939 3.4912 0.8998 3.6173 0.9568










What am I missing?



















3.3579 0.9797 3.1547 0.9883

Total Least-squares ( Orthogonal ) RegressionOrdinary Least Squares ( Normal ) OLS x'y2.8625 #VALUE! #VALUE! #VALUE!0.5704 #VALUE! #VALUE! #VALUE!

#VALUE! #VALUE!#VALUE! #VALUE!

TLS x'y#VALUE! #VALUE! #VALUE!#VALUE! #VALUE! #VALUE!#VALUE! #VALUE!#VALUE! #VALUE!

TLS or Orthogonal regression gives 'correct' reciprocal Beta

#VALUE!

0.00 0.20 0.40 0.60 0.80 1.00 1.20

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00










What am I missing?




























What am I missing?



















0.0118 0.0549 0.0648 0.4990 0.1365 1.2477 0.1509 1.3439 0.1730 0.9849 0.1934 1.5304 0.1991 0.9769 0.2523 1.2821 0.2714 1.6606 0.2844 1.5627 0.2897 1.7861 0.2987 1.7280 0.3028 1.5151 0.3093 1.2807 0.3412 1.4339 0.3420 1.9614 0.3704 1.2501 0.3784 1.5916 0.4449 1.9384 0.4692 1.8366 0.4966 2.1051 0.5226 2.0129 0.5341 2.4959 0.5417 2.1110 0.5466 1.8384 0.5681 1.7141 0.5936 1.5000 0.6213 2.0627 0.6449 2.6729 0.6602 2.3864 0.6614 2.4871 0.6822 2.2778 0.6946 2.3559 0.6979 2.8558 0.7027 2.3110 0.7271 2.2392 0.7373 2.8841 0.7948 2.3997 0.8180 2.6302 0.8216 3.3867 0.8385 3.3287 0.8537 3.3824 0.8600 2.5985 0.8757 2.8299 0.8801 3.5722 0.8939 3.3630 0.8998 3.4912 0.9568 3.6173










What am I missing?



















0.9797 3.3579 0.9883 3.1547

OLS y'x#VALUE! #VALUE!#VALUE! #VALUE!#VALUE! #VALUE!#VALUE! #VALUE!

TLS y'x#VALUE! #VALUE!#VALUE! #VALUE!#VALUE! #VALUE!#VALUE! #VALUE!

TLS or Orthogonal regression gives 'correct' reciprocal Beta

A Comparison of a Simple Linear Regression and a Total Least Squares Fit

Simple Linear Regression Minimizes the sum of squared y deviations from the line of best fit

Total Least Squares Minimizes the sum of squared distances of the data points from the line of best fit

i1 1 0.0118 0.05491 2 0.0648 0.49901 3 0.1365 1.24771 4 0.1509 1.34391 5 0.1730 0.98491 6 0.1934 1.53041 7 0.1991 0.97691 8 0.2523 1.28211 9 0.2714 1.66061 10 0.2844 1.56271 11 0.2897 1.78611 12 0.2987 1.72801 13 0.3028 1.51511 14 0.3093 1.28071 15 0.3412 1.43391 16 0.3420 1.96141 17 0.3704 1.25011 18 0.3784 1.59161 19 0.4449 1.93841 20 0.4692 1.83661 21 0.4966 2.10511 22 0.5226 2.01291 23 0.5341 2.49591 24 0.5417 2.11101 25 0.5466 1.83841 26 0.5681 1.71411 27 0.5936 1.50001 28 0.6213 2.06271 29 0.6449 2.67291 30 0.6602 2.38641 31 0.6614 2.48711 32 0.6822 2.27781 33 0.6946 2.35591 34 0.6979 2.85581 35 0.7027 2.31101 36 0.7271 2.23921 37 0.7373 2.88411 38 0.7948 2.39971 39 0.8180 2.63021 40 0.8216 3.3867

xi yi

1 41 0.8385 3.32871 42 0.8537 3.38241 43 0.8600 2.59851 44 0.8757 2.82991 45 0.8801 3.57221 46 0.8939 3.36301 47 0.8998 3.49121 48 0.9568 3.61731 49 0.9797 3.3579

1 50 0.9883 3.1547Sum 27.3780 106.8877

0.54756

2.13775

3.57017

10.21956

34.04917

2.862486 2.862486

0.570371 0.570371

Se 0.316091

Standard Err Slope 0.167289

0.101926

Total Least Squares

1.49121

Slope + 3.28668

Slope - -0.30426

Intercept 0.33810

0.926903

0.859149 0.859149

Sum of Squares

ANOVA Table Linear Model 29.25333

Error 4.79585Total 34.0491749042

x bar

y bar

SSxx

SSxy

SSyy

B^1 = SSxy/SSxx

B^0 = y bar - B^

1 * x bar

Standard Err Intercept

X0

(SSyy-SSxx)/(2SSxy)

r = SSxy/SQRT( SSxx*SSyy)

r2

Note: The Two Lines Intersect at (x bar, y bar)

1 2.3147

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00-1.0

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

Residuals of Simple Linear Regression

0.00 0.20 0.40 0.60 0.80 1.000.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

Scatter Plot

DataSim Lin RTotal LS

A Comparison of a Simple Linear Regression and a Total Least Squares Fit

Simple Linear Regression Minimizes the sum of squared y deviations from the line of best fit

Total Least Squares Minimizes the sum of squared distances of the data points from the line of best fit

0.0006 0.0001 0.0030 0.6041 -0.5492 0.30170.0323 0.0042 0.2490 0.7559 -0.2569 0.06600.1703 0.0186 1.5568 0.9611 0.2866 0.08210.2028 0.0228 1.8061 1.0023 0.3416 0.11670.1704 0.0299 0.9700 1.0656 -0.0807 0.00650.2960 0.0374 2.3421 1.1240 0.4064 0.16520.1945 0.0396 0.9543 1.1403 -0.1634 0.02670.3235 0.0637 1.6438 1.2926 -0.0105 0.00010.4507 0.0737 2.7576 1.3472 0.3134 0.09820.4444 0.0809 2.4420 1.3845 0.1782 0.03180.5174 0.0839 3.1902 1.3996 0.3865 0.14940.5162 0.0892 2.9860 1.4254 0.3026 0.09160.4588 0.0917 2.2955 1.4371 0.0780 0.00610.3961 0.0957 1.6402 1.4557 -0.1750 0.03060.4892 0.1164 2.0561 1.5471 -0.1132 0.01280.6708 0.1170 3.8471 1.5493 0.4121 0.16980.4630 0.1372 1.5628 1.6306 -0.3805 0.14480.6023 0.1432 2.5332 1.6535 -0.0619 0.00380.8624 0.1979 3.7574 1.8439 0.0945 0.00890.8617 0.2201 3.3731 1.9134 -0.0768 0.00591.0454 0.2466 4.4314 1.9919 0.1132 0.01281.0519 0.2731 4.0518 2.0663 -0.0534 0.00291.3331 0.2853 6.2295 2.0992 0.3967 0.15741.1435 0.2934 4.4563 2.1210 -0.0100 0.00011.0049 0.2988 3.3797 2.1350 -0.2966 0.08800.9738 0.3227 2.9381 2.1965 -0.4824 0.23280.8904 0.3524 2.2500 2.2695 -0.7695 0.59221.2816 0.3860 4.2547 2.3488 -0.2861 0.08191.7238 0.4159 7.1444 2.4164 0.2565 0.06581.5755 0.4359 5.6949 2.4602 -0.0738 0.00541.6450 0.4374 6.1857 2.4636 0.0235 0.00061.5539 0.4654 5.1884 2.5232 -0.2454 0.06021.6364 0.4825 5.5503 2.5587 -0.2028 0.04111.9931 0.4871 8.1556 2.5681 0.2877 0.08281.6239 0.4938 5.3407 2.5818 -0.2708 0.07341.6281 0.5287 5.0140 2.6517 -0.4125 0.17012.1264 0.5436 8.3180 2.6809 0.2032 0.04131.9073 0.6317 5.7586 2.8455 -0.4458 0.19872.1515 0.6691 6.9180 2.9119 -0.2817 0.07932.7825 0.6750 11.4697 2.9222 0.4645 0.2158

xi * yi x2i y2

i y^i e^

i (e^i)2

2.7911 0.7031 11.0802 2.9706 0.3581 0.12832.8876 0.7288 11.4406 3.0141 0.3683 0.13572.2347 0.7396 6.7522 3.0321 -0.4336 0.18802.4781 0.7669 8.0083 3.0771 -0.2472 0.06113.1439 0.7746 12.7606 3.0896 0.4826 0.23293.0062 0.7991 11.3098 3.1291 0.2339 0.05473.1414 0.8096 12.1885 3.1460 0.3452 0.11913.4610 0.9155 13.0849 3.3092 0.3081 0.09493.2897 0.9598 11.2755 3.3747 -0.0168 0.00033.1178 0.9767 9.9521 3.3994 -0.2447 0.059968.7470 18.5613 262.5488 106.8877 0.0000 4.79585

68.74698423

68.74698423

Population Values

2.86248555 0.570371414 2.862485547

0.1672891 0.101926341 0.617351796

0.85914943 0.316090895 3.08103583292.786686 4829.2533291 4.795845791TAN(ATAN[0.5*Linest(Yrange,Xrange)] + 0.5*ATAN[1/Linest(Xrange,Yrange)])

2.07084784

Y=MX+C

D. F. Mean Sq

1 29.25333

48 0.09991 292.78668634406

49 0.69488Fobs = 292.78668634406

Fcritical = #VALUE!

P-Value = 2.627950267E-22

W = (SSyy-SSxx)/(2SSxy)

W + SQRT(W2 + 1)

W - SQRT(W2 + 1)

(n-2)r2/(1 - r2)

Actually I think there's a typo in there. The left inner term should probably be 0.5*ATAN[Linest(Y,X)] because then the intuition clearly becomes an equally weighted average of the angles - not slopes - of the two regressions (y vs. x and x vs. y). I don't know of a rigorous derivation of that expression and Aaron Brown didn't give a proof.

-1.69062163

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00-1.0

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

Residuals of Simple Linear Regression

0.00 0.20 0.40 0.60 0.80 1.000.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

Scatter Plot

DataSim Lin RTotal LS

0.37690.55110.78670.83410.90670.97370.99251.16731.23011.27281.29031.31981.33331.35471.45951.46211.55551.58181.80031.88021.97032.05572.09352.11852.13462.20532.28912.38012.45772.50802.51192.58032.62102.63192.64762.72782.76142.95043.02663.0384

y^i Total

3.09403.14393.16463.21623.23073.27613.29553.48283.55813.5863

106.88770

3.331766811556

0.639605065346

Actually I think there's a typo in there. The left inner term should probably be 0.5*ATAN[Linest(Y,X)] because then the intuition clearly becomes an equally weighted average of the angles - not slopes - of the two regressions (y vs. x and x vs. y). I don't know of a rigorous derivation of that expression and Aaron Brown didn't give a proof.

tls orthogonal regression vba

Documents