comparison of high-frequency covolatility forecasts for high … · 2018-08-05 · comparison of...

Comparison of high-frequency covolatilityforecasts for high-dimensional portfolio

allocation

with N. Hautsch

My visit at the University of Pennsylvania was sponsored by the Austrian Ministry forDigital and Economic Affairs and by the Slovak Ministry of education, science, research

and sport.

Large-scale portfolio allocation based on H-F data

Goal:construction of global minimum variance portfolios for S&P 500.Crucial input:forecasts of high-dimensional covariance matrices.Hautsch, Kyj and Malec (JAE, 2015):→ H-F forecasts beat L-F in terms of portfolio volatility,→ H-F methods lead to higher portfolio turnover,→ H-F methods are worth up to 199 bp for risk-averse investor,→ H-F methods perform better for moderate-size portfolios

(e.g., 30-100 most traded stocks).Next move:would the efficient H-F approach further increase investor’s utility?How to solve practical issues with H-F data . . . as the portfolio grows:→ different sample sizes,→ asynchronous observations. . . loss of information,→ ill-conditioned/in-definite covariance matrix estimators,→ lack of computational memory,→ computation takes too long for short-term portfolio management.

2 Motivation

Content

FrameworkGlobal minimum variance portfolioPortfolio characteristicsEconomic criterion

Local method of moments estimator (LMME)Spectral approach and efficiencyClustering and regularizationQuantifying information loss

Other high-frequency estimatorsRegularization and blocking estimator (RnB)Cholesky factorization estimator (CHOLCOV)

Empirical comparison resultsLOBSTER dataPortfolio charateristicsEconomic criterion

Global minimum variance portfolioFinding optimal weights at time t for portfolio of m stocks at time t + 1

ω∗t+1 = arg minω∈Rm

ω>Σt+1ω, s.t. ω>ι = 1,

we obtain the solution

ω∗t+1 =Ξt+1ι

ι>Ξt+1ι, with Ξ = Σ−1.

Criterion for cov-mat forecasts, Patton and Sheppard (2008):pluggin Σt+1|t 6= Σt+1, we get ωt+1|t and conditional portfolio variance

ω>t+1|t Σt+1ωt+1|t > ω∗Tt+1Σt+1ω∗t+1.

→ Focus on day-ahead prediction of cov-mat,. . . we do not predict conditional mean,

→ evaluation depends on choice of ex-post measure for Σt+1,. . . sample portfolio variance is biased criterion, see Voev (2009),

→ comparison based on. . . conditional portfolio variance, turnover and shortselling,. . . performance fees a risk-averse investor would be willing to pay.

4 Framework

Portfolio characteristics

→ Conditional portfolio variance

ω>t+1|tMRKt+1ωt+1|t ,

→ portfolio turnover according to de Pooter, Martens and Dijk (2008),

m∑l=1

∣∣∣∣∣ωlt+1|t − ωl

t|t−11 + r l

1 + prt

∣∣∣∣∣ ,→ amount of short-selling

m∑l=1

ωlt+1|tI

t+1|t < 0©.

5 Framework

Example: Assume investor invests 10$ into 2 stocks:Why is the turnover from day t to day t + 1 computed as

m∑l=1

∣∣∣∣∣ωlt+1|t − ωl

t|t−11 + r l

1 + prt

∣∣∣∣∣?→ on day t at 9:30 am, investor has/knows

- 10$ capital,- optimal weights ωt|t−1 = (0.6,0.4) for day t obtained on day

t − 1 (after 4 pm),- opening prices of the 2 stocks (1.5$,0.5$).

→ so investor has 4 shares of stock a and 8 shares of stock b.→ on day t after 4 pm, investor knows/has

- closing prices of the 2 stocks are (2$,0.1$).- rt = ( 1

3 ,−45 ) and prt = − 3

25 .- 8.8$ of capital in stocks.

→ on t + 1 just before rebalancing according to ωt+1|t (9:30 amsharp)- prices of the 2 stocks are still the same (2$,0.1$).- but the weights of stocks a,b are not (0.6,0.4) but

( 4·28.8 ,

8·0.18.8 ) = (0.6,0.4)� (

1− 325,

1− 45

1− 325

).6 Framework

Economic utility criterion

We assess economic significance of a lower (conditional) portfoliovariance by the approach of Fleming, Kirby and Ostdiek (JoF, 2001).→ If investor has quadratic utility

U(prt+1) = 1 + prt+1 −γ

2(1 + γ)

(1 + prt+1

→ ∆γ1 is defined by the equality

T−1∑t=1

E[U(prIt+1)|Ft

T−1∑t=1

E[U(prIIt+1 −∆γ)|Ft

→ Interpret ∆γ as fee the investor is willing to pay in order to switchfrom allocation implied by ΣI

t+1|t to ΣIIt+1|t .

→ assuming E [rt+1|Ft ] = const., ∆γ > 0 iff pv2,I > pv2,II .

1γ = 1, 10 as in Fleming et al. (2003).7 Framework

Spectral approach and efficiencyAssume a continuous latent d-dimensional true log-price process:

y∗t = y∗0 +

0Σ1/2(s)dbs, t ∈ [0,1] .

We are estimating integrated covolatility

ICov =

0Σ(t)dt ,

Observations are non-synchronous and poluted by iid noise:

y (l)ti = y∗(l)

ti + ε(l)i , ε

(l)i ∼ N(0, η2

l ), i = 1, . . . ,nl , l = 1, . . . ,m.

Efficient LMM estimator of Bibinger, Hautsch, Malec, Reiss (2014):→ partition the trading day into K intervals of length h = 1/K ,→ assume Σk (t) = Σk and define Sjk ∼ Nm(0,Σk + bias),→ compute S(l)

jk =√

2/h∑nl

i=1 ∆y (l)i sin

Äjπh−1(t (l)

i − kh)ä.

LMME =B∑

Wjkvec(S⊗2jk − ˆbias).

8 Local method of moments estimator (LMME)

Clustering and regularization→ Computational memory and speed issues2

. . . Wjk = I−1k Ijk ∈ Rm2×m2

require J × K × 123 GB for S&P 500,. . . Wjk can not be decomposed and estimated piecewise,. . . if m > 50 LMME takes days to obtain (if feasible).

→ Application to portfolio allocation related issues. . . Σ has to be positive-definite and well-conditioned,. . . adjust LMME to be feasible and well-conditioned.

→ Divide and conquer by Hsieh et al. (2012) and Tan et al. (2015):. . . Hierarchical correlation clustering & regularization

→ GLASSO estimator for the Σ−1 denoted Ξ:

Ξλ = argmin︸︷︷︸Ξ>0

ÄTr(ΞR)− log det(Ξ) + ‖L ◦ (Ξ− diag(Ξ))‖1

→ SEC estimator by Cui, Leng and Yu (CSDA, 2016):

Rλ = argmin︸︷︷︸R−εE>0

∥∥∥R − R∥∥∥2

F+ ‖L ◦ R‖1 ,

Lij = λ > 0 should assets i and j be in the same cluster or λ→∞.2LMME impl. in Matlab is feasible for m ≤ 20 on Mac, m ≤ 150 on VSC 256 GB.

GLASSO

GLASSO - average absolute correlation

SEC - average absolute correlation

GLASSO - conditonal numbers

SEC - conditonal numbers

Divide and conquer step-by-step

On every intra-day interval 1, . . .K :→ Estimate PILOT spectral estimator, Bibinger and Reiss (2014),→ use the scaled estimator as adjacency matrix 1− |R| for HCC,

. . . Step 0, each asset is a cluster, . . . Steps 1, . . . ,m − 1 mergetwo nearest (linkage) clusters ,. . . search dendrogram top-down and find “height” λ, such thateach cluster c = 1, . . . ,C has ≤ 50 assets,

→ GLASSO/SEC on each cluster of assets with parameter λ.→ with Ξk,c

λ or Rk,cλ , compute local-cluster-LMME (now feasible),

→ put all clusters back together & replace 0’s by respective PILOTentries,

LMME-GLASSO =B∑

h · (Ξkλ)−1

LMME-SEC =B∑

h · (Σkλ)−1

Linkage

linkage description noteaverage 1

‖C1‖0‖C2‖0

∑c1∈C1,c2∈C2

corr(c1, c2) Tan et al. (2015)Tumminello et al. (2010)complete max {corr(c1, c2); c1 ∈ C1, c2 ∈ C2}

single min {corr(c1, c2); c1 ∈ C1, c2 ∈ C2} leads to trailing clusters

cluster m

asset m

cluster m − 1

assetm − 1

cluster m − 2

assetm − 2

clusterm − 3

cluster m

cluster m − 2

clusterm − 6

clusterm − 5

cluster m − 1

clusterm − 4

clusterm − 3

Quantifying information loss

Clustering and regularization lead to loss of information:→ information distance of GLASSO and SEC from efficient LMME,→ Tumminello et al. (Phys. Rev. 2007) use Kulback-Leibler distance:

K (R1,R2) =12

ÅlogÅ

det R2

det R1

ã+ trÄR−1

ä−mã.

and showed that

EîKÄR,R

(m log

n∑t=n−m+1

ψ(t/2)

Data cluster glasso sec identitySP 100 21.23 12.21 30.75 65.12SP 500 3.13 2.76 6.07 6.04

Regularization and blocking estimator (RnB)

→ start: multivariate realized kernel estimator proposed byBarndorf-Nielsen et al. (2011),

MRK =H∑

h=−H

hH + 1

ãΓ, with bandwidth H,

. . . refresh-time sampling for synchronization of prices,

. . . sample if all assets have been (re-)traded at least once,

. . . large loss of data - frequency is driven by least liquid asset,

. . . it gets worse as portfolio grows (heterogeneity).→ Hautsch et al. (2012): blocking (clustering) based on liquidity,

. . . set fixed number of liquidity groups (e.g., 4), apply refresh-timeon each group,. . . estimate G(G + 1)/2 blocks by MRK,. . . put blocks back together in hierarchical order.

→ Need regularization (beside smoothing). . . Laloux et al. (1999) propose eigenvalue cleaning.. . . idea: eigenvalues which are close to 0 are noisy and can beinflated without loosing information.

19 Other high-frequency estimators

RnB - average absolute correlation

RnB - conditonal numbers

Cholesky factorization estimator (CHOLCOV)→ Boudt et al. (2017): reorder assets according to their liquidity,

. . . estimate Σ hierachically element-by-element,

. . . hierarchical Cholesky factorization Σk = BVB>,

1 0 . . . 0

b21 1 . . . 0...

.... . .

...bm1 bm2 . . . 1

and V =

v11 0 . . . 00 v22 . . . 0...

.... . .

...0 0 . . . vmm

.. . . factorization of the returns

fτi = B−1xτi ∼ N(0, (τi − τi−1)V ),

. . . where τ1, . . . , τn are synchronized observation times,

. . . convenient hierarchical regression representation

xl,τ = fl,τbl,1 + . . . ,+fl−1,τbl,l−1 + fl,τ

. . . regression coefficients bij = E [ri , fj ]/E [f 2j ], j < i and vii = E [f 2

i ],→ We: uni/bi- local spectral estimator of Bibinger and Reiss (2014).

CHOLCOV - average absolute correlation

CHOLCOV - conditonal numbers

LOBSTER data from NASDAQ data feed

→ mid-quote data on 100 and 426 most liquid assets of S&P 500,. . . sample period 1/1/2012− 31/12/2015, i.e. 994 trading days,. . . raw quotes cleaned as in Barndorf-Nielsen et al. (2009),. . . quotes revised for 0 returns, reduction of comp. burden.. . . e.g., for AAPL on 1/Jan/2012: 104 711→ 39 843, i.e. 62%,. . . the least liquid asset on this day was XRX with only 436observations after revisions,

→ compute PILOT, GLASSO, SEC, RnB and CHOLCOV for allt = 1 . . . ,993, . . . estimators are smoothed across 1,2,5,20 days,. . . day t estimates are forecasts for t + 1

→ performance depends on choice of nuisance parameters,. . . for spectral approach, B, J according to Bibinger et al. (2018),. . . for RnB, G,H and others according to Hautsch et al. (2015),. . . for clustering and regularization: number of clusters, size,. . . arbitrary choice: smoothing after regularization, MRK asex-post approximation.

25 Empirical comparison results

Portfolio charateristics

10 PILOTSECGLASSO

CHOLCOVRnB

1 5 20

PV: A) SP 100 - PV, B) SP 500 PT C) SP 100, D) SP 50026 Empirical comparison results

Economic criterion

PILOTGLASSO

1 5 20

CHOLCOV: A) SP 100 B) SP 500 RV5min: C) SP 100 D) SP 50027 Empirical comparison results

Economic criterion

PILOTGLASSO

1 5 20

CHOLCOV: A) SP 100 B) SP 500 PILOT: C) SP 100 D) SP 50028 Empirical comparison results

Thank you!

comparison of high-frequency covolatility forecasts for high … · 2018-08-05 · comparison of...

Documents

technical explanation for rfid systemsmedium frequency hf...

rainfall frequency and extreme forecasts wet days and wet...

high frequency amplifiers

high (?) frequency receivers. high (?) frequency rxs

p1.11 high impact gridded weather forecasts steve …

high frequency phrases

stochastic forecasts achieve high throughput and low …

jk,...

very high frequency bipolar junction transistor frequency

do high-frequency measures of volatility improve forecasts...

lmx2531 high performance frequency synthesizer system...

high frequency words

for high frequency sensor module pmsm - e-matic · high...

electromagnetic spectrum low energy high energy low...

producing high-skill probabilistic forecasts using ... ·...

high-frequency phrases

rfid high frequency

surface-mounting high-frequency relay g6z ·...

low-frequency stimulation cancels the high-frequency

high frequency trading