realized networks - aarhus universitet...we are grateful for useful comments to peter hansen, nour...
TRANSCRIPT
![Page 1: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/1.jpg)
Realized Networks
Christian Brownlees† Eulalia Nualart† Yucheng Sun†
October 2014
Abstract
In this work we introduce a lasso based regularization procedure for large di-mensional realized covariance estimators of log-prices. The procedure consists ofshrinking the off diagonal entries of the inverse realized covariance matrix towardszero. This technique produces covariance estimators that are positive definite andwith a sparse inverse. We name the regularized estimator realized network, sinceestimating a sparse inverse covariance matrix is equivalent to detecting the partialcorrelation network structure of the log-prices. We focus in particular on applyingthis technique to the Multivariate Realized Kernel and the Two–Scales Realized Co-variance estimators based on refresh time sampling. These are consistent covarianceestimators that allow for market microstructure effects and asynchronous trading.The paper studies the large sample properties of the regularized estimators andestablishes conditions for consistent estimation of the integrated covariance and forconsistent selection of the partial correlation network. As a by–product of the the-ory, we also establish novel concentration inequalities for the Multivariate RealizedKernel estimator. The methodology is illustrated with an application to a panelof US bluechips throughout 2009. Results show that the realized network estima-tor outperforms its unregularized counterpart in an out–of–sample global minimumvariance portfolio prediction exercise.
Keywords: Networks, Realized Covariance, Lasso
JEL: C13, C33, C52, C58
† Department of Economics and Business, Universitat Pompeu Fabra and Barcelona GSE,e-mail: [email protected], [email protected], [email protected] are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminarparticipants at the “Measuring and Modeling Financial Risk with High Frequency Data” conference(Florence, 19–20 June 2014); and “Barcelona GSE Summer Forum: Time Series Analysis in Macro andFinance” workshop (Barcelona, 19–20 June 2014). Christian Brownlees acknowledges financial supportfrom Spanish Ministry of Science and Technology (Grant MTM2012-37195) and from Spanish Ministry ofEconomy and Competitiveness, through the Severo Ochoa Programme for Centres of Excellence in R&D(SEV-2011-0075). Eulalia Nualart’s research is supported in part by the European Union programmeFP7-PEOPLE-2012-CIG under grant agreement 333938.
1
![Page 2: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/2.jpg)
1 Introduction
The covariance matrix of the log–prices of financial assets is a fundamental ingredient in
many applications ranging from asset allocation to risk management. For more than a
decade now the econometric literature has made a number of significant steps forward in
the estimation of covariance matrices using financial high frequency data. This new gener-
ation of estimators, commonly known as realized covariance estimators, measure precisely
the daily covariance of log–prices using intra-daily price information. The literature has
proposed an extensive number of procedures that allow to estimate the covariance effi-
ciently under general assumptions, such as the presence of market microstructure effects
and asynchronous trading in the price data generating process.
Despite the significant leaps forward, carrying out inference on large covariance ma-
trices is inherently challenging. As forcefully put forward by, inter alia, Ledoit and Wolf
(2004), it is hard to estimate precisely the covariance matrix when the number of as-
sets is large. Another hurdle that arises in the analysis of large multivariate systems is
synthesizing effectively the information contained in the covariance matrix, in order to
unveil the dependence structure of the assets. In this work we propose a realized co-
variance estimation strategy that tackles simultaneously both of these challenges. The
estimation approach consists of using lasso–type shrinkage to regularize a realized co-
variance estimator. The lasso procedure identifies nonzero partial correlation relations
among the assets and therefore allows to represent the dependence structure of the as-
sets as a network. If the partial correlation structure of the assets is sufficiently sparse,
then the regularized estimator can be substantially more precise than its unregularized
counterpart.
The framework we work in makes a number of fairly common assumptions on the
dynamics of the asset prices (cf Bandi and Russell, 2006; Ait-Sahalia, Mykland, and
Zhang, 2005; Fan, Li, and Yu, 2012). We assume that observed log prices are equal to
the efficient log prices, which are Brownian martingales, plus a noise term that is due
to market microstructure frictions. Prices are observed according to the realization of
a counting process driving the arrival of trades/quotes of each asset and are allowed to
2
![Page 3: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/3.jpg)
be asynchronous. The target estimand of interest is integrated covariance matrix of the
efficient log prices.
We introduce a network definition for a collection of assets induced by the integrated
covariance, which we call the integrated partial correlation network. Assets i and j are
connected in the integrated partial correlation network iff the partial correlation between
i and j implied by the integrated covariance is nonzero. As it is well known, the network
is entirely characterized by the inverse of the integrated covariance matrix, the integrated
concentration matrix. In fact, it has been known since at least Dempster (1972) that
if the (i, j)–th entry of the inverse covariance matrix is zero, then variables i and j are
partially uncorrelated, that is, are uncorrelated conditional on all other assets . Thus, the
sparsity structure of the integrated concentration matrix determines the partial correlation
network dependence structure among assets.
We propose a lasso–type procedure to obtain sparse integrated concentration ma-
trix estimators. The procedure is based on regularizing a consistent realized covariance
estimator. Several realized covariance estimators have been introduced in the literature
in the presence of market microstructure effects and asynchronous trading. In this work
we focus in particular on the Multivariate Realized Kernel (MRK) and the Two–Scales
Realized Covariance estimators (TSRC) based on pairwise refresh sampling (Ait-Sahalia
et al., 2005; Barndorff-Nielsen, Hansen, Lunde, and Shephard, 2011; Fan et al., 2012).
These estimators are regularized using the glasso (Friedman, Hastie, and Tibshirani,
2011); which shrinks the off–diagonal elements of the inverse of the covariance estimators
to zero. The procedure allows to detect the nonzero linkages of the integrated partial
correlation network. Moreover, the sparse integrated concentration matrix estimator can
be inverted to obtain an estimator of the integrated covariance.
We study the large sample properties of the realized network estimator, and estab-
lish conditions for consistent estimation of the integrated concentration and consistent
selection of the integrated partial correlation network. We develop the theory for the
MRK and TSRC estimators based on pairwise refresh–sampling building up on the gen-
eral asymptotic theory developed by Ravikumar, Wainwright, Raskutti, and Yu (2011).
3
![Page 4: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/4.jpg)
The MRK estimator results are obtained by developing a novel concentration inequality,
while for the TSRC estimator we apply a concentration inequality derived in Fan et al.
(2012). Results are established in a high–dimensional setting, that is, we allow for the
total number of parameters to be larger than the amount of observations available to
the extent that the proportion of nonzero parameters is small relative to the total. We
conjecture that other sufficiently well behaved realized covariance estimators should share
analogous properties. We emphasize that the theory is developed under essentially the
same assumptions for both estimators. In particular, we assume the market microstruc-
ture noise that contaminates prices is iid. It is important to stress however that the MRK
can also handle more complex types of market microstructure dependence but that we do
not take advantage of this in this current work. A simulation study used to investigate
the finite sample properties of the procedure shows that when the underlying network is
sparse, the realized network provides substantial gains in terms of estimation precision.
We apply the realized network methodology to analyse the network structure of a
panel US bluechips throughout 2009 using the MRK, TSRC as well as the classic Realized
Covariance (RC) estimators. More precisely, we use the realized network to regularize the
so called idiosyncratic realized covariance matrix, that is the residual covariance matrix
of the assets after netting out the influence of the market factor. Results show that after
controlling for the market factor, assets still exhibit a significant amount of cross–sectional
dependence. A number of empirical findings emerge from the analysis of the estimated
networks. Interestingly, several network characteristics are fairly stable throughout the
year. The number of links in the network is around 5% of the total possible number of
linkages. The distribution of the connections of the assets exhibits power law behavior,
that is the number of connections is heterogeneous and the most interconnected stocks
have a large number of connections relative to the total number of links. The stocks in the
industrials and energy sectors show a high degree of sectoral clustering, that is there is a
large number of connections among firms in these industry groups. Technology companies
and Google in particular are the most highly interconnected firms throughout the year.
We conjecture that this could be driven by the fact that starting from March 2009 with
4
![Page 5: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/5.jpg)
the beginning of post crisis market rally technology stocks and Google in particular have
been the most rapidly recovering stocks of the year. We investigate the usefulness of
our procedure from a forecasting perspective by carrying out a Markowitz type Global
Minimum Variance (GMV) portfolio prediction exercise. We run a horse race among
different covariance estimator to assess which estimator produces GMV portfolio weights
that deliver the minimum out–of–sample GMV portfolio variance. Results show that the
realized network significantly improves prediction accuracy irrespective of the covariance
estimator used.
Our contribution is related to different strands of the literature. First we contribute to
the literature on realized volatility and realized covariance estimation (Andersen, Boller-
slev, Diebold, and Labys, 2003; Barndorff-Nielsen and Shephard, 2004) in the presence of
market microstructure effects and asynchronous trading. Contributions on this strand of
the literature include like Ait-Sahalia et al. (2005); Bandi and Russell (2006); Barndorff-
Nielsen, Hansen, Lunde, and Shephard (2008); Barndorff-Nielsen et al. (2011); Zhang
(2011); Fan et al. (2012). More specifically, this work builds up on the literature con-
cerned with the estimation and regularization of possibly large realized covariances. One
of the first significant contributions in the area is Hautsch, Kyj, and Oomen (2012). Im-
portant research in this field also includes Hautsch, Kyj, and Malec (2013), Corsi, Peluso,
and Audrino (2015) and Lunde, Shephard, and Sheppard (2011). Last, this paper re-
lates to network modelling in statistics and econometrics, which includes Meinshausen
and Buhlmann (2006), Billio, Getmansky, Lo, and Pellizzon (2012), Diebold and Yilmaz
(2014), Hautsch, Schaumburg, and Schienle (2014a,b), Barigozzi and Brownlees (2013),
Banerjee and Ghaoui (2008).
The rest of the paper is structured as follows. In Section 2 we present the model
and the realized network estimator. The theoretical properties of the estimator are anal-
ysed in Section 3. Section 4 contains an illustration of the realized network estimator
using simulated data, while Section 5 presents an application to a panel of US bluechips.
Concluding remarks follow in Section 6.
5
![Page 6: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/6.jpg)
2 Realized Networks
2.1 The Efficient Price
Let (y(t), t ≥ 0) denote the n-dimensional log–price process of n assets. We assume that
y(t) is a Brownian martingale whose dynamics are given by
y(t) =
∫ t
0
Θ(u)dB(u) , (1)
where B(u) is an n-dimensional Brownian motion with independent components defined
on a filtered probability space (Ω,F ,Ft,P). The spot covolatility process Θ(u) is an n×n
positive definite random matrix whose elements are almost surely continuous, and adapted
to the filtration Ft. We consider our process over a fixed time interval of length h > 0,
which typically represents a day. To simplify notation we assume h equal one, and we
set y = y(1). Also, we define the i–th entry of y as yi, for i = 1, . . . , n. The ex–post
covariance matrix of y is given by
Var (y) = E (yy′) =
∫ 1
0
Σ(t)dt ,
where Σ(t) = Θ(t)Θ(t)′ = (σij(t)) is the spot covariance matrix. We define the integrated
covariance as Σ? =∫ 1
0Σ(t)dt = (σ?ij).
We impose the following regularity assumption on the volatility dynamics in order to
derive the main results of this paper.
Assumption 1. For any i, j ∈ 1, . . . , n and t ∈ [0, 1], σij(t) is a continuous stochastic
process which is (1a) either bounded on [0, 1]; (1b) or it satisfies that for all x > 0,
P
(sup
0≤t≤1|σij(t)| ≥ x
)≤ kσ exp
(−axb
),
for some positive constants kσ, a and b.
That is, the spot volatilities are assumed to be uniformly bounded on [0, 1], or to
satisfy an exponential tail behavior.
6
![Page 7: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/7.jpg)
2.2 The Integrated Partial Correlation Network
We introduce a network representation of y that is based on the partial linear dependence
structure induced by the integrated covariance matrix. We call this network representation
the integrated partial correlation network. A network is defined as an undirected graph
G = (V , E), where V is the set of vertices V = 1, 2, . . . , n and E is the set of edges
E ⊂ V × V . In the integrated partial correlation network the set of vertices corresponds
to the set of assets, and a pair of assets is connected by an edge iff their daily returns are
partially correlated given all other daily returns in the panel, that is,
E = (i, j) ∈ V × V , ρij 6= 0, i 6= j ,
where ρij is the integrated partial correlation, a measure of linear cross-sectional condi-
tional dependence defined as
ρij = Cor(yi, yj|yk, k 6= i, j) .
The integrated partial correlations can be characterized by the inverse integrated co-
variance matrix K? = (Σ∗)−1 = (k?ij). We call K? the integrated concentration matrix.
We have that the integrated partial correlations can be expressed as
ρij =−k?ij√k?iik
?jj
.
Thus, the integrated partial correlation network can be equivalently defined as
E = (i, j) ∈ V × V , k?ij 6= 0, i 6= j .
This characterization of the integrated partial correlation network is convenient for
estimation. The detection of the network structure among the assets is equivalent to
recovering the zeros of the integrated concentration matrix from the data.
Also, notice that if the spot volatility is deterministic then absence of partial correla-
7
![Page 8: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/8.jpg)
tion implies conditional independence between two assets.
2.3 Market Microstructure Noise & Asynchronous Trading
It is customary to assume that rather than the efficient price, the econometrician observes
the transaction (or midquote) price. This differs from the efficient price because trades
(quotes) (i) are affected by an array of market frictions that go under the umbrella term
of market microstructure and, (ii) are observed at discrete time points. To this extent, we
assume that for each asset i for i = 1, . . . , n the econometrician observes at timestamps
Ti = ti 1, ti 2, . . . , timi the transaction (or midquote) prices xi(ti k) defined as
xi(ti k) = yi(ti k) + ui(ti k) ,
where ui(ti k) denotes the microstructure noise associated with the k–th trade. We also
use the shorthand notation xi k, yi k and ui k to denote respectively xi(ti k), yi(ti k), and
ui(ti k).
We assume that ui k is a discrete process, as we model it only when a new observed
price for stock i occurs, and it is such that
ui ki.i.d∼ N(0, η2
i ) ,
where η2i is some positive constant. The noise process is independent of the efficient price
process, and in order to simplify the exposition, we use the same notation (Ω,F ,Ft,P) for
the product filtered probability space of the two processes. Moreover the noise processes
of different assets are independent of each other.
2.4 Factor Structure
Classical asset pricing theory models like the capm or apt imply that the unexpected
rate of return of risky assets can be expressed as a linear function of few common factors
and an idiosyncratic component. Factors induce a fully interconnected partial correlation
network structure. In these cases, it is more interesting to carry out network analysis on
8
![Page 9: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/9.jpg)
the partial correlation structure of the assets after netting out the influence of common
sources of variation.
We augment the y process with additional k components representing factors. The
dynamics of the augmented system are assumed to be the same as the one described in (1).
Moreover, the factors are assumed to be observed, as it is commonly done in the empirical
finance literature. The integrated covariance of the system can then be partitioned as an
(n+ k)× (n+ k) matrix
Σ? =
Σ?AA Σ?
FA
Σ?AF Σ?
FF
, (2)
where A and F denote, respectively, the blocks of assets and factors.
We define the idiosyncratic integrated covariance as the covariance of the residuals
obtained by projecting the factors onto the assets, which is given by
Σ?I = Σ?
AA −Σ?AF [Σ?
FF ]−1 Σ?FA = (σ?I ij) .
Then we define the idiosyncratic integrated partial correlation network as the network
whose set of edges is given by
EI =
(i, j) ∈ V × V , k?I ij 6= 0, i 6= j,
where k?I ij is the i, j–entry of the matrix K∗I = (Σ∗I)−1.
2.5 Realized Covariance Estimation
Our strategy for the estimation of the integrated concentration matrix K∗ is based on
regularizing a consistent estimator of the integrated covariance Σ?. In this section we
introduce two well known estimators for the integrated covariance estimators that are
robust to microstructure noise: the Multivariate Realized Kernel and the Two–Scales
Realized Covariance estimators based on pairwise refresh time.
9
![Page 10: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/10.jpg)
2.5.1 Pairwise Refresh Time
A challenge in the estimation realized covariation is dealing with the asynchronycity of
the observed price. A standard technique used to overcome this hurdle is refresh time
introduced by Barndorff-Nielsen et al. (2011). Several variants of this technique exist,
like the pairwise and groupwise refresh time approaches, used in Fan et al. (2012), Lunde
et al. (2011) and Hautsch et al. (2012). In this work we focus on pairwise refresh time.
Pairwise refresh time based covariance estimation consists of estimating each entry of
the covariance separately. The i, j–entry of the matrix is computed by first synchronizing
the observations of assets i and j using refresh time and then estimating the covariance
between assets i and j using the synchronized data. Notice that this approach does not
guarantee that the covariance estimator is positive definite, as each covariance entry is
estimated using different subsets of observations.
The refresh time method works as follows. Consider two stocks i and j observed at
times Ti = ti1, . . . , timi and Tj = tj1, . . . , tj mj on [0, 1]. The set of pairwise refresh
time points is Tij =τ0, τ1, . . . , τm−1, τm, τm+1
, where 0 = τ0 < τ1 . . . < τm−1 < τm ≤
τm+1 = 1 and m is the amount of pairwise refresh time observations for stocks i and j on
[0, 1]. For 0 < k ≤ m, τk is determined by
τk = max min t ∈ Ti : t > τk−1 ,min t ∈ Tj : t > τk−1 .
The timestamps for assets i and j that correspond to the refresh time τk are respectively
tri ,k = max t ∈ Ti : t ≤ τk and trj ,k = max t ∈ Tj : t ≤ τk .
Accordingly, we obtain the refresh sample prices which are xri = xi(tri 1), . . . , xi(trim) for
stock i and xrj =xj(t
rj 1), . . . , xj(t
rj m)
for stock j. We will use the shorthand notation
xr` k, yr` k and ur`, k to denote x`(t
r` k), y`(t
r` k) and u`(t
r` k), respectively.
10
![Page 11: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/11.jpg)
2.5.2 Realized Kernel Estimator
The Multivariate Realized Kernel (MRK) is the multivariate extension of the Realized
Kernel estimator proposed in Barndorff-Nielsen et al. (2008). The MRK estimator is
denoted by
ΣK = (σK ij) ,
and its i, j–entry is defined as
σK ij = γ0
(xri , x
rj
)+
1
2
H∑h=1
k
(h− 1
H
)(γh(x
ri , x
rj) + γ−h(x
ri , x
rj) + γh(x
rj , x
ri ) + γ−h(x
rj , x
ri )),
(3)
where for each h ∈ −H,−H + 1, . . . ,−1, 0, 1, . . . , H − 1, H,
γh(xri , x
rj) =
m−1∑p=1
(xri p − xri p−1)(xrj p−h − xrj p−h−1).
The kernel k(x) is a weighting function satisfying the following assumption.
Assumption 2. The function k : [0, 1] → R is bounded and twice differentiable with
bounded derivatives, and satisfies that k(1) = k′(0) = k′(1) = 0 and k(0) = 1.
A typical choice of the kernel weighting function which we also use in this work is the
Parzen kernel, defined as,
k(x) =
1− 6x2 + 6x3 0 ≤ x ≤ 12
2(1− x)3 12< x ≤ 1.
Notice that when i = j, the estimator in (3) is equivalent to the univariate realized kernel
estimator of Barndorff-Nielsen et al. (2008). For a choice of H of OP
(m
12
), this estimator
is convergent at rate m14 .
A number of remarks on the estimator are in order. First, the MRK estimation based
on pairwise refresh sampling using a pair specific optimal bandwidth, has already been
put forward in the literature by Lunde et al. (2011). Second, we note that the properties
of the estimator depend on the family of kernel function adopted. There are two main
11
![Page 12: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/12.jpg)
classes put forward in the literature: flat–top, and non–flat top kernels. In this paper we
consider the former. Also, flat–top kernels have better convergence properties than the
other class. The non flat–top kernels on the other hand have a slower rate of convergence
but handle dependent market microstructure noise, which however we do not consider
in this work. See the discussion on this point also in Barndorff-Nielsen et al. (2011).
For simplicity in this work we do not consider jittering. Thus in order to estimate the
realized variance or covariance on [0, 1] we use some prices that happen before time 0
and after time 1. In the empirical implementation, for each pair we choose H as the
(rounded) average of the optimal bandwidth for the realized kernel volatility estimator
of the two assets following the procedure detailed in Barndorff-Nielsen et al. (2008) and
Barndorff-Nielsen, Hansen, Lunde, and Shephard (2009).
For this work is of particular importance a concentration inequality for the MRK esti-
mator which establishes the rate at which the entries in ΣK converge to their counterparts
in Σ? as the sample size increases. In order to obtain the expression of the inequality, we
make the following assumptions:
Assumption 3. The observation times are independent of the y and u processes. For any
assets i, j ∈ V, the pairwise refresh times satisfy sup1<k≤m+1
m(τk − τk−1) ≤ c∆ ≤ ∞, where
c∆ is some positive constant, and m is the number of refresh times on [0, 1].
Assumption 4. The parameter H of the MRK equals c1m12 , for some c1 > 0.
Theorem 1. Assume Assumptions 1a, 2, 3 and 4 hold. Then there are positive constants
a1, a2 and a3, such that for large m, x ∈ [0, a1], and i, j ∈ V,
P(∣∣σK ij − σ?ij
∣∣ > x)≤ a2me
−a3m14 x.
Observe that since the function f(m) = a2me−a3m
14 x is decreasing on
((4a3x
)4
,∞)
,
if we denote by M the minimum pairwise refresh sample size across all pairs of stocks,
then for all i, j ∈ V and x ∈(
4
a3M14, a1
], we have
P(∣∣σK ij − σ?ij
∣∣ > x)≤ a2Me−a3M
14 x.
12
![Page 13: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/13.jpg)
If x ∈[0, 4
a3M14
], then a2Me−a3M
14 x ≥ a3Me−4 ≥ 1 when M is large enough. Conse-
quently, for large M , x ∈ [0, a1], and i, j ∈ V , it holds that
P(∣∣σK ij − σ?ij
∣∣ > x)≤ a2Me−a3M
14 x. (4)
2.5.3 Two–Scale Realized Covariance Estimator
The Two–Scale Realized Covariance (TSRC) is the multivariate extension of the Two–
Scales estimator proposed in (Ait-Sahalia et al., 2005). The TSRC estimator is denoted
by
ΣTS = (σTS,ij) ,
and its generic i, j–entry is defined as
σTS,ij =1
K
m∑k=K+1
(xri k − xri k−K
) (xrj k − xrj k−K
)−mK
mJ
1
J
m∑k=J+1
(xri k − xri k−J
) (xrj k − xrj k−J
)
where mK = m−K+1K
and mJ = m−J+1J
.
Zhang (2011) shows that the optimal choice of K has order K = OP
(m
23
), and J
can be taken to be a constant such as 1. Under this choice, the concentration inequality
derived by Fan et al. (2012) shows that the TSRC estimator converges at rate m16 . In
the empirical implementation, for each pair, we choose J as 1 and K as the (rounded)
average of the optimal bandwidth for the realized two scale volatility estimator of the two
assets, following the procedure detailed in Ait-Sahalia et al. (2005).
We next state the concentration inequality proved in Fan et al. (2012), which requires
the following assumption:
Assumption 5. The TSRC parameters are J = 1 and mK is such that 12m
13 ≤ mK ≤
2m13 .
Theorem 2. Assume Assumptions 1, 3 and 5 hold. Then there are positive constants
a1, a2 and a3 such that for all i, j ∈ V:
13
![Page 14: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/14.jpg)
(i) If Assumption 1a is true, then for large m and x ∈ [0, a1m16 ],
P(m
16
∣∣σTS ij − σ?ij∣∣ > x
)≤ a2 exp(−a3x
2).
(ii) If Assumption 1b is true, then for large m and x ∈ [0, a1m4+b6b ],
P(m
16
∣∣σTS ij − σ∗ij∣∣ > x
)≤ a2 exp
(−a3x
2b4+b
),
where b > 0 is the constant from Assumption 1b.
Following the same reasoning of the previous section, if we define M as the minimum
pairwise refresh sample size across all pairs of stocks, then under Assumption 1a, for large
M , x ∈ [0, a1] and i, j ∈ V ,
P(∣∣σTS ij − σ∗ij
∣∣ > x)≤ a2e
−a3M13 x2
, (5)
and a similar inequality holds under Assumption 1b.
2.5.4 Idiosyncratic Realized Covariance Estimation
The MRK and TSRC estimators of the integrated covariance can also be used to estimate
the idiosyncratic integrated covariance. Let Σ be either the MRK or TSRC estimator
and partition the covariance matrix analogously to equation (2) as
Σ =
ΣAA ΣFA
ΣAF ΣFF
.
We then define the idiosyncratic realized covariance estimator ΣI = (σI ij) as
ΣI = (σI ij) = ΣAA −ΣFA
[ΣFF
]−1ΣAF . (6)
The following corollary establishes the concentration properties of the estimator ΣI
building upon Theorems 1 and 2.
14
![Page 15: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/15.jpg)
Corollary 1. When the assumptions of Theorem 1 hold and Σ is the MRK estimator,
there are positive constants a1, a2 and a3 such that for large M , x ∈ [0, a1], and i, j ∈ V,
P(∣∣σI ij − σ∗I ij∣∣ > x
)≤ a2Me−a3M
14 x. (7)
When the assumptions of Theorem 2(i) hold and Σ is the TSRC estimator, there are
positive constants a1, a2 and a3, such that for large M , x ∈ [0, a1], and i, j ∈ V,
P(∣∣σI ij − σ∗I ij∣∣ > x
)≤ a2e
−a2M13 x2
,
and a similar inequality holds under the assumptions of Theorem 2(ii).
2.6 Realized Network Estimator
We use the Graphical lasso procedure (glasso) to estimate the integrated concentration
matrix K? and the idiosyncratic integrated concentration matrix K?I . Here, we focus on
the integrated concentration estimation only, since the idiosyncratic integrated concen-
tration case is analogous. Let Σ denote either the MRK or TSRC covariance estimators.
The realized network estimator is defined as
K = arg minK∈Sn
tr(ΣK)− log det(K) + λ
∑i 6=j
|kij|
, (8)
where λ ≥ 0 and Sn is the set of n×n symmetric positive definite matrices. We denote by
(kij) the entries of the realized network estimator K. Notice that any consistent estimator
of the integrated covariance Σ? that satisfies an appropriate concentration inequality can
be used.
The realized network estimator is a shrinkage type estimator. If we set λ = 0 in (8), we
obtain the normal log-likelihood function of the covariance matrix, which is minimized by
the inverse realized covariance estimator (Σ)−1. If λ is positive, (8) becomes a penalized
likelihood function with penalty equal to the sum of the absolute values of the non-
diagonal entries in the estimator. The important feature of the absolute value penalty is
15
![Page 16: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/16.jpg)
that for λ > 0 some of the entries of the realized network estimator are going to be set
to exact zeros. The highlight of this type of estimator is that it simultaneously estimates
and selects the nonzero entries of K?.
Banerjee and Ghaoui (2008) show that the optimization problem in (8) can be solved
through a series of lasso regressions. Define
Σ =
Σ11 Σ12
Σ′12 Σ22
and K−1 = W =
W11 w12
w′12 w22
, (9)
where Σ11 and W11 are (n−1)×(n−1) matrices, and Σ12 and w12 are (n−1)×1 vectors.
Suppose that that the values of W11 and w22 are known and that we are concerned with
finding the w12 which minimizes (8). Banerjee and Ghaoui (2008) consider the following
minimization problem
minβ
1
2
(W
1211β − b
)′ (W
1211β − b
)+ λ||β||1
, (10)
where b = W− 1
211 Σ12 and ||β||1 is the sum of absolute values of entries in β. The authors
then show that (10) is the the dual problem of (8) in the sense that if β solves (10), then
w12 = W11β solves (8). Note that the optimization problem in (10) can be cast as a
lasso regression estimation problem.
These results motivate the following iterative algorithm to solve (8) (Friedman, Hastle,
Hofling, and Tibshirani, 2007). Let W denote an initial estimate of W, for example
W = Σ + λIn, where In is the identity matrix of order n. In each iteration of the
algorithm, for each asset i we permute the rows and columns of W so that the n–the
column/row of the matrix contains the covariances with respect to i. We then partition
W as in (9) and update the values of w12 and w′12 by solving equation (10). The algorithm
is iterated until convergence. Finally, we obtain the K estimator by taking the inverse of
W.
16
![Page 17: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/17.jpg)
3 Theory
In this section, we establish the large sample properties of the realized network estimator
defined in (8). In particular we obtain results on the rate of convergence of the estimator
as well as the probability of correct selection of the network.
In this section we adopt a more detailed notation. Given a matrix U = (uij) ∈
R`×m, we set ||U||∞, ||U||1, ||U||1,o, and |||U|||∞ to denote maxi,j|uij|,
∑i,j |uij|,
∑i 6=j |uij|
and maxj
∑mk=1 |ujk|, where i ∈ 1, 2, . . . , ` and j ∈ 1, 2, . . . ,m. If A = (aij) is a p× q
matrix and B is an m × n matrix, the Kronecker product of matrices A and B is the
pm× qn matrix given by
A⊗B =
a11B · · · a1qB
.... . .
...
ap1B · · · apqB
.
We index the pm rows of A⊗B by
R = (1, 1), (2, 1), ..., (m, 1), (1, 2), (2, 2), ..., (m, 2), ...., (1, p), ..., (m, p),
and the qn columns by
C = (1, 1), (2, 1), ..., (n, 1), (1, 2), (2, 2), ..., (n, 2), ...., (1, q), ..., (n, q).
For any two subsets R ⊂ R and C ⊂ C, we denote by (A⊗B)R×C the matrix such that
(A⊗B)(i,j)(c,d) is an entry of (A⊗B)R×C iff (i, j) ∈ R and (c, d) ∈ C.
Assumption 6. Consider the n2×n2 matrix Γ? = Σ?⊗Σ?. There exists some α ∈ (0, 1]
such that
maxe∈Sc||Γ?
eS (Γ?SS)−1 ||1 ≤ 1− α,
where S = E ∪ (i, i)|i ∈ V and Sc =
(i, j) ∈ V × V , k?ij = 0
.
Assumption 6 limits the amount of dependence between the non-edge terms (indexed
by Sc) and the edge-based terms (indexed by S). The limit is controlled by α: the bigger
17
![Page 18: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/18.jpg)
the α the smaller the dependence. In other words, if we set
X(j,k) = yjyk − E (yjyk) , for all j, k ∈ V ,
then the correlation between X(j,k) and X(`,m) is low for any (j, k) ∈ S and (`,m) ∈ Sc.
The following theorems establish the properties of the realized network estimator (cf
Ravikumar et al., 2011). Theorems 3 and 4 are for the MRK estimator, and Theorems
5 and 6 are for the TSRC estimator. Theorems 3 and 5 establish the rate at which
the estimator converges to the true value as the sample size M increases. Theorems 4
and 6 establish a lower bound on the probability of correctly detecting the nonzero partial
correlations (as well as their signs) as a function of the sample size M . In particular, these
theorems show that the estimators are model selection consistent with high probability,
when n is large.
Theorem 3. Assume that the assumptions of Theorem 1 and Assumption 6 hold, and
choose λ = 8α
log(a2Mnτ )
a3M14
in (8), where τ > 2 is arbitrary, ai are the constants in (4), and
α is from Assumption 6.
Assume also that
M >
(log(c1n
−τ )
c2
)4
,
where c1 and c2 are two positive constants that will be given in the Appendix. Then,
P
(||K−K?||∞ ≤ 2CΓ∗
(1 + 8α−1
)M− 1
4log (a2Mnτ )
a3
)≥ 1− 1
nτ−2.
where CΓ? = |||(Γ?SS)−1|||∞.
Observe that Theorem 3 states how large the number of observations need to be in
order to have sufficiently precise estimates of the entries of K∗. The Theorem allows
for the number of observations M to be smaller than the total number of entries of the
concentration matrix. The rate of growth of M depends on the constants c1 and c2, which
are explicit in the proof, and depend on a number of population parameters. In particular,
the maximum degree of the network and coefficients proportional to the magnitude of the
18
![Page 19: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/19.jpg)
entries of K∗ and its inverse. The bound is such that the higher sparsity of the network,
the lower is the requirement on M .
Observe that if the assumptions of Theorem 3 hold, and if for all i, j ∈ V ,∣∣k?ij∣∣ >
OP
(M− 1
4 log (Mnτ )), then the sign of kij is the same as that of k?ij with large probability
when n is large enough. In order to avoid ambiguity, we set that if k?ij = 0, sign(kij) =
sign(k?ij) means kij = 0.
Theorem 4. Under the assumptions of Theorem 3, if
M >
(log(c3n
−τ )
c4
)4
,
where c3 and c4 are two positive constants that will be given in the Appendix, then
P(
sign(kij) = sign(k?ij),∀i, j ∈ V)≥ 1− 1
nτ−2.
From Theorems 3 and 4, we get that if M > OP
(log4(n)
), the events ||K−K?||∞ ≤
OP
(log(Mnτ )
M14
)and sign(kij) = sign(k?ij), for all i, j ∈ V , are true with large probability
when n is large enough.
Theorem 5. Assume that the assumptions of Theorem 2 and Assumption 6 hold, and
choose λ = 8αM− 1
6
(log(a2nτ )
a3
) 12
in (8), where τ > 2 is arbitrary, ai are the constants in
(5), and α is from Assumption 6.
Assume also that
M >
(log (a2n
τ )
a3
)3
c60,
where c0 is a positive constants that will be given in the Appendix. Then
P
(||K−K?||∞ ≤ 2CΓ∗
(1 + 8α−1
)M− 1
6
(log (a2n
τ )
a3
) 12
)≥ 1− 1
nτ−2.
where CΓ? = |||(Γ?SS)−1|||∞.
19
![Page 20: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/20.jpg)
Theorem 6. Under the assumptions of Theorem 5, if
M >
(log (a2n
τ )
a3
)3
c61,
where c1 is a positive constant that will be given in the Appendix, then
P(
sign(kij) = sign(k?ij),∀i, j ∈ V)≥ 1− 1
nτ−2.
From Theorems 5 and 6, we have that if M > OP
(log3(n)
), the events ||K−K?||∞ ≤
OP
(log
12 (n)
M16
)and sign(kij) = sign(k?ij), for all i, j ∈ V are true with large probability
when n is large enough.
Detailed proof of the theorems are given in the Appendix. In what follows, we outline
the arguments used to establish the results. Note first that we can rewrite the objective
function in (8) as
tr(KΣ)− log det(K) + λ||K||1,o.
Recall that the sub-differential of the absolute value function f(x) = |x|, x ∈ R, is
given by the function
f 0(x) =
1 if x > 0
−1 if x < 0
∈ [−1, 1] if x = 0.
Then, given the definition of the norm ||·||1,o, the sub-differential of this norm evaluated at
some matrix K = (kij) ∈ Rn×n includes the set of all symmetric matrices Z = (zij) ∈ Rn×n
such that
zij =
0 if i = j
sign(kij) if i 6= j and kij 6= 0
∈ [−1, 1] if i 6= j and kij = 0.
We can now state the following result.
Lemma 1. Assume that Σ has positive diagonal entries. Then for any λ > 0, the `1-
20
![Page 21: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/21.jpg)
regularized log-determinant problem
K = arg minK∈Sn
tr(KΣ)− log det(K) + λ||K||1,o
(11)
has a unique solution K characterized by
Σ− K−1 + λZ = 0, (12)
where Z is an element of the sub-differential of the norm || · ||1,o evaluated at K.
The proof of Lemma 1 will be provided in Appendix A. Based on this lemma, we
follow the following procedure:
(a) We consider the matrix K = (kij) that solves the restricted log-determinant prob-
lem
K = arg minK∈Sn,KSc=0
tr(KΣ)− log det(K) + λ||K||1,o
, (13)
where λ is some positive constant, and KSc = kij|(i, j) ∈ Sc.
(b) We define the matrix Z = (zij) by
zij =
0 if i = j
sign(kij) if i 6= j and k?ij 6= 0
1λ(σij − σij) if i 6= j and k?ij = 0,
where (K)−1 = Σ = (σij).
(c) We verify that:
|zij| ≤ 1 for all (i, j) ∈ Sc.
This procedure is called the construction of the primal-dual witness solution (K, Z). Ob-
serve that in step (a), we do not care about the specific algorithm that can be used to
obtain the minimizer. Notice that the function (12) is convex and has a unique minimum
in Sn, which we denote by K. Moreover, since K is the minimizer, the sub-differential of
21
![Page 22: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/22.jpg)
(13) evaluated at each entry of K is zero. Thus we have that
σij − σij + sign(kij)λ = 0, (i, j) ∈ E
σij − σij = 0, (i, j) ∈ (`, `), ` ∈ V .
Finally, according to the definition of Z, we obtain that
Σ− K−1 + λZ = 0. (14)
Consequently, based on Lemma 1, if Z is an element of the sub-differential of the norm
|| · ||1,o evaluated at K, which will be checked in step (c), we get that K = K.
Now we get back to the logic of the proof of the theorems. First we obtain a lower
bound of the probability that condition (c) holds. As explained before, this condition
implies that K = K. Then we seek for a bound of the `∞ norm of K−K?, and conditioned
on K = K, this bound also applies to ||K−K?||∞. Thus we obtain an upper bound for
||K−K?||∞ in probability. Define this bound as x for a given probability. For a non-zero
entry k?ij in K?, if its absolute value is bigger than x, we have sign(kij) = sign(k?ij) with the
same probability. Thus to guarantee sign consistency, we consider the minimum absolute
value of the non-zero entries in K?, and show that under certain probability the maximum
of |kij − kij| for all i, j ∈ V is still smaller than this minimum.
4 Simulation Study
In this section we carry out a simulation study to assess the finite sample properties of the
realized network estimator. The simulation exercise consists of simulating one day of data
and to apply the realized network estimator to estimate the integrated covariance, the
integrated concentration, as well as detecting the integrated partial correlation network.
In our numerical implementation, a trading day is 7.5 hours and the simulation of the
continuous time process is carried out using the Euler scheme with a discretization step
of 5 seconds.
22
![Page 23: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/23.jpg)
The efficient price follows an n–dimensional zero drift diffusion with time varying spot
covariance equal to
Σ(t) = V(t) R V(t) ,
where V(t) is a diagonal matrix with positive diagonal entries and R is a symmetric
positive definite matrix. In this model V(t) drives the time varying volatility of each
asset while R determines the (partial) correlation structure of the process. The diagonal
entries of V(t) follow a CIR process
dvii(t) = κ(v − vii(t)) + τ√vii(t)dWi(t) ,
where v is the long run mean of the process, τ > 0 is its volatility, κ > 0 is the mean
reversion parameter, and Wi(t) is a Brownian motion independent of B(t) in (1). Also,
it is assumed that Wi(t) for i = 1, ..., n are independent. The matrix R is a function of a
realization of an Erdos–Renyi random graph. The Erdos–Renyi random graph G = (V , E)
is a undirected graph defined over a fixed set of vertices V = 1, . . . , n and a random
set of edges where the existence of an edge between any pair of vertices is determined by
an independent Bernoulli trial with probability p. We generate R by first simulating an
Erdos–Renyi random graph G and then setting R equal to
R = [In + D−A]−1 ,
where In is the n dimensional identity matrix, and D and A are respectively the degree
matrix and the adjancy matrix of the random graph G. The model for R is such that
the underlying random graph structure determines the sparsity structure of the spot
concentration matrix. Also, note that R is symmetric positive definite by construction.
As far as the microstructure effects are concerned, we assume that the noise component
ui k is an independent zero mean normal random variable with variance η2 for all stocks.
Finally, prices of each stock are observed according to a Poisson process with a constant
intensity µ. Table 1 contains detailed information on the parameter choices made in the
simulation exercise. We consider three scenarios in the simulation study that differ on the
23
![Page 24: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/24.jpg)
amount of sparsity of the underlying network, which is governed by p. In the first scenario
np is smaller than one, and in this case the Erdos–Renyi graph will not have components
larger than O(log(n)), almost surely. In the second scenario np is exactly equal to one,
and in this case the random graph will have a largest component of order n2/3, almost
surely. Finally, in the last scenario np is greater than one, and in this case the graph will
have a giant component.
INSERT TABLE 1 ABOUT HERE
The integrated covariance and integrated partial correlation network are estimated
using the realized network estimator introduced in the previous section. As far as the
estimation of the integrated covariance is concerned, we consider three estimators: the
MRK and TSRC based on pairwise refresh time sampling as well as the “classic” 5–minute
realized covariance (RC) estimator. The realized covariance ΣV = (σV ij) is defined as
σV ij =m∑k=2
(x′ik − x′ik−1)(x′jk − x′jk−1),
where x′ik, x′ik−1 and x′jk, x
′jk−1 are respectively the 5–minute frequency prices of assets i
and j. Each estimator is computed using different values of the shrinkage λ value.
We use different metrics to evaluate the performance of the realized network estimator.
We use a Root Means Square Error (RMSE) type loss based on the scaled Frobenius norm
of the covariance and concentration matrices to evaluate the estimation precision (Hautsch
et al., 2012). The selection performance of the estimator is measured with the AUROC.
The scaled Frobenius norm of the realized concentration estimator K is defined as
RMSE(K)
=
√√√√ 1
n
n∑i,j=1
(kij − k∗ij)2 ,
and, analogously, the scaled Frobenius norm of the realized covariance estimator Σ is
RMSE(Σ)
=
√√√√ 1
n
n∑i,j=1
(σij − σ∗ij)2 .
24
![Page 25: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/25.jpg)
The ROC curve and the AUROC index are standard measures of performance used for
classifications. The ROC (Receiver Operating Characteristic) curve is a graphical plot
which illustrates the performance of a binary classifier system as its discrimination thresh-
old is varied. In our context, the binary classifier is whether or not kij = 0 for i 6= j and
i, j ∈ V , and the discrimination threshold is the value of λ. The ROC consists of the
plot of the true positive rate (TPR) and false positive rate (FPR) of the classifier. These
two rates are obtained from the comparison of K? with K. TPR equals the percentage
of the pairs correctly identified as connected among the truely connected ones, and FPR
equals the percentage of the pairs falsely identified as linked among the not linked ones.
Clearly, a larger λ shrinks kij more towards zero and when λ = 0, kij hardly equals to
zero. For a particular estimator on K∗, we set its TPR value as vertical coordinate, FPR
value as horizontal coordinate. Then we have a series of points corresponding to a series
of estimators as the value of λ changes, and the ROC curve is generated by connecting
those points. Generally increasing λ enlarges both TPR and FPR, and that’s why the
slope of the ROC curve is positive. The area underneath the ROC (AUROC) is then used
as an overall normalized index of accuracy of the classifier. The closer the index to one,
the more accurate the classifier.
For each parameter setting of the model, we perform 100 Monte Carlo replications
and report here summary statistics.
Table 2 reports the summary results of the simulation study. For each estimator and
simulation setting, the table reports the root mean square error of the plain estimator
as well as the “oracle” network estimator. Oracle refers to the network estimator based
on the best possible (yet unfeasible) choice of the shrinkage parameter. The table also
reports the AUROC of the network estimator. The table shows that typically the sparser
the integrated covariance matrix, the larger the gains from using the realized network
estimator. Notice that the performance of the TSRC and MRK plain estimators as
estimators of K is sometimes poorer than the one of the RC. This is a consequence of
the fact that as the entries of these estimator are computed pairwise, these estimator are
ill–posed when the networks are not sufficiently sparse. On the other hand, the realized
25
![Page 26: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/26.jpg)
network version of these estimators performs much better than the RC. Finally, we note
that in this specific simulation setting, the results based on the TSRC estimator are almost
systematically more accurate than the ones based on the MRK and RC.
INSERT TABLE 2 ABOUT HERE
We give more detailed insights into the performance of the network estimator for
Design 3. Figure 1 displays RMSE of the realized network estimator as an estimator of
Σ? as a function of the tuning parameter λ The picture displays the typical profile of
shrinkage type estimators. As the shrinkage parameter increases the estimator rapidly
becomes more efficient. We note that, in this simulation setting, the efficiency gains are
quite substantial in relative terms.
INSERT FIGURE 1 ABOUT HERE
Overall, the results convey that the gains by using the realized network estimator
when the partial correlation structure is sparse can be substantial.
5 Empirical Application
We use the realized network estimator to analyse the dependence structure of a panel
of US bluechips from the NYSE throughout 2009. We then engage in a Markowitz style
Global Minimum Variance portfolio forecasting exercise to showcase the advantages of
our proposed methodology.
5.1 Data & Estimation
We consider a sample of 100 liquid US bluechips that have been part of the S&P100 index
for most of the 2000’s. We also include in the panel the SPY ETF, the ETF tracking the
S&P 500 index. We work with tick–by–tick transaction prices obtained from the NYSE–
TAQ database. Before proceeding with the econometric analysis, the data are filtered
using standard techniques described in Brownlees and Gallo (2006) and Barndorff-Nielsen
26
![Page 27: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/27.jpg)
et al. (2009). The full list of tickers, company names and industry groups is reported in
table 3.
INSERT TABLE 3 ABOUT HERE
We estimate the integrated covariance of the assets throughout 2009. More precisely,
for each of the 52 weeks of 2009, we use the data on the last weekday available of each
week to construct the realized covariance estimators. On each of these days, we only
consider the tickers that have at least 1000 trades.
Exploratory analysis (non reported in the paper) confirms the well documented evi-
dence of a CAPM–type factor structure in the panel. To this extent, our realized covari-
ance estimation strategy consists of first decomposing the realized covariance in systematic
and idiosyncratic covariance components and then regularizing the idiosyncratic part with
the realized network. More precisely, we compute the realized covariance of the assets
in the panel together with the SPY ticker (the proxy of the market), and then obtain
the systematic and idiosyncratic realized covariance of the assets on the basis of formula
(6). Finally, we apply the realized network regularization procedure to the idiosyncratic
realized covariance. The regularization parameter λ is chosen by minimizing a BIC–type
criterion defined as
BIC(λ) = M ×[− log det KI λ + tr
(KI λΣI
)]+ logM ×#(i, j) : 1 ≤ i ≤ j ≤ n, kI ij λ 6= 0 ,
as it is suggested in, among others, Yuan and Lin (2007). On each week of 2009, we
estimate the realized network using three (idiosyncratic) realized covariance estimators:
MRK, TSRC as well as the “classic” RC using data sampled at a 5 minute frequency.
5.2 Realized Network Estimates
In this section we present the realized network estimation results. We first provide details
for one specific date only that roughly corresponds to the mid of the sample (June 26,
2009), and then report summaries for all estimated networks in 2009. In the interest of
27
![Page 28: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/28.jpg)
space we report the TSRC estimator results only. The MRK and RC provide essentially
analogous conclusions.
We begin by showing in figure 2 the heatmap of the idiosyncratic correlation matrix
associated with the idiosyncratic realized covariance estimator on June 26. Notice that
the heatmap is constructed by sorting stocks by industry group and then by alphabetical
order. The picture clearly shows that after netting out the influence of the market factor,
a fair amount of cross–sectional dependence is still present across stocks. Inspection of the
heatmap reveals that the majority of estimated correlation coefficients are positive. The
correlation matrix exhibits a block diagonal structure hinting that correlation is stronger
among firms in the same industry. On this date, the intra-industry group correlation is
particularly strong for energy companies.
INSERT FIGURE 2 ABOUT HERE
We estimate the realized network using the glasso and use the bic to choose the
optimal amount of shrinkage. To give insights on the sensitivity of the estimated network
to the shrinkage parameter λ, the top panel of figure 3 reports the so called trace, which
is the graph of the estimated partial correlations as a function of the shrinkage parameter.
The bottom panel of the same figure also shows the value of the bic criterion as a function
of the shrinkage parameter. The plot highlights how the sparsity of the estimated network
varies substantially over the range of lambda values considered and that the majority of
estimated partial correlations are positive irrespective of the value of shrinkage imposed
on the estimator.
INSERT FIGURE 3 ABOUT HERE
Figure 4 contains the idiosyncratic partial correlation network associated with the
optimal λ. The estimated network has 188 edges, which correspond to approximately the
5% of the total number of possible edges in the network on this date. The number of
companies that have interconnections are 66 (roughly 2/3 of the total) and, interestingly,
they are all connected in a unique giant component.
28
![Page 29: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/29.jpg)
A number of comments on the empirical characteristics of the network are in order.
First, on this date, Google (GOOG) emerges as a particularly highly interconnected firm,
with linkages spreading to several other industry groups. In order to further investigate
this we carry out centrality analysis of the network using the page rank algorithm, and
report the top ten most central companies in table 4. The page rank algorithm conveys
that Google is indeed the most central stock on this date. The estimated network also
shows some degree of industry clustering, that is linkages are more frequent among firms
in the same industry group. In order to get better insights on the industry linkages in
table 5 we report the total number of links across industry groups. The table shows
that firms in the industrials, energy, technology and financials groups are particularly
interconnected among each other. Last, in figure 5 we report the degree distribution
of the estimated network and the distribution of the nonzero partial correlations. As
far as the degree distribution is concerned, the network exhibits the typical features of
Power Law networks, that is the number of connections is heterogeneous and the most
interconnected stocks have a large number of connections relative to the total number
of links. The histogram of the partial correlation shows that the majority of the partial
correlations are positive and that positive partial correlations are on average larger than
the negative ones.
INSERT FIGURES 4, 5 AND TABLES 4, 5 ABOUT HERE
We report a number of summary statistics for the sequence of networks estimated
in 2009. First, in figure 6 we report the proportion of number of links in the network
throughout the year. The picture shows that sparsity is stable a value slightly below 5%
of the total possible number of linkages. The plots shows the sparsity rate vis–a–vis the
VIX volatility index to show that no particular time series pattern emerges in the network
sparsity and that in particular the sparsity is unrelated to the level of volatility of the
market.
INSERT FIGURE 6 ABOUT HERE
29
![Page 30: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/30.jpg)
Figure 7 shows the total number of links of each industry group divided by the total
number of possible edges. The plot omits the series for materials, telecom and utilities due
to their small size. Technology, energy, financial and industrials are the most intercon-
nected sectors also throughout 2009, and the level and cross sectional rankings are fairly
stable across the sample. In order to give more insights on the degree of concentration
withing each group, in figure 8 we report the concentration of links in each industry group
measured using the Herfindahl index. Again, materials, telecom and utilities are omitted
from the graph. Once again, no particular time series pattern emerges from the plot and
cross sectional concentration rankings is quite stable. The picture shows that the most
interlinked sectors have quite different concentration characteristics. Technology is one of
the most highly concentrated sector. Detailed inspection of the results reveals that this
is driven by the fact that in 2009 Google is essentially the most interconnected ticker in
the sample. On the other hand, industrials have the smaller average concentration, in
that the number of links is quite uniformly distributed across firms and no specific “hub”
emerges among these tickers.
INSERT FIGURES 7 AND 8 ABOUT HERE
Overall results convey that after conditioning on a one factor structure that data still
has a fair amount of cross–sectional dependence and that networks provide a useful device
to synthesize such information. Tha main empirical features of the network are stable
throughout 2009. Firms in the energy and industrials sectors are strongly interconnected.
Technology companies and Google in particular are the most highly interconnected firms
throughout the year. We conjecture that this could be driven by the fact that starting
from March 2009, with the beginning of post crisis market rally, technology stocks and
Google in particular have been the most rapidly recovering stocks of the year.
5.3 Predictive Analysis
In order to assess the ability of the regularized network methodology to provide more
precise estimates of the integrated covariance we carry out an asset allocation prediction
exercise (Hautsch et al., 2012, 2013; Engle and Colacito, 2006).
30
![Page 31: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/31.jpg)
The forecasting exercise is designed as follows. For each week of 2009, we construct
the Markowitz Global Minimum Variance (GMV) portfolio weights using the formula
w =Σ−11
1′Σ−11,
where 1 is n–dimensional vector of ones and
Σ = σ2mββ
′ + ΣI .
The resulting GMV portfolio weights are used to construct a portfolio of the assets which
is held for the following week. At the end of the week, we compute the daily sample
variance of the portfolio return for that week. We repeat the exercise for all the weeks
of 2009. The performance of the covariance estimators is evaluated by assessing which
estimator delivers the smallest average out–of–sample GMV portfolio variance.
Different estimation strategies the covariance matrix Σ are employed. First, we esti-
mate the covariance on the basis of either the MRK, TSRC and RC estimators. For each
of these estimators we consider three approaches to compute Σ: the simple unconstrained
covariance estimator, the constrained estimator obtained by setting all the off–diagonal
elements of ΣI to zero and the realized network estimator where ΣI is estimated by the
realized network. Notice that the unconstrained realized covariance estimator is not guar-
anteed to be positive definite. When the estimator is close to being singular we regularize
it by adding to the matrix the identity times a positive scalar.
In figure 9 we report the time series of the weekly out–of–sample GMV annualized
volatility of the constrained, unconstrained and realized network procedures for the TSRC.
The figure shows that unconstrained approach produces fairly high out of sample GMV
volatilities. Constraining the idiosyncratic covariance to be diagonal improves on average
performance but still produces rather noisy estimates. The realized network approach
produces stable and low GMV variability which is almost uniformly more accurate than
the unconstrained approach and is only beaten by the constrained procedure on a small
number of instances.
31
![Page 32: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/32.jpg)
INSERT FIGURE 9 ABOUT HERE
We report summary results on the forecasting exercise in Table 6. The table shows
the average annualized volatility of the GMV portfolios. The asterisks of the table de-
note the significance of a test for variance equality between the realized network GMV
variance and the other GMV variances. The three different covariance estimators deliver
analogous inference. The unrestrained procedure delivers large out–of–sample variances.
In line with theory, the RC estimator is the one that performs the worst. Constrain-
ing the idiosyncratic covariance to be diagonal does not generally improve out–of–sample
performance. The realized network performs always best and reduces the GMV variance
by roughly a third with respect to the unconstrained estimator. Moreover, the variance
reduction of the realized network is significant in all instances. Interestingly, the differ-
ence in the forecasting performance across the different estimators is substantially smaller
after carrying out network regularization.
INSERT TABLE 6 ABOUT HERE
6 Conclusions
In this work we propose a regularization procedure for realized covariance estimators.
The regularization consists of shrinking the off–diagonal elements of the inverse realized
covariance matrix to zero using a Lasso–type penalty. Since estimating a sparse inverse
covariance matrix is equivalent to detecting the partial correlation network structure of the
log-prices we call our procedure realized network. The technique is specifically designed
for the Multivariate Realized Kernel and the Two–Scales Realized Covariance estimators
based on refresh time sampling, which are state–of–the–art consistent covariance estima-
tors that allow for market microstructure effects and asynchronous trading. We establish
the large sample properties of the procedure estimator and show that the realized net-
work consistently estimates of the integrated covariance and for consistently selects of
the partial correlation network. We also establish novel concentration inequalities for
the Multivariate Realized Kernel estimator. An empirical exercise is used to highlight
32
![Page 33: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/33.jpg)
the usefulness of the procedure and an out—of–sample GMV portfolio asset allocation
exerices is carried out to compare our procedure against a number of benchmarks. Re-
sults convey that realized network enhances the prediction properties of classic realized
covariance estimators.
33
![Page 34: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/34.jpg)
References
Ait-Sahalia, Mykland, P. A., and Zhang, L. (2005). How often to sample a continuous-timeprocess in the presence of market microstructure noise. Review of Financial Studies, 18(2),351–416.
Andersen, T. G., Bollerslev, T., Diebold, F. X., and Labys, P. (2003). Modeling and forecastingrealized volatility. Econometrica, 71, 579–625.
Bandi, F. M. and Russell, J. R. (2006). Separating microstructure noise from volatility. Journalof Financial Economics, 79, 655–692.
Banerjee, O. and Ghaoui, L. E. (2008). Model Selection Through Sparse Maximum LikelihoodEstimation for Multivariate Gaussian or Binary Data. Journal of Machine Learning Research,9, 485–416.
Barigozzi, M. and Brownlees, C. (2013). NETS: Network Estimation for Time Series. Technicalreport, Barcelona GSE.
Barndorff-Nielsen, O. and Shephard, N. (2004). Econometric analysis of realized covariation:High frequency based covariance, regrssion, and correlation in financial economics. Econo-metrica, 72(3), 885–925.
Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., and Shephard, N. (2008). Designing realizedkernels to measure the ex post variation of equity prices in the presence of noise. Econometrica,76, 1481–1536.
Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., and Shephard, N. (2009). Realised kernelsin practice: Trades and quotes. Econometrics Journal , 12, 1–32.
Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., and Shephard, N. (2011). Multivariaterealised kernels: Consistent positive semi-definite estimators of the covariation of equity priceswith noise and non-synchronous trading. Journal of Econometrics, 162(2), 149–169.
Billio, M., Getmansky, M., Lo, A., and Pellizzon, L. (2012). Econometric Measures of Con-nectedness and Systemic Risk in the Finance and Insurance Sectors. Journal of FinancialEconomics, 104, 535–559.
Brownlees, C. T. and Gallo, G. M. (2006). Financial econometric analysis at ultra–high fre-quency: Data handling concerns. Computational Statistics and Data Analysis, 51, 2232–2245.
Corsi, F., Peluso, S., and Audrino, F. (2015). Missing in asynchronicity: a kalman-em approachfor multivariate realized covariance estimation. Journal of Applied Econometrics.
Dempster, A. P. (1972). Covariance selection. Biometrics, 28, 157–175.
Diebold, F. and Yilmaz, K. (2014). On the Network Topology of Variance Decompositions:Measuring the Connectedness of Financial Firms. Journal of Econometrics, 182, 119–134.
Engle, R. and Colacito, R. (2006). Testing and valuing dynamic correlations for asset allocation.Journal of Business and Economic Statistics, 24, 238–253.
Fan, J., Li, Y., and Yu, K. (2012). Vast volatility matrix estimation using high-frequency datafor portfolio selection. Journal of the American Statistical Association, 107(497), 412–428.
Friedman, J., Hastle, T., Hofling, H., and Tibshirani, R. (2007). Pathwise Coordinate Opti-mization. The Annals of Applied Statistics, 1, 302–332.
34
![Page 35: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/35.jpg)
Friedman, J., Hastie, T., and Tibshirani, R. (2011). Sparse inverse covariance estimation withthe graphical lasso. Biostatistics, 9, 432–441.
Hautsch, N., Kyj, L. M., and Oomen, R. C. A. (2012). A blocking and regularization approachto high dimensional realized covariance estimation. Journal of Applied Econometrics, 27(4),625–645.
Hautsch, N., Kyj, L. M., and Malec, P. (2013). Do high-frequency data improve high-dimensionalportfolio allocations? Journal of Applied Econometrics, forthcoming.
Hautsch, N., Schaumburg, J., and Schienle, M. (2014a). Financial network systemic risk contri-butions. Review of Finance, forthcoming.
Hautsch, N., Schaumburg, J., and Schienle, M. (2014b). Forecasting Systemic Impact in Finan-cial Networks. International Jounal of Forecasting , 30(3), 781–794.
Ledoit, O. and Wolf, M. (2004). A Well-Conditioned Estimator for Large–Dimensional Covari-ance Matrices. Journal of Multivariate Analysis, 88, 365–411.
Lunde, A., Shephard, N., and Sheppard, K. (2011). Econometric analysis of vast covariancematrices using composite realized kernels. Technical report.
Meinshausen, N. and Buhlmann, P. (2006). High Dimensional Graphs and Variable Selectionwith the Lasso. The Annals of Statistics, 34, 1436–1462.
Ravikumar, P., Wainwright, M. J., Raskutti, G., and Yu, B. (2011). High-dimensional covari-ance estimation by minimizing `1-penalized log-determinant divergence. Electronic Journalof Statistics, 5, 935–980.
Yuan, M. and Lin, Y. (2007). Model selection and estimation in the gaussian graphical model.Biometrika, 94(1), 19–35.
Zhang, L. (2011). Estimating covariation: Epps effect, microstructure noise. Journal of Econo-metrics, 160(1), 33–47.
35
![Page 36: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/36.jpg)
Table 1: Simulation Settings
Mnemonic Dimension Volatility Network Microstructuren v, κ, τ p η2, µ
Design 1 50 50, 0.1, 2.75 0.9/50 0.25, 0.5Design 2 50 50, 0.1, 2.75 1.0/50 0.25, 0.5Design 3 50 50, 0.1, 2.75 1.1/50 0.25, 0.5
The table reports the parameter settings used in the simulation study: n is the number of stocks, v, κand τ are the parameters of the volatility process, p is the probability that the partial correlation of anytwo stocks is not zero, η2 is the variance of microstructure noises and µ is the Poisson intensity drivingthe trading process.
Table 2: Realized Network Simulation Study
Σ KRMSE RMSE RMSE RMSE AUROCplain net plain net net
Realized CovarianceDesign 1 36.978 22.844 0.076 0.027 0.708Design 2 37.757 23.536 0.071 0.028 0.644Design 3 35.700 22.603 0.077 0.031 0.667
Multivariate Realized KernelDesign 1 34.068 15.899 1.831 0.023 0.764Design 2 34.535 16.654 1.090 0.022 0.697Design 3 32.786 16.022 1.065 0.025 0.727
Two–Scales Realized CovarianceDesign 1 30.326 12.885 1.137 0.021 0.806Design 2 31.065 13.167 1.900 0.022 0.719Design 3 29.110 12.840 0.828 0.024 0.746
The table reports the results of the simulation study on the selection and estimation precision ofthe realized network estimator. “Plain” stands for unregularized estimators and “Net” refers to theregularized ones.
36
![Page 37: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/37.jpg)
Tab
le3:
Tic
kers
,C
ompa
nyN
ames
and
Sect
ors
Tic
ker
Sym
bol
Com
pany
Nam
eG
ICS
Sec
tor
Tic
ker
Sym
bol
Com
pany
Nam
eG
ICS
Sec
tor
AM
ZN
Am
azo
n.c
om
Con
sum
erD
iscr
etio
nary
AB
TA
bb
ott
Lab
ora
tori
esH
ealt
hC
are
CM
CS
AC
om
cast
Con
sum
erD
iscr
etio
nary
AM
GN
Am
gen
Hea
lth
Care
DIS
Walt
Dis
ney
Con
sum
erD
iscr
etio
nary
BA
XB
axte
rIn
tern
ati
on
al
Hea
lth
Care
FF
ord
Moto
rC
on
sum
erD
iscr
etio
nary
BM
YB
rist
ol-
Myer
sS
qu
ibb
Hea
lth
Care
FO
XA
Tw
enty
-Fir
stC
entu
ryF
ox
Con
sum
erD
iscr
etio
nary
GIL
DG
ilea
dS
cien
ces
Hea
lth
Care
GM
Gen
eral
Moto
rsC
on
sum
erD
iscr
etio
nary
JN
JJoh
nso
n&
Joh
nso
nH
ealt
hC
are
HD
Hom
eD
epot
Con
sum
erD
iscr
etio
nary
LLY
Lilly
&C
o.
Hea
lth
Care
LO
WL
ow
esC
on
sum
erD
iscr
etio
nary
MD
TM
edtr
on
icH
ealt
hC
are
MC
DM
cDon
ald
sC
on
sum
erD
iscr
etio
nary
MR
KM
erck
Hea
lth
Care
NK
EN
IKE
Con
sum
erD
iscr
etio
nary
PF
EP
fize
rH
ealt
hC
are
SB
UX
Sta
rbu
cks
Con
sum
erD
iscr
etio
nary
UN
HU
nit
edH
ealt
hG
rou
pH
ealt
hC
are
TG
TT
arg
etC
on
sum
erD
iscr
etio
nary
BA
Boei
ng
Com
pany
Ind
ust
rials
TW
XT
ime
Warn
erIn
c.C
on
sum
erD
iscr
etio
nary
CA
TC
ate
rpil
lar
Ind
ust
rials
CL
Colg
ate
-Palm
olive
Con
sum
erS
tap
les
EM
RE
mer
son
Ele
ctri
cIn
du
stri
als
CO
ST
Cost
coC
o.
Con
sum
erS
taple
sF
DX
Fed
Ex
Ind
ust
rials
CV
SC
VS
Care
mark
Con
sum
erS
tap
les
GD
Gen
eral
Dyn
am
ics
Ind
ust
rials
KO
Th
eC
oca
Cola
Com
pany
Con
sum
erS
tap
les
GE
Gen
eral
Ele
ctri
cIn
du
stri
als
MD
LZ
Mon
del
ezIn
tern
ati
on
al
Con
sum
erS
tap
les
HO
NH
on
eyw
ell
Intl
Ind
ust
rials
MO
Alt
ria
Gro
up
Inc
Con
sum
erS
tap
les
LM
TL
ock
hee
dM
art
inIn
du
stri
als
PE
PP
epsi
Co
Inc.
Con
sum
erS
tap
les
MM
M3M
Com
pany
Ind
ust
rials
PG
Pro
cter
&G
am
ble
Con
sum
erS
tap
les
NS
CN
orf
olk
Sou
ther
nIn
du
stri
als
PM
Ph
ilip
Morr
isC
on
sum
erS
tap
les
RT
NR
ayth
eon
Co.
Ind
ust
rials
WA
GW
alg
reen
Con
sum
erS
tap
les
UN
PU
nio
nP
aci
fic
Ind
ust
rials
WM
TW
al-
Mart
Sto
res
Con
sum
erS
tap
les
UP
SU
nit
edP
arc
elS
ervic
eIn
du
stri
als
AP
AA
pach
eE
ner
gy
UT
XU
nit
edT
ech
nolo
gie
sIn
du
stri
als
AP
CA
nad
ark
oP
etro
leu
mE
ner
gy
AA
PL
Ap
ple
Info
rmati
on
Tec
hn
olo
gy
CO
PC
on
oco
Ph
illip
sE
ner
gy
AC
NA
ccen
ture
Info
rmati
on
Tec
hn
olo
gy
CV
XC
hev
ron
En
ergy
CS
CO
Cis
coSyst
ems
Info
rmati
on
Tec
hn
olo
gy
DV
ND
evon
En
ergy
En
ergy
EB
AY
eBay
Info
rmati
on
Tec
hn
olo
gy
HA
LH
all
ibu
rton
Co.
En
ergy
EM
CE
MC
Info
rmati
on
Tec
hn
olo
gy
NO
VN
ati
on
al
Oil
wel
lV
arc
oE
ner
gy
FB
Face
book
Info
rmati
on
Tec
hn
olo
gy
OX
YO
ccid
enta
lP
etro
leu
mE
ner
gy
GO
OG
Inc.
Info
rmati
on
Tec
hn
olo
gy
SL
BS
chlu
mb
erger
Ltd
.E
ner
gy
HP
QH
ewle
tt-P
ack
ard
Info
rmati
on
Tec
hn
olo
gy
XO
ME
xxon
Mob
ilE
ner
gy
IBM
IBM
Info
rmati
on
Tec
hn
olo
gy
AIG
AIG
Fin
an
cials
INT
CIn
tel
Info
rmati
on
Tec
hn
olo
gy
AL
LA
llst
ate
Fin
an
cials
MA
Mast
erca
rdIn
form
ati
on
Tec
hn
olo
gy
AX
PA
mer
ican
Exp
ress
Co
Fin
an
cials
MS
FT
Mic
roso
ftIn
form
ati
on
Tec
hn
olo
gy
BA
CB
an
kof
Am
eric
aF
inan
cials
OR
CL
Ora
cle
Info
rmati
on
Tec
hn
olo
gy
BK
Ban
kof
New
York
Fin
an
cials
QC
OM
QU
AL
CO
MM
Inc.
Info
rmati
on
Tec
hn
olo
gy
BR
K.B
Ber
ksh
ire
Hath
aw
ay
Fin
an
cials
TX
NT
exas
Inst
rum
ents
Info
rmati
on
Tec
hn
olo
gy
CC
itig
rou
pIn
c.F
inan
cials
VV
isa
Inc.
Info
rmati
on
Tec
hnolo
gy
CO
FC
ap
ital
On
eF
inan
cial
Fin
an
cials
DD
Du
Pont
Mate
rials
GS
Gold
man
Sach
sG
rou
pF
inan
cials
DO
WD
ow
Ch
emic
al
Mate
rials
JP
MJP
Morg
an
Chase
&C
o.
Fin
an
cials
FC
XF
reep
ort
-McM
ora
nM
ate
rials
ME
TM
etL
ife
Inc.
Fin
an
cials
MO
NM
on
santo
Co.
Mate
rials
MS
Morg
an
Sta
nle
yF
inan
cials
TA
T&
TIn
cT
elec
om
mu
nic
ati
on
sS
PG
Sim
on
Pro
per
tyG
rou
pF
inan
cials
VZ
Ver
izon
Com
mu
nic
ati
on
sT
elec
om
mu
nic
ati
on
sU
SB
U.S
.B
an
corp
Fin
an
cials
AE
PA
mer
ican
Ele
ctri
cP
ow
erU
tiliti
esW
FC
Wel
lsF
arg
oF
inan
cials
EX
CE
xel
on
Uti
liti
esA
BB
VA
bb
Vie
Hea
lth
Care
SO
Sou
ther
nC
o.
Uti
liti
es
The
tabl
ere
port
sth
elis
tof
com
pany
tick
ers,
com
pany
nam
esan
din
dust
ryse
ctor
s.
37
![Page 38: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/38.jpg)
Table 4: Centrality on 2009-06-26
Rank Ticker Sector
1 GOOG Information Technology2 MA Information Technology3 SLB Energy4 FCX Materials5 APA Energy6 COP Energy7 OXY Energy8 APC Energy9 DVN Energy10 CVX Energy
The table reports the top tickers by eigenvector centrality on June 26, 2009.
Table 5: Links on 2009-06-26
Disc Stap Ener Fin Heal Ind Tech Mat Tel Util
Disc 1 2 4 8 1Stap 2 1 1 1 1 6 1Ener 1 35 2 2 12 6Fin 11 2 14 3Heal 1 2 3 7 3Ind 4 1 2 2 12 20 4Tech 8 6 12 14 7 20 14 6 2Mat 1 1 6 3 3 4 6 3TelUtil 2
The table reports the number of estimated links among industry groups on June 26, 2009.
Table 6: GMV Forecasting
Realized Network Unconstrained ConstrainedMRK 24.66 31.83∗∗∗ 29.51∗∗
TSRC 23.78 37.43∗∗∗ 41.41∗∗∗
RV 23.82 39.99∗∗∗ 40.65∗∗∗
The table reports the results of the GMV forecasting comparison exercise.“Realized Network” meansΣI is the realized network estimator, “Unconstrained” means to the ΣI in not regularized and “Con-strained” means ΣI is obtained by setting all of its off-diagonal entries to zero.
38
![Page 39: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/39.jpg)
Figure 1: Realized Network Estimator Precision
The figure displays the monte carlo average of the scaled Frobenius norm of the realized networkestimator based on the Multivariate Realized Kernel (triangles), Two–Scales Realized Covariance (circles)and Realized Covariance (squares) as a function of the tuning parameter λ.
Figure 2: Idiosyncratic Correlation Heatmap on 2009-06-26
The figure shows the heatmap of the idiosyncratic realized correlation matrix on June 26, 2009estimated using the TSRC estimator. Darker colors indicate higher correlations in absolute value.
39
![Page 40: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/40.jpg)
Figure 3: Trace on 2009-06-26
The figure shows the trace and BIC of the realized network estimator on June 26, 2009.
40
![Page 41: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/41.jpg)
Figure 4: Idiosyncratic Partial Correlation Network on 2009-06-26
AMZN
LOW
NKE
SBUX
TGT
CL
COST
PEP
PG
PM
WMT
APAAPC
COP
CVXDVN
HAL
NOV
OXY
SLB
XOM
AXP
BK
COF
GS
JPM
MET
MS
SPG
USB
WFC
ABT AMGN
BAX
GILDJNJ
MRK
BA
CAT
EMR
FDX
GD
HON
LMT MMM
NSC
RTN
UNP
UPS
UTX
AAPL
ACN
EBAY
GOOG
IBM MA
ORCLQCOMTXN
V
DD
DOW
FCX
MON
EXC
SO
The figure shows the optimal realized network estimated on June 26, 2009.
Figure 5: Degree and Partial Correlation Distribution on 2009-06-26
The figure shows the degree distribution and the distribution of partial correlations on June 26, 2009.
41
![Page 42: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/42.jpg)
Figure 6: Sparsity vs. Volatility
The figure shows the sparsity of the estimated network (square) vis–a–vis the level of volatilitymeasured by the VIX (circle) for each week of 2009.
Figure 7: Sectorial Links
The figure shows the number of linkages of the different industry groups over the total number ofpossible linkages for each week of 2009.
42
![Page 43: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/43.jpg)
Figure 8: Sectorial Concentration
The figure shows the link concentration (measured using the Herfindahl index) of the different industryfor each week of 2009.
Figure 9: GMV Forecasting
The figure shows the weekly annualized volatilities of the GMV portfolios for each week of 2009. Plotshows the GMV weekly volatility of the realized network based weights (circle), unrestricted covarianceweights (square) and restricted covariance weights (triangle).
43
![Page 44: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/44.jpg)
A Technical Appendix
In this Section we provide the proofs of Theorems 1,3,4,5,6, Lemma 1 and Corollary 1.
Proof of Theorem 1. We start proving the concentration inequality for a generic asset i.
By (1), the process of the efficient log-price of this asset is given by
yi(t) =n∑j=1
∫ t
0
θij(u)dBj(u).
where θij(u) is the i, j-entry of Θ(u), and Bj are the iid one-dimensional components
of the n-dimensional Brownian motion B. Recall that the asset i is observed at times
Ti = ti1, ..., timi on [0, 1], which are assumed to satisfy sup2≤j≤mimi(tij − tij−1) ≤ c∆
(Assumption 3). We use the notation xj, yj, uj to respectively denote x(tij), y(tij), u(tij),
for j ∈ 1, ...,mi, and η2 denotes the variance of the centered Gaussian random variables
uj. We also denote mi by m. In order to simplify the notation we will also write θ(u)dB(u)
to denote the scalar product∑n
j=1 θij(u)dBj(u). In all what follows c > 0 will denote a
generic constant that may change from line to line.
By definition (3), the realized kernel estimator of σ∗ii equals
σK ii = γ0(y, y) + A+B + C, (15)
where
A =H∑h=1
k
(h− 1
H
)(γh(y, y) + γ−h(y, y)),
B = 2γ0(y, u) +H∑h=1
k
(h− 1
H
)(γh(y, u) + γh(u, y) + γ−h(y, u) + γ−h(u, y)),
C = γ0(u, u) +H∑h=1
k
(h− 1
H
)(γh(u, u) + γ−h(u, u)),
and for each h ∈ −H, . . . , H, γh(y, u) =∑m−1
j=1 (yj−yj−1)(uj−h−uj−h−1). By Assumption
4 H = cm12 , where c is some positive constant.
By Lemma 3 in Fan et al. (2012), there exist constants c1, c2 > 0 such that for large
44
![Page 45: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/45.jpg)
m and x ∈ [0, c1],
P (|γ0(y, y)− σ∗ii| ≥ x) ≤ 2 exp(−c2mx2). (16)
Therefore, by (15) and (16), it suffices to bound the terms A,B and C in probability. For
this, we will use the following well-known exponential martingale inequality.
Lemma 2. Let (γ(t), t ∈ [0, 1]) be an adapted process such that there exists a constant
c > 0 for whichK∑j=1
∫ tj
tj−1
|γ(u)|2du ≤ c a.s.,
where 0 ≤ t0 < t1 < · · · < tK ≤ 1 is a partition of [0, 1]. Then for any x > 0,
P
(∣∣∣∣∣K∑j=1
∫ tj
tj−1
γ(u)dB(u)
∣∣∣∣∣ ≥ x
)≤ 2 exp
(−x
2
2c
).
Bound in probability of the term A in (15). We decompose A as
A =m−1∑j=1
∫ tj
tj−1
θ(u)dB(u)H∑h=1
k
(h− 1
H
)(∫ tj−h
tj−h−1
θ(u)dB(u) +
∫ tj+h
tj+h−1
θ(u)dB(u)
)
= A1 + A2.
We start bounding A1. For any x > 0 and α > 0, we have
P(|A1| ≥
x
2
)≤ P
∣∣∣∣∣∣m−1∑j=1
∫ tj
tj−1
θ(u)dB(u)Dj1(sup
j∈1,...,m−1|Dj |≤m−
14+α
)∣∣∣∣∣∣ ≥ x
4
+ P
(sup
j∈1,...,m−1|Dj| > m−
14
+α
),
where
Dj =H∑h=1
k
(h− 1
H
)∫ tj−h
tj−h−1
θ(u)dB(u).
By Assumptions 2 and 3,
H∑h=1
k2
(h− 1
H
)∫ tj−h
tj−h−1
|θ(u)|2du ≤ cHC4m
= cm−12 .
45
![Page 46: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/46.jpg)
Therefore, appealing to Lemma 2, we get that
P
(sup
j∈1,...,m−1|Dj| > m−
14
+α
)≤ 2m exp
(−cm2α
).
On the other hand, since
m−1∑j=1
∫ tj
tj−1
|θ(u)|2duD2j1(
supj∈1,...,m−1
|Dj |≤m−14+α
) ≤ cm2α− 12 ,
again using Lemma 2, we obtain that for all x > 0
P
∣∣∣∣∣∣m−1∑j=1
∫ tj
tj−1
θ(u)dB(u)Dj1(sup
j∈1,...,m−1|Dj |≤m−
14+α
)∣∣∣∣∣∣ ≥ x
4
≤ 2 exp(−cm
12−2αx2
).
Therefore, choosing α > 0 such that m2α = m14x, we conclude that
P(|A1| ≥
x
2
)≤ 2m exp
(−cm
14x). (17)
We next bound A2. We write A2 = A21 + A22 + A23, where
A21 =H−1∑i=1
∫ ti+1
ti
θ(u)dB(u)i−1∑j=0
k
(i− jH
)∫ tj+1
tj
θ(u)dB(u)
A22 =m−1∑i=H
∫ ti+1
ti
θ(u)dB(u)i−1∑
j=i−H
k
(i− jH
)∫ tj+1
tj
θ(u)dB(u)
A23 =m+H−1∑i=m
∫ ti+1
ti
θ(u)dB(u)m−2∑j=i−H
k
(i− jH
)∫ tj+1
tj
θ(u)dB(u).
We start bounding A22. For a generic i, using Assumptions 2, 3 and 4, we get that
i−1∑j=i−H
∫ tj+1
tj
k2
(i− jH
)|θ(u)|2du ≤ cm−
12 .
Thus, according to Lemma 2, for any α > 0, we obtain that
P
(sup
i∈H,...,m−1|Fi| > mα− 1
4
)≤ 2m exp
(−cm2α
),
46
![Page 47: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/47.jpg)
where Fi =∑i−1
j=i−H k(i−jH
) ∫ tj+1
tjθ(u)dB(u). Therefore, using again Lemma 2, we con-
clude that for any x > 0 and α > 0,
P(|A22| ≥
x
6
)≤ P
∣∣∣∣∣∣m−1∑i=H
∫ ti+1
ti
θ(u)Fi1(sup
i∈H,...,m−1|Fi|≤mα−
14
)dB(u)
∣∣∣∣∣∣ ≥ x
12
+ P
(sup
i∈H,...,m−1|Fi| > mα− 1
4
)≤ 2 exp
(−cm
12−2αx2
)+ 2m exp
(−cm2α
),
sincem−1∑i=H
∫ ti+1
ti
|θ(u)|2F 2i 1(
supi∈H,...,m−1
|Fi|≤mα−14
)du ≤ cm2α− 12 .
Thus, choosing α as in (17), we conclude that A2,2 also satisfies (17). Similarly, so do
A2,1, A2,3, and A.
Bound in probability of the term B in (15). We decompose B as B = B1 +B2, where
B1 = γ0(y, u) +H∑h=1
k
(h− 1
H
)(γh(y, u) + γ−h(y, u))
B2 = γ0(u, y) +H∑h=1
k
(h− 1
H
)(γh(u, y) + γ−h(u, y)).
We start bounding B1. We write
B1 =m−1∑j=1
∫ tj
tj−1
θ(u)dB(u)Gj,
where
Gj =H∑h=1
k
(h− 1
H
)(uj−h − uj−h−1 + uj+h − uj+h−1) + uj − uj−1. (18)
Observe that since k(0) = 1, k(1) = 0, we have
Gj ∼ N
(0, 2η2
H∑h=1
(k
(h
H
)− k
(h− 1
H
))2).
47
![Page 48: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/48.jpg)
Moreover, there exists βh ∈[h−1H, hH
]such that
H∑h=1
(k
(h
H
)− k
(h− 1
H
))2
=1
H
H∑h=1
1
Hk′(βh)
2 ≤ cm−12 .
Therefore, for all α > 0,
P
(sup
j∈1,...,m−1|Gj| > mα− 1
4
)> 2m exp
(−cm2α
).
Consequently, by Lemma 2, for any x > 0,
P (|B1| ≥ x) ≤ P
∣∣∣∣∣∣m−1∑j=1
∫ tj
tj−1
θ(u)dB(u)Gj1(sup
j∈1,...,m−1|Gj |≤mα−
14
)∣∣∣∣∣∣ ≥ x
+ P
(sup
j∈1,...,m−1|Gj| > mα− 1
4
)≤ 2 exp
(−cm
12−2αx2
)+ 2m exp
(−cm2α
),
sincem−1∑i=H
∫ ti+1
ti
|θ(u)|2G2j1(
supj∈1,...,m−1
|Gj |≤mα−14
)du ≤ cm2α− 12 .
Thus, choosing α as in (17), we conclude that B1 also satisfies (17).
We next bound B2. We write
B2 =m−1∑j=1
(uj − uj−1)Sj = −S1u0 +m−1∑j=2
(Sj−1 − Sj)uj−1 + Sm−1um−1,
where
Sj =H∑h=1
k
(h− 1
H
)(∫ tj−h
tj−h−1
θ(u)dB(u) +
∫ tj+h
tj+h−1
θ(u)dB(u)
)+
∫ tj
tj−1
θ(u)dB(u).
Now, for all x > 0 and α > 0, we write
P (|B2| ≥ x) ≤ P
(∣∣∣∣B21nE≤m2α− 12
o∣∣∣∣ ≥ x
2
)+ P
(E > m2α− 1
2
),
where E = S21 +
∑m−1j=2 (Sj−1 − Sj)2 + S2
m−1.
48
![Page 49: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/49.jpg)
Since the random variables u0, u1, ..., um are iid centered Gaussian with variance η2,
P
(∣∣∣∣B21nE≤m2α− 12
o∣∣∣∣ ≥ x
2
)≤ 2 exp
(cm
12−2αx2
).
In order to bound the second term, we observe that
E = S1(2S1 − S2) +m−2∑j=2
Sj(2Sj − Sj−1 − Sj+1) + Sm−1(2Sm−1 − Sm−2).
Thus,
P(E > 7m2α− 1
2
)≤ P
(sup
j∈2,...,m−2|2Sj − Sj−1 − Sj+1| > mα− 5
4
)
+ P
(sup
j∈1,...,m−1|Sj| > mα− 1
4
).
Straightforward computations show that both terms are bounded by cm exp (−cm2α).
Therefore, choosing α as in (17), we conclude that B2, and thus B also satisfies (17).
Bound in probability of the term C in (15). We write
C = −u0e1 +m−2∑j=1
(ej − ej+1)uj + em−1um−1,
where ej =∑H
h=2 k(h−1H
)(uj−h − uj−h−1 + uj+h − uj+h−1) + uj+1 − uj−2.
For all x > 0 and α > 0, we write
P (|C| ≥ x) ≤ P
(∣∣∣∣C1nF≤m2α− 1
2
o∣∣∣∣ ≥ x
2
)+ P
(F > m2α− 1
2
),
where F = e21 +
∑m−1j=2 (ej − ej−1)2 + e2
m−1.
Again, straightforward computations show that
P(F > 3m2α− 1
2
)≤ P
(sup
j∈2,...,m−1(ej − ej−1)2 > m2α− 3
2
)+ P
(e2
1 > m2α− 12
)+ P
(e2m−1 > m2α− 1
2
)≤ cm exp
(−cm2α
).
49
![Page 50: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/50.jpg)
Hence, the rest of the proof follows as for B2, which shows that C also satisfies (17).
Conclusion. The bounds obtained for A, B, and C, together with (15) and (16), show
that there exist constant c1, c2, c3, c4 > 0 such that for large m and x ∈ [0, c1],
P (|σK ii − σ∗ii| ≥ x) ≤ 2 exp(−c2mx2) + c3m exp
(−c4m
14x).
Now, when m−34 < x ≤ c1, we have x2m ≥ xm
14 , and thus, we obtain that
P (|σK ii − σ∗ii| ≥ x) ≤ c5m exp(−c6m
14x).
On the other hand, when 0 ≤ x ≤ m−34 , we have that
m exp(−c4xm
14
)≥ m exp
(−c4m
− 12
)> 1,
and thus the same bound holds. This concludes the proof of the theorem for the diagonal
terms.
We are now going to prove the concentration inequality for two generic assets i and j.
By (1), the integrated covariance of the log-returns of assets i and j on [0, 1] is given by
∫ 1
0
Θi(u)Θj(u)′du =1
4
(∫ 1
0
(Θi(u) + Θj(u)) (Θi(u) + Θj(u))′ du
)− 1
4
(∫ 1
0
(Θi(u)−Θj(u)) (Θi(u)−Θj(u))′ du
),
(19)
where Θk(u), k ∈ 1, ..., n denote the rows of the matrix Θ(u). Therefore, in order to
estimate the integrated covariance, it suffices to estimate the two terms in (19).
The MRK estimator of s∗ij =∫ 1
0(Θi(u) + Θj(u)) (Θi(u) + Θj(u))′ du is given by
sK ij = γ0(xri + xrj) +H∑h=1
k
(h− 1
H
)(γh(x
ri + xrj) + γ−h(x
ri + xrj)),
where γh(xri + xrj) = γh(x
ri + xrj , x
ri + xrj), for all h ∈ −H, . . . , H.
Observe that if there is no asynchronicity among the observations, then we can derive
the concentration inequality for sK ij as we did for the case of a single asset. Therefore, it
50
![Page 51: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/51.jpg)
suffices to split the estimator as sK ij = sK ij +∑10
q=1 Fq, where sK ij is the analogous of sK ij
but replacing xri + xrj by xi + xj, where x`k := y`(τk) + ur`k, for ` = i, j and k ∈ 1, ...,m.
The terms Fq, are those that contain the points 4y` k := y`(τk)− yr`k, that is,
F1 = γ0(4yi, xi + xj) +H∑h=1
k
(h− 1
H
)(γh(4yi, xi + xj) + γ−h(4yi, xi + xj)),
F3 = γ0(xi + xj,4yi) +H∑h=1
k
(h− 1
H
)(γh(x1 + x2,4yi) + γ−h(xi + xj,4yi)),
F5 = γ0(4yi,4yj) +H∑h=1
k
(h− 1
H
)(γh(4yi,4yj) + γ−h(4yi,4yj)),
F7 = γ0(4yi, uri + urj) +H∑h=1
k
(h− 1
H
)(γh(4yi, uri ) + γ−h(4yi, uri + urj)),
F9 = γ0(uri + urj ,4yi) +H∑h=1
k
(h− 1
H
)(γh(u
ri + urj ,4yi) + γ−h(u
ri + urj ,4yi)).
F2, F4, F6, F8 and F10 are equal to F1, F3, F5, F7 and F9, respectively, by replacing 4yi by
4yj and viceversa.
By the previous proof of the concentration inequality for a single asset i, we have that
there exist constants c1, c2, c3 > 0 such that for large m and x ∈ [0, c1],
P(∣∣sK ij − σ∗ij
∣∣ ≥ x)≤ c2m exp
(−c3m
14x).
On the other hand, straightforward computations show that for each q ∈ 1, ..., 10, there
exist constants c1, c2, c3 > 0 such that for large m and x ∈ [0, c1],
P (|Fq| ≥ x) ≤ c2m exp(−c3m
14x).
Proceeding similarly, we obtain the same estimate for the MRK estimator of the second
term in (19), which concludes the desired proof.
Proof of Corollary 1. We prove the corollary when Σ is the TSCR estimator. The proof
follows similarly for the MRK estimator.
51
![Page 52: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/52.jpg)
Notice that Σ∗FF = σ∗n+1,n+1 and ΣFF = σn+1,n+1. Then for all i, j ∈ V ,
σ∗I ij = σ∗ij −σ∗n+1,iσ
∗j,n+1
σ∗n+1,n+1
and σI ij = σij −σn+1,iσj,n+1
σn+1,n+1
.
Therefore,
P(∣∣σI ij − σ∗I ij∣∣ > x) ≤ P
(∣∣σ∗ij − σij∣∣ ≥ x
2
)+ P
(∣∣∣∣σn+1,iσj,n+1
σn+1,n+1
−σ∗n+1,iσ
∗j,n+1
σ∗n+1,n+1
∣∣∣∣ ≥ x
2
).
According to (5), for large M , x ∈ [0, a1] and i, j ∈ V , the first term is bounded by
a2e−a3M
13 x2
. In order to bound the second term, set b1 = σn+1,n+1 − σ∗n+1,n+1, b2 =
σn+1,i − σ∗n+1,i, and b3 = σj,n+1 − σ∗j,n+1. Then,
∣∣∣∣σn+1,iσj,n+1
σn+1,n+1
−σ∗n+1,iσ
∗j,n+1
σ∗n+1,n+1
∣∣∣∣=
∣∣∣∣b3σ∗n+1,n+1σ
∗n+1,i + b2σ
∗n+1,n+1σ
∗j,n+1 + b2b3σ
∗n+1,n+1 − b1σ
∗n+1,iσ
∗j,n+1
σ∗n+1,n+1(σ∗n+1,n+1 + b1)
∣∣∣∣ . (20)
Without loss of generality, we assume that σ∗j,n+1 and σ∗n+1,i are non-zero. Otherwise, the
proof will follow similarly. Then,
P
(∣∣∣∣σn+1,iσj,n+1
σn+1,n+1
−σ∗n+1,iσ
∗j,n+1
σ∗n+1,n+1
∣∣∣∣ ≥ 8x
)≤ P
(|b1| ≥ max
(σ∗n+1,n+1
2,σ∗2n+1,n+1x
|σ∗n+1,iσ∗j,n+1|
))+ P
(|b2| ≥ max
(σ∗n+1,n+1x
|σ∗j,n+1|, 1
))+ P
(|b3| ≥
σ∗n+1,n+1x
|σ∗n+1,i|
).
Observe that in the last inequality we have used the fact that if |b1| ≥σ∗n+1,n+1
2, then
we can lower bound the denominator in (20) byσ∗2n+1,n+1
2. Then appealing again to the
concentration inequality (5), for large M , and x ∈ [0, c1], this term is also bounded by
c2e−c3M
13 x2
. Thus, the result follows.
Proof of Lemma 1. By Lagrangian duality, for each positive λ, there exists a positive
constant c(λ) such that the minimization problem (11) can be written in the equivalent
constrained form
minK∈Sn,||K||1,o≤c(λ)
[tr(KΣ
)− log det(K)
].
52
![Page 53: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/53.jpg)
When ||K||1,o ≤ c(λ), we can find a positve constant c such that
h(K) := tr(KΣ
)− log det(K) ≥
n∑i=1
kiiσii − log det(K)− c.
Set S1 = K ∈ Sn, ||K||1,o ≤ c(λ). For a generic i, the function f(x) = xσii − log x is
increasing in ( 1σii,+∞), decreasing in (0, 1
σii), and approaches to infinity as x → ∞ or
x→ 0+. Thus we can find positive constants ai, bi such that for any K0 =(k0ij
)∈ S1 \S2,
where S2 = K ∈ Sn : ||K1,o|| ≤ c(λ), ai ≤ kii ≤ bi ∀ i and K1 ∈ S2, we have
n∑i=1
(k0iiσii − log k0
ii
)> c+ h(K1).
According to Hadamard’s inequality for positive definite matrices, det(K0) ≤∏n
i=1 k0ii.
Therefore,
h(K0) ≥n∑i=1
k0iiσii − log det(K0)− c ≥
n∑i=1
(k0iiσii − log k0
ii
)− c > h(K1).
Consequently, the minimum of h(K) over S2 is also its minimum over S1. Because h(K)
is a convex function and S2 is compact, the minimum is unique. By standard optimality
conditions for convex programs, the minimizer K satisfies (12).
In order to prove Theorems 3, 4, 5 and 6, we need the following preliminary lemmas.
Set W = (wij) = Σ−Σ?,4 = K−K?,R(4) = Σ−1 −Σ∗−1 + Σ?4Σ?.
Lemma 3. Suppose that for some λ > 0,
max||W||∞, ||R(4)||∞ ≤αλ
8,
where α ∈ (0, 1] is from Assumption 6. Then for all (i, j) ∈ Sc, |zij| ≤ 1. That is, step
(c) in the construction of the primal-dual witness solution succeeds.
Lemma 4. Suppose that ||4||∞ ≤ 13CΣ?d
, where d is the maximum degree of the network,
53
![Page 54: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/54.jpg)
that is the maximum number of edges that include a vertex, and CΣ? = |||Σ?|||∞. Then
||R(4)||∞ ≤3
2d||4||2∞C3
Σ? .
Lemma 5. Suppose that
r = 2CΓ?(||W||∞ + λ) ≤(3dmax
(CΣ? , C3
Σ?CΓ?)−1
, (21)
where CΓ∗ = ||| (Γ∗SS)−1 |||∞. Then ||4||∞ ≤ r.
Lemma 6. Assume that (21) holds. Assume also that km > 2CΓ?(||W||∞+λ), where km is
the minimum absolute value of the non-zero entries of K?. Then sign(kij
)= sign
(k?ij),
for any (i, j) ∈ S.
In the following, Lemma 7 is for the case when Σ is the TSCR estimator, and Lemma
8 is for the MRK estimator.
Lemma 7. Assume that for some τ > 2, M− 16
[log(a2nτ )
a3
] 12 ≤ a1, where ai are the constants
in (5). Then
P
(||W||∞ > M− 1
6
[log (a2n
τ )
a3
] 12
)≤ 1
nτ−2.
Lemma 8. Assume that M ≥16
„log
n−τ a43x4
a2
«4
a43x
4 , for some τ > 2, where ai are the constants
in (4). Then there exists c0 depending on ai such that for all x ∈ [0, c0],
P (||W||∞ ≥ x) ≤ 1
nτ−2.
Proof of Lemma 3. We first rewrite condition (14) in the equivalent form
K∗−14K∗−1 + W −R(4) + λZ = 0. (22)
In order to transform this matricial equation into a vector equation, we use the following
notion: For any matrix A, we set A for the vector obtained by stacking up the rows of
54
![Page 55: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/55.jpg)
A into a single column vector. In particular,
K∗−14K∗−1 =(K∗−1 ⊗K∗−1
)4 = Γ?4.
For any matrix A = (aij) ∈ Rn×n and any subset T of V , we define the vector AT such
that aij ∈ AT if and only if i, j ∈ T , and the relative order of any two entries in AT is
the same as their order in A. Given that 4Sc = 0 by construction, equation (22) can be
rewritten as two blocks of linear equations as
Γ?SS4S + WS −R(4)S + λZS = 0 (23)
Γ?ScS4S + WSc −R(4)Sc + λZSc = 0. (24)
Since Γ?SS is invertible, we obtain from (23) that
4S = (Γ?SS)−1
[−WS +R(4)S − λZS
].
Substituting this expression into equation (24), we get that
ZSc = −1
λΓ?ScS(Γ?
SS)−1(WS −R(4)S) + Γ?ScS(Γ?
SS)−1ZS −1
λ(WSc −R(4)Sc).
Recalling the definitions of || · || and ||| · |||, we obtain that
||ZSc||∞ ≤1
λ|||Γ?
ScS(Γ?SS)−1|||∞
(||WS||∞ + ||R(4)S)||∞
)+ ||Γ?
ScS(Γ?SS)−1ZS||∞ +
1
λ
(||WS||∞ + ||R(4)S||∞
).
According to Assumption 6 and the fact that ||ZS|| ≤ 1, we have that
||Γ?ScS(Γ?
SS)−1ZS||∞ ≤ 1− α,
so that
||ZSc|| ≤2− αλ
(||WS||∞ + ||R(4)S||∞) + 1− α.
55
![Page 56: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/56.jpg)
Therefore, if max||W||∞, ||R(4)||∞ ≤ αλ8, we conclude that ||ZSc||∞ < 1.
Proof of Lemma 4. We write R(4) in the form
R(4) = (K? + 4)−1 −K∗−1 + K∗−14K∗−1.
Using the fact that |||4|||∞ ≤ d||4||∞, and the assumption of the lemma, we get that
|||K∗−14|||∞ ≤ |||K∗−1|||∞|||4|||∞ ≤ CΣ?d||4||∞ <1
3. (25)
Consequently, we have the matrix expansion
(K? + 4)−1 =(K?(In + K∗−14)
)−1= (In + K∗−14)−1(K?)−1
=∞∑k=0
(−1)k(K∗−14)k(K?)−1
= K∗−1 −K∗−14K∗−1 +∞∑k=2
(−1)k(K∗−14)k(K?)−1
= K∗−1 −K∗−14K∗−1 + K∗−14K∗−14JK∗−1,
where J =∑∞
k=0(−1)k(K∗−14)k. Therefore, we conclude that
R(4) = K∗−14K∗−14JK∗−1. (26)
Let (ei)ni=1 denote the Euclidean basis in Rn. Given the fact that for any vectors a, b ∈ Rn,
|a′b| ≤ ||a||∞||b||1, it holds that
||R(4)||∞ = maxi,j
∣∣e′iK∗−14K∗−14JK∗−1ej∣∣
≤ maxi||e′iK∗−14||∞max
j||K∗−14JK∗−1ej||1.
Since for any vector u ∈ Rn, ||u′4||∞ ≤ ||u||1||4||∞, we get that
||R(4)||∞ ≤ maxi||e′iK∗−1||1||4||∞max
j||K∗−14JK∗−1ej||1.
56
![Page 57: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/57.jpg)
Continuing on, we obtain that
||R(4)||∞ ≤ |||K∗−1|||∞||4||∞|||K∗−1J′4K∗−1|||∞
≤ CΣ? ||4||∞|||K∗−1|||2∞|||J′|||∞|||4|||∞.(27)
Given the definition of J and (25), we have
|||J′|||∞ ≤∞∑k=0
|||4K∗−1|||k∞ ≤1
1− |||K∗−1|||∞|||4|||∞≤ 3
2,
Substituting this in (27) and using the fact that |||4|||∞ ≤ d||4||∞, we conclude that
||R(4)||∞ ≤3
2CΣ? ||4||∞|||K∗−1|||2∞|||4|||∞ ≤
3
2C3
Σ?d||4||2∞.
Proof of Lemma 5. For any matrix X = (xij) ∈ Rn×n, we define XS =(x
(s)ij
)such that if
(i, j) ∈ S, we have x(s)ij = xij; and if (i, j) ∈ Sc, we have x
(s)ij = 0. Let X = (xij) ∈ Rn×n,
such that XS = X. Set
G(XS) = ΣS −(X−1
)S
+ λZS,
and
F (XS) = −(Γ?SS)−1G(K?
S + XS)S + XS.
If we take partial derivatives of the Lagrangian of the restricted problem (13) with respect
to the unconstrained entries in K, the partial derivatives must vanish at the optimum, so
we obtain the zero-gradient conditon:
ΣS − (K−1)S + λZS = 0.
Thus G(XS) = 0 is true if and only if XS = KS. Consequently, F (XS) = XS is true if
and only if XS = 4S = KS −K?. Notice that 4S = 4, so our goal is to prove that
57
![Page 58: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/58.jpg)
4S ∈ B(r), where B(r) denotes the ball
B(r) =XS : ||XS||∞ ≤ r,X ∈ Rn×n , with r = 2CΓ? (||W||∞ + λ) .
We have already shown that 4S is the only fixed point for F (XS) which is continuous
on B(r) that is convex and compact. Therefore, if we show that for any XS ∈ B(r),
F (X−S) ∈ B(r), we have that 4S ∈ B(r) by Brouwer’s fix point theorem. By definition,
G(K?S + XS) = ΣS −
((K? + X)−1
)S
+ λZS
=(ΣS −
(K∗−1
)S
)+(−((K? + X)−1
)S
+(K∗−1
)S
)+ λZS
= WS +(−((K? + X)−1
)S
+(K∗−1
)S
)+ λZS.
(28)
Note that ||X||∞ ≤ r ≤ 13CΣ∗d
, as XS ∈ B(r) and the entries on Sc of X are zero.
Then, according to (26),
R(X) = (K? + X)−1 −K∗−1 + K∗−1XK∗−1 = (K∗−1X)2J(X)K∗−1,
where J(X) =∑∞
k=0(−1)k(K∗−1X)k. If we take the vectorized form and restrict to entries
in S, we obtain
(K? + X)−1 −K∗−1S + Γ?
SSXS = (K∗−1X)2J(x)K∗−1S. (29)
Based on (28) and (29), we have
F (XS) = −(Γ?SS)−1G(K?
S + XS) + XS
= (Γ?SS)−1
(K? + X)−1 −K∗−1
S −WS − λZS
+ XS.
Therefore,
||F (XS)||∞ ≤ ||T1||∞ + ||T2||∞,
where T1 = (Γ?SS)−1(K∗−1X)2J (x)K∗−1
S and T2 = (Γ?SS)−1
(WS + λZS
).
58
![Page 59: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/59.jpg)
Using the definition of CΓ? , we get that
||T2||∞ ≤ CΓ?(||W||∞ + λ) =r
2.
It now remains to show that ||T1||∞ ≤ r2. We have
||T1||∞ ≤ CΓ? ||(K∗−1X)2J(x)K∗−1S||∞ ≤ CΓ?||R(X)||∞.
As the entries on Sc of X are zero, we get that ||X||∞ = ||XS||∞ ≤ r ≤ 13CΣ?d
. Then,
using Lemma 4, we obtain that
||T1||∞ ≤3
2dC3
Σ?CΓ∗||X||2∞ ≤3
2dC3
Σ?CΓ∗r2 ≤ r
2.
Proof of Lemma 6. By Lemma 5, we get that |kij−k∗ij| ≤ r < km, for all (i, j) ∈ S. Thus,
the desired result follows.
Proof of Lemma 7. According to (5), for large M , x ∈ [0, a1] and i, j ∈ V ,
P (|wij| > x) ≤ a2e−a3M
13 x2
.
Using this inequality over all n2 entries of W , we obtain that
P
(maxi,j|wij| > x
)≤ n2a2e
−a3M13 x2
.
Setting x = M− 16
[log(a2nτ )
a3
] 12
yields the desired result.
Proof of Lemma 8. According to (4), for large M , x ∈ [0, a1], and i, j ∈ V ,
P (|wij| ≥ x) ≤ a2Me−a3M14 x =
(a3M
14x)4
exp
(−a3M
14x
2
)a2
a43x
4exp
(−a3M
14x
2
).
59
![Page 60: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/60.jpg)
We first claim that if M ≥16
„log
n−τ a43x4
a2
«4
a43x
4 and x ≤ a142
a3e−
278 , then
(a3M
14x)4
exp
(−a3M
14x
2
)≤ 1.
Indeed, this follows from the fact that f(y) := y4e−y/2 ≤ 1 when y ≥ 27, and the fact
that if x ≤ a142
a3e−
278 , then
logn−τa4
3x4
a2
≤ loga4
3x4
a2
≤ −27
2.
On the other hand, if M ≥16
„log
n−τ a43x4
a2
«4
a43x
4 , then
a2
a43x
4exp
(−a3M
14x
2
)≤ n−τ .
Thus, the desired result follows choosing c0 = min
(a1,
a142
a3e−
278
), and summing over all
the entries of W.
Proof of Theorem 3. In the proof of Lemma 8 we have shown that if M ≥16
„log
n−τ a43c40
a2
«4
a43c
40
,
then
f(c0) := a2Me−a3M14 c0 ≤ n−τ .
On the other hand, f(α8λ) = n−τ . Because the function f is decreasing, we conclude that
α8λ ≤ c0. Therefore, by Lemma 8, we conclude that
1− 1
nτ−2≤ P
(||W ||∞ ≤
α
8λ = M− 1
4log (a2Mnτ )
a3
).
Now, set σm = mini σ∗ii (observe that σ∗ii > 0 for all i ∈ V). By the same argument
above, if M ≥16
„log
n−τ a43σ4m
a2
«4
a43σ
4m
and σm ≤ c0, then f(σm) ≤ n−τ and thus α8λ ≤ σm. When
σm > σ0, as α8λ ≤ c0, we also get that α
8λ ≤ σm. Therefore,
1− 1
nτ−2≤ P
(||W ||∞ ≤
α
8λ)≤ P (||W ||∞ ≤ σm) .
60
![Page 61: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/61.jpg)
Thus, if the event A :=||W ||∞ ≤M− 1
4log(a2Mnτ )
a3
holds a.s., then Σ has positive
diagonal entries, and by Lemma 1, K is unique.
Proceeding as above, if
M ≥16(
logn−τa4
3c4
a2
)4
a43c
4,
where c =(
6 (1 + 8α−1)2d max (CΣ?CΓ? , C
3Σ?C2
Γ?)−1
, then c ≥ α8λ.
Thus, we get that, if A holds, then
2CΓ?(||W||∞ + λ) ≤ 2CΓ?(1 + 8α−1
)M− 1
4log (a2Mnτ )
a3
≤(3d max
(CΣ? , C3
Σ?CΓ?)−1
.
Therefore, by Lemma 5,
||4||∞ ≤ 2CΓ?(1 + 8α−1
)M− 1
4log (a2n
τM)
a3
.
Hence, applying Lemma 4, we obtain that
||R(4)||∞ ≤ 3
2d||4||2∞C3
Σ? ≤ 6C3Σ?C2
Γ?d(1 + 8α−1
)2(M− 1
4log (a2n
τM)
a3
)2
= 6C3Σ?C2
Γ?d(1 + 8α−1
)2M− 1
4log (a2n
τM)
a3
αλ
8≤ αλ
8.
By Lemma 3, we conclude that K = K. Thus, ||K −K?||∞ satisfies the same bound as
||4||∞ if c1 =a43
a2min (c0, σm, c)
4 and c2 =a43
16min (c0, σm, c)
4.
Proof of Theorem 4. If M ≥16
„log
n−τ a43k4
a2
«4
a43k
4 , where k = km2CΓ? (1+8α−1)
, then
km ≥ 2CΓ?(1 + 8α−1
)M− 1
4log (a2Mnτ )
a3
≥ 2CΓ?(||W||∞ + λ).
Thus, choosing c3 =a43
a2min (c0, σm, c, k)4 and c4 =
a43
16min (c0, σm, c, k)4 , following the same
steps as in Theorem 5, and appealing to Lemma 6, we conclude the desired result.
61
![Page 62: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/62.jpg)
Proof of Theorem 5. If M ≥(
log(a2nτ )
a3
)3
min(a1, σm)−6, then
min(a1, σm) ≥M− 16
(log (a2n
τ )
a3
) 12
,
and by Lemma 7,
1− 1
nτ−2≤ P
(||W ||∞ ≤M− 1
6
(log (a2n
τ )
a3
) 12
)≤ P (||W ||∞ ≤ σm) .
Thus, if the event A :=
||W ||∞ ≤M− 1
6
(log(a2nτ )
a3
) 12
holds a.s., then Σ has positive
diagonal entries, and by Lemma 1, K is unique.
Now, if
M ≥(
log (a2nτ )
a3
)3
c−6,
where c =(
6 (1 + 8α−1)2d max (CΣ?CΓ? , C
3Σ?C2
Γ?)−1
then
c ≥M− 16
(log (a2n
τ )
a3
) 12
.
Thus, choosing λ = 8α−1M− 16
(log(a2nτ )
a3
) 12
, we get that, if A holds then
2CΓ?(||W||∞ + λ) ≤(3d max
(CΣ? , C3
Σ?CΓ?)−1
.
Therefore, by Lemma 5,
||4||∞ ≤ 2CΓ?(1 + 8α−1
)M− 1
6
(log (a2n
τ )
a3
) 12
.
Hence, applying Lemma 4, we obtain that
||R(4)||∞ ≤ 3
2d||4||2∞C3
Σ? ≤ 6C3Σ?C2
Γ?d(1 + 8α−1
)2
(M− 1
6
(log (a2n
τ )
a3
) 12
)2
= 6C3Σ?C2
Γ?d(1 + 8α−1
)2M− 1
6
(log (a2n
τ )
a3
) 12 αλ
8≤ αλ
8.
62
![Page 63: Realized Networks - Aarhus Universitet...We are grateful for useful comments to Peter Hansen, Nour Meddahi and Kevin Sheppard and seminar participants at the \Measuring and Modeling](https://reader033.vdocuments.us/reader033/viewer/2022050308/5f709ecdac19bb2d106b54bb/html5/thumbnails/63.jpg)
As ||W ||∞ ≤ M− 16
(log(a2nτ )
a3
) 12
= αλ8
, by Lemma 3, we conclude that K = K. Thus,
||K−K?||∞ satisfies the same bound as ||4||∞ if c0 = min (a1, σm, c)−1 .
Proof of Theorem 6. If M >(
log(a2nτ )
a3
)3(
2CΓ?(1+8α−1)km
)6
, then
km > 2CΓ?(1 + 8α−1
)M− 1
6
(log (a2n
τ )
a3
) 12
≥ 2CΓ?(||W||∞ + λ).
Thus, choosing
c1 = max
(c0,
2CΓ? (1 + 8α−1)
km
),
following the same steps as in Theorem 5, and appealing to Lemma 6, we conclude the
desired result.
63