transfer entropy for nonparametric granger causality
TRANSCRIPT
Article
Transfer Entropy for Nonparametric GrangerCausality Detection: An Evaluation of DifferentResampling Methods
Cees Diks 1,2, Hao Fang 1,2,†*1 CeNDEF, Amsterdam School of Economics, University of Amsterdam, The Netherlands2 Tinbergen Institute, Amsterdam, The Netherlands* Correspondence: [email protected]; Tel.: +31-06-84124179† Current address: E. 3.28, Faculty of Economics and Business, University of Amsterdam, Roetersstraat 11,
1018 WB Amsterdam, The Netherlands
Academic Editor: nameVersion May 16, 2017 submitted to Entropy
Abstract: The information-theoretical concept transfer entropy is an ideal measure for detecting1
conditional independence, or Granger causality in a time series setting. The recent literature indeed2
witnesses an increased interest in applications of entropy-based tests in this direction. However, those3
tests are typically based on nonparametric entropy estimates for which the development of formal4
asymptotic theory turns out to be challenging. In this paper, we provide numerical comparisons for5
simulation-based tests to gain some insights into the statistical behavior of nonparametric transfer6
entropy-based tests. In particular, surrogate algorithms and smoothed bootstrap procedures are7
described and compared. We conclude this paper with a financial application to the detection of8
spillover effects in the global equity market.9
Keywords: Transfer entropy; Granger causality; Nonparametric test; Bootstrap; Surrogate data10
MSC: 62G09; 62G10; 91B8411
JEL Classification: C12; C14; C1512
1. Introduction13
Entropy, introduced by [1,2], is an information theoretical concept with several appealing14
properties, and therefore wide applications in information theory, thermodynamics and time series15
analysis. Based on this classical measure, transfer entropy has become the most popular information16
measure. This concept, which was coined by [3], was applied to distinguish a possible asymmetric17
information exchange between the variables of a bivariate system. When based on appropriate18
non-parametric density estimates, transfer entropy is a flexible non-parametric measure for conditional19
dependence, coupling structure, or Granger causality in a general sense.20
The notion of Granger causality was developed by the pioneering work of [4] to capture causal21
interactions in a linear system. In a more general model-free world, the Granger causal effect can be22
interpreted as the impact of incorporating the history of another variable on the conditional distribution23
of a future variable in addition to its own history. Recently, various nonparametric measures have24
been developed to capture such difference between conditional distributions in a more complex,25
and typically nonlinear system. There is a growing list of such methods, based on, among others,26
correlation-integral methods [5], kernel density estimation [6], Hellinger distances [7], copula functions27
[8] and empirical likelihood [9].28
Submitted to Entropy, pages 1 – 24 www.mdpi.com/journal/entropy
Version May 16, 2017 submitted to Entropy 2 of 24
In contrast with the above-mentioned methods, transfer entropy-based causality tests do not29
attempt to capture the difference between two conditional distributions explicitly. Instead, with the30
information theoretical interpretation, it is a natural tool for measuring directional information transfer31
and Granger causality. We refer to [10] and [11] for detailed reviews of the relation between Granger32
causality and directed information theory. However, the direct application of entropy and its variants,33
though attractive, turns out to be difficult, if not impossible at all, due to the lack of asymptotic34
distribution theory for the test statistics. For example, [12] normalize the entropy to detect serial35
dependence with critical values obtained from simulations. [13] provide the asymptotic distribution36
for the [12] statistic with a specific kernel function. [14] derive a χ2 distribution for transfer entropy at37
the cost of the model-free property.38
On the other hand, to obviate the asymptotic problem, several resampling methods on transfer39
entropy have been developed for providing empirical distributions of the test statistics. Two popular40
techniques are bootstrapping and surrogate data. Bootstrapping is a random resampling technique41
proposed by [15] to estimate the properties of an estimator by measuring those properties from42
approximating distributions. The ‘surrogate’ approach developed by [16] is another randomization43
method initially employing Fourier transforms to provide a benchmark in detecting nonlinearity in a44
time series. It is worth mentioning that the two methods are different with respect to the statistical45
properties of the resampled data. For the surrogate method, the null hypothesis is maintained, while46
the bootstrap method does not seek to impose the null hypothesis on the bootstrapped samples. We47
refer to [17] and [18] for detailed applications of the two methods.48
However, not all resampling methods are suitable for entropy-based dependence measures. As49
[13] put it, a standard bootstrap fails to deliver a consistent entropy-based statistic because it does not50
preserve the statistical properties of a degenerated U-statistic. Similarly, with respect to traditional51
surrogates based on phase randomization of the Fourier transform, [19] criticize the particularly52
restrictive assumption of linear Gaussian process, and [20] point out that it cannot preserve the whole53
statistical structure of original time series.54
As far as we are aware, there are several applications of both methods in entropy-based tests, for55
example, [7] propose a smoothed local bootstrap for entropy-based test for serial dependence, [21]56
apply stationary bootstrap in partial transfer entropy estimation, [22] use time-shifted surrogates to57
test the significance of the asymmetry of directional measures of coupling, and [23] come up with58
the effective transfer entropy, which relies on random shuffling surrogate in estimation. [24] and [21]59
provide some comparisons between bootstrap and surrogate methods for entropy-based tests.60
In this paper, we adopt transfer entropy as a test statistic for measuring conditional independence61
(Granger non-causality). Being aware of the fact that the analytical null distribution may not always be62
accurate or available in an analytical closed form, we resort to resampling techniques for constructing63
the empirical null distribution. The techniques under consideration include smoothed local bootstrap,64
stationary bootstrap and time-shifted surrogates, all of which are shown in literature to be applied to65
entropy-based test statistics. Using different dependence structures, the size and power performance66
of all methods are examined in simulations.67
The remainder of paper is organized as follows. Section 2 first provides the transfer entropy-based68
testing framework and a short introduction to kernel density estimation; then bandwidth selection rules69
are discussed. After presenting the resampling methods, including the smoothed local bootstrap and70
time-shifted surrogates for different dependence structure settings. Section 3 examines the empirical71
performance of different resampling methods. The testing size and power are presented. Section 472
considers an financial application with the transfer entropy-based nonparametric test and Section 573
summaries.74
Version May 16, 2017 submitted to Entropy 3 of 24
2. Methodology75
2.1. Transfer Entropy and the Estimator76
Information theory is a branch of applied mathematical theory of probability and statistics. Thecentral problem of classical information theory is to measure transmission of information over a noisychannel. Entropy, also referred to as Shannon entropy, is one key measure in the field of informationtheory brought by [1,2]. Entropy measures the uncertainty and randomness associated with a randomvariable. Supposing that S is a random vector with density fS(s), its Shannon entropy is defined as
H(s) = −∫
fS(s) log ( fS(s)) ds. (1)
There is a long history of applying information measures in time series modeling. For example,77
[25] applies the [26] information criterion to construct a one-sided test for serial independence. Since78
then, nonparametric tests using entropy measures for dependence between two time series are79
becoming prevalent. [12] normalize the entropy measure to identify the lags in a nonlinear bivariate80
model. [27] study dependence with a transformed metric entropy, which turns out to be a proper81
measure of distance. [13] provide a new entropy-based test for serial dependence, and the test statistic82
follows a standard normal distribution asymptotically.83
Although those heuristic approaches work for entropy-based measures of dependence, thesemethodologies do not carry over directly to measures of conditional dependence, i.e., Granger causality.Transfer entropy (TE), a term coined by [3], although it appeared in literatures earlier under differentnames, is a suitable measure to serve this purpose. The TE quantifies the amount of informationexplained in one series at k steps ahead from the state of another series, given the current and paststate of itself. Suppose we have two series {Xt} and {Yt}, for brevity put X = {Xt}, Y = {Yt} andZ = {Yt+k}, further we define a three-variate vector Wt as Wt = (Xt, Yt, Zt), where Zt = Yt+k; andW = (X, Y, Z) is used when there is no danger of confusion. Within this bivariate setting, W is a threedimensional continuous vector. In this paper, we limit ourselves to k = 1 for simplicity, but the methodcan be generalized into multiple steps easily. The transfer entropy TEX→Y is a nonlinear measure forthe amount of information explained in Z by X, accounting for the contribution of Y. Although TEdefined by [3] applies for discrete variable, it is easily generalized to continuous variables. Conditionalon Y, TEX→Y is defined as
TEX→Y = EW
(log
fZ,X|Y(Z, X|Y)fX|Y(X|Y) fZ|Y(Z|Y)
)
=∫ ∫ ∫
fX,Y,Z(X, Y, Z) logfZ,X|Y(Z, X|Y)
fX|Y(X|Y) fZ|Y(Z|Y) dx dy dz
= EW
(log
fX,Y,Z(X, Y, Z)fY(Y)
− logfX,Y(X, Y)
fY(Y)− log
fY,Z(Y, Z)fY(Y)
)= EW (log fX,Y,Z(X, Y, Z) + log fY(Y)− log fX,Y(X, Y)− log fY,Z(Y, Z)) .
(2)
Using conditional mutual information I(Z, X|Y = y), the TE can be equivalently formulated with fourShannon entropy terms:
TEX→Y = I(Z, X|Y)= H(Z|Y)− H(Z|X, Y)
= H(Z, Y)− H(Y)− H(Z, X, Y) + H(X, Y).
(3)
In order to construct a test for Granger causality based on the TE measure, one needs first to show84
quantitatively that TE is a proper basis for detecting whether null hypothesis is satisfied. The following85
Version May 16, 2017 submitted to Entropy 4 of 24
theorem, as a direct application of Kullback-Leibler criterion, lays the quantitative foundation for86
testing on the TE.87
Theorem 1. TEX→Y ≥ 0 with equality if and only if fZ,X|Y(Z, X|Y) = fX|Y(X|Y) fZ|Y(Z|Y).88
Proof. The proof of Thm. 1 is given by [28].89
It is not difficult to verify that the condition for TEX→Y = 0 coincide with the null hypothesisfor Granger non-causality defined in Equation (4), also referred to as conditional independence or nocoupling. Mathematically speaking, the null hypothesis of a Granger non-causality, H0 : {Xt} is not aGranger cause of {Yt}, can be phrased as
H0 :fX,Y,Z(x, y, z)
fY(y)=
fY,Z(y, z)fY(y)
fX,Y(x, y)fY(y)
(4)
for (x, y, z) in the support of W. A nonparametric test for Granger non-causality seeks to find statistical90
evidence of violation of Equation (4). There are many nonparametric measures available for this91
purpose, some are mentioned previously. Equation (4) lays the stone for a model-free test without92
imposing any parametric assumptions about the data generating process or underlying distributions93
for {Xt} and {Yt}. We only assume two things here. First, {Xt, Yt} is a strictly stationary bivariate94
process. Second, the process has a finite memory, i.e. variable lags lX, lY � ∞. The second (finite95
Markov order) assumption is needed in this nonparametric setting to allow conditioning on the past.96
As far as we are aware, the direct use of TE to test Granger non-causality in nonparametric97
setting is difficult, if not impossible at all, due to the lack of asymptotic theory for the test statistic.98
As [12] put it, very few asymptotic distribution results for entropy-based estimators is available.99
Although over the years several break-throughs have been made with application of entropy to testing100
serial independence, the limiting distribution of TE statistic is still unknown. One may wish to use101
simulation techniques to overcome the lack of asymptotic distributions. However, as noted by [7], there102
are estimation biases of TE statistics for non-parametric dependence measures under the smoothed103
bootstrap procedure. Even for the parametric test statistic used by [14], the authors noticed that the TE104
based estimator is generally biased.105
2.2. Density Estimation and Bandwidth Selection106
The non-negative property in Thm. 1 makes TEX→Y a desirable statistic for constructing a107
one-sided test of conditional independence; any positive divergence from zero is a sign of conditional108
dependence of Y on X. To estimate TE, there are several different approaches, such as histogram-based109
estimators [29], correlation sums [30] and nearest neighbor estimators [31]. However, the optimal rule110
for the number of neighbor points is unclear, and as [31] comment, a small value of neighbor points111
may lead to large statistical errors. A more natural method, kernel density estimators, the properties of112
which have been well studied, is applied in this paper. With the plug-in kernel estimates of densities,113
we may replace the expectation in Equation (2) by a sample average to get an estimate for TEX→Y.114
A local density estimator of a dW-variate random vector W at Wi is given by
fW(Wi) = ((n− 1)h)−dWn
∑j,j 6=i
K(Wi −Wj
h
), (5)
where K is a kernel function and h is the bandwidth. We take K(.) to be a product kernel functiondefined as K(W) = ∏dW
s=1 κ(ws), where ws is sth element in W. Using a standard univariate Gaussian
kernel, κ(ws) = (2π)−1/2e−12 (w
s)2, K(.) is the standard multivariate Gaussian kernel as described by
Version May 16, 2017 submitted to Entropy 5 of 24
[32] and [33]. Using Equation (5) as the plug-in density estimator, and replacing the expectation by thesample mean, we obtain the estimator for the transfer entropy given by
I(Z, X|Y) = 1n
n
∑i=1
(log f (xi, yi, zi) + log f (yi)− log f (xi, yi)− log f (yi, zi)
). (6a)
If we estimate the Shannon entropy in Equation (1) based on a sample of size n from the dW-dimensionalrandom vector W, by the sample average of the plug-in density estimates, we obtain
H(W) =1n
n
∑i=1
(log( fW(Wi))
), (6b)
then Equation (6a) can be equivalently expressed in terms of four entropy estimators, that is,
I(Z, X|Y) = −H(xi, yi, zi)− H(yi) + H(xi, yi) + H(yi, zi). (6c)
To construct a statistical test, we develop the asymptotic properties of I(Z, X|Y) defined in115
Equations (6a) and (6c) through two steps. In the first step, given the density estimates the consistency116
of entropy estimates is achieved and then the linear combination of four entropy estimates would117
converge in probability to the true value. In mathematics, the following two theorems ensure the118
consistency of I(Z, X|Y).119
Theorem 2. Given the kernel density estimate fW(Wi) for fW(Wi), where W is a dW-dimensional random120
vector with length n. Let H(W) be the plug-in estimate for the Shannon entropy as defined in Equation (6b),121
then H(W)P−→ H(W).122
Proof. The proof to Thm. 2 is initially investigated in [34] and also discussed in [12].123
The basic idea of the proof is to take the Taylor series expansion of log( fW(Wi)) around its124
mean and use the fact that fW(Wi), given an appropriate bandwidth sequence, converges to fW(Wi)125
pointwise to get the consistency. In the next step, the consistency of I(Z, X|Y) is provided with the126
continuous mapping theorem.127
Theorem 3. Given H(W)P−→ H(W), with the definition of I(Z, X|Y) as in Equation (6c), I(Z, X|Y) P−→128
I(Z, X|Y)129
Proof. The proof is straightforward if one applies the Continuous Mapping Theorem. See Theorem130
2.3 in [35].131
Before we move to the next section, it is worth having a careful look at the issue of bandwidth132
selection. The bandwidth h for kernel estimation determines how smooth the density estimation is; a133
smaller bandwidth reveals more structure of the data, whereas larger bandwidth delivers a smoother134
density estimate. Bandwidth selection is essentially a trade-off between the bias and variance in135
density estimation. A very small value of h could eliminate estimation bias, but variance will increase.136
On the other hand, a large bandwidth reduces estimation variance at the expense of incorporating137
more bias. See Chapter 3 in [32] and [36] for details.138
However, the bandwidth selection for the TE statistic is more involved. To the best of ourknowledge, there is a blank in the field of optimal bandwidth selection in kernel-based transferentropy estimator. As [37] show, when estimating the entropy estimator, two types of errors wouldbe generated, one is from entropy estimation and the other from density estimation, and the optimalbandwidth for density estimation may not coincide with the optimal one for entropy estimation. Thus,instead of providing the optimal multivariate density estimator, i.e. the rule-of-thumb in [33], the
Version May 16, 2017 submitted to Entropy 6 of 24
bandwidth in our study should provide an accurate estimator for I(Z, X|Y), say in the minimal MSEsense. [28] developed such bandwidth rule for their transfer entropy-based estimator. In that paper,rather than directly developing asymptotic properties for transfer entropy, we study the propertiesof the first order Taylor expansion of transfer entropy under the null hypothesis. The suggestedbandwidth is shown to be MSE optimal and allows us to obtain asymptotic normality for the teststatistic. In principal, the convergence rate of the transfer entropy estimator should be the same as theleading term of its Taylor approximation. We therefore propose to use the same rate also here, giving
h = Cn−2/7, (7)
where C is an unknown parameter. This bandwidth would deliver a consistent test since the variance139
of local estimate of I(Zi, Xi|Yi) will dominate the MSE. In [28] we suggest to use C = 4.8 based on140
simulations, while [6] suggest to set C ≈ 8 for ARCH processes. Our simulations here also may prefer141
a larger value of C because the squared bias is of higher order and hence less concern for the transfer142
entropy based statistic. A larger bandwidth could better control the estimation variance and deliver a143
more powerful test. For robustness check, we adopt C = 8 as well as C = 4.8 suggested by [28] in our144
simulation study. To match the Gaussian kernel, we standardize the data before estimate Equations (6a)145
to (6c) such that the transformed series will be centered around zero and unit-variance.1146
2.3. Resampling Methods147
To develop simulation-based tests for the null hypothesis, given in Equation (4), of no Granger148
causality from X to Y, or equivalently, for conditional independence, we consider three resampling149
techniques, i.e. (1) time shifted surrogates developed by [22], (2) the smoothed bootstrap of [7] and150
(3) the stationary bootstrap introduced by [38]. The first technique is widely applied in coupling151
measures, as for example by [39] and [40], while the latter two have already been used for detecting152
conditional independence for decades. It worth to mention that the surrogates and bootstrap methods153
treat the null quite differently. Surrogate data are supposed to preserve the dependence structure154
imposed by H0 while bootstrap data are not restricted to the H0. It is possible to bootstrap the dataset155
without restoring dependence structure of {X, Y, Z} under the null. See [41] for more details. To avoid156
resampling errors and to make different methods more comparable, we limit ourselves to methods157
that impose the null hypothesis on the resampled data. The wollowing three different resampling158
methods are implemented with different sampling details.159
1. Time-shifted Surrogates160
• (TS.a) The first resampling method only deals with the driving variable X. Suppose we have161
observations {x1, ..., xn}, the time-shifted surrogates are generated by cyclical time-shifting162
the components of the time series. Specifically, an integer d is randomly generated within163
the interval ([0.05n], [0.95n]), and then the first d values of {x1, ..., xn} would be moved to the164
end of the series, to deliver the surrogate sample X∗ = {xd+1, ..., xn, x1, ..., xd}. Compared165
with the traditional surrogates based on phase randomization of the Fourier transform, the166
time-shifted surrogates can preserve the whole statistical structure in X. The couplings167
between X and Y are destroyed, although the null hypothesis of X not causing Y is imposed.168
169
• (TS.b) The second scheme resamples both the driving variable X and the response variable Y170
separately. Similar to (TS.a), Y∗ = {yc+1, ..., yn, y1, ..., yc} is created given another random171
integer c from the range ([0.05n], [0.95n]). In contrast with the standard time-shifted172
1 Very similar results are obtained by matching the mean absolute deviation instead of the variance of standard Gaussiankernel for transfer entropy estimation.
Version May 16, 2017 submitted to Entropy 7 of 24
surrogates described in (TS.a), in this setting we add more noise to the coupling between X173
and Y.174
2. Smoothed Local Bootstrap175
The smoothed bootstrap selects samples from a smoothed distribution instead of drawing176
observations from the empirical distribution directly. See [42] for a discussion of the smoothed177
bootstrap procedure. Based on rather mild assumptions, [43] show that there is no need to178
reproduce the whole dependence structure of the stochastic process to get an asymptotically179
correct nonparametric dependence estimator. Hence a smoothed bootstrap from the estimated180
conditional density is able to deliver a consistent statistic. Specifically, we consider two versions181
of the smoothed bootstrap that are different in dependence structure to some extent.182
• (SMB.a) In the first setting, Y∗ is resampled without replacement from the smoothed local183
bootstrap. Given the sample Y = {y1, ...yn}, the bootstrap sample is generated by adding a184
smoothing noise term εYi such that y∗i = y∗i + hbεY
i , where hb > 0 is the bandwidth used in185
bootstrap procedure, εYi repersents a sequence of i.i.d. N(0, 1) random variables. Without186
random replacement from the original time series, this procedure does not disturb the187
original dynamics of Y = {y1, ...yn} at all. After Y∗ is resampled, both X∗ and Z∗ are drawn188
from the smoothed conditional densities f (x|Y∗) and f (z|Y∗) as described in [44].189
190
• (SMB.b) Secondly, we implement the smoothed local bootstrap same as [7]. The only191
difference between this setting and (SMB.a) is that the bootstrap sample Y∗ is drawn with192
replacement from the smoothed kernel density.193
3. Stationary Bootstrap194
[38] propose the stationary bootstrap to overcome the drawbacks of non-stationary bootstrap195
sample in the moving block bootstrap. This method replicates the time dependence of original196
data by resampling blocks of data with randomly varying block length. The lengths of the197
bootstrap blocks follows a geometric distribution. Given a fixed probability p, the length Li of198
block i is decided as P(Li = k) = (1− p)k−1 p for k = 1, 2, ..., and the starting points of block i are199
randomly drawn from the original n observations. To restore the dependence structure exactly200
under the null, we combine the stationary bootstrap with the smoothed local bootstrap for our201
simulations.202
• (STB) In short, firstly y∗1 is picked randomly from the original n observations of Y = {y1, ...yn},203
denoted as y∗1 = ys where s ∈ [1, n]. With probability p, y∗2 is picked at random from the204
data set; and with probability 1− p, y∗2 = ys+1, so that y∗2 would be the next observation205
to ys in original series Y = {y1, ...yn}. Proceeding in this way, {y∗1 , ..., y∗n} can be generated.206
If y∗i = ys and s = n, the ‘circular boundary condition’ would kick in such that y∗i+1 = y1.207
After Y∗ = {y∗1 , ..., y∗n} is generated, both X∗ and Z∗ are randomly drawn from the smoothed208
conditional densities f (x|Y∗) and f (z|Y∗) as in (SMB.b).209
A final remark concerns the difference between this paper and [21]; there both time-shifted210
surrogate and the stationary bootstrap are implemented for an entropy-based causal test. However,211
our paper provides additional insights into several aspects. Firstly, the smoothed bootstrap, being212
shown in the literature to work for nonparametric kernel estimators under general dependence213
structure, is applied in our paper. Secondly, they treat the bootstrap and surrogate sample in a similar214
way, but as we noted before, the bootstrap method is not designed to impose the null hypothesis215
but designed to keep the dependence structure present in the original data. The stationary bootstrap216
procedure in [21] might be incompatible with the null hypothesis of conditional independence since217
they destroy the dependence completely. Because they restore independence between X and Y rather218
than conditional independence between X|Y and Z|Y during resampling, the estimated statistics from219
Version May 16, 2017 submitted to Entropy 8 of 24
the resampled data may not necessary correspond to the statistic defined under the null. Thirdly, we220
provide rigorous size and power result in our simulations, which is missing in their paper.221
3. Simulation Study222
In this section, we investigate the performance of the five resampling methods in detecting223
conditional dependence for several date generating processes, including a linear vector autoregressive224
process, a nonlinear VAR process and a bivariate ARCH process. In Equations (8) to (10), we use225
a single parameter a to control the strength of the conditional dependence. The size assessment is226
obtained based on testing Granger non-causality from {Xt} to {Yt}, and for the power we use the same227
process but testing on Granger non-causality from {Yt} to {Xt}. Specifically, for each data-generating228
process (DGP), we run 500 simulations for sample size n = {200, 500, 1000, 2000}. Surrogate and the229
bootstrap sample size is 999. We set a = 0.4 to represent the moderate dependence in size performance230
investigation and a = 0.1 to evaluate the test power. For fair comparisons between (TS.a) and (TS.b),231
as well as between (SMB.a) and (SMB.b), we fix the seeds for random generator in the resampling232
functions to eliminate the potential effect of randomness. Besides, we use the empirical standard233
deviation of {Yt} as the bootstrapping bandwidth and C = {4.8, 8} in Equation (7) for kernel estimation.234
Further, the power performance for the two-way causal linkage is investigated in Equation (11), where235
b = −0.2 and c = 0.1. All empirical rejection rates are summarized in Tables 1 to 4.236
1. Linear vector autoregressive process (VAR).
Xt = aYt−1 + εx,t, εx,t ∼ N(0, 1)Yt = aYt−1 + εy,t, εy,t ∼ N(0, 1).
(8)
2. Nonlinear VAR. This process is considered in [45] to show the failure of linear Granger causalitytest.
Xt = aXt−1Yt−1 + εx,t, εx,t ∼ N(0, 1)Yt = 0.6Yt−1 + εy,t, εy,t ∼ N(0, 1).
(9)
3. Bivariate ARCH process.
Xt ∼ N(0, 1 + aY2t−1)
Yt ∼ N(0, 1 + aY2t−1).
(10)
4. Two-way VAR process.
Xt = 0.7Xt−1 + bYt−1 + εx,t, εx,t ∼ N(0, 1)Yt = cXt−1 + 0.5Yt−1 + εy,t, εy,t ∼ N(0, 1).
(11)
The top panels in Tables 1 to 3 summarize the empirical rejection rates obtained for the 5% and237
10% nominal significance levels for processes Equations (8) to (10) under the null hypotheses, and the238
bottom panels report the corresponding empirical power under the alternatives. Generally speaking,239
the size and power are quite satisfactory for almost all combinations of the constant C, sample size n240
and nominal significance level. The performance differences for the different resampling methods are241
not substantial.242
With respect to the size performance, most of the time we see that the realized rejection rates243
stay in line with the nominal size. Besides, the bootstrapping methods outperform the time-shifted244
surrogate methods in that their empirical size is slightly closer to the nominal size. Lastly, the size of245
the tests is not very sensitive to the choice of the constant C apart from the cases for the ARCH process246
given in Equation (10), where all tests are too conservative for the smaller constant C = 4.8.247
From the point of view of power, (TS.a) and (SMB.a) seems to outperform their counterparts, yet,248
the differences are very subtle. Along the dimension of sample size, clearly we see that the empirical249
power increases in the sample size. Furthermore, the results are very robust with respect to different250
Version May 16, 2017 submitted to Entropy 9 of 24
choices for the constant C in the kernel density estimation bandwidth. For the VAR and nonlinear251
processes given by Equations (8) and (9) a smaller C seems to give more powerful tests while a larger C252
is more beneficial for detecting conditional dependence structure in the ARCH process of Equation (10).253
Next, Table 4 presents the empirical power for the two-way VAR process, where the two variables254
{X} and {Y} are inter-tangled with each other. Due to the setting b = −0.2 and c = 0.1 in Equation (11),255
it is obvious that {Y} is a stronger Granger cause for {X} than the other way around. As a consequence,256
the reported rejection rates in Table 4 are overall higher when testing Y → X than X → Y.257
To visualize the simulation results, Figures 1 to 3 report the empirical size and power against258
the nominal size. Since the performance of the five difference re-sampling methods is quite similar,259
we only show the results for (SMB.a) for simplicity. In each figure, the left (right) panels show the260
realized size (power), and we choose C = 4.8 (C = 8) for the top (bottom) two panels. We can see261
from the figures that the empirical performance of the transfer entropy test are overall satisfactory,262
apart from panel (a) in Figure 3, where the testing size is too conservative for large sample size. The263
under-rejection is caused by the inappropriate choice C = 4.8, which makes the bandwidth for kernel264
estimation too small. Further, the empirical power against the nominal size for the two-way VAR265
Equation (11) are reported in Figure 4.266
4. Application267
In this section, we apply the transfer entropy-based nonparametric test on detecting financial268
market interdependence, in terms of both return and volatility. [46] performed a variance269
decomposition of the covariance matrix of the error terms from a reduced-form VAR model to270
investigate the spillover effect in the global equity market. More recently, [47] extended the framework271
and considered the time-varying feature in global volatility spillovers. Their research, although272
providing simple and intuitive methods for measuring directional linkages between global stock273
markets, may suffer from the limitation of the linear parametric modeling, as discussed earlier. We274
revisit the topic of spillovers in the global equity market by the nonparametric method.275
For our analysis, we use daily nominal stock market indexes from January 1992 to March 2017,276
obtained from Datastream, for six developed countries including the U.S. (DJIA), Japan (Nikkei 225),277
Hong Kong (Hangseng), the U.K. (FTSE 100), Germany (DAX 30) and France (CAC 40). The target278
series are weekly return and volatility for each index. The weekly returns are calculated in terms of279
diffferenced log prices multiplied by 100, from Friday to Friday. Where the price for Friday is not280
available due to a public holiday, we use the Thursday price instead.281
The weekly volatility series are generated following [46] by making use of the weekly high, low,opening and closing prices that obtained from the underlying daily high, low, opening and closingdata. The estimate of volatility σ2
t for week t is as follows.
σt2 = 0.511(Ht − Lt)
2 − 0.019[(Ct −Ot)(Ht + Lt − 2Ot)− 2(Ht −Ot)(Lt −Ot)]− 0.383(Ct −Ot)2,
(12)where Ht is the Monday-Friday high, Lt is the Monday-Friday low, Ot is the Monday-Friday open and282
Ct is the Monday-Friday close (in natural logarithms multiplied by 100). Futher, after deleting the283
volatility estimates for the New Year week in 2002, 2008 and 2013 due to the lack of observations for284
Nikkei 225 index, we have 1313 observations in total for weekly returns volatilities. The descriptive285
statistics, LB-test statistics and ADF-test statistics for both series are summarized in Table 5. From the286
ADF test results, it is clear that all time series are stationary for further analysis.287
We firstly provide a full-sample analysis of global stock market return and volatility spillovers288
over the period from January 1992 to March 2017, summarized in Tables 6 and 7. The two tables289
report the pairwise test statistics for conditional independence between index X and index Y, given290
the constant C in the bandwidth for kernel estimation is 4.8 or 8 and 999 resampling time series.291
In other words, we test on the one-week-ahead directional linkage from index X to Y by using the292
five re-sampling methods described in Section 2. For example, the first line in the top panel in293
Version May 16, 2017 submitted to Entropy 10 of 24
0 0.05 0.1 0.15 0.20
0.05
0.1
0.15
0.2
Norminal Size
Rej
ecti
onR
ate
(a) Size-Size Plot, C=4.8
n = 200n = 500n = 1000n = 200045-line
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Norminal Size
Rej
ecti
onR
ate
(b) Size-Power, C=4.8
n = 200n = 500n = 1000n = 200045-line
0 0.05 0.1 0.15 0.20
0.05
0.1
0.15
0.2
Norminal Size
Rej
ecti
onR
ate
(c) Size-Size Plot, C=8
n = 200n = 500n = 1000n = 200045-line
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Norminal Size
Rej
ecti
onR
ate
(d) Size-Power, C=8
n = 200n = 500n = 1000n = 200045-line
Figure 1. Size-size and size-power plots of Granger non-causality tests, based on 500 replications andsmoothed local bootstrap (a). The DGP is the bivariate VAR specified in Equation (8), with Y affectingX. The left (right) column shows observed rejection rates under the null (alternative) hypothesis. Thesample size varies from n = 200 to n = 2000.
Table 6 reports the one-week-ahead influence of DJIA returns upon other indexes by using the fisrt294
time-shifting surrogate test. Given C = 8, DJIA return is shown to be a strong Granger cause for295
Nikkei, FTSE and CAC at 1% level, and for DAX at 5% level.296
Based on Tables 6 and 7, we may draw several conclusions. Firstly, the US index and German297
index are the most important return transmitters and Hong Kong is the largest source for volatility298
spillover, judged by the numbers of significant linkages. Note that this finding is similar as the result in299
[46], where the total return (volatility) spillovers from US (Hong Kong) to others are much higher than300
any other countries. Figure 5 provides a graphic illustration for the global spillover network based on301
the result of (SMB.a) from Tables 6 and 7. Apart from the main transmitters, we can clearly see that302
Nikkei and CAC are the main receivers in the global return spillover network, while DAX is the main303
receiver of global volatility transmission.304
Secondly, the result obtained is very robust, no matter which re-sampling method is applied. The305
differences between the five resampling methods are small, although (TS.a) is seen to be slightly more306
powerful than (TS.b) in Table 6. The three different bootstrapping methods are very consistent almost307
all the time, similarly to what we observe in Section 3.308
Version May 16, 2017 submitted to Entropy 11 of 24
0 0.05 0.1 0.15 0.20
0.05
0.1
0.15
0.2
Norminal Size
Rej
ecti
onR
ate
(a) Size-Size Plot, C=4.8
n = 200n = 500n = 1000n = 200045-line
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Norminal Size
Rej
ecti
onR
ate
(b) Size-Power, C=4.8
n = 200n = 500n = 1000n = 200045-line
0 0.05 0.1 0.15 0.20
0.05
0.1
0.15
0.2
Norminal Size
Rej
ecti
onR
ate
(c) Size-Size Plot, C=8
n = 200n = 500n = 1000n = 200045-line
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Norminal Size
Rej
ecti
onR
ate
(d) Size-Power, C=8
n = 200n = 500n = 1000n = 200045-line
Figure 2. Size-size and size-power plots of Granger non-causality tests, based on 500 replicationsand smoothed local bootstrap (a). The DGP is the bivariate non-linear VAR specified in Equation (9),with Y affecting X. The left (right) column shows observed rejection rates under the null (alternative)hypothesis. The sample size varies from n = 200 to n = 2000.
However, the summary result in Tables 6 and 7 is static. The statistics are measurements for309
averaged-out directional linkages over the whole period from 1992 to 2017. The conditional dependence310
structure of time series, at any point in time, can be very different. Hence, the full-sample analysis311
is very likely to oversee the cyclical dynamics between each pair of stock indices. To investigate the312
dynamics in the global stock market, we now move from the full-sample analysis to a rolling-window313
study. Considering a 200-week rolling window starts from the beginning of the sample and admits314
a 5-week forward step for the iterative evaluation of the conditional dependence, we can access the315
variation of the spillover in the global equity market over time.316
Taking the return series of DJIA as an illustration 2, we iteratively exploit the local smoothed317
bootstrap method for detecting Granger causality from and to the DJIA return in Figure 6. The red line318
2 All volatility series are extremely skewed, see Table 5. In a small sample analysis, the test statistics turn out to be sensitiveto the clustering outliers, which typically occur during the market turmoil. As a result, the volatility dynamics are moreradical and less informative than that of returns.
Version May 16, 2017 submitted to Entropy 12 of 24
0 0.05 0.1 0.15 0.20
0.05
0.1
0.15
0.2
Norminal Size
Rej
ecti
onR
ate
(a) Size-Size Plot, C=4.8
n = 200n = 500n = 1000n = 200045-line
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Norminal Size
Rej
ecti
onR
ate
(b) Size-Power, C=4.8
n = 200n = 500n = 1000n = 200045-line
0 0.05 0.1 0.15 0.20
0.05
0.1
0.15
0.2
Norminal Size
Rej
ecti
onR
ate
(c) Size-Size Plot, C=8
n = 200n = 500n = 1000n = 200045-line
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Norminal Size
Rej
ecti
onR
ate
(d) Size-Power, C=8
n = 200n = 500n = 1000n = 200045-line
Figure 3. Size-size and size-power plots of Granger non-causality tests, based on 500 replicationsand smoothed local bootstrap (a). The DGP is the bivariate ARCH specified in Equation (10), withY affecting X. The left (right) column shows observed rejection rates under the null (alternative)hypothesis. The sample size varies from n = 200 to n = 2000.
represents the p-values for the transfer entropy-based test on DJIA weekly return as the information319
transmitter while the blue line shows the p-values associated with testing on DJIA as a receiver of320
information spillover on a weekly basis from others. The plots displays an event-dependent pattern,321
particularly for the recent financial crisis; during the early 2009 till the end of 2012, all pairwise test322
show a strong bi-directional linkage. Besides, the DJIA is strongly leading the Nikkei, Hangseng and323
CAC during the first decade of this century.324
Further, we see that the influence from other indexes to DJIA are different, typically responding325
to the economics events. For example, the blue line in the second panel of Figure 6 plunges below326
5% level twice before the recent financial crisis, meaning that the Hong Kong Hangseng index causes327
fluctuationd in the DJIA during those two periods; first in the later 90’s and again by the end of 2004.328
The timing of the first fall matches the 1997 Asian currency crisis and the latter one was in fact caused329
by China’s austerity policy in October 2004.330
Finally, the dynamic plots provide additional insights into the sample period that the full sample331
analysis may omit. The DAX and CAC are found to be less relevant for the future fluctuation in the332
weekly return of the DJIA, according to Table 6 and Figure 5, however one may clearly see that since333
Version May 16, 2017 submitted to Entropy 13 of 24
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Norminal Size
Rej
ecti
onR
ate
(a) X → Y, C=4.8
n = 200n = 500n = 1000n = 200045-line
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Norminal Size
Rej
ecti
onR
ate
(b) Y → X, C=4.8
n = 200n = 500n = 1000n = 200045-line
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Norminal Size
Rej
ecti
onR
ate
(c) X → Y, C=8
n = 200n = 500n = 1000n = 200045-line
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Norminal Size
Rej
ecti
onR
ate
(d) Y → X, C=8
n = 200n = 500n = 1000n = 200045-line
Figure 4. Size-power plots of Granger non-causality tests, based on 500 replications and smoothedlocal bootstrap (a). The DGP is the two-way VAR specified in Equation (11), with X affecting Y and Yaffecting X. The left (right) column shows observed rejection rates for testing X (Y) causing Y (X). Thesample size varies from n = 200 to n = 2000.
2001, the p-values for DAX→DJIA and CAC→DJIA are consistently below 5% for most of the time,334
suggesting an increase of the integration of global financial markets.335
5. Conclusions336
This paper provides guidelines for the practical application of transfer entropy in detecting337
conditional dependence, i.e. Granger causality in a more general sense, between two time series.338
Although there already is a tremendous literature that tried to apply the transfer entropy for this339
purpose, the asymptotics of the statistic and the performance of the resampling-based measures are340
still not understood well. We have considered five different resampling method-based tests; all of341
which were shown in the literature to be suitable for entropy-related tests, and investigated the testing342
size and power comprehensively. Two time-shifted surrogates and three smoothed bootstrap methods343
are tested on simulated data from linear VAR, nonlinear VAR and ARCH processes. The simulation344
results in this controlled environment suggest that all five measures achieve reasonable rejection rates345
under the null as well as the alternative hypotheses. Our results are very robust with respect to the346
density estimation method, including the procedure used for standardizing the location and scale of347
Version May 16, 2017 submitted to Entropy 14 of 24
DJIA
Nikkei
Hangseng
FTSE
DAX
CAC
(a) Network of Global Stock Return Spillover
DJIA
Nikkei
Hangseng
FTSE
DAX
CAC
(b) Network of Global Stock Volatility Spillover
Figure 5. Graphic representation of pairwise causalities on global stock returns and volatilities. All"−→" in the graph indicate a significant directional causality at 5% level.
the data and the choice of the bandwidth parameter, as long as the convergence rate of the kernel348
estimator of transfer entropy is consistent with its first order Taylor expansion.349
In the empirical application, we have shown how the proposed resampling techniques can be350
used on real world data for detecting conditional dependence in the data set. We use global equity data351
to carry out the detection in pairwise causalities in the return and volatility series among the world352
leading stock indexes. Our work is a nonparametric extension of the spillover measures considered353
by [46]. In accordance with them, we found evidence that US index and German index are the most354
important return transmitters and Hong Kong is the largest source for volatility spillover. Furthermore,355
the rolling window-based test for Granger causality in pairwise return series demonstrated that the356
causal linkages in global equity market is time-varying rather than static. The overall dependence is357
more tight during the most recent financial crisis, and the fluctuation of the p-values are shown to be358
event dependent.359
As for the future work, there are several directions for potential extension. On the theoretical360
side, it would be practically meaningful to consider causal linkage detection beyond the single period361
lag. Further nonparametric techniques need to be developed to play a similar role as the information362
criterion does for order selection of an estimation model in the parametric world. On the empirical363
side, it will be interesting to further exploit entropy-based statistics in testing conditional dependence364
when there exist a so-called common factor, i.e., looking at multivariate systems with more than two365
variables. One potential candidate for this type of test in the partial transfer entropy has been coined366
by [48], but its statistical properties need to be thoroughly studied yet.367
368
Acknowledgments: We would like to thank seminar participants at University of Amsterdam and Tinbergen369
Institute. The authors also wish to extend particular gratitude to Simon A. Broda and Chen Zhou for their370
suggestions on an early version of this paper. We also acknowledge SURFsara for providing the computational371
environment of the LISA cluster.372
References373
1. Shannon, C.E. A mathematical theory of communication. Bell System Technical Journal 1948, 27, 379–423;374
623–656.375
2. Shannon, C.E. Prediction and entropy of printed English. Bell System Technical Journal 1951, 30, 50–64.376
Version May 16, 2017 submitted to Entropy 15 of 24
3. Schreiber, T. Measuring information transfer. Physical review letters 2000, 85, 461.377
4. Granger, C.W. Investigating causal relations by econometric models and cross-spectral methods.378
Econometrica 1969, pp. 424–438.379
5. Hiemstra, C.; Jones, J.D. Testing for linear and nonlinear Granger causality in the stock price-volume380
relation. The Journal of Finance 1994, 49, 1639–1664.381
6. Diks, C.; Panchenko, V. A new statistic and practical guidelines for nonparametric Granger causality382
testing. Journal of Economic Dynamics and Control 2006, 30, 1647–1669.383
7. Su, L.; White, H. A nonparametric Hellinger metric test for conditional independence. Econometric Theory384
2008, 24, 829–864.385
8. Bouezmarni, T.; Rombouts, J.V.; Taamouti, A. Nonparametric copula-based test for conditional386
independence with applications to Granger causality. Journal of Business & Economic Statistics 2012,387
30, 275–287.388
9. Su, L.; White, H. Testing conditional independence via empirical likelihood. Journal of Econometrics 2014,389
182, 27–44.390
10. Hlavácková-Schindler, K.; Paluš, M.; Vejmelka, M.; Bhattacharya, J. Causality detection based on391
information-theoretic approaches in time series analysis. Physics Reports 2007, 441, 1–46.392
11. Amblard, P.O.; Michel, O.J. The relation between Granger causality and directed information theory: a393
review. Entropy 2012, 15, 113–143.394
12. Granger, C.; Lin, J.L. Using the mutual information coefficient to identify lags in nonlinear models. Journal395
of Time Series Analysis 1994, 15, 371–384.396
13. Hong, Y.; White, H. Asymptotic distribution theory for nonparametric entropy measures of serial397
dependence. Econometrica 2005, 73, 837–901.398
14. Barnett, L.; Bossomaier, T. Transfer entropy as a log-likelihood ratio. Physical Review Letters 2012,399
109, 138105.400
15. Efron, B. Computers and the theory of statistics: Thinking the unthinkable. SIAM review 1979, 21, 460–480.401
16. Theiler, J.; Eubank, S.; Longtin, A.; Galdrikian, B.; Farmer, J.D. Testing for nonlinearity in time series: the402
method of surrogate data. Physica D: Nonlinear Phenomena 1992, 58, 77–94.403
17. Schreiber, T.; Schmitz, A. Surrogate time series. Physica D: Nonlinear Phenomena 2000, 142, 346–382.404
18. Horowitz, J.L. The Bootstrap. Handbook of Econometrics 2001, 5, 3159–3228.405
19. Hinich, M.J.; Mendes, E.M.; Stone, L.; others. Detecting nonlinearity in time series: surrogate and bootstrap406
approaches. Studies in Nonlinear Dynamics & Econometrics 2005, 9.407
20. Faes, L.; Porta, A.; Nollo, G. Mutual nonlinear prediction as a tool to evaluate coupling strength and408
directionality in bivariate time series: Comparison among different strategies based on k nearest neighbors.409
Physical Review E 2008, 78, 026201.410
21. Papana, A.; Kyrtsou, K.; Kugiumtzis, D.; Diks, C.; others. Assessment of resampling methods for causality411
testing. Technical report, (CeNDEF Working Paper; No. 14-08). Amsterdam: CeNDEF, University of412
Amsterdam, 2014.413
22. Quiroga, R.Q.; Kraskov, A.; Kreuz, T.; Grassberger, P. Performance of different synchronization measures414
in real data: A case study on electroencephalographic signals. Physical Review E 2002, 65, 041903.415
23. Marschinski, R.; Kantz, H. Analysing the information flow between financial time series. The European416
Physical Journal B-Condensed Matter and Complex Systems 2002, 30, 275–281.417
24. Kugiumtzis, D. Evaluation of Surrogate and Bootstrap tests for nonlinearity in time series. Studies in418
Nonlinear Dynamics & Econometrics 2008.419
25. Robinson, P.M. Consistent nonparametric entropy-based testing. The Review of Economic Studies 1991,420
58, 437–453.421
26. Kullback, S.; Leibler, R.A. On information and sufficiency. The Annals of Mathematical Statistics 1951,422
22, 79–86.423
27. Granger, C.; Maasoumi, E.; Racine, J. A dependence metric for possibly nonlinear processes. Journal of424
Time Series Analysis 2004, 25, 649–669.425
28. Diks, C.; Fang, H. Detecting Granger causality with a nonparametric information-based statistic. (CeNDEF426
Working Paper; No. 17-03). Amsterdam: CeNDEF, University of Amsterdam 2017.427
29. Moddemeijer, R. On estimation of entropy and mutual information of continuous distributions. Signal428
Processing 1989, 16, 233–248.429
Version May 16, 2017 submitted to Entropy 16 of 24
30. Diks, C.; Manzan, S. Tests for serial independence and linearity based on correlation integrals. Studies in430
Nonlinear Dynamics and Econometrics 2002, 6-2, 1–20.431
31. Kraskov, A.; Stögbauer, H.; Grassberger, P. Estimating mutual information. Physical review E 2004,432
69, 066138.433
32. Wand, M.P.; Jones, M.C. Kernel Smoothing; Vol. 60, Crc Press, 1994.434
33. Silverman, B.W. Density Estimation for Statistics and Data Analysis; Vol. 26, CRC press, 1986.435
34. Joe, H. Estimation of entropy and other functionals of a multivariate density. Annals of the Institute of436
Statistical Mathematics 1989, 41, 683–697.437
35. Van der Vaart, A.W. Asymptotic Statistics; Vol. 3, Cambridge University Press, 2000.438
36. Jones, M.C.; Marron, J.S.; Sheather, S.J. A brief survey of bandwidth selection for density estimation.439
Journal of the American Statistical Association 1996, 91, 401–407.440
37. He, Y.L.; Liu, J.N.; Wang, X.Z.; Hu, Y.X. Optimal bandwidth selection for re-substitution entropy estimation.441
Applied Mathematics and Computation 2012, 219, 3425–3460.442
38. Politis, D.N.; Romano, J.P. The stationary bootstrap. Journal of the American Statistical association 1994,443
89, 1303–1313.444
39. Kugiumtzis, D. Direct-coupling information measure from nonuniform embedding. Physical Review E445
2013, 87, 062918.446
40. Papana, A.; Kyrtsou, C.; Kugiumtzis, D.; Diks, C. Simulation study of direct causality measures in447
multivariate time series. Entropy 2013, 15, 2635–2661.448
41. Efron, B.; Tibshirani, R.J. An introduction to the bootstrap; CRC press, 1994.449
42. Shao, J.; Tu, D. The Jackknife and Bootstrap; Springer Science & Business Media, 2012.450
43. Neumann, M.H.; Paparoditis, E. On bootstrapping L2-type statistics in density testing. Statistics &451
Probability Letters 2000, 50, 137–147.452
44. Paparoditis, E.; Politis, D.N. The local bootstrap for kernel estimators under general dependence conditions.453
Annals of the Institute of Statistical Mathematics 2000, 52, 139–159.454
45. Baek, E.G.; Brock, W.A. A general test for nonlinear Granger causality: Bivariate model. Working paper,455
Iowa State University and University of Wisconsin, Madison 1992.456
46. Diebold, F.X.; Yilmaz, K. Measuring financial asset return and volatility spillovers, with application to457
global equity markets. The Economic Journal 2009, 119, 158–171.458
47. Gamba-Santamaria, S.; Gomez-Gonzalez, J.E.; Hurtado-Guarin, J.L.; Melo-Velandia, L.F.; others. Volatility459
spillovers among global stock markets: Measuring total and directional effects. Technical Report No. 983,460
Banco de la Republica de Colombia, 2017.461
48. Vakorin, V.A.; Krakovska, O.A.; McIntosh, A.R. Confounding effects of indirect connections on causality462
estimation. Journal of neuroscience methods 2009, 184, 152–160.463
c© 2017 by the authors. Submitted to Entropy for possible open access publication under the terms and conditions464
of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).465
Version May 16, 2017 submitted to Entropy 17 of 24
Tabl
e1.
Obs
erve
dSi
zean
dPo
wer
ofth
eTr
ansf
erEn
trop
yba
sed
test
for
Line
arVA
RPr
oces
sin
Equa
tion
(8)
Size
α=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
4.8
200
0.05
600.
0500
0.04
600.
0420
0.04
400.
0820
0.07
400.
0740
0.07
800.
0780
500
0.07
400.
0680
0.06
600.
0700
0.06
600.
1160
0.11
200.
1200
0.11
600.
1220
1000
0.06
200.
0560
0.06
000.
0560
0.05
600.
0940
0.09
200.
0980
0.09
600.
0980
2000
0.03
800.
0340
0.03
800.
0460
0.04
600.
0940
0.09
200.
0960
0.09
800.
0980
α=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
8
200
0.05
000.
0440
0.04
600.
0460
0.04
400.
1120
0.10
000.
0960
0.09
600.
0920
500
0.08
400.
0760
0.07
600.
0720
0.06
600.
1360
0.13
000.
1060
0.11
600.
1140
1000
0.07
200.
0680
0.06
200.
0560
0.05
800.
1280
0.12
800.
1160
0.12
600.
1200
2000
0.08
800.
0780
0.08
200.
0760
0.08
200.
1420
0.13
800.
1320
0.13
400.
1440
Pow
erα=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
4.8
200
0.18
800.
1980
0.19
000.
1920
0.19
000.
2780
0.27
800.
2880
0.28
600.
2920
500
0.34
600.
3460
0.34
000.
3480
0.34
200.
4520
0.44
600.
4580
0.45
000.
4500
1000
0.54
400.
5340
0.53
200.
5340
0.52
800.
6400
0.65
200.
6460
0.64
800.
6460
2000
0.75
000.
7420
0.75
000.
7460
0.74
600.
8160
0.80
800.
8100
0.81
200.
8120
α=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
8
200
0.16
600.
1680
0.16
400.
1680
0.17
000.
2660
0.26
400.
2680
0.27
400.
2740
500
0.29
000.
2900
0.30
200.
3040
0.30
200.
4020
0.39
600.
3940
0.40
200.
3980
1000
0.49
800.
4980
0.50
000.
4900
0.49
800.
6040
0.61
200.
6120
0.61
400.
6120
2000
0.84
200.
8380
0.84
000.
8340
0.84
600.
8960
0.88
800.
8900
0.89
000.
8900
Not
e:Ta
ble
1p
rese
nts
the
emp
iric
alsi
zean
dp
ower
ofth
etr
ansf
eren
trop
y-ba
sed
test
at5%
and
10%
sign
ifica
nce
leve
lsfo
rp
roce
ssE
quat
ion
(8)f
ord
iffe
rent
resa
mp
ling
met
hod
s.T
heva
lues
repr
esen
tobs
erve
dre
ject
ion
rate
sov
er50
0re
aliz
atio
nsfo
rno
min
alsi
ze0.
05.S
ampl
esi
zes
gofr
om20
0to
2,00
0.Th
eco
ntro
lpar
amet
era=
0.4
for
size
eval
uatio
nan
da=
0.1
for
esta
blis
hing
pow
ers.
For
this
sim
ulat
ion
stud
yw
eco
nsid
erC=
4.8
and
C=
8.
Version May 16, 2017 submitted to Entropy 18 of 24
Tabl
e2.
Obs
erve
dSi
zean
dPo
wer
ofth
eTr
ansf
erEn
trop
yba
sed
test
for
Non
linea
rVA
RPr
oces
sin
Equa
tion
(9)
Size
α=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
4.8
200
0.05
000.
0360
0.03
400.
0380
0.03
600.
0760
0.07
200.
0780
0.07
600.
0720
500
0.05
800.
0580
0.05
800.
0580
0.05
800.
0960
0.09
800.
1040
0.10
400.
0980
1000
0.03
400.
0360
0.03
800.
0360
0.03
800.
0620
0.05
800.
0740
0.07
000.
0720
2000
0.03
800.
0320
0.04
400.
0460
0.04
200.
0780
0.07
000.
0960
0.09
200.
0880
α=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
8
200
0.05
600.
0480
0.04
600.
0520
0.04
800.
1000
0.08
000.
0940
0.09
400.
0940
500
0.06
400.
0620
0.05
000.
0480
0.05
400.
1120
0.11
000.
1080
0.10
400.
1040
1000
0.04
400.
0360
0.03
200.
0300
0.02
800.
0900
0.08
600.
0820
0.07
800.
0800
2000
0.02
600.
0300
0.02
800.
0280
0.02
600.
0700
0.06
400.
0740
0.06
600.
0640
Pow
erα=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
4.8
200
0.14
000.
1360
0.13
800.
1400
0.14
400.
2480
0.24
200.
2300
0.23
600.
2300
500
0.33
800.
3360
0.33
400.
3360
0.33
200.
4400
0.44
000.
4440
0.43
000.
4360
1000
0.60
600.
6040
0.62
400.
6220
0.62
200.
7180
0.72
600.
7140
0.71
800.
7160
2000
0.87
600.
8760
0.87
800.
8740
0.87
800.
9440
0.93
000.
9320
0.93
000.
9280
α=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
8
200
0.09
400.
0900
0.09
000.
0900
0.08
800.
1800
0.17
400.
1700
0.16
800.
1760
500
0.18
800.
1800
0.18
000.
1760
0.17
800.
2960
0.29
600.
3000
0.29
800.
3020
1000
0.38
000.
3940
0.39
000.
3840
0.38
200.
5520
0.54
400.
5480
0.55
200.
5480
2000
0.83
400.
8300
0.82
800.
8220
0.83
000.
9040
0.90
400.
9060
0.89
800.
9040
Not
e:Ta
ble
2p
rese
nts
the
emp
iric
alsi
zean
dp
ower
ofth
etr
ansf
eren
trop
y-ba
sed
test
at5%
and
10%
sign
ifica
nce
leve
lsfo
rp
roce
ssE
quat
ion
(9)f
ord
iffe
rent
resa
mp
ling
met
hod
s.T
heva
lues
repr
esen
tobs
erve
dre
ject
ion
rate
sov
er50
0re
aliz
atio
nsfo
rno
min
alsi
ze0.
05.S
ampl
esi
zes
gofr
om20
0to
2,00
0.Th
eco
ntro
lpar
amet
era=
0.4
for
size
eval
uatio
nan
da=
0.1
for
esta
blis
hing
pow
ers.
For
this
sim
ulat
ion
stud
y,w
eco
nsid
erC=
4.8
and
C=
8.
Version May 16, 2017 submitted to Entropy 19 of 24
Tabl
e3.
Obs
erve
dSi
zean
dPo
wer
ofth
eTr
ansf
erEn
trop
yba
sed
test
for
Biva
riat
eA
RC
HPr
oces
sin
Equa
tion
(10)
Size
α=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
4.8
200
0.06
600.
0580
0.06
200.
0640
0.06
200.
1420
0.12
800.
1260
0.13
400.
1280
500
0.06
400.
0560
0.05
600.
0560
0.05
800.
1120
0.10
600.
1080
0.10
200.
1000
1000
0.04
800.
0460
0.04
000.
0420
0.03
400.
0740
0.07
000.
0700
0.06
600.
0620
2000
0.02
200.
0200
0.00
800.
0080
0.00
800.
0460
0.05
000.
0220
0.02
600.
0180
α=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
8
200
0.09
200.
0760
0.07
600.
0720
0.08
400.
1540
0.13
600.
1280
0.12
600.
1320
500
0.11
000.
0900
0.07
200.
0760
0.08
000.
1980
0.18
600.
1620
0.14
800.
1600
1000
0.09
200.
0960
0.07
400.
0720
0.08
000.
1500
0.15
000.
1220
0.11
800.
1240
2000
0.07
800.
0740
0.06
600.
0620
0.06
000.
1260
0.11
800.
1140
0.11
600.
1120
Pow
erα=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
4.8
200
0.23
200.
2240
0.21
800.
2240
0.21
600.
3320
0.33
400.
3320
0.35
200.
3460
500
0.35
200.
3420
0.35
200.
3540
0.35
400.
4800
0.48
000.
4740
0.47
800.
4780
1000
0.50
200.
5060
0.51
200.
5120
0.50
000.
6340
0.63
200.
6340
0.62
800.
6300
2000
0.60
200.
6060
0.59
400.
5920
0.59
600.
7340
0.72
800.
7360
0.73
000.
7280
α=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
8
200
0.29
400.
2880
0.32
200.
3180
0.30
800.
4360
0.43
200.
4380
0.44
000.
4340
500
0.55
200.
5500
0.56
200.
5720
0.56
800.
6880
0.69
400.
6880
0.69
000.
6920
1000
0.77
200.
7720
0.78
200.
7800
0.77
400.
8600
0.85
600.
8640
0.86
000.
8640
2000
0.97
800.
9720
0.97
800.
9740
0.97
600.
9900
0.99
000.
9920
0.99
200.
9920
Not
e:Ta
ble
3pr
esen
tsth
eem
piri
cals
ize
and
pow
erof
the
tran
sfer
entr
opy-
base
dte
stat
5%an
d10
%si
gnifi
canc
ele
vels
for
proc
ess
Equ
atio
n(1
0)fo
rd
iffe
rent
resa
mpl
ing
met
hod
s.T
heva
lues
repr
esen
tobs
erve
dre
ject
ion
rate
sov
er50
0re
aliz
atio
nsfo
rno
min
alsi
ze0.
05.S
ampl
esi
zes
gofr
om20
0to
2,00
0.Th
eco
ntro
lpar
amet
era=
0.4
for
size
eval
uatio
nan
da=
0.1
for
esta
blis
hing
pow
ers.
For
this
sim
ulat
ion
stud
y,w
eco
nsid
erC=
4.8
and
C=
8.
Version May 16, 2017 submitted to Entropy 20 of 24
Tabl
e4.
Obs
erve
dPo
wer
ofth
eTr
ansf
erEn
trop
yba
sed
test
for
the
Two-
Way
VAR
Proc
ess
inEq
uati
on(1
1)
X→
Yα=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
4.8
200
0.27
800.
2600
0.35
400.
3520
0.35
600.
3980
0.37
600.
4560
0.45
600.
4660
500
0.56
400.
5360
0.62
600.
6240
0.62
600.
6780
0.66
800.
7420
0.74
000.
7460
1000
0.83
800.
8320
0.88
000.
8740
0.87
800.
9040
0.89
800.
9200
0.92
400.
9200
2000
0.99
000.
9900
0.99
800.
9960
1.00
001.
0000
1.00
001.
0000
1.00
001.
0000
α=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
8
200
0.24
800.
2240
0.30
800.
3180
0.31
000.
3540
0.33
600.
3900
0.39
400.
3920
500
0.41
800.
4020
0.47
400.
4700
0.47
000.
5400
0.53
200.
5880
0.59
200.
5800
1000
0.79
600.
7840
0.81
400.
8140
0.81
600.
8560
0.85
400.
8740
0.87
200.
8720
2000
0.98
000.
9780
0.98
200.
9800
0.98
000.
9900
0.98
800.
9900
0.99
000.
9920
Y→
Xα=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
4.8
200
0.51
400.
5060
0.54
800.
5560
0.54
800.
6120
0.59
600.
6920
0.69
200.
6940
500
0.94
000.
9340
0.94
800.
9440
0.94
600.
9620
0.95
800.
9720
0.97
000.
9740
1000
1.00
001.
0000
1.00
001.
0000
1.00
001.
0000
1.00
001.
0000
1.00
001.
0000
2000
1.00
001.
0000
1.00
001.
0000
1.00
001.
0000
1.00
001.
0000
1.00
001.
0000
α=
0.05
α=
0.10
nTS
.aT
S.b
SMB.
aSM
B.b
STB
TS.a
TS.
bSM
B.a
SMB.
bST
B
C=
8
200
0.40
400.
3700
0.44
200.
4400
0.42
800.
5060
0.49
000.
5400
0.54
000.
5320
500
0.82
600.
8220
0.82
200.
8200
0.81
600.
8740
0.87
400.
8700
0.87
200.
8680
1000
0.98
200.
9840
0.98
200.
9800
0.98
000.
9940
0.99
400.
9940
0.99
200.
9940
2000
1.00
001.
0000
1.00
001.
0000
1.00
001.
0000
1.00
001.
0000
1.00
001.
0000
Not
e:Ta
ble
4pr
esen
tsth
eem
piri
calp
ower
ofth
etr
ansf
eren
trop
y-ba
sed
test
at5%
and
10%
sign
ifica
nce
leve
lsfo
rpr
oces
sE
quat
ion
(11)
for
dif
fere
ntre
sam
plin
gm
etho
ds.
The
valu
esre
pres
ent
obse
rved
reje
ctio
nra
tes
over
500
real
izat
ions
for
nom
inal
size
0.05
.Sam
ple
size
sgo
from
200
to2,
000.
The
cont
rolp
aram
eter
b=−
0.2
and
c=
0.1
for
esta
blis
hing
pow
ers.
For
this
sim
ulat
ion
stud
y,w
eco
nsid
erC=
4.8
and
C=
8.
Version May 16, 2017 submitted to Entropy 21 of 24
Table 5. Descriptive Statistics for Global Stock Market Return and Volatility
ReturnDJIA Nikkei Hangseng FTSE DAX CAC
Mean 0.1433 -0.0126 0.1294 0.0823 0.1535 0.0790Median 0.2911 0.1472 0.2627 0.2121 0.4029 0.1984
Maximum 10.6977 11.4496 13.9169 12.5845 14.9421 12.4321Minimum -20.0298 -27.8844 -19.9212 -23.6317 -24.3470 -25.0504Std. Dev. 2.2321 3.0521 3.3819 2.3367 3.0972 2.9376Skewness -0.8851 -0.6978 -0.3773 -0.8643 -0.6398 -0.6803Kurtosis 10.8430 8.9250 5.9522 13.2777 7.9186 8.0780LB Test 49.9368∗∗ 15.4577 28.7922 61.0916∗∗ 28.0474 43.5004∗∗
ADF Test -38.9512∗∗ -37.1989∗∗ -35.4160∗∗ -38.8850∗∗ -37.2015∗∗ -38.7114∗∗
VolatilityDJIA Nikkei Hangseng FTSE DAX CAC
Mean 4.5983 7.7167 9.6164 5.5306 8.5698 8.2629Median 2.2155 4.6208 4.5827 2.7161 4.0596 4.7122
Maximum 208.2227 265.9300 379.4385 149.1572 175.0968 179.8414Minimum 0.0636 0.1882 0.1554 0.1154 0.1263 0.2904Std. Dev. 9.9961 13.5154 21.3838 10.1167 15.2845 12.7872Skewness 10.9980 9.6361 10.2868 6.7179 5.3602 6.0357Kurtosis 180.0844 140.7606 152.4263 67.0128 42.0810 58.3263LB Test 1924.0870∗∗ 933.3972∗∗ 1198.6872∗∗ 1970.7366∗∗ 2770.5973∗∗ 1982.4141∗∗
ADF Test -16.0378∗∗ -17.9044∗∗ -17.4896∗∗ -14.0329∗∗ -14.1928∗∗ -13.6136∗∗
Note: Table 5 presents the descriptive statistics for six globally leading indexes. The sample size is 1313 for both Returnsand Volatilities. The nominal returns are measured by weekly Friday-to-Friday log price difference multiplied by 100 andthe Monday-to-Friday volatilities are calculated following [46]. For LB-test and ADF-test statistics, the asterisks indicate thesignificance of corresponding p-value at 1% (∗∗) levels.
Version May 16, 2017 submitted to Entropy 22 of 24
Tabl
e6.
Det
ecti
onof
Con
diti
onal
Dep
ende
nce
inG
loba
lSto
ckR
etur
ns
PP
PPP
PP
From
ToD
JIA
Nik
kei
Han
gsen
gFT
SED
AX
CA
C
C=
4.8
C=
8C=
4.8
C=
8C=
4.8
C=
8C=
4.8
C=
8C=
4.8
C=
8C=
4.8
C=
8
TS.a
DJI
A-
0.36
70.
002∗∗
0.97
20.
273
0.60
90.
009∗∗
0.53
30.
012∗
0.27
70.
001∗∗
Nik
kei
0.23
10.
081
-0.
997
0.95
10.
883
0.05
90.
868
0.24
20.
004∗∗
0.00
1∗∗
Han
gsen
g0.
898
0.19
70.
963
0.40
7-
0.70
10.
035∗
0.96
90.
174
0.64
00.
005∗∗
FTSE
0.48
30.
004∗∗
0.00
4∗∗
0.00
1∗∗
0.41
90.
001∗∗
-0.
839
0.18
50.
917
0.61
5D
AX
0.97
70.
149
0.02
7∗0.
001∗∗
0.00
9∗∗
0.00
1∗∗
0.00
4∗∗
0.00
1∗∗
-0.
695
0.02
5∗
CA
C0.
995
0.71
30.
001∗∗
0.00
1∗∗
0.29
40.
001∗∗
0.91
80.
741
0.63
90.
203
-
TS.b
DJI
A-
0.40
20.
004∗∗
0.96
70.
341
0.59
50.
020∗
0.56
90.
049∗
0.28
80.
012∗
Nik
kei
0.21
90.
110
-0.
999
0.95
60.
853
0.10
30.
855
0.25
50.
009∗∗
0.00
6∗∗
Han
gsen
g0.
899
0.26
90.
975
0.48
5-
0.69
60.
088
0.97
10.
231
0.66
40.
023∗
FTSE
0.47
70.
016∗
0.00
6∗∗
0.00
3∗∗
0.40
40.
018∗
-0.
847
0.24
50.
896
0.64
2D
AX
0.97
60.
221
0.02
7∗0.
002∗∗
0.00
9∗∗
0.00
5∗∗
0.01
1∗0.
010∗∗
-0.
692
0.06
5C
AC
0.99
30.
729
0.00
2∗∗
0.00
2∗∗
0.34
60.
016∗
0.90
70.
763
0.65
00.
244
-
SMB.
a
DJI
A-
0.56
40.
001∗∗
0.99
90.
349
0.79
60.
003∗∗
0.81
70.
014∗
0.42
50.
001∗∗
Nik
kei
0.32
10.
147
-0.
988
0.94
60.
957
0.08
50.
944
0.32
10.
006∗∗
0.00
2∗∗
Han
gsen
g0.
946
0.27
30.
967
0.48
3-
0.86
00.
016∗
0.99
40.
188
0.79
30.
004∗∗
FTSE
0.57
90.
005∗∗
0.02
1∗0.
001∗∗
0.70
10.
003∗∗
-0.
947
0.29
70.
943
0.73
9D
AX
0.98
80.
240
0.04
4∗0.
001∗∗
0.01
2∗0.
001∗∗
0.00
6∗∗
0.00
1∗∗
-0.
788
0.02
0∗
CA
C0.
993
0.76
20.
006∗∗
0.00
1∗∗
0.59
40.
001∗∗
0.94
60.
861
0.84
20.
270
-
SMB.
b
DJI
A-
0.58
30.
002∗∗
0.99
40.
334
0.79
70.
001∗∗
0.85
50.
014∗
0.41
30.
001∗∗
Nik
kei
0.35
10.
155
-0.
992
0.94
00.
952
0.08
80.
965
0.32
20.
002∗∗
0.00
3∗∗
Han
gsen
g0.
945
0.27
60.
970
0.50
6-
0.86
60.
015∗
0.99
70.
215
0.82
90.
004∗∗
FTSE
0.63
70.
006∗∗
0.00
8∗∗
0.00
1∗∗
0.71
40.
003∗∗
-0.
954
0.34
50.
953
0.78
9D
AX
0.98
00.
256
0.03
4∗0.
001∗∗
0.01
1∗0.
001∗∗
0.00
9∗∗
0.00
1∗∗
-0.
831
0.04
2∗
CA
C0.
993
0.82
50.
003∗∗
0.00
1∗∗
0.59
10.
002∗∗
0.98
00.
898
0.86
20.
304
-
STB
DJI
A-
0.64
50.
001∗∗
0.99
60.
334
0.78
70.
002∗∗
0.82
30.
015∗
0.43
00.
001∗∗
Nik
kei
0.36
30.
114
-0.
989
0.94
40.
946
0.07
90.
955
0.29
60.
003∗∗
0.00
2∗∗
Han
gsen
g0.
964
0.27
20.
984
0.49
1-
0.85
90.
013∗
0.98
70.
197
0.78
60.
004∗∗
FTSE
0.65
20.
005∗∗
0.01
6∗0.
001∗∗
0.68
80.
003∗∗
-0.
940
0.26
20.
965
0.76
1D
AX
0.98
20.
234
0.04
8∗0.
001∗∗
0.01
7∗0.
001∗∗
0.00
6∗∗
0.00
1∗∗
-0.
828
0.02
9∗
CA
C0.
996
0.81
40.
007∗∗
0.00
1∗∗
0.57
80.
001∗∗
0.96
70.
835
0.79
90.
265
-N
ote:
Tabl
e6
pres
ents
the
stat
istic
sfo
rpa
irw
ise
tran
sfer
entr
opy
base
dte
ston
retr
uns
ofgl
obal
stoc
k-in
dexe
sfo
ron
e-w
eek
ahea
dco
nditi
onal
non-
inde
pend
ence
.The
resu
ltsar
esh
own
both
for
the
five
diff
eren
tres
ampl
ing
met
hods
inSe
ctio
n2.
3.Th
eco
nsta
ntC
take
sva
lue
4.8
and
8fo
rro
bust
ness
-che
ck.T
heas
teri
sks
indi
cate
the
sign
ifica
nce
ofth
eco
rres
pond
ing
p-va
lue,
atth
e5%
(∗)a
nd1%
(∗∗ )
leve
ls.
Version May 16, 2017 submitted to Entropy 23 of 24
Tabl
e7.
Det
ecti
onof
Con
diti
onal
Dep
ende
nce
inG
loba
lSto
ckVo
lati
litie
s
PP
PPP
PP
From
ToD
JIA
Nik
kei
Han
gsen
gFT
SED
AX
CA
C
C=
4.8
C=
8C=
4.8
C=
8C=
4.8
C=
8C=
4.8
C=
8C=
4.8
C=
8C=
4.8
C=
8
TS.a
DJI
A-
0.99
70.
998
0.99
80.
994
0.97
40.
853
0.82
80.
001∗∗
0.00
5∗∗
0.00
1∗∗
Nik
kei
0.99
80.
995
-0.
996
1.00
00.
997
0.99
40.
971
0.95
01.
000
0.99
9H
angs
eng
0.94
30.
003∗∗
0.98
90.
992
-0.
822
0.00
1∗∗
0.00
1∗∗
0.00
1∗∗
0.97
30.
953
FTSE
0.01
0∗∗
0.00
1∗∗
0.95
50.
944
0.99
71.
000
-0.
806
0.00
1∗∗
0.98
50.
946
DA
X0.
975
0.93
10.
934
0.89
80.
999
0.99
50.
001∗∗
0.00
1∗∗
-0.
072
0.00
1∗∗
CA
C0.
996
0.94
40.
988
0.98
70.
999
0.99
70.
003∗∗
0.00
1∗∗
0.05
40.
001∗∗
-
TS.
b
DJI
A-
0.99
30.
994
0.99
60.
992
0.95
80.
857
0.80
60.
010∗∗
0.01
8∗0.
005∗∗
Nik
kei
0.98
60.
994
-0.
996
0.99
70.
988
0.98
80.
942
0.92
60.
998
0.99
6H
angs
eng
0.91
90.
011∗
0.97
60.
966
-0.
819
0.00
2∗∗
0.00
1∗∗
0.00
1∗∗
0.96
50.
941
FTSE
0.01
9∗0.
003∗∗
0.95
00.
927
0.99
40.
997
-0.
785
0.00
2∗∗
0.98
20.
932
DA
X0.
976
0.90
40.
943
0.89
00.
997
0.99
90.
001∗∗
0.00
1∗∗
-0.
113
0.00
9∗∗
CA
C0.
998
0.95
10.
979
0.98
30.
999
0.99
80.
009∗∗
0.00
2∗∗
0.07
70.
003∗∗
-
SMB.
a
DJI
A-
0.82
30.
786
0.98
40.
968
0.46
80.
189
0.21
00.
004∗∗
0.02
7∗0.
002∗∗
Nik
kei
0.95
70.
941
-1.
000
1.00
00.
843
0.80
00.
743
0.61
40.
998
0.99
3H
angs
eng
0.45
80.
030∗
0.88
80.
808
-0.
165
0.00
7∗∗
0.00
1∗∗
0.00
1∗∗
0.83
90.
736
FTSE
0.03
4∗0.
001∗∗
0.55
70.
417
1.00
01.
000
-0.
358
0.00
1∗∗
0.96
70.
793
DA
X0.
702
0.25
50.
448
0.32
71.
000
1.00
00.
001∗∗
0.00
1∗∗
-0.
030∗
0.00
1∗∗
CA
C0.
887
0.24
10.
673
0.64
31.
000
1.00
00.
005∗∗
0.00
1∗∗
0.01
2∗0.
001∗∗
-
SMB.
b
DJI
A-
0.85
10.
806
0.97
30.
969
0.56
30.
269
0.22
60.
004∗∗
0.03
2∗0.
001∗∗
Nik
kei
0.95
00.
940
-1.
000
1.00
00.
888
0.84
90.
792
0.67
50.
999
0.99
3H
angs
eng
0.45
10.
022∗
0.91
90.
869
-0.
209
0.00
5∗∗
0.00
2∗∗
0.00
1∗∗
0.86
80.
793
FTSE
0.02
3∗0.
001∗∗
0.56
10.
457
1.00
01.
000
-0.
434
0.00
1∗∗
0.97
40.
840
DA
X0.
819
0.44
20.
491
0.35
61.
000
1.00
00.
001∗∗
0.00
1∗∗
-0.
030∗
0.00
1∗∗
CA
C0.
958
0.55
40.
781
0.69
61.
000
1.00
00.
009∗∗
0.00
1∗∗
0.01
4∗0.
001∗∗
-
STB
DJI
A-
0.84
70.
813
0.97
20.
964
0.60
20.
278
0.22
10.
009∗∗
0.04
5∗0.
004∗∗
Nik
kei
0.96
70.
956
-1.
000
1.00
00.
835
0.78
90.
777
0.62
40.
998
0.99
7H
angs
eng
0.52
70.
033∗
0.93
30.
887
-0.
222
0.01
0∗∗
0.00
3∗∗
0.00
1∗∗
0.86
00.
762
FTSE
0.02
4∗0.
002∗∗
0.63
50.
497
1.00
01.
000
-0.
405
0.00
1∗∗
0.97
70.
826
DA
X0.
795
0.39
20.
561
0.41
01.
000
1.00
00.
004∗∗
0.00
1∗∗
-0.
027∗
0.00
1∗∗
CA
C0.
903
0.56
10.
809
0.74
81.
000
1.00
00.
020∗
0.00
1∗∗
0.01
0∗∗
0.00
1∗∗
-N
ote:
Tabl
e7
pres
ents
the
stat
istic
sfo
rpa
irw
ise
tran
sfer
entr
opy
base
dte
ston
vola
tiliti
esof
glob
alst
ock-
inde
xes
for
one-
wee
kah
ead
cond
ition
alno
n-in
depe
nden
ce.T
here
sults
are
show
nbo
thfo
rth
efiv
edi
ffer
entr
esam
plin
gm
etho
dsin
Sect
ion
2.3.
The
cons
tant
Cta
kes
valu
e4.
8an
d8
for
robu
stne
ss-c
heck
.The
aste
risk
sin
dica
teth
esi
gnifi
canc
eof
the
corr
espo
ndin
gp-
valu
esat
the
5%(∗
)and
1%(∗∗ )
leve
ls.
Version May 16, 2017 submitted to Entropy 24 of 24
11/95 10/97 09/99 08/01 07/03 06/05 05/07 04/09 03/11 02/13 01/15 12/160
0.2
0.4
0.6
0.8
1
Year
p-va
lue
Return Spillover Dynamics
DJIA→ NikkeiNikkei→ DJIA5% sig. level
11/95 10/97 09/99 08/01 07/03 06/05 05/07 04/09 03/11 02/13 01/15 12/160
0.2
0.4
0.6
0.8
1
Year
p-va
lue
DJIA→ HKHSHKHS→ DJIA5% sig. level
11/95 10/97 09/99 08/01 07/03 06/05 05/07 04/09 03/11 02/13 01/15 12/160
0.2
0.4
0.6
0.8
1
Year
p-va
lue
DJIA→ FTSEFTSE→ DJIA5% sig. level
11/95 10/97 09/99 08/01 07/03 06/05 05/07 04/09 03/11 02/13 01/15 12/160
0.2
0.4
0.6
0.8
1
Year
p-va
lue
DJIA→ DAXDAX→ DJIA5% sig. level
11/95 10/97 09/99 08/01 07/03 06/05 05/07 04/09 03/11 02/13 01/15 12/160
0.2
0.4
0.6
0.8
1
Year
p-va
lue
DJIA→ CACCAC→ DJIA5% sig. level
Figure 6. The time-varying p-values for the transfer entropy-based Granger causality test in Returnseries are presented. The causal linkages from DJIA to other markets, as well as the linkages from othermarkets to DJIA are tested.