robustestimation and inference for extremal dependence …jbhill/robust_biv_tail_dep.pdf ·...

42
Robust Estimation and Inference for Extremal Dependence in Time Series ¤ Jonathan B. Hill y Dept. of Economics University of North Carolina - Chapel Hill January 24, 2009 Abstract Dependence between extreme values is predominantly measured by …rst assuming a parametric joint distribution function, and almost always for oth- erwise marginally iid processes. We develop semi-nonparametric and nonpara- metric measures, estimators and tests of bivariate tail dependence for non-iid data based on tail exceedances and events. The measures and estimators cap- ture extremal dependence decay over time and can be re-scaled to provide robust estimators of canonical conditional tail probability and tail copula no- tions of tail dependence. Unlike extant o¤erings, the tests obtain asymptotic power of one against in…nitessimal deviations from tail independence. Further, the estimators apply to dependent, heterogeneous processes with or without extremal dependence and irrespective of non-extremal properties and joint dis- tribution speci…cations. Finally, we study the extremal associations within and between equity returns in the U.S., U.K. and Japan. ¤ Portions of this paper, previously circulated under the title "Robust Non-Parametric Tests of Extremal Volatility Spillover with a Study of Extremal Asset Market Contagion," were presented at the Tinbergen Institute-Amsterdam, the University of Mannhiem Dept. of Economics, Maastricht University Dept. of Quantitative Economics, Lancaster University Dept. of Mathematics and Sta- tistics, the University of North Carolina-Chapel Hill Dept. of Statistics and Operations Research, the Statistical and Applied Mathematical Sciences Institute Workshop on Risk Analysis, University of Toronto Dept. of Economics, and Duke University Dept. of Economics Brown Bag Econometrics workshop, London School of Economics Dept. of Economics, and Oxford University Dept. of Eco- nomics. All participants are gratefully acknowledged, in particular Tim Bollerslev, Bjorn Eraker, Chuan Goh, Ross Leadbetter, Anthony Ledford, John Mahue, Enno Mammen, Angelo Molino, Bent Neilson, Marius Ooms, Franz Palm, Vladas Pipiras, Myung Seo, Neil Sheppard, George Tauchen, Jonathan Tawn, Pierre Urbain and Qiwei Yao. y Dept. of Economics, University of North Carolina-Chapel Hill, www.unc.edu/»jbhill, jb- [email protected]. AMS 2000 subject classi…cations : 62G32, 62G35. Keywords : tail dependence; heavy-tails; non-parametric inference. 1

Upload: buidien

Post on 27-Aug-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

Robust Estimation and Inference for ExtremalDependence in Time Series¤

Jonathan B. Hilly

Dept. of EconomicsUniversity of North Carolina - Chapel Hill

January 24, 2009

AbstractDependence between extreme values is predominantly measured by …rst

assuming a parametric joint distribution function, and almost always for oth-erwise marginally iid processes. We develop semi-nonparametric and nonpara-metric measures, estimators and tests of bivariate tail dependence for non-iiddata based on tail exceedances and events. The measures and estimators cap-ture extremal dependence decay over time and can be re-scaled to providerobust estimators of canonical conditional tail probability and tail copula no-tions of tail dependence. Unlike extant o¤erings, the tests obtain asymptoticpower of one against in…nitessimal deviations from tail independence. Further,the estimators apply to dependent, heterogeneous processes with or withoutextremal dependence and irrespective of non-extremal properties and joint dis-tribution speci…cations. Finally, we study the extremal associations within andbetween equity returns in the U.S., U.K. and Japan.

¤Portions of this paper, previously circulated under the title "Robust Non-Parametric Tests ofExtremal Volatility Spillover with a Study of Extremal Asset Market Contagion," were presented atthe Tinbergen Institute-Amsterdam, the University of Mannhiem Dept. of Economics, MaastrichtUniversity Dept. of Quantitative Economics, Lancaster University Dept. of Mathematics and Sta-tistics, the University of North Carolina-Chapel Hill Dept. of Statistics and Operations Research,the Statistical and Applied Mathematical Sciences Institute Workshop on Risk Analysis, Universityof Toronto Dept. of Economics, and Duke University Dept. of Economics Brown Bag Econometricsworkshop, London School of Economics Dept. of Economics, and Oxford University Dept. of Eco-nomics. All participants are gratefully acknowledged, in particular Tim Bollerslev, Bjorn Eraker,Chuan Goh, Ross Leadbetter, Anthony Ledford, John Mahue, Enno Mammen, Angelo Molino, BentNeilson, Marius Ooms, Franz Palm, Vladas Pipiras, Myung Seo, Neil Sheppard, George Tauchen,Jonathan Tawn, Pierre Urbain and Qiwei Yao.

yDept. of Economics, University of North Carolina-Chapel Hill, www.unc.edu/»jbhill, [email protected] 2000 subject classi…cations : 62G32, 62G35.Keywords : tail dependence; heavy-tails; non-parametric inference.

1

Page 2: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

1. INTRODUCTION Interest in extremal co-movements in …nancial and in-surance markets has risen dramatically due to signi…cant increases in market volatilityand evidence for market linkages. See Engle et al (1990), Embrechts et al (2001), Lon-gin and Solnik (2001), Forbes and Rigobon (2002), Embrechts et al (2003), Finken-stadt Rootzén (2003), Hartmann et al (2004) to name a few. Clearly bivariate taildependence in this framework is a natural measure of risk spillover.

The tail dependence literature for a bivariate pair fg is split unevenly intoparametric and nonparametric methods. Parametric methods, arguably the mostwidely used, are typically grounded on power tail indices, including the bivariate tailindex (Ledford and Tawn 1996, 1997, 2003) and the extremal index (Leadbetter etal 1983). Unit Fréchet transformations and copula structures are often exploited toabstract from the marginal distributions, yet both joint and marginal distributions,or just the tail support, are assumed to satisfy parametric forms, including Pareto,Weibull and Generalized Extreme Value. In this literature either the individual timeseries fg and fg are assumed to be iid (Ledford and Tawn 1997; He¤ernan andTawn 2004; Schmidt and Stadtmüller 2006), or dependence is allowed in a univariatesetting but either ignored for point estimation, or not supported with rigorous theoryfor interval estimation (Ledford and Tawn 2003). See, also, Coles et al (1999) andSt¼aric¼a (1999). Further, Ledford and Tawn’s (1996, 1997) bivariate tail index isincapable of detecting tail dependence in stochastic volatility, and in general cannotmodel tail dependence decay for the classes of distributions most frequently cited inthis literature (e.g. Bivariate Extreme Value, Clayton, Gumbel). See Section 2.3below.

Nonparametric and semi-nonparametric methods have attracted substantial at-tention recently due to ‡exibility and robustness properties. See, for example, Dreesand Huang (1998), Einmahl et al (2001), Einmahl et al (2006), Schmidt and Stadt-müller (2006), Klüppelberg et al (2007), Einmahl et al (2008), Klüppelberg et al(2008) and Zhang et al (2008). In all of these cases fg and fg are assumed to bemarginally iid and tail dependence decay is neglected, and in many cases very smallor degenerate forms of tail dependence cannot be detected (Section 2.3).

In general, few results permit unrestricted non-extremes, or substantial persistenceand heterogeneity in extremes under the null hypothesis of no tail dependence of aparticular form, and none that we know of tackle joint tail dependence for time seriesover multiple time horizons. It is clear that persistence, decay and heterogeneitymust be allowed due to evidence for clustering in temperature ‡uctuations, rain fall,wave size, insurance claims, equity trade volume, stock options, asset returns, etc.(Embrechts et al 2003; Ledford and Tawn 2003, He¤ernan and Tawn 2004). Figure1 depicts daily log returns of the NASDAQ composite stock index for period Jan.2001-Dec. 2004 (roughly = 1000 trading days), where the upper 5th percentileof absolute returns jj is denoted. Figure 2 presents scatter plots of j¡j againstjj for displacements = 1, 2 and 3 days, it depicts the upper 5th percentile of thesample fjjg=1, and lists the percent of joint 5th percentile threshold exceedances

2

Page 3: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

for j¡j and jj. Apparently extremes cluster with decaying memory, although a…xed percentile like 5% does not reveal tail dependence (cf. Sibuya 1961, Loynes1965, Smith 1984).

FIGURE 1NASDAQ Daily Log Returns and Upper 5th percentile

-0.10

-0.08

-0.06

-0.04

-0.02

0.00

0.02

0.04

0.06

0.08

0.10

Jan-01 May-01 Sep-01 Feb-02 Jun-02 Oct-02 Mar-03 Jul-03

FIGURE 2

.00

.02

.04

.06

.08

.10

.12

.14

.00 .04 .08 .12 .16.00

.02

.04

.06

.08

.10

.12

.14

.00 .04 .08 .12 .16.00

.02

.04

.06

.08

.10

.12

.14

.00 .04 .08 .12 .16

% nas(t) an d n as(t-1) >Upper 5th =.007 % n as(t) and nas(t-2) > Upp er 5th = .006 % n as(t) and nas(t-3) > Upp er 5th = .0047

NASDAQ(t-h) against NASDAQ(t) Daily Absolute Log ReturnsY-Axis = NASDAQ(t), Dotted Lines = upper 5th quantile

In this paper we propose nonparametric and semi-nonparametric measures, esti-mators and Wald tests of joint tail dependence over multiple time horizons based ontail exceedances and events. Let () denote the indicator function: () = 1 if istrue, 0 otherwise. Let ¸ 1 be the sample size; fg 1 a sequence of intermediatetail fractiles: ! 1 and ! 0 as ! 1; and fg 1 the correspond-ing sequences of asymptotic upper quantiles of the two-tailed sample pathfjjjjg=1: ()(jj ) ! 1 as ! 1. The tail exceedance, de…nedas a log peak-over-threshold (Leadbetter et al 1983, Smith 1984, Davison and Smith1990) µ

lnµ jj

¶¶

+:= max

½µ jj

¶0

¾

records zero if an observation is not an extreme jj , or the positive distance(i.e. "peak") of ln jj above the threshold ln. The tail event (jj ), by

3

Page 4: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

comparison, minimally records 1 or 0 based on whether is extreme jj ornot jj · .

High threshold exceedances in …nancial and insurance markets are important mea-sures for risk and cost analysis (Embrechts et al 2003, Finkenstadt Rootzén 2003).The news of a …nancial scandal, presidential election or expected large pro…ts may af-fect multiple stock markets over several trading days: information associated with onemarket return’s distance above a high threshold (ln(j¡j))+ 0 may be used bytraders in another …nancial market, e¤ectively spilling over over into (ln(jj))+0 with a substantial degree of persistence . Log returns to the NASDAQ, for ex-ample, have apparently long memory exceedances: see Section 6.

Now, de…ne the log exceedance process

fg :=½µln

µ jj

¶¶

·µln

µ jj

¶¶

+

¸¾

We propose a semi-nonparametric measure and estimator of the correlation between¡and , and a joint test of no correlation as ! 1 over a window ofdisplacements = 1, where the horizon ¸ 1 is …nite. This reduces to testing

0 :µ

¶2

[¡]! 0, = 1 1 as ! 1 (1)

where the outer scale ! 1 controls for degeneracy. If ()2[¡]! 0, say, as! 1, then the the peaks (ln(jj))+ 0 and (ln(j¡j))+0 are associated, and strongly associated when = 1. One-tailed and cross-tailedversions are also possible. See Section 2.1 for details.

Second, we propose a nonparametric measure and estimator of the correlationbetween tail events (j¡j ) and (jj ), and a joint test of tail indepen-dence over displacements = 1. Once degeneracy is controlled for, this reduces totesting

0 :(j¡j jj )

(j¡j )£ (jj )! 1, = 1 1 as ! 1 (2)

The ratio in (2) is intuitive since it naturally measures extremal dependence, it allowsfor in…nitessimal forms of tail dependence and, as we detail in Sections 2.2 and 2.3,deviates from extant constructions in a simple but very useful way.

The two proposed Wald tests exploit nonparametric kernel covariance matrix es-timators for a vector of tail dependence estimators over multiple displacements. Thelatter allows for robust inference under extraordinarily general conditions, and ap-pears to be unique in the tail dependence literature (cf. Hill 2005).

Non-extremes are left unrestricted, and it is irrelevant whether a parametric modelis assumed, including GARCH (Hong 2001), Factor VAR (King et al 1994), or bivari-ate tail form (Ledford and Tawn 2003; He¤ernan and Tawn 2004; St¼aric¼a 1999); or

4

Page 5: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

whether the tests are applied to levels (Longin and Solnik 2001); or model residuals(Hong 2001). The tests are only sensitive to deviations from their speci…ed forms oftail independence, and obtain asymptotic power of one against in…nitessimal devia-tions from (1) or (2). In this combined sense the estimators and test statistics arerobust.

Asymptotic theory covers mixing and geometrically ergodic data in general, butalso possibly non-mixing data like Near Epoch Dependent or mixingale data (Ibrag-imov and Linnik 1971, McLeish 1975), and possibly non-mixing and non-NED datalike explosive GARCH with non-smoothly distributed errors (Hill 2008b). This cov-ers linear and nonlinear distributed lags like ARFIMA, regime switching and randomcoe¢cient autoregressions; and random volatility processes with short or long mem-ory, including GARCH with unit or explosive roots, nonlinear ARMA-GARCH andstochastic volatility. See Sections 3 and 4, and consult Section 5 for a simulationstudy.

Finally, in Section 6 we apply the estimators and tests to the daily log-returns ofequity markets in the U.S., U.K. and Japan. Extreme returns are highly persistentcasting serious doubt on extant methods that rely on independence of the marginalprocesses, while signi…cant, persistent and decaying tail dependence between marketsis evident.

Proofs of the main results are in Appendix A, Appendix B contains supportinglemmas, and all remaining tables and …gures are placed at the end.

Throughout ! and ! denote convergence in probability and …nite dimensionaldistributions; denotes "almost surely"; ! implies lim!1= ; and » implies ! 1. denotes the dimensional identity matrix. [] denotes theinteger part of . j ¢ jand jj ¢ jjdenote the and vector-norms respectively; jj ¢ jj= jj ¢ jj2 and j ¢ j = j ¢ j2 are the corresponding Euclidean norms.

2. MEASURES OF EXTREMAL DEPENDENCE Let fg = f: ¡1 1g be a bivariate stochastic process. We write to represent eitheror, and assume ¸ 0 with probability one in order to simplify notation. Intuitivelythis covers two-tailed jj, left-tailed ¡( 0) and right-tailed (¸ 0) data.Although we allow to be non-stationary, we assume () := (· ) have forall the same regularly varying tail:

¹() := 1¡ () =¡() for some 0 as ! 1 (3)

where is a slowly varying function. Class (3) with 2 (02) belongs to thenormal domain of attraction of the stable laws, for any 0 includes the maximumdomain of attraction of the Type II extreme value distribution expf¡¡g, andcharacterizes stochastic recurrence equations (e.g. strong-GARCH). See Bingham etal (1987), Resnick (1987) and Basrak et al (2002).

2.1 The Exceedance Correlation Coe¢cient

5

Page 6: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

Assume lim!1 ¹()¹(¡) = 1 so we can …nd an intermediate order sequencefg 1 and a tail threshold sequence fg 1, 2 N, 2 R+satisfying = (), ! 1 and ! 1 as ! 1, and (see Leadbetter et al 1983: Theorem1.7.13)

() ! 1 : ! 1 is the

! 0 quantile of . (4)

Properties (3)-(4) imply simple formulas for any moment of the tail exceedance(ln())+ and event () (see, e.g., Hsing 1991: p. 1548):

£(ln ())+

¤! !£ ¡ and

[()]! 1 82 N (5)

We de…ne tail dependence as the correlation between tail arrays. The …rst is basedon the tail exceedance process

fg :=n(ln ())+ ¡

¡1

o (6)

Notice (6) is parametrically re-de…ned from Section 1 by exploiting (ln ())+» ()¡1 under (5).

Since property (5) implies [2] = [(ln())2+] ¡ ([(ln())+])

2 »()2¡2 ¡ ()2¡2 = ()2¡2 £ (1 + (1)), the exceedance correlationat displacement 1 is

[¡]¡

£2¡

¤¢12 ¡£2

¤¢12

» [¡]

[()2¡2 £ (1 +(1))]12 ££()2¡2 £ (1 +(1))

¤12

» [¡]()2¡1 ¡1

=

[¡]2¡1 ¡1

Thus, de…ne the (asymptotic) exceedance correlation

() =(¡) :=

[¡]2¡1 ¡1

The Cauchy-Schwartz inequality reveals lim!1 j()j · 1, since (5) implies

lim!1

¯¯[¡]

2¡1 ¡1

¯¯

· lim!1

¯¯k¡k2 kk2

2¡1 ¡1

¯¯

= lim!1

¯¯¯[()2¡2 (1+(1))]12

£()2¡2 (1+(1))

¤12

2¡1 ¡1

¯¯¯

= lim!1

(1+(1)) = 1

6

Page 7: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

We say extremal exceedance dependence occurs at displacement when

lim!1

() = lim!1

µ

¶2 [¡]2¡1 ¡1

0

The scale ! 1 controls for degeneracy and permits detection of in…nitessimaldeviations from tail independence. See Sections 2.3 and 2.4 for examples involvingGumbel joint tails and stochastic volatility.

Noticeweabuse notation since the limit need not exist, for example if lim!1()= for some 2 (01). This case covers popularly exploited distribution classeslike Bivariate Extreme Value and Clayton. See Section 2.3.

A negative value lim!1()()0 signi…es either sub-average exceedancesor non-exceedances of (i.e. 0) are associated with above average ex-ceedances of ¡ (i.e. ¡ 0), or visa-versa. While this may be interesting,clearly 0 can occur even when is very large. We leave disentangling thenegative case for future consideration. In any case, various daily equity market re-turns series appear to have signi…cantly positively associated extremes over a longhorizon ¸ 100 days. See Section 6.

2.2 The Tail Event Correlation

Our second measure is based on the tail event process©¹

ª:= f()¡ ()g .

Since [ ¹2] = () ¡ ()2 » () £ (1 + (1)) by (4) and (5),the tail event correlation at displacement ¸ 1 reduces to

£¹¡¹

¤¡

£¹2¡¤¢12 ¡

£¹2

¤¢12

» (¡)¡ (¡)()[()£ (1 +(1))]12 [()£ (1 +(1))]12

=

[(¡)¡ (¡)()]

Thus, the (asymptotic) tail event correlation is

() =(¡) :=

[(¡)¡ (¡)()]

A Cauchy-Schwartz argument and (4) and (5) again reveal lim!1 j()j · 1.If we use an outer scale to control for degenerate cases since ()

» , tail dependence can be modeled as the ratio of joint and marginal tail

7

Page 8: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

probabilities

lim!1

() (7)

= lim!1

µ

¶2

(¡)()·

(¡)(¡)()

¡ 1¸

= lim!1

(¡)(¡)()

¡ 10

Since the right-hand-side of (7) maybe unbounded use () » to deduce

lim!1

() = lim!1

µ(¡)

(¡)()¡ 1

¶(8)

= lim!1

(¡)0

This is a non-trivial distinction since, as we review below, (7) captures importantcases that many models in the literature cannot detect.

2.3 Extant Representations

The use of (7) or (8) as a characterization of tail dependence has a variety ofpractical and theoretical advantages. We discuss how both relate to canonical, copulaand bivariate index models of tail dependence. For the sake of brevity we focusentirely on (), however each result carries over to ().

Conditional Tail and Asymptotic Independence De…ne

() := lim!1

(j ¡) (9)

In one of the earliest treatments of tail dependence, Sibuya (1960) de…ne ¡and as asymptotically independent when () = 0. See also Mardia (1964) and Loynes(1965). Shortcomings associated with (9) for families of distributions are now ex-tensively documented (Ledford and Tawn 1996, 1997; de Haan and de Ronde 1998,He¤ernan 2000, Ramos and Ledford 2008), and in this and nearly every other relatedliterature = 0: tail dependence lags and decay are ignored.

An overlooked problem with (9), however, is the following. Since ()() ! 1

lim!1

(j ¡)

= lim!1

(¡) = lim!1

³() +

´= lim

!1()

thus() = lim

!1()

8

Page 9: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

The property () = 0therefore, cannot always be associated with a total absenceof tail dependence because (7) holds in non-trivial cases when () ! 0, includingjoint Gumbel tails and stochastic volatility. See below. That said, any estimator of() asymptotically estimates (). See Section 3.

Tail Copula The above also reveals a major shortcoming of tail copulae. Fol-lowing Schmidt and Stadtmüller (2006: p. 4) the tail copula ¤() is

¤() := lim!1

£ ¡¡ ¹¡1

() ¹¡1 ()¢

where 2 [01], ¹() := (· ), and ¹¡1 () := inff2 Rj ¹() ¸ g. Tailindependence is assumed to be captured by the case

¤ () := ¤(11) = 0

displacement is ignored (= 0), and theory is only o¤ered for iid marginals and. See, also, Capéraà, Fougères and Genest (1997), Chen and Fan (2006) and Klüp-pelberg et al (2007).

Now, use the fractile property = () to rede…ne ¤(),

¤() := lim!1

¡¡ ¹¡1 (()) ¹¡1 (())

¢

and exploit ¡1¹¡1 (()) ! 1 by construction to obtain from (8)

¤() = lim!1

¡¡ ¹¡1 () ¹¡1 ()

¢

= lim!1

(¡) = lim!1

()

Hence() = ¤() = lim

!1()

The copula ¤() therefore su¤ers the same problems as (), and any estimator of() asymptotically estimates ¤(). Consult Section 3.

Stable Tail Dependence Function A nearly identical concept is the so-calledstable tail dependence function () (Huang 1992, Drees and Huang 1998, Einmahlet al 2008). The function is

() := lim!1

£ ¡¡ ¹¡1 () or ¹¡1 ()

¢

which trivially reduces to (e.g. Einmahl et al 2008)

() = +¡ lim!1

£ ¡¡ ¹¡1 () ¹¡1 ()

¢= +¡ ¤()

Thus, (11) = 2 ¡ ¤() = 2 ¡ () = 2 ¡ lim!1().

9

Page 10: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

Bivariate Tail Index Perhaps the most popular parametric model of tail depen-dence is developed in the seminal contributions of Ledford and Tawn (1996, 1997).Assume fg have unit Fréchet marginal distributions and a bivariate tail

(¡) = ¡1(), 2 (01] and slowly varying () (10)

See Resnick (1987) for background theory on bivariate regular variation. Since per-fect negative dependence, independence and perfect positive dependence imply thebivariate index 2 [012)= 12 and 2 (121] respectively, is simplyassumed to represent tail dependence. As far as we know time displacement is alwaysignored in the bivariate case (= 0). See Ledford and Tawn (2003) for the serialdependence case.

The index is repeatedly solved in the literature for Bivariate Extreme Value,Clayton, and Gumbel distributions (e.g. Ledford and Tawn 1996, 1997). In theexamples below we show is either redundant, or misrepresents tail dependence,while () captures all important nuances of tail dependence. We require thefollowing easily proven limit for unit Fréchets:

() ! 1

Logistic Bivariate Extreme Value: If we assume each is symmetricallydistributed for convenience, and allow for displacement, the LBEV lower tail is(¡) = expf¡(¡1 + ¡1)g for some 2 (01]The case = 1corresponds to independence, and the joint upper tail (¡) satis…es(10) with = 1 for every 2 (01) and () = 2 ¡ 2the case of maximalpositive tail dependence (Ledford and Tawn 1996, 1997). Notice = 1 for all , andonly the scale () = 2 ¡ 2 captures tail dependence decay: % 1 as ! 1implies () & 0. This is nontrivial given the popular use of , and not (), asan expression of tail dependence (e.g. St¼aric¼a 1999, Schmidt and Stadtmüller 2006).Further, we are not aware of a theory for joint estimation ofand () over multipledisplacements = 1. That said, since () reveals tail dependence decay, = 1is redundant information and meaningless as ! 1 when decay exists.

Now consider (), use (8) and the mean-value-theorem twice to deduce

() »

·(¡)

(¡)£ ()¡ 1

¸

»

·1 ¡

µ

¶¡

µ

¶+ exp

n¡2

o¸¡

»

h1 ¡ 2

³1¡

´+

³1 ¡ 2

£ (1+(1))

´i¡

» 2 ¡ 2

hence lim!1() = 2 ¡ 2 captures all we need to know about tail dependence.

Clayton Distribution: The joint lower tail is (¡) =()+ () ¡ 1 + [ ¹()¡1 + ¹()¡1 ¡ 1]¡ for some 0. Again (10) is

10

Page 11: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

satis…ed with = 1 8, and () = 2¡ (Ledford and Tawn 1996, 1997), henceonly () reveals decay if ! 1 as ! 1. Since () » 1 ¡ thecoe¢cient () satis…es

() »

·(¡)

(¡)£ ()¡ 1

¸

»

µ

¶¡1+ ¹

µ

¶¡1¡ 1

#¡¡

»

·2

³

´¡1¡ 1¸¡

¡

» 2¡

Thus lim!1() = 2¡ again captures the fundamentals of tail dependence.

Gumbel: In this case (¡) = ()() £ expf¡()gfor some 2 [01), where = 0 represents independence. The joint upper tailsatis…es (10) with = 12 for all , the tail independence case, and an asymptoticexpansion of () reveals weak negative dependence (see Ledford and Tawn 1997).Now use () » 1 ¡ and the mean-value-theorem the deduce

() » (¡)(¡)()

¡ 1

»µ

¶2 ·1 ¡2

³1 ¡

´+

³1¡

´2expf¡()

2g¸

¡ 1

¶2 ·¡1 + 2

+

µ1 ¡2

+

³

´2¶ µ1¡

³

´2(1+(1))

¶¸¡ 1

= ¡(1 +(1)) ! ¡2 (¡10]

indicating negative tail dependence. This result applies without an expansion of thebivariate tail, and gives an exact expression of tail dependence decay & 0 as !1.

2.4 Example: Bivariate Stochastic Volatility

An important example of a process that de…es conventional representations oftail dependence is stochastic volatility. Considering the rise in prominence of suchmodels in the …nancial econometrics literature (e.g. Ghysels, Harvey and Renault1995; Davis and Mikosch 2006), a concise depiction of tail dependence in this classis desirable. Let = []0 2 R2

+ be iid with symmetric marginal tails (3) with 0, and consider a bivariate random variable = []0 with stochasticvolatility marginals

= and =

11

Page 12: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

where = []0 2 R2+, f[2][

2]g 1and

fg is independent of fgln= £ +© ln

¡1 +, £ 2 R2

+© 2 R2£2+ , 2 R2,

» (01)

Assume the slope matrix satis…es

© 2½·

0

¸or

· 00

¸¾where ¢¢ 2 (01)

The bound [] 1 holds for any 0 if, for example, » (01). Notice

() » [2] £ () hence has tail (3) with index (see Breiman1965). In the …rst case ¡ and are tail dependent, in particular (Hill (2008a:Section 4)

() = lim!1

() =

£¡

¤

£¡

¤

£

¤ ¡ 1! 0 as ! 1 (11)

while in the second case = = 0 hence lim!1()() = 0 8.Since ¡ and naturally satisfy lim!1() = 0 in the …rst case, they are

asymptotically independent a la () = ¤() = 0, yet clearly they are tail dependentin the sense of (7) for all ¸ 1 whenever 6= 0, 6= 0, and 6= 0. Further,Hill (2008a) proves the bivariate tail index in (10) satis…es = 12 for all 1thetail independence case. Similarly, Davis and Mikosch (2006) prove the extremal indexsatis…es = 1, again the tail independence case. Thus, lim!1()() pro-vides a very di¤erent, and arguably correct, portrait of tail dependence for stochasticvolatility.

3. EXTREMAL DEPENDENCE ESTIMATION For the sake of nota-tion simplicity assume the sample starts at = ¡+ 1 for the chosen horizon . Let() to denote the sample order statistic of : (1) ¸ (2) ¸ ()

3.1 Estimation

The sample exceedances and events are

=µln

µ

(+1)

¶¶

+

¡

¡1

where ¡1=1

X

=1

lnµ

()(+1)

=¡(+1)

¢ ¡

and the coe¢cient estimators are

() =1

X

=1

¡2 £ ¡1

¡1

and () =1

X

=1

¡

12

Page 13: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

Notice ¡1is the so-called B. Hill (1975) estimator. We consider Hill’s (1975) ¡1

because we require a consistent estimator under the general conditions outlined below,cf. J.B. Hill (2005, 2009c).

3.2 Assumptions: Tail Memory and Tail Decay

We require abstractions that permit unrestricted non-extremes and substantial de-pendence and heterogeneity under any hypothesis. We therefore exploit the ExtremalNear Epoch Dependence [E-NED] property developed in Hill (2005, 2008b, 2009c).E-NED is simply Near Epoch Dependence [NED] enforced solely on tail events (), while NED dates in various forms to Ibragimov (1962), Billingsley (1968),Ibragimov and Linnik (1971), McLeish (1975) and Gallant and White (1988). Con-sult Davidson (1994) for historical details. Let fzg be a sequence of -…elds inducedby some possibly vector-valued stochastic process fg,

z:=(: · ) where z:= (:· · )

We say fg is -NED on fzg or on fg with size 0 if there exist sequencesfg, 0, sup02 [01) and = (¡) such that

°°¡ £jz+

¡¤°°

· £

The "constants" fg capture time dependence of the -norm and control forscale, while the "coe¢cients" 2 [01) gauge the rate of approximation, hencepersistence, according to the size . The property characterizes mixing, in…niteorder distributed lags of mixing, and non-mixing data, including linear and nonlin-ear distributed lags with long or short memory (e.g. ARFIMA, regime switching),GARCH, nonlinear GARCH, nonlinear ARMA-nonlinear GARCH, bilinear, and sto-chastic volatility (Gallant and White 1988; Davidson 1994, 2004; Hill 2008a).

Now let fg 1 be a sequence of time integer displacements where ! 1.

-E-NED fg is -Extremal-Near-Epoch-Dependent on fzg or on fg with size 0 if f()g is -NED on fzg with size . In particular, forsome fg, ! 1, and all 2 R

°°()¡ ¡jz+

¡¢°°

· ()£

where : R ! R+ is Lebesgue measurable, sup 01··() =(()1),and = (()1¡1¡ ) for some 2.

Remark 1: Saying fg is -E-NED is synonymous to saying f()gis -NED.

Remark 2: The E-NED property is a marginal tail memory property that saysnothing about joint tail memory nor non-extremal properties. If f()g is2-NED on fzg with size then so is the exceedance f(ln())+g, and f(

13

Page 14: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

)g is -NED with size if and only if it is -NED for any 6= with sizemaxfg (Hill 2008b: Lemmas 2.1 and 2.2).

Remark 3: Any process is E-NED on itself (i.e. when zis adapted to ) withconstants () = 0 and coe¢cients with arbitrary size .

Remark 4: If memory is geometric = (), 2 (01), then size 0is irrelevant (i.e. arbitrarily large).

Remark 5: In general mixing, -NED, and GARCH processes with unit orexplosive roots are -E-NED for any 2 (Hill 2008b).

In order to exploit central limit theory for NED tail arrays f()g werequire the base fg to satisfy a mixing condition. Recall we say is uniform mixingwith size 0 when (Ibragimov and Linnik 1971)

:= sup2Z

sup2z

¡12z1+

j(j) ¡()j =(¡)

and strong mixing with size 0 when

:= sup2Z

sup2z

¡12z1+

j(\ )¡ ()()j= (¡)

Although when f()g is -NED it is -NED, we ultimately require con-volutions f(¡) £ ()g and f(ln(¡))+ £ (ln())+gto be 2-NED such that a Gaussian central limiting theory applies to () and(). A simple su¢cient condition is for f()g to be 4-E-NED.

Assumption A fg is 4-E-NED on fzg with coe¢cients of size = 12, andLebesgue integrable constants () where sup1··s10 ()=(()1)for some 2. The base fg is uniform mixing where ()

[2(¡1)] !

0 for some 2 and some ! 1; or strong mixing where ()(¡2)

! 0 for some 2 and some ! 1.

Remark 1: The strong mixing condition ()(¡2) ! 0, for example,

implies the mixing size is larger than (¡ 2). For example if = (()) forsome 0 then (¡2)+1 ! 0, and since is otherwise arbitrary can be madearbitrarily large such that the mixing size is arbitrarily close to (¡ 2).

Remark 2: If is geometrically strong mixing then it is 4-E-NED on itselfas the mixing base (i.e. = ()) with trivial constants () = 0 andcoe¢cients with any size. Since geometric memory implies an arbitrary mixingsize Assumption A is easily satis…ed. This covers many nonlinear processes, includingrandom coe¢cient autoregressions, neural nets, contraction mappings and nonlinearautoregressions, and linear and nonlinear ARMA-GARCH all under appropriate den-sity smoothness conditions (e.g. An and Huang 1996, Carrasco and Chen 2002; Meitzand Saikkonen 2008). The advantages of E-NED are that it places no restrictions on

14

Page 15: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

non-extremes nor density smoothness, and therefore covers non-mixing and non-NEDdata, including linear and nonlinear in…nite order distributed lags of mixing data andexplosive GARCH processes. See Hill (2008b).

In order to expedite asymptotic normality we impose a second order regular vari-ation condition. Recall the slowly varying component () in (3).

Assumption B For some positive measurable function : R+ ! R+, 2 fg,

()()¡ 1 = (()) as ! 1 81. (12)

Assume has bounded increase: there exists 0 0 1 and · 0 suchthat ()() · some for ¸ 1 and 0Finally = () and satisfy

12 £ () ! 0 (13)

Remark: Property (12) represents slow variation with remainder (Goldie andSmith 1987). Tails satisfying properties (3), (4), (12) and (13) include ¹() =¡(1 + ((ln)¡)) and ¹() = ¡(1 + (¡)), 0, ¸ 0, and (13)respectively requires = ((ln)2) and =(f2(2+)g). See Haeusler andTeugels (1985).

Finally, we require su¢ciently many tail observations to ensure the orderstatistic (+1) that enters nonlinearly in every and does not a¤ect thelimit distributions of () and ().

ASSUMPTION C 1. 12 ! 1; or 2. 23 ! 1.

Remark: Any restriction like ! 1 for 0 implies some tail structurescharacterized by (12)-(13) are not covered including ¹() = ¡(1 +((ln)¡))since = ((ln)2) is required.

3.3 Preliminary Asymptotic Theory

Let () denote either () or (), construct vectors

= [()]=1 and = [()]

=1 for any 2 N,

and let §denote the associated covariance matrix

§:= £ £(¡ ) (¡ )

THEOREM 3.1 Let Assumptions A and B hold.

. Under Assumption C.1 j¡ j! 0. Further, if lim!1= 0

then there exists a zero mean £ 1 multivariate Gaussian law such that

12 (¡ )

! and j§j = (1)

15

Page 16: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

. Under Assumption C.2 j¡ j! 0. Further, there exists an £ 1

multivariate Gaussian law such that

12 (¡ )

! and j§j =(1)

. Let 2 Rdenote the vector (12)[¡¡ (¡)]=1or [ ¹¡¹¡ (¹¡¹)]=1. Then under the conditions of () or ()

¯¯¯§¡ 1

X

=1

£0

¤¯¯¯

! 0

Remark 1: In general is consistent under only Assumptions A, B andC.1, but the limit law of 12

(¡ ) is substantially complicated by thepresence of sample tail statistics in the numerator and denominator of . Wetherefore focus on the case ! 0 to simply the proof.

Remark 2: More tail observations for are required for a key approxima-tion result ¡ ¼ ¹¡

¹ due to the highly nonlinear imbedding of(+1). See Lemma A.1 in Appendix B.

Remark 3: Claim () is useful for proving a nonparametric estimator of §

is consistent. See Section 4.2.

Theorem 3.1 permits an "autocorrelogram" type analysis of serial and bivariatetail dependence and decay a la the -asymptotic con…dence band, 2 [01],

() § 2 £h§

i12

12

withthe 1¡ quantile of a standard normal distribution, and §any consistentestimator of §.

4. WALD TESTS Now consider a joint test of no tail dependence over mul-tiple lags.

4.1 Null, Local and Distant Hypotheses

The null and alternative hypotheses for either 2 fg are

0 : lim!1

= 0 and 1 : lim!1

() 6= 0 for some 2 f1g

Since we have shown processes like stochastic volatility and joint tails like Gumbelsatisfy lim!1= 0 and lim!1() 6= 0, and joint distribution tailslike Logistic BEV and Clayton satisfy lim!1 6= 0, we distinguish between two

16

Page 17: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

types of alternatives. The …rst represents a ()-local alternative: there exists anon-stochastic vector = [()]=1 such that

1 :

=+(112 ) where jj 1

Under Assumption B the stochastic volatility process in Section 2.4 satis…es1 where

() = () is de…ned in (11). See Hill (2008a). Clearly 0 is a special case of 1 ,

capturing trivial cases like independence where () = 0 for all and .The second represents a distant alternative:

1 : lim!1

() = () 6= 0 for at least one 2 f1g

Notice under 1 divergence ()j()j ! 1 for some 2 f1g. Thus, since

() = ¤() = lim!1()the canonical () and tail copula ¤() are only sensitiveto large deviations from no tail dependence

1 , and cannot detect in…nitessimaldeviations of the local form

1 .

4.2 Test StatisticsNow construct a Wald statistic for either 2 fg,

= £ 0§¡1

where § is any consistent estimator of § = [( ¡ )0 ¡)]. The following ensures a well de…ned test statistic asymptotically.

ASSUMPTION D §is positive de…nite uniformly in .

Since we allow for a wide variety of dependence properties under the null, ingeneral a parametric estimator of § is not available. We therefore suggest anonparametric estimator

§=1

X

=1

£ £ 0

where = ((¡ )) denotes a standard kernel function with bandwidth ! 1 as ! 1. For example, the Bartlett kernel is = (1 ¡ j¡ j)+. If= then

=1

2 £ ¡1¡1

i=1

¡

2 R

and if = then

=h¡

i=1

¡

2 R

The following details the required properties of the kernel and bandwidth .

17

Page 18: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

ASSUMPTION E

1. is a member of class de…ned by

= f: R ! [¡11] j (0) = 1() =(¡) 82 RZ 1

¡1j()j1

Z 1

¡1j()j1() is

continuous at = 0 and all but a …nite number of points g

where () = (2)¡1R 1¡1()1.

2. Let ! 1 as ! 1 = (),P

=1 jj = (12), andsup1··

P=1 jj = (12

14).

Remark 1: Class includes Bartlett, Parzen, Quadratic Spectral and Tukey-Hanning kernels, and all kernels in ensure §is uniformly positive de…nite withprobability one. See de Jong and Davidson (2000).

Remark 2: In general the bandwidth rate ! 1 depends on the choice ofkernel under E.2. The Bartlett kernel, for example, satis…es

P=1 jj = ()

and sup1··P

=1 jj= (12 12). Thus, =(1) and E.2 hold simultane-ously if = (12) = (12). In practice some rule for will reveal a boundfor . For example » , 2 (231), implies = (¡12), 2 (231).

The kernel covariance matrix is in general consistent.

THEOREM 4.1 Let Assumptions A, B, and E hold, and let Assumption C.1 holdfor and C.2 for . Then j§¡ §j

! 0.

Intuitively, if f¡g are not tail dependent with respect to tail exceedancesor events over displacements = 1, then Theorem 3.1 implies will be ap-proximately chi-squared distributed for large . If there exists tail dependence then12

jj will be large for large as long as ! 1 su¢ciently fast, hence

! 1 with probability one since § is consistent and uniformly positive de…niteunder any hypothesis.

THEOREM 4.2 Let Assumptions A, B, D and E hold, and let Assumption C.1hold for and C.2 for .

Under 1 ,

! 2() a non-central chi-squared law with -degrees offreedom and non-centrality parameter . In particular = 0 if and only if jj= 0, and = 1 if and only if jj 6= 0 and 23 ! 1.

Under 1 for if » 23 then = 0§

¡1 , and = 0 when

= (23), where §= lim!1§ is 2-bounded.

Under 1 , ! 1 with probability one.

18

Page 19: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

Remark 1: Theorem 4.2 implies under 0 : ()! = 0 the Waldstatistic

! 2(), a chi-squared law with -degrees of freedom.Remark 2: Asymptotic power against the local

1 depends intimately on therate ! 1 of tail observations used. Even though ! 0 under

1 , as long as23 ! 1 and ()! 6= 0, the Wald test has non-negligible powerfor either dependence measure, in particular ! 1 so asymptotic poweris one. On the other hand, if » 23 then the exceedance-based

! 2()under

1 , with non-centrality = 0§¡1 0 if and only if jj 6= 0. Asymptotic

power now depends on how "local" the alternative is. As a rule of thumb, therefore,23 ! 1 should always be enforced.

Remark 3: Although Theorem 3.1 requires only 12 ! 1 for 12 (

¡ ) to be asymptotically normal, Theorem 4.2.ii,iii show the number of tailobservations must satisfy at least 23 ! 0 to ensure any power: if23 ! 0 then = 0 and asymptotic power is equal to the nominal size of thetest.

How , and work in practice is the subject of the next section.

5. MONTE CARLO STUDY We now analyze the extremal dependenceproperties of bivariate Extremal Threshold VAR(1) [E-VAR] and Stochastic Autore-gressive Volatility [SAV] processes. The E-VAR process allows us to investigate howwell the proposed estimators perform when non-extremes have properties that di¤erfrom extremes.

5.1 Simulated ProcessesWe draw random samples of iid zero mean innovations and from a sym-

metric Pareto distribution with density () = jj¡¡1 if jj ¸ , () = ¡¡1

if jj · = [2(1 + )]1, and index = 17. NoticeR1¡1()= 1, and both

and have the same index for simplicity of interpretation. The sample size is2 f5001000g, a total of 3observations are generated for each process, we retainthe last to reduce the impact of randomized starting conditions, we generate 1000series fg=1 and report results only for = 500. Consult Hill (2009b) for the case= 1000.

Apparently a method for selecting an "optimal" sample fractile for dependent,heterogeneous data does not exist in the extreme value theory literature. All methodsknown to this author concern tail index and tail dependence estimation for iid pairsfg. Since the topic supersedes the focus of the present paper we simply computemedian two-tailed dependence estimators and Wald statistics over a fractile window= f[03 £ 56][2 £ 56]g, in this case

500 = f5355g and 1000 = f9632gOur choice of is guided by the requirement 23 ! 1, the facts that ()and () for close to 1 are extremely volatile and for ¸ (23)for a partic-

19

Page 20: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

ular sample no longer re‡ect tail memory well, and of course ! 0 must holdasymptotically. See Hill (2009b: Figures 1-2) for plots of () and () over and displacement 2 f14g. It is clear that the median over a window covering nomore than roughly two thirds of a particular sample of size 2 f5001000g capturesessential tail dependence characteristics. Clearly other speci…cations of will alsosatisfy these criteria. See Hill (2009b: Figure 4) for fractile window robustness checks.

E-VAR(1) Let fg 1 denote a sequence of positive real numbers, ! 1 as! 1. In the E-VAR(1) case we simulate a triangular array = []0 where

= ©1¡1(j¡1j · ) + ©2¡1(j¡1j ) +and = 15

©1 2½©()1 =

·2 43 2

¸©()1 =

·9 00 9

¸¾

©2 2½©()2 =

·0 00 9

¸ ©()1 =

·8 90 8

¸¾

It is not di¢cult to show each 2 fjjjjg satis…es () = ¡(1 +(1)), so Assumption B is easily satis…ed (cf. Haeusler and Teugels 1985). Extremesare governed by the data generating process ~:= ©2¡1 + , and it is straight-forward to show f ~g is geometrically 1-NED on iid fg. Thus fg itself isgeometrically 4-E-NED with arbitrary size (Hill 2008b: Lemmas 2.1 and 2.2), soAssumption A is satis…ed.

In each case ©()1 is matched with ©()2 , etc. The cases are () extremal-iid ,extremal-serially dependent , and bi-directional extremal independence at all hori-

zons ((1)= ); () extremal-serially dependent and , and ”weak” extremal

association from ¡ to at every horizon (()! ); and () extremal-serially dependent and , and ”strong” extremal association from ¡to for all (

()! ).

Stochastic Autoregressive Volatility [SAV] The bivariate stochastic volatil-ity process = []0 detailed in Section 2.4 is simulated. The volatility shocks 2 R2 are mutually and serially independent bivariate standard normally distributed,and the intercept is £ = [11] 0. The no-tail dependence case parameters f()()gare independent random draws from a uniform distribution on [1395]and thetail-dependence case parameters are © = [26j06]Each 2 fg satis…es As-sumptions A and B (see Hill 2008a).

5.2 Computation

For each simulated SAV sample fg=1 or E-VAR sample fg=1 the coe¢cients() and () are computed from two-tailed [jjjj] 0 or [jjjj]0 for each2 and displacements = 1100. We then compute Wald statistics foreach 2 and = 14 where the kernel covariance matrix §is computedwith a Bartlett kernel = (1 ¡ j¡ j)+ and bandwidth = [

33]. Since

20

Page 21: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

» 56 and 12 » 13 the Assumption E.2 implication = (12)is satis…ed. Since Assumptions A, B and C.2 hold in each case, Theorem 4.2 dictates! 1 with probability one under 1 : ()()9 0 for some 2 f1g,and

! 2() under 0 : ()() ! 0 8 1.We report simulation averages of the median () and Wald statistic over 2

, and the associated asymptotic p-value. In general () out-performs () inevery case, likely due to the dual numerator and denominator estimators in ().See Hill (2009b: Figure 3) for plots related to ().

For ease of inference under the null, we plot () and § () for E-VAR datawhere () is the 95% asymptotic con…dence band half length () = 196 £[§]

12

12 . Since SAV data exhibit -local tail dependence, for these data

we report scaled ()() and §() £ (). Note nothing is gained withrespect to inference since the latter interval is merely ()-times the former. Theadvantage of the estimators developed here are the Wald statistic has non-negligible power even for such degenerate tail dependence cases and without using adi¤erent scale, as long as 23 ! 1.

5.3 Estimation and Test Results

Figure 3 contains plots of (), and Tables 1 and 2 report (), Wald test p-values and test rejection frequencies for tests at the 5% level. In general the median() works quite well. Considering we are working with extreme values, if =500 the median sample statistics are computed from only = 5 to = 355observations. It is not surprising, then, that the dependence estimates in the strongestcase (e.g. strong E-VAR) are insigni…cant at the 5% level for displacements ¸ 7when = 500 (i.e. ¡ () () ()). If = 1000 or = 2000 (notshown) the coe¢cient is insigni…cant for displacements 12 and 20, respectively(see Hill 2009b: Tables 3-4). Further, the median appears to be robust to the choiceof window for both E-VAR and SAV data (see Hill 2009b: Figure 4).

The two-tailed median Wald test works exceptionally well in all cases. Empiricalpower is 100% for the strong E-VAR case at each horizon = 14, and 78% for aone-step ahead test in the SAV case.

One-tailed tests performed reasonably well in separate simulations not reportedhere. By the very nature of one-tailed computations, however, the resulting numberof usable observations can be quite small (about half the number of two-tailed testswhen the data generating process is symmetric), hence larger sample sizes are requiredto obtain empirical power on par with two-tailed tests.

6. TAIL DEPENDENCE IN EQUITY MARKET RETURNS Finally,we study tail dependence within and between daily log returns to the NASDAQand SP500 composite indices, the London Stock Exchange FTSE-100, and the Nikkei-225 index over Jan. 1, 2001 - Dec. 31, 2004. Equity returns are computed as dailyopen/close averages, holiday and unscheduled market closures are treated as missing

21

Page 22: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

values, and only dates for which each bivariate pair has an available observation areused to compute that pair’s bivariate tail dependence estimates. We …lter each seriesfor day e¤ects based on a standard daily-dummy regression, and each net sample sizeis roughly 1000 days.

Tail index estimates , serial and bivariate two-tailed () and associatedWald statistics are all computed from two–tailed data1. We only plot the unscaledf()§()g where () = 196 £ [§]

12

12 , over displacements =

1100, since scaling by does not alter inference. See Figures 4 and 5. Median() and are reported in Table 3.

We again use a Barlett kernel with bandwidth = [33] for computation of thecovariance matrix estimator §, and = f[03 £ 56][2 £ 56]g is thefractile window.

6.1 Distribution Tail Thickness

Table 3 contains 95% con…dence bands of the Hill-estimator . In particular,we report the median § 196

12 over 2 where 2

= 42

, and2

= 1P

=1is Hill’s (2005) kernel estimator of [(12 (¡1

¡¡1))2]. Under Assumptions A, B and E 12

( ¡ )! (01) follows by

Theorems 5 and 6 of Hill (2005) and a standard mean-value-theorem argument. Eachequity returns series exhibits heavy-tails: values ¸ 3 do not occur in any interval,and in each case except the Nikkei we cannot reject the hypothesis that · 2 at the5% level.

6.2 Extremal Serial DependenceNASDAQ, SP500 and LSE returns exhibit signi…cant, highly persistent, positive

serial extremal dependence, while the Nikkei exhibits rather shallow, delayed serialextremal dependence. We regressed () on functions of capturing geometric andhyperbolic decay (not shown). NASDAQ and SP500 () decay signi…cantly hy-perbolically, and the LSE () decays signi…cantly geometrically. In this limitedstudy, then, U.S. markets evidently exhibit long memory extreme returns, U.K. ex-treme returns exhibit short memory, and Japanese extreme returns very exhibit briefand delayed dependence.

Recall the median () may be precipitously less sharp as displacement in-creases even when = 1000 for strongly dependent processes. Thus, NASDAQ,SP500 and LSE cases, in particular U.S. markets, suggest highly persistent datagenerating processes underlie some equity market extremes. This adds yet anotherdimension to the long standing debate for and against long memory in stock returns(e.g. Granger 1966, Lo 1991, Bandi and Perron 2006).

6.3 Extremal Bivariate Dependence1Due to space constraints all exceedance correlation () plots, and all one-tailed and cross-

tailed dependence plots are available upon request. In every case () is qualitatively similar to()

22

Page 23: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

See Figures 5-6 and Table 4 for the two-tailed median () and for bivari-ate pairs f¡g. Two-tailed tail dependence between each equity market pair issigni…cant and positive with varying degrees of persistence. In this study only theextremal association between pairs of NASDAQ, SP500 and LSE display any degreeof persistence: an extreme spike in one market ¡or ¡spills over into the othermarket or over multiple lags .

The Nikkei index is the decided outsider in this study. Extremes from the re-maining indices spillover to the Nikkei index only one day ahead, and past Nikkeiextremes are associated with the remaining indices’ contemporary returns very brie‡yand after a potentially long delay.

There is no evidence of one-day ahead cross-tailed dependence, and in this casethe …rst order coe¢cients are signi…cantly negative in each case (e.g. if an aboveaverage positive extreme return to the NASDAQ is followed by a negative return tothe LSE, the LSE return is predominantly below average extreme or non-extreme).See Hill (2009b: Figure 7).

7. CONCLUSION We measure and estimate tail dependence as either a highthreshold exceedance correlation () or a joint-marginal tail probability discrep-ancy (). We permit tail dependence decay and cross-tailed dependence, marginaland joint non-extremes are left unrestricted, and our estimators and test statisticsare sensitive to in…nitessimal deviations from tail independence (e.g. when () !0 yet lim!1()() 6= 0). The coe¢cients () and () are capable ofdetecting in…nitessimal forms of tail dependence that extant models cannot detect,including canonical conditional tail probabilities and tail copulae. In particular, theycan capture tail dependence decay for joint tails where the bivariate index aloneis either lacking or simply wrong.

The resulting sample statistics () and () are easy to compute and havejoint asymptotic Gaussian limits over an arbitrary displacement window under verygeneral conditions, covering mixing, NED and E-NED data. The estimators andtests work well for simulated VAR and stochastic volatility data, and reveal a varietyof extremal dependence properties within and between equity markets. Given thestate of the art in the tail dependence literature, the next challenge is to generalizeour methods to multivariate time series models of extremes, and hybrid models ofextremes and non-extremes.

23

Page 24: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

APPENDIX A: Proofs of Main Results

We make repeated use of the following results under Assumptions A-C:©(ln ())+ and ()

ª=

¡1+(112

)¢ (A.1)

°°°°°1

12

X

=1

f() ¡ ()g°°°°°2

=(1)

°°°°°1

12

X

=1

©(ln ())+ ¡

£(ln ())+

¤ª°°°°°2

=(1)

°°ln¡(+1)

¢°°2 = (112

) and ¡1= ¡1 +(112

)

The …rst line follows from (3), (4), (12) and (13) (see Hsing 1991: p. 1553, cf. Goldieand Smith 1987); the last three lines from Hill (2009c: Corollary 3.3 and Theorem5.1).

Proof of Theorem 3.1. Consider the exceedance correlation , the proof forbeing similar. By the Cramér-Wold theorem we need only prove 82 R, 0= 1, there exists a …nite variance, zero mean Gaussian law () satisfying

12

X

=1

(()¡ ())! ().

Write := (12) and := (12). Then for any conformable 0= 1

X

=1

(()¡ ()) (A.2)

= £ 1

X

=1

X

=1

(¡¡ [¡])

+ £ 1

X

=1

X

=1

³¡¡ ¡

´

+³¡

´£ 1

X

=1

X

=1

(¡¡ [¡])

+³¡

´£

X

=1

()

The …rst three terms are respectively (112 ), (1

12 ) and (1) re-

spectively under Lemmas A1 and A2 in Appendix B, and (A.1) since ¡ =

24

Page 25: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

(112 ). The last term is(1

12 ) under (A.1) and the construction lim!1 j()j

· 1.Further, if () ! 0 for each = 1, then (A.1), (A.2), and Lemmas A.1 and

A.2 imply the last three terms in (A.2) are (112 ). Thus, by Lemma A.2 there

exists a …nite variance, zero mean Gaussian law () that satis…es

12

X

=1

(()¡ ()) (A.3)

=£ 1

12

X

=1

X

=1

(¡¡ [¡]) +(1)! ()

Finally, de…ne := [¡¡ (¡)]=1 and use (A.3) andthe Helly-Bray Theorem to deducej§¡ 1

P=10j

! 0

Proof of Theorem 4.1. We will prove the claim for §constructed from ,the proof in the case of being similar.

De…ne := (12), := (12), ¤:= ¡, ¤:=¡and

(12) := 21

X

=1

[(¤1 ¡ [¤¡1]) (¤2 ¡ [¤¡2])] (A.4)

~(12) :=1

X

=1

³¤1 ¡

(1)

´³¤2 ¡

(2)

´

·(12) :=1

X

=1

³¤1 ¡

(1)

´³¤2 ¡

(2)

´

(12) :=1

X

=1

³¤1 ¡

(1)

´³¤2 ¡

(2)

´

Note under Assumptions A, B and C.1 Theorem 3.1.iii translates to¯¯§¡ [(12)]

12=1

¯¯ ! 0

and by construction§= [(12)]

12=1

It therefore su¢ces to prove

0³[ (12)¡ (12)]

12=1

´! 0 82 R

The limit follows from Lemmas A.4-A.6 and the triangular inequality.

25

Page 26: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

Proof of Theorem 4.2. We will prove claims () and () for the exceedance-based= 0§

¡1, the proof for =0§

¡1being similar.

Claim () can be proven as an easy extension of the following argument.Assume the local alternative holds

1 : lim!1

=+(112 ), jj 1.

Uniform positive de…niteness Assumption C and j§j = (1) by Theorem 3.1imply there exists a sequence of of uniformly bounded matrices fg, 2R£and matrices §2 R£that satisfy

§¡1=0

! 0= §¡1 (A.5)

De…ne

:=12 0(¡ ) , := 0

¡12

¡ ¢

, := 0

and decompose

0§¡1= ( +)

0 (+) +0+ 2 ( +)

0

We will derive the limit of each term on the right-hand-side.

Step 1 ( +): By Theorem 3.1 there exists a multivariate standard normallaw 2 Rsuch that

!

The Slutsky Theorem and (A.5) therefore imply

+

! +0

where jj = (1) follows from j§j= (1) and jj 1.Now apply the continuous mapping theorem to deduce

(+)0 (+)

! 2()

a non-central chi-squared law with non-centrality parameter

:=0=0§¡1 1

Positive de…niteness implies = 0 if and only if jj = 0, and otherwise 2(01).

26

Page 27: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

Step 2 (): We can always write

0 =¡12

¡ ¢0§¡1

¡12

¡ ¢

¡

¶0§¡1

µ

¡

+2¡1 ¡ 32

¢µ

¡

¶0§¡1

µ

+¡1 ¡ 32

¢2µ

¶0§¡1

µ

= 2£O¡112

¡ ¢0§¡1

¡+(112

+¡1 ¡ 32

¢2 ¡

+(112 )

¢0§¡1£ ¡

+(112 )

¢+(1)

where O(¢) denotes a conformable vector of (¢)-terms. The …rst term vanishes forany …nite because 112

¡ ! 0, and j§j = (1) by Theorem 3.1.Therefore

0 ! := (1 ¡ )2 £

say, where 32 ! 2 [01]. By cases2:

= 0 if and only if = 0 or = 1;

2 (01) if and only if 6= 0 and 6= 1;= 1 if and only if 6= 0 and = 1.

Therefore, if 23 ! 1 then 0 ! 0 if and only if jj = 0and 0

! 1 if and only if jj 6= 0. Further, if » 23 then 0 ! 0.

Step 3 ( + ): Imitating Step 2

(+)0 = 12

(¡ )0 §¡1£

¡32

¡ 1¢

+12 (¡ )

0 §¡1£¡32

¡ 1¢

£ (112 )

+12 (¡ )

0 §¡1£ (112 )

+0§¡1£ ¡

32 ¡ 1¢

+0§¡1£

¡32

¡ 1¢

£ (112 )

+0§¡1£ (112

)

= 12 (¡ )

0 §¡1£¡32

¡ 1¢

+0§¡1£¡32

¡ 1¢+(1)

2In the proof for= 0§

¡1we require23 != 1 under Assumption

C.2, hence = 0 if and only if = 0, and = 1 if and only if 6= 0.

27

Page 28: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

Theorem 3.1 and Cramér’s Theorem therefore imply j( + )0j! , say,

where = 0 if and only if jj = 0 or » 23; and = 1 if and only if jj6= 0 and 23 ! 1.

Step 4 (): Steps 1-3, Theorem 4.1 and the mapping theorem imply under1

= (++)0 (++)

+0

h§¡1¡§¡1

i) 2()

a non-central chi-squared law with non-centrality parameter := + + ,where = 0 if and only if jj = 0 or » 23, and =1 if and only if jj 6= 0and 23 ! 1.

APPENDIX B: Supporting Lemmata

Write

2©¡¹¡¹

ªand 2

n¡¡

o

Throughout 2 R, 0= 1 is arbitrary. Lemma A.1 proves each j¡ j! 0 su¢ciently fast that we make work with in all asymptotic arguments.

Lemma A.2 justi…es a joint Gaussian distribution limit for [1]0 by aCramér-Wold device.

LEMMA A.1 Let fg satisfy Assumptions A and B, and let Assumption C.1or C.2 hold respectively for () or (). Then 112

P

=1P

=1(¡ ) = (1)

LEMMA A.2 Under Assumptions A and B there exist a zero mean Gaussian law() that satis…es 112

P

=1P

=1(¡ [])! () where

(112

P=1

P=1(¡ []))2 = (1).

Let () denote either

1

12

X

=1

©¹¡¹¡

£¹¡¹¤ª

or1

12

X

=1

f¡¡ [¡]g

Recall the de…nition of a mixingale (cf. McLeish 1975). Using the same -…eld z

de…ned in Section 3.2 we say fzg forms an 2-mixingale array with size 0 ifjj[jz¡]jj2 · and jj¡ [jz+]jj2 · +1 for some positive non-stochasticconstants fg and coe¢cients fg that satisfy 0 · = (¡).

28

Page 29: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

LEMMA A.3 (Hill 2009a) Under Assumption A the triangular tail array f()gis 2-NED on fzg with constants ()= (1¡12

¡1) uniformly in 1and coe¢cients () = (()12¡1

¡12 ) for some integer sequence fg,

! 1 as ! 1. Moreover, f()zg forms an 2-mixingale sequencewith size = 12 and constant = (¡12) uniformly in 1.

In the next three claims let Assumptions C.1 and C.2 respectively hold for ()and (), and let Assumptions A, B, and E hold.

LEMMA A.4 For all 2 R, 0= 1, 0([(12) ¡ ·(12]12=1)! 0.

LEMMA A.5 For all 2 R, 0= 1, 0([·(12) ¡ ~(12)]12=1)! 0.

LEMMA A.6 For all 2 R, 0= 1, 0([~(12) ¡ (12)]12=1)! 0.

Proof of Lemma A.1.Step 1 (): De…ne := (ln((+1)))+ ¡ (ln())+ and write

1

12

X

=1

¡ (B.1)

=1

12

X

=1

¡ln

¡¡(+1)

¢¢+

¡ln

¡(+1)

¢¢+ ¡ 32

¡1

¡1

=1

12

X

=1

(ln (¡))+ (ln ())+

+1

12

X

=1

¡£ (ln ())+

+1

12

X

=1

£ (ln (¡))+

+1

12

X

=1

¡¡32

¡1

¡1

It is straightforward to show by cases

jj =¯¯¡ln

¡(+1)

¢¢+¡ (ln ())+

¯¯ ·

¯ln

¡(+1)

¢¯(B.2)

andX

=1

jj ·¯ln

¡(+1)

¢¯£

X

=1

¡¸ (+1)

¢

·¯ln

¡(+1)

¢¯£

X

=1

(¸ )

29

Page 30: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

so that from (A.1) and the Cauchy-Schwartz inequality

sup 1

kk2 =(112 ) (B.3)

°°°°°1

12

X

=1

jj°°°°°1

· 1

12

°°ln¡(+1)

¢°°2 £ £ ( )

12

=³¡112

¢2 £ £³()

12´´=(()

12)

Exploiting (A.1), (B.2), (B.3) and the Minkowski and Cauchy-Schwartz inequali-ties, the second (and third) term on the right-hand-side of (B.1) satis…es

°°°°°1

12

X

=1

¡£ (ln ())+

°°°°°1

·°°12

ln¡(+1)

¢°°2 £

°°°°°1

X

=1

£(ln ())+ ¡ (ln ())+

¤°°°°°2

+(ln ())+ £°°°°°1

12

X

=1

¡

°°°°°1

= (1)£ (112 ) +()£ (()

12) =(1)

For the fourth term in (B.1), use (A.1), (B.2) and (B.3) to deduce°°°°°1

12

X

=1

¡

°°°°°1

·°°°°°1

12

X

=1

j¡j £ jj°°°°°1

·°°ln

¡(+1)

¢°°2 £

°°°°°1

12

X

=1

jj°°°°°1

= ³¡112

¢£ ()

12´= (12) = (1)

since 12 ! 1 under Assumption C.1.Together, (B.1) reduces to

1

12

X

=1

X

=1

¡

=1

12

X

=1

X

=1

h(ln (¡))+ ((ln))+ ¡

¡1 ¡1

i+(1)

=1

12

X

=1

X

=1

¡+(1)

30

Page 31: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

Step 2 (): Write

1

12

X

=1

¡

=1

12

X

=1

¡¡(+1)

¢

¡(+1)

¢ ¡ 32

=1

12

X

=1

(¡)()¡32

+1+2+3

where

1 =1

12

X

=1

£

¡¡(+1)

¢¡ (¡)

¤()

2 =1

12

X

=1

£

¡(+1)

¢¡ ()

¤(¡)

3 =1

12

X

=1

£

¡¡(+1)

¢ ¡ (¡)¤

£ £

¡(+1)

¢ ¡ ()¤

We need to show each = (1). We can always …nd some 0 such that¯

¡(+1)

¢¡ ()

¯

¡ln ¸ lnln(+1)

¢+

¡ln(+1) ¸ ln ln

¢¤

· ¡ln(+1)

¢+

¡ln(+1)

¢

Now, use (A.1) to bound 1 (and 2)

j1j ·£

¡ln

¡(+1)

¢¸

¢+

¡ln

¡(+1)

¢¸

¢¤

£¯¯¯1

12

X

=1

[() ¡()]

¯¯¯

¡ln

¡(+1)

¢¸

¢+

¡ln

¡(+1)

¢¸

¢¤

£ (12 ) £ ()

Thus, by Chebychev’s and the Cauchy-Schwartz inequalities, and (A.1), 1 (and

31

Page 32: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

2) is 1-bounded:

k1k1 ·£

¡¯ln(+1)

¯¸

¢+

¡¯ln(+1)

¯¸

¢¤12

£°°°°°1

12

X

=1

[()¡ ()]

°°°°°2

¡¯ln

¡(+1)

¢¯¸

¢+

¡¯ln

¡(+1)

¢¯¸

¢¤

£¡12

¢£ ()

· £°°ln

¡(+1)

¢°°2 £ (1)

+£ h¡ln

¡(+1)

¢¢2i¡12

¢£ ()(1 +(112

))

= (112 )

Finally, use (A.1) to deduce

k3k1 · 112

X

=1

°°¡ln

¡(+1)

¢

¢+

¡ln

¡(+1)

¢

¢°°2

£°°

¡ln

¡(+1)

¢

¢+

¡ln

¡(+1)

¢

¢°°2

·

12

¡ln

¡(+1)

¢

¢12 £ ¡ln

¡(+1)

¢

¢12

·

12

°°ln¡(+1)

¢°°2 £

°°ln¡(+1)

¢°°2

· £¡32

¢=(1)

since 23 ! 1 under Assumption C.2.

Proof of LemmaA.2. Weprove the claim for¡, the proof for ¹¡¹being similar. De…ne for any conformable 0= 1

() :=1

12

X

=1

(¡¡ [¡])

and 2() := (

P=1())2. By Lemma A.3 f()g is 2-NED on fzg

with constants ()= (1¡12 ¡1) uniformly in 1 and coe¢cients ()

= (()12¡1¡12 ), and zis induced by a uniform or strong mixing process

under Assumption A with coe¢cients that respectively satisfy ()[2(¡1)]

! 0 or ()(¡2) ! 0. The claim therefore follows from Corollary 3.3 of Hill

(2009c).

Proof of Lemma A.4. For the sake of brevity we will prove j() ¡ ·()j! 0 for arbitrary 2 f1g. It is straightforward to show the following arguments

32

Page 33: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

hold if ¡¡ ()() and ¡¡ () () are re-placed with

P=1f¡¡ ()()g and

P=1f¡

¡ () ()g for any 2 R.Simply multiply out () from (A.4)

() = ·(12)

+2³

´2( ()¡ ()) £ 1

X

=1

µ

¡¡ ()¶

´2( () ¡ ())

2 £ 1

X

=1

= ·() +

µ³

´2 1

12

2

32

¶+

µ³

´2 1

12¶

= ·() +(1)

The second equality follows from () ¡ () = (112 ) by Theorem 3.1,

1P

=1=(12) under Assumption E, 1P

=1f()¡¡ ()g = (2

32 ) by Lemma B.4 of Hill (2009a), and = ().

Proof of Lemma A.5. By the same argument as above we prove j·() ¡~()j

! 0 for arbitrary 2 f1g. Compactly write := ((¡ )) and

(12) :=¯¯¡1¡2¡ ¡1¡2

¯¯

It can be shown with some work and the triangular inequality

j·()¡ ~()j

· 2£ 1

X

=1

njj £ ()

o+

¯¯2

¡ 2¯¯ £ ()

+¯¯2

¡ 2¯¯³

´j ()j £

¯¯¯1

X

=1

³¡¡ ¡

´¯¯¯

+¯¯¡

¯¯³

´ ¯¯¯1

X

=1

(¡¡ [¡])

¯¯¯

+¯¯¡

¯¯ 1

X

=1

+¯¯2¡ 2

¯¯³2

´³

´2 ()

2

Observe j () j=(1) by the construction; ¡ =(112 ) and 2¡ 2=

(112 ) by (A.1); () =(1) is implied by Theorem 3.1.i,iii; 1

P=1

33

Page 34: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

= (12) under Assumption E; 1P

=1 jj() = (1) by Lemma

B.2 in Hill (2009a); and 1P

=1(¡¡ ¡) = (12 )

and 1P

=1(¡¡ [¡]) = (12 ) by Lemma B.3

in Hill (2009a), and = (). Therefore

j·()¡ ~()j · (1) +¡112

¢+

¡12

¢

£ (12 )

+¡12

¢

£ (12 ) +

³()

12´+

¡32

= (1)

Proof of Lemma A.6. By the same argument above we prove j~() ¡ ()j! 0 for arbitrary 2 f1g. We need only verify Assumptions 1-3 of de Jong and

Davidson’s (2000) [JD] Theorem 2.1 to prove

j~()¡ ()j =

¯¯¯~()¡

Ã1

12

X

=1

f¡¡ [¡]g!2

¯¯¯

! 0

JD’s Assumption 1 holds by the statement of the lemma. Moreover, f¡12 zg

forms an 2-mixingale array with size 12 and constants = (¡12) by LemmaA.3. Thus JD’s Assumption 2 is satis…ed. Finally, JD’s Assumption 3 is satis…ed bymax1··2= (1) given = ().

REFERENCES

[1] An H.Z., Huang F.C. (1996). The Geometrical Ergodicity of Nonlinear Autoregres-sive Models, Statist. Sin. 6 943-956.[2] Bandi, F.M., and B. Perron (2006). Long Memory and the Relation Between Impliedand Realized Volatility, J. Finan. Econometrics 4, 636 - 670.[3] Basrak, B., R.A. Davis, and T. Mikosch (2002). A Characterization of MultivariateRegular Variation, Ann. Probab. 12, 908-920.[4] Billingsley, P. (1968). Convergence of Probability Measures. John Wiley & Sons, NewYork.[5] Bingham, N.H., C.M. Goldie and J.L. Teugels (1987). Regular Variation. CambridgeUniv. Press, Great Britain.[6] Breiman, L. (1965). On Some Limit Theorems Similar to the Arc-Sine Law. TheoryProbab. Appl. 10, 351–360.[7] Carrasco and Chen (2002). Mixing and Moment Properties of Various GARCH andStochastic Volatility Models. Econometric Theory 18, 17-39.[8] Chen, X. and Y. Fan (2006). Estimation of Copula-Based Semiparametric Time Series

34

Page 35: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

Models. J. Econometrics 130, 307–35.[9] Coles, S., J. He¤ernan, and J. Tawn (1999). Dependence Measures for Extreme ValueAnalysis. Extremes 2, 339-365.[10] Davis, R.A. and T. Mikosch (2006). The Probabilistic Properties of Stochastic Volatil-ity Models, mimeo, Dept. of Statistics, Colorado State University.[11] Davidson, J. (1994). Stochastic Limit Theory. Oxford Univ. Press: Oxford.[12] Davidson, J. (2004). Moment and Memory Properties of Linear Conditional Het-eroscedasticity Models, and a New Model. J. Bus. Econ. Statist. 22, 16-29.[13] Davison, A.C. and R.L. Smith (1990). Models for Exceedances Over High Thresholds.J.R. Statist. Soc.Ser. B 52, 393-442.[14] de Haan, L. and J. de Ronde (1998). Sea and Wind: Multivariate Extremes at Work.Extremes 1, 7-45.[15] de Jong, R.M., and J. Davidson (2000). Consistency of Kernel Estimators of Het-eroscedastic and Autocorrelated Covariance Matrices. Econometrica 68, 407-423.[16] Drees, H. and X. Huang (1998). Best Attainable Rates of Convergence for Estimatesof the Stable Tail Dependence Functions. J. Mult. Anal. 64, 25-47.[17] Einmahl, J., L. de Haan, and V. Piterbarg (2001). Nonparametric Estimation of theSpectral Measure of an Extreme Value Distribution. Ann. Statist. 29, 1401-1423.[18] Einmahl, J., de Haan, L. and D. Li (2006). Weighted Approximations of Tail CopulaProcesses with Application to Testing the Multivariate Extreme Condition. Ann. Statist.34, 1987-2014.[19] Einmahl, J., A. Krajina and J. Segers (2008). A Method of Moments Estimator of TailDependence. Bernoul li 44, 1003-1006.[20] Embrechts, P. A. McNeil, and D. Strauman (2001). Correlation and Dependence Prop-erties in Risk Management: Properties and Pitfalls, in M. Dempster (ed.), Risk Manage-ment: Value at Risk and Beyond, p. 176–223. Cambridge University Press: Cambridge.[21] Embrechts, P., Klüppelberg, C. and Mikosch, T. (2003). Modelling Extremal Eventsfor Insurance and Finance. Springer-Verlag.[22] Engle, R., T. Ito, and W. Lin (1990). Meteor Showers or Heat Waves? HeteroscedasticIntra-Day Volatility in the Foreign Exchange Rate Market. Econometrica 58, 525-542.[23] Finkenstadt, B., and H. Rootzén (2003). Extreme Values in Finance, Telecommunica-tions and the Environment. Chapman and Hall: New York.[24] Forbes, K. and R. Rigobon (2002). No Contagion, Only Interdependence: MeasuringStock Market Co-Movements. J. Finan. 57, 2223-2261.[25] Gallant, A. R. and H. White (1988). A Uni…ed Theory of Estimation and Inference forNonlinear Dynamic Models. Basil Blackwell: Oxford.[26] Ghysels, E., A. Harvey and E. Renault (1995). Stochastic Volatility, in: Handbook ofStatistics 15, Statistical Methods in Finance, G.S. Maddala and C.R. Rao (eds.). NorthHolland: Amsterdam.[27] Goldie, C.M., and R.L. Smith (1987). Slow Variation with Remainder: Theory andApplications. Q. J. Math. 38, 45-71.[28] Granger, C. (1966). The Typical Spectral Shape of an Economic Variable. Economet-

35

Page 36: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

rica, 34, 150-161.[29] Haeusler, E., and J. L. Teugels (1985). On Asymptotic Normality of Hill’s Estimatorfor the Exponent of Regular Variation. Ann. Statist. 13, 743-756.[30] Hartmann, P., S. Straetmans, and C.G. de Vries (2004). Asset Market Linkages inCrisis Periods. Rev. Econ. Statist. 86, 313-326.[31] He¤ernan, J. E. (2000) A Directory of Coe¢cients of Tail Dependence. Extremes, 3,279-290.[32] He¤ernan, J.E. and J.A. Tawn (2004). A Conditional Approach to Multivariate Ex-treme Values. J. R. Statist. Soc. Ser. B 66, 497-546.[33] Hill, B.M. (1975). A Simple General Approach to Inference about the Tail of a Distri-bution. Ann. Math. Statist. 3, 1163-1174.[34] Hill, J.B. (2005). On Tail Index Estimation for Dependent, Heterogenous Data, Dept.of Economics, University of North Carolina - Chapel Hill; revised for Econometric Theory;http://www.unc.edu/»jbhill/hill_het_dep.pdf.[35] Hill, J.B. (2008a). Extremal Memory of Stochastic Volatility with Applications to TailShape and Tail Dependence Inference, Dept. of Economics, University of North Carolina-Chapel Hill, submitted; www.unc.edu/»jbhill/stoch_vol_tails.pdf.[36] Hill, J.B. (2008b). Tail and Non-Tail Memory with Applications to Extreme Value andRobust Statistics, Dept. of Economics, University of North Carolina-Chapel Hill, submit-ted; www.unc.edu/»jbhill/tail_nontail_garch.pdf.[37] Hill, J.B. (2009a). Appendix C: Omitted Proofs and Supporting Lemmata for ”RobustEstimation and Inference for Extremal Dependence in Time Series”, Dept. of Economics,University of North Carolina-Chapel Hill, www.unc.edu/»jbhill/tech_append_biv_dep_test.pdf.[38] Hill, J.B. (2009b). Appendix D: Omitted Figures and Tables for ”Robust Estimationand Inference for Extremal Dependence in Time Series”, Dept. of Economics, Universityof North Carolina-Chapel Hill, www.unc.edu/»jbhill/tech_append_biv_dep_test.pdf.[39] Hill, J.B. (2009c). Functional Central Limit Theorems for Dependent, HeterogenousTail Arrays with Applications. J. Statist. Plan. Infer.: in press.[40] Hong, Y. (2001). A Test for Volatility Spillover with Application to Exchange Rates.J. Econometrics 103, 183-224.[41] Hsing, T. (1991). On Tail Index Estimation Using Dependent Data. Ann. Statist. 19,1547-1569.[42] Huang, X. (1992). Statistics of Bivariate Extreme Values. Unpublished dissertation,Erasmus Univ. Rotterdam, Tinbergen Institute Research Series 22.[43] Ibragimov, I. A. (1962). Some Limit Theorems for Stationary Processes. TheoryProbab. Appl. 7, 349-382.[44] Ibragimov, I. A. and Y. V. Linnik (1971). Independent and Stationary Sequences ofRandom Variables. Wolters-Noordhof: Berlin.[45] King, M., E. Sentana, and S.Wadhwani (1994). Volatility and Links between NationalStock Markets. Econometrica 62, 901-933.[46] Klüppelberg, C., G. Kuhn and L. Peng (2007). Estimating the Tail Dependence Func-tion of an Elliptical Distribution. Bernoulli 13, 229-251.

36

Page 37: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

[47] Klüppelberg, C., G. Kuhn and L. Peng (2008). Semi-Parametric Models for the Multi-variate Tail Dependence Function - the Asymptotically Dependent Case. Scan. J. Statist.35, 701-718.[48] Leadbetter, M.R., G. Lindgren and H. Rootzén (1983). Extremes and Related Proper-ties of Random Sequences and Processes. Springer-Verlag: New York.[49] Ledford, A.W. and J.A. Tawn (1996). Statistics for Near Independence in MultivariateExtreme Values. Biometrika 83, 169-187.[50] Ledford, A.W. and J.A. Tawn (1997). Modeling Dependence within Joint Tail Regions.J. R. Statist. Soc. Ser. B 59, 475-499.[51] Ledford, A.W. and J.A. Tawn (2003). Diagnostics for Dependence within Time SeriesExtremes. J. R. Statist. Soc. Ser. B 65, 521-543.[52] Lo, A.W. (1991). Long Memory in Stock Market Prices Econometrica 59, 1279-1313.[53] Longin, F. and B. Solnik (2001). Extreme Correlation of International Equity Markets.J. Finan. 56, 649-676.[54] Loynes, R. M. (1965). Extreme Values in Uniformly Mixing Stationary StochasticProcesses. Ann. Math. Statist. 36, 993-999.[55] Mardia, K.V. (1964). Asymptotic Independence of Bivariate Extremes. Calcutta Sta-tist. Assoc. Bull. 13, 172-178.[56] McLeish, D.L. (1975). A Maximal Inequality and Dependent Strong Law. Ann. Probab.3, 329-339.[57] Meitz M. and P. Saikkonen (2008). Stability of Nonlinear AR-GARCH Models. J.Time Ser. Anal. 29, 453-475.[58] Newey, W., and K. West (1987). A Simple, Positive-De…nite, Heteroscedasticity andAutocorrelation Consistent Covariance Matrix. Econometrica 55, 703-708.[59] Ramos, A. and A. Ledford (2008). A New Class of Models for Bivariate Joint Tails. J.R. Statist. Soc. Ser. B : forthcoming.[60] Resnick, S. (1987). Extreme Values, Regular Variation and Point Processes. Springer-Verlag, New York.[61] Schmidt R. and U. Stadtmüller (2006). Non-Parametric Estimation of Tail Depen-dence. Scan. J. Statist. 33, 307-335.[62] Sibuya, M. (1961). Bivariate extreme statistics. Ann. Math. Statist. 11, 195-210.[63] Smith, R.L. (1984). Threshold Methods for Sample Extremes, In J. Tiago de Oliveira(ed.), Statistical Extremes and Applications, 621-638. Reidel: Dordrecht.[64] St¼aric¼a, C. (1999). Multivariate Extremes for Models with Constant Conditional Cor-relations. J. Empir. Finan. 6, 515-553.[65] Zhang, D, M.T. Wells and L. Peng (2008). Nonparametric Estimation of the Depen-dence Function for a Multivariate Extreme Value Distribution. J. Mult. Anal. 99, 577 -588

37

Page 38: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

TABLE 1: E-VAR Two-Tailed Serial Median § (n = 500)

(1)9 (weak)!

(strong)! h § p-val %rej § p-val %rej § p-val %rej1 -.014§.048 .58 .03 .170§.072 .00 1.0 .237§.069 .00 1.02 -.002§.047 .50 .03 .157§.071 .00 .98 .189§.071 .00 1.03 -.000§.041 .42 .04 .119§.065 .01 .93 .132§.070 .00 1.04 -.001§.052 .35 .06 .086§.068 .02 .88 .091§.067 .00 1.0

a. Median 95% bands over 2 . b. P-value of median Wald statistic under the null! 2(). c. Rejection frequency at the 5%-level.

TABLE 2: SAV Two–Tailed Serial Median ()[ § ] (n = 500)9 !

h ()[§ ] p-val %rej ()[§ ] p-val %rej1 -.004§.161 .67 .02 .320§.203 .02 .782 -.009§.164 .72 .02 .263§.205 .04 .733 .000§.169 .81 .03 .214§.187 .06 .564 .008§.182 .88 .03 .155§.178 .08 .38

TABLE 3: Two-Tailed Serial Median § (n = 500)h NASDAQ LSE Nikkei SP5001 .137§.036 .000 .176§.053 .000 .020§.036 .512 .140§.042 .0002 .109§.041 .000 .114§.051 .000 .027§.041 .475 .089§.041 .0003 .133§.043 .000 .143§.047 .000 .041§.043 .335 .091§.043 .0004 .158§.037 .000 .148§.046 .000 .050§.039 .262 .109§.039 .000

1.93§.350 1.91§.309 2.40§.433 1.96§.297

a. P-value of median Wald statistic. b. Median tail index 95% band § 12 for jj.

38

Page 39: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

TABLE 4: Two-Tailed Bivariate Median § (n = 500)NASDAQ LSE NIKKEI SP500

h NAS ! SP LSE ! NAS NIK ! NAS SP ! NAS1 .140§.037 .00 .087§.042 .02 .013§.042 .68 .049§.038 .152 .058§.038 .00 .055§.039 .05 .028§.043 .56 .059§.042 .143 .057§.036 .01 .075§.040 .04 .016§.039 .64 .078§.039 .094 .083§.041 .01 .069§.041 .06 .040§.043 .65 .066§.039 .12h NAS ! LSE LSE ! SP NIK ! SP SP ! LSE1 .117§.048 .00 .096§.035 .01 .012§.040 .60 .137§.048 .002 .052§.037 .01 .069§.036 .03 .014§.042 .68 .101§.051 .003 .070§.043 .02 .108§.041 .01 -.004§.037 .82 .093§.051 .004 .081§.041 .02 .123§.040 .01 .035§.039 .75 .126§.049 .00h NAS ! NIK LSE ! NIK NIK ! LSE SP ! NIK1 .074§.036 .02 .052§.041 .15 .019§.041 .71 .076§.039 .022 .001§.037 .06 .027§.039 .28 .015§.040 .76 -.004§.044 .073 .012§.041 .12 .041§.039 .27 .026§.041 .74 .009§.042 .144 .040§.041 .18 .048§.042 .31 .020§.039 .74 .040§.042 .19

39

Page 40: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

FIGURE 3: Two-Tailed Serial Median § (n = 500)

E- V A R : N o S p illo ve r

- 0 .0 6

- 0 .0 4

- 0 .0 2

0 .0 0

0 .0 2

0 .0 4

0 .0 6

1 3 5 7 9 1 1 1 3 1 5 1 7 1 9

h

- k r(h) k

E- V A R: S tro ng S pillover

- 0 .10- 0 .05

0 .000 .050 .100 .150 .200 .25

1 3 5 7 9 11 13 15 1 7 19

h

- k r(h) k

SAV: No Spillover

-0.20-0.15-0.10-0.050.000.050.100.150.20

1 3 5 7 9 11 13 15 17 19h

-k (n/m)r(h) k

SAV: Spillover

-0.30-0.20-0.100.000.100.200.300.40

1 3 5 7 9 11 13 15 17 19

h

-k (n/m)r(h) k

FIGURE 4: Equity Returns Two-Tailed Serial Median §

NASDAQ

-0.10

-0.050.000.050.100.150.20

1 11 21 31 41 51 61 71 81 91h-k r(h) k

SP500

-0.05

0.00

0.05

0.10

0.15

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

LSE

-0.10

-0.05

0.00

0.05

0.10

0.15

0.20

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

N IK K IE

-0.06-0.04-0.020.000.020.040.060.08

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

40

Page 41: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

FIGURE 5: Equity-to-Equity Two-Tailed Bivariate Median §

NASDAQ --> SP500

-0.10

-0.05

0.00

0.05

0.10

0.15

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

SP500 --> N ASDAQ

-0.06-0.04-0.020.000.020.040.060.080.10

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

N ASDAQ --> LSE

-0.10

-0.05

0.00

0.05

0.10

0.15

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

SP500 --> LSE

-0.10

-0.05

0.00

0.05

0.10

0.15

1 11 21 31 41 51 61 71 81 91h-k r(h) k

NASDAQ --> NIKK EI

-0.06

-0.04

-0.020.00

0.02

0.04

0.06

0.08

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

SP500 --> NIKKEI

-0.06-0.04-0.020.000.020.040.060.080.10

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

c

"NASDAQ —LSE" implies tail dependence between past NASDAQt-h and contemporary LSEt is measured.

41

Page 42: RobustEstimation and Inference for Extremal Dependence …jbhill/robust_biv_tail_dep.pdf · RobustEstimation and Inference for Extremal Dependence in Time Series ... ! 1as ! 1. The

FIGURE 5 Cont.: Equity-to-Equity Two-Tailed Bivariate Median §

LSE --> N ASDAQ

-0.06

-0 .04-0 .020.00

0.020.040.06

0.080.10

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

NIKKEI --> NASDAQ

-0.06

-0.04

-0.02

0.000.02

0.040.06

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

LSE - -> SP500

- 0.10

- 0.05

0.00

0.05

0.10

0.15

1 11 21 31 41 51 61 71 81 91

h- k r(h) k

NIKKEI --> SP500

-0.06

-0.04

-0.02

0.00

0.02

0.04

0.06

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

LSE --> N IK K EI

-0.06

-0.04

-0.02

0.00

0.02

0.04

0.06

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

NIKKEI --> LSE

-0.06

-0.04

-0.02

0.00

0.02

0.04

0.06

1 11 21 31 41 51 61 71 81 91

h-k r(h) k

42