topics on max-stable processes and the...
TRANSCRIPT
TOPICS ON MAX-STABLE PROCESSES AND
THE CENTRAL LIMIT THEOREM
by
Yizao Wang
A dissertation submitted in partial fulfillmentof the requirements for the degree of
Doctor of Philosophy(Statistics)
in The University of Michigan2012
Doctoral Committee:
Associate Professor Stilian. A. Stoev, ChairProfessor Tailen HsingProfessor Robert W. KeenerProfessor Roman VershyninProfessor Emeritus Michael B. Woodroofe
ACKNOWLEDGEMENTS
First of all, I am indebted to my thesis advisor Professor Stilian A. Stoev for his
help and support since 2008. He has been a great mentor for me in my research
career. At the same time, he has also provided me many helps and advice in daily
life. This dissertation would not have been possible without him. In particular, the
first part of this dissertation is under his supervision.
Second, I am grateful to Professor Emeritus Michael Woodroofe. He sets up a
very high standard for scholars, and as a young researcher I am deeply influenced by
him in many aspects. The second part of this dissertation is under his supervision.
I would also like to thank Professor Yves Atchade, Professor Tailen Hsing, Pro-
fessor Bob Keener and Professor Parthanil Roy (from Michigan State University)
for many insightful and inspiring discussions on research. I also appreciate Profes-
sor Tailen Hsing, Professor Bob Keener, Professor Roman Vershynin and Professor
Michael Woodroofe for serving on my thesis committee.
I own many thanks to all the faculty members and students in the Department
of Statistics at the University of Michigan. I really enjoy my last five years as a
graduate student in Ann Arbor.
At last, I am greatly indebted to my parents for their unconditional support of
my pursue of academia career abroad during the past years. Without their support I
can achieve no success. I am also grateful to my wife, Fei Xu, for her companionship
full of encouragement, support and consideration.
ii
TABLE OF CONTENTS
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
CHAPTER
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Max-stable Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Central Limit Theorems for Random Fields . . . . . . . . . . . . . . . . . . . 5
II. Preliminaries on Max-stable Processes . . . . . . . . . . . . . . . . . . . . . . 8
2.1 Spectral Representation and Extremal Integrals . . . . . . . . . . . . . . . . 92.2 Spectrally Continuous and Discrete α-Frechet processes . . . . . . . . . . . . 14
III. Association of Sum- and Max-stable Processes . . . . . . . . . . . . . . . . . 16
3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 Identification of Max-linear and Positive-linear Isometries . . . . . . . . . . . 223.3 Association of Sum- and Max-stable Processes . . . . . . . . . . . . . . . . . 243.4 Association of Classifications . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.5 Proofs of Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
IV. Decomposability of Sum- and Max-stable Processes . . . . . . . . . . . . . . 35
4.1 SαS Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.2 Stationary SαS Components and Flows . . . . . . . . . . . . . . . . . . . . . 414.3 Decomposability of Max-stable Processes . . . . . . . . . . . . . . . . . . . . 494.4 Proof of Theorem IV.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
V. Conditional Sampling for Max-stable Processes . . . . . . . . . . . . . . . . . 58
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.2 Conditional Probability in Max-linear Models . . . . . . . . . . . . . . . . . 625.3 Conditional Sampling: Computational Efficiency . . . . . . . . . . . . . . . . 685.4 MARMA Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735.5 Discrete Smith Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.6 Proofs of Theorems V.4 and V.9 . . . . . . . . . . . . . . . . . . . . . . . . . 82
VI. Central Limit Theorems for Stationary Random Fields . . . . . . . . . . . . 89
iii
6.1 Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916.2 m-Dependent Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 936.3 A Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 956.4 An Invariance Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 986.5 Orthomartingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1016.6 Stationary Causal Linear Random Fields . . . . . . . . . . . . . . . . . . . . 1056.7 A Moment Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1096.8 Auxiliary Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
VII. Asymptotic Normality of Kernel Density Estimators for Stationary Ran-dom Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.1 Assumptions and Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . 1187.2 Examples and Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1217.3 A Central Limit Theorem for m-Dependent Random Fields . . . . . . . . . . 1237.4 Asymptotic Normality by m-Approximation . . . . . . . . . . . . . . . . . . 1257.5 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
iv
LIST OF FIGURES
Figure
5.1 Four samples from the conditional distribution of the discrete Smith model (seeSection 5.5), given the observed values (all equal to 5) at the locations marked bycrosses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 Prediction of a MARMA(3,0) process with φ1 = 0.7,φ2 = 0.5 and φ3 = 0.3, basedon the observation of the first 100 values of the process. . . . . . . . . . . . . . . . 77
5.3 Conditional medians (left) and 0.95-th conditional marginal quantiles (right). Eachcross indicates an observed location of the random field, with the observed value atright. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
v
LIST OF TABLES
Table
5.1 Means and standard deviations (in parentheses) of the running times (in seconds)for the decomposition of the hitting matrix H, based on 100 independent obser-vations X = A ⊙ Z, where A is an (n × p) matrix corresponding to a discretizedSmith model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2 Cumulative probabilities that the projection predictors correspond to at time 100+t, based on 1000 simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3 Coverage rates (CR) and the widths of the upper 95% confidence intervals at time100 + t, based on 1000 simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . 78
vi
CHAPTER I
Introduction
This dissertation consists of results in two distinct areas of probability theory.
One is the extreme value theory, the other is the central limit theorem.
In the extreme value theory, the focus is on max-stable processes. Such processes
play an increasingly important role in characterizing and modeling extremal phenom-
ena in finance, environmental sciences and statistical mechanics. Several structural
and ergodic properties of max-stable processes are investigated via their spectral rep-
resentations. Besides, the conditional distributions of max-stable processes are also
studied, and a computationally efficient algorithm is developed. This algorithm has
many potential applications in prediction of extremal phenomena.
In the central limit theorem, the asymptotic normality for partial sums of sta-
tionary random fields is studied, with a focus on the projective conditions on the
dependence. Such conditions, easy to check for many stochastic processes and ran-
dom fields, have recently drawn many attentions for (one-dimensional) time series
models in statistics and econometrics. Here, the focus is on (high-dimensional) sta-
tionary random fields. In particular, a general central limit theorem for stationary
random fields and orthomartingales is established. The method is then extended to
establish the asymptotic normality for the kernel density estimator of linear random
1
2
fields.
Below are overviews of the following chapters of this dissertation.
1.1 Max-stable Processes
Max-stable processes arise in the limit of maxima of independent and identically
distributed processes. It is well known that all max-stable processes can be trans-
formed to α-Frechet processes. A random variable Y is α-Frechet with α > 0, if
P(Y ≤ y) = exp(−σαy−α), y > 0.
A stochastic process Ytt∈T is α-Frechet, if all its max-linear combinations in form
of maxi=1,...,n aiYti≡
n
i=1 aiYti, ai > 0, ti ∈ T, i = 1, . . . , n, n ∈ N are α-Frechet.
It is known since de Haan [23] that under mild regularity conditions, for every α-
Frechet process Ytt∈T , there exists a class of non-negative, Lα-integrable functions
ftt∈T ∈ Lα
+(S,BS, µ), such that
(1.1) P(Yt1 ≤ y1, . . . , Ytn≤ yn) = exp
−
S
n
i=1
fti(s)/yti
α
µ(ds).
Indeed, every such a process has an extremal integral representation as
(1.2) Ytt∈Td=
e
S
ft(s)M∨
α(ds)
t∈T
,
where ‘e
’ is the symbol of the extremal integral and M∨α
is an α-Frechet random
sup-measure (see Stoev and Taqqu [101]).
Preliminary results on max-stable processes can be found in Chapter II. Then,
starting with such representation results, structural properties of max-stable pro-
cesses are investigated. Besides, a careful investigation of its conditional distributions
also yields an exact conditional sampling algorithm, which has potential applications
of spatial extremes.
3
Association of max-stable processes to sum-stable processes
The association of α-Frechet processes to the symmetric α-stable (SαS) processes
is established in Chapter III. Namely, under mild assumptions, every α-Frechet pro-
cess can be associated to an SαS process via spectral representations. This provides
a theoretical support to the longstanding folklore that two classes of processes share
many similar structural results. However, the converse is not true. That means that
roughly speaking, the class of SαS processes has richer structures than the class of
α-Frechet processes.
The association method has become a convenient tool to translate results on
SαS processes (e.g. Rosinski [83] and Samorodnitsky [92]) to α-Frechet processes.
By the association method, many structural results on SαS processes have natural
counterparts for α-Frechet processes. See also Kabluchko [50] for an independent
treatment with different tools.
Decomposability of max-stable processes
The decomposability properties have been extensively studied for probability dis-
tributions. The notion of decomposability can be generalized to α-Frechet processes.
Namely, letting Y = Ytt∈T be an α-Frechet process as in (1.2), a natural question
is, when can we write
(1.3) Ytt∈Td=
Y (1)t ∨ Y (2)
t
t∈T
,
where Y(i) = Y (i)t t∈T , i = 1, 2, are two independent α-Frechet processes? If such
processes Y(1),Y(2) exist, what kind of α-Frechet processes can they be? To what
extent are their structures determined by Y?
A characterization of all possible α-Frechet components Y(i) is established in
Chapter IV. Furthermore, when Y is stationary, a necessary and sufficient condition
4
for its α-Frechet component to be stationary is established. This time, Y may
have but only trivial stationary α-Frechet components (scaled copies cY with c ∈
(0, 1)), and such a process is said to be indecomposable. These indecomposable
processes can be viewed as the elementary building blocks for all stationary α-Frechet
processes. Therefore, to study stationary α-Frechet processes, it suffices to focus on
the indecomposable ones. The decomposability of stationary α-Frechet processes
also provides a different point of view on the classification problem for stationary
α-Frechet processes.
Similar decomposability results also hold for sum-stable processes. This is clear
from the association point of view. In fact, we first establish the decomposability
result for sum-stable processes, and then obtain results for max-stable processes by
the association method.
Conditional sampling for max-stable random fields
Given an α-Frechet random field Ytt∈Zd , what is the conditional distribution
(1.4) P((Ys1 , . . . , Ysm) ∈ · | Yt1 , . . . , Ytn
) = ?
The conditional distribution formula is established for a dense class of α-Frechet
random fields (the spectrally discrete ones). For such random fields, explicit exact for-
mula of the conditional distribution (1.4) is obtained. The hard part of the problem
is to provide an efficient algorithm applicable in practice. Such an algorithm is de-
veloped, thanks to certain conditional independence structure of spectrally discrete
max-stable random fields.
As a potential application, such an algorithm would play an important role in the
prediction problem. The prediction problem arises in many scenarios from different
areas. For example, suppose observations of heavy rainfalls are available in an area
5
at certain locations. The engineers often need estimates (predictions) of the rainfalls
over the entire area, and this information is useful for building infrastructure for
flood-protection. Max-stable random fields are natural models for such problems
focusing on extremal phenomena.
Remark I.1. The main results in Chapters II, III, IV and V have already been
published in peer-reviewed journals ([111], [110], [112] and [113] respectively).
1.2 Central Limit Theorems for Random Fields
In probability theory, the central limit theorem is one of the problems with longest
history: when does
(1.5)1√n
n
k=1
(Xk − EXk) ⇒ N (0, σ2)
occur? While the case when Xkk=1,...,n are independent has been completely solved
for more than half a century, establishing central limit theorems in the dependent case
is still an active area of research. Such limit results are of fundamental importance in
various areas, particularly in statistics theory, where it is important to characterize
the cumulative behavior of large amount of individuals.
This dissertation investigates two problems, focusing on central limit theorems for
random fields. Namely, given a stationary random field Xi,j(i,j)∈Z2 , we establish
conditions on the dependence such that
(1.6)1
n
n
i=1
n
j=1
(Xi,j − EXi,j) ⇒ N (0, σ2).
This problem has been investigated by many researchers. Many results in the lit-
erature are based on mixing-type conditions (see e.g. Bradley [8]). These conditions
are sometimes difficult to check in applications. Here, the focus is on projective-type
6
conditions that are easy to check. Such conditions have recently drawn much atten-
tion in the study of central limit theorems for (one-dimensional) stochastic processes,
with applications in statistics and econometrics. See for example Dedecker et al. [30]
and Wu [118, 119], among others. The extension of the aforementioned results to
high dimensions (i.e. random fields) is not trivial, as the main technical tool used
there, the martingale approximation method, is not applicable in the multiparameter
setting most of the time. Instead, we will take an m-approximation approach.
A general central limit theorem
A central limit theorem for stationary random fields (1.6) is established in Chap-
ter VI. A particular example is the functionals of linear random fields
(1.7) Xi,j = g ∞
k=0
∞
l=0
ak,li−k,j−l
, (i, j) ∈ Z
2 ,
where i,j(i,j)∈Z2 are i.i.d. random variables, and g is often a Lipschitz func-
tion. Such models from statistics have recently attracted people’s attentions (see
e.g. [15]). Another example is when Xi,j(i,j)∈Z2 are orthomartingale differences (see
e.g. Khoshnevisan [53]). In this case, a new central limit theorem for orthomartin-
gales follows from the previous result, generalizing known results [3, 62, 63, 73] in
the literature.
Asymptotic normality of kernel density estimators
Consider a causal linear random field Xi,j(i,j)∈Z2 (as in (1.7) with g(x) = x).
The kernel density estimator
fn(x) =1
n2bn
n
i=1
n
j=1
Kx−Xi,j
bn
is said to be asymptotically normal, if
(1.8)
n2bn(fn(x)− Efn(x)) ⇒ N (0, σ2x) ,
7
where σ2x= p(x)
K2(s)ds and p(x) is the density of X0,0 at x. Such estimators
were first considered for i.i.d. sequences by Rosenblatt [82] and Parzen [65], and have
been widely studied since then (see e.g. Wu and Mielniczuk [120] for a treatment
for stationary sequences). Sufficient conditions for (1.8) to hold are provided in
Chatper VII.
Remark I.2. The results in Chapters VI and VII can be found in [114] and [115],
which have been submitted to peer-reviewed journals at the time of writing this
dissertation.
CHAPTER II
Preliminaries on Max-stable Processes
Max-stable processes have been studied extensively in the past 30 years. The
works of Balkema and Resnick [2], de Haan [22, 23], de Haan and Pickands [26],
Gine et al. [38] and Resnick and Roy [78], among many others have led to a wealth
of knowledge on max-stable processes. The seminal works of de Haan [23] and de
Haan and Pickands [26] laid the foundations of the spectral representations of max-
stable processes and established important structural results for stationary max-
stable processes. Since then, however, while many authors focused on various im-
portant aspects of max-stable processes, the general theory of their representation
and structural properties had not been thoroughly explored. At the same time, the
structure and the classification of sum-stable processes has been vigorously stud-
ied. Rosinski [83], building on the seminal works of Hardin [45, 46] about minimal
representations, developed the important connection between stationary sum-stable
processes and flows. This led to a number of important contributions on the struc-
ture of sum-stable processes (see, e.g. [86, 84, 70, 71, 92]). There are relatively few
results of this nature about the structure of max-stable processes, with the notable
exceptions of de Haan and Pickands [26], Davis and Resnick [20] and the very recent
works of Kabluchko et al. [51] and Kabluchko [50].
8
9
This chapter collects preliminary results on max-stable processes and their
(stochastic) extremal integral representation introduced by Stoev and Taqqu [101]
(see also Wang and Stoev [111]). This representation, essentially equivalent to the
one by de Haan [23], provides a natural connection to sum-stable processes (see
e.g. Samorodnitsky and Taqqu [93]). This connection is explored in Chapters III
and IV.
2.1 Spectral Representation and Extremal Integrals
It is well known that the univariate marginals of a max-stable process are nec-
essarily extreme value distributions, i.e. up to rescaling and shift they are either
Frechet, Gumbel or negative Frechet. The extreme value distributions arise as limits
of normalized maxima of independent and identically distributed random variables:
n
i=1 Xi − bnan
⇒ Z .
If the weak convergence holds and Z is non-degenerate, then it must have one of
the above mentioned distributions (see e.g. [76], Proposition 0.3). Similarly, given
independent and identically distributed stochastic processes X(i)t t∈T , i ∈ N, if
n
i=1 X(i)t − bn(t)
an(t)
t∈T
⇒ Ztt∈T ,
for some an(t)t∈T ∈ RT
+, bn(t)t∈T ∈ RT , then the limiting process is necessarily a
max-stable process.
We focus on a special class of max-stable processes: the α-Frechet processes.
Recall that a positive random variable Z ≥ 0 has α-Frechet distribution, α > 0, if
P(Z ≤ x) = exp−σαx−α , x ∈ (0,∞) .
Here Zα:= σ > 0 stands for the scale coefficient of Z. A stochastic process
10
Xtt∈T is α-Frechet, if all max-linear combinations:
(2.1) max1≤j≤n
ajXtj≡
n
j=1
ajXtjfor all aj > 0, tj ∈ T, j = 1, . . . , n,
are α-Frechet random variables. Any max-stable process can be transformed into an
α-Frechet process by simply transforming their one-dimensional distributions into
α-Frechet ones (see e.g. [76], Chapter 5.4).
The seminal work of de Haan [23] provides convenient spectral representations for
stochastically continuous α-Frechet processes in terms of functionals of Poisson point
processes on (0, 1)× (0,∞). Here, we adopt the slightly more general, but essentially
equivalent, approach of representing max-stable processes through extremal integrals
with respect to random sup-measures (see Stoev and Taqqu [101]). We do so in order
to emphasize the analogies with the well-developed theory of sum-stable processes
(see e.g. Samorodnitsky and Taqqu [93] and Chapter III below).
Given a measure space (S,BS, µ) and α > 0, Mα(A)A∈BSis said to be an α-
Frechet random sup-measure with control measure µ, if
(i) the Mα(Ai)’s are independent random variables for disjoint Ai ∈ BS, 1 ≤ i ≤ n,
(ii) Mα(A) is α-Frechet with scale coefficient Mα(A)α = µ(A)1/α, and
(iii) for all disjoint Ai’s, i ∈ N, we have Mα(
i∈NAi) =
i∈N
Mα(Ai), almost surely.
One can then define the extremal integral of a non-negative simple function f(u) :=
n
i=1 ai1Ai(u) ≥ 0 with disjoint A1, . . . , An ∈ BS:
e
S
fdMα ≡
e
S
f(u)Mα(du) :=
1≤i≤n
aiMα(Ai).
One can show thate
SfdMα is an α-Frechet random variable with scale coeffi-
cient (Sfαdµ)1/α. The definition of
e
SfdMα can, by continuity in probability,
11
be extended to integrands f in the space of nonnegative, Lα-integrable measurable
functions Lα
+(S, µ) := f ∈ Lα(S, µ) : f ≥ 0. Here and in the sequel, we may write
(S, µ) = (S,BS, µ) for simplicity.
Extremal integrals are sometimes referred to as stochastic extremal integrals to
emphasize that they are random variables. We omit the term ‘stochastic’ for the sake
of simplicity. Extremal integrals are in parallel to the notion of stochastic integrals
based on SαS random measures ([101]). In particular, two important properties of
extremal integrals are:
(i) the random variablese
SfjdMα, j = 1, . . . , n are independent, if and only if
the fj’s have pairwise disjoint supports (mod µ), and
(ii) the extremal integral is max-linear:e
S(af ∨bg)dMα = a
e
SfdMα∨b
e
SgdMα,
for all a, b > 0 and f, g ∈ Lα
+(S, µ) = f ∈ Lα(S, µ) : f ≥ 0.
For more details, see Stoev and Taqqu [101].
Now, for any collection of deterministic functions ftt∈T ⊂ Lα
+(S, µ), one can
construct the stochastic process:
(2.2) Xt =
e
S
ft(u)Mα(du) , for all t ∈ T .
In view of the max-linearity of the extremal integrals and (2.1), the resulting process
X = Xtt∈T is α-Frechet. Furthermore, for any n ∈ N, xi > 0, ti ∈ T, i = 1, . . . , n:
(2.3) P(Xt1 ≤ x1, . . . , Xtn≤ xn) = exp
−
S
n
i=1
x−1ifti(u)
α
µ(du).
This shows that the deterministic functions ftt∈T characterize completely the finite-
dimensional distributions of the process X. In general, if
(2.4) Xtt∈Td=
e
S
ftdMα
t∈T
,
12
for some ftt∈T ⊂ Lα
+(S, µ), we shall say that the processX has the extremal integral
representation or spectral representation ftt∈T over the space Lα
+(S, µ). The ft’s in
(2.4) are also referred to as spectral functions of X. In this dissertation, we let ‘d=’
denote ‘equal in finite-dimensional distributions’.
Many α-Frechet processes of practical interest have tractable spectral representa-
tions, with (S,BS, µ) being a standard Lebesgue space. A measurable space (S,S, ν)
is a standard Lebesgue space, if (S,S) is a standard Borel space and ν is a σ-finite
measure. A standard Borel space is a measurable space measurably isomorphic (i.e.,
there exists a one-to-one, onto and bi-measurable map) to a Borel subset of a Polish
space. For example, a Polish space with σ-finite measure on its Borel sets is stan-
dard Lebesgue, and one often chooses (S,BS, µ) = ([0, 1],B[0,1], Leb) in (2.4). (For
more discussions on standard Lebesgue spaces and stationary sum-stable processes,
see Appendix A in [71].)
As shown in Proposition 3.2 in [101], an α-Frechet process X has a representation
(2.4) with (S,BS, µ) being standard Lebesgue, if and only if X satisfies Condition S.
Condition S. There exists a countable subset T0 ⊆ T such that for every t ∈ T , we
have that Xtn
P−→ Xt for some tnn∈N ⊂ T0.
Note that without the Condition S, every max-stable process X can still have a
spectral representation as in (2.4), but the space (S, µ) may not be standard Lebesgue
(see Theorem 1 in [50]).
Remark II.1. The assumption that (S, µ) is a standard Lebesgue space implies that
the space of integrands Lα
+(S, µ) is a complete and separablemetric space with respect
to the metric:
(2.5) ρµ,α(f, g) =
S
|fα− gα|dµ .
13
This metric is natural to use when handling extremal integrals, since as n → ∞,
(2.6)
e
S
fndMα
P−→ ξ , if and only if, ρµ,α(fn, f) =
S
|fα
n− fα
|dµ → 0,
where ξ =e
SfdMα (see e.g. [101]). (Such a metric naturally induces a metric for
space of jointly α-Frechet random variables.) By default, we equip the space Lα
+(S, µ)
with the metric ρµ,α and often write fLα
+(S,µ) for (Sfαdµ)1/α. Here ·
Lα
+(S,µ) is
not a norm unless α ≥ 1.
We focus only on the rich class of α-Frechet processes that satisfy Condition S. In
particular, we want ft(s) to be jointly measurable as a function from (T, S) to R+.
Here, we suppose T is a σ-algebra on T and the measurability is w.r.t. the product
σ-algebra T ⊗BS := σ(T ×BS). The following result clarifies the connection between
the joint measurability of the spectral functions ft(s) and the measurability of its
corresponding α-Frechet process. The proof can be found in [111].
Proposition II.2. Let (S, µ) be a standard Lebesgue space and Mα (α > 0) be an
α-Frechet random sup-measure on S with control measure µ. Suppose (T, ρT ) is a
separable metric space and T is the Borel σ-algebra.
(i) Let X = Xtt∈T have a spectral representation ftt∈T ⊂ Lα
+(S, µ) as in (2.4).
Then, X has a measurable modification if and only if ft(s)t∈T has a jointly
measurable modification, i.e., there exists a T ⊗BS-measurable mapping (s, t) →
gt(s), such that ft(s) = gt(s) µ-a.e. for all t ∈ T .
(ii) If an α-Frechet process Xtt∈T has a measurable modification, then it satisfies
Condition S, and hence it has a representation as in (2.4).
We always assume (T, ρT ) is a separable metric space and T is the Borel σ-algebra.
By Proposition II.2, any measurable α-Frechet process Xtt∈T always has a jointly
measurable spectral representation and satisfies Condition S.
14
2.2 Spectrally Continuous and Discrete α-Frechet processes
Definition II.3. Consider an α-Frechet processX = Xtt∈T . We sayX is spectrally
discrete, if X can be represented as
Xtt∈Td=
i∈Z
ft(i)Zi,
where Zii∈Z are i.i.d. standard α-Frechet random variables, and for each t ∈ T ,
the map ft : Z → R+ satisfies
ift(i)α < ∞. The α-Frechet process X is spectrally
continuous, if X cannot be represented as
Xtt∈Td=
X(1)
t ∨X(2)t
t∈T
with two independent non-degenerate α-Frechet processes X(1)t t∈T , X
(2)t t∈T , such
that one of them is spectrally discrete.
Theorem II.4. Let Xtt∈T be an α-Frechet process with jointly measurable repre-
sentation ftt∈T ⊂ Lα
+(S, µ). Then, there exist spectrally continuous and discrete
α-Frechet processes Xcontt
t∈T and Xdisct
t∈T , such that two processes are indepen-
dent, and
Xtt∈Td=
Xcont
t∨Xdisc
t
t∈T
.
Furthermore, this decomposition is unique in distribution.
The proof can be found in [111]. The processes Xcontt
t∈T and Xdisct
t∈T are
referred to as the spectrally continuous and spectrally discrete components of X,
respectively.
Example II.5. Let Zi, i ∈ N be independent standard α-Frechet variables and let
ft(i) ≥ 0, t ∈ T be such that
i∈Nfα
t(i) < ∞, for all t ∈ T . The spectrally discrete
α-Frechet processes can have stochastic extremal integral representation as
Xt :=
i∈N
ft(i)Zi ≡
e
N
ftdMα, t ∈ T,
15
where Mα is an α-Frechet random sup-measure on N with counting control measure.
Spectrally discrete max-stable processes have simple structure of conditional dis-
tributions, which will be explored in Chapter V.
Example II.6. Consider the well-known α-Frechet extremal process (α > 0):
(2.7) Xtt∈R+
d=
e
R+
1(0,t](u)Mα(du)
t∈R+
,
where Mα has the Lebesgue control measure on R+ (see e.g. [76], Chapter 4). The
α-Frechet extremal process X is spectrally continuous.
CHAPTER III
Association of Sum- and Max-stable Processes
The deep connection between sum- and max-stable processes has long been sus-
pected. As observed, for example, in [19] the moving maxima and the moving aver-
ages are statistically indistinguishable in the extremes. Also, the maxima of indepen-
dent copies of a sum-stable process (appropriately rescaled) converge in distribution
to a max-stable process and the two processes have very similar spectral represen-
tations (see e.g. [101], Theorem 5.1). In [100], the ergodic properties of max-stable
processes were characterized by borrowing ideas and drawing parallels to existing
work in the sum-stable domain.
In this chapter, we introduce the notion of association of sum- and max-stable
processes, by relating their spectral functions. It provides a theoretical support for
the long-standing folklore that the two classes of processes share similar structures.
Furthermore, we will see that the association method also helps ‘translate’ structural
properties of sum-stable processes to max-stable processes.
We focus on infinite variance symmetric α-stable (SαS, α ∈ (0, 2)) sum-stable
processes and α-Frechet max-stable processes. Recall that an infinite variance SαS
variable X has characteristic function ϕX(t) = E exp−itX = exp−σα|t|α , ∀t ∈
R, where α ∈ (0, 2). On the other hand, Y has an α-Frechet distribution if FY (y) =
16
17
P(Y ≤ y) = exp−σαy−α , ∀y ∈ (0,∞), where now α is in (0,∞). The σ’s in both
cases are positive parameters referred to as scale coefficients.
Recall that X = Xtt∈T is an SαS stochastic process if all its finite linear combi-
nations
n
i=1 aiXti, ai ∈ R, ti ∈ T, are SαS. These processes have convenient integral
(or spectral) representations:
(3.1) Xtt∈Td=
S
ft(s)Mα,+(ds)
t∈T
.
Here ftt∈T ⊂ Lα(S, µ), ‘’ stands for the stable integral andMα,+ is an SαS random
measure on measure space (S, µ) with control measure µ (see [93], Chapters 3 and
13). The representation (3.1) implies that
(3.2) E exp− i
n
j=1
ajXtj
= exp
−
S
n
j=1
ajftj(s)α
µ(ds), aj ∈ R , tj ∈ T ,
which determines the finite-dimensional distributions (f.d.d.) of the SαS process
Xtt∈T .
On the other hand, we have seen in Chapter II that, every α-Frechet process has
an extremal integral representation
(3.3) Ytt∈Td=
e
S
ft(s)Mα,∨(ds)
t∈T
,
with ftt∈T ⊂ Lα
+(S, µ), and
(3.4)
P(Yt1 ≤ a1, . . . , Ytn≤ an) = exp
−
S
n
j=1
ftj(s)/ajα
µ(ds), aj ≥ 0 , tj ∈ T .
The ft’s in (3.1) and (3.3) are called the spectral functions of the sum- or max-
stable processes, respectively. Based on the spectral representations above, we define
association as follows:
Definition III.1 (Associated SαS and α-Frechet processes). We say that an SαS
process Xtt∈T and an α-Frechet process Ytt∈T are associated, if there exist
18
ftt∈T ⊂ Lα
+(S, µ) such that:
Xtt∈Td=
S
ftdMα,+
t∈T
and Ytt∈Td=
e
S
ftdMα,∨
t∈T
.
In this case, we say Xtt∈T and Ytt∈T are associated by ftt∈T .
We need to show that this definition is consistent. That is, if f (1)t t∈T and
f (2)t t∈T are two different spectral representations of certain SαS (α-Frechet, resp.)
process, then the associated α-Frechet (SαS, resp.) processes are equal in finite-
dimensional distributions. This is ensured by the following theorem, which is proved
in Section 3.2 below.
Theorem III.2. Consider two arbitrary collections of functions f (i)1 , . . . , f (i)
n ∈
Lα
+(Si, µi) , i = 1, 2, 0 < α < 2. Then,
(3.5)
n
j=1
ajf(1)j
Lα(S1,µ1)
=
n
j=1
ajf(2)j
Lα(S2,µ2)
, for all aj ∈ R ,
if and only if
(3.6)
n
j=1
ajf(1)j
Lα
+(S1,µ1)=
n
j=1
ajf(2)j
Lα
+(S2,µ2), for all aj ≥ 0 .
Furthermore, Theorem III.2 entails that our notion of association is not merely
formal. For example, stationary or self-similar max-stable processes are associated
with stationary or self-similar sum-stable ones, respectively (see Corollary III.12).
We will also see, however, that there are SαS processes that cannot be associ-
ated to any α-Frechet processes (see Theorem III.13). In particular, we provide a
practical characterization of the max-associable SαS processes Xtt∈T with station-
ary increments characterized by dissipative flow, indexed by T = R or T = Z (see
Proposition III.16).
19
This chapter is organized as follows. In Section 3.1, some preliminaries are pro-
vided. In Section 3.2, we prove Theorem III.2. In Section 3.3, we establish the asso-
ciation of SαS and α-Frechet processes and give examples of both max-associable and
non max-associable SαS processes. In Section 3.4, we show how the association can
serve as a tool to translate available structural results for SαS processes to α-Frechet
processes, and vice versa.
3.1 Preliminaries
We draw a connection between the linear isometries and max-linear isometries,
which play important roles in relating two representations of a given SαS or an α-
Frechet process, respectively. The notion of a linear isometry is well known. To
define a max-linear isometry, we say that a subset F ⊂ Lα
+(S, µ) is a max-linear
space if for all n ∈ N, fi ∈ F , ai > 0,
n
i=1 aifi ∈ F and if F is closed w.r.t. the
metric ρµ,α defined by ρµ,α(f, g) =S|fα − gα|dµ (and recall Remark II.1).
Definition III.3 (Max-linear isometry). Let α > 0 and consider two measure
spaces (S1, µ1) and (S2, µ2) with positive and σ-finite measures µ1 and µ2. Let
F1 ⊂ Lα
+(S1, µ1) be a max-linear space. A mapping U : F1 → Lα
+(S2, µ2), is said to
be a max-linear isometry, if:
(i) for all f1, f2 ∈ F1 and a1, a2 ≥ 0, U(a1f1 ∨ a2 f1) = a1(Uf1) ∨ a2(Uf2), µ2-a.e.
and
(ii) for all f ∈ F1, UfLα
+(S2,µ2)= f
Lα
+(S1,µ1).
A linear (max-linear resp.) isometry may be defined only on a small linear (max-
linear resp.) subspace of Lα(S, µ) (Lα
+(S, µ) resp.). However, this linear (max-linear
resp.) isometry can be extended uniquely to the extended ratio space (extended
20
positive ratio space resp.), which will turn out to be closed w.r.t. both linear and
max-linear combinations.
Definition III.4. Let F be a collection of functions in Lα(S, µ).
(i) The ratio σ-field of F , written ρ(F ) := σ (f1/f2, f1, f2 ∈ F), is defined as the
σ-field generated by ratio of functions in F , with the conventions ±1/0 = ±∞
and 0/0 = 0;
(ii) The extended ratio space of F , written Re(F ), is defined as:
(3.7) Re(F ) := rf : rf ∈ Lα(S, µ), r ∼ ρ(F ), f ∈ F .
Similarly, we define extended positive ratio space:
(3.8) Re,+(F ) := rf : rf ∈ Lα
+(S, µ), r ∼ ρ(F ), r ≥ 0, f ∈ F .
The following result is due to [45] and [109].
Theorem III.5. Let F be a linear (max-linear resp.) subspace of Lα(S1, µ1) with
0 < α < 2 (Lα
+(S1, µ1) with 0 < α < ∞ resp.). If U is a linear (max-linear resp.)
isometry from F to U(F), then U can be uniquely extended to a linear (max-linear
resp.) isometry U : Re(F) → Re(U(F)) (U : Re,+(F) → Re,+(U(F)) resp.), with
the form
(3.9) U(rf) = T (r)U(f) ,
for all rf ∈ Re(F) in (3.7) (rf ∈ Re,+(F) as in (3.8) resp.). Here T is the mapping
from Lα(S1, ρ(F), µ1) to Lα(S1, ρ(U(F)), µ2), induced by a regular set isomorphism
T from ρ(F) to ρ(U(F)).
For the precise definition of a regular set isomorphism T and the induced mapping
T , see [56], [45] or [109]. The following remark provides some intuition. Part (iii) is
especially important since it shows that the two types of isometries can be identified.
21
Remark III.6. (i) U is well defined in the sense that for any rifi ∈ Re(F) , i = 1, 2
in (3.7), if r1f1 = r2f2 , µ1-a.e., then U(r1f1) = U(r2f2) , µ2-a.e. Similar result
holds for rifi ∈ Re,+(F) as in (3.8).
(ii) T maps any two almost disjoint sets to almost disjoint sets. See [56].
(iii) The mapping T is both linear and max-linear, i.e., for a, b ≥ 0,
(3.10) T (af + bg) = aTf + bTg and T (af ∨ bg) = aTf ∨ bTg .
This follows from the definition T1A = 1T (A) for measurable A ⊂ S1 and the
construction of T via simple functions. It is via T that the linearity and max-
linearity are identified.
To make good use of (iii) in Remark III.6, we introduce the notion of positive-
linearity. We say a linear isometry U is positive-linear, if U maps all nonnegative
functions to nonnegative functions. Accordingly, we say that F ⊂ Lα
+(S, µ) is a
positive-linear space, if it is closed w.r.t. the metric ρµ,α and all positive-linear com-
binations, i.e., for all n ∈ N, fi ∈ F , ai ≥ 0, we have g :=
n
i=1 aifi ∈ F . Note that
the metric (f, g) → f − g1∧αLα(S,µ) restricted to Lα
+(S, µ) generates the same topology
as the metric ρµ,α. Clearly, Theorem III.5 holds if F is a positive-linear (instead of
a linear) subspace of Lα
+(S, µ). In this case, U is also positive-linear. We conclude
this section with the following refinement of statement (iii) in Remark III.6.
Proposition III.7. Let U be as in Theorem III.5. If F is a positive-linear subspace
of Lα
+(S1, µ1), then the linear isometry U in (3.9) is also a max-linear isometry
from Re,+(F) to Re,+(U(F)). If F is a max-linear subspace of Lα(S1, µ1), then
the max-linear isometry U in (3.9) is also a positive-linear isometry from Re(F) to
Re(U(F)).
22
Proof. Suppose F is max-linear and U is a max-linear isometry. We show U
is also positive-linear. First, if U in (3.9) is max-linear, then the mapping T
from Lα
+(S1, ρ(F), µ1) to Lα
+(S2, ρ(U(F)), µ2) is both max-linear and linear, by Re-
mark III.6 (iii). Moreover, it is easily seen that T is positive-linear. Now, for
r1f1, r2f2 ∈ Re,+(F) as in (3.8), we have
U(a1r1f1 + a2r2f2) = U
a1r1f1
f1 ∨ f2+ a2r2
f2f1 ∨ f2
(f1 ∨ f2)
= Ta1r1
f1f1 ∨ f2
+ a2r2f2
f1 ∨ f2
U(f1 ∨ f2) = a1U(r1f1) + a2U(r2f2) .
That is, U is positive-linear. The proof of the other case is similar, except that
we need the existence of full support function f in F , guaranteed by Lemma 3.2
in [45].
3.2 Identification of Max-linear and Positive-linear Isometries
In this section we prove Theorem III.2. It will be used to relate SαS and α-
Frechet processes in the next section. To do so, we need to introduce a subspace of
Lα
+(S, µ), which is closed w.r.t. the max-linear and positive-linear combinations. For
any F ⊂ Lα
+(S, µ), let
(3.11) F+ := span+F and F∨ := ∨-spanF
denote the smallest positive-linear and max-linear subspace of Lα
+(S, µ) containing
the collection of functions F , respectively. We call them the max-linear and positive-
linear spaces generated by F , respectively. (We also write F := spanF as the
smallest linear subspace of Lα(S, µ) containing F .) In general, we have F+ = F∨.
This means both F+ and F∨ are too small to be closed w.r.t. both ‘
’ and ‘’
operators. However, these two subspaces generate the same extended positive ratio
23
space, on which the two types of isometries are identical. The following fact is proved
in the Appendix.
Proposition III.8. Suppose F ⊂ Lα
+(S, µ). Then Re,+(F+) = Re,+(F∨).
Proof of Theorem III.2. Let F (i) := f (i)1 , . . . , f (i)
n ⊂ Lα
+(Si, µi). We prove the ‘if’
part. Suppose Relation (3.6) holds and we will show (3.5). Relation (3.6) implies that
there exists unique max-linear isometry U from F(1)∨ onto F
(2)∨ , such that Uf (1)
j=
f (2)j
, 1 ≤ j ≤ n. Thus, Theorem III.5 implies that the mapping
U : Re,+(F(1)∨ ) → Re,+(U(F (1)
∨ ))
with form (3.9) is a max-linear isometry. By Proposition III.7, we have that U is
also a positive-linear isometry. By Proposition III.8, U is a positive-linear isometry
defined on Re,+(F(1)+ ), which implies (3.5). The proof of the ‘if’ part is similar.
To conclude this section, we will address the following question: for f (1)1 , . . . , f (1)
n ∈
Lα(S1, µ1), do there always exist nonnegative f (2)1 , . . . , f (2)
n ∈ Lα
+(S2, µ2) such that
Relation (3.5) holds for any aj ∈ R? The answer is ’No’. As a consequence, in the
next section we will see that there are SαS processes, which cannot be associated to
any α-Frechet process.
Proposition III.9. Consider f (1)j
∈ Lα(S1, µ1), 1 ≤ j ≤ n. Then, there exist some
f (2)j
∈ Lα
+(S2, µ2) , 1 ≤ j ≤ n such that (3.5) holds, if and only if
(3.12) f (1)i
(s)f (1)j
(s) ≥ 0 , µ1-a.e. for all 1 ≤ i, j ≤ n .
When (3.12) is true, one can take f (2)i
(s) := |f (1)i
(s)|, 1 ≤ i ≤ n and (S2, µ2) ≡
(S1, µ1) for (3.5) to hold.
The proof is given in the Appendix. We will call (3.12) the associability condition.
24
3.3 Association of Sum- and Max-stable Processes
In this section, by essentially applying Theorem III.2, we associate an SαS process
to every α-Frechet process by Definition III.1. The associated processes will be shown
to have similar properties. However, we will also see that not all the SαS processes
can be associated to α-Frechet processes. We conclude with several examples.
Remark III.10. In Definition III.1, the associated SαS and α-Frechet processes have
the same α ∈ (0, 2). It is easy to see that, for any α-Frechet process Ytt∈T with
spectral functions ftt∈T , Yβ
t t∈T is α/β-Frechet with spectral functions fβ
t t∈T ,
for all 0 < α, β < ∞. This transformation shows that the parameter α plays essen-
tially no role in characterizing the dependence structure of the α-Frechet process.
Given an SαS process with nonnegative spectral functions, one could associate it
to the 1-Frechet process with spectral functions fα
tt∈T . This leads to no loss of
generality. Here, we chose to pair-up the two α’s for technical convenience.
The following result, a simple application of Theorem III.2, shows the consistency
of Definition III.1, i.e., the notion of association is independent of the choice of the
spectral functions.
Theorem III.11. Suppose an SαS process Xtt∈T and an α-Frechet process Ytt∈T
are associated by f (1)t t∈T ⊂ Lα
+(S1, µ1). Then, f (2)t t∈T ⊂ Lα
+(S2, µ2) is a spectral
representation of Xtt∈T , if and only if it is a spectral representation of Ytt∈T .
Namely,
S1
f (1)t dM (1)
α,+
t∈T
d=
S2
f (2)t dM (2)
α,+
t∈T
,
if and only if
e
S1
f (1)t dM (1)
α,∨
t∈T
d=
e
S2
f (2)t dM (2)
α,∨
t∈T
,
25
where M (i)α,+ and M (i)
α,∨ are SαS random measures and α-Frechet random sup-
measures, respectively, on Si with control measure µi, i = 1, 2.
As an immediate consequence, stationarity and self-similarity are preserved under
association. Here we assume T = Rd or Zd.
Corollary III.12. Suppose an SαS process Xtt∈T and an α-Frechet process
Ytt∈T are associated. Then,
(i) Xtt∈T is stationary if and only if Ytt∈T is stationary.
(ii) Xtt∈T is self-similar with exponent H, if and only if Ytt∈T is self-similar
with exponent H.
Proof. Suppose Xtt∈T and Ytt∈T are associated by ftt∈T ⊂ Lα
+(S, µ). (i) For
any h ∈ T , letting gt = ft+h , ∀t ∈ T , by stationarity of Xtt∈T , we obtain gtt∈T
as another spectral representation. Namely, SftdMα,+t∈T
d=
SgtdMα,+t∈T .
By Theorem III.11, the previous statement is equivalent to e
SftdMα,∨t∈T
d=
e
SgtdMα,∨t∈T , which is equivalent to the fact that Ytt∈T is stationary. The
proof of part (ii) is similar and thus omitted.
Observe that not all SαS processes can be associated to α-Frechet processes, since
not all SαS processes have nonnegative spectral representations. For an SαS process
Xtt∈T with spectral representation ftt∈T to have an associated α-Frechet process,
a necessary and sufficient condition is that for all t1, . . . , tn ∈ T , ft1 , . . . , ftn satisfy
the associability condition (3.12). We say such SαS processes are max-associable.
Now, Proposition III.9 becomes:
Theorem III.13. Any SαS process Xtt∈T with representation (3.1) is max-
26
associable, if and only if for all t1, t2 ∈ T ,
(3.13) ft1(s)ft2(s) ≥ 0 , µ-a.e.
Indeed, by Theorem III.13 for any max-associable spectral representation ftt∈T ,
|ft|t∈T is also a spectral representation for the same process. Clearly, if the spectral
functions are nonnegative, then the SαS process is max-associable. We give two
simple examples next.
Example III.14 (Association of mixed fractional motions). Consider the self-similar
SαS processes Xtt∈R+ with the following representations
(3.14) Xtt∈R+
d=
E
∞
0
tH−1α g
x,
u
t
Mα,+(dx, du)
t∈R+
, H ∈ (0,∞) ,
where (E, E , ν) is a standard Lebesgue space, Mα,+ is an SαS random measure on
X × R+ with control measure m(dx, du) = ν(dx)du and g ∈ Lα(E × R+,m). Such
processes are called mixed fractional motions (see [10]). When g ≥ 0 a.e., the process
Xtt∈R+ is max-associable. The Corollary III.12 implies the associated α-Frechet
process is H-self-similar.
Example III.15 (Association of Chentzov SαS random fields). Recall that Xtt∈Rn
is a Chentzov SαS random field, if
Xtt∈Rn ≡ Mα,+(Vt)t∈Rn
d=
S
1Vt(u)Mα,+(du)
t∈Rn
.
Here, 0 < α < 2, (S, µ) is a measure space and Vt, t ∈ Rn is a family of measurable
sets such that µ(Vt) < ∞ for all t ∈ Rn (see Ch. 8 in [93]). Since 1Vt
(u) ≥ 0, all
Chentzov SαS random fields are max-associable.
We conclude this section with some examples of SαS processes that are not max-
associable. In particular, recall that the SαS processes with stationary increments
27
(zero at t = 0) characterized by dissipative flows were shown in [103] to have repre-
sentation
(3.15) Xtt∈Rd=
E
R
(G(x, t+ u)−G(x, u))Mα,+(dx, du)
t∈R
.
Here, (E, E , ν) is a standard Lebesgue space, Mα,+, α ∈ (0, 2), is an SαS random
measure with control measure m(dx, du) = ν(dx)du and G : E × R → R is a
measurable function such that, for all t ∈ R,
Gt(x, u) = G(x, t+ u)−G(x, u) , x ∈ E, u ∈ R
belongs to Lα(E × R,m). The process Xtt∈R in (3.15) is called a mixed moving
average with stationary increments. The following result provides a partial character-
ization of the max-associable SαS processes Xtt∈T , which have the representation
(3.15). We shall suppose that E is equipped with a metric ρ and endow E ×R with
the product topology.
Proposition III.16. Consider an SαS process Xtt∈R with representation (3.15).
Suppose there exists a closed set N ⊂ E × R, such that m(N ) = 0 and the function
G is continuous at all (x, u) ∈ N c := E × R \ N , w.r.t. the product topology. Then,
Xtt∈R is max-associable, if and only if
(3.16) G(x, u) = f(x)1Ax(u) + c(x), on N
c .
Namely, for all x ∈ E, G(x, u) can take at most two values on N c.
Proof. By Theorem III.13, Xtt∈R is max-associable, if and only if for all t1, t2 ∈ R,
(3.17) Gt1(x, u)Gt2(x, u) = (G(x, t1 + u)−G(x, u))(G(x, t2 + u)−G(x, u)) ≥ 0 ,
m-a.e. (x, u) ∈ E × R .
28
First, we show the ‘if’ part. Define G(x, u) := G(x, u) (given by (3.16)) on N c
and G(x, u) := f(x)1Ax(u) + c(x) on N (if Ax and c(x) are not defined, then set
G(x, u) = 0). Set Gt(x, u) = G(x, u + t) − G(x, u). Note that Gt(x, u) is another
spectral representation of Xtt∈R and for all (x, u), 1Ax(u+ t)− 1Ax
(u)t∈R
can
take at most 2 values, one of which is 0. This observation implies (3.17) with Gt(x, u)
replaced by Gt(x, u), whence Xtt∈R is max-associable.
Next, we prove the ‘only if’ part. We show that (3.17) is violated, if G(x, u) takes
more than 2 different values on (x×R)∩N c for some x ∈ X. Suppose there exist
x ∈ E, ui ∈ R such that (x, ui) ∈ N c and gxi := G(x, ui) are mutually different, for
i = 1, 2, 3. Indeed, without loss of generality we may suppose that gx1 < gx2 < gx3.
Then, by the continuity of G, there exists > 0 such that Bi := B(x, )× (ui−, ui+
) , i = 1, 2, 3 are disjoint sets with B(x, ) := y ∈ E : ρ(x, y) < , ρ is the metric
on E and
(3.18) supB1∩N
c
G(x, u) < infB2∩N
c
G(x, u) ≤ supB2∩N
c
G(x, u) < infB3∩N
c
G(x, u) .
Put t1 = u1−u2 and t2 = u3−u2. Inequality (3.18) implies thatGt1(x, u)Gt2(x, u) < 0
on B2 ∩N c. This, in view of Theorem III.13, contradicts the max-associability. We
have thus shown (3.16).
We give two classes of SαS processes, which cannot be associated to any α-Frechet
processes, according to Proposition III.16.
Example III.17 (Non-associability of linear fractional stable motions). The linear
fractional stable motions (see Ch. 7.4 in [93]) have the following spectral representa-
tions:
Xtt∈Rd=
R
a(t+ u)H−1/α
+ − uH−1/α+
+ b
(t+ u)H−1/α
− − uH−1/α−
Mα,+(du)
t∈R
.
29
HereH ∈ (0, 1), α ∈ (0, 2), H = 1/α, a, b ∈ R and |a|+|b| > 0. By Proposition III.16,
these processes are not max-associable.
Example III.18 (Non-associability of Telecom processes). The Telecom process
offers an extension of fractional Brownian motion consistent with heavy-tailed fluc-
tuations. It is a large scale limit of renewal reward processes and it can be obtained
by choosing the distribution of the rewards accordingly (see [57] and [72]). A Telecom
process Xtt∈R has the following representation
Xtt∈Rd=
R
R
e(H−1)/α (F (es(t+ u))− F (esu))Mα,+(ds, du)
t∈R
,
where 1 < α < 2, 1/α < H < 1, F (z) = (z ∧ 0 + 1)+ , z ∈ R and the SαS random
measure Mα,+ is with control measure mα(ds, du) = dsdu. By Proposition III.16,
the Telecom process is not max-associable.
Remark III.19. It is important that the index T in Proposition III.16 is the entire real
line R. Indeed, in both Example III.17 and III.18, when the time index is restricted
to the half-line T = R+ (or T = R−), the processes Xtt∈T satisfy condition (3.13)
and are therefore max-associable.
3.4 Association of Classifications
In this section, we show how to apply the association technique to relate various
classification results for SαS and α-Frechet processes. Note that, many classifications
of SαS (α-Frechet as well) processes are induced by suitable decompositions of the
measure space (S, µ). The following theorem provides an essential tool for translating
classification results for SαS to α-Frechet processes, and vice versa.
Theorem III.20. Suppose an SαS process Xtt∈T and an α-Frechet process Ytt∈T
are associated by two spectral representations f (i)t t∈T ⊂ Lα
+(Si, µi) for i = 1, 2. That
30
is,
Xtt∈Td=
Si
f (i)t dM (i)
α,+
t∈T
and Ytt∈Td=
e
Si
f (i)t dM (i)
α,∨
t∈T
, i = 1, 2 .
Then, for any measurable subsets Ai ⊂ Si, i = 1, 2, we have
A1
f (1)t dM (1)
α,+
t∈T
d=
A2
f (2)t dM (2)
α,+
t∈T
if and only if
e
A1
f (1)t dM (1)
α,∨
t∈T
d=
e
A2
f (2)t dM (2)
α,∨
t∈T
.
The proof follows from Theorem III.2 by restricting the measures onto the sets Ai, i =
1, 2.
For an SαS process Xtt∈T with spectral functions ftt∈T ⊂ Lα(S, µ), a de-
composition typically takes the form Xtt∈Td=
n
j=1 X(j)t t∈T , where X(j)
t =
A(j) ftdMα,+ for all t ∈ T and A(j), 1 ≤ j ≤ n are disjoint subsets of S =
n
j=1 A(j).
The components X(j)t t∈T , 1 ≤ j ≤ n are independent SαS processes. When
Xtt∈T is max-associable, Theorem III.20 enables us to define the associated de-
composition, for the α-Frechet process Ytt∈T associated with Xtt∈T . Namely,
we have Ytt∈Td=
n
j=1 Y(j)t t∈T , where Y (j)
t =e
A(j) |ft|dMα,∨ for all t ∈ T . Con-
versely, given a decomposition for α-Frechet processes, we can define a corresponding
decomposition for the associated SαS processes.
Example III.21 (Conservative-dissipative decomposition). In the seminal work, [83]
established the conservative-dissipative decomposition for SαS processes. Namely, for
any Xtt∈T with representation (3.1), one have
Xtt∈Td= XC
t+XD
tt∈T ,
where XC
t=
CftdMα,+ and XD
t=
DftdMα,+ for all t ∈ T , with C and D defined
31
by
(3.19) C :=s :
T
ft(s)αλ(dt) = ∞
and D := S \ C .
When Xtt∈T is stationary, the sets C and D correspond to the Hopf decom-
position S = C ∪ D of the non-singular flow associated with Xtt∈T (see [83]
for details). Therefore, XC
tt∈T and XD
tt∈T are referred to as the conservative
and dissipative components of Xtt∈T , respectively. Theorem III.20 enables us to
use (3.19) to establish the parallel decomposition of the associated α-Frechet process
Ytt∈T . Namely, for the associated Ytt∈T , we have, Ytt∈Td= Y C
t∨ Y D
tt∈T ,
where Y C
t=
e
C|ft|dMα,∨ and Y D
t=
e
D|ft|dMα,∨ for all t ∈ T . This decomposition
was established in [109] by using different tools.
Remark III.22. Similar associations can be established for other decompositions,
including positive-null decomposition (see [92] and [109]), and the decompositions of
the above two types for random fields (T = Zd or R
d, see [91] and [108]). A more
specific decomposition for SαS processes with representation (3.15) was developed
in [70], and one can obtain the corresponding decomposition for the associated α-
Frechet process by Theorem III.20.
3.5 Proofs of Auxiliary Results
We first need the following lemma.
Lemma III.23. If F ⊂ Lα
+(S, µ), then
(i) ρ(F ) = ρ(span+(F )) = ρ(∨-span(F )), and
(ii) for any f (1) ∈ span+(F ) and f (2) ∈ ∨-span(F ), f (1)/f (2) ∈ ρ(F ).
32
Proof. (i) First, for any fi, gi ∈ F, ai ≥ 0, bi ≥ 0, i ∈ N, we have
i∈N
aifij∈N
bjgj≤ x
=
i∈N
aifij∈N
bjgj≤ x
=
i∈N
k∈N
j∈N
aifibjgj
< x+1
k
,
hence ρ(∨-span(F )) ⊂ ρ(span+(F )).
To show ρ(span+(F )) ⊂ ρ(∨-span(F )), we shall first prove that ρ(span+(F )) ⊂
ρ(∨-span(F )), where span+(F ) involves only finite positive linear combinations. For
all f1, f2, g1 ∈ F, a1, b1, b2 ≥ 0, we have
a1f1 + a2f2b1g1
≤ x=
qj∈Q
a1f1b1g1
≤ qj∩
a2f2b1g1
≤ x− qj
,
This shows that (a1f1 + a2f2)/b1g1 is ρ(∨-span(F )) measurable. By using the fact
that F contains only nonnegative functions and since
b1g1
a1f1 + a2f2≤ x
=
a1f1 + a2f2
b1g1≥
1
x
, x > 0,
we similarly obtain that (a1f1+a2f2)/(b1g1+b2g2) is ρ(∨-span(F )) measurable. Sim-
ilarly arguments can be used to show that (
n
i=1 aifi)/(
n
i=1 bigi) is ρ(∨-span(F ))
measurable for all ai, bi ≥ 0, fi, gi ∈ F, 1 ≤ i ≤ n.
We have thus shown that ρ(span+(F )) ⊂ ρ(∨-span(F )). If now f, g ∈ span+(F ),
then there exist two sequences fn, gn ∈ span+(F ), such that fn → f and gn → g a.e..
Thus, hn := fn/gn → h := f/g as n → ∞, a.e.. Since hn are ρ(span+(F )) measurable
for all n ∈ N, so is h. Hence ρ(span+(F )) = ρ(span+(F )) ⊂ ρ(∨-span(F )).
(ii) By the previous argument, it is enough to focus on finite linear and max-
linear combinations. Suppose f (1) =
n
i=1 aifi and f (2) =
p
j=1 bjgj for some fi, gj ∈
F, ai, bj ≥ 0, 1 ≤ i ≤ n, 1 ≤ j ≤ p. Then, for all x > 0,
n
i=1 aifip
j=1 bjgj< x
=
p
j=1
n
i=1
aifigj
< xbj∈ ρ(F ) .
It follows that f (1)/f (2) ∈ ρ(F ).
33
Proof of Proposition III.8. First we show Re,+(F∨) ⊃ Re,+(F+), where F∨ and F+
are defined in (3.11). By (3.8), it suffices to show that, for any r2 ∈ ρ(F+), f (2) ∈ F+,
there exist r1 ∈ ρ(F∨) and f (1) ∈ F∨, such that
(3.20) r1f(1) = r2f
(2) .
To obtain (3.20), we need the concept of full support. We say a function g has full
support in F (an arbitrary collection of functions defined on (S, µ)), if g ∈ F and for
all f ∈ F , µ(supp(g)\supp(f)) = 0. Here supp(f) := s ∈ S : f(s) = 0. By Lemma
3.2 in [109], there exists function f (1) ∈ F∨, which has full support in F∨. One can
show that this function has also full support in F+. Indeed, let g ∈ F+ be arbitrary.
Then, there exist gn =
kn
i=1 anigni, ani ≥ 0 and gni ∈ F ⊂ F∨ such that gnµ
−→ g as
n → ∞. Note that µ(supp(gn) \ supp(f)) = 0 for all n. Thus, for all > 0, we have
µ(|gn − g| > ) ≥ µ(|g| > \ supp(f)). Since µ(|gn − g| > ) → 0 as n → ∞, it
follows that µ(|g| > \ supp(f)) = 0 for all > 0, i.e., µ(supp(g) \ supp(f)) = 0.
We have thus shown that f has full support in F+.
Now, set r1 := r2f (2)/f (1)
, we have (3.20). (Note that f (2) = 0 , µ-a.e. on
S \ supp(f (1)). By setting 0/0 = 0, f (2)/f (1) is well defined.) Lemma III.23 (ii)
implies that f (2)/f (1) ∈ ρ(F ), whence r1 ∈ ρ(F ) = ρ(F+). We have thus shown
Re,+(F∨) ⊃ Re,+(F+). In a similar way one can show Re,+(F∨) ⊂ Re,∨(F+).
Proof of Proposition III.9. First, suppose (3.12) does not hold but (3.5) holds. Then,
without loss of generality, we can assume that there exists S(1)0 ⊂ S1 such that
f (1)1 (s) > 0, f (1)
2 (s) < 0 for all s ∈ S(1)0 and µ(S(1)
0 ) > 0. It follows from (3.5)
that there exists a linear isometry U such that, by Theorem III.5, Uf (1)i
= f (2)i
=
T (ri)U(f), with certain f and ri = f (1)i
/f , for i = 1, 2. In particular, f can be taken
with full support. Note that sign(r1) = sign(r2) on S(1)0 . It follows that f (2)
1 and f (2)2
34
have different signs on a set of positive measure (indeed, this set is the image of the
S(1)0 under the regular set isomorphism T ). This contradicts the fact that f (2)
1 and
f (2)2 are both nonnegative on S2.
On the other hand, suppose (3.12) is true. Define Uf (1)i
:= |f (1)i
|. It follows
from (3.12) that U can be extended to a positive-linear isometry from Lα(S1, µ1) to
Lα
+(S2, µ2), which implies (3.5).
CHAPTER IV
Decomposability of Sum- and Max-stable Processes
In this chapter, we investigate the general decomposability problem for both SαS
and α-Frechet processes with 0 < α < 2. We first focus on SαS processes. Then,
by the association method introduced in Chapter III, the counterpart results for
α-Frechet processes are proved with little extra effort in Section 4.3.
Let X = Xtt∈T be an SαS process. We are interested in the case when X can
be written as
(4.1) Xtt∈Td=
X(1)
t + · · ·+X(n)t
t∈T
,
where ‘d=’ stands for ‘equality in finite-dimensional distributions’, and X(k) =
X(k)t t∈T , k = 1, . . . , n are independent SαS processes. We will write X
d= X(1) +
· · ·+X(n) in short, and each X(k) will be referred to as a component of X. The sta-
bility property readily implies that (4.1) holds with X(k) d= n−1/αX ≡ n−1/αXtt∈T .
The components equal in finite-dimensional distributions to a constant multiple of X
will be referred to as trivial. We are interested in the general structure of non-trivial
SαS components of X.
Many important decompositions (4.1) of SαS processes (with non-trivial compo-
nents) are already available in the literature: see for example Cambanis et al. [12],
Rosinski [83], Rosinski and Samorodnitsky [86], Surgailis et al. [103], Pipiras and
35
36
Taqqu [70, 71], and Samorodnitsky [92], to name a few. These results were mo-
tivated by studies of various probabilistic and structural aspects of the underlying
SαS processes such as ergodicity, mixing, stationarity, self-similarity, etc. Notably,
Rosinski [83] established a fundamental connection between stationary SαS processes
and non-singular flows. He developed important tools based on minimal represen-
tations of SαS processes and inspired multiple decomposition results motivated by
connections to ergodic theory.
In this chapter, we adopt a different perspective. Our main goal is to charac-
terize all possible SαS decompositions (4.1). Our results show how the dependence
structure of an SαS process determines the structure of its components.
Consider SαS processes Xtt∈T indexed by a complete separable metric space T
with an integral representation
(4.2) Xtt∈Td=
S
ft(s)Mα(ds)
t∈T
,
with spectral functions ftt∈T ⊂ Lα(S,BS, µ). Recall that for all n ∈ N, tj ∈ T, aj ∈
R,
(4.3) E exp− i
n
j=1
ajXtj
= exp
−
S
n
j=1
ajftj
α
dµ.
Without loss of generality, we always assume that the spectral functions ftt∈T ⊂
Lα(S,BS, µ) have full support, i.e., S = suppft, t ∈ T.
We first state the main result of this chapter. To this end, we recall that the ratio
σ-algebra of a spectral representation F = ftt∈T (of Xt) is defined as
(4.4) ρ(F ) ≡ ρft, t ∈ T := σft1/ft2 , t1, t2 ∈ T.
The following result characterizes the structure of all SαS decompositions.
37
Theorem IV.1. Suppose Xtt∈T is an SαS process (0 < α < 2) with spectral
representation
Xtt∈Td=
S
ft(s)Mα(ds)
t∈T
,
with ftt∈T ⊂ Lα(S,BS, µ). Let X(k)t t∈T , k = 1, · · · , n be independent SαS pro-
cesses.
(i) The decomposition
(4.5) Xtt∈Td=
X(1)
t + · · ·+X(n)t
t∈T
holds, if and only if there exist measurable functions rk : S → [−1, 1], k =
1, · · · , n, such that
(4.6) X(k)t t∈T
d=
S
rk(s)ft(s)Mα(ds)
t∈T
, k = 1, · · · , n.
In this case, necessarily
n
k=1 |rk(s)|α = 1, µ-almost everywhere on S.
(ii) If (4.5) holds, then the rk’s in (4.6) can be chosen to be non-negative and ρ(F )-
measurable. Such rk’s are unique modulo µ.
The rest of the chapter is structured as follows. In Section 4.1, we provide some
consequences of Theorem IV.1 for general SαS processes. The stationary case is
discussed in Section 4.2. Parallel results on max-stable processes are presented in
Section 4.3. The proof of Theorem IV.1 is given in Section 4.4.
4.1 SαS Components
In this section, we provide a few examples to illustrate the consequences of our
main result Theorem IV.1. The first one is about SαS processes with independent
increments. Recall that we always assume 0 < α < 2.
38
Corollary IV.2. Let X = Xtt∈R+ be an arbitrary SαS process with independent
increments and X0 = 0. Then all SαS components of X also have independent
increments.
Proof. Write m(t) = Xtα
α, where Xtα denotes the scale coefficient of the SαS
random variable Xt. By the independence of the increments of X, it follows that
m is a non-decreasing function with m(0) = 0. First, we consider the simple case
when m(t) is right-continuous. Consider the Borel measure µ on [0,∞) determined
by µ([0, t]) := m(t). The independence of the increments of X readily implies that
X has the representation:
(4.7) Xtt∈R+
d=
∞
0
1[0,t](s)Mα(ds)
t∈R+
,
where Mα is an SαS random measure with control measure µ.
Now, for any SαS component Y (≡ X(k)) of X, we have that (4.6) holds with
ft(s) = 1[0,t](s) and some function r(s)(≡ rk(s)). This implies that the increments of
Y are also independent since, for example, for any 0 ≤ t1 < t2, the spectral functions
r(s)ft1(s) = r(s)1[0,t1](s) and r(s)ft2(s) − r(s)ft1(s) = r(s)1(t1,t2](s) have disjoint
supports.
It remains to prove the general case. The difficulty is that m(t) may have (at
most countably many) discontinuities, and a representation as (4.7) is not always
possible. Nevertheless, introduce the right-continuous functions t → mi(t), i = 0, 1,
m0(t) := m(t+)−
τ≤t
(m(τ)−m(τ−)) and m1(t) :=
τ≤t
(m(τ)−m(τ−))
and let Mα be an SαS random measure on R+×0, 1 with control measure µ([0, t]×
i) := mi(t), i = 0, 1, t ∈ R+. In this way, as in (4.7) one can show that
Xtt∈Td=
R+×0,1
1[0,t)×0(s, v) + 1[0,t]×1(s, v)Mα(ds, dv)
t∈T
.
39
The rest of the proof remains similar and is omitted.
Remark IV.3. Theorem IV.1 and Corollary IV.2 do not apply to the Gaussian case
(α = 2). For the sake of simplicity, take T = 1, 2 and n = 2 (2 SαS components)
in (4.1). In this case, all the (in)dependence information of the mean-zero Gaussian
process Xtt∈T is characterized by the covariance matrix Σ of the Gaussian vector
(X(1)1 , X(2)
1 , X(1)2 , X(2)
2 ). A counterexample can be easily constructed by choosing
appropriately Σ. This reflects the drastic difference of the geometries of Lα spaces
for α < 2 and α = 2.
The next natural question to ask is whether two SαS processes have common
components. Namely, the SαS process Z is a common component of the SαS processes
X and Y , if Xd= Z + X(1) and Y
d= Z + Y (1), where X(1) and Y (1) are both SαS
processes independent of Z.
To study the common components, the co-spectral point of view introduced in
Wang and Stoev [111] is helpful. Consider a measurable SαS process Xtt∈T with
spectral representation (4.2), where the index set T is equipped with a measure λ
defined on the σ-algebra BT . Without loss of generality, we take f(·, ·) : (S×T,BS ×
BT ) → (R,BR) to be jointly measurable (see Theorems 9.4.2 and 11.1.1 in [93]).
The co-spectral functions, f·(s) ≡ f(s, ·), are elements of L0(T ) ≡ L0(T,BT ,λ), the
space of BT -measurable functions modulo λ-null sets. The co-spectral functions are
indexed by s ∈ S, in contrast to the spectral functions ft(·) indexed by t ∈ T . Recall
also that a set P ⊂ L0(T ) is a cone, if cP = P for all c ∈ R \ 0 and 0 ∈ P . We
write f·(s)s∈S ⊂ P modulo µ, if for µ-almost all s ∈ S, f·(s) ∈ P .
Proposition IV.4. Let X(i) = X(i)t t∈T be SαS processes with measurable represen-
tations f (i)t t∈T ⊂ Lα(Si,BSi
, µi), i = 1, 2. If there exist two cones Pi ⊂ L0(T ), i =
40
1, 2, such that f (i)· (s)s∈Si
⊂ Pi modulo µi, for i = 1, 2, and P1 ∩ P2 = 0, then
the two processes have no common component.
Proof. Suppose Z is a component of X(1). Then, by Theorem IV.1, Z has a spectral
representation r(1)f (1)t t∈T , for some BS1-measurable function r(1). By the definition
of cones, the co-spectral functions of Z are included in P1, i.e., r(1)(s)f(1)· (s)s∈S1 ⊂
P1 modulo µ1. If Z is also a component of X(2), then by the same argument,
r(2)(s)f (2)· (s)s∈S2 ⊂ P2 modulo µ2, for some BS2-measurable function r(2)(s). Since
P1∩P2 = 0, it then follows that µi(supp(r(i))) = 0, i = 1, 2, or equivalently Z = 0,
the degenerate case.
We conclude this section with an application to SαS moving averages.
Corollary IV.5. Let X(1)and X(2)
be two SαS moving averages
X(i)t t∈Rd
d=
Rd
f (i)(t+ s)M (i)α(ds)
t∈Rd
with kernel functions f (i) ∈ Lα(Rd,BRd ,λ), i = 1, 2. Then, either
(4.8) X(1) d= cX(2)
for some c > 0 ,
or X(1)and X(2)
have no common component. Moreover, (4.8) holds, if and only if
for some τ ∈ Rdand ∈ ±1,
(4.9) f (1)(s) = cf (2)(s+ τ) , µ-almost all s ∈ S.
Proof. Clearly (4.9) implies (4.8). Conversely, if (4.8) holds, then (4.9) follows as in
the proof of Corollary 4.2 in [111], with slight modification (the proof therein was
for positive cones). When (4.8) (or equivalently (4.9)) does not hold, consider the
smallest cones containing f (i)(s+ ·)s∈R, i = 1, 2 respectively. Since these two cones
have trivial intersection 0, Proposition IV.4 implies that X(1) and X(2) have no
common component.
41
4.2 Stationary SαS Components and Flows
LetX = Xtt∈T be a stationary SαS process with representation (4.2), where now
T = Rd or T = Z
d, d ∈ N. The seminal work of Rosnski [83] established an important
connection between stationary SαS processes and flows. A family of functions φtt∈T
is said to be a flow on (S,BS, µ), if for all t1, t2 ∈ T , φt1+t2(s) = φt1(φt2(s)) for all
s ∈ S, and φ0(s) = s for all s ∈ S. We say that a flow is non-singular, if µ(φt(A)) = 0
is equivalent to µ(A) = 0, for all A ∈ BS, t ∈ T . Given a flow φtt∈T , ctt∈T is
said to be a cocycle if ct+τ (s) = ct(s)cτ φt(s) µ-almost surely for all t, τ ∈ T and
ct ∈ ±1 for all t ∈ T .
To understand the relation between the structure of stationary SαS processes
and flows, it is necessary to work with minimal representations of SαS processes,
introduced by Hardin [45, 46]. The minimality assumption is crucial in many results
on the structure of SαS processes, although it is in general difficult to check (see
e.g. Rosinski [85] and Pipiras [69]).
Definition IV.6. The spectral functions F ≡ ftt∈T (and the corresponding spec-
tral representation (4.2)) are said to be minimal, if the ratio σ-algebra ρ(F ) in (4.4)
is equivalent to BS, i.e., for all A ∈ BS, there exists B ∈ ρ(F ) such that µ(A∆B) = 0,
where A∆B = (A \B) ∪ (B \ A).
Rosinski ([83], Theorem 3.1) proved that if ftt∈T is minimal, then there exists
a modulo µ unique non-singular flow φtt∈T , and a corresponding cocycle ctt∈T ,
such that for all t ∈ T ,
(4.10) ft(s) = ct(s)dµ φt
dµ(s)
1/αf0 φt(s) , µ-almost everywhere.
Conversely, suppose that (4.10) holds for some non-singular flow φtt∈T , a cor-
responding cocycle ctt∈T , and a function f0 ∈ Lα(S, µ) (ftt∈T not necessarily
42
minimal). Then, clearly the SαS process X in (4.2) is stationary. In this case, we
shall say that X is generated by the flow φtt∈T .
Consider now an SαS decomposition (4.1) of X, where the independent com-
ponents X(k)t t∈T ’s are stationary. This will be referred to as a stationary SαS
decomposition, and the X(k)t t∈T ’s as stationary components of X. Our goal in this
section is to characterize the structure of all possible stationary components. This
characterization involves the invariant σ-algebra with respect to the flow φtt∈T :
(4.11) Fφ = A ∈ BS : µ(φτ (A)∆A) = 0 , for all τ ∈ T .
Given a function g and a σ-algebra G, we write g ∈ G, if g is measurable with respect
to G.
Theorem IV.7. Let Xtt∈T be a stationary and measurable SαS process with spec-
tral functions ftt∈T given by
ft(s) =
S
ct(s)dµ φt
dµ(s)
1/αf0 φt(s)Mα(ds), t ∈ T .
(i) Suppose that Xtt∈T has a stationary SαS decomposition
(4.12) Xtt∈Td=
X(1)
t + · · ·+X(n)t
t∈T
.
Then, each component X(k)t t∈T has a representation
(4.13) X(k)t t∈T
d=
S
rk(s)ft(s)Mα(ds)
t∈T
, k = 1, · · · , n,
where the rk’s can be chosen to be non-negative and ρ(F )-measurable. This choice is
unique modulo µ and these rk’s are φ-invariant, i.e. rk ∈ Fφ.
(ii) Conversely, for any φ-invariant rk’s such that
n
k=1 |rk(s)|α = 1, µ-almost ev-
erywhere on S, decomposition (4.12) holds with X(k)’s as in (4.13).
43
Proof. By using (4.10), a change of variables, and the φ-invariance of the functions
rk’s, one can show that the X(k)’s in (4.13) are stationary. This fact and Theorem
IV.1 yield part (ii).
We now show (i). Suppose that X(k) is a stationary (SαS) component of X.
Theorem IV.1 implies that there exists unique modulo µ non-negative and ρ(F )-
measurable function rk for which (4.13) holds. By the stationarity of X(k), it also
follows that for all τ ∈ T , rk(s)ft+τ (s)t∈T is also a spectral representation of X(k).
By the flow representation (4.10), it follows that for all t, τ ∈ T ,
(4.14) ft+τ (s) = cτ (s)ft φτ (s)dµ φτ
dµ
1/α(s) , µ-almost everywhere,
and we obtain that for all τ, tj ∈ T, aj ∈ R, j = 1, · · · , n:
S
n
j=1
ajrk(s)ftj+τ (s)α
µ(ds) =
S
n
j=1
ajrk φ−τ (s)ftj(s)α
µ(ds),
which shows that rk φ−τ (s)ft(s)t∈T is also a representation for X(k), for all τ ∈ T .
Observe that from (4.14), for all t1, t2, τ ∈ T and λ ∈ R,
ft1+τ
ft2+τ
≤ λ= φ−1
τ
ft1ft2
≤ λmodulo µ.
It then follows that for all τ ∈ T , the σ-algebra φ−τ (ρ(F )) ≡ (φτ )−1(ρ(F )) is equiv-
alent to ρ(F ). This, by the uniqueness of rk ∈ ρ(F ) (Theorem IV.1), implies that
rk φτ = rk modulo µ, for all τ . Then, rk ∈ Fφ follows from standard measure-
theoretic argument. The proof is complete.
Remark IV.8. The structure of the stationary SαS components of stationary SαS pro-
cesses (including random fields) has attracted much interest since the seminal work
of Rosinski [83, 84]. See, for example, Pipiras and Taqqu [71], Samorodnitsky [92],
Roy [87, 88], Roy and Samorodnitsky [91], Roy [89, 90], and Wang et al. [108]. In
44
view of Theorem IV.7, the components considered in these works correspond to in-
dicator functions rk(s) = 1Ak(s) of certain disjoint flow-invariant sets Ak’s arising
from ergodic theory (see e.g. Krengel [55] and Aaronson [1]).
Theorem IV.7 can be applied to check indecomposability of stationary SαS pro-
cesses. Recall that a stationary SαS process is said to be indecomposable, if all its
stationary SαS components are trivial (i.e. constant multiples of the original process).
Corollary IV.9. Consider Xtt∈T as in Theorem IV.7. If Fφ is trivial, then
Xtt∈T is indecomposable. The converse is true when, in addition, ftt∈T is mini-
mal.
Proof. If Fφ is trivial, the result follows from Theorem IV.7. Conversely, let ftt∈T
be minimal and X indecomposable. Then, one can choose A ∈ Fφ, such that µ(A) >
0 and µ(S \ A) > 0. Then, consider
XA
tt∈T
d=
S
1A(s)ft(s)Mα(ds)
t∈T
.
By Theorem IV.7, XA is a stationary component of X. It suffices to show that XA
is a non-trivial of X, which would contradict the indecomposability.
Suppose that XA is trivial, then cXA d= X, for some c > 0. Thus, by Theorem
IV.7, cXA has a representation as in (4.13), with rk := c1A. On the other hand, since
cXA d= X, we also have the trivial representation with rk := 1. Since A ∈ ρ(F ),
the uniqueness of rk implies that 1 = c1A modulo µ, which contradicts µ(Ac) > 0.
Therefore, XA is non-trivial.
The indecomposable stationary SαS processes can be seen as the elementary build-
ing blocks for the construction of general stationary SαS processes. We conclude this
section with two examples.
45
Example IV.10 (Mixed moving averages). Consider a mixed moving average in the
sense of [102]:
(4.15) Xtt∈Rd
d=
Rd×V
f(t+ s, v)Mα(ds, dv)
t∈Rd
.
Here, Mα is an SαS random measure on Rd×V with the control measure λ×ν, where
λ is the Lebesgue measure on (Rd,BRd) and ν is a probability measure on (V,BV ), and
f(s, v) ∈ Lα(Rd × V,BRd×V ,λ× ν). Given a disjoint union V =
n
j=1 Aj, where Aj’s
are measurable subsets of V , the mixed moving averages can clearly be decomposed
as in (4.12) with
X(k)t t∈Rd
d=
Rd×Ak
f(t+ s, v)Mα(ds, dv)
t∈Rd
, for all k = 1, . . . , n .
Any moving average process
(4.16) Xtt∈Rd
d=
Rd
f(t+ s)Mα(ds)
t∈Rd
trivially has a mixed moving average representation. The next result shows when
the converse is true.
Corollary IV.11. The mixed moving average X in (4.15) is indecomposable, if and
only if it has a moving average representation as in (4.16).
Proof. By Corollary IV.9, the moving average process (4.16) is indecomposable, since
in this case φt(s) = t + s, t, s ∈ Rd and therefore Fφ is trivial. This proves the ‘if’
part.
Suppose now thatX in (4.15) is indecomposable. In Section 5 of Pipiras [69] it was
shown that SαS processes with mixed moving average representations and stationary
increments also have minimal representations of the mixed moving average type. By
using similar arguments, one can show that this is also true for the class of stationary
mixed moving average processes.
46
Thus, without loss of generality, we assume that the representation in (4.15) is
minimal. Suppose now that there exists a set A ∈ BV with ν(A) > 0 and ν(Ac) > 0.
Since Rd × A and R
d × Ac are flow-invariant, we have the stationary decomposition
Xtt∈Rd
d= XA
t+XA
c
tt∈Rd , where
XB
t:=
R×V
1B(v)f(t+ s, v)Mα(ds, dv), B ∈ A,Ac.
Note that both components XA = XA
tt∈Rd and XA
c
= XAc
tt∈Rd are non-zero
because the representation of X has full support.
Now, since X is indecomposable, there exist positive constants c1 and c2, such
that Xd= c1XA d
= c2XAc
. The minimality of the representation and Theorem IV.7
imply that c11A = c21Ac modulo ν, which is impossible. This contradiction shows
that the set V cannot be partitioned into two disjoint sets of positive measure. That
is, V is a singleton and the mixed moving average is in fact a moving average.
Example IV.12 (Doubly stationary processes). Consider a stationary process ξ =
ξtt∈T (T = Zd) supported on the probability space (E, E , µ) with ξt ∈ Lα(E, E , µ).
Without loss of generality, we may suppose that ξt(u) = ξ0 φt(u), where φtt∈T is
a µ-measure-preserving flow.
Let Mα be an SαS random measure on (E, E , µ) with control measure µ. The
stationary SαS process X = Xtt∈T
(4.17) Xt :=
E
ξt(u)Mα(du), t ∈ T
is said to be doubly stationary (see Cambanis et al. [11]). By Corollary IV.9, if ξ is
ergodic, then X is indecomposable.
A natural and interesting question raised by a referee is: what happens when X
is decomposable and hence ξ is non-ergodic? Can we have a direct integral decom-
47
position of the process X into indecomposable components? The following remark
partly addresses this question.
Remark IV.13. The doubly stationary SαS processes are a special case of station-
ary SαS processes generated by positively recurrent flows (actions). As shown in
Samorodnitsky [92], Remark 2.6, each such stationary SαS process X = Xtt∈T can
be expressed through a measure-preserving flow (action) on a finite measure space.
Namely,
(4.18) Xtt∈Td=
E
ft(u)M(µ)α
(du)
t∈T
, with ft(u) := ct(u)f0 φt(u),
where M (µ)α is an SαS random measure with a finite control measure µ on (E, E),
φ = φtt∈T is a µ-preserving flow (action), and ctt∈T is a co-cycle with respect to
φ. In the case when the co-cycle is trivial (ct ≡ 1) and µ(E) = 1, the process X is
doubly stationary.
For simplicity, suppose that T = Zd and without loss of generality let (E, E , µ)
be a standard Lebesgue space with µ(E) = 1. The ergodic decomposition theorem
(see e.g. Keller [52], Theorem 2.3.3) implies that there exists conditional probability
distributions µuu∈E with respect to I such that φ is measure-preserving and ergodic
with respect to the measures µu for µ-almost all u ∈ E. Let ν be another φ-invariant
measure on (E, E) dominating the conditional probabilities µu so that the Radon–
Nikodym derivatives p(x, u) = (dµu/dν)(x) are jointly measurable on (E × E, E ⊗
E , ν × µ). Consider
gt(x, u) = ft(x)p(φt(x), u)1/α.
Recall that ν and µu are φ-invariant, whence
p(φt(x), u) =dµu
dν(φt(x)) =
dµu
dν(x) = p(x, u), modulo ν × µ.
48
Thus, gt(x, u) = ft(x)(dµu/dν)1/α(x), and for all aj ∈ R, tj ∈ T, j = 1, · · · , n, we
have
E2
n
j=1
ajgtj(x, u)α
ν(dx)µ(du) =
E2
n
j=1
ajftj(x)αdµu
dν(x)ν(dx)µ(du)
=
E2
n
j=1
ajftj(x)α
dµu(dx)µ(du)
=
E
n
j=1
ajftj(x)α
µ(dx),
where the last equality follows from the identity that
E
h(x)µ(dx) =
E2
h(x)µu(dx)µ(du), for all h ∈ L1(E, E , µ).
We have thus shown that Xtt∈T defined by (4.18) has another spectral represen-
tation
(4.19) Xtt∈Td=
E×E
gt(x, u)M(ν×µ)α
(dx, du)
t∈T
,
where M (ν×µ)α is an SαS random measure on E × E with control measure ν × µ. It
also follows that for µ-almost all u ∈ E, the process defined by
X(u)t
:=
E
gt(x, u)M(ν)α
(dx), t ∈ T,
is indecomposable, where M (ν)α has control measure ν. Indeed, as above, one can
show that
X(u)t t∈T
d=
E
ft(u, x)M(µu)α
(dx)
t∈T
,
where M (µu)α has control measure µu. The ergodic decomposition theorem implies
that the flow (action) φ is ergodic with respect to µu, which by Corollary IV.9 implies
the indecomposability of X(u) = X(u)t t∈T . In this way, (4.19) parallels the mixed
moving average representation for stationary SαS processes generated by dissipative
flows (see e.g. Rosinski [83]).
49
Remark IV.14. The above construction of the decomposition (4.19) assumes the
existence of a φ-invariant measure ν dominating all conditional probabilities µu, u ∈
E. If the measure µ, restricted on the invariant σ-algebra Fφ is discrete, i.e. Fφ
consists of countably many atoms under µ, then one can take ν ≡ µ. In this case,
the process X is decomposed into a sum (possibly infinite) of its indecomposable
components:
Xt =
k
Ek
ft(x)M(µ)α
(dx),
where the Ek’s are disjoint φ-invariant measurable sets, such that E = ∪kEk and
φ|Ekis ergodic, for each k. In this case, the Ek’s are the atoms of Fφ.
In general, when µ|Fφis not discrete, the dominating measure ν if it exists, may not
be σ-finite. Indeed, since the φt’s are ergodic for µu, it follows that either µu = µu
or µu and µu are singular, for µ-almost all u, u ∈ E. Thus, if Fφ is “too rich”,
this singularity feature implies that the measure ν may not be chosen to be σ-finite.
4.3 Decomposability of Max-stable Processes
In this section, we state and prove some results on the (max-)decomposability of
max-stable processes. Again, we focus on α-Frechet processes.
Let Y = Ytt∈T be an α-Frechet process. If
(4.20) Ytt∈Td=
Y (1)t ∨ · · · ∨ Y (n)
t
t∈T
,
for some independent α-Frechet processes Y (k) = Y (k)t t∈T , i = 1, · · · , n, then we
say that the Y (k)’s are components of Y . By the max-stability of Y , (4.20) trivially
holds if the Y (k)’s are independent copies of n−1/αYtt∈T . The constant multiples of
Y are referred to as trivial components of Y and as in the SαS case, we are interested
in the structure of the non-trivial ones.
50
The association method can be readily applied to transfer decomposability results
for SαS processes to the max-stable setting. Let Y = Ytt∈T be an α-Frechet
(α ∈ (0, 2)) process with extremal representation
(4.21) Ytt∈Td=
e
S
ft(s)M∨
α(ds)
t∈T
,
where ftt∈T ⊂ Lα
+(S,BS, µ) are spectral functions, and recall that
(4.22) P(Yti≤ yi, i = 1, · · · , n) = exp
−
S
max1≤i≤n
fti(s)
yi
α
µ(ds),
for all yi > 0, ti ∈ T, i = 1, · · · , n.
Assume 0 < α < 2. Recall that, an SαS process X and an α-Frechet process Y
are said to be associated if they have a common spectral representation. That is, if
for some non-negative ftt∈T ⊂ Lα
+(S,BS, µ), Relations (4.2) and (4.21) hold.
To illustrate the association method in Chapter III, we prove the max-stable
counterpart of our main result Theorem IV.1. From the proof, we can see that the
other results in the sum-stable setting have their natural max-stable counterparts by
association. We briefly state some of these results at the end of this section.
Theorem IV.15. Suppose Ytt∈T is an α-Frechet process with spectral represen-
tation (4.21), where F ≡ ftt∈T ⊂ Lα
+(S,BS, µ). Let Y (k)t t∈T , k = 1, · · · , n, be
independent α-Frechet processes. Then the decomposition (4.20) holds, if and only
if there exist measurable functions rk : S → [0, 1], k = 1, · · · , n, such that
(4.23) Y (k)t t∈T
d=
e
S
rk(s)ft(s)M∨
α(ds)
t∈T
, k = 1, · · · , n.
In this case,
n
k=1 rk(s)α = 1, µ-almost everywhere on S and the rk’s in (4.23) can
be chosen to be ρ(F )-measurable, uniquely modulo µ.
Proof. The ‘if’ part follows from straight-forward calculation of the cumulative distri-
bution functions (4.22). To show the ‘only if’ part, suppose (4.20) holds and Y (k) has
51
spectral functions g(k)t t∈T ⊂ Lα
+(Vk,BBk, νk), k = 1, . . . , n. Without loss of general-
ity, assume Vkk=1,...,n to be mutually disjoint and define gt(v) =
n
k=1 g(k)t (v)1Vk
∈
Lα
+(V,BV , ν) for appropriately defined (V,BV , ν) (see the proof of Theorem IV.1).
Now, consider the SαS processX associated to Y . It has spectral functions ftt∈T
and gtt∈T . Consider the SαS processesX(k) associated to Y (k) via spectral functions
g(k)t t∈T for k = 1, . . . , n. By checking the characteristic functions, one can show
that X(k)k=1,...,n form a decomposition of X as in (4.1). Then, by Theorem IV.1,
each SαS component X(k) has a spectral representation (4.6) with spectral functions
rkftt∈T . But we introduced X(k) as the SαS process associated to Y (k) via spectral
representation g(k)t t∈T . Hence, X(k) has spectral functions g(k)t t∈T and rkftt∈T ,
and so does Y (k) by Theorem III.11. Therefore, (4.23) holds and the rest of the
desired results follow.
Further parallel results can be established by the association method. Consider
a stationary α-Frechet process Y . If Y (k), k = 1, . . . , n are independent stationary
α-Frechet processes such that (4.20) holds, then we say each Y (k) is a stationary α-
Frechet component of Y . The process Y is said to be indecomposable, if it has no non-
trivial stationary component. The following results on (mixed) moving maxima (see
e.g. [101] and [50] for more details) follow from Theorem IV.15 and the association
method, in parallel to Corollary IV.11 on (mixed) moving averages in the sum-stable
setting.
Corollary IV.16. The mixed moving maxima process
Ytt∈Rd
d=
e
Rd×V
f(t+ s, v)M∨
α(ds, dv)
t∈Rd
is indecomposable, if and only if it has a moving maxima representation
Ytt∈Rd
d=
e
Rd
f(t+ s)M∨
α(ds)
t∈Rd
.
52
4.4 Proof of Theorem IV.1
We will first show that Theorem IV.1 is true when ftt∈T is minimal (Proposi-
tion IV.18), and then we complete the proof by relating a general spectral representa-
tions to a minimal one. This technique is standard in the literature of representations
of SαS processes (see e.g. Rosinski [83], Remark 2.3). We start with a useful lemma.
Lemma IV.17. Let ftt∈T ⊂ Lα(S,BS, µ) be a minimal representation of an SαS
process. For any two bounded BS-measurable functions r(1) and r(2), we have
S
r(1)ftdMα
t∈T
d=
S
r(2)ftdMα
t∈T
,
if and only if |r(1)| = |r(2)| modulo µ.
Proof. The ’if’ part is trivial. We shall prove now the ’only if’ part. Let S(k) :=
supp(r(k)), k = 1, 2 and note that since ftt∈T is minimal, then r(k)ftt∈T , are
minimal representations, restricted to S(k), k = 1, 2, respectively. Since the latter
two representations correspond to the same process, by Theorem 2.2 in [83], there
exist a bi-measurable, one-to-one and onto point mapping Ψ : S(1) → S(2) and a
function h : S(1) → R \ 0, such that, for all t ∈ T ,
(4.24) r(1)(s)ft(s) = r(2) Ψ(s)ft Ψ(s)h(s) , almost all s ∈ S(1),
and
(4.25)dµ Ψ
dµ= |h|α , µ-almost everywhere.
It then follows that, for almost all s ∈ S(1),
(4.26)ft1(s)
ft2(s)=
r(1)(s)ft1(s)
r(1)(s)ft2(s)=
ft1 Ψ(s)
ft2 Ψ(s).
53
Define Rλ(t1, t2) = s : ft1(s)/ft2(s) ≤ λ and note that by (4.26), for all A ≡
Rλ(t1, t2),
(4.27) µ(Ψ(A ∩ S(1))∆(A ∩ S(2))) = 0 .
In fact, one can show that Relation (4.27) is also valid for all A ∈ ρ(F ) ≡
σ(Rλ(t1, t2) : λ ∈ R, t1, t2 ∈ T ). Then, by minimality, (4.27) holds for all
A ∈ BS. In particular, taking A equal to S(1) and S(2), respectively, it follows
that µ(S(1)∆S(2)) = 0. Therefore, writing S := S(1) ∩ S(2), we have
(4.28) µ(Ψ(A ∩ S)∆(A ∩ S)) = 0, for all A ∈ BS .
This implies that Ψ(s) = s, for µ-almost all s ∈ S. To see this, let BS = BS ∩ S
denote the σ-algebra BS restricted to S. Observe that for all A ∈ BS, we have
1A = 1A Ψ, for µ-almost all s ∈ S, and trivially σ(1A : A ∈ BS) = BS. Thus,
by the second part of Proposition 5.1 in [85], it follows that Ψ(s) = s modulo µ on
S. This and (4.25) imply that h(s) ∈ ±1, almost everywhere. Plugging Ψ and h
into (4.24) yields the desired result.
Proposition IV.18. Theorem IV.1 is true when ftt∈T is minimal.
Proof. We first prove the ’if ’ part. The result follows readily by using characteristic
functions. Indeed, suppose that the X(k) = X(k)t t∈T , k = 1, . . . , n are independent
and have representations as in (4.6). Then, for all aj ∈ R, tj ∈ T, j = 1, · · · ,m, we
have
(4.29) E expi
m
j=1
ajXtj
= exp
−
S
m
j=1
ajftj
α
dµ
=n
k=1
exp−
S
m
j=1
ajrkftj
α
dµ=
n
k=1
E expi
m
j=1
ajX(k)tj
,
54
where the second equality follows from the fact that
n
k=1 |rk(s)|α = 1, for µ-almost
all s ∈ S. Relation (4.29) implies the decomposition (4.1).
We now prove the ’only if ’ part. Suppose that (4.1) holds and let f (k)t t∈T ⊂
Lα(Vk,BVk, νk), k = 1, . . . , n be representations for the independent components
X(k)t t∈T , k = 1, . . . , n, respectively, and without loss of generality, assume that
Vkk=1,...,n are mutually disjoint. Introduce the measure space (V,BV , ν), where
V :=
n
k=1 Vk, BV :=
n
k=1 Ak, Ak ∈ BVk, k = 1, . . . , n and ν(A) :=
n
k=1 νk(A ∩
Vk) for all A ∈ BV .
By decomposition (4.1), it follows that Xtt∈Td=
VgtdMαt∈T , with gt(u) :=
n
k=1 f(k)t (u)1Vk
(u) andMα an SαS random measure on (V,BV ) with control measure
ν.
Thus, ftt∈T ⊂ Lα(S,BS, µ) and gtt∈T ⊂ Lα(V,BV , ν) are two representations
of the same process X, and by assumption the former is minimal. Therefore, by
Remark 2.5 in [83], there exist modulo ν unique functions Φ : V → S and h : V →
R \ 0, such that, for all t ∈ T ,
(4.30) gt(u) = h(u)ft Φ(u) , almost all u ∈ V ,
where moreover µ = νh Φ−1 with dνh = |h|αdν.
Recall that V is the union of mutually disjoint sets Vkk=1,...,n. For each k =
1, . . . , n, let Φk : Vk → Sk := Φ(Vk) be the restriction of Φ to Vk, and define the
measure µk(·) := νh,k Φ−1k( · ∩ Sk) on (S,BS) with dνh,k := |h|αdνk. Note that
µk has support Sk, and the Radon–Nikodym derivative dµk/dµ exists. We claim
that (4.6) holds with rk := (dµk/dµ)1/α. To see this, observe that for all m ∈
N, a1, . . . , am ∈ R, t1, . . . , tm ∈ T ,
S
m
j=1
ajrkftj
α
dµ =
Sk
m
j=1
ajftj
α
dµk =
Vk
m
j=1
ajhftj Φk
α
dνk ,
55
which, combined with (4.30), yields (4.6) because gt|Vk= f (k)
t .
Note also that
n
k=1 µk = µ and thus
n
k=1 rα
k= 1. This completes the proof of
part (i) of Theorem IV.1 in the case when ftt∈T is minimal.
To prove part (ii), note that the rk’s above are in fact non-negative and BS-
measurable. Note also that by minimality, the rk’s have versions rk’s that are ρ(F )-
measurable, i.e. rk = rk modulo µ. Their uniqueness follows from Lemma IV.17.
Proof of Theorem IV.1. (i) The ‘if’ part follows by using characteristic functions as
in the proof of Proposition IV.18 above.
Now, we prove the ‘only if’ part. Let ftt∈T ⊂ Lα(S,BS, µ) be a minimal represen-
tation of X. As in the proof of Proposition IV.18, by Remark 2.5 in [83], there exist
modulo µ unique functions Φ : S → S and h : S → R \ 0, such that, for all t ∈ T ,
(4.31) ft(s) = h(s) ft Φ(s) , almost all s ∈ S,
where µ = µh Φ−1 with dµh = |h|αdµ.
Now, by Proposition IV.18, if the decomposition (4.1) holds, then there exist
unique non-negative functions rk, k = 1, · · · , n, such that
(4.32) X(k)t t∈T
d=
Srk ftdMα
t∈T
, k = 1, · · · , n,
and
n
k=1 rαk = 1 modulo µ. Here Mα is an SαS measure on (S,BS) with control
measure µ. Let rk(s) := rk Φ(s) and note that by using (4.31) and a change of
variables, for all aj ∈ R, tj ∈ T, j = 1, · · · ,m, we obtain
(4.33)
S
m
j=1
ajrk(s)ftj(s)µ(ds) =
S
m
j=1
ajrk(s) ftj(s)µ(ds).
This, in view of Relation (4.32), implies (4.6). Further, the fact that
n
k=1 rαk = 1
implies
n
k=1 rα
k= 1, modulo µ, because the mapping Φ is non-singular, i.e. µΦ−1 ∼
µ. This completes the proof of part (i).
56
We now focus on proving part (ii). Suppose that (4.6) holds for two choices of rk,
namely rkand r
k. Let also r
kand r
kbe non-negative and measurable with respect
to ρ(F ). We claim that
(4.34) ρ(F ) ∼ Φ−1(ρ( F ))
and defer the proof to the end. Then, since the minimality implies that BS ∼ ρ( F ).
rkand r
kare measurable with respect to ρ(F ) ∼ Φ−1(BS). Now, Doob–Dynkin’s
lemma (see e.g. Rao [75], p. 30) implies that
(4.35) rk(s) = r
k Φ(s) and r
k(s) = r
k Φ(s), for µ almost all s,
where rkand r
kare two BS-measurable functions. By using the last relation and a
change of variables, we obtain that (4.33) holds with (rk, rk) replaced by (rk, r
k) and
(rk, r
k), respectively. Thus both r
kftt∈T and r
kftt∈T are representations of the
k-th component of X. Since ftt∈T is a minimal representation of X, Lemma IV.17
implies that rk= r
kmodulo µ. This, by (4.35) and the non-singularity of Φ yields
rk= r
kmodulo µ.
It remains to prove (4.34) Relation (4.31) and the fact that h(s) = 0 imply that
for all λ and t1, t2 ∈ T , ft1/ft2 ≤ λ = Φ−1( ft1/ ft2 ≤ λ) modulo µ. Thus the
classes of sets C := ft1/ft2 ≤ λ, t1, t2 ∈ T, λ ∈ R and C := Φ−1( ft1/ ft2 ≤
λ), t1, t2 ∈ T, λ ∈ R are equivalent. That is, for all A ∈ C, there exists A ∈ C,
with µ(A∆ A) = 0 and vice versa.
Define
G =Φ−1(A) : A ∈ ρ( F ) such thatµ(Φ−1(A)∆B) = 0 for some B ∈ σ(C)
.
Notice that G is a σ-algebra and since C ⊂ G ⊂ Φ−1(ρ( F )), we obtain that σ(C) =
Φ−1(ρ( F )) ≡ G. This, in view of definition of G, shows that for all A ∈ σ(C), exists
57
A ∈ σ(C) with µ(A∆ A) = 0. In a similar way one can show that each element of
σ(C) is equivalent to an element in σ(C), which completes the proof of the desired
equivalence of the σ-algebras.
CHAPTER V
Conditional Sampling for Max-stable Processes
The modeling and parameter estimation of the univariate marginal distributions
of the extremes have been studied extensively (see e.g. Davison and Smith [21], de
Haan and Ferreira [24], Resnick [77] and the references therein). Many of the recent
developments of statistical inference in extreme value theory focus on the character-
ization, modeling and estimation of the dependence for multivariate extremes. In
this context, building adequate max-stable processes and random fields plays a key
role. See for example de Haan and Pereira [25], Buishand et al. [9], Schlather [94],
Schlather and Tawn [95], Cooley et al. [17], and Naveau et al. [64].
This chapter is motivated by an important and long-standing challenge, namely,
the prediction for max-stable random processes and fields. Suppose that one already
has a suitable max-stable model for the dependence structure of a random field
Xtt∈T . The field is observed at several locations t1, . . . , tn ∈ T and one wants to
predict the values of the field Xs1 , . . . , Xsmat some other locations. The optimal
predictors involve the conditional distribution of Xtt∈T , given the data. Even
if the finite-dimensional distributions of the field Xtt∈T are available in analytic
form, it is typically impossible to obtain a closed-form solution for the conditional
distribution. Naıve Monte Carlo approximations are not practical either, since they
58
59
involve conditioning on events of infinitesimal probability, which leads to mounting
errors and computational costs.
Prior studies of Davis and Resnick [19, 20] and Cooley et al. [17], among others,
have shown that the prediction problem in the max-stable context is challenging,
and it does not have an elegant analytical solution. On the other hand, the growing
popularity and the use of max-stable processes in various applications, make this an
important problem. This motivated us to seek a computational solution.
5.1 Overview
In this chapter, we develop theory and methodology for sampling from the con-
ditional distributions of spectrally discrete max-stable models. More precisely, we
provide an algorithm that can generate efficiently exact independent samples from the
regular conditional probability of (Xs1 , . . . , Xsm), given the values (Xt1 , . . . , Xtn
). For
the sake of simplicity, we write X = (X1, . . . , Xn) ≡ (Xt1 , . . . , Xtn). The algorithm
applies to the general max-linear model:
(5.1) Xi = maxj=1,...,p
ai,jZj ≡
p
j=1
ai,jZj , i = 1, . . . , n.
where the ai,j’s are known non-negative constants and the Zj’s are independent
continuous non-negative random variables. Any multivariate max-stable distribution
can be approximated arbitrarily well via a max-linear model with sufficiently large
p (see e.g. Remark II.1).
The main idea is to first generate samples from the regular conditional probability
distribution of Z | X = x, where Z = (Zj)j=1,...,p. Then, the conditional distributions
of
Xsk=
p
j=1
bk,jZj , k = 1, . . . ,m,
60
given X = x can be readily obtained, for any given bk,j’s. In this chapter, we assume
that the model is completely known, i.e., the parameters ai,j and bk,j are given.
The statistical inference for these parameters is beyond the scope of this chapter.
Observe that if X = x, then (5.1) implies natural equality and inequality con-
straints on the Zj’s. More precisely, (5.1) gives rise to a set of so-called hitting
scenarios. In each hitting scenario, a subset of the Zj’s equal, in other words hit,
their upper bounds and the rest of the Zj’s can take arbitrary values in certain open
intervals. We will show that the regular conditional probability of Z | X = x is
a weighted mixture of the various distributions of the vector Z, under all possible
hitting scenarios corresponding to X = x.
The resulting formula, however, involves determining all hitting scenarios, which
becomes computationally prohibitive for large and even moderate values of p. This
issue is closely related to the NP-hard set-covering problem in computer science (see
e.g. [13]).
Fortunately, further detailed analysis of the probabilistic structure of the max-
linear models allows us to obtain a different formula of the regular conditional prob-
ability (Theorem V.9). It yields an exact and computationally efficient algorithm,
which in practice can handle complex max-linear models with p in the order of thou-
sands, on a conventional desktop computer. The algorithm is implemented in the R
([74]) package maxLinear [107], with the core part written in C/C++. We also used
the R package fields ([37]) to generate some of the figures in this chapter.
We illustrate the performance of our algorithm over two classes of processes: the
max-autoregressive moving average (MARMA) time series (Davis and Resnick [19]),
and the Smith model (Smith [98]) for spatial extremes. The MARMA processes
are spectrally discrete max-stable processes, and our algorithm applies directly. In
61
Section 5.4, we demonstrate the prediction of MARMA processes by conditional
sampling and compare our result to the projection predictors proposed in [19]. To
apply our algorithm to the Smith model, on the other hand, we first need to discretize
the (spectrally continuous) model. Section 5.5 is devoted to conditional sampling
for the discretized Smith model. Thanks to the computational efficiency of our
algorithm, we can choose a mesh fine enough to obtain a satisfactory discretization.
Figure 5.1 shows four realizations from such a discretized Smith model, conditioning
only on 7 observations (with assumed value 5). The algorithm applies in the same
way to more complex models.
−2 −1 0 1
−2−1
01
5
5 5 5
10
10
10
15
15
55
5
5
55
5
−2 −1 0 1
−2−1
01
2
4
4
4
6
6
6
6
8
8
8
8
10
10
12
55
5
5
55
5
−2 −1 0 1
−2−1
01
5
5
5
5 5
10
10
15
55
5
5
55
5
−2 −1 0 1
−2−1
01
5
5
5
5
5
5
10
10
15
55
5
5
55
5
Conditional sampling from the Smith model
Parameters:ρ=0,β1=1,β2=1
5
10
15
Figure 5.1:Four samples from the conditional distribution of the discrete Smith model (see Sec-tion 5.5), given the observed values (all equal to 5) at the locations marked by crosses.
62
Remark V.1. We shall focus on spectrally discrete max-stable processes (see Chap-
ter II):
Xt :=p
j=1
φj(t)Zj, t ∈ T,
where the φj(t)’s are non-negative deterministic functions. By taking sufficiently
large p’s and with judicious φj(t)’s, one can build flexible models that can replicate
the behavior of an arbitrary max-stable process (recall the metric (2.5) characterizing
the convergence of stochastic extremal integrals). From this point of view, a satisfac-
tory computational solution must be able to deal with max-linear models with large
p’s.
Remark V.2. After our work [112] was published, the exact conditional distributions
of spectrally continuous max-stable processes were addressed by Dombry and Eyi–
Minko [31] via a different approach. Nevertheless, they also have a similar notion of
hitting scenarios introduced below.
5.2 Conditional Probability in Max-linear Models
Consider the max-linear model in (5.1). We shall denote this model by:
(5.2) X = A⊙ Z,
where A = (ai,j)n×p is a matrix with non-negative entries, X = (X1, . . . , Xn) and
Z = (Z1, . . . , Zp) are column vectors. We assume that the Zj’s, j = 1, . . . , p, are
independent non-negative random variables having probability densities.
In this section, we provide an explicit formula for the regular conditional probability
of Z with respect to X (see Theorem V.4 below). We start with some intuition and
notation. Throughout this chapter, we assume that the matrix A has at least one
nonzero entry in each of its rows and columns. This will be referred to as Assumption
A.
63
Observe that if x = A⊙ z with x ∈ Rn
+, z ∈ Rp
+, then
(5.3) 0 ≤ zj ≤ zj ≡ zj(A,x) := min1≤i≤n
xi/ai,j, j = 1, . . . , p.
That is, the max-linear model (5.2) imposes certain inequality and equality con-
straints on the Zj’s, given a set of observed Xi’s. Namely, some of the upper bounds
zj(A,x) in (5.3) must be attained, or hit, i.e., zj = zj(A,x) in such a way that
xi = ai,j(i)zj(i), i = 1, . . . , n,
with judicious j(i) ∈ 1, . . . , p. The next example helps to understand the inequality
and equality constraints.
Example V.3. Suppose that n = p = 3 and
A =
1 0 0
1 1 0
1 1 1
.
Let x = A⊙z for some z ∈ R3+. In this case, it necessarily follows that x1 ≤ x2 ≤ x3.
Moreover, (5.3) yields z = x.
(i) If x = (1, 2, 3), then it trivially follows that z = z = (1, 2, 3), which is an
equality constraint on z.
(ii) If x = (1, 1, 3), then it follows that z1 = z1 = 1, z2 ≤ z2 = 1 and z3 = z3 = 3.
Here, the “equality constraints” must hold for z1 = z1 and z3 = z3, while z2
only needs to satisfy the “inequality constraint” 0 ≤ z2 ≤ z2.
Write
C(A,x) := z ∈ Rp
+ : x = A⊙ z,
64
and note that the conditional distribution of Z | X = x concentrates on the set
C(A,x). The observation in Example V.3 can be generalized and formulated as
follows.
• Every z ∈ C(A,x) corresponds to a set of active (equality) constraints J ⊂
1, . . . , p, which we refer to as a hitting scenario of (A,x), such that
(5.4) zj = zj(A,x), j ∈ J and zj < zj(A,x), j ∈ J c := 1, . . . , p \ J.
Observe that if j ∈ J , then there are no further constraints and zj can take any
value in [0, zj), regardless of the values of the other components of the vector
z ∈ C(A,x).
• Every value x may give rise to many different hitting scenarios J ⊂ 1, . . . , p.
Let J (A,x) denote the collection of all such J ’s. We refer to J (A,x) as to the
hitting distribution of x w.r.t. A:
J (A,x) ≡J ⊂ 1, . . . , p : exist z ∈ C(A,x), such that (5.4) holds
.
To illustrate the notions of hitting scenario and hitting distribution, consider again
Example V.3. Therein, we have J (A,x) = 1, 2, 3 in case (i), and J (A,x) =
1, 3, 1, 2, 3 in case (ii).
The hitting distribution J (A,x) is a finite set and thus can always be identified.
However, the identification procedure is the key difficulty in providing an efficient
algorithm for conditional sampling in practice. This issue is addressed in Section 5.3.
In the rest of this section, suppose that J (A,x) is given. Then, we can partition
C(A,x) as follows
C(A,x) =
J∈J (A,x)
CJ(A,x) ,
65
where
CJ(A,x) = z ∈ Rp
+ : zj = zj, j ∈ J and zj < zj, j ∈ J.
The sets CJ(A,x), J ∈ J (A,x) are disjoint since they correspond to different hitting
scenarios in J (A,x). Let
(5.5) r(J (A,x)) = minJ∈J (A,x)
|J | ,
where |J | is the number of elements in J . We call r(J (A,x)) the rank of the hitting
distribution J (A,x). It equals the minimal number of equality constraints among the
hitting scenarios in J (A,x). It will turn out that the hitting scenarios J ⊂ J (A,x)
with |J | > r(J (A,x)) occur with (conditional) probability zero and can be ignored.
We therefore focus on the set of all relevant hitting scenarios:
Jr(A,x) = J ∈ J (A,x) : |J | = r(J (A,x)).
Theorem V.4. Consider the max-linear model in (5.2), where Zj’s are independent
random variables with densities fZjand distribution functions FZj
, j = 1, . . . , p. Let
A = (ai,j)n×p have non-negative entries satisfying Assumption A and let RRp
+be the
class of all rectangles (e, f ], e, f ∈ Rp
+ in Rp
+.
For all J ∈ J (A,x), E ∈ RRp
+, and x ∈ R
n
+, define
(5.6) νJ(x, E) :=
j∈J
δzj(πj(E))
j∈Jc
PZj ∈ πj(E) | Zj < zj,
where πj(z1, . . . , zp) = zj and δa is a unit point-mass at a.
Then, the regular conditional probability ν(x, E) of Z w.r.t. X equals:
(5.7) ν(x, E) =
J∈Jr(A,x)
pJ(A,x)νJ(x, E), E ∈ RRp
+,
for PX-almost all x ∈ A⊙ (Rp
+), where for all J ∈ Jr(A,x),
(5.8) pJ(A,x) =wJ
K∈Jr(A,x) wK
with wJ =
j∈J
zjfZj(zj)
j∈Jc
FZj(zj).
66
In the special case when the Zj’s are α-Frechet with scale coefficient 1, we have
wJ =
j∈J(zj)−α.
Remark V.5. We state (5.7) only for rectangle sets E because the projections πj(B)
of an arbitrary Borel set B ⊂ Rp
+ are not always Borel (see e.g. [99]). Nevertheless,
the extension of measure theorem ensures that Formula (5.7) specifies completely
the regular conditional probability.
We do not provide a proof of Theorem V.4 directly. Instead, we will first provide
an equivalent formula for ν(x, E) in Theorem V.9 in Section 5.3, and then prove that
ν(x, E) is the desired regular conditional probability. All the proofs are deferred to
Section 5.6. The next example gives the intuition behind Formula (5.7).
Example V.6. Continue with Example V.3.
(i) IfX = x = (1, 2, 3), then z = x, J (A,x) = 1, 2, 3. Therefore, r(J (A,x)) =
3 and Formula (5.7) yields
ν(x, E) = νJ(x, E) = δz1(π1(E))δz2(π2(E))δz3(π3(E)) ≡ δz(E) ,
a degenerate distribution with single unit point mass at z.
(ii) If X = x = (1, 1, 3), then, z = x, J (A,x) = 1, 3, 1, 2, 3, and
r(J (A,x)) = 2. Therefore, Jr(A,x) = 1, 3 and Formula (5.7) yields:
ν(x, E) = ν1,3(x, E) = δz1(π1(E))P(Z2 ∈ π2(E) | Z2 < z2)δz3(π3(E)).
In this case, the conditional distribution concentrates on the one-dimensional
set 1× (0, 1)× 3.
(iii) Finally, if X = x = (1, 1, 1), then z = x and J (A,x) = 1, 1, 2, 1, 2, 3.
Then, Jr(A,x) = 1 and
ν(x, E) = ν1(x, E) = δz1(π1(E))3
j=2
P(Zj ∈ πj(E) | Zj < zj).
67
The conditional distribution concentrates on the set 1× (0, 1)× (0, 1).
We conclude this section by showing that the conditional distributions (5.7) arise
as suitable limits. This result can be viewed as a heuristic justification of Theo-
rem V.4. Let > 0, consider
(5.9) C
J(A,x) :=
z ∈ R
p
+ : zj ∈ [zj(1−), zj(1+)], j ∈ J, zk < zk(1−) , k ∈ J c
,
and set
(5.10) C(A,x) :=
J∈J (A,x)
C
J(A,x) .
Note that the sets A⊙ (C(A,x)) shrink to the point x, as ↓ 0.
Proposition V.7. Under the assumptions of Theorem V.4, for all x ∈ A ⊙ (Rp
+),
we have, as ↓ 0,
(5.11) P(Z ∈ E | Z ∈ C(A,x)) −→ ν(x, E), E ∈ RRp
+.
Proof. Recall the definition of C
Jin (5.9). Observe that for all > 0, the sets
C
J(A,x)J∈J (A,x) are mutually disjoint. Thus, writing C ≡ C(A,x) and C
J≡
C
J(A,x), by (5.10) we have
P(Z ∈ E | Z ∈ C) =
J∈J
P(Z ∈ E | Z ∈ C
J)P(Z ∈ C
J| Z ∈ C)
=
J∈J
P(Z ∈ E | Z ∈ C
J)
P(Z ∈ C
J)
K∈JP(Z ∈ C
K),(5.12)
where the terms with P(Z ∈ C
J) = 0 are ignored. One can see that P(Z ∈ E | Z ∈
C
J) converge to νJ(E,x) in (5.6), as ↓ 0. The independence of the Zj’s also implies
that
(5.13) P(Z ∈ C
J) =
j∈J
P(Zj ∈ [zj(1− ), zj(1 + )])
k∈Jc
P(Zk ≤ zk(1− ))
=
j∈J
fZj
(zj)zj · 2+ o()
k∈Jc
FZj
(zj) + o().
68
Observe that for J ∈ Jr(A,x), the latter expression equals 2wJ |J |(1 + o(1)), ↓ 0
and the terms with |J | > r will become negligible since they are of smaller order.
Therefore, Relation (5.13) yields (5.7), and the proof is thus complete.
The proof of Proposition V.7 provides an insight to the expressions of the weights
wJ ’s in (5.8) and the components νJ ’s in (5.6). In particular, it explains why only hit-
ting scenarios of rank r are involved in the expression of the conditional probability.
The formal proof of Theorem V.4, however, requires a different argument.
5.3 Conditional Sampling: Computational Efficiency
We discuss here important computational issues related to sampling from the reg-
ular conditional probability in (5.7). It turns out that identifying all hitting scenarios
amounts to solving the set covering problem, which is NP-hard (see e.g. [13]). The
probabilistic structure of the max-linear models, however, will lead us to an alter-
native efficient solution, valid with probability one. In particular, we will provide
a new formula for the regular conditional probability, showing that Z can be de-
composed into conditionally independent vectors, given X = x. As a consequence,
with probability one we are not in the ‘bad’ situation that the corresponding set
covering problem requires exponential time to solve. Indeed, this will lead us to an
efficient and linearly-scalable algorithm for conditional sampling, which works well
for max-linear models with large dimensions n× p arising in applications.
To fix ideas, observe that Theorem V.4 implies the following simple algorithm.
Algorithm I:
1. Compute zj for j = 1, . . . , p.
2. Identify J (A,x), compute r = r(J (A,x)) and focus on the set of relevant
69
hitting scenarios Jr = Jr(A,x).
3. Compute wJJ∈Jrand pJJ∈Jr
.
4. Sample Z ∼ ν(x, ·) according to (5.7).
Step 1 is immediate. Provided that Step 2 is done, Step 3 is trivial and, Step 4
can be carried out by first picking a hitting scenario J ∈ Jr(A,x) (with probability
pJ(A,x)), setting Zj = zj, for j ∈ J and then resampling independently the remain-
ing Zj’s from the truncated distributions: Zj | Zj < zj, for all j ∈ 1, . . . , p \ J .
The most computationally intensive aspect of this algorithm is to identify the
set of all relevant hitting scenarios Jr(A,x) in Step 2. This is closely related to
the NP-hard set covering problem in theoretical computer science (see e.g. [13]),
which is formulated next. Let H = (hi,j)n×p be a matrix of 0’s and 1’s, and let
c = (cj)p
j=1 ∈ Zp
+ be a p-dimensional cost vector. For simplicity, introduce the
notation:
m ≡ 1, 2, . . . ,m, m ∈ N.
For the matrix H, we say that the column j ∈ p covers the row i ∈ n, if hi,j = 1.
The goal of the set-covering problem is to find a minimum-cost subset J ⊂ p, such
that every row is covered by at least one column j ∈ J . This is equivalent to solving
(5.14) minδj∈0,1j∈p
j∈p
cjδj , subject to
j∈p
hi,jδj ≥ 1 , i ∈ n .
We can relate the problem of identifying Jr(A,x) to the set covering problem by
defining
(5.15) hi,j = 1ai,jzj=xi,
where A = (ai,j)n×p and x = (xi)ni=1 are as in (5.2), and cj = 1 , j ∈ p. It is easy
70
to see that, every J ∈ Jr(A,x) corresponds to a solution of (5.14), and vice versa.
Namely, for δjj∈p minimizing (5.14), we have J = j ∈ p : δj = 1 ∈ Jr(A,x).
The set Jr(A,x) corresponds to the set of all solutions of (5.14), which depends
only on the matrix H. Therefore, in the sequel we write Jr(H) for Jr(A,x), and
(5.16) H = (hi,j)n×p ≡ H(A,x),
with hi,j as in (5.15) will be referred to as the hitting matrix.
Example V.8. Recall Example V.6. The following hitting matrices correspond to
the three cases of x discussed therein:
H(i) =
1 0 0
0 1 0
0 0 1
, H(ii) =
1 0 0
1 1 0
0 0 1
and H(iii) =
1 0 0
1 1 0
1 1 1
.
Observe that solving for Jr(H) is even more challenging than solving the set
covering problem (5.14), where only one minimum-cost subset J is needed, and
often an approximation of the optimal solution is acceptable. Here, we need to
identify exhaustively all J ’s such that (5.14) holds. Fortunately, this problem can
be substantially simplified, thanks to the probabilistic structure of the max-linear
model.
We first study the distribution ofH. In view of (5.16), we have thatH = H(A,X),
with X = A⊙Z, is a random matrix. It will turn out that, with probability one, H
has a nice structure, leading to an efficient conditional sampling algorithm.
For any hitting matrix H, we will decompose the set p ≡ 1, . . . , p into a
certain disjoint union p =
r
s=1 J(s). The vectors (Zj)
j∈J(s) , s = 1, . . . , r will turn
out to be conditionally independent (in s), given X = x. Therefore, ν(x, E) will be
expressed as a product of (conditional) probabilities.
71
We start by decomposing the set n ≡ 1, . . . , n. First, for all i1, i2 ∈ n , j ∈
p, we write i1j
∼ i2 , if hi1,j = hi2,j = 1. Then, we define an equivalence relation on
n:
(5.17) i1 ∼ i2, if i1 = i0j1∼ i1
j2∼ · · ·
jm∼ im = i2 ,
with some m ≤ n, i1 = i0,i1, . . . ,im = i2 ∈ n, j1, . . . , jm ∈ p. That is, ‘∼’ is the
transitive closure of ‘j
∼’. Consequently, we obtain a partition of n, denoted by
(5.18) n =r
s=1
Is ,
where Is, s = 1, . . . , r are the equivalence classes w.r.t. (5.17). Based on (5.18), we
define further
J (s) =j ∈ p : hi,j = 1 for all i ∈ Is
,(5.19)
J(s)
=j ∈ p : hi,j = 1 for some i ∈ Is
.(5.20)
The sets J (s), J(s)s∈r will determine the factorization form of ν(x, E).
Theorem V.9. Let Z be as in Theorem V.4. Let also H be the hitting matrix
corresponding to (A,X) with X = A ⊙ Z, and J (s), J(s)s∈r be the sets defined
in (5.19) and (5.20). Then, with probability one, we have
(i) r = r(J (A,X)),
(ii) for all J ⊂ p, J ∈ Jr(A,A⊙ Z) if and only if J can be written as
(5.21) J = j1, . . . , jr with js ∈ J (s) , s ∈ r ,
(iii) for ν(x, E) defined in (5.7),
(5.22) ν(X, E) =r
s=1
ν(s)(X, E) with ν(s)(X, E) =
j∈J(s) w
(s)j(X)ν(s)
j(X, E)
j∈J(s) w
(s)j(X)
,
72
where for all j ∈ J (s),
w(s)j(x) := zjfZj
(zj)
k∈J(s)
\j
FZk(zk) ,(5.23)
ν(s)j(x, E) := δπj(E)(zj)
k∈J(s)
\j
P(Zk ∈ πk(E)|Zk < zk),(5.24)
with zj = zj(x) as in (5.3).
The proof of Theorem V.9 is given in Section 5.6.
Remark V.10. Note that this result does not claim that ν(x, E) in (5.22) is the regular
conditional probability. It merely provides an equivalent expression for (5.7), which
is valid with probability one. We still need to show that (5.7), or equivalently (5.22),
is indeed the regular conditional probability.
From (5.23) and (5.24), one can see that ν(s) is the conditional distribution of
(Zj)j∈J
(s) . Therefore, Relation (5.22) implies that (Zj)j∈J
(s)s∈r, as vectors in-
dexed by s, are conditionally independent, given X = x. This leads to the following
improved conditional sampling algorithm:
Algorithm II:
1. Compute zj for j = 1, . . . , p and the hitting matrix H = H(A,x).
2. Identify J (s), J(s)s∈r by (5.19) and (5.20).
3. Compute w(s)jj∈J(s) for all s ∈ r by (5.23).
4. Sample (Zj)j∈J
(s) | X = x ∼ ν(s)(x, ·) independently for s = 1, . . . , r.
5. Combine the sampled (Zj)j∈J
(s) , s = 1, . . . , r to obtain a sample Z.
This algorithm identifies all hitting scenarios in an efficient way. To illustrate
its efficiency compared to Algorithm I, consider that r = 10 and |J (s)| = 10 for all
73
Table 5.1:Means and standard deviations (in parentheses) of the running times (in seconds) forthe decomposition of the hitting matrix H, based on 100 independent observations X =A⊙ Z, where A is an (n× p) matrix corresponding to a discretized Smith model.
p \ n 1 5 10 502500 0.03 (0.02) 0.13 (0.03) 0.24 (0.04) 1.25 (0.09)10000 0.11 (0.04) 0.50 (0.05) 1.00 (0.08) 4.98 (0.33)
s ∈ 10. Then, applying Formula (5.7) in Algorithm I requires storing in memory
the weights of all 1010 hitting scenarios. In contrast, the implementation of (5.22)
requires saving only 10 × 10 weights. This improvement is critical in practice since
it allows us to handle large, realistic models.
Table 5.1 demonstrates the running times of Algorithm II as a function of the
dimensions n × p of the matrix A. It is based on a discretized 2-d Smith model
(Section 5.5) and measured on an Intel(R) Core(TM)2 Duo CPU E4400 2.00GHz
with 2GB RAM. It is remarkable that the times scale linearly in both n and p.
5.4 MARMA Processes
In this section, we apply our result to the max-autoregressive moving average
(MARMA) processes studied by Davis and Resnick [19]. A stationary process
Xtt∈Z is a MARMA(m, q) process if it satisfies the MARMA recursion:
(5.25) Xt = φ1Xt−1 ∨ · · · ∨ φmXt−m ∨ Zt ∨ θ1Zt−1 ∨ · · · ∨ θqZt−q ,
for all t ∈ Z, where φi ≥ 0, θj ≥ 0, i = 1, . . . ,m, j = 1, . . . , q are the parameters,
and Ztt∈Z are i.i.d. 1-Frechet random variables. Proposition 2.2 in [19] shows
that, (5.25) has a unique solution in form of
(5.26) Xt =∞
j=0
ψjZt−j < ∞ , almost surely,
74
with ψj ≥ 0, j ≥ 0,
∞
j=0 ψj < ∞, if and only if φ∗ =
m
i=1 φi < 1. In this case,
ψj =j∧q
k=0
αj−kθk ,
where αjj∈Z are determined recursively by αj = 0 for all j < 0, α0 = 1 and
(5.27) αj = φ1αj−1 ∨ φ2αj−2 ∨ · · · ∨ φmαj−m , ∀j ≥ 1 .
In the sequel, we will focus on the MARMA process (5.25) with unique stationary
solution (5.26). In this case, the MARMA process is a spectrally discrete max–stable
process. Without loss of generality, we also assume Zkk∈Z to be standard 1-Frechet.
We consider the prediction of the MARMA process in the following framework:
suppose at each time t ∈ 1, . . . , n we observe the value Xt of the process, and
the goal is to predict Xsn<s≤n+N . We do so by generating i.i.d. samples from the
conditional distribution Xsn<s≤n+N | Xtt=1,...,n. To apply our result, it suffices to
provide a max-linear representation of this model. We will truncate (5.26) to obtain
(5.28) Xt =p
j=0
ψjZt−j , ∀t = 1, . . . , n+N .
The truncated process can approximate the original one arbitrarily well, if we take p
large enough. Indeed, by using the independence and max-stability of the Zt’s, one
can show that
(5.29) P( Xt = Xt) = P
p
j=0
ψjZt−j ≥
∞
j=p+1
ψjZt−j
= 1−
∞
j=p+1 ψj∞
j=0 ψj
−→ 1 ,
as p → ∞. Moreover, by induction on αj in (5.27), one can show that αj ≤ (φ∗)j/m
for all j ∈ N, and thus the convergence (5.29) above is geometrically fast.
Now, we reformulate the prediction problem with the model (5.28) as follows:
observe X[1,n] = A⊙ Z, and predict Y[1,N ] = B ⊙ Z | X[1,n] ,
75
with the notation X[1,n] = ( X1, . . . , Xn), Y[1,N ] = ( Xn+1, . . . , Xn+N) and Z =
(Z1−p, Z2−p, . . . , Zn+N). Here, A ∈ Rn×(p+n+N)+ , B ∈ R
N×(p+n+N)+ are determined
by (5.28). In particular,
(5.30)
A
B
=
ψp ψp−1 · · · ψ0 0 0 · · · 0
0 ψp ψp−1 · · · ψ0 0 · · · 0
.... . . . . . . . . . . . . . .
...
0 · · · 0 ψp ψp−1 · · · ψ0 0
0 · · · 0 0 ψp ψp−1 · · · ψ0
.
In practice, given the observations X[1,n], we use our algorithm to sample from the
conditional distribution Z | X[1,n]. Therefore, we can sample
(5.31) Y[1,N ] | X[1,n]d= B⊙ Z | X[1,n] .
Our approach is different from the prediction considered in [19], which we will
briefly review. Davis and Resnick took the classic time series point of view and
investigated how to approximate Xs by a max-linear combination of Xtt=1,...,n,
w.r.t. a certain metric d. Namely, for all Y ∈ H with
H = ∞
j=−∞
αjZj : αj ≥ 0,∞
j=−∞
αj < ∞
,
they considered a projection of Y onto the space Fn, max-linearly spanned by
Xtt=1,...,n: Fn =
∞
j=0 bjXn−j : bj ≥ 0,
∞
j=0 bj < ∞. That is, consider the
projection PnY defined by
(5.32) PnY = argminY ∈Fnd(Y , Y )
with the metric d induced by d(
jαjZj,
jβjZj) =
j|αj − βj|. For specific
MARMA processes, [19] provided predictors based on the projection (5.32). We
will refer to these predictors as the projection predictors.
76
In general, the conditional samplings reflect the conditional distribution (5.31),
and they provide more information than the projection predictors. Sampling multiple
times from (5.31), we can calculate e.g., conditional medians, conditional means,
quantiles, etc., which are optimal predictors with respect to various loss functions.
Example V.11 (MAR(m) processes). Consider the MAR(m) ≡MARMA(m, 0) pro-
cess with
(5.33) Xt = φ1Xt−1 ∨ · · · ∨ φmXt−m ∨ Zt .
The projection predictor for this model can be obtained recursively by
(5.34) Xt+k = φ1Xt+k−1 ∨ · · · ∨ φm
Xt+k−m ,
with Xt = Xt, t = 1, . . . , n (see [19], p. 799).
Figure 5.2 illustrates an application of our conditional sampling algorithm in this
case. Consider an MAR(3) process Xt150t=1 with φ1 = 0.7,φ2 = 0.5 and φ3 = 0.3. In
effect, we use the truncated model Xtt∈N in (5.28) with p = 500, but we still write
Xt for the sake of simplicity. Treating the first 100 values as observed, we plot the
projection predictor, conditional upper 95%-quantiles and the conditional medians
of Xs150s=101 based on 500 independent samples from the conditional distribution.
Observe that the value of the projection predictor in Figure 5.2 is always below
the conditional median. This “underestimation” phenomenon was typical in all the
simulations we performed. It can be explained by the fact that, the projection
predictor in (5.34) does not account for the jumps of the process caused by new
arrivals Ztt>100. Indeed, a large new arrival Zt will cause the process to jump
immediately to Zt at time t, but this will never occur for the projection predictor
Xt.
77
0 50 100 150
020
4060
80
t
XMARMA processConditional 95% quantileConditional medianProjection predictor
Prediction of MARMA(3,0) processes
Figure 5.2:Prediction of a MARMA(3,0) process with φ1 = 0.7,φ2 = 0.5 and φ3 = 0.3, based onthe observation of the first 100 values of the process.
Next, we apply our algorithm to examine the bias of the projection predictor. To
do this, for each generated MARMA process, we calculated the cumulative probabil-
ity that the projection predictor corresponds to, for each location s = 101, . . . , 150.
Namely, using 500 independent samples X(k)s 150
s=101, k = 1, . . . , 500 from the condi-
tional distribution, we calculated
(5.35) P(Xs ≤Xs | Xt
100t=1) ≈
1
500
500
k=1
1X(k)s
≤ Xs, ∀s > 100 ,
78
where Xs is the projection predictor in (5.34). This procedure was repeated 1000
times for independent realizations of Xt100t=1 and the means of the (estimated) prob-
ability in (5.35) are reported in Table 5.2. Note that as the time lag increases, the
conditional quantiles of the projection predictors decrease. In this way, our condi-
tional sampling algorithm helps quantify numerically the observed underestimation
phenomenon in Figure 5.2.
Table 5.2:Cumulative probabilities that the projection predictors correspond to at time 100 + t,based on 1000 simulations.
t 1 2 3 4 5 10 20 30 40mean 70.6% 50.3% 35.6% 25.3% 17.8% 2.9% 0.1% 0% 0%
Finally, we compare the generated conditional samples to the true process values
at times s = 101, . . . , 150. Our goal is to demonstrate the validity of our conditional
sampling algorithm. The idea is that, at each location s = 101, . . . , 150, the true
process should lie below the predicted 95% upper confidence bound of Xs | Xt100t=1,
with probability at least 95%. (Note that due to the presence of atoms in the
conditional distributions, the coverage probability may in principle be higher than
95%.) Motivated by this, we repeat the procedure in the previous paragraph and
record the proportion of the times that Xs is below the predicted confidence quantile,
for each s. We refer to these values as the coverage rates. As discussed, the coverage
rates should be close to 95%. This is supported by our simulation result, shown in
Table 5.3.
Table 5.3:Coverage rates (CR) and the widths of the upper 95% confidence intervals at time 100+t,based on 1000 simulations.
t 1 2 3 4 5 10 20 30 40CR 0.956 0.952 0.954 0.957 0.966 0.947 0.943 0.951 0.955
width 13.06 26.6 37.8 45.6 51.2 62.8 66.0 66.2 65.4
79
Table 5.3 also shows the widths of the upper 95%-confidence intervals. Note that
these widths are not equal to the upper confidence bounds, given by the conditional
95%-quantiles, since the left end-point of the conditional distributions are greater
than zero. When the time lag is small, the left end-point is large and the widths
are small, due to the strong influence of the past of the process Xt100t=1. On the
other hand, because of the weak temporal dependence of the MAR(3) processes, this
influence decreases fast as the lags increase. Consequently, the conditional distri-
bution converges to the unconditional one, and the conditional quantile to the un-
conditional one. Note that the (unconditional) 95%-quantile of Xs for the MARMA
process (5.26) can be calculated via the formula 0.95 = P(σZ ≤ u) = exp(−σu−1),
with σ =
p
j=0 ψj. For the MAR(3) process we chose, we have σ = 3.4 and the
95%-quantile of Xs equals 66.29. This is consistent with the widths in Table 5.3 for
large lags.
Remark V.12. As pointed out by an anonymous referee, in this case one can directly
generate samples from XsN
s=n+1 | Xtn
t=1, by generating independent Frechet ran-
dom variables and iterating (5.33). We selected this example only for illustrative
purpose and to be able to compare with the projection predictors in [19]. One can
modify slightly the prediction problem, such that our algorithm still applies by ad-
justing accordingly (5.30), while both the projection predictor and the direct method
by using (5.33) do not apply. For example, consider the prediction problem with re-
spect to the conditional distribution P(Xs2n+N
s=2n+1 ∈ · | Xt : t = 1, 3, . . . , 2n − 1)
(prediction with only partial history observed) or P(Xsn−1s=2 ∈ · | X1, Xn) (predic-
tion of the middle path with the beginning and the end-point (in the future) given).
In other words, our algorithm has no restriction on the locations of observations.
This feature is of great importance in spatial prediction problems.
80
5.5 Discrete Smith Model
Consider the following moving maxima random field model in R2:
(5.36) Xt =
e
R2φ(t− u)Mα(du), t = (t1, t2) ∈ R
2,
where Mα is an α-Frechet random sup-measure on R2 with the Lebesgue control
measure. Smith [98] proposed to use for φ the bivariate Gaussian density:
(5.37) φ(t1, t2) :=β1β2
2π1− ρ2
exp−
1
2(1− ρ2)
β21t
21 − 2ρβ1β2t1t2 + β2
2t22
,
with correlation ρ ∈ (−1, 1) and variances σ2i= 1/β2
i, i = 1, 2. Consistent and
asymptotically normal estimators for the parameters ρ, β1 and β2 were obtained by
de Haan and Pereira [25]. Here, we will assume that these parameters are known
and will illustrate the conditional sampling methodology over a discretized version
of the random field (5.36). Namely, we truncate the extremal integral in (5.36) to
the square region [−M,M ]2 and consider a uniform mesh of size h := M/q, q ∈ N.
We then set
(5.38) Xt :=
−q≤j1,j2≤q−1
h2/αφ(t− uj1j2)Zj1j2 ,
where uj1j2 = ((j1 + 1/2)h, (j2 + 1/2)h) and h2/αZj1j2
d= Mα((j1h, (j1 + 1)h] ×
(j2h, (j2 + 1)h]). This discretized model (5.38) can be made arbitrarily close to
the spectrally continuous one in (5.36) by taking a fine mesh h and sufficiently large
M (see e.g. [101]).
Suppose that the random fieldX in (5.38) is observed at n locationsXti = xi, ti ∈
[−M,M ]2, i = 1, . . . , n. In view of (5.38), we have the max-linear model X = A⊙Z,
with X = (Xti)n
i=1 and Z = (Zj)p
j=1, p = q2. By sampling from the conditional
distribution of Z | X = x, we can predict the random field Xs at arbitrary locations
s ∈ R2.
81
To illustrate our algorithm, we used the model (5.38) with parameter values ρ =
0, β1 = β2 = 1,M = 4, p = q2 = 2500, and n = 7 observed locations. We generated
N = 500 independent samples from the conditional distribution of the random field
Xs, where s takes values on an uniform 100×100 grid, in the region [−2, 2]×[−2, 2].
We have already seen four of these realizations in Figure 5.1. Figure 5.3 illustrates
the median and 0.95-th quantile of the conditional distribution. The former provides
the optimal predictor for the values of the random field given the observed data, with
respect to the absolute deviation loss. The marginal quantiles, on the other hand,
provide important confidence regions for the random field, given the data.
Certainly, conditional sampling may be used to address more complex functional
prediction problems. In particular, given a two-dimensional threshold surface, one
can readily obtain the correct probability that the random field exceeds or stays
below this surface, conditionally on the observed values. This is much more than
what marginal conditional distributions can provide.
−2 −1 0 1
−2−1
01
4.0
4.5
5.0
5.5
6.0
4
4.5
5
5
5
5.5
5.5 5.5
5.5
5.5
5.5
5.5
6
6
6
6
Conditional Median of the Smith model
Parameters:ρ=0, β1=1, β2=1
5
5
5
5
5
5
5
−2 −1 0 1
−2−1
01
5
10
15
20
25
10 15
15
15
15
20
20
20
20
25
25 25
Conditional Marginal Quantile of the Smith model
Parameters:ρ=0, β1=1, β2=1, q=0.95
5
5
5
5
5
5
5
Figure 5.3:Conditional medians (left) and 0.95-th conditional marginal quantiles (right). Eachcross indicates an observed location of the random field, with the observed value atright.
82
5.6 Proofs of Theorems V.4 and V.9
In this section, we prove Theorems V.4 and V.9. We will first prove Theorem V.9,
which simplifies the regular conditional probability formula (5.7) in Theorem V.4.
Then, we show the simplified new formula is the desired regular conditional probabil-
ity, which completes the proof of Theorem V.4. The key step to prove Theorem V.9
is the following lemma. Write H·j = i ∈ r : hi,j = 1.
Lemma V.13. Under the assumptions of Theorem V.9, with probability one,
(i) J (s)is nonempty for all s ∈ r, and
(ii) for all j ∈ J (s), H·j ∩ Is = ∅ implies H·j ⊂ Is.
Proof. Note that to show part (ii) of Lemma V.13, it suffices to observe that since
Is is an equivalence class w.r.t. Relation (5.17), H·j \ Is and H·j ∩ Is cannot be both
nonempty. Thus, it remains to show part (i). We proceed by excluding several
P-measure zero sets, on which the desired results may not hold.
First, observe that for all i ∈ n, the maximum value of ai,jZjj∈r is achieved
for unique j ∈ p with probability one, since the Zj’s are independent and have
continuous distributions. Thus, the set
N1 :=
i∈n,j1,j2∈p,j1 =j2
ai,j1Zj1 = ai,j2Zj2 = max
j∈p
ai,jZj
has P-measure zero. From now on, we focus on the event N c
1 and set j(i) =
argmaxj∈pai,jZj for all i ∈ n.
Next, we show that with probability one, i1j
∼ i2 implies j(i1) = j(i2). That is,
the set
N2 :=
j∈p,i1,i2∈n,i1 =i2
Nj,i1,i2 with Nj,i1,i2:=
j(i1) = j(i2), i1
j
∼ i2
83
has P-measure 0. It suffices to show P(Nj,i1,i2) = 0 for all i1 = i2. If not, since p
and n are finite sets, there exists N0 ⊂ Nj,i1,j2 , such that j(i1) = j1 = j(i2) = j2
on N0, and P(N0) > 0. At the same time, however, observe that i1j
∼ i2 implies
hi1,j = hi2,j = 1, which yields
aik,jzj = xik= aik,j(ik)Zj(ik) = aik,jkZjk
, k = 1, 2 .
It then follows that on N0, Zj1/Zj2 = ai1,jai2,j2/(ai2,jai1,j1), which is a constant. This
constant is strictly positive and finite. Indeed, this is because on N c
1 , ai,j(i) > 0 by
Assumption A and hi,j = 1 implies ai,j > 0. Since Zj1 and Zj2 are independent
continuous random variables, it then follows that P(N0) = 0.
Finally, we focus on the event (N1∪N2)c. Then, for any i1, i2 ∈ Is, we have i1 ∼ i2
and let i0, . . . ,in be as in (5.17). It then follows that j(i1) = j(i0) = j(i1) = · · · =
j(in) = j(i2). Note that for all i ∈ n, hi,j(i) = 1 by the definition of j(i). Hence,
j(i1) = j(i2) ∈ J (s). We have thus completed the proof.
Proof of Theorem V.9. Since Iss∈r are disjoint with
s∈rIs = n, in the lan-
guage of the set-covering problem, to cover n, we need to cover each Is. By part
(ii) of Lemma V.13, any two different Is1 and Is2 cannot be covered by a single set
H·j. Thus we need at least r sets to cover n. On the other hand, with probability
one we can select one js from each J (s) (by part (i) of Lemma V.13), which yields a
valid cover. That is, with probability one, r = r(J (H)) and any valid minimum-cost
cover of n must be as in (5.21), and vice versa. We have thus proved parts (i) and
(ii).
84
To show (iii), by straight-forward calculation, we have, with probability one,
J∈Jr(A,x)
wJ =
j1∈J(1)
· · ·
jr∈J(r)
wj1,...,jr
=
j1∈J(1)
· · ·
jr−1∈J(r−1)
r−1
s=1
zjsfZjs(zjs)
j /∈J(r)
j =j1,...,jr−1
FZj(zj)
×
j∈J(r)
zjfZj
(zj)
k∈J(r)
\j
FZk(zk)
=r
s=1
j∈J(s)
zjfZj
(zj)
k∈J(s)
\j
FZk(zk)
=
r
s=1
j∈J(s)
w(s)j
.(5.39)
Similarly, we have
(5.40)
J∈Jr(A,x)
wJνJ(x, E) =r
s=1
j∈J(s)
w(s)jν(s)j(x, E)
.
By plugging (5.39) and (5.40) into (5.7), we obtain the desired result and complete
the proof.
Proof of Theorem V.4. To prove that ν in (5.7) yields the regular conditional prob-
ability of Z given X, it is enough to show that
(5.41) P(X ∈ D,Z ∈ E) =
D
ν(x, E)PX(dx),
for all rectangles D ∈ RRn
+and E ∈ RR
p
+. In view of Theorem V.9, it is enough to
work with ν(x, E) given by (5.22).
We shall prove (5.41) by breaking the integration into a suitable sum of integrals
over regions corresponding to all hitting matrices H for the max-linear model X =
A⊙ Z. We say such a hitting matrix H is nice, if J (s) defined in (5.19) is nonempty
for all s ∈ r. In view of Lemma V.13, it suffices to focus on the set H(A) of
nice hitting matrices H. Notice that the set H(A) is finite since the elements of the
hitting matrices are 0’s and 1’s.
85
For all rectangles D ∈ RRn
+, let
DH =x = A⊙ z : H(A,x) = H,x ∈ D
be the set of all x ∈ Rn
+ that give rise to the hitting matrix H. By Lemma V.13 (i),
for the random vector X = A⊙ Z, with probability one, we have
X =
H∈H(A)
X1DH(X)
and hence
(5.42)
D
ν(x, E)PX(dx) =
H∈H(A)
DH
ν(x, E)PX(dx) .
Now fix an arbitrary and non-random nice hitting matrix H ∈ H(A). Let Iss∈r
denote the partition of n determined by (5.17) and let J (s), J(s), s = 1, . . . , r be
as in (5.19). Recall that J (s) ⊂ J(s)
and the sets J(s), s = 1, . . . , r are disjoint.
Focus on the set DH ⊂ Rn
+. Without loss of generality, and for notational conve-
nience, suppose that s ∈ Is, for all s = 1, . . . , r. That is,
I1 = 1, i1,2, . . . , i1,k1, I2 = 2, i2,2, . . . , i2,k2, · · · , Ir = r, ir,2, . . . , ir,kr.
Define the projection mapping PH : DH → Rr
+ onto the first r coordinates:
PH(x1, . . . , xn) = (x1, . . . , xr) ≡ xr .
Note that PH , restricted toDH is one-to-one. Indeed, for all i ∈ Is, we have xi = ai,jzj
and xs = as,jzj, for all j ∈ J (s) (recall (5.19)). This implies xi = (ai,j/as,j)xs, for all
i ∈ Is and all s = 1, . . . , r. Hence, PH(x) = PH(x) implies x = x.
Consequently, can write x = P−1(xr), xr ∈ P(DH), and
DH
ν(x, E)PX(dx) =
PH(DH)
ν(x, E)QXrH(dx1 . . . dxr),
where QXrH
:= PX P
−1H
is the induced measure on the set PH(DH).
86
Lemma V.14. The measure QXrH
has a density with respect to the Lebesgue measure
on the set PH(DH). The density is given by
(5.43) QXrH(dxr) = 1PH(DH)(xr)
r
s=1
j∈J(s)
w(s)j(x)
dx1
x1· · ·
dxr
xr
.
The proof of this result is given below. In view of (5.43) and (5.22), we obtain
PH(DH)
ν(x, E)QXrH(dxr)
=
PH(DH)
r
s=1
j∈J(s) w
(s)j(x)ν(s)
j(x, E)
k∈J(s) w
(s)k(x)
=ν(x,E)
×
r
s=1
j∈J(s)
w(s)j(x)
dx1
x1· · ·
dxr
xr
=Q
XrH
(dxr)
=
PH(DH)
r
s=1
j∈J(s)
w(s)j(x)ν(s)
j(x, E)
dx1
x1· · ·
dxr
xr
,
which equals
(5.44)
j1∈J(1),··· ,jr∈J
(r)
PH(DH)
r
s=1
w(s)js(x)ν(s)
js(x, E)
dx1
x1· · ·
dxr
xr
=:I(j1,...,jr)
.
Fix j1 ∈ J (1), · · · , jr ∈ J (r) and focus on the integral I(j1, · · · , jr). Define
Ωr
H(DH) :=
(zj1 , . . . , zjr) : zjs = xs/as,js , s = 1, . . . , r,xr = (xs)
r
s=1 ∈ PH(DH).
We have, by (5.23), (5.24), and replacing xs with as,jszjs , s = 1, . . . , r (simple change
of variables),
I(j1, · · · , jr)
=
Ωr
H(DH)
r
s=1
zjsfZjs
(zjs)
k∈J(s)
\js
FZk(zk)
×δπjs(E)(zjs)
k∈J(s)
\js
P(Zk ∈ πk(E) | Zk < zk)dzj1zj1
· · ·dzjrzjr
=
Ωr
H(DH)
r
s=1
fZjs(zjs)δπjs
(E)(zjs)
×
k∈p\j1,...,jr
P(Zk ∈ πk(E), Zk < zk)dzj1 · · · dzjr .(5.45)
87
Define
ΩH;j1,...,jr(DH) =z ∈ R
p
+ : x = A⊙ z ∈ DH ,
zjs = xs/as,js , s = 1, . . . , r, zk < zk(x), k ∈ p \ j1, . . . , jr.
By the independence of the Zk’s, (5.45) becomes
(5.46) I(j1, . . . , jr) = P
Z ∈ ΩH;j1,...,jr(DH) ∩ E
.
By plugging (5.46) into (5.44), we obtain
(5.47)
DH
ν(x, E)PX(dx) =
PH(DH)
ν(x, E)QXrH(dxr)
=
j1∈J(1),··· ,jr∈J
(r)
P(Z ∈ ΩH;j1,··· ,jr(DH) ∩ E) = P(A⊙ Z ∈ DH ,Z ∈ E),
because the summation over (j1, . . . , jr) accounts for all relevant hitting scenarios
corresponding to the matrix H. Plugging (5.47) into (5.42), we have
D
ν(x, E)PX(dx) =
H∈H(A)
P(X ≡ A⊙ Z ∈ DH , Z ∈ E) = P(X ∈ D,Z ∈ E) .
This completes the proof of Theorem V.4.
Proof of Lemma V.14. Consider the random vector Xr = (X1, . . . , Xr). Observe
that by the definition of the set PH(DH), on the event Xr ∈ PH(DH), we have
(5.48) Xr =
j1∈J(1), ··· , jr∈J
(r)
a1,j1Zj1
...
ar,jrZjr
r
s=1
1
k∈J(s)
\js
as,kZk < as,jsZjs
=:1Cs,js
.
Note that since J(s) ⊂ J(s), s = 1, . . . , r, the events
r
s=1 Cs,jsare disjoint for all
r-tuples (j1, . . . , jr) ∈ J (1) × · · ·× J (r).
88
Recall that our goal is to establish (5.43). By the fact that the sum in (5.48)
involves only one non-zero term for some (j1, . . . , jr), with probability one, we have
that for all measurable set ∆ ⊂ PH(DH), writing ξjs = as,jsZjs,
(5.49) QXrH(∆) ≡ P(Xr ∈ ∆)
=
j1∈J(1), ··· , jr∈J
(r)
P
(ξj1 , · · · , ξjr) ∈ ∆ ∩
r
s=1
Cs,js
.
Now, consider the last probability, for fixed (j1, . . . , jr). The random variables
ξjs , s = 1, . . . , r are independent and they have densities fZjs(xs/as,js)/as,js , xs ∈
R+. We also have that the events Cs,js, s = 1, . . . , r are mutually independent,
since their definitions involve Zk’s indexed by disjoint sets J(s), s = 1, . . . , r. By
conditioning on the ξjs ’s, we obtain that the probability in the right-hand side of
(5.49) equals
∆
r
s=1
1
as,jsf(xs/as,js)
×
r
s=1
P
k∈J(s)
\js
as,kZk < xs
dx1 · · · dxr
=
∆
r
s=1
1
as,jsf(xs/as,js)
k∈J(s)
\js
FZk(xs/as,k)
dx1 · · · dxr.
In view of (5.48) and (5.23), replacing
j1∈J(1), ··· , jr∈J
(r)
r
s=1 by
r
s=1
j∈J(s) ,
we obtain that the measure QXrH
has a density on P(DH), given by (5.43).
CHAPTER VI
Central Limit Theorems for Stationary Random Fields
The central limit theorem studies the asymptotic behavior of partial sums of
random variables Sn = X1 + · · · + Xn. In the case that the random variables are
independent, it has been well understood when the normalized partial sums converge
to a normal distribution:
Sn − ESn√n
⇒ N (0, σ2) .
The case that Xii∈N are dependent also has a long history. This dates back to at
least 1910, when Markov [58] already proved a central limit theorem for a two-state
Markov chain. Since then, the central limit theorem for stationary processes has
been an active research area in probability theory.
In this chapter, our focus is on establishing central limit theorems for stationary
random fields. That is, for stationary random variables Xi,j(i,j)∈N2 , when do we
have
(6.1)
n
i=1
n
j=1(Xi,j − EXi,j)
n⇒ N (0, σ2)?
This problem has already been considered by many researchers. For example,
Bolthausen [5], Goldie and Morrow [40] and Bradley [7] studied this problem under
suitable mixing conditions. Basu and Dorea [3], Nahapetian [62], and Poghosyan and
89
90
Roelly [73] considered the problem for multiparameter martingales. Another impor-
tant result is due to Dedecker [27, 28], whose approach was based on an adaptation of
the Lindeberg method. As a particular case, Cheng and Ho [15] established a central
limit theorem for functionals of linear random fields, based on a lexicographically
ordered martingale approximation.
Here, we aim at establish the so-called projective-type conditions such that the
central limit theorem (6.1) holds. Such conditions have recently drawn much atten-
tions in central limit theorems for stationary sequences, as they are easy to verify
when applying such results to stochastic processes from statistics and econometrics
(see e.g. Wu [119]). However, central limit theorems for stationary random fields
based on projective conditions have been much less explored.
This problem is not a simple extension of a one-dimensional problem to a high-
dimensional one. An important reason is that, the main technique for establishing
central limit theorems with projective conditions in one dimension, the martingale
approximation approach, does not apply to (high-dimensional) random fields as suc-
cessfully as to (one-dimensional) stochastic processes. This obstacle has been known
among researchers for more than 30 years. For example, Bolthausen [5] remarked
that ‘Gordin uses an approximation by martingales, but his method appears difficult
to generalizes to dimensions ≥ 2.’ (For literatures on martingale approximation, see
e.g. Gordin and Lifsic [43], Kipnis and Varadhan [54], Woodroofe [116], Maxwell and
Woodroofe [59], Wu and Woodroofe [121], Dedecker et al. [30], Peligrad et al. [68],
among others, and Merlevede et al. [60] for a survey.)
In this chapter, we establish a central limit theorem and an invariance principle
for stationary multiparameter random fields. We will apply an m-approximation
approach. we first state the main result in the next section.
91
6.1 Main Result
We start with some notations. We consider a product probability space (Ω,A,P),
i.e., a Zd-indexed-product of i.i.d. probability spaces in form of
(Ω,A,P) ≡ (RZd
,BZd
, P Zd
) .
Write k(ω) = ωk, for all ω ∈ RZd
and k ∈ Zd. Then, kk∈Zd are i.i.d. random
variables with distribution P . On such a space, we define the natural filtration
Fkk∈Zd by
(6.2) Fk := σl : l k, l ∈ Zd, for all k ∈ Z
d .
Here and in the sequel, for all vector x ∈ Rd, we write x = (x1, . . . , xd) and for all
l, k ∈ Rd, let l k stand for li ≤ ki, i = 1, . . . , d.
We focus on mean-zero stationary random fields, defined on a product probability
space. Let Tkk∈Zd denote the group of shift operators on RZd
with (Tkω)l = ωk+l,
for all k, l ∈ Zd ,ω ∈ R
Zd
. Then, we consider random fields in form of
f Tkk∈Zd ,
where f is in the class Lp
0 = f ∈ Lp(F∞),fdP = 0, p ≥ 2, with F∞ =
k∈Zd Fk.
Throughout this chapter, we consider a sequence Vnn∈N of finite rectangular
subsets of Zd, in form of
(6.3) Vn =d
i=1
1, . . . ,m(n)i
⊂ Nd , for all n ∈ N ,
with m(n)i
increasing to infinity as n → ∞ for all i = 1, . . . , d. Let
(6.4) Sn(f) ≡ S(Vn, f) =
k∈Vn
f Tk .
92
denote the partial sums with respect to Vn. Moreover, write for t ∈ [0, 1], Vn(t) =
d
i=1[0,m(n)i
t] ⊂ Rd and Rk =
d
i=1(ki − 1, ki] ⊂ Rd for all k ∈ Z
d.
We write also
(6.5) Bn,t(f) ≡ BVn,t(f) =
k∈Nd
λ(Vn(t) ∩Rk)f Tk ,
where λ is the Lebesgue measure on Rd, and consider the weak convergence in the
space C[0, 1]d, the space of continuous functions on [0, 1]d, equipped with the uniform
metric. Recall that the standard d-parameter Brownian sheet on [0, 1]d, denoted by
B(t)t∈[0,1]d , is a mean-zero Gaussian random field with covariance E(B(s)B(t)) =
d
i=1 min(si, ti), s, t ∈ [0, 1]d. Write 0 = (0, . . . , 0),1 = (1, . . . , 1) ∈ Zd. Our condi-
tion involves the following term:
(6.6) ∆d,p(f) :=
k∈Nd
E(f Tk | F1)pd
i=1 k1/2i
.
Our main result is the following.
Theorem VI.1. Consider a product probability space described above. If f ∈ L20,
f ∈ F0 and ∆d,2(f) < ∞, then
σ2 = limn→∞
E(Sn(f)2)
|Vn|< ∞
exists and
Sn(f)
|Vn|1/2
⇒ N (0, σ2) .
In addition, if f ∈ Lp
0 and ∆d,p(f) < ∞ for some p > 2, then
(6.7)Bn,·(f)
|Vn|1/2
⇒ σB(·)
in C[0, 1]d.
For the sake of simplicity, we will prove Theorem VI.1 in the case d = 2 in
Sections 6.3 and 6.4.
93
Remark VI.2. Conditions involving conditional expectations like the one here
∆d,p(f) < ∞ are referred to as projective conditions. Compared to mixing-type
conditions (see e.g. Bradley [8]), projective ones are often easy to check in practice.
One of such examples is given in Section 6.6. See for example [30] for comparisons
of projective conditions.
The rest of chapter is organized as follows. In Section 6.2 we provide preliminary
results on m-dependent approximation. We establish the central limit theorem in
Section 6.3 and then the invariance principle in Section 6.4. Sections 6.5 and 6.6 are
devoted to the applications to orthomartingales and functionals of stationary linear
random fields, respectively. In Section 6.7, we prove a moment inequality, which
plays a crucial role in proving our limit results. Some other auxiliary proofs are
given in Section 6.8.
6.2 m-Dependent Approximation
We describe the general procedure of m-dependent approximation in this sec-
tion. In this section, we do not assume any structure on the underlying probabil-
ity space, nor the filtration structure. Instead, we simply assume f ∈ L20 = f ∈
L2(Ω,A,P),fdP = 0, and Tkk∈Zd is an Abelian group of bimeasurable, measure-
preserving, one-to-one and onto maps on (Ω,A,P).
The notion of m-dependence was first introduced by Hoeffding and Robbins [49].
We say a random variable f is m-dependent, if f Tk, f Tl are independent when-
ever |k − l|∞ := maxi=1,...,d |ki − li| > m. The following result on the asymptotic
normality of sums of m-dependent random variables is due to Bolthausen [5] (see
also Rosen [81]). Recall Vnn∈N given in (6.3).
94
Theorem VI.3. Suppose fm ∈ L20 is m-dependent. Write
(6.8) σ2m=
k∈Zd
E[fm(fm Tk)] .
Then,
Sn(fm)
|Vn|1/2
⇒ N (0, σ2m) .
Now, consider the function f ∈ L20(P) and define
(6.9) fV,+ = lim sup
n→∞
Sn(f)2|Vn|
1/2.
We refer to the pseudo norm ·V,+ as the plus-norm.
Lemma VI.4. Suppose f, f1, f2, · · · ∈ L20(P) and fm is m-dependent for all m ∈ N.
If
(6.10) limm→∞
f − fmV,+ = 0 ,
then
(6.11) limm→∞
σm = limm→∞
fmV,+ =: σ < ∞
exists, and
(6.12)Sn(f)
|Vn|1/2
⇒ N (0, σ2) .
Proof. It suffices to prove (6.11). We will show that σ2mm∈N forms a Cauchy se-
quence in R+. Observe that since fm is m-dependent with zero mean,
σm = limn→∞
Sn(fm)2|Vn|
1/2.
It then follows that
|σm1 − σm2 | ≤ lim supn→∞
Sn(fm1 − fm2)2|Vn|
1/2
≤ fm1 − fV,+ + fm2 − fV,+ ,
95
which can be made arbitrarily small by taking m1,m2 large enough. We have thus
shown that σ2nn∈N is a Cauchy sequence in R+.
Remark VI.5. The idea of establishing the central limit theorem by controlling the
quantity f − fmV,+ dates back to Gordin [42], where fm was selected from a dif-
ferent subspace. In the one-dimensional case, when Vn = 1, . . . , n, Zhao and
Woodroofe [123] named ·V,+ the plus-norm, and established a necessary and suf-
ficient condition for the martingale approximation, in term of the plus-norm. See
Gordin and Peligrad [41] and Peligrad [66] for improvements and more discussions
on such conditions.
In the next section, we will establish conditions, under which (6.10) holds.
6.3 A Central Limit Theorem
From this section on, we will focus on stationary multiparameter random fields,
defined on product probability spaces. On Such a space, any integrable function
has a natural L2-approximation by m-dependent functions, and there is a natural
commuting filtration.
For the sake of simplicity, we consider only the 2-parameter random fields in the
sequel and simply say ‘random fields’ for short. We will prove a central limit theorem
here and then an invariance principle in the next section. The argument, however,
can be generalized easily to d-parameter random fields, and the result has been stated
in Theorem VI.1.
We start with a product probability space with i.i.d. random variables i,j(i,j)∈Z2 .
Recall that Ti,j(i,j)∈Z2 are the group of shift operators on RZ2and write F∞,∞ =
σ(i,j : (i, j) ∈ Z2). We focus on the class of functions Lp
0 = f ∈ Lp(F∞,∞) : Ef =
96
0, p ≥ 2. For all measurable function f ∈ L20, define, for all m ∈ N,
(6.13) fm := E(f |Fm) with Fm = σ(j : j ∈ −m, . . . ,m2) .
Clearly, fm ∈ L20, f − fm2 → 0 as m → ∞ and fm Ti,j(i,j)∈Z2 are m-dependent
functions.
Now, recall the natural filtration Fi,j(i,j)∈Z2 defined by Fk,l = σ(i,j : i ≤ k, j ≤
l). This is a 2-parameter filtration, i.e.,
(6.14) Fi,j ⊂ Fk,l if i ≤ k, j ≤ l .
Also,
(6.15) T−i,−jFk,l = Fk+i,l+j , ∀(i, j), (k, l) ∈ Z2 .
Moreover, the notion of commuting filtration is of importance to us.
Definition VI.6. A filtration Fi,j(i,j)∈Z2 is commuting, if for all Fk,l-measurable
bounded random variable Y , E(Y |Fi,j) = E(Y |Fi∧k,j∧l).
Since k,l(k,l)∈Z2 are independent random variables, Fi,j(i,j)∈Z2 is commuting
(see Proposition VI.22 in Section 6.8). This implies that the marginal filtrations
(6.16) Fi,∞ =
j≥0
Fi,j and F∞,j =
i≥0
Fi,j
are commuting, in the sense that for all Y ∈ L1(P),
(6.17) E[E(Y |Fi,∞)|F∞,j] = E[E(Y |F∞,j)|Fi,∞] = E(Y |Fi,j) .
For more details on the commuting filtration, see Khoshnevisan [53].
For all F0,0-measurable function f ∈ L20, write
(6.18) Sm,n(f) =m
i=1
n
j=1
f Ti,j .
97
Thanks to the commuting structure of the filtration, applying twice the maximal
inequality in [68], we can prove the following moment inequality with p ≥ 2:
(6.19) Sm,n(f)p ≤ Cm1/2n1/2∆(m,n),p(f)
with
∆(m,n),p(f) =m
k=1
n
l=1
E(Sk,l(f) | F1,1)pk3/2l3/2
.
In fact, we will prove a stronger inequality without the assumptions of product
probability space and the F0,0-measurability of f . See Section 6.7, Proposition VI.20
and Corollary VI.21.
Recall that
(6.20) ∆2,p(f) =∞
k=1
∞
l=1
E(f Tk,l | F1,1)pk1/2l1/2
.
Now, we can prove the following central limit theorem for adapted stationary random
fields.
Theorem VI.7. Consider the product probability space discussed above. Let Vnn∈N
be as in (6.3) with d = 2. Suppose f ∈ L20, f ∈ F0,0, and define fm as in (6.13). If
∆2,2(f) < ∞, then
limm→∞
f − fmV,+ = 0 .
Therefore, σ := limm→∞ fmV,+ < ∞ exist and Sn(f)/|Vn|1/2 ⇒ N (0, σ2).
Proof. The second part follows immediately from Lemma VI.4. It suffices to prove
f − fmV,+ → 0 as m → ∞. First, by the fact that
E(Sk,l(f) | F1,1)2 ≤k
i=1
l
j=1
E(f Tk,l | F1,1)2
98
and Fubini’s theorem, we have ∆(∞,∞),2(f) ≤ 9∆2,2(f). So, by (6.9) and (6.19), it
suffices to show
(6.21) ∆2,2(f − fm) =∞
k=1
∞
l=1
E[(f − fm) Tk,l | F1,1]2k1/2l1/2
→ 0
as m → ∞. Clearly, the summand in (6.21) converges to 0 for each k, l fixed,
since (6.13) implies f − fm2 → 0 as m → ∞ and E[(f − fm) Tk,l | F1,1]2 ≤
f − fm2. Moreover, observe that,
E(fm Tk,l | F1,1) = E[E(f Tk,l | T−k,−l(Fm)) | F1,1]
= E[E(f Tk,l | F1,1) | T−k,−l(Fm)] ,
where in the second equality we can exchange the order of conditional expectations
by the definitions of F1,1 and T−k,−l(Fm) (see Proposition VI.22 in Section 6.8 for
a detailed treatment). Therefore,
E[(f − fm) Tk,l | F1,1]2
≤ E(f Tk,l | F1,1)2 + E(fm Tk,l | F1,1)2
≤ 2E(f Tk,l | F1,1)2 .
Then, the condition ∆2,2(f) < ∞ combined with the dominated convergence theorem
yields (6.21). The proof is thus completed.
6.4 An Invariance Principle
Recall the space C[0, 1]2 and the 2-parameter Brownian sheet B(t)t∈[0,1]2 .
Theorem VI.8. Under the assumptions in Theorem VI.7, suppose in addition that
f ∈ Lp
0 and ∆2,p(f) < ∞ for some p > 2. Write Bn,t(f) as in (6.5) with d = 2.
Then,
Bn,·(f)
|Vn|1/2
⇒ σB(·) ,
99
where ‘ ⇒’ stands for the weak convergence in C[0, 1]2.
Proof. It suffices to show that the finite-dimensional distributions converge, and
Bn,t(f)/|Vn|1/2t∈[0,1]2 is tight.
We first show that, for all t = (t(1), . . . , t(k)) ⊂ [0, 1]2,
(6.22)Bn,t(1)(f)
|Vn|1/2
, · · · ,Bn,t(k)(f)
|Vn|1/2
⇒ σ(B(t(1)), · · · ,B(t(k))) =: σBt .
Consider the m-dependent function fm defined in (6.13). Then, the convergence
of the finite-dimensional distributions (6.22) with f replaced by fm follows from
the invariance principle of m-dependent random fields (see e.g. [96]). Further-
more, by Theorem VI.7, ∆2,2(f) ≤ ∆2,p(f) < ∞, so that f − fmV,+ → 0 as
m → ∞, and therefore, letting Bn,t(f)/|Vn|
1/2 denote the left-hand side of (6.22),
Bn,t(fm − f)/|Vn|
1/2 → (0, . . . , 0) ∈ Rk in probability. The convergence of the finite-
dimensional distribution (6.22) follows.
Now, we prove the tightness of Bn,t(f)t∈[0,1]2 . Fix n and consider
Vn = 1, . . . , n1× 1, . . . , n2 .
Write Bn,t ≡ Bn,t(f) and Sm,n ≡ Sm,n(f) for short. For all 0 ≤ r1 < s1 ≤ 1, 0 ≤ r2 <
s2 ≤ 1, set,
Bn((r1, s1]× (r2, s2]) := Bn,(s1,s2) − Bn,(r1,s2) − Bn,(s1,r2) +Bn,(r1,r2) .
We will show that there exists a constant C, independent of n, r1, r2, s1 and s2, such
that
(6.23) (n1n2)−1/2
Bn((r1, s1]× (r2, s2])p ≤ C(s1 − r1)(s2 − r2)∆2,p(f) .
Inequality (6.23) implies the tightness, by Nagai [61], Theorem 1.
100
Now, we prove (6.23) to complete the proof. From now on, the constant C may
change from line to line. Write mi = nisi − niri , i = 1, 2. If mi ≥ 2, i = 1, 2,
then
Bn((r1, s1]× (r2, s2])p
≤ Sm1,m2p + 2Sm1,1p + 2S1,m2p + 4S1,1p
≤ C(m1m2)1/2 ∆2,p(f)(6.24)
for some constant C, by (6.19). Note that mi ≥ 2 also implies ni(si − ri) > 1.
Therefore, mi ≤ ni(si − ri) + 1 < 2ni(si − ri), and (6.24) can be bounded by
C(n1n2)1/2[(s1 − r1)(s2 − r2)]1/2 ∆2,p(f), which yields (6.23).
In the case m1 < 2 or m2 < 2, to obtain (6.23) requires more careful analysis.
We only show the case when m1 = 1,m2 ≥ 2, as the proof for the other cases are
similar. Observe that m1 = 1 implies n1r1 < n1r1 = n1s1 ≤ n1s1. Then,
Bn((r1, s1]× (r2, s2])p
≤ n1(s1 − r1)(S1,m2p + 2S1,1p) ≤ Cn1(s1 − r1)m1/22
∆2,p(f) .
Observe that m1 = 1 also implies n1(s1−r1) ∈ (0, 2). If n1(s1−r1) ≤ 1, then n1(s1−
r1) ≤ [n1(s1 − r1)]1/2. If n1(s1 − r1) ∈ (1, 2), then n1(s1 − r1) <√2[n1(s1 − r1)]1/2.
It then follows that (6.23) still holds.
Remark VI.9. To prove the invariance principle of stationary random fields, most
of the results require finite moment of order strictly larger than 2. See for exam-
ple Berkes and Morrow [4], Goldie and Greenwood [39] and Dedecker [28]. This
is in contrast to the one-dimensional case, where the invariance principle can be
established with finite second moment assumption.
101
To the best of our knowledge, there are two invariance principles for stationary
random fields requiring finite second moment. One is due to Sashkin [96], who
assumed to be BL(θ)-dependent (including m-dependent stationary random fields).
In general the BL(θ)-dependence is difficult to check. The other is due to Basu and
Dorea [3], who proved an invariance principle for martingale difference random fields
with finite second moment assumption. However, they have stringent conditions on
the filtration (see Remark VI.13 below). In our case, it remains an open problem:
whether ∆2,2(f) < ∞ implies the invariance principle. See also a similar conjecture
by Dedecker in [28], Remark 1.
6.5 Orthomartingales
The central limit theorems and invariance principles for multiparameter martin-
gales are more difficult to establish than in the one-dimensional case. This is due to
the complex structure of multiparameter martingales. We will focus on orthomartin-
gales first and establish an invariance principle, and then compare the results on
other types of multiparameter martingales.
The idea of orthomartingales are due to R. Cairoli and J. B. Walsh. See e.g. refer-
ences in Khoshnevisan [53], which also provides a nice introduction to the materials.
For the sake of simplicity, we suppose d = 2. Consider a probability space (Ω,A,P)
and recall the definition of 2-parameter filtration (6.14). We restrict ourselves to the
filtration indexed by N2.
Definition VI.10. Given a commuting 2-parameter filtration Fi,j(i,j)∈N2 on
(Ω,A,P), we say a family of random variables Mi,j(i,j)∈N2 is a 2-parameter or-
thomartingale on (Ω,A,P), with respect to Fi,j(i,j)∈N2 , if for all (i, j) ∈ N2, Mi,j is
Fi,j-measurable, and E(Mi+1,j | Fi,∞) = E(Mi,j+1 | F∞,j) = Mi,j, almost surely.
102
In our case, for F0,0-measurable function f ∈ L20, Mm,n = Sm,n(f) as in (6.18)
yields a 2-parameter orthomartingale, if
(6.25) E(f Ti+1,j | Fi,∞) = E(f Ti,j+1 | F∞,j) = 0 almost surely,
for all (i, j) ∈ N2. In this case, we say fTi,j(i,j)∈N2 are 2-parameter orthomartingale
differences.
Remark VI.11. In our case, Mi,j(i,j)∈N2 is also a 2-parameter martingale in the
normal sense, i.e., E(Mi,j | Fk,l) = Mi∧k,j∧l, almost surely. Indeed,
E(Mi,j | Fk,l) = E[E(Mi,j | Fk,∞) | F∞,l] = E(Mi∧k,j | F∞,l) = Mi∧k,j∧l .
In general, however, the converse is not true, i.e., multiparameter martingales are
not necessarily orthomartingales (see e.g. [53] p. 33). The two notions are equivalent,
when the filtration is commuting (see e.g. [53], Chapter I, Theorem 3.5.1).
Theorem VI.12. Consider a product probability space (Ω,A,P) with a natural filtra-
tion Fi,j(i,j)∈N2. Suppose f ∈ L20 and f ∈ F0,0. If f Ti,j(i,j)∈N2 are 2-parameter
orthomartingale differences, i.e., (6.25) holds, then σ2 = limn→∞ E(Sn(f)2)/|Vn|2 <
∞ exists, and
Sn(f)
|Vn|1/2
⇒ σN (0, 1) .
In addition, if f ∈ Lp
0 for some p > 2, then the invariance principle (6.7) holds.
Proof. Observe that, (6.25) implies E(f Ti,j | F1,1) = 0 if i > 1 or j > 1. Then, for
f ∈ Lp
0, p ≥ 2,
∆∞,p(f) = E(f T1,1 | F1,1)p = fp< ∞ .
The result then follows immediately from Theorem VI.1. Note that, the argument
holds for general d-parameter orthomartingales (d ≥ 2) defined in [53].
103
Remark VI.13. Our result is more general than [3], [62] and [73] in the following way.
Let be i,j(i,j)∈Z2 be i.i.d. random variables. In [62], the central limit theorem was
established for the so-called martingale-difference random fields Mi,j(i,j)∈N2 with
Mi,j =
i
k=1
j
l=1 Dk,l, such that
E[Di,j | σ(k,l : (k, l) ∈ Z2, (k, l) = (i, j))] = 0 , for all (i, j) ∈ N
2 .
In [3] and [73], the authors considered the multiparameter martingales Mi,j(i,j)∈N2
with respect to the filtration defined by
Fi,j = σ(k,l : k ≤ i or l ≤ j) .
It is easy to see, in both cases above, their assumptions are stronger, in the sense
that they imply that Mi,j(i,j)∈N2 is an orthomartingale, with the natural filtra-
tion Fi,j(i,j)∈N2 (6.2). On the other hand, however, the results mentioned above
only assume that i,j(i,j)∈Z2 is a stationary random field, which is weaker than our
assumption.
At last, we point out that the product structure of the probability space plays
an important role. We provide an example of an orthomartingale with a different
underlying probability structure. In this case, the limit behavior is quite different
from the case that we studied so far.
Example VI.14. Suppose kk∈Z and ηkk∈Z are two families of i.i.d. random
variables. Define Gi = σ(j : j ≤ i) and Hi = σ(ηj : j ≤ i) for all i ∈ N. Then,
G = Gnn∈N and H = Hnn∈N are two filtrations.
Now, let Ynn∈N and Znn∈N be two arbitrary martingales with stationary
increment with respect to the filtration G and H, respectively. Suppose Yn =
n
i=1 Di, Zn =
n
i=1 Ei, where Dnn∈N and Enn∈N are stationary martingale dif-
104
ferences. Then, DiEj(i,j)∈N2 is a stationary random fields and
Mm,n :=m
i=1
n
j=1
DiEj = YmZn
is an orthomartingale with respect to the filtration Gi ∨Hj(i,j)∈N2 . Clearly,
Mn,n
n=
Yn√n
Zn√n⇒ N (0, σ2
Y)×N (0, σ2
Z) ,
where the limit is the distribution of the product of two independent normal ran-
dom variables (a Gaussian chaos). That is, Sn(f)/n has asymptotically non-normal
distribution.
One can also define Mm,n = Ym + Zn, which again gives an orthomartingale, and
Di + Ej(i,j)∈N2 is the corresponding stationary random field. This time, one can
show thatMn,n√n
=Yn√n+
Zn√n⇒ N (0, σ2
Y+ σ2
Z) .
Here, the limit is a normal distribution, but the normalizing sequence is√n instead
of n.
This example demonstrates that for general orthomartingales, to obtain a central
limit theorem one must assume extra conditions on the structure of the underlying
probability space. For the structure mentioned above, there is no m-dependent
approximation for the random fields. Indeed, the example corresponds to the sample
space Ω = (RZ,RZ) with [Tk,l(, η)]i,j = (i+k, ηj+l), and if we define fm similarly as
in (6.13) with
Fm := σ(i, ηj : −m ≤ i, j ≤ m) ,
then f and f Tk,l are independent, if and only if min(k, l) > m. That is, the
dependence can be very strong, along the horizontal (the vertical resp.) direction of
the random field.
105
6.6 Stationary Causal Linear Random Fields
We establish a central limit theorem for functionals of stationary causal linear ran-
dom fields. We focus on d = 2. Consider a stationary linear random field Zi,j(i,j)∈Z2
defined by
(6.26) Zi,j =
r∈Z
s∈Z
ar,si−r,j−s =
r∈Z
s∈Z
ai−r,j−sr,s ,
with coefficient ai,j(i,j)∈Z2 satisfying
(i,j)∈Z2 a2i,j < ∞. We restrict ourselves to
causal linear random fields, i.e., ai,j = 0 unless i ≥ 0 and j ≥ 0. They are also
referred to be adapted to the filtration Fi,j(i,j)∈Z2 .
Now, consider the random fields f Tk,l(k,l)∈Z2 with a more specific form f =
K(Zi,j0,0h), where h is a fixed strictly positive integer, K is a measurable function
from Rh2to R and for all (k, l) ∈ Z
2,
Zi,jk,l
h:= Zi,j : k − h+ 1 ≤ i ≤ k, l − h+ 1 ≤ j ≤ l
is viewed as a random vector in Rh2with covariates lexicographically ordered. In the
sequel, the same definition applies similarly to xi,jk,l
h, given xi,j(i,j)∈Z2 . Assume
that
(6.27) EK(Zi,j0,0h) = 0 and EKp(Zi,j
0,0h) < ∞
for some p ≥ 2. In this way,
(6.28) f Tk,l = K(Zi,jk,l
h) .
The model (6.28) is a natural extension of the functionals of causal linear processes
considered by Wu [117].
Next, we introduce a few notations similarly as in [48] and [117]. Here, our
ultimate goal is to translate Condition (6.20) into a condition on the regularity of K
106
and the summability of ai,j(i,j)∈Z2 . For all (i, j) ∈ Z2, let
(6.29) Γ(i, j) = (r, s) ∈ Z2 : r ≤ i, s ≤ j ,
and write
Zi,j =
(r,s)∈Γ(i,j)
ai−r,j−sr,s
=
(r,s)∈Γ(i,j)\Γ(1,1)
ai−r,j−sr,s +
(r,s)∈Γ(1,1)
ai−r,j−sr,s
=: Zi,j,+ + Zi,j,− .(6.30)
Write Wk,l,− = Zi,j,−k,l
hand define, for all (k, l) ∈ Z
2,
Kk,l(xi,jk,l
h) = EK(Zi,j,+ + xi,j
k,l
h) .
In this way,
(6.31) E(f Tk,l | F1,1) = Kk,l(Zi,j,−k,l
h) =: Kk,l(Wk,l,−) .
Plugging (6.31) into (6.20), we obtain a central limit theorem for functionals of
stationary causal linear random fields.
Theorem VI.15. Consider the functionals of stationary causal linear random
fields (6.28). If Conditions (6.27) holds and
(6.32)∞
k=1
∞
l=1
Kk,l(Wk,l,−)pk1/2l1/2
< ∞ ,
for p = 2, then σ2 = limn→∞ E(S2n)/n2 < ∞ exists and Sn/|Vn|
1/2 ⇒ N (0, σ2). If
the two conditions hold with p > 2, then the invariance principle (6.7) holds.
Next, we will provide conditions on K and ai,j(i,j)∈Z2 such that (6.32) holds.
For all Λ ⊂ Z2, write
(6.33) ZΛ =
(i,j)∈Λ
ai,j−i,−j and AΛ =
(i,j)∈Λ
a2i,j.
107
In particular, our conditions involves summations of ai,j over the following type of
regions:
Λ(k, l) := (i, j) ∈ Z2 : i ≥ k, j ≥ l , (k, l) ∈ Z
2.
For the sake of simplicity, we write Ak,l ≡ AΛ(k,l). The following lemma is a simple
extension of Lemma 2, part (b) in [117].
Lemma VI.16. Suppose that there exist α, β ∈ R such that 0 < α ≤ 1 ≤ β < ∞
and E(||2β) < ∞. If
(6.34) EM2α,β
(W1,1) < ∞ with Mα,β(x) = supy∈Rh
2, y =x
|K(x)−K(y)|
|x− y|α + |x− y|β,
then, for all p ≥ 2,
(6.35) Kk,l(Wk,l,−)p = O(Aα/2k+1−h,l+1−h
) .
Consequently, Condition (6.32) can be replaced by specific ones on Ak,l.
Corollary VI.17. Assume there exist α, β ∈ R as in Lemma VI.16. Consider the
functionals of stationary linear random fields in form of (6.28). Suppose Condi-
tion (6.34) holds and
(6.36)∞
k=1
∞
l=1
Aα/2k+1−h,l+1−h
k1/2l1/2< ∞ .
If E(||p) < ∞ and (6.27) hold with p = 2, then Sn/n ⇒ N (0, σ2) with some σ < ∞.
If E(||p) < ∞ and (6.27) holds with p > 2, then the invariance principle (6.7) holds.
Next, we compare our Condition (6.36) on the summability of ai,j(i,j)∈Z2 , and the
one considered by Cheng and Ho [15]. They only established central limit theorems
for functionals of stationary linear random fields, so we restrict to the case p = 2.
Cheng and Ho [15] assumed
(6.37)∞
i=0
∞
j=0
|ai,j|1/2 < ∞ ,
108
and provided different regularity conditions on K. Namely,
supΛ⊂Z2
EK2(x+ ZΛ) < ∞
for all x ∈ R with ZΛ defined in (6.33), and that for any two independent random
variables X and Y with E(K2(X) +K2(Y ) +K2(X + Y )) < M < ∞,
(6.38) E[(K(X + Y )−K(X))2] ≤ C[E(Y 2)]γ
for some γ ≥ 1/2. In general, Cheng and Ho [15]’s condition and ours on the
regularityK are not comparable and thus have different range of applications. Below,
we focus on the simple case that h = 1 andK is Lipschitz, covered by both conditions.
This correspond to α = β = 1 in (6.34) and γ = 1 in (6.38). In the following two
examples, our Condition (6.36) turn out to be weaker than Condition (6.37).
Example VI.18. Consider ai,j = (i+ j+1)−q for all i, j ≥ 0 and some q > 1. Then,
A =
∞
i=0
∞
j=0 a2i,j
< ∞ and
Ak,l =∞
j=1
j(k + l + j)−2q = O((k + l)2−2q) .
Then (6.36) is bounded by, up to a multiplicative constant,
∞
k=1
∞
l=1
(k + l)1−q
k1/2l1/2<
∞
k=1
k(1−q)/2
k1/2
∞
l=1
l(1−q)/2
l1/2≤
∞
k=1
k−q/22
.
Therefore, Condition (6.36) requires q > 2. In this case, Condition (6.37) requires
q > 4.
Example VI.19. Consider ai,j = (i+ 1)−q(j + 1)−q, for all i, j ≥ 0 for some q > 1.
Then, A =
∞
i=0
∞
j=0 a2i,j
< ∞ and
(6.39) Ak,l =∞
i=k
∞
j=l
a2i,j
= O(k−(2q−1)l−(2q−1)) .
One can thus check that Condition (6.36) requires q > 3/2 while Condition (6.37)
requires q > 2.
109
6.7 A Moment Inequality
We establish a moment inequality for stationary 2-parameter random fields on
general probability spaces, without assuming the product structure. We first review
the Peligrad–Utev inequality, a maximal Lp-inequality in dimension one, with p ≥ 2.
Let Xkk∈Z be a stationary process with Xk = f T k for all k ∈ Z, where f is a
measurable function from a probability space (Ω,A,P) to R, and T is a bimeasurable,
measure-preserving, one-to-one and onto map on (Ω,A,P). Consider
(6.40) Sn(f) =n
k=1
f T k.
Let Fkk∈Z be a filtration on (Ω,A,P) such that T−1Fk = Fk+1 for all k ∈ Z.
Supposef 2dP < ∞,
fdP = 0, f ∈ F0 (i.e., the sequence is adapted) and
f ∈ L2(F∞) L2(F−∞) with F∞ =
k∈ZFk and F−∞ =
k∈Z
Fk.
Let C denote a constant that may change from line to line. It is known that for
all f ∈ Lp(F∞), E(f | F−∞) = 0,
(6.41) max
1≤k≤n
|Sk(f)|p
≤ Cn1/2
E(f | F0)p + f − E(f | F0)p
+n
k=1
E(Sk(f) | F0)pk3/2
+n
k=1
Sk(f)− E(Sk(f) | Fk)pk3/2
.
The inequality above was first established for adapted stationary sequences in
Peligrad and Utev [67] and then extended to Lp-inequality for p ≥ 2 in Peligrad et
al. [68]. The case p ∈ (1, 2) was addressed by Wu and Zhao [122]. The non-adapted
case for p ≥ 2 was addressed by Volny [105].
For the sake of simplicity, we simplify the bound in (6.41) by regrouping the
summations. Observe that E(Sk(f) | F0)p ≤ E(Sk(f) | F1)p, E(f | F0)p =
110
E(S1(f) | F1)p and f −E(f | F0)p = S1(f)−E(S1(f) | F1)p. Thus, we obtain
(6.42) max
1≤k≤n
|Sk(f)|p
≤ Cn1/2
n
k=1
E(Sk(f) | F1)pk3/2
+n
k=1
Sk(f)− E(Sk(f) | Fk)pk3/2
.
Now, consider a general probability space (Ω,A,P), and suppose there exists
a commuting 2-parameter filtration Fi,j(i,j)∈Z2 , and an Abelian group of bimea-
surable, measure-preserving, one-to-one and onto maps Ti,j(i,j)∈Z2 on (Ω,A,P),
such that (6.15) holds. Define F∞,∞ =
(i,j)∈Z2 Fi,j, F−∞,∞ =
i∈ZFi,∞ and
F∞,−∞ =
j∈ZF∞,j. Note that when (Ω,A,P) is a product probability space, then
F−∞,∞ and F∞,−∞ are trivial, by Kolmogorov’s zero-one law.
Recall the definition of Sm,n(f) in (6.18). Given f , write Sm,n ≡ Sm,n(f) for the
sake of simplicity.
Proposition VI.20. Consider (Ω,A,P), Ti,j(i,j)∈Z2 and Fi,j(i,j)∈Z2 described as
above. Suppose p ≥ 2, f ∈ Lp(F∞,∞) and E(f | F−∞,∞) = E(f | F∞,−∞) = 0. Then,
Sm,np ≤ Cm1/2n1/2m
k=1
n
l=1
dk,l(f)
k3/2l3/2
with
dk,l(f) = E(Sk,l | F1,1)p
+E(Sk,l | F1,∞)− E(Sk,l | F1,l)p
+E(Sk,l | F∞,1)− E(Sk,l | Fk,1)p
+Sk,l − E(Sk,l | Fk,∞)− E(Sk,l | F∞,l) + E(Sk,l | Fk,l)p .
Corollary VI.21. Suppose the assumptions in Proposition VI.20 hold.
(i) If f ∈ F0,0, then
Sm,n(f)p ≤ Cm1/2n1/2m
k=1
n
l=1
E(Sk,l(f) | F1,1)pk3/2l3/2
.
111
(ii) If f Ti,j(i,j)∈Z2 are two-dimensional martingale differences, in the sense that
f ∈ Lp(F0,0) and E(f | F0,−1) = E(f | F−1,0) = 0, then
Sm,n(f)p ≤ Cm1/2n1/2f
p.
The proof of Corollary VI.21 is trivial. We only remark that the second case recov-
ers the Burkholder’s inequality for multiparameter martingale differences established
in [36].
Proof of Proposition VI.20. Fix f . Define S0,n =
n
j=1 f T0,j. Clearly,
(6.43) Sm,n =m
i=1
n
j=1
f Ti,j =m
i=1
n
j=1
f T0,j
Ti,0 =
m
i=1
S0,n Ti,0 .
Fix n. Observe that ES0,n = 0 and S0,n Ti,0 is a stationary sequence. Furthermore,
Fi,∞i∈Z is a filtration, T−1i,0 Fj,∞ = T−i,0Fj,∞ = Fi+j,∞ and E(S0,n | F−∞,∞) = 0.
Therefore, we can apply the Peligrad–Utev inequality (6.42) and obtain
Sm,np ≤ Cm1/2 m
k=1
k−3/2E(Sk,n | F1,∞)p
Λ1
+m
k=1
k−3/2Sk,n − E(Sk,n | Fk,∞)p
Λ2
.(6.44)
We first deal with Λ1. Define Sm,0 =
m
i=1 f Ti,0. Similarly as in (6.43), Sk,n =
n
j=1Sk,0 T0,j, and
E(Sk,n | F1,∞) =n
j=1
E(Sk,0 T0,j | F1,∞)
=n
j=1
E(Sk,0 T0,j | T0,−j(F1,∞)) ,
where in the last equality we used the fact that T0,j(Fi,∞) = Fi,∞, for all i, j ∈ Z.
Now, by the identify E(f | F) T = E(f T | T−1(F)), we have
(6.45) E(Sk,n | F1,∞) =n
j=1
E(Sk,0 | F1,∞) T0,j .
112
Observe that (6.45) is again a summation in the form of (6.40). Then, applying the
Peligrad–Utev inequality (6.42) again, we obtain
Λ1 ≤ Cn1/2 n
l=1
l−3/2E[E(Sk,l | F1,∞) | F∞,1]p
+n
l=1
l−3/2E(Sk,l | F1,∞)− E[E(Sk,l | F1,∞) | F∞,l]p
.
By the commuting property of the marginal filtrations (6.17), the above inequality
becomes
Λ1 ≤ Cn1/2 n
l=1
l−3/2E(Sk,l | F1,1)p
+n
l=1
l−3/2E(Sk,l | F1,∞)− E(Sk,l | F1,l)p
.(6.46)
Similarly, one can show
Λ2 =
n
j=1
[Sk,0 − E(Sk,0 | Fk,∞)] T0,j
p
≤ Cn1/2 n
l=1
l−3/2E(Sk,l | F∞,1)− E(Sk,l | Fk,1)p
+n
l=1
l−3/2Sk,l − E(Sk,l | Fk,∞)
−E(Sk,l | F∞,l) + E(Sk,l | Fk,l)p.(6.47)
Combining (6.44), (6.46) and (6.47), we have thus proved Proposition VI.20.
6.8 Auxiliary Proofs
For arbitrary σ-fields F ,G, let F ∨ G denote the smallest σ-field that contains F
and G.
Proposition VI.22. Let (Ω,B,P) be a probability space and let F ,G,H be mutually
independent sub-σ-fields of B. Then, for all random variable X ∈ B, E|X| < ∞, we
113
have
(6.48) E [E(X | F ∨ G) | G ∨H] = E(X | G) a.s.
Proposition VI.22 is closely related to the notion of conditional independence (see
e.g. [16], Chapter 7.3). Namely, provided a probability space (Ω,F ,P), and sub-σ-
fields G1,G2 and G3 of F , G1 and G2 are said to be conditionally independent given
G3, if for all A1 ∈ G1, A2 ∈ G2, P(A1 ∩ A2 | G3) = P(A1 | G3)P(A2 | G3) almost surely.
Proof of Proposition VI.22. First, we show that F ∨ G and G ∨H are conditionally
independent, given G. By Theorem 7.3.1 (ii) in [16], it is equivalent to show, for all
F ∈ F , G ∈ G, P(F ∩G | G ∨H) = P(F ∩G | G) almost surely. This is true since
P(F ∩G | G ∨H) = 1GE(1F | G ∨H) = 1GE(1F | G) = P(F ∩G | G) a.s.
Next, by Theorem 7.3.1 (iv) in [16], the conditional independence obtained above
yields E(X | G ∨ H) = E(X | G) almost surely, for all X ∈ F ∨ G, E|X| < ∞.
Replacing X by E(X | F ∨ G), we have thus proved (6.48).
Proof of Lemma VI.16. Write Wk,l = Zi,jk,l
h. Define (and recall that) Wk,l,± =
Zi,j,±k,l
h. Let Wk,l,− be a copy of Wk,l,−, independent of Wk,l,±. Set Wk,l := Wk,l,++
Wk,l,−.
Recall Kk,l(Wk,l,−) = E(K(Wk,l) | F1,1) in (6.31). Observe that by (6.30), Wk,l,− ∈
F1,1, and Wk,l,+,Wk,l,− are independent of F1,1. Therefore, E(K(Wk,l) | F1,1) =
E(K(Wk,l)) = 0, and
|Kk,l(Wk,l,−)| = |E(K(Wk,l)−K(Wk,l) | F1,1)|
≤ E(|K(Wk,l)−K(Wk,l)| | F1,1) .
Observe that by (6.34),
|K(Wk,l)−K(Wk,l)| ≤ Mα,β(Wk,l)(|Wk,l,− − Wk,l,−|α + |Wk,l,− − Wk,l,−|
β) .
114
Write Uk,l = Wk,l,− − Wk,l,−. By Cauchy–Schwartz’s inequality, and noting that
E(|Mα,β(Wk,l)|2 | F1,1) = Mα,β(Wk,l)22 = Mα,β(W1,1)22, we have
|Kk,l(Wk,l,−)| ≤ Mα,β(W1,1)2E[(|Uk,l|α + |Uk,l|
β)2 | F1,1]1/2 ,
whence, for p ≥ 2,
Kk,l(Wk,l,−)p ≤ Mα,β(W1,1)2|Uk,l|α + |Yk,l|
βp
≤ Mα,β(W1,1)2(|Uk,l|αp + |Yk,l|
βp) .(6.49)
Finally, since for all γ > 0 and n ∈ N, there exists a constant C(γ, n) > 0 such that
and for all vector w = (w1, . . . , wn) ∈ Rn,
|w|2γ = n
i=1
w2i
γ
≤ C(γ, n) n
i=1
w2γi
,
it follows that for all γ > 0,
E(|Uk,l|2γ) = E(|Wk,l,− − Wk,l,−|
2γ) = E(|Zi,j,− − Zi,j,−k,l
h|2γ)
= OE
k−h<i≤k
l−h<j≤l
(Zi,j,− − Zi,j,−)2γ
.
By Wu [117], Lemma 4, under the notation (6.33), E(||2∨2γ) < ∞ implies that for
all Λ ⊂ Z2, E(|ZΛ|
2γ) ≤ CAγ
Λ for some universal constant C. It then follows that
E(|Uk,l|2γ) = O(Aγ
k+1−h,l+1−h). Consequently, (6.49) yields
Kk,l(Wk,l,−)p ≤ Mα,β(W1,1)2
O(Aα/2
k+1−h,l+1−h) +O(Aβ/2
k+1−h,l+1−h)
= O(Aα/2k+1−h,l+1−h
) .
The proof is thus completed.
CHAPTER VII
Asymptotic Normality of Kernel Density Estimators forStationary Random Fields
Let Xii∈Zd , d ∈ N be a stationary zero-mean random field, such that the
marginal probability density function p(·) exists. We are interested in the Parzen–
Rosenblatt kernel density estimator of p(x) in the form of
(7.1) fn(x) =1
ndbn
i∈1,ndKx−Xi
bn
, x ∈ R .
Throughout this chapter, we assume that the kernel K : R → R is a bounded
Lipschitz-continuous density function, and the bandwidth bn satisfies
(7.2) bn → 0 and ndbn → ∞ as n → ∞ .
We also write, for a, b ∈ Z, a, b ≡ a, a+ 1, . . . , b.
This problem was first considered by Rosenblatt [82] and Parzen [65], in the case
that Xi’s are independent and identically distributed (i.i.d.) random variables: in
particular, one can show the consistency
limn→∞
E[(fn(x)− p(x))2] = 0,
and the asymptotic normality
(7.3) (ndbn)1/2(fn(x)− Efn(x)) ⇒ N (0, σ2
x) as n → ∞ ,
115
116
where σ2x= p(x)
K2(s)ds. See for example Silverman [97] for more references on
density estimation problems with i.i.d. data.
The case that Xi’s are dependent, however, has presented more challenges, and we
focus on establishing the asymptotic normality (7.3) in this chapter. The dependent
one-dimensional case has been considered by Robinson [80], Castellana and Leadbet-
ter [14], Bosq et al. [6], Wu and Mielniczuk [120] and Dedecker and Merlevede [29],
among others. In particular, Wu and Mielniczuk [120] investigated thoroughly the
case when Xii∈Z is a linear process. That is,
Xi =∞
k=−∞
aki−k , i ∈ Z ,
where
ka2k< ∞ and the innovations ii∈Z are i.i.d. random variables. Linear
processes are important in the study of stationary processes, as any stationary process
can be represented as linear combinations of linear processes (the so-called superlinear
processes) with martingale-difference innovations (Volny et al. [106]).
The asymptotic normality of kernel density estimators for random fields has been
considered by Tran [104], Hallin et al. [44] and El Machkouri [33, 34], among others.
The extension of results in one dimension to high dimensions, however, is not trivial.
As summarized in Hallin et al. [44], ‘the points of Zd do not have a natural ordering.
As a result, most techniques available for one-dimensional processes do not extend
to random fields.’ See more references in [44] on related discussions.
In particular, a notorious difficulty for kernel density estimation of random fields,
is that one often needs more assumptions on the bandwidth bn than the minimal
one (7.2). This condition is minimal in the sense that it is the natural condition
for the asymptotic normality (7.3) to hold when Xi’s are i.i.d. To the best of our
knowledge, only the recent results by El Machkouri [33, 34] assume no other but
minimal condition (7.2) on bn for dependent random fields.
117
We focus on the kernel density estimation for causal linear random fields Xii∈Zd
(d ∈ N) in form of
(7.4) Xi =
k∈Zd,k0
aki−k , i ∈ Zd,
where
i0 a2i< ∞ and ii∈Zd are i.i.d. zero-mean random variables with finite
second moments. Throughout this chapter, we let ‘i k’ denote ‘iτ ≥ kτ for all
τ = 1, . . . , d’ for i, k ∈ Zd, and write 0 = (0, . . . , 0),1 = (1, . . . , 1) ∈ Z
d.
We provide new conditions on the coefficient aii∈Zd such that the asymptotic
normality (7.3) holds (see Theorem VII.4 below), and compare with results obtained
by Hallin et al. [44] and El Machkouri [34]. In both cases, our conditions are weaker
on the coefficients aii∈Zd . On the other hand, our condition on the bandwidth
improves the one in [44], but it is still stronger than the minimal one (7.2) assumed
in [34].
Our proof is based on the m-approximation approach. As we will see, to address
this problem one has to establish an m-approximation with unbounded m (mn → ∞
as n → ∞). As a key step of our approach, we establish a central limit theo-
rem for triangular arrays of stationary m-dependent random fields with unbounded
m (Theorem VII.12). This result improves a central limit theorem established by
Heinrich [47]. Our m-approximation method is also involved with certain moment
inequalities for stationary random fields (Lemma VII.16), which are based on Propo-
sition VI.20. A different proof of the asymptotic normality of kernel density esti-
mators of stationary random fields is due to El Machkouri [33, 34], who also es-
tablished m-approximations with unbounded m, combined with Lindeberg’s method
(see e.g. Rio [79] and Dedecker [27]).
At last, we point out that when the asymptotic normality (7.3) holds, the ran-
118
dom variables are often said to have weak dependence, in the sense that they behave
asymptotically as i.i.d. random variables. On the other hand, when the dependence
is strong enough, the normalization for obtaining limiting distributions is of different
order from ndbn in (7.3), and the asymptotic limit may be no longer Gaussian (see
e.g. Csorgo and Mielniczuk [18] for one-dimensional case). These two regimes are
sometimes referred to as short-range dependence and long-range dependence, respec-
tively. For linear processes, Wu and Mielniczuk [120] addressed both short-range
and long-range dependence cases. For the linear random fields, however, to the best
of our knowledge, the long-range dependence case remains open. It seems that the
m-approximation method is limited to the short-range dependence case. Therefore,
the long-range dependence case is beyond the scope of this chapter.
The chapter is organized as follows. Our assumptions and main results are pre-
sented in Section 7.1. Examples and comparison with other results are provided in
Section 7.2. Section 7.3 is devoted to the central limit theorem for triangular ar-
rays of m-dependent random fields. Section 7.4 establishes asymptotic normality by
m-approximation. Auxiliary proofs are given in Section 7.5.
7.1 Assumptions and Main Result
We first introduce our conditions. For each m ∈ N, i ∈ Zd, write
(7.5) Xi,m =
k∈0,m−1daki−k and Xi,m = Xi −
Xi,m .
Let p, pm and pm denote the probability density function of X0, X0,m and X0,m,
respectively. Let pi and pi,m denote the joint density functions of (X0, Xi) and
(X0,m, Xi,m), respectively. Our first condition is on the regularity of the density
functions. Define the supremum p = supxp(x), p
i= sup
x,ypi(x, y) and similarly p
m
and pi,m
.
119
Condition VII.1. (i) The density functions p and pmm∈N exist. They are
c0-Lipschitz continuous with certain constant c0 < ∞, independent of m (i.e.,
max(|p(x)− p(y)|, |pm(x)− pm(y)|) ≤ c0|x− y|). Furthermore,
(7.6) p < ∞ and supm
pm< ∞ .
(ii) The density functions pi and pi,m exist for all i = 0,m ∈ N. Furthermore,
(7.7) supi =0
pi< ∞ and sup
m
supi =0
pi,m
< ∞ .
Condition VII.1 can be satisfied, for example, by simply assuming that the prob-
ability density function p of 0 exists and is Lipschitz. This was assumed also in Wu
and Mielniczuk [120].
Lemma VII.2. If p exists and is Lipschitz, then Condition VII.1 holds.
The proof is deferred to Section 7.5.
Our second condition is on the decay of coefficients and bandwidth bn. Define
Ak =
ik
a2i
1/2, k ∈ Z
d and Bm =
i∈0,∞d|i|∞≥m
a2i
1/2,m ∈ N ,
with |i|∞ = maxτ=1,...,d |iτ |. Write
∆n =
k∈1,nd
Ak−1d
τ=1 k1/2τ
.
Condition VII.3. There exist a sequence of integers mnn∈N such that mn → ∞
as n → ∞, and the following limits hold:
limn→∞
b1/2n
∆n = 0 ,(7.8)
limn→∞
Bmn
bn= 0 ,(7.9)
limn→∞
md
nbn = 0 ,(7.10)
limn→∞
md
nlogd n
ndbn= 0 .(7.11)
120
Theorem VII.4. If Conditions VII.1 and VII.3 hold and E(|0|α) < ∞ for some
α > 2, then the asymptotic normality (7.3) holds.
We will prove Theorem VII.4 in Section 7.4. We conclude this section with a few
remarks.
Remark VII.5. We briefly comment on each condition in Condition VII.3.
(i) Condition (7.8) is slightly weaker than
∆∞ ≡
k∈1,∞d
Ak−1d
τ=1 k1/2τ
< ∞ .
It was shown in Corollary 6.36 that, the above condition implies the asymptotic
normality of
i∈1,nd [f(Xi) − Ef(X0)]/nd/2 for Lipschitz continuous function
f such that Ef 2(X0) < ∞.
(ii) Condition (7.9) implies that
(7.12) limn→∞
E| X0,mn|
bn= 0 .
Indeed, Wu [117], Lemma 4 showed that for i.i.d. zero-mean random variables
ii∈Z with E(|0|2∨2p) < ∞, p > 0,
(7.13) E
i
aii2p
≤ C
i
a2i
p
.
Intuitively, X0,mncan be viewed as the remainder ofX0 after themn-truncation.
Condition (7.12) tells that mn needs to tend to infinity fast enough, so that the
central limit theorem holds.
(iii) Conditions (7.10) and (7.11) are useful when we apply a central limit theorem
for m-dependent random variables with unbounded m in Proposition VII.14
below.
Throughout this chapter, let C denote constants that do not depend on
i, k,m, n, x, y. The value of C may change from line to line.
121
7.2 Examples and Discussions
Theorem VII.4, and particularly Condition VII.3, is not convenient to apply for
concrete models. Instead, we provide a corollary for practical reason. Write
A[n] = maxAn,1,...,1, A1,n,1,...,1, . . . , A1,...,1,n .
Corollary VII.6. Suppose A[n] ≤ c1n−βand β > 0, and bn = c2n−γ
. Then a
sufficient condition such that Condition VII.3 holds is
(7.14) γ <dβ
d+ βand β > d .
Consequently, if E(|0|α) < ∞ for some α > 2, and Condition VII.1 and (7.14) hold,
then the asymptotic normality (7.3) follows.
Proof. Assume that mn takes the form of nδ. Observe that Bmnis of the same
order of A[mn] as n → ∞. Then, the limit conditions (7.9), (7.10) and (7.11) are
implied by
limn→∞
n−βδ+γ + ndδ−γ + nδ−1+γ/d = 0 ,
which is equivalent to γ/β < δ < minγ/d, 1 − γ/d. Since β > d implies that
∆∞ < ∞, the desired result follows.
Remark VII.7. Under the assumptions of Corollary VII.6, Condition (7.14) is very
close to necessary for Condition VII.3 to holds. Indeed, if A[n] = l(n)n−β with
limn→∞ l(n) = c2 > 0, then the same argument above yields that Condition VII.3 is
equivalent to (7.14).
Below, we provide examples of coefficients so that Condition VII.3 holds. We
assume that bn = n−γ for some γ ∈ (0, d).
122
Example VII.8. We compare our conditions and the ones by Hallin et al. [44].
They considered the case that |ai| ≤ C|i|−q
∞, i 0. Then, they require
(7.15) q > max(d+ 3, 2d+ 1/2) and limn→∞
ndb(2q−1+6d)/(2q−1−4d)n
= ∞ .
Our condition (7.14) imposes weaker assumption in this case (with bn = n−γ). First,
observe that
A2n,1,...,1 ≤ B2
n≤ C
∞
i=n
id−1i−2q≤ Cnd−2q .
We can apply Corollary VII.6 with β = q − d/2. Then, (7.14) becomes
(7.16) q >3d
2and γ < d
q − d/2
q + d/2.
Thus, to establish the asymptotic normality (7.3), our condition (7.16) is less restric-
tive than (7.15) on both q and γ.
Example VII.9. We compare our conditions and the ones by El Machkouri [34].
Note that his results apply to general stationary random fields and the linear random
fields are a specific case. In particular, he showed that for causal linear random fields,
if
(7.17)
i∈Zd
|i|q∞|ai| < ∞
with q = 5d/2, then the asymptotic normality follows.
In this case, our condition on the coefficients is weaker, requiring only q > d.
Indeed, suppose (7.17) holds with some q > 0. Then, to apply Corollary VII.6, it
suffices to observe
A2n,1 =
∞
i1=n
i2,...,id∈N
|ai|2≤ Cn−2q
∞
i1=n
i2,...,id∈N
|i|2q∞|ai|
2 < Cn−2q,
and take β = q.
123
At the same time, our result requires γ < dq/(q+d) for the bandwidth, in addition
to the minimal one (7.2) assumed in [34]. Recall also that we assume E(|0|α) < ∞ for
some α > 2, while El Machkouri’s result needs only finite-second-moment assumption
on 0.
Remark VII.10. Finally, we compare our result to Wu and Mielniczuk [120]. In the
one-dimensional case, to have asymptotic normality they assume only finite variance
of 0 and weaker assumption on the coefficient:
(7.18)∞
i=0
|ai| < ∞ .
This is weaker than our condition in one dimension (with q > d = 1 in (7.17)).
Wu and Mielniczuk followed a martingale approximation approach. It remains an
open question that in high dimension, whether the condition q > d in (7.17) can be
improved to match (7.18) in dimension one.
7.3 A Central Limit Theorem for m-Dependent Random Fields
In this section, we prove a central limit theorem for stationary triangular arrays of
m-dependent random fields. Throughout this section, let Yn,i : i ∈ Ndn∈N denote
stationary zero-mean triangular arrays. That is, for each n, Yn,ii∈Nd is stationary
and Yn,i has zero mean. Furthermore, we assume that Yn,ii∈Nd is mn-dependent in
the sense that Yn,i and Yn,j are independent if |i− j|∞ ≥ m. We provide conditions
such that
(7.19)Sn(Y )
nd/2≡
i∈1,nd Yn,i
nd/2⇒ N (0, σ2) as n → ∞.
A key condition is the following:
(7.20)
i∈Nd,1ij
Yn,i
2≤ C(j1 · · · jd)
1/2 for all n ∈ N, j ∈ Nd .
124
Remark VII.11. Observe that Proposition VI.20 provides conditions such that (7.20)
holds. In fact, inequality (7.20) has been established, under various conditions on
the dependence of stationary random fields, by Dedecker [28] and El Machkouri et
al. [35], among others.
Theorem VII.12. Suppose that there exists a constant C such that (7.20) holds. If
there exists a sequence lnn∈N ⊂ N, mn/ln → 0 and ln/n → 0 as n → ∞, such that
limn→∞
1
ldn
E
k∈1,lndYn,k
2= σ2 ,(7.21)
limn→∞
1
ldn
E
k∈1,lndYn,k
21
k∈1,lndYn,k
> nd/2
= 0 ,(7.22)
for all > 0, then (7.19) holds.
Proof. Consider partial sums over big blocks of size ldn, denoted by
ηn,k =
i∈1,lndYn,i+k(ln+mn) , k ∈ N
d.
In this way, for each n ∈ N, ηn,kk∈Nd are i.i.d., as we separate neighboring blocks
by distance mn, and Yn,ii∈Zd are mn-dependent. Set
Sn(η) =
k∈0,n/(ln+mn)−1dηn,k , n ∈ N .
Then, (7.20) implies that
Sn(Y )
nd/2−
Sn(η)
nd/2
2→ 0 as n → ∞ .
To see this, for the sake of simplicity, we consider the case n/(ln + mn) =
n/(ln +mn). Indeed, by the triangular inequality, the left-hand side above can
be bounded by sums in form of
i∈BYn,i
2/nd/2, where B can be a rectangle of
size nd−rmr
nwith r ∈ 1, . . . , d − 1. Focusing on the dominant term with r = 1,
125
we then bound the left-hand side above by C(n/(ln + mn))1/2(nd−1mn)1/2/nd/2 =
Cm1/2n /(ln +mn)1/2 → ∞ as n → ∞.
As a consequence, it suffices to show Sn(η)/nd/2 ⇒ N (0, σ2). This, under condi-
tions (7.21) and (7.22), follows from the standard central limit theorem for triangular
arrays of independent random variables (see e.g. [32], Chapter 2, Theorem 4.5).
Remark VII.13. Central limit theorems for mn-dependent random fields has been
considered by Heinrich [47]. His result has been recently applied, with mn = m
fixed, by El Machkouri et al. [35] to establish a central limit theorem for stationary
random fields.
Our application requires us to take mn → ∞. In this case our condition in
Theorem VII.12 is weaker than Heinrich’s. In particular, he assumed
(7.23) limn→∞
m2dn
nd
i∈1,ndE
Y 2n,i1|Yn,i|>nd/2m
−2dn
= 0 , for all > 0 .
This is stronger than (7.22).
7.4 Asymptotic Normality by m-Approximation
In this section, we prove Theorem VII.4 by an m-approximation argument. Fix
x ∈ R and write
Zn,i =1
√bnKx−Xi
bn
and ζn,i =
1√bnKx−Xi,mn
bn
, i ∈ Z
d .
In this way, ζn,ii∈Zd are mn-dependent. We will use ζn,i : i ∈ Zdn∈N to approx-
imate Zn,i : i ∈ Zdn∈N. We also write Zn,i = Zn,i − EZn,i and ζ
n,i= ζn,i − Eζn,i.
Setting
Sn(ζ) =
i∈1,ndζn,i
and Sn(Z − ζ) =
i∈1,nd(Zn,i − ζ
n,i) ,
126
we decompose
(7.24) (ndbn)1/2(fn(x)− Efn(x)) =
Sn(ζ)
nd/2+
Sn(Z − ζ)
nd/2.
To prove Theorem VII.4, it suffices to establish the following two results.
Proposition VII.14. Under Condition VII.1 and (7.8), (7.10), (7.11) of Condi-
tion VII.3,
(7.25)Sn(ζ)
nd/2⇒ N (0, σ2
x) .
Proposition VII.15. Under Condition VII.1 and (7.8), (7.9) of Condition VII.3,
(7.26)Sn(Z − ζ)
nd/2
P−→ 0 .
To prove the above two propositions, a key step is to establish the following
moment inequalities.
Lemma VII.16. There exists a constant C > 0, such that for all n ∈ N,
(7.27)Sn(Z − ζ)
2≤ Cnd/2
Zn,0 − ζn,0
2+ b1/2
n∆n
.
In addition, if E(|0|α) < ∞ for some α ≥ 2, then
(7.28)
i∈Nd,1ij
ζn,i
α
≤ C(j1 · · · jd)1/2
ζn,0
α+ b1/2
n∆n
, for all j ∈ N
d.
The proof is deferred to Section 7.5.
Proof of Proposition VII.14. Observing that Sn(ζ)/nd/2 is a partial sum of mn-
dependent random fields, we apply Theorem VII.12. Observe that sinceζ
n,0
2→ σx
as n → ∞, (7.28) with α = 2 and assumption (7.8) entail (7.20). Thus, to
prove (7.25), it suffices to show, for ln = mn log n,
(7.29) limn→∞
1
ldn
E
i∈1,lndζn,i
2= σ2
x,
127
and, writing ξn =
i∈1,lnd ζn,i,
(7.30) limn→∞
1
ldn
E
ξ2n1|ξn|>nd/2
= 0 , for all > 0 .
By standard calculation, under (7.7) of Condition VII.1, for all n ∈ N and i = 0,
|E(ζn,0ζn,i)| ≤ Cp
i,mnbn ≤ Cbn .
Therefore,
1
ldn
E
i∈1,lndζn,i
2− Eζ
2n,0
≤ 2
i∈−mn,mnd|E(ζ
n,0ζn,i)|1i =0 ≤ Cmd
nbn .
Thus, assumption (7.10) entails (7.29). To prove (7.30), observe that
E(ξ2n1|ξn|>nd/2) ≤ ξn
2αP(|ξn| > nd/2)(α−2)/α
≤ ξn2α
ξn
22
nd2
(α−2)/α.
This time, (7.28) and (7.8) yield ξn2 ≤ Cld/2n . For α > 2, observe that, since K is
bounded,
ζn,0
α= (E|ζ
n,0|α)1/α ≤
C
b(α−2)/2n
ζn,0
2
2
1/α≤ Cb−(α−2)/(2α)
n.
So, ξn2α≤ Cld
nb−(α−2)/αn . To sum up, we have obtained that
1
ldn
E(ξ2n1|ξn|>nd/2) ≤ C
ldn
ndbn
(α−2)/α.
Now, (7.11) entails (7.30).
Proof of Proposition VII.15. In order to obtain the desired result, it suffices to com-
bine (7.27), assumptions (7.8) and (7.9) and Lemma VII.17 below.
Lemma VII.17. Under the assumption of Condition VII.1, there exists a constant
C, such that for all n ∈ N,
(7.31)ζ
n,0 − Zn,0
2≤ C
Bmn
bn
1/2+ b1/2
n
.
The proof is deferred to Section 7.5.
128
7.5 Proofs
Proof of Lemma VII.2. (i) The existence and Lipschitz continuity of p and pm have
been proved by Wu and Mielniczuk [120], Lemma 1. To prove (7.6), observe that
|pm(y)− p(y)| ≤
|pm(y)− pm(y − x)|pm(x)dx
≤ C
|x|pm(x)dx = CE| X0,m| .(7.32)
This entails that pm(x) → p(x) uniformly for x ∈ R as m → ∞. Therefore, (7.6)
holds.
(ii) Fix i ∈ Zd \ 0 and let Fi denote the joint distribution function of (X0, Xi).
For the sake of simplicity, we prove the case of a0 = 1. Write R = X0 − 0 and
Ri = Xi− i− ai0. Now, R and Ri are dependent random variables. First, we show
that
(7.33) pi(x, y) ≡∂2
∂x∂yFi(x, y) = E[p(x−R)p(y −Ri − aix)] .
Indeed,
Fi(x, y) = P(X0 ≤ x,Xi ≤ y)
= P(0 +R ≤ x, i + ai0 +Ri ≤ y)
= EΦi(x−R, y −Ri) ,(7.34)
with, letting F denote the cumulative distribution function of 0,
Φi(x, y) =
x
−∞
F(y − aix)F(dx
) .
Differentiating (7.34) yields (7.33) (see e.g. [32], Appendix A.9 on the validation of
exchange of differentiation and expectation).
129
Next, we prove (7.7) by establishing the following two steps:
(7.35) lim|i|∞→∞
supx,y
|pi(x, y)− p(x)p(y − aix)| = 0 ,
and
(7.36) limm→∞
supx,y,i
|pi(x, y)− pi,m(x, y)| = 0 .
Then, (7.35) implies the first part of (7.7), and the two limits imply the second part.
To prove (7.35), set
Di = E(Ri | σ(k : k 0)) and Di = Ri −Di , i ∈ Z
d.
By definition, Di and R are independent. Introducing an intermediate term E[p(x−
R)p(y−Di−aix)] = p(x)Ep(y−Di−aix), we then bound |pi(x, y)−p(x)p(y−aix)| ≤
Ψ1 +Ψ2 with, under the assumption that p is bounded and Lipschitz,
Ψ1 = |pi(x, y)− E[p(x−R)p(y −Di − aix)]|
≤ E[p(x−R)|Ri −Di|] ≤ CE| Di|,
and
Ψ2 = |p(x)p(y − aix)− E[p(x−R)p(y −Di − aix)]|
≤ p(x)E|p(y − aix−Ri + ai0)− p(y −Di − aix)|
≤ C(E| Di|+ |ai|) .
By (7.13), |pi(x, y)− p(x)p(y − aix)| → 0 as |i|∞ → ∞.
To prove (7.36), define Rm = X0,m − 0 and Ri,m = Xi,m − i − ai01|i|∞<m.
Then, similarly as (7.33), one has
pi,m(x, y) = E[p(x−Rm)p(y − aix1|i|∞<m −Ri,m)] .
130
Introducing an intermediate term E[p(x−R)p(y− aix1|i|∞<m −Ri,m)], we obtain
that
|pi,m(x, y)− pi(x, y)|
≤ E[p(x−R)(|aix|1|i|∞≥m + |Ri −Ri,m|)] + CE|R−Rm|
≤ C(|x|p(x)|ai|1|i|∞≥m + |R−Rm|+ |Ri −Ri,m|) .
Since X0 has finite second moment and p is bounded and Lipschitz, supx|x|p(x) <
∞. The summability assumption on ai implies that limm→∞ sup|i|∞≥m|ai| = 0, and
supi(|R−Rm|+ |Ri−Ri,m|) → 0 as m → ∞ (recall (7.13)). Therefore, we have thus
proved (7.36).
Proof of Lemma VII.16. We only prove (7.27). The proof of (7.28) is similar. By
Proposition VI.20, there exists a constant C, such that
(7.37)
Sn(Z − ζ)2
n≤ C
k∈1,nd
E(Zn,k − ζn,k
| F1)2
d
τ=1 k1/2τ
,
where F1 = σ(k : k ∈ Zd, k 1). By the definition of ζ
n,i, (7.37) equals (up to the
multiplicative constant C),
k∈1,nd\1,mnd
E(Zn,k | F1)2
d
τ=1 k1/2τ
+
k∈1,mnd
E(Zn,k − ζn,k
| F1)2
d
τ=1 k1/2τ
≤Zn,0 − ζ
n,0
2+
k∈1,nd
E(Zn,k | F1)2
d
τ=1 k1/2τ
+
k∈1,mnd
E(ζn,k
| F1)2
d
τ=1 k1/2τ
≤ CZn,0 − ζ
n,0
2+ b1/2
n
k∈1,nd
Ak−1d
τ=1 k1/2τ
,
where the last inequality follows from Lemma VII.18 below.
Lemma VII.18. Suppose that in addition to Condition VII.1, E(|0|α) < ∞ for
some α ≥ 2. For all k ∈ Nd, k = 1,
131
E(Zn,k | F1)α
≤ Cb1/2n
Ak−1 ,(7.38)
E(ζn,k
| F1)α
≤ Cb1/2n
Ak−1 .(7.39)
Proof of Lemma VII.18. First, we controlE(Zn,k | F1)
α. For each k ∈ Z
d, intro-
duce the notation
(7.40) Γ(k) := i ∈ Zd : i k ,
and write
Xk =
i∈Γ(k)
ak−ii =
i∈Γ(1)
+
i∈Γ(k)\Γ(1)
ak−ii =: Dk + Tk .
For the sake of simplicity, write D ≡ Dk, T ≡ Tk, and, given a random variable Y , let
EY (·) ≡ E(· | Y ) denote the conditional expectation given the σ-algebra generated
by Y . Since k 1, k = 1, Tk is a non-degenerate random variable. Then,
E(Zn,k | F1) =1
√bn
EDK
x−D − T
bn
− EK
x−D − T
bn
.
Let D be a copy of D, independent of D and T . Then, the above identity becomes,
letting pT denote the density of T ,
1√bnEDED, D
Kx−D − T
bn
−K
x− D − T
bn
= b1/2n
ED
K(t)
pT (x− bnt−D)− pT (x− bnt− D)
dt .
Since pT is Lipschitz, the absolute value of the above term is bounded by
C|K(s)|dsb1/2n ED|D − D|, almost surely. (Here pT depends on k, n, but one
can show that the Lipschitz constant can be chosen independently from k, n. See
e.g. [117], Lemma 1.) To sum up, we have
E(Zn,k | F1)α≤ Cb1/2
n
ED|D − D|
α
≤ Cb1/2n
Dα≤ Cb1/2
nAk−1,
132
where the last inequality follows from (7.13). We have thus proved (7.38). To
prove (7.39), a similar argument yieldsE(ζ
n,k| F1)
α≤ Cb1/2n Ak,mn
with Ak,mn=
(
i∈0,mn−1d,ik−1 a2i)1/2 ≤ Ak−1.
Proof of Lemma VII.17. For random variables Zn,0, Zn,0, ζn,0, ζn,0, we replace the
index ‘n,0’ by ‘n’ for the sake of simplicity. First observe that
(EZn)2 + (Eζn)
2≤ C(p2bn + p2
mnbn) ≤ Cbn ,
where the last step we applied (7.7). Then,
|E(ζ2n− Z
2n)| ≤
K2(y)[pmn(x− bny)− p(x− bny)]dy
+ Cbn
≤ supy
|pmn(y)− p(y)|
K2(s)ds+ Cbn .
≤ C(Bmn+ bn) ,(7.41)
where the last inequality follows from (7.32). Next, write
(7.42)ζ
n− Zn
2
2= EZ
2n− Eζ
2n+ 2(Eζ
2n− E(Znζn)) .
For the last term on the right-hand side of (7.42), observe that E(Znζn) = E(Znζn)−
EZnEζn = E(Znζn) + O(pmn
bn). We claim that E(Znζn) is very close to Eζ2n, under
our restriction on the choice of mn. Indeed,
(7.43) |E(Znζn)− Eζ2n| ≡
E(Znζn)−
K2(y)pmn
(x− bny)dy ,
and,
E(Znζn) =
1
bnKx− y − z
bn
Kx− y
bn
pmn
(y)pmn(z)dydz
=
K(y)EK
y −
X0,mn
bn
pmn
(x− bny)dy .
133
Therefore, (7.43) can be bounded by, since K is Lipschitz,
|K(y)|E
Ky −
X0,mn
bn
−K(y)
pmn(x− bny)dy
≤E| X0,mn
|
bn
|K(y)|pmn
(x− bny)dy ,
and E| X0,mn| ≤ CBmn
by (7.13). To sum up, we have thus shown that (recall that
bn ↓ 0, whence Bmnis dominated by Bmn
/bn), under (7.6),
ζn− Zn
2
2≤ C
Bmn
bn+ bn
.
BIBLIOGRAPHY
134
135
BIBLIOGRAPHY
[1] J. Aaronson. An introduction to infinite ergodic theory, volume 50 of Mathematical Surveysand Monographs. American Mathematical Society, Providence, RI, 1997.
[2] A. A. Balkema and S. I. Resnick. Max-infinite divisibility. J. Appl. Probability, 14(2):309–319,1977.
[3] A. K. Basu and C. C. Y. Dorea. On functional central limit theorem for stationary martingalerandom fields. Acta Math. Acad. Sci. Hungar., 33(3-4):307–316, 1979.
[4] I. Berkes and G. J. Morrow. Strong invariance principles for mixing random fields. Z.Wahrsch. Verw. Gebiete, 57(1):15–37, 1981.
[5] E. Bolthausen. On the central limit theorem for stationary mixing random fields. Ann.Probab., 10(4):1047–1050, 1982.
[6] D. Bosq, F. Merlevede, and M. Peligrad. Asymptotic normality for density kernel estimatorsin discrete and continuous time. J. Multivariate Anal., 68(1):78–95, 1999.
[7] R. C. Bradley. A caution on mixing conditions for random fields. Statist. Probab. Lett.,8(5):489–491, 1989.
[8] R. C. Bradley. Introduction to strong mixing conditions. Vol. 1. Kendrick Press, Heber City,UT, 2007.
[9] T. Buishand, L. de Haan, and C. Zhou. On spatial extremes: With application to a rainfallproblem. Ann. Appl. Stat., 2(2):624–642, 2008.
[10] K. Burnecki, J. Rosinski, and A. Weron. Spectral representation and structure of stableself-similar processes. In Stochastic processes and related topics, Trends Math., pages 1–14.Birkhauser Boston, Boston, MA, 1998.
[11] S. Cambanis, C. D. Hardin, Jr., and A. Weron. Ergodic properties of stationary stableprocesses. Stochastic Process. Appl., 24(1):1–18, 1987.
[12] S. Cambanis, M. Maejima, and G. Samorodnitsky. Characterization of linear and harmoniz-able fractional stable motions. Stochastic Process. Appl., 42(1):91–110, 1992.
[13] A. Caprara, P. Toth, and M. Fischetti. Algorithms for the set covering problem. Ann. Oper.Res., 98:353–371 (2001), 2000. Optimization theory and its application (Perth, 1998).
[14] J. V. Castellana and M. R. Leadbetter. On smoothed probability density estimation forstationary processes. Stochastic Process. Appl., 21(2):179–193, 1986.
[15] T.-L. Cheng and H.-C. Ho. Central limit theorems for instantaneous filters of linear randomfields on Z
2. In Random walk, sequential analysis and related topics, pages 71–84. World Sci.Publ., Hackensack, NJ, 2006.
136
[16] Y. S. Chow and H. Teicher. Probability theory. Springer-Verlag, New York, 1978. Indepen-dence, interchangeability, martingales.
[17] D. Cooley, D. Nychka, and P. Naveau. Bayesian spatial modeling of extreme precipitationreturn levels. J. Amer. Statist. Assoc., 102(479):824–840, 2007.
[18] S. Csorgo and J. Mielniczuk. Density estimation under long-range dependence. Ann. Statist.,23(3):990–999, 1995.
[19] R. A. Davis and S. I. Resnick. Basic properties and prediction of max-ARMA processes. Adv.in Appl. Probab., 21(4):781–803, 1989.
[20] R. A. Davis and S. I. Resnick. Prediction of stationary max-stable processes. Ann. Appl.Probab., 3(2):497–525, 1993.
[21] A. C. Davison and R. L. Smith. Models for exceedances over high thresholds. J. Roy. Statist.Soc. Ser. B, 52(3):393–442, 1990. With discussion and a reply by the authors.
[22] L. de Haan. A characterization of multidimensional extreme-value distributions. SankhyaSer. A, 40(1):85–88, 1978.
[23] L. de Haan. A spectral representation for max-stable processes. Ann. Probab., 12(4):1194–1204, 1984.
[24] L. de Haan and A. Ferreira. Extreme value theory. Springer Series in Operations Researchand Financial Engineering. Springer, New York, 2006. An introduction.
[25] L. de Haan and T. T. Pereira. Spatial extremes: Models for the stationary case. The Annalsof Statistics, 34:146–168, 2006.
[26] L. de Haan and J. Pickands, III. Stationary min-stable stochastic processes. Probab. TheoryRelat. Fields, 72(4):477–492, 1986.
[27] J. Dedecker. A central limit theorem for stationary random fields. Probab. Theory RelatedFields, 110(3):397–426, 1998.
[28] J. Dedecker. Exponential inequalities and functional central limit theorems for a randomfields. ESAIM Probab. Statist., 5:77–104 (electronic), 2001.
[29] J. Dedecker and F. Merlevede. Necessary and sufficient conditions for the conditional centrallimit theorem. Ann. Probab., 30(3):1044–1081, 2002.
[30] J. Dedecker, F. Merlevede, and D. Volny. On the weak invariance principle for non-adaptedsequences under projective criteria. J. Theoret. Probab., 20(4):971–1004, 2007.
[31] C. Dombry and F. Eyi-Minko. Regular conditional distributions of max infinitely divisibleprocesses. Submitted, available at http://arxiv.org/abs/1109.6492, 2011.
[32] R. Durrett. Probability: theory and examples. Duxbury Press, Belmont, CA, second edition,1996.
[33] M. El Machkouri. Asymptotic normality of the parzen-rosenblatt density estimator forstrongly mixing random fields. Stat. Inference Stoch. Process., 14(1):73–84, 2011.
[34] M. El Machkouri. Kernel density estimation for stationary random fields. preprint, availableat http://arxiv.org/abs/1109.2694, 2011.
[35] M. El Machkouri, D. Volny, and W. B. Wu. A central limit theorem for stationary randomfields. Submitted, available at http://arxiv.org/abs/1109.0838, 2011.
137
[36] I. Fazekas. Burkholder’s inequality for multiindex martingales. Ann. Math. Inform., 32:45–51,2005.
[37] R. Furrer, D. Nychka, and S. Sain. fields: Tools for spatial data, 2009. R package version6.01.
[38] E. Gine, M. G. Hahn, and P. Vatan. Max-infinitely divisible and max-stable sample continu-ous processes. Probab. Theory Related Fields, 87(2):139–165, 1990.
[39] C. M. Goldie and P. E. Greenwood. Variance of set-indexed sums of mixing random variablesand weak convergence of set-indexed processes. Ann. Probab., 14(3):817–839, 1986.
[40] C. M. Goldie and G. J. Morrow. Central limit questions for random fields. In Dependencein probability and statistics (Oberwolfach, 1985), volume 11 of Progr. Probab. Statist., pages275–289. Birkhauser Boston, Boston, MA, 1986.
[41] M. Gordin and M. Peligrad. On the functional CLT via martingale approximation. preprint,http://arxiv.org/abs/0910.3448, 2009.
[42] M. I. Gordin. The central limit theorem for stationary processes. Dokl. Akad. Nauk SSSR,188:739–741, 1969.
[43] M. I. Gordin and B. A. Lifsic. Central limit theorem for stationary Markov processes. Dokl.Akad. Nauk SSSR, 239(4):766–767, 1978.
[44] M. Hallin, Z. Lu, and L. T. Tran. Density estimation for spatial linear processes. Bernoulli,7(4):657–668, 2001.
[45] C. D. Hardin, Jr. Isometries on subspaces of Lp. Indiana Univ. Math. J., 30(3):449–465,1981.
[46] C. D. Hardin, Jr. On the spectral representation of symmetric stable processes. J. Multivari-ate Anal., 12(3):385–401, 1982.
[47] L. Heinrich. Asymptotic behaviour of an empirical nearest-neighbour distance function forstationary Poisson cluster processes. Math. Nachr., 136:131–148, 1988.
[48] H.-C. Ho and T. Hsing. Limit theorems for functionals of moving averages. Ann. Probab.,25(4):1636–1669, 1997.
[49] W. Hoeffding and H. Robbins. The central limit theorem for dependent random variables.Duke Math. J., 15:773–780, 1948.
[50] Z. Kabluchko. Spectral representations of sum- and max-stable processes. Extremes,12(4):401–424, 2009.
[51] Z. Kabluchko, M. Schlather, and L. de Haan. Stationary max-stable fields associated tonegative definite functions. Ann. Probab., 37(5):2042–2065, 2009.
[52] G. Keller. Equilibrium states in ergodic theory, volume 42 of London Mathematical SocietyStudent Texts. Cambridge University Press, Cambridge, 1998.
[53] D. Khoshnevisan. Multiparameter processes. Springer Monographs in Mathematics. Springer-Verlag, New York, 2002. An introduction to random fields.
[54] C. Kipnis and S. R. S. Varadhan. Central limit theorem for additive functionals of reversibleMarkov processes and applications to simple exclusions. Comm. Math. Phys., 104(1):1–19,1986.
[55] U. Krengel. Ergodic theorems, volume 6 of de Gruyter Studies in Mathematics. Walter deGruyter & Co., Berlin, 1985. With a supplement by Antoine Brunel.
138
[56] J. Lamperti. On the isometries of certain function-spaces. Pacific J. Math., 8:459–466, 1958.
[57] J. B. Levy and M. S. Taqqu. Renewal reward processes with heavy-tailed inter-renewal timesand heavy-tailed rewards. Bernoulli, 6(1):23–44, 2000.
[58] A. Markov. Recherches sur un cas remarquable d’epreuves dependantes. Acta Math.,33(1):87–104, 1910.
[59] M. Maxwell and M. Woodroofe. Central limit theorems for additive functionals of Markovchains. Ann. Probab., 28(2):713–724, 2000.
[60] F. Merlevede, M. Peligrad, and S. Utev. Recent advances in invariance principles for station-ary sequences. Probab. Surv., 3:1–36 (electronic), 2006.
[61] T. Nagai. A simple tightness condition for random elements on C([0, 1]2). Bull. Math.Statist., 16(1-2):67–70, 1974/75.
[62] B. Nahapetian. Billingsley-Ibragimov theorem for martingale-difference random fields andits applications to some models of classical statistical physics. C. R. Acad. Sci. Paris Ser. IMath., 320(12):1539–1544, 1995.
[63] B. S. Nahapetian and A. N. Petrosian. Martingale-difference Gibbs random fields and centrallimit theorem. Ann. Acad. Sci. Fenn. Ser. A I Math., 17(1):105–110, 1992.
[64] P. Naveau, A. Guillou, D. Cooley, and J. Diebolt. Modelling pairwise dependence of maximain space. Biometrika, 96(1):1–17, 2009.
[65] E. Parzen. On estimation of a probability density function and mode. Ann. Math. Statist.,33:1065–1076, 1962.
[66] M. Peligrad. Conditional central limit theorem via martingale approximation. In Berkes,Bradley, Dehling, Peligrad, and Tichy, editors, Dependence in Probability, Analysis and Num-ber Theory, pages 295–309. Kendrick Press, 2010.
[67] M. Peligrad and S. Utev. A new maximal inequality and invariance principle for stationarysequences. Ann. Probab., 33(2):798–815, 2005.
[68] M. Peligrad, S. Utev, and W. B. Wu. A maximal Lp-inequality for stationary sequences andits applications. Proc. Amer. Math. Soc., 135(2):541–550 (electronic), 2007.
[69] V. Pipiras. Nonminimal sets, their projections and integral representations of stable processes.Stochastic Process. Appl., 117(9):1285–1302, 2007.
[70] V. Pipiras and M. S. Taqqu. The structure of self–similar stable mixed moving averages.Ann. Probab., 30(2):898–932, 2002.
[71] V. Pipiras and M. S. Taqqu. Stable stationary processes related to cyclic flows. Ann. Probab.,32(3A):2222–2260, 2004.
[72] V. Pipiras, M. S. Taqqu, and J. B. Levy. Slow, fast and arbitrary growth conditions forrenewal-reward processes when both the renewals and the rewards are heavy-tailed. Bernoulli,10(1):121–163, 2004.
[73] S. Poghosyan and S. Rœlly. Invariance principle for martingale-difference random fields.Statist. Probab. Lett., 38(3):235–245, 1998.
[74] R Development Core Team. R: A Language and Environment for Statistical Computing. RFoundation for Statistical Computing, Vienna, Austria, 2009. ISBN 3-900051-07-0.
[75] M. M. Rao. Conditional measures and applications, volume 271 of Pure and Applied Mathe-matics (Boca Raton). Chapman & Hall/CRC, Boca Raton, FL, second edition, 2005.
139
[76] S. I. Resnick. Extreme values, regular variation, and point processes, volume 4 of AppliedProbability. A Series of the Applied Probability Trust. Springer-Verlag, New York, 1987.
[77] S. I. Resnick. Heavy-tail phenomena. Springer Series in Operations Research and FinancialEngineering. Springer, New York, 2007. Probabilistic and statistical modeling.
[78] S. I. Resnick and R. Roy. Random usc functions, max-stable processes and continuous choice.Ann. Appl. Probab., 1(2):267–292, 1991.
[79] E. Rio. About the Lindeberg method for strongly mixing sequences. ESAIM Probab. Statist.,1:35–61 (electronic), 1995/97.
[80] P. M. Robinson. Nonparametric estimators for time series. J. Time Ser. Anal., 4(3):185–207,1983.
[81] B. Rosen. A note on asymptotic normality of sums of higher-dimensionally indexed randomvariables. Ark. Mat., 8:33–43, 1969.
[82] M. Rosenblatt. Remarks on some nonparametric estimates of a density function. Ann. Math.Statist., 27:832–837, 1956.
[83] J. Rosinski. On the structure of stationary stable processes. Ann. Probab., 23(3):1163–1187,1995.
[84] J. Rosinski. Decomposition of stationary α-stable random fields. Ann. Probab., 28(4):1797–1813, 2000.
[85] J. Rosinski. Minimal integral representations of stable processes. Probab. Math. Statist.,26(1):121–142, 2006.
[86] J. Rosinski and G. Samorodnitsky. Classes of mixing stable processes. Bernoulli, 2(4):365–377, 1996.
[87] E. Roy. Ergodic properties of Poissonian ID processes. Ann. Probab., 35(2):551–576, 2007.
[88] E. Roy. Poisson suspensions and infinite ergodic theory. Ergodic Theory Dynam. Systems,29(2):667–683, 2009.
[89] P. Roy. Ergodic theory, abelian groups and point processes induced by stable random fields.Ann. Probab., 38(2):770–793, 2010.
[90] P. Roy. Nonsingular group actions and stationary SαS random fields. Proc. Amer. Math.Soc., 138(6):2195–2202, 2010.
[91] P. Roy and G. Samorodnitsky. Stationary symmetric α-stable discrete parameter randomfields. J. Theoret. Probab., 21(1):212–233, 2008.
[92] G. Samorodnitsky. Null flows, positive flows and the structure of stationary symmetric stableprocesses. Ann. Probab., 33(5):1782–1803, 2005.
[93] G. Samorodnitsky and M. S. Taqqu. Stable non-Gaussian random processes. StochasticModeling. Chapman & Hall, New York, 1994. Stochastic models with infinite variance.
[94] M. Schlather. Models for stationary max–stable random fields. Extremes, 5:33–44, 2002.
[95] M. Schlather and J. A. Tawn. A dependence measure for multivariate and spatial extremevalues: Properties and inference. Biometrika, 90:139–156, 2003.
[96] A. P. Shashkin. The invariance principle for a (BL, θ)-dependent random field. Uspekhi Mat.Nauk, 58(3(351)):193–194, 2003.
140
[97] B. W. Silverman. Density estimation for statistics and data analysis. Monographs on Statis-tics and Applied Probability. Chapman & Hall, London, 1986.
[98] R. L. Smith. Max–stable processes and spatial extremes. unpublished manuscript, 1990.
[99] S. M. Srivastava. A course on Borel sets, volume 180 of Graduate Texts in Mathematics.Springer-Verlag, New York, 1998.
[100] S. A. Stoev. On the ergodicity and mixing of max-stable processes. Stochastic Process. Appl.,118(9):1679–1705, 2008.
[101] S. A. Stoev and M. S. Taqqu. Extremal stochastic integrals: a parallel between max-stableprocesses and α-stable processes. Extremes, 8(4):237–266 (2006), 2005.
[102] D. Surgailis, J. Rosinski, V. Mandrekar, and S. Cambanis. Stable mixed moving averages.Probab. Theory Related Fields, 97(4):543–558, 1993.
[103] D. Surgailis, J. Rosinski, V. Mandrekar, and S. Cambanis. On the mixing structure of sta-tionary increment and self–similar SαS processes. Unpublished results., 1998.
[104] L. T. Tran. Kernel density estimation on random fields. J. Multivariate Anal., 34(1):37–53,1990.
[105] D. Volny. A nonadapted version of the invariance principle of Peligrad and Utev. C. R.Math. Acad. Sci. Paris, 345(3):167–169, 2007.
[106] D. Volny, M. Woodroofe, and O. Zhao. Central limit theorems for superlinear processes.Stoch. Dyn., 11(1):71–80, 2011.
[107] Y. Wang. maxLinear: Conditional sampling for max-linear models, 2010. R package version1.0.
[108] Y. Wang, P. Roy, and S. A. Stoev. Ergodic properties of sum– and max–stable stationaryrandom fields via null and positive group actions. To appear in Ann. Probab., available athttp://arxiv.org/abs/0911.0610, 2012.
[109] Y. Wang and S. A. Stoev. On the structure and representations of max–stableprocesses. Technical Report 487, Department of Statistics, University of Michigan,http://arxiv.org/abs/0903.3594, 2009.
[110] Y. Wang and S. A. Stoev. On the association of sum- and max-stable processes. Statist.Probab. Lett., 80(5-6):480–488, 2010.
[111] Y. Wang and S. A. Stoev. On the structure and representations of max–stable processes.Adv. in Appl. Probab., 42(3):855–877, 2010.
[112] Y. Wang and S. A. Stoev. Conditional sampling for spectrally-discrete max-stable randomfields. Adv. in Appl. Probab., 43(2):463–481, 2011.
[113] Y. Wang, S. A. Stoev, and P. Roy. Decomposability for stable processes. Stochastic Process.Appl., 122(3):1093–1109, 2012.
[114] Y. Wang and M. Woodroofe. A new condition on invariance principles for stationary randomfields. Submitted, available at http://arxiv.org/abs/1101.5195, 2011.
[115] Y. Wang and M. Woodroofe. On the asymptotic normality of kernel density estimators forlinear random fields. Submitted, available at http://arxiv.org/abs/1201.0238, 2012.
[116] M. Woodroofe. A central limit theorem for functions of a Markov chain with applications toshifts. Stochastic Process. Appl., 41(1):33–44, 1992.
141
[117] W. B. Wu. Central limit theorems for functionals of linear processes and their applications.Statist. Sinica, 12(2):635–649, 2002.
[118] W. B. Wu. Nonlinear system theory: another look at dependence. Proc. Natl. Acad. Sci.USA, 102(40):14150–14154 (electronic), 2005.
[119] W. B. Wu. Asymptotic theory for stationary processes. Stat. Interface, 4(2):207–226, 2011.
[120] W. B. Wu and J. Mielniczuk. Kernel density estimation for linear processes. Ann. Statist.,30(5):1441–1459, 2002.
[121] W. B. Wu and M. Woodroofe. Martingale approximations for sums of stationary processes.Ann. Probab., 32(2):1674–1690, 2004.
[122] W. B. Wu and Z. Zhao. Moderate deviations for stationary processes. Statist. Sinica,18(2):769–782, 2008.
[123] O. Zhao and M. Woodroofe. On martingale approximations. Ann. Appl. Probab., 18(5):1831–1847, 2008.
ABSTRACT
Topics on Max-stable Processes and the Central Limit Theorem
by
Yizao Wang
Chair: Stilian. A. Stoev
This dissertation consists of results in two distinct areas of probability theory.
One is the extreme value theory, the other is the central limit theorem.
In the extreme value theory, the focus is on max-stable processes. Such processes
play an increasingly important role in characterizing and modeling extremal phe-
nomena in finance, environmental sciences and statistical mechanics. In particular,
the association of sum- and max-stable processes and the decomposability of sum-
and max-stable processes are investigated. Besides, the conditional distributions of
max-stable processes are also studied, and a computationally efficient algorithm is
developed. This algorithm has many potential applications in prediction of extremal
phenomena.
In the central limit theorem, the asymptotic normality for partial sums of sta-
tionary random fields is studied, with a focus on the projective conditions on the
dependence. Such conditions, easy to check for many stochastic processes and ran-
1
dom fields, have recently drawn many attentions for (one-dimensional) time series
models in statistics and econometrics. Here, the focus is on (high-dimensional) sta-
tionary random fields. In particular, a general central limit theorem for stationary
random fields and orthomartingales is established. The method is then extended to
establish the asymptotic normality for the kernel density estimator of linear random
fields.