springer series in operations research and financial ...978-0-387-34471-3/1.pdf · the two...

15
Springer Series in Operations Research and Financial Engineering Series Editors: Thomas V. Mikosch Sidney I. Resnick Stephen M. Robinson

Upload: others

Post on 01-Sep-2019

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

Springer Series in Operations Research and Financial Engineering Series Editors: Thomas V. Mikosch Sidney I. Resnick Stephen M. Robinson

Page 2: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

Springer Series in Operations Research and Financial Engineering

Altiok: Performance Analysis of Manufacturing Systems Birge and Louveaux: Introduction to Stochastic Programming Bonnans and Shapiro: Perturbation Analysis of Optimization Problems Bramel, Chen, and Simchi-Levi: The Logic of Logistics: Theory,

Algorithms, and Applications for Logistics and Supply Chain Management (second edition)

Dantzig and Thapa: Linear Programming 1: Introduction Dantzig and Thapa: Linear Programming 2: Theory and Extensions de Haan and Ferreira: Extreme Value Theory: An Introduction Drezner (Editor): Facility Location: A Survey of Applications and

Methods Facchinei and Pang: Finite-Dimensional Variational Inequalities and

Complementarity Problems, Volume I Facchinei and Pang: Finite-Dimensional Variational Inequalities and

Complementarity Problems, Volume II Fishman: Discrete-Event Simulation: Modeling, Programming, and

Analysis Fishman: Monte Carlo: Concepts, Algorithms, and Applications Haas: Stochastic Petri Nets: Modeling, Stability, Simulation Klamroth: Single-Facility Location Problems with Barriers Muckstadt: Analysis and Algorithms for Service Parts Supply Chains Nocedal and Wright: Numerical Optimization Olson: Decision Aids for Selection Problems Pinedo: Planning and Scheduling in Manufacturing and Services Pochet and Wolsey: Production Planning by Mixed Integer Programming Whitt: Stochastic-Process Limits: An Introduction to Stochastic-Process

Limits and Their Application to Queues Yao (Editor): Stochastic Modeling and Analysis of Manufacturing Systems Yao and Zheng: Dynamic Control of Quality in Production-Inventory

Systems: Coordination and Optimization Yeung and Petrosyan: Cooperative Stochastic Differential Games

Forthcoming Resnick: Heavy Tail Phenomena: Probabilistic and Statistical Modeling Muckstadt and Sapra: Models and Solutions in Inventory Management

Page 3: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

Laurens de Haan Ana Ferreira

Extreme Value Theory An Introduction

Springer

Page 4: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

Laurens de Haan Erasmus University School of Economics P.O. Box 1738 3000 DR Rotterdam The Netherlands [email protected]

Ana Ferreira Instituto Superior de Agronomia Departamento de Matematica Tapada da Ajuda 1349-017 Lisboa Portugal [email protected]

Series Editors: Thomas V. Mikosch University of Copenhagen Laboratory of Actuarial Mathematics DK-1017 Copenhagen Denmark [email protected]

Stephen M. Robinson University of Wisconsin-Madison Department of Industrial

Engineering Madison, WI 53706 U.S.A. [email protected]

Sidney I. Resnick Cornell University School of Operations Research and Industrial Engineering Ithaca, NY 14853 U.S.A. [email protected]

Mathematics Subject Classification (2000): 60G70, 60G99, 60A99

Library of Congress Control Number: 2006925909

ISBN-10:0-387-23946-4 e-ISBN: 0-387-34471-3

ISBN-13: 978-0-387-23946-0

Printed on acid-free paper.

© 2006 Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media LLC, 233 Spring Street, New York, NY 10013, U.SA.), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed in the United States of America. (TXQ/MP)

9 8 7 6 5 4 3 2 1

spnnger.com

Page 5: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

In cauda venenum

Page 6: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

Preface

Approximately 40% of the Netherlands is below sea level. Much of it has to be protected against the sea by dikes. These dikes have to withstand storm surges that drive the seawater level up along the coast. The government, balancing considerations of cost and safety, has determined that the dikes should be so high that the probability of a flood (i.e., the seawater level exceeding the top of the dike) in a given year is 10~4. The question is then how high the dikes should be built to meet this requirement. Storm data have been collected for more than 100 years. In this period, at the town of Delfzijl, in the northeast of the Netherlands, 1877 severe storm surges have been identified. The collection of high-tide water levels during those storms forms approximately a set of independent observations, taken under similar conditions (i.e., we may assume that they are independent and identically distributed). No flood has occurred during these 100 years.

At first it looks as if this is an impossible problem: in order to estimate the prob­ability of a once-in-10000 years event one needs more than observations over just 100 years. The empirical distribution function carries all the information acquired, and going beyond its range is impossible.

Yet it is easy to see that some information can be gained. For example, one can check whether the spacings (i.e., the difference between consecutive ordered observations) increase or decrease in size as one moves to the extreme observations. A decrease would point at a short tail and an increase at a long tail of the distribution.

Alternatively, one could try to estimate the first and second derivatives of the empirical distribution function near the boundary of the sample and extrapolate using these estimates.

The second option is where extreme value theory leads us. But instead of pro­ceeding in a heuristic way, extreme value theory provides a solid theoretical basis and framework for extrapolation. It leads to natural estimators for the relevant quan­tities, e.g., those for extreme quantiles as in our example, and allows us to assess the accuracy of these estimators.

Extreme value theory restricts the behavior of the distribution function in the tail basically to resemble a limited class of functions that can be fitted to the tail of the

Page 7: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

viii Preface

distribution function. The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution function.

In order to be able to apply this theory some conditions have to be imposed. They are quite broad and natural and basically of a qualitative nature. It will become clear that the so-called extreme value condition is on the one hand quite general (it is not easy to find distribution functions that do not satisfy them) but on the other hand is sufficiently precise to serve as a basis for extrapolation.

Since we do not know the tail, the conditions cannot be checked (however, see Section 5.2). But this is a common feature in more traditional branches of statistics. For example, when estimating the median one has to assume that it is uniquely defined. And for assessing the accuracy one needs a positive density. Also, for estimating a mean one has to assume that it exists and for assessing the accuracy one usually assumes the existence of a second moment.

In these two cases it is easy to see what the natural conditions should be. This is not the case in our extrapolation problem. Nevertheless, some reflection shows that the "extreme value condition" is the natural one. For example (cf. Section 1.1.4), one way of expressing this condition is that it requires that a high quantile (beyond the scope of the available data) be asymptotically related in a linear way to an intermediate quantile (which can be estimated using the empirical distribution function).

The theory described in this book is quite recent: only in the 1980s did the con­tours of the statistical theory take shape. One-dimensional probabilistic extreme value theory was developed by M. Frechet (1927), R. Fisher and L. Tippett (1928), and R. von Mises (1936), and culminated in the work of B. Gnedenko (1943). The statis­tical theory was initiated by J. Pickands III (1975).

The aim of this book is to give a thorough account of the basic theory of extreme values, probabilistic and statistical, theoretical and applied. It leads up to the current state of affairs. However, the account is by no means exhaustive for this field has become too vast. For these two reasons, the book is called an introduction.

The outline of the book is as follows. Chapters 1 and 2 discuss the extreme value condition. They are of a mathematical and probabilistic nature. Section 2.4 is important in itself and essential for understanding Sections 3.4, 3.6 and Chapter 5, but not for understanding the rest of the book. Chapter 3 discusses how to estimate the main (shape) parameter involved in the extrapolation and Chapter 4 explains the extrapolation itself. Examples are given.

In Chapter 5 some interesting but more advanced topics are discussed in a one-dimensional setting.

The higher-dimensional version of extreme value theory offers challenges of a new type. The model is explained in Chapter 6, the estimation of the main parameters (which are infinite-dimensional in this case) in Chapter 7, and the extrapolation in Chapter 8.

Chapter 9 (probabilistic) and Chapter 10 (statistical) treat the infinite-dimensional case.

Appendix B offers an introduction to the theory of regularly varying functions, which is basic for our approach. This text is partly based on the book Regular Vari-

Page 8: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

Preface ix

ation, Extensions and Tauberian Theorems, by J.L. Geluk and L. de Haan, which is out of print. The authors wish to thank Jaap Geluk for his permission to use the text.

In a book of this extent it is possible that some errors may have escaped our atten­tion. We are very grateful for feedback on any corrections, suggestions or comments ([email protected], [email protected]). We intend to publish possible corrections at Ana's webpage, http://www.isa.utl.pt/matemati/~anafh/anafh.html.

We wish to thank the statistical research unit of the University of Lisbon (CEAUL) for offering an environment conducive to writing this book. We acknowledge the support of FCT/POCTI/FEDER as well as the Gulbenkian foundation. We thank Holger Drees and the editors, Thomas Mikosch and Sidney Resnick, for their efforts to go through substantial parts of the book, which resulted in constructive criticism. We thank John Einmahl for sharing his notes on the material of Sections 7.3 and 10.4.2. The first author thanks the Universite de Saint Louis (Senegal) for the opportunity to present some of the material in a course. We are very grateful to Maria de Fatima Correia de Haan, who learned BlgX for the purpose of typing a substantial part of the text. Laurens de Haan also thanks Maria de Fatima for her unconditional support during these years. Ana Ferreira is greatly indebted to those who propitiated and encouraged her learning on the subject, especially to Laurens de Haan. Ana also thanks the long-enduring and unconditional support of her parents as well as her husband, Bernardo, and son, Pedro.

In a book of this extent it is possible that some errors may have escaped our atten­tion. We are very grateful for feedback on any corrections, suggestions or comments ([email protected], [email protected]). We intend to publish possible corrections at Ana's webpage, http://www.isa.utl.pt/matemati/ anafh/anafh.html.

Lisbon, 2006

Laurens de Haan Ana Ferreira

Page 9: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

Contents

Preface vii

List of Abbreviations and Symbols xv

Part I One-Dimensional Observations

1 Limit Distributions and Domains of Attraction 3 1.1 Extreme Value Theory: Basics 3

1.1.1 Introduction 3 1.1.2 Alternative Formulations of the Limit Relation 4 1.1.3 Extreme Value Distributions 6 1.1.4 Interpretation of the Alternative Conditions; Case Studies .. 12 1.1.5 Domains of Attraction: A First Approach 14

1.2 Domains of Attraction 19 Exercises 34

2 Extreme and Intermediate Order Statistics 37 2.1 Extreme Order Statistics and Poisson Point Processes 37 2.2 Intermediate Order Statistics 40 2.3 Second-Order Condition 43 2.4 Intermediate Order Statistics and Brownian Motion 49 Exercises 60

3 Estimation of the Extreme Value Index and Testing 65 3.1 Introduction 65 3.2 A Simple Estimator for the Tail Index (y > 0): The Hill Estimator . 69 3.3 General Case y e E: The Pickands Estimator 83 3.4 The Maximum Likelihood Estimator (y > — \) 89 3.5 AMomentEstimator (/ € R) 100 3.6 Other Estimators 110

3.6.1 The Probability-Weighted Moment Estimator (y < 1) 110

Page 10: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

xii Contents

3.6.2 The Negative Hill Estimator (y < - | ) 113 3.7 Simulations and Applications 116

3.7.1 Asymptotic Properties 116 3.7.2 Simulations 120 3.7.3 Case Studies 121

Exercises 124

4 Extreme Quantile and Tail Estimation 127 4.1 Introduction 127 4.2 Scale Estimation 130 4.3 Quantile Estimation 133

4.3.1 Maximum Likelihood Estimators 139 4.3.2 Moment Estimators 140

4.4 Tail Probability Estimation 141 4.4.1 Maximum Likelihood Estimators 145 4.4.2 Moment Estimators 145

4.5 Endpoint Estimation 145 4.5.1 Maximum Likelihood Estimators 147 4.5.2 Moment Estimators 147

4.6 Simulations and Applications 148 4.6.1 Simulations 148 4.6.2 Case Studies 149

Exercises 153

5 Advanced Topics 155 5.1 Expansion of the Tail Distribution Function and Tail Empirical

Process 155 5.2 Checking the Extreme Value Condition 163 5.3 Convergence of Moments, Speed of Convergence, and Large

Deviations 176 5.3.1 Convergence of Moments 176 5.3.2 Speed of Convergence; Large Deviations 179

5.4 Weak and Strong Laws of Large Numbers and Law of the Iterated Logarithm 188

5.5 Weak "Temporal" Dependence 195 5.6 Mejzler's Theorem 201 Exercises 204

Part II Finite-Dimensional Observations

6 Basic Theory 207 6.1 Limit Laws 207

6.1.1 Introduction: An Example 207 6.1.2 The Limit Distribution; Standardization 208 6.1.3 The Exponent Measure 211

Page 11: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

Contents xiii

6.1.4 The Spectral Measure 214 6.1.5 The Sets Qc and the Functions L, / , and A 221

6.2 Domains of Attraction; Asymptotic Independence 226 Exercises 230

7 Estimation of the Dependence Structure 235 7.1 Introduction 235 7.2 Estimation of the Function L and the Sets Qc 235 7.3 Estimation of the Spectral Measure (and L) 247 7.4 A Dependence Coefficient 258 7.5 Tail Probability Estimation and Asymptotic Independence: A

Simple Case 261 7.6 Estimation of the Residual Dependence Index r\ 265 Exercises 268

8 Estimation of the Probability of a Failure Set 271 8.1 Introduction 271 8.2 Failure Set with Positive Exponent Measure 276

8.2.1 First Approach: cn Known 276 8.2.2 Alternative Approach: Estimate cn 278 8.2.3 Proofs 279

8.3 Failure Set Contained in an Upper Quadrant; Asymptotically Independent Components 285

8.4 Sea Level Case Study 288 Exercises 289

Part III Observations That Are Stochastic Processes

9 Basic Theory in C[0,1] 293 9.1 Introduction: An Example 293 9.2 The Limit Distribution; Standardization 294 9.3 The Exponent Measure 296 9.4 The Spectral Measure 302 9.5 Domain of Attraction 311 9.6 Spectral Representation and Stationarity 314

9.6.1 Spectral Representation 314 9.6.2 Stationarity 315

9.7 Special Cases 321 9.8 Two Examples 323 Exercises 328

10 Estimation in C[0,1] 331 10.1 Introduction: An Example 331 10.2 Estimation of the Exponent Measure: A Simple Case 332 10.3 Estimation of the Exponent Measure 335

Page 12: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

xiv Contents

10.4 Estimation of the Index Function, Scale and Location 338 10.4.1 Consistency 339 10.4.2 Asymptotic Normality 344

10.5 Estimation of the Probability of a Failure Set 349

Part IV Appendix

A Skorohod Theorem and Vervaat's Lemma 357

B Regular Variation and Extensions 361 B.l Regularly Varying (RV) Functions 361 B.2 Extended Regular Variation (ER V); The class n 371 B.3 Second-Order Extended Regular Variation (2ERV) 385 B.4 ERV with an Extra Parameter 401

References 409

Index 415

Page 13: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

List of Abbreviations and Symbols

Notation that is largely confined to sections or chapters is mostly excluded from the list below.

=d

->d

_ > p

a(t) ~ b(t) a

n y r v Q

Up) 1 - F^ 2ERV a+ a-flV b a Ab [a] \a\ a.s. C[0, 1]

C+[0, 1] q+ [0 , i]

Cj"[0, 1]

equality in distribution convergence in distribution convergence in probability

lim, a{t)/b(t) = 1 tail index residual dependence index extreme value index gamma function exponent measure metric \\/x — l/y\

indicator function: equals 1 if p is true and 0 otherwise left-continuous empirical distribution function second-order extended regular variation max (a, 0) min(a, 0) max (a, b) min(<z, b) largest integer less than or equal to a smallest integer greater than or equal to a almost surely space of continuous functions on [0, 1] equipped with the supremum norm {/ € C[0, 1] : / > 0} {/ ^ C[0, 1] : / > 0, l/loo = 1}

{/ € C[0, 1] : / > 0, |/|oo = 1}

Page 14: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

xvi Abbreviations and Symbols

C+[0,1]

CSMS D and D' D[0, T]

V(Gy) ERV

r f+

f z4-l/loc F Fn Gy GP i.i.d. L R+ R*_* R(Xt) RVa

U X*

*x

(0, oo] x Ct [0, 1] with the lower index Q meaning that the space (0, oo] is equipped with the metric Q complete separable metric space dependence conditions space of functions on [0, T] that are right-continuous and have left-hand limits domain of attraction of GY

extended regular variation left-continuous version of the function / right-continuous version of the function / generalized inverse function of / (usually left-continuous) inverse function of / SUp5 | / ( 5 ) |

distribution function right-continuous empirical distribution function extreme value distribution function generalized Pareto independent and identically distributed dependence function [0,oo) R i \ { ( 0 , 0 ) } rank of X,- among {X\, Z 2 , . . . , Xn) regularly varying with index a (usually left-continuous) inverse of 1/(1 — F) supfjc : F(x) < 1} = U(oo) M{x : F(x) > 0}

Page 15: Springer Series in Operations Research and Financial ...978-0-387-34471-3/1.pdf · The two parameters that play a role, scale and shape, are based roughly on derivatives of the distribution

Extreme Value Theory