eigenvalues, inequalities and ergodic...

210
Eigenvalues, Inequalities and Ergodic Theory Mu-Fa Chen (Beijing Normal University) May 2, 2003

Upload: others

Post on 18-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Eigenvalues, Inequalitiesand Ergodic Theory

Mu-Fa Chen

(Beijing Normal University)

May 2, 2003

Page 2: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

ii

Page 3: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Contents

Preface vii

Acknowledgements ix

Chapter 1 An Overview of the Book 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 New variational formula for the first eigenvalue . . . . . . . . . . 31.3 Basic inequalities and new forms of Cheeger’s constants . . . . . 101.4 New picture of ergodic theory and explicit criteria . . . . . . . . 12

Chapter 2 Optimal Markovian Couplings 172.1 Couplings and Markovian couplings . . . . . . . . . . . . . . . . 172.2 Optimality with respect to distances . . . . . . . . . . . . . . . . 262.3 Optimality with respect to lower semi-continuous functions . . . 302.4 Applications of the coupling methods . . . . . . . . . . . . . . . . 32

Chapter 3 New Variational Formulas for the First Eigenvalue 393.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.2 Partial proof in discrete case . . . . . . . . . . . . . . . . . . . . 413.3 Three steps of the proof in geometric case . . . . . . . . . . . . . 453.4 Two difficulties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483.5 Final step of the proof of the formula . . . . . . . . . . . . . . . . 513.6 Comments about different methods . . . . . . . . . . . . . . . . . 533.7 Proof in the discrete case (continued) . . . . . . . . . . . . . . . . 543.8 The first Dirichlet eigenvalue . . . . . . . . . . . . . . . . . . . . 58

Chapter 4 Generalized Cheeger’s Method 634.1 Cheeger’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.2 A generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.3 New results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654.4 Splitting technique and existence criterion . . . . . . . . . . . . . 674.5 Sketch of the proof of Theorem 4.4 . . . . . . . . . . . . . . . . . 714.6 Logarithmic Sobolev inequality . . . . . . . . . . . . . . . . . . . 744.7 Upper bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Page 4: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

iv CONTENTS

4.8 Nash inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.9 Birth-death processes . . . . . . . . . . . . . . . . . . . . . . . . 81

Chapter 5 Ten Explicit Criteria for One-dimensional Processes 83

5.1 Three traditional types of ergodicity . . . . . . . . . . . . . . . . 83

5.2 The first (non-trivial) eigenvalue (spectral gap) . . . . . . . . . . 86

5.3 Results about the first eigenvalues and the exponentially ergodicrate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5.4 Explicit criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.5 Exponential ergodicity for single birth processes . . . . . . . . . . 94

5.6 Strong ergodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

Chapter 6 Variational Formulas and Explicit Bounds of Poincare-type Inequalities in Dimension One 105

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.2 Ordinary Poincare inequalities . . . . . . . . . . . . . . . . . . . 107

6.3 Extension. Banach spaces. . . . . . . . . . . . . . . . . . . . . . . 111

6.4 Neumann case. Orlicz spaces. . . . . . . . . . . . . . . . . . . . . 113

6.5 Nash inequality and Sobolev-type inequality . . . . . . . . . . . . 115

6.6 Logarithmic Sobolev inequality . . . . . . . . . . . . . . . . . . . 117

6.7 Partial proofs of Theorem 6.1 . . . . . . . . . . . . . . . . . . . . 119

Chapter 7 Functional Inequalities 123

7.1 Statement of the results . . . . . . . . . . . . . . . . . . . . . . . 123

7.2 Sketch of the proofs . . . . . . . . . . . . . . . . . . . . . . . . . 126

7.3 Comparison with Cheeger’s method . . . . . . . . . . . . . . . . 129

7.4 General convergence speed . . . . . . . . . . . . . . . . . . . . . . 130

7.5 Two functional inequalities . . . . . . . . . . . . . . . . . . . . . 131

7.6 Algebraic convergence . . . . . . . . . . . . . . . . . . . . . . . . 133

7.7 General (irreversible) case . . . . . . . . . . . . . . . . . . . . . . 135

Chapter 8 A Diagram of Nine Types of Ergodicity 137

8.1 Statements of the results . . . . . . . . . . . . . . . . . . . . . . . 137

8.2 Applications and comments . . . . . . . . . . . . . . . . . . . . . 140

8.3 Proof of Theorem 1.9 . . . . . . . . . . . . . . . . . . . . . . . . . 142

Chapter 9 Reaction-Diffusion Processes 149

9.1 The models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

9.2 Finite-dimensional case . . . . . . . . . . . . . . . . . . . . . . . 152

9.3 Construction of the processes . . . . . . . . . . . . . . . . . . . . 156

9.4 Ergodicity and phase transitions . . . . . . . . . . . . . . . . . . 160

9.5 Hydrodynamic limits . . . . . . . . . . . . . . . . . . . . . . . . . 163

Page 5: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

CONTENTS v

Chapter 10 Stochastic Models of Economic Optimization 16710.1 Input-output method . . . . . . . . . . . . . . . . . . . . . . . . . 16710.2 L. K. Hua’s fundamental theorem . . . . . . . . . . . . . . . . . . 16810.3 Stochastic model without consumption . . . . . . . . . . . . . . . 17110.4 Stochastic model with consumption . . . . . . . . . . . . . . . . . 17310.5 Proof of Theorem 10.4 . . . . . . . . . . . . . . . . . . . . . . . . 174

Appendix A Some Elementary Lemmas 177

Bibliography 181

Index 197

Page 6: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

vi CONTENTS

Page 7: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Preface

First, let us explain the precise meaning of the compressed title. The word“eigenvalues” means the first non-trivial Neumann or Dirichlet eigenvalues, orthe principal eigenvalues. The word “inequalities” means the Poincare inequal-ities, the logarithmic Sobolev inequalities, the Nash inequalities and so on. Ac-tually, the first eigenvalues can be described by some Poincare inequalities andso the second topic has a wider range than the first one. Next, for a Markovprocess, corresponding to its operator, each inequality describes a type of er-godicity. Thus, the study on the inequalities and their relation provides a wayto develop the ergodic theory for Markov processes. Due to these facts, fromprobabilistic point of view, the book can also be regarded as the study on “er-godic convergence rates of Markov processes”, which is an alternative title ofthe book. However, this book is aimed to a larger class of readers, not onlyprobabilists.

The importance of the study on these topics should be obvious. On the onehand, the first eigenvalue is the leading term in the spectrum, that plays a greatrole in almost every branch of mathematics. On the other hand, the ergodicconvergence rates consist of a recent research area in the theory of Markovprocesses. This study has a very wide range of applications. In particular, itprovides a tool to describe the phase transitions and the effectiveness of randomalgorithms, which are now very fashionable research area.

This book surveys, in a popular way, the main progress made in the fieldby our group. It consists of ten chapters plus an appendix. The first chapteris an overview of the second to the eighth ones. Mainly, we study several diffe-rent inequalities or different types of convergence, by using three mathematicaltools: a probabilistic tool—the coupling methods (Chapters 2 and 3), a gene-ralized Cheeger’s method originated in Riemannian Geometry (Chapter 4), andan approach coming from Potential Theory and Harmonic Analysis (Chapters6 and 7). The explicit criteria for different types of convergence and the explicitestimates of the convergence rates (or the optimal constants in the inequalities)are given in Chapters 5 and 6, some generalizations are given in Chapter 7. Adiagram of nine types of ergodicity is presented in Chapter 8. The topics ofthe last two chapters (9 and 10) are different but closely related. Chapter 9introduces the resource of the problems and illustrates some applications. Inthe last chapter, one can see an interesting application of the first eigenvalue, its

Page 8: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

viii Preface

eigenfunctions and an ergodic theorem to a stochastic model of economy. Somerelated open problems are included in each chapter. Besides, an effort is madeto make each chapter, except the first one, to be more or less self-contained.

This book serves as an introduction to the developing field. We emphasizethe ideas through simple examples rather than technical proofs, the most ofthem are only sketched. It is hoped the book would be readable for the non-specialists. Honestly, in the past ten years or more, the author has tried ratherhard to make acceptable lectures, the present book is just based on the lecturenotes: Chen (1994b; 1997a; 1998a; 1999c; 2002b; 2002c; 2003a; 2003b; 2003c).After presented eleven lectures in Japan in 2002, the author understood that itis worthy to publish a short book and then the job was started.

Page 9: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Acknowledgements

As mentioned before, this book is based on the lecture notes presented in thepast ten years or so. Thus, the book should be dedicated, with the author’s deepacknowledgement, to those mathematicians and their universities/institutes fortheir kind invitations, financial supports and warm hospitality. Without theirencouragement and effort, the book would never exist. The author should andwish to list, may be also permitted by our readers, a part of their names below(since 1993), with an apology of some missing:

• Prof. D. J. Chen, Prof. J. D. Chen, Prof. M. P. Qian and Prof. G. Q.Zhang at Beijing (Peking) University.

• Prof. D. A. Dawson, Dr. S. Feng [McMaster University] and Dr. Y. D.Wu at Carleton University.

• Prof. G. O’Brien, Prof. N. Madras and Dr. J. M. Sun at York University.

• Prof. D. McDonald and Dr. K. Qian at University of Ottawa.

• Prof. M. Barlow, Prof. E. A. Perkins, Dr. S. J. Luo and Dr. L. W. Zhang[Hong Kong Univ. Sci. Tech.] at University of British Columbia.

• Prof. E. Scacciatelli, Prof. G. Nappo and Prof. A. Pellegrinotti at Uni-versity of Roma I.

• Prof. V. Capasso and Prof. Y. G. Lu at University of Bari.

• Prof. L. Accardi at University of Roma II.

• Prof. C. Boldrighini at University of Camerino [University of Roma I].

• Prof. B. Grigelionis at Akademijios, Lithuania.

• Prof. L. Stettner and Prof. J. Zabczyk at Polish Academy of Sciences.

• Prof. W.Th.F. den Hollander at Utrecht university [Universiteit Leiden].

• Prof. Louis H. Y. Chen, Prof. Prof. K. P. Choi and J. H. Lou at SingaporeUniversity.

• Prof. C. Heyde, Prof. K. Sigman and Prof. Y. Z. Shao at ColumbiaUniversity.

• Prof. R. Durrett, Prof. L. Gross and Prof. Z. Q. Chen [University ofWashington Seattle] at Cornell University.

• Prof. D. L. Burkholder at University of Illinois.

• Prof. Z. M. Ma and Prof. J. A. Yan at Institute of Applied Mathematics,Chinese Academy.

Page 10: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

x Acknowledgements

• Prof. D. Elworthy at Warwick University.

• Prof. F. Gotze and Prof. M. Rockner at University of Bielefeld.

• Prof. S. Albeverio and Prof. J. L. Wu [University of Wales Swansea] atBochum University.

• Prof. K. J. Hochberg at Bar-Ilan University.

• Prof. B. Granovsky at Technion-Israel Institute of technology.

• Prof. B. Yart at Grenoble University [University, Paris5].

• Prof. R. A. Minlos, Prof. E. Pechersky and Prof. E. Zizhina at theInformation Transmission Problems, Russian Academy of Sciences.

• Prof. T. S. Chiang, Prof. C. R. Hwang, Prof. Y. S. Chow and Prof. S. J.Sheu at Institute of Mathematics, Academy Sinica, Taipei.

• Prof. C. H. Chen, Prof. Y. S. Chow, Prof. A. C. Hsiung, Prof. W. T.Huang, Prof. W. Q. Liang and Prof. C. Z. Wei at Institute of StatisticalScience, Academy Sinica, Taipei.

• Prof. H. Chen at National Taiwan University.

• Prof. T. F. Lin at Soochow University.

• Prof. Y. J. Lee and Prof. W. J. Huang at National University of Kaohsi-ung.

• Prof. C. L. Wang at National Dong Hwa University.

• Prof. A. H. Xia at University of New South Wales [Melbourne University].

• Prof. L. M. Wu at Blaise Pascal University and Wuhan University.

• Prof. C. Heyde, Prof. J. Gani and Dr. W. Dai at Australian NationalUniversity.

• Prof. E. Seneta at University of Sydney.

• Prof. F. C. Flebaner at University of Melbourne.

• Prof. Y. X. Lin at Wollongong University.

• Prof. I. Shigekawa, Prof. Y. Takahashi, Prof. T. Kumagai, Prof. N.Yosida, Prof. S. Watanabe and Dr. Q. P. Liu at Kyoto University.

• Prof. M. Fukushima, Prof. S. Kotani, Prof. S. Aida and Prof. N. Ikedaat Osaka University.

• Prof. H. Osada, Prof. S. Liang and Prof. K. Sato at Nagoya University.

• Prof. T. Funaki and Prof. S. Kusuoka at Tokyo University.

The author also acknowledges the organizers of the following conferences(since 1993) for their invitations and financial supports.

• The Six International Vilnuis Conference on Probability and Mathemati-cal Statistics (June 1993, Vilnuis).

• The International Conference on Dirichlet Forms and Stochastic Processes(October 1993, Beijing).

• The 60th Anniversary Conference of Chinese Mathematical Society (May1995, Beijing).

• The 23rd Conference on Stochastic Processes and Their Applications (Ju-ne 1995, Singapore).

Page 11: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

xi

• The Symposium on Probability Towards the Year 2000 (October 1995,New York).

• Stochastic Differential Geometry and Infinite-Dimensional Analysis (April1996, Hangzhou).

• Workshop on Interacting Particle Systems and Their Applications (June1996, Haifa).

• The 25rd Conference on Stochastic Processes and Their Applications (July1998, Oregon).

• IMS Workshop on Applied Probability (June 1999, Hong Kong).

• The Second Sino-French Colloquium in Probability and Applications (Ap-ril 2001, Wuhan).

• The Conference on Stochastic Analysis on Large Scale Interacting Systems(July 2002, Japan).

• Stochastic Analysis and Statistical Mechanics, Yukawa Institute (July2002, Kyoto)

• International Congress of Mathematicians (August 2002, Beijing).

• The First Sino-German Conference on Stochastic Analysis—A SatelliteConference of ICM 2002 (September 2002, Beijing).

• Stochastic Analysis in Infinite Dimensional Spaces (November 2002, Ky-oto)

• Japanese National Conference on Stochastic Processes and Related Fields(December 2002, Tokyo).

The continued supports from the National Natural Science Foundation ofChina, the Foundation and the Ministry of Education of China, as well asthe Qiu Shi Science and Technology Foundation and the 973 Project are alsoacknowledged.

Finally, the author is grateful for the colleagues in our group: Prof. F.Y. Wang, Prof. Y. H. Zhang, Prof. Y. H. Mao and Prof. Y. Z. Wang fortheir fruitful cooperations. The helps from a long term of students are alsoappreciated. Moreover, I would like to use this chance to acknowledge Prof. S.J. Yan, Prof. Z. T. Hou and Prof. Z. K. Wang for their teaching and advice.

Mu-Fa Chen, May 1, 2003

Page 12: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

xii Acknowledgements

Page 13: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Chapter 1

An Overview of the Book

This chapter is an overview of the book, especially of the first eight chapters.It consists of four sections. In the first section, we explain what eigenvalues weare interested in and show the difficulties of the study on the first (non-trivial)eigenvalue through elementary examples. The second section presents somenew (dual) variational formulas and explicit bounds for the first eigenvalueof Laplacian on Riemannian manifolds or Jacobi matrices (Markov chains),and explain the main idea of the proof, which is a probabilistic approach—thecoupling methods. In the third section, we introduce some recent lower boundsof several basic inequalities, based on a generalization of Cheeger’s approachwhich comes from Riemannian geometry. In the last section, a diagram ofnine different types of ergodicity, and a table of explicit criteria for them arepresented. The criteria are motivated by the weighted Hardy inequality whichcomes from Harmonic analysis.

1.1 Introduction

Let me explain what eigenvalue we are talking about it now.

Definition. The first (non-trivial) eigenvalue

Consider a triangle matrix (or in probabilistic language, a birth–death processwith state space E = 0, 1, 2, · · · and Q-matrix)

Q = (qij) =

−b0 b0 0 0 . . .a1 −(a1 + b1) b1 0 . . .0 a2 −(a2 + b2) b2 . . ....

. . .. . .

. . .

where ak, bk > 0. Since the sum of each row equals 0, we have Q1 = 0 = 0 · 1.This means that the Q-matrix has an eigenvalue 0 with eigenvector 1. Next,

Page 14: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

2 1 An Overview of the Book

consider the finite case, En = 0, 1, · · · , n. Then, the eigenvalues of −Q arediscrete: 0 = λ0 < λ1 6 · · · 6 λn. We are interested in the first (non-trivial)eigenvalue λ1 = λ1−λ0 =: gap (Q) (also called spectral gap of Q). In the infinitecase, λ1 can be 0. Certainly, one can consider a self-adjoint elliptic operator inRd or the Laplacian ∆ on manifolds or an infinite-dimensional operator as inthe study of interacting particle systems.

Since the spectral theory is a central part in each branch of mathematicsand the first non-trivial eigenvalue is the leading term of the spectrum, it shouldnot be surprising that the study of λ1 has a very wider range of applications.

Difficulties

To get some concrete feeling about the difficulties of the topic, let us look atthe following examples with a finite state space.

When E = 0, 1, it is trivial that λ1 = a1 + b0. Everyone is happy to seethis result since either a1 or b0 increases, so does λ1. If we go one more step,E = 0, 1, 2, then we have four parameters b0, b1 and a1, a2. In this case,

λ1 = 2−1[a1 + a2 + b0 + b1 −

√(a1 − a2 + b0 − b1)2 + 4a1b1

].

It is disappointed to see this result since the role for λ1 played by the pa-rameters is not clear at all. When E = 0, 1, 2, 3, we have six parameters:b0, b1, b2, a1, a2, a3. The solution is expressed by three quantities B, C and D:

λ1 =D

3− C

3 · 21/3+

21/3(3B −D2

)

3C,

where the quantities D, B and C are not too complicated:

D = a1 + a2 + a3 + b0 + b1 + b2,

B = a3 b0 + a2 (a3 + b0) + a3 b1 + b0 b1 + b0 b2 + b1 b2 + a1 (a2 + a3 + b2) ,

C =

(A+

√4(3B −D2)

3+A2

)1/3

.

However, in the last expression, another quantity A is involved. Then, what isA?

A = −2 a31 − 2 a3

2 − 2 a33 + 3 a2

3 b0 + 3 a3 b20 − 2 b30 + 3 a2

3 b1 − 12 a3 b0 b1 + 3 b20 b1

+ 3 a3 b21+ 3 b0 b

21− 2 b31− 6 a2

3 b2 + 6 a3 b0 b2+ 3 b20 b2+ 6 a3 b1 b2− 12 b0 b1 b2

+ 3 b21 b2 − 6 a3 b22+ 3 b0 b

22+ 3 b1 b

22− 2 b32 + 3 a2

1 (a2 + a3 − 2 b0 − 2 b1 + b2)

+ 3 a22 [a3 + b0 − 2 (b1 + b2)]

+ 3 a2

[a23 + b20 − 2 b21 − b1 b2 − 2 b22 − a3(4 b0 − 2 b1 + b2) + 2 b0(b1 + b2)

]

+ 3 a1

[a22 + a2

3 − 2 b20 − b0 b1 − 2 b21 − a2(4 a3 − 2 b0 + b1 − 2 b2)

+ 2 b0 b2 + 2 b1 b2 + b22 + 2 a3(b0 + b1 + b2)].

Page 15: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

1.2 New variational formula for the first eigenvalue 3

It should be shocked, at least I did, to see this example since the roles of theparameters are completely mazed! Of course, everyone understands that it isimpossible to compute λ1 explicitly when the size of the matrix is greater thanfive!

Now, how about the estimation? To see this, let us consider the perturbationof the eigenvalues and eigenfunctions. We consider the infinite state space E =0, 1, 2, · · · . Denote by g and Degree(g), respectively, the eigenfunction of λ1

and the degree of g when g is polynomial. Three examples of the perturbationof λ1 and Degree(g) are listed in Table 1.1.

bi(i > 0)bi(i > 0)bi(i > 0) ai(i > 1)ai(i > 1)ai(i > 1) λ1λ1λ1 Degree(g)Degree(g)Degree(g)

i+ c (c > 0) 2i 1 1

i+ 1 2i+ 3 2 2

i+ 1 2i+(4 +

√2)

3 3

Table 1.1 Three examples of the perturbation of λ1 and Degree(g)

The first line is the well-known linear model, for which λ1 = 1, independent ofthe constant c > 0, and g is linear. Next, keeping the same birth rate, bi = i+1,changes the death rate ai from 2i to 2i+3 (resp., 2i+4+

√2), which leads to the

change of λ1 from one to two (resp., three). More surprisingly, the eigenfunctiong is changed from linear to quadratic (resp., triple). For the other values of ai

between 2i, 2i+ 3 and 2i + 4 +√

2, λ1 is unknown since g is non-polynomial.As seen from these examples, the first eigenvalue is very sensitive. Hence, ingeneral, it is very hard to estimate λ1.

Hopefully, we have presented enough examples to show the extremal diffi-culties of the topic. Very fortunately, at last, we are able to present a completesolution to this problem in the present context. Please be patient, the resultwill be given only latterly.

For a long period, we was bare-handed. So we visited several branches ofmathematics. At the end, we found that the topic is well studied in Riemanniangeometry.

1.2 New variational formula for the first eigen-

value

Story of estimating λ1 in geometry

Here is a short story of the study of λ1 in geometry.Consider Laplacian ∆ on a connected compact Riemannian manifold (M, g),

where g is the Riemannian metric. The spectrum of ∆ is discrete: · · · 6 −λ2 6

−λ1 < −λ0 = 0 (may be repeated). Estimating these eigenvalues λk (especiallyλ1) consists an important section and chapter in the modern geometry. As faras we know, five books, excluding those books on general spectral theory, have

Page 16: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

4 1 An Overview of the Book

been devoted to this topic: I. Chavel (1984), P. H. Berard (1986), R. Schoen andS. T. Yau (1988), P. Li (1993) and C. Y. Ma (1993). About 2000 references arecollected in the second quoted book. Thus, it is impossible for us to introducean overview of what have been done in geometry. Instead, we would like toshow you ten of the most beautiful lower bounds. For a manifold M , denoteits dimension, diameter and the lower bound of Ricci curvature by d, D, andK (RicciM > Kg), respectively. The simplest example is the unit sphere in Rd,for which D = π and K = d− 1. We are interested in estimating λ1 in terms ofthese three geometric quantities. It is relatively easy to obtain an upper boundby applying a test function f ∈ C1(M) to the classical variational formula:

λ1 = inf

M

‖∇f‖2dx : f ∈ C1(M),

∫fdx = 0,

∫f2dx = 1

, (1.0)

where “dx” is the Riemannian volume element. To obtain the lower bound,however, is much harder. In Table 1.2, we list ten of the strongest lower boundsthat have been derived in the past, using various sophisticated methods.

Author(s) Lower bound

A. Lichnerowicz (1958)d

d− 1K, K > 0. (1.1)

P. H. Berard, G. Besson& S. Gallot (1985)

d

∫ π/2

0cosd−1tdt

∫D/2

0cosd−1tdt

2/d

, K = d − 1 > 0. (1.2)

P. Li & S. T. Yau (1980)π2

2D2, K > 0. (1.3)

J. Q. Zhong &H. C. Yang (1984)

π2

D2, K > 0. (1.4)

D. G. Yang (1999)π

2

D2+

K

4, K > 0. (1.5)

P. Li & S. T. Yau (1980)1

D2(d− 1) exp[1 +

√1 + 16α2

] , K 6 0. (1.6)

K. R. Cai (1991)π

2

D2+ K, K 6 0. (1.7)

D. Zhao (1999)π

2

D2+ 0.52K, K 6 0. (1.8)

H. C. Yang (1989) &F. Jia (1991)

π2

D2e−α, if d > 5, K 6 0. (1.9)

H. C. Yang (1989) &F. Jia (1991)

π2

2D2e−α′

, if 2 6 d 6 4, K 6 0, (1.10)

Table 1.2 Ten lower bounds of λ1

In Table 1.2, the two parameters α and α′ are defined as

α = D√|K|(d− 1)/2 and α′ = D

√|K|((d− 1) ∨ 2)/2.

Page 17: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

1.2 New variational formula for the first eigenvalue 5

The first estimate is due to A. Lichnerowicz 45 years ago. It is very goodsince it is indeed sharp for unit sphere in two- or higher dimensions. After 27years, this result was improved by three French mathematicians, given in (1.2).The problem here is that these two estimates become trivial for zero-curvature,the unit circle for instance. It is well known that the zero-curvature is harderthan positive curvature. The first progress was made by Li and Yau (1.3) andimproved by Zhang and Yang (1.4), by removing the factor two from (1.3).For inexpert, one may think this is not essential. However, it is regarded as adeepest result in geometry since it is sharp for the unit circle. The fifth estimateis a mixing of the first and the forth sharp estimates.

We now go to the negative curvature. The first result (1.6) is again due toLi and Yau in the same paper quoted above. Combining the two results (1.3)and (1.6) together, it should be clear that the negative case in much harderthan the positive one. Li and Yau’s results are improved step by step by manypeople as listed in Table 1.2.

Among these estimates, seven [(1.1), (1.2), (1.4), (1.5), (1.7)–(1.9)] with boldfonts are sharp. The first two are sharp for the unit sphere in two or higherdimensions but fail for the unit circle; the fourth, the forth to the ninth, exceptthe sixth, are all sharp for the unit circle. The above authors include severalfamous geometers and the results were awarded several times. As seen fromthe table, the picture is now very complete, due to the effort by the geometersin the past 40 years or more. For such a well-developed field, what can we donow? Our original starting point is to learn from the geometers and to studytheir methods, especially the recent new developments. It is surprising that weactually went to the opposite direction, that is, studying the first eigenvalue byusing a probabilistic method. At last, we found out a general formula for λ1.

New variational formula

Before stating our new variational formula, we introduce two notations:

C(r) = coshd−1

[r

2

√−Kd− 1

], r ∈ (0, D); F = f ∈C[0, D] : f > 0 on (0, D).

Here, we have used all the three quantities: the dimension d, the diameter D,and the lower bound K of Ricci curvature. Note that C(r) is always real forany K.

Theorem 1.1 (General formula [Chen and F. Y. Wang, 1997a]).

λ1 > supf∈F

infr∈(0,D)

4f(r)∫ r

0 C(s)−1ds∫D

s C(u)f(u)du=: ξ1. (1.11)

The new variational formula (1.11) has its essential value in estimating thelower bound. It is a dual of the classical variational formula (1.0) in the sensethat “inf” in (1.0) is replaced by “sup” in (1.11). The classical formula goesback to Lord S. J. W. Rayleigh (1877) or E. Fischer (1905). Noticing that there

Page 18: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

6 1 An Overview of the Book

are no common points in these two formulas (1.0) and (1.11), this explains thereason why such a formula never appeared before. Certainly, the new formulacan produce many new lower bounds. For instance, the one corresponding tothe trivial function f ≡ 1 is still non-trivial in geometry. It also has a nice prob-abilistic meaning: the convergence rate of strong ergodicity (cf., §5.6). Clearly,in order to obtain a better estimate, one needs to be more careful in choos-ing the test functions. Applying the general formula (1.11) to the elementarytest functions sin(αr) and coshd−1(αr) sin(βr) with α = D

√|K|(d− 1)/2 and

β = π/(2D), we obtain the following:

Corollary 1.2 (Chen and F. Y. Wang, 1997a).

λ1 >dK

d− 1

1 − cosd

[D

2

√K

d− 1

]−1

, d > 1, K > 0 (1.12)

λ1 >π2

D2

√1 − 2D2K

π4cosh1−d

[D

2

√−Kd− 1

], d > 1, K 6 0. (1.13)

Applying the formula (1.11) to some very complicated test functions, we canprove, assisted by a computer, the following result.

Corollary 1.3 (Chen, E. Scacciatelli and L. Yao, 2002).

λ1 > π2/D2 +K/2, K ∈ R. (1.14)

Surprisingly, the corollaries improve all the estimates (1.1)—(1.10). (1.12)improves (1.1) and (1.2), (1.13) improves (1.9) and (1.10), and (1.14) improves(1.4), (1.5), (1.7) and (1.8). Moreover, the linear approximation in (1.14) isoptimal in the sense that the coefficient 1/2 of K is exact.

A test function is indeed a mimic of the eigenfunction, so it should be cho-sen appropriately in order to obtain good estimates. A question arises natu-rally: Does there exist a single representative test function such that we canavoid the task of choosing a different test function each time? The answeris seemingly negative since we have already seen that the eigenvalue and theeigenfunction are both very sensitive. Surprisingly, the answer is affirmative.The representative test function, though very tricky to find, has a rather sim-ple form: f(r) =

( ∫ r

0 C(s)−1ds)γ

(γ > 0). This is motivated from the studyof the weighted Hardy inequality, a powerful tool in harmonic analysis [cf., B.Muckenhoupt (1972), B. Opic and A. Kufner (1990)]. The lower and the upperbounds correspond to γ = 1/2 and γ = 1, respectively.

Corollary 1.4 (Chen, 2000c). For the lower bound ξ1 of λ1 given in Theorem1.1, we have

4δ−1 > ξ1 > δ−1, (1.15)

where

δ = supr∈(0,D)

(∫ r

0

C(s)−1ds

)(∫ D

r

C(s)ds

), C(s) = coshd−1

[s

2

√−Kd− 1

].

Page 19: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

1.2 New variational formula for the first eigenvalue 7

Theorem 1.1 and its corollaries are also valid for manifolds with a convexboundary endowed with the Neumann boundary condition. In this case, theestimates (1.1)—(1.10) are conjectured by the geometers to be correct. However,only the Lichnerowicz’s estimate (1.1) was proven by J. F. Escobar in 1990.The others in (1.2)—(1.10) and furthermore in (1.12)—(1.15) are all new ingeometry.

Sketch of the main proof (Chen and F. Y. Wang, 1993b)

Here we adopt the analytic language. Our main tool is the coupling methods.Given a self-adjoint second order elliptic operator L in Rd

L =

d∑

i, j=1

aij(x)∂2

∂xi∂xj+

d∑

i=1

bi(x)∂

∂xi,

an elliptic (usually degenerated) operator L on the product space Rd × Rd iscalled a coupling of L if it satisfies the following marginality (Chen and S. F.Li, 1989):

Lf(x, y) = Lf(x)(resp., Lf(x, y) = Lf(y)

), f ∈ C2

b (Rd), x 6= y,

where on the left-hand side, f is regarded as a bivariate function.

Denote by Ptt>0 the semigroup determined by L: Pt = etL. Correspond-

ing to a coupling operator L, we have Ptt>0. The coupling simply meansthat

Ptf(x, y) = Ptf(x)(resp., Ptf(x, y) = Ptf(y)

)(1.20)

for all f ∈ C2b (Rd) and all (x, y) (x 6= y), where on the left-hand side, f is again

regarded as a bivariate function. Having the preparation in mind, we can nowstart our proof.Step 1. Let g be an eigenfunction of −L corresponding to λ1. That is, −Lg =λ1g. By the standard differential equation (the forward Kolmogorov equation)of the semigroup, we have

d

dtPtg(x) = PtLg(x) = −λ1Ptg(x).

Solving this ordinary differential equation in Ptg(x) for fixed g and x, we obtain

Ptg(x) = g(x)e−λ1t. (1.21)

Step 2. Consider the case of compact space. Then g is Lipschitz with respect tothe distance ρ. Denote by cg the Lipschitz constant. Now, the main conditionwe need is the following:

Ptρ(x, y) 6 ρ(x, y)e−αt. (1.22)

Page 20: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

8 1 An Overview of the Book

This condition is more or less equivalent to

Lρ(x, y) 6 −αρ(x, y), x 6= y (1.23)

(cf., Lemma A.6 in the Appendix). Setting g1(x, y) = g(x) and g2(x, y) = g(y),we obtain

e−λ1t|g(x) − g(y)| =∣∣Ptg(x) − Ptg(y)

∣∣ (by (1.21))

=∣∣Ptg1(x, y) − Ptg2(x, y)

∣∣ (by (1.20))

=∣∣Pt(g1 − g2)(x, y)

∣∣ 6 Pt|g1 − g2|(x, y)6 cgPtρ(x, y) (Lipschitz property)

6 cgρ(x, y)e−αt (by (1.22)).

Since g is not a constant, there exist x 6= y such that g(x) 6= g(y). Lettingt→ ∞, we must have λ1 > α.

The proof is unbelievably straightforward. A good point in the proof is theuse of eigenfunction so that we can achieve the sharp estimates. On the otherhand, it is crucial that we do not need too much knowledge about the eigenfunc-tion, otherwise, there is no hope to work out in such a general setting since theeigenvalue and its eigenfunction are either known or unknown simultaneously.Except the Lipschitz property of g with respect to the distance, which can beavoided by using a localizing procedure for the non-compact case, the key ofthe proof is clearly the condition (1.23). For this, one needs not only a goodcoupling but also a good choice of the distance. There is a long trip to solvethese two problems. The details will be explained in the next two chapters.

Our proof is universal in the sense that it works for general Markov processes.We also obtain variational formulas for non-compact manifolds, elliptic opera-tors in Rd (Chen and F. Y. Wang, 1997b), and Markov chains (Chen, 1996).It is more difficult to derive the variational formulas for the elliptic operatorsand Markov chains due to the presence of infinite parameters in these cases. Incontrast, there are only three parameters (d, D, and K) in the geometric case.In fact, having the coupling methods at hand, the formula (1.11) is a particularconsequence of our general formula (which is complete in dimensional one) forelliptic operators. The general formulas have recently been extended to theDirichlet eigenvalues by Chen, Y. H. Zhang and X. L. Zhao (2003).

To conclude this section, we return to the matrix case introduced at thebeginning of the chapter.

Triangle matrices (Birth–death processes)

To answer the question just mentioned, we need some notations. Define

µ0 = 1, µi =b0 · · · bi−1

a1 · · · ai, i > 1.

Page 21: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

1.2 New variational formula for the first eigenvalue 9

Assume that the process is non-explosive:

∞∑

k=0

1

bkµk

k∑

i=0

µi = ∞ and moreover Z =∑

i

µi <∞. (1.24)

Then, the process is ergodic (positive recurrent). The corresponding Dirichletform is

D(f) =∑

i

πibi(fi+1 − fi)2, D(D) = f ∈ L2(π) : D(f) <∞.

Here and in what follows, only the diagonal elements D(f) are written, but thenon-diagonal elements can be computed from the diagonal ones by using thequadrilateral role. We then have the classical variational formula

λ1 =D(f) : π(f) = 0, π

(f2)

= 1,

where π(f) =∫fdπ. Define

F = f : f0 = 0, f is strictly increasing,F = f : f0 = 0, there exists k : 1 6k 6 ∞ so that fi = fi∧k

and f is strictly increasing in [0, k],

Ii(f) =1

µibi(fi+1 − fi)

j>i+1

µjfj .

Note that F is simply a modification of F . Hence, only two notations F andI(f) are essential.

Theorem 1.5 (Chen (1996; 2000c; 2001b)). Let f = f − π(f). For ergodicbirth–death processes (i.e., (1.24) holds), we have

(1) Dual variational formulas: inff∈F

supi>1

Ii(f)−1 = λ1 = supf∈F

infi>0

Ii(f)−1.

(2) Explicit bounds and approximating procedure: Two explicit sequences ηnand ηn are constructed such that

Zδ−1 > η−1n > λ1 > η−1

n > (4δ)−1,

whereδ = sup

i>1

j6i−1

(µjbj)−1∑

j>i

µj .

(3) Explicit criterion: λ1 > 0 iff δ <∞.

Here the word “dual” means that the upper and lower bounds in part (1) of thetheorem are interchangeable if one exchanges “sup” and “inf”. Certainly, withslight modifications, this result is also valid for finite matrices, refer to Chen(1999a). Comparing with the examples given in Section 1.1, can you expectsuch a short and complete answer?

Page 22: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

10 1 An Overview of the Book

1.3 Basic inequalities and new forms of Chee-

ger’s constants

Basic inequalities

We now go to a more general setup. Let (E,E , π) be a probability space satis-fying (x, x) : x ∈ E ∈ E × E . Denote by Lp(π) the usual real Lp-space withnorm ‖ · ‖p. Write ‖ · ‖ = ‖ · ‖2.

For a given Dirichlet form (D,D(D)), the classical variational formula forthe first eigenvalue λ1 can be rewritten into the form (1.25) below with opti-mal constant C = λ−1

1 . From this point of view, it is natural to study otherinequalities. Here are two more basic inequalities (1.26) and (1.27):

Poincare inequality : Var(f) 6 CD(f), f ∈ L2(π), (1.25)

Logarithmic Sobolev inequality :

∫f2 log

f2

‖f‖2dπ 6 CD(f),

f ∈ L2(π),

(1.26)

Nash inequality : Var(f) 6 CD(f)1/p‖f‖2/q1 , f ∈ L2(π), (1.27)

where Var(f) = π(f2)−π(f)2, π(f) =∫fdπ, p ∈ (1,∞) and 1/p+1/q = 1. The

last two inequalities are due to L. Gross (1976) and J. Nash (1958) respectively.

Our main object is a symmetric (not necessarily Dirichlet) form (D,D(D))on L2(π), corresponding to an integral operator (or symmetric kernel) on (E,E ):

D(f) =1

2

E×E

J(dx, dy)[f(y) − f(x)]2,

D(D) = f ∈ L2(π) : D(f) <∞,(1.28)

where J is a non-negative, symmetric measure having no charge on the diagonalset (x, x) : x ∈ E. A typical example in our mind is the reversible jumpprocess with q-pair (q(x), q(x, dy)) and reversible measure π. Then J(dx, dy) =π(dx)q(x, dy).

For the remainder of this section, we restrict our discussions to the symmetricform of (1.28).

Status of the research

An important topic in this research area is to study under what conditionson the symmetric measure J do the above inequalities hold. In contrast withthe probabilistic method used in Section 1.2, here we adopt a generalization ofCheeger’s method (1970), which comes from Riemannian geometry. Naturally,we define λ1 := infD(f) : π(f) = 0, ‖f‖ = 1. For bounded jump processes,the fundamental known result is the following:

Page 23: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

1.3 Basic inequalities and new forms of Cheeger’s constants 11

Theorem 1.6 (G. F. Lawler and A. D. Sokal, 1988). λ1 >k2

2M, where

k = infπ(A)∈(0,1)

∫Aπ(dx)q(x,Ac)

π(A) ∧ π(Ac)and M = sup

x∈Eq(x) <∞.

In the past years, the theorem has been collected into six books: Chen(1992a), A. Sinclair (1993), F. R. K. Chung (1997), L. Saloff-Coste (1997), Y.Colin de Verdiere (1998), D. G. Aldous and J. A. Fill (1994–). From the titlesof the books, one can see a wide range of the applications. However, this resultfails for the unbounded operator (i.e., supx q(x) = ∞). Thus, it has been achallenging open problem in the past ten years to handle the unbounded case.

As for the logarithmic Sobolev inequality, there have been a large numberof publications in the past twenty years for differential operators. For a survey,see D. Bakry (1992), L. Gross (1993) or A. Guionnet and B. Zegarlinski (2003).Still, there are very limited results for integral operators.

New results

Since the symmetric measure can be very unbounded, we choose a symmetric,non-negative function r(x, y) such that

J (α)(dx, dy) := Ir(x,y)α>0J(dx, dy)

r(x, y)α, α > 0

satisfiesJ (1)(dx,E)

π(dx)6 1, π-a.s.

For convenience, we use the convention J (0) = J . Corresponding to the threeinequalities above, we introduce the following new forms of Cheeger’s constants.

Inequality Constant k(ααα)k(ααα)k(ααα)

Poincare infπ(A)∈(0,1)

J (α)(A×Ac)

π(A) ∧ π(Ac)(Chen and F. Y. Wang, 1998)

Log. Sobolev limr→0

infπ(A)∈(0,r]

J (α)(A×Ac)

π(A)√

log[e+ π(A)−1](F. Y. Wang, 2001a)

Log. Sobolev limδ→∞

infπ(A)>0

J (α)(A×Ac) + δπ(A)

π(A)√

1 − logπ(A)(Chen, 2000b)

Nash infπ(A)∈(0,1)

J (α)(A×Ac)

[π(A) ∧ π(Ac)](2q−3)/(2q−2)(Chen, 1999b)

Table 1.3 New forms of Cheeger’s constants

Now, our main result can be easily stated as follows.

Theorem 1.7. k(1/2) > 0 =⇒ the corresponding inequality holds.

Page 24: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

12 1 An Overview of the Book

In short words, we use J (1/2) and J (1) to handle with the unbounded J .The use of the first two kernels comes from the Schwarz inequality. The resultis proven in four papers quoted in Table 1.3. In these papers, some estimates,which can be sharp or qualitatively sharp, for the upper or lower bounds arealso presented.

1.4 New picture of ergodic theory and explicitcriteria

Importance of the inequalities

Let (Pt)t>0 be the semigroup determined by a Dirichlet form (D,D(D)). Then,various applications of the inequalities are based on the following results:

Theorem 1.8 (T. M. Liggett (1989), L. Gross (1976) and Chen (1999b)).

(1) Poincare inequality ⇐⇒ ‖Ptf − π(f)‖2 = Var(Ptf) 6 Var(f) exp[−2λ1t].

(2) Logarithmic Sobolev inequality =⇒ exponential convergence in entropy:Ent(Ptf) 6 Ent(f) exp[−2σt], where Ent(f) = π(f log f) −π(f) log ‖f‖1.

(3) Nash inequality ⇐⇒ Var(Ptf) 6 C‖f‖21/t

q−1.

In the context of diffusions, one can replace ”=⇒” by ”⇐⇒” in part (2).Therefore, the above inequalities describe some type of L2-ergodicity for thesemigroup (Pt)t>0. These inequalities have become powerful tools in the studyon infinite-dimensional mathematics (phase transitions, for instance) and theeffectiveness of random algorithms.

Three traditional types of ergodicity

The following three types of ergodicity are well known for Markov processes.

Ordinary ergodicity : limt→∞

‖pt(x, ·) − π‖Var = 0

Exponential ergodicity : ‖pt(x, ·) − π‖Var 6 C(x)e−αt for some α > 0

Strong ergodicity : limt→∞

supx

‖pt(x, ·) − π‖Var = 0

⇐⇒ limt→∞

eβt supx

‖pt(x, ·) − π‖Var = 0 for some β > 0

where pt(x, dy) is the transition function of the Markov process and ‖ · ‖Var isthe total variation norm. They obey the following implications:

Strong ergodicity =⇒ Exponential ergodicity =⇒ Ordinary ergodicity.

It is natural to ask the following question: Does there exist any relation betweenthe above inequalities and the three traditional types of ergodicity?

Page 25: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

1.4 New picture of ergodic theory and explicit criteria 13

New picture of ergodic theory

Theorem 1.9 (Chen (1999c), et al). Let (E,E ) be a measurable space withcountably generated E . Then, for a Markov processes with state space (E,E ),reversible and having transition probability densities with respect to a probabilitymeasure, we have a diagram shown in Figure 1.4.

Nash inequality

Logarithmic Sobolev inequality L1-exponential convergence⇓ ‖

Exponential convergence in entropy π-a.s. Strong ergodicity⇓ ⇓

Poincare inequality ⇐⇒ π-a.s. Exp. ergodicity⇓

L2-algebraic ergodicity⇓

Ordinary ergodicity

Figure 1.4 Diagram of nine types of ergodicity

In Figure 1.4, the L2-algebraic ergodicity means that Var(Ptf) 6 CV (f)t1−q (t>0) holds for some V having the properties: V is homogeneous of degree two inthe sense that

V (cf + d) = c2V (f)

for any constants c and d, and V (f) <∞ for all functions f with finite support.The diagram is complete in the following sense. Each single-side implication

can not be replaced by double-sides one. Moreover, strong ergodicity and loga-rithmic Sobolev inequality (resp., exponential convergence in entropy) are notcomparable. With the exception of the equivalences, all the implications in thediagram are suitable for more general Markov processes. Clearly, the diagramextends the ergodic theory of Markov processes.

The application of the diagram is obvious. For instance, one obtains imme-diately some criteria (which are indeed new) for Poincare inequality to be heldfrom the well-known criteria for the exponential ergodicity. On the other hand,by using the estimates obtained from the study on Poincare inequality, one mayestimate exponentially ergodic convergence rate (for which, the knowledge isstill very limited).

The diagram was presented in Chen (1999c), stated mainly for Markovchains. Recently, the equivalence of L1-exponential convergence and strongergodicity was proved by Y. H. Mao (2002c). A counter-example of diffusionwhich shows that strong ergodicity does not imply exponential convergence inentropy is constructed by F. Y. Wang (2001b). For L2-algebraic convergence,refer to T. M. Liggett (1991), J. D. Deuschel (1994), Chen and Y. Z. Wang(2000) and references therein. We will come back to this topic in §7.6. Thedetail proofs of the diagram is presented in Chapter 8.

Page 26: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

14 1 An Overview of the Book

Explicit criteria for several types of ergodicity

As an application of the diagram in Figure 1.4, we obtain a criterion for the ex-ponential ergodicity of birth–death processes, as listed in Table 1.5. To achievethis, we use the equivalence of exponential ergodicity and Poincare inequality,as well as the explicit criterion for the Poincare inequality given in part (3) ofTheorem 1.5. This solves a long standing open problem in the study of Markovchains [cf., W. J. Anderson (1991, §6.6), Chen (1992a, §4.4)].

Recall the sequence (µn) was defined above (1.24) and set µ[i, k]=∑

i6j6k µj .

Property Criterion

Uniqueness∑

n>0

1

µnbnµ[0, n] = ∞ (∗)

Recurrence∑

n>0

1

µnbn= ∞

Ergodicity (∗) & µ[0,∞) <∞Exponential ergodicityL2-exp. convergence

(∗) & supn>1

µ[n,∞)∑

j6n−1

1

µjbj<∞

Discrete spectrum (∗) & limn→∞

supk>n+1

µ[k,∞)∑

n6j6k−1

1

µjbj= 0

Log. Sobolev inequality (∗) & supn>1

µ[n,∞)log[µ[n,∞)−1]∑

j6n−1

1

µjbj<∞

Strong ergodicityL1-exp. convergence

(∗) &∑

n>0

1

µnbnµ[n+1,∞)=

n>1

µn

j6n−1

1

µjbj<∞

Nash inequality (∗) & supn>1

µ[n,∞)(q−2)/(q−1)∑

j6n−1

1

µjbj<∞ (ε)

Table 1.5 Ten criteria for birth–death processes

Next, it is natural to look for some criteria for other types of ergodicity.To do so, we consider only the one-dimensional case. Here we focus on thebirth–death processes since the one-dimensional diffusion processes are in pa-rallel. The criterion for strong ergodicity was obtained recently by H. J. Zhang,X. Lin and Z. T. Hou (2000), and extended by Y. H. Zhang (2001), using adifferent approach, to a larger class of Markov chains. The criteria for loga-rithmic Sobolev, Nash inequalities, and the discrete spectrum (no continuousspectrum and all eigenvalues have finite multiplicity) were obtained by S. G.Bobkov and F. Gotze (1999a; 1999b) and Y. H. Mao (2000, 2002a,b), respec-tively, based on the weighted Hardy inequality [see also L. Miclo (1999a,b), F.Y. Wang (2000a,b), F. Z. Gong and F. Y. Wang (2002)]. It is understood nowthe results can also be deduced from generalizations of the variational formulas

Page 27: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

1.4 New picture of ergodic theory and explicit criteria 15

discussed in this chapter [cf., Chen (2002a,d, 2003a) and Chapter 6]. Finally,we summarize these results in Theorem 1.10 and Table 1.5. The table is ar-ranged in such an order that the property in the latter line is stronger than theproperty in the former line. The only exception is that even though the strongergodicity is often stronger than the logarithmic Sobolev inequality, they arenot comparable in general, as mentioned in Section 1.3.

Theorem 1.10 (Chen, 2001a). For birth–death processes with birth rates bi(i >

0) and death rates ai(i > 1), ten criteria are listed in Table 1.5, in which the notion“(∗) & · · · ” means that one requires the uniqueness condition in the first line plusthe condition “· · · ”. The notion “(ε)” in the last line means that there is still asmall room (1 < q 6 2) left from completeness.

In conclusion, we have discussed in the chapter three levels of problems, threemethods and mainly three results. According to the range of the problems, theprinciple eigenvalues, the basic inequalities and the ergodic theory, each latterone has a wider range than the previous one. We have used coupling methodwhich comes from probability theory, the Cheeger’s approach which comes fromRiemannian geometry and the weighted Hardy inequality which comes fromHarmonic analysis. Finally, we have presented some variational formulas forthe exponential ergodic rates, a comparison diagram and a table of explicitcriteria for several types of ergodicity.

Page 28: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

16 1 An Overview of the Book

Page 29: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Chapter 2

Optimal MarkovianCouplings

This chapter introduces our first mathematical tool—the coupling methods, inthe study of the topics in the book, and so it will be used many times in thesubsequent chapters. We introduce couplings, Markovian couplings and optimalMarkovian couplings, mainly for time-continuous Markov processes. The studyemphasizes the analysis of the coupling operators rather than the processes.Some constructions of optimal Markovian couplings for Markov chains and dif-fusions are presented, which are often unexpected. Some typical applicationsof the methods are illustrated through simple examples. Two general results ofapplications to the estimation of the first eigenvalue are proved in §2.4

2.1 Couplings and Markovian couplings

Let us recall the simple definition of couplings.

Definition 2.1. Let µk be a probability on a measurable space (Ek, Ek), k = 1, 2.A probability measure µ on the product measurable space (E1 × E2, E1 × E2) iscalled a coupling of µ1 and µ2 if the following marginality holds:

µ(A1 ×E2) = µ1(A1)

µ(E1 ×A2) = µ2(A2), Ak ∈ Ek, k = 1, 2.(M)

Example 2.2 (Independent coupling µ0). µ0 = µ1 × µ2. That is, µ is theindependent product of µ1 and µ2.

This trivial coupling already has a non-trivial application. Let µk = µ onR, k = 1, 2. We say that µ satisfies the FKG-inequality if

∫fgdµ >

∫fdµ

∫gdµ, f, g ∈ M , (2.1)

Page 30: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

18 2 Optimal Markovian Couplings

where M is the set of bounded monotone functions on R. Here is the one-lineproof based on the independent coupling:

∫∫µ0(dx, dy)[f(x) − f(y)][g(x) − g(y)] > 0, f, g ∈ M .

We mention that a criterion of FKG-inequality for higher dimensional mea-sure on Rd (more precisely, for diffusions) was obtained by Chen and F. Y.Wang (1993a). However, a criterion is still unknown for Markov chains.

Open Problem 2.3. What is the criterion of FKG-inequality for Markov jumpprocesses?

We will explain the term “Markov jump processes” soon. The next exampleis non-trivial.

Example 2.4 (Basic coupling µb). Let Ek = E, k = 1, 2. Denote by ∆ thediagonals in E: ∆ = (x, x) : x ∈ E. Take

µb(dx1, dx2) =(µ1 ∧ µ2)(dx1)I∆

+(µ1 − µ2)

+(dx1)(µ1 − µ2)−(dx2)

(µ1 − µ2)+(E)I∆c ,

where ν± is the Jordan-Hahn decomposition of a signed measure ν and ν1 ∧ ν2 =ν1 − (ν1 − ν2)

+.

Note that one may ignore I∆c in the above formula since (µ1 − µ2)+ and

(µ1 − µ2)− have different supports.

Actually, the basic coupling is optimal in the following sense. Let ρ bethe discrete distance: ρ(x, y)=1 if x 6= y, and = 0 if x = y. Then, a simplecomputation shows that

µb(ρ) =1

2‖µ1 − µ2‖Var.

Thus, by Dobrushin’s Theorem (see Theorem 2.23 below), we have

µb(ρ) = infµµ(ρ),

where µ varies over all couplings of µ1 and µ2. In other words, µb(ρ) is a ρ-optimal coupling. This also indicates an optimality for couplings we are goingto study in this chapter.

Similarly, we can define a coupling process of two stochastic processes interms of their distributions at each time t for fixed initial points. Of course,for given marginal Markov processes, the resulting coupled process may not beMarkovian. Non-Markovian couplings are useful, especially in the time-discretesituation. However, in the time-continuous case, they are not practical. Hence,we now restrict ourselves to the Markovian couplings.

Page 31: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

2.1 Couplings and Markovian couplings 19

Definition 2.5. Given two Markov processes with semigroups Pk(t) or transitionprobabilities Pk(t, xk, ·) on (Ek ,Ek), k = 1, 2. A Markovian coupling is a Markov

process with semigroup P (t) or transition probability P (t;x1, x2; ·) on the productspace (E1 ×E2,E1 × E2) having the marginality:

P (t;x1, x2;A1 ×E2) = P1(t, x1, A1),

P (t;x1, x2;E1 ×A2) = P2(t, x2, A2), t > 0, xk ∈ Ek, Ak ∈ Ek, k = 1, 2.(MP)

Equivalently,

P (t)f(x1, x2) = P1(t)f(x1),

P (t)f(x1, x2) = P2(t)f(x2), t > 0, xk ∈ Ek, f ∈ bEk, k = 1, 2,(MP)

where bE is the set of all bounded E -measurable functions. Here, on the left-handside, f is regarded as a bivariate function.

We now consider Markov jump processes. For this, we need some notations.Let (E,E ) be a measurable space such that (x, x) : x ∈ E ∈ E ×E and x ∈E for all x ∈ E. It is well-known that for a given sub-Markovian transitionfunction P (t, x, A) (t > 0, x ∈ E,A ∈ E ), if it does satisfy the jump condition

limt→0

P (t, x, x) = 1, x ∈ E, (2.2)

then the limits

q(x) := limt→0

1− P (t, x, x)t

and q(x,A) := limt→0

P (t, x, A \ x)t

(2.3)

exist for all x ∈ E and A ∈ R, where

R =

A ∈ E : lim

t→0supx∈A

[1 − P (t, x, x)

]= 0

.

Moreover, for each A ∈ R, q(·), q(·, A) ∈ E , for each x ∈ E, q(x, ·) is a finitemeasure on (E,R) and 0 6 q(x,A) 6 q(x) 6 ∞ for all x ∈ E and A ∈ R. Thepair (q(x), q(x,A)) (x ∈ E, A ∈ R) is called a q-pair (also called the transitionintensity or transition rate). The q-pair is said to be totally stable if q(x) < ∞for all x ∈ E. Then q(x, ·) can be uniquely extended to the whole space E

as a finite measure. Next, the q-pair(q(x), q(x,A)

)is called conservative if

q(x,E) = q(x) <∞ for all x ∈ E (Note that the conservativity here is differentfrom the one often used in the context of diffusions). Because of the abovefacts, we often call the sub-Markovian transition P (t, x, A) satisfying (2.3) ajump process or a q-process.. Finally, a q-pair is called regular if it is not onlytotally stable and conservative but also determines uniquely a jump process(non-explosive).

When E is countable, conventionally we use the matrix Q = (qij : i, j ∈ E)(called Q-matrix) and P (t) = (pij(t) : i, j ∈ E),

p′ij(t)∣∣t=0

= qij ,

Page 32: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

20 2 Optimal Markovian Couplings

instead of the q-pair and the jump process, respectively. Here qii = −qi, i ∈ E.We also call P (t) = (pij(t)) a Markov chain (which is used, throughout thisbook, only for discrete state space) or a Q-process.

In practice, what we know in advance is the q-pair (q(x), q(x, dy)) but notP (t, x, dy). Hence, our real interest goes to the opposite direction. How does aq-pair determine the properties of P (t, x, dy)? A large part of the book (Chen,1992) is devoted to the theory of jump processes. Here, we would like to mentionthat the theory now has some very nice application to the quantum physicswhich was missed in the book. Refer to the survey article by A. A. Konstantinov,U. P. Maslov and A. M. Chebotarev (1990) and references within.

Clearly, there is a one-to-one correspondence of a q-pair and the operator:

Ωf(x) =

∫q(x, dy)[f(y) − f(x)] − [q(x) − q(x,E)]f(x), f ∈ bE .

Because of this one-to-one correspondence between a q-pair and its operatorΩ, we will use both according to our convenience. Corresponding to a coupledMarkov jump process, we have a q-pair (q(x1, x2), q(x1, x2; dy1, dy2)) as follows.

q(x1, x2) = limt→0

1 − P (t;x1, x2; x1 × x2)t

, (x1, x2) ∈ E1 ×E2

q(x1, x2; A) = limt→0

P (t;x1, x2; A)

t, (x1, x2) /∈ A ∈ R,

R :=

A ∈ E1 × E2 : lim

t→0sup

(x1,x2)∈A

[1 − P (t;x1, x2; (x1, x2))

]= 0

.

Concerning with the total stability and conservativity of the q-pair of a coupling(or coupled) process, we have the following result.

Theorem 2.6. The following assertions hold.

(1) [Chen, 1994a]. A (equivalently, any) coupling q-pair is totally stable iff so arethe marginals.

(2) [Y. H. Zhang, 1994]. A (equivalently, any) coupling q-pair is conservative iffso are the marginals.

Proof of part (1). To have a feeling about the proof, we prove here the easierpart of the theorem. Note that we do not assume the uniqueness of the processeshere.

Denote by Pk(t, xk, dyk) a marginal jump process with q-pair

(qk(xk), qk(xk, dyk))

on (Ek ,Rk), k = 1, 2, respectively. Next, let P (t;x1, x2; dy1, dy2) be a coupled

jump process with q-pair (q(x1, x2), q(x1, x2; dy1, dy2)) on (E1×E2, R). Clearly,we need only to show that

q1(x1) ∨ q2(x2) 6 q(x1, x2) 6 q1(x1) + q2(x2).

Page 33: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

2.1 Couplings and Markovian couplings 21

By the marginality for processes (MP), we have

P (t;x1, x2; x1 × x2)> P (t;x1, x2; x1 ×E2) − P (t;x1, x2;E1 × (E2 \ x2))> P (t;x1, x2; x1 ×E2) − 1 + P (t;x1, x2;E1 × x2)= P1(t, x1, x1) − 1 + P2(t, x2, x2).

By the first part of (2.3), this gives us q(x1, x2) 6 q1(x1) + q2(x2).On the other hand, since

P (t;x1, x2; x1 × x2) 6 P (t;x1, x2; x1 ×E2) = P (t, x1, x1),

we obtain q(x1, x2) > q1(x1) by (2.3) again.

Due to Theorem 2.6, from now on, assume that all coupling operators con-

sidered below are conservative. Then, we have

q(x1, x2) = limt→0

1 − P (t;x1, x2; x1 × x2)t

, (x1, x2) ∈ E1 ×E2

q(x1, x2; A) = limt→0

P (t;x1, x2; A)

t, (x1, x2) /∈ A ∈ E1 × E2.

Note that in the second line, the original set R is replaced by E1 × E2. Define

Ω1f(x1) =

∫q1(x1, dy1)[f(y1) − f(x1)], f ∈ bE1.

Similarly, we can define Ω2. Corresponding to a coupling process P (t), we also

have an operator Ω. Now, since the marginal q-pairs and the coupling q-pairsare all conservative, it is not difficult to prove that (MP) implies the following:

Ωf(x1, x2) = Ω1f(x1), f ∈ bE1

Ωf(x1, x2) = Ω2f(x2), f ∈ bE2, xk ∈ Ek, k = 1, 2.(MO)

Again, on the left-hand side, f is regarded as a bivariate function. Refer toChen (1986a) or Chen (1992a, Chapter 5). Here, “MO” means the marginalityfor operators.

Definition 2.7. Any operator Ω satisfying (MO) is called a coupling operator.

Does there exists any coupling operator?

Examples of coupling operators for jump processes

The simplest example to answer the above question is the following.

Page 34: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

22 2 Optimal Markovian Couplings

Example 2.8 (Independent coupling Ω0).

Ω0f(x1, x2) = [Ω1f(·, x2)](x1) + [Ω2f(x1, ·)](x2), xk ∈ Ek, k = 1, 2.

This coupling is trivial but it does show that a coupling operator alwaysexists.

To simplify our notation, in what follows, instead of writing down a couplingoperator, we will use tables. For instance, a q-pair can be expressed as follows:

x → dy \ x at rate q(x, dy).

In particular, in the discrete case, a Q-matrix can be expressed as

i→ j 6= i at rate qij .

Example 2.9 (Classical coupling Ωc). Take E1 = E2 = E and let Ω1 =Ω2 = Ω. If x1 6= x2, then take

(x1, x2) → (y1, x2) at rate q(x1, dy1)→ (x1, y2) at rate q(x2, dy2).

Otherwise(x, x) → (y, y) at rate q(x, dy).

Each coupling has its own character. The classical coupling means that themarginals evolve independently until they meet. Then, they move together. Anice way to interpret this coupling is to use a Chinese idiom: fall in love at firstsight. That is, a boy and a girl had independent paths of their lives before thefirst time they met each other. Once they met, they are in love at once andwill have the same path of their lives forever. When the marginal Q-matricesare the same, all couplings considered below will have the property listed in thelast line and hence we will not mention again.

Example 2.10 (Basic coupling Ωb). For x1, x2 ∈ E, take

(x1, x2) → (y, y) at rate[q1(x1, ·) ∧ q2(x2, ·)

](dy)

→ (y1, x2) at rate[q1(x1, ·) − q2(x2, ·)

]+(dy1)

→ (x1, y2) at rate[q2(x2, ·) − q1(x1, ·)

]+(dy2)

The basic coupling means that the components jump to the same place withthe biggest possible rate. This explains where the term q1(x1, dy1)∧ q2(x2, dy2)comes from, which is the biggest one to guarantee the marginality. This term isthe key of the coupling. Note that whenever we have a term A ∧ B, we shouldhave the other two terms (A−B)+ and (B −A)+ automatically, again, due tothe marginality. Thus, in what follows, we will write down the term A∧B onlyfor simplicity.

Example 2.11 (Coupling of marching soldiers Ωm). Assume that E is anaddition group. Take

(x1, x2) → (x1 + y, x2 + y) at rate q1(x1, x1 + dy) ∧ q2(x2, x2 + dy).

Page 35: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

2.1 Couplings and Markovian couplings 23

The word “marching” is a Chinese name, which is the command to soldiersto start marching. Thus, this coupling means that at each step, the componentsmaintain the same length of jumps with the biggest possible rate.

In the time-discrete case, the classical coupling and the basic coupling aredue to W. Doeblin (1938) and L. N. Wasserstein (1969), respectively. Thecoupling of marching solders is due to Chen (1986b). The original purpose forthe last coupling is mainly to keep the order-preservation.

Let us now consider a birth–death process with regular Q-matrix:

qi,i+1 = bi, i > 0; qi,i−1 = ai, i > 1.

Then for two copies of the process starting from i1 and i2, respectively, we havethe following two examples taken from (Chen, 1990):

Example 2.12 (Modified coupling of marching solders Ωcm). Take Ωcm

= Ωc if |i1 − i2| 6 1 and Ωcm = Ωm if |i1 − i2| > 2.

Example 2.13 (Coupling by inner reflection Ωir). Again, take Ωir = Ωc

if |i1 − i2| 6 1. For i2 > i1 + 2, take

(i1, i2) → (i1 + 1, i2 − 1) at rate bi1 ∧ ai2

→ (i1 − 1, i2) at rate ai1

→ (i1, i2 + 1) at rate bi2 .

By exchanging i1 and i2, we can get the expression of Ωir for the case that i1 > i2.

This coupling lets the components move to the closed place (not necessarilythe same place as required by the basic coupling) with the biggest possible rate.

From these examples one sees that there are many choices of a couplingoperator Ω. Indeed, there are infinite many choices! Thus, in order to use thecoupling technique, a basic problem we should study is the regularity (non-explosive problem) of coupling operators. For which, fortunately, we have acomplete answer [Chen (1986a) or Chen (1992a, Chapter 5)]. The followingresult can be regarded as a fundamental theorem for couplings of jump processes.

Theorem 2.14 (Chen, 1986a).

(1) If a coupling operator is non-explosive, then so are their marginals.

(2) If the marginals are both non-explosive, then so is every coupling operator.

(3) If so, then (MP) and (MO) are equivalent.

Clearly, Theorem -1.14 simplifies greatly our study on couplings for generaljump processes since the marginality (MP) of a coupling process is reduced tothe rather simpler marginality (MO) of the corresponding operators.

Page 36: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

24 2 Optimal Markovian Couplings

Markovian couplings for diffusions

We now turn to study the couplings for diffusion processes in Rd with secondorder differential operator

L =1

2

d∑

i, j

aij(x)∂2

∂xi∂xj+

d∑

i=1

bi(x)∂

∂xi.

For simplicity, we write L ∼ (a(x), b(x)). Given two diffusions with operators

Lk ∼ (ak(x), bk(x)), k = 1, 2,

respectively, an elliptic (may be degenerated) operator L on the product spaceR

d × Rd is called a coupling of L if it satisfies the following marginality:

Lf(x, y) = Lf(x)(resp., Lf(x, y) = Lf(y)

),

f ∈ C2b (Rd), x 6= y.

(MO)

Again, on the left-hand side, f is regarded as a bivariate function. From this,it is clear that the coefficients of any coupling operator L should be of the form

a(x, y) =

(a1(x) c(x, y)c(x, y)∗ a2(y)

), b(x, y) =

(b1(x)b2(y)

),

where the matrix c(x, y)∗ is the conjugate of c(x, y). This condition and thenon-negative definite property of a(x, y) consist of the marginality in the contextof diffusions. Obviously, the only freedom is the choice of c(x, y).

As analog of jump processes, we have the following examples:

Example 2.15 (Classical coupling). c(x, y) ≡ 0 for all x 6= y.

Example 2.16 (Coupling of marching solders [Chen and S. F. Li, 1989]).Let ak(x) = σk(x)σk(x)∗, k = 1, 2. Take c(x, y) = σ1(x)σ2(y)

∗.

The couplings given below are due to T. Lindvall and L. C. G. Rogers (1986),Chen and S. F. Li (1989), respectively.

Example 2.17 (Coupling by reflection). Let L1 = L2 and a(x) = σ(x)σ(x)∗ .Take

c(x, y) = σ(x)

[σ(y)∗ − 2

σ(y)−1uu∗

|σ(y)−1u|2], detσ(y) 6= 0, x 6= y

c(x, y) = σ(x)[I − 2uu∗

]σ(y)∗, x 6= y

where u = (x− y)/|x− y|.

Page 37: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

2.1 Couplings and Markovian couplings 25

This coupling was generalized to Riemannian manifolds by W. S. Kendall(1986) and M. Cranston (1991).

In the case that x = y, the first and the third couplings here are defined tobe the same as the second one.

In probabilistic language, suppose that the original process is given by thestochastic differential equation

dXt =√

2σ(Xt)dBt + b(Xt)dt,

where (Bt) is a Brownian motion. We want to construct a new process (X ′t):

dX ′t =

√2σ′(Xt)dB

′t + b′(Xt)dt

on the same probability space, having the same distribution as of (Xt). Then,what we need is only choosing a suitable Brownian motion (B ′

t). Correspondingto the above three examples, we have

(1) Classical coupling: B′t is a new Brownian motion, independent of Bt.

(2) Coupling of marching soldiers: B′t = Bt.

(3) Coupling by reflection: B′t = [I − 2uu∗](Xt, X

′t)Bt, where u is given in

Example 2.17.

It is important to remark that in the constructions, we need only considerthe time t < T , where T is the coupling time:

T = inft > 0 : Xt = X ′t

since Xt = X ′t for all t > T . This avoids the degeneration of the coupling

operators.Before moving further, let us mention a conjecture as follows.

Conjecture 2.18. The fundamental theorem (Theorem -1.14) holds for diffu-sions.

The following facts strongly support the conjecture.

(a) A well known sufficient condition says that the operator Lk (k = 1, 2) iswell-posed if there exists a function ϕk such that lim|x|→∞ ϕk(x) = ∞and Lkϕk 6 cϕk for some constant c. Then, the conclusion holds for thecoupling operators, simply taking

ϕ(x1, x2) = ϕ1(x1) + ϕ2(x2).

(b) Let τn,k be the first time leaving from the cube with length n of the k-thprocess (k = 1, 2) and let τn be the first time leaving the product cube ofcoupled process, then we have

τn,1 ∨ τn,2 6 τn 6 τn,1 + τn,2.

Besides, a process, the k-th one for instance, is well-posed iff

limn→∞

Pk[τn,k < t] = 0.

Page 38: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

26 2 Optimal Markovian Couplings

Open Problem 2.19. What should be the form of Markovian coupling operatorsfor Levy processes?

2.2 Optimality with respect to distances

Since there are infinite many Markovian couplings, we asked ourselves severaltimes in the past years: Does there exist an optimal one? Now, let us explainthe way how we obtained a reasonable notion for optimal Markovian couplings.The first time we touched this problem was in Chen and S. F. Li (1989). It wasproved there for Brownian motion, the coupling by reflection is optimal withrespect to the total variation and moreover, for different probability metrics,the effective couplings can be different. At the second time, in Chen (1990), itwas proved that for birth–death processes, we have an order as follows:

Ωir Ωb Ωc Ωcm Ωm,

where A B means that A is better than B in some sense. However, only in1992, it became clear to the author how to optimize couplings.

To study optimal couplings, we need more preparation. As was mentionedseveral times in the previous publications [Chen (1989a; 1989b; 1992a) andChen and S. F. Li (1989)] that it should be helpful to keep in mind the relationbetween couplings and the probability metrics. It will be clear soon, this isactually one of the key ideas of the study. So far as we know, there are morethan 16 different probability distances, including the total variation, the Levy-Prohorov distance for the weak convergence and so on. But we often concernwith another distance. We now explain our understanding how to introducethis distance.

As we know, in probability theory, we usually consider the following typesof convergence for real random variables on a probability space:

weak convergence

convergence in P

convergence in Lp (p ≥ 1)

a.s. convergence vague convergence

1

Figure 2.1. Typical types of convergence in probability theory

The Lp-convergence, the a.s. convergence and the convergence in P all dependon the reference frame — our probability space (Ω,F ,P). But the vague (weak)convergence does not. By a result of Skorohod [cf., N. Ikeda and S. Watanabe(1988, p.9 Theorem 2.7)], if Pn converges weakly to P , then we can choosea suitable reference frame (Ω,F ,P) such that ξn ∼ Pn, ξ ∼ P and ξn → ξa.s., where ξ ∼ P means that ξ has distribution P . Thus, all these types ofconvergence above are intrinsically the same except the Lp-convergence. In

Page 39: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

2.2 Optimality with respect to distances 27

other words, if we want to find another intrinsic metric on the space of allprobabilities, we should consider an analogue of the Lp-convergence.

Let ξ1, ξ2: (Ω,F ,P) → (E, ρ,E ). The usual Lp-metric is defined by

‖ξ1 − ξ2‖p =E[ρ(ξ1, ξ2)

p]1/p

.

Suppose that ξi ∼ Pi, i = 1, 2 and (ξ1, ξ2) ∼ P . Then

‖ξ1 − ξ2‖p =

∫ρ(x1, x2)

pP (dx1, dx2)

1/p

.

Certainly, P is a coupling of P1 and P2. However, if we ignore our referenceframe (Ω,F ,P), then there are a lot of choices of P for given P1 and P2. Thus,the intrinsic metric should be defined as follows:

Wp(P1, P2) = infP

∫ρ(x1, x2)

pP (dx1, dx2)

1/p

, p > 1.

Definition 2.20. The metric defined above is called Wp-distance or p-thWasserstein distance. Briefly, we write W = W1.

From the probabilistic point of view, the Wp-metric have an intrinsic pro-perty which makes them more suitable for certain applications. For exam-ple, if (E, ρ) is the Euclidean space, for P2 obtained from P1 by a translation,Wp(P1, P2) is equal to the length of the translation vector.

In general, it is quite hard to compute the exact Wp-distance. Here are themain known results.

Theorem 2.21 (S. S. Vallender, 1973). Let Pk be a probabilities on the real linewith distribution function Fk(x), k = 1, 2. Then

W (P1, P2) =

∫ +∞

−∞

|F1(x) − F2(x)|dx.

Theorem 2.22 (D. C. Dowson and B. V. Landau (1982), C. R. Givens and R.M. Shortt (1984), I. Olkin and R. Pukelsheim (1982)). Let Pk be the normal dis-tribution on

(Rd,B(Rd)

)(d > 1) with mean value mk and covariance matrix Mk,

k = 1, 2. Then

W2(P1, P2) =[|m1 −m2|2 + TraceM1 + TraceM2

− 2 Trace(√

M1M2

√M1

)1/2]1/2,

where TraceM denotes the trace of M .

Theorem 2.23 (R. L. Dobrushin, 1970). (1) For bounded ρ, W is equivalentto the Levy-Prohorov distance.

Page 40: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

28 2 Optimal Markovian Couplings

(2) For discrete distance ρ, W = ‖ · ‖Var/2.

Fortunately, in most cases, what we need is only certain estimates of upperbound. Clearly, any coupling provides an upper bound of W (P1, P2). Thus, itis very natural to introduce the following notion.

Definition 2.24. A coupling P of P1 and P2 is called ρ-optimal if∫ρ(x1, x2)P (dx1, dx2) = W (P1, P2).

Now, it is natural to define the optimal coupling for time-discrete Markovprocesses without restricted to the Markovian class. In the special case of ρbeing the discrete metric (or equivalently, restricted to the total variation),it is just the maximal coupling, started by D. Griffeath (1978). However, itis well known that the maximal couplings are usually non-Markovian. Eventhough the maximal couplings as well as other non-Markovian couplings nowconsist of an important part of the theory and have been widely studied in theliterature (refer to T. Lindvall (1992) and references therein). They are difficultto handle especially when we come to the time-continuous situation. Moreover,it will be clear soon that in the context of diffusions, to deal with the optimalMarkovian coupling in terms of their operators, the discrete metric will lostits meaning. Thus, our optimal Markovian couplings are essentially differentfrom the maximal ones. It should be also pointed out that the sharp estimatesintroduced in Chapter 1 are obtained from the exponential rate in the W -metricwith respect to some much more refined metric ρ rather than the discrete one.Replacing Pk and P with Pk(t) and P (t), respectively, and then going to theoperators, it is not far away to arrive at the following notion (cf., Chen (1996)for details):

Definition 2.25. A coupling operator Ω is called ρ-optimal if

Ω ρ(x1, x2) = infΩ

Ω ρ(x1, x2)

for all x1 6= x2, where Ω varies over all coupling operators.

To see the notion is useful, let us introduce one more coupling.

Example 2.26 (Coupling by reflection Ωr). Given a birth–death processwith birth rates bi and death rates ai. This coupling evolves in the following way:If i2 = i1 + 1, then

(i1, i2) → (i1 − 1, i2 + 1) at rate ai1 ∧ bi2→ (i1 + 1, i2) at rate bi1→ (i1, i2 − 1) at rate ai2 .

If i2 > i1 + 2, then

(i1, i2) → (i1 − 1, i2 + 1) at rate ai1 ∧ bi2→ (i1 + 1, i2 − 1) at rate bi1 ∧ ai2 .

By symmetry, we can write down the rates for the other case that i1 > i2.

Page 41: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

2.2 Optimality with respect to distances 29

Intuitively, the reflection in outside direction is quite strange since it makesthe components apart by distance 2 but not by 1. For this reason, even thoughthe coupling came to our attention years ago, we never believed that it couldbe better than the coupling by inner reflection. But the next result changes ourmind.

Theorem 2.27 (Chen, 1994b). For birth–death processes, the coupling by reflec-tion is ρ-optimal for any translation-invariant metric ρ on Z+ having the property:

uk := ρ(0, k + 1) − ρ(0, k), k > 0

is non-increasing in k.

To see that the optimal coupling depends heavily on the metric ρ, note thatthe above metric ρ can be rewritten as

ρ(i, j) =∑

k<|i−j|

uk

for some positive non-increasing sequence (uk). In this way, for any positivesequence (uk), we can introduce another metric as follows:

ρ(i, j) =

∣∣∣∣∑

k<i

uk −∑

k<j

uk

∣∣∣∣.

Because (uk > 0) is arbitrary, this class of metrics is still quite large. Now,among the couplings listed above, which one is ρ-optimal?

Theorem 2.28 (Chen, 1994b). For birth–death processes, every coupling men-tioned above except the trivial one is ρ-optimal.

This result is again quite surprising, far away from our probabilistic intuition.Thus, our optimality does produce some unexpected results.

We are now ready to study the optimal couplings for diffusion processes.

Definition 2.29. Given ρ ∈ C2(Rd×Rd \(x, x) : x ∈ Rd), a coupling operatorL is called ρ-optimal if

Lρ(x, y) = infLLρ(x, y), x 6= y,

where L varies over all coupling operators.

For the underlying Euclidean distance | · | in Rd, we introduce a family ofdistances as follows:

ρ(x, y) = f(|x− y|), where f(0) = 0, f ′ > 0 and f ′′ 6 0. (2.4)

In order to make ρ to be a distance, the first two conditions of f are necessaryand the third condition guarantees the triangle inequality.

Page 42: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

30 2 Optimal Markovian Couplings

Theorem 2.30 (Chen, 1994b). Let ρ(x, y)= f(|x−y|) for some f ∈C2(R+; R+)satisfying (2.4). Then, the ρ-optimal solution c(x, y) is given as follows.

(1) If d = 1, then c(x, y) = −√a1(x)a2(y) and moreover,

Lf(|x− y|) =1

2

(√a1(x) +

√a2(y)

)2f ′′(|x− y|)

+(x− y)(b1(x) − b2(y))

|x− y| f ′(|x− y|).

Next, suppose that ak = σ2k (k = 1, 2) is non-degenerated and write

c(x, y) = σ1(x)H∗(x, y)σ2(y).

(2) If f ′′(r) < 0 for all r > 0, then H(x, y) = U(γ)−1[U(γ)U(γ)∗

]1/2, where

γ=1 − |x−y|f ′′(|x−y|)f ′(|x− y|) , U(γ) = σ1(x)(I−γuu∗)σ2(y).

(3) If f(r) = r, then H(x, y) is a solution to the equation:

U(1)H =(U(1)U(1)∗

)1/2.

In particular, if ak(x) = ϕk(x)σ2 for some positive function ϕk (k = 1, 2), whereσ is independent of x and det σ > 0. Then

(4) H(x, y) = I − 2σ−1uu∗σ−1/|σ−1u|2 if ρ(x, y) = |x− y|. Moreover,

Lf(|x− y|) =1

2|x− y|(√

ϕ1(x) −√ϕ2(y)

)2[Traceσ2 − |σu|2

]

+ 2⟨x− y, b1(x) − b2(y)

⟩.

(5) The last assertion also holds if ρ(x, y) = |x − y| is replaced by ρ(x, y) =f(|σ−1(x− y)|). Furthermore,

Lρ(x, y) =1

2

(√ϕ1(x) +

√ϕ2(y)

)2f ′′(|σ−1(x− y)|)

+(d− 1)

(√ϕ1(x) −

√ϕ2(y)

)2

+ 2⟨σ−1(x− y), σ−1(b1(x) − b2(y))

× f ′(|σ−1(x − y)|)2|σ−1(x− y)| .

Page 43: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

2.3 Optimality with respect to lower semi-continuous functions 31

2.3 Optimality with respect to lower semi-conti-

nuous functions

As an extension of the optimal couplings with respect to distances, we canconsider the optimal couplings with respect to a more general, nonnegative,lower semi-continuous function φ.

Definition 2.31. Given a metric space (E, ρ,E ). Let ϕ be a nonnegative, lowersemi-continuous function on (E, ρ,E ). A coupling is called ϕ-optimal (Markovian)coupling, if in the definitions given in the last section, the distance function ρ isreplaced by ϕ.

Here are some typical examples of ϕ.

Example 2.32. (1) ϕ is a distance of the form f ρ for some f having theproperties f(0) = 0, f ′ > 0 and f ′′ 6 0.

(2) ϕ is the discrete distance: ϕ(x, y) = 1 iff x 6= y, otherwise, ϕ(x, y) = 0.

(3) Let E be endowed with a measurable semi-order “≺” and set F = (x, y) :x ≺ y. Then F is a closed set. Take ϕ = IF c .

Before moving further, let us recall the stochastic comparability.

Definition 2.33. Let M be set of bounded monotone functions f : x ≺ y =⇒f(x) 6 f(y).

(1) We write µ1 ≺ µ2 if µ1(f) 6 µ2(f) for all f ∈ M .

(2) Let P1 and P2 be transition probabilities. We write P1 ≺ P2 if P1(f)(x1) 6

P2(f)(x2) for all x1 ≺ x2 and f ∈ M .

(3) Let P1(t) and P2(t) be transition semigroups. We write P1(t) ≺ P2(t) ifP1(t)(f)(x1) 6 P2(t)(f)(x2) for all t > 0, x1 ≺ x2 and f ∈ M .

Here is a famous result about the stochastic comparability.

Theorem 2.34 (V. Strassen, 1965). For a Polish space, µ1 ≺ µ2 iff there existsa coupling measure µ such that µ(F c) = 0.

Usually, in practice, it is not easy to compare two measures directly. Dueto this reason, one introduces the stochastic comparability for processes. First,one construct two processes with the stationary distributions µ1 and µ2, respec-tively. Then the stochastic comparability of the two measures can be reduced tothe one of the processes. The advantage for the latter comparison comes fromthe intuition of the stochastic dynamics. One can even see the answer from thecoefficients of the operators. See Examples 2.44–2.46 below.

As a typical application of the ϕ-optimal coupling, we have the followingresult.

Page 44: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

32 2 Optimal Markovian Couplings

Theorem 2.35 (S. Y. Zhang, 2000a). Let (E, ρ,E ) be Polish and ϕ > 0 be alower semi-continuous function.

(1) Given Pk(xk, dyk), k = 1, 2, there exists a P (x1, x2; dy1, dy2) such that

Pϕ = inf P Pϕ.

(2) Given operators Ωk of regular jump processes, k = 1, 2, there exists a coupling

operator Ω of jump process such that Ωϕ = inf Ω Ωϕ.

According to Theorem 2.35 (1), Strassen’s theorem can be re-expressed asIF c -optimal Markovian coupling satisfying µ(F c) = 0. Even though the proofof Theorem 2.35 is quite technical, the main root is still clear. Consider firstfinite state space, then the conclusion follows from an existence theorem of alinear programming, regarding the marginality as a constraint. Next, pass tothe general Polish space by using a tightness argument plus the first momentcondition with respect to ρ.

Concerning with the stochastic comparability, we have

Theorem 2.36 (Chen (1992a), Zhang (2000b)). For jump processes on Polishspace, under a mild assumption, P1(t) ≺ P2(t) iff

Ω1IB(x1) 6 Ω2IB(x2),

for all x1 ≺ x2 and B ∈ M .

Here, we mention an additional result, which provides us the optimal solu-tions within the class of order–preserved couplings.

Theorem 2.37 (T. Lindvall, 1999). Again, let ∆ denote the diagonals.

(1) Let µ1 ≺ µ2. Then

infµ(F c)=0

µ(∆c) =1

2‖µ1 − µ2‖Var.

(2) Let P1 and P2 be transition probabilities and satisfy P1 ≺ P2. Then

infP (x1,x2; F c)=0

P (x1, x2; ∆c) =

1

2‖P1(x1, ·) − P2(x2, ·)‖Var

for all x1 ≺ x2.

Open Problem 2.38. Let ϕ ∈ C2(R2d \ ∆). Prove the existence of ϕ-optimalMarkovian couplings for diffusions under some reasonable hypotheses.

We remark that in general an ϕ-optimal Markovian coupling may not existin the context of diffusions. Refer to F. Y. Wang and M. P. Xu (1997).

Open Problem 2.39. Construct ϕ-optimal Markovian couplings.

Page 45: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

2.4 Applications of the coupling methods 33

2.4 Applications of the coupling methods

It should be helpful for the readers, especially for the newcomers, to providesome applications of couplings. Of course, the applications discussed below cannot be complete, additional applications will be presented in Chapters 3, 5 and9. One may refer to T. M. Liggett (1985) and T. Lindvall (1992) for much moreinformation.

Spectral gap. Exponential L2-convergence

We introduce two general results, due to Chen and F. Y. Wang (1993b)[see alsoChen (1994a)], on the estimation of the first non-trivial eigenvalue (spectralgap) by couplings.

Definition 2.40. Let L be an operator of a Markov process (Xt)t>0. We saythat a function f is in the weak domain of L, denoted by Dw(L), if f satisfies theforward Kolmogorov equation

Exf(Xt) = f(x) +

∫ t

0

ExLf(Xs)ds,

or equivalently,

f(Xt) −∫ t

0

Lf(Xs)ds

is Px-martingale with respect to the natural flow of σ-algebras Ft := σXs : s 6

tt>0.

Definition 2.41. We say that g is an eigenfuction of L corresponding to λ inweak sense if g satisfies the eigen-equation Lg = −λg pointwise.

Note that the eigenfunction defined above may not belong to L2(π).

Theorem 2.42. Let (E, ρ) be a metric space and let Xtt>0 be a reversibleMarkov process with operator L. Denote by g the eigenfunction corresponding toλ1 in weak sense. Next, let (Xt, Yt) be the coupled process, starting from (x, y),

with coupling operator L and let γ : E ×E → [0,∞) satisfy γ(x, y) = 0 iff x = y.Suppose that

(1) g ∈ Dw(L).

(2) γ ∈ Dw(L).

(3) Lγ(x, y) 6 −αγ(x, y) for all x 6= y.

(4) g is Lipschitz with respect to γ in the sense that

cg,γ := supy 6=x

|g(y) − g(x)|/γ(y, x) <∞.

Page 46: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

34 2 Optimal Markovian Couplings

Then, we have λ1 > α.

Proof. By conditions (2), (3) and Lemma A.6, we have

Ex,yγ(Xt, Yt) 6 γ(x, y)e−αt, t > 0.

Next, by condition (1) and the definition of g,

g(Xt) −∫ t

0

Lg(Xs)ds = g(Xt) + λ1

∫ t

0

g(Xs)ds

is a Px-martingale with respect to the natural flow of σ-algebras Ft>0. Inparticular,

g(x) = Ex

[g(Xt) + λ1

∫ t

0

g(Xs)ds

].

Because of the coupling property,

Ex

[g(Xt) + λ1

∫ t

0

g(Xs)ds

]= E

x,y

[g(Xt) + λ1

∫ t

0

g(Xs)ds

].

Thus, we obtain

g(x) − g(y) = Ex,y

[g(Xt) − g(Yt) + λ1

∫ t

0

[g(Xs) − g(Ys)]ds

].

Therefore,

|g(x) − g(y)| 6 Ex,y∣∣g(Xt) − g(Yt)

∣∣+ λ1Ex,y

∫ t

0

|g(Xs) − g(Ys)|ds

6 cg,γEx,y∣∣γ(Xt) − γ(Yt)

∣∣+ λ1cg,γEx,y

∫ t∧T

0

|γ(Xs) − γ(Ys)|ds

6 cg,γγ(x, y)e−αt + λ1cg,γγ(x, y)

∫ t

0

e−αsds

Noting that g is not a constant, we have cg,γ 6= 0. Dividing both sides by γ(x, y)and choosing a sequence (xn, yn) so that |g(yn) − g(xn)|/γ(yn, xn) → cg,γ , weobtain

1 6 e−αt + λ1

∫ t

0

e−αsds = e−αt + λ1

(1 − e−αt

)/α

for all t. This implies that λ1 > α as required.

The condition (3) in Theorem 2.42 is essential. The other conditions canbe often relaxed or avoided by using a localizing procedure. The next weakerresult is useful, it has a different meaning as will be explained in Chapter 5.Define the coupling time T = inft > 0 : Xt = Yt.Theorem 2.43. Let Xtt>0, L, λ1 and g be the same as in the last theoremand set f(x, y) = g(x) − g(y). Suppose that

Page 47: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

2.4 Applications of the coupling methods 35

(1) g ∈ Dw(L).

(2) supx6=y |g(x) − g(y)| <∞

Then for every coupling Px,y, we have λ1 >

(maxx6=y Ex,yT

)−1

.

Proof. By the martingale formulation as we did in the last proof, we have

f(x, y) = Ex,yf

(Xt∧T , Yt∧T

)− E

x,y

∫ t∧T

0

Lf(Xs, Ys

)ds

= Ex,yf

(Xt∧T , Yt∧T

)− λ1E

x,y

∫ t∧T

0

f(Xs, Xs

)ds.

Hence

|g(x) − g(y)| 6 Ex,y∣∣g(Xt∧T

)− g(Yt∧T

)∣∣+ λ1Ex,y

∫ t∧T

0

∣∣g(Xs

)− g(Ys

)∣∣ds.

Assume supx6=y Ex,yT <∞ and so Px,y[T <∞] = 1. Letting t ↑ ∞, we obtain

|g(x) − g(y)| 6 λ1Ex,y

∫ T

0

∣∣g(Xs

)− g(Ys

)∣∣ds.

Choose xn and yn such that

limn→∞

|g(xn) − g(yn)| = supx, y

|g(x) − g(y)|.

Without loss of generality, assume that supx, y |g(x) − g(y)| = 1. Then

1 6 λ1 limn→∞

Exn,yn(T ).

Therefore, 1 6 λ1 supx6=y Ex,yT.

For the remainder of this section, we emphasize the main ideas by usingsome simple examples. In particular, from now on, the metric is taken to beρ(x, y) = |x− y|. That is, f(r) = r. In view of Theorem 2.30, this metric maynot be optimal since f ′′ = 0. Thus, in practice, an additional work is oftenneeded in order to figure out an effective metric ρ. The details will be discussedin the next chapter.

To conclude this subsection, let us consider the Ornstein-Uhlenbeck processin Rd. By Theorem 2.30 (4), we have Lρ(x, y) 6 −ρ(x, y) and so

Ex,yρ(Xt, Yt) 6 ρ(x, y)e−t. (2.5)

By using Theorem 2.42 with a help of localizing procedure, this gives us λ1 > 1,which is indeed exact!

Page 48: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

36 2 Optimal Markovian Couplings

Ergodicity

The coupling methods are often used to study the ergodicity of Markov pro-cesses. For instance, for Ornstein-Uhlenbeck process, from (2.5), it follows that

W (P (t, x, ·), π) 6 C(x)e−t, t > 0, (2.6)

where π is the stationary distribution of the process. The estimate (2.6) simplymeans that the process is exponentially ergodic with respect to W .

Recall that the coupling time T = inft > 0 : Xt = Yt. Starting from timeT , we can adopt the coupling of marching solders so that the two componentswill move together. Then, we have

‖P (t, x, ·) − P (t, y, ·)‖Var 6 2 Ex,yI[Xt 6=Yt] = 2 P

x,y[T > t]. (2.7)

Thus, if Px,y[T > t] → 0 as t → ∞, then the existence of a stationary distribu-tion plus (2.7) gives us the ergodicity with respect to the total variation. See T.Lindvall (1992) for details and references on this topic. Actually, for Brownianmotion, as pointed out in Chen and S. F. Li (1989), the coupling by reflectionprovides the sharp estimate for the total variation. We will come back to thistopic in Chapter 5.

Gradient estimate

Recall that for every suitable function f , we have

f(x) − f(y) = Ex,y[f(Xt∧T

)− f

(Yt∧T

)]− E

x,y

∫ t∧T

0

[Lf(Xs

)− Lf

(Ys

)]ds.

Thus, if f is L-harmonic, i.e., Lf = 0, then we have

f(x) − f(y) = Ex,y[f(Xt∧T

)− f

(Yt∧T

)].

Hence|f(x) − f(y)| 6 2 ‖f‖∞ P

x,y[T > t].

Letting t→ ∞, we obtain

|f(x) − f(y)| 6 2 ‖f‖∞ Px,y[T = ∞].

Now, if f is bounded and Px,y[T = ∞] = 0, then f = constant. Otherwise, ifPx,y[T = ∞] 6 constant· ρ(x, y), then we get

‖∇f‖∞ 6 constant · ‖f‖∞,

which is the gradient estimate we are looking for [cf., M. Cranston (1991; 1992)and F. Y. Wang (1994a; 1994b)]. For Brownian motion in Rd, the optimalcoupling gives us Px,y[T < ∞] = 1, and so f = constant. We have thus proveda well-known result: every bounded harmonic function should be constant.

Page 49: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

2.4 Applications of the coupling methods 37

Comparison results

The stochastic order occupies a critical position in the study of probabilitytheory as the usual order-relation is an fundamental structure in mathematics.

The coupling method provides a natural way to study the order-preservingproperty (i.e., the stochastic comparability). Refer to Chen [1992, Chapter 5]for the study on jump processes. Here is an example for diffusions.

Example 2.44. Consider two diffusions in R with

a1(x) = a2(x) = a(x), b1(x) 6 b2(x). (2.8)

Then, we have P1(t) ≺ P2(t).

The conclusion was proved in N. Ikeda and S. Watanabe [1981, Section6.1] by using stochastic differential equation. The same proof with a slightmodification works if we adopt the coupling of marching solders.

Actually, a criterion for the order-preservation for multidimensional diffusionprocesses was presented in Chen and F. Y. Wang (1993a). From which, we seethat the condition (2.8) is not only sufficient but also necessary. A relatedtopic, the preservation of positive correlations for diffusions, was also solved inthe same paper, as mentioned at the beginning of this chapter.

To illustrate an application of the study, let us introduce a simple example.

Example 2.45. Let µλ be the Poisson measure on Z+ with parameter λ:

µλ(k) =λk

k!e−λ, k > 0.

Then, we have µλ ≺ µλ′

whenever λ 6 λ′.

In some publications, one proves such kind of result by constructing a coup-ling measure µ so that µ(x, y) : x ≺ y = 1. Of course, such a proof is lengthy.So we now introduce a very short proof based on the coupling argument.

Consider a birth–death process with rate

a(k) ≡ 1, bλ(k) =µλ(k + 1)

µλ(k)=

λ

k + 1↑ as λ ↑ .

Denote by P λ(t) the corresponding process. It should be clear that

P λ(t) ≺ P λ′

(t) whenever λ 6 λ′

[cf., Chen(1992, Theorem 5.26)]. Then, by ergodic theorem,

µλ(f) = limt→∞

P λ(t)f 6 limt→∞

P λ′

(t)f = µλ′

(f)

for all f ∈ M . Clearly, the technique by using stochastic processes [goes backto R. Holley (1974)] provides an intrinsic insight of the order-preservation forprobability measures.

Page 50: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

38 2 Optimal Markovian Couplings

An aspect of the applications of coupling method is to compare a rathercomplicated process with a simpler one. To get some impression, we introducean example which was used by Chen and Y. G. Lu (1990) in the study on largedeviations for Markov chains.

Example 2.46. Consider a single birth Q-matrix Q = (qij), which means that

qi,i+1 > 0 and qij = 0 for all j > i+ 1,

and a birth–death Q-matrix Q = (qij) with qi,i−1 =∑

j<i qij . If qi,i+1 > qi,i+1

for all i > 0, then P (t) ≺ P (t).

The conclusion can be easily deduced by the following coupling:

(i1, i2) → (i1 − k, i2 − k) at rate qi1,i1−k ∧ qi2,i2−k

→ (i1 − k, i2) at rate (qi1,i1−k − qi2,i2−k)+

→ (i1, i2 − k) at rate (qi2,i2−k − qi1,i1−k)+

→ (i1 + 1, i2 + 1) at rate qi1,i1+1 ∧ qi2,i2+1

→ (i1 + 1, i2) at rate (qi1,i1+1 − qi2,i2+1)+

→ (i1, i2 + 1) at rate (qi2,i2+1 − qi1,i1+1)+,

here we have used the convention: qij = 0 if j < 0. Refer to Chen (1992a, The-orem 8.24) for details. This example illustrates the flexibility in the applicationof couplings.

Page 51: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Chapter 3

New Variational Formulasfor the First Eigenvalue

This chapter is devoted to the proofs of the main variational formulas introducedin §1.1 and §1.2. Two quick proofs for the discrete case are given in §3.2. Then,three sections are used to explain the ideas in detail for the proof in geometriccase. In §3.6, we compare the coupling methods with other techniques. Thelast two sections are more technical. In §3.7 we show that the new variationalformulas are indeed complete in dimension one. The results for the first Dirichleteigenvalue are similar and are presented in §3.8 for discrete case.

Let us begin with the background of the study on this topic.

3.1 Background

As mentioned in the first chapter, since the spectral theory is a central part ineach branch of mathematics and the first non-trivial eigenvalue is the leading

6

-0

λ1 > 0

β = 1/temperature

βc

Figure 3.1 The first eigenvalue and phase transitions

Page 52: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

40 3 New Variational Formulas for the First Eigenvalue

term of the spectrum, it should not be surprising that the study of λ1 has avery wider range of applications. Here we mention two fashionable applicationsonly.

Phase transitions

In the study of interacting particle systems, a physical model is described bya Markov process with semigroup Ptt>0 (depending on temperature 1/β)having stationary distribution π. Let L2(π) be the usual real L2-space withnorm ‖ · ‖.

Figure 3.1 means that in higher temperature (small β), the correspondingsemigroup Ptt>0 is exponentially ergodic in the L2-sense:

‖Ptf − π(f)‖ 6 ‖f − π(f)‖e−λ1t,

where π(f) =∫fdπ, with the largest rate λ1, and when the temperature goes to

the critical value 1/βc, the rate will go to zero. This provides a way to describethe phase transitions and it is now an active research field. Further remarks aregiven at the end of Chapter 9. The next application we would like to mentionis

Markov chains Monte Carlo (MCMC)

Consider a function with several local minimums. The usual algorithms go ateach step to the place which decreases the value of the function. The problemis that one may pitfall into a local trap (Figure 3.2).

The MCMC algorithm avoids this by allowing some possibility to visit otherplaces, not only towards to a local minimum. The random algorithm consistsof two steps.

• Construct a distribution according to the local minimums, staying at lowerplace with bigger probability, in terms of Gibbs principle.

• Construct a Markov chain with the stationary distribution given above,and with fast convergence rate (i.e., λ1).

The idea is great since it reduces some NP-problems to the P-problems in com-puter science. The effectiveness of a random algorithm is determined by λ1 ofthe Markov chain. Refer to M. R. Jerrum and A. J. Sinclair (1989) and A. J.Sinclair (1993) for further information.

Here is a practical example, called the Travelling Salesman Problem: Findout the shortest closed path (without loop) among 144 cities in China. For acomputer with speed of computing 109 paths in a second, it requires

143!

109 × 365× 24× 60 × 60≈ 10231

years for the computation. This is a typical NP-problem. However, by usingMCMC, it can be done quickly, as did in L. S. Kang et al (1994). The resulting

Page 53: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

3.2 Partial proof in discrete case 41

U W

j

z

Local trap

MCMC

Figure 3.2 The first eigenvalue and random algorithm

best path has length 30421 kilometers, only about 40 kilometers different fromthe best known length: 30380 kilometers.

3.2 Partial proof in discrete case

In the section, we introduce a short proof for the lower bounds of λ1 in thediscrete case. Even though the proof is very elementary, it does illustrate agood use of the Cauchy-Schwarz inequality. Recall that

µ0 = 1, µi =b0 · · · bi−1

a1 · · ·ai, i > 1.

For infinite matrix, we need the assumption:

∞∑

k=0

1

bkµk

k∑

i=0

µi = ∞ and Z =

∞∑

i=0

µi <∞. (3.1)

Letπi = µi/Z, L2(π) =

f : π(f2) <∞

,

where π(f) =∑∞

i=0 πifi. The first eigenvalue is defined by the classical varia-tional formula as follows.

λ1 = infD(f) : π(f) = 0, π

(f2)

= 1, (3.2)

where D(f) =∑∞

i=0 πibi(fi+1 −fi)2. We are now going to prove the variational

formula for the lower bounds (cf., Theorem 1.5):

λ1 > supw∈W

infi>0

Ii(w)−1 > supw∈W

infi>0

Ii(w)−1, (3.3)

where

W = w : w0 = 0, wi ↑↑, W = w : wi ↑↑, π(w) > 0,wi = wi − π(w), i > 0,

Ii(w) =1

µibi(wi+1 − wi)

∞∑

j=i+1

µjwj , i > 0, w ∈ W ,

Page 54: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

42 3 New Variational Formulas for the First Eigenvalue

and “↑↑” means strictly increasing.

Analytic proof of (3.3)

Clearly, it suffices to prove the first estimate in (3.3) since w : w ∈ W ⊂ W .

(a) First, we prove that Ii(w) > 0 for each w ∈ W and all i > 1. Equiva-lently,

∑∞j=i+1 µjwj > 0 for all i > 0. Otherwise, let i0 satisfy

∑∞j=i0+1 µjwj 6

0. Then, since wj is strictly increasing, it follows that wi0 < 0, and furthermore

0 6

∞∑

j=0

µjwj =

i0∑

j=0

µjwj +

∞∑

j=i0+1

µjwj 6

i0∑

j=0

µjwj 6 wi0

i0∑

j=0

µj < 0.

This is a contradiction.(b) For each i > 0, define a bond ei := 〈i, i+ 1〉. Next, for each pair i, j (i <

j), define γij to be the path (only one) consists of the bonds ei, ei+1, · · · , ej−1.

Given w ∈ W , choose a positive weight function on the bonds (w(e)): w(ei) =wi+1 − wi, and define the length of the path to be |γij |w =

∑e∈γij

w(e). Set

J(w)(e) =1

a(e)w(e)

i,j: γij3e

|γij |wπiπj ,

where a(ei) = πibi. At the same time, we write f(ei) = fi+1 − fi.(b) As an good application of the Cauchy-Schwarz inequality, we have

(fi − fj)2 =

( ∑

e∈γij

f(e)

)2

=

( ∑

e∈γij

f(e)√w(e)

·√w(e)

)2

6

( ∑

e∈γij

f(e)2

w(e)

)|γij |w.

Thus, for every f with π(f) = 0 and π(f 2) = 1, we obtain

1 =1

2

i,j

πiπj(fi − fj)2 =

i,j

πiπj

( ∑

e∈γij

f(e)

)2

6∑

i,j

πiπj

( ∑

e∈γij

f(e)2

w(e)

)|γij |w

=∑

e

a(e)f(e)21

a(e)w(e)

i,j: γij3e

|γij |wπiπj

6 D(f) supeJ(w)(e),

where i, j denotes the disordered pair of i and j. The first equality followsby expanding the sum on the right-hand side of the equality, the last equalityfollows by exchanging the order of the sums. Clearly, the proof in this paragraphworks for general Markov chain on a graph.

Page 55: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

3.2 Partial proof in discrete case 43

Note that for every ` > k,

|γk`|w = (wk+1 − wk) + · · · + (w` − w`−1) = w` − wk.

If a path γk` (k < `) contains ei = 〈i, i + 1〉, then k ∈ 0, 1, · · · , i and ` ∈i+ 1, i+ 2, · · · . Hence, once π(w) > 0, we have

k,`: γk`3ei

|γk`|wπkπ`

=

i∑

k=0

∞∑

`=i+1

πkπ`(w` − wk) =

i∑

k=0

πk

∞∑

`=i+1

π`w` −i∑

k=0

πkwk

∞∑

`=i+1

π`

=

∞∑

`=i+1

π`w` −(

∞∑

k=i+1

πk

)∞∑

`=i+1

π`w` −i∑

k=0

πkwk

∞∑

`=i+1

π`

=∞∑

`=i+1

π`w` −(

∞∑

k=i+1

πk

)(∞∑

`=0

π`w`

)6

∞∑

`=i+1

π`w`, i > 0.

Collecting the above two inequalities together, whenever π(w) > 0, we have

1 6 D(f) supi>0

J(w)(ei) 6 D(f) supi>0

1

a(ei)w(ei)

∞∑

j=i+1

πjwj

= D(f) supi>0

1

πibi(wi+1 − wi)

∞∑

j=i+1

πjwj

= D(f) supi>0

1

µibi(wi+1 − wi)

∞∑

j=i+1

µjwj .

Combining this with (a), it follows that D(f) > inf i>0 Ii(w)−1. Now, by condi-

tions π(f) = 0, π(f2) = 1 and (3.2), we get λ1 > inf i>0 Ii(w)−1. Since w ∈ W

is arbitrary, we obtain λ1 > supw∈W

inf i>0 Ii(w)−1. That is what we required.

Coupling proof of (3.3)

We use the same notations W and I(w) as in the last proof.

Fix w ∈ W and define

ui =1

biµi

j>i+1

µjwj , i > 0,

gi =∑

k<i

uk, i > 0,

ρ(i, j) = |gi − gj |, i > 0.

Page 56: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

44 3 New Variational Formulas for the First Eigenvalue

From part (a) in the last proof, it follows that ui > 0 for all i > 0. Hence g isstrictly increasing and so ρ is a distance.

Next, we adopt the classical coupling. Because of the symmetry µibi =µi+1ai+1 (i > 0), we have

Ωcρ(i, i+ 1) = [Ωρ(·, i+ 1)](i) + [Ωρ(i, ·)](i+ 1)

= [Ω(gi+1 − g•)](i) + [Ω(g• − gi)](i+ 1)

= Ωg(i+ 1) − Ωg(i)

= bi+1ui+1 − ai+1ui − biui + aiui−1

=

[1

µi+1

j>i+2

µjwj −ai+1

biµi

j>i+1

µjwj

]

−[

1

µi

j>i+1

µjwj −ai

bi−1µi−1

j>i

µjwj

]

= −wi+1 + wi

=(wi+1 − wi)biµi∑

j>i+1 µjwj·∑

j>i+1 µjwj

biµi

= −Ii(w)−1ρ(i, i+ 1)

6 −[

infi>0

Ii(w)−1]ρ(i, i+ 1), i > 1.

On the other hand, since∑

j µjwj > 0, we have

Ωcρ(0, 1) = Ωg(1) − Ωg(0) = b1u1 − a1u0 − b0u0

=1

µ1

j>2

µjwj −a1 + b0b0µ0

j>1

µjwj

= −w1 −1

µ0

j>1

µjwj 6 −(w1 − w0)

6 −[

infi>0

Ii(w)−1]ρ(0, 1).

Collecting these two inequalities together, we obtain

Ωcρ(i, j) = Ωcρ(i, i+ 1) + · · · + Ωcρ(j − 1, j)

6 −[

infi>0

Ii(w)−1]ρ(i, j), i < j.

This proves the key condition (1.23). From which, the conclusion (3.3) followsby a localizing procedure. Refer to Theorem 2.42.

Clearly, the key point in the last proof is the choice of the distance ρ, whichis not obvious at all. We will explain this point in detail in §3.4. Actually, eachsign of equalities in (3.3) holds and so we have complete variational formulasfor the lower bound of the first eigenvalue. Since the proof is more technical,we would like to delay it to the last section of this chapter.

Page 57: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

3.3 Three steps of the proof in geometric case 45

In the next three sections, we explain the ideas of the proof in geometriccase for the variational formula (1.11)

3.3 Three steps of the proof in geometric case

In this section, we explain in more detail the ideas of the proof for the variationalformula given in Theorem 1.1.

Choosing a coupling

Let (Bt) be the standard Brownian motion (abbrev. BM) in Rd and let (Xt)be the solution to the stochastic differential equation (abbrev. SDE):

dXt =√

2dBt, x0 = x. (3.4)

The process corresponds to the operator ∆ (half of it corresponds to the BM).Certainly, we can define a process (Yt) in the same way:

dYt =√

2dBt, y0 = y. (3.5)

Now, because the processes (Xt) and (Yt) are defined on the same probabilityspace, we obtain a coupling, that is the coupling of marching solders (Xt, Yt).However, in what follows, we will use another process (Yt) which is defined by

dYt =√

2H(Xt, Yt)dBt, y0 = y, (3.6)

where H(x, y) = I − 2(x − y)(x − y)∗/|x − y|2. Note that H(x, y) has nomeaning when x = y, so the process (Yt) given in (3.6) is meaningful only upto the coupling time

T := inft > 0 : Xt = Yt.Starting from the time T , we define Yt = Xt. We have thus constructed a process(Yt). Clearly, this (Yt) strongly depends on (Xt). Of course, the solutions ofEq.(3.5) and Eq.(3.6) are different, but they do have the same distribution, dueto the invariance of orthogonal transform of BM and the fact that H(x, y) is areflection matrix. The last couple (Xt, Yt) is the coupling by reflection discussedin the last chapter.

Intuitively, the construction of (Yt) can be completed in two steps: Lety 6= x.

• Transport Xt from x to y in parallel along the line (x, y).

• Make the mirror reflection of the transported image of Xt in the hyper-plane which is perpendicular to the line (x, y) at y.

Then, the mirror image gives us the process (Yt).For the diffusion (Xt) on manifold M with generator ∆, a process (Yt) can be

constructed in a similar way. Roughly speaking, one simply replaces the phrase

Page 58: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

46 3 New Variational Formulas for the First Eigenvalue

“the line (x, y)” in the above construction by “the unique shortest geodesic γbetween x and y”. Certainly, there are some technical details and geometricdifficulty (the cutlocus for instance) in the construction, refer to Kendall (1986)and Cranston (1992). An account of this coupling is now contained in Hsu(2002).

The appearance of the coupling by reflection is a critical step in the devel-opment of the coupling theory. For a long period, one knew mainly the classicalcoupling, it is successful (i.e., P[T < ∞] = 1) for BM in Rd iff d = 1 [cf., Chenand Li (1989)]. Thus, one may have an impression that a process having asuccessful coupling ought to be recurrent. But the coupling by reflection showsthat the success can be much weaker than the recurrence since this coupling issuccessful in any dimension [cf., Lindvall and Rogers (1986) and Chen and Li(1989)]. The key point is that the strong dependence of (Yt) on (Xt) enable usto reduce the higher dimensional case to dimension one.

Computing the distance

Throughout the chapter, we consider a connected, complete Riemannian ma-nifold M with Ricc > K for some K ∈ R. In the most cases, we consider herecompact M only. Denote by ρ the Riemannian distance on M . For the distanceof the coupled process (Xt, Yt), the following formula was proved by Kendall(1986) and Cranston (1992).

dρ(Xt, Yt) = 2√

2 dBt +

[∫ Yt

Xt

d∑

i=2

(|∇UW

i|2 − 〈R(W i, U)U, W i〉)]

dt

− dLt, t < T

(3.7)

where W i, i = 2, · · · , d are Jacobi fields along the unique shortest geodesic γbetween Xt and Yt, U is the unit tangent vector to γ and the integral in [· · · ]is along γ. (Bt) is a BM in R and (Lt) is an increasing process with supportcontained in t > 0 : (Xt, Yt) ∈ C, C := (x, y) : x is the cutlocus of y.When (Xt, Yt) ∈ C, the coefficient of dt is taken to be 0.

The formula is a finer version of the deterministic situation. The secondterm on the right-hand side of (3.7) is more or less familiar and comes fromthe second variation of arclength. The first and the last terms are new in thestochastic case. Since the measure of cutlocus equals zero, the last term is notessential. Next, because the mean of the first term is zero, it will be ignoredonce we make the expectation as we will see soon in the next step. However,the condition “t < T” is critical to avoid the singularity at t = T . This is themain place for which the present proof is probabilistic.

To estimate ρ(Xt, Yt), we need only to handle with the second term onthe right-hand side of (3.7). By comparing M with a manifold with constantsectional curvature, M. Cranston (1992) proved that when K < 0, this term is

Page 59: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

3.3 Three steps of the proof in geometric case 47

controlled by

2√−K(d− 1) tanh

(ρt

2

√−Kd− 1

), ρt := ρ(Xt, Yt). (3.8)

It was then proved by Chen and F. Y. Wang (1993b) that the same conclusionremains true when K > 0 and in which case, (3.8) can be rewritten as

−2√K(d− 1) tan

(1

2

√K

d− 1ρt

).

Set

γ(r) = 2√−K(d− 1) tanh

(1

2

√−Kd− 1

r

).

Then, we obtain

dρt 6 2√

2dBt + γ(ρt)dt− dLt 6 2√

2dBt + γ(ρt)dt, t < T. (3.9)

Equivalently,

ρt∧T − ρ0 6 2√

2

∫ t∧T

0

dbs +

∫ t∧T

0

γ(ρs)ds.

Making expectation, we get

Ex,yρt∧T 6 ρ0 + E

x,y

∫ t∧T

0

γ(ρs)ds. (3.10)

In order to get an exponential rate, we need the condition

γ(r) 6 −α r for some α > 0. (3.11)

When K > 0, since tan θ > θ on [0, π/2], we have α = K. Under (3.11), wehave

Ex,y

∫ t∧T

0

γ(ρs)ds 6 −αEx,y

∫ t∧T

0

ρsds = −αEx,y

∫ t

0

ρs∧T ds

= −α∫ t

0

Ex,yρs∧T ds,

since ρt∧T = 0 for all t > T . Combining this with (3.10), we obtain

Ex,yρt∧T 6 ρ0e

−αt.

Equivalently,

Ex,yρt 6 ρ0e

−αt, t > 0. (3.12)

This is the key estimate of our method.

Page 60: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

48 3 New Variational Formulas for the First Eigenvalue

Estimating λ1

Let g be an eigenfunction of λ1: −∆g = λ1g, g 6= constant. Then Exg(Xt) =g(x)e−λ1t for all t > 0. This gives us a relation between the λ1, g and the process(Xt). The same relation holds for (Yt). Note that the coupling property gives

us Ex,yg(Xt) = Exg(Xt). By (3.12), we have

e−λ1t|g(x) − g(y)| = |Exg(Xt) − Eyg(Yt)| =

∣∣Ex,y[g(Xt) − g(Yt)

]∣∣

6 L(g)Ex,yρt 6 L(g)ρ0e−αt

= L(g)ρ(x, y)e−αt, t > 0.

This gives us immediately λ1 > α and hence our proof is completed.

The last step is rather simple but may not be so easy to find out. This isindeed a character of various applications of coupling method, once the idea isunderstood, the proof often becomes quite straightforward.

3.4 Two difficulties

Roughly speaking, we have explained half of the first version of the paper byChen and F. Y. Wang (1993b). The problem is that the above arguments arestill not enough to obtain the sharp estimate. For instance, when K > 0, weget the lower bound α = K only, as mentioned right after (3.11). The best wecan get (when K > 0) is 8/D2 rather than the sharp one π2/D2, where D is thediameter of the compact manifold M . Even for the bound 8/D2, we still need

to estimate Ex,yT (cf., Theorem 2.43) for which we are not going to discusshere.

We now return to analyze the proof discussed in the last section. In the laststep, we need the Lipschitz property of g. Since the non-compact case can oftenbe reduced to the compact one [cf., Chen and Wang (1995)] and in the lattercase, g is smooth and hence the Lipschitz property is automatic. Thus, in thewhole proof, the key is the estimate (3.12), for which we require not only a goodcoupling but also a good distance. This is not surprising since the convergencerate is not a topological concept, it certainly depends heavily on the choice ofa distance. There is no reason why the underlying Riemannian distance shouldbe always a correct choice.

Optimal Markovian coupling

The first question is how about the effectiveness of the coupling used above. Isthere an optimal choice? This problem is quite hard as explained in §2.2. How-ever, the aim for the optimality becomes clear now. That is choosing couplingto make the rate α as bigger as possible, or in a slightly wider sense, to make

Ex,yρ(Xt, Yt)

Page 61: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

3.4 Two difficulties 49

as smaller as possible for all t > 0 and for every fixed pair (x, y) and fixed ρ.Because we are dealing with Markovian coupling, we can use the language ofcoupling operators, studied in the last chapter. Of course, one can translatethe discussions here into stochastic differential equations. Note that under mildassumption, the last statement is equivalent to that Lρ(x, y) is as smaller aspossible for every pair (x, y), x 6= y. This leads to the definition of ρ-optimal

coupling operator L:

Lρ(x, y) = infLLρ(x, y), x 6= y

where L varies over all coupling operators (cf., §2.2).

Some constructions for the optimal Markovian couplings are presented inthe last chapter. In particular, Theorem 2.30 (4) tells us that the coupling byreflection is already good enough even for the BM on manifolds. Furthermore,it suggests us to use f ρ instead of the original Riemannian distance ρ. Theconstruction of a new distance is the second main difficulty of the study andthis consists of the context of the remainder of this section.

Modification of Riemannian distance

To illustrate the use of the above idea, assume that K > 0 and take ρ = sin πρ2D .

Since π 6 D, ρ is a distance. To computer dρt, noting that dρt 6 2√

2dBt,apply the Ito’s formula plus a comparison argument,

dρt 6π

2Dcos

πρt

2D· 2√

2dBt −1

2· π2

4D2· sin πρt

2D· 8dt, t < T.

The first term is a martingale, denoted by Mt. We then obtain

dρt 6 dMt −π2

D2ρtdt, t < T.

Repeating the proof given in the last section, we get

Ex,yρt 6 ρ0 exp

[− π2

D2t

].

Thus, we obtain luckily λ1 > π2/D2 which is optimal in the case of zero curva-ture. By using the same function sin with a slight modifications (which comefrom some controlling equations of (3.9) with constant coefficients), we can ob-tain the other two optimal lower bounds [i.e., (1.1) and (1.8)], as shown in thefinal version of Chen and F. Y. Wang (1993b, Theorem 1.8). Finally, it is inter-esting to remark that 2θ/π 6 sin θ 6 θ on [0, π/2] and so the distances ρ and ρused above are actually equivalent. However, the resulting rates are essentiallydifferent.

Page 62: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

50 3 New Variational Formulas for the First Eigenvalue

Redesignated distances

Is there any other choice of the distance? The question is again easy to statebut not so easy to think. Indeed, we did not know for a long time where wecan start from. This problem becomes more serious when one goes to the non-compact situation. Intuitively, those distance can not be good if with respectto it the eigenfunction g is too far away from being Lipschitz. As usual, we aretaught by simple examples. Consider the diffusion on the half-line [0,∞) withoperator

L = a d2/dx2 − b d/dx

for some constants a, b > 0 and with the Neumann boundary condition at theorigin. If one adopts the Euclidean distance, then it gives nothing. So whatdistance should we take? Our goal is to look at the eigenfunction of λ1 = b2/4(without loss of generality, set a = 1):

g(x) = (1 − bx/2) exp[bx/2] ∈ L1(π) \ L2(π).

This suggests us to construct a new distance ρ from the leading part of g:

ρ(x, y) = | exp[γx] − exp[γy]|

for suitable γ > 0. Surprisingly, it gives us the exact estimate of λ1 even thoughthe eigenfunction g is still not Lipschitz with respect to this distance [cf., Chenand F. Y. Wang (1995)]. Furthermore, once g being strictly monotone (it isindeed the case of dimension one but the proof is rather technical, cf., §3.7below), we can always take |g(x) − g(y)| as the distance we required. Thisprovides us a way to construct and to classify the distances according to differentclasses of elementary function [cf., Chen (1996) and Chen and F. Y. Wang(1997b)].

However, there is still a serious difficulty in the construction of the newdistance since the eigenvalue λ1 and its eigenfunctions g are either known orunknown simultaneously. To see this, consider another example on the half-linewith operator L = a(x)d2/dx2. A beautiful estimate due to I. S. Kac and M.G. Krein (1958), S. Kotani and S. Watanabe (1982) says that

1

4

(supx>0

x

∫ ∞

x

du

a(u)

)−1

6 λ1 6

(supx>0

x

∫ ∞

x

du

a(u)

)−1

.

Now, in order to recover this estimate by using our method, according to whatdiscussed above, we have to know some information about the eigenfunction g.Even in such a simple situation, it is still no hope to solve g from a(x) explicitly.What can we do now? Once again, we examine the eigen-equation:

a(x)g′′ = −λ1g ⇐⇒ g′(s) =

∫ ∞

s

λ1g(u)

a(u)du (since g′(∞) = 0)

⇐⇒ g(x) = g(0) +

∫ x

0

ds

∫ ∞

s

λ1g(u)

a(u)du. (3.13)

Page 63: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

3.4 Two difficulties 51

Since we are dealing with the ergodic case, one can regard ∞ as Neumannboundary and so g′(∞) = 0. What we have done is just rewriting the differen-tial equation into the corresponding integration equation. Is the last equationhelpful? The answer is affirmative. We now move step by step as follows.

• Regard λ1g as a new function f .

• Regard the right-hand side of (3.13) as an approximation of the left-handside g.

• Ignore the constant g(0) on the right-hand side since we are interestedonly in g(x) − g(y).

In other words, these considerations suggest us to take

g(x) =

∫ x

0

ds

∫ ∞

s

f(u)

a(u)du (3.14)

as an approximation of g (up to a constant) and then take ρ(x, y) = |g(x)−g(y)|.The function f used above is called a test function. A slight different explanationof the construction goes as follows. Even though the equation (3.13) can not besolved explicitly, but as usual we do have a successive approximation procedure.Thus, one may regard (3.14) as the first step of the approximation and go furtherstep by step. However, the further approximations are not completely necessarysince it becomes on the one hand too complicated and on the other hand it isnot as effective as modifying the test function f directly.

Next, we consider the general operator on the half-line:

L = a(x)d2/dx2 + b(x)d/dx.

By standard ODE, it can be reduced to the above simple case. The approxima-tion function now becomes (cf., Chen and Wang (1997b))

g(r) =

∫ r

0

e−C(s)ds

∫ ∞

s

f(u)eC(u)

a(u)du, C(r) :=

∫ r

0

b

a. (3.15)

We have thus obtained a general construction of the mimic eigenfunctionsand furthermore of the required distances. It should be not surprised that thereconstruction of the distances is a powerful tool in many situations. This willbe illustrated in the next section.

Optimizing the distances

Before moving further, let us mention that an optimizing method of the distanceinduced from (3.15) as well as some comparison methods is developed in Chen

and F. Y. Wang (1995). In short word, the condition “Lρ(x, y) 6 −αρ(x, y)[which is equivalent to (3.12)] holds for all large enough ρ(x, y)” but not neces-sarily “for all x 6= y” is enough to guarantee a positive lower bound of λ1.

Page 64: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

52 3 New Variational Formulas for the First Eigenvalue

3.5 Final step of the proof of the formula

Up to now, we have discussed only the construction of the mimic eigenfunctionsg in the case of half-line. But how to go to the whole line and further to Rd

and manifold M? This seems quite difficult. However, the answer is still rathersimple once the idea was figured out. As we have seen from §3.3, the couplingmethods reduce the higher-dimensional case to computing the distance of thecoupled process, and then the distance itself consists of a process valued in thehalf-line [0,∞). We have thus returned to what treated in the last section.

Recall that

γ(r) = 2√−K(d− 1) tanh

(1

2

√−Kd− 1

r

)

and ρt = ρ(Xt, Yt). From (3.9), it is known that

dρt 6 2√

2 dBt + γ(ρt)dt, t < T. (3.16)

The operator corresponding to (3.16) with equality is

L = 4d2/dx2 + γ(x)d/dx

on [0, D] with absorbing boundary at 0 and reflecting boundary at D. This isindeed simpler than what we discussed in the last section (a(x) ≡ 4). Redefine

C(r) = exp

[1

4

∫ r

0

γ(s)ds

].

Then the approximation function defined by (3.15) becomes

g(r) =

∫ r

0

C(s)−1ds

∫ D

s

C(u)f(u)du,

up to a constant factor. Now the same proof as given in §3.3 implies rathereasily the formula (1.11).

The tool to derive Corollary 1.2 from Theorem 1.1 is the FKG-inequality(2.1).

We have thus completed the proof in geometric case. Our proof is universalin the sense that it works for general Markov processes, as shown by Theorems2.42 and 2.43. We also obtain variational formulas for non-compact manifolds,elliptic operators in Rd [Chen and Wang (1997b)], and Markov chains (Chen,1996). It is more difficult to derive the variational formulas for the elliptic op-erators and Markov chains due to the presence of infinite parameters in thesecases. In contrast, there are only three parameters (d, D, and K) in the geo-metric case. In fact, formula (1.11) is a particular consequence of our generalformula (which is complete in dimensional one) for elliptic operators.

Finally, we mention that the same method is used by F. W. Wang (1999b)and Y. H. Mao (2002d; 2002e) to show that for diffusion on compact manifold,

Page 65: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

3.6 Comments about different methods 53

D

D

f > 0

f = 0

Dirichlet: f(∂D) = 0

Neumann: ∂f/∂n=0,

∫fdπ=0.

⇐⇒ Maximum principle.

Surface f =0 depends on L, D

Figure 3.3 Intrinsic difference of the methods

the rate of strong ergodicity is bounded below by

4

[∫ D

0

C(s)−1ds

∫ D

s

C(u)du

]−1

. (3.17)

Note that this lower bound coincides with (1.11) by setting the test functionf ≡ 1. We will come back to this topic in Chapter 5.

3.6 Comments about different methods

First, we would like to make some remarks here for the Dirichlet eigenvalue(called D-problem for short). Similarly, we have N-problem. It is interesting tonote that in history the most of the publications in this field are devoted to theD-problem rather than the N-problem. The main reason is that the D-problemis equivalent to the maximum principle. Let B(p, n) be the ball centered atp with radius n. It is well known [go back to J. Barta (1937), refer to H.Berestycki, L. Nirenberg and S. R. S. Varadhan (1994) and references within]that

λ1 > supf

infB(p,n)

(−Lf)/f, (3.18)

where f varies over all C2(B(p, n)) functions with f |∂B(p,n) = 0 and f > 0on B(p, n). In other words, we do have a variational formula for the lowerbound for the D-problem. Note that the maximum principle is a powerful toolin PDE. It should not be surprised that one can do a lot for the D-problem.However, this formula does not work for the N-problem (or the closed eigenvalueproblem). The reason is simply that the eigenfunction g in the Neumann casemust cross zero and so is Lg (because the mean of g equals zero). See Figure3.3. Hence, there is a singularity of (−Lg)/g around the zero point whichmakes serious difficulty when the eigenfunction g is replaced by its perturbationf . Traditionally, one transfers the N-problem to the D-problem, as will bestudied in the next chapter. This explains the reason why one often thinksthat the N-problem is more difficult than the D-problem. It seems that the

Page 66: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

54 3 New Variational Formulas for the First Eigenvalue

N-problem is also more difficult than the closed problem. For instance, for theNeumann eigenvalue λ1 with convex boundary, the best known lower bound isthe Lichnerowicz’s estimate obtained by J. F. Escobar (1990) in the case of K >0, and up to now we have not seen from literature a proof about “λ1 > π2/D2

for general K > 0. The known estimates of λ1 for the N-problem in the case ofK < 0 are all less than the known estimates for the closed eigenvalue (refer tothe books quoted in Chapter 1). However, as we mentioned above, Theorem 1.1and its corollaries are all suitable for the Neumann eigenvalue λ1 with convexboundary. These discussions also show that the use of coupling enables us toavoid the singularity, just as mentioned above. The degeneracy of the coupledprocess appears at time T only, and before time T , the process is quite regular.This is somehow similar to the D-problem for which the degeneracy appears atthe boundary only. In other words, the coupling method plays a substitute rolein our proof as the maximum principle played for the D-problem.

Geometric proof

We now recall Li–Yau’s method (1980).Let g be the eigenfunction corresponding to λ1. By using a normalizing

procedure, assume that 1 = sup g > inf g =: −k. Here is the Li-Yau’s keyestimate:

|∇g| 62λ1

1 + k(1 − g)(k + g).

That is often called the method of gradient estimation. To improve Li-Yau’sestimate of λ1, a key result is the Zhong-Yang’s estimate (1984):

|∇θ|2 6 λ1

(1 + aεψ(θ)

),

where

θ = arcsin(a linear function of g), aε =1 − k

(1 + k)(1 + ε),

ψ(θ) =

2[2θ+ sin(2θ)]/π − 2 sin θ

cos2 θ, θ ∈

(− π

2,π

2

)

1, θ =π

2−1, θ = −π

2.

This estimate has been improved step by step by Yang et al. All of the proofsare based on the Maximum Principle. From this, it should be clear that Zhong-Yang’s proof can not be simple. Moreover, our proof is completely differentfrom theirs.

No doubt, our method should be useful for complex manifolds. However,much works are expected to be done.

Open Problem 3.1. Study the first eigenvalue for complex manifolds by cou-plings.

Page 67: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

3.7 Proof in the discrete case (continued) 55

3.7 Proof in the discrete case (continued)

The main purpose of this section is to prove that each sign of the equalities in(3.3) holds. We use the notations given in §3.2 and restate the result as follows.

Theorem 3.2. We have

λ1 = supw∈W

infi>0

Ii(w)−1 = supw∈W

infi>0

Ii(w)−1. (3.19)

To prove these two equalities, we need some properties of the correspondingeigenfunction and so is much more technical.

Proposition 3.3. Let λ > 0 and g 6≡ 0 be a solution to the equation Ωg = −λg.Then g0 6= 0 and

πnbn(gn+1 − gn) = −λn∑

i=0

πigi, n > 0. (3.20)

Proof. The formula (3.20) follows from

−λn∑

i=0

πigi =

n∑

i=0

πiΩg(i) =

n∑

i=0

[πiai(gi−1 − gi) + πibi(gi+1 − gi)

]

=

n∑

i=0

[− πiai(gi − gi−1) + πi+1ai+1(gi+1 − gi)

]

= −π0a0(g0 − g−1) + πn+1an+1(gn+1 − gn)

= πnbn(gn+1 − gn).

Here the additional term g−1 can be ignored since a0 = 0.If g0 = 0, then by induction and (3.20), it follows that gi ≡ 0. This is a

contradiction.

Proposition 3.4. Let λ1 > 0 and g be a solution to the equation Ωg = −λ1gwith g0 < 0. Then gi is strictly increasing.

Proof. Since g0 < 0, by (3.20), we have g1 > g0. If gi is not strictly increasing,then there would exist an n > 1 such that

g0 < g1 < · · · < gn−1 < gn > gn+1. (3.21)

We are going to prove that this is impossible.By (3.20), we have

gk < (resp., =) gk+1 ⇐⇒k∑

i=0

πigi < (resp., =) 0. (3.22)

Page 68: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

56 3 New Variational Formulas for the First Eigenvalue

Define gn = −∑n−1i=0 πigi/πn and gi = giI[i<n] + gnI[i>n]. Then, from (3.20)–

(3.22), it follows that ∑

i6n−1

πigi + πngn = 0, (3.23)

gn > gn = [πn−1bn−1(gn − gn−1)]/(λ1πn) = [an(gn − gn−1)]/λ1 > 0. (3.24)

Define gi = giIi<n + gnIi>n. Then, we have∑

i

πig2i =

i6n−1

πig2i + g2

n

i>n

πi,

i

πigi =∑

i6n−1

πigi + gn

i>n

πi = gn

i>n

πi − πngn (by (3.23)).

Hence

i

πig2i −(∑

i

πigi

)2

=∑

i6n−1

πig2i +g2

n

i>n

πi−(gn

i>n

πi−πngn

)2

. (3.25)

Next,

−∑

i

πi

(giΩgi

)(i) = λ1

i6n−1

πig2i + πnangn(gn − gn−1) (by (3.24)). (3.26)

We now prove that

πngngn < g2n

i>n

πi −(gn

i>n

πi − πngn

)2

. (3.27)

By (3.24), gn > 0. Thus, (3.27) is equivalent to

πngn

gn<∑

i>n

πi −(∑

i>n

πi − πngn/gn

)2

.

That is, ∑

i>n

πi − πngn

gn

2

<∑

i>n

πi − πngn

gn.

This clearly holds since 0 < gn 6 gn,

0 <∑

i>n

πi − πngn/gn =∑

i>n+1

πi + πn(1 − gn/gn) < 1.

We have thus proved (3.27). Collecting (3.25)–(3.27) together, it follows that

λ1 6−∑i πi

(giΩgi

)(i)

∑i πig2

i −(∑

i πigi

)2

=λ1

∑i6n−1 πig

2i + λ1πngngn

∑i6n−1 πig2

i + g2n

∑i>n πi −

(gn

∑i>n πi − πngn

)2 < λ1

which is a contradiction.

Page 69: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

3.7 Proof in the discrete case (continued) 57

Proposition 3.5. Let λ1 > 0 and g be the function given by Proposition 3.4.Then g ∈ L1(π) and π(g) = 0.

Proof. By Proposition 3.4, we can define a positive sequence ui = gi+1 − gi,i > 0. From the eigen-equation Ωg = −λ1g, it follows that

biui − aiui−1 = −λ1gi (a0 := 0), i > 0. (3.28)

Replacing i with i + 1, we obtain another equation. Making the difference ofthese two equations, we get

Ri(u) := (ai+1ui − bi+1ui+1 − aiui−1 + biui)/ui = λ1 > 0, i > 0.

By Propositions 3.3 and 3.4, we know that µnbnun is strictly decreasing and sothere is a limit c := limn→∞ µnbnun > 0. Set u−1 = 0. Define

wi = aiui−1 − biui + c/(Z − µ0) = λ1gi + c/(Z − µ0) (by (3.28)).

Then

(wi+1 − wi)/ui = Ri(u) = λ1 > 0, i > 0. (3.29)

It follows that wi is strictly increasing. On the other hand, we have

j>i+1

µjwj =∑

j>i+1

[µjajuj−1 − µjbjuj + cµj/(Z − µ0)]

=∑

j>i+1

[µj−1bj−1uj−1 − µjbjuj + cµj/(Z − µ0)]

=∑

j>i+1

[µj−1bj−1uj−1 − µjbjuj ] +c

Z − µ0

j>i+1

µj

= biµiui − c+c

Z − µ0

j>i+1

µj

= biµiui −c

Z − µ0

16j6i

µj 6 biµiui, i > 0.

(3.30)

In particular,∑

j>1 µjwj = µ0b0u0 > 0 and so w ∈ L1(π).

Next, because w0 = −b0u0 + c/(Z − µ0), we see that

j

µjwj = w0 +∑

j>1

µjwj = c/(Z − µ0) > 0.

This fact plus wi ↑↑ implies that∑

j>i+1 µjwj > 0 for all i > 0, as proved inpart (a) of the analytic proof of (3.3)(§3.2).

Page 70: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

58 3 New Variational Formulas for the First Eigenvalue

Collecting the above facts together, we obtain

Ii(w)−1 = biµi(wi+1 − wi)

/ ∑

j>i+1

µjwj

= biµiRi(u)ui

/[biµiui −

c

Z − µ0

16j6i

µj

]

= λ1

[1 − c

(Z − µ0)biµiui

16j6i

µj

]−1

> λ1, i > 0.

(3.31)

Thus, inf i>0 Ii(w)−1 > λ1. Combining this with (3.3), we get inf i>0 Ii(w)−1 =λ1.

Noting that µnbnun is strictly decreasing in n, the sign of the last equalityin (3.31) and hence the last one in (3.30) must be held. Therefore c = 0. Inother words, we must have π(g) = 0.

Having these preparations at hand, the proof of Theorem 3.2 is quite easy.

Proof of Theorem 3.2. Since inf i>0 Ii(w)−1 > 0, by (3.3), the equalities of(3.19) become trivial when λ1 = 0.

We now assume that λ1 > 0. By (3.3), we have λ1 > supw∈W

inf i>0 Ii(w).Combining this with Proposition 3.5 and its proof (by setting w = λ1g), we seethat the sign of the last equality holds:

λ1 = supw∈W

infi>0

Ii(w).

One can replace the right-hand side by supw∈W inf i>0 Ii(w) since Ii(w) is in-variant under the transform wi → αwi + β for all α > 0.

3.8 The first Dirichlet eigenvalue

We now turn to study the first Dirichlet eigenvalue. This is a more traditionaltopic than the Neumann one, as explained in §3.6, and will be studied time bytime subsequently. Here we consider Markov chains only. The results given inthis section are quite similar to the last section.

Fix a point, say 0 ∈ E. Then the first Dirichlet eigenvalue is defined by

λ0 = infD(f) : f(0) = 0 and π(f 2) = 1.

For each i ∈ E, choose a path γi from 0 to i (without loop). Again, choose apositive weight function w(e) on the edges and define |γi|w =

∑e∈γi

w(e),

I(w)(e) =1

a(e)w(e)

i6=0: γi3e

|γi|wπi.

Page 71: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

3.8 The first Dirichlet eigenvalue 59

Theorem 3.6. We have λ0 > supw infe I(w)(e)−1.

Proof.

1 =∑

i6=0

πif2i =

i6=0

πi(fi − f0)2 =

i6=0

πi

(∑

e∈γi

f(e)

)2

6∑

i6=0

πi

e∈γi

f(e)2

w(e)|γi|w =

e

a(e)f(e)2I(w)(e)

6 D(f, f) supeI(w)(e).

Theorem 3.7. Let E = 0, 1, 2, · · · , N, N 6 ∞, qi,i+1 = bi > 0 (0 6 i 6

N − 1), qi,i−1 = ai > 0 (1 6 i 6 N) and qij = 0 for other i 6= j. Define

µ0 = 1, µn =b0 · · · bn−1

a1 · · · an, 1 6 n 6 N,

Z =

N∑

n=0

µn, W = w : w0 = 0, wi ↑↑,

Ii(w) =1

biµi(wi+1 − wi)

N∑

j=i+1

µjwj , 0 6 i 6 N − 1, w ∈ W .

Then we haveλ0 = sup

w∈W

inf06i6N−1

Ii(w)−1.

When b0 = 0, the conclusion remains true if we redefine

µ1 = 1, µn =b1 · · · bn−1

a2 · · ·an, 2 6 n 6 N,

Ii(w) =1

ai+1µi+1(wi+1 − wi)

N∑

j=i+1

µjwj , 0 6 i 6 N − 1, w ∈ W ,

D(f) =∑

16i6N−1

πibi(fi+1 − fi)2 + π1a1f

21 ,

λ0 = D(f) : f0 = 0, π(f2) = 1.

Proof. (a) Again, let ei be the edge 〈i, i + 1〉. For each i > 1, there is a pathconsisting of e0, e1, · · · , ei−1. Take w(ei) = wi+1 − wi. Then

k: γk3ei

|γk|wπk =

N∑

k=i+1

(wk − w0)πk =

N∑

k=i+1

πkwk.

Now, the inequality “λ0 > · · · ” follows from Theorem 3.6.(b) The remainder of the proof is similar to the proof of Theorem 3.2 given

in §3.7. However, we still present the details here for completeness. Let λ0 > 0

Page 72: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

60 3 New Variational Formulas for the First Eigenvalue

and g 6≡ 0 with g0 = 0 be a solution to the equation Ωg(i) = −λ0gi, 1 6 i 6 N .Here, we adopt the convention that a0 = 0 and bN = 0. The key to prove theequality is to show the strictly monotonicity of (gi). Once this is done, withoutless of generality, assume that gi ↑↑, then we have

Ii(g) =1

ai+1µi+1(gi+1 − gi)

N∑

j=i+1

µjgj ≡ 1

λ0(3.32)

for all 0 6 i 6 N − 1 and hence the required assertion follows.(c) To see that (3.32) holds, first, we show that

−λ0

n∑

1

πigi = πn+1an+1(gn+1 − gn) − π1a1g1, 1 6 n 6 N. (3.33)

Here, we use the convention aN+1 = 0 provided N <∞. The proof is easy:

−λ0

n∑

1

πigi =

n∑

1

πiΩg(i) =

n∑

1

[πiai(gi−1 − gi) + πibi(gi+1 − gi)

]

=n∑

1

[− πiai(gi − gi−1) + πi+1ai+1(gi+1 − gi)

]

= πn+1an+1(gn+1 − gn) − π1a1g1.

Let ui = gi+1 − gi, 0 6 i 6 N − 1. Even though it is not necessary but forspecificity, we set uN = 1 when N <∞. By eigen-equation, we have

biui − aiui−1 = −λ0gi, 1 6 i 6 N.

Then,

Ri(u) := (ai+1ui − bi+1ui+1 − aiui−1 + biui)/ui = λ0 > 0, 1 6 i 6 N − 1.

By (3.33) and the assumption gi ↑↑, g0 = 0, it follows that

0 6 µn+1an+1un = µ1a1g1 − λ0

n∑

i=1

µigi 6 µ1a1g1, 1 6 i 6 N − 1.

Thus, µn+1an+1un is decreasing in n and

0 6 c := limn→N

µn+1an+1un 6 µ1a1g1.

Note that c = 0 when N <∞. Next, let

wi = aiui−1 − biui + c/(Z − µ0) = λ0gi + c/(Z − µ0) > 0, 1 6 i 6 N.

Then(wi+1 − wi)/ui = Ri(u) = λ0 > 0, 1 6 i 6 N − 1.

Page 73: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

3.8 The first Dirichlet eigenvalue 61

This implies that wi ↑↑. Therefore,

N∑

j=i+1

µjwj =

N∑

j=i+1

(µjajuj−1 − µjbjuj) +c

Z − µ0

N∑

j=i+1

µj

=

N∑

j=i+1

(µjajuj−1 − µj+1aj+1uj) +c

Z − µ0

N∑

j=i+1

µj

= µi+1ai+1ui − c+c

Z − µ0

N∑

j=i+1

µj

= µi+1ai+1ui −c

Z − µ0

16j6i

µj , 0 6 i 6 N − 1.

Define additionally w0 = 0. Since w1 > 0, it is clear that w ∈ W . We have

Ii(w)−1 = µi+1ai+1(wi+1 − wi)

/ N∑

j=i+1

µjwj

= µi+1ai+1Ri(u)ui

/[µi+1ai+1ui −

c

Z − µ0

16j6i

µj

]

= λ0

[1 − c

(Z − µ0)µi+1ai+1ui

16j6i

µj

]−1

> λ0, 1 6 i 6 N − 1.

I0(w)−1 = µ1a1w1

/ N∑

j=1

µkwj = µ1a1

(λ0u0 +

c

Z − µ0

)/a1µ1u0

= λ0 +c

(Z − µ0)u0

> λ0.

Collecting these two estimates together, we get

supw∈W

inf06i6N−1

Ii(w)−1> inf

06i6N−1Ii(w)−1

> λ0.

Combining this with proof (a), we know that inf06i6N−1 Ii(w)−1 = λ0. WhenN <∞, we have c = 0 and so wi = λ0gi. Hence (3.32) holds. We now show thatwhen N = ∞, we still have c = 0 and so (3.32) also holds. Otherwise, sinceµi+1ai+1ui is decreasing in i, we have inf16i6N−1 Ii(w)−1 = I1(w)−1. Fromthis, we must have a contradiction with inf06i6N−1 Ii(w)−1 = λ0 providedc > 0.

We have thus completed the proof of (3.32) under the assumption that gi ↑↑.(d) We now prove the strictly monotonicity of the eigenfunction (gi) of λ0.

By (3.33), we have g1 6= 0. Otherwise, by induction, we would have gi ≡ 0 forall i > 1. Thus, we may assume that g1 > 0. Suppose that there is an n with1 6 n 6 N − 1 such that

0 = g0 < g1 < · · · < gn−1 < gn > gn+1.

Page 74: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

62 3 New Variational Formulas for the First Eigenvalue

Define gi = giI[i<n] + gnI[i>n]. Then, we have

i

πig2i =

i6n−1

πig2i + g2

n

N∑

i=n

πi,

−∑

i

πi(gΩg)(i) = λ0

i6n−1

πig2i + πnangn(gn − gn−1).

Note that

λ0gn = −Ωg(n) = bn(gn − gn+1) + an(gn − gn−1) > an(gn − gn−1).

We have

πnangn(gn − gn−1) 6 λ0πng2n < λ0g

2n

N∑

i=n

πi.

Therefore,

λ0 6−∑i πi(gΩg)(i)∑

i πig2i

=λ0

∑i6n−1 πig

2i + πnangn(gn − gn−1)

∑i6n−1 πig2

i + g2n

∑Ni=n πi

< λ0,

which is a contradiction.(e) As for the last assertion of the theorem, simply note that in the above

proofs (a)–(d), we make no use of π0 (recall that g0 = 0) and b0. Moreover, theoriginal Ii(w) is homogeneous in (µi). Actually, when b0 > 0,

λ0 = inff0=0,f 6=0

06i6N−1

πibi(fi+1 − fi)2

/ ∑

06i6N

πif2i

= inff 6=0

[ ∑

16i6N−1

πibi(fi+1 − fi)2 + π1a1f

21

]/ ∑

16i6N

πif2i

= inff 6=0

[ ∑

16i6N−1

µibi(fi+1 − fi)2 + µ1a1f

21

]/ ∑

16i6N

µif2i

= inff 6=0

[ ∑

16i6N−1

µibi(fi+1 − fi)2 + µ1a1f

21

]/ ∑

16i6N

µif2i .

Thus, we are studying the process with Dirichlet form D(f)(πi := µi

/ ∑06i6N

µj

)

on the state space 1, 2, · · · , N and with killing rate a1. No role is played byb0.

Page 75: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Chapter 4

Generalized Cheeger’sMethod

From the previous chapters, we have seen an application of a probability methodto a problem in Riemannian geometry. This chapter goes the opposite direction.We use Cheeger’s method, which comes from Riemannian geometry, to studysome probabilistic problems. We begin with a review on the Cheeger’s methodin geometry. Then, move to a generalization and present our new results. Inparticular, we examine the Cheeger’s splitting technique and prove an existencecriterion for the spectral gap. In Sections 4.5–4.8, we sketch the proofs of themain theorems. Applications to birth–death processes are collected in the lastsection (§4.9).

4.1 Cheeger’s method

Let us recall the Cheeger’s inequality in geometry.Again, let M be a connected compact Riemannian manifold. We consider

the first non-trivial eigenvalue λ1 of Laplacian ∆. We will also study the firstDirichlet eigenvalue, denoted by λ0. Here is the geometric result.

Theorem 4.1 (Cheeger’s inequality, 1970). We have

k > λ1 >1

4k2,

where

k := infM1,M2: M1∪M2=M

Area(M1 ∩M2)

Vol(M1) ∧ Vol(M2),

which is called the Cheeger’s constant.

As usual, Vol(M) and Area(S) denote the Riemannian volume of M andthe area of S, respectively.

The key ideas in establishing this inequality are the following.

Page 76: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

64 4 Generalized Cheeger’s Method

• Splitting technique: λ1 > infB [λ0(B) ∨ λ0(Bc)]. That is, splitting the

space into two parts B and Bc, and then estimating the first eigenvalueλ1 in terms of the first Dirichlet eigenvalues λ0(B) and λ0(B

c).

• Estimate λ0(B) in terms of another Cheeger’s constant

h = infM1⊂M, ∂M1∩∂M=∅

Area(∂M1)

Vol(M1).

The last constant is closely related to the isoperimetric inequality:

Area(∂A)

Vol(A)(d−1)/d>

Area(Sd−1)

Vol(Bd)(d−1)/d.

where the right-hand side is called isoperimetric constant. It was observed firstby J. Cheeger (1970) that the proof of the classical isoperimetric inequalitycan be also used to study the first eigenvalue λ1. Certainly, one can replaceLebesgue measure with others. The isoperimetric inequality with respect toGaussian measure was studied by P. Levy (1919)[see Levy (1951, Chapter IV)]and extended by. M. Gromov (1980; 1999), S. G. Bobkov (1996; 1997), S. G.Bobkov and F. Gotze (1999a), D. Bakry and M. Ledoux (1996) even in theinfinite-dimensional setting. Mainly, these studies are concerning with differen-tial operators. However, in this chapter, we are going to another direction, thatis studying integral operators.

4.2 A generalization

Let (E,E , π) be a probability space satisfying (x, x) : x ∈ E ∈ E ×E . Denoteby Lp(π) the usual real Lp-space with norm ‖ · ‖p. Write ‖ · ‖ = ‖ · ‖2 forsimplicity. In this chapter, we consider mainly a symmetric form (D,D(D))(not necessarily a Dirichlet form) on L2(π):

D(f) =1

2

E×E

J(dx, dy)[f(y) − f(x)]2,

D(D) = f ∈ L2(π) : D(f) <∞,(4.1)

where J > 0 is a symmetric measure, has no charge on the diagonal set (x, x) :x ∈ E. A typical example is as follows. For a q-pair (q(x), q(x, dy)), reversiblewith respect to π (i.e., π(dx)q(x, dy) = π(dy)q(y, dx)), we simply take

J(dx, dy) = π(dx)q(x, dy).

More especially, for a Q-matrix Q = (qij), reversible with respect to (πi > 0)(i.e., πiqij = πjqji for all i, j), we take

Jij = πiqij (j 6= i).

Page 77: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

4.3 New results 65

Naturally, define

λ1 = infD(f) : π(f) = 0, ‖f‖ = 1, π(f) :=

∫fdπ.

We call λ1 the spectral gap of (D,D(D)).For bounded jump processes, the fundamental known result is due to G. F.

Lawler and A. D. Sokal (1988), stated in Theorem 1.6. As mentioned in Section1.3, the last result has a very wider range of applications and has been collectedin several books. The main shorthand is the restriction on bounded operators.

On the other hand, for differential operators, there is a great deal of publica-tions on the logarithmic Sobolev inequality (1.26). Refer to D. Bakry (1992), L.Gross (1993), S. Aida and I. Shigekawa (1994), A. Guionnet and B. Zegarlinski(2003) and references within. However, the known result for integral operatorsare still rather limited. Here is a general result for Markov chains.

Theorem 4.2 (P. Diaconis and L. Saloff-Coste, 1996). Let the Q-matrix is re-versible with respect to (πi > 0) and satisfies

∑j

|qij | = 1. Define

Ent(f) =∑

i

πifi log fi −∑

i

πifi log∑

i

πifi, f > 0

and D(f) =∑

i πiqij(fj − fi)2. Then the optimal constant σ in the logarithmic

Sobolev inequality

Ent (f2) 62

σD(f),

satisfies

σ >2(1 − 2π∗)λ1

log[1/π∗ − 1], π∗ := min

iπi.

This result is very good since it can be sharp. However, it works only forfinite state space.

For the Nash inequality (1.27), the knowledge is more or less at the samelevel.

4.3 New results

To avoid unboundedness, our goal is to use a renormalizing procedure. Choosea nonnegative symmetric function r such that

J (1)(dx,E)/π(dx) 6 1, π-a.e. (4.2)

where

J (α)(dx, dy) = Ir(x,y)α>0J(dx, dy)

r(x, y)α, α ∈ [0, 1].

Then, corresponding to each inequality, define a new Cheeger’s constant, aslisted in Table 1.3. Finally, one of our main results can be stated as follows(Theorem 1.7).

Page 78: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

66 4 Generalized Cheeger’s Method

Theorem 4.3. If k(1/2) > 0, then the corresponding inequality holds.

Even though the result is very simple to state, its proofs are completedin four papers: Chen and F. Y. Wang (1998), Chen (1999b; 2000b), F. Y.Wang (2001a). Of course, much more results were proved in these papers. Forinstance, here is a lower estimate of λ1.

Theorem 4.4.

λ1 >k(1/2)2

1 +√

1 − k(1)2. (4.3)

Surprisingly, this estimate can be sharp, which is rather unusual by usingthe Cheeger’s approach.

In parallel, the lower estimates for the logarithmic Sobolev and the Nashinequalities are presented in Sections 4.6 and 4.8, respectively. Some upperestimates are given in Section 4.7.

The most advantage of the Cheeger’s approach is that it works in a verygeneral setting. Here is an example.

Corollary 4.5. Let (E,E , π) be a probability space and let j(x, y) > 0 be asymmetric function satisfying j(x, x) = 0 and

j(x) :=

E

j(x, y)π(dy) <∞, x ∈ E.

Then, for the symmetric form generated by

J(dx, dy) = j(x, y)π(dx)π(dy),

we have

λ1 >1

8infx6=y

j(x, y)2

j(x) ∨ j(y) .

Proof. Note that

k(α) = infπ(A)∈(0,1/2]

1

π(A)

A×Ac

j(x, y)

[j(x) ∨ j(y)]α π(dx)π(dy)

> infx6=y

j(x, y)

[j(x) ∨ j(y)]α infπ(A)∈(0,1/2]

π(Ac)

>1

2infx6=y

j(x, y)

[j(x) ∨ j(y)]α .

The conclusion now follows from (1.3) immediately.

The Cheeger’s approach and the isoperimetric method have also been appliedto the Lp-setup for jump processes by F. Wang and Y. H. Zhang (2003), Y. H.Mao (2001a; 2001b), to diffusions by F. Y. Wang (2000a), and to the weakerPoincare inequalities (which will be discussed in Chapter 7) by M. Rockner andF. Y. Wang (2001).

Page 79: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

4.4 Splitting technique and existence criterion 67

4.4 Splitting technique and existence criterion

Recall the definition:

λ1 = infD(f) : f ∈ D(D), π(f) = 0, π(f 2) = 1,λ0(A) = infD(f) : f ∈ D(D), f |Ac = 0, π(f2) = 1.

As mentioned in the last section, the reduction of the Neumann case to theDirichlet one is based on the Cheeger’s splitting technique. Here is the resultproved by G. F. Lawler and A. D. Sokal (1988) for bounded operators:

λ1 > infBλ0(B) ∨ λ0(B

c).

However, we are unable to extend this result to the unbounded symmetric forms.What instead is a weaker version as follows:

λ1 > infπ(B)∈(0,1/2]

λ0(B).

More precisely, we have

Theorem 4.6 (Chen and F. Y. Wang (1998), Chen (2000c)). For the abovesymmetric form or general Dirichlet form

(D,D(D)

), we have

infπ(A)∈(0,1/2]

λ0(A) 6 λ1 6 2 infπ(A)∈(0,1/2]

λ0(A). (4.4)

Proof. (a) Let f ∈ D(D) such f |Ac = 0 and π(f2) = 1. Then

π(f2) − π(f)2 = 1− π(fIA

)2> 1 − π

(f2)π(A)

= 1− π(A) = π(Ac).

Hence

λ1 6D(f)

π(f2) − π(f)26

D(f)

π(Ac)

which implies that

λ1 6λ0(A)

π(Ac) .

Furthermore

λ1 6 infπ(A)∈(0,1)

min

λ0(A)

π(Ac),λ0(A

c)

π(A)

= infπ(A)∈(0,1/2]

min

λ0(A)

π(Ac),λ0(A

c)

π(A)

6 infπ(A)∈(0,1/2]

λ0(A)/π(Ac)

6 2 infπ(A)∈(0,1/2]

λ0(A).

Page 80: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

68 4 Generalized Cheeger’s Method

This part of the proof works for general D(f).(b) Next, for ε > 0, choose fε such that π(fε) = 0, π(f2

ε ) = 1 and λ1 + ε >

D(fε). Choose cε such that π(fε < cε) 6 1/2 and π(fε > cε) 6 1/2. Setf±

ε = (fε − cε)±, and define B±

ε = f±ε > 0. Recall that

D(f) =1

2

∫J(dx, dy)[f(y) − f(x)]2.

We have

λ1 + ε > D(fε) = D(fε − cε)

=1

2

∫J(dx, dy)

[ ∣∣f+ε (y) − f+

ε (x)∣∣+∣∣f−

ε (y) − f−ε (x)

∣∣ ]2

Therefore

λ1 + ε >1

2

∫J(dx, dy)

(f+

ε (y) − f+ε (x)

)2+

1

2

∫J(dx, dy)

(f−

ε (y) − f−ε (x)

)2

> λ0

(B+

ε

)π((f+

ε

)2)+ λ0

(B−

ε

)π((f−

ε

)2)

> infπ(B)∈(0,1/2]

λ0(B)π((f+

ε

)2+(f−

ε

)2)

= (1 + c2ε) infπ(B)∈(0,1/2]

λ0(B)

> infπ(B)∈(0,1/2]

λ0(B).

Because ε is arbitrary, the proof is done for the first case.(c) Finally, for general Dirichlet form, since

D(f) = limt↓0

1

2t

∫π(dx)Pt(x, dy)[f(y) − f(x)]2, (4.5)

the proof needs a little modification only.

Applications to the Neumann eigenvalue

We need the following result. Let E be a locally compact separable metric spacewith Borel σ-algebra E , µ be an everywhere dense Radon measure on E and(D,D(D)) be a Dirichlet form on L2(µ) = L2(E;µ).

The next result is due to V. G. Maz’ya (1973) (cf., Maz’ya (1985, and refer-ences within)) in particular case and Z. Vondracek (1996) in general. Its proofis simplified by M. Fukushima and T. Uemura (2002).

Theorem 4.7. For a regular transient Dirichlet form on A ∈ E ,

(4Θ(A))−1 6 λ0(A) 6 Θ(A)−1,

where

Θ(A) = supcompact K⊂A

π(K)

Cap(K).

Page 81: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

4.4 Splitting technique and existence criterion 69

U

Y

E

Ac

(D)(N)

(D)

(N)A

(D): Dirichlet boundary

(N): Neumann boundary

Figure 4.1 Four choices of boundary condition

Combining the above two results together, we obtain immediately the fol-lowing one.

Theorem 4.8. Let µ(E) <∞ and µ(∂O) = 0 for all open O. Then for a regularDirichlet form, we have

infopenA: π(A)∈(0,1/2]

(4Θ(A))−1 6 λ1 6 2 infopenA: π(A)∈(0,1/2]

Θ(A)−1.

In particular, λ1 > 0 iff supopenA: π(A)∈(0,1/2]

Θ(A) <∞.

We now study the existence criterion in a different way. For compact statespace (E,E ), it is often true that λ1 > 0. Thus, we need only to considerthe non-compact case. The idea is to use the Cheeger’s splitting technique.Split the space E into two parts A and Ac. Mainly, there are two boundaryconditions: the Dirichlet or the Neumann boundary condition, that is absorbingor reflecting at the boundary, respectively. See Figure 4.1. The correspondingeigenvalue problems are denoted by (D) and (N), respectively.

Next, let A be compact for a moment. Then on Ac, one should consider theproblem (D). Otherwise, since Ac is non-compact, the solution to problem (N)is unknown and it is indeed what we are also interested in. On A, we can useeither of the boundary conditions. However, it is better to use the Neumannone since the corresponding λ1 is more closer to the original λ1 when A becomeslarger. In other words, we want to describe the original λ1 in terms of the localλ1(A) and λ0(A

c).We now state our criterion informally, which is easier to remember.

Criterion (Informal description [Chen and F. Y. Wang, 1998]). λ1 > 0 iffexists a compact A such that λ0(A

c) > 0, where

λ0(Ac) = infD(f, f) : f |A = 0, π(f2) = 1.

Page 82: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

70 4 Generalized Cheeger’s Method

E

A

B

Bc

?

(D)(N)

Figure 4.2 Non-local, need intersection

Theorem 4.9 (Criterion [Chen and F. Y. Wang, 1998]). Let A ⊂ B satisfy0 < π(A), π(B) < 1. Then

λ0(Ac)

π(A)> λ1 >

λ1(B)[λ0(Ac)π(B) − 2MAπ(Bc)]

2λ1(B) + π(B)2[λ0(Ac) + 2MA], (4.6)

where MA = ess supA, πJ(dx,Ac)/π(dx), where ess supA, π denotes the essentialsupremum over the set A with respect to the measure π.

As mentioned before, usually, λ1(B) > 0 for all compact B. Hence the resultmeans, as stated in the heuristic description, that λ1 > 0 iff λ0(A

c) > 0 for somecompact A, because we can first fix such an A and then make B large enoughso that the right-hand side of (1.6) becomes positive. The reason why we haveto use two sets is that the operator is not local, there may exist an interactionwith a very long range. The final choice of regions and the boundary conditionsare shown in Figure 4.2.

Proof of Theorem 1.9. Let f satisfy π(f) = 0 and π(f 2) = 1. Our aim is tobound D(f) in terms of λ0(A

c) and λ1(B).

(a) First, we use λ1(B).

D(f) > DB(fIB) > λ1(B)π(B)−1[π(f2IB

)− π(B)−1π

(fIB

)2]

= λ1(B)π(B)−1[π(f2IB

)− π(B)−1π

(fIBc

)2]. (4.7)

Here in the last step, we have used π(f) = 0.

(b) Next, we use λ0(Ac). We need the following elementary inequality

|(fIAc)(x)−(fIAc )(y)| 6 |f(x)−f(y)|+IA×Ac∪Ac×A(x, y)|(fIA)(x)−(fIA)(y)|.

Page 83: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

4.5 Sketch of the proof of Theorem 4.4 71

+ j

.....................0 γ0 1

Γ1 ∨ Γ2Γ1

Γ2

Figure 4.3 Intersection of Γ1 and Γ2

Then

λ0(Ac)π(f2IAc

)6 D(fIAc) =

1

2

∫J(dx, dy)

[(fIAc)(y) − (fIAc)(x)

]2

6 2D(f) + 2

A×Ac

J(dx, dy)[(fIA)(y) − (fIA)(x)

]2

6 2D(f) + 2MAπ(f2IA

). (4.8)

(c) Estimating the right-hand sides of (1.7) and (1.8) in terms of γ :=π(f2IB

), we obtain two inequalities D(f) > c1γ+ c2 and D(f) > −c3γ+ c4 for

some constants c1, c3 > 0. Hence

D(f) > infγ∈[0,1]

maxc1γ + c2, −c3γ + c4.

Clearly, the infimum is achieved at γ0, which is the intersection of the two linesΓ1 and Γ2 in · · · . See Figure 4.3. Then, the required lower bound of λ1 isgiven by c1γ0 + c2.

4.5 Sketch of the proof of Theorem 4.4

The proof of Theorem 1.4 is based on Cheeger’s splitting idea. That is, estimateλ1 in terms of λ0 for a more general symmetric form

D(f) =1

2

E×E

J(dx, dy)[f(y) − f(x)]2 +

E

K(dx)f(x)2, (4.9)

where K is a non-negative measure on (E,E ). The study on λ0 is meaningfulsince D(1) 6= 0 whenever K 6= 0. It is called Dirichlet eigenvalue of (D,D(D)).Thus, in what follows, when dealing with λ0 (resp., λ1), we consider only thesymmetric form given by (1.9) (resp., (1.1)). Instead of (4.2), we now requirethat

[J (1)(dx,E) +K(1)(dx)]/π(dx) 6 1, π-a.s, (4.10)

Page 84: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

72 4 Generalized Cheeger’s Method

where J (α) is the same as before and

K(α)(dx) = Is(x)α>0K(dx)

s(x)α

for some non-negative function s(x). Corresponding to (J (α),K(α)), we have asymmetric form D(α), defined by (4.9). Next, define

h(α) = infπ(A)>0

J (α)(A×Ac) +K(α)(A)

π(A).

Theorem 4.10 (Chen and F. Y. Wang, 1998). For the symmetric form given by(1.9), under (1.10), we have

λ0 >h(1/2)2

1 +√

1 − h(1)2.

Proof. (a) First, we express h(α) by the following functional form

h(α) = inf

1

2

∫J (α)(dx, dy)|f(x) − f(y)| +K(α)(f) : f > 0, π(f) = 1

.

By setting f = IA/π(A), one returns to the original set form of h(α). For thereverse assertion, simply consider the set Aγ = f > γ for γ > 0. The proof isalso not difficult:

f(x)>f(y)

J (α)(dx, dy)[f(x) − f(y)] +K(α)(f)

=

∫ ∞

0

dγJ (α)

(f(x) > γ > f(y)

)+K(α)

(f > γ

)

=

∫ ∞

0

[J (α)

(Aγ ×Ac

γ

)+K(α)(Aγ)

]dγ (Co-area formula)

> h(α)

∫ ∞

0

π(Aγ

)dγ = h(α)π(f).

The appearance of K makes the notations heavier. To avoid this, one canenlarge the state space to E∗ = E ∪ ∞. Regarding K as a killing measureon E∗, the form D(f, g) can be extended to the product space E∗ × E∗ butexpressed by using a symmetric measure J∗ only. At the same time, one canextend f to a function f∗ on E∗: f∗ = fIE . Then, we have

h(α) = inf

1

2

E∗×E∗

J∗(α)(dx, dy)|f∗(x) − f∗(y)| : f > 0, π(f) = 1

.

Page 85: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

4.5 Sketch of the proof of Theorem 4.4 73

(b) Take f with π(f2) = 1, by (a), Cauchy-Schwarz inequality and condition(1.10), we have

h(1)26

1

2

E∗×E∗

J∗(1)(dx, dy)∣∣f∗(y)2 − f∗(x)2

∣∣2

61

2D(1)(f)

E∗×E∗

J∗(1)(dx, dy)[f∗(y) + f∗(x)

]2

=1

2D(1)(f)

2

E∗×E∗

J∗(1)(dx, dy)[f∗(y)2 + f∗(x)2

]

−∫

E∗×E∗

J∗(1)(dx, dy)[f∗(y) − f∗(x)

]2

6 D(1)(f)[2 −D(1)(f)

]. (4.11)

Solving this quadratic inequality in D(1)(f), one obtains

D(1)(f) > 1 −√

1 − h(1)2.

(c) Repeating the above proof but by a more careful use of Cauchy-Schwarzinequality, we obtain

h(1/2)2 6

1

2

E∗×E∗

J∗(1/2)(dx, dy)∣∣f∗(y)2 − f∗(x)2

∣∣2

=

1

2

E∗×E∗

J∗(dx, dy)|f∗(y) − f∗(x)| · Ir(x,y)>0|f∗(y) + f∗(x)|√

r(x, y)

2

61

2D(f)

E∗×E∗

J∗(1)(dx, dy)[f∗(y) + f∗(x)

]2

6 D(f)[2 −D(1)(f)

].

From this and (b), the required assertion follows.

Proof of Theorem 1.4. (a) For any B ⊂ E with π(B) > 0, define a local formas follows.

D(α)B (f) =

1

2

B×B

J (α)(dx, dy)[f(y) − f(x)]2 +

B

J (α)(dx,Bc)f(x)2.

Obviously, D(α)B (f) = D

(α)B (fIB). Moreover,

λ0(B) := infD(f) : f |Bc = 0, ‖f‖ = 1= inf

DB(f) : π(f2IB) = 1

.

Let πB = π(· ∩ B)/π(B) and set

h(α)B = inf

A⊂B, π(A)>0

J (α)(A×(B \A))+J (α)(A×Bc)

π(A)= inf

A⊂B, π(A)>0

J (α)(A×Ac)

π(A).

Page 86: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

74 4 Generalized Cheeger’s Method

Applying Theorem 1.10 to the local form on L2(B,E ∩ B, πB) generated byJB = π(B)−1J |B×B and KB = J(·, Bc)|B , we obtain

λ0(B) >h

(1/2)B

2

1 +

√1 − h

(1)B

2.

(b) Note that infπ(B)61/2 h(α)B = k(α), by Theorem 1.6,

λ1 > infπ(B)61/2

h(1/2)B

2

1 +

√1 − h

(1)B

2> inf

π(B)61/2

infπ(B)61/2 h(1/2)B

2

1 +

√1 − h

(1)B

2

>infπ(B)61/2 h

(1/2)B

2

1 +

√1 − infπ(B)61/2 h

(1)B

2=

k(1/2)2

1 +√

1 − k(1)2.

We obtain the required conclusion.

4.6 Logarithmic Sobolev inequality

Theorem 4.11 (Chen, 2000b). Denote by σ the optimal constant in the loga-rithmic Sobolev inequality:

Ent(f2)

62

σD(f).

We have

2κ > σ >2λ1κ

(1/2)

√λ1(2 − λ

(1)1 ) + 3κ(1/2)

>1

8κ(1/2)2,

where

κ(α) = infπ(A)∈(0,1)

J (α)(A×Ac)

−π(A) log π(A), κ = κ(0).

and λ(α)1 = infD(α)(f) : π(f) = 0, ‖f‖ = 1.

Proof. The proof is partially due to F. Y. Wang. To get the upper bound,simply apply the inequality to the test function f = IA/

√π(A), π(A) ∈ (0, 1).

To prove the lower bound, let π(f) = 0 and ‖f‖ = 1.

(a) Set ε =

√2 − λ

(1)1

/[2κ(1/2)] and E(f) = π

(f2 log f2

). Then, we claim

thatE(f) 6 2ε

√D(f) + 1. (4.12)

Actually, one shows first that

I :=1

2

∫J (1/2)(dx, dy)

∣∣f(y)2 − f(x)2∣∣ 6

√(2 − λ

(1)1

)D(f). (4.13)

Page 87: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

4.6 Logarithmic Sobolev inequality 75

The proof is standard as used before (cf., the proof (c) of Theorem 4.10). Next,set At =

f2 > t

and prove that

I > κ(1/2)[E(f) − 1]. (4.14)

The proof goes as follows. Note that ht := π(At) 6 1 ∧ t−1. We have

I =

∫ ∞

0

J (1/2)(At × Act)dt > κ(1/2)

∫ ∞

0

(−ht loght)dt

> κ(1/2)

∫ ∞

0

ht log tdt = κ(1/2)

∫ ∞

0

ht(log t+ 1)dt− 2κ(1/2)

= κ(1/2)

∫dπ

∫ f2

0

(log t+ 1)dt− 2κ(1/2) = κ(1/2)[E(f) − 1].

Combining (4.13) with (4.14), we get (4.12).(b) By (4.12), we have

E(f) 6 2ε√D(f) + 1 6 γεD(f) + ε/γ + 1,

where γ > 0 is a constant to be specified below. On the other hand, by [Bakry(1992); Proposition 3.9], the inequality

π(f2 log f2

)6 C1D(f) + C2, π(f) = 0, ‖f‖ = 1 (4.15)

implies thatσ > 2/[C1 + (C2 + 2)λ−1

1 ]. (4.16)

In other words, if λ1 > 0, then the weaker inequality (4.15) is indeed equivalentto the original logarithmic Sobolev inequality. We will prove this fact soon.Combining these facts together, it follows that

σ >2

εγ + [ε/γ + 3]/λ1.

Maximizing the right-hand side with respect to γ, we get

σ >2λ1κ

(1/2)

√(2 − λ

(1)1 )λ1 + 3κ(1/2)

. (4.17)

On the other hand, applying Theorem 4.4 to J (1), we have k(1/2) = k(1) and

hence λ(1)1 > 1−

√1 − k(1)2. Combining this with Theorem 4.4 and noting that

k(1/2) > (log 2)κ(1/2), it follows that the right-hand side of (4.17) is boundedbelow by

2(log 2)2κ(1/2)2

(log 2 + 3)[1 +

√1 − k(1)2

] >1

8κ(1/2)2.

We now introduce a more powerful result, its proof is much technical andomitted here.

Page 88: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

76 4 Generalized Cheeger’s Method

Theorem 4.12 (Chen, 2000b). Define

ξ(δ) = infπ(A)>0

J (1/2)(A×Ac) + δπ(A)

π(A)√

1 − log π(A),

ξ(∞) = limδ→∞

ξ(δ) = supδ>0

ξ(δ),

A(δ) =(2 + δ)(λ1 + δ)

(ξ(δ))2, δ > 0.

Then, we have

2κ > σ >2λ1

1 + 16 infδ>0A(δ).

To prove (4.16), we need the following result, proved by J. D. Deuschel andD. W. Stroock (1989, page 247) and goes back to O. S. Rothaus (1985, Lemma9).

Lemma 4.13. We have

supc∈R

Ent((f + c)2

)6 Ent

(f2)

+ 2π(f2), π(f) = 0.

Proof. Without loss of generality, assume that ‖f − π(f)‖ = 1 and set h =f − π(f) =: f − 1/t. Then we have π(h) = 0 and ‖h‖ = 1. It suffices to showthat

∫(1 + th)2 log

(1 + th)2

1 + t2dπ 6 t2

∫h2 logh2dπ + 2t2, t ∈ R.

Define

hδ(t) =

∫(1 + th)2 log

(1 + th)2 + δ

1 + t2dπ − t2

∫h2 logh2dπ.

Then

h′δ(t) = 2

∫(1 + th)h log[(1 + th)2 + δ]dπ + 2

∫(1 + th)3h

(1 + th)2 + δdπ

− 2t

[1 + log(1 + t2) +

∫h2 logh2dπ

].

h′′δ (t) = 2

∫h2 log

(1 + th)2 + δ

(1 + t2)h2dπ + 10

∫h2 log

(1 + th)2

(1 + th)2 + δdπ

− 4

∫h2 (1 + th)4

[(1 + th)2 + δ]2dπ − 2 − 4t2

1 + t2.

By Jensen’s inequality,

∫h2 log

(1 + th)2 + δ

(1 + t2)h2dπ 6 log

∫(1 + th)2 + δ

1 + t2dπ = log

(1 +

δ

1 + t2

)

Page 89: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

4.7 Upper bounds 77

since π(h) = 0 and ‖h‖ = 1. On the other hand, by Schwarz inequality,

A(t, δ)2 6

∫h2 (1 + th)4

[(1 + th)2 + δ]2dπ, A(t, δ) :=

∫h2 (1 + th)2

(1 + th)2 + δdπ ∈ [0, 1].

Hence

h′′δ (t) 6 2 log

(1 +

δ

1 + t2

)− [4A(t, δ)2 − 10A(t, δ)] − 2 6 2 log(1 + δ) + 4.

Noting that hδ(0) = log(1 + δ) and h′δ(0) = 0, by Taylor expansion, we get

hδ(t) 6 log(1 + δ) + [2 + log(1 + δ)]t2.

The assertion now follows by letting δ → 0.

Proof of (4.16). Let f = f−π(f). Then the assertion follows by Lemma 4.13,(4.15) and the Poincare inequality:

Ent(f2)6 Ent

(f2)+2∥∥f∥∥2

6 C1D(f)+(2+C2)∥∥f∥∥2

=

(C1+

2 + C2

λ1

)D(f).

4.7 Upper bounds

The upper bound given by Theorem 4.11 is usually very rough. Here we intro-duce two results which are often rather effective. The results show that orderone (resp., two) of exponential integrability is required for λ1 > 0 (resp., σ > 0).

Theorem 4.14 (Chen and F. Y. Wang, 1998). Suppose that the function r usedin (4.2) is J-a.e. positive. If there exists ϕ > 0 such that

ess supJ |ϕ(x) − ϕ(y)|2r(x, y) 6 1, (4.18)

thenλ1 6 inf

ε2/4 : ε > 0, π

(eεϕ)

= ∞. (4.19)

Consequently λ1 = 0, if there exists ϕ > 0 satisfying (4.18) such that π(eεϕ)

= ∞for all ε > 0.

Proof. We need to show that if π(eεϕ)

= ∞, then λ1 6 ε2/4. For n > 1, definefn = exp[ε(ϕ ∧ n)/2]. Then, we have

λ1 6 D(fn)/[π(f2

n

)− π(fn)2

]. (4.20)

For every m > 1, choose rm > 0 such that π(ϕ > rm) 6 1/m. Then

π(I[ϕ>rm]f

2n

)1/2>

√mπ

(I[ϕ>rm]fn

)>

√mπ(fn) −√

meεrm/2.

Hence

π(fn

)26

[√π(f2

n

)/√m+ eεrm/2

]2. (4.21)

Page 90: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

78 4 Generalized Cheeger’s Method

On the other hand, by Mean Value Theorem,

|eA − eB | 6 |A−B|eA∨B = |A−B|(eA ∨ eB)

for all A, B > 0. Hence

D(fn) =1

2

∫J(dx, dy)[fn(x) − fn(y)]2

6ε2

8

∫J (1)(dx, dy)[ϕ(x) − ϕ(y)]2r(x, y)

[fn(x) ∨ fn(y)

]2

6ε2

4π(f2

n

).

(4.22)

Noticing that π(f2

n

)↑ ∞, combining (4.22) with (4.20) and (4.21) and then

letting n ↑ ∞, we obtain λ1 6 ε2/[4(1 − m−1)

]. The proof is completed by

setting m ↑ ∞.

Theorem 4.15 (F. Y. Wang, 2000a). Suppose that (4.18) holds. If σ > 0, then

π(eεσϕ2)

6 exp

[εσπ(ϕ2)

1 − 2ε

]<∞, ε ∈ [0, 1/2).

Proof. (a) Given n > 1, let ϕn = ϕ ∧ n, fn = exp[rϕ2

n/2]

and hn(r) =

π(erϕ2

n

). Then, by (4.2), (4.18) and applying Mean Value Theorem to the

function exp[rx2/2], we get

D(fn) =1

2

∫J(dx, dy)[fn(x) − fn(y)]2

6r2

2

∫J (1)(dx, dy)[ϕ(x) − ϕ(y)]2r(x, y)

× maxϕn(x)fn(x), ϕn(y)fn(y)

2

6 r2∫J (1)(dx, dy)ϕn(x)2fn(x)2

6 r2h′n.

(b) Next, applying the logarithmic Sobolev inequality to the function fn andusing (a), it follows that

rh′n(r) 6 hn(r) log hn(r) + 2r2h′n(r)/σ, r > 0.

That is,

h′n(r) 61

r(1 − 2r/σ)hn(r) log hn(r), r ∈ [0, σ/2).

Now the required assertion follows from Corollary A.5.

Page 91: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

4.8 Nash inequality 79

4.8 Nash inequality

Theorem 4.16 (Chen, 1999b). Define the isoperimetric constant Iν as follows:

Iν = inf0<π(A)61/2

J (1/2)(A×Ac)

π(A)(ν−1)/ν= inf

0<π(A)<1

J (1/2)(A×Ac)[π(A) ∧ π(Ac)

](ν−1)/ν, ν > 1.

ThenVar(f)1+2/ν 6 2I−2

ν D(f) ‖f‖4/ν1 , f ∈ L2(π). (4.23)

Proof. The proof below is quite close to L. Saloff-Coste (1997). Fix a boundedg ∈ D(D). Let c be the median of g. Set f = sgn(g − c)|g − c|2. Then f hasmedian 0. By using the functional form of Iν :

Iν = inf

12

∫J (1/2)(dx, dy)|f(y) − f(x)|

infc: c is a median of f ‖f − c‖ν/(ν−1): f ∈ L1(π) is non-constant

(4.24)which will be proved later, we obtain

‖g − c‖22q = ‖f‖q 6

1

2I−1ν

∫J (1/2)(dx, dy)|f(y) − f(x)|. (4.25)

On the other hand, since

|a− b| (|a| + |b|) =

|a2 − b2|, if ab > 0

(|a| + |b|)2, if ab < 0,

we have |f(y) − f(x)| 6 |g(y) − g(x)|(|g(y) − c| + |g(x) − c|

). By using this

inequality and following the last part of the proof of Theorem 4.10, we get∫J (1/2)(dx, dy)|f(y) − f(x)|

6√

2D(g)

[∫J (1)(dx, dy)[|g(y) − c| + |g(x) − c|]2

]1/2

6 2√

2D(g) ‖g − c‖2.

(4.26)

Combining (4.25) with (4.26) together, we get ‖g − c‖22q 6 2I−1

ν

√2D(g) ‖g −

c‖2. On the other hand, writing g2 = g2/(ν+1) · g2ν/(ν+1) and applying Holderinequality with p′ = (ν + 1)/2 and q′ = (ν + 1)/(ν − 1), we obtain

‖g‖2 6 ‖g‖1/(ν+1)1 ‖g‖ν/(ν+1)

2q .

From these facts, it follows that

‖g − c‖2 6

[I−1ν

√2D(g) ‖g − c‖2

]ν/2(ν+1)

‖g − c‖1/(ν+1)1 .

Page 92: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

80 4 Generalized Cheeger’s Method

Thus,

‖g − c‖2(1+2/ν)2 6 2I−2

ν D(g) ‖g − c‖4/ν1

and hence

Var(g)1+2/ν6 2I−2

ν D(g) ‖g‖4/ν1 .

We now return to prove (4.24). Denote by Jν the right-hand side of (4.24).Set q = ν/(ν − 1) and ignore the superscript “(1/2)” in J (1/2) everywhere forsimplicity. Take f = IA with 0 < π(A) 6 1/2. Then, f has a median 0.Moreover,

∫J(dx, dy)|f(y) − f(x)| = 2J(A×Ac), ‖f‖q = π(A)1/q .

This proves that Iν > Jν .Conversely, fix f with median c. Set f± = (f − c)±. Then f+ + f− = |f − c|

and |f(y) − f(x)| = |f+(y) − f+(x)| + |f−(y) − f−(x)|. Put F±t = f± > t.

Then

1

2

∫J(dx, dy)|f(y) − f(x)| =

1

2

∫J(dx, dy)

[|f+(y)−f+(x)| + |f−(y) − f−(x)|

]

=

∫ ‖f‖u

0

[J(F+

t ×(F+

t

)c)+ J

(F−

t ×(F−

t

)c)]dt

(by co-area formula)

> Iν

∫ ‖f‖u

0

[π(F+

t )1/q + π(F−t )1/q

]dt.

Next, we need the following simple result:

Claim. Let p > 1. Then ‖f‖p 6 F iff ‖fg‖1 6 FG holds for all g satisfying‖g‖q 6 G.

It follows that

π(F±t )1/q =

∥∥IF±

t

∥∥q

= sup‖g‖r61

⟨IF±

t, g⟩,

1

r+

1

q= 1.

Thus, for every g with ‖g‖r 6 1, we have

1

2

∫J(dx, dy)|f(y) − f(x)| > Iν

∫ ∞

0

[〈IF+

t, g〉 + 〈IF−

t, g〉]dt

= Iν[〈f+, g〉 + 〈f−, g〉

]

= Iν〈|f − c|, g〉.

Making supremum with respect to g, we get

1

2

∫J(dx, dy)|f(y) − f(x)| > Iν‖f − c‖q.

Page 93: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

4.9 Birth-death processes 81

4.9 Birth-death processes

Finally, we specify the above results to birth–death processes to illustrate thepower of the Cheeger’s approach. Consider a regular birth–death process onZ+ with birth rates (bi) and death rates (ai). Then Jij = πibi if j = i + 1,Jij = πiai if j = i− 1 and Jij = 0 otherwise. We have the following result.

Theorem 4.17. For birth–death process, take rij = (ai + bi)∨ (aj + bj) (i 6= j).Then the following assertions hold.

(1) For the Nash inequality, Iν > 0 for some ν > 1 iff there exists a constantc > 0 such that

πiai√ri,i−1

> c

[∑

j>i

πj

](ν−1)/ν

, i > 1.

If so, we indeed have Iν > c.

(2) For the logarithmic Sobolev inequality, ξ(∞) > 0 iff

infi>1

πiai√ri,i−1

/(∑

j>i

πj

)√1 − log

j>i

πj > 0.

κ(α) > 0 iff

infi>1

πiai

rαi,i−1

/(−∑

j>i

πj

)log∑

j>i

πj > 0.

(3) For the Poincare inequality, k(α) > 0 iff there exists a constant c > 0 suchthat

πiai

[(ai + bi) ∨ (ai−1 + bi−1)]α> c

j>i

πj , i > 1.

Then, we indeed have k(α) > c.

Page 94: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

82 4 Generalized Cheeger’s Method

Page 95: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Chapter 5

Ten Explicit Criteria forOne-dimensional Processes

The traditional ergodicity consists of a crucial part in the theory of stochasticprocesses, plays a key role in practical applications. The ergodicity has muchrefined recently, due to the study on some inequalities, which are especiallypowerful in the infinite dimensional situation. The explicit criteria for varioustypes of ergodicity for birth–death processes and one-dimensional diffusions arecollected in Tables 1.5 and 5.1, respectively. In particular, an interesting storyabout how to obtain one of the criteria for birth–death processes is explainedin details.

This chapter is organized as follows. First, we recall the study on an expo-nential convergence from different point of view in different subjects: probabilitytheory, spectral theory and harmonic analysis (§5.1 and §5.2). Then we intro-duce the explicit criterion for the convergence, the variational formulas andexplicit estimates for the convergence rates. Some comparison with the knownresults and an application are included (§5.3). Next, we present ten (eleven)criteria for the two classes of processes, respectively (§5.4). The technical proofsare collected into the last two sections. Section 5.5 is devoted to the exponentialergodicity for the discrete case. Section 5.6 is devoted to the strong ergodicity,by using both the analytic and the coupling proofs.

Let us begin with the chapter by recalling the three traditional types ofergodicity.

5.1 Three traditional types of ergodicity

Let Q = (qij) be a regular Q-matrix on a countable set E = i, j, k, · · · .That is, qij > 0 for all i 6= j, qi := −qii =

∑j 6=i qij < ∞ for all i ∈ E and

Q determines uniquely a transition probability matrix Pt = (pij(t)) (which isalso called a Q-process or a Markov chain). Denote by π = (πi) a stationarydistribution of Pt: πPt = π for all t > 0. From now on, assume that the Q-

Page 96: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

84 5 Ten Explicit Criteria for One-dimensional Processes

matrix is irreducible and hence the stationary distribution π is unique. Then,the three types of ergodicity are defined, respectively, as follows.

Ordinary ergodicity : limt→∞

|pij(t) − πj | = 0 (5.1)

Exponential ergodicity : limt→∞

eαt|pij(t) − πj | = 0 (5.2)

Strong ergodicity : limt→∞

supi

|pij(t) − πj | = 0 (5.3)

⇐⇒ limt→∞

eβt supi

|pij(t) − πj | = 0, (5.4)

where α and β are (the largest) positive constants and i, j varies over whole E.The equivalence in (5.4) is well known but one may refer to the second part of§5.6. These definitions are meaningful for general Markov processes once thepointwise convergence is replaced by the convergence in total variation norm.The three types of ergodicity were studied in a great deal during 1953–1981.Especially, it was proved that

strong ergodicity =⇒ exponential ergodicity =⇒ ordinary ergodicity.

Refer to W. J. Anderson (1991), Chen (1992a, Chapter 4), and S. P. Meyn andR. L. Tweedie (1993b) for details and related references. We will come back tothis topic in Chapter 8. The study is quite complete in the sense that we havethe following criteria which are described by the Q-matrix plus a test sequence(yi) only, except the exponential ergodicity for which one requires an additionalparameter λ.

Theorem 5.1 (Criteria). Let H 6= ∅ be an arbitrary but fixed finite subset ofE. Then the following conclusions hold.

(1) The process Pt is ergodic iff the system of inequalities

∑j qijyj 6 −1, i /∈ H∑i∈H

∑j 6=i qijyj <∞ (5.5)

has a nonnegative finite solution (yi).

(2) The process Pt is exponentially ergodic iff for some λ > 0 with λ < qi for alli, the system of inequalities

∑j qijyj 6 −λyi − 1, i /∈ H∑i∈H

∑j 6=i qijyj <∞ (5.6)

has a nonnegative finite solution (yi).

(3) The process Pt is strongly ergodic iff the system (5.5) of inequalities has abounded nonnegative solution (yi).

Page 97: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

5.1 Three traditional types of ergodicity 85

The probabilistic meaning of the criteria reads, respectively, as follows:

maxi∈H

EiσH <∞, maxi∈H

EieλσH <∞ and sup

i∈EEiσH <∞,

where σH = inft > the first jumping time : Xt ∈ H and λ is the same asin (5.6). The criteria are not completely explicit since they depend on thetest sequences (yi) and in general it is often non-trivial to solve a system ofinfinitely many inequalities. Hence, one expects to find out some explicit criteriafor some specific processes. Clearly, for this, the first candidate should be thebirth–death processes. Recall that for a birth–death process with state spaceE = Z+ = 0, 1, 2, · · · , its Q-matrix has the form: qi,i+1 = bi > 0 for all i > 0,qi,i−1 = ai > 0 for all i > 1 and qij = 0 for all other i 6= j. Along this line,it was proved by R. L. Tweedie (1981)[see also W. J. Anderson (1991) or Chen(1992)] that

S :=∑

n>1

µn

j6n−1

1

µjbj<∞ =⇒ Exponential ergodicity, (5.7)

where µ0 = 1 and µn = b0 · · · bn−1/a1 · · · an for all n > 1. Refer to Z. K. Wang(1980), X. Q. Yang (1986) or Hou et al (2000) for the probabilistic meaningof S. The condition is explicit since it depends only on the rates ai and bi.However, the condition is not necessary. A simple example is as follows. Letai = bi = iγ (i > 1) and b0 = 1. Then the process is exponential ergodic iff γ > 2but S < ∞ iff γ > 2. See Chen (1996)) or Examples 8.2 and 8.3. Surprisingly,the condition is correct for strong ergodicity.

Theorem 5.2 (H. J. Zhang, X. Lin and Z. T. Hou, 2000). S < ∞ ⇐⇒Strong ergodicity.

Refer to Z. T. Hou et al (2000). With a different proof, the result is extendedby Y. H. Zhang (2001) to the single-birth processes with state space Z+ (thedetails are presented in §5.6 below). Here, the term “single birth”means thatqi,i+1 > 0 for all i > 0 but qij > 0 can be arbitrary for j < i. Introducing thisclass of Q-processes is due to the following observation: If the first inequality in(5.5) is replaced by equality, then we get a recursion formula for (yi) with oneparameter only. Hence, there should exist an explicit criterion for the ergodicity(resp., uniqueness, recurrence and strong ergodicity). For (5.6), there is also arecursion formula but now two parameters are involved and so it is unclearwhether there exists an explicit criterion or not for the exponential ergodicity.

Note that the criteria are not enough to estimate the convergence rate αor β [cf., Chen (2000a)]. It is the main reason why we have to come back tostudy the well-developed theory of Markov chains. For birth–death processes,the estimation of α was studied by E. A. van Doorn in a book (1981) and in aseries of papers (1985; 1987; 1991; 2002). He proved, for instance, the followinglower bound

α > infi>0

ai+1 + bi −

√aibi −

√ai+1bi+1

,

Page 98: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

86 5 Ten Explicit Criteria for One-dimensional Processes

which is exact when ai and bi are constant. The following formula for the lowerbounds was implicated in his papers and rediscovered in a different point ofview (in the study on spectral gap) by Chen (1996):

α = supv>0

infi>0

ai+1 + bi − ai/vi−1 − bi+1vi.

Besides, the precise α was determined by E. Van Doorn for four practical models.The main tool used in Doorn’s study is the Karlin-Mcgregor’s representationtheorem, a specific spectral representation, involving heavy techniques. Thereis no explicit criterion for α > 0 ever appeared so far.

5.2 The first (non-trivial) eigenvalue (spectralgap)

The birth–death processes have a nice property—symmetrizability: µipij(t)= µjpji(t) for all i, j and t > 0. Then, the matrix Q can be regarded as aself-adjoint operator on the real L2-space L2(µ) with norm ‖ ·‖. In other words,one can use the well-developed L2-theory. For instance, one can study the L2-exponential convergence given below. Assuming that Z =

∑i µi <∞ and then

setting πi = µi/Z. Then, the convergence means that

‖Ptf − π(f)‖ 6 ‖f − π(f)‖ 6 e−λ1t (5.8)

for all t > 0, where π(f) =∫fdπ and λ1 is the first non-trivial eigenvalue (more

precisely, the spectral gap) of (−Q) [cf., Chen (1992a, Chapter 9)].The estimation of λ1 for birth–death processes was studied by W. G. Sullivan

(1984), Liggett (1989) and C. Landim, S. Sethuraman and S. R. S. Varadhan(1996) [see also C. Kipnis and C. Lamdin (1999)]. It was used as a compari-son tool to handle the convergence rate for some interacting particle systems,which are infinite-dimensional Markov processes. Here we recall three resultsas follows.

Theorem 5.3 (W. G. Sullivan, 1984). Let c1 and c2 be two constants satisfying

c1 > supi>1

∑j>i µj

µi, c2 > sup

i>1

µi

µiai.

Then λ1 > 1/4c21c2.

Theorem 5.4 (T. M. Liggett, 1989). Let c1 and c2 be two constants satisfying

c1 > supi>1

∑j>i µj

µiai, c2 > sup

i>1

∑j>i µjaj

µiai.

Then λ1 > 1/4c1c2.

Page 99: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

5.2 The first (non-trivial) eigenvalue (spectral gap) 87

Theorem 5.5 (T. M. Liggett, 1989). Let inf i>1 ai > 0, inf i>0 bi > 0 andsupi>0 bi < ∞, λ1 > 0 iff (µi) has an exponential tail:

∑j>i µj 6 Cµi for all i

and some constants C <∞ and c > 0.

The reason we are mainly interested in the lower bounds is that on the onehand, they are more useful in practice and on the other hand, the upper boundsare usually easier to obtain from the following classical variational formula:

λ1 = infD(f) : µ(f) = 0, µ

(f2)

= 1,

where

D(f) =1

2

i,j

µiqij(fj − fi)2, D(D) = f ∈ L2(µ) : D(f) <∞

and µ(f) =∫fdµ.

Let us now leave Markov chains for a while and turn to diffusions.

One-dimensional diffusions

As a parallel of birth–death process, we now consider an elliptic operator L =a(x)d2/dx2 + b(x)d/dx on the half line [0,∞) with a(x) > 0 everywhere andwith reflecting boundary at the origin. Again, we are interested in estimationof the principle eigenvalues, which consist of the typical, well-known Sturm-Liouville eigenvalue problem in the spectral theory. Refer to Y. Egorov andV. Kondratiev (1996) for the present status of the study and references. Here,we mention two results, which are the most general ones we have ever knownbefore.

Theorem 5.6. Let b(x) ≡ 0 (which corresponds to the birth–death process withai = bi for all i > 1) and set δ = supx>0 x

∫∞

xa−1. Here we omit the integration

variable when it is integrated with respect to the Lebesgue measure. Then, we have

(1) I. S. Kac and M. G. Krein (1958): δ−1 > λ0 > (4δ)−1, here λ0 is the firsteigenvalue corresponding to the Dirichlet boundary f(0) = 0.

(2) S. Kotani and S. Watanabe (1982): δ−1 > λ1 > (4δ)−1.

It is simple matter to rewrite the classical variational formula as (5.9) below.Similarly, we have (5.10) for λ0.

Poincare inequalities

λ1 : ‖f − π(f)‖2 6 λ−11 D(f) (5.9)

λ0 : ‖f‖2 6 λ−10 D(f), f(0) = 0. (5.10)

It is interesting that inequality (5.10) is a special but typical case of theweighted Hardy inequality discussed in the next subsection.

Page 100: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

88 5 Ten Explicit Criteria for One-dimensional Processes

Weighted Hardy inequality

The classical Hardy inequality goes back to G. H. Hardy (1920):

∫ ∞

0

(f

x

)p

6

(p

p− 1

)p ∫ ∞

0

f ′p, f(0) = 0, f ′ > 0,

where the optimal constant was determined by Landau (1926). After a longperiod of efforts by analysts, the inequality was finally extended to the followingform, called weighted Hardy inequality by B. Muckenhoupt (1972):

∫ ∞

0

f2dν 6 A

∫ ∞

0

f ′2dλ, f ∈ C1, f(0) = 0, (5.11)

where ν and λ be nonnegative Borel measures.The Hardy-type inequalities play a very important role in the study of har-

monic analysis and have been treated in many publications. Refer to the books:B. Opic and A. Kufner (1990), E. M. Dynkin (1990), V. G. Maz’ya (1985) andthe survey article E. B. Davies (1999) for more details. We will come back thisinequality soon.

We have finished the overview of the study on the exponential convergence(equivalently, the Poincare inequalities) in the different subjects. The difficultiesof the topic are illustrated in §1.1.

5.3 Results about the first eigenvalues and theexponentially ergodic rate

It is the position to state our results. To do so, define

W = w : w0 = 0, wi ↑↑, Z =∑

i

µi, δ = supi>0

j6i−1

1

µjbj

j>i

µj ,

where “↑↑” means strictly increasing. Recall the notation wi = wi − π(w). By

suitable modification, we can define W and explicit sequences δn and δn.Refer to Chen (2001a) for details.

The next result provides a complete answer to the question proposed in §5.1.

Theorem 5.7. For birth–death processes, the following assertions hold.

(1) Dual variational formulas:

λ1 = supw∈W

infi>0

µibi(wi+1 − wi)

/ ∑

j>i+1

µjwj [Chen (1996)] (5.12)

= infw∈W

supi>0

µibi(wi+1 − wi)

/ ∑

j>i+1

µjwj [Chen (2001a)] (5.13)

Page 101: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

5.3 First eigenvalues and exponentially ergodic rate 89

(2) Appoximating procedure and explicit bounds:

Zδ−1> δ−1

n > λ1 > δ−1n > (4δ)−1 for all n [Chen (2000b, 2001a)].

(3) Explicit criterion: λ1 > 0 iff δ <∞ [L. Miclo (1999b), Chen (2000b)].

(4) Relation: α = λ1 [Chen (1991b)].

The formula (5.12) is nothing new but Theorem 3.2 (§3.7). The proofsof parts (2) and (3) are similar to the continuous case (stated as Theorem 5.8below), and will be presented in §6.7. In part (1) of Theorem 5.7, only two nota-

tions are used: the sets W and W of test functions (sequences). Clearly, for eachtest function, (5.12) gives us a lower bound of λ1. This explains the meaning of“variational”. Because of (5.12), it is now easy to obtain some lower estimatesof λ1, and in particular, one obtains all the lower bounds given by Theorems5.3–5.5. Next, by exchanging the orders of “sup” and “inf”, we get (5.13) from(5.12), ignoring a slight modification of W . In other words, (5.12) and (5.13) aredual of one to the other. For the explicit estimates “δ−1 > λ0 > (4δ)−1” and inparticular for the criterion, one needs to find out a representative test functionw among all w ∈ W . This is certainly not obvious, because the test functionw used in the formula is indeed a mimic of the eigenfunction (eigenvector) ofλ1, and in general, the eigenvalues and the corresponding eigenfunctions canbe very sensitive, as we have seen from the above examples. Fortunately, thereexists such a representative function with a simple form. We will illustrate thefunction in the context of diffusions in the second to the last paragraph of thissection.

In parallel, for diffusions on [0,∞], define

C(x) =

∫ x

0

b/a, δ = supx>0

∫ x

0

e−C

∫ ∞

x

eC/a,

F =f ∈ C[0,∞) ∩ C1(0,∞) : f(0) = 0 and f ′|(0,∞) > 0

.

Again, denote by F a suitable modification of F [cf., (6.8) below].

Theorem 5.8 (Chen (1999a, 2000b, 2001a)). For diffusion on [0,∞), the fol-lowing assertions hold.

(1) Dual variational formulas:

λ0 > supf∈F

infx>0

eC(x)f ′(x)

/∫ ∞

x

feC/a (5.14)

λ0 6 inff∈F

supx>0

eC(x)f ′(x)

/∫ ∞

x

feC/a (5.15)

Furthermore, the signs of the equality in (5.14) and (5.15) hold if both a andb are continuous on [0,∞).

Page 102: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

90 5 Ten Explicit Criteria for One-dimensional Processes

(2) Appoximating procedure and explicit bounds: A decreasing sequence δnand an increasing sequence δn are constructed explicitly such that

δ−1 > δ−1n > λ0 > δ−1

n > (4δ)−1 for all n.

(3) Explicit criterion: λ0 (resp., λ1) > 0 iff δ <∞.

We mention that the above two results are also based on Chen and F. Y.Wang (1997a).

To see the power of the dual variational formulas, let us return to theweighted Hardy’s inequality.

Theorem 5.9 (B. Muckenhoupt, 1972). The optimal constantA in the inequality

∫ ∞

0

f2dν 6 A

∫ ∞

0

f ′2dλ, f ∈ C1, f(0) = 0, (5.16)

satisfies B 6 A 6 4B, where

B = supx>0

ν[x,∞]

∫ ∞

x

(dλabs/dLeb)−1

and dλabs/dLeb is the derivative of the absolutely continuous part of λ with respectto the Lebesgue measure.

By setting ν = π and λ = eCdx, it follows that the criterion in Theorem 5.8is a consequence of the Muckenhoupt’s Theorem. Along this line, the criteriain Theorems 5.7 and 5.8 for a typical class of the processes were also obtainedby S. G. Bobkov and F. Gotze (1999a; 1999b), in which, the contribution of anearlier paper by J. H. Luo (1992) was noted.

We now point out that the explicit estimates “δ−1 > λ0 > (4δ)−1” in The-orems 5.8 or 5.9 follow from our variational formulas immediately. Here weconsider the lower bound “(4δ)−1” only, the proof for the upper bound “δ−1”is also easy, in terms of (5.15).

Recall that δ = supx>0

∫ x

0e−C

∫∞

xeC/a. Set ϕ(x) =

∫ x

0e−C . By using the

integration by parts formula, it follows that

∫ ∞

x

√ϕ eC

a= −

∫ ∞

x

√ϕ d

(∫ ∞

eC

a

)

6δ√ϕ(x)

2

∫ ∞

x

ϕ′

ϕ3/26

2δ√ϕ(x)

.

Hence

I(√ϕ)(x) =

e−C(x)

(√ϕ)′

(x)

∫ ∞

x

√ϕeC

a6e−C(x)

√ϕ(x)

(1/2)e−C(x)· 2δ√

ϕ(x)= 4δ.

This gives us the required bound by (5.14).

Page 103: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

5.4 Explicit criteria 91

Theorem 5.8 can be immediately applied to the whole line or higher-dimensi-onal situation. For instance, for Laplacian on compact Riemannian manifolds,it was proved by Chen and F. Y. Wang (1997b) that

λ1 > supf∈F

infr∈(0,D)

4f(r)∫ r

0 C(s)−1ds∫D

s C(u)f(u)du=: ξ1,

for some specific function C(r) (Theorem 1.1). Thanks are given to the couplingtechnique which reduces the higher dimensional case to dimension one. We nowhave δ−1 > δ−1

n ↓> ξ1 > δ−1n ↑> (4δ)−1, similar to Theorem 5.8. Refer to

Chen (2000b; 2001a) for details. As we mentioned before, the use of the testfunctions is necessary for producing sharp estimates. Actually, the variationalformula enables us to improve a number of best known estimates obtainedpreviously by geometers, but none of them can be deduced from the estimates“δ−1 > ξ1 > (4δ)−1”. Besides, the approximating procedure enables us todetermine the optimal linear approximation of ξ1 in K:

ξ1 >π2

D2+K

2,

where D is the diameter of the manifold and K is the lower bound of Riccicurvature, as stated in Corollary 1.3 [cf., Chen, E. Scacciatelli and L. Yao(2002)]. We have thus shown the value of our dual variational formulas.

5.4 Explicit criteria

Three basic inequalities

Up to now, we have mainly studied the Poincare inequality, i.e., (5.17) be-low. Naturally, one may study other inequalities, for instance, the logarithmicSobolev inequality or the Nash inequality listed below.

Poincare inequality : ‖f − π(f)‖2 6 λ−11 D(f) (5.17)

Logarithmic Sobolev inequality :

∫f2 log(|f |/‖f‖)dπ 6 σ−1D(f) (5.18)

Nash inequality : ‖f − π(f)‖2+4/ν6 η−1D(f)‖f‖4/ν

1 (for some ν > 0).

(5.19)

Here, to save notation, σ (resp., η) denotes the largest constant so that (5.18)(resp., (5.19)) holds.

The next inequality is a generalization of the Nash one.

Liggett–Stroock inequality : ‖f − π(f)‖2 6 CD(f)1/pV (f)1/q , (5.20)

where V is homogeneous of degree two:

V (cf + d) = c2V (f), c, d ∈ R. (5.21)

Page 104: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

92 5 Ten Explicit Criteria for One-dimensional Processes

The importance of these inequalities is due to the fact that each inequalitydescribes a type of ergodicity.

Theorem 5.10 ([Chen (1991b; 1992a), T. M. Liggett (1989; 1991)]). Let V sa-tisfy (5.21) and (Pt)t>0 be the semigroup determined by the Dirichlet form (D,D(D)).

(1) Let V (Ptf) 6 V (f) for all t > 0 and f ∈ L2(π) (which is automatic whenV (f) = ‖f‖2

r). Then the Liggett-Stroock inequality implies that

Var(Ptf) = ‖Ptf − π(f)‖2 6 CV (f)/tq−1, t > 0.

(2) Conversely, the last inequality implies the Liggett-Stroock inequality.

(3) Poincare inequality ⇐⇒ Var(Ptf) 6 Var(f) exp[−2λ1t].

Proof. Here we prove parts (1) and (2) only. The proof of part (3) is similar[cf., Chen (1992a, Theorem 9.1)].

a) Assume that (5.20) holds. Let f ∈ D(D) and π(f) = 0. Then ft :=P (t)f ∈ D(D). Set Ft = π

(f2

t

). Since

F ′t = −2D(ft) 6 −2C−pV (f)−p/q‖ft‖2p = −2C−pF p

t V (f)−p/q .

Now, part (1) follows from Corollary A.4.b) Conversely, since the process is reversible, the spectral representation

theorem gives us

1

t(f − P (t)f, f) ↑ D(f) as t ↓ 0.

Hence

‖f‖2 − tD(f) 6 (P (t)f, f) 6 ‖P (t)f‖ ‖f‖ 6 ‖f‖√C V (f)t1−q , π(f) = 0.

Put A = D(f), B = ‖f‖√C V (f) and C1 = ‖f‖2. It follows that C1 − At 6

Bt(1−q)/2. The function h(t) := At+Bt(1−q)/2 −C1 (> 0 for all t > 0) achievesits minimum

h(t0) =

[(q − 1

2

)2/(q+1)

+

(2

q − 1

)(q−1)/(q+1)]A(q−1)/(q+1)B2/(q+1) − C1

at the point

t0 =

[2A

B(q − 1)

]−2/(q+1)

> 0.

Now, since h(t0) > 0, it follows that ‖f‖2 6 C2D(f)1/pV (f)1/q for some con-stant C2 > 0 and so we have proved part (2).

Page 105: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

5.4 Explicit criteria 93

Criteria

Recently, the criteria for the logarithmic Sobolev or the Nash inequalities aswell as for the discrete spectrum (which means that there is no continuousspectrum and moreover, all eigenvalues have finite multiplicity) are obtainedby Mao (2000, 2002a, b), based on the weighted Hardy’s inequality. On theother hand, the main parts of Theorems 5.7 and 5.8 are extended to a generalclass of Banach spaces by Chen (2002a; 2002d; 2003a), which unify a large classof inequalities and provide a unified criterion in particular. This is the aim ofthe next chapter. We can now summarize the results in Tables 1.5 and 5.1.The tables are arranged in such order that the property in the latter line isstranger than the former one, the only exception is that even though the strongergodicity is often stronger than the logarithmic Sobolev inequality but they arenot comparable in general (Chen, 2002b), refer to Chapter 8 for more details.

For birth–death processes, ten criteria are presented in Table 1.5. For twoof the criteria, the proofs are given in the next two sections, respectively.

For diffusion processes on [0,∞) with reflecting boundary and operator

L = a(x)d2

dx2+ b(x)

d

dx,

define

C(x) =

∫ x

0

b/a, µ[x, y] =

∫ y

x

eC/a.

Then we have criteria listed in Table 5.1.

Property Criterion

Uniqueness

∫ ∞

0

µ[0, x]e−C(x) = ∞ (∗)

Recurrence

∫ ∞

0

e−C(x) = ∞

Ergodicity (∗) & µ[0,∞) <∞Exponential ergodicityL2-exp. convergence

(∗) & supx>0

µ[x,∞)

∫ x

0

e−C <∞

Discrete spectrum (∗) & limn→∞

supx>n

µ[x,∞)

∫ x

n

e−C = 0

Log. Sobolev inequalityExp. convergence in entropy

(∗) & supx>0

µ[x,∞)log[µ[x,∞)−1]

∫ x

0

e−C<∞

Strong ergodicityL1-exp. convergence

(∗) &

∫ ∞

0

µ[x,∞)e−C(x)<∞

Nash inequality (∗) & supx>0

µ[x,∞)(ν−2)/ν

∫ x

0

e−C<∞(ε)

Table 5.1 Eleven criteria for one-dimensional diffusions

Page 106: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

94 5 Ten Explicit Criteria for One-dimensional Processes

Here, “(∗) & · · · ” means that one requires the uniqueness condition in the firstline plus the condition “· · · ”. The “(ε)” in the last line means that there is stilla small gap from being necessary. In other words, when ν ∈ (0, 2], there is stillno criterion for the Nash inequality. The reason we have one more criterion herethan Table 1.5 is due to the equivalence of the logarithmic Sobolev inequalityand the exponential convergence in entropy. However, this is no longer truein the discrete case. In general, the logarithmic Sobolev inequality is strongerthan the exponential convergence in entropy. A criterion for the exponentialconvergence in entropy for birth–death processes remains open [cf., S. Y. Zhangand Y. H. Mao (2000) and Y. H. Mao and S. Y. Zhang (2000)]. The twoequivalences in the tables come from the diagram, Figure 1.4.

5.5 Exponential ergodicity for single birth pro-cesses

In this section, we study the exponential ergodicity for single birth processes,which are in general irreversible. In particular, we prove the criterion for the er-godicity of birth–death processes, presented in Table 1.5. The strong ergodicityfor this class of processes will be studied in the next section.

The Q-matrix of a single birth process Q = (qij : i, j ∈ Z+) is as follows:qi,i+1 > 0, qi,i+j = 0 for all i ∈ Z+ := 0, 1, 2, · · · and j > 2. Throughout thechapter, we consider only totally stable and conservative Q-matrix: qi = −qii =∑

j 6=i qij <∞ for all i ∈ Z+. Define q(k)n =

∑kj=0 qnj for 0 6 k < n (k, n ∈ Z+)

and

m0 =1

q01, mn =

1

qn,n+1

(1 +

n−1∑

k=0

q(k)n mk

), n > 1,

F (n)n = 1, F (i)

n =1

qn,n+1

n−1∑

k=i

q(k)n F

(i)k , 0 6 i < n,

d0 = 0, dn =1

qn,n+1

(1 +

n−1∑

k=0

q(k)n dk

), n > 1.

For birth–death processes, these quantities take simpler form:

mn =1

µnbnµ[0, n], F (0)

n =b0µnbn

, dn =1

µnbnµ[1, n], n > 1,

where µ[i, k] =∑

i6j6k µj . The main advantage of single birth processes is thatthe exit boundary consists at most one single point and so the explicit criteriaare expected. Here are the criteria for the three classical problems.

• Uniqueness (regularity) ⇐⇒ ∑∞n=0mn = ∞. Next, assume that the Q-

matrix is irreducible, then

• recurrence ⇐⇒ ∑∞n=0 F

(0)n = ∞. In the regular case,

Page 107: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

5.5 Exponential ergodicity for single birth processes 95

• ergodicity ⇐⇒ d := supk∈Z+(∑k

n=0 dn)/(∑k

n=0 F(0)n ) <∞

[cf., Chen (1992a, Theorems 3.16 and 4.54)].Unfortunately, a criterion for the exponential ergodicity of general single

birth processes remains unknown. Here is a sufficient condition, due to Y. H.Mao and Y. H. Zhang with an addition (Proposition 5.13), which is a general-ization of the criterion for birth–death processes.

Theorem 5.11. Let the single birth Q-matrix be regular and irreducible. If

infiqi > 0 and M := sup

i>0

i−1∑

j=0

F(0)j

∞∑

j=i

1

qj,j+1F(0)j

<∞, (5.22)

then the process is exponentially ergodic. The condition (5.22) is necessary for theexponential ergodicity of birth–death processes. Equivalently,

δ := supi>0

i−1∑

j=0

1

µjbj

∞∑

j=i

µj <∞.

Proof. In view of Theorem 5.1, the condition inf i qi > 0 is indeed necessary.(a) Let H = 0. We need to construct a solution (gi) to the equation (5.6)

for a fixed λ: 0 < λ < inf i qi. First, define an operator

IIi(f) =1

fi

i−1∑

j=0

F(0)j

∞∑

k=j+1

fk

qk,k+1F(0)k

, i > 1.

This is an analog of the operator I(f) used several times before and will bediscussed in more details in the next chapter. It indicates a key point in thisproof, which comes from the study on the first eigenvalue. Next, define

ϕi =1

q01

i−1∑

j=0

F(0)j i > 1.

Then ϕ is increasing in i and ϕ1 = q−101 . Let f = cq10

√q01ϕ for some c > 1.

Then f is increasing and f1 = cq10. Finally, define g = fII(f). Then g isincreasing and

g1 =∞∑

k=1

fk

qk,k+1F(0)k

>f1

q12F(0)1

= c > 1.

We now need a technical result, will be proved later, taken from Chen (2000c).

Lemma 5.12. Let (mi) and (ni) be non-negative sequences satisfying

∞∑

i=0

mi <∞ and c := supi>0

i−1∑

j=0

nj

∞∑

j=i

mj <∞.

Page 108: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

96 5 Ten Explicit Criteria for One-dimensional Processes

Define ϕk =∑k−1

j=0 nj . Then for every γ ∈ (0, 1), we have

∞∑

j=i

ϕγjmj 6 c(1 − γ)−1ϕγ−1

i

By Lemma 5.12, it follows that

gi = cq10√q01

i−1∑

j=0

F(0)j

∞∑

k=j+1

√ϕk

qk,k+1F(0)k

62Mcq10√

q10

i−1∑

j=0

F(0)j ϕ

−1/2j+1

62Mcq10√q10ϕ1

i−1∑

j=0

F(0)j <∞, i > 1.

Let g0 = 1. Then 1 6 gi < ∞ for all i > 0. We now determine λ in terms ofthe equation (5.6). When i = 1, we get λ 6 (c − 1)c−1II1(f)−1. When i > 2,we should have

λgi 6

i−1∑

k=0

q(k)i F

(0)k

∞∑

j=k+1

fj

qj,j+1F(0)j

− qi,i+1F(0)i

∞∑

k=i+1

fk

qk,k+1F(0)k

.

For this, it suffices that

λgi 6

i−1∑

k=0

q(k)i F

(0)k

∞∑

j=i

fj

qj,j+1F(0)j

− qi,i+1F(0)i

∞∑

k=i+1

fk

qk,k+1F(0)k

= qi,i+1F(0)i

∞∑

k=i

fk

qk,k+1F(0)k

− qi,i+1F(0)i

∞∑

k=i+1

fk

qk,k+1F(0)k

= fi.

In other words, for (5.6), we need only λ 6 fi/gi = IIi(f)−1 for all i > 2 andλ 6 (c− 1)c−1II1(f)−1. Then we can take any λ:

0 < λ <(c− 1

cII1(f)−1

)∧(infi>2

IIi(f)−1)∧(

infiqi

), (5.23)

provided the right-hand side of (5.23) is positive, or equivalently supi>2 IIi(f) <∞. To prove the last property, define another operator

Ii(f) =F

(0)i

fi+1 − fi

∞∑

k=i+1

fk

qk,k+1F(0)k

, i > 1,

which is exactly the analog of the one we have used many times before. By theproportion property, we get

supi>1

IIi(f) 6 supi>1

Ii(f).

Page 109: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

5.5 Exponential ergodicity for single birth processes 97

By Lemma 5.12 and the condition M <∞, it follows that

Ii(f) =F

(0)i√

ϕi+1 −√ϕi

∞∑

k=i+1

√ϕk

qk,k+1F(0)k

62MF

(0)i

q01(√ϕi+1 −√

ϕi)√ϕi+1

6 4M

for all i > 1. Therefore, supi>1 IIi(f) 6 4M < ∞ as required. We have thusconstructed a solution (gi) to the equation (5.6) with 1 6 gi <∞ for all i. Thisimplies the exponential ergodicity of the process.

For the remainder of this section, we consider birth–death processes only.(b) Denote σH by σ0. Suppose that the process is exponentially ergodic. As

mentioned below Theorem 5.1, there exists a λ with 0 < λ < qi for all i suchthat E0e

λσ0 <∞. Define

ei0(λ) =

∫ ∞

0

eλtPi[σ0 > t]dt, i ∈ Z+.

Then Eieλσ0 = λei0(λ) + 1. By Chen (1992a, page 148), one gets ei0(λ) < ∞

for all i > 1. Furthermore, Eieλσ0 < ∞ for all i > 1. Note that if the starting

point is not 0, then σ0 is equal to the first hitting time:

τ0 = inft > 0 : X(t) = 0.

Hence Eieλτ0 <∞ for all i > 1. Define m

(n)i = Eiτ

n0 . The Taylor’s expansion

∞ > Eieλτ0 =

∞∑

n=0

λn

n!m

(n)i , (5.24)

leads us to estimate the moments m(n)i . By a result due to Z. K. Wang [cf.,

Wang (1980, Page 525), or Z. T. Hou and Q. F. Guo (1978), or Z. K. Wang andX. Q. Yang (1992)], we have

m(1)i =

i−1∑

j=0

1

µjbj

∞∑

k=j+1

µk, m(n)i = n

i−1∑

j=0

1

µjbj

∞∑

k=j+1

µkm(n−1)k , n > 2.

(5.25)

Obviously, m(n)k > m

(n)i if k > i. By (5.25), it follows that

m(n)i > n

i−1∑

j=0

1

µjbj

∞∑

k=i

µkm(n−1)k > n

(i−1∑

j=0

1

µjbj

∞∑

k=i

µk

)m

(n−1)i , n > 2

and

m(1)i >

i−1∑

j=0

(µjbj)−1

∞∑

k=i

µk.

Hence, by induction, one gets

m(n)i > n!

(i−1∑

j=0

1

µjbj

∞∑

k=i

µk

)n

, n > 1.

Page 110: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

98 5 Ten Explicit Criteria for One-dimensional Processes

Combining this with (5.24), we obtain

∞∑

n=1

i−1∑

j=0

1

µjbj

∞∑

k=i

µk

)n

<∞,

which implies that

λ

i−1∑

j=0

1

µjbj

∞∑

k=i

µk < 1.

Making the supremum over i, we obtain δ 6 λ−1 < ∞. Hence, the necessity isproven.

(c) To complete the proof of the theorem, it suffices to show that

infiqi = 0 =⇒ δ = ∞. (5.26)

To do so, we need the following result.

Proposition 5.13. For a general reversible Markov chain, if inf i qi = 0, thenλ1 = 0.

Having the result at hand, the proof of (5.26) is trivial. Because for birth–death processes, by Theorem 4.7 (2), we have (4δ)−1 6 λ1 6 Zδ−1.

Proof of Lemma 5.12. Let Mn =∑

j>n mj . Fix N > i. Then by summation

by parts formula and Mn 6 cϕ−1n , we get

N∑

j=i

ϕγjmj 6 ϕγ

i Mi +

N∑

j=i

[ϕγj+1 − ϕγ

j ]Mj+1 6 c

ϕγ−1

i +

N∑

j=i

[ϕγj+1 − ϕγ

j ]/ϕj+1

.

By using the elementary inequality γ(1− γ)−1(xγ−1 − 1) + xγ > 1 (x > 0), it iseasy to check that

ϕγ−1j+1 − ϕγ

j /ϕj+1 6 γ(1 − γ)−1[ϕγ−1j − ϕγ−1

j+1 ].

Combining this with the last estimate gives us the required assertion.

Proof of Proposition 5.13. Without loss of generality, let the state space Ebe 0, 1, · · · . Consider the test function f = c1Ik + c2, where c1 and c2 are

constants such that π(f) = 0 and π(f2)

= 1:

c2 = −c1πk, c1 = 1/√πk(1 − πk) .

ThenD(f) =

(i,j): i<j

πiqij(fj − fi)2

=∑

j: k<j

πkqkj(fj − fk)2 +∑

i: i<k

πiqik(fk − fi)2

=∑

j: k<j

qkj

1 − πk+∑

i: i<k

qki

1 − πk

=qk

1 − πk.

Page 111: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

5.6 Strong ergodicity 99

Applying the classical variational formula (5.9) to this f , we obtain

λ1 6qk

1 − πk.

The required assertion follows since for large enough k, we have πk > 1/2.

5.6 Strong ergodicity

This section is devoted to the strong ergodicity for general Markov processes.We will adopt both the analytic method and the coupling method. Let us beginwith the analytic method for single birth processes.

For birth–death processes, the next result was first obtained by Hou et al(2000) with different proof. The general case is due to Y. H. Zhang (2001). Weadopt the notations introduced at the beginning of the last section.

Analytic method

Theorem 5.14. Let Q = (qij) be a regular, irreducible single birth Q-matrix.Then the Q-process is strongly ergodic iff

supk∈Z+

k∑

n=0

(F (0)

n d− dn

)<∞. (5.27)

For birth–death processes, the criterion becomes

S :=∑

n>0

1

µnbnµ[n+ 1,∞) =

n>1

µn

j6n−1

1

µjbj<∞

as stated in Table 1.5.

Proof. (a) First, we prove that the equation

yi =∑

j 6=i

qijqiyj +

1

q i

, i > 1; y0 = 0, (5.28)

has a bounded non-negative solution iff (5.27) holds. If so,

d := supk∈Z+

k∑

n=0

dn

/ k∑

n=0

F (0)n = lim

k→∞

k∑

n=0

dn

/ k∑

n=0

F (0)n

and the unique solution to (5.28) is as follows:

y0 = 0, y1 = d, yn+1 = yn + F (0)n y1 − dn, n > 1. (5.29)

Page 112: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

100 5 Ten Explicit Criteria for One-dimensional Processes

First, assume that (5.27) holds and define (yi) by (5.29). Then, it should beeasy to verify that (yi) is a bounded non-negative solution of (5.28).

Next, let (yi) be a bounded non-negative solution of (5.28) and define vn =yn+1 − yn for n > 0. From (5.28), it is not difficult to derive

vn =1

qn,n+1

(n−1∑

k=0

q(k)n vk − 1

), n > 1.

By induction, we can easily prove that vn = F(0)n v0 − dn for all n > 0. Note

that v0 = y1. From these facts, it follows that

yk+1 =

k∑

n=0

vn =

k∑

n=0

(F (0)

n v0 − dn

), k ∈ Z+. (5.30)

Now, on the one hand, by (5.30) and yk+1 > 0, it follows that

v0 >

k∑

n=0

dn

/ k∑

n=0

F (0)n , k ∈ Z+.

Hence v0 > d. On the other hand, by (5.30) again,

yk+1∑kn=0 F

(0)n

= v0 −∑k

n=0 dn∑kn=0 F

(0)n

, k ∈ Z+. (5.31)

Note that (yi) is bounded and∑k

n=0 F(0)n → +∞ as k → ∞ (by recurrence).

Letting k → ∞ in (5.31), we see that the second part on the right-hand side of(5.31) tends to the limit v0, and furthermore v0 6 d. Hence, we have

y1 = v0 = d = limk→∞

k∑

n=0

dn

/ k∑

n=0

F (0)n ,

Combining this with (5.30), it follows that the solution (yi) must have therepresentation (5.29) and hence is unique. Finally, by the boundedness of (yi)and (5.30), condition (5.27) follows.

(b) By Theorem 5.1 (3), we know that the Q-process is strongly ergodic iffthe following equation has a bounded non-negative solution:

j

qijyj 6 −1, i /∈ H ;∑

i∈H

j 6=i

qijyj <∞,

where H is a non-empty finite subset of Z+. Let H = 0. For single birthprocesses, the last equation is reduced to

j

qijyj 6 −1, i 6= 0, (5.32)

Page 113: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

5.6 Strong ergodicity 101

since∑

j 6=0 q0jyj = q01y1 <∞.Assume that the single birth process is strongly ergodic. Then there exists

a bounded non-negative solution (ui) of (5.32), i.e.,

ui >∑

j 6=i

qijqiuj +

1

q i

, i > 1; u0 > 0.

Denote by (u∗i ) the minimal non-negative solution of (5.28). By the ComparisonTheorem below, we have ui > u∗i for all i > 0. Thus, (u∗i ) is bounded and (5.28)has a bounded non-negative solution. By (a), (5.27) holds.

Conversely, let (5.27) hold. Define (yi) by (5.29). By (a), (yi) is a boundednon-negative solution of (5.28). Clearly (yi) is also a bounded non-negativesolution of (5.32). This implies strong ergodicity by the criterion quoted above.

To conclude this subsection, we introduce some elementary facts, taken fromZ. T. Hou and Q. F. Guo (1978) or Chen (1992a) and also needed in Chapter9, about the theory of the minimal nonnegative solutions for the systems ofequations with nonnegative coefficients. All the results below can be easilyproved by using induction.

Theorem 5.15 (Existence and uniqueness theorem). Let cij > 0, bi > 0.Then there exists uniquely the minimal solution (x∗i : i ∈ E) to the equations

xi =∑

k∈E

cikxk + bi, i ∈ E.

More precisely, define

x(0)i = 0, x

(n+1)i =

k∈E

cikx(n)k + bi, i ∈ E, n > 0.

Then, x(n)i ↑ x∗i for all i ∈ E as n→ ∞.

Theorem 5.16 (Comparison theorem). Let (xi : i ∈ E) satisfies

xi >∑

k∈E

cikxk + bi, i ∈ E.

Then xi > x∗i for all i ∈ E.

Theorem 5.17 (Linear combination theorem). Let G be a countable setand cα > 0 for all α ∈ G. Denote by (x∗α

i : i ∈ E) be the minimal solution to

xi =∑

k∈E

cikxk + b(α)i , i ∈ E.

Then(∑

α∈G cαx∗αi : i ∈ E

)is the minimal solution to the equations

xi =∑

k∈E

cikxk +∑

α∈G

cαb(α)i , i ∈ E.

Page 114: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

102 5 Ten Explicit Criteria for One-dimensional Processes

Coupling method

For a general Markov process with transition probability P (t, x, dy) on (E,E )and having a stationary distribution π, the strong ergodicity means that

supx∈E

‖P (t, x, ·) − π‖Var → 0 as t→ ∞.

Such convergence must be exponential. Actually, note that for a signed measureν, we have

‖ν‖Var = supf : |f |61

|ν(f)|.

It is easy to check that

supx

‖P (t+ s, x, ·) − π‖Var 61

2sup

x‖P (t, x, ·) − π‖Var · sup

x‖P (s, x, ·) − π‖Var.

Hence 12 supx ‖P (t, x, ·)−π‖Var must have an exponential decay. The various ap-

plications to the convergence rate in the total variation of the coupling methodsis based on the following simple observation. Let (Xt, Yt) be a coupling of theMarkov processes starting from x and y, respectively, and define the couplingtime as follows.

T = inft > 0 : Xt 6= Yt.Then we have

‖P (t, x, ·)−π‖Var 6 2

E

π(dy)Ex,yI[Xt 6=Yt] = 2

E

π(dy)Px,y[T > t], λ > 0.

In particular, by Chebychev inequality, we have

supx

‖P (t, x, ·) − π‖Var 6 supx6=y

Ex,y(Tn)/tn and

supx

‖P (t, x, ·) − π‖Var 6 supx6=y

Ex,y(eλT)e−λt.

Along this line, there are a lot of works. See for instance Chen (1992a), T.Lindvall (1992) and the references within, or more recent papers by Y. Z. Wang(1999b), Y. H. Mao (2002d; 2002e).

We now study the estimation of the moments of the coupling time [refer toChen and S. F. Li (1989, Theorem 5.7) for a refined result].

Theorem 5.18. Let (Xt, Yt) be a Markovian coupling with operator L, ϕ ∈C1[0,∞), F ∈ Dw

(L)

and ϕ, F > 0, F (x, x) = 0 for all x. Suppose that

LF (x, y) 6 −1, x 6= y (5.33)

Then we have

Ex,y

∫ t∧T

0

ϕ(s)ds 6 ϕ(0)F (x, y) + Ex,y

∫ t∧T

0

ϕ′(s)F (Xs, Ys)ds. (5.34)

Page 115: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

5.6 Strong ergodicity 103

Proof. Let f(t;x, y) = ϕ(t)F (x, y). By using the martingale formulation,

f(t;Xt, Yt) −∫ t

0

[∂sf(s;Xs, Ys) + Lf(s;Xs, Ys)

]ds

is a Px,y-martingale with respect to the natural flow of σ-algebras. Hence

ϕ(0)F (x, y) = ϕ(t)Ex,yF (Xt∧T , Yt∧T )

− Ex,y

∫ t∧T

0

[ϕ′(s)F (Xs, Ys) + ϕ(s)LF (Xs, Ys)

]ds

> −Ex,y

∫ t∧T

0

ϕ′(s)F (Xs, Ys)ds+ Ex,y

∫ t∧T

0

ϕ(s)ds,

by (5.33). This gives us the required assertion.

Applying (5.13) to ϕ = 1, we get

Ex,yT 6 F (x, y).

Next, applying (5.13) to ϕ(t) = tm(m > 1), we get

Ex,yTm+1 6 (m+ 1)‖F‖∞E

x,yTm.

Hence, we have

Ex,yTm 6 m! ‖F‖m−1

∞ F (x, y) 6 m! ‖F‖m∞, m > 1. (5.35)

Finally, applying (5.13) to ϕ(t) = eλt(λ > 0), we obtain

λ−1E

x,y[eλ(t∧T ) − 1

]6 F (x, y) + ‖F‖∞E

x,y[eλ(t∧T ) − 1

].

Thus,

Ex,yeλT 6 1 +

‖F‖∞1/λ− ‖F‖∞

=1

1 − λ‖F‖∞, 0 < λ < ‖F‖∞ <∞. (5.36)

Certainly, one can also deduce (5.36) from (5.35).For compact Riemannian manifold, simply take F = H ρ, where ρ is the

Riemannian distance and

H(r) =

∫ r

0

C(s)−1ds

∫ D

s

C(u)du, C(r) = coshd−1

[r

2

√−Kd− 1

], r ∈ (0, D),

and adopt the coupling by reflection. Then (5.36) holds. This is done by Y. Z.Wang (1999b), Y. H. Mao (2002d; 2002e). In one-dimensional case, since the lin-ear order, one can simply use the classical coupling which is order–preservative.Note that T is controlled by the hitting time inft > 0 : Xt = 0. One can

Page 116: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

104 5 Ten Explicit Criteria for One-dimensional Processes

study its moments in the same way as above, replacing the coupling operatorby the marginal one. For birth–death processes, one takes

Fi =

i−1∑

j=0

1

bjµj

∞∑

k=j

µk.

Then (5.36) holds once ‖F‖∞ = limi→∞ Fi < ∞. Equivalently, S < ∞, whichis indeed necessary by Theorem 5.14. This is done by Y. H. Mao (2002d).

Page 117: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Chapter 6

Variational Formulas andExplicit Bounds ofPoincare-type Inequalitiesin Dimension One

This chapter is a quick and elementary overview of the recent progress on a largeclass of Poincare-type inequalities in dimension one. The higher-dimensionalcase will be discussed in the next chapter. The explicit criteria for the inequali-ties, the variational formulas and explicit bounds of the corresponding constantsin the inequalities are presented. As typical applications, the Nash inequalitiesand the logarithmic Sobolev inequalities are examined. To illustrate the mainideas, some short proofs are included. In the last section (§6.7), partial proofsare given for the main dual variational formulas (Theorem 6.1).

6.1 Introduction

The one-dimensional case in this chapter means either the second order ellip-tic operators (one-dimensional diffusions) or the triangle matrices (birth–deathMarkov processes). Let us begin with diffusions.

Let L = a(x)d2/dx2 + b(x)d/dx be an elliptic operator on an interval (0, D)(D 6 ∞) with Dirichlet boundary at 0 and Neumann boundary at D when D <∞, where a and b are Borel measurable functions and a is positive everywhere.Set C(x) =

∫ x

0 b/a, here and in what follows, the Lebesgue measure dx is oftenomitted. Throughout the chapter, assume that

Z :=

∫ D

0

eC/a <∞. (6.1)

Page 118: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

106 6 Poincare-type inequalities in dimension one

Hence, dµ := a−1eCdx is a finite measure, which is crucial in the chapter. Weare interested in the first Poincare inequality

‖f‖2 :=

∫ D

0

f2dµ6A

∫ D

0

f ′2eC := AD(f), f ∈ Cd[0, D], f(0) = 0, (6.2)

where Cd is the set of all continuous functions, differentiable almost everywhereand having compact supports. When D = ∞, one should replace [0, D] by [0, D)but we will not mention again in what follows. Next, we are also interested inthe second Poincare inequality

‖f − π(f)‖2 :=

∫ D

0

(f − π(f))2dµ 6 AD(f) f ∈ Cd[0, D], (6.3)

where π(f) = µ(f)/Z =∫fdµ/Z. To save the notations, we use the same A

(resp., A) to denote the optimal constant in (6.2) (resp., (6.3)).The aim of the study on these inequalities is looking for a criterion under

which (6.2) (resp., (6.3)) holds, i.e., the optimal constantA <∞ (resp., A <∞),and for the estimations of A (resp., A). The reason why we are restrictedin dimension one is looking for some explicit criteria and explicit estimates.Actually, we have dual variational formulas for the upper and lower boundsof these constants. Such explicit story does not exist in higher dimensionalsituation.

Next, replacing the L2-norm on the left-hand sides of (6.2) and (6.3) with ageneral norm ‖ · ‖B in a suitable Banach space (the details are delayed to §3),respectively, we obtain the following Poincare–type inequalities

∥∥f2∥∥

B6 ABD(f), f ∈ Cd[0, D], f(0) = 0. (6.4)

∥∥(f − π(f))2∥∥ 6 ABD(f), f ∈ Cd[0, D]. (6.5)

For which, it is natural to study the same problems as above. The main purposeof this chapter is to answer these problems. By using this general setup, we areable to handle with the following Nash inequalities (J. Nash, 1958)

‖f − π(f)‖2+4/ν 6 AND(f)‖f‖4/ν1 (6.6)

in the case of ν > 2, and the logarithmic Sobolev inequality (L. Gross, 1976):

Ent(f2)

:=

∫ D

0

f2 logf2

π(f2)dµ 6 ALSD(f). (6.7)

The remainder of the chapter is organized as follows. In the next section, wereview the criteria for (6.2) and (6.3), the dual variational formulas and explicitestimates of A and A. Then, we extend partially these results to Banach spacesfirst for the Dirichlet case and then for the Neumann one. For a very generalsetup of Banach spaces, the resulting conclusions are still rather satisfactory.

Page 119: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

6.2 Ordinary Poincare inequalities 107

Next, we specify the results to Orlicz spaces and finally apply to the Nashinequalities and the logarithmic Sobolev inequality.

Since each topic discussed subsequently has a long history and contains alarge number of publications, it is impossible to collect in the present chaptera complete list of references. We emphasize on recent progress and relatedreferences only.

6.2 Ordinary Poincare inequalities

In this section, we introduce the criteria for (6.2) and (6.3), the dual variationalformulas and explicit estimates of A and A, which strengthen Theorem 1.5 andthe results listed in §5.3.

To state the main results, we need some notations. Write x∧ y = minx, yand similarly, x ∨ y = maxx, y. Define

F =f ∈ C[0, D] ∩ C1(0, D) : f(0) = 0, f ′|(0,D) > 0

,

F =f ∈ C[0, D] : f(0) = 0, there exists x0 ∈ (0, D] so that

f = f(· ∧ x0), f ∈ C1(0, x0) and f ′|(0,x0) > 0

,

F′ =

f ∈ C[0, D] : f(0) = 0, f |(0,D) > 0

,

F′ =

f ∈ C[0, D] : f(0) = 0, there exists x0 ∈ (0, D] so that

f = f(· ∧ x0) and f |(0,x0) > 0

.

(6.8)

Here the sets F and F ′ are essential, they are used, respectively, to definebelow the operators of single and double integrals, and are used for the upper

bounds. The sets F and F ′ are less essential, simply some modifications of F

and F ′, respectively, to avoid the problem of integrability, and are used for thelower bounds. Define

I(f)(x) =e−C(x)

f ′(x)

∫ D

x

[feC/a

](u)du, f ∈ F , (6.9)

II(f)(x) =1

f(x)

∫ x

0

dye−C(y)

∫ D

y

[feC/a

](u)du, f ∈ F

′. (6.10)

The next result is taken from Chen (2001b, Theorems 1.1 and 1.2). Theword “dual” below means that the upper and lower bounds are interchangeableif one exchanges the orders of “sup” and “inf” with a slight modification of theset F (resp., F ′) of test functions.

Theorem 6.1 Let (6.1) hold. Define ϕ(x)=∫ x

0 e−C and B= sup

x∈(0,D)

ϕ(x)∫ D

xeC

a .

Then, we have the following assertions.

(1) Explicit criterion: A <∞ iff B <∞.

Page 120: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

108 6 Poincare-type inequalities in dimension one

(2) Dual variational formulas:

A 6 inff∈F ′

supx∈(0,D)

II(f)(x) = inff∈F

supx∈(0,D)

I(f)(x), (6.11)

A > supf∈F ′

infx∈(0,D)

II(f)(x) = supf∈F

infx∈(0,D)

I(f)(x). (6.12)

The two inequalities all become equalities whenever both a and b are conti-nuous on [0, D].

(3) Approximating procedure and explicit bounds:

(a) Define f1 =√ϕ, fn = fn−1II(fn−1) and Dn = supx∈(0,D) II(fn) (x).

Then Dn is decreasing in n and A 6 Dn 6 4B for all n > 1.

(b) Fix x0 ∈ (0, D). Define

f(x0)1 = ϕ(· ∧ x0), f

(x0)n = f

(x0)n−1(· ∧ x0)II

(f

(x0)n−1(· ∧ x0)

)

and Cn = supx0∈(0,D) infx∈(0,D) II

(f

(x0)n (· ∧ x0)

)(x). Then Cn is in-

creasing in n and A > Cn > B for all n > 1.

We mention that the explicit estimates “B 6 A 6 4B” were obtained pre-viously in the study on the weighted Hardy’s inequality by B. Muckenhoupt(1972). A short proof of “A 6 4B” was presented in §5.3. The proof of “A > B”is also easy. Actually, fix x ∈ (0, D) and take

f(y) = fx(y) =

∫ x∧y

0

e−C(s)ds, y ∈ (0, D).

Then f ′(y) = e−C(y) if y < x and f ′(y) = 0 if y ∈ (x,D). Furthermore,

‖f‖2 =

∫ x

0

f(y)2π(dy) + f(x)2π[x,D],

D(f) =

∫ x

0

e−2C(y)eC(y)dy/Z = f(x)/Z,

where π[p, q] =∫ q

p dπ and Z = µ[0, D]. Hence

A > ‖f‖2/D(f) = Zf(x)−1

∫ x

0

f(y)2π(dy) + Zf(x)π[x,D] > f(x)µ[x,D].

Making supremum with respect to x, it follows that A > B.The proofs of parts (3) and (4) of Theorem 6.1 are more technical, see §6.7

for details.We now turn to study A, for which it is necessary to assume that

∫ D

0

e−C(s)ds

∫ s

0

a(u)−1eC(u)du = ∞, (6.13)

Page 121: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

6.2 Ordinary Poincare inequalities 109

since we are working in the ergodic situation.

Theorem 6.2 Let (6.1) and (6.13) hold and set f = f − π(f). Then, we havethe following assertions.

(1) Explicit criterion: A <∞ iff B <∞, where B is given by Theorem 6.1.

(2) Dual variational formulas:

supf∈F

infx∈(0,D)

I(f)(x) 6 A 6 inff∈F

supx∈(0,D)

I(f)(x). (6.14)

The two inequalities all become equalities whenever both a and b are contin-uous on [0, D].

(3) Approximating procedure and explicit bounds:

(a) Define f1 =√ϕ, fn = fn−1II(fn−1) and Dn = supx∈(0,D) II(fn) (x).

Then A 6 Dn 6 4B for all n > 1.

(b) Fix x0 ∈ (0, D). Define

f(x0)1 = ϕ(· ∧ x0), f

(x0)n = f

(x0)n−1(· ∧ x0)II

(f

(x0)n−1(· ∧ x0)

)

and Cn = supx0∈(0,D) infx∈(0,D) II

(f

(x0)n (· ∧ x0)

)(x). Then A > Cn

for all n > 2. By convention, 1/0 = ∞.

Part (1) of the theorem is taken from Chen (2000c, Theorem 3.7). Theupper bound in (6.14) is due to Chen and F. Y. Wang (1997b). The other partsare taken from Chen (2001b, Theorems 1.3 and 1.4).

Finally, we consider inequality (6.3) on a general interval (p, q) (−∞ 6 p <q 6 ∞). When p (resp., q) is finite, at which the Neumann boundary conditionis endowed. We adopt a splitting technique. The intuitive idea goes as fol-lows: Since the eigenfunction corresponding to A, if exists, must change signs,it should vanish somewhere in the present continuous situation, say θ for in-stance. Thus, it is natural to divide the interval (p, q) into two parts: (p, θ) and(θ, q). Then, one compares A with the optimal constants in the inequality (6.2),denoted by A1θ and A2θ, respectively, on (θ, q) and (p, θ) having the commonDirichlet boundary at θ. Actually, we do not care about the existence of thevanishing point θ. Such θ is unknown, even if it exists. In practice, we regard θas a reference point and then apply an optimization procedure with respect toθ. We now redefine C(x) =

∫ x

θb/a. Again, since it is in the ergodic situation,

we assume the following (non-explosive) conditions:

Z1θ :=

∫ q

θ

eC/a <∞, Z2θ :=

∫ θ

p

eC/a <∞.

∫ θ

p

e−C(s)ds

∫ θ

s

eC/a = ∞ if p = −∞ and

∫ q

θ

e−C(s)ds

∫ s

θ

eC/a = ∞ if q = ∞

(6.15)

Page 122: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

110 6 Poincare-type inequalities in dimension one

for some (equivalently, all) θ ∈ (p, q). Corresponding to the intervals (θ, q) and(p, θ), respectively, we have constants B1θ and B2θ, given by Theorem 6.1.

Theorem 6.3 Let (6.15) hold. Then, we have

(1) infθ∈(p,q)

(A1θ ∧A2θ

)6 A 6 supθ∈(p,q)

(A1θ ∨ A2θ

).

(2) Let θ be the medium of µ, then(A1θ ∨A2θ

)/2 6 A 6 A1θ ∨ A2θ.

In particular, A <∞ iff B1θ ∨ B2θ <∞.

Comparing the variational formulas (6.11), (6.12) and (6.14) with the clas-sical variational formulas

λ0 =D(f) : f ∈ C1(0, D) ∩ C[0, D], f(0) = 0, π

(f2)

= 1,

λ1 =D(f) : f ∈ C1(0, D) ∩ C[0, D], π(f) = 0, π

(f2)

= 1,

one sees that there are no common points. This explains why the new formulas(6.12) and (6.14) have not appeared before. The key here is the discover ofthe formulas rather than their proofs, which are usually not hard, due to theadvantage of dimension one. As an illustration, here we present a part of theproofs.

Proof of the upper bound in (6.14)

Originally, the assertion was proved in Chen and F. Y. Wang (1997b) by usingthe coupling methods. Here we adopt the analytic proof given in Chen (1999a).

Let g ∈ C[0, D] ∩C1(0, D), π(g) = 0 and π(g2) = 1. Then, for every f ∈ F

with π(f) > 0, we have

1 =1

2

∫ D

0

π(dx)π(dy)[g(y) − g(x)]2

=

x6y

π(dx)π(dy)

(∫ y

x

g′(u)√f ′(u)√

f ′(u)du

)2

6

x6y

π(dx)π(dy)

∫ y

x

g′(u)2

f ′(u)du

∫ y

x

f ′(ξ)dξ

(by Cauchy-Schwarz inequality)

=

x6y

π(dx)π(dy)

∫ y

x

g′(u)2eC(u) e−C(u)

f ′(u)du[f(y) − f(x)

]

=

∫ D

0

a(u)g′(u)2π(du)Ze−C(u)

f ′(u)

∫ u

0

π(dx)

∫ D

u

π(dy)[f(y) − f(x)

]

6 D(g) supu∈(0,D)

Ze−C(u)

f ′(u)

∫ u

0

π(dx)

∫ D

u

π(dy)[f(y) − f(x)

]

6 D(g) supx∈(0,D)

I(f)(x)(since π(f) > 0

).

Page 123: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

6.3 Extension. Banach spaces. 111

Thus, D(g)−1 6 supx∈(0,D) I(f)(x), and so

A = supg: π(g)=0, π(g2)=1

D(g)−16 sup

x∈(0,D)

I(f)(x).

This gives us the required assertion:

A 6 inff∈F

supx∈(0,D)

I(f)(x).

The proof of the sign of the equality holds for continuous a and b needs morework, since it requires some more precise properties of the corresponding eigen-functions, as shown in §3.7 and §3.8 for the discrete case.

6.3 Extension. Banach spaces.

Starting from this section, we introduce the recent results obtained in Chen(2002a; 2002d), but we will not point out time by time subsequently.

In this section, we study the Poincare-type inequality (6.4). Clearly, theBanach spaces used here can not be completely arbitrary since we are dealingwith a topic of hard mathematics. From now on, let (B, ‖ · ‖B, µ) be a Banachspace of functions f : [0, D] → R satisfying the following conditions:

(1) 1 ∈ B;

(2) B is ideal: If h ∈ B and |f | 6 |h|, then f ∈ B;

(3) ‖f‖B = supg∈G

∫ D

0

|f |gdµ,

(4) G 3 g0 with inf g0 > 0,

(6.16)

where G is a fixed set, to be specified case by case later, of non-negative functionson [0, D]. The first two conditions mean that B is rich enough and the last onemeans that G is not trivial, it contains at least one strictly positive function.The third condition is essential in this chapter, which means that the norm‖ · ‖B has a “dual” representation. A typical example of the Banach space isB = Lr(µ), then G = the unit ball in Lr′

+(µ), 1/r + 1/r′ = 1.

The optimal constant A in (6.4) can be expressed as a variational formulaas follows.

AB = sup

∥∥f2∥∥

B

D(f): f ∈ Cd[0, D], f(0) = 0, 0 < D(f) <∞

. (6.17)

Clearly, this formula is powerful mainly for the lower bounds of A. However,the upper bounds are more useful in practice but much harder to handle. For-tunately, for which we have quite complete results.

Page 124: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

112 6 Poincare-type inequalities in dimension one

Define ϕ(x) =∫ x

0e−C as before and let

BB = supx∈(0,D)

ϕ(x)∥∥I(x,D)

∥∥B, CB = sup

x∈(0,D)

∥∥ϕ(x ∧ ·)2∥∥

B

ϕ(x),

DB = supx∈(0,D)

∥∥√ϕϕ(x ∧ ·)2∥∥

B√ϕ(x)

.

(6.18)

Theorem 6.4 Let (6.1) and (6.16) hold. Then we have the following assertions.

(1) Explicit criterion: AB <∞ iff BB <∞.

(2) Variational formulas for the upper bounds:

AB6 inff∈F ′

supx∈(0,D)

f(x)−1∥∥fϕ(x ∧ ·)

∥∥B

6 inff∈F

supx∈(0,D)

e−C(x)

f ′(x)‖fI(x,D)‖B.

(6.19)

(3) Approximating procedure and explicit bounds: Let BB < ∞. Define f0 =√ϕ, fn(x) = ‖fn−1ϕ(x ∧ ·)‖B and DB(n) = supx∈(0,D) fn/fn−1 for n > 1.

Then, DB(n) is decreasing in n and

BB 6 CB 6 AB 6 DB(n) 6 DB 6 4BB (6.20)

for all n > 1.

We are now going to sketch the proof of the second variational formula in(6.19), from which the explicit upper bound AB 6 4BB follows immediately, aswe did at the end of the last section. The explicit estimates “BB 6 AB 6 4BB”were previously obtained in S. G. Bobkov and F. Gotze (1999b) in terms of theweighted Hardy’s inequality [cf., B. Muckenhoupt (1972)]. The lower boundsfollows easily from (6.17).

Sketch of the proof of the second variational formula in(6.19)

The starting point is the variational formula for A (cf., (6.11)):

A 6 inff∈F

supx∈(0,D)

e−C(x)

f ′(x)

∫ D

x

feC

a= inf

f∈Fsup

x∈(0,D)

e−C(x)

f ′(x)

∫ D

x

fdµ.

Fix g > 0 and introduce a transform as follows.

b→ b/g, a→ a/g > 0. (6.21)

Page 125: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

6.4 Neumann case. Orlicz spaces. 113

Under which, C(x) is transformed into

Cg(x) =

∫ x

0

b/g

a/g= C(x).

This means that the function C is invariant of the transform, and so is theDirichlet form D(f). The left-hand side of (6.2) is changed into

∫ D

0

f2geC/a =

∫ D

0

f2gdµ.

At the same time, the constant A is changed into

Ag 6 inff∈F

supx∈(0,D)

e−C(x)

f ′(x)

∫ D

x

fgdµ.

Making supremum with respect to g ∈ G , the left-hand side becomes

supg∈G

∫ D

0

f2gdµ =∥∥f2

∥∥B

and the constant becomes

AB = supgAg 6 sup

ginff

supx

e−C(x)

f ′(x)

∫ D

x

fgdµ 6 inff

supg

supx

= inff

supx

e−C(x)

f ′(x)sup

g

∫ D

0

fI(x,D)gdµ.

= inff

supx

e−C(x)

f ′(x)

∥∥fI(x,D)

∥∥B.

We are done! Of course, more details are required for completing the proof. Forinstance, one may use g + 1/n instead of g to avoid the condition ”g > 0” andthen pass limit.

The lucky point in the proof is that “sup inf 6 inf sup”, which goes to thecorrect direction. However, we do not know at the moment how to generalizethe dual variational formula for lower bounds, given in (6.12), to the generalBanach spaces, since the same procedure goes to the opposite direction.

6.4 Neumann case. Orlicz spaces.

In the Neumann case, the boundary condition becomes f ′(0) = 0, rather thanf(0) = 0. Then λ0 = 0 is trivial. Hence, we study λ1 (called spectral gap

of L), that is the inequality (6.3). We now consider its generalization (6.5).Naturally, one may play the same game as in the last section extending (6.14)to the Banach spaces. However, it does not work this time. Note that onthe left-hand side of (6.5), the term π(f) is not invariant under the transform

Page 126: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

114 6 Poincare-type inequalities in dimension one

(6.21). Moreover, since π(f ) = 0, it is easy to check that for each fixed f ∈ F ,I(f)(x) is positive for all x ∈ (0, D). But this property is no longer true whendµ is replaced by gdµ. Our goal is to adopt the splitting technique explainedin Section 6.2.

Let θ ∈ (p, q) be a reference point and let AkθB

, BkθB

, CkθB

, DkθB

(k = 1, 2) bethe constants defined in (6.17) and (6.18) corresponding to the intervals (θ, q)and (p, θ), respectively. By Theorem 6.4, we have

BkθB

6 CkθB

6 AkθB

6 DkθB

6 4BkθB, k = 1, 2.

Theorem 6.5 Let (6.15) and (6.16) hold. Then, we have the following assertions.

(1) Explicit criterion: AB <∞ iff B1θB

∨ B2θB<∞.

(2) Explicit estimates:

max1

2

(A1θ

B∧ A2θ

B

), Kθ

(A1θ

B∨A2θ

B

)6 AB 6 A1θ

B∨A2θ

B,

where Kθ is a constant.

It is the position to consider briefly the discrete case, i.e., the birth–deathprocesses. Let bi (i > 0) be the birth rates and ai (i > 1) be the death rates ofa birth–death process. Define

µ0 = 1, µn =b0 · · · bn−1

a1 · · · an, Z =

∞∑

n=0

µn, πn =µn

Z, n > 1.

Consider a Banach space (B, ‖ · ‖B, µ) of functions E := 0, 1, 2, · · · → R

satisfying (6.16). Define

ϕi =i∑

j=1

1

µjaj, i > 1; BB = sup

i>1ϕi

∥∥Ii,i+1,···

∥∥B.

Clearly, the inequalities (6.4) and (6.5) are meaningful with a slight modifica-tion.

Theorem 6.6 Consider birth–death processes with state space E. Assume thatZ <∞.

(1) Explicit criterion for (6.4): AB <∞ iff BB <∞.

(2) Explicit bounds for AB: BB 6 AB 6 4BB.

(3) Explicit criterion for (6.5): Let the birth–death process be non-explosive:

∞∑

i=0

1

µibi

i∑

j=0

µj = ∞. (6.22)

Then AB <∞ iff BB <∞.

Page 127: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

6.4 Neumann case. Orlicz spaces. 115

(4) Explicit estimates for AB: Let E1 = 1, 2, · · · and let c1 and c2 be twoconstants such that |π(f)| 6 c1‖f‖B and |π(fIE1

)| 6 c2‖fIE1‖B for all

f ∈ B. Then,

max‖1‖−1

B,(1 −

√c2(1 − π0)‖1‖B

)2AB

6 AB 6(1 +

√c1‖1‖B

)2AB.

(6.23)

Similarly, one can handle with the birth–death processes on Z.An interesting point here is that the first lower bound in (6.23) is meaningful

only in the discrete situation.

Orlicz spaces. The results obtained so far can be specialized to Orlicz spaces.The idea also goes back to S. G. Bobkov and F. Gotze (1999b). A function Φ:R → R is called an N -function if it is non-negative, continuous, convex, even(i.e., Φ(−x) = Φ(x)) and satisfies the following conditions:

Φ(x) = 0 iff x = 0, limx→0

Φ(x)/x = 0, limx→∞

Φ(x)/x = ∞.

In what follows, we assume the following growth condition (or ∆2-condition)for Φ:

supx1

Φ(2x)/Φ(x) <∞(⇐⇒ sup

x1xΦ′

−(x)/Φ(x) <∞),

where Φ′− is the left derivative of Φ. Corresponding to each N -function, we

have a complementary N -function:

Φc(y) := supx|y| − Φ(x) : x > 0, y ∈ R.

Alternatively, let ϕc be the inverse function of Φ′−, then Φc(y) =

∫ |y|

0ϕc [cf., M.

M. Rao and Z. D. Ren (1991)].Given an N -function and a finite measure µ on E := (p, q) ⊂ R, define an

Orlicz space as follows:

LΦ(µ) =

f(E → R

):

E

Φ(f)dµ <∞, ‖f‖Φ = sup

g∈G

E

|f |gdµ, (6.24)

where G =g > 0 :

∫E

Φc(g)dµ 6 1, which is the set of non-negative functions

in the unit ball of LΦc(µ). Under ∆2-condition, (LΦ(µ), ‖ · ‖Φ, µ) is a Banachspace. For this, the ∆2-condition is indeed necessary. Clearly, LΦ(µ) 3 1 andis ideal. Obviously, (LΦ(µ), ‖ · ‖Φ, µ) satisfies condition (6.16) and so we havethe following result.

Corollary 6.7 For any N -function Φ satisfying the growth condition, if (6.1)(resp., (6.15)) holds, then Theorem 6.4 (resp., 6.5) is available for the Orlicz space(LΦ(µ), ‖ · ‖Φ, µ).

Page 128: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

116 6 Poincare-type inequalities in dimension one

6.5 Nash inequality and Sobolev-type inequality

It is known that when ν > 2, the Nash inequality (6.6):

‖f − π(f)‖2+4/ν6 AND(f)‖f‖4/ν

1

is equivalent to the Sobolev-type inequality:

‖f − π(f)‖2ν/(ν−2) 6 ASD(f),

where ‖ · ‖r is the Lr(µ)-norm. Refer to D. Bakry et al (1995), E. A. Carlen etal (1987) and N. Varopoulos (1985). This leads to the use of the Orlicz spaceLΦ(µ) with Φ(x) = |x|r/r, r = ν/(ν − 2):

∥∥(f − π(f))2∥∥

Φ6 AνD(f). (6.25)

The results in this section were obtained in Y. H. Mao (2002b), based on theweighted Hardy’s inequalities.

Define C(x) =∫ x

θ b/a, µ(m,n) =∫ n

m eC/a and

ϕ1θ(x) =

∫ x

θ

e−C B1θν = sup

x>θϕ1θ(x)µ(x, q)(ν−2)/ν ,

ϕ2θ(x) =

∫ θ

x

e−C B2θν = sup

x<θϕ2θ(x)µ(p, x)(ν−2)/ν .

Here Bkθν (k = 1, 2) is specified from BB given in (6.18) with B = LΦ((θ, q), µ)

or B = LΦ((p, θ), µ), since ‖ · ‖Φ = (r′)1/r′‖ · ‖r, 1/r + 1/r′ = 1.

Theorem 6.8 Let (6.15) hold and ν > 2.

(1) Explicit criterion: The Nash inequality (equivalently, (6.25)) holds on (p, q)iff B1θ

ν ∨ B2θν <∞.

(2) Explicit bounds:

max

1

2

(B1θ

ν ∧ B2θν

),

[1 −

(Z1θ ∨ Z2θ

Z1θ + Z2θ

)1/2+1/ν]2(B1θ

ν ∨ B2θν

)

6 Aν 6 4(B1θ

ν ∨ B2θν

).

(6.26)

In particular, if θ is the medium of µ, then

[1 − (1/2)1/2+1/ν

]2(B1θ

ν ∨ B2θν

)6 Aν 6 4

(B1θ

ν ∨ B2θν

).

We now consider birth–death processes with state space 0, 1, 2, · · · . Define

ϕi =i∑

j=1

1

µjaj, i > 1; Bν = sup

i>1ϕi

( ∞∑

j=i

µj

)(ν−2)/ν

.

Page 129: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

6.6 Logarithmic Sobolev inequality 117

Theorem 6.9 For birth–death processes, let (6.22) hold and assume that Z <∞.

Then, we have

max

(2

νZν/2−1

)2/ν

,

[1 −

(Z − 1

Z

)1/2+1/ν]2Bν 6 Aν 6 16Bν . (6.27)

Hence, when ν > 2, the Nash inequality holds iff Bν <∞.

6.6 Logarithmic Sobolev inequality

The starting point of the study is the following observation.

Lemma 6.10 Let Φ(x) = |x| log(1 + |x|), L (f) = supc∈REnt

((f + c)2

)and

Ent(f) =∫

Rf log f

π(f)dµ, f > 0. Then we have

2

5

∥∥(f − π(f))2∥∥

Φ6 L (f) 6

51

20

∥∥(f − π(f))2∥∥

Φ, (6.28)

The proof of Lemma 6.10 is given at the end of this section.The observation leads to the use of the Orlicz space B = LΦ(µ) with Φ(x) =

|x| log(1 + |x|). The results in this section were obtained by S. G. Bobkov andF. Gotze (1999b), and Y. H. Mao (2002a), based again on the weighted Hardy’sinequalities. Refer also to L. Miclo (1999b) for the related study.

Define

C(x) =

∫ x

θ

b

a, µ(m,n) =

∫ n

m

eC/a;

ϕ1θ(x) =

∫ x

θ

e−C , ϕ2θ(x) =

∫ θ

x

e−C ;

M(x) = x

[2

1 +√

1 + 4x+ log

(1 +

1 +√

1 + 4x

2x

)];

B1θΦ = sup

x∈(θ,q)

ϕ1θ(x)M(µ(θ, x)), B2θΦ = sup

x∈(p,θ)

ϕ2θ(x)M(µ(x, θ)).

(6.29)

Again, here BkθΦ (k = 1, 2) is specified from BB given in (6.18).

Theorem 6.11 Let (6.15) hold.

(1) Explicit criterion: The logarithmic Sobolev inequality on (p, q) ⊂ R holds iff

supx∈(θ,q)

µ(x, q) log1

µ(x, q)

∫ x

θ

e−C <∞ and

supx∈(p,θ)

µ(p, x) log1

µ(p, x)

∫ θ

x

e−C <∞(6.30)

hold for some (equivalently, all) θ ∈ (p, q).

Page 130: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

118 6 Poincare-type inequalities in dimension one

(2) Explicit bounds: Let θ be the root of B1θΦ = B2θ

Φ , θ ∈ [p, q]. Then, we have

1

5B1θ

Φ 6 ALS 651

5B1θ

Φ . (6.31)

By a translation if necessary, assume that θ = 0 is the medium of µ. Then,we have

(√2 − 1

)2

5

(B1θ

Φ ∨ B2θΦ

)6 ALS 6

51

5

(B1θ

Φ ∨ B2θΦ

). (6.32)

We now consider birth–death processes with state space 0, 1, 2, · · · . Define

ϕi =

i∑

j=1

1

µjaj, i > 1; BΦ = sup

i>1ϕiM(µ[i,∞)),

where µ[i,∞) =∑

j>i µj and M(x) is defined in (6.29).

Theorem 6.12 For birth–death processes, let (6.22) hold and assume that Z <∞. Then, we have

2

5max

√4Z + 1 − 1

2,

(1 − Z1Ψ

−1(Z−1

1

)

ZΨ−1(Z−1

))2

6 ALS 651

5

(1 + Ψ−1

(Z−1

))2

BΦ,

where Z1 = Z − 1 and Ψ−1 is the inverse function of Ψ: Ψ(x) = x2 log(1 + x2).In particular, ALS <∞ iff

supi>1

ϕi µ[i,∞) log1

µ[i,∞)<∞.

Proof of Lemma 6.10. We follow the proof given in S. G. Bobkov and F.Gotze (1999b) and Chen (2001b).

Without loss of generality, we may replace µ with π in definitions of L (f),Ent(f) and ‖ · ‖Φ. In other words, we may assume that µ = π.

For convenience, we adopt a more practical but equivalent norm as follows:

‖f‖(Φ) = inf

α > 0 :

E

Φ(f/α)dµ 6 1

. (6.33)

The comparison of these two norms is as follows:

‖f‖(Φ) 6 ‖f‖Φ 6 2‖f‖(Φ) (6.34)

Page 131: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

6.6 Logarithmic Sobolev inequality 119

[cf., M. M. Rao and Z. D. Ren (1991, §3.3, Proposition 4)]. Because∥∥f2

∥∥(Φ)

=

‖f‖2(Ψ), it suffices to prove that

4

5

∥∥f − π(f)∥∥2

(Ψ)6 L (f) 6

51

20

∥∥f − π(f)∥∥2

(Ψ), (6.35)

Let ‖f‖(Ψ) = 1 and π(f) = 0. By Lemma 4.13, we have

L (f) 6 Ent(f2)

+ 2π(f2), (6.36)

Express the right as

∫f2(δ + log f2

)dπ + π

(f2)[

2 − δ − logπ(f2)]

for some δ ∈ [0, 2]. Note that x(2 − δ − logx) 6 e1−δ for all x > 0. Let c(δ) bethe bound so that δ + logx 6 c(δ) log(1 + x) for all x > 0. Then, we have

L (f) 6 c(δ)

∫f2 log

(1 + f2

)dπ + e1−δ 6 c(δ) + e1−δ.

Minimizing the right in δ and noting that c(δ) satisfies the equation

c log c− (c− 1) log(c− 1) = δ, c > 1

(which comes from the equation c′(δ) = 0), we obtain δ ≈ 1.02118, c(δ) ≈1.56271. Then c(δ) + e1−δ < 2.542 < 2.55 = 51/20 and the required upperbound follows.

For the lower bound, let π(f) = 0 and L (f) = 2. Because

π(f2)− π(f)2 =

1

2lim

|a|→∞Ent

((f + a)2

)6

1

2L (f),

we get π(f2)

6 1. Hence π(f2)logπ

(f2)

6 0 and moreover

π(f2 log(f2)

)= Ent(f2) + π

(f2)logπ

(f2)

6 Ent(f2)

6 L (f) = 2. (6.37)

The idea is to find the smallest constant δ ≈ 0.4408 so that

x log(1 + x/(2 + δ)

)6 δ + x log x

for all x > 0. Then

∫(f2/(2 + δ)) log

(1 + f2/(2 + δ)

)dπ 6

(δ +

∫f2 log f2dπ

)/(2 + δ) 6 1,

by (6.37). Thus, ‖f/√

2 + δ ‖(Ψ) 6 1 and so ‖f‖2(Ψ) 6 2 + δ < 5L (f)/4.

Page 132: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

120 6 Poincare-type inequalities in dimension one

6.7 Partial proofs of Theorem 6.1

In this section, we prove Theorem 6.1 except the conclusion that the sign ofequalities in (6.11) and (6.12) holds for continuous coefficients a and b. Theproof for the last assertion requires some finer properties of the eigenfunctionin weak sense, as seen from §3.7 and §3.8 in the discrete case.

Proof of (6.11)

(a) First we prove that A 6 inff∈F ′ supx∈(0,D) II(f)(x). Given h withh|(0,D) > 0, for every g: g(0) = 0, ‖g‖ = 1, by Cauchy-Schwarz inequality, wehave

1 =

∫ D

0

g(x)2π(dx) =

∫ D

0

π(dx)

[ ∫ x

0

g′(u)du

]2

6

∫ D

0

π(dx)

∫ x

0

[g′2eCh−1

](u)du

∫ x

0

[he−C

](ξ)dξ

=

∫ D

0

a(u)g′(u)2π(du)Z

h(u)

∫ D

u

π(dx)

∫ x

0

he−C

6 D(g) supx∈(0,D)

1

h(x)

∫ D

x

eC(y)

a(y)dy

∫ y

0

he−C

=: D(g) supx∈(0,D)

H(x).

(6.38)

Now, let f ∈ F ′ satisfy supx∈(0,D) II(f)(x) <∞. Take h(x) =∫ D

x fa−1eC . ByMean Value Theorem, we get

supx∈(0,D)

H(x) 6 supx∈(0,D)

[− eC

ah′(x)

] ∫ x

0

he−C = supx∈(0,D)

II(f)(x). (6.39)

Because g is arbitrary, by (6.38) and (6.39), we obtain the required assertion.When D = ∞, there is however a problem about the integrability of h. Toavoid this, replacing D with finite M in the definitions of h and H , the aboveproof is still valid. Then, the conclusion follows by letting M ↑ ∞.

(b) Next, we prove that

inff∈F ′

supx∈(0,D)

II(f)(x) = inff∈F

supx∈(0,D)

I(f)(x).

Given f ∈ F , without loss of generality, assume that supx∈(0,D) I(f)(x) < ∞.By Mean Value Theorem, supx∈(0,D) II(f)(x) 6 supx∈(0,D) I(f)(x). But F ′ ⊃F , so

inff∈F ′

supx∈(0,D)

II(f)(x) 6 inff∈F

supx∈(0,D)

I(f)(x).

Conversely, for a given f ∈ F ′ with supx∈(0,D) II(f)(x) < ∞, let g = fII(f).Then g ∈ F . By using Mean Value Theorem again, we obtain

I(g)(x) =

[ ∫ D

x

fa−1eC

]−1∫ D

x

ga−1eC6 sup

x∈(0,D)

(g/f)(x) = supx∈(0,D)

II(f)(x).

Page 133: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

6.7 Partial proofs of Theorem 6.1 121

When D = ∞, there is again a problem about the integrability of g, which canbe solved by using the method mentioned in the last paragraph. Hence

supx∈(0,D)

II(f)(x) > supx∈(0,D)

I(g)(x) > inff∈F

supx∈(0,D)

I(f)(x).

Making supremum with respect to f ∈ F ′, it follows that

inff∈F ′

supx∈(0,D)

II(f)(x) > inff∈F

supx∈(0,D)

I(f)(x)

An alternative proof of this inequality can be done by using the identity

(eCg′

)′= −feC/a. (6.40)

We have thus proved the required assertion.Combining (a) with (b), we get (6.11).

Proof of (6.12)

(a) For supf∈F ′ infx∈(0,D) II(f)(x) = supf∈Finfx∈(0,D) I(f)(x), the proof

is a dual of the above one, exchanging supremum and infimum, making inverseorder of the inequalities and redefining g = [fII(f)](· ∧ x0).

(b) Let f ∈ F ′ satisfy f = f(· ∧ x0) and c := supx∈(0,D) II(f)(x)−1 < ∞and let g0 = [fII(f)](· ∧ x0). Then g0 is bounded and (6.40) holds on (0, x0).By integration by parts formula, we get

∫ D

0

g′02eC = [g0g

′0e

C ](x0−) −∫ x0

0

g0(eCg′0

)′

=

∫ x0

0

g0feC/a+ g0(x0)

∫ D

x0

feC/a

=

∫ D

0

g0feC/a 6

∫ D

0

(g20e

C/a)

supx∈(0,D)

f/g0

= c

∫ D

0

g20e

C/a.

Hence A > c−1 and furthermore A > supf∈F ′ infx∈(0,D) II(f)(x).

Combining (a) with (b), we get (6.12).

Proof of part (3) of Theorem 6.1

First, we consider the case (a). The condition B <∞ implies that

∫ D

0

√ϕeC/a 6

√B

∫ D

0

(∫ D

x

eC/a

)−1/2

eC/a = 2√BZ <∞.

Page 134: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

122 6 Poincare-type inequalities in dimension one

Hence√ϕ ∈ L1(π) [these two conditions are needed for the initial function√

ϕ. In practice, one can certainly choose some more convenient functions].Furthermore, as did in the second the proof (b) of (6.11) and by using induction,it follows that fn ∈ L1(π) for all n. By (6.11),

A 6 supf∈F ′

infx∈(0,D)

II(f)(x) 6 Dn.

Then by Mean Value Theorem and the proof of the upper bound given in §5.3,we get D1 6 4B. On the other hand, by definition of fn and (6.30), we have

(− eCf ′

n

)′= a−1eCfn−1 > a−1fne

CD−1n−1. (6.31)

That is, fneC/a 6 Dn−1

(− eCf ′

n

)′. Hence

fn+1(x) 6 Dn−1

∫ x

0

e−C(y)dy

∫ D

y

(− eCf ′

n

)′(u)du 6 Dn−1fn(x). (6.32)

From this, one deduces that Dn 6 Dn−1.We now consider the case (b). By identity

[fII(f)](x) =

∫ D

0

fϕ(· ∧ x)eC/a =

∫ x

0

fϕeC/a+ ϕ(x)

∫ D

x

feC/a,

we get

f(x0)2 (x ∧ x0) > ϕ(x ∧ x0)ϕ(x0)

∫ D

x0

eC/a

and so

supx∈(0,D)

[f

(x0)1 /f

(x0)2

](x ∧ x0) = sup

x∈(0,x0)

[f

(x0)1 /f

(x0)2

](x) 6

[ϕ(x0)

∫ D

x0

eC/a

]−1

.

This implies that C1 > B. Here, the reason one needs the local procedure“stopping at x0” is the possibility of ϕ /∈ L1(π) which then implies that D(ϕ) =∞.

Finally, we prove the monotonicity of Cn’s. Applying Mean Value Theoremtwice, we obtain

supx∈(0,D)

[f

(x0)n /f

(x0)n+1

](x ∧ x0)

= supx∈(0,x

0)

[f

(x0)n /f

(x0)n+1

](x) 6 sup

x∈(0,x0)

[f

(x0)′n /f

(x0)′n+1

](x)

6 supx∈(0,x

0)

∫ D

x

f(x0)n−1(· ∧ x0)e

Ca−1

/∫ D

x

f(x0)n (· ∧ x0)e

Ca−1

6 supx∈(0,D)

[f

(x0)n−1/f

(x0)n

](x ∧ x0).

This implies that Cn > Cn−1. The inequality A > Cn > B comes from (6.12).

Page 135: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Chapter 7

Functional Inequalities

This chapter deals with some stronger or weaker inequalities than the PoincareOne, called functional inequalities for simplicity. Equivalently, we are study-ing some stronger or weaker types of convergence than the exponential one.Correspondingly, this chapter is divided into two parts.

In the first part, we discuss some types of stronger convergence. We showhow to go to Banach (Orlicz) spaces, starting from Hilbert space (L2-space), inthe higher–dimensional situation. This part is an extension of the main resultsobtained in the last chapter. There are three sections. In §7.1, we state theresults. Their proofs are sketched in §7.2. In §7.3, we compare the capacitarymethod used here with the Cheeger’s method.

In the next part, we discuss some weaker (slower) types of convergence.There are four sections. In §7.4, we examine the general convergence speed,and then two functional inequalities are introduced in §7.5. In §7.6, we discussthe algebraic convergence. The general (irreversible) case is discussed in thelast section §7.7.

7.1 Statement of the results

Let E be a locally compact separable metric space with Borel σ-algebra E ,µ be an everywhere dense Radon measure on E and (D,D(D)) be a regularDirichlet form on L2(µ) = L2(E;µ). The starting point of our study is thefollowing result, which is a copy of Theorem 4.7.

Theorem 7.1. For a regular transient Dirichlet form (D,D(D)), the optimalconstant A in the Poincare inequality

‖f‖2 =

E

f2dµ 6 AD(f), f ∈ D(D) ∩ C0(E) (7.1)

satisfies B 6 A 6 4B, where

B = supcompact K

µ(K)

Cap(K). (7.2)

Page 136: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

124 7 Functional Inequalities

Recall that

Cap(K) = infD(f) : f ∈ D(D) ∩ C0(E), f |K > 1

,

where C0(E) is the set of continuous functions with compact support. Certainly,in (4.1), one may replace “D(D) ∩ C0(E)” by “D(D)” or by the extendedDirichlet space “De(D)”, which is the set of E -measurable functions f :

|f | <∞, µ-a.e., there exists a sequence fn ⊂ D(D) such that

D(fn − fm) → 0 as n,m→ ∞ and limn→∞

fn = f, µ-a.e.

Refer to the standard books by M. Fukushima, Y. Oshima and M. Takeda(1994), and by Ma and Rockner (1992) for some preliminary facts about theDirichlet forms theory.

Actually, inequality (4.1) on the half-line (E = R+) was begun by G. H.Hardy in 1920 and completed by B. Muckenhoupt in 1972 [see also B. Opic andA. Kufner (1990)] with explicitly isoperimetric constant B.

The first goal of this chapter is extending (4.2) to the Poincare-type inequal-

ity ∥∥f2∥∥

B6 ABD(f), f ∈ D(D) ∩ C0(E) (7.3)

for a class of Banach spaces (B, ‖ · ‖B, µ) of real functions on E. To do so, weneed the following assumptions on (B, ‖ · ‖B, µ):

(H1) Transient case: IK ∈ B for all compact K. Ergodic case: 1 ∈ B.

(H2) If h ∈ B and |f | 6 h, then f ∈ B.

(H3) ‖f‖B = supg∈G

∫E|f |gdµ,

where G , to be specified case by case, is a class of non-negative E -measurablefunctions. Unless otherwise stated, these assumptions will be used throughoutthis and the next sections. Note that part (4) of condition (6.16) is ignoredhere.

We can now state our first result as follows.

Theorem 7.2. For a regular transient Dirichlet form (D,D(D)), the optimalconstant AB in (4.3) satisfies

BB 6 AB 6 4BB, (7.4)

where the isoperimetric constant BB is given by

BB = supcompact K

‖IK‖B

Cap(K). (7.5)

Next, we go to the ergodic case. Assume that µ(E) < ∞ and set π =µ/µ(E). Throughout this chapter, we use a simplified notation: f = f − π(f),where π(f) =

∫fdπ. We adopt a splitting technique. Let E1 ⊂ E be open with

π(E1) ∈ (0, 1) and write E2 = Ec1 \ ∂E1.

Page 137: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

7.1 Statement of the results 125

The restriction of B to Ei gives us (Bi, ‖ · ‖Bi , µi):

Bi = f |Ei

: f ∈ B, µi = µ|Ei, G

i = g|Ei: g ∈ G ,

‖f‖Bi = supg∈G i

Ei

|f |gdµi = supg∈G

Ei

|f |gdµ, i = 1, 2.

Correspondingly, we have a restricted Dirichlet form(D,D(D)|Ei

)on Ei and

the corresponding constants ABi and BBi (i = 1, 2) given by Theorem 4.2.

Denote by c1 a constant such that

|π(f)| 6 c1‖f‖B, f ∈ B. (7.6)

For each G ⊂ E, denote by c2(G) a constant such that

|π(fIG)| 6 c2(G)‖fIG‖B, f ∈ B. (7.7)

Theorem 7.3. Let (D,D(D)) be a regular Dirichlet form. Assume that µ(E) <∞ and there exists an open E1 such that µ(∂E1) = 0, c2(Ei)π(Ei)‖1‖B < 1,i = 1, 2, where E2 = Ec

1 \ ∂E1. Then the optimal constant AB in the Poincare-

type inequality

∥∥f2∥∥

B6 ABD(f), f ∈ D(D) ∩ C0(E) (7.8)

satisfies

AB > (K1 ∧K2)(AB1 ∨ AB2) > (K1 ∧K2)(BB1 ∨BB2), (7.9)

AB 64(1+√c1‖1‖B

)2

(AB1∨AB2)616(1+√c1‖1‖B

)2

(BB1∨BB2), (7.10)

where Ki =(1 −

√c2(Ei)π(Ei)‖1‖B

)2, i = 1, 2.

For the logarithmic Sobolev inequality:

E

f2 log[f2/π(f2)

]dµ 6 ALSD(f), f ∈ D(D) ∩ C0(E) (7.11)

We have more explicit result. To see this, let Φ(x) = |x| log(1 + |x|). Define theOrlicz space

(B = LΦ(µ), ‖ · ‖B = ‖ · ‖Φ

)with N -function Φ as follows.

LΦ(µ) =

f :

E

Φ(f)dµ <∞

(7.12)

‖f‖Φ = sup

E

|f |gdµ :

E

Φc(g)dµ 6 1

(7.13)

where Φc is the complementary function of Φ.

Page 138: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

126 7 Functional Inequalities

Theorem 7.4. Let (D,D(D)) be a regular Dirichlet form. Assume that µ(E) <∞ and there exists an open E1 such that µ(∂E1) = 0, π(Ei) ∈ (0, 1/2], i = 1, 2,where E2 = Ec

1 \ ∂E1. Then, the optimal ALS in (4.11) satisfies

(√

2 − 1)2

5

(B1

Φ ∨ B2Φ

)6 ALS 6

204

5

[1 + Φ−1

(µ(E)−1

)]2(B1

Φ ∨ B2Φ

), (7.14)

where

BiΦ = sup

compact K⊂Ei

M(µ(K))

Cap(K), i = 1, 2, (7.15)

M(x) =1

2

(√1 + 4x− 1

)+ x log

(1 +

1 +√

1 + 4x

2x

)∼ x log x−1,

as x → 0.

7.2 Sketch of the proofs

The key to prove Theorem 4.1 is the following result:

Theorem 7.5. For a regular transient Dirichlet form (D,D(D)), we have∫ ∞

0

Cap(x ∈ E : |f(x)| > t)d(t2)

6 4D(f), f ∈ D(D) ∩ C0(E).

Proof. The simplified proof given here is due to M. Fukushima and T. Uemura(2002, Theorem 2.1). In this proof, one needs more knowledge about Dirichletforms. Refer to the books by Fukushima et al (1994), and by Ma and Rockner(1992).

Let f ∈ D(D) ∩ C0(E) and set Nt = |f | > t. Then there exists e(t) ∈De(D), e(t) = 1 q.e. on Nt, and a measure µt such that

Cap(Nt) = µt(Nt) = D(e(t)) (7.16)

D(e(t), g) =

Nt

gdµt, g ∈ De(D) (7.17)

Since e(s) = 1, 0 6 s 6 t, q.e. on Nt, by (4.17), we have

D(e(t), e(s)) = µt(Nt) = D(e(t)) (7.18)

Next, define ‖g‖D = D(g). By (4.18), we have

‖e(s) − e(t)‖D = Cap(Ns) − Cap(Nt).

Thus, ‖e(t)‖D is measurable in t since Cap is right-continuous. On the otherhand, by (4.16), we have

∫ ∞

0

‖e(t)‖Ddt =

∫ ∞

0

√Cap(Nt) dt 6

∫ ‖f‖∞

0

√Cap(Supp(f)) dt

= ‖f‖∞√

Cap(Supp(f)) <∞.

Page 139: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

7.2 Sketch of the proofs 127

We can define the Bochner integral ψ =∫∞

0e(t)dt. Moreover,

D(ψ, g) =

∫ ∞

0

D(e(t), g)dt, g ∈ De(D). (7.19)

Having these preparations, we can now prove our assertion.

∫ ∞

0

Cap(Nt)dt2 = 2

∫ ∞

0

tCap(Nt)dt = 2

∫ ∞

0

tµt(Nt)dt (by (4.16))

6 2

∫ ∞

0

t · 1

t

Nt

|f |dµtdt (since |f |/t > 1 on Nt)

= 2

∫ ∞

0

D(e(t), |f |)dt (by (4.17))

= 2D(ψ, |f |) (by (4.19))

6 2√D(ψ)D(|f |) (by Schwarz inequality)

6 2√D(ψ)D(f).

But

D(ψ) = D

(∫ ∞

0

e(t)dt,

∫ ∞

0

e(s)ds

)

=

∫ ∞

0

∫ ∞

0

D(e(t), e(s))dtds = 2

∫ ∞

0

ds

∫ s

0

D(e(t), e(s))dt

= 2

∫ ∞

0

ds

∫ s

0

D(e(s))dt (by (4.18))

= 2

∫ ∞

0

sD(e(s))ds = 2

∫ ∞

0

sCap(Ns)ds

=

∫ ∞

0

Cap(Nt)dt2,

and so the required assertion follows.

Having Theorem 4.5 in mind, the proof of Theorem 4.2 (which is more ge-neral than Theorem 4.1) is quite standard. Here we follow the proof of A.Kaimanovich (1992, Theorem 3.1).

Proof of Theorem 4.2. Let f ∈ D(D)∩C0(E) and set Nt = |f | > t. SinceNt is compact, by (H1), INt

∈ B. Next, since |f | 6 ‖f‖∞ISupp(f), by (H1)and (H2), f

2 ∈ B. Note that

∫ ∞

0

INtd(t2)

= 2

∫ ∞

0

tI|f |>tdt = 2

∫ |f |

0

t dt = f2 (Co-area formula).

Page 140: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

128 7 Functional Inequalities

Since Nt is compact, by the definition of BB and Theorem 4.5, we obtain

∥∥f2∥∥

B= sup

g∈G

E

f2gdµ = supg∈G

E

(∫ ∞

0

INtd(t2))gdµ

= supg∈G

∫ ∞

0

(∫

E

INtgdµ

)d(t2)

6

∫ ∞

0

∥∥INt

∥∥Bd(t2)

6 BB

∫ ∞

0

Cap(Nt)d(t2)

6 4BBD(f).

This implies that AB 6 4BB.Next, for every compact K and any function f with f |K > 1, we have

∥∥IK∥∥

B6∥∥f2

∥∥B

6 D(f).

Thus,

∥∥IK∥∥

B6 AB infD(f) : f ∈ D(D) ∩ C0(E), f |K > 1 = ABCap(K).

Making supremum with respect to K, it follows that BB 6 AB and the proof iscompleted.

The proof of Theorem 7.3 is based on the splitting technique and the fol-lowing result.

Proposition 7.6. Let (E,E , π) be a probability space and (B, ‖·‖B) be a Banachspace, satisfying (H1) and (H2), of Borel measurable functions on (E,E , π).

(1) Let c1 be given by (4.6). Then

∥∥f2∥∥

B6(1 +

√c1‖1‖B

)2∥∥f2∥∥

B.

(2) Let c2(G) be given by (4.7). If c2(G)π(G)‖1‖B < 1, then for every f withf |Gc = 0, we have

∥∥f2∥∥

B6∥∥f2

∥∥B

/[1 −

√c2(G)π(G) ‖1‖B

]2.

Proof. Note that π(f)2 6 π(f2)

6 c1∥∥f2

∥∥B. For all p, q > 1 with (p− 1)(q −

1) = 1, we have (x + y)2 6 px2 + qy2, and so

∥∥f2∥∥

B6 p∥∥f2

∥∥B

+ qπ(f)2‖1‖B 6(p+ c1q‖1‖B

)∥∥f2∥∥

B.

Minimizing the right-hand side with respect to p and q, we get the first assertion.The proof of the second one is similar.

The proof of Theorem 4.4 needs some different consideration, based on Lem-ma 6.10.

Page 141: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

7.3 Comparison with Cheeger’s method 129

7.3 Comparison with Cheeger’s method

A typical case for which one needs the Banach form of Poincare-type inequalityis the F -Sobolev inequality [cf., F. Y. Wang (2000a), F. Z. Gong and F. Y. Wang(2002)]: ∫

E

f2F(f2)dµ 6 AFD(f), f ∈ D(D) ∩ C0(E). (7.20)

Theorem 7.7. Let F : R+ → R+ satisfy the following conditions:

(1) 2F ′ + xF ′′ > 0 on [0,∞).

(2) limx→0 F (x) = 0 and limx→∞ F (x) = ∞.

(3) supx1 xF′(x)/F (x)<∞.

Then Theorem 4.2 is valid for the Orlicz space with N -function Φ(x) = |x|F (|x|).Furthermore the isoperimetric constant is given by

BΦ = supcompact K

α∗(K)−1 + µ(K)F (α∗(K))

Cap(K), (7.21)

where α∗(K) is the minimal positive root of α2F ′(α) = µ(K).

To compare this result with the generalized Cheeger’s method, let us recallthe symmetric form

D(α)(f) =1

2

∫J (α)(dx, dy)[f(y) − f(x)]2 +

∫K(α)(dx)f(x)2, α > 0

as defined in §4.5, satisfying the normalizing condition

[J (1)(dx,E) +K(1)(dx)]/π(dx) 6 1.

Next, define

λ(α)0 = inf

D(α)(f) : ‖f‖ = 1

, c1 = sup

x>0

xF ′±(x)

F (x)<∞.

The next result is taken from Chen (2000a).

Theorem 7.8. The optimal AF in (4.20) satisfies

AF > supπ(G)>0

π(G)F(π(G)−1

)

J(G×Gc) +K(G), (7.22)

AF 6 supπ(G)>0

4(1 + c21)(2 − λ

(1)0

)π(G)2F

(π(G)−1

)[J (1/2)(G×Gc) +K(1/2)(G)

]2 . (7.23)

Comparing (4.23) with (4.21) and (4.4), it is clear that the Cheeger’s methodis more explicit but less precise than the capacitary method.

Besides, it should be also clear that these two methods are much less explicitthan the one-dimensional results studied in Chapter 6, The main reason is thatin the latter case, our starting point given in §6.2 is much more precise thanTheorem 4.1.

Page 142: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

130 7 Functional Inequalities

Comments on F-Sobolev inequalities

The F -Sobolev inequalities are important in the following sense.

(1) It was proved by F. Y. Wang (2000a) and extended by F. Z. Gong andF. Y. Wang (2002) that if the essential spectrum of the generator of theprocess is empty, then F -Sobolev inequality holds for a suitable functionF . Here and in what follows, we are talking about the ergodic case only.The converse assertion holds once there exists a transition probabilitydensity with respect to the reversible probability measure.

(2) The F -Sobolev inequalities are used by F. Y. Wang (1999a) to estimatethe higher eigenvalues λj , j > 1, not only the first one.

(3) Recently, F. Y. Wang (2002) proves that the inequalities for suitable F areequivalent to the Beckner-type inequalities, which are additive and henceare useful in the infinite-dimensional situation to study the perturbationof independent systems.

(4) Clearly, one may regard some F -Sobolev inequalities as interpolationsbetween the logarithmic Sobolev inequality and the Poincare inequality.This is meaningful, especially for Markov jump processes, to give somesufficient conditions for the exponential convergence in entropy.

7.4 General convergence speed

The aim of this section is to derive an inequality for general convergence speedfor reversible Markov processes.

Let ξ(t) ↓ 0 as t ↑ ∞. Consider the general decay

‖Ptf‖2 = ‖Ptf − π(f)‖26 CV (f)ξ(t), (7.24)

where C is a constant and V is a suitable functional to be discussed belowmore carefully. On the right-hand side of (4.24), the variables t and f areseparated. Of course, we are mainly looking for such a simpler control, ratherthan complicated expression. Now, a question arises: What is analog of Poincareinequality for such decay?

Note that if we define

V (f) = supt>0

ξ(t)−1‖Ptf‖2,

then it is clear that the functional V should be homogeneous in degree two:

V (αf + β) = α2V (f), (7.25)

which is the main condition we need for the functional V . Next, if we defineV (f) = ‖f‖2, then we will return to the the exponential convergence ξ(t) ∼ e−εt.

Page 143: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

7.5 Two functional inequalities 131

We now continue our seeking for an analog of the Poincare inequality. It isa simple fact that

0 61

t(f − Ptf, f) =

[0,∞)

eαt − 1

td(Eαf, f)

x D(f), as t ↓ 0,

where Eαα>0 is the spectral family of the semigroup, since (eαt − 1)/t ↑ ast ↓. Because (Ptf, f) is decreasing in t, there exists a nonnegative, increasingfunction η (η(r) = r, for example) such that η(r)/r ↑ 1 as r ↓ 0, and then foreach f with π(f) = 0, we have

‖f‖2 − η(t)D(f) 6 (Ptf, f) 6 ‖Ptf‖ ‖f‖ 6√CV (f)ξ(t) ‖f‖,

by assumption. Solving this inequality in ‖f‖, we get

‖f‖ 61

2

[√CV (f)ξ(t) +

√CV (f)ξ(t) + 4η(t)D(f)

].

Therefore, we obtain the required inequality

∥∥f∥∥2

6 η(t)D(f) + C ′V (f)ξ(t), (7.26)

where C ′ is a constant.In particular, if we set ξ(t) = t1−q(q > 1) and η(t) = t, then by optimizing

the right-hand side of (4.26) with respect to t, we obtain the following Liggett–

Stroock inequality: ∥∥f∥∥2

6 C ′′D(f)1/pV (f)1/q , (7.27)

where 1/p+ 1/q = 1. Refer to the proof of Theorem 5.10.

7.5 Two functional inequalities

Let us return to (4.26). By a transform if necessary, without loss of genera-lity, we may assume that V (f) = 1. Then the right-hand side of (4.26) becomesη(t)D(f)+C ′ξ(t). Define Φ(x) = infr>0[η(r)x+C ′ξ(r)], x > 0. Then inequality(4.26) takes the following more compact form:

∥∥f∥∥2

6 Φ(D(f)), V (f) = 1.

However, this inequality is not practical since Φ is not explicit. The tricky nowis regarding t as new parameter r. Then, we can rewrite (4.26) as follows.

∥∥f∥∥2

6 η(r)D(f) + C ′V (f)ξ(r), r > 0 (7.28)

Before moving further, we show that it is easy to go back to (4.24) from(4.28). Let π(f) = 0 and set ft = Ptf , Ft = π

(f2

t

). Assuming that the

Page 144: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

132 7 Functional Inequalities

semigroup is V -contractive in the sense that V (ft) 6 V (f) for all f , then, by(4.28), we have

F ′t = −2D(ft) 6

2C ′V (ft)ξ(r)

η(r)− 2

η(r)Ft

62C ′V (f)ξ(r)

η(r)− 2

η(r)Ft, t > 0, r > 0.

By Gronwall lemma,

Ft 6 F0e−2t/η(r) +

2C ′V (f)ξ(r)

η(r)

∫ t

0

e−2(t−s)/η(r)ds

6 ‖f‖2e−2t/η(r) + C ′V (f)ξ(r), t > 0, r > 0.

Define

r(t) = inf

r > 0 : −1

2η(r) log ξ(r) 6 t

,

ξ(t) = ξ(r(t)), V (f) = C ′V (f) + ‖f‖2.

Then, as t ↑ ∞, r(t) ↑ ∞ and so ξ(t) ↓ 0. Moreover,

‖Ptf‖2 = Ft 6 V (f)ξ(t), t > 0

which gives us the required decay (4.24).As an application of (4.28), by setting η(r) = r and V (f) = π(|f |)2, F. Y.

Wang (2000a) introduced the so called super-Poincare inequality

‖f‖2 6 rD(f) + β(r)π(|f |)2, ∀r > 0, (7.29)

where β(r) ↓ as r ↑. The reason for choosing V (f) = π(|f |)2 comes from thefact that the ordinary Poincare inequality is equivalent to

‖f‖26 CD(f) + π(|f |)2,

for some constant C > 0. It was also proved in F. Y. Wang (2000a), F. Z. Gongand F. Y. Wang (2002) that (4.29) is equivalent to the F -Sobolev inequality

(ergodic case) ∫

E

f2F(f2)dπ 6 CD(f), ‖f‖ = 1, (7.30)

where F satisfies supr∈(0,1] |rF (r)| <∞ and limr→∞ F (r) = ∞. The equivalenceof (4.29) and (4.30) provides us not only a more intrinsic understanding aboutthese two inequalities but also the convenience in practice, we can use any oneof them according to our convenience.

Next, we are going to look for a slower convergence. Again, we start from(4.28). Note that the use of the new parameter r is mainly for our convenience.Actually, the right-hand side of (4.28) plays a role only at a point r0 at whichthe right-hand side of (4.28) achieves the infimum. Note also that when V =

Page 145: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

7.6 Algebraic convergence 133

‖ · ‖2∞ and η(r) = r for instance, we have limr→0 β(r) = ∞. This singularity

makes (4.29) stronger for smaller r, and costs some difficulty in the applications.However, it is clear that using different pair (η, ξ) on the right-hand side of(4.28), one may obtain the the same (or equivalent) inequality. Based on theseobservations, by exchange the functions r and β(r), M. Rockner and F. Y. Wang(2001) introduced the so called weaker-Poincare inequality (WPI) as follows.

∥∥f∥∥2

6 α(r)D(f) + rV (f), ∀r > 0, (7.31)

where α(r) > 0, α(r) ↓ as r ↑ on (0,∞). It was proved in the quoted paper thatwhen V = ‖ · ‖2

∞, (4.31) is equivalent to the Kusuoka-Aida’s weak spectral gapproperty [cf., S. Aida (1998)]:

For every sequence fn ⊂ D(D) with π(fn) = 0,

‖fn‖ 6 1 and limn→∞

D(fn) = 0, we have fn → 0 in P.

Here is one of the main results about WPI.

Theorem 7.9 (M. Rockner and F. Y. Wang, 2001).

(1) If ‖Ptf‖2 6 ξ(t)V (f) for all t > 0 and ξ(t) ↓ 0 as t ↑ ∞, then WPI holdswith the same V and

α(r) = 2r infs>0

1

sξ−1(se1−s/r

), ξ−1(t) := infr > 0 : ξ(r) 6 t.

(2) If WPI holds and V (Ptf) 6 V (f) for all t > 0, then

‖Ptf‖2 6 ξ(t)[V (f) +

∥∥f∥∥2]

, ∀t > 0

where ξ(t) = infr > 0 : −α(r) log r/2 6 t.In order to establish WPI, the generalized Cheeger’s (isoperimetric) method

is also very powerful (cf., Rockner and Wang (2001)).

7.6 Algebraic convergence

We now return to the Liggett-Stroock inequality (4.27). Instead of stating somegeneral but technical theorems, we introduce only two examples, from which onecan see the role played by different functionals V . Examples are the leadinglight of our study. Every meaningful theorem should be supported by a goodexample.

Example 7.10 (Chen and Y. Z. Wang, 2000). Consider the birth–death processwith rates ai = bi = iγ for large i (i 1) and γ > 0. The process is ergodic iffγ > 1.

(1) Let γ > 1. Then λ1 > 0 iff γ > 2. In other words, with respect to V (f) =∥∥f∥∥2

, the process has L2-algebraic decay iff γ > 2.

Page 146: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

134 7 Functional Inequalities

(2) Let γ ∈ (1, 2). Then with respect to V s: V s(f) = supk>0

[(k + 1)s|fk+1 −

fk|]2

, where 0 < s 6 γ−1, the process has L2-algebraic decay iff γ ∈ (5/3, 2).

(3) Let γ ∈ (1, 2). Then with respect to V0: V0(f) = supi6=j(fi − fj)2, the

process has L2-algebraic decay for all γ ∈ (1, 2).

Example 7.11 (Chen and Y. Z. Wang, 2000). Consider the birth–death processwith rates ai = 1, bi = 1 − γ/i for i 1 and γ > 0. The process is ergodic iffγ > 1. We now let γ > 1.

(1) In general, we have λ1 = 0 for all γ > 1.

(2) With respect to V 0: V 0(f) = supk>0 |fk+1 − fk|2, the process has L2-algebraic decay iff γ > 3.

(3) With respect to V0: V0(f) = supi6=j(fi − fj)2, the process has L2-algebraic

decay for all γ > 1.

The beginning step of the proof

Note that the functionals V s and V0 are all Lipschitz-type with respect to some

distance ρ: Lipρ(f)2 = supx6=y

∣∣∣ f(x)−f(y)ρ(x,y)

∣∣∣2

. As we have shown before, the

distances play a very important role in the study of spectral gap. The samehappens in the present situation. Here we show a few of the lines in the originalproof. First, recall the Liggett-Stroock inequality:

‖f‖2 6 CD(f)1/pV (f)1/q .

Clearly, one has to use Holder inequality:

Var(f) =1

2

i,j

πiπj(fj−fi)2 =

i,j

πiπj(fj−fi)2

6

i,j

πiπj

(fj − fi

φij

)2

1/p∑

i,j

πiπj

(fj − fi

φδij

)2

φ2(q+δ−1)ij

1/q

Roughly speaking, in the last line, φij represents a distance between i and j.The last inequality indicates a good use of the Holder inequality. One maycontinue the proof by estimating the right-hand side.

General ergodic Markov chains

Finally, we consider a general ergodic Markov chain with transition probability(pij(t)) on a countable set:

πj := limt→∞

pij(t) > 0,

Page 147: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

7.7 General (irreversible) case 135

but looking for polynomial convergence only. Define

d(n)ij =

∫ ∞

0

tn(pij(t) − πj

)dt, n ∈ Z+,

m(n)ij = Eiσ

nj , σj = inft > τ1 : Xt = j,

where τ1 is the first jumping time of the chain.

Theorem 7.12 (Y. H. Mao, 2002f). For an irreducible and ergodic Markov chain,the following assertions hold.

(1)∣∣d(n)

ij

∣∣ <∞ for all i, j iff m(n)j <∞ for some (equivalently, all) j.

(2) If m(n)j <∞, then pij(t) − πj = o

(t−n+1

), as t→ ∞.

(3) m(n)j <∞ iff the inequalities

∑k qikyk 6 −nm(n−1)

ij , i 6= j∑k 6=j qjkyk <∞

have a finite non-negative solution (yi).

It is remarkable to note that the sign of the last equalities indeed holds

for(yi = m

(n)ij

). This indicates the complexity of a criterion, in general, for

the algebraic convergence, especially for irreversible processes. Recall the studyin the previous sections on general convergence speed depends heavily on theDirichlet form (reversibility) and so is not available for the irreversible situation.

7.7 General (irreversible) case

The discussion at the end of the last section leads to the following

Open Problem 7.13. What should be a criterion for slower convergence of ageneral time-continuous Markov process in terms of its operator?

As a reference, here we consider the time-discrete case. Let (E,E ) be ageneral measurable space and (Xn)n>0 be a Markov process with state space(E,E ). Define the return time σB = infn > 1 : Xn ∈ B. Next, define

R0 =

r(n)n∈Z+

: 2 6 r(n) ↑, log r(n)

n

y0 as n ↑ ∞

R =

r(n)n∈Z+

: ∃r0 ∈ R0 such that

limn→∞

r(n)

r0(n)> 0 and lim

n→∞

r(n)

r0(n)<∞

Page 148: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

136 7 Functional Inequalities

Roughly speaking, R0 is the set of monotone speeds and R is the perturbationsof the elements in R0. Here is the general answer to Problem 4.13 in the contextof time-discrete Markov processes.

For simplicity, one may think the petite set and E + below, respectively, ascompact set K and the Borel sets in Rd having positive Lebesgue measure.

Theorem 7.14 (P. Tuominen and R. L. Tweedie, (1994)). Let the Markovprocess be irreducible and aperiodic. Given r ∈ R. Then

limn→∞

r(n)‖Pn(x, ·) − π‖Var = 0

for all x in the set

x : Ex

σB−1∑

k=0

r(k) <∞, ∀B ∈ E+,

provided one of the following equivalent conditions holds:

(1) There exists a petite set K such that Ex

∑σK−1k=0 r(k) <∞ for all x ∈ K.

(2) There exists (fn)n∈Z+: E → [1,∞], a petite set K and a constant b such

that supx∈K f0 <∞, f1 <∞ ⊂ f0 <∞ and furthermore

Pfn+1 6 fn − r(n) + b r(n)IK , n ∈ Z+.

(3) There exists A ∈ E + such that

supx∈A

Ex

σB−1∑

k=0

r(k) <∞, ∀B ∈ E+.

It is proved by S. F. Jarner & G. Roberts (2002) in the polynomial case, onecan use a single f instead of the sequence (fn) used above.

Finally, we mention that there is a quite a number of publications devotedto the study on the convergence rates for time-discrete Markov processes, notincluded in this book. Refer to the papers quoted above, I. Kontoyiannis andS. P. Meyn (2003), G. O. Roberts and J. S. Rosenthal (1997), J. S. Rosenthal(2002), L. M. Wu (2002) and references within.

Page 149: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Chapter 8

A Diagram of Nine Typesof Ergodicity

This chapter consists of three sections. In the first section, we recall three basicinequalities and their ergodic meaning. Then we recall three traditional typesof ergodicity. We compute the exact optimal constants of the inequalities orexact ergodic rates for the traditional ergodicity in the simplest case that thestate space consists of two points only. The diagram of nine types of ergodicitywas presented in Theorem 1.9. In the second section, we discuss the valueof the diagram, its powerful applications and make some comments about itscompleteness. The last section is devoted to the proof of the diagram.

8.1 Statements of the results

Ergodicity by means of three inequalities

Three basic inequalities. Let (E,E , π) be a probability space and (D,D(D))be a Dirichlet form. Denote by Var the variational norm: Var(f) = ‖f‖2−π(f)2,where ‖·‖p is the Lp-norm and ‖·‖ = ‖·‖2. The three inequalities are as follows.

Poincare inequality : Var(f) 6 λ−11 D(f)

Nash inequality : Var(f)1+2/ν6 η−1D(f)‖f‖4/ν

1 , ν > 0

Logarithmic Sobolev inequality :

∫f2log

(f2/‖f‖2

)dπ 6 2σ−1D(f).

We remark that for the Nash inequality, it is equivalent if ‖f‖1 is replacedby ‖f‖r for all r ∈ (1, 2). To save the notations, here λ1, η and σ denote,respectively, the optimal constants in the inequalities.

Ergodicity by means the inequalities. Let (Pt) be the semigroup deter-mined by the Dirichlet form (D,D(D)): Pt = etL formally. Then

Page 150: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

138 8 A Diagram of Nine Types of Ergodicity

• Poincare inequality ⇐⇒Var(Ptf)6Var(f)e−2λ1t.

• Logarithmic Sobolev inequality =⇒ exponential convergence in entropy:

Ent(Ptf) 6 Ent(f)e−2σt,

whereEnt(f) = π(f log f) − π(f) log ‖f‖1.

Actually, one can replace “=⇒” with “⇐⇒” in the context of diffusions.

• Nash inequality ⇐⇒ Var(Ptf) 6

4ηt

)ν/2

‖f‖21.

At the first look, one may think that the Nash inequality is the weakest onesince the convergence speed is slower. However, a result due to L. Gross (1976)says that

Logarithmic Sobolev inequality =⇒ Exponential convergence in entropy

=⇒ Poincare inequality

Three traditional types of ergodicity

Recall‖µ− ν‖Var = 2 sup

A∈E

|µ(A) − ν(A)|.

Here are the three traditional types of ergodicity.

Ordinary ergodicity : limt→∞

‖Pt(x, ·) − π‖Var = 0

Exponential ergodicity : limt→∞

eαt‖Pt(x, ·) − π‖Var = 0

Strong ergodicity : limt→∞

supx

‖Pt(x, ·) − π‖Var = 0

⇐⇒ limt→∞

eβt supx

‖Pt(x, ·) − π‖Var = 0,

where α and β denote the largest positive constants in the correspondent equa-lity to save our notations. For these types of ergodicity, there is a classicaltheorem, which will be proved in the last section of this chapter.

Strong ergodicity =⇒ Exponential ergodicity =⇒ Ordinary ergodicity.

As an illustration, we compute the optimal constants in the inequalities andthe ergodic convergence rates for the simplest example.

Example 8.1. Let E = 0, 1 and consider the Q-matrix

Q =

(−b ba −a

).

Page 151: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

8.1 Statements of the results 139

Then the Nash constant

η = (a+ b)

(a ∧ ba ∨ b

)1/q

,

the logarithmic Sobolev constant

σ =2(a ∨ b− a ∧ b)

log[(a ∨ b)/(a ∧ b)] ,

the rates of the L2-exponential convergence, the exponential ergodicity and thestrong ergodicity (must be exponential) are all equal to λ1 = a+ b. The results aresummarized in Table 8.1.

λ1λ1λ1 = ααααααααα = βββββββββ Log Sobolev σσσ Nash ηηη

a+ b >2(a ∨ b− a ∧ b)

log a ∨ b− log a ∧ b (a+ b)

(a ∧ ba ∨ b

)1+2/ν

Table 8.1 The constants of the inequalities for two points

Proof. (a) Note that

P (t) = (pij(t)) = etQ =1

a+ b

(a+ be−λ1t b

(1 − e−λ1t

)

a(1 − e−λ1t

)b+ ae−λ1t

)

and

π0 =a

a+ b, π1 =

b

a+ b.

Hence

|pij(t) − πj | 6a ∨ b ∨ 1

a+ be−λ1t.

This proves the last assertion.(b) Write

Q = (a+ b)

(−θ θ

1 − θ θ − 1

),

where θ = ba+b . Therefore, it suffices to consider the Q-matrix

Q1 =

(−θ θ

1 − θ θ − 1

).

Without loss of generality, one may assume that θ 6 1/2, i.e., b 6 a.(c) By P. Diaconis and L. Saloff-Coste (1996) or Chen (1997a), for Q1, we

have

σ =2(1 − 2θ)

log(1/θ − 1).

Actually, the computation of this constant is non-trivial. From this and (b), itis easy to obtain the second assertion.

Page 152: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

140 8 A Diagram of Nine Types of Ergodicity

(d) We now show that the Nash inequality is equivalent to

‖f − π(f)‖26 η−1D(f)1/p‖f − c‖2/q

1 , f ∈ L2(π), (8.1)

where c is the median of f . To see this, replace f with f − c in the originalNash inequality, we get (5.1). The inverse implication follows from

‖f − c‖1 = infα

‖f − α‖1 6 ‖f‖1.

Consider Q1. Given a function f on 0, 1. Without loss of generality, assumethat f0 > f1. Since θ 6 1/2, the median of f is f0. Set g = f − f0. Then,

‖g‖1 = θ|g1| = θ(f0 − f1).

Var(g) = π1g21 + (π1g1)

2 = θ(1 − θ)(f0 − f1)2,

D(g) = π0q01(g1 − g0)2 = (1 − θ)θ(f0 − f1)

2.

Hence,

η = infg

D(g)1/p‖g‖2/q1

Var(g)=

1 − θ

)1/q

for Q1. Applying (b) again, we obtain the first assertion.

Even in such a simple situation, the exact rate of the exponential convergencein entropy is still unknown.

A diagram of nine types of ergodicity

The main topic of this chapter is the diagram of the different types of ergodicitypresented in Theorem 1.9.

8.2 Applications and comments

Here are some remarks about Figure 1.4.The importance of the diagram is obvious. For instance, by using the es-

timates obtained from the study on the Poincare inequality, based on the ad-vantage of the analytic approach —- the L2-theory and the equivalence in thediagram, one can estimate exponentially ergodic convergence rates, for which,the known knowledge is still very limited. Actually, these two convergence ratesare often coincided (cf., the proofs given in §8.3). In particular, one obtains acriterion for the exponential ergodicity in dimension one, which has been openedfor a long period (cf., Tables 1.5 and 5.1). Conversely, one obtains immediatelysome criteria, which are indeed new, for the Poincare inequality to be held fromthe well-known criteria for the exponential ergodicity. Here is a criterion forthe last property: A ψ-irreducible, aperiodic Markov process with operator Lis exponentially ergodic if there exists a probability measure ν, some functionsh : E → (0, 1], V : E → [1,∞], and some constants δ > 0, c <∞ such that

LV 6 (−δ + ch)V, R1 > h⊗ ν,

Page 153: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

8.2 Applications and comments 141

where Rλ =∫∞

0e−λtPtdt, λ > 0, is the resolvent of the semigroup Ptt>0.

Refer to S. P. Meyn and R. L. Tweedie (1993a), I. Kontoyiannis and S. P. Meyn(2003) for more details. Next, there is still very limited known knowledge aboutthe L1-spectrum, due to the structure of the L1-space, which is only a Banachbut not Hilbert space. Based on the probabilistic advantage and the identityin the diagram, from the study on the strong ergodicity, one learns a lot aboutthe L1-spectral gap of the generator.

As explained in §7.6, the L2-algebraic ergodicity means that

Var(Ptf) 6 CV (f) t1−q , t > 0

holds for some V having the properties: V is homogeneous of degree two (in thesense that V (cf + d) = c2V (f) for any constants c and d) and V (f) <∞ for aclass of functions f [continuous functions with compact support, for instance,cf., T. M. Liggett (1991)]. Refer also to J. D. Deuschel (1994), Chen and Y.Z. Wang (2000), M. Rockner and F. Y. Wang (2001) for the study on the L2-algebraic convergence.

The reversibility is used in both of the identity and the equivalence. Withoutthe reversibility, the L2-exponential convergence still implies π-a.s. exponen-tially ergodic convergence.

An important fact is that the condition “having densities” is used only inthe identity of L1-exponential convergence and π-a.s. strong ergodicity, withoutthis condition, L1-exponential convergence still implies π-a.s. strong ergodicity,and so the diagram needs only a little change (However, the reversibility is stillrequired here). Thus, it is a natural open problem to remove this “density’scondition”.

Except the identity and the equivalence, all the implications in the diagramare suitable for general Markov processes, not necessarily reversible, even thoughthe inequalities are mainly valuable in the reversible situation. Clearly, thediagram extends the ergodic theory of Markov processes.

The diagram is complete in the following sense: each single-directed implica-tion can not be replaced by double-directed one. Moreover, the L1-exponentialconvergence (resp., the strong ergodicity) and the logarithmic Sobolev inequal-ity (resp., the exponential convergence in entropy) are not comparable.

Examples 8.2. Comparisons of the different types of ergodicity for diffusions onthe half line with reflecting boundary at the origin. Here “

√” means “always holds”

and “×” means “never holds”.

Erg. Exp. erg. LogS Strong erg. Nash

b(x) = 0a(x) = xγ γ > 1 γ > 2 γ > 2 γ > 2 γ > 2

b(x) = 0a(x) = x2 logγ x

√γ > 0 γ > 1 γ > 1 ×

a(x) = 1b(x) = −b

√ √ × × ×

Page 154: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

142 8 A Diagram of Nine Types of Ergodicity

Table 8.2 Comparisons by diffusions on [0,∞)

Examples 8.3. Comparisons of the different types of ergodicity for birth–deathprocesses

Erg. Exp. erg. LogS Strong erg. Nash

ai= bi= iγ

γ > 1 γ > 2 γ > 2 γ > 2 γ > 2

ai= bi= i2 logγ i

√γ > 0 γ > 1 γ > 1 ×

ai = a >bi = b

√ √ × × ×

Table 8.3 Comparisons by birth–death processes

We have seen from the above tables that the strong ergodicity is usuallystronger than the logarithmic Sobolev inequality. The next example goes in theopposite way.

Example 8.4 (Chen, 1999b). Let (πi > 0) and take qij = πj(j 6= i). Then theprocess is strong ergodic but the logarithmic Sobolev inequality does not hold.

Proof. (a) Since the Q-matrix is bounded, the logarithmic Sobolev inequalitycan not hold (cf., Theorem 4.11).

(b) For the strong ergodicity, note that the sequence y0 = 0, yi = 1/π0(i 6= 0)satisfies the inequalities’ criterion [cf., Theorem 5.1 (3)]:

∑j qij(yj − yi) 6 −1, i 6= 0∑j 6=0 q0jyj <∞,

(yi) is nonnegarive and bounded.

Hence the process is strong ergodic.

For one-dimensional diffusions, a counterexample was constructed by F. Y.Wang (2001) to show that the strong ergodicity does not imply the exponentialconvergence in entropy (equivalently, the logarithmic Sobolev inequality).

The diagram was presented in Chen (1999c; 2002b), originally stated mainlyfor Markov chains. Recently, the identity of L1-exponential convergence and theπ-a.s. strong ergodicity is proven by Y. H. Mao (2002c).

8.3 Proof of Theorem 1.9

The detailed proofs and some necessary counterexamples were presented in Chen(1999c; 2002b) for reversible Markov processes, except the identity of the L1-exponential convergence and π-a.s. strong ergodicity. Note that for discrete

Page 155: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

8.3 Proof of Theorem 1.9 143

state spaces, one can rule out “a.s.” used in the diagram. Here, we collect thewhole proofs of the diagram, with some more careful estimates for the generalstate spaces. The author would like to acknowledge Y. H. Mao for his nice ideaswhich are included in this section. The steps of the proofs are listed as follows.

(a) Nash inequality =⇒ L1-exponential convergence and π-a.s. Strong ergo-dicity.

(b) L1-exponential convergence ⇐⇒ π-a.s. Strong ergodicity.

(c) Strong ergodicity=⇒Exponential ergodicity =⇒Ordinary ergodicity.

(d) L2-exponential convergence =⇒ L2-algebraic ergodicity.

(e) L2-algebraic convergence =⇒ Ordinary ergodicity.

(f) Nash inequality =⇒ Logarithmic Sobolev inequality.

(g) Logarithmic Sobolev inequality =⇒ Poincare inequality.

(h) L2-exponential convergence =⇒ π-a.s. Exponential ergodicity.

(i) Exponential ergodicity =⇒ L2-exponential convergence.

(a) Nash inequality =⇒ L1-exponential convergence and π-

a.s. Strong ergodicity [Chen, 1999b]

Denote by ‖ · ‖p→q the operator’s norm from Lp(π) to Lq(π). Note that

Nash inequality ⇐⇒ Var(Pt(f)) = ‖Ptf − π(f)‖22 6 C2‖f‖2

1/tq−1

⇐⇒ ‖(Pt − π)f‖2 6 C‖f‖1/t(q−1)/2.

⇐⇒ ‖Pt − π‖1→2 6 C/t(q−1)/2 (q := ν/2 + 1).

Since ‖Pt − π‖1→1 6 ‖Pt − π‖1→2, we have

Nash inequality =⇒ L1-algebraic convergence.

Furthermore, because of the semigroup property, the convergence of ‖ · ‖1→1

must be exponential, we indeed have

Nash inequality =⇒ L1-exponential convergence.

In the symmetric case: Pt − π = (Pt − π)∗, and so

‖P2t − π‖1→∞ 6 ‖Pt − π‖1→2‖Pt − π‖2→∞ = ‖Pt − π‖21→2.

Page 156: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

144 8 A Diagram of Nine Types of Ergodicity

Hence, ‖Pt − π‖1→∞ 6 C/tq−1. Thus,

ess supx‖Pt(x, ·) − π‖Var = ess supx sup|f |61

|(Pt(x, ·) − π)f |

6 ess supx sup‖f‖161

|(Pt(x, ·) − π)f |

= sup‖f‖161

ess supx|(Pt(x, ·) − π)f |

= ‖Pt − π‖1→∞

6 C/tq−1 → 0, as t→ ∞.

This gives us the π-a.s. strong ergodicity.

(b) L1-exponential convergence ⇐⇒ π-a.s. Strong ergodi-

city [Y. H. Mao, 2002c]

Since (L1)∗ = L∞ =⇒ ‖Pt−π‖1→1 = ‖P ∗t −π‖∞→∞ and P ∗

t (x, ·) π, we have

‖P ∗t − π‖∞→∞ = ess supx sup

‖f‖∞=1

|(P ∗t −π)f(x)|

= ess supx supsup |f |=1

|(P ∗t − π)f(x)|

= ess supx‖P ∗t (x, ·) − π‖Var.

Hence, π-a.s. strong ergodicity is exactly the same as the L1-exponential con-vergence. Without condition “P ∗

t (x, ·) π”, the second equality becomes “>”,and so we have in the general reversible case that

L1-exponential convergence =⇒ π-a.s. Strong ergodicity.

(c) Strong ergodicity=⇒Exponential ergodicity=⇒Ordinaryergodicity

If the Markov process corresponding to the semigroup (Pt) is irreducible andaperiodic in the Harris sense, then the implications hold. To see this, notingthat by [Chen (1992a, §4.4); and D. Down, S. P. Meyn and R. L. Tweedie(1995)], the continuous-time case can be reduced to the discrete-time one andthen the conclusion follows from S. P. Meyn and R. L. Tweedie (1993b, Chapter16).

(d) L2-exponential convergence =⇒ L

2-algebraic ergodicity

Simply take V (f) = ‖f‖2 in (6.1) and apply Theorem 6.2.

(e) L2-algebraic convergence =⇒ Ordinary ergodicity

The proof is very much the same as proof (h) below.

Page 157: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

8.3 Proof of Theorem 1.9 145

(f)Nash inequality=⇒Logarithmic Sobolev inequality[Chen,1999b]

Because ‖f‖1 6 ‖f‖p for all p > 1, we have

‖ · ‖2→2 6 ‖ · ‖1→2 6 C/t(q−1)/2,

and so

Nash inequality =⇒ Poincare inequality ⇐⇒ λ1 > 0.

‖Pt‖p→2 6 ‖Pt‖1→2 6 ‖Pt − π‖1→2 + ‖π‖1→2 <∞, p ∈ (1, 2).

The assertion now follows from D. Bakry (1992, Theorem 3.6 and Proposi-tion 3.9) or (4.15) and (4.16).

(g) Logarithmic Sobolev inequality =⇒ Poincare inequality[O. S. Rothaus, 1981]

Actually, we have more precise result: λ1 > σ. Consider f = 1+εg for sufficientsmall ε. Then D(f) = ε2D(g). Next, expand f2 log f2 and f2 log ‖f‖2 in ε upto order 2. Then, we get

∫f2 log f2/‖f‖2dπ = 2ε2 Var(g) +O(ε3).

The proof can be done by using the definitions of λ1 and σ and letting ε→ 0.The remainder of the section is devoted to the proof of the assertion:

L2-exponential convergence ⇐⇒ π-a.s. Exponential ergodicity. (8.2)

Actually, this is done by Chen (2000a). Because, by assumption, the process isreversible and Pt(x, ·) π. Set

pt(x, y) =dPt(x, ·)

dπ(y).

Then we havept(x, y) = pt(y, x), π × π-a.s. (x, y).

Hence∫ps(x, y)

2π(dy) =

∫ps(x, y)ps(y, x)π(dy) = p2s(x, x) <∞

(E. A. Carlen et al, 1987).

(8.3)

This means that pt(x, ·) ∈ L2(π) for all t > 0 and π-a.s. x ∈ E. Thus, by Chen(2000a, Theorem 1.2) and the remarks right after the theorem, (8.2) holds.

The proof above is mainly based on the time-discrete analog result by G.O. Roberts and J. S. Rosenthal (1997). Here, we present a more direct proof of(8.3) as follows.

Page 158: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

146 8 A Diagram of Nine Types of Ergodicity

(h) L2-exponential convergence =⇒ π-a.s. Exponential er-

godicity [Chen, 1991b; 1998b; 2000a]

Let µ π. Then

‖µPt − π‖Var = sup|f |61

|(µPt − π)f | = sup|f |61

∣∣∣∣π(

dπPtf − f

)∣∣∣∣

= sup|f |61

∣∣∣∣π(fP ∗

t

(dµ

)− f

)∣∣∣∣ = sup|f |61

∣∣∣∣π[f

(P ∗

t

(dµ

dπ− 1

))]∣∣∣∣

6

∥∥∥∥P ∗t

(dµ

dπ− 1

)∥∥∥∥1

6

∥∥∥∥dµ

dπ− 1

∥∥∥∥2

e−t gap(L∗)

=

∥∥∥∥dµ

dπ− 1

∥∥∥∥2

e−t gap(L).

(8.4)We now consider two cases separately.In the reversible case with Pt(x, ·) π, by (8.3), we have

‖Pt(x, ·) − π‖Var 6

∥∥∥∥Pt−s

(dPs(x, ·)

dπ− 1

)∥∥∥∥1

6 ‖ps(x, ·) − 1‖2e−(t−s) gap(L)

=[√

p2s(x, x) − 1 es gap(L)]e−t gap(L), t > s.

(8.5)

Therefore, there exists C(x) <∞ such that

‖Pt(x, ·) − π‖Var 6 C(x)e−t gap(L), t > 0, π-a.s. (x).

Denote by ε1 be the largest ε such that

‖Pt(x, ·) − π‖Var 6 C(x)e−εt

for all t. Then ε1 > gap(L) = λ1.In the ϕ-irreducible case, without using the reversibility and transition den-

sity, from (8.4), one can still derive π-a.s. exponential ergodicity (but may havedifferent rates). Refer to G. O. Roberts and R. L. Tweedie (2001) for a proof inthe time-discrete situation (the title of the quoted paper is confused, where theterm “L1-convergence” is used for the π-a.s. exponentially ergodic convergence,rather than the standard meaning of L1-exponential convergence used in thispaper. These two types of convergence are essentially different as shown in The-orem 9.1). In other words, the reversibility and the existence of the transitiondensity are not essential in this implication.

(i) π-a.s. Exponential ergodicity =⇒ L2-exponential conver-

gence [Chen (2000a), Y. H. Mao (2002c)]

In the time-discrete case, a similar assertion was proved by G. O. Roberts andJ. S. Rosenthal (1997) and so can be extended to the time-continuous case by

Page 159: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

8.3 Proof of Theorem 1.9 147

using the standard technique [cf., Chen (1992a, §4.4)]. The proof given belowprovides more precise estimates. Let the σ-algebra E be countably generated.By E. Numemelin and P. Tuominen (1982) or E. Numemelin (1984, Theorem6.14 (iii)), we have in the time-discrete case that

π-a.s. geometrically ergodic convergence

⇐⇒ ‖‖P n(•, ·) − π‖Var‖1 geometric convergence,

here and in what follows, the L1-norm is taken with respect to the variable “•”.This implies in the time-continuous case that

π-a.s. exponentially ergodic convergence

⇐⇒ ‖‖Pt(•, ·) − π‖Var‖1 exponential convergence.

Assume that ‖ ‖Pt(•, ·) − π‖Var‖1 6 Ce−ε2t with largest ε2.We now prove that ‖ ‖Pt(•, ·) − π‖Var‖1 > ‖Pt − π‖∞→1. Let ‖f‖∞ = 1.

Then

‖(Pt − π)f‖1 =

∫π(dx)

∣∣∣∫ [

Pt(x, dy) − π(dy)]f(y)

∣∣∣

6

∫π(dx) sup

‖g‖∞61

∣∣∣∫ [

Pt(x, dy) − π(dy)]g(y)

∣∣∣

= ‖ ‖Pt(•, ·) − π‖Var‖1 (Need Pt(x, ·) π or reversibility!).

Next, we prove that ‖P2t − π‖∞→1 = ‖Pt − π‖2∞→2 in the reversible case. We

have‖(Pt − π)f‖2

2 = ((Pt − π)f, (Pt − π)f) = (f, (Pt − π)2f)

= (f, (P2t − π)f) 6 ‖f‖∞‖(P2t − π)f‖1

6 ‖f‖2∞‖P2t − π‖∞→1.

Hence ‖P2t−π‖∞→1 > ‖Pt−π‖2∞→2. The inverse inequality is obvious by using

the semigroup property and symmetry:

‖P2t − π‖∞→1 6 ‖Pt − π‖∞→2‖Pt − π‖2→1 = ‖Pt − π‖2∞→2.

We remark that in general case, without reversibility, we have ‖Pt−π‖∞→1 >

‖Pt − π‖2∞→2/2. Actually,

‖(Pt − π)f‖22 6

∫|(Pt − π)f |2dπ 6 2‖f‖∞

∫|(Pt − π)f |dπ

6 2‖f‖2∞‖Pt − π‖∞→1, f ∈ L∞(π).

Finally, assume that the process is reversible. We prove that

λ1 = gap(L) > ε2.

We have just proved that for every f with π(f) = 0 and ‖f‖2 = 1,

‖Ptf‖22 6 C‖f‖2

∞e−2ε2t.

Page 160: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

148 8 A Diagram of Nine Types of Ergodicity

Following F. Y. Wang (2000a, Lemma 2.2), or M. Rockner and F. Y. Wang(2001), by the spectral representation theorem, we have

‖Ptf‖22 =

∫ ∞

0

e−2λtd(Eλf, f)

>

[ ∫ ∞

0

e−2λsd(Eλf, f)

]t/s

(by Jensen inequality)

= ‖Psf‖2t/s2 , t > s.

Thus, ‖Psf‖22 6

[C‖f‖2

]s/te−2ε2s. Letting t→ ∞, we get

‖Psf‖22 6 e−2ε2s, π(f) = 0, ‖f‖2 = 1, f ∈ L∞(π).

Since L∞(π) is dense in L2(π), we have

‖Psf‖22 6 e−2ε2s, s > 0, π(f) = 0, ‖f‖2 = 1.

Therefore, λ1 > ε2.

Remark 8.5. Note that when p2s(·, ·) ∈ L1/2(π) (in particular, when p2s(x, x)is bounded in x) for some s > 0, from (8.5), it follows that there exists a constantC such that ‖ ‖Pt(•, ·)−π‖Var‖1 6 Ce−λ1t. Then, we have ε2 > λ1. Combiningthis with (e), we indeed have λ1 = ε2.

Remark 8.6. It is proved by Hwang et al (2002) that under mild condition,in the reversible case, λ1 = ε1. Refer also to F. Y. Wang (2000a) for relatedestimates.

Page 161: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Chapter 9

Reaction-DiffusionProcesses

This chapter surveys the main progress made in the past twenty years or soin the study of reaction–diffusion (abbrev. RD) processes. The processes aremotivated from some typical models in the modern non-equilibrium statisticalphysics and consist an important class of interacting particle systems which iscurrently an active research field in probability and mathematical physics. Themodels are concrete but as a part of the infinite-dimensional mathematics, thetopic is quite hard. It is explained how new problems arise, how some new ideasand new mathematical tools are introduced. Surprisingly, the mathematicaltools produced from the study on the simple models then turn to have a lot ofpowerful applications not only in probability theory but also in other branchesof mathematics. Nevertheless, the story is still far away to be finished, someimportant open problems are proposed for the further study.

The chapter consist of five sections. We begin with an introduction of themodels (Section 1). Then we turn to the finite-dimensional case, in which theprocesses are indeed Markov chains (Section 2). The infinite-dimensional pro-cesses are constructed in Section 3. The main tool for the construction is thecoupling methods discussed in Chapter 2. The existence of the stationary distri-bution, the ergodicity and the phase transitions of the processes are discussedin Section 4. In the last section, the relation between the RD-processes andRD-equations is described.

9.1 The models

Let S = Zd, the d-dimensional lattice. Consider a chemical reaction in a con-tainer. Divide the container into small vessels. Imagining each u ∈ S as a smallvessel in which there is a reaction. The reaction is described by some Markovchains (abbrev. MCs) with Q-matrices Qu = (qu(i, j) : i, j ∈ Z+), u ∈ S.That is, the rate of the MC in u jumping from i to j 6= i is given by qu(i, j).

Page 162: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

150 9 Reaction-Diffusion Processes

Container S = Zd

Small vessels: reactions

Diffusions

State space E =(Zm

+

)S

+

i

u

v

Figure 9.1 The models of reaction–diffusion processes

Throughout the chapter, we consider only totally stable and conservative Q-matrix: −qu(i, i) =

∑j 6=i qu(i, j) < ∞ for all i ∈ Z+. Thus, the reaction part

of the formal generator of the process is as follows:

Ωrf(x) =∑

u∈S

k∈Z\0

qu(xu, xu + k)[f(x+ keu) − f(x)

],

where eu is the element in E := ZS+ whose value at site u is equal to one,

and the values at other sites are zero. Moreover, we have used the followingconvention: qu(i, j) = 0 for i ∈ Z+, j /∈ Z+ and u ∈ S. Mathematically, onemay regard xu as the u-th component of x in the product space ZS

+. The otherpart of the generator of the process consists of diffusions between the vessels,which are described by a transition probability matrix (p(u, v) : u, v ∈ S) and afunction cu (u ∈ S) on Z+. For instance, if there are k particles in u, then therate function of the diffusion from u to v is cu(k)p(u, v), where cu satisfies

cu > 0, cu(0) = 0, u ∈ S. (9.1)

Thus, the diffusion part of the formal generator becomes

Ωdf(x) =∑

u,v∈S

cu(xu)p(u, v)[f(x− eu + ev) − f(x)].

Finally, the whole formal generator of the process is Ω = Ωr +Ωd. A simple de-scription of the models is given by Figure 9.1. Sometimes, it is more convenientto lift the spin spaces—“fibers”, then, we have Figure 9.2.

Example 9.1 (Polynomial model). The diffusion rates are described by cu(k) =k and p(u, v), which is the simple random walk on Zd. The reaction rates are ofbirth–death type:

qu(k, k + 1) = bk =

m0∑

j=0

βjk(j), qu(k, k − 1) = ak =

m0+1∑

j=1

δjk(j),

Page 163: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

9.1 The models 151

Base space: S = Zd

Fibers: Zm+ . Reations

Connections: Diffusions

^

u

Figure 9.2 Informal interpretation: dynamics of infinite-dimensionalfiber bundle

where k(j) = k(k − 1) · · · (k − j + 1), βj , δj > 0 and β0, βm0, δ1, δm0+1 > 0.

In particular, we have

Example 9.2. (1). Schlogl’s first model: m0 = 1.(2). Schlogl’s second model: m0 = 2 but β1 = δ2 = 0.

All these examples have a single type of particles and so the number of theparticles is valued in Z+. If we consider two types of particles, then the reactionpart becomes a MC valued in Z2

+. Here is a typical example.

Example 9.3 (Brussel’s model). For each type of the particles, the diffusion partof the formal generator is the same as in Example 9.1. As for the reaction part, theMC has the following transition behavior.

Z2+3 (i, j) → (i+ 1, j) at rate λ1

→ (i− 1, j) at rate λ4i→ (i− 1, j + 1) at rate λ2i→ (i+ 1, j − 1) at rate λ3i(i− 1)j/2

where λk’s are positive constants.

These examples are typical models in non-equilibrium statistical physics.Refer to Chen (1986b) or Chen (1992a, Part IV) for more information about thebackground and references. About 15 models are treated in these books. Theauthor learnt the models from Prof. S. J. Yan in 1980.

Page 164: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

152 9 Reaction-Diffusion Processes

Z+

0 21 3 · · ·

Z+

6-

6-

?

En := x :∑

u∈S xu = n −→ n

q23 = maxy∈E2

∑z∈E3

Rate(y→z)

q21 = miny∈E2

∑z∈E1

Rate(y→z)

Single birth processes:

qi,i+n > 0 iff n = 1

Figure 9.3 Reduce higher-dimensions to dimension one

9.2 Finite-dimensional case

Replacing S = Zd with a finite set S (which is fixed in this section) in the abovedefinitions of Ωr and Ωd, the corresponding processes are simply MCs since

the state space E = ZS+

(or(Z2

+

)S)is countable. At the beginning, one may

think this step can be ignored because there have already been a well-developedtheory on MCs. However, the subject is not so easy as it stands. Indeed, wedid not know how to prove the uniqueness of the MCs for several years. Theusual criterion for the uniqueness says that the equations

(λ− Ω)u(x) = 0, 0 6 u(x) 6 1, x ∈ ZS+

have only the solution zero for some (equivalently, for all) λ > 0. It should beclear that the equations are quite hard to handle especially in higher-dimensionalcase. The criterion does not take the geometry of the MC into account.

To overcome this difficulty, we regard the set x :∑

u∈S xu = n as a singlepoint n (n > 0). Construct a single birth process (i.e., when k > 0, qi,i+k > 0iff k = 1 but there is no restriction on the death rates qij for all j < i ) onZ+ which dominates the original process. See Figure 9.3. Since for single birthprocesses, we do have a computable criterion for the uniqueness and then wecan prove the uniqueness of the original processes by a comparison argument.This and related results are presented in S. J. Yan and Chen (1986). See alsoChen (1999d) for some improvements and Sections 5.4 and 5.5 for additionalresults. By using an approximation of the processes with bounded rates (inthis case, the process is always unique), a more general uniqueness result (evenfor Markov jump processes on general state space) was proved in Chen (1986a).The following result is also included in Chen (1986b, 1991c, 1992a).

Theorem 9.4. Let Q = (qij) be a Q-matrix on a countable set E. Suppose that

Page 165: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

9.2 Finite-dimensional case 153

there exist a sequence En∞1 , a constant c ∈ R, and a non-negative function ϕsuch that

(1) En ↑ E, supi∈En(−qii) <∞ for all n > 1,

(2) limn→∞ inf i/∈Enϕi = ∞ and

(3)∑

j qij(ϕj − ϕi) 6 c ϕi for all i ∈ E.

Then the process (MC) is unique.

To justify the power of the theorem, for the above examples, simply takeϕ(x) = c[1+

∑u∈S xu] and En = x :

∑u∈S xu 6 n for some suitable constant

c. Indeed, it can be proved that the conditions of the theorem are also necessaryfor the single birth processes [see Chen (1986b) or Chen (1992a, Remark 3.20)]and up to now we do not know any counterexample for which the process isunique but the conditions of Theorem 9.4 fail. The theorem now has a verywide range of applications. For instance, it is basic result used in study of theRD-processes (here is a long list of publications:

C. Boldrighini et al (1987), Chen (1985, 1987, 1989c, 1991d, 1994c), Chen,W. D. Ding and D. J. Zhu (1994), Chen, L. P. Huang and X. J. Xu (1991)A. DeMasi and E. Presutti (1992), W. D. Ding, R. Durrett and T. M.Liggett (1990), W. D. Ding and X. G. Zheng (1989), D. Han (1990; 1991;1992; 1995), L. P. Huang (1987), Y. Li (1991; 1995), J. S. Lu (1997), T.S. Mountford (1992), C. Neuhauser (1990), A. Perrut (2000), T. Shiga(1988), X. G. Zheng and W. D. Ding (1987), S. Z. Tang (1985))

and the mean field models (cf.,

D. A. Dawson and X. Zheng (1991), S. Feng (1994a; 1994b; 1995), S. Fengand X. G. Zheng (1992))

which will be discussed later. It was used in R. R. Chen (1997b) to studyan extended class of branching processes and it was actually a key in J. S.Song (1987) to study the Markov decision programming in unbounded case.The theorem is also included in the book by W. J. Anderson (1991, Corollary2.2.16) and followed with some extension by

K. Hamza and F. C. Klebaner (1995), G. Kersting and F. C. Klebaner(1995), S. P. Meyn and R. L. Tweedie (1993a).

The generalization of Theorem 9.4 to general state spaces given in Chen (1986a)is also meaningful in quantum mechanics, refer to A. Konstantinov et al (1990)and references within.

Sketch of the proof of Theorem 9.4. Instead of (pij(t)), we use its Laplacetransform

pij(λ) =

∫ ∞

0

pij(t)dt.

Page 166: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

154 9 Reaction-Diffusion Processes

The advantage of the use is reducing an integral equation to an algebraic one.(a) Let

q(n)ij = qijIEn

(i), i 6= j; q(n)i =

j 6=i

q(n)ij .

Then, from condition (1), it follows that supi q(n)i <∞, and so there is uniquely

a Q-process Pn(λ) =(p(n)ij (λ)

). Next, replacing c with c+ = c∨ 0, condition (3)

also holds for Qn =(q(n)ij

). Because

(p(n)ij (λ) : i ∈ E

)is the minimal solution

to the backward Kolmogorov equation

xi =∑

k 6=i

q(n)ik

λ+ q(n)i

xk +δij

λ+ q(n)i

, i ∈ E,

by the linear combination theorem (§5.6),(∑

j p(n)ij ϕj : i ∈ E

)is the minimal

solution to the equation

xi =∑

k 6=i

q(n)ik

λ+ q(n)i

xk +ϕi

λ+ qi.

By condition (3), we have

ϕi

λ− c+>∑

k 6=i

q(n)ik

λ+ q(n)i

· ϕk

λ− c++

ϕi

λ+ q(n)i

, λ > c+

[⇐⇒ (λ+ q

(n)i )ϕi >

k 6=i

q(n)ik ϕk + ϕi(λ− c+)

].

Then, the comparison theorem gives us

j

p(n)ij (λ)ϕj 6

ϕi

λ− c+<∞, λ > c+.

(b) Denote by(pmin

ij (λ) : i ∈ E)

the minimal solution to the backwardKolmogorov equation

xi =∑

k 6=i

qikλ+ qi

xk +δij

λ+ qi, i ∈ E.

By the linear combination theorem,(pmin

iA (λ) :=∑

j∈A pminij (λ) : i ∈ E

)is the

minimal solution to the equation

xi =∑

k 6=i

qikλ+ qi

xk +δiAλ+ qi

,

where δiA = 1 if i ∈ A and = 0, otherwise.

Page 167: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

9.2 Finite-dimensional case 155

When i ∈ En, for all A ⊂ En, we have

pminiA (λ) =

k 6=i

qikλ+ qi

pminkA (λ) +

δiAλ+ qi

=∑

k 6=i

q(n)ik

λ+ q(n)i

pminkA (λ) +

δiA

λ+ q(n)i

.

On the other hand, for i /∈ En, we have

PminiA (λ) > 0 =

k 6=i

q(n)ik

λ+ q(n)i

p(n)kA (λ) +

δiA

λ+ q(n)i

=∑

k 6=i

q(n)ik

λ+ q(n)i

pminkA (λ) +

δiA

λ+ q(n)i

, A ⊂ En.

By the comparison theorem again, we get

pminiA (λ) > p

(n)iA (λ), i ∈ E, A ⊂ En.

(c) Finally,

λpminiEn

(λ) > λp(n)iEn

(λ) = 1 − λp(n)iEc

n(λ)

(since

(p(n)ij (λ)

)is non-explosive

)

> 1 − λ∑

j 6=En

p(n)ij (λ)ϕj

/inf

i/∈En

ϕi

> 1 − λϕi

inf i/∈Enϕi

· 1

λ− c+, λ > c+.

Letting n→ ∞, by condition (2), it follows that λpminiE (λ) > 1. This implies the

uniqueness as required.

We have seen that how the models lead us to resolve one of the classicalproblems for MCs and produce some effective results. Some new solutions tothe recurrence and positive recurrence problems are also given in S. J. Yan andChen (1986), Chen (1986b) and Chen (1992a, Chapter 4). However, the positiverecurrence for the Brussel’s model was proved only in 1991 by D. Han in the caseof d = 1 and by J. W. Chen (1995) for the general finite-dimensional situation[cf., Chen (1992a, Example 4.50)]. From the papers listed above, one can seeagain a lot of applications of these results but we are not going to the detailshere. In conclusion, the finite-dimensional Schlogl’s and Brussel’s models areall ergodic and so have no phase transitions. Thus, in order to study the phasetransition phenomena for the systems, we have to go to the infinite-dimensionalsituation.

Before moving further, let us compare the above models with the famousIsing model.

Page 168: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

156 9 Reaction-Diffusion Processes

• The state space E = −1,+1Zd

for Ising model is compact but for

Schlogl’s models, the state space E = ZZ

d

+ is neither compact nor locallycompact.

• The Ising model is reversible, its local Gibbs distributions are explicit.But the Schlogl’s models has no such advantages except a very specialcase.

• The Ising model has at least one stationary distribution since every Fellerprocess with compact state space does. But for non-compact case, theconclusion may not be true.

• The generator of the Ising model is locally bounded but it is not so forthe Schlogl’s models.

In summery, we have Table 9.4.

Comparison Ising Model Schlogl model

Space−1,+1Z

d

compactZ+

Zd

:not locally compact

Systemequilibriumreversible

non-equilibriumirreversible

Operator locally boundednot locally bounded

and non-linear

Stationarydistribution

always exists andlocally explicit

?locally no expression

Table 9.4 Comparison of Ising model and Schlogl models

From these facts, it should be clear that the Ising and the Schlogl’s models arevery different.

9.3 Construction of the processes

The diffusion part of the operator for RD-processes can not be ignored, other-wise, there is no interaction and then the processes are simply the classical MCs.If we forget the reaction part, then the processes are reduced to the well-knownzero range processes. For which, the construction was completed step by stepby several authors. In a special case, the process was constructed by R. Holley(1970), the general case was done by T. M. Liggett (1973). Then, E. D. Andjel(1982), T. M. Liggett and F. Spitzer (1981) simplified the construction. For allthe models considered in the last quoted paper, the coefficients of the operatorare assumed to be locally bounded and linear. Thus, even in this simpler case,the construction is still not simple.

Page 169: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

9.3 Construction of the processes 157

A standard tool in constructing Markov processes is the semigroup theory,as was used by T. M. Liggett (1985) to construct a large class of interactingparticle systems. However, the theory is not helpful in the present situation.Even one has a semigroup at hand, it is still quite a distance to constructthe process since in our case we do not have the Riesz representation theoremfor constructing the transition probability kernel. Moreover, from the author’sknowledge, since the state space is so poor, the usual weak convergence (evenon the path space) is not effective for the construction. What we adopt is astronger convergence.

Recall for given two probability measures P1 and P2 on a measurable statespace (E,E ), a coupling of P1 and P2 is a probability measure P on the product

space (E×E,E ×E ) having the marginality: P (A×E) = P1(A) and P (E×A) =P2(A) for all A ∈ E . Next, assume that (E, ρ,E ) is a metric space with distanceρ. The Wasserstein distance W (P1, P2) of P1 and P2 is defined by

W (P1, P2) = infP

E2

ρ(x1, x2)P (dx1, dx2), (9.2)

where P varies over all couplings of P1 and P2. Refer to §2.2 and Chen (1992a,Chapter 5) for further properties of the Wasserstein distance.

Now, our strategy goes as follows. Take a sequence of finite subsets Λnof S = Zd, Λn ↑ S. Using Λn instead of S, we obtain a MC Pn(t, x, ·) asmentioned in the last section. For each n < m, one may regard Pn(t, x, ·) asa MC on the space Em := Z

Λm

+ and hence for fixed t > 0 and x ∈ Em, thedistance W (Pn(t, x, ·), Pm(t, x, ·)) of Pn(t, x, ·) and Pm(t, x, ·) is well defined.Clearly, one key step in our construction is to prove that

W (Pn(t, x, ·), Pm(t, x, ·)) −→ 0 as m, n→ ∞. (9.3)

Certainly, it is no hope to compute exactly the W -distance since Pn(t, x, ·)is not explicitly known. In virtue of (9.3), we need only a upper bound of thedistance and moreover it follows from (9.2) that every coupling gives us sucha bound. The problem is that a coupling measure of Pn(t, x, ·) and Pm(t, x, ·)for fixed t and x is still not easy to construct, again due to the fact that thesemarginal measures are not known explicitly. What we know is mainly the ope-rators Ωn obtained from Ω but replacing Z

d with Λn. Thus, in order to get somepractical coupling, it is natural to restrict ourselves to the Markovian coupling,i.e., the coupling process itself is again a MC. This analysis leads us to explorea theory of couplings for time-continuous Markov processes, which dates backto Chen (1984).

Since then, we have gone a long trip in the field: from MC to general jumpprocesses Chen (1986a), from discrete state spaces to continuous spaces Chenand S. F. Li (1989), from Markovian couplings to optimal Markovian couplingsChen (1994a,b), from the exponential convergence to the estimation of spectralgap [Chen and F. Y. Wang (1993b), Chen (1994a)], from compact manifoldsto non-compact ones [Chen and F. Y. Wang (1995; 1997a; 1997b)] and fromfinite dimension to infinite one [Chen (1987, 1989c, 1991d, 1994c), F. Y. Wang

Page 170: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

158 9 Reaction-Diffusion Processes

(1994c; 1995; 1996)]. No doubt, the coupling method is now a powerful tooland has many applications. The story of our study on couplings is presented inChapter 2.

We now return to our main construction. We will restrict ourselves on singlereactant for a while. Let (ku) be a positive summable sequence and set

E0 =

x ∈ E : ‖x‖ :=

u∈S

xuku <∞,

i.e., a L1-subspace of E with respect to (ku). Roughly speaking, the key ofour construction (which is rather lengthy and technical) is to get the followingestimates:

(E.1) Pn(t)‖ · ‖(x) 6 (1 + ‖x‖)ect, x ∈ E0 and

(E.2) WΛn(Pn(t, x, ·), Pm(t, x, ·)) 6 c(t,Λn, x;n,m), x ∈ E0,

where c is a constant, independent of n, c(t,Λn, x;n,m) ∈ R+ satisfy

limm>n→∞

c(t,Λn, x;n,m) = 0

and WV is the Wasserstein distance restricted on ZV+ , with respect to the un-

derlying distance∑

u∈V |xu − yu|ku. The second condition (E.2) shows thatPn(t, x, ·) : n > 1

is a Cauchy sequence in the WV -distance (for fixed finite

V ). Note that our operators are not locally bounded and the particles frominfinite sites may move to a single site, so the process may be exploded atsome single site. This explains the reason why we use E0 instead of E. Then,the first moment condition (E.1) ensures that E0 is a closed set of the pro-cess. Finally, in order to prove that the limiting process satisfies the Chapman–Kolmogorov equation, some kind of uniform controlling in the second conditionis also needed.

To state our main result, we need some assumptions.

supv

u

p(u, v) <∞, (9.4)

k 6=0

qu(i, i+ k)|k| <∞, u ∈ S, (9.5)

supk,u

|cu(k) − cu(k + 1)| <∞, (9.6)

supgu(j1, j2) + hu(j1, j2) : u ∈ S, j2 > j1 > 0

<∞, (9.7)

where

gu(j1, j2) =1

j2 − j1

k 6=0

(qu(j2, j2 + k) − qu(j1, j1 + k)

)k, j2 > j1 > 0,

hu(j1, j2) =2

j2 − j1

∞∑

k=0

[(qu(j2, j1 − k) − qu(j1, 2j1 − j2 − k)

)+

+(qu(j1, j2 + k) − qu(j2, 2j2 − j1 + k)

)+]k, j2 > j1 > 0.

Page 171: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

9.3 Construction of the processes 159

The conditions (9.1), (9.4) and (9.5) are natural. For instance, when p(u, v) isthe simple random walk, (9.4) becomes trivial. However, the conditions (9.6)and (9.7) are essential in this construction, they are keys to the estimates (E.1)and (E.2) mentioned above and also to the study of mean field models discussedbelow. To get some feeling about condition (9.7), let us explain the couplingadopted to deduce (E.2). For the diffusion part, in the box Λn, we use thecoupling by marching solders. For each pair u, v, u, v ∈ Λn, let

(x, y) → (x− eu + ev, y − eu + ev) at rate p(u, v)(cu(xu) ∧ cu(yu)

)

→ (x− eu + ev, y) at rate p(u, v)(cu(xu) − cu(yu)

)+

→ (x, y − eu + ev) at rate p(u, v)(cu(yu) − cu(xu)

)+.

In the box Λm \ Λn, since (xu : u ∈ Λm \ Λn) is absorbed, the second processevolves along: For each u ∈ Λn and v ∈ Λm \ Λn, take

(x, y) → (x, y − eu + ev) at rate p(u, v)cu(yu).

Conversely, for u ∈ Λm \ Λn and v ∈ Λm, we have the same evolution as in thelast line. Besides, for different pairs, the couplings are taken to be independent.In other words, we have the coupling operator Ωd

n,m for the diffusion part asfollows.

Ωdn,mf(x, y)

=∑

u,v∈Λn

p(u, v)(cu(xu) ∧ cu(yu)

)[f(x− eu + ev, y − eu + ev) − f(x, y)]

+∑

u,v∈Λn

p(u, v)(cu(xu) − cu(yu)

)+[f(x− eu + ev, y) − f(x, y)]

+∑

u,v∈Λn

p(u, v)(cu(yu) − cu(xu)

)+[f(x, y − eu + ev) − f(x, y)]

+∑

u∈Λn, v∈Λm\Λn

p(u, v)cu(yu)[f(x, y − eu + ev) − f(x, y)]

+∑

u∈Λm\Λn, v∈Λm

p(u, v)cu(yu)[f(x, y − eu + ev) − f(x, y)].

For the reaction part, in the box Λn, we also use the coupling by marchingsolders. For each u ∈ Λn, let

(x, y) → (x+ keu, y + keu) at rate qu(xu, xu + k) ∧ qu(yu, yu + k)

→ (x+ keu, y) at rate(qu(xu, xu + k) − qu(yu, yu + k)

)+

→ (x, y + keu) at rate(qu(yu, yu + k) − qu(xu, xu + k)

)+.

Again, for each u ∈ Λm \ Λn, let the second process evolves along:

(x, y) → (x, y + keu) at rate qu(yu, yu + k).

Finally, for the reaction part, let each component evolve independent. Thus,we have defined a coupling operator Ωr

n,m for the reaction part. Then the

Page 172: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

160 9 Reaction-Diffusion Processes

whole coupling of the operators Ωm and Ωn is defined by Ωn,m = Ωdn,m + Ωr

n,m.

Computing the action of the coupling operator on the distance in ZΛm

+ , we getthe condition (9.7) and an estimate for (E.2). The computations are rather longand technical, only a part of them is illustrated at the end of this chapter. Referto Chen (1992a, Chapter 13) for details.

The next result is due to Chen (1985), first reported at the Second Inter-national Conference on Random Fields, Hungary, 1984. See also Chen (1987,1986b, 1992a) for more general results.

Theorem 9.5. Denote by E0 the Borel σ-algebra generated by the distance ‖ · ‖on E0. Under (9.1) and (9.4)–(9.7), there exists a Markov process on (E0,E0),the corresponding semigroup (Pt) maps the set of Lipschitz functions on E0 withrespect to ‖ · ‖ into itself. Moreover, for every Lipschitz function f on E0, thederivative of Ptf at the origin coincides with Ωf in a dense set of E0.

It is now a simple matter to justify the assumptions of Theorem 9.6 forExample 9.1 an Examples 9.2. However, up to now, we do not know how tochoose a distance so that our general theorem [Chen (1987, 1986b, 1992a)] canbe applied to obtain a Lipschitz semigroup for Example 9.3. In the case wherethe diffusion rates are bounded or growing at most as fast as log xu, an infinite-dimensional process corresponding to Example 9.3 was constructed by S. Z.Tang (1985) [see also Chen (1992a, Example 13.38)] and D. Han (1990; 1992;1995), respectively. For the mean field models, the problem was solved by S.Feng (1995). In the latter papers, the martingale approach was adopted butnot the analytic one used here.

Open Problem 9.6. Construct a Markov process for the Brussel’s model.

The next result is due to Y. Li (1991), which improves the author’s one in(1991d).

Theorem 9.7. Under the same assumptions as in Theorem 9.5, if additionally,

supu

k 6=0

qu(i, i+ k)[(i+ k)m − im

]6 constant (1 + im), i ∈ Z+ (9.8)

for some m > 1, then the process constructed by Theorem 9.5 is also unique.

The proof of Theorem 9.7 is also non-trivial. It uses an infinite-dimensionalversion of the maximum principle, due to S. Z. Tang (1985) and Y. Li (1991).This is the third mathematical tool developed from the study on RD-processes.

9.4 Ergodicity and phase transitions

Existence of stationary distributions

When the state space is compact, it is known that every Feller process has astationary distribution. But for non-compact case, there is no such a general

Page 173: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

9.4 Ergodicity and phase transitions 161

theorem and so one needs to work out case by case. The next result is aparticular case of Chen (1986b, 1992a). See also L. P. Huang (1987).

Theorem 9.8. There always exists at least one stationary distribution for thepolynomial model.

The intuition for the result is quite clear since the order of the death rateis higher than the birth one, the number of particles at each site is kept to benearly bounded and then we may return to the compact situation. However,the proof depends heavily on the construction of the process. We will not go tothe details here.

Ergodicity

There are two cases.(a) General case. By using the coupling method again, some general suf-

ficient conditions for the ergodicity of the processes were presented in Chen(1986b; 1989c). The result was then improved in C. Neuhauser (1990) and fur-ther improved in Chen (1990). In the case where the coefficients of the operatorare translation invariant and with an absorbing state, some refined results aregiven in Y. Li (1995). A particular result from Chen (1990) can be stated asfollows.

Theorem 9.9. For the polynomial model, when β1, · · · , βm0and δ1, · · · , δm0+1

are fixed, the processes are ergodic for all large enough β0.

We will come back to this topic at the end of this chapter.(b) Reversible case. When the reaction part is a birth–death process with

birth rates b(k) and death rates a(k), the RD-process is reversible iff p(u, v) =p(v, u) and (k+1)b(k)/a(k) =constant, independent of k [cf., Chen, W. D. Dingand D. J. Zhu (1994)].

The next result is due to W. D. Ding, R. Durrett and T. M. Liggett (1990).

Theorem 9.10. For the reversible polynomial model, the processes are alwaysergodic.

The proof of the result is a nice illustration of the application of the freeenergy method. It also uses the power of the monotonicity of the processes.The result was then extended by Chen, W. D. Ding and D. J. Zhu (1994) tothe non-polynomial case.

If we replace β0 > 0 with β0 = 0, then we obtain two stationary distributions,one is trivial and the other one is non-trivial. The question is that startingfrom a non-trivial initial distribution, whether the process converges to thenon-trivial stationary distribution (ergodic) or not. The affirmative assertion iscalled Shiga’s conjecture, which was solved by T. S. Mountford (1992).

Theorem 9.11. For the reversible polynomial model with β0 = 0, under mildassumption, the Shiga’s conjecture is correct.

Page 174: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

162 9 Reaction-Diffusion Processes

Phase transitions

(a) RD-processes with absorbing state. The following result was first provedby Y. Li and X. G. Zheng (1988) by using color graph representation and thensimplified by R. Durrett (1988) by using oriented percolation [see Chen (1992a,Theorem 15.8)].

Theorem 9.12. Take S = Z. Consider the RD-process with birth rates b(k) =λk, arbitrary death rates a(k) > 0 (k > 1) and the diffusion coefficient xup(u, v),where p(u, v) is the simple random walk. Then for the process X0(t) starting fromx0: x0

0 = 1 and x0u = 0 for all u 6= 0, we have

infλ : P[X0(t) 6≡ 0 for all t > 0] > 0 <∞.

In other words, for some λ > 0, there exists a non-trivial stationary distributionexcept the trivial one.

(b) Mean field models. In statistical physics, one often studies the meanfield models as simplified approximations of the original ones. It is usuallya common phenomena that the mean field models are easier to exhibit phasetransitions. Roughly speaking, the mean field model of a RD-process is thetime-inhomogeneous birth–death process on Z+ with death rates a(k) as usualbut with birth rates b(k) + EX(t), where (X(t))t>0 denotes the process. Theterm EX(t) represents the interaction of the particle at the present site withthe particles at the other sites in the original models. The next result is due toS. Feng and X. G. Zheng (1992).

Theorem 9.13. For the mean field of the second Schlogl model, there alwaysexists at least one stationary distribution. There is precise one if δ1, δ3 1 andthere are more than two if δ1 < δ21 < 1/2 + (2β2 + 1)/(3δ1 + 6δ3) and β0 is smallenough.

For more information about the study on the mean field models, refer to D.A. Dawson and X. G. Zheng (1991), S. Feng (1994a; 1994b; 1995), S. Feng andX. G. Zheng (1992). In B. Djehiche and I. Kaj (1995), the models are treated asa measure-valued process. Here, we mention another model, the linear growthmodel which exhibits phase transitions, refer to W. D. Ding and X. G. Zheng(1989). However, we are still unable to answer the following problem.

Open Problem 9.14. Does there exist more than one stationary distribution forthe polynomial model with no absorbing states?

The last phase means that β0 > 0. In physics, this represents an exchangeof the energy between inside and outside. From mathematical point of view,there is an essential difference between β0 = 0 and β0 > 0. For instance, whenβ0 = 0, the process restricted on x :

∑u xu < ∞ is simply a MC but this is

no longer true when β0 > 0.Because the RD-processes are quite involved, partially due to the non-

compactness of the state space. Thus, one may construct some similar models

Page 175: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

9.5 Hydrodynamic limits 163

with finite spin space to simplify the study. There are a lot of publications alongthis direction. Refer to R. Durrett and S. Levin (1994), R. Durrett (1995), R.Durrett and C. Neuhauser (1994) and references therein.

9.5 Hydrodynamic limits

Consider again the polynomial model. However, we now study the process withthe rescaled operator Ωε = ε−2Ωd + Ωr. Our main purpose is looking for thelimiting behavior of the scaling processes as ε→ 0. To do so, let µε(ε > 0) be theindependent product of the Poisson measures for which µε(xu) = ρ(εu), u ∈Zd, where ρ is a non-negative, bounded C2(Rd)-function with bounded firstderivative.

Denote by Eεµε the expectation of the process with generator Ωε and initial

distribution µε. The next result is due to C. Boldrighini, A. DeMasi, A. Pelle-grinotti and E. Presutti (1987) [See Chen (1992a, Theorem 16.1)]. Refer alsoto T. Funaki (1997; 1999), J. F. Feng (1996) and A. Perrut (2000).

Theorem 9.15. For all r = (r1, · · · , rd) ∈ Rd and t > 0, the limit f(t, r) :=limε→0 Eε

µεX[r/ε](t), where [r/ε] = ([r1/ε], · · · , [rd/ε]) ∈ Zd, exists and satisfiesthe RD-equation:

∂f

∂t=

1

2

d∑

i=1

∂2f

∂(ri)2+

m0∑

j=0

βjfj −

m0+1∑

j=1

δjfj

f(0, r) = ρ(r).

(9.9)

This result explains the relation between the RD-process and RD-equationand it is indeed the original reason why the processes was named RD-processesin Chen (1985). Certainly, in that time, a result like Theorem 9.15 did not exist,we had only a rough impression that the RD-equations describe the macroscopicbehavior of the physical systems and our aim was to introduce the processes asthe microscopic description of the same systems.

To give some insight of the relation of these two subjects, we need somenotation. Let λ > 0 satisfy the algebraic equation:

m∑

j=0

βjλj −

m+1∑

j=1

δjλj = 0. (9.10)

Which is the simplest solutions to the first equation of (9.9). A (constantequilibrium) solution λ is called asymptotically stable if there exists a δ > 0such that for any solution f(t, r) to (9.9), whenever |f(0, r) − λ| < δ, we havelim

t→∞|f(t, r) − λ| = 0.

The following result is due to X. J. Xu (1991) [see Chen (1992a, Theorem16.2)].

Page 176: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

164 9 Reaction-Diffusion Processes

Theorem 9.16. Denote by λ1 > λ2 > · · · > λk the non-negative roots of (9.10),where λj has multiplicity mj . Then, f(t, r) ≡ λi is asymptotically stable iff mi isodd and

∑j6i−1 mj is even.

All the known results is consistent with the assertion: a model has no phasetransition iff every λj is asymptotically stable and it is the case of the Schlogl’sfirst model. This leads to the following conjecture.

Conjecture 9.17. (1). The Schlogl’s first model has no phase transition.(2). The Schlogl’s second model has phase transitions.

To conclude this chapter, we want to show a use of the RD-equation. Notethat for the Schlogl’s second model, the role played by each of the parametersβk and δk is not clear at all. It seems too hard and may not be necessary toconsider the whole parameters. Based on the above observation and to keepthe physical meaning, we fix β2 = 6α (α > 0), δ1 = 9α and δ3 = α. Then,when β0 ∈ (0, 4α), there are three roots λ1 > λ2 > λ3 > 0, λ1 and λ3 areasymptotically stable but not λ2. We have thus reduce the four parametersinto one only. Now, we want to know for which region of α, the process canbe ergodic. The following result is based on the new progress on couplings [cf.,Chen (1994a)], it is a complemental to Theorem 9.9 and is also the most preciseinformation we have known so far.

Theorem 9.18 (Chen, 1994c). Consider the second Schlogl model with β0 = 2α,β2 = 6α, δ1 = 9α and δ3 = α. Then, the processes are exponentially ergodic forall α > 0.7303.

We now sketch the proof of the last theorem. Actually, we have a generalresult as follows.

Theorem 9.19. Consider the polynomial model. Let (uk) be a positive sequenceon Z+ with u0 = 1 and u := supk>0 uk < ∞. Set u∗ = supj>i>0(uj − ui) ∨ 0.Suppose that there exists an ε > 0 such that

bk+1uk+1 − (bk+ak+1+k+1−ε)uk + (ak+k)uk−1 + u+ ku∗ 6 0, k > 0,

where a0 = 0 and u−1 = 1. Then the reaction–diffusion processes are ergodic.

Sketch of the proof. (a) Define a distance in Z+ as follows

ρ(k, `) =

∣∣∣∣∑

j<k

uj −∑

j<`

uj

∣∣∣∣, k, ` ∈ Z+.

By Theorem 2.28, for birth-death processes, the couplings mentioned in Chapter2, except the independent one, are all ρ-optimal. Thus, we now adopt thesimplest classical coupling. Denote by Ωc the coupling operator of the reaction–diffusion processes: That is, using the classical coupling for each component ofthe reaction part, but for the diffusion part, still using the coupling by marching

Page 177: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

9.5 Hydrodynamic limits 165

solders mentioned in §9.3. Fix x 6 y and u ∈ S, write xu = i 6 j = yu. Wehave

Ωcρ(i, j) =− biui + aiui−1 + bjuj − ajuj−1

Ij−i>1 − (j − i)uj−1

− i(uj−1 − ui−1) +∑

v

(yv − xv)p(v, u)uj +∑

v

xvp(v, u)(uj − ui)

=bjuj − biui − (aj + j)uj−1 + (ai + i)ui−1

Ij−i>1

+∑

v

(yv − xv)p(v, u)uj +∑

v

xvp(v, u)(uj − ui).

The last term on the right-hand side appears since ρ is not translation invariant.Now, by assumption, we have

bjuj − biui − (aj + j)uj−1 + (ai + i)ui−1

Ij−i>1

=

j−1∑

`=i

(b`+1u`+1 − b`u`

)−[(a`+1 + `+ 1)u` − (a` + `)u`−1

]

6 −εj−1∑

`=i

u` − (j − i)u− (j − i)iu∗

6 −ερ(i, j)− (j − i)u− iu∗.

On the other hand, by the order-preserving of the coupling and the translationinvariance of the processes, for every translation invariant x and y with x 6 y,we have∑

v

Ex,y(Yv(t) −Xv(t)

)p(v, u)uYu(t) +

v

Ex,yXv(t)p(v, u)

(uYu(t) − uXu(t)

)

6 uEx,y(Yu(t) −Xu(t)

)+ u∗Ex,yXu(t).

Collecting the above estimates together, replacing i and j by Xu and Yu, re-spectively, we arrive at

Ex,yΩcρ

(Xu(t), Yu(t)

)6 −εEx,yρ

(Xu(t), Yu(t)

), t > 0.

By Gronwall lemma or Lemma A.6, this gives us

Ex,yρ

(Xu(t), Yu(t)

)6 E

x,yρ(Xu(1), Yu(1)

)e−εt, t > 0

for every translation invariant x and y.(b) The reason we use the time t = 1 as initial value rather than t = 0 is

the first moment estimate

Ex[Xu(t)m

]6 ϕm(t) <∞, t > 0, m ∈ N

[cf., Chen (1992a, Lemma 14.12)]. Thus, we can extend the initial state to be∞ everywhere. Let (Xn

t ) be the process starting from (xu = n, u ∈ Zd). Then,by (a), we obtain

Eρ(X0

u(t), Y∞u (t)

)6 Eρ

(X0

u(1), Y∞u (1)

)e−εt, t > 0.

Page 178: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

166 9 Reaction-Diffusion Processes

This certainly implies the ergodicity of the process, because of the translationinvariance.

To complete the proof of Theorem 9.18, by Theorem 9.19, it remains tochoose a suitable positive sequence (ui). Regarding the reaction–diffusion pro-cesses as perturbation of the birth–death processes, it is natural to choose thesequence from the mimic of the eigenfunction which produces the explicit crite-rion for the exponential convergence (or equivalently, the spectral gap). Moreprecisely, take

ui =gi+1 − gi

g1 − g0, i > 0,

where

gi =i−1∑

j=0

1

µjbj

∞∑

k=j+1

µk√ϕk, ϕi =

i−1∑

j=0

1

µjbj, i > 0.

Of course, this universal sequence produces a rough result than Theorem 9.18.For which, one needs more work to find out a better sequence (ui). Refer toChen (1994c).

A large number of publications of the study on hydrodynamic limits is col-lected in the book by C. Kipnis and C. Landim (1999). From which, one seesthat the spectral gap and the logarithmic Sobolev inequalities play a criticalrole. The spectral gap for Ising model in dimension one was computed expli-citly by R. A. Minlos and A. Trisch (1994). For higher-dimensional results, referto A. D. Sokal and L. E. Thomas (1988), R. H. Schonmann (1994) and R. A.Minlos (1996). For other equilibrium particle systems, some recent excellentexplorations have devoted to the topics studied in this book: A. Guionnet andB. Zegarlinski (2003), M. Ledoux (1999, 2001), F. Martinelli (1999) and so on.

Page 179: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Chapter 10

Stochastic Models ofEconomic Optimization

This chapter deals with some stochastic models of economic optimization. Dueto the value in practice, the models are quite attractive. But our knowledge onthem is still very limited, some fundamental problems remain open.

We begin with a short review of the study on some global economic models(or economy in large scale), the well-known input-output method and L. K.Hua’s fundamental theorem for the stability of economy. Then, we show thatit is necessary to study the stochastic models. A collapse theorem for a non-controlling stochastic economic system is introduced. In the analysis of thesystem, the products of random matrices play a critical role. Especially, the firsteigenvalue, the corresponding eigenfunctions and an ergodic theorem of Markovchains play a nice role here. Partial proofs are included. Some challenge openproblems are also mentioned.

10.1 Input-output method

First, we fix the unit of the quantity of each product: kilogram, kilovolt and soon. Denote by x =

(x(1), x(2), . . . , x(d)

)the quantity of the main products we

are interested, it is called the vector of products. Throughout this chapter, allvectors are row ones.

To understand the present economy, we need to examine three things: Theinput, the output and the structure matrix. Suppose that the starting vectorof products last year was

x0 =(x

(1)0 , x

(2)0 , . . . , x

(d)0

).

For reproduction, assume that the j-th product distributed amount x(0)ij to the

i-th product, and the vector of the products this year becomes

x1 =(x

(1)1 , x

(2)1 , . . . , x

(d)1

).

Page 180: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

168 10 Stochastic Models of Economic Optimization

Here, we suppose for a moment that all the products are used for the reproduc-tion (idealized model). Next, set

a(0)ij = x

(j)0

/x

(i)1 , 1 6 i, j 6 d .

The matrix A0 =(a(0)ij

)is called a structure matrix (or matrix of expending

coefficients). This matrix is essential since it describes the efficiency of the

current economy: to produce one unit of i-th product, one needs a(0)ij units of

the j-th product. Clearly, x0 = x1A0. Similarly, we have xn−1 = xnAn−1 forall n > 1. Suppose that the structure matrices are time-homogeneous: An = Afor all n > 0 (This is reasonable if one consider a short time unit). Then wehave a simple expression for the n-th output:

xn = x0A−n, n > 1. (10.1)

Thus, once known the structure matrix and the input x0, we may predictthe future output, and so is called the input-output method or Leontief ’s method

(cf., Leontief (1936, 1951, 1986)). It is a well known method. As far as I know,up to 1960’s, more than 100 countries had used this method in their nationaleconomy.

10.2 L. K. Hua’s fundamental theorem

Let us return to the original equation

x1 = x0A−1.

We now fix A, then x1 is determined by x0 only. The question is which choiceof x0 is the optimal one. Furthermore, in what sense of optimality are wetalking about? The first choice would be “average”. If someone tells you thatthe average of the members’ ages in a group is twenty, you may think thateveryone in the group is strong, it may be a team of volleyball. However, thegroup may be a nursery, which consists of six babies and two older women,who are over seventies. The average of the ages in this group is still twenty.The misleading point is that the variance is too big in this situation and so theaverage is not a good tool in the present situation. To avoid this, we adopt theminimax principle: i.e., finding out the best solution among the worst cases. Itis the most safe strategy and used widely in the optimalization theory and game

theory. In other words, we want to find out x0 such that min16j6d x(j)1

/x

(j)0

attains the maximum below

maxx1>0, x0=x1A

min16j6d

x(j)1

/x

(j)0 .

By using the classical Frobenius theorem, L. K. Hua (1984, Part III) provedthe following result.

Page 181: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

10.2 L. K. Hua’s fundamental theorem 169

Theorem 10.1 (L. K. Huo (1984, Part III)). Given an irreducible non-negativematrix A, let u be the left eigenvector (positive) of A, corresponding to the largesteigenvalue ρ(A) of A. Then, up to a constant, the solution to the above problemis x0 = u. In this case, we have

x(j)1

/x

(j)0 = ρ(A)−1 for all j.

In what follows, we call the above technique (i.e., setting x0 = u) the eigen-

vector’s method.Next, we are going further to study the stability of economy. From (10.1),

we obtain the simple expression:

xn = x0ρ(A)−n

whenever x0 = u. What happens if we take x0 6= u (up to a constant)?

Stability of economy

For convenience, set

T x = infn > 1 : x0 = x and there is some j such that x(j)

n 6 0,

which is called the collapse time of the economic system.We can now state Hua’s important result as follows.

Theorem 10.2 (L. K. Huo (1984, Part III; 1985, Part IX)). Under some mildconditions, if x0 6= u, then T x0 <∞.

In the case that the collapse time is bigger than 150 years, then we do notneed to take care about the stability of the economy, since none of us will bestill alive. However, the next example shows that we are not in this situation.

Example 10.3 (L. K. Huo (1984, Part I)). Consider two products only: industryand agriculture. Let

A =1

100

(20 1440 12

).

Then u =(5(√

2400 + 13)/7, 20

)≈ (44.34397483, 20). We have

x0 T x0

(44, 20) 3

(44.344, 20) 8

(44.34397483, 20) 13

This shows that the economy is very sensitive! We point out that thistheorem is essential. Recall that the Frobenius theorem or Brouwer fixed pointtheorem, often used in the study on economics, do not provide any informationabout the collapse phenomena.

Page 182: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

170 10 Stochastic Models of Economic Optimization

To understand Hua’s theorem, for probabilists, it is very natural to considera particular case that A = P . That is, A is a transition probability matrix.Then, from the ergodic theorem for Markov chains (irreducible and aperiodic),it follows that

Pn → Π as n→ ∞,

where Π is the matrix having the same row(π(1), π(2), . . . , π(d)

), which is just

the stationary distribution of the corresponding Markov chain. Since the dis-tribution is the only stable solution for the chain, it should have some meaningin economics even though the later one goes in a converse way:

xn = x0P−n , n > 1.

From the above facts, it is not difficult to prove, as shown in the next paragraph,that if

x0 6= u =(π(1), π(2), . . . , π(d)

)

up to a positive constant, then T x0 < ∞. Next, since the general case can bereduced to the above particular case, we think that this is a very natural wayto understand the Hau’s theorem.

Proof of Theorem 10.2. We need to show that if xn > 0 for all n, thenx0 = π.

Let x0 > 0 be normalized such that x011∗ = 1, where 11∗ is the row vector

having components 1 everywhere. Then

1 = x011∗ = xnP

n11∗ = xn11∗, n > 1.

Since the set x : x > 0, x11∗ = 1 is compact, exists a subsequence xnkk>1

and a vector x such that

limk→∞

xnk= x, x > 0, x11∗ = 1.

Therefore,

x0 = (x0P−nk)Pnk = xnk

Pnk → xΠ = x11∗π = π.

Thus, we must have x0 = π. As mentioned before, the general case can bereduced to the above particular case and so we are done.

We have seen the critical role played by the largest or the first eigenvalueand its eigenvectors. For which, the computations are far non-trivial, especiallyfor large scale of matrices, as we have seen from the previous chapters. Inthe numerical computation of the largest eigenvalue, it is important to have agood initial data, which is just an application of the study on the estimationof the eigenvalue. Having the known eigenvalue at hand, the computation ofeigenvectors is easier, for which, one needs only to solve a linear equation (incontract, the equation of eigenvalue is polynomial).

Page 183: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

10.3 Stochastic model without consumption 171

Economy in markets

In L. K. Hua’s eleven reports (1984–1985), he also studied some more generalmodels of economy. But the above two theorems are the key to his idea. Thetitle of the reports (written in that specific period) may cost some misunder-standing since one may think that the theory works only for planned economy.Actually, the economy in markets was also treated in (Hua, 1984, Part VII). Theonly difference is that in the later case one needs to replace the structure matrixA with V −1AV , where V is the diagonal matrix diag(vi/pi): (pi) is the vectorof prices in market and (vi) is the right eigenvector of A. Note that the eigen-value of V −1AV are the same as those of A. Corresponding to the eigenvalueρ(V −1AV ) = ρ(A), the left eigenvector of V −1AV becomes uV . Therefore, forthe economy in markets, we have a new structure matrix V −1AV and a newleft eigenvector uV , which are the all what we need in Hua’s model. Thus, frommathematical point of view, the consideration of markets makes no essentialdifference in the Hua’s model.

10.3 Stochastic model without consumption

In the case that the randomness does not play a critical role, one may simplyignore it and insist in the deterministic system. Thus, we started our study onexamining the influence of a smaller random perturbation of Hua’s example.

Consider the perturbation:

aij = aij with probability 2/3,

= aij(1 ± 0.01) with probability 1/6.

Taking (aij) instead of (aij), we get a random matrix. Next, let An; n > 1be a sequence of independent random matrices with the same distribution asabove, then xn = x0

∏nk=1A

−1k gives us a stochastic model of an economy

without consumption.Again, starting from x0 = (44.344, 20) (remember the collapse time is 8 in

the deterministic case), then the collapse probability in the above stochasticmodel is the following

P[T x0 = n] =

0, for n = 1,

0.09, for n = 2,

0.65, for n = 3.

Surprisingly, we have P[T 6 3] ≈ 0.74. This observation tells us that therandomness plays a critical role in the economy. It also explains the reasonwhy the traditional input-output is not very practicable, as people often think,because the randomness has been ignored and so the deterministic model is faraway from the real practice.

Now, what is the analog of Hua’s theorem for the stochastic case?

Page 184: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

172 10 Stochastic Models of Economic Optimization

Theorem 10.4 (Chen (1992b, Part II)). Under some mild conditions, we have

P[T x0 <∞] = 1, ∀x0 > 0.

Note that the limit theory of products of random matrices are quite differentfrom the deterministic case (cf. P. Bougerol and J. Lacroix (1985)), the problemis non-trivial. We have to deal with the product of random matrices:

Mn = AnAn−1 · · ·A1.

The first result we learnt from the limit theory of products of random matricesis the Liapynov exponent, sometimes called “strong law of large numbers”. Let‖A‖ denote the operator norm of A. Then, the main known result is as follows.

Theorem 10.5 (V. I. Oseledec, 1968). Let E log+ ‖A1‖ <∞. Then

1

nlog ‖Mn‖ a.s.−→ γ ∈ −∞ ∪ R,

where

γ = limn→∞

1

nE log ‖Mn‖.

However, this result is still not enough for our purpose. What we adopted isa much stronger result. To state the result, we need the following assumptionswhich are analogue of the irreducible and aperiodic conditions.

(H1) A1 > 0, a.s. and there exists an integer m such that

P[Mm is positive] > 0,

where Mn = A1 · · ·An.

(H2) P[A1 has zero row or column] = 0.

Theorem 10.6 (H. Kesten and F. Spitzer, 1984). Under (H1) and (H2),Mn > 0 for large n with probability one and Mn/‖Mn‖ converges in distributionto a positive matrix M = L∗R with rank one, where L and R are independent,positive row vectors satisfying the normalizing condition:

max16i6d

R(i) = 1,

d∑

j=1

L(j) = 1. (10.2)

By a change of the probabilistic frame, one may replace the “convergencein distribution” by “convergence almost surely” (Shkorohod Theorem). In thissense, the last result is really the strong law of large numbers. Having theseremarks in mind, the proof of Theorem 10.4 is not difficult and is given in §10.5.

One may refer to A. Mukherjea (1991), H. Hennion (1997) and referenceswithin for more recent progress on the limit theory of products of randommatrices.

Page 185: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

10.4 Stochastic model with consumption 173

10.4 Stochastic model with consumption

The model without consumption is idealized and so is not practice. More prac-tical one should have consumption. That is, allow a part of the productionsturning into consumption, not used for reproduction.

Suppose that every year we take the θ(i)-times amount of the increment ofthe i-th product to be consumed. Then in the first year, the vector of productswhich can be used for reproduction is

y1 = x0 + (x1 − x0)(I − Θ),

where I is the d × d unit matrix and Θ = diag(θ(1), θ(2), . . . , θ(d)

), which is

called a consumption matrix. Therefore,

y1 = y0[A−10 (I − Θ) + Θ], y0 = x0.

Similarly, in the n-th year, the vector of the products which can be used forreproduction is

yn = y0

n−1∏

k=0

[A−1n−k−1(I − Θ) + Θ], n > 1.

LetBn = [A−1

n−1(I − Θ) + Θ]−1.

Then

yn = y0

n∏

k=1

B−1n−k+1, n > 1.

We have thus obtained a stochastic model with consumption. In the determin-istic case, a collapse theorem was obtained by L. K. Hua (1985, Part X), L. K.Hua and S. Hua (1985). The conclusion is that the system becomes more stablethan the idealized model. More precisely, the dimension of (x0) for which theeconomy will be not collapsed can be greater than one. This is consistence withour practice.

To state our result in this general case, we need some notation. Denoteby Gl(d,R) the general linear group of real invertible d × d matrices and byO(d,R) the orthogonal matrices in Gl(d,R). Next, denote by Gµ the smallestclosed semigroup of Gl(d,R) containing the support of µ.

Definition 10.7.

• G is called strongly irreducible if exist no proper linear subspaces of Rd,V1, · · · ,Vk such that

(∪ki=1Vi)B = ∪k

i=1Vi, ∀B ∈ G .

• G is said to be contractive if exists Bn ⊂ G such that ‖Bn‖−1Bn convergesto a matrix with rank one.

Page 186: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

174 10 Stochastic Models of Economic Optimization

• We call B = Kdiag(ai)U a polar decomposition if K,U ∈ O(d,R) anda1 > a2 > · · · > ad > 0.

Theorem 10.8 (Chen and Y. Li, 1994). Let Bn be an i.i.d. sequenceof random matrices with common distribution µ. Suppose that Gµ is stronglyirreducible, contractive and the sequence Kn in the polar decomposition satisfiesa “tightness condition”. Then P[T x <∞] = 1 for all 0 < x ∈ Rd.

Naturally, we have the following question.

Open Problem 10.9. How fast does the economy go to collapse?

As we have seen before, since the economy is very sensitive, one certainlyexpects the following large deviation result:

P[T > n] 6 Ce−αn.

Clearly, Theorem 10.8 is still a distance from complete. Furthermore, inpractice, collapse result is not expected and less useful. Now, another questionarises.

Open Problem 10.10. How to control the economy and what is the optimalone?

Up to now, we have no idea how to handle this problem, we even do notunderstand what kind of optimality should be adopted here.

Finally, we mention that a probabilistic exploration of Hua’s model, closedrelated to the ergodic theorem as used in the proof of Theorem 7.2, was inves-tigated by K. L. Chung (1995). The topic of this chapter is now explored, withmuch more extension and recent references, in the book by D. Han and X. J.Hu (2003).

10.5 Proof of Theorem 10.4

Given i.i.d., nonnegative random matrices An∞n=1, since we are working onthe economic model

xn = x0A−11 · · ·A−1

n ,

it is natural to assume that

P[detA1 = 0] = 0. (10.3)

We study mainly on the collapse probability P[T <∞], where T is the same asbefore,

T = n > 1 : there exists some 1 6 j 6 d such that x(j)n 6 0.

The following result is a more precise statement of Theorem 10.4.

Page 187: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

10.5 Proof of Theorem 10.4 175

Theorem 10.11 (Chen (1992b, Part II)). Let (H1), (H2) and (10.3) hold. Given

a deterministic x0 > 0 with maxi x(i)0 = 1, we have

P[T = ∞] 6 P[R = x0].

In particular, if P[R = x0] = 0, then P[T = ∞] = 0.

Proof. (a) Write Mn = An · · ·A1 and set Mn = Mn/‖M∗n‖. Note that the

product Mn is in different order of that in Theorem 10.6. From which, we knowthat Mn converges in distribution to R∗L, where R and L are independent,positive row vectors satisfying (10.2).

(b) By condition (10.3), we have ‖M ∗n‖ > 0, a.s. and so

xn > 0 ⇐⇒ x0M−1n > 0 ⇐⇒ x0M

−1

n > 0, n > 1.

HenceP[T = ∞] = P[xn > 0, ∀n > 1] = P[x0M

−1

n > 0, ∀n > 1].

Thus, we can use Mn instead of Mn.(c) By Skorohod Theorem (cf., N. Ikeda and S. Watanabe (1988, page 9)),

there exists a probability space(Ω, F , P

), on which, there are Mn and M such

thatMn = Mn in distribution, ∀n > 1

M = R∗L =: M in distribution,

Mn → M as n→ ∞, P-a.s.

(10.4)

In particular,P[M has rank 1

]= P

[M has rank 1

]= 1.

From these facts and the normalizing condition, it is easy to see that there existpositive R and L, P-a.s. unique, such that M = R∗L and

maxi

R(i) = 1,∑

j

L(j) = 1, P-a.s.

Therefore, we must have

P[x0M−1

n > 0, ∀n > 1] = P[x0M−1n > 0, ∀n > 1].

Thus, we can ignore ˜ and use the original (Ω,F ,P) instead of(Ω, F , P

),

and assume that Mn converges to R∗L almost everywhere, rather than theconvergence in distribution.

(d) By (10.4), there exists a P-zero set Λ such that

M∗n → R∗L, as n→ ∞ on Λc.

Write xn = x0M−1

n . Fix ω ∈ Λc. If

xn(ω) > 0, ∀n > 1,

Page 188: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

176 10 Stochastic Models of Economic Optimization

because of the normalizing condition of x0 and Mn, there must exist a subse-quence nk = nk(ω) such that

limk→∞

xnk(ω) =: x(ω) ∈ [0,∞]d.

Butx0 = lim

k→∞

[x0Mnk

(ω)−1Mnk(ω)]

= limk→∞

[xnk

(ω)Mnk(ω)]

= x(ω)L∗(ω)R(ω).

Combining this with the positivity of x0, L and R, it follows that

c := xL∗ ∈ (0,∞), a.s.

Furthermore, since maxi x0(i) = maxi R(i) = 1, we know that c = 1, a.s.Therefore, we have

[T = ∞] ⊂ [R = x0], a.s. on Λc,

as required.

Finally, we mention that the condition “P[R = x0] = 0” can be removed insome cases, as was proven in Chen and Y. Li (1994).

To conclude this chapter, let us make some remarks about the theory ofrandom matrices. The theory is a traditional and important branch of mathe-matics and has a very wide range of applications including statistics, physics,number theory and even the Riemann hypothesis. Refer to M. Mehta (1991),V. L. Girko (1990) and J. B. Conrey (2003) and references within.

Mainly, there are two topics in the study on the eigenvalues. The first oneis the estimation of the first few of the eigenvalues as dealt in this book. Thesecond one, missed in the book, is the asymptotic behavior of the eigenvalues.In the context of random matrices, about the second topic, there is a famousbeautiful Wigner’s semicircle law (1955). For its modern generalization to theoperator algebras, called free probability, see D. Voiculescu, K. Dykema and A.Nica (1992), E. Haagerup (2002) for instance.

Page 189: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Appendix A

Some Elementary Lemmas

This appendix is taken from Chen (1999c), an earlier version of the next resultwas appeared in Y. Li (1995).

Lemma A.1. Let u and v be two functions defined on [a, b] (b 6 ∞). Supposethat

(1) u is non-negative and absolutely continuous with u(0) > 0.

(2) v is local integrable.

Next, let [c, d] ⊃ u(t) : t ∈ [a, b] and suppose that

(3) g : (c, d) → (0,∞) is non-decreasing.

(4) G(u(a)) +∫ t

a v(s)ds ∈ [G(c), G(d)], where

G(u) =

∫ u

u0

dx

g(x), u, u0 ∈ (c, d),

G(c) = limu→c

G(u), G(d) = limu→d

G(u).

If

(5) u′(t) 6 v(t)g(u(t)), a.e.t,

Then

u(t) 6 G−1

(G(u(a)) +

∫ t

a

v(s)ds

), t ∈ [a, b]

where G−1 is the inverse function of G.

Remark A.2. (1) If u(a) = 0, one may replace u by u +M for some M > 0,so condition u(0) > 0 is not really a restriction.

Page 190: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

178 A Some Elementary Lemmas

(2) Condition (5) is equivalent to the integral form

u(t2) − u(t1) 6

∫ t2

t1

v(s)g(u(s))ds, t1, t2 ∈ [a, b], t2 > t1.

Actually, since g is local bounded, vg is local integrable. Then condition (5) isdeduced by using the absolute continuity of integration.

Proof of Lemma A.1. By condition (3), G is continuous and increasing. By

conditions (2) and (4), it suffices to prove that G(u(t)) 6 G(u(a)) +∫ t

a v(s)ds.

Set F (t) = G(u(t)) −G(u(a)) −∫ t

a v(s)ds. Then

g(u(t))F ′(t) = u′(t) − v(t)g(u(t)), a.e.t.

By conditions (5) and (3), it follows that F ′(t) 6 0, a.e. t. Therefore F (t) 6

F (a) = 0, t ∈ [a, b].

Corollary A.3 (Exponential form). If a non-negative function u satisfies u(0) >0 and u′(t) 6 −αu(t) on [0,∞) for some constant α > 0, then u(t) 6 u(0)e−αt

for all t > 0.

Proof. Take [a, b) = [0,∞) = [c, d), v(t) ≡ −α and g(x) = x. Then G(u) =∫ u

11xdx = logu and G−1(u) = eu. Hence by Lemma A1, we have u(t) 6

exp[logu(0) − αt] = u(0)e−αt.

Corollary A.4 (Algebraic form). If a non-negative function u satisfies u(0) > 0and u′(t) 6 −αu(t)p on (0,∞) for some constants α > 0 and p > 1, then

u(t) 6(u(0)1−p + (p− 1)αt

)1−q, where 1/p+ 1/q = 1.

Proof. Take (a, b) = (0,∞) = (c, d), v(t) ≡ −α and g(x) = xp. Then G(u) =∫ u

1x−pdx = 1

1−p

(u1−p−1

)and G−1(u) = [1+(1−p)u]1/(1−p). Hence by Lemma

A1, we have

u(t) 6

(1 + (1 − p)

[1

1 − p

(u(0)1−p − 1

)− αt

])1/(1−p)

=(u(0)1−p + (p− 1)αt

)1−q.

Corollary A.5. If a non-negative function u satisfies u(0) > 0 and u′(t) 61

t(1 − 2t/σ)u(t) logu(t) on [ε, σ/2), then u(t) 6 u(ε)t(1−2ε/σ)/ε(1−2t/σ).

Proof. Take [a, b) = [ε, σ/2), [c, d) = [1,∞), g(x) = x logx, v(t) = t(1 −2t/σ)−1 and u0 = u(ε). Then

G(u) =

∫ u

u0

dx

g(x)=

∫ u

u0

dx

x logx=

∫ u

u0

d(logx)

logx

==

∫ log u

log u0

dy

y= log logu− log logu0.

Page 191: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

179

Solving the equation log logu− log logu0 = x, we get G−1(x) = uexp[x]0 . Because

v(s) =1

s(1 − 2s/σ)=

1

s+

2/σ

1 − 2s/σ∫ t

ε

v(s)ds = log t− log ε− log(1 − 2t/σ) + log(1 − 2ε/σ) = logt(1 − 2ε/σ)

ε(1 − 2t/σ)

By Lemma A1, we obtain

u(t) 6 G−1

(G(u(ε)) +

∫ t

ε

v(s)ds

)= G−1

(log

t(1 − 2ε/σ)

ε(1 − 2t/σ)

)

= uexp[

log[t(1−2ε/σ)/ε(1−2t/σ)]]

0 = u(ε)t(1−2ε/σ)/ε(1−2t/σ).

For the remainder of this part, we consider a Markov semigroup P (t)t>0

with weak operator Ω having domain

Dw(Ω) =

f :

d

dtP (t)f(x) = P (t)Ωf(x) for all x ∈ E and t > 0

The next two results describe the exponential or algebraic decay of the semi-group in terms of its operator.

Lemma A.6 (Exponential form). Let f ∈ Dw(Ω) and α > 0 be a constant.Then P (t)f 6 e−αtf iff Ωf 6 −αf .

Proof. Let ft = P (t)f . Then f ′t = P (t)Ωf 6 −αP (t)f = −αft. The suffi-

ciency now follows from Corollary A3. The necessity follows from

Ωf = limt→0

P (t)f − f

t6 lim

t→0

e−αt − 1

tf = −αf.

Lemma A.7 (Algebraic form). Fix p > 1. Let f ∈ Dw(Ω), f > 0 and C > 0 bea constant. Then P (t)f 6 [f 1−p + (p− 1)Ct]1−q iff Ωf 6 −Cfp.

Proof. Again, let ft = P (t)f . Then f ′t = P (t)Ωf 6 −CP (t)(fp). However,

by Holder inequality, P (t)(fp) > (P (t)f)p. Hence f ′t 6 −Cfp

t . The sufficiencynow follows from Corollary A4. Next, note that p− 1 = p/q and q − 1 = q/p.The necessity follows from

Ωf = limt→0

P (t)f − f

t6 lim

t→0

[f1−p + (p− 1)Ct

]1−q − f

t

= limt→0

(1 − q)(p− 1)C[f1−p + (p− 1)Ct

]−q= −Cf q(p−1) = −Cfp.

Page 192: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

180 A Some Elementary Lemmas

Page 193: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Bibliography

S. Aida. Uniform positivity improving property, Sobolev inequality and spectral gaps.J. Funct. Anal., 158, no.1:152–185, 1998.

S. Aida and I. Shigekawa. Logarithmic Sobolev inequalities and spectral gaps: Per-turbation theory. J. Funct. Anal., 126, no.2:448–475, 1994.

D. G. Aldous and J. A. Fill. Reversible Markov Chains and Random Walks on GraphsURL. www.stat.Berkeley.edu/users/aldous/book.html, 1994–.

W. J. Anderson. Continuous-Time Markov Chains. Springer Series in Statistics, 1991.

E. D. Andjel. Invariant measures for zero range process. Ann. Prob., 10(3):527–547,1982.

D. Bakry. L’hypercontractivite et son utilisation en theorie des semigroupes. LecturesNotes in Mathematics 1581, D. Bakry, R. D. Gill and S. A. Molchanov (Eds.),“Lectures on Probability Theorey”, Springer-Verlag, 1992.

D. Bakry, T. Coulhon, M. Ledoux, and L. Saloff-Coste. Sobolev inequalities in disguise.Indiana Univ. Math. J., 44(4):1033–1074, 1995.

D. Bakry and M. Ledoux. Levy–Gromov’s isoperimetric inequality for an infinitedimensional diffusion generator. Invent. math., 123:259–281, 1996.

J. Barta. Sur la vibration fondamentale d’une membrane. C. R. Acad. Sci. Paris, 204:472–473, 1937.

P. H. Berard. Spectral Geometry: Direct and Inverse Problem. LNM. vol. 1207,Springer-Verlag, 1986.

P. H. Berard, G. Besson, and S. Gallot. Sur une inequalite isoperimetritrique quigeneralise celle de paul Levy-Gromov. Invent. Math., 80:295–308, 1985.

H. Berestycki, L. Nirenberg, and S. R. S. Varadhan. The principal eigenvalue andmaximum principle for second-order elliptic operators in general domains. Comm.Pure and Appl., XLVII:47–92, 1994.

S. G. Bobkov. A functional form of the isoperimetric inequality for the Gaussianmeasure. J. Funct. Anal., 135(1):39–49, 1996.

Page 194: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

182 BIBLIOGRAPHY

S. G. Bobkov. An isoperimetric inequality on the discrete cube, and an elementaryproof of the isoperimetric inequality in Gauss space. Ann. Probab., 25(1):206–214,1997.

S. G. Bobkov and F. Gotze. Discrete isoperimetric and Poincare inequalities. Prob.Th. Rel. Fields, 114:245–277, 1999a.

S. G. Bobkov and F. Gotze. Exponential integrability and transportation cost relatedto logarithmic Sobolev inequalities. J. Funct. Anal., 163:1–28, 1999b.

C. Boldrighini, A. DeMasi, A. Pellegrinotti, and E. Presutti. Collective phenomena ininteracting particle systems. Stoch. Proc. Appl., 25:137–152, 1987.

P. Bougerol and J. Lacroix. Products of Random Matrices with Applications toSchrodinger Operators. Birkhauser Boston, Inc., 1985.

K. R. Cai. Estimate on lower bound of the first eigenvalue of a compact Riemannianmanifold. Chin. Ann. of Math., 12(B)(3):267–271, 1991.

E. A. Carlen, S. Kusuoka, and D. W. Stroock. Upper bounds for symmetric Markovtransition functions. Ann. Inst. Henri Poincare, 2:245–287, 1987.

I. Chavel. Eigenvalues in Riemannian Geometry. Academic Press, 1984.

J. Cheeger. A lower bound for the smallest eigenvalue of the Laplacian. Problemsin Analysis, a Symposium in Honor of S. Bochner, Princeton U. Press, Princeton,pages 195–199, 1970.

J. W. Chen. The positive recurrence of Brussel’s model. Acta Math. Sci., 15:121–125,1995.

M. F. Chen. Couplings of Markov chains (In Chinese). J. Beijing Normal Univ., 4:3–10, 1984.

M. F. Chen. Infinite-dimensional reaction-diffusion processes. Acta Math. Sin. NewSer., 1(3):261–273, 1985.

M. F. Chen. Couplings of jump processes. Acta Math. Sinica, New Series, 2(2):123–136, 1986a.

M. F. Chen. Jump Processes and Interacting Particle Systems (In Chinese). BeijingNormal Univ. Press, 1986b.

M. F. Chen. Existence theorems for interacting particle systems with non-compactstate spaces. Sci. Sin., 30(2):148–156, 1987.

M. F. Chen. Probability metrics and coupling methods. Pitman Research Notes inMath., 200:55–72, 1989a.

M. F. Chen. A survey on random fields (In Chinese). Advances in Math., 18(3):294–322, 1989b.

M. F. Chen. Stationary distributions of infinite particle systems with non-compactstate spaces. Acta Math. Sci., 9(1):7–19, 1989c.

Page 195: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

BIBLIOGRAPHY 183

M. F. Chen. Ergodic theorems for reaction-diffusion processes. J. Statis. Phys.,58(5/6):939–966, 1990.

M. F. Chen. Exponential L2-convergence and L2-spectral gap for Markov processes.Acta Math. Sin. New Ser., 7(1):19–37, 1991b.

M. F. Chen. On three classical problem for Markov chains with continuous timeparameters. J. Appl. Prob., 28:305–320, 1991c.

M. F. Chen. Uniqueness of reaction-diffusion processes. Chin. Sci. Bulleten, 36(12):969–973, 1991d.

M. F. Chen. From Markov Chains to Non-Equilibrium Particle Systems. World Sci-entific, Singapore, 1992a.

M. F. Chen. Stochastic model of economic optimization. Chin. J. Appl. Probab. andStatis., (I): 8(3), 289-294; (II): 8(4), 374-377 (In Chinese), 1992b.

M. F. Chen. Optimal Markovian couplings and applications. Technical Report, No.215,1993, Carleton Univ. and No.147, 1993, C. V. Volterra, Univ. of Roma II. Acta Math.Sin. New Ser. 10(3), 1994, pages 260–275, 1994a.

M. F. Chen. Optimal Markovian couplings and application to Riemannian geometry.in Prob. Theory and Math. Stat., Eds. Grigelionis B et al, VPS/TEV, pages 121–142, 1994b.

M. F. Chen. On ergodic region of Schlogl’s model, Carr Reports in Math. and Phys.No.13, 1993. In M. Rockner Z. M. Ma and J. A. Yan, editors, Proc. Intern. Conf.on Dirichlet Forms and Stoch. Proc., pages 87–102. Walter de Gruyter, 1994c.

M. F. Chen. Estimation of spectral gap for Markov chains. Acta Math. Sin. New Ser.,12(4):337–360, 1996.

M. F. Chen. Coupling, spectral gap and related topics. Chin. Sci. Bulletin, (I): 42(16),1321-1327; (II): 42(17), 1409-1416; (III): 42(18), 1497–1505, 1997a.

M. F. Chen. Trilogy of couplings and general formulas for lower bound of spectralgap. in “Probability Towards 2000”, Edited by L. Accardi and C. Heyde, LectureNotes in Statistics, Vol.128, 123–136, Springer-Verlag, 1998a.

M. F. Chen. Estimate of exponential convergence rate in total variation by spectralgap. Acta Math. Sin. Ser. (A), 41(1) (Chinese Ed.), 1–6; Acta Math. Sin. New Ser.14(1):9–16, 1998b.

M. F. Chen. Analytic proof of dual variational formula for the first eigenvalue indimension one. Sci. Sin. (A), 42(8):805–815, 1999a.

M. F. Chen. Nash inequalities for general symmetric forms. Acta Math. Sin. Eng.Ser., 15(3):353–370, 1999b.

M. F. Chen. Eigenvalues, inequalities and ergodic theory (II). Advances in Math.,28(6):481–505, 1999c.

M. F. Chen. Single birth processes. Chin. Ann. of Math., 20B(1):77–82, 1999d.

Page 196: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

184 BIBLIOGRAPHY

M. F. Chen. Equivalence of exponential ergodicity and L2-exponential convergencefor Markov chains. Stoch. Proc. Appl., 87:281–297, 2000a.

M. F. Chen. Logarithmic Sobolev inequality for symmetric forms. Sci. Sin. (A), 43(6):601–608, 2000b.

M. F. Chen. Explicit bounds of the first eigenvalue. Sci. China (A), 43(10):1051–1059,2000c.

M. F. Chen. Explicit criteria for several types of ergodicity. Chin. J. Appl. Prob. Stat.,17(2):1–8, 2001a.

M. F. Chen. Variational formulas and approximation theorems for the first eigenvalue.Sci. China (A), 44(4):409–418, 2001b.

M. F. Chen. Variational formulas of Poincare-type inequalities in Banach spaces offunctions on the line. Acta Math. Sin. Eng. Ser., 18(3):417–436, 2002a.

M. F. Chen. A new story of ergodic theory. in “Applied Probability”, 25-34, eds. R.Chan et al., AMS/IP Studies in Advanced Mathematics, 26, 2002b.

M. F. Chen. Ergodic convergence rates of Markov processes — eigenvalues, inequalitiesand ergodic theory. in Proceedings of “ICM 2002”, Higher Education Press, Beijing,III:41–52, 2002c.

M. F. Chen. Variational formulas of Poincare-type inequalities for birth-death pro-cesses. to appear in Acta Math. Sin. Eng. Ser., 2002d.

M. F. Chen. Variational formulas of Poincare-type inequalities for one-dimensionalprocesses. IMS Lecture Notes – Monograph Series, Volume 41, Probability, Statisticsand their Applications: Papers in Honor of Rabi Bhattacharya, 2003a.

M. F. Chen. Ten explicit criteria of one-dimensional processes. Proceedings of theConference on Stochastic Analysis on Large Scale Interacting Systems, AdvancedStudies in Pure Mathematics, Mathematical Society of Japan, 2003b.

M. F. Chen. Stochastic model of economic optimization. to appear in Proceedings ofthe First Sino-German Conference on Stochastic Analysis—A Satellite Conferenceof ICM 2002, 2003c.

M. F. Chen, W. D. Ding, and D. J. Zhu. Ergodicity of reversible reaction-diffusionprocesses with general reaction rates. Acta Math. Sin. New Ser., 10(1):99–112, 1994.

M. F. Chen, L. P. Huang, and X. J. Xu. Hydrodynamic limit for reaction-diffusionprocesses with several species. in “Probability and Statistics, Nankai Series of Pureand Applied Mathematics”, edited by S. S. Chern and C. N. Yang, World Scientific,1991.

M. F. Chen and S. F. Li. Coupling methods for multi-dimensional diffusion processes.Ann. Prob., 17(1):151–177, 1989.

M. F. Chen and Y. Li. Stochastic model of economic optimization. J. Beijing NormalUniv., 30(2):185–194, 1994.

Page 197: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

BIBLIOGRAPHY 185

M. F. Chen and Y. G. Lu. Large deviations for Markov chains. Acta Sci. Sin., 10(2):217–222, 1990.

M. F. Chen, E. Scacciatelli, and L. Yao. Linear approximation of the first eigenvalueon compact manifolds. Sci. Sin. (A), 45(4), 2002.

M. F. Chen and F. Y. Wang. On order-preservation and positive correlations formultidimensional diffusion processes. Prob. Th. Rel. Fields, 95:421–428, 1993a.

M. F. Chen and F. Y. Wang. Application of coupling method to the first eigenvalueon manifold. Sci. Sin. (A), 23(11), 1993 (Chinese Edition), 37(1), 1994 (EnglishEdition), 37(1), 1993b.

M. F. Chen and F. Y. Wang. Estimation of the first eigenvalue of second order ellipticoperators. J. Funct. Anal., 131(2):345–363, 1995.

M. F. Chen and F. Y. Wang. General formula for lower bound of the first eigenvalueon Riemannian manifolds. Sci. Sin., 40(4):384–394, 1997a.

M. F. Chen and F. Y. Wang. Estimation of spectral gap for elliptic operators. Trans.Amer. Math. Soc., 349:1239–1267, 1997b.

M. F. Chen and F. Y. Wang. Cheeger’s inequalities for general symmetric forms andexistence criteria for spectral gap. Abstract: Chin. Sci. Bulletin, 43(18), 1516–1519. Ann. Prob. 2000, 28(1), 235–257, 1998.

M. F. Chen and Y. Z. Wang. Algebraic convergence of Markov chains. to appear inAnn. Appl. Prob, 2000.

M. F. Chen, Y. H. Zhang, and X. L. Zhao. Dual variational formulas for the firstDirichlet eigenvalue on half-line. to appear in Sci. China, 2003.

R. R. Chen. An extended class of time-continuous branching processes. J. Appl. Prob.,34(1):14–23, 1997b.

F. R. K. Chung. Spectral Graph Theory. CBMS, 92, AMS, Providence, Rhode Island,1997.

K. L. Chung. A New Introduction to Stochastic Processes. World Scientific, Singapore,1995.

Y. Colin de Verdiere. Spectres de Graphes. Publ. Soc. Math. France, 1998.

J. B. Conrey. The Riemann hypothesis. Notices of AMS, pages 341–353, 2003.

M. Cranston. Gradient estimates on manifolds using coupling. J. Funct. Anal., 99:110–124, 1991.

M. Cranston. A probabilistic approach to gradient estimates. Canad. Math. Bull., 35:46–55, 1992.

E. B. Davies. A review of Hardy inequality. Operator Theory: Adv. and Appl., 110:55–67, 1999.

Page 198: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

186 BIBLIOGRAPHY

D. A. Dawson and X. Zheng. Law of large numbers and central limit theorem forunbounded jump mean field models. Adv. Appl. Math., 12:293–326, 1991.

A. DeMasi and E. Presutti. Lectures on the Collective Behavior of Particle Systems.LNM 1502, Springer, 1992.

J. D. Deuschel. Algebraic L2 decay of attractive critical processes on the lattice. Ann.Prob., 22:1:264–283, 1994.

J. D. Deuschel and D. W. Stroock. Large Deviations. Academic Press, New York,1989.

P. Diaconis and L. Saloff-Coste. Logarithmic Sobolev inequalities for finite Markovchains. Ann. Appl. Prob., 6(3):695–750, 1996.

W. D. Ding, R. Durrett, and T. M. Liggett. Ergodicity of reversible theorems reaction-diffusion processes. Prob. Th. Rel. Fields, 85(1):13–26, 1990.

W. D. Ding and X. G. Zheng. Ergodic theorems for linear growth processes withdiffusion. Chin. Ann. Math., 10(B)(3):386–402, 1989.

B. Djehiche and I. Kaj. The rate function for some measure-valued jump processes.Ann. Prob., 23(3):1414–1438, 1995.

R. L. Dobrushin. Prescribing a system of random variables by conditional distribu-tions. Theory Prob. Appl., 15:458–486, 1970.

W. Doeblin. Expose de la theorie des chaines simples constantes de Markov a unnombre dini d’etats. Rev. Math. Union Interbalkanique, 2:77–105, 1938.

D. Down, S. P. Meyn, and R. L. Tweedie. Exponential and uniform ergodicity ofMarkov processes. Ann. Prob., 23:1671–1691, 1995.

D. C. Dowson and B. V. Landau. The Frechet distance between multivariate normaldistributions. J. Multivariate Anal., 12:450–455, 1982.

R. Durrett. Ten lectures on particle systems, st. flour lecture notes. LNM, 1608, 1995.

R. Durrett and S. Levin. The importance of being discrete (and spatial). Theoret.Pop. Biol., 46:363–394, 1994.

R. Durrett and C. Neuhauser. Particle systems and reaction-diffusion equations. Ann.Prob., 22(1):289–333, 1994.

E. M. Dynkin. EMS. Springer-Valerg, 1991, Berlin, 1990.

Y. Egorov and V. Kondratiev. On Spectral Theory of Elliptic Operators. Birkbauser,Berlin, 1996.

J. F. Escobar. Uniqueness theorems on conformal deformation of metrics, Sobolevinequalities, and an eigenvalue estimate. Comm. Pure and Appl. Math., XLIII:857–883, 1990.

J. F. Feng. The hydrodynamic limit for the reaction diffusion equation—an approachin terms of the GRP method. J. Theor. Prob., 9(2):285–299, 1996.

Page 199: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

BIBLIOGRAPHY 187

S. Feng. Large deviations for empirical process of interacting particle system withunbounded jumps. Ann. Prob., 22(4):2122–2151, 1994a.

S. Feng. Large deviations for Markov processes with interaction and unbounded jumps.Prob. Th. Rel. Fields, 100:227–252, 1994b.

S. Feng. Nonlinear master equation of multitype particle systems. Stoch. Proc. Appl.,57:247–271, 1995.

S. Feng and X. Zheng. Solutions of a class of nonlinear master equations. Stoch. Proc.Appl., 43:65–84, 1992.

E. Fischer. Uber quadratische formen mit reellen koeffizienten. Monatsh. Math. Phys.,16:234–249, 1905.

M. Fukushima, Y. Oshima, and M. Takeda. Dirichlet Forms and Symmetric MarkovProcesses. Walter de Gruyter, 1994.

M. Fukushima and T. Uemura. Capacitary bounds of measures and ultracontracitivityof time changed processes. to appear in J. Math. Pure et Appliquees, 2002.

T. Funaki. Singular limit for reaction-diffusion equation with self-similar Gaussiannoise. In S. Kusuoka D. Elworthy and I. Shigekawa, editors, Proceedings of Taniguchisymposium, New Trends in Stochastic Analysis, pages 132–152. World Sci., 1997.

T. Funaki. Singular limit for stochastic reaction-diffusion equation and generation ofrandom interfaces. Acta Math. Sin. Eng. Ser., 15, 1999.

V. L. Girko. Theory of Random Determinants. Kluwer Acad. Publ., 1990.

C. R. Givens and R. M. Shortt. A class of Wasserstein metrics for probability distri-butions. Michigan Math. J., 31:231–240, 1984.

F. Z. Gong and F. Y. Wang. Functional inequalities for uniformly integrable semi-groups and application to essential spectrums. Forum Math., 14:293–313, 2002.

D. Griffeath. Coupling methods for Markov processes. in “Studies in Probability andErgodic Theory, Adv. Math., Supplementary Studies”, 1978.

M. Gromov. Paul Levy’s isoperimetric inequality. preprint I.H.E.S., 1980.

M. Gromov. Metric Structures for Riemannian and Non-Riemannian Spaces. Progressin Mathematics. 152. Birkhauser, Boston·Basel·Berlin, 1999.

L. Gross. Logarithmic Sobolev inequalities. Amer. J. Math., 97:1061–1083, 1976.

L. Gross. Logarithmic Sobolev inequalities and contractivity properties of semigroups.Lectures Notes in Mathematics 1563, E. Fabes et al (Eds.), “Dirichlet Forms”,Springer-Verlag, 1993.

A. Guionnet and B. Zegarlinski. Lectures on logarithmic Sobolev inequalities. InM. Ledoux J. Azema, M. Emery and M. Yor, editors, LNM 1801, pages 1–134.Springer–Verlag, 2003.

Page 200: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

188 BIBLIOGRAPHY

E. Haagerup. Random matrices, free probability and the invariant subspace problemrelative to a von Nenmann algebra. in Proceedings of “ICM 2002”, Higher EducationPress, Beijing, I:273–290, 2002.

K. Hamza and F. C. Klebaner. Conditions for integrability of Markov chains. J. Appl.Prob., 32:541–547, 1995.

D. Han. Existence of solution to the martingale problem for multispecies infinitedimensional reaction-diffusion particle systems. Chin. J. Appl. Prob. Statis., 6(3):265–278, 1990.

D. Han. Ergodicity for one-dimensional Brussel’s model (In Chinese). J. XingjiangUniv., 8(3):37, 1991.

D. Han. Uniqueness of solution to the martingale problem for infinite-dimensionalreaction-diffusion particle systems with multispecies (In Chinese). Chin. Ann.Math., 13A(2):271–277, 1992.

D. Han. Uniqueness for reaction-diffusion particle systems with multispecies (In Chi-nese). Chin. Ann. Math., 16A(5):572–578, 1995.

D. Han and X. J. Hu. Mathematical Models of Economics and Finance — Theory andPractice. Shanghai Jiaotong U. Press, Shanghai (In Chinese), 2003.

G. H. Hardy. Note on a theorem of Hilbert. Math. Zeitschr., 6:314–317, 1920.

H. Hennion. Limit theorems for products of positive random matrices. Ann. Probab.,25(4):1545–1587, 1997.

R. Holley. A class of interaction in an infinite particle systems. Adv. Math., 5:291–309,1970.

R. Holley. Recent results on the stochastic Ising model. Rocky Mountain J. Math.,4(3):479–496, 1974.

Z. T. Hou and Q. F. Guo. Time-Homogeneous Markov Processes with Countable StateSpace (In Chinese). Beijing Sci. Press (1978). English translation (1988): BeijingSci. Press and Springer-Verlag, 1978.

Z. T. Hou, Z. M. Liu, J. P. Li, J. Z. Zhou, and C. G. Yuan. Birth-death Processes.Hunan Sci. Press, Hunan, 2000.

E. P. Hsu. Stochastic Analysis on Manifolds. Amer. Math. Soc., Providence, R. I.,2002.

L. K. Hua. The mathematical theory of global optimization on planned economy.Kexue Tongbao, (I): 1984, No.12, 705–709. (II): 1984, No.13, 769–772. (III): 1984,No.16, 961–965. (V), (VI) and (VII): 1984, No.18, 1089–1092. (VIII): 1984, No.21,1281–1282. (IV): 1985, No.1, 1–2. (X): 1985, No.9, 641–646. (XI): 1985, No.24, 1841-1844 (In Chinese), 1984.

L. K. Hua and S. Hua. The study of the real square matrix with both left positiveeigenvector and right positive eigenvector. Shuxue Tongbao, 8:30–32 (In Chinese),1985.

Page 201: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

BIBLIOGRAPHY 189

L. P. Huang. The existence theorem of the stationary distributions of infinite particlesystems (In Chinese). Chin. J. Appl. Prob. Statis., 3(2):152–158, 1987.

C. R. Hwang, S. Y. Hwang-Ma, and S. J. Sheu. Accelerating diffusions. preprint,2002.

N. Ikeda and S. Watanabe. Stochastic Differential Equations and Diffusion Processes.2nd, Ed. North-Holland, addr Kodansha, Tokyo, 1988.

S. R. Jarner and G. Roberts. Polynomial convergence rates of Markov chains. Ann.Appl. Prob., 12:224–247, 2002.

M. R. Jerrum and A. J. Sinclair. Approximating the permanent. SIAM J. Comput.,18:1149–1178, 1989.

F. Jia. Estimate of the first eigenvalue of a compact Riemannian manifold with riccicurvature bounded below by a negative constant (In Chinese). Chin. Ann. Math.,12A(4):496–502, 1991.

I. S. Kac and M. G. Krein. Criteria for discreteness of the spectrum of a singularstring. Izv. Vyss. Ucebn. Zaved. Mat., 2:136–153 (In Russian), 1958.

A. Kaimanovich. Dirichlet norms, capacities and generalized isoperimetric inequalitiesfor Markov operators. Potential Analysis, 1:61–82, 1992.

L. S. Kang et al. Non-numerically Parallel Algorithms (In Chinese), volume 1. Pressof Sciences, Beijing, 1994.

W. Kendall. Nonnegative ricci curvature and the Brownian coupling property. Stochas-tics, 19:111–129, 1986.

G. Kersting and F. C. Klebaner. Sharp conditions for nonexplosions in Markov jumpprocesses. Ann. Prob., 23(1):268–272, 1995.

H. Kesten and F. Spitzer. Convergence in distribution of products of random matrices.Z. Wahrs., 67:363–386, 1984.

C. Kipnis and C. Landim. Scaling Limits of Interacting Particle Systems. Springer-Verlag, Berlin, 1999.

A. A. Konstantinov, V. P. Maslov, and A. M. Chebotarev. Probability representationsof solutions of the cauchy problem for quantum mechanical solutions. Russian Math.Surveys, 45(6):1–26, 1990.

I. Kontoyiannis and S. P. Meyn. Spectral theory and limit theory for geometricallyergodic Markov processes. Ann. Appl. Prob., 13:304–362, 2003.

S. Kotani and S. Watanabe. Krein’s spectral theory of strings and generalized diffusionprocesses. Lecture Notes in Math., 923:235–259, 1982.

C. Landim, S. Sethuraman, and S. R. S. Varadhan. Spectral gap for zero rangedynamics. Ann. Prob., 24:1871–1902, 1996.

Page 202: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

190 BIBLIOGRAPHY

G. F. Lawler and A. D. Sokal. Bounds on the L2 spectrum for Markov chain andMarkov processes: a generalization of Cheeger’s inequality. Trans. Amer. Math.Soc., 309:557–580, 1988.

M. Ledoux. Concentration of measure and logarithmic Sobolev inequalities. InSeminaire de Probabilites 33. LNM 1709, pages 120–216. Springer–Valerg, 1999.

M. Ledoux. Logarithmic Sobolev inequalities for unbounded spin systems revised. InSeminaire de Probabilites 35. LNM 1755, pages 167–194. Springer–Valerg, 2001.

W. Leontief. Quantitive input–output relation in the economic system of the unitedstates. Review of Economics and Statistics, XVIII(3):105–125, 1936.

W. Leontief. The structure of the American Economy 1919–1939. Oxford U. Press,New York, 1951.

W. Leontief. Input-Output Economics. 2nd Ed., Oxford U. Press, 1986.

P. Levy. Problemes Concrets d’Analyse Fonctionnelle. Gauthier–Villars, Paris, 1951.

P. Li. Lecture Notes on Geometric Analysis. Seoul National U., Korea, 1993.

P. Li and S. T. Yau. Estimates of eigenvalue of a compact Riemannian manifold. Ann.Math. Soc. Proc. Symp. Pure Math., 36:205–240, 1980.

Y. Li. Uniqueness for infinite-dimensional reaction-diffusion processes (In Chinese).Chin. Sci. Bulleten, 22:1681–1684, 1991.

Y. Li. Ergodicity of a class of reaction-diffusion processes with translation invariantcoefficients (In Chinese). Chin. Ann. Math., 16A(2):223–229, 1995.

A. Lichnerowicz. Geometrie des Groupes des Transformationes. Dunod, Paris, 1958.

T. M. Liggett. An infinite particle system with zero range interactions. Ann. Prob.,1:240–253, 1973.

T. M. Liggett. Interacting Particle Systems. Springer–Verlag, 1985.

T. M. Liggett. Exponential L2 convergence of attractive reversible nearest particlesystems. Ann. Prob., 17:403–432, 1989.

T. M. Liggett. L2 rates of convergence for attractive reversible nearest particle systems:the critical case. Ann. Prob., 19(3):935–959, 1991.

T. M. Liggett and F. Spitzer. Ergodic theorems for coupled random walks and othersystems with locally interacting components. Z. Wahrs., 56:443–468, 1981.

T. Lindvall. Lectures on the Coupling Method. Wiley, New York, 1992.

T. Lindvall. On Strassen’s theorem, on stochastic domination. Electr. Comm. Probab.,4:51–59, 1999.

T. Lindvall and L. C. G. Rogers. Coupling of multidimensional diffusion processes.Ann. Prob., 14(3):860–872, 1986.

Page 203: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

BIBLIOGRAPHY 191

J. S. Lu. The optimal coupling for single–birth reaction–diffusion processes and itsapplications (In Chinese). J. Beijing Normal Univ., 33(1):10–17, 1997.

J. H. Luo. On discrete analog of Poincare–type inequalities and density representation.Unpublished, 1992.

C. Y. Ma. The Spectrum of Riemannian Manifolds. Press of Nanjing U., Nanjing, (InChinese), 1993.

Z. M. Ma and M. Rockner. Introduction to the Theory of (Non-symmetric) DirichletForms. Springer-Valerg, 1992.

Y. H. Mao. On empty essential spectrum for Markov processes in dimension one.preprint, 2000.

Y. H. Mao. Lp-Poincare inequality for general symmetric forms. preprint, 2001a.

Y. H. Mao. General Sobolev type inequalities for symmetric forms. preprint, 2001b.

Y. H. Mao. The logarithmic Sobolev inequalities for birth-death process and diffusionprocess on the line. Chin. J. Appl. Prob. Statis., 18(1):94–100, 2002a.

Y. H. Mao. Nash inequalities for Markov processes in dimension one. Acta. Math.Sin. Eng. Ser., 18(1):147–156, 2002b.

Y. H. Mao. In preparation. 2002c.

Y. H. Mao. Strong ergodicity for Markov processes by coupling methods. J. Appl.Prob., 39:839–852, 2002d.

Y. H. Mao. Convergence rates in strong ergodicity for Markov processes. preprint,2002e.

Y. H. Mao. Ergodic degree for continuous–time Markov chains. preprint, 2002f.

Y. H. Mao and S. Y. Zhang. Comparison of some convergence rates for Markov process(In Chinese). Acta. Math. Sin., 43(6):1019–1026, 2000.

F. Martinelli. Lectures on glauber dynamics for discrete spin models. In LNM 1717,pages 93–191. Springer–Valerg, 1999.

V. G. Maz’ya. Sobolev Spaces. Springer-Valerg, 1985.

M. L. Mehta. Random Matrices. 2nd Ed., Academic Press, New York, 1991.

S. P. Meyn and R. L. Tweedie. Stability of Markovian processes (III):Foster-Lyapunovcriteria for continuous-time processes. Adv. Appl. Prob., 25:518–548, 1993a.

S. P. Meyn and R. L. Tweedie. Markov Chains and Stochastic Stability. Springer–Verlag, London, 1993b.

L. Miclo. Relations entre isoperimetrie et trou spectral pour les chaınes de Markovfinies. Prob. Th. Rel. Fields, 114:431–485, 1999a.

L. Miclo. An example of application of discrete Hardy’s inequalities. Markov ProcessesRelat. Fields, 5:319–330, 1999b.

Page 204: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

192 BIBLIOGRAPHY

R. A. Minlos. Invariant subspaces of the stochastic Ising high temperature dynamics.Markov Processes Relat. Fields, 2:263–284, 1996.

R. A. Minlos and A. Trisch. Complete spectral decomposition of the generator forone-dimensional glauber dynamics (In Russian). Uspekhi Matem. Nauk, 49:209–211, 1994.

T. S. Mountford. The ergodicity of a class of reversible reaction-diffusion processes.Prob. Th. Rel. Fields, 92(2):259–274, 1992.

B. Muckenhoupt. Hardy’s inequality with weights. Studia Math., XLIV:31–38, 1972.

A. Mukherjea. Tightness of products of i.i.d. random matrices. Prob. Th. Rel. Fields,87:389–401, 1991.

J. Nash. Continuity of solutions of parabolic and elliptic equations. Amer. J. Math.,80:931–954, 1958.

C. Neuhauser. An ergodic theorem for Schlogl models with small migration. Prob.Th. Rel. Fields, 85(1):27–32, 1990.

E. Nummelin. General Irreducible Markov Chains and Non-Negative Operators. Cam-bridge Univ. Press, 1984.

E. Nummelin and P. Tuominen. Geometric ergodicity of harris recurrent chains withapplications to renewal theory. Stoch. Proc. Appl., 12:187–202, 1982.

I. Olkin and R. Pukelsheim. The distance between two random vectors with givendispersion matrices. Linear Algebra Appl., 48:257–263, 1982.

B. Opic and A. Kufner. Hardy-type Inequalities. Longman, New York, 1990.

V. I. Oseledec. A multiplicative ergodic theorem. Lyapunov characteristic number fordynamical systems. Trans. Moscow Math. Soc., 19:197–231, 1968.

A Perrut. Hydrodynamic limits for a two-species reaction-diffusion process. Ann.Appl. Prob., 10, no. 1:163–191, 2000.

M. M. Rao and D. Ren, Z. Theory of Orlicz Spaces. Marcel Dekker, Inc. New York,1991.

G. O. Roberts and J. S. Rosenthal. Geometric ergodicity and hybrid Markov chains.Electron. Comm. Probab., 2:13–25, 1997.

G. O. Roberts and R. L. Tweedie. Geometric L2 and L1 convergence are equivalentfor reversible Markov chains. J. Appl. Probab., 38(A):37–41, 2001.

M. Rockner and F. Y. Wang. Weak Poincare inequalities and L2-convergence rates ofMarkov semigroups. J. Funct. Anal., 185(2):564–603, 2001.

J. S. Rosenthal. Quantitative convergence rates of Markov chains: A simple account.Elec. Comm. Prob., 7 no. 13:123–128, 2002.

O. S. Rothaus. Diffusion on compact Riemannian manifolds and logarithmic Sobolevinequalities. J. Funct. Anal., 42:102–109, 1981.

Page 205: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

BIBLIOGRAPHY 193

O. S. Rothaus. Analytic inequalities, isoperimetric inequalities and logarithmicSobolev inequalities. J. Funct. Anal., 64:296–313, 1985.

L. Saloff-Coste. Lectures on finite Markov chains. Lectures on probability theory andstatistics (Saint-Flour, 1996), LNM, 1665. Springer-Verlag:301–413, 1997.

R. Schoen and S. T. Yau. Differential Geometry. Science Press, Beijing, China,(In Chinese), English Translation: Lectures on Differential Geometry, InternationalPress (1994), 1988.

R. H. Schonmann. Slow drop-driven relaxation of stochastic Ising models in the vicinityof the phase coexistence region. Commun. Math. Phys., 161:1–49, 1994.

T. Shiga. Stepping stone models in population, genetics and population dynamics. in“Stochastic Processes in Physics and Engineering”, edited by S. Albeverio et al.:345–355, 1988.

A. Sinclair. Algorithms for Random Generation and Counting: A Markov ChainApproach. Birkhauser, 1993.

A. D. Sokal and L. E. Thomas. Absence of mass gap for a class of stochastic contourmodels. J. Statis. Phys., 51(5/6):907–947, 1988.

J. S. Song. Time-continuous mdp with unbounded rate. Sci. Sin., 12:1258–1267, 1987.

V. Strassen. The existence of probability measures with given marginals. Ann. Math.Statist., 36:423–439, 1965.

D. W. Stroock. An Introduction to the Theory of Large Deviations. Springer–Verlag,1984.

W. G. Sullivan. The L2 spectral gap of certain positive recurrent Markov chains andjump processes. Z. Wahrs., 67:387–398, 1984.

S. Z. Tang. The existence and uniqueness of reaction-diffusion processes with multi-species and bounded diffusion rates (In Chinese). J. Chin. Appl. Prob. Stat., 1:11–22, 1985.

P. Tuominen and R. L. Tweedie. Subgeometric rates of convergence of f -ergodicMarkov chains. Adv. Appl. Prob., 26:775–798, 1994.

R. L. Tweedie. Criteria for ergodicity, exponential ergodicity and strong ergodicity ofMarkov processes. J. Appl. Prob., 18:122–130, 1981.

S. S. Vallender. Calculation of the Wasserstein distance between probability distribu-tions on line. Theory Prob. Appl., 18:784–786, 1973.

E. van Doorn. Stochastic Monotonicity and Queuing Applications of Birth-Death Pro-cesses, volume 4. Lecture Notes in Statistics, Springer–Verlag, 1981.

E. van Doorn. Conditions for exponential ergodicity and bounds for the decay param-eter of a birth-death process. Adv. Appl. Prob., 17:514–530, 1985.

Page 206: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

194 BIBLIOGRAPHY

E. van Doorn. Representations and bounds for zeros of orthogonal polynomials andeigenvalues of sign-symmetric tri-diagonal matrices. J. Approx. Th., 51:254–266,1987.

E. van Doorn. Quasi-stationary distributions and convergence to quasi-stationarity ofbirth-death processes. Adv. Appl. Prob., 23:683–700, 1991.

E. van Doorn. Representations for the rate of convergence of birth-death processes.Theory Probab. Math. Statist., 65:36–42, 2002.

N. Varopoulos. Hardy-littlewood theory for semigroups. J. Funct. Anal., 63:240–260,1985.

D. Voiculescu, K. Dykema, and A. Nica. Free Random Variables. CRM MonographSeries, vol. 1, AMS, Providence, R. I., 1992.

Z. Vondracek. An estimate for the L2-norm of a quasi continuous function with respectto a smooth measure. Arch. Math., 67:408–414, 1996.

F. Wang and Y. H. Zhang. F -Sobolev inequality for general symmetric forms. J.North–eastern Math., In press, 2003.

F. Y. Wang. Gradient estimates for generalized harmonic functions on manifolds.Chinese Sci. Bull., 39(22):1849–1852, 1994a.

F. Y. Wang. Gradient estimates in Rd. Canad. Math. Bull., 37(4):560–570, 1994b.

F. Y. Wang. Ergodicity for infinite-dimensional diffusion processes on manifolds. Sci.Sin. Ser (A), 37(2):137–146, 1994c.

F. Y. Wang. Uniqueness of Gibbs states and the L2-convergence of infinite-dimensionalreflecting diffusion processes. Sci. Sin. Ser (A), 32(8):908–917, 1995.

F. Y. Wang. Estimation of the first eigenvalue and the lattice Yang–Mills fields. Chin.J. Math. 1996, 17A(2) (In Chinese), 147–154; Chin. J. Contem. Math., 1996, 17(2)(In English), 119–126, 1996.

F. Y. Wang. Estimates of semigroups and eigenvalues using functional inequalities.preprint, 1999a.

F. Y. Wang. Functional inequalities for empty essential spectrum. J. Funct. Anal.,170:219–245, 2000a.

F. Y. Wang. Functional inequalities, symmegroup properties and spectrum estimates.Infinite Dim. Anal., Quantum Probab. and related Topics, 3(2):263–295, 2000b.

F. Y. Wang. Sobolev type inequalities for general symmetric forms. Proc. Amer.Math. Soc., 128(12):3675–3682, 2001a.

F. Y. Wang. Convergence rates of Markov semigroups in probability distances.preprint, 2001b.

F. Y. Wang. A generalized Bochner–type inequality. preprint, 2002.

Page 207: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

BIBLIOGRAPHY 195

F.Y. Wang and M.P. Xu. On order-preservation for couplings of multidimensionaldiffusion processes. Chinese J. Appl. Probab. Stat., 13(2):142–148, 1997.

Y. Z. Wang. Convergence rate in total variation for diffusion processes. Chin. Ann.Math. Chinese Ed. 20(2), 261–266; English Ed., 20(2):323–329, 1999b.

Z. K. Wang. Birth-death Processes and Markov Chains. Science Press, Beijing (InChinese), 1980.

Z. K. Wang and X. Q. Yang. Birth and Death Processes and Markov Chains. Springer-Verlag, Berlin, 1992.

L. N. Wasserstein. Markov processes on a countable product space, describing largesystems of automata (In Russian). Problem Peredachi Informastii, 5:64–73, 1969.

E. Wigner. Characterictic vector of boardered matrices with infinite dimensions. Ann.Math., 62:548–564, 1955.

L. M. Wu. Essential spectral radius for Markov semigroups (I): discrete time case.preprint, 2002.

S. J. Yan and M. F. Chen. Multidimensional q-processes. Chin. Ann. Math., 7(B):1:90–110, 1986.

S. J. Yan and Z. B. Li. Probability models of non-equilibrium systems and the masterequations (In Chinese). Acta Phys. Sin., 29:139–152, 1980.

D. G. Yang. Lower bound estimates of the first eigenvalue for compact manifolds withpositive ricci curvature. Pacific. J. Math., 190(2):383–398, 1999.

H. C. Yang. Estimate of the first eigenvalue of a compact Riemannian manifold withricci curvature bounded below by a negative constant. Sci. Sin. (A), 32(7):698–700,(In Chinese), 1989.

X. Q. Yang. Constructions of Time-homogeneous Markov Processes with DenumerableStates. Hunan Sci. Press, Hunan (In Chinese), 1986.

H. J. Zhang, X. Lin, and Z. T. Hou. Uniformly polynomial convergence for standardtransition functions. In “Birth–death Processes” by Hou, Z. T. et al (2000), HunanSci. Press, Hunan, 2000.

S. Y. Zhang. Existence and application of optimal Markovian coupling with respectto non-negative lower semi-continuous functions. Acta Math. Sin. Eng. Ser., 16(2):261–270, 2000a.

S. Y. Zhang and Y. H. Mao. Exponential convergence rate in Boltzman-Shannonentropy. Sci. Sin. (A), 44(3):280–285, 2000.

Y. H. Zhang. Conservativity of couplings for jump processes (In Chinese). J. BeijingNormal Univ., 30(3):305–307, 1994.

Y. H. Zhang. Sufficient and necessary conditions for stochastic comparability of jumpprocesses. Acta Math. Sin. Eng. Ser., 16(1):99–102, 2000b.

Page 208: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

196 BIBLIOGRAPHY

Y. H. Zhang. Strong ergodicity for continuous-time Markov chains. J. Appl. Prob.,38:270–277, 2001.

D. Zhao. Lower estimate of the first eigenvalue on compact Riemannian manifolds.Science in China, Ser. A, 42(9):897–904, 1999.

X. G. Zheng and W. D. Ding. Existence theorems for linear growth processes withdiffusion. Acta Math. Sin. New Ser., 7(1):25–42, 1987.

J. Q. Zhong and H. C. Yang. Estimates of the first eigenvalue of a compact Riemannianmanifolds. Sci. Sin., 27(12):1251–1265, 1984.

Page 209: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

Index

F -Sobolev inequality, 129, 132L2-algebraic ergodicity, 141Q-matrix, 19, 150Q-process, 20, 83Wp-distance, 27α, 88λ0, 67, 72, 88, 89λ1, 2, 3, 48, 63, 66, 67, 88ρ-optimal coupling, 28, 29ρ-optimal coupling operator, 49ϕ-optimal coupling, 31q-pair, 19q-process, 19(MO), 21(MP), 19

algebraic convergence, 133algebraic form, 178analytic method, 42, 99approximating procedure, 9, 108, 109,

112

Banach space, 111, 124basic coupling, 18, 22birth–death process, 8, 81, 85, 88,

104, 105, 114, 116, 118Brussel’s model, 151

Cheeger’s constant, 10, 11, 63classical coupling, 22, 24classical variational formula, 4collapse time, 169conservative, 19consumption matrix, 173coupling, 17, 24, 43, 45, 157coupling by inner reflection, 23coupling by marching solders, 159

coupling by reflection, 24, 28coupling method, 7, 43, 102coupling of marching solders, 24coupling of marching soldiers, 22coupling operator, 21coupling time, 25, 45

Dirichlet eigenvalue, 71dual variational formulas, 9, 88, 89,

108, 109

eigenfunction in weak sense, 33eigenvector’s method, 169ergodic criteria, 84, 93expending coefficients, 168explicit bound, 9, 108, 109, 112, 114,

116, 118explicit criterion, 107, 109, 112, 114,

116, 117explicit estimate, 114, 115exponential ergodicity, 12, 84, 138exponential form, 178

first eigenvalue, 1FKG-inequality, 17functional inequalities, 123

gradient estimate, 36

hydrodynamic limit, 163

idealized model, 168independent coupling, 17, 22input–output method, 168isoperimetric constant, 64, 124, 129isoperimetric inequality, 64

jump condition, 19

Page 210: Eigenvalues, Inequalities and Ergodic Theoryzhanghanjun.weebly.com/.../9/5/0/...ergodic_theory.pdf · to develop the ergodic theory for Markov processes. Due to these facts, from

198 INDEX

jump process, 19

Leontief’s method, 168Liapynov exponent , 172Liggett–Stroock inequality, 91, 131logarithmic Sobolev inequality, 10,

74, 91, 106, 117, 125, 137

marginality, 17, 19, 21, 24Markov chain, 20, 83Markov jump processes, 19Markovian coupling, 19modified coupling of marching sol-

ders, 23

Nash inequality, 10, 79, 91, 106, 137new variational formula, 5

one-dimensional diffusion, 87, 89, 93,105, 117

optimal Markovian coupling, 48ordinary ergodicity, 12, 84, 138Orlicz space, 115, 125

Poincare inequality, 10, 91, 106, 123,137

Poincare–type inequality, 106, 124,125

polynomial model, 150probability distance, 26

reaction–diffusion process, 149regular, 19

Schlogl’s first model, 151Schlogl’s second model, 151single birth process, 99, 152Sobolev-type inequality, 116spectral gap, 33, 65, 113splitting technique, 64stochastic comparability, 31, 37stochastic model with consumption,

173strong ergodicity, 12, 84, 138structure matrix, 168super-Poincare inequality, 132symmetric form, 64

totally stable, 19

variational formula, 112vector of products, 167

Wasserstein distance, 27, 157weak domain, 33weaker-Poincare inequality, 133weighted Hardy inequality, 88