recent advances in operator theory and its applications: the israel gohberg anniversary volume

479

Upload: others

Post on 11-Sep-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume
Page 2: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory: Advances andApplicationsVol. 160

Editor:I. Gohberg

H. G. Kaper (Argonne)S. T. Kuroda (Tokyo)P. Lancaster (Calgary)L. E. Lerer (Haifa)B. Mityagin (Columbus)V. V. Peller (Manhattan, Kansas)L. Rodman (Williamsburg)J. Rovnyak (Charlottesville)D. E. Sarason (Berkeley)I. M. Spitkovsky (Williamsburg)S. Treil (Providence)H. Upmeier (Marburg)S. M. Verduyn Lunel (Leiden)D. Voiculescu (Berkeley)H. Widom (Santa Cruz)D. Xia (Nashville)D. Yafaev (Rennes)

Honorary and AdvisoryEditorial Board:C. Foias (Bloomington)P. R. Halmos (Santa Clara)T. Kailath (Stanford)P. D. Lax (New York)M. S. Livsic (Beer Sheva)

Editorial Office:School of MathematicalSciencesTel Aviv UniversityRamat Aviv, Israel

Editorial Board:D. Alpay (Beer-Sheva)J. Arazy (Haifa)A. Atzmon (Tel Aviv)J. A. Ball (Blacksburg)A. Ben-Artzi (Tel Aviv)H. Bercovici (Bloomington)A. Böttcher (Chemnitz)K. Clancey (Athens, USA)L. A. Coburn (Buffalo)K. R. Davidson (Waterloo, Ontario)R. G. Douglas (College Station)A. Dijksma (Groningen)H. Dym (Rehovot)P. A. Fuhrmann (Beer Sheva)B. Gramsch (Mainz)G. Heinig (Chemnitz)J. A. Helton (La Jolla)M. A. Kaashoek (Amsterdam)

Page 3: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Birkhäuser VerlagBasel . Boston . Berlin

Recent Advances inOperator Theoryand its Applications

The Israel Gohberg Anniversary Volume

International Workshop on Operator Theory and its ApplicationsIWOTA 2003, Cagliari, Italy

Marinus A. KaashoekSebastiano SeatzuCornelis van der MeeEditors

Page 4: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

A CIP catalogue record for this book is available from theLibrary of Congress, Washington D.C., USA

Bibliographic information published by Die Deutsche BibliothekDie Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailedbibliographic data is available in the Internet at <http://dnb.ddb.de>.

ISBN 3-7643-7290-7 Birkhäuser Verlag, Basel – Boston – Berlin

This work is subject to copyright. All rights are reserved, whether the whole or part of thematerial is concerned, specifically the rights of translation, reprinting, re-use ofillustrations, recitation, broadcasting, reproduction on microfilms or in other ways, andstorage in data banks. For any kind of use permission of the copyright owner must beobtained.

© 2005 Birkhäuser Verlag, P.O. Box 133, CH-4010 Basel, SwitzerlandMember of the BertelsmannSpringer Publishing GroupPrinted on acid-free paper produced from chlorine-free pulp. TCF ∞Cover design: Heinz Hiltbrunner, BaselPrinted in GermanyISBN 10: 3-7643-7290-7 e-ISBN: 3-7643-7398-9ISBN 13: 978-3-7643-7290-3

9 8 7 6 5 4 3 2 1 www.birkhauser.ch

Editors:

Marinus A. Kaashoek Sebastiano SeatzuDepartment of Mathematics, FEW Cornelis van der MeeVrije Universiteit Dipartimento di MatematicaDe Boelelaan 1081A Università di Cagliari1081 HV Amsterdam Viale Merello 92The Netherlands 09123 Cagliarie-mail: [email protected] Italy

e-mail: [email protected]@krein.unica.it

2000 Mathematics Subject Classification 34, 35, 45, 47, 65, 93

Page 5: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Contents

Editorial Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

T. Aktosun, M.H. Borkowski, A.J. Cramer and L.C. PittmanInverse Scattering with Rational Scattering Coefficients andWave Propagation in Nonhomogeneous Media . . . . . . . . . . . . . . . . . . . . . . . 1

T. AndoAluthge Transforms and the Convex Hull of the Spectrumof a Hilbert Space Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

W. Bhosri, A.E. Frazho and B. YagciMaximal Nevanlinna-Pick Interpolation for Pointsin the Open Unit Disc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

M.R. Capobianco, G. Criscuolo and P. JunghannsOn the Numerical Solution of a Nonlinear Integral Equationof Prandtl’s Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

M. CappielloFourier Integral Operators and Gelfand-Shilov Spaces . . . . . . . . . . . . . . . . 81

D.Z. Arov and H. DymStrongly Regular J-inner Matrix-valued Functionsand Inverse Problems for Canonical Systems . . . . . . . . . . . . . . . . . . . . . . . . . 101

C. EstaticoRegularization Processes for Real Functionsand Ill-posed Toeplitz Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

K. GalkowskiMinimal State-space Realization for a Class of nD Systems . . . . . . . . . . 179

G. Garello and A. MorandoContinuity in Weighted Besov Spaces for PseudodifferentialOperators with Non-regular Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

G.J. Groenewald and M.A. KaashoekA New Proof of an Ellis-Gohberg Theorem on OrthogonalMatrix Functions Related to the Nehari Problem . . . . . . . . . . . . . . . . . . . . 217

Page 6: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

vi Contents

G. Heinig and K. RostSchur-type Algorithms for the Solution of HermitianToeplitz Systems via Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

M. Kaltenback, H. Winkler and H. WoracekAlmost Pontryagin Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

D.S. Kalyuzhnyı-VerbovetzkiıMultivariable ρ-contractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273

V. Kostrykin and K.A. MakarovThe Singularly Continuous Spectrum and Non-ClosedInvariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

G. Mastroianni, M.G. Russo and W. ThemistoclakisNumerical Methods for Cauchy Singular Integral Equationsin Spaces of Weighted Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . 311

A. OliaroOn a Gevrey-Nonsolvable Partial Differential Operator . . . . . . . . . . . . . . . 337

V. Olshevsky and L. SakhnovichOptimal Prediction of Generalized Stationary Processes . . . . . . . . . . . . . . 357

P. Rocha, P. Vettori and J.C. WillemsSymmetries of 2D Discrete-Time Linear Systems . . . . . . . . . . . . . . . . . . . . . 367

G. Rodriguez, S. Seatzu and D. TheisAn Algorithm for Solving Toeplitz Systems by Embeddingin Infinite Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383

B. SilbermannFredholm Theory and Numerical Linear Algebra . . . . . . . . . . . . . . . . . . . . . 403

C.V.M. van der Mee and A.C.M. RanAdditive and Multiplicative Perturbations of ExponentiallyDichotomous Operators on General Banach Spaces . . . . . . . . . . . . . . . . . . . 413

C.V.M. van der Mee, L. Rodman and I.M. SpitkovskyFactorization of Block Triangular Matrix Functionswith Off-diagonal Binomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425

G. WanjalaClosely Connected Unitary Realizations of the Solutions to theBasic Interpolation Problem for Generalized Schur Functions . . . . . . . . . 441

M.W. WongTrace-Class Weyl Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469

Page 7: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Editorial Preface

This volume contains a selection of papers in modern operator theory and its appli-cations. Most of them are directly related to lectures presented at the FourteenthInternational Workshop on Operator Theory and its Applications (IWOTA 2003)held at the University of Cagliari, Italy, in the period of June 24–27, 2003.

The workshop, which was attended by 108 mathematicians – including anumber of PhD and postdoctoral students – from 22 countries, presented eightspecial sessions on

1) control theory,2) interpolation theory,3) inverse scattering,4) numerical estimates for operators,5) numerical treatment of integral equations,6) pseudodifferential operators,7) realizations and transformations of analytic functions and indefinite inner

product spaces, and8) structured matrices.

The program consisted of 19 plenary lectures of 45 minutes and 78 lectures of 30minutes in four parallel sessions.

The present volume reflects the wide range and rich variety of topics pre-sented and discussed at the workshop, both within and outside the special sessions.The papers deal with inverse scattering, numerical ranges, pseudodifferential op-erators, numerical analysis, interpolation theory, multidimensional system theory,indefinite inner products, spectral factorization, and stationary processes.

Since in the period that the proceedings of IWOTA 2003 were being prepared,Israel Gohberg, the president of the IWOTA steering committee, reached the ageof 75, we decided to dedicate these proceedings to Israel Gohberg on the occasionof his 75th birthday. All of the authors of these proceedings have joined the editorsand dedicated their papers to him as well.

The Editors

Page 8: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Israel Gohberg, the presidentof the IWOTA steering committee

Page 9: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 1–20c© 2005 Birkhauser Verlag Basel/Switzerland

Inverse Scattering with Rational ScatteringCoefficients and Wave Propagation inNonhomogeneous Media

Tuncay Aktosun, Michael H. Borkowski, Alyssa J. Cramer and LanceC. Pittman

Dedicated to Israel Gohberg on the occasion of his 75th birthday

Abstract. The inverse scattering problem for the one-dimensional Schrodingerequation is considered when the potential is real valued and integrable and hasa finite first-moment and no bound states. Corresponding to such potentials,for rational reflection coefficients with only simple poles in the upper half com-plex plane, a method is presented to recover the potential and the scatteringsolutions explicitly. A numerical implementation of the method is developed.For such rational reflection coefficients, the scattering wave solutions to theplasma-wave equation are constructed explicitly. The discontinuities in thesewave solutions and in their spatial derivatives are expressed explicitly in termsof the potential.

Mathematics Subject Classification (2000). Primary 34A55; Secondary 34L4035L05 47E05 81U40.

Keywords. Inverse scattering, Schrodinger equation, Rational scattering coef-ficients, Wave propagation, Plasma-wave equation.

1. Introduction

Consider the Schrodinger equation

ψ′′(k, x) + k2ψ(k, x) = V (x)ψ(k, x), x ∈ R, (1.1)

The research leading to this paper was supported by the National Science Foundation undergrants DMS-0243673 and DMS-0204436 and by the Department of Energy under grant DE-FG02-01ER45951.

Page 10: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

2 T. Aktosun, M.H. Borkowski, A.J. Cramer and L.C. Pittman

where the prime denotes the x-derivative, and the potential V is assumed to haveno bound states and to belong to the Faddeev class. The bound states of (1.1)correspond to its square-integrable solutions. By the Faddeev class we mean the setof real-valued and measurable potentials for which

∫∞−∞ dx (1+ |x|)|V (x)| is finite.

Via the Fourier transformation

u(x, t) =12π

∫ ∞

−∞dk ψ(k, x) e−ikt,

we can transform (1.1) into the plasma-wave equation

∂2u(x, t)∂x2

− ∂2u(x, t)∂t2

= V (x)u(x, t), x, t ∈ R. (1.2)

In the absence of bound states, (1.1) does not have any bounded solutionsfor k2 < 0. The solutions for k2 > 0 are known as the scattering solutions. Eachscattering solution can be expressed as a linear combination of the two (linearly-independent) Jost solutions from the left and the right, denoted by fl and fr,respectively, satisfying the respective asymptotic conditions

fl(k, x) = eikx [1 + o(1)] , f ′l (k, x) = ik eikx [1 + o(1)] , x→ +∞,

fr(k, x) = e−ikx [1 + o(1)] , f ′r(k, x) = −ik e−ikx [1 + o(1)] , x→ −∞.

We have

fl(k, x) =1

T (k)eikx +

L(k)T (k)

e−ikx + o(1), x→ −∞,

fr(k, x) =1

T (k)e−ikx +

R(k)T (k)

eikx + o(1), x→ +∞,

where L and R are the left and right reflection coefficients, respectively, and T isthe transmission coefficient.

The solutions to (1.1) for k = 0 require special attention. Generically, fl(0, x)and fr(0, x) are linearly independent on R, and we have

T (0) = 0, R(0) = L(0) = −1.

In the exceptional case, fl(0, x) and fr(0, x) are linearly dependent on R and wehave

T (0) =√

1−R(0)2 > 0, −1 < R(0) = −L(0) < 1.When V belongs to the Faddeev class and has no bound states, it is known

[1–5] that either one of the reflection coefficients R and L contains the appropriateinformation to construct the other reflection coefficient, the transmission coefficientT, the potential V, and the Jost solutions fl and fr. Our aim in this paper is topresent explicit formulas for such a construction when the reflection coefficientsare rational functions of k with simple poles on the upper half complex plane C+.We will use C− to denote the lower half complex plane and let C+ := C+ ∪ Rand C− := C− ∪R.

The recovery of V from a reflection coefficient constitutes the inverse scat-tering problem for (1.1). There has been a substantial amount of previous work [2,

Page 11: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Inverse Scattering with Rational Scattering Coefficients 3

6–14] done on the inverse scattering problem with rational reflection coefficients.The solution to this inverse problem can, for example, be obtained by solving theMarchenko integral equation [1–5]. Another way to solve this inverse problem isto use Sabatier’s method [2, 12–14] utilizing transformations resembling Darbouxtransformations [1–3]. Dolveck-Guilpart developed [7] a numerical implementationof Sabatier’s method. Yet another method is based on the Wiener-Hopf factoriza-tion of a 2×2 matrix [6] related to the scattering matrix for (1.1). It is also possibleto use [15, 16] a minimal realization of a rational reflection coefficient and to re-cover the potential explicitly. The method discussed in our paper is closely relatedto that given in [6]. Here, we are able to write down the Jost solutions explicitlyin terms of the poles in C+ and the corresponding residues of the reflection coeffi-cients. This also enables us to construct explicitly certain solutions to (1.2), whichwe call the Jost waves.

Our paper is organized as follows. In Section 2 we present the preliminarymaterial needed for later sections, including an outline of the construction of Tand L from the right reflection coefficient R. In Section 3 we present the explicitconstruction of the potential and the Jost solutions for x > 0 in terms of the polesin C+ and the corresponding residues of R. Having constructed the left reflectioncoefficient L in terms of R, in Section 4 we present the explicit construction ofthe potential and the Jost solutions for x < 0 in terms of the poles in C+ andthe corresponding residues of L. In Section 5 we turn our attention to (1.2) andexplicitly construct its solutions by using the Fourier transforms of the Jost solu-tions to (1.1). In Section 6 we analyze the discontinuities in such wave solutionsand in their x-derivatives at each fixed t. Finally, in Section 7 we remark on thenumerical implementation of our method.

2. Preliminaries

For convenience, we introduce the Faddeev functions from the left and right, de-noted by ml and mr, respectively, defined as

ml(k, x) := e−ikxfl(k, x), mr(k, x) := eikxfr(k, x). (2.1)

From (1.1) and (2.1) it follows that

m′′l (k, x) + 2ikm′

l(k, x) = V (x)ml(k, x), x ∈ R, (2.2)

m′′r (k, x) − 2ikm′

r(k, x) = V (x)mr(k, x), x ∈ R.

It is known [1–5] that

fl(−k, x) = −R(k) fl(k, x) + T (k) fr(k, x), k ∈ R, (2.3)

fr(−k, x) = T (k) fl(k, x)− L(k) fr(k, x), k ∈ R, (2.4)or equivalently

ml(−k, x) = −R(k) e2ikxml(k, x) + T (k)mr(k, x), k ∈ R, (2.5)

mr(−k, x) = T (k)ml(k, x) − L(k) e−2ikxmr(k, x), k ∈ R. (2.6)

Page 12: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

4 T. Aktosun, M.H. Borkowski, A.J. Cramer and L.C. Pittman

When R and L are rational functions of k, their domains can be extendedmeromorphically from R to the entire complex plane C. Similarly, the Jost so-lutions and the Faddeev functions have extensions that are analytic in C+ andmeromorphic in C−. We have

fl(−k∗, x) = fl(k, x)∗, fr(−k∗, x) = fr(k, x)∗, k ∈ C,

T (−k∗) = T (k)∗, R(−k∗) = R(k)∗, L(−k∗) = L(k)∗, k ∈ C,

where the asterisk denotes complex conjugation. Note, in particular, that

|R(k)|2 = R(k)R(−k), k ∈ R. (2.7)

The scattering coefficients satisfy

T (k)T (−k) + R(k)R(−k) = 1, k ∈ R, (2.8)

T (k)T (−k) + L(k)L(−k) = 1, k ∈ R,

L(k)T (−k) + R(−k)T (k) = 0, k ∈ R, (2.9)with appropriate meromorphic extensions to C.

In the rest of this section we outline the construction of T and L from R.From (2.7) and (2.8) we get

T (k) =[1− |R(k)|2

] 1T (−k)

, k ∈ R. (2.10)

If R(k) is a rational function of k ∈ R, so is 1 − |R(k)|2. When V belongs to theFaddeev class and has no bound states, it is known [1–5] that T (k) is analytic inC+, continuous in C+, nonzero in C+ \ 0, and 1 + O(1/k) as k → ∞ in C+.Generically T (k) has a simple zero at k = 0, and T (0) = 0 in the exceptionalcase. We have R(k) = o(1/k) as k → ±∞ in R, and hence the rationality of Rimplies that R(k) = O(1/k2) as k →∞ in C. With the help of (2.10), by factoringboth the numerator and the denominator of 1 − |R(k)|2, we can obtain T (k) byseparating the zeros and poles of 1− |R(k)|2 in C+ and in C−.

In the exceptional case it is known [1–5] that |R(k)| < 1 for k ∈ R. Thus, inthat case we get

1− |R(k)|2 =∏

(k − k+a )∏

(k − k+m)

∏(k − k−b )∏(k − k−n )

, k ∈ R, (2.11)

where k+a , k

+m ∈ C+ and k−b , k

−n ∈ C−. Note that the left-hand side in (2.11) is an

even function of k and it converges to 1 as k → ±∞. As a result we find that thedegrees of the four polynomials

∏(k−k+

a ),∏

(k−k−b ),∏

(k−k+m), and

∏(k−k−n )

are all the same. Hence, from (2.10) and (2.11) we get

T (k)∏

(k − k−n )∏(k − k−b )

=1

T (−k)

∏(k − k+

a )∏(k − k+

m), (2.12)

where the left-hand side is analytic in C+, continuous in C+, and 1 + O(1/k) ask →∞ in C+. Similarly, the right-hand side of (2.12) is analytic in C−, continuousin C−, and 1 + O(1/k) as k →∞ in C−. With the help of Morera’s theorem, we

Page 13: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Inverse Scattering with Rational Scattering Coefficients 5

conclude that each side of (2.12) must be equal to an entire function on C thatconverges to 1 at ∞. By Liouville’s theorem both sides must then be equal to 1.Therefore, we obtain

T (k) =∏

(k − k−b )∏(k − k−n )

, k ∈ C. (2.13)

The argument given above can easily be adapted to the generic case. In the genericcase, it is known [1–5] that T (k) = O(k) as k → 0, and the construction of T fromR is similarly obtained by replacing exactly one of k−b with zero and exactly oneof k+

a with zero in the above argument.Let us write the expression in (2.13) for T (k) in a slightly different notation

which will be useful in Section 5:

T (k) =

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩k∏nz

j=1(k + zj)∏nz+1j=1 (k + pj)

, T (0) = 0,∏nz

j=1(k + zj)∏nz

j=1(k + pj), T (0) = 0,

(2.14)

where the zj for 1 ≤ j ≤ nz correspond to the zeros of T (−k) in C+ and the pj

correspond to the poles there. Thus, the poles of 1/T (−k) in C+ occur at k = zj ,and let us use τj to denote the residues there:

τj := Res(

1T (−k)

, zj

), j = 1, . . . , nz. (2.15)

Once T (k) is constructed, with the help of (2.9) we obtain

L(k) = −R(−k)T (k)T (−k)

.

3. Construction on the positive half-line

In this section, when x > 0 we explicitly construct the Jost solutions and thepotential in terms of the poles and residues of R(k) in C+. We use nl to denotethe number of poles of R(k) in C+, assume that such poles are simple and occurat k = klj in C+, and use ρlj to denote the corresponding residues.

We define

Bl(x, α) :=12π

∫ ∞

−∞dk [ml(k, x)− 1] e−ikα. (3.1)

When α < 0, we have Bl(x, α) = 0 due to, for each fixed x, the analyticity ofml(k, x) in C+, the continuity of ml(k, x) in C+, and the fact that ml(k, x) =

Page 14: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

6 T. Aktosun, M.H. Borkowski, A.J. Cramer and L.C. Pittman

1 + O(1/k) as k →∞ in C+. From (2.5) we get

12π

∫ ∞

−∞dk [ml(−k, x)− 1] eikα =− 1

∫ ∞

−∞dk R(k)ml(k, x)eik(2x+α)

+12π

∫ ∞

−∞dk [T (k)mr(k, x)− 1] eikα.

(3.2)

The second integral on the right-hand side of (3.2) vanishes when α > 0 due to thefact that T (k) and mr(k, x) are analytic for k ∈ C+ and continuous for k ∈ C+,and T (k)mr(k, x) = 1 + O(1/k) as k → ∞ in C+. Thus, from (3.1) and (3.2) weobtain

Bl(x, α) = − 12π

∫ ∞

−∞dk R(k) e2ikx+ikαml(k, x), α > 0. (3.3)

From (3.1) and the fact that Bl(x, α) = 0 for α < 0, we have

ml(k, x) = 1 +∫ ∞

0

dαBl(x, α) eikα. (3.4)

When 2x+α > 0, the integral in (3.3) can be evaluated as a contour integralalong the boundary of C+, to which the only contribution comes from the polesof R(k) in C+. Since such poles are assumed to be simple, we get

Bl(x, α) = −inl∑

j=1

ρlj e2ikljx+ikljαml(klj , x), 2x+ α > 0, α > 0. (3.5)

Using (3.5) in (3.4), with the help of∫ ∞

0

dα ei(k+klj)α =i

k + klj, k ∈ C+,

we get

ml(k, x) = 1 +nl∑

j=1

ρlj e2ikljx

k + kljml(klj , x), x ≥ 0. (3.6)

We are interested in determining ml(klj , x) appearing in (3.5) and (3.6). To do so,we put k = klp in (3.6) for 1 ≤ p ≤ nl. Then, for x ≥ 0 we obtain

ml(klp, x) = 1 +nl∑

j=1

ρlj e2ikljx

klp + kljml(klj , x), p = 1, . . . , nl. (3.7)

Notice that (3.7) is a linear algebraic system and can be written as

Ml(x)

⎡⎢⎣ ml(kl1, x)...

ml(klnl , x)

⎤⎥⎦ =

⎡⎢⎣ 1...1

⎤⎥⎦ , (3.8)

where Ml(x) is the nl × nl matrix-valued function whose (p, q)-entry is given by

[Ml(x)]pq := δpq −ρlq e

2iklqx

klp + klq, (3.9)

Page 15: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Inverse Scattering with Rational Scattering Coefficients 7

with δpq denoting the Kronecker delta. The unique solvability of the linear systemin (3.8) and hence the invertibility of Ml(x) follow from Corollary 4.2 of [6]. Using(3.8) in (3.6) we get for x ≥ 0

ml(k, x) = 1 +[

ρl1 e2ikl1x

k + kl1. . .

ρlnl e2iklnlx

k + klnl

]Ml(x)−1

⎡⎢⎣ 1...1

⎤⎥⎦ . (3.10)

We can simplify (cf. p. 12 of [17]) the bilinear form in (3.10) and obtain for x ≥ 0

ml(k, x) = 1− 1detMl(x)

∣∣∣∣∣∣∣∣∣∣∣∣

0ρl1 e

2ikl1x

k + kl1. . .

ρlnl e2iklnlx

k + klnl

1...1

Ml(x)

∣∣∣∣∣∣∣∣∣∣∣∣, (3.11)

and hence we have written ml(k, x) − 1 as the ratio of two determinants that areconstructed solely in terms of the klj and ρlj with 1 ≤ j ≤ nl.

Similarly, from (3.5) and (3.8) we get

Bl(x, α) = −i[ρl1 e

ikl1(2x+α) . . . ρlnl eiklnl (2x+α)

]Ml(x)−1

⎡⎢⎣ 1...1

⎤⎥⎦ ,or equivalently

Bl(x, α) =det Γl(x, α)detMl(x)

, x ≥ 0, α > 0, (3.12)

where Γl(x, α) is the (nl + 1)× (nl + 1) matrix defined as

Γl(x, α) :=

⎡⎢⎢⎢⎢⎣0 iρl1 e

2ikl1x+ikl1α . . . iρlnl e2iklnlx+iklnlα

1...1

Ml(x)

⎤⎥⎥⎥⎥⎦ . (3.13)

It is pleasantly surprising that we have

det Γl(x, 0+) =d

dxdetMl(x). (3.14)

The proof of (3.14) is somehow involved, and we briefly describe the basic steps inthe proof. First, in the matrix Γl(x, 0+), multiply the (j + 1)st column by e−ikljx

and the (j + 1)st row by eikljx for all 1 ≤ j ≤ nl. The determinant remainsunchanged. Then, use the cofactor expansion of the resulting determinant withrespect to the first column and get

det Γl(x, 0+) = 0 |·| − eikl1x |·|+ eikl2x |·| − · · ·+ (−1)nleiklnlx |·| , (3.15)

Page 16: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

8 T. Aktosun, M.H. Borkowski, A.J. Cramer and L.C. Pittman

where | · | denotes the appropriate subdeterminant. Next, put each coefficient eikljx

on the right-hand side of (3.15) into the first row of the corresponding subdeter-minant. We need to show that the resulting quantity is equal to the x-derivativeof detMl(x). In order to do so, in the matrix Ml(x) multiply the jth row byeikljx and the jth column by e−ikljx for 1 ≤ j ≤ nl, which results in no change indetMl(x). Now take the x-derivative of the resulting determinant and write it asa sum where the jth term is the determinant of a matrix obtained by taking thex-derivative of the jth row of Ml(x). Then rewrite each term in the summation sothat the row whose derivative has been evaluated is moved to the first row whilethe remaining rows are left in the same order. By comparison, we then conclude(3.14).

Using (3.14) in (3.12) we get

Bl(x, 0+) =ddx detMl(x)detMl(x)

, x ≥ 0. (3.16)

It is known [1–5] that

V (x) = −2d

dxBl(x, 0+), x ∈ R. (3.17)

Therefore, using (3.16) in (3.17) we get

V (x) = −2d

dx

[ddx detMl(x)detMl(x)

], x > 0. (3.18)

From (3.9) we see that Ml(x) is uniquely constructed in terms of the poles andresidues of R(k) in C+. Thus, in (3.18) we have expressed V (x) for x > 0 in termsof the (2nl) constants klj and ρlj alone.

Alternatively, having constructed ml(k, x) for x ≥ 0 as in (3.11), we can use(2.2) and obtain the potential for x > 0 as

V (x) =m′′

l (k, x) + 2ikm′l(k, x)

ml(k, x). (3.19)

Note that even though the parameter k appears in the individual terms on theright-hand side in (3.19), it is absent from the right-hand side as a whole. Inparticular, using k = 0 in (3.19) we can evaluate V (x) for x > 0 as

V (x) =m′′

l (0, x)ml(0, x)

. (3.20)

We have shown that, starting with a rational right reflection coefficient R,one can explicitly construct ml(k, x) for x ≥ 0 and V (x) for x > 0. We can thenobtain fl(k, x) via (2.1). This means that we also have fl(−k, x) in hand. Then,using (2.3) we also get fr(k, x) for x ≥ 0 as

fr(k, x) =fl(−k, x) + R(k) fl(k, x)

T (k),

with T (k) as in (2.14).

Page 17: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Inverse Scattering with Rational Scattering Coefficients 9

4. Construction on the negative half-line

In this section we present the explicit construction of the Jost solutions and thepotential when x < 0. Finding mr(k, x) and V (x) for x < 0 in terms of L(k) issimilar to the construction of ml(k, x) and V (x) for x > 0 from R as outlined inSection 3. As we shall see in (4.6) and (4.10), the explicit formulas for mr(k, x)and V (x) for x < 0 are written in terms of the poles and residues of L(k) in C+.We let nr denote the number of poles of L(k) in C+, use krj and ρrj to denotethose (simple) poles and the corresponding residues, respectively.

Let

Br(x, α) :=12π

∫ ∞

−∞dk [mr(k, x)− 1] e−ikα dk. (4.1)

When α < 0, we get Br(x, α) = 0 because, for each fixed x, mr(k, x) is analytic inC+, continuous in C+, and 1 + O(1/k) as k → ∞ in C+. Starting with (2.6) weshow [cf. (3.3)] that

Br(x, α) = − 12π

∫ ∞

−∞dk L(k) e−2ikx+ikαmr(k, x), α > 0, (4.2)

and thus, [cf. (3.4)] we obtain

mr(k, x) = 1 +∫ ∞

0

dαBr(x, α) eikα. (4.3)

In order to evaluate the integral in (4.2), we use a contour integration along theinfinite semicircle which is the boundary of C+. Since the poles of L(k) in C+ areassumed to be simple, we obtain [cf. (3.5)]

Br(x, α) = −inr∑

j=1

ρrj e−2ikrjx+ikrjαmr(krj , x), −2x+ α > 0, α > 0. (4.4)

Using (4.4) in (4.3) we get [cf. (3.6)]

mr(k, x) = 1 +nr∑

j=1

ρrj e−2ikrjx

k + krjmr(krj , x), x ≤ 0. (4.5)

Proceeding as in Section 3 leading to (3.10), we get for x ≤ 0

mr(k, x) = 1 +[

ρr1 e−2ikr1x

k + kr1. . .

ρrnr e−2ikrnrx

k + krnr

]Mr(x)−1

⎡⎢⎣ 1...1

⎤⎥⎦ ,or equivalently

mr(k, x) = 1− 1detMr(x)

∣∣∣∣∣∣∣∣∣∣∣0

ρr1 e−2ikr1x

k + kr1. . .

ρrnr e−2ikrnrx

k + krnr

1...1

Mr(x)

∣∣∣∣∣∣∣∣∣∣∣, (4.6)

Page 18: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

10 T. Aktosun, M.H. Borkowski, A.J. Cramer and L.C. Pittman

where Mr(x) is the nr × nr matrix-valued function whose (p, q)-entry is given by

[Mr(x)]pq := δpq −ρrq e

−2ikrqx

krp + krq. (4.7)

Let us remark that the invertibility of Mr(x) follows from Corollary 4.2 of [6].Then [cf. (3.12)]

Br(x, α) =det Γr(x, α)detMr(x)

, x ≤ 0, α > 0, (4.8)

where Γr(x, α) is the (nr + 1)× (nr + 1) matrix defined as

Γr(x, α) :=

⎡⎢⎢⎢⎢⎣0 iρr1 e

−2ikr1x+ikr1α . . . iρrnr e−2ikrnrx+ikrnrα

1...1

Mr(x)

⎤⎥⎥⎥⎥⎦ . (4.9)

Similarly as in the proof of (3.14), it can be shown that

det Γr(x, 0+) = − d

dxdetMr(x),

and hence (4.8) implies

Br(x, 0+) =− d

dx detMr(x)detMr(x)

, x ≤ 0.

It is known [1–5] that

V (x) = 2d

dxBr(x, 0+), x ∈ R,

and hence we obtain

V (x) = −2d

dx

[ddx detMr(x)detMr(x)

], x < 0. (4.10)

Alternatively, having constructed mr(k, x) as in (4.6) for x ≤ 0, we can evaluateV (x) for x < 0 via [cf. (3.19) and (3.20)]

V (x) =m′′

r (k, x) − 2ikm′r(k, x)

mr(k, x), (4.11)

V (x) =m′′

r (0, x)mr(0, x)

. (4.12)

Note that the right-hand side in (4.11) is independent of k as a whole.If we start with R, we can construct T and L as in Section 2. Then, using

the poles and residues of L(k) in C+, we can construct mr(k, x) for x ≤ 0 andV (x) for x < 0. We then obtain fr(k, x) with the help of (2.1) and (4.6). Havingfr(k, x) and fr(−k, x) in hand, via (2.4) we construct fl(k, x) for x ≤ 0 by using

fl(k, x) =fr(−k, x) + L(k) fr(k, x)

T (k).

Page 19: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Inverse Scattering with Rational Scattering Coefficients 11

5. Wave propagation and Jost waves

We now wish to analyze certain solutions to the plasma-wave equation (1.2) whenV belongs to the Faddeev class, there are no bound states, and the correspondingreflection coefficients are rational functions of k with simple poles in C+.

We define the Jost wave from the left as

Jl(x, t) :=12π

∫ ∞

−∞dk fl(k, x) e−ikt. (5.1)

Using (2.1) in (5.1), we get

Jl(x, t) =12π

∫ ∞

−∞dk eik(x−t) +

12π

∫ ∞

−∞dk [ml(k, x) − 1] eik(x−t). (5.2)

From (3.1), (5.2), and the fact that∫∞−∞ dk eikα = 2π δ(α), we obtain

Jl(x, t) = δ(x− t) + Bl(x, t− x), x, t ∈ R, (5.3)

where δ(x) denotes the Dirac delta distribution. Note that Bl(x, t − x) = 0 ift− x < 0 because Bl(x, α) = 0 for α < 0, as we have seen in Section 3. Thus,

Jl(x, t) = 0, x− t > 0. (5.4)

Comparing (5.3) with (3.12), we see that when x ≥ 0 and t−x > 0, we can expressBl(x, t − x) as the ratio of two determinants that are constructed explicitly withthe help of (3.9) and (3.13). Hence, we have

Jl(x, t) =det Γl(x, t− x)

detMl(x), x ≥ 0, t− x > 0. (5.5)

We also need Jl(x, t) in the region with x − t < 0 and x + t < 0. Towardsthat goal, we can use (2.6) in (5.2) and get

Jl(x, t) =12π

∫ ∞

−∞dk eik(x−t)+

12π

∫ ∞

−∞dk

[mr(−k, x)

T (k)− 1

]eik(x−t)

+12π

∫ ∞

−∞dk

L(k)T (k)

mr(k, x) e−ik(x+t),

or equivalently

Jl(x, t) = δ(x− t)+12π

∫ ∞

−∞dk

[mr(k, x)T (−k)

− 1]e−ik(x−t)

+12π

∫ ∞

−∞dk

L(k)T (k)

mr(k, x) e−ik(x+t).

(5.6)

This separation causes each of the two integrands in (5.6) to have a simple pole atk = 0 in the generic case. The zeros of T (−k) in C+ contribute to the first integralin (5.6). The poles of L(k) in C+ contribute to the second integral in (5.6). We

Page 20: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

12 T. Aktosun, M.H. Borkowski, A.J. Cramer and L.C. Pittman

find that the contribution from k = 0 to the two integrals on the right-hand sideof (5.6) is given by

i θ(−x) [θ(x + t)− θ(t− x)] mr(0, x)Res (1/T (k), 0) , (5.7)

where θ(x) is the Heaviside function defined as

θ(x) :=

1, x > 0,

0, x < 0.

Note that (5.7) can be evaluated by using the fact that L(0) = −1 in the genericcase and that T (k) is analytic and nonzero in C+.

As in Section 4, let us use krj to denote the poles of L(k) in C+ and ρrj

the residues there. Similarly, as in (2.14) and (2.15) let us use zj for the polesof 1/T (−k) in C+ and τj for the corresponding residues for 1 ≤ j ≤ nz. Thecontributions to the right-hand side of (5.6) from the zeros of T (−k) and the polesof L(k) in C+ can be evaluated by using a contour integration along the infinitesemicircle enclosing C+. Hence, in the region with t− x > 0 and x + t < 0, thatcontribution is given by

i

nz∑j=1

τj mr(zj , x) e−izj(x−t) + i

nr∑j=1

ρrj e−ikrj(x+t)

T (krj)mr(krj , x). (5.8)

From (5.7) and (5.8) we see that, in the region with t− x > 0 and x + t < 0, theJost wave from the left is given by

Jl(x, t) = −imr(0, x)Res (1/T (k), 0) + i

nz∑j=1

τj mr(zj , x) e−izj(x−t)

+ i

nr∑j=1

ρrj e−ikrj(x+t)

T (krj)mr(krj , x).

(5.9)

Note that mr(zj , x) and mr(krj , x) can be evaluated explicitly by using (4.6) andT (krj) by using (2.14).

Finally, we will evaluate Jl(x, t) in the region with x < 0 and x + t > 0. Inthat region, the contribution to the first integral in (5.6) from the zeros of T (−k)in C+ is evaluated with the help of a contour integration along the boundary ofC+ and we get the first summation term in (5.8). However, the second integral in(5.6) needs to be evaluated as a contour integral along the boundary of C− dueto the presence of the exponential term e−ik(x+t) in the integrand. With the helpof (2.9) we write that integral as

12π

∫ ∞

−∞dk

L(k)T (k)

mr(k, x) e−ik(x+t) = − 12π

∫ ∞

−∞dk

R(k)T (k)

mr(−k, x) eik(x+t),

(5.10)where the right-hand side can now be evaluated as a contour integral along theboundary of C+. Let us now evaluate the contribution to that integral coming fromthe poles of R(k) in C+ and also the poles of mr(−k, x) in C+. From Section 3

Page 21: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Inverse Scattering with Rational Scattering Coefficients 13

and (4.5) we see that the former poles occur at k = klj and the latter occur atk = krj . As a result, such contributions to the right-hand side of (5.10) can beexplicitly evaluated. For example, when the sets kljnl

j=1 and krjnrj=1 do not

intersect, we get

−inl∑

j=1

ρlj eiklj(x+t)

T (klj)mr(−klj , x)− i

nr∑j=1

R(krj) eikrj(x+t)

T (krj)Res (mr(−k, x), krj) .

(5.11)From (4.5) we see that

Res (mr(−k, x), krj) = −ρrj e−2ikrjx mr(krj , x),

and thus, in the region with x < 0 and x+ t > 0, with the help of (5.6)–(5.11) weobtain

Jl(x, t) = i

nz∑j=1

τj mr(zj, x) e−izj(x−t) − i

nl∑j=1

ρlj eiklj(x+t)

T (klj)mr(−klj , x)

+ i

nr∑j=1

ρrj R(krj) e−ikrj(x−t)

T (krj)mr(krj , x),

(5.12)

where we note that there is no contribution to Jl(x, t) from the poles at k = 0 [cf.(5.7)] in the region with x < 0 and x+t > 0. In case the sets kljnl

j=1 and krjnrj=1

partially or wholly overlap, the integral on the right-hand side of (5.10) can beevaluated explicitly in a similar way as a contour integral along the boundary ofC+ and the result in (5.12) can be modified appropriately.

We can write the Jost wave Jl(x, t) by combining (5.3)–(5.5), (5.9), and(5.12) as

Jl(x, t) =δ(x− t) + θ(x) θ(t − x)det Γl(x, t− x)

detMl(x)

+ i θ(−x) θ(t − x)nz∑

j=1

τj mr(zj , x) e−izj(x−t)

− i θ(t− x) θ(−x− t)mr(0, x)Res (1/T (k), 0)

+ i θ(t− x) θ(−x− t)nr∑

j=1

ρrj e−ikrj(x+t)

T (krj)mr(krj , x)

− i θ(−x) θ(x + t)nl∑

j=1

ρlj eiklj(x+t)

T (klj)mr(−klj , x)

+ i θ(−x) θ(x + t)nr∑

j=1

ρrj R(krj) e−ikrj(x−t)

T (krj)mr(krj, x),

Page 22: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

14 T. Aktosun, M.H. Borkowski, A.J. Cramer and L.C. Pittman

and hence, in terms of the quantities that have been constructed explicitly oncethe rational reflection coefficient R is known, we have constructed Jl(x, t) for allx, t ∈ R except when x = t and also when x = −t with x < 0. In Section 6, we willsee that Jl(x, t) may have jump discontinuities when x = t and also when x = −twith x < 0, and we will evaluate those discontinuities.

We define the Jost wave from the right in a similar manner, by letting

Jr(x, t) :=12π

∫ ∞

−∞dk fr(k, x) e−ikt. (5.13)

Using (2.1) in (5.13), we get [cf. (5.2)]

Jr(x, t) =12π

∫ ∞

−∞dk e−ik(x+t) +

12π

∫ ∞

−∞dk [mr(k, x)− 1] e−ik(x+t), (5.14)

which can be written as [cf. (4.1)]

Jr(x, t) = δ(x + t) + Br(x, x + t). (5.15)

Note that Br(x, x+t) = 0 if x+t < 0 due to, for each fixed x, the analyticity in C+,the continuity in C+, and the O(1/k)-behavior as k → ∞ in C+ of mr(k, x) − 1.Thus,

Jr(x, t) = 0, x + t < 0. (5.16)

Comparing (4.8) and (5.15), when x < 0 and x + t > 0, we see that we can writeBr(x, x + t) as the ratio of two determinants and obtain [cf. (5.5)]

Jr(x, t) =det Γr(x, x + t)

detMr(x), x ≤ 0, x + t > 0, (5.17)

where Mr(x) and Γr(x, α) are as in (4.7) and (4.9), respectively.Next, we will obtain Jr(x, t) in the region with x+ t > 0 and x− t > 0. Using

(2.5) in (5.14), we get [cf. (5.6)]

Jr(x, t) = δ(x + t)+12π

∫ ∞

−∞dk

[ml(k, x)T (−k)

− 1]eik(x+t)

+12π

∫ ∞

−∞dk

R(k)T (k)

ml(k, x) eik(x−t).

(5.18)

The zeros of T (−k) in C+ contribute to the first integral on the right-hand sideof (5.18). As in (5.6), we evaluate that integral as a contour integral along theboundary of C+. The poles of R(k) in C+ contribute to the second integral in(5.18). In the generic case, each of the two integrands in (5.18) has a simple poleat k = 0 because of the simple zero of T (k) there; the contribution from k = 0 inthe two integrals in (5.18) can be evaluated as in (5.7) and we get

i θ(x) [θ(t− x)− θ(x + t)]ml(0, x)Res (1/T (k), 0) . (5.19)

Page 23: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Inverse Scattering with Rational Scattering Coefficients 15

Recalling that klj is the set poles of R(k) in C+ and zj is the set of zeros ofT (−k) in C+, in the region with x + t > 0 and x− t > 0 we get [cf. (5.9)]

Jr(x, t) = −iml(0, x)Res (1/T (k), 0)

+ i

nz∑j=1

τj ml(zj , x) eizj(x+t)

+ i

nl∑j=1

ρlj eiklj(x−t)

T (klj)ml(klj , x),

(5.20)

where the first term on the right-hand side is the contribution from (5.19). Notethat ml(zj , x) and ml(klj , x) can be evaluated explicitly from (3.11) and T (klj)from (2.14).

Finally, let us evaluate Jr(x, t) in the region with x > 0 and t− x > 0. From(5.19) we see that the contribution from the pole at k = 0 to Jr(x, t) is nil whenx > 0 and t−x > 0. To obtain Jr(x, t) when x > 0 and t−x > 0, we can use (5.18)and evaluate the first integral there via a contour integration along the boundaryof C+ to get the first summation term in (5.20). To evaluate the second integralin (5.18), since t− x > 0, with the help of (2.9) we get

12π

∫ ∞

−∞dk

R(k)T (k)

ml(k, x) eik(x−t) = − 12π

∫ ∞

−∞dk

L(k)T (k)

ml(−k, x) e−ik(x−t),

where the right-hand side is to be evaluated as a contour integral along the bound-ary of C+ with the contributions coming from the poles krj of L(k) in C+ and thepoles of ml(−k, x) in C+. From (3.6) or (3.10) we see that the latter poles occurat exactly k = klj , which are the poles of R(k) in C+. If the sets kljnl

j=1 andkrjnr

j=1 do not intersect, we get [cf. (5.11)]

− 12π

∫ ∞

−∞dk

L(k)T (k)

ml(−k, x) e−ik(x−t)

= −inr∑

j=1

ρrj e−ikrj(x−t)

T (krj)ml(−krj , x)

− i

nl∑j=1

L(klj) e−iklj(x−t)

T (klj)Res (ml(−k, x), klj) .

(5.21)

From (3.6) we see that

Res (ml(−k, x), klj) = −ρlj e2ikljx ml(klj , x). (5.22)

In case the sets kljnlj=1 and krjnr

j=1 overlap, the result in (5.21) can appropriatelybe modified.

Page 24: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

16 T. Aktosun, M.H. Borkowski, A.J. Cramer and L.C. Pittman

By combining (5.15)–(5.22), we can write the Jost wave Jr(x, t) for any x, t ∈R as

Jr(x, t) =δ(x + t) + θ(−x) θ(x + t)det Γr(x, x + t)

detMr(x)

+ i θ(x) θ(x + t)nz∑

j=1

τj ml(zj , x) eizj(x+t)

− i θ(x + t) θ(x − t)ml(0, x)Res (1/T (k), 0)

+ i θ(x + t) θ(x − t)nl∑

j=1

ρlj eiklj(x−t)

T (klj)ml(klj , x)

− i θ(x) θ(t − x)nr∑

j=1

ρrj e−ikrj(x−t)

T (krj)ml(−krj , x)

+ i θ(x) θ(t − x)nl∑

j=1

ρlj L(klj) eiklj(x+t)

T (klj)ml(klj , x).

Thus, in terms of the quantities that have been constructed explicitly once therational reflection coefficient R(k) is known, we have obtained Jr(x, t) for all x, t ∈R except when x = −t and also when x = t with x > 0. In the next section, wewill see that Jr(x, t) may have jump discontinuities when x = −t and also whenx = t with x > 0, and we will evaluate those discontinuities.

6. Discontinuities in the Jost waves

For each fixed t ∈ R, let us now analyze the discontinuities in the Jost wavesJl(x, t) and Jr(x, t) and in their x-derivatives. From (3.9) and (3.18), we see thatfor x > 0 the potential V is the ratio of linear combinations of various exponentialfunctions of x and that ratio decays exponentially as x → +∞. Similarly, forx < 0, from (4.7) and (4.10) we see that V is the ratio of linear combinations ofvarious exponential functions and that ratio decays exponentially as x→ −∞. Inthe absence of bound states, it is known [3] that ml(0, x) > 0 and mr(0, x) > 0for x ∈ R. Hence, from (3.20) and (4.12) we see that the only discontinuities in Vand its derivatives can occur at x = 0.

From (7.5)–(7.7) of [18], as k →∞ in C+ we have

ml(k, x) = 1− γl(x)2ik

− 18k2

[γl(x)2 − 2 ql(k, x)

]+ O(1/k3), (6.1)

mr(k, x) = 1− γr(x)2ik

− 18k2

[γr(x)2 + 2 qr(k, x)

]+ O(1/k3), (6.2)

m′l(k, x) =

γl(x)2ik

+ O(1/k2), m′r(k, x) =

γr(x)2ik

+ O(1/k2), (6.3)

Page 25: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Inverse Scattering with Rational Scattering Coefficients 17

with

γl(x) :=∫ ∞

x

dy V (y), γr(x) :=∫ x

−∞dy V (y),

ql(k, x) := V (x) + θ(−x)[V (0+)− V (0−)

]e−2ikx,

qr(k, x) := −V (x) + θ(x)[V (0+)− V (0−)

]e2ikx.

Note that ql(k, x) and qr(k, x) are continuous at x = 0, and we have

ql(k, 0) = V (0+), qr(k, 0) = −V (0−).

With the help of (5.2) and (5.3), let us define

Ul(x, t) := Jl(x, t)− δ(x− t). (6.4)

We will refer to Ul(x, t) as the tail of the Jost wave Jl(x, t). From (5.2) we see thatthe discontinuity in the tail Ul(x, t) is caused by the (1/k)-term in the expansionof ml(k, x) − 1 as k →∞ in C+. By using

12πi

∫ ∞

−∞dk

eikξ

k + i0+= −θ(−ξ), (6.5)

we evaluate that contribution as (1/2) γl(x) θ(t− x), and hence the only disconti-nuity in the tail Ul(x, t) occurs at the wavefront x = t and is given by

Ul(t+ 0+, t)− Ul(t− 0+, t) = −12

∫ ∞

t

dy V (y).

In other words,

Ul(x, t) =

⎧⎪⎨⎪⎩0, x > t,

12

∫ ∞

t

dy V (y), x = t− 0+.

Next, let us analyze the discontinuities in ∂Ul(x, t)/∂x. From (5.2) and (6.4),we see that

∂Ul(x, t)∂x

=12π

∫ ∞

−∞dk m′

l(k, x) + ik [ml(k, x)− 1] eik(x−t), (6.6)

and for each fixed t ∈ R, the discontinuity in ∂Ul(x, t)/∂x is caused by the (1/k)-term in the expansion of the integrand of (6.6) as k →∞ in C+. Using (6.1), (6.3),and (6.5) in (6.6) we see that there are exactly two such discontinuities. The firstdiscontinuity occurs at the wavefront x = t, and the second occurs at x = −t. Atthe wavefront, we get

∂Ul(x, t)∂x

=

⎧⎪⎨⎪⎩0, x > t,

V (x)4

− 12

∫ ∞

t

dy V (y)− 18

[∫ ∞

t

dy V (y)]2

, x = t− 0+.

The contribution to the discontinuity at x = −t is obtained as

∂Ul(t+ 0+, t)∂x

− ∂Ul(t− 0+, t)∂x

= −14θ(−x)

[V (0+)− V (0−)

].

Page 26: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

18 T. Aktosun, M.H. Borkowski, A.J. Cramer and L.C. Pittman

In analogy to (6.4), with the help of (5.14) let us define

Ur(x, t) := Jr(x, t)− δ(x + t). (6.7)

We will refer to Ur(x, t) as the tail of the Jost wave Jr(x, t). Using (6.2) and (6.5)in (6.7), we see that for each fixed t ∈ R, the only discontinuity in Ur(x, t) occursat the wavefront x = −t and is described by

Ur(x, t) =

⎧⎪⎨⎪⎩0, x < −t,12

∫ t

−∞dy V (y), x = −t+ 0+.

To determine the discontinuities in ∂Ur(x, t)/∂x, first from (5.14) we obtain

∂Ur(x, t)∂x

=12π

∫ ∞

−∞dk m′

r(k, x)− ik [mr(k, x) − 1] e−ik(x+t). (6.8)

Using (6.2), (6.3), and (6.5) in (6.8), for each fixed t ∈ R, we see that the disconti-nuities in ∂Ur(x, t)/∂x may occur only at x = −t and at x = t. The former occursat the wavefront and is described by

∂Ur(x, t)∂x

=

⎧⎪⎨⎪⎩0, x < −t,

−V (x)4

− 12

∫ t

−∞dy V (y) +

18

[∫ t

−∞dy V (y)

]2

, x = −t+ 0+.

Finally, the discontinuity at x = t is given by

∂Ur(t + 0+, t)∂x

− ∂Ur(t− 0+, t)∂x

=14θ(x)

[V (0+)− V (0−)

].

7. Numerical implementation

One of the authors (Borkowski) has implemented the theoretical method describedin this paper as a Mathematica 4.2 notebook. The user inputs a rational functionfor R(k) and instructs Mathematica to evaluate the notebook. Mathematica thencalculates all of the quantities relevant to (1.1); namely, the Faddeev functionsml(k, x) and mr(k, x), Jost solutions fl(k, x) and fr(k, x), the potential V (x), thescattering coefficients T (k) and L(k), and the quantities Bl(x, α) and Br(x, α)given in (3.1) and (4.1), respectively.

The implemented program first reduces R(k), then calculates T (k) and L(k)as described in Section 2, and then reduces those to cancel common factors ap-pearing both in the numerator and in the denominator. The reduction in eachscattering coefficient is achieved by computing all the zeros and poles in C+,comparing them within a chosen numerical precision, and cancelling the commonfactors appearing in the numerator and in the denominator. This reduction is nec-essary because Mathematica cannot usually cancel the terms by itself or cannotsimplify enough in certain circumstances. Finally, the Faddeev functions and the

Page 27: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Inverse Scattering with Rational Scattering Coefficients 19

Jost solutions, along with the potential are determined for x > 0, and then forx < 0.

Our program has been able to duplicate the numerical results in [9]. However,we have not been able to duplicate the two numerical examples in [7] given byDolveck-Guilpart. We have also verified that the result of our program agrees withthe analytical example of Sabatier [14]. Prof. Paul Sacks of Iowa State Universityhas used his Matlab program for the solution of the inverse scattering problembased on transforming the relevant inverse problem into an equivalent time-domainproblem and solving it by a time-domain method [19]; he also was able to duplicatethe results in [9], but not in [7], and he has confirmed to us that our results arein complete agreement with his as far as the two examples in [7] are concerned.Prof. Sabatier later has informed us that the two numerical examples given in [7]by Dolveck-Guilpart were indeed incorrect, and the rational reflection coefficientsused in those two examples were outside the domain of the applicability of themethod of [14].

References

[1] T. Aktosun and M. Klaus, Chapter 2.2.4, Inverse theory: problem on the line, in:E.R. Pike and P.C. Sabatier (eds.), Scattering, Academic Press, London, 2001, pp.770–785.

[2] K. Chadan and P.C. Sabatier, Inverse problems in quantum scattering theory, 2nded., Springer, New York, 1989.

[3] P. Deift and E. Trubowitz, Inverse scattering on the line, Comm. Pure Appl. Math.32 (1979), 121–251.

[4] L.D. Faddeev, Properties of the S-matrix of the one-dimensional Schrodinger equa-tion, Am. Math. Soc. Transl. (ser. 2) 65 (1967), 139–166.

[5] V.A. Marchenko, Sturm-Liouville operators and applications, Birkhauser, Basel,1986.

[6] T. Aktosun, M. Klaus, and C. van der Mee, Explicit Wiener-Hopf factorization forcertain nonrational matrix functions, Integral Equations Operator Theory 15 (1992),879–900.

[7] B. Dolveck-Guilpart, Practical construction of potentials corresponding to exact ra-tional reflection coefficients, in: P.C. Sabatier (ed.), Some topics on inverse problems,World Sci. Publ., Singapore, 1988, pp. 341–368.

[8] I. Kay, The inverse scattering problem when the reflection coefficient is a rationalfunction, Comm. Pure Appl. Math. 13 (1960), 371–393.

[9] K.R. Pechenick and J.M. Cohen, Inverse scattering – exact solution of the Gel’fand-Levitan equation, J. Math. Phys. 22 (1981), 1513–1516.

[10] K.R. Pechenick and J.M. Cohen, Exact solutions to the valley problem in inversescattering, J. Math. Phys. 24 (1983), 406–409.

[11] R.T. Prosser, On the solutions of the Gel’fand-Levitan equation, J. Math. Phys. 25(1984), 1924–1929.

Page 28: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

20 T. Aktosun, M.H. Borkowski, A.J. Cramer and L.C. Pittman

[12] P.C. Sabatier, Rational reflection coefficients in one-dimensional inverse scatteringand applications, in: J.B. Bednar et al. (eds.), Conference on inverse scattering:theory and application, SIAM, Philadelphia, 1983, pp. 75–99.

[13] P.C. Sabatier, Rational reflection coefficients and inverse scattering on the line,Nuovo Cimento B 78 (1983), 235–248.

[14] P.C. Sabatier, Critical analysis of the mathematical methods used in electromagneticinverse theories: a quest for new routes in the space of parameters, in: W. M. Boerneret al. (eds.), Inverse methods in electromagnetic imaging, Reidel Publ., Dordrecht,Netherlands, 1985, pp. 43–64.

[15] D. Alpay and I. Gohberg, Inverse problem for Sturm-Liouville operators with rationalreflection coefficient, Integral Equations Operator Theory 30 (1998), 317–325.

[16] C. van der Mee, Exact solution of the Marchenko equation relevant to inverse scat-tering on the line, in: V.M. Adamyan et al. (eds.), Differential operators and relatedtopics, Vol. I, Birkhauser, Basel, 2000, pp. 239–259.

[17] R. Courant and D. Hilbert, Methods of mathematical physics, Vol. I, IntersciencePubl., New York, 1953.

[18] T. Aktosun and J.H. Rose, Wave focusing on the line, J. Math. Phys. 43 (2002),3717–3745.

[19] P.E. Sacks, Reconstruction of steplike potentials, Wave Motion 18 (1993), 21–30.

Acknowledgment

We have benefited from discussions with Profs. Pierre C. Sabatier, Paul Sacks,and Robert C. Smith.

Tuncay Aktosun, Michael H. Borkowski, Alyssa J. Cramer and Lance C. PittmanDepartment of Mathematics and StatisticsMississippi State UniversityMississippi State, MS 39762, USAe-mail: [email protected]

Page 29: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 21–39c© 2005 Birkhauser Verlag Basel/Switzerland

Aluthge Transforms and the Convex Hullof the Spectrum of a Hilbert Space Operator

Tsuyoshi Ando

Dedicated to Professor Israel Gohberg on the occasion of his 75th birthday

Abstract. For a bounded linear operator T on a Hilbert space its Aluthge

transform ∆(T ) is defined as ∆(T ) = |T | 12 U |T | 12 with the help of a polarrepresentation T = U |T |. In recent years usefulness of the Aluthge transformhas been shown in several directions. In this paper we will use the Aluthgetransform to study when the closure of the numerical range W (T ) of T co-incides with the convex hull of its spectrum. In fact, we will prove that it is

the case if and only if the closure of W (T ) coincides with that of W(∆(T )

).

As a consequence we will show also that for any operator T the convex hullof its spectrum is written as the intersection of the closures of the numericalranges of all iterated Aluthge transforms ∆n(T ).

Mathematics Subject Classification (2000). Primary 47A12; Secondary 47A10.

Keywords. Aluthge transform; Numerical range; Convex hull of spectrum;Convexoid operator; Norm inequalities.

1. Introduction

Among familiar quantities related to a (bounded linear) operator T on a Hilbertspace are the (operator) norm ||T || and the spectral radius r(T ). Also amongfamiliar sets related to T are the spectrum σ(T ) and the numerical range W (T ),defined as

W (T ) def= 〈Tx, x〉 : ||x|| = 1.The quantity

w(T ) def= sup|〈Tx, x〉| : ||x|| = 1is called the numerical radius. Obviously

||T || ≥ w(T ) ≥ r(T ).

Page 30: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

22 T. Ando

Consider a polar representation T = U |T | where |T | is the positive semi-definite square root of T ∗T and U is a partial isometry with U∗U |T | = |T |. Aluthge[2] assigned to T an operator T defined by

T ≡ |T | 12U |T | 12 .

It is easy to see that T does not depend on a choice of a partial isometry U withU∗U |T | = |T |. We will call the correspondence T −→ T the Aluthge transformand use the notation

T −→ ∆(T ) ≡ |T | 12U |T | 12 . (1.1)With ∆0(T ) ≡ T , the nth iterate of the Aluthge transform will be denoted by∆n(T )

∆n(T ) def= ∆n−1(∆(T )

)(n = 1, 2, . . .).

If T = U |T | is normal, U can be chosen to commute with |T | (hence |T | 12 ),so that T = ∆(T ). But this relation does not characterize normality.

Since ||T || = || |T | || = || |T | 12 ||2, the following inequality is obvious

||T || ≥ ||∆(T )||. (1.2)

According to the Toeplitz-Hausdorff theorem (see [8] p. 113)W (T ) is a convexset of the complex plane. Using the representation theorem of a closed convex set(of the complex plane) as the intersection of all closed half-planes containing it,Hildebrandt [9] proved

W (T ) =⋂ζ∈C

ξ : |ξ − ζ| ≤ ||T − ζI||. (1.3)

As a corollary we have

W (T ) =⋂ζ∈C

ξ : |ξ − ζ| ≤ w(T − ζI). (1.4)

In Section 2, generalizing (1.2) we will establish the inequality (Theorem 2)

||T − ζI|| ≥ ||∆(T )− ζI|| (ζ ∈ C),

which proves via (1.3) a known result (Corollary 3)

W (T ) ⊃ W(∆(T )

). (1.5)

It is well known (see [8] p. 43) that for any pair of operators A,B

σ(AB) ∪ 0 = σ(BA) ∪ 0.

Since 0 ∈ σ(T ) ⇐⇒ 0 ∈ σ(|T | 12 ) ∪ σ(U), we can see that 0 ∈ σ(T ) is equivalent to0 ∈ σ

(∆(T )

). Therefore by (1.1) we arrive at the following known relation

σ(T ) = σ(∆(T )

). (1.6)

Page 31: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Aluthge Transforms 23

It is well known (see [8] p. 115) that σ(T ) is contained in the closure W (T ),so that by convexity of W (T ) we have

W (T ) ⊃ conv(σ(T )

),

where conv(·) denotes the convex hull. When combined with (1.5) and (1.6), thisyields

W (T ) ⊃ W(∆(T )

)⊃ W

(∆2(T )

)⊃ · · · ⊃ conv

(σ(T )

). (1.7)

As a consequence of the spectral representation, it is known (see [8] p. 116)that if T is normal the convex hull of σ(T ) coincides with the closure of W (T ).But the converse is not true in general.

Let us call an operator T convexoid if

W (T ) = conv(σ(T )

).

It follows from (1.7) that if T is convexoid then

(∗) W (T ) = W(∆(T )

).

In Section 3 we will prove (Theorem 6) that the condition (∗) completelycharacterizes convexoidity. Then by (1.4) we can show (Corollary 7) that T isconvexoid if and only if

w(∆(T )− ζI

)= w(T − ζI) (ζ ∈ C).

In Section 4 we will establish a representation result (Theorem 9) for theconvex hull of σ(T );

conv(σ(T )

)=

∞⋂n=1

W(∆n(T )

),

and derive, as a related result, a known result (Theorem 10)

limn→∞ ||∆n(T )|| = r(T ).

Theorem 6 and Theorem 9 for a (finite-dimensional) matrix were establishedin an earlier paper [3].

2. Norm inequalities

Let T be an operator on a Hilbert space H with polar representation T = U |T |.For notational simplicity, write

P ≡ |T |so that

T = UP and ∆(T ) = P12UP

12 .

When T is not invertible, the polar part U may not be unitary. But we canreduce many problems to the case with unitary U by the following lemma.

Page 32: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

24 T. Ando

Lemma 1. Suppose that T is a bounded linear operator on a Hilbert space H withpolar representation T = UP . When T is not invertible, define an operator T onthe direct sum H⊕H as T = U P , where

Udef=

⎡⎣ U (I − UU∗)12

(I − U∗U)12 −U∗

⎤⎦ and Pdef=

⎡⎣ P 0

0 0

⎤⎦ .Then U is unitary and P ≥ 0, and T satisfies the following properties:

(a) σ(T ) = σ(T ) and σ(∆(T )

)= σ

(∆(T )

),

(b) W (T ) ⊂ W (T ) ⊂ W (T ) and W(∆(T )

)⊂ W

(∆(T )

)⊂ W

(∆(T )

),

(c) ||T − ζI|| = ||T − ζI|| and ||∆(T )− ζI|| = ||∆(T )− ζI|| (ζ ∈ C),

(d) ||(T − ζI)−1|| = ||(T − ζI)−1|| and

||(∆(T )− ζI

)−1

|| = ||(∆(T )− ζI

)−1

|| (ζ ∈ σ(T )),

where I is the identity operator on H⊕H. In particular, if W (T ) (resp. W(∆(T )

))

is closed, so is W (T ) ( resp. W(∆(T )

)).

Proof. The operator U is the so-called unitary dilation of U . Since U∗UP = P bydefinition, we have

T =

⎡⎣ UP 0

0 0

⎤⎦ =

⎡⎣ T 0

0 0

⎤⎦ and ∆(T ) =

⎡⎣ ∆(T ) 0

0 0

⎤⎦ . (2.1)

By (2.1) we have

σ(T ) = σ(T ) ∪ 0 and σ(∆(T )

)= σ

(∆(T )

)∪ 0.

But since neither T or ∆(T ) is invertible, by (1.6) we have

0 ∈ σ(T ) = σ(∆(T )

), (2.2)

and we can conclude (a).Again by (2.1) we can see by definition of numerical range

W (T ) = conv(W (T ), 0

)and W

(∆(T )

)= conv

(W(∆(T )

), 0).

On the other hand, since by (2.2)

0 ∈ σ(T ) ⊂W (T ) and 0 ∈ σ(∆(T )

)⊂W

(∆(T )

),

we can conclude (b) by convexity of W (T ) and W(∆(T )

).

Page 33: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Aluthge Transforms 25

Again it follows from (2.1) that

||T − ζI || = max||T − ζI||, |ζ|

and ||∆(T )− ζI || = max

||∆(T )− ζI||, |ζ|

.

Therefore for (c) it suffices to prove that

||T − ζI|| ≥ |ζ| and ||∆(T )− ζI|| ≥ |ζ| (ζ ∈ C). (2.3)

By (2.2) there is a sequence of unit vectors xn (n = 1, 2, . . .) such that

limn→∞Txn = 0 or lim

n→∞T ∗xn = 0. (2.4)

In the former case of (2.4), we have

||T − ζI|| ≥ limn→∞ ||(T − ζI)xn|| = |ζ|

while in the latter case

||T − ζI|| = ||T ∗ − ζI|| ≥ limn→∞ ||(T ∗ − ζI)xn|| = |ζ|,

which prove the first part of (2.3).By (2.2) the same arguments prove the second part of (2.3). These establish

(c). The proof of (d) is quite similar and is omitted. Theorem 2. Let T be a Hilbert space operator with polar representation T = UP .Then

||T − ζI|| ≥ ||∆(T )− ζI|| ( ζ ∈ C ), (2.5)

and

||(T − ζI)−1|| ≥ ||(∆(T )− ζI

)−1

|| ( ζ ∈ σ(T ) ). (2.6)

Proof. If U is not unitary, consider T in Lemma 1. Therefore we may assume thatU is unitary. For ε > 0 define

Tεdef= U(P + εI).

Then all Tε are invertible, and

limε↓0

||Tε − ζI|| = ||T − ζI|| (ζ ∈ C),

andlimε↓0

||∆(Tε)− ζI|| = ||∆(T )− ζI|| (ζ ∈ C).

In a similar way

limε↓0

||(Tε − ζI)−1|| = ||(T − ζI)−1||(ζ ∈ σ(T )

),

and

limε↓0

||(∆(Tε)− ζI

)−1

|| = ||(∆(T )− ζI

)−1

||(ζ ∈ σ(T )

).

Therefore to prove (2.5) and (2.6) we may assume further that T is invertible.

Page 34: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

26 T. Ando

First we claim a general assertion that if an operator S commutes with T =UP then

() ||P 12SUP

12 || ≤ ||SUP ||.

In fact, since for a selfadjoint operator the norm coincides with the spectralradius, we can see

||P 12SUP

12 ||2 = ||P 1

2SUPU∗S∗P12 ||

= r(P12SUPU∗S∗P

12 ) = r(PSUPU∗S∗)

≤ ||U∗ · UPS · U · (SUP )∗|| ≤ ||UPS|| · ||(SUP )∗||= ||SUP ||2,

proving ().Now, to see (2.5), take in ()

S ≡ (T − ζI)T−1 = (UP − ζI)(UP )−1.

Then since

P12SUP

12 = P

12 (UP − ζI)P−1U∗UP

12

= (P12UP

12 − ζI)P− 1

2U∗UP12 = ∆(T )− ζI

andSUP = T − ζI,

it follows from () that

||∆(T )− ζI|| ≤ ||T − ζI||,

proving (2.5).To see (2.6), take

S ≡ (T − ζI)−1T−1 = (UP − ζI)−1(UP )−1.

Then since

P12SUP

12 = P

12

(UP − ζI

)−1

P−1U∗UP12

= (P12UP

12 − ζI)−1P− 1

2U∗UP12 =

(∆(T )− ζI

)−1

andSUP = (T − ζI)−1,

it follows from () that

||(∆(T )− ζI

)−1

|| ≤ ||(T − ζI)−1||,

proving (2.6). This completes the proof.

Page 35: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Aluthge Transforms 27

It should be mentioned that in a recent paper [7] (see also [10]) generalinequalities

||f(∆(T )

)|| ≤ ||f(T )|| ∀ f(ζ) analytic on σ(T )

were established.

Corollary 3. (Yamazaki [14], Wu [13]) For a Hilbert space operator T and itsAluthge transform ∆(T ) the following holds

W (T ) ⊃ W(∆(T )

).

Proof. According to the formula (1.3) it follows from Theorem 2

W (T ) =⋂ζ∈C

ξ : |ξ − ζ| ≤ ||T − ζI||

⊃⋂ζ∈C

ξ : |ξ − ζ| ≤ ||∆(T )− ζI|| = W(∆(T )

).

This completes the proof.

3. Convexoid operators

Let T be an operator on a Hilbert space H with polar representation T = UP ,and let ∆(T ) ≡ P

12UP

12 be its Aluthge transform.

The numerical range W (T ) is not closed in general. But for the problemsunder discussion we may assume the closedness of W (T ) as seen in Lemma 4 below.For this, let us start with a general setup following the idea of Berberian [4].

Consider the Banach space l∞ of bounded complex sequences (ζn). Remarkthat the indexing is from 0 to ∞. For a sequence

(ζn) = (ζ0, ζ1, ζ2, . . .)

the (backward) shifted sequence (ζn+1) is defined as

(ζn+1) ≡ (ζ1, ζ2, ζ3, . . .).

We consider a Banach limit Lim (ζn) (see [6] pp. 84–86) as a unital positive linearfunctional on the commutative C∗-algebra l∞ (see [6], p. 256). Here unitalnessmeans

Lim(ζn) = 1 when ζn = 1 ∀ n

while positivity means

Lim(ζn) ≥ 0 whenever ζn ≥ 0 ∀ n.

The characteristic property of a Banach limit is

(shift invariance) Lim (ζn) = Lim (ζn+1).

Page 36: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

28 T. Ando

By linearity, positive unitalness and shift invariance we can see

limn→∞

|ζn| ≤ |Lim (ζn)| ≤ limn→∞|ζn| (3.1)

andLim (ζn) = lim

n→∞ ζn (if ζn converges). (3.2)

Next consider the space l∞(H) of bounded sequences x = (xn) of a Hilbertspace H, and define a sesquilinear form

〈x,y〉 def= Lim(〈xn, yn〉

)and the corresponding Hilbertian seminorm

||x|| def=√〈x,x〉 =

√Lim (||xn||2).

l∞(H) becomes a pre-Hilbert space. The associated Hilbert space, that is, thecompletion of l∞(H)/x : ||x|| = 0 will be called the ultra-sum (based on theBanach limit) of H and denoted by K.

The mapx −→ x = (xn) (with xn = x ∀ n)

gives a canonical isometric embedding of H into K. This embedding will be simplywritten as

x = (x).

When (An) is a bounded sequence of operators on H, define a linear map Aon l∞(H) by

Ax def= (Anxn) for x = (xn).The operator A can be canonically lifted to an operator on K, which will bedenoted by the same A. We will denote this relation by

A = (An).

It is immediate from definition that for A = (An), B = (Bn) and α, β ∈ C

A ·B = (AnBn), αA + βB = (αAn + βBn) and A∗ = (A∗n). (3.3)

Therefore there is a canonical C∗-embedding of B(H) into B(K):

A −→ A = (An) (with An = A ∀ n).

This A will be written asA = (A).

Just as in [4] we can prove the following Lemma.

Lemma 4. For any operator T on a Hilbert space H the operator T = (T ) on theultra-sum K of H possesses the following properties;

σ(T) = σ(T ),

andW (T) = W (T ) and W

(∆(T)

)= W

(∆(T )

).

Page 37: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Aluthge Transforms 29

Though the proof of Lemma 4 does not use shift-invariance of the Banachlimit, the proof of the following lemma, which gives generalizations of (3.1) and(3.2), is based on the shift invariance.

Lemma 5. For any bounded sequence (An) of operators on a Hilbert space Hthe norm of the operator A = (An) on the ultra-sum K satisfies the followinginequalities:

limn→∞

||An|| ≤ ||A|| ≤ limn→∞||An||,

so that||A|| = lim

n→∞ ||An|| (if ||An|| converges).

Proof. For any bounded sequence x = (xn) in H and k = 1, 2, . . ., by shift invari-ance and positivity of the Banach limit

||Ax|| = Lim ||An+kxn+k|| ≤ Lim ||An+k|| · ||xn+k||≤ sup

n||An+k|| · Lim ||xn+k|| = sup

n||An+k||||x||,

which implies||Ax|| ≤ lim

n→∞||An|| · ||x||,

so that||A|| ≤ lim

n→∞||An||.

Conversely, for any ε > 0 and n = 0, 1, 2, . . . take xn ∈ H such that

||xn|| = 1 and ||Anxn|| ≥ ||An|| − ε.

Then again by shift invariance and positivity we have for any k = 1, 2, . . .

||Ax|| = Lim ||(An+kxn+k)|| ≥ Lim (||An+k|| − ε)≥ inf

n||An+k|| − ε,

which implies||A|| ≥ ||Ax|| ≥ lim

n→∞||An|| − ε.

Since ε > 0 is arbitrary, we can conclude

||A|| ≥ limn→∞

||An||.

This completes the proof.

Recall that an operator T is said to be convexoid when

W (T ) = conv(σ(T )

).

Page 38: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

30 T. Ando

Theorem 6. A Hilbert space operator T is convexoid if and only if

W (T ) = W(∆(T )

).

Proof. The “only if” part was already pointed out in Introduction. To prove the“if” part, considering T in Lemma 4 and further T in Lemma 1 if necessary, wemay assume that W (T ) and W

(∆(T )

)are closed and T = UP with unitary U .

There fore we have to prove that if W (T ) is closed and U is unitary then

W (T ) = W(∆(T )

)=⇒ W (T ) = conv

(σ(T )

).

Since W (T ) is a compact convex set containing conv(σ(T )

), by the separa-

tion theorem for a compact convex set (see [12] p. 40), with the help of a rotationof the complex plane (around the origin) if necessary, for the proof it suffices toshow, under the condition

(∗) W (T ) = W (T ) = W(∆(T )

),

that for any τ ∈ R

() W (T ) ⊂ ζ : Re(ζ) ≤ τ and W (T ) ∩ ζ : Re(ζ) = τ = ∅

=⇒ σ(T ) ∩ ζ : Re(ζ) = τ = ∅.

When τ = 0 and T is not invertible, we have obviously

0 ∈ σ(T ) ∩ ζ : Re(ζ) = τ = ∅.Therefore in the following we have to consider two cases; the first is the case ofτ = 0 with invertible T and the second is the case of τ = 0.

Write P ≡ |T | for simplicity, so that

T = UP and ∆(T ) = P12UP

12 . (3.4)

Denote by Q the orthoprojection onto the closure of the range of P . Then accordingto the decomposition I = Q⊕ (I −Q), write the positive semi-definite operator Pas P = P1 ⊕ 0 where P1 is a positive definite operator on ran(Q), the range of Q.

Notice that if τ < 0 the (closed) numerical range W (T ) is contained in theopen left half-plane and hence T is invertible, so that Q = I.

Since U is unitary, the first of the assumptions in () implies

Re(UP ) ≤ τI and Re(PU) = U∗ ·Re(UP ) · U ≤ τI, (3.5)

where Re(·) denotes the selfadjoint part: Re(T ) = 12 (T + T ∗).

Now let us consider an operator-valued (strongly) continuous function Φ(ζ)on the strip ζ : |Re(ζ)| ≤ 1

2 defined by

Φ(ζ) def= P12−ζUP

12+ζ with P ζ def= exp(ζ logP1)⊕ 0. (3.6)

Page 39: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Aluthge Transforms 31

Take the resolution of the identity E(λ) (0 ≤ λ <∞) for the operator P1 (see [1]p. 249). Then

P1 =∫ ∞

0

λdE(λ) (3.7)

and

P ζ1 = exp(ζ logP1) =

∫ ∞

0

λζdE(λ). (3.8)

Notice that by definition (3.6) we have

Φ(12 ) = QUP = QT, Φ(0) = ∆(T ) and Φ(− 1

2 ) = PUQ. (3.9)

Now ζ −→ Φ(ζ) is analytic in the interior of the strip, and on the boundary± 1

2 + it : t ∈ R we have

〈Φ(12 + it)x, x〉 = 〈UP (P itx), P itx〉 (3.10)

and〈Φ(− 1

2 + it)x, x〉 = 〈PU(P itx), P itx〉. (3.11)

Since||P itx|| ≤ ||x|| (with equality if τ < 0),

we can conclude from (3.5)

Re(Φ(ζ)

)≤ τI (|Re(ζ)| = 1

2 ). (3.12)

Then it follows from (3.12) via a variant of the three lines theorem in complexanalysis (see [5] Chap. VI, Sect. 3) that

Re(Φ(ζ)

)≤ τI (|Re(ζ)| ≤ 1

2 ). (3.13)

Since by (∗) the second of the assumptions in () implies

ζ : Re(ζ) = τ ∩W(Φ(0)

)= ζ : Re(ζ) = τ ∩W

(∆(T )

)= ∅,

there exists a vector u such that

||u|| = 1 and 〈Re(Φ(0)

)u, u〉 = τ. (3.14)

Fix this u and consider the numerical analytic function

ϕ(ζ) def= 〈Φ(ζ)u, u〉 (|Re(ζ)| ≤ 12 ).

Since by (3.13) and (3.14)

Re(ϕ(ζ)

)≤ τ (|Re(ζ)| ≤ 1

2 ) and Re(ϕ(0)

)= τ,

it follows from a variant of the maximum principle for an analytic function ([5] p.128) that

Re(ϕ(ζ)

)= τ (|Re(ζ)| ≤ 1

2 ). (3.15)

Page 40: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

32 T. Ando

Then from (3.13) and (3.15) by an inequality of Cauchy–Schwarz type we can see(see [8] p. 75) [

τI − Re(Φ(ζ)

)]u = 0 (|Re(ζ)| ≤ 1

2 ).

In particular, [τI − Re

(Φ(0)

)]u = 0.

These considerations show that

M def= ker[τI − Re

(Φ(0)

)](3.16)

is a non-trivial closed subspace and[τI − Re

(Φ(ζ)

)]x = 0 (x ∈M, (|Re(ζ)| ≤ 1

2 ). (3.17)

Now we are in position to prove () under the condition M = ∅. First letus show that M is included in ran(Q). This is immediate when T is invertible,because then P is invertible with Q = I so that ran(Q) is the whole space. Asmentioned in the beginning of the proof of (), it remains to treat the case τ = 0.When τ = 0, writing

Re(Φ(0)

)= P

12 Re(U)P

12 = Q ·Re

(Φ(0)

)·Q,

we have by definition (3.16) that

τ(I −Q)x⊕[τI − Re

(Φ(0)

)]Qx = 0 (x ∈ M).

This implies (I −Q)x = 0, that is, Qx = x, hence M⊂ ran(Q).Now since Q = P it with t = 0 and Qx = x for x ∈ M, we can derive as

before from (3.5), (3.9), (3.10) and (3.11) together with (3.17)[τI − Re(UP )

]x = 0 and

[τI − Re(PU)

]x = 0 (x ∈M). (3.18)

Then since

2τI − 2Re(UP ) = 2τI − UP − PU∗

= 2τI −QRe(U)QP − PQRe(U) − (I −Q)Re(U)P−iIm(U)P − P Im(U)

where Im(U) = 12i (U − U∗), and similarly

2τI − 2Re(PU) = 2τI − PU − U∗P= 2τI −QRe(U)QP − PQRe(U) − (I −Q)Re(U)P

+iIm(U)P − P Im(U),we can conclude from (3.18)

[2τI −QRe(U)QP − PQRe(U)]x − (I −Q)Re(U)Px = 0 (x ∈M), (3.19)

and[Im(U)P − P Im(U)]x = 0 (x ∈ M). (3.20)

Page 41: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Aluthge Transforms 33

With ζ = it (t ∈ R) in (3.17) we can see[τI − Re(Φ(0)

)]P itx = 0 (x ∈ M; t ∈ R).

This shows that M is invariant for P it (t ∈ R). Let Q0 be the orthoprojectiononto the subspace M of ran(Q). Since P it is unitary on ran(Q), it follows fromthe invariance of M for P it (t ∈ R) that

Q0Pit = P itQ0 (t ∈ R). (3.21)

We claim further that

() Q0P = PQ0.

To see this, notice that it follows from (3.8) and (3.21) that for any x, y ∈ran(Q) and t ∈ R∫ ∞

0

λit〈d〈E(λ)Q0x, y〉 = 〈P itQ0x, y〉

= 〈P itx,Q0y〉 =∫ ∞

0

λitd〈E(λ)x,Q0y〉.

Now by the injectivity of the Fourier transform on the set of measures on the realline (see [11] p. 134) we can conclude that for any x, y ∈ ran(Q)

〈E(λ)Q0x, y〉 = 〈E(λ)x,Q0y〉 (0 ≤ λ <∞),

which implies that Q0 commutes with all E(λ), and hence with P by (3.7), estab-lishing the claim ().

Now it follows from definition (3.16) via () that the subspace M is invariantfor any function f(P ) of P , that is,

P 1/2Re(U)P 1/2 · f(P )x = τf(P )x (x ∈ M). (3.22)

With f(t) =√t, we can see from (3.22) that

P12 Re(U)Px = τ · P 1

2 x (x ∈ M),

and henceQRe(U)Q · Px = τx (x ∈ M)

because P 1/2 is injective on M. Again since P (M) is dense in M, this impliesfurther that M is invariant for QRe(U)Q too, and

P ·QRe(U)Qx = QRe(U)Q · Px = τx (x ∈ M). (3.23)

Therefore the positive semidefinite operator P and the selfadjoint contrac-tion QRe(U)Q on M have common approximate (unit) eigenvectors, say vn (n =1, 2, . . .) in M, that is, for some λ ≥ 0 and 0 ≤ θ ≤ π

limn→∞(λI − P )vn = 0 and lim

n→∞

[cos θ · I −QRe(U)

]vn = 0. (3.24)

Page 42: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

34 T. Ando

It follows from (3.23) and (3.24) that

λ cos θ = τ. (3.25)

Here λ > 0, because we are treating the case τ = 0Then it follows from (3.19), (3.20), (3.23) and (3.24) that

limn→∞(I −Q)Re(U)vn = 0 and lim

n→∞(λI − P )Im(U)vn = 0, (3.26)

which implies again by (3.24) that

limn→∞

[cos θ · I − Re(U)

]vn = 0. (3.27)

If | cos θ| = 1, by (3.27) and the fact that U is unitary we have

limn→∞(cos θ · I − U)vn = 0.

Then by (3.24) and (3.25) we have

U · limn→∞(τI − PU)vn = lim

n→∞(τI − T )Uvn = 0,

and henceτ ∈ σ(T ) ∩ ζ : Re(ζ) = τ.

Finally suppose that | cos θ| < 1 and hence sin θ = 0. Then by the unitarityof U and (3.27), each vn can be written in the form

vn = xn + yn (n = 1, 2, . . .),

such that

limn→∞(eiθI − U)xn = 0 and lim

n→∞(e−iθI − U)yn = 0. (3.28)

Then it follows from (3.26) and (3.28) that

sin θ · limn→∞(λI − P )(xn − yn) = lim

n→∞(λI − P )Im(U)vn = 0. (3.29)

Since sin θ = 0, by (3.24) and (3.29) we have

limn→∞(λI − P )(xn + yn) = 0 and lim

n→∞(λI − P )(xn − yn) = 0

so thatlim

n→∞(λI − P )xn = 0 and limn→∞(λI − P )yn = 0,

and hence

limn→∞(λeiθI − T )xn = 0 and lim

n→∞(λe−iθI − T )yn = 0.

Now by (3.25) λeiθ or λe−iθ is an approximate eigenvalue of T with real partλ cos θ = τ , so that

σ(T ) ∩ ζ ; Re(ζ) = τ = ∅.This completes the proof of ().

Page 43: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Aluthge Transforms 35

Corollary 7. A Hilbert space operator T is convexoid if and only if

w(T − ζI) = w(∆(T )− ζI

)(ζ ∈ C).

Proof. First assume that T is convexoid. Then by Theorem 6

W (T ) = W(∆(T )

)which implies, by definition of numerical radius,

w(T − ζI) = sup|ξ − ζ| : ξ ∈W (T )

= sup|ξ − ζ| : ξ ∈W(∆(T )

) = w

(∆(T )− ζI

),

proving the relation.Conversely, if the relation is valid, by the general formula (1.4) we have

W (T ) =⋂ζ∈C

ξ : |ξ − ζ| ≤ w(T − ζI)

=⋂ζ∈C

ξ : |ξ − ζ| ≤ w(∆(T )− ζI

) = W

(∆(T )

).

Now the assertion follows again from Theorem 6.

4. Convex hull of spectrum

When an operator T is not convexoid, it is an interesting question how to representconv

(σ(T )

)in terms of numerical ranges related to T .

To answer this question, we need an operator, different from that in Lemma4, on the ultra-sum.

Lemma 8. Suppose that T is a bounded linear operator on a Hilbert space H. Thenthe sequence of operators T =

(∆n(T )

)determines a bounded linear operator on

the ultra-sum K of H with the following properties:

σ(T) ⊂ σ(T ) and W (T) = W(∆(T)

)=

∞⋂n=1

W(∆n(T )

).

Proof. The operator T is bounded, because by (1.2)

||T || ≥ ||∆(T )|| ≥ ||∆2(T )|| ≥ · · · .Then it is easy to see

∆(T) =(∆n+1(T )

). (4.1)

Take ζ ∈ σ(T ). Then since by Theorem 2

||(T − ζI)−1|| ≥ ||(∆(T )− ζI

)−1

|| ≥ ||(∆2(T )− ζI

)−1

|| ≥ · · ·

Page 44: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

36 T. Ando

the operator

S def=((∆n(T )− ζI)−1

)is bounded on K and becomes the inverse of T− ζ I. Therefore ζ ∈ σ(T), hence

σ(T) ⊂ σ(T ).

Since by (4.1)

∆k(T)− ζ I =(∆n+k(T )− ζI

)and by Theorem 2

||T − ζI|| ≥ ||∆(T )− ζI|| ≥ ||∆2(T )− ζI|| ≥ · · · ,we can conclude from Lemma 5 that

||∆k(T)− ζ I|| = infn||∆n(T )− ζI|| = ||T− ζ I|| (ζ ∈ C). (4.2)

Now using (4.2) for k = 1, by the general formula (1.3) we have

W (T) =⋂ζ∈C

ξ : |ξ − ζ| ≤ ||T− ζ I||

=

⋂ζ∈C

ξ : |ξ − ζ| ≤ ||∆(T) − ζ I||

= W

(∆(T)

).

In a similar way we have by (4.2)

W (T) =⋂ζ∈C

ξ : |ξ − ζ| ≤ ||T− ζ I||

=

⋂ζ∈C

∞⋂n=1

ξ : |ξ − ζ| ≤ ||∆n(T )− ζI||

=

∞⋂n=1

W(∆n(T )

).

This completes the proof.

Theorem 9. For any Hilbert space operator T the convex hull of its spectrum isrepresented as

conv(σ(T )

)=

∞⋂n=1

W(∆n(T )

).

Proof. Consider the operator T in Lemma 8 such that

W (T) = W(∆(T)

)=

∞⋂n=1

W(∆n(T )

)and σ(T) ⊂ σ(T ).

Then by Theorem 6 T is convexoid, so that

conv(σ(T)

)= W (T) =

∞⋂n=1

W(∆n(T )

).

Page 45: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Aluthge Transforms 37

Finally since σ(T ) ⊃ σ(T), this implies

conv(σ(T )

)⊃

∞⋂n=1

W(∆n(T )

).

The reverse inclusion is obvious by (1.7). This completes the proof. Theorem 10. (Yamazaki [15]) The spectral radius of a Hilbert space operator T isrepresented as

r(T ) = limn→∞ ||∆n(T )|| = inf

n=1,2,...||∆n(T )||.

Proof. Since by (1.6)

||∆n(T )|| ≥ r(∆n(T )

)= r(T ),

the inequalityinf

n=1,2,...||∆n(T )|| ≥ r(T )

is immediate. To see the reverse inequality, we may assume

infn=1,2,...

||∆n(T )|| = 1, (4.3)

and have to prove r(T ) ≥ 1.To this end, consider the operator T on the ultra-sum K in Lemma 8. Then

by (4.2) and (4.3) we have

||T|| = infn=1,2,...

||∆n(T )|| = 1.

Choose unit vectors xn ∈ H such that

limn→∞ ||∆n(T )xn|| = 1,

and let x = (xn) ∈ K. For any k ≥ 1 let

xkdef= (xn+k)

Then we have, by definition (4.1) of T, for every k = 1, 2, . . .

||xk|| = 1 and ||∆k(T)xk|| = limn→∞ ||∆n+k(T )xn+k|| = 1. (4.4)

We claim, for an operator S with polar representation S = V |S| and ||S|| = 1,that for any k ≥ 0

(†) ||∆k(S)u|| = ||u|| = 1 =⇒ ||Sk+1u|| = 1.

Since ∆0(S) = S by definition, (†) for k = 0 is immediate. Suppose that (†)for some k ≥ 0 is true in general, and suppose

||∆k+1(S)u|| = ||u|| = 1.

Then since by (1.2)

1 = ||S|| ≥ ||∆(S)|| ≥ ||∆k+1(S)|| ≥ ||∆k+1(S)u|| = 1,

Page 46: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

38 T. Ando

we have ||∆(S)|| = 1. Since

1 = ||∆k+1(S)u|| = ||∆k(∆(S)

)u||,

apply the induction assumption to ∆(S) instead of S to conclude

||∆(S)k+1u|| = 1. (4.5)

Since∆(S)k+1 = |S| 12 (V |S|)kV |S| 12 and 0 ≤ |S| 12 ≤ I,

we have by (4.5)

1 = ||∆(S)k+1u|| ≤ ||(V |S|)kV |S| 12 u|| ≤ || |S| 12u|| ≤ ||u|| = 1

and hence||(V |S|)kV |S| 12u|| = || |S| 12 u|| = 1. (4.6)

Recall the following well-known result (see [8] p.75);

(‡) 0 ≤ A ≤ I, ||x|| ≤ 1, ||Ax|| = 1 =⇒ A2x = Ax = x.

Now consider the vector v ≡ (V |S|)kV |S| 12u. Then by (4.6) and (4.5) we have||v|| = 1 and

|| |S| 12 v|| = ||∆(S)k+1u|| = 1.

Now we can apply (‡) to |S| 12 instead of A and to u and v instead of x to get

||Sk+2u|| = ||V |S| · (V |S|)kV |S|u|| = ||(|S| 12 )2(V |S|)kV |S| 12 |S| 12u||= ||(|S| 12 )2(V |S|)kV |S| 12u|| = ||(|S| 12 )2v|| = 1.

This completes induction for (†).Now applying (†) to (4.4) we can conclude

||Tk+1|| ≥ ||Tk+1xk|| = 1 (k = 0, 1, 2, . . .).

Then the Gelfand formula (see [8] p.48) yields

r(T) = limk→∞

||Tk|| 1k ≥ 1. (4.7)

Finally since σ(T) ⊂ σ(T ) implies r(T) ≤ r(T ), by (4.7) we arrive at r(T ) ≥ 1.This completes the proof.

References

[1] N.I. Akhiezer and I.M. Glazman, Theory of Linear Operators in Hilbert Space, (Eng-lish Translation) Pitman Pub., Boston, 1981.

[2] A. Aluthge, On p-hyponorma l operators for 0 < p < 1, Integral Equations OperatorTheory 13 (1990), 307–315.

[3] T. Ando, Aluthge transforms and the convex hull of the eigenvalues of a matrix,Linear Multilinear Algebra, 52, 281–292.

[4] S.K. Berberian, Approximate proper vectors, Proc. Amer. Math. Soc. 13 (1962), 111–114.

Page 47: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Aluthge Transforms 39

[5] J.B. Conway, Functions of One Complex Variable I, Springer, New York, 1978.

[6] J.B. Conway, A Course in Functional Analysis, Springer, New York, 1985.

[7] C. Foias, I.B. Jung, E. Ko and C. Pearcy, Complete contractivity of maps associatedwith the Aluthge and Duggal transforms, Pacific J. Math. 209 (2003), 249–259.

[8] P. Halmos, A Hilbert Space Problem Book, Springer, New York, 1985.

[9] S. Hildebrandt, Uber den numerischen Wertebereich eines Operators, Math. Ann.163 (1966), 230–247.

[10] I.B. Jung, E. Ko, and C. Pearcy, Aluthge transforms of operators, Integral EquationsOperator Theory 37 (2000), 437–448.

[11] Y. Katznelson, An Introduction to Harmonic Analysis (Second corrected edition),Dover, New York, 1976.

[12] S.R. Lay, Convex Sets and Their Applications, Wiley, New York, 1982.

[13] P.Y. Wu, Numerical range of Aluthge transform of operators, Linear Algebra Appl.357(2002), 295–298.

[14] T. Yamazaki, On numerical range of the Aluthge transformation, Linear Alg. Appl.341(2002), 111–117.

[15] T. Yamazaki, An expression of spectral radius via Aluthge transformation, Proc.Amer. Math. Soc. 130 (2002), 1131–1137.

Tsuyoshi AndoShiroishi-ku, Hongo-dori 9Minami 4-10-805Sapporo 003-0024, Japane-mail: [email protected]

Page 48: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 41–52c© 2005 Birkhauser Verlag Basel/Switzerland

Maximal Nevanlinna-Pick Interpolationfor Points in the Open Unit Disc

W. Bhosri, A.E. Frazho and B. Yagci

Dedicated to Israel Gohberg on the occasion of his seventy-fifth birthday

Abstract. This note uses a modification of the classical optimization problemin prediction theory to derive a maximal solution for the Nevanlinna-Pickinterpolation problem for each point in the open unit disc. This optimizationproblem is also used to show that the maximal solution is unique. A statespace realization for the maximal solution is given.

1. A positive interpolation problem

In this note we will use a modification of the classical optimization problem inprediction theory [14, 15] to derive a special set of solutions for the Nevanlinna-Pick interpolation or covariance problem in [7, 8, 10, 16]. For each α in the openunit disc, we compute a state space solution to the Nevanlinna-Pick interpolationproblem that uniquely satisfies a maximum principle. For α = 0 our maximalsolution reduces to the central or maximal entropy solution in [7, 8, 10, 16].

To introduce our Nevanlinna-Pick interpolation problem, let U be Hilbertspace and T on 2+(U) be the strictly positive Toeplitz operator matrix given by

T =

⎡⎢⎢⎢⎣R0 R1 R2 · · ·R−1 R0 R1 · · ·R−2 R−1 R0 · · ·

......

.... . .

⎤⎥⎥⎥⎦ on 2+(U) . (1.1)

(An operator P strictly positive if P is an invertible positive operator.) Now let Abe a stable operator on a Hilbert space X . By stable we mean that the spectrum ofA is contained in the open unit disc D, that is, rspec(A) < 1. Let C be an operatormapping X onto the whole space U . Let W be the observability operator mapping

Page 49: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

42 W. Bhosri, A.E. Frazho and B. Yagci

X into 2+(U) defined by

W =

⎡⎢⎢⎢⎣CCACA2

...

⎤⎥⎥⎥⎦ : X → 2+(U) . (1.2)

Throughout we assume that the pair C,A is observable. In other words, weassume that W is left invertible, or equivalently, W ∗W is invertible. Hence Λ =W ∗TW is a strictly positive operator on X .

The operator Λ = W ∗TW is a solution to a Lyapunov equation of the form

Λ = A∗ΛA + C∗C + C∗C. (1.3)

Here C is an operator from X into U . To obtain this Lyapunov equation, let C bethe operator from X into U defined by

C =12R0C +

∞∑j=1

RjCAj . (1.4)

Now let S be the standard forward shift on 2+(U). By employing S∗W = WA, weobtain

Λ −A∗ΛA = W ∗TW −W ∗STS∗W = W ∗ (T − STS∗)W

= W ∗

⎡⎢⎢⎢⎣R0 R1 R2 · · ·R−1 0 0 · · ·R−2 0 0 · · ·

......

.... . .

⎤⎥⎥⎥⎦W

= C∗R0C +∞∑

j=1

C∗RjCAj +

∞∑j=1

A∗jC∗R−jC = C∗C + C∗C .

Therefore Λ is a solution to the Lyapunov equation in (1.3).This naturally leads to a Nevanlinna-Pick interpolation problem. The data

set is a triple of operators A,C,Λ where C,A is a stable, observable pair andC is onto. Moreover, Λ is a strictly positive operator on X satisfying a Lyapunovequation of the form (1.3). Finally, we assume that U is finite dimensional. Thenour Nevanlinna-Pick interpolation problem is to find the set of all strictly positiveToeplitz operators T on 2+(U) satisfying Λ = W ∗TW . The set of all solutionsto this problem is given in [7, 8, 16]. It turns out that the Nevanlinna-Pick in-terpolation problem is equivalent to a state covariance problem arising in linearsystems [10, 11]. Reference [10] uses J expansive functions to derive the set of allsolutions. Here we will use some optimization problems from classical predictiontheory [14, 15], to present an elementary derivation of a special set of solutions tothis interpolation problem. Our set of solutions is parameterized by the open unitdisc D. For each α in D, we obtain a solution to the Nevanlinna-Pick interpolationproblem which uniquely satisfies a maximal principle. For α = 0, our solution

Page 50: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nevanlinna-Pick Interpolation for Points in the Unit Disc 43

turns out to be the central solution to the Nevanlinna-Pick interpolation problempresented in [7, 8, 10, 16]. Finally, it is noted that we do not need to obtain C tocompute a solution to our Nevanlinna-Pick interpolation problem. All we need toknow is that Λ is a solution to a Lyapunov equation of the form (1.3).

This interpolation problem encompasses the classical Caratheodory interpo-lation problem. To see this assume that A is the upper shift on X = ⊕n

1U given by

A =

⎡⎢⎢⎢⎢⎢⎣0 I 0 · · · 00 0 I · · · 0...

......

. . ....

0 0 0 · · · I0 0 0 0 0

⎤⎥⎥⎥⎥⎥⎦C =

[I 0 0 · · · 0

]. (1.5)

Notice that the state space X is simply n orthogonal copies of U . In this settingthe observability operator W in (1.2) is given by

W =[I0

]: ⊕n

1U → 2+(U) .

In other words, W embeds X = ⊕n1U into the first n components of 2+(U). Now

assume that T is the strictly positive Toeplitz operator given in (1.1). Then Λ =W ∗TW is the strictly positive n× n Toeplitz matrix contained in the upper left-hand corner of T , that is,

Λ = W ∗TW =

⎡⎢⎢⎢⎢⎢⎣R0 R1 R2 · · · Rn−1

R−1 R0 R1 · · · Rn−2

R−2 R−1 R0 · · · Rn−3

......

.... . .

...R1−n R2−n R3−n · · · R0

⎤⎥⎥⎥⎥⎥⎦ . (1.6)

Now assume that Λ is a strictly positive Toeplitz matrix of the form (1.6). Let Cbe the operator given by

C =[R0/2 R1 R2 · · · Rn−1

].

Then Λ is a solution to the Lyapunov equation in (1.3). Therefore A,C,Λ withC,A in (1.5) and Λ in (1.6) strictly positive is a data set for our Nevanlinna-Pickinterpolation problem. In this setting, our Nevanlinna-Pick interpolation problemis to find the set of all strictly positive Toeplitz operators T on 2+(U) such that Λis contained in the n×n upper left-hand corner of T . This is precisely the classicalCaratheodory interpolation problem.

To introduce our solution to the Nevanlinna-Pick interpolation problem, weneed some additional notation. Let Θ be a function inH∞(L(U)). (Here H∞(L(U))is the Hardy space consisting of the set of all uniformly bounded analytic functions

Page 51: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

44 W. Bhosri, A.E. Frazho and B. Yagci

in the open unit disc whose values are bounded operators on U .) Then TΘ is thelower triangular Toeplitz operator on 2+(U) defined by

TΘ =

⎡⎢⎢⎢⎣Θ0 0 0 · · ·Θ1 Θ0 0 · · ·Θ2 Θ1 Θ0 · · ·...

......

. . .

⎤⎥⎥⎥⎦ on 2+(U) (1.7)

where Θ(λ) =∑∞

0 Θnλn is the Taylor series expansion for Θ. We say that Θ is an

invertible outer function if both Θ and Θ−1 are function in H∞(L(U)). A functionΘ in H∞(L(U)) is an invertible outer function if and only if TΘ is an invertibleoperator on 2+(U). We say that Θ is an outer spectral factor for the Toeplitzoperator T in (1.1) if Θ is an outer function in H∞(L(U)) satisfying T = T ∗

ΘTΘ. Itis well known that the outer spectral factor is unique up to a unitary constant onthe left. Finally, T is a strictly positive Toeplitz operator if and only if T admitsan invertible outer spectral factor; see Theorem 1.1 page 534 in [6].

Throughout F is the standard Fourier transform mapping 2+(U) onto H2(U).Here H2(U) is the Hardy space formed by the set of all analytic functions in theopen unit disc with values in U whose Taylor coefficients are square summable. If Wis the observability operator in (1.2), then FWx = C(I−λA)−1x where x is in X .

Now assume that T is a solution to the Nevanlinna-Pick interpolation prob-lem for the data set A,C,Λ. In other words, assume that T is a strictly positiveToeplitz operator such that Λ = W ∗TW . Then T admits a unique invertible outerspectral factor Θ. Therefore Λ = W ∗T ∗

ΘTΘW . Clearly, there is a one to one cor-respondence between the set of all solutions to our Nevanlinna-Pick interpolationproblem and the set of all invertible outer functions Θ satisfying Λ = W ∗T ∗

ΘTΘW .Motivated by this we say that Θ is a spectral interpolant for the data A,C,Λ ifΘ is an invertible outer function in H∞(L(U)) satisfying Λ = W ∗T ∗

ΘTΘW .

2. Two optimization problems

As before, let Θ be an invertible outer function in H∞(L(U)). Let α be a fixedscalar in the open unit disc D. Consider the classical optimization problem

µ(y, α) = inf‖Θh‖2 : h ∈ H2(U) and h(α) = y . (2.1)

In optimal control theory, the error µ(y, α) in the optimization problem is referredto as the “cost”. The idea is to design a controller with the lowest cost possible.Motivated by control theory we will refer to µ(y, α) as the cost. If α = 0, this isprecisely the classical optimization problem arising in prediction theory [14, 15, 18],which also played a role in the maximal entropy solution to the state covarianceproblem in [10]. Now let ϕα be the reproducing kernel in H2 defined by ϕα(λ) =1/(1− αλ). Set dα = (1− |α|2)1/2. The optimal solution hopt to the optimizationproblem in (2.1) is unique and given by

hopt(λ) = d2αϕα(λ)Θ(λ)−1Θ(α)y and µ(y, α) = d2

α‖Θ(α)y‖2 . (2.2)

Page 52: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nevanlinna-Pick Interpolation for Points in the Unit Disc 45

To show that hopt is the optimal solution, first observe that hopt(α) = y. Let h beany function in H2(U) satisfying h(α) = y. Recall that if g is a function in H2(U)and u is in U , then (g(α), u) = (g, ϕαu). Using this reproducing property of ϕα,we obtain

‖Θh‖2 = ‖Θ(h− hopt) + Θhopt‖2

= ‖Θ(h− hopt)‖2 + 2(Θ(h− hopt), d2αϕαΘ(α)y) + ‖Θhopt‖2

= ‖Θ(h− hopt)‖2 + 2(Θ(α)(y − y), d2αΘ(α)y) + ‖Θhopt‖2

= ‖Θ(h− hopt)‖2 + ‖Θhopt‖2 ≥ ‖Θhopt‖2 .

This readily implies that

‖Θhopt‖2 ≤ ‖Θ(h− hopt)‖2 + ‖Θhopt‖2 = ‖Θh‖2 . (2.3)

Hence hopt is an optimal solution to the optimization problem in (2.1). The in-equality in (2.3) shows that ‖Θhopt‖ = ‖Θh‖ if and only if ‖Θ(h− hopt)‖ = 0, orequivalently, h = hopt. In other words, hopt is the unique solution to the optimiza-tion problem in (2.1). Since dαϕα is a unit vector, the cost µ(y, α) = ‖Θhopt‖2 =d2

α‖Θ(α)y‖2.Now let us introduce the second optimization problem, which plays a funda-

mental role in our approach to the Nevanlinna-Pick interpolation problem. Recallthat operator Λ = W ∗TW is the solution to the Lyapunov equation in 1.3. Let

ν(y, α) = inf(Λx, x) : C(I − αA)−1x = y . (2.4)

Here α is a fixed scalar in D, and y is a fixed vector in U . This is a standard leastsquares optimization problem whose solution is unique and given by

xopt = Λ−1(I − αA∗)−1C∗∆y and ν(y, α) = (∆y, y)

∆ =(C(I − αA)−1Λ−1(I − αA∗)−1C∗)−1

. (2.5)

To develop a connection between (2.4) and the Nevanlinna-Pick interpolationproblem, assume that Θ is a spectral interpolant for the data A,C,Λ, that is,Θ is an invertible outer function satisfying Λ = W ∗T ∗

ΘTΘW . If x is in X , then(FTΘW )(λ) = Θ(λ)C(I − λA)−1x where λ ∈ D. Hence

(Λx, x) = (W ∗T ∗ΘTΘWx, x) = ‖TΘWx‖2 = ‖ΘC(I − λA)−1x‖2

H2 .

This readily implies that

(∆y, y) = inf(Λx, x) : C(I − αA)−1x = y= inf‖ΘC(I − λA)−1x‖2 : C(I − αA)−1x = y . (2.6)

Notice that the solution xopt to this optimization problem is independent of thespectral interpolant Θ. We claim that

∆ ≥ d2αΘ(α)∗Θ(α). (2.7)

If x is a vector in X satisfying C(I −αA)−1x = y, then h(λ) = C(I − λA)−1x is afunction in H2(U) satisfying h(α) = y. In other words, the optimization problem

ν(y, α) = inf‖ΘC(I − λA)−1x‖2 : C(I − αA)−1x = y (2.8)

Page 53: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

46 W. Bhosri, A.E. Frazho and B. Yagci

searches over a smaller set than the optimization problem in (2.1). This readilyimplies that the cost ν(y, α) ≥ µ(y, α). By virtue of ν(y, α) = (∆y, y) and µ(y, α) =d2

α‖Θ(α)y‖2, we arrive at the inequality in (2.7).The previous analysis yields the following maximal principle: If Θ is any

spectral interpolant for A,C,Λ, then ∆ ≥ d2αΘ(α)∗Θ(α). We say that Θ is an α-

maximal spectral interpolant for A,C,Λ if Θ is a spectral interpolant satisfying∆ = d2

αΘ(α)∗Θ(α).Assume that Θ is an α-maximal spectral interpolant. This means that two

optimization problems in (2.1) and (2.8) have the same cost, that is, µ(y, α) =ν(y, α) for all y in U . Recall that the optimization problem in (2.8) searches over asmaller set than the optimization problem in (2.1). Because the solution to thesetwo optimization problems are unique, we must have hopt(λ) = C(I − λA)−1xopt

for all y in U . By consulting (2.2) and (2.5) this readily implies that

d2αϕα(λ)Θ(λ)−1Θ(α) = C(I − λA)−1Λ−1(I − αA∗)−1C∗∆ . (2.9)

Since ∆ = d2αΘ(α)∗Θ(α), without loss of generality we can assume that dαΘ(α) =

∆1/2. Then equation (2.9) implies that

Θ(λ) = dα

((1− αλ)C(I − λA)−1Λ−1(I − αA∗)−1C∗∆1/2

)−1

. (2.10)

Observe that Θ is uniquely determined by the formula in (2.10). In other words,if there exists a spectral interpolant Θ satisfying ∆ = d2

αΘ(α)∗Θ(α), then Θ isuniquely given by (2.10). (Here we do not distinguish between two outer functionswhich are equal up to a unitary constant on the left.) So far we have shown that ifthere exists an α-maximal spectral interpolant for A,C,Λ, then Θ is unique andgiven by (2.10). The following result shows that there exists a unique α-maximalspectral interpolant for any data set.

Theorem 2.1. Let A,C,Λ be the data set for a Nevanlinna-Pick interpolationproblem. Moreover, let Ω be the function in H∞(L(U)) defined by

Ω(λ) = d−1α (1− αλ)C(I − λA)−1Λ−1(I − αA∗)−1C∗∆1/2

∆ =(C(I − αA)−1Λ−1(I − αA∗)−1C∗)−1

. (2.11)

Then the following holds.(i) The inverse Θ(λ) = Ω(λ)−1 is the unique α-maximal spectral factor for

A,C,Λ. In particular, Ω is an invertible outer function.(ii) A realization for Θ is given by

Θ(λ) = D − λDC(I − λJ)−1(A− αI)B(CB)−1

B = Λ−1(I − αA∗)−1C∗

J = A− (A− αI)B(CB)−1C

D = dα∆−1/2(CB)−1 . (2.12)

Finally, the operator J is stable, that is, rspec(J) < 1.

Page 54: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nevanlinna-Pick Interpolation for Points in the Unit Disc 47

Remark 2.1. If α = 0, then the 0-maximal solution in Theorem 2.1 is given byΘ = Ω−1 where

Ω(λ) = C(I − λA)−1Λ−1C∗∆1/2 and ∆ =(CΛ−1C∗)−1

.

This Θ is precisely the central solution to the Nevanlinna-Pick interpolation prob-lem presented in [7, 8, 10, 16].

Remark 2.2. Let Λ be the strictly positive Toeplitz operator on X = ⊕n1U given

in (1.6), and A and C the operators on X defined in (1.5). In this case, theNevanlinna-Pick interpolation problem for A,C,Λ reduces to the Caratheodoryinterpolation problem. Notice that

C(I − λA)−1 =[I λI λ2I · · · λn−1I

]:= Ψ(λ) . (2.13)

By consulting Theorem 2.1, we see that the α-maximal solution to the Caratheodoryinterpolation problem is given by Θ = Ω−1 where Ω is the polynomial determined by

Ω(λ) = d−1α (1 − αλ)Ψ(λ)Λ−1Ψ(α)∗∆1/2

∆ =(Ψ(α)Λ−1Ψ(α)∗

)−1. (2.14)

If α = 0, then Ω =[I λI · · · λn−1I

]Λ−1C∗∆1/2 and ∆ = (CΛ−1C∗)−1. In

this case, Θ = Ω−1 is the outer spectral factor computed from the Levinson filter.

Proof of Theorem 2.1. Recall that if F admits a state space realization of the form

F (λ) = N + λC(I − λA)−1E (2.15)

where N is invertible, then the inverse of F exists in some neighborhood of theorigin and is given by

F (λ)−1 = N−1 − λN−1C(I − λ(A− EN−1C)

)−1EN−1 . (2.16)

Using (I − λA)−1 = I + λ(I − λA)−1A, it follows that Ω in (2.11) admits astate space realization of the form

dαΩ(λ)∆−1/2 = CB + λC(I − λA)−1(A− αI)B . (2.17)

Lemma 2.1 below shows that CB is invertible. By consulting (2.15) and (2.16), wesee that Θ = Ω−1 is given by the state space realization in (2.12).

We claim that J satisfies the Lyapunov equation

Λ = J∗ΛJ + C∗D∗DC . (2.18)

To obtain the Lyapunov equation in (2.18), set

P = I −B(CB)−1C and Q = B(CB)−1C .

Using P + Q = I with x in X , we obtain

(Λx, x) = (Λ(P + Q)x, (P + Q)x) = (ΛPx, Px) + 2(ΛPx,Qx) + (ΛQx,Qx) .(2.19)

Page 55: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

48 W. Bhosri, A.E. Frazho and B. Yagci

By employing AP = J − αQ, we obtain

‖Λ1/2APx‖2 = ‖Λ1/2Jx− αΛ1/2Qx‖2

= ‖Λ1/2Jx‖2 − 2(Λ(APx + αQx), αQx) + |α|2‖Λ1/2Qx‖2

= (ΛJx, Jx)− 2(Px, αA∗ΛQx)− |α|2(ΛQx,Qx) . (2.20)

Notice that CP = 0. By applying P ∗ to the left and P to the right of the Lyapunovequation in (1.3), we obtain P ∗ΛP = P ∗A∗ΛAP . This with (2.19) and (2.20) yields

(Λx, x) − (ΛJx, Jx) = (ΛPx, Px) + 2(Px,ΛQx) + (ΛQx,Qx)−(ΛAPx,APx)− 2(Px, αA∗ΛQx)− |α|2(ΛQx,Qx)

= 2(Px, (I − αA∗)ΛQx) + d2α(ΛQx,Qx)

= 2(CPx, (CB)−1Cx) + d2α(ΛQx,Qx)

= d2α(B∗ΛB(CB)−1Cx, (CB)−1Cx) = (C∗D∗DCx, x) .

The last equation follows from the fact that B∗ΛB = ∆−1. Therefore the Lyapunovequation in (2.18) holds.

Now let us show that J is stable. Set L = (A − αI)B(CB)−1, and let k beany positive integer. Using J = A− LC, we obtain⎡⎢⎢⎢⎢⎢⎣

CCACA2

...CAk

⎤⎥⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎢⎣I 0 0 · · · 0CL I 0 · · · 0CAL CL I · · · 0

......

.... . .

...CAk−1L CAk−2L CAk−3L · · · I

⎤⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎣CCJCJ2

...CJk

⎤⎥⎥⎥⎥⎥⎦ . (2.21)

The square operator matrix in (2.21) is a lower triangular Toeplitz matrix withthe identity on the main diagonal. In particular, this matrix is invertible. Forthe moment assume that X is finite dimensional. Because C,A is observable,equation (2.21) implies that C, J is observable. Notice that D is invertible, andthus, DC, J is also observable. Since Λ is a strictly positive solution to theLyapunov equation in (2.18) and DC, J is observable, it follows that J is stable.The stability of A and J imply that Ω and Ω−1 are both function in H∞(L(U)).In other words, Θ = Ω−1 is an invertible outer function.

Now assume that X is infinite dimensional. Let Wk and Vk be the operatorsmapping X into ⊕k

0U defined by

Wk =

⎡⎢⎢⎢⎣CCA...

CAk

⎤⎥⎥⎥⎦ and Vk =

⎡⎢⎢⎢⎣CCJ...

CJk

⎤⎥⎥⎥⎦ .

Because A is stable, the operator Wk converges to W in the operator topology as ktends to infinity. Recall that W is left invertible. So there exists an integer k suchthat Wk is left invertible. Since the lower triangular matrix in (2.21) is invertible,Vk is also left invertible. Recall that J satisfies the Lyapunov equation in (2.18).

Page 56: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nevanlinna-Pick Interpolation for Points in the Unit Disc 49

By consulting Lemma 5.4 in [7], we see that J is stable. Therefore Θ = Ω−1 is aninvertible outer function.

We claim that

Θ(λ)C(I − λA)−1 = DC(I − λJ)−1 . (2.22)

Recall that J = AP + αQ and P + Q = I. Using this we obtain

Θ(λ)C(I − λA)−1 =(D − λDC(I − λJ)−1(A− αI)B(CB)−1

)C(I − λA)−1

= DC(I − λ(I − λJ)−1(A− αI)Q

)(I − λA)−1

= DC(I − λJ)−1 ((I − λJ) − λ(A− αI)Q) (I − λA)−1

= DC(I − λJ)−1 (I − λA(P + Q)) (I − λA)−1

= DC(I − λJ)−1.

Therefore (2.22) holds.Recall that Λ =

∑∞0 J∗nC∗D∗DCJn is the unique solution to the Lyapunov

equation in (2.18). By employing (2.22) with x in X , we obtain

‖TΘWx‖2 = ‖ΘC(I − λA)−1x‖2H2 = ‖DC(I − λJ)−1x‖2

H2

= ‖∞∑

n=0

λnDCJnx‖2 =∞∑

n=0

‖DCJnx‖2

=∞∑

n=0

(J∗nC∗D∗DCJnx, x) = (Λx, x) .

Thus (W ∗T ∗ΘTΘWx, x) = ‖TΘWx‖2 = (Λx, x) for all x in X . Hence Θ is a spec-

tral interpolant for the data A,C,Λ. Notice that Θ(α)−1 = Ω(α) = dα∆−1/2. Inother words, dαΘ(α) = ∆1/2. This readily implies that ∆ = d2

αΘ(α)∗Θ(α). There-fore Θ is the unique α-maximal spectral interpolant for A,C,Λ. This completesthe proof.

Lemma 2.1. Let A,C,Λ be a data set, and B = Λ−1(I−αA∗)−1C∗ where α ∈ D.Then the operator CB is invertible.

Proof. If α = 0, then CB = CΛ−1C∗. In this case, CB is strictly positive. Thereforethe Lemma holds when α = 0.

Now assume that α is nonzero, and recall that U is finite-dimensional. Letus proceed by contradiction and assume that CBu = 0 for some nonzero u in U .This implies that Bu is in the kernel of C. Recall that

xopt = Λ−1(I − αA∗)−1C∗∆y = B∆y

is the unique solution to the optimization problem in (2.4). So if we set y = ∆−1u,then the optimal solution xopt = Bu is in the kernel of C. Moreover, ν(y, α) isnonzero. By employing the Lyapunov equation in (1.3), we see that (Λx, x) =

Page 57: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

50 W. Bhosri, A.E. Frazho and B. Yagci

(ΛAx,Ax) for all x in the kernel of C. Using this we obtain

ν(y, α) = (Λxopt, xopt) ≥ inf(Λx, x) : Cx = 0 and C(I − αA)−1x = y= inf(Λx, x) : Cαx = 0 and C(I − αA)−1Aαx = y= |α|−2 inf(Λx, x) : Cx = 0 and C(I − αA)−1Ax = y= |α|−2 inf(ΛAx,Ax) : Cx = 0 and C(I − αA)−1Ax = y≥ |α|−2 inf(Λx, x) : C(I − αA)−1x = y= |α|−2ν(y, α).

Hence |α|2ν ≥ ν = 0. Thus |α| ≥ 1. Since α ∈ D, we arrive at a contradiction.Therefore CB is invertible. This completes the proof.

3. An approximation result

As before, let T be a specified strictly positive Toeplitz matrix on 2+(U), andA,C,Λ the corresponding data set where Λ = W ∗TW . In applications one istrying to estimate T from the data set A,C,Λ. Recall that T = T ∗

ΦTΦ whereΦ is an invertible outer function in H∞(L(U)). So in practice one is interestedin determining how close the α-maximal spectral interpolant Ω−1 is to a specifiedspectral interpolant Φ. The following result shows that if ∆ is approximately equalto d2

αΦ(α)∗Φ(α), then Ω−1 is approximately equal to Φ.

Proposition 3.1. Let Φ be a spectral interpolant for the data set A,C,Λ, andΩ−1 the α-maximal spectral interpolant. Then the following equality holds

‖(ΦΩ∆1/2 − dαΦ(α)

)dαϕαy‖2 = (∆y, y)− d2

α‖Φ(α)y‖2 . (3.1)

In particular, if α = 0, then we have

‖ΦΩ∆1/2y − Φ(0)y‖2 = (∆y, y)− ‖Φ(0)y‖2 . (3.2)

Proof. Notice that C(I − λA)−1xopt = dαϕαΩ∆1/2y. Using the fact that ϕα is areproducing kernel for H2 with C(I − αA)−1xopt = y, we obtain

‖(ΦΩ∆1/2 − dαΦ(α)

)dαϕαy‖2 = ‖ΦC(I − λA)−1xopt − d2

αϕαΦ(α)y‖2

= ‖ΦC(I − λA)−1xopt‖2 + d2α‖Φ(α)y‖2

−2(ΦC(I − λA)−1xopt, d2αϕαΦ(α)y)

= (Λxopt, xopt) + d2α‖Φ(α)y‖2

−2d2α(Φ(α)C(I − αA)−1xopt,Φ(α)y)

= (∆y, y)− d2α‖Φ(α)y‖2 .

Therefore (3.1) holds. This completes the proof.

Page 58: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nevanlinna-Pick Interpolation for Points in the Unit Disc 51

References

[1] C.I. Byrnes, T.T. Georgiou, and A. Lindquist, A generalized entropy criterion forNevanlinna-Pick interpolation: A convex optimization approach to certain problemsin systems and control, IEEE Transactions on Automatic Control, 42 (2001) pp.822–839.

[2] P.E. Caines, Linear Stochastic Systems, Wiley, New York, 1988.

[3] M.J. Corless and A.E. Frazho, Linear Systems and Control; An Operator Perspective,Marcel Decker, New York, 2003.

[4] R.L. Ellis, I. Gohberg and D.C. Lay, Extensions with positive real part, a new versionof the abstract band method with applications, Integral Equations and OperatorTheory, 16 (1993) pp. 360–384.

[5] C. Foias and A.E. Frazho, The Commutant Lifting Approach to Interpolation Prob-lems, Operator Theory: Advances and Applications, vol. 44, Birkhauser, 1990.

[6] C. Foias, A.E. Frazho, I. Gohberg and M. A. Kaashoek, Metric Constrained Inter-polation, Commutant Lifting and Systems, Operator Theory: Advances and Appli-cations, vol. 100, Birkhauser, 1998.

[7] A.E. Frazho and M.A. Kaashoek, A band method approach to a positive expansionproblem in a unitary dilation setting, Integral Equations and Operator Theory, 42(2002) pp. 311–371.

[8] A.E. Frazho and M.A. Kaashoek, A Naimark dilation perspective of Nevanlinna-Pickinterpolation, Integral Equations and Operator Theory, to appear.

[9] T.T. Georgiou, Spectral estimation via selective harmonic amplification, IEEETransactions on Automatic Control, 46 (2001) pp. 29–42.

[10] T.T. Georgiou, Spectral analysis based on the state covariance: the maximum en-tropy spectrum and linear fractional parameterization, IEEE Transactions on Auto-matic Control, 47, (2002) pp. 1811–1823.

[11] T.T. Georgiou, The structure of state covariances and its relation to the powerspectrum of the input, IEEE Transactions on Automatic Control, 47, (2002) pp.1056–1066.

[12] I. Gohberg, S. Goldberg and M.A. Kaashoek, Classes of Linear Operators I, OperatorTheory: Advances and Applications, 49, Birkhauser Verlag, Basel, 1990.

[13] I. Gohberg, S. Goldberg and M.A. Kaashoek, Classes of Linear Operators II, Oper-ator Theory: Advances and Applications, 63, Birkhauser Verlag, Basel, 1993.

[14] H. Helson and D. Lowdenslager, Prediction theory and Fourier series in several vari-ables, Acta Math., 99 (1958), pp. 165–202.

[15] H. Helson and D. Lowdenslager, Prediction theory and Fourier series in several vari-ables II, Acta Math., 106 (1961), pp. 175–213.

[16] M.A. Kaashoek and C.G. Zeinstra, The band method and generalized Caratheodory-Toeplitz interpolation at operator points, Integral Equations and Operator Theory,33 (1999) pp. 175–210.

[17] T. Kailath, Linear Systems, Englewood Cliffs: Prentice Hall, New Jersey, 1980.

[18] B. Sz.-Nagy and C. Foias, Harmonic Analysis of Operators on Hilbert Space, NorthHolland Publishing Co., Amsterdam-Budapest, 1970.

Page 59: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

52 W. Bhosri, A.E. Frazho and B. Yagci

W. BhosriSchool of Aeronautics and AstronauticsPurdue UniversityWest Lafayette, IN 47907-1282, USAe-mail: [email protected]

A.E. FrazhoSchool of Aeronautics and AstronauticsPurdue UniversityWest Lafayette, IN 47907-1282, USAe-mail: [email protected]

B. YagciSchool of Aeronautics and AstronauticsPurdue UniversityWest Lafayette, IN 47907-1282, USAe-mail: [email protected]

Page 60: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 53–79c© 2005 Birkhauser Verlag Basel/Switzerland

On the Numerical Solution of aNonlinear Integral Equation of Prandtl’s Type

M.R. Capobianco, G. Criscuolo and P. Junghanns

Dedicated to Professor Israel Gohberg on the Occasion of his 75th Birthday

Abstract. We discuss solvability properties of a nonlinear hypersingular inte-gral equation of Prandtl’s type using monotonicity arguments together withdifferent collocation iteration schemes for the numerical solution of such equa-tions.

Mathematics Subject Classification (2000). Primary 65R20; Secondary 45G05.

Keywords. Nonlinear hypersingular integral equation, Collocation method.

1. Introduction

We are interested in the numerical solution of integral equations of the form

− ε

π

∫ 1

−1

g(y)(y − x)2

dy + γ(x, g(x)) = f(x) , |x| < 1 , (1.1)

where 0 < ε ≤ 1 and the unknown function g satisfies the boundary conditions

g(±1) = 0 . (1.2)

The integral has to be understood as the “finite part” of the strongly singularintegral in the sense of Hadamard, who introduced this concept in relation to theCauchy principal value.

This type of strongly singular integral equations can be used effectively tomodel many problems in fracture mechanics (see [6, 7, 10, 11] and the referencesgiven there). Denote by D the linear Cauchy singular integral operator

(Dg)(x) =1π

∫ 1

−1

g(y)y − x

dy, |x| < 1 .

Page 61: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

54 M.R. Capobianco, G. Criscuolo and P. Junghanns

Using the boundary conditions (1.2), we can write

(Lg)(x) :=1π

∫ 1

−1

g(y)(y − x)2

dy =d

dx(Dg)(x) = (Dg′)(x) . (1.3)

In a two-dimensional crack problem g is the crack opening displacement definedby the density of the distributed dislocations v(x) as

g(x) = −∫ x

−1

v(y)dy .

If we suppose that the nondimensional half crack length is equal to 1, the parameterε in (1.1) corresponds to the inverse of the normalized crack length, measured interms of a physical length parameter which is small relative to the physical cracklength. The stress field at a crack tip has a square-root singularity with respectto the distance measured from the crack tip. This requires that the dislocationdensity v(x) is similarly singular, and it turns out that the Cauchy singular integralremains bounded at the crack tip. Thus, we suppose that

g(x) = ϕ(x)u(x), ϕ(x) =√

1− x2 . (1.4)

Then, by relations (1.3) and (1.4), we can rewrite equation (1.1) as

− ε

π

d

dx

∫ 1

−1

ϕ(y)u(y)y − x

dy + γ(x, ϕ(x)u(x)) = f(x) , |x| < 1 . (1.5)

Moreover, it can be supposed that the functions f(x) and γ(x, g) will both benonnegative by physical reasons. This happens since f and γ represent the appliedtensile tractions that pull the crack surfaces apart and the stiffness of the rein-forcing fibres that resist crack opening, respectively (for more details the readeris referred to [10]). But, in the present paper we will not make use of such non-negativity assumptions. We are particularly interested in the class of problems forwhich γ(x, g) is a monotone function with respect to g, i.e.,

[g1 − g2][γ(x; g1)− γ(x; g2)] ≥ 0, |x| ≤ 1, g1, g2 ∈ R.

As examples, in the literature γ(x, g) is chosen as follows

γ(x, g) = Γ(x)g, |x| ≤ 1, g ∈ R, (1.6)

andγ(x, g) = Γ(x)

√|g| sgn g, |x| ≤ 1, g ∈ R, (1.7)

where Γ(x) > 0 , |x| ≤ 1 . Both cases occur in the analysis of a relatively long crackin unidirectionally reinforced ceramics (see [10]). The case corresponding to (1.6)is extensively treated in [2, 3].

The paper is organized as follows. In Section 2 we study the solvability of(1.5) and smoothness properties of the solutions. In Section 3 the convergenceof a collocation method is proved and iteration methods for the solution of thecollocation equations are investigated. In the foundation of such iteration methodsfor nonlinear operator equations the Lipschitz continuity plays an important role.But, this Lipschitz continuity is not satisfied in example (1.7). Hence, in Section 4

Page 62: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 55

we use a transformation of the unknown function and study a collocation methodfor the transformed equation together with an iteration method for solving therespective collocation equations. In the last section we present and discuss theresults of some numerical experiments.

2. Solvability and regularity properties of the solution

By pϕn(x) we denote the normalized Chebyshev polynomial of the second kind

pϕn(cos s) =

√2π

sin(n + 1)ssin s

, n = 0, 1, . . . ,

and by L2ϕ the real Hilbert space of all square integrable functions u : (−1, 1) −→ R

with respect to the weight ϕ(x) equipped with the inner product∫ 1

−1

u(x)v(x)ϕ(x) dx =∞∑

n=0

〈u, pϕn〉ϕ 〈v, pϕ

n〉ϕ .

We consider equation (1.1) in the pair X −→ X∗ of the real Banach space X = L2, 12

ϕ

and its dual space X∗ = L2,− 12

ϕ with respect to the dual product

〈u, v〉ϕ =∞∑

n=0

〈u, pϕn〉ϕ 〈v, pϕ

n〉ϕ , u ∈ X∗, v ∈ X ,

where, for s ≥ 0 , L2,sϕ is the subspace of L2

ϕ of all u ∈ L2ϕ for which

‖u‖ϕ,s :=

√√√√ ∞∑n=0

(n + 1)2s∣∣∣〈u, pϕ

n〉ϕ∣∣∣2 <∞

and L2,−sϕ :=

(L2,s

ϕ

)∗. Write (1.1) in the form

A(u) := εV u + F (u) = f , (2.1)

where V : L2, 12

ϕ −→ L2,− 12

ϕ is an isometrical isomorphism given by

(V u)(x) = − d

dx

∫ 1

−1

ϕ(y)u(y)y − x

dy, |x| < 1,

or, which is the same, by

V u =∞∑

n=0

(n + 1) 〈u, pϕn〉ϕ pϕ

n . (2.2)

For u, v ∈ L2,sϕ , consider the inner product

〈u, v〉ϕ,s =∞∑

n=0

(1 + n)2s 〈u, pϕn〉ϕ 〈v, pϕ

n〉ϕ .

Page 63: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

56 M.R. Capobianco, G. Criscuolo and P. Junghanns

Note that ∣∣∣〈u, v〉ϕ,s

∣∣∣ ≤ ‖u‖ϕ,s−t ‖v‖ϕ,s+t , u ∈ L2,s−tϕ , v ∈ L2,s+t

ϕ . (2.3)

Moreover, for the operator V : L2,s+ 12

ϕ −→ L2,s− 12

ϕ defined by (2.2), we have

〈V u, u〉ϕ,s = ‖u‖2ϕ,s+ 1

2, u ∈ L2,s+ 1

2ϕ . (2.4)

(See [2] for a more detailed analysis of the operator V .) The operator F : X → X∗

is defined by (F (u)

)(x) = γ(x, ϕ(x)u(x)).

With respect to the function γ : [−1, 1]×R → R we can make different assumptions,for example

(A) (g1 − g2)[γ(x, g1)− γ(x, g2)] ≥ 0 , x ∈ [−1, 1] , g1, g2 ∈ R ,

and

(B) |γ(x, g1) − γ(x, g2)| ≤ λ(x) |g1 − g2|α , x ∈ [−1, 1], g1, g2 ∈ R , for some0 < α ≤ 1 , where

cα :=

⎧⎪⎨⎪⎩∫ 1

−1

[λ(x)]2

1−α [ϕ(x)]1+α1−α dx : 0 < α < 1

sup λ(x)ϕ(x) : −1 ≤ x ≤ 1 : α = 1

⎫⎪⎬⎪⎭ <∞

and γ(., 0) ∈ L2ϕ .

In any case we assume that y → γ(x, y) is continuous on R for almost all x ∈ [−1, 1]and that x → γ(x, y) is measurable for all y ∈ R . The following definitions aretaken from [18].

Definition 2.1. An operator A : X → X∗ is called

– hemicontinuous, if the function s → 〈A(u + sv), w〉 is continuous on [0, 1] forany fixed u, v, w ∈ X ;

– strictly monotone, if 〈A(u)−A(v), u − v〉 > 0 for all u, v ∈ X with u = v ;– strongly monotone, if there exists a constant m > 0 such that

〈A(u)−A(v), u − v〉 ≥ m ‖u− v‖2X

for all u, v ∈ X ;– coercive, if there exists a function ρ : [0,∞) → R satisfying

lims→∞ ρ(s) = ∞ and 〈A(u), u〉 ≥ ρ (‖u‖X) ‖u‖X

for all u ∈ X .

Lemma 2.2. If (A) is fulfilled and if F maps X into X∗ , then the operator A :X −→ X∗ in (2.1) is strongly monotone (with m = ε) for each ε > 0.

Page 64: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 57

Proof. Let u, v ∈ X . In view of (A) and (2.2), we have

〈A(u)−A(v), u − v〉ϕ ≥ ε

∞∑n=0

(n + 1)| 〈u− v, pϕn〉ϕ |2 = ε ||u− v||2ϕ, 1

2,

which proves the lemma.

Lemma 2.3. If (B) is fulfilled, then the operator F maps L2ϕ into L2

ϕ , where F :L2

ϕ −→ L2ϕ is Holder continuous with exponent α .

Proof. For u, v ∈ L2ϕ , we get

‖F (u)− F (v)‖2ϕ ≤

∫ 1

−1

[λ(x)]2[ϕ(x)]1+2α|u(x)− v(x)|2α dx

≤ c1−αα ‖u− v‖2α

ϕ

in case 0 < α < 1 and

‖F (u)− F (v)‖2ϕ ≤ c21 ‖u− v‖2

ϕ

in case α = 1 . In particular, for v = 0 , we have

‖F (u)− F (0)‖ϕ ≤ const ‖u‖αϕ ,

which together with γ(., 0) ∈ L2ϕ implies F (u) ∈ L2

ϕ for all u ∈ L2ϕ . The Lemma

is proved.

Corollary 2.4 ([18], Theorem 26.A). If the assumptions (A) and (B) are fulfilled,then the operator A is also coercive and equation (2.1) has a unique solution in Xfor each f ∈ X∗ and ε > 0 .

Corollary 2.5. Let the assumptions (A) and (B) be fulfilled. Moreover, let f ∈ L2,sϕ

for some s > 0 . If u ∈ L2,1ϕ implies F (u) ∈ L2,s

ϕ , then the unique solution u∗ ∈ Xof equation (2.1) belongs to L2,s+1

ϕ .

Proof. Since u∗ ∈ X and f ∈ L2,sϕ ⊃ L2

ϕ we have, due to F (u) ∈ L2ϕ (see Lemma

2.3), also V u∗ ∈ L2ϕ , which implies u∗ ∈ L2,1

ϕ . Consequently, F (u∗) ∈ L2,sϕ and so

V u∗ ∈ L2,sϕ and u∗ ∈ L2,s+1

ϕ .

The previous corollary can be generalized in the following way.

Corollary 2.6. Let the assumptions (A) and (B) be fulfilled. Moreover, let f ∈ L2,sϕ

for some s > 1 . If there is a t0 ∈ (0, 1] such that u ∈ L2,1ϕ implies F (u) ∈ L2,t0

ϕ

and such that u ∈ L2,t0+rϕ implies F (u) ∈ L2,mint0+r,s

ϕ for r = 1, 2, . . . , then thesolution u∗ of equation (2.1) belongs to L2,s+1

ϕ .

Proof. As in the proof of Cor. 2.5 we get u ∈ L2,1ϕ ⊃ L2,t0

ϕ . This implies V u∗ ∈ L2,t0ϕ

and u∗ ∈ L2,t0+1ϕ , and so on.

Page 65: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

58 M.R. Capobianco, G. Criscuolo and P. Junghanns

Remark 2.7. If γ(x, g) fulfils assumption (B) and if the partial derivatives γx(x, g)and γg(x, g) are continuous functions on (−1, 1)× R satisfying

|γx(x, g)|2 ≤ const(1− x2)2

[(1− x2)δ− 1

2 + g2]

and |γg(x, g)| ≤ const

for some δ > 0 , then u ∈ L2,1ϕ implies F (u) ∈ L2,1

ϕ .

Proof. As a result of [1] (see pp. 196,197) we have that the condition u ∈ L2,1ϕ is

equivalent to u ∈ L2ϕ and u′ ∈ L2

ϕ3 . Due to Lemma 2.3 it remains to show thatu ∈ L2,1

ϕ implies h′ ∈ L2ϕ3 , where

h(x) = γ(x, u(x)

√1− x2

).

By our assumptions it follows

|h′(x)|2(1− x2)32

≤∣∣∣γx

(x, u(x)

√1− x2

)∣∣∣2 (1− x2)32

+∣∣∣γg

(x, u(x)

√1− x2

)∣∣∣2(|u′(x)|2(1− x2) +|xu(x)|21− x2

)(1− x2)

32

≤ const[(1− x2)δ−1 + |u(x)|2(1− x2)

12 + |u′(x)|2(1− x2)

52

],

thus |h′(x)|2(1− x2)32 is summable.

Remark 2.8. Based on some results of [17], in [13] it is shown that, for any f ∈ Lqµ ,

problem (1.1), (1.2) has a unique solution g ∈ Lpr with g′ ∈ Lq

µ if γ(x, g) is non-decreasing in g ∈ R for almost all t ∈ (−1, 1) , if

|γ(x, g)| ≤ A(x) + B σ(x)|g|p−1 ,

where p > 1 , A ∈ Lqµ , B > 0 , and, if p > 2 ,

g γ(x, g) ≥ C σ(x)|g|p −D(x) ,

where D ∈ L1 and C > 0 .

Here, the norm in Lpψ is defined by

‖g‖Lpψ

=(∫ 1

−1

|g(x)|pψ(x) dx) 1

p

,

p−1 + q−1 = 1 , and the weight functions are chosen as

σ(x) =(1− x2

)− 12 , µ(x) =

(1− x2

) q−12 .

Note that, by means of the substitution x = cos τ , problem (1.1), (1.2) can bewritten equivalently as

−ε sin τπ

∫ π

0

g′(τ) dσcos τ − cosσ

+ γ(τ, g(τ)) = f(τ) , 0 < τ < π , (2.5)

Page 66: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 59

with g(0) = g(π) = 0 , where g(τ) = g(cos τ) , γ(τ, g) = γ(cos τ, g) sin τ , andf(τ) = f(cos τ) sin τ .

Corollary 2.9 ([13], Concl. 1). Let γ(x, g) be a monotone Caratheodory functionsatisfying

|γ(τ, g)| ≤ a(τ) + B |g|p−1 and, if p > 1 , g γ(τ, g) ≥ C |g|p − d(τ)

with some a ∈ Lq , d ∈ L1 , and constants B,C > 0 . Then, for any f ∈ Lq , problem(2.5) has a unique solution g ∈ Lp with g′ ∈ Lq , where p > 1 , p−1 + q−1 = 1 .

3. A collocation method

Denote by

xϕnk = cos

n + 1, k = 1, . . . , n ,

the zeros of the nth orthonormal polynomial pϕn(x) . Let Xn denote the space

of all algebraic polynomials of degree less than n and let Lϕn be the Lagrange

interpolation operator onto Xn with respect to the nodes xϕnk , k = 1, . . . , n . We

recall that Lϕn is defined by

Lϕn(f ;x) =

n∑k=1

f(xϕnk)ϕnk(x) , ϕnk(x) =

n∏r=1,r =k

x− xϕnr

xϕnk − xϕ

nr.

Moreover, in order to prove the convergence of the collocation method, we recalla well-known result on the Lagrange interpolation.

Lemma 3.1 (see [1, 4, 8]). For s > 12 and for all f ∈ L2,s

ϕ , we have

limn→∞ ‖f − Lϕ

nf‖ϕ,s = 0

and‖f − Lϕ

nf‖ϕ,r ≤ constnr−s||f ||ϕ,s, 0 ≤ r ≤ s.

We look for an approximate solution un ∈ Xn to the solution of equation(2.1) by solving the collocation equations

An(un) := εV un + Fn(un) = Lϕnf , un ∈ Xn , (3.1)

where Fn(un) := LϕnF (un) .

Theorem 3.2. Consider equation (2.1) for a function f : (−1, 1) −→ C . Assumethat the conditions (A) and (B) are satisfied. Then the equations (3.1) have aunique solution u∗

n ∈ Xn . If the solution u∗ ∈ X of (2.1) belongs to L2,s+1 forsome s > 1

2 , then the solutions u∗n converge in X to u∗ , where

‖u∗n − u∗‖ϕ, 12

≤ constn−s ‖u∗‖ϕ,s+1

and the constant does not depend on n , ε , and u∗ .

Page 67: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

60 M.R. Capobianco, G. Criscuolo and P. Junghanns

Proof. At first we observe that, for un, vn ∈ Xn ,

〈Fn(un)− Fn(vn), un − vn〉ϕ (3.2)

=n∑

k=1

λϕnk [γ (xϕ

nk, ϕ(xϕnk)ξk)− γ (xϕ

nk, ϕ(xϕnk)ηk)] (ξk − ηk) ≥ 0 ,

where λϕnk = π[ϕ(xϕ

nk)]2/(n + 1) are the Christoffel numbers with respect to theweight function ϕ(x) , and ξk = un(xϕ

nk) , ηk = vn(xϕnk) . This implies the strong

monotonicity of the operator εV + Fn : Xn ⊂ X −→ Xn ⊂ X∗ . Moreover, theestimation (here let 0 < α < 1; for the case α = 1 see the proof of Cor. 3.4)

‖Fn(un)− Fn(vn)‖2ϕ =

n∑k=1

λϕnk |γ (xϕ

nk, ϕ(xϕnk)ξk)− γ (xϕ

nk, ϕ(xϕnk)ηk)|2

≤n∑

k=1

λϕnk[λ(xϕ

nk)]2[ϕ(xϕnk)]2α |ξk − ηk|2α

≤(

n∑k=1

((λϕ

nk)1−α [λ(xϕnk)]2[ϕ(xϕ

nk)]2α) 1

1−α

)1−α( n∑k=1

λϕnk |ξk − ηk|2

=: cnα ‖un − vn‖2αϕ

gives the Holder continuity of Fn : Xn ⊂ L2ϕ −→ Xn ⊂ L2

ϕ . Thus, equation (3.1)is uniquely solvable (comp. Cor. 2.4). With the help of

LϕnF (u∗) = Lϕ

nF (Lϕnu

∗) = Fn(Lϕnu

∗)

we can estimate

ε||u∗n − Lϕ

nu∗||2ϕ, 1

2

≤ 〈εV u∗n + Fn(u∗

n)− εV Lϕnu

∗ − Fn(Lϕnu

∗), u∗n − Lϕ

nu∗〉ϕ

= 〈Lϕnf − εV Lϕ

nu∗ − Lϕ

nF (u∗), u∗n − Lϕ

nu∗〉ϕ

= ε 〈LϕnV u

∗ − V Lϕnu

∗, u∗n − Lϕ

nu∗〉ϕ

≤ ε(‖Lϕ

nV u∗ − V u∗‖ϕ,− 1

2+ ‖u∗ − Lϕ

nu∗‖ϕ, 12

)‖u∗

n − Lϕnu

∗‖ϕ, 12.

Hence, taking into account ‖LϕnV u

∗ − V u∗‖ϕ,− 12≤ ‖Lϕ

nV u∗ − V u∗‖ϕ and Lemma

3.1, we obtain

‖u∗n − Lϕ

nu∗‖ϕ, 1

2≤ const

(n−s ‖V u∗‖ϕ,s + n−s− 1

2 ‖u∗‖ϕ,s+1

),

and the theorem is proved.

Corollary 3.3. Additionally to the assumptions of Theorem 3.2 assume that thereis an r ≥ 1

2 such that

〈Fn(un)− Fn(vn), un − vn〉ϕ,r ≥ 0 , un, vn ∈ Xn , n ≥ n0 , (3.3)

Page 68: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 61

and such that s ≥ r − 12 . Then

‖u∗n − u∗‖ϕ,r+ 1

2≤ constnr− 1

2−s ‖u∗‖ϕ,s+1 , (3.4)

where the constant does not depend on n , ε , and u∗ .

Proof. Using (2.4), (2.3), and Lemma 3.1 we get, analogously to the proof ofTheorem 3.2,

‖u∗n − Lϕ

nu∗‖2

ϕ,r+12

≤ 〈LϕnV u

∗ − V Lϕnu

∗, u∗n − Lϕ

nu∗〉ϕ,r

≤(‖Lϕ

nV u∗ − V u∗‖ϕ,r− 1

2+ ‖V (u∗ − Lϕ

nu∗‖ϕ,r− 1

2

)‖u∗

n − Lϕnu

∗‖ϕ,r+ 12

≤ const(nr− 1

2−s ‖V u∗‖ϕ,s + nr+ 12−s−1 ‖u∗‖ϕ,s+1

)‖u∗

n − Lϕnu

∗‖ϕ,r+ 12.

Since, again due to Lemma 3.1, ‖u∗ − Lϕnu

∗‖ϕ,r+ 12≤ constnr− 1

2−s ‖u∗‖ϕ,s+1 theassertion is proved.

Let us discuss the question how we can solve the collocation equations (3.1).Although Theorem 3.2 holds for all 0 < α ≤ 1 , for the following we need to assumeα = 1 .

Corollary 3.4. If condition (B) with α = 1 is fulfilled, then the operator A : X −→X∗ as well as the operator An : Xn ⊂ X −→ Xn ⊂ X∗ are Lipschitz continuouswith constant c1 + ε .

Proof. We give the proof for the operator An. The proof for A is analogous. Letu1

n, u2n, un ∈ Xn. Then∣∣∣⟨Fn(u1

n)− Fn(u2n), un

⟩ϕ

∣∣∣≤

n∑k=1

λϕnk

∣∣γ (xϕnk, ϕ(xϕ

nk)u1n(xϕ

nk))− γ

(xϕ

nk, ϕ(xϕnk)u2

n(xϕnk)

)∣∣ |un(xϕnk)|

≤n∑

k=1

λϕnkλ(xϕ

nk)ϕ(xϕnk)

∣∣u1n(xϕ

nk)− u2n(xϕ

nk)∣∣ |un(xϕ

nk)|

≤ c1∥∥u1

n − u2n

∥∥ϕ‖un‖ϕ ≤ c1

∥∥u1n − u2

n

∥∥ϕ, 1

2‖un‖ϕ, 12

,

and we are done.

Remark that V is just the dual mapping between the spaces X and X∗ aswell as between Xn ⊂ X and Xn ⊂ X∗ . Hence, we consider (for some fixed t > 0)the equations

un = un − tV −1 [εV un + Fn(un)− Lϕnf ] =: Bn(un) , (3.5)

Page 69: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

62 M.R. Capobianco, G. Criscuolo and P. Junghanns

which are equivalent to (3.1). If we choose t ∈ (0, tε) with tε = 2ε/(c21 + ε2)then the operator Bn : Xn ⊂ X −→ Xn ⊂ X is a kε-contractive mapping withkε =

√(1 − tε)2 + t2c21 < 1 , i.e.,∥∥Bn(u1

n)−Bn(u2n)∥∥

ϕ, 12≤ kε

∥∥u1n − u2

n

∥∥ϕ, 12

. (3.6)

This follows from∥∥Bn(u1n)−Bn(u2

n)∥∥2

ϕ, 12

= (1 − tε)2∥∥u1

n − u2n

∥∥2

ϕ, 12− 2t

⟨V −1[F (u1

n)− Fn(u2n)], u1

n − u2n

⟩ϕ, 1

2

+t2∥∥V −1[F (u1

n)− Fn(u2n)]∥∥2

ϕ, 12

= (1 − tε)2∥∥u1

n − u2n

∥∥2

ϕ, 12− 2t

⟨F (u1

n)− Fn(u2n), u1

n − u2n

⟩ϕ

+t2∥∥V −1[F (u1

n)− Fn(u2n)]∥∥2

ϕ, 12

≤ k2ε

∥∥u1n − u2

n

∥∥2

ϕ, 12

taking into account (3.2). Consequently, under the assumptions (A) and (B) withα = 1 the collocation equations (3.1) can be solved by applying the method ofsuccessive approximation to the fixpoint equation (3.5). The smallest possible kε

for given ε and c1 is equal to

k∗ε =c1√

ε2 + c21

(⇔ t = t∗ε =

ε

ε2 + c21

). (3.7)

Remark that, if we directly apply the formulas of [18, Sect. 25.4] to the fixed pointequation (3.5) we obtain, for t ∈ (0, tε) with tε = 2ε/(ε + c1)2 , the contractionconstant kε =

√1− 2tε + t2(ε + c1)2 , whose minimal value is

k∗ε =

√1−

ε + c1

)2 (⇔ t = t∗ε =

ε

(ε + c1)2

).

If we seek the solution of (3.1) in the form

un(x) =n∑

k=1

ξnkϕnk(x) ,

then (3.1) can be written as

εVnΛnξn + Fn(ξn) = ηn , ξn = [ξnk] nk=1 , (3.8)

where ηn = [f(xϕnk)] n

k=1 , Vn = UTnDnUn , Un = [pϕ

j (xϕnk)]n−1, n

j=0,k=1 , and Dn =diag[1, . . . , n] , Λn = diag[λϕ

n1, . . . , λϕnn] , Fn(ξn) = [γ(xϕ

nk, ϕ(xϕnk)ξnk)] n

k=1 (see [2,(4.12)]). We recall that, due to the orthogonality relations of pϕ

j , we have

UnΛnUTn = In =: [δjk] n

j,k=1 . (3.9)

Page 70: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 63

Thus, the fixed point iteration for (3.5) takes the form

ξ(m+1)n = (1− tε)ξ(m)

n − tΛ−1n V−1

n

[Fn(ξ(m)

n )− ηn

], m = 0, 1, . . . , (3.10)

where, taking into account (3.9) ,

Λ−1n V−1

n = Λ−1n U−1

n D−1n (UT

n )−1 = UTnD−1

n UnΛn .

Remark that the matrix Un can be written as

Un = UnD−1n , Un =

√2π

[sin

jkπ

n + 1

] n

j,k=1

, Dn = diag[sin

n + 1

] n

k=1

,

and that the matrix Un can be applied to a vector with O(n log n) computationalcomplexity (see [15, 16]).

Remark 3.5. In [13] the approximate solution is represented in the form

vn(τ) = ϕ(cos τ)un(cos τ) = sin τ un(cos τ) .

With ζnk = vn

(kπ

n+1

)we get ζn = Dnξn , such that (3.8) (in case ε = 1) is

equivalent to (use ΛnD−1n = π

n+1 Dn)

εAnζn + Φn(ζn) = ηn , (3.11)

where

An = [αnjk] n

j,k=1 =π

n + 1UT

nDnUn =2

n + 1

[n∑

=1

sinjπ

n + 1sin

n + 1

] n

j,k=1

,

Φn(ζn) =[Φnj(ζnj)

] n

j=1=

[ϕ(xϕ

nj) γ

(xϕ

nj ,ζnj

ϕ(xϕnj)

)] n

j=1

, ηn = Dnηn .

In [13] it is shown that the sequenceζmn

m=1defined by the nonlinear Jacobi

iteration

1. solve αnjj ζnj + Φnj(ζnj) = ηnj −

n∑k=1,k =j

αnjk ζ

mnk , j = 1, . . . , n ,

2. set ζm+1j = ζm

j + ω(ζj − ζmj ) , j = 1, . . . , n ,

where ω is taken from the interval (0, 1] , converges to the unique solution ζ∗n of(3.11) for any ζ0

n ∈ Rn , if γ(x, g) fulfills condition (A), is continuous in g ∈ R forall x ∈ [−1, 1] and bounded in x ∈ [−1, 1] for all g ∈ R , and if f(x) is bounded inx ∈ [−1, 1] .

Instead of (3.1), (3.5) let us consider the collocation-iteration scheme

um = Lϕnm

[um−1 − tV −1 (εV um−1 + F (um−1)− f)

], m = 1, 2, . . . ,

where 1 < n0 < n1 < n2 < · · · and u0 ≡ 0 . This is equivalent to

um = um−1 − tV −1[εV um−1 + Fnm(um−1)− Lϕ

nmf]

=: Tm(um−1) (3.12)

Page 71: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

64 M.R. Capobianco, G. Criscuolo and P. Junghanns

with Tmu = Bnm(Pnmu) , where Pn : L2, 12

ϕ −→ L2, 12

ϕ denotes the projection

Pnu =n−1∑k=0

〈u, pϕk 〉ϕ pϕ

k .

Due to ‖Pn‖L

2, 12

ϕ →L2, 1

= 1 and (3.6) the operator Tm : L2, 12

ϕ −→ L2, 12

ϕ is a kε-

contractive operator. Thus, the equation

vm = Tm(vm) , vm ∈ L2, 12

ϕ ,

has a unique solution v∗m , which is nothing else then the solution of the collocationmethod (3.1) for n = nm . Hence, under the assumptions of Theorem 3.2 with (B)in the case α = 1,

‖v∗m − u∗‖ϕ, 12

= O(n−s

m

). (3.13)

Now, for m = 2, 3, . . . ,

‖um − v∗m‖ ≤ ‖Tm(um−1)− Tm(v∗m)‖

≤ kε ‖um−1 − v∗m‖

≤ kε

(∥∥um−1 − v∗m−1

∥∥ +∥∥v∗m−1 − v∗m

∥∥)and, by repeating this,

‖um − v∗m‖ ≤ km−1ε ‖u1 − v∗1‖ +

m−1∑=1

km−ε

∥∥v∗ − v∗+1

∥∥ .

Taking into account the Toeplitz convergence theorem (comp. [18, Prop. 25.1,Problem 25.1]) we get ‖um − v∗m‖ −→ 0 , which implies together with (3.13) theconvergence of um to u∗ . Of course, these ideas apply also if we use the iterationum = T r

m(um−1) for some fixed integer r > 1 instead of (3.12). (For more detailsconcerning projection-iteration methods, we refer the reader to [18, Sect.s 25.1,25.2].) Practically we can write the collocation-iteration (3.12) in the form

ξm = (1 − t ε)Emξm−1 − tΛ−1nm

V−1nm

[Fnm(Emξm−1)− ηnm

], ξ0 = 0 , (3.14)

where Em = UTnm

Pnmnm−1Unm−1Λnm−1 and Pnm = [δjk] n , mj=1,k=1 . Since the fast

transformations Un can be realized most effectively for particular n , for examplen = 2r − 1 , r ∈ N (comp. the examples in Section 5), an appropriate way to usethe idea of (3.14) is to combine it with the method (3.10):

1. Choose a finite sequence n1 < n2 < · · · < nM+1 of natural numbers and anatural number K .

2. For m = 1, . . . ,M do K iterations of the form (3.10) on level n = nm and use(3.14) to get a good initial approximation u

(0)nm+1 for (3.10) on level nm+1 .

3. Apply (3.10) with the initial approximation u(0)nM+1 till the desired accuracy

is achieved.

Page 72: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 65

4. Collocation for the case (1.7)

We can try to apply the collocation method described in Section 3 to example(1.7) (comp. Theorem 3.2). Of course, condition (A) is fulfilled. Moreover, if∫ 1

−1

[Γ(x)]4[ϕ(x)]3 dx <∞

then (B) is satisfied with α = 12 and u ∈ L2

ϕ implies F (u) ∈ L2ϕ (see Lemma 2.3).

But, the conditions of Remark 2.7 are not satisfied such that the convergence rateestablished in Theorem 3.2 cannot be proved.

Let us consider a little bit more general example

γ(x, g) = Γ(x) |g|α sgn g , |x| ≤ 1, g ∈ R , (4.1)

with 0 < α < 1 and a continuous function Γ : [−1, 1] −→ (0,∞) . The problem inthe application of the collocation method (3.1) together with an iteration scheme(3.5) or (3.12) is that condition (B) with α = 1 is not satisfied. To overcome thisdifficulty we transform the respective equation (1.1) with γ(x, g) defined in (4.1)as follows: Define a new unknown function g(x) = [Γ(x)]δα|g(x)|α sgn g(x) , whereδ = (1 + α)−1 . Then, equation (1.1) with (4.1) is equivalent to

− ε

π

∫ 1

−1

|g(y)|β g(y)[Γ(y)]δ(y − x)2

dy + [Γ(x)]δ g(x) = f(x) , |x| < 1 , g(±1) = 0 , (4.2)

where β := α−1 − 1 > 0 . In [14, Sect. 1]) there is proved that the solution of

−(L g)(x) = −(Dg′)(x) = f(x) , −1 < x < 1 , g(±1) = 0

(comp. (1.3)) is given by the formula

g(x) =1π

∫ 1

−1

h(x, y)f(y) dy ,

where

h(x, y) = ln1− xy +

√(1− x2)(1− y2)|y − x| .

Thus, equation (4.2) can be written in the form

ε|g(x)|β g(x) +[Γ(x)]δ

π

∫ 1

−1

h(x, y)[Γ(y)]δ g(y)] dy = f(x) , (4.3)

with

f(x) =[Γ(x)]δ

π

∫ 1

−1

h(x, y)f(y) dy .

Page 73: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

66 M.R. Capobianco, G. Criscuolo and P. Junghanns

Taking into account, for x, y ∈ (−1, 1) and x = y ,

∂h(x, y)∂y

= − 1y − x

+−x

√1− y2 − y

√1− x2√

1− y2(1− xy −

√(1− x2)(1− y2)

)= −

√1− y2 − x2

√1− y2 − xy

√1− x2 +

√1− x2

(y − x)√

1− y2(1− xy −

√(1− x2)(1− y2)

)= −

√1− x2

(y − x)√

1− y2,

by partial integration we get

(Hf)(x) :=1π

∫ 1

−1

h(x, y)f(y) dy =√

1− x2

π

∫ 1

−1

∫ y

−1f(t) dt√

1− y2(y − x)dy

=√

1− x2

π

∫ 1

−1

∫ y

−1f(t) dt− 1+y

2

∫ 1

−1f(t) dt√

1− y2 (y − x)dy

+√

1− x2

∫ 1

−1

(1 + y) dy√1− y2 (y − x)

∫ 1

−1

f(t) dt

=√

1− x2

π

∫ 1

−1

∫ y

−1f(t) dt− 1+y

2

∫ 1

−1f(t) dt√

1− y2 (y − x)dy

+12

√1− x2

∫ 1

−1

f(t) dt .

For 0 < µ < 12 , define the Banach space Hµ

0 of all functions u : (−1, 1) −→ Rsuch that ϕu is Holder continuous on [−1, 1] with exponent µ and (ϕu)(±1) = 0 ,where the norm in Hµ

0 is given by

‖u‖Hµ0

= ‖ϕu‖Hµ

with

‖f‖Hµ := ‖f‖∞ + sup|f(x1)− f(x2)||x1 − x2|µ

: −1 ≤ x1 < x2 ≤ 1

,

‖f‖∞ = sup |f(x)| : −1 ≤ x ≤ 1 . The Cauchy singular integral operator D :Hµ

0 −→ Hµ0 is bounded (see [5, Sect. 1.6]). Hence, we have

‖Hf‖Hµ ≤ const ‖f‖L

11−µ

(4.4)

since f ∈ L1

1−µ implies ‖F‖Hµ ≤ const ‖f‖L

11−µ

for F (x) =∫ x

−1

f(t) dt . Since

Hµ is compactly embedded into the space of continuous functions, the operatorH : Lp −→ Lq0 is linear and compact for all p > 1 and q0 ≥ 1 .

Page 74: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 67

Due to [18, Prop. 26.7] the operator Gβ : Lp −→ Lq defined by(Gβ(g)

)(x) = |g(x)|βg(x) , p = β + 2 , p−1 + q−1 = 1 ,

is continuous and bounded with

‖Gβ(g)‖Lq ≤ ‖g‖p−1Lp ,

strictly monotone as well as coercive with

〈Gβ(g), g〉 ≥ ‖g‖pLp .

Let f belong to Lp for some p ≥ 2 and let v = Hf . Then f ∈ L2ϕ and v = ϕu =

ϕV −1f . Using (2.2), we obtain, for f = 0 ,⟨ΓδHΓδf, f

⟩=⟨V −1Γδf,Γδf

⟩ϕ

=∞∑

n=0

1n + 1

∣∣∣⟨Γδf, pϕn

⟩ϕ

∣∣∣2 > 0 . (4.5)

Hence, defining B : Lp −→ Lq , p = β + 2 , p−1 + q−1 = 1 , by B(g) = εGβ(g) +ΓδHΓδg , we conclude that B is strictly monotone and coercive with

〈B(g), g〉 ≥ ε ‖g‖pLp .

Consequently, due to [18, Theorem 26.A] we have the following.

Theorem 4.1. For each f ∈ L1 :=⋃p>1

Lp , there is a unique solution g ∈ Lβ+2 of

equation (4.3).

Moreover, the operator Gβ : Lp −→ Lq , p = β + 2 , p−1 + q−1 = 1 , isuniformly monotone. Indeed, using β > 0 and the inequalities

xβ+1 − 1(x − 1)β+1

> 1 , 1 < x <∞ ,

andxβ+1 + 1

(x + 1)β+1≥ min

1, 21−β

=: dβ , 0 ≤ x <∞ ,

we get (|x|βx− |y|βy

)(x − y) ≥ dβ |x− y|β+2 , x, y ∈ R . (4.6)

This implies

〈Gβ(g)−Gβ(f), g − f〉 ≥ dβ ‖g − f‖β+2Lp = a (‖g − f‖Lp) ‖g − f‖Lp

with a(s) = dβsβ+1 .

To solve equation (4.3) or, which is the same, the equation

B(g) = ΓδHf , g ∈ Lβ+2 , (4.7)

numerically, we consider the collocation method

Bn(gn) := εGβ,n(gn) + Hngn = Mϕn ΓδHLϕ

nf , gn ∈ Xn , (4.8)

where

Gβ,n(gn) = MϕnGβ(gn) , Hngn = Mϕ

n ΓδHLϕnΓδ gn , Mϕ

n = ϕLϕnϕ

−1I .

Page 75: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

68 M.R. Capobianco, G. Criscuolo and P. Junghanns

In what follows, let p = β + 2 , p−1 + q−1 = 1 , and let Xpn denote the space Xn of

polynomials of degree less than n equipped with the Lp-norm.

Lemma 4.2. The operator Hn : Xpn −→ Xq

n is strictly monotone.

Proof. Using the Gaussian rule w.r.t. the weight ϕ(x) and the fact that V −1gn ∈ Xn

if gn ∈ Xn , we get, for all gn ∈ Xpn \ Θ ,

〈Hngn, gn〉 =⟨Lϕ

nΓδϕ−1HLϕnΓδ gn, gn

⟩ϕ

=n∑

k=1

λϕnkΓδ(xϕ

nk)(V −1Lϕ

nΓδ gn

)(xϕ

nk) gn(xϕnk)

=⟨V −1Lϕ

nΓδgn, LϕnΓδgn

⟩ϕ> 0

taking into account (4.5).

In what follows we need the existence of a constant Mp > 1 such that ([8,Theorems 2.7])

‖pnσ‖Lp ≤Mp

(n∑

k=1

λn(σp;xϕnk)|pn(xϕ

nk)|p) 1

p

(4.9)

for all pn ∈ Xn , where λn(σp;x) denotes the Christoffel function w.r.t. the Jacobiweight σp(x) = (1 − x2)η . Recall that this relation holds if and only if σ

ϕ ∈ Lp,and ϕ

σ ∈ Lq , i.e., β2 < η < 3β

2 + 2. Recall that ([12, Theorem 6.3.28])

λn(σp;x) ∼ 1n

(√1− x+

1n

)2η+1 (√1 + x +

1n

)2η+1

. (4.10)

Lemma 4.3. Let β2 < η < 3β

2 + 2 . Then the operators Gβ,n : Xpn −→ Xq

n andBn : Xp

n −→ Xqn are uniformly monotone with⟨

Gβ,n(gn)−Gβ,n(fn), gn − fn

⟩≥ b

(∥∥∥gn − fn

∥∥∥Lp

) ∥∥∥gn − fn

∥∥∥Lp

and⟨Bn(gn)−Bn(fn), gn − fn

⟩≥ ε b

(∥∥∥gn − fn

∥∥∥Lp

) ∥∥∥gn − fn

∥∥∥Lp

, fn, gn ∈ Xpn ,

where b(s) = constn−2ηsβ+1 .

Proof. From (4.10) we obtain

λϕnk

ϕ(xϕnk)λn(σp;xϕ

nk)∼ (1− xϕ

nk2)−η > 1 . (4.11)

Page 76: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 69

Thus, taking into account (4.6), (4.9) and (4.11),⟨Gβ,n(gn)−Gβ,n(fn), gn − fn

⟩=⟨Lϕ

nϕ−1[Gβ(gn)−Gβ(fn)

], gn − fn

⟩ϕ

=n∑

k=1

λϕnk

ϕ(xϕnk)

[|gn(xϕ

nk)|β gn(xϕnk)− |fn(xϕ

nk)|β fn(xϕnk)

] [gn(xϕ

nk)− fn(xϕnk)

]≥ dβ

n∑k=1

λϕnk

ϕ(xϕnk)

∣∣∣gn(xϕnk)− fn(xϕ

nk)∣∣∣β+2

≥ const∥∥∥(gn − fn)σ

∥∥∥p

Lp.

On the other hand, due to [12, Cor. 6.3.15] we have∥∥∥(gn − fn)σ∥∥∥p

Lp≥ constn−2η

∥∥∥gn − fn

∥∥∥p

Lp.

Together with Lemma 4.2 we get the assertions.

Corollary 4.4. The operators Gβ,n : Xpn −→ Xq

n and Bn : Xpn −→ Xq

n are coercivewith

〈Gβ,n(gn), gn〉 ≥ constn−2η ‖gn‖pLp and 〈Bn(gn), gn〉 ≥ const ε n−2η ‖gn‖p

Lp ,

gn ∈ Xpn .

Lemma 4.2, Lemma 4.3, and Cor. 4.4 state that we can preserve the essentialproperties of the operator of equation (4.7) for the operator of the collocationmethod (4.8).

Lemma 4.5. For each function h : [−1, 1] −→ R with∥∥∥ϕ 1

q h∥∥∥∞

< ∞ and each

polynomial gn ∈ Xn , we have

〈Mϕn h, gn〉 ≤ const

∥∥∥ϕ 1q h∥∥∥∞‖gn‖Lp ,

where the constant does not depend on h , gn , and n .

Proof. Using the Gaussian rule as well as ([12, Theorem 9.25]), we can estimate

〈Mϕn h, gn〉 =

⟨Lϕ

nϕ−1h, gn

⟩ϕ

=n∑

k=1

λϕnk

ϕ(xϕnk)

h(xϕnk)gn(xϕ

nk)

≤(

n∑k=1

λϕnk

ϕ2(xϕnk)

∣∣∣ϕ 1q (xϕ

nk)h(xϕnk)

∣∣∣q) 1q(

n∑k=1

λϕnk

ϕ(xϕnk)

|gn(xϕnk)|p

) 1p

≤ const∥∥∥ϕ 1

q h∥∥∥∞‖gn‖Lp ,

and the lemma is proved.

Page 77: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

70 M.R. Capobianco, G. Criscuolo and P. Junghanns

Theorem 4.6. For each n ∈ N , the collocation equation (4.8) has a unique solutiong∗n ∈ Xn . If g ∈ Lp is the unique solution of (4.2) (comp. Theorem 4.1) and if Lϕ

n g

converges in Lp to g , and if, for some r > 1 and η ∈(

β2 ,

3β2 + 2

),

limn→∞

(∥∥LϕnΓδg − Γδ g

∥∥Lr + ‖Lϕ

nf − f‖Lr

)n2η = 0,

then g∗n converges to g , where

‖g∗n − Lϕn g‖Lp ≤

[const ε−1

(∥∥LϕnΓδg − Γδg

∥∥Lr + ‖Lϕ

nf − f‖Lr

)n2η

] 1p−1 .

Proof. The unique solvability of (4.2) follows from [18, Theorem 26.A] and Lemma4.3. Furthermore, with the help of Lemma 4.5 as well as the relations Lϕ

nΓδLϕn g =

LϕnΓδ g and Gβ,n(Lϕ

n g) = MϕnGβ(g) we get

ε b (‖g∗n − Lϕn g‖Lp) ‖g∗n − Lϕ

n g‖Lp

≤ 〈Bn(g∗n)−Bn(Lϕn g), g

∗n − Lϕ

n g〉=⟨Mϕ

n ΓδHLϕnf − εGβ,n(Lϕ

n g)−HnLϕn g, g

∗n − Lϕ

n g⟩

=⟨Mϕ

n ΓδHf − εMϕnGβ(g)−HnL

ϕn g + Mϕ

n ΓδH(Lϕnf − f), g∗n − Lϕ

n g⟩

=⟨Mϕ

n ΓδH(Γδg − LϕnΓδg) + Mϕ

n ΓδH(Lϕnf − f), g∗n − Lϕ

n g⟩

≤ const(∥∥H(Γδg − Lϕ

nΓδ g)∥∥∞ + ‖H(Lϕ

nf − f)‖∞)‖g∗n − Lϕ

n g‖Lp .

Now, apply (4.4) for some µ ∈ (0, 12 ) and use b(s) = constn−2 ηsp−1 .

Let us investigate the iteration method

g(m+1)n = g(m)

n − t Rn(g(m)n ) , g(0)

n ≡ 0 , t > 0 , (4.12)

where Rn(gn) = εGβ,n(gn) + Hngn − fn and

Gβ,n(gn) = Lϕnϕ

−1Gβ(gn) , Hngn = Lϕnϕ

−1ΓδHLϕnΓδ gn , fn = Lϕ

nϕ−1ΓδHLϕ

nf ,

for solving the collocation equation (4.8). For this, denote by Yϕn the space Xn of

polynomials of degree less than n (with real coefficients) equipped with the innerproduct ⟨

gn, fn

⟩ϕ

=∫ 1

−1

ϕ(x)gn(x)fn(x) dx =n∑

k=1

λϕnk gn(xϕ

nk)fn(xϕnk) .

Since Γ : [−1, 1] −→ (0,∞) is assumed to be continuous, there exist constantsγ0, γ1 ∈ R such that

γ1 ≥ Γ(x) ≥ γ0 > 0 , −1 ≤ x ≤ 1 . (4.13)

Then, for gn, fn ∈ Yϕn ,⟨

Gβ,n(gn)−Gβ,n(fn), gn − fn

⟩ϕ≥ 0

Page 78: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 71

and (see (4.5) and the proof of Lemma 4.2)⟨Hngn, gn

⟩ϕ

=n−1∑k=0

1k + 1

∣∣∣⟨LϕnΓδgn, p

ϕk

⟩ϕ

∣∣∣2≥ 1

n

∥∥LϕnΓδgn

∥∥2

ϕ=

1n

n∑k=1

λϕnk

∣∣γδ(xϕnk)gn(xϕ

nk)∣∣2 ≥ γ2δ

0

n‖gn‖2

ϕ .

Furthermore, ∥∥∥Hngn

∥∥∥ϕ

=

√√√√ n∑k=1

λϕnk |Γδ(xϕ

nk)(V −1LϕnΓδgn)(xϕ

nk)|2

≤ γδ1

∥∥V −1LϕnΓδgn

∥∥ϕ≤ γ2δ

1 ‖gn‖ϕ .

Taking into account that ‖gn‖ϕ ≤ r implies |gn(xϕnk)| ≤ r√

λϕnk

, k = 1, . . . , n , and

that λϕnk =

π sin2 kπn+1

n + 1≥ 4π

(n + 1)3, we get, for ‖gn‖ϕ ,

∥∥∥fn

∥∥∥ϕ≤ r ,∥∥∥Gβ,n(gn)− Gβ,n(fn)

∥∥∥2

ϕ

n + 1

n∑k=1

∣∣∣|gn(xϕnk)|β gn(xϕ

nk)− |fn(xϕnk)|β fn(xϕ

nk)∣∣∣2

≤ π[(β + 1)rβ ]2

n + 1

n∑k=1

1(λϕ

nk)β|gn(xϕ

nk)− fn(xϕnk)|2

≤ π[(β + 1)rβ ]2

(4π)β+1(n + 1)3β+2

∥∥∥gn − fn

∥∥∥2

ϕ.

Consequently, the operator Rn : Yϕn −→ Yϕ

n is strongly monotone and locallyLipschitz continuous, which implies that, for sufficiently small t > 0 , the iterationmethod (4.12) converges in Yϕ

n to the solution gn ∈ Yϕn of (4.8) (see [18, Prop.

26.8, Theorem 26.B]).Remark that the collocation equation (4.8) can be written as

εGn(ξn) + Hnξn = HnΓ−1n ηn ,

where ηn = [f(xϕnk)]nk=1 and gn =

n∑k=1

ξnkϕnk , i.e., ξn = [ξnk] n

k=1 = [gn(xϕnk)] n

k=1 ,

Gn(ξn) =[|ξnk|β ξnk

] n

k=1, Γn = diag[Γδ(xϕ

nk)] nk=1 , Hn = ΓnDnUT

nD−1n UnΛnΓn

(comp. the definitions associated with (3.8)). Thus, the iteration equation (4.12)is equivalent to

ξ(m+1)n = ξ(m)

n − t[εGn(ξ(m)

n ) + Hnξ(m)n −HnΓ−1

n ηn

]. (4.14)

Page 79: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

72 M.R. Capobianco, G. Criscuolo and P. Junghanns

5. Numerical examples

Let us consider equation (1.5) for different functions f(x) and γ(x, g) .

Example 1. We solve the (linear) equation (1.5) with γ(x, g) = (1− x2)−1/2g and

(a) f(x) = fa(x) = x|x| − εx

π

(2− 3 x2

√1− x2

ln1 +

√1− x2

1−√

1− x2− 6

)as well as with(b) f(x) = fb(x) = x|x|

by the collocation method (3.1) together with the iteration method (3.10) (withu

(0)n ≡ 0) as well as with the combination of (3.10) and (3.14) as described at the

end of Section 3.In case of the method (3.10) the iteration is stopped if the L2, 1

2ϕ -norm of the

difference of two consecutive iterations is smaller than toll, i.e.,∥∥∥u(N)n − u(N−1)

n

∥∥∥ϕ, 12

< toll , (5.1)

where u(m)n =

n∑k=1

ξ(m)nk ϕnk . For the combination of (3.10) and (3.14) we choose the

sequence

n1 = 7 < · · · < nj = 2j+2 − 1 < · · · < n = nM+1 = 2M+3 − 1

and the number K of iterations realized on the levels n1, . . . , nM . The number ofiterations needed on the last level nM+1 to get the same accuracy (5.1) is denotedby NK . Condition (3.3) is satisfied for all r .(a) In this case the solution is given by u∗(x) = x|x| (independent of ε). We have,

for n = 2k − 1 ,

a∗2k−1 :=⟨u∗, pϕ

2k−1

⟩ϕ

=4√

2 (−1)k(n + 1)√π(n2 − 4)n(n + 4)

, k = 1, 2, . . . ,

such that u∗ ∈ L2,2.5−δϕ for all δ > 0 . Thus, the convergence rate predicted

by (3.4) for t = 0.5 is

‖u∗n − u∗‖ϕ,1 = O

(nδ−1.5

), δ > 0 arbitrarily small,

which is confirmed by the numerical results presented in the following table,in which the values

ds =∥∥∥u(N)

n − Pnu∗∥∥∥

ϕ,s=

√√√√n−1∑k=0

(k + 1)2s

[⟨u

(N)n , pϕ

k

⟩ϕ− a∗k

]2

are presented for s = 0.5 and s = 1. To compute these values we use therelation [ ⟨

u(N)n , pϕ

k

⟩ϕ

]n−1

k=0= UnΛnξ

(N)n .

Page 80: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 73

n N N1 N3 d1/2 d1 n2.0d1/2 n1.5d1

31 16 1.13e-03 5.43e-03 1.090 0.93863 16 2.93e-04 2.00e-03 1.164 0.999

127 16 7.46e-05 7.21e-04 1.204 1.032255 16 1.88e-05 2.58e-04 1.225 1.049511 16 4.73e-06 9.15e-05 1.235 1.057

1023 16 13 12 1.19e-06 3.24e-05 1.241 1.0622047 16 12 11 2.97e-07 1.15e-05 1.243 1.0644095 16 11 10 7.42e-08 4.06e-06 1.245 1.0658191 16 10 10 1.86e-08 1.44e-06 1.246 1.066

16383 16 9 9 4.64e-09 5.08e-07 1.246 1.06632767 16 9 8 1.16e-09 1.80e-07 1.246 1.06665535 16 8 7 2.92e-10 6.41e-08 1.256 1.076

Example 1, (a): ε = 1.0 , t = 0.75 , toll = 10−12

Here the value t is equal to 1.5 t∗ε with t∗ε from (3.7), where c1 = 1 . For t = t∗ε ,N = 31 iteration steps are needed in (3.10) to get the same accuracy in (5.1).

The next table presents the respective results for ε = 0.2 . We observethe same convergence rates and that the numbers in the last two columns donot depend on ε as predicted by Cor. 3.3. Otherwise, the number of iterationsis much higher than for ε = 1.0 as expected by (3.7). Here t is about 9 t∗ε .For t = t∗ε N = 410 iterations are needed in (3.10) to fulfil (5.1).

n N N3 N5 N10 d1/2 d1 n2.0d1/2 n1.5d1

31 40 1.06e-03 5.17e-03 1.018 0.89263 42 2.82e-04 1.94e-03 1.119 0.972

127 43 7.30e-05 7.11e-04 1.178 1.017255 44 1.86e-05 2.56e-04 1.210 1.041511 44 4.70e-06 9.12e-05 1.228 1.053

1023 44 37 36 36 1.18e-06 3.24e-05 1.237 1.0602047 44 35 33 33 2.96e-07 1.15e-05 1.241 1.0634095 44 32 30 30 7.42e-08 4.06e-06 1.244 1.0658191 44 30 27 26 1.86e-08 1.44e-06 1.245 1.065

16383 44 27 24 23 4.64e-09 5.08e-07 1.246 1.06632767 44 24 20 20 1.16e-09 1.80e-07 1.246 1.06665535 44 21 17 16 2.92e-10 6.41e-08 1.256 1.076

Example 1, (a): ε = 0.2 , t = 1.7 , toll = 10−12

(b) In this case the solution u∗ is unknown. For this reason we compare theapproximate solutions u(N)

n with u(N)65535 . We have fb ∈ L2,2.5−δ

ϕ for all δ > 0 .Thus, by Cor. 2.6 and by (3.4), we get

‖u∗n − u∗‖ϕ,1 = O

(nδ−2.5

), δ > 0 arbitrarily small.

Page 81: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

74 M.R. Capobianco, G. Criscuolo and P. Junghanns

n N N1 N3 D1/2 D1 n3.0D1/2 n2.5D1

31 14 2.90e-05 1.43e-04 0.865 0.76363 14 3.83e-06 2.68e-05 0.958 0.845

127 14 4.93e-07 4.90e-06 1.009 0.890255 14 6.25e-08 8.80e-07 1.036 0.914511 14 7.87e-09 1.57e-07 1.050 0.926

1023 14 10 8 9.87e-10 2.78e-08 1.057 0.9322047 14 9 6 1.24e-10 4.93e-09 1.060 0.9354095 14 8 5 1.55e-11 8.71e-10 1.061 0.9358191 14 7 3 2.00e-12 1.60e-10 1.100 0.974

Example 1, (b): ε = 1.0 , t = 0.75 , toll = 10−12

The numerical results presented in the tables confirm these theoretical esti-mate. Here we observe the values of the norms Ds = ‖u(N)

n − u(N)65535‖ϕ,s for

s = 0.5 and s = 1 . Of course, here the values of the last two columns dependon ε since the exact solution u∗ and so ‖u∗‖ϕ,3.5 in (3.4) depends on ε .

n N N3 N5 N10 D1/2 D1 n3.0D1/2 n2.5D1

31 37 1.24e-04 6.17e-04 3.697 3.30363 38 1.76e-05 1.24e-04 4.405 3.919

127 39 2.36e-06 2.36e-05 4.829 4.281255 39 3.05e-07 4.31e-06 5.064 4.479511 39 3.89e-08 7.76e-07 5.189 4.582

1023 39 29 26 25 4.91e-09 1.38e-07 5.253 4.6352047 39 26 22 20 6.16e-10 2.46e-08 5.286 4.6624095 39 23 17 15 7.72e-11 4.35e-09 5.300 4.6738191 39 20 13 10 9.77e-12 7.80e-10 5.368 4.738

Example 1, (b): ε = 0.2 , t = 1.7 , toll = 10−12

Example 2. We solve the equation (1.5) with γ(x, g) = |g| arctan(g) and f(x)equal to

x2√

1− x2 arctan(x|x|

√1− x2

)− εx

π

(2− 3 x2

√1− x2

ln1 +

√1− x2

1−√

1− x2− 6

)by the collocation method (3.1) together with the iteration method (3.10) (withu

(0)n ≡ 0) as well as with the combination of (3.10) and (3.14) as described at the

end of Section 3.Condition (B) is fulfilled with α = 1 and c1 = π/2 . The solution is given byu∗(x) = x|x| (independent of ε). In the following tables (where ε = 1.0 andε = 0.2) we can observe the convergence rate which is predicted by (3.4). (Thenotations from the previous example are used.)

Page 82: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 75

n N N3 d1/2 d1 n2.0d1/2 n1.5d1

31 12 1.16e-03 5.52e-03 1.116 0.95363 12 2.97e-04 2.01e-03 1.178 1.007

127 12 7.51e-05 7.24e-04 1.212 1.036255 12 1.89e-05 2.58e-04 1.229 1.051511 12 4.74e-06 9.16e-05 1.237 1.058

1023 12 8 1.19e-06 3.25e-05 1.242 1.0622047 12 8 2.97e-07 1.15e-05 1.244 1.0644095 12 7 7.43e-08 4.06e-06 1.245 1.0658191 12 6 1.86e-08 1.44e-06 1.246 1.066

16383 12 6 4.64e-09 5.08e-07 1.246 1.06632767 12 5 1.16e-09 1.80e-07 1.246 1.06665535 12 5 2.92e-10 6.41e-08 1.256 1.076

Example 2: ε = 1.0 , t = 0.9 , toll = 10−12

n N N3 N5 N10 d1/2 d1 n2.0d1/2 n1.5d1

31 22 1.16e-03 5.51e-03 1.113 0.95163 22 2.97e-04 2.01e-03 1.178 1.007

127 22 7.51e-05 7.24e-04 1.211 1.036255 22 1.89e-05 2.58e-04 1.229 1.051511 22 4.74e-06 9.16e-05 1.237 1.058

1023 22 15 15 15 1.19e-06 3.25e-05 1.242 1.0622047 22 14 14 14 2.97e-07 1.15e-05 1.244 1.0644095 22 12 12 12 7.43e-08 4.06e-06 1.245 1.0658191 22 11 11 11 1.86e-08 1.44e-06 1.246 1.066

16383 22 10 10 10 4.64e-09 5.08e-07 1.246 1.06632767 22 9 9 9 1.16e-09 1.80e-07 1.246 1.06665535 22 8 7 7 2.92e-10 6.41e-08 1.256 1.076

Example 2: ε = 0.2 , t = 3.4 , toll = 10−12

Example 3. We solve equation (1.7) with Γ(x) = (1− x2)14 and

f(x) = x

[√1− x2 − ε

π

(2− 3 x2

√1− x2

ln1 +

√1− x2

1−√

1− x2− 6

)](a) by the collocation method (3.1) together with the iteration method (3.10)

(with u(0)n ≡ 0), although condition (B) is not satisfied for α = 1 , and

(b) by the collocation method (4.8) together with the iteration method (4.14).We observe that the iteration method (3.10) can converge (although the conditionon the Lipschitz continuity is not satisfied), but does usually not converge forgreater n if the iteration parameter t is not small enough. On the other hand, inthe iteration method (4.14) the number of iterations essentially depends on the

Page 83: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

76 M.R. Capobianco, G. Criscuolo and P. Junghanns

discretization parameter n and increases strongly with n . Thus, a more effectiveand stable way to solve equation (1.7) seems to be the application of a Newtoniteration method to the collocation equations (4.8) or a combination of a Newtoniteration with the method (4.14). We will discuss this in a forthcoming paper.(a) The solution is given by g∗(x) =

√1− x2u∗(x) with u∗(x) = x|x| . In the

following tables we use the same notations as in the previous examples.

n N d1/2 d1 n2.0d1/2 n1.5d1

31 68 1.06e-03 5.16e-03 1.017 0.89163 68 2.72e-04 1.89e-03 1.078 0.946

127 68 6.89e-05 6.81e-04 1.111 0.975255 68 1.73e-05 2.43e-04 1.128 0.990511 68 4.35e-06 8.64e-05 1.136 0.998

1023 20000 2.90e-04 1.13e-032047 20000 2.93e-04 1.13e-034095 20000 2.91e-04 1.13e-038191 20000 2.91e-04 1.12e-03

16383 20000 2.91e-04 1.12e-0332767 20000 2.91e-04 1.12e-0365535 20000 2.91e-04 1.12e-03

Example 3, (a): ε = 1.0 , t = 0.3 , toll = 10−12

n N d1/2 d1 n2.0d1/2 n1.5d1

31 365 1.06e-03 5.16e-03 1.017 0.89163 365 2.72e-04 1.89e-03 1.078 0.946

127 365 6.89e-05 6.81e-04 1.111 0.975255 365 1.73e-05 2.43e-04 1.128 0.990511 365 4.35e-06 8.64e-05 1.136 0.998

1023 365 1.09e-06 3.06e-05 1.141 1.0022047 365 2.73e-07 1.08e-05 1.143 1.0044095 365 6.82e-08 3.83e-06 1.144 1.0058191 365 1.71e-08 1.36e-06 1.145 1.005

16383 365 4.27e-09 4.80e-07 1.145 1.00632767 365 1.07e-09 1.70e-07 1.145 1.00665535 365 2.70e-10 6.06e-08 1.158 1.017

Example 3, (a): ε = 1.0 , t = 0.06 , toll = 10−12

(b) The iteration is stopped if the L2ϕ-norm of the difference of two consecutive

iterations is smaller than toll, i.e.,∥∥∥g(N)n − g(N−1)

n

∥∥∥ϕ< toll , where g(m)

n =n∑

k=1

ξ(m)nk ϕnk .

Page 84: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 77

Then, the transformation

ξ(N)nk → ξ

(N)nk =

∣∣∣ξ(N)nk

∣∣∣β ξ(N)nk

[Γ(xϕnk)]δϕ(xϕ

nk)

is applied and the approximations un =n∑

k=1

ξ(N)nk ϕnk are compared with the

exact solution u∗(x) ,

ds =∥∥∥u(N)

n − Pnu∗∥∥∥

ϕ,s=

√√√√n−1∑k=0

(k + 1)2s

[⟨u

(N)n , pϕ

k

⟩ϕ− a∗k

]2

.

n N d1/2 d1 n2.0d1/2 n1.5d1

31 74 1.059e-03 5.163e-03 1.017 0.89163 144 2.716e-04 1.891e-03 1.078 0.946

127 268 6.887e-05 6.812e-04 1.111 0.975255 487 1.734e-05 2.432e-04 1.128 0.990511 867 4.352e-06 8.639e-05 1.136 0.998

1023 1511 1.090e-06 3.062e-05 1.141 1.0022047 2573 2.727e-07 1.084e-05 1.143 1.0044095 4242 6.819e-08 3.833e-06 1.143 1.0048191 6677 1.702e-08 1.353e-06 1.142 1.003

16383 9749 4.205e-09 4.737e-07 1.129 0.99332767 12399 9.948e-10 1.602e-07 1.068 0.95065535 12773 2.312e-10 5.434e-08 0.993 0.912

Example 3,(b): ε = 1.0 , t = 1.0 , toll = 10−12

n N d1/2 d1 n2.0d1/2 n1.5d1

31 93 1.003e-03 4.955e-03 0.964 0.85563 178 2.581e-04 1.821e-03 1.025 0.910

127 332 6.555e-05 6.568e-04 1.057 0.940255 604 1.652e-05 2.346e-04 1.074 0.955511 1077 4.147e-06 8.338e-05 1.083 0.963

1023 1888 1.039e-06 2.956e-05 1.087 0.9672047 3238 2.600e-07 1.046e-05 1.089 0.9694095 5403 6.503e-08 3.702e-06 1.090 0.9708191 8671 1.626e-08 1.309e-06 1.091 0.970

16383 13112 4.058e-09 4.622e-07 1.089 0.96932767 17871 1.006e-09 1.622e-07 1.080 0.96265535 19814 2.457e-10 5.646e-08 1.055 0.947

Example 3,(b): ε = 0.5 , t = 1.4 , toll = 10−13

Page 85: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

78 M.R. Capobianco, G. Criscuolo and P. Junghanns

References

[1] D. Berthold, W. Hoppe, B. Silbermann, A fast algorithm for solving the generalizedairfoil equation, J. Comp. Appl. Math., 43 (1992), 185–219.

[2] M.R. Capobianco, G. Criscuolo, P. Junghanns, A fast algorithm for Prandtl’s integro-differential equation, J. Comp. Appl. Math., 77 (1997), 103–128.

[3] M.R. Capobianco, G. Criscuolo, P. Junghanns, U. Luther, Uniform convergence ofthe collocation method for Prandtl’s integro-differential equation, ANZIAM J., 42(2000), 151–168.

[4] M.R. Capobianco, G. Mastroianni, Uniform boundedness of Lagrange operator insome weighted Sobolev-type space, Math. Nachr., 187 (1997), 1–17.

[5] I. Gohberg, N. Krupnik, One-Dimensional Linear Singular Integral Equations, Vol-ume I, Birkhauser Verlag, 1992.

[6] N.I. Ioakimidis, Application of finite-part integrals to the singular integral equationsof crack problems in plane and three-dimensional elasticity, Acta Mech., 45 (1982),31–47.

[7] A.C. Kaya, F. Erdogan, On the solution of integral equations with strong singularities,Quart. Appl. Math., 45 (1987), 105–122.

[8] G. Mastroianni, M.G. Russo, Lagrange interpolation in weighted Besov spaces, Con-str. Approx., 15 2 (1999), 257–289.

[9] G. Mastroianni, M.G. Russo, Weighted Marcinkiewicz inequalities and boundednessof the Lagrange operator, Math. Anal. Appl., (2000), 149–182.

[10] S. Nemat-Nasser, M. Hori, Toughening by partial or full bridging of cracks in ceramicsand fiber reinforced composites, Mech. Mat., 6 (1987), 245–269.

[11] S. Nemat-Nasser, M. Hori, Asymptotic solution of a class of strongly singular integralequations, SIAM J. Appl. Math., 50 3 (1990), 716–725.

[12] P. Nevai, Orthogonal Polynomials, Mem. Amer. Math. Soc. 213, Providence, RI,1979.

[13] D. Oestreich, Approximated solution of a nonlinear singular equation of Prandtl’stype, Math. Nachr., 161 (1993), 95–105.

[14] M.A. Sheshko, G.A. Rasol’ko, V.S. Mastyanitsa, On approximate solution ofPrandtl’s integro-differential equation, Differential Equations, 29 (1993), 1345–1354.

[15] G. Steidl, Fast radix-p discrete cosine transform, AAECC, 3 (1992), 39–46.

[16] M. Tasche, Fast algorithms for discrete Chebyshev-Vandermonde transforms and ap-plications, Numer. Algor., 5 (1993), 453–464.

[17] L. von Wolfersdorf, Monotonicity methods for nonlinear singular integral and integro-differential equations, ZAMM, 63 (1983), 249–259.

[18] E. Zeidler, Nonlinear Functional Analysis and its Applications, Part II, SpringerVerlag, 1990.

Page 86: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Nonlinear Equation of Prandtl’s type 79

M.R. CapobiancoC.N.R. – Istituto per le Applicazioni delCalcolo “Mauro Picone”, Sezione di NapoliVia Pietro Castellino 111I-80131 Napoli, Italye-mail: [email protected]

G. CriscuoloDipartimento di MatematicaUniversita degli Studi Napoli “Frederico II”Edificio T Compless Monte Sant’ AngeloVia CinthiaI-80126 Napoli, Italye-mail: [email protected]

P. JunghannsFakultat fur MathematikTechnische Universitat ChemnitzD-09107 Chemnitz, Germanye-mail: [email protected]

Page 87: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 81–100c© 2005 Birkhauser Verlag Basel/Switzerland

Fourier Integral Operatorsand Gelfand-Shilov Spaces

Marco Cappiello

Abstract. In this work, we study a class of Fourier integral operators of infiniteorder acting on the Gelfand-Shilov spaces of type S. We also define wave frontsets in terms of Gelfand-Shilov classes and study the action of the previousFourier integral operators on them.

Mathematics Subject Classification (2000). Primary 35S30; Secondary 35A18.

Keywords. Fourier integral operators, θ-wave front set, Gelfand-Shilov spaces.

1. Introduction

Fourier integral operators, as introduced by L. Hormander [20], play a fundamentalrole in microlocal analysis and in the theory of the partial differential equations. Inthese fields, they find a natural application in the analysis of the Cauchy problemfor some classes of hyperbolic equations. In particular, parametrices and solutionsfor these kinds of problems can be expressed in terms of Fourier integral operators.The propagation of singularities associated to the Cauchy problem can also be in-vestigated by studying the action of Fourier integral operators on the wave front setof distributions. A large number of works concerning these operators and their ap-plications in the study of the C∞-well-posedness of hyperbolic problems appearedin the last thirty years, see [21], [23], [32] and the references there. Correspondingresults have been obtained in the context of the Gevrey classes, see for example[19], [6], [30], [31], [17]. The Gevrey framework leads us to consider operators ofinfinite order, i.e., with symbols growing exponentially at infinity. In a differentdirection, S. Coriasco [8] has developed a global calculus for Fourier integral oper-ators defined by symbols a(x, η) satisfying estimates on R2n

x,η, called SG-symbolsin the literature, see [25], [7], [28], [13], [29]. As an application, S. Coriasco [9],S. Coriasco and L. Rodino [12], S. Coriasco and P. Panarese [11] and S. Coriascoand L. Maniccia [10] have proved results on the well-posedness and propagation ofsingularities for some hyperbolic problems globally defined in the space variables,in the framework of the Schwartz spaces S(Rn),S′(Rn). In some recent works, theauthor extends this global calculus in a Gevrey context first for symbols of finite

Page 88: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

82 M. Cappiello

order [2] and then for symbols of infinite order [3], [4]. In [3], [4], the functionalframework is given by the Gelfand-Shilov space Sν

µ(Rn), µ > 0, ν > 0, µ + ν ≥ 1,defined as the space of all functions u ∈ C∞(Rn) such that

supα,β∈Nn

supx∈Rn

A−|α|B−|β|(α!)−µ(β!)−ν∣∣xα∂β

xu(x)∣∣ < +∞ (1.1)

for some positive constants A,B. More precisely, the results have been obtainedin Sθ(Rn) = Sθ

θ (Rn), with θ > 1 and in the dual space S′θ(R

n), corresponding tothe case µ = ν = θ in (1.1) and representing a global version of the Gevrey classesGθ(Rn),D′

θ(Rn).

In this work, we study SG-Fourier integral operators on Gelfand-Shilov spacesfrom a microlocal point of view. In Section 2, we recall the basic results concerningthe SG-calculus on Sθ(Rn). In Section 3, we define polyhomogeneous SG-symbolsof finite order, which extend the standard notion of classical symbols in the SG-context. In Sections 4 and 5, we define the wave front sets for distributions u ∈S′

θ(Rn) and study the action of Fourier integral operators on them. These results

are the starting point for the study of the propagation of singularities for the SG-hyperbolic Cauchy problem in the Gelfand-Shilov spaces, which we shall detail ina forthcoming paper (a construction of parametrices is given in [4]).

In the sequel we will use the following notation:

〈x〉 = (1 + |x|2) 12 for x ∈ Rn

∇xϕ =(

∂ϕ∂x1

, . . . , ∂ϕ∂xn

)Dα

x = Dα1x1

. . .Dαnxn

for all α ∈ Nn, x ∈ Rn, where Dxh= −i∂xh

, h = 1, . . . , n

e1 = (1, 0), e2 = (0, 1), e = (1, 1).

We will denote by Z+ the set of all positive integers and by N the set Z+ ∪ 0.We will also denote by F the Fourier transformation.

We start by giving the basic definitions and properties of the Gelfand-Shilovspaces Sθ(Rn), θ > 1 and describing their relations with the Gevrey spaces. Wewill refer to [14], [15], [24] for proofs and details. Let θ > 1, let A,B be positiveintegers and denote by Sθ,A,B(Rn) the space of all functions u in C∞(Rn) suchthat

supα,β∈Nn

supx∈Rn

A−|α|B−|β|(α!β!)−θ∣∣xα∂β

xu(x)∣∣ < +∞. (1.2)

We haveSθ(Rn) =

⋃A,B∈Z+

Sθ,A,B(Rn).

For any A,B ∈ Z+, the space Sθ,A,B(Rn) is a Banach space endowed with thenorm given by the left-hand side of (1.2). Therefore, we can consider the spaceSθ(Rn) as an inductive limit of an increasing sequence of Banach spaces.

Let us give another characterization of the space Sθ(Rn), providing anotherequivalent topology to Sθ(Rn).

Page 89: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fourier Integral Operators 83

Proposition 1.1. Sθ(Rn) is the space of all functions u ∈ C∞(Rn) such that

supβ∈Nn

supx∈Rn

B−|β|(β!)−θeL|x| 1θ |∂βxu(x)| < +∞

for some positive B,L.

Proposition 1.2. i) Sθ(Rn) is closed under differentiation;ii) We have

Gθo(R

n) ⊂ Sθ(Rn) ⊂ Gθ(Rn),where Gθ(Rn) is the Gevrey space of all functions u ∈ C∞(Rn) satisfying forevery compact subset K of Rn estimates of the form:

supβ∈Nn

B−|β|(β!)−θ supx∈K

|∂βxu(x)| < +∞

for some B = B(K) > 0, and Gθo(R

n) is the space of all functions of Gθ(Rn)with compact support.

We shall denote by S′θ(R

n) the space of all linear continuous forms on Sθ(Rn), alsoknown as temperate ultradistributions, cf. [26].

Remark 1.3. Given u ∈ S′θ(R

n), the restriction of u to Gθo(Rn) is a Gevrey ul-

tradistribution in D′θ(R

n), topological dual of Gθo(R

n). In this sense, we haveS′

θ(Rn) ⊂ D′

θ(Rn). Similarly, the space of the ultradistributions with compact

support E ′θ(R

n) can be regarded as subset of S′θ(R

n).

Theorem 1.4. There exists an isomorphism between L(Sθ(Rn), S′θ(R

n)), the spaceof all linear continuous maps from Sθ(Rn) to S′

θ(Rn), and S′

θ(R2n), which asso-

ciates to every T ∈ L(Sθ(Rn), S′θ(R

n)) a distribution KT ∈ S′θ(R

2n) such that

〈Tu, v〉 = 〈KT , v ⊗ u〉for every u, v ∈ Sθ(Rn). KT is called the kernel of T.

Finally we give a result concerning the action of the Fourier transformation onSθ(Rn).

Proposition 1.5. The Fourier transformation is an automorphism of Sθ(Rn) andextends to an automorphism of S′

θ(Rn).

2. SG-calculus on Gelfand-Shilov spaces

In this section, we illustrate the main results concerning the action of Fourierintegral operators of finite and infinite order on the spaces Sθ(Rn), S′

θ(Rn). These

results have been proved combining the standard arguments of the local theoryof Fourier integral operators on the Gevrey classes, see [6], [18], [33], with thetechniques coming from the SG-calculus on the Schwartz spaces S(Rn),S′(Rn),see [7], [8]. For the sake of brevity, we omit or just sketch the proofs, since theyare given in full detail in [3] and [4].Let µ, ν, θ be real numbers such that µ > 1, ν > 1 and θ ≥ maxµ, ν.

Page 90: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

84 M. Cappiello

Definition 2.1. For every A > 0 we denote by Γ∞µ,ν,θ(R

2n;A) the Frechet space ofall functions a(x, η) ∈ C∞(R2n) satisfying the following condition: for every ε > 0

‖a‖A,ε = supα,β∈Nn

sup(x,η)∈R2n

A−|α|−|β|(α!)−µ(β!)−ν 〈η〉|α|〈x〉|β|·

· exp[−ε(|x| 1θ + |η| 1θ )

] ∣∣DαηD

βxa(x, η)

∣∣ < +∞

endowed with the topology defined by the seminorms ‖ · ‖A,ε, for ε > 0. We set

Γ∞µ,ν,θ(R

2n) = lim−→

A→+∞Γ∞

µ,ν,θ(R2n;A)

with the topology of inductive limit of Frechet spaces.

An important subclass of Γ∞µ,ν,θ(R

2n) is represented by SG-symbols of finiteorder which we will define as follows. Let µ, ν be real numbers such that µ > 1, ν >1 and let m = (m1,m2) be a vector of R2.

Definition 2.2. For every B > 0 we denote by Γmµ,ν(R2n;B) the Banach space of

all functions a(x, η) ∈ C∞(R2n) such that

‖a‖B = supα,β∈Nn

sup(x,η)∈R2n

B−|α|−|β|(α!)−µ(β!)−ν ·

·〈η〉−m1+|α|〈x〉−m2+|β| ·∣∣Dα

ηDβxa(x, η)

∣∣ < +∞endowed with the norm ‖ · ‖B and define

Γmµ,ν(R2n) = lim

−→B→+∞

Γmµ,ν(R2n;B).

We observe that Γmµ,ν(R2n) ⊂ Γ∞

µ,ν,θ(R2n) for every m ∈ R2 and for all θ ≥

maxµ, ν.

Definition 2.3. A function ϕ ∈ Γeµ,ν(R2n) will be called a phase function if it is

real-valued and there exists a positive constant Cϕ such that

C−1ϕ 〈x〉 ≤ 〈∇ηϕ〉 ≤ Cϕ〈x〉 (2.1)

C−1ϕ 〈η〉 ≤ 〈∇xϕ〉 ≤ Cϕ〈η〉. (2.2)

We shall denote by Pµ,ν the space of all phase functions from Γeµ,ν(R2n).

Given a ∈ Γ∞µ,ν,θ(R

2n) and ϕ ∈ Pµ,ν , we can consider the Fourier integraloperator

Aa,ϕu(x) =∫

Rn

eiϕ(x,η)a(x, η)u(η)d−η, u ∈ Sθ(Rn), (2.3)

where we denote d−η = (2π)−ndη. In particular, for ϕ(x, η) = 〈x, η〉, we obtain thepseudodifferential operator of symbol a(x, η)

Au(x) =∫

Rn

ei〈x,η〉a(x, η)u(η)d−η. (2.4)

Page 91: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fourier Integral Operators 85

In view of Proposition 1.5 and Definition 2.1, the integrals (2.3) and (2.4) areabsolutely convergent. Given m ∈ R2, we shall denote by OPSm

µ,ν(Rn) the spaceof all operators of the form (2.4) defined by a symbol a ∈ Γm

µ,ν(R2n). We set

OPSµ,ν(Rn) =⋃

m∈R2

OPSmµ,ν(Rn).

Lemma 2.4. Let ϕ ∈ Γeµ,ν(R2n). Then, for every α, β in Nn, there exists a function

kα,β(x, η) ∈ C∞(R2n) such that

DαηD

βxe

iϕ(x,η) = eiϕ(x,η)kα,β(x, η)

and|kα,β(x, η)| ≤ C|α|+|β||α|!µ|β|!ν ·

·max0,|α|−1∑

h=0

〈x〉h+1−|β|

(h!)µ

max0,|β|−1∑k=0

〈η〉k+1−|α|

(k!)ν(2.5)

for every (x, η) ∈ R2n and for some C > 0 independent of α, β.

With the aid of Lemma 2.4, Propositions 1.1 and 1.5, we obtain the followingresult.

Theorem 2.5. Let ϕ ∈ Pµ,ν . Then, the map (a, u) → Aa,ϕu is a bilinear andseparately continuous map from Γ∞

µ,ν,θ(R2n) × Sθ(Rn) to Sθ(Rn) and it can be

extended to a bilinear and separately continuous map from Γ∞µ,ν,θ(R

2n) × S′θ(R

n)to S′

θ(Rn).

Definition 2.6. An operator of the form (2.3) is said to be θ-regularizing if it canbe extended to a linear continuous map from S′

θ(Rn) to Sθ(Rn).

Proposition 2.7. Let ϕ ∈ Pµ,ν and let a ∈ Sθ(R2n). Then, the operator Aa,ϕ isθ-regularizing.

We now give an asymptotic expansion of symbols from Γ∞µ,ν,θ(R

2n). Let usdenote, for t > 0

Qt = (x, η) ∈ R2n : 〈η〉 < t, 〈x〉 < tand

Qet = R2n \Qt.

Definition 2.8. Let B,C > 0. We shall denote by FS∞µ,ν,θ(R

2n;B,C) the space ofall formal sums

∑j≥0

aj(x, η) such that aj(x, η) ∈ C∞(R2n) for all j ≥ 0 and for

every ε > 0

supj≥0

supα,β∈Nn

sup(x,η)∈Qe

Bjµ+ν−1

C−|α|−|β|−2j(α!)−µ(β!)−ν(j!)−µ−ν+1〈η〉|α|+j〈x〉|β|+j ·

· exp[−ε(|x| 1θ + |η| 1θ )

] ∣∣DαηD

βxaj(x, η)

∣∣ < +∞. (2.6)

Page 92: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

86 M. Cappiello

Consider the space FS∞µ,ν,θ(R

2n;B,C) obtained from FS∞µ,ν,θ(R

2n;B,C) bytaking its quotient by the subspace

E =

⎧⎨⎩∑j≥0

aj(x, η) ∈ FS∞µ,ν,θ(R

2n;B,C) : supp (aj) ⊂ QBjµ+ν−1 ∀j ≥ 0

⎫⎬⎭ .

By abuse of notation, we shall denote the elements of FS∞µ,ν,θ(R

2n;B,C) by formalsums of the form

∑j≥0

aj(x, η). The arguments in the following are independent of

the choice of representative. We observe that FS∞µ,ν,θ(R

2n;B,C) is a Frechet spaceendowed with the seminorms given by the left-hand side of (2.6), for ε > 0. We set

FS∞µ,ν,θ(R

2n) = lim−→

B,C→+∞FS∞

µ,ν,θ(R2n;B,C).

Each symbol a∈Γ∞µ,ν,θ(R

2n) can be identified with an element∑j≥0

aj of FS∞µ,ν,θ(R

2n)

by setting a0 = a and aj = 0 for all j ≥ 1.

Definition 2.9. We say that two sums∑j≥0

aj ,∑j≥0

a′j from FS∞µ,ν,θ(R

2n) are equiva-

lent

(we write

∑j≥0

aj(x, η) ∼∑j≥0

a′j(x, η)

)if there exist constants B,C > 0 such

that for all ε > 0

supN∈Z+

supα,β∈Nn

sup(x,η)∈Qe

BNµ+ν−1

C−|α|−|β|−2N(α!)−µ(β!)−ν(N !)−µ−ν+1〈η〉|α|+N 〈x〉|β|+N ·

· exp[−ε(|x| 1θ + |η| 1θ )

] ∣∣∣∣∣∣DαηD

βx

∑j<N

(aj − a′j)

∣∣∣∣∣∣ < +∞.

Theorem 2.10. Given a sum∑j≥0

aj ∈ FS∞µ,ν,θ(R

2n), we can find a symbol a in

Γ∞µ,ν,θ(R

2n) such that

a ∼∑j≥0

aj in FS∞µ,ν,θ(R

2n).

Proposition 2.11. Let ϕ ∈ Pµ,ν and a ∈ Γ∞µ,ν,θ(R

2n) such that a ∼ 0. Then, theoperator Aa,ϕ is θ-regularizing.

In the following statements, we will need stronger assumptions on µ, ν, θ.Namely, we will assume that

1 < µ ≤ ν, θ ≥ µ + ν − 1. (2.7)

These assumptions are crucial to define the product of a pseudodifferential and aFourier integral operator in our classes. In particular, the condition θ ≥ µ+ ν − 1is related to the loss of of Gevrey regularity occurring in the stationary phasemethod, cf. [5], [16], [17] and in the composition formula, cf. [1], [6], [19], [22], [33].In some particular cases, the condition (2.7) can be relaxed, see Remark 2.13.

Page 93: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fourier Integral Operators 87

Theorem 2.12. Let µ, ν, θ be real numbers satisfying (2.7) and let

Aa,ϕu(x) =∫

Rn

eiϕ(x,η)a(x, η)u(η)d−η,

Pu(x) =∫

Rn

ei〈x,η〉p(x, η)u(η)d−η,

where ϕ ∈ Pµ,ν , a ∈ Γ∞µ,ν,θ(R

2n), p ∈ Γmµ,ν(R2n) for some m = (m1,m2) ∈ R2.

Then, PAa,ϕ is, modulo θ-regularizing operators, a Fourier integral operator withphase function ϕ and symbol q ∈ Γ∞

µ,ν,θ(R2n). Furthermore,

q(x, η) ∼∑j≥0

∑|α|=j

(α!)−1Dαz

((∂α

η p)(x, ∇xϕ(x, z, η))a(z, η))|z=x

(2.8)

in FS∞µ,ν,θ(R

2n) with

∇xϕ(x, z, η) =∫ 1

0

(∇xϕ)(z + τ(x − z), η)dτ.

Remark 2.13. With the same notation as in Theorem 2.12, we see that if a∼∑h≥0

ah

in FS∞µ,ν,θ(R

2n), then

q(x, η) ∼∑j≥0

∑|α|=j−h

(α!)−1Dαz

((∂α

η p)(x, ∇xϕ(x, z, η))ah(z, η))|z=x

.

Moreover, when ϕ(x, η) = 〈x, η〉, we have from (2.8) the standard formula for thesymbol of the product of two pseudodifferential operators. We emphasize that,as we are dealing only with pseudodifferential operators, we can simply assumeµ > 1, ν > 1 in Theorem 2.12 instead of 1 < µ ≤ ν. Finally, we observe that if Pis a differential operator, the sum in (2.8) is finite and Theorem 2.12 holds underthe weaker assumptions µ > 1, ν > 1, θ ≥ maxµ, ν.

Remark 2.14. Given µ, ν, θ satisfying (2.7), if ϕ ∈ Pµ,µ, p ∈ Γmµ,µ(R2n) and a ∈

Γ∞ν,µ,θ(R

2n), then the operator PAa,ϕ is a Fourier integral operator with phase ϕ

and symbol q ∈ Γ∞ν,µ,θ(R

2n) satisfying (2.8) in FS∞ν,µ,θ(R

2n).

Under the same assumptions of Theorem 2.12, we are also interested in thestudy of the operator Aa,ϕP, which will occur in the proofs of the statements ofSection 5. Let us give a preliminary result.

Lemma 2.15. Let a ∈ Γ∞µ,ν,θ(R

2n) and ϕ ∈ Pµ,ν and consider the transpose tAa,ϕ

of the operator Aa,ϕ defined by

〈tAa,ϕu, v〉 = 〈u,Aa,ϕv〉, u ∈ S′θ(R

n), v ∈ Sθ(Rn).

Then, we havetAa,ϕ = F Aa,ϕ F−1,

where we denote a(x, η) = a(η, x), ϕ(x, η) = ϕ(η, x).

Page 94: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

88 M. Cappiello

Theorem 2.16. Let µ, ν, θ be real numbers satisfying (2.7) and let

Aa,ϕu(x) =∫

Rn

eiϕ(x,η)a(x, η)u(η)d−η,

Pu(x) =∫

Rn

ei〈x,η〉p(x, η)u(η)d−η,

where ϕ ∈ Pµ,µ, a ∈ Γ∞µ,ν,θ(R

2n), p ∈ Γmµ,µ(R2n) for some m = (m1,m2) ∈ R2.

Then, the operator Aa,ϕP is, modulo θ-regularizing operators, a Fourier integraloperator with phase function ϕ and symbol h ∈ Γ∞

µ,ν,θ(R2n) such that

h(x, η) ∼∑j≥0

∑|α|=j

(α!)−1Dαζ

((∂α

x p)(∇ηϕ(x, ζ, η), η)a(x, ζ))|ζ=η

(2.9)

in FS∞µ,ν,θ(R

2n), where

∇ηϕ(x, ζ, η) =∫ 1

0

(∇ηϕ)(x, ζ + τ(η − ζ))dτ.

Proof. By Lemma 2.15, Theorem 2.12 and Remark 2.14, denoting by P the op-erator with symbol p(x, η) = p(η, x), we can write

Aa,ϕP = t(tP tAa,ϕ) =t [(F(P Aa,ϕ)F−1)] =t [FAh,ϕF−1]

with h ∈ Γ∞ν,µ,θ(R

2n) such that

h(x, η) ∼∑

α

(α!)−1Dαz

((∂α

η p)(∇ηϕ(η, z, x), x)a(η, z))|z=x

in FS∞ν,µ,θ(R

2n).

Then, applying again Lemma 2.15, we deduce that Aa,ϕP = Ah,ϕ where h satisfies(2.9). Remark 2.17. The results of this section obviously hold also for symbols of finiteorder introduced in Definition 2.2. Nevertheless, in view of the applications in thenext few sections, it is convenient to have for them more precise results, obtainedby defining in a suitable way formal sums of finite order and a correspondingequivalence relation.

Let µ, ν be real numbers such that µ > 1, ν > 1, and let m = (m1,m2) ∈ R2.

Definition 2.18. Let B,C > 0. We shall denote by FSmµ,ν(R2n;B,C) the space of

all formal sums∑j≥0

pj(x, η) such that pj(x, η) ∈ C∞(R2n) for all j ≥ 0 and

supj≥0

supα,β∈Nn

sup(x,η)∈Qe

Bjµ+ν−1

C−|α|−|β|−2j(α!)−µ(β!)−ν(j!)−µ−ν+1·

·〈η〉−m1+|α|+j〈x〉−m2+|β|+j∣∣Dα

ηDβxpj(x, η)

∣∣ < +∞, (2.10)

where, as in Definition 2.8, we identify two sums∑j≥0

pj ,∑j≥0

p′j if pj − p′j vanish in

QBjµ+ν−1 for all j ≥ 0. We set FSmµ,ν(R2n) = lim

−→B,C→+∞

FSmµ,ν(R2n;B,C).

Page 95: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fourier Integral Operators 89

Definition 2.19. We say that two sums∑j≥0

pj ,∑j≥0

p′j from FSmµ,ν(R2n) are equiva-

lent

(we write

∑j≥0

pj(x, η) ∼∑j≥0

p′j(x, η)

)if there exist constants B,C > 0 such

that

supN∈Z+

supα,β∈Nn

sup(x,η)∈Qe

BNµ+ν−1

C−|α|−|β|−2N(α!)−µ(β!)−ν(N !)−µ−ν+1·

·〈η〉−m1+|α|+N 〈x〉−m2+|β|+N

∣∣∣∣∣∣DαηD

βx

∑j<N

(pj − p′j)

∣∣∣∣∣∣ < +∞.

Theorem 2.10 and Proposition 2.11 can be formulated for symbols of finiteorder starting from Definitions 2.18 and 2.19. Moreover, with the same notationas in Theorems 2.12 and 2.16, if a ∈ Γm′

µ,ν(R2n) for some m′ ∈ R2, then theoperators PAa,ϕ and Aa,ϕP are Fourier integral operators with phase function ϕ

and symbol q, respectively h, in Γm+m′µ,ν (R2n) satisfying (2.8), respectively (2.9),

in FSm+m′µ,ν (R2n).

3. Elliptic and polyhomogeneous symbols of finite order

In this section, we investigate two important typologies of SG-symbols of finiteorder and the relations between them. We will restrict our attention to pseudo-differential operators defined by such symbols. Let µ, ν be real numbers such thatµ > 1, ν > 1.

Definition 3.1. A symbol p ∈ Γmµ,ν(R2n) is said to be elliptic if there exist B,C > 0

such that|p(x, η)| ≥ C〈η〉m1 〈x〉m2 ∀(x, η) ∈ Qe

B.

Theorem 3.2. Given p ∈ Γmµ,ν(R2n) elliptic, we can find E,E′ ∈ OPS−m

µ,ν (Rn) suchthat EP = I + R,PE′ = I + R′, where I is the identity operator on S′

θ(Rn) and

R,R′ are θ-regularizing operators.

Proof. By Theorem 2.10, the symbols e and e′ can be constructed starting fromtheir asymptotic expansions

∑j≥0

ej,∑j≥0

e′j. We can define e0 ∈ C∞(R2n) such that

e0(x, η) = p(x, η)−1 ∀(x, η) ∈ QeB

and by induction on j ≥ 1

ej(x, η) = −e0(x, η)∑

0<|α|≤j

(α!)−1∂αη ej−|α|(x, η)Dα

x p(x, η).

By induction on j, it is easy to prove that∑j≥0

ej ∈ FS−mµ,ν (R2n). Moreover, by

Theorem 2.12 and Remark 2.13, the operator E is such that EP = I + R. Theconstruction of E′ is analogous.

Page 96: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

90 M. Cappiello

Corollary 3.3. Let p ∈ Γmµ,ν(R2n) be an elliptic symbol and let f ∈ Sθ(Rn) for some

θ ≥ µ + ν − 1. Then, if u ∈ S′θ(R

n) is a solution of the equation Pu = f, thenu ∈ Sθ(Rn).

We can also define the notion of ellipticity of a symbol with respect to another one.

Definition 3.4. Let p ∈ Γmµ,ν(R2n), q ∈ Γm′

µ,ν(R2n) for some m,m′ ∈ R2. We say thatp is elliptic with respect to q if there exist B,C > 0 such that

|p(x, η)| ≥ C〈η〉m1〈x〉m2

for all (x, η) ∈ QeB ∩ supp(q).

In particular, the symbol p is elliptic if and only if p is elliptic with respectto q ≡ 1. Arguing as in the proof of Theorem 3.2, it is easy to prove the followingresult.

Proposition 3.5. Given p ∈ Γmµ,ν(R2n) elliptic with respect to q ∈ Γm′

µ,ν(R2n), wecan find E,E′ ∈ OPSm′−m

µ,ν (Rn) such that EP = Q + R,PE′ = Q + R′, whereR,R′ are θ-regularizing operators.

We can now introduce polyhomogeneous SG-symbols. We follow the approachof Y. Egorov and B.-W. Schulze [13], [29], who have treated polyhomogeneousSG-symbols in the S-S′-framework. Namely, we will define three classes whoseelements are polyhomogeneous in x, in η and in both x, η, respectively. Beforegiving precise definitions of these spaces, we need to introduce in our context anotion of asymptotic expansion with respect to x and η separately. Let µ, ν be realnumbers such that µ > 1, ν > 1 and let m = (m1,m2) be a vector of R2.

Definition 3.6. We denote by FSmµ,ν,η(R2n) the space of all formal sums

∑j≥0

pj(x, η)

such that pj ∈ C∞(R2n)∀j ≥ 0 and there exist B,C > 0 such that

supj≥0

supα,β∈Nn

sup〈η〉≥Bjµ+ν−1

x∈Rn

C−|α|−|β|−j(α!)−µ(β!)−ν(j!)−µ−ν+1·

·〈η〉−m1+|α|+j〈x〉−m2+|β| ∣∣DαηD

βxpj(x, η)

∣∣ < +∞. (3.1)

As in Definition 2.19, we can define an equivalence relation among the ele-ments of FSm

µ,ν,η(R2n).

Definition 3.7. Two sums∑j≥0

pj ,∑j≥0

p′j ∈ FSmµ,ν,η(R2n) are said to be equivalent(

we write∑j≥0

pj ∼η

∑j≥0

p′j

)if there exist B,C > 0 such that

supN∈Z+

supα,β∈Nn

sup〈η〉≥BNµ+ν−1

x∈Rn

C−|α|−|β|−N(α!)−µ(β!)−ν(N !)−µ−ν+1·

·〈η〉−m1+|α|+N 〈x〉−m2+|β|

∣∣∣∣∣∣DαηD

βx

∑j<N

(pj − p′j)

∣∣∣∣∣∣ < +∞.

Page 97: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fourier Integral Operators 91

In an analogous way, we can define the space FSmµ,ν,x(R2n) and the corre-

sponding relation ∼x .

Remark 3.8. We observe that

FSmµ,ν(R2n) ⊂ FSm

µ,ν,η(R2n) ∩ FSmµ,ν,x(R2n).

Furthermore, if∑j≥0

pj ∼∑j≥0

p′j in FSmµ,ν(R2n), then∑

j≥0

pj ∼η

∑j≥0

p′j and∑j≥0

pj ∼x

∑j≥0

p′j.

Similarly to Theorem 2.10, we have the following result.

Proposition 3.9. Given∑j≥0

pj ∈ FSmµ,ν,η(R2n),

∑j≥0

qj ∈ FSmµ,ν,x(R2n), then there

exist p, q ∈ Γmµ,ν(R2n) such that

p ∼η

∑j≥0

pj in FSmµ,ν,η(R2n),

q ∼x

∑j≥0

qj in FSmµ,ν,x(R2n).

We can define the following classes of homogeneous symbols.

Definition 3.10. We denote by Γ[m1],m2µ,ν (R2n) the space of all symbols p ∈ Γm

µ,ν(R2n)such that p(x, λη) = λm1p(x, η) ∀λ ≥ 1, |η| ≥ c > 0, x ∈ Rn. Analogously, wedefine the space Γm1,[m2]

µ,ν (R2n) by interchanging the roles of x and η. Finally, weset

Γ[m1],[m2]µ,ν (R2n) = Γ[m1],m2

µ,ν (R2n) ∩ Γm1,[m2]µ,ν (R2n).

Using Definitions 3.6, 3.7, 3.10, we can now introduce polyhomogeneous sym-bols.

Definition 3.11. We denote by Γm1,[m2]µ,ν,cl(η)(R

2n) the space of all p ∈ Γm1,[m2]µ,ν (R2n)

satisfying the following condition: there exists a sum∑k≥0

pk ∈ FSmµ,ν,η(R2n) such

that pk ∈ Γ[m1−k],[m2]µ,ν (R2n) ∀k ≥ 0 and p ∼η

∑k≥0

pk in FSmµ,ν,η(R2n).

Definition 3.12. We denote by Γm1,m2µ,ν,cl(η)(R

2n) the space of all symbols p ∈ Γmµ,ν(R2n)

satisfying the following condition: there exists a sum∑k≥0

pk ∈ FSmµ,ν,η(R2n) such

that pk ∈ Γ[m1−k],m2µ,ν (R2n) ∀k ≥ 0 and p ∼η

∑k≥0

pk in FSmµ,ν,η(R2n).

Analogous definitions can be given for the spaces Γ[m1],m2µ,ν,cl(x)(R

2n) and Γm1,m2µ,ν,cl(x)(R

2n),by interchanging the roles of x and η. Finally, we define a space of symbols whichare polyhomogeneous with respect to both the variables.

Page 98: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

92 M. Cappiello

Definition 3.13. We denote by Γmµ,ν,cl(R

2n) the space of all symbols p ∈ Γmµ,ν(R2n)

for which the following conditions hold:

i) there exists∑k≥0

pk ∈ FSmµ,ν,η(R2n) such that pk ∈ Γ[m1−k],m2

µ,ν,cl(x) (R2n) ∀k ∈ N,

p ∼η

∑k≥0

pk in FSmµ,ν,η(R2n) and p−

∑k<N

pk ∈ Γm1−N,m2µ,ν,cl(x) (R2n) ∀N ∈ Z+;

ii) there exists∑h≥0

qh ∈ FSmµ,ν,x(R2n) such that qh ∈ Γm1,[m2−h]

µ,ν,cl(η) (R2n) ∀h ∈ N,

p ∼x

∑h≥0

qh in FSmµ,ν,x(R2n) and p−

∑h<N

qh ∈ Γm1,m2−Nµ,ν,cl(η) (R2n) ∀N ∈ Z+.

The following inclusions hold:

Γm1,[m2]µ,ν,cl(η)(R

2n) ⊂ Γmµ,ν,cl(R

2n), Γ[m1],m2

µ,ν,cl(x)(R2n) ⊂ Γm

µ,ν,cl(R2n). (3.2)

A simple homogeneity argument shows that for every p ∈ Γmµ,ν,cl(R

2n) andfor every k ∈ N, there exists a unique function σm1−k

ψ (p) ∈ C∞(Rn × (Rn \ 0))such that σm1−k

ψ (p)(x, λη) = λm1−kσm1−kψ (p)(x, η) for all λ > 0, x ∈ Rn, η = 0

and σm1−kψ (p)(x, η) = pk(x, η) for |η| ≥ c > 0. Analogously, in view of condition

ii) of Definition 3.13, we can associate to every p ∈ Γmµ,ν,cl(R

2n) the functionsσm2−h

e (p) ∀h ∈ N such that σm2−he (p) ∈ C∞((Rn \ 0)×Rn), σm2−h

e (p)(x, η) =qh(x, η) for |x| ≥ c > 0 and σm2−h

e (p)(λx, η) = λm2−hσm2−he (p)(x, η) for all λ >

0, η ∈ Rn, x = 0. We also observe that if ω ∈ Gµ(Rn) is an excision function, i.e.,ω = 0 in a neighborhood of the origin and ω = 1 in a neighborhood of ∞, thenω(η)σm1−k

ψ (p)(x, η) is in Γ[m1−k],m2µ,ν,cl(x) (R2n). Similarly, if χ(x) is an excision function

in Gν(Rn), then χ(x)σm2−he (p)(x, η) is in Γm1,[m2−h]

µ,ν,cl(η) (R2n).By these considerations and by the inclusions (3.2), we can also consider the

functions σm1−kψ (σm2−h

e (p)) and σm2−he (σm1−k

ψ (p)). It is easy to show that

σm1−kψ (σm2−h

e (p)) = σm2−he (σm1−k

ψ (p)) for all h, k ∈ N.

In particular, given p ∈ Γmµ,ν,cl(R

2n), we can consider the triplet

σm1ψ (p), σm2

e (p), σmψe(p),

where we denote σmψe(p) = σm1

ψ (σm2e (p)).

The function σm1ψ (p) is called the homogeneous principal interior symbol of

p and the pair σm2e (p), σm

ψe(p) is the homogeneous principal exit symbol of p.By the previous results, it turns out that, given two excision functions ω(η)

in Gµ(Rn) and χ(x) ∈ Gν(Rn), we have also

p(x, η)− ω(η)σm1ψ (p)(x, η) ∈ Γ(m1−1,m2)

µ,ν,cl (R2n), (3.3)

p(x, η) − χ(x)σm2e (p)(x, η) ∈ Γ(m1,m2−1)

µ,ν,cl (R2n), (3.4)p(x, η)− ω(η)σm1

ψ (p)(x, η) − χ(x)(σm2e (p)(x, η)

− ω(η)σmψe(p)(x, η)) ∈ Γm−e

µ,ν,cl(R2n).

(3.5)

Page 99: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fourier Integral Operators 93

We denote by OPSmµ,ν,cl(R

n) the set of all operators of the form (2.4) defined bya symbol p ∈ Γm

µ,ν,cl(R2n) and we set, for θ > 1

OPSθcl(R

n) =⋃

m∈R2

µ,ν∈(1,+∞)µ+ν−1≤θ

OPSmµ,ν,cl(R

n).

Remark 3.14. Arguing as in the previous section and applying Remark 2.13, it iseasy to prove that if P ∈ OPSm

µ,ν,cl(Rn), Q ∈ OPSm′

µ,ν,cl(Rn), then the operator

PQ is in OPSm+m′µ,ν,cl (Rn).

We recall (cf. Proposition 1.4.37 in [29]) that a symbol p ∈ Γmµ,ν,cl(R

2n) iselliptic if and only if the three following conditions hold:

σm1ψ (p)(x, η) = 0 ∀(x, η) ∈ Rn × (Rn \ 0), (3.6)

σm2e (p)(x, η) = 0 ∀(x, η) ∈ (Rn \ 0)× Rn, (3.7)

σmψe(p)(x, η) = 0 ∀(x, η) ∈ (Rn \ 0)× (Rn \ 0). (3.8)

Example. Consider a partial differential operator with polynomial coefficients

P =∑

|α|≤m1|β|≤m2

cαβxβDα.

The corresponding symbol belongs to Γmµ,ν,cl(R

2n) for every µ > 1, ν > 1 andm = (m1,m2). The operator P is elliptic in the SG-sense if and only if

σm1ψ =

∑|α|=m1|β|≤m2

cαβxβηα = 0 for η = 0, σm2

e =∑

|α|≤m1|β|=m2

cαβxβηα = 0 for x = 0

andσm

ψe =∑

|α|=m1|β|=m2

cαβxβηα = 0 for x, η = 0.

4. θ-wave front set

In this section, we introduce an appropriate notion of wave front set for distri-butions u ∈ S′

θ(Rn), and prove the standard properties of microellipticity with

respect to the polyhomogeneous operators defined in Section 3. Similar resultshave been proved by S. Coriasco and L. Maniccia [10] for Schwartz tempered dis-tributions. For every ηo ∈ Rn \ 0, we will denote by ∞ηo the projection ηo

|η0|on the unit sphere Sn−1. In the following, an open set V ⊂ Rn is said to be aconic neighborhood of the direction ∞ηo if it is the intersection of an open conecontaining the direction ∞ηo with the complementary set of a closed ball centeredin the origin. The decomposition of the principal symbol into three components inthe previous section suggests to define for the elements of S′

θ(Rn) three sets which

Page 100: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

94 M. Cappiello

we will denote by WF θψ,WF θ

e ,WF θψe, θ > 1. To give precise definitions, we need

to introduce two types of cut-off functions.

Definition 4.1. Let yo ∈ Rn and fix ν > 1. We denote by Rνyo

the set of all functionsϕ ∈ Gν

o(Rn) such that 0 ≤ ϕ ≤ 1 and ϕ ≡ 1 in a neighborhood of yo.

Definition 4.2. Let ηo ∈ Rn \ 0 and fix µ > 1. We denote by Zµηo

the set of allfunctions ψ ∈ C∞(Rn) such that ψ(λη) = ψ(η)∀λ ≥ 1 and |η| large, 0 ≤ ψ ≤1, ψ ≡ 1 in a conic neighborhood V of ∞ηo, ψ ≡ 0 outside a conic neighborhoodV ′ of ∞ηo, V ⊂ V ′ and

|Dαηψ(η)| ≤ C|α|+1(α!)µ〈η〉−|α|, η ∈ Rn

for every α ∈ Nn and for some C > 0.

Definition 4.3. Let θ be a positive real number such that θ > 1 and let u ∈ S′θ(R

n).– We say that (xo, ηo) ∈ Rn × (Rn \ 0) is not in WF θ

ψu if there exist positivenumbers µ, ν ∈ (1,+∞) such that θ ≥ maxµ, ν and there exist cut-offfunctions ϕxo in Rν

xo, ψηo ∈ Zµ

ηosuch that ϕxo(ψηo(D)u) ∈ Sθ(Rn).

– We say that (xo, ηo) ∈ (Rn \ 0)×Rn is not in WF θe u if there exist positive

numbers µ, ν ∈ (1,+∞) such that θ ≥ maxµ, ν and there exist cut-offfunctions ϕηo in Rµ

ηo, ψxo ∈ Zν

xosuch that ψxo(ϕηo (D)u) ∈ Sθ(Rn).

– We say that (xo, ηo) ∈ (Rn \ 0)× (Rn \ 0) is not in WF θψeu if there exist

positive numbers µ, ν ∈ (1,+∞) such that θ ≥ maxµ, ν and there existcut-off functions ψxo ∈ Zν

xo, ψηo ∈ Zµ

ηosuch that ψxo(ψηo (D)u) ∈ Sθ(Rn).

Remark 4.4. It is easy to prove that Definition 4.3 is independent of the choiceof µ and ν. In particular, if (xo,∞ηo) /∈ WF θ

ψu, then for any given µ > 1, ν > 1with µ + ν − 1 ≤ θ we may actually find ϕxo ∈ Rν

xo, ψηo ∈ Zµ

ψηosuch that

ϕxo(ψηo(D)u) ∈ Sθ(Rn), and similarly for WF θe u,WF θ

ψeu.

Remark 4.5. We can consider WF θψu as a subset of Rn × Sn−1, being WF θ

ψuinvariant with respect to the multiplication of the second variable η by positivescalars. Analogously, we can consider WF θ

e u ⊂ Sn−1 ×Rn and WF θψeu ⊂ Sn−1 ×

Sn−1.

Remark 4.6. Every u ∈ S′θ(R

n) can be regarded as an element ofD′θ(R

n), accordingto Remark 1.3. It is easy to show that WF θ

ψu coincides with the standard Gevreywave front set of u, cf. [27].

Let us characterize the sets defined before in terms of characteristic manifoldsof polyhomogeneous operators. For p ∈ Γm

µ,ν,cl(R2n), we define

Charψ(P ) = (x, η) ∈ Rn × Sn−1 : σm1ψ (p)(x, η) = 0,

Chare(P ) = (x, η) ∈ Sn−1 × Rn : σm2e (p)(x, η) = 0,

Charψe(P ) = (x, η) ∈ Sn−1 × Sn−1 : σmψe(p)(x, η) = 0.

Page 101: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fourier Integral Operators 95

Proposition 4.7. Let u ∈ S′θ(R

n). We have the following relations:

WF θψu =

⋂P∈OP Sθ

cl(Rn)

Pu∈Sθ(Rn)

Charψ(P ), WF θe u =

⋂P∈OP Sθ

cl(Rn)

Pu∈Sθ(Rn)

Chare(P ),

WF θψeu =

⋂P∈OP Sθ

cl(Rn)

Pu∈Sθ(Rn)

Charψe(P )

Proof. Let (xo,∞ηo) /∈ WF θψu. Then, by Remark 4.4, there exist µ, ν ∈ (1,+∞)

such that θ ≥ µ+ν−1 and ϕxo in Rνxo, ψηo in Zµ

ηosuch that Pu = ϕxo(ψηo(D)u) ∈

Sθ(Rn). Observe that P is a pseudodifferential operator with symbol ϕxo(x)ψηo (η)in Γ(0,0)

µ,ν,cl(R2n) and that ϕxo(xo)ψηo(ληo) = 1 for λ ∈ R+ sufficiently large. Hence,

(xo,∞ηo) /∈⋂

P∈OP Sθcl

(Rn)

Pu∈Sθ(Rn)

Charψ(P ). Conversely, let us assume that there exists P =

p(x,D) in OPSθcl(R

n) such that Pu ∈ Sθ(Rn) and σψ(p)(xo,∞ηo) = 0. Then,there exists a neighborhood U of xo and a conic neighborhood V of ∞ηo such thatσψ(p)(x,∞η) = 0 ∀(x, η) ∈ U × V. Furthermore, by (3.3), it turns out that if |η|is sufficiently large, we have

|p(x, η)||η|m1〈x〉m2

≥ |σψ(p)(x, η)||η|m1〈x〉m2

− |p(x, η)− σψ(p)(x, η)||η|m1〈x〉m2

≥ C > 0

for some C > 0. Hence we can construct two cut-off functions ϕxo , ψηo supportedin U and in V, respectively, such that p is elliptic with respect to ϕxo(x)ψηo (η). ByProposition 3.5, there exists E ∈ OPS−m

µ,ν (Rn) such that EPu = ϕxo(ψηo (D)u) +Ru, where R is θ-regularizing. Then, ϕxo(ψηo(D)u) = Ru− EPu ∈ Sθ(Rn). Thisgives the statement for WF θ

ψu. The corresponding relation for WF θe u can be ob-

tained with the same argument by simply interchanging the roles of x and η. Asto the third relation, we obtain the inclusion

⋂P∈OP Sθ

cl(Rn)

Pu∈Sθ(Rn)

Charψe(P ) ⊂ WF θψeu di-

rectly again from Definition 4.3 and Remark 4.4. Assume now that there existsP ∈ OPSθ

cl(Rn) such that Pu ∈ Sθ(Rn) and σψe(p)(∞x0,∞η0) = 0. Then, there

exist two conic neighborhoods Vx0 , Vη0 such that σψe(p)(x, η) = 0 if (x, η) is inVx0 × Vη0 . Hence, by (3.5), we have

|p(x, η)||η|m1 |x|m2

≥ C > 0

if |x| and |η| are large enough. Then, we can conclude arguing as for WF θψu.

Theorem 4.8. Let u ∈ S′θ(R

n) and p ∈ Γmµ,ν,cl(R

2n), with µ+ ν − 1 ≤ θ. Then, thefollowing inclusions hold:

WF θψ(Pu) ⊂WF θ

ψu ⊂WF θψ(Pu) ∪ Charψ(P ) (4.1)

WF θe (Pu) ⊂WF θ

e u ⊂WF θe (Pu) ∪ Chare(P ) (4.2)

Page 102: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

96 M. Cappiello

WF θψe(Pu) ⊂WF θ

ψeu ⊂WFψe(Pu) ∪ Charψe(P ). (4.3)

Proof. If (xo, ηo) /∈WF θψu, then there exist cut-off functions ϕxo ∈ Rν

xo, ψηo ∈ Zµ

ηo

such that ϕxo(ψηo(D)u) ∈ Sθ(Rn), where, in view of Remark 4.4, we may take thesame µ, ν as for the class Γm

µ,ν,cl(R2n). Shrinking the neighborhoods of xo,∞ηo, we

can construct two cut-off functions ϕxo ∈ Rνxo, ψηo ∈ Zµ

ηosuch that ϕxoϕxo = ϕxo

and ψηoψηo = ψηo . Denote by Q the operator with symbol ϕxoψηo and by Q theoperator with symbol ϕxo ψηo . By Theorem 2.12, we have

QQPu = QPQu + Q[Q,P ]u = QPQu+ Ru,

where R is θ-regularizing. Observe that QQ ∈ OPSθcl(R

n) and σψ(QQ)(xo,∞ηo)= σψ(Q)(xo,∞ηo)σψ(Q)(xo,∞ηo) = 0. Then, by Proposition 4.7, we concludethat (xo,∞ηo) /∈ WF θ

ψ(Pu). This proves the first inclusion in (4.1). Assume nowthat (xo,∞ηo) /∈ WF θ

ψ(Pu). By Proposition 4.7, there exists Q = q(x,D) ∈OPS0

µ,ν,cl(Rn) such that QPu ∈ Sθ(Rn) and σψ(Q)(xo,∞ηo) = 0. Furthermore, if

(xo,∞ηo) /∈ Charψ(P ), then σψ(QP )(xo,∞ηo) = σψ(Q)(x0,∞ηo)σψ(P )(xo,∞ηo)= 0. Moreover, QP ∈ OPSm

µ,ν,cl(Rn). Hence, by Proposition 4.7, we conclude that

(xo,∞ηo) /∈WF θψu. The proofs of (4.2) and (4.3) are analogous.

Proposition 4.9. Let u ∈ S′θ(R

n). Then, u ∈ Sθ(Rn) if and only if WF θψu =

WF θe u = WF θ

ψeu = ∅.

Proof. If WF θψeu = ∅, then, for every (∞xo,∞ηo) ∈ Sn−1×Sn−1, there exist ψxo ∈

Zνxo, ψηo ∈ Zµ

ηosuch that ψxo(ψηo(D)u) ∈ Sθ(Rn). In view of Remark 4.4, we may

fix µ, ν independent of (∞xo,∞ηo). Let us observe that σ(0,0)ψe (ψxo(x)ψηo (η)) = 1 in

a conic set in R2n, obtained as a product of conic sets of Rnx and Rn

η , intersectingSn−1 × Sn−1 in a neighborhood Vxo,ηo of (∞xo,∞ηo). By the compactness ofSn−1 × Sn−1, we can find a finite family (∞xj ,∞ηj), j = 1, . . . , N, such thatVxj ,ηj , j = 1, . . . , N cover Sn−1 × Sn−1. Define

qo(x, η) =∑

j=1,...,N

ψxj (x)ψηj (η).

If |η| > R and |x| > R, with R sufficiently large, then qo(x, η) ≥ C > 0. More-over, by construction, qo(x,D)u ∈ Sθ(Rn). Applying similar compactness argu-ments to x ∈ Rn : |x| ≤ R × Sn−1 and to Sn−1 × η ∈ Rn : |η| ≤ R andusing the assumption WF θ

ψu = WF θe u = ∅, we can construct q1(x, η), q2(x, η)

such that q1(x,D)u ∈ Sθ(Rn), q2(x,D)u ∈ Sθ(Rn) and q1(x, η) ≥ C1 > 0 if|η| > R, |x| ≤ R and q2(x, η) ≥ C2 > 0 if |x| > R, |η| ≤ R. Moreover, obviouslyqo, q1, q2 ∈ Γ(0,0)

µ,ν,cl(R2n). Then, the function q(x, η) = qo(x, η) + q1(x, η) + q2(x, η)

is an elliptic symbol of order (0, 0) and q(x,D)u ∈ Sθ(Rn). Then, u ∈ Sθ(Rn) inview of Corollary 3.3. The inverse implication is trivial.

Page 103: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fourier Integral Operators 97

We conclude with a proposition which makes clear in what sense the exitcomponents WF θ

e ,WF θψe determine the behavior of a distribution of S′

θ(Rn) at

infinity. The proof follows the same arguments of the proof of Proposition 4.9. Weomit it for sake of brevity.

Proposition 4.10. Let u ∈ S′θ(R

n) and denote by Πx : R2nx,η −→ Rn

x the standardprojection on the variable x. If x0 /∈ Πx(WF θ

e u∪WF θψeu), then there exists ψx0 ∈

Zθx0

such that ψx0u ∈ Sθ(Rn).

5. Action of SG-Fourier integral operators on the θ-wave front set

In this section, we study the action of the Fourier integral operators of infinite or-der defined in Section 2 on the θ-wave front set of ultradistributions u ∈ S′

θ(Rn).

The results presented here have an analogous local version for the standard Gevreyultradistributions from D′

θ(Rn), see [27] and the references there. The correspond-

ing analysis of the action of SG-operators of finite order on the S-wave front set isdue to S. Coriasco and L. Maniccia [10]. We start introducing further assumptionson the phase functions. Given µ > 1, we will denote by Pµ,µ,cl the space of allphase functions ϕ ∈ Pµ,µ such that ϕ is in Γe

µ,µ,cl(R2n). Given ϕ ∈ Pµ,µ,cl, we can

consider the following three maps:

Φψ :(x, ξ = σ1

ψ(∇xϕ)(x, η))−→

(y = σ0

ψ(∇ηϕ)(x, η), η), (5.1)

Φe :(x, ξ = σ0

e(∇xϕ)(x, η))−→

(y = σ1

e(∇ηϕ)(x, η), η), (5.2)

Φψe :(x, ξ = σe1

ψe(∇xϕ)(x, η))−→

(y = σe2

ψe(∇ηϕ)(x, η), η). (5.3)

Definition 5.1. A phase function ϕ ∈ Pµ,µ is said to be regular if there exists C > 0such that

sup(x,η)∈R2n

∣∣∣∣det(

∂2ϕ

∂xj∂ηk

)(x, η)

∣∣∣∣ ≥ C > 0 (5.4)

for all j, k = 1, . . . ,m.

We can apply in particular to a regular phase function ϕ ∈ Pµ,µ the followingresults from [8], [10] in the C∞-setting.

Proposition 5.2. If ϕ ∈ Pµ,µ,cl is regular, then the maps Φψ,Φe,Φψe are globaldiffeomorphisms acting on Rn × Sn−1, Sn−1 × Rn, Sn−1 × Sn−1, respectively.

Proof. See Proposition 12 in [8] for the proof. Theorem 5.3. Let µ, ν, θ be real numbers satisfying (2.7) and let a∈Γ∞

µ,ν,θ(R2n), ϕ∈

Pµ,µ,cl regular. Then, for every u ∈ S′θ(R

n), we have the following inclusions:

WF θψ(Aa,ϕu) ⊂ Φ−1

ψ (WF θψu) (5.5)

WF θe (Aa,ϕu) ⊂ Φ−1

e (WF θe u) (5.6)

WF θψe(Aa,ϕu) ⊂ Φ−1

ψe (WF θψeu). (5.7)

Page 104: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

98 M. Cappiello

Proof. We will only prove (5.5) for sake of brevity. The proofs of (5.6) and (5.7) donot present further difficulties. Let (yo, ηo) /∈ WF θ

ψu. Then, by Remark 4.4, thereexist cut-off functions ϕyo ∈ Rµ

yo, ψηo ∈ Zµ

ηosuch that Cu = ϕyo(ψηo(D)u) ∈

Sθ(Rn). We want to prove that there exist ϕxo ∈ Rνxo, ψξo ∈ Zµ

ξosuch that

ϕxo(ψξo(D)Aa,ϕu) ∈ Sθ(Rn). We will fix the supports of ϕxo and ψξo later. Let usdenote by T the operator with symbol t(x, η) = ϕxo(x)ψξo (η). We can write

TAa,ϕu = TAa,ϕCu + TAa,ϕEu,

where E = I − C. We obviously have TAa,ϕCu ∈ Sθ(Rn). To prove (5.5), it issufficient to show that also TAa,ϕEu ∈ Sθ(Rn). Indeed, we want to prove that, bysuitably choosing the supports of ϕxo and ψξo , the operator TAa,ϕE turns out tobe θ-regularizing. Let us first consider the operator B = TAa,ϕ. By Theorem 2.12,B is a Fourier integral operator with phase function ϕ and symbol b ∈ Γ∞

µ,ν,θ(R2n)

such that

b(x, η) ∼∑α

(α!)−1Dαz

((∂α

η t)(x, ∇xϕ(x, z, η))a(z, η))|z=x

(5.8)

in FS∞µ,ν,θ(R

2n). We observe that all the terms in the sum (5.8) contain derivativesof t evaluated in (x,∇xϕ(x, η)). Moreover, by (3.3), we know that ∇xϕ(x, η) =σ1

ψ(∇xϕ)(x, η) mod Γ(0,0)µ,µ for |η| large. Then, by the properties of ϕ, we can assume

that there exists a neighborhood Uxo of xo and a conic neighborhood Vξo of ∞ξo

such that b(x, η) vanishes when (x, ξ) ∈ R2n \ (Uxo × Vξo) . Furthermore, we cantake Uxo and Vξo as small as we want by shrinking the supports of ϕxo and ψξo .Let us now consider BE. By Theorem 2.16, BE is a Fourier integral operator withphase function ϕ and symbol h ∈ Γ∞

µ,ν,θ(R2n) such that

h(x, η) ∼∑α

(α!)−1Dαζ

((∂α

x e)(∇ηϕ(x, ζ, η), η)b(x, ζ))|ζ=η

(5.9)

in FS∞µ,ν,θ(R

2n). Observe that all the terms of the sum (5.9) contain some deriva-tives of e evaluated in (∇ηϕ(x, η), η). Arguing as before, we can conclude that thereexists a neighborhood Uyo of yo and a conic neighborhood Vηo of ∞ηo dependingonly on the supports of ϕyo and ψηo such that e(y, η) = 0 for (y, η) ∈ Uyo × Vηo .Furthermore, by the condition (5.4), it turns out that also the maps

M1 : (x, η) −→ (y, η)

M2 : (x, η) −→ (x, ξ)

defined in terms of (5.5) are global diffeomorphisms on Rn×Sn−1, cf. Proposition12 in [8]. By homogeneity, we deduce that we can choose the supports of ϕxo andψξo sufficiently small such that

M1(M−12 (Uxo × Vξo)) ⊂ Uyo × Vηo .

This gives h ∼ 0 in FS∞µ,ν,θ(R

2n). Then, (5.5) follows from Proposition 2.11.

Page 105: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fourier Integral Operators 99

Acknowledgment

Thanks are due to Professor Luigi Rodino for helpful discussions and comments.The author also wishes to thank Professor Cornelis Van der Mee and the refer-ees of the paper for several useful remarks which led to an improvement of themanuscript.

References

[1] L. Boutet de Monvel and P. Kree, Pseudodifferential operators and Gevrey classes,Ann. Inst. Fourier, Grenoble, 17 (1967), 295–323.

[2] M. Cappiello, Pseudodifferential operators and spaces of type S, in “Progress inAnalysis” Proceedings 3rd Int. ISAAC Congress, Vol. I, Editors G.W. Begehr, R.B.Gilbert, M.W. Wong, World Scientific, Singapore (2003), 681–688.

[3] M. Cappiello, Pseudodifferential parametrices of infinite order for SG-hyperbolicproblems, Rend. Sem. Mat. Univ. Pol. Torino, 61, 4 (2003), 411–441.

[4] M. Cappiello, Fourier integral operators of infinite order and applications to SG-hyperbolic equations, Preprint 2003. To appear in Tsukuba J. Math.

[5] F. Cardin and A. Lovison, Lack of critical phase points and exponentially faint illu-mination, Preprint 2004.

[6] L. Cattabriga and L. Zanghirati, Fourier integral operators of infinite order onGevrey spaces. Application to the Cauchy problem for certain hyperbolic operators,J. Math. Kyoto Univ., 30 (1990), 142–192.

[7] H.O. Cordes, The technique of pseudodifferential operators, Cambridge Univ. Press,1995.

[8] S. Coriasco, Fourier integral operators in SG classes.I. Composition theorems andaction on SG-Sobolev spaces, Rend. Sem. Mat. Univ. Pol. Torino, 57 n. 4 (1999),249–302.

[9] S. Coriasco, Fourier integral operators in SG classes.II. Application to SG hyperbolicCauchy problems, Ann. Univ. Ferrara Sez VII, 44 (1998), 81–122.

[10] S. Coriasco and L. Maniccia, Wave front set at infinity and hyperbolic linear operatorswith multiple characteristics, Ann. Global Anal. and Geom., 24 (2003), 375–400.

[11] S. Coriasco and P. Panarese, Fourier integral operators defined by classical symbolswith exit behaviour, Math. Nachr., 242 (2002), 61–78.

[12] S. Coriasco and L. Rodino, Cauchy problem for SG-hyperbolic equations with constantmultiplicities, Ricerche di Matematica, Suppl. Vol. XLVIII (1999), 25–43.

[13] Y. Egorov and B.-W. Schulze, Pseudo-differential operators, Singularities, Applica-tions, Birkhauser, 1997.

[14] I.M. Gelfand and G.E. Shilov, Generalized functions, Vol. 2, Academic Press, NewYork-London, 1968.

[15] I.M. Gelfand and N. Ya. Vilenkin, Generalized functions, Vol. 4, Academic Press,New York-London, 1964.

[16] T. Gramchev, Stationary phase method in the Gevrey classes and the Gevrey wavefront sets, C. R. Acad. Bulgare Sci. 36 (1983), n. 12, 1487–1489.

Page 106: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

100 M. Cappiello

[17] T. Gramchev, The stationary phase method in Gevrey classes and Fourier integraloperators on ultradistributions, Banach Center Publ., PWN, Warsaw, 19 (1987), 101–111.

[18] T. Gramchev and P. Popivanov, Partial differential equations: Approximate solutionsin scales of functional spaces, Math. Research, 108, WILEY-VCH, Berlin, 2000.

[19] S. Hashimoto, T. Matsuzawa and Y. Morimoto, Operateurs pseudo-differentiels etclasses de Gevrey, Comm. Partial Differential Equations, 8 (1983), 1277–1289.

[20] L. Hormander, Fourier integral operators I, Acta Math. 127 (1971), 79–183.

[21] H. Kumano-go, Pseudodifferential operators, MIT Press, 1981.

[22] R. Lascar, Distributions integrales de Fourier et classes de Denjoy-Carleman. Ap-plications, C. R. Acad. Sci. Paris Ser. A-B 284 (1977), no. 9, A485–A488.

[23] M. Mascarello and L. Rodino, Partial differential equations with multiple character-istics, Akademie Verlag, Berlin, 1997.

[24] B.S. Mitjagin, Nuclearity and other properties of spaces of type S, Amer. Math. Soc.Transl., Ser. 2 93 (1970), 45–59.

[25] C. Parenti, Operatori pseudodifferenziali in Rn e applicazioni, Ann. Mat. Pura Appl.93 (1972), 359–389.

[26] S. Pilipovic, Tempered ultradistributions, Boll. U.M.I. 7 2-B (1988), 235–251.

[27] L. Rodino, Linear Partial Differential Operators in Gevrey Spaces, World ScientificPublishing Co., Singapore, 1993.

[28] E. Schrohe, Spaces of weighted symbols and weighted Sobolev spaces on manifolds, InH. O. Cordes, B. Gramsch and H. Widom editors, Proceedings, Oberwolfach, 1256Springer LNM, New York (1986), 360–377.

[29] B.-W. Schulze, Boundary value problems and singular pseudodifferential operators,J. Wiley & sons, Chichester, 1998.

[30] K. Shinkai and K. Taniguchi, On ultra wave front sets and Fourier integral operatorsof infinite order, Osaka J. Math., 27 (1990), 709–720.

[31] K. Taniguchi, Fourier integral operators in Gevrey class on Rn and the fundamentalsolution for a hyperbolic operator, Publ. RIMS Kyoto Univ., 20 (1984), 491–542.

[32] K. Yagdjan, The Cauchy problem for hyperbolic operators: multiple characteristics,microlocal approach, Akademie Verlag, Berlin, 1997.

[33] L. Zanghirati, Pseudodifferential operators of infinite order and Gevrey classes, Ann.Univ Ferrara, Sez. VII, Sc. Mat., 31 (1985), 197–219.

Marco CappielloDipartimento di MatematicaUniversita degli Studi di TorinoVia Carlo Alberto, 10I-10123 Torino, Italye-mail: [email protected]

Page 107: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 101–160c© 2005 Birkhauser Verlag Basel/Switzerland

Strongly Regular J-inner Matrix-valuedFunctions and Inverse Problemsfor Canonical Systems

Damir Z. Arov and Harry Dym

To Israel Gohberg, valued teacher, colleague and friend, on his 75th birthday.

Abstract. This paper provides an introduction to the role of strongly regularJ-inner matrix-valued functions in the analysis of inverse problems for canon-ical integral and differential systems. A number of the main results that weredeveloped in a series of papers by the authors are surveyed and examples andapplications are presented, including an application to the matrix Schrodingerequation. The approach of M.G. Krein to inverse problems is discussed briefly.

Mathematics Subject Classification (2000). Primary 34A55, 45Q05, 47B32,46E22 Secondary 34L40, 30E05.

Keywords. canonical systems, differential systems with potential, inverse prob-lems, de Branges spaces, J-inner matrix-valued functions, interpolation, re-producing kernel Hilbert spaces, Dirac systems, Krein systems, Schrodingersystems.

1. Introduction

The purpose of this paper is to present a survey of the useful role played by theclass of strongly regular J-inner mvf’s (matrix-valued functions) in the theory ofdirect and inverse problems for canonical integral and differential systems. Weshall not present proofs, unless they are short and instructive. A more completeanalysis that includes all the missing details may be found in the cited references.A number of illustrative examples are included.

D.Z. Arov thanks the Weizmann Institute of Science for for hospitality and support, throughThe Minerva Foundation. H. Dym thanks Renee and Jay Weiss for endowing the Chair whichsupports his research and the Minerva Foundation.

Page 108: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

102 D.Z. Arov and H. Dym

We shall consider four systems of differential equations:(a) canonical integral systems

y(t, λ) = y(0, λ) + iλ

∫ t

0

y(s, λ)dM(s)J , 0 ≤ t < d .

(b) canonical differential systems

y′(t, λ) = iλy(t, λ)H(t)J , 0 ≤ t < d .

(c) differential systems with potential

y′(t, λ) = iλy(t, λ)NJ + y(t, λ)V(t) , 0 ≤ t < d .

(d) matrix Schrodinger equations

−u′′(t, λ) + u(t, λ)q(t) = λu(t, λ) , 0 ≤ t < d .

In these systems J is an m×m signature matrix with rank (Im + J) = rank (Im −J) = p, the mass function M(t) in (a) is a continuous nondecreasing m×m mvf onthe interval [0, d) with M(0) = 0; the Hamiltonian H(t) in (b) and the potentialV(t) in (c) are locally summable m×m mvf’s on the interval [0, d) that are subjectto the constraints

H(t) ≥ 0 and V(t)J + JV(t)∗ = 0 a.e. in [0, d) ;

N ∈ Cm×m is a positive semidefinite matrix and the potential q(t) is a Hermitianlocally summable p× p mvf on the interval [0, d).

The imposed constraints insure that the matrizant Ut(λ) = U(t, λ), 0 ≤ t < d,for each of the systems (a)–(c) is an entire m×m mvf in the variable λ that belongsto the class U(J) of J-inner mvf’s as a function of λ for each choice of t ∈ [0, d), asdoes the fundamental matrix of the Schrodinger equation (d). This article focuseson the case where the matrizant belongs to the subclass UsR(J) of strongly regularJ-inner mvf’s that is introduced in Section 4. This class includes the matrizantsof systems of the form (b) when H(t) is locally absolutely continuous on [0, d),H(t)JH(t) = J and H(0) = Im. It also includes systems of the form (c) whenNJ = JN , such as Dirac systems and Krein systems; see Sections 23 and 24. Atfirst glance these restrictions may seem unduly restrictive. However, in Section 26we shall explain how to exploit certain Dirac systems to study related systems ofthe form (c) with matrizants that do not belong to the class UsR(J). This analysiswill then be used to study a class of matrix Schrodinger equations in Section 26.

The paper is organized as follows: The next section lists the main notation.The subsequent ten sections present a brief survey of the preliminary materialthat is needed to formulate and explain the main developments in the article.These sections are short and are usually devoted to one topic: J-inner mvfs, repro-ducing kernel Hilbert spaces, linear fractional transformations, parametrization ofA ∈ UsR(Jp) and a description of the associated RKHS (reproducing kernel Hilbertspace) H(A), chains of entire J-inner mvf’s, canonical systems, chains of associ-ated pairs, de Branges spaces associated with systems, generalized Caratheodoryinterpolation.

Page 109: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 103

Sections 12–21 present a number of the authors’ results on direct and inversespectral problems for the canonical integral and differential systems described initems (a) and (b) of the previous list. (The two systems are really equivalent,but sometimes one form is more convenient than the other.) In particular weshall discuss the bitangential inverse spectral and input impedance problems (butnot the inverse monodromy problem or the bitangential inverse input scatteringproblem that are considered in [ArD:00a], [ArD:00b], [ArD:02a]and [ArD:02b]). Inour formulation of these problems, the given data includes a normalized monotoniccontinuous chain bt

3(λ), bt4(λ), 0 ≤ t < d, of entire inner p×p mvf’s in addition to

the spectral data (such as a Weyl-Titchmarsh function or a spectral function) thatis usually specified. Thus, for example, the bitangential inverse spectral problemfor the system (a) is: given σ(µ); bt

3(λ), bt4(λ), 0 ≤ t < d, find a continuous

nondecreasing m×m mvf M(t) on the [0, d) with M(0) = 0, such that

(1) σ(µ) is a spectral function of the corresponding system (a).(2) The given pair bt

3(λ), bt4(λ) is associated with the matrizant Ut(λ) of the

system in a prescribed way, for each t ∈ [0, d).(3) Ut ∈ UsR(J) for every t ∈ [0, d).

The condition alluded to in (2) on the given chain of pairs serves to specify theclass of systems in which a solution is sought. The third condition guaranteesthat there is at most one solution in the class specified by (2), up to a possibleparameter α = α∗, α ∈ Cp×p.

Sections 22–25 are devoted to differential systems with potential, i.e., systemsof the type (c), whereas Section 26 considers matrix Schrodinger equations. Finally,Section 27 discusses the approach of M.G. Krein to inverse problems.

There is an extensive literature on the inverse spectral problem for assortedsystems of differential and integral equations; see, e.g., [CG:02], [CG:01], [GKM:02],[Kr:55], [Kr:56], [LeMa:00], [LeSa:75], [MeA:67], [MeA:77], [MeA:99a], [MeA3:99b],[MeA:00], [Sak:96], [Sak:99], [Sak:00a], [Sak:00b], [Sak-A:92], [DK:78], [DI:84],[AlD:84], [AlD:85], [AG:95], [AG:01], [GKS:98], [GKS:02], and the references citedtherein. The notion of associated pairs does not appear explicitly in any of thesepapers. Nevertheless, the restrictions imposed on the structure of the systemsunder study are such as to uniquely define a monotonic chain of normalized as-sociated pairs, even if they are not mentioned explicitly. Moreover, in some ofthese problems, condition (3) is automatically in force; see, e.g., Theorem 22.3and Corollary 22.4.

The bitangential inverse spectral problem with given data

σ(µ); bt3(λ), bt

4(λ), 0 ≤ t < d ,

is solved by considering a bitangential inverse input impedance problem with givendata

c(α)(λ); bt3(λ), bt

4(λ), 0 ≤ t < d ,

Page 110: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

104 D.Z. Arov and H. Dym

where the input impedance

c(α)(λ) = iα+1πi

∫ ∞

1

µ− λ− µ

1 + µ2

dσ(µ)

is based on the given spectral function σ(µ), and the same chain bt3(λ), bt

4(λ),0 ≤ t < d, is considered in both problems; see Section 15. The matrizant of asystem (a) that solves the inverse input impedance problem coincides with theresolvent matrix (in Krein’s terminology) that describes the sets

c ∈ Cp×p : (bt3)

−1(c− c(α))(bt4)

−1 ∈ N p×p+ , 0 ≤ t < d ;

see Section 12 for additional details and references. This chain of interpolationproblems (indexed by t) is equivalent to a chain of bitangential extension problemsin the class of continuous p× p mvf’s g(s) on R for which g(−s) = g(s)∗ and∫ ∞

0

ϕ(s)∗∫ ∞

0

g(s− u)− g(s)− g(−u) + g(0)ϕ(u)duds ≥ 0

for every ϕ ∈ Lp2([0,∞)) with compact support. These and other related bitangen-

tial extension problems that generalize a number of extension problems consideredby Krein are studied in [ArD:98].

The link between these classes of problems rests on the formula

c(λ) = λ2

∫ ∞

0

eiλsg(s)ds , λ ∈ C+ ,

which defines a one to one correspondence between mvf’s in the Caratheodoryclass Cp×p and the class of p × p mvf’s g(s) described just above that meet theextra condition g(0) ≤ 0. The extension problems that Krein considered in [Kr:44]correspond to the special choices bt

3(λ) = expiλa3t and bt4(λ) = expiλa4t,

where a3 ≥ 0, a4 ≥ 0 and a3 + a4 > 0. In fact, if say a3 = a4 = 1 and d < ∞,then, as Krein pointed out in [Kr:55] and [Kr:56], only the values of g(s) on theinterval [−2d, 2d] are relevant to the solution of an inverse spectral problem for asystem of the form (c) or (d) on the interval [0, d). This is the reason that extensionproblems are connected with inverse problems; additional discussion of this themein the special case that c(λ) is in the Wiener class may be found in [MeA:77],[DI:84], [KrL:85], [Dy:90], [ArD:??] and Section 27 below.

The interplay between a chain of extension problems (or, equivalently, a chainof interpolation problems) and inverse problems follows a general strategy envi-sioned by M.G. Krein some fifty years ago and is discussed in a little more detailin Section 27. Another generalization of Krein’s extension problems and their ap-plication to inverse problems was developed by L.A. Sakhnovich [Sak:96], [Sak:99],[Sak:00a], [Sak:00b]. Some comparisons of his approach with ours are presented in[ArD:04b].

There seems to be a strong connection between the strategy proposed byKrein for solving inverse problems and the approach advocated in the recent pa-pers [Sim:99], [GeSi:00], [RaSi:00] and [Rem:03]. In particular, the A-function in-troduced in [Sim:99] is remarkably close to Krein’s transition function.

Page 111: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 105

The approach to direct and inverse spectral problems that is described in thissurvey is, roughly speaking, a synthesis of the Krein strategy and RKHS methodsthat exploit the properties of two chains of RKHS’s, H(Ut) and B(Et), 0 ≤ t < d,that are defined in terms of the fundamental solution Ut(λ), 0 ≤ t < d, of thesystem under study. These RKHS’s were introduced and extensively studied byL. de Branges [dBr:63], [dBr:68a], [dBr:68b] and played a significant role in hiscelebrated proof that a suitably normalized real 2 × 2 Hamiltonian H(t) of asystem of the form (b) (with m = 2) is uniquely determined by a spectral functionof the system. A number of other results on inverse problems that were announcedwithout proof by Krein are established in the monograph [DMc:76] with the helpof RKHS methods.

In general

H(Ut1) ⊂ H(Ut2) and B(Et1) ⊂ B(Et2) when 0 ≤ t1 ≤ t2 < d (1.1)

and the inclusions are contractive. Moreover, there exists a partial isometry fromH(Ut) onto B(Et). However, if Ut(λ) belongs to the class UsR(J) of strongly regularJ-inner mvf’s for every t ∈ [0, d), then:

(1) The inclusions in (1.1) are isometric.(2) There exists a fixed p ×m matrix T such that the map f −→ Tf defines a

unitary operator from H(Ut) onto B(Et) for every t ∈ [0, d).(3) There is a useful parametrization of the space H(Ut) in terms of a unique

pair bt3(λ), bt

4(λ) of normalized entire inner p × p mvf’s that are uniquelydefined by Ut(λ).

(4) The chain of pairs bt3(λ), bt

4(λ), 0 ≤ t < d, alluded to in (3) is continuousin t on the interval [0, d) for each fixed choice of λ.

(5) The de Branges space B(Et) and the RKHS (Hp2 bt

3Hp2 )⊕ (Kp

2 (bt4)

−1Kp2 )

based on the Hardy space Hp2 and its orthogonal complement Kp

2 = Lp2 Hp

2

coincide as linear topological spaces, i.e., they contain the same elements andtheir norms are equivalent.

Item (5) in the last list is a generalization of the Paley-Wiener theorem.Thus, for example, if bt

3(λ) = eia3tλIp and bt4(λ) = eia4tλIp for some choice of

finite nonnegative numbers a3 and a4, then

B(Et) =∫ a3t

−a4t

eiλsg(s)ds : g ∈ Lp2([−a4, a3])

and the norms in the two spaces B(Et) and Lp

2([−a4, a3]) are equivalent. Anal-ogous conclusions prevail for the de Branges spaces associated with the matrixSchrodinger equation even though the fundamental matrix for this system of equa-tions is not strongly regular. This result, which generalizes a theorem of Remling[Rem:02], [Rem:03] to the matrix case (though for what, at least as of this moment,appears to be a more restrictive class of potentials) is obtained in Section 26 asa byproduct of the methods of this paper, by exploiting the connections with anassociated Dirac system.

Page 112: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

106 D.Z. Arov and H. Dym

The preceding discussion focuses on the direct problem, wherein one startswith a system. Sample inverse problems were discussed earlier. To supplement thatdiscussion, it is worth noting that if Ut(λ), 0 ≤ t < d, is a given normalized chainof entire m×m mvf’s in the class UsR(J), and if the normalized pair bt

3(λ), bt4(λ),

0 ≤ t < d, of entire inner p× p mvf’s that are associated with Ut(λ) is continuousas a function of t on [0, d), then Ut(λ) is automatically the matrizant of a systemof the form (a) with

M(t) = i(∂Ut

∂λ)(0)J for every t ∈ [0, d) .

This serves to connect the resolvent matrices of assorted classes of interpolationand extension problems with canonical systems.

2. Notation

In addition to the standard nomenclature such as Lm×n2 (R), Lm×n

1 (R) andLm×n∞ (R) for the Lebesgue spaces of m × n mvf’s on R;Hm×n

2 and Hm×n∞ for

the Hardy spaces of m×n mvf’s in the open upper half plane C+, C− for the openlower half plane, J = J∗ = J−1 for a general signature matrix, (Rf)(λ) = f(λ)+f(λ)∗/2 , (If)(λ) = f(λ)− f(λ)∗/2i, f#(λ) = f(λ)∗, f∼(λ) = f#(−λ) andthe abbreviations Xm for Xm×1 and χ for χ1, we shall make use of the followingclasses of functions:

Kp2 =Lp

2 Hp2 with respect to the standard inner product.

Ep×q =the set of p× q mvf’s with entire entries.Cp×p = p× p mvf’s c(λ) which are holomorphic with (Rc)(λ) ≥ 0 in C+.Cp×p = c ∈ Cp×p : c ∈ Hp×p

∞ and (Rc)−1 ∈ Lp×p∞ (R).

Sp×q = p× q mvf’s s(λ) which are holomorphic and contractive in C+.Sp×p

in = s ∈ Sp×p : s is an inner mvf.Sp×p

out = s ∈ Sp×p : s is an outer mvf.Sp×q = s ∈ Sp×q : sup‖s(λ)‖ : λ ∈ C+ < 1.H(b) =Hr

2 bHr2 and H∗(b) = Kr

2 b−1Kr2 for b ∈ Sr×r

in .N p×q = h−1g : g ∈ Sp×q and h ∈ S.N p×q

+ = h−1g : g ∈ Sp×q and h ∈ Sout.N p×p

out = h−1g : g ∈ Sp×pout and h ∈ Sout.

U(J) = the set of m×m mvf’s that are J-inner with respect to C+.

UrR(J) = the class of right regular J-inner mvf’s.UsR(J) = the class of strongly regular J-inner mvf’s.US(J) = the class of singular J-inner mvf’s.

X p×qconst =the set of constant functions in the set X p×q.

Page 113: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 107

Πp×q = f ∈ N p×q : f admits a pseudocontinuation to C−such that f# ∈ N q×p.

E ∩ X p×q = Ep×q ∩ X p×q.

Wp×q(γ) = γ +∫∞−∞ eiλth(t)dt with fixed γ ∈ Cp×q and any h ∈ Lp×q

1 (R).Wp×q

+ (γ) = γ +∫∞0

eiλth(t)dt with fixed γ ∈ Cp×q and any h ∈ Lp×q1 (0,∞).

Wp×q− (γ) = γ +

∫ 0

−∞ eiλth(t)dt with fixed γ ∈ Cp×q and any h ∈ Lp×q1 (−∞, 0).

ACp×q([a, b)) = p× q mvf′s g(t) : g(t) = γ +∫ t

0 h(s)ds where γ ∈ Cp×q

and h ∈ Lp×q1 ([a, b)).

Lp×q1, loc([a, b)) = p× q mvf′s g(t) : g ∈ Lp×q

1 ([a, c]) for every c ∈ [a, b).ACp×q

loc ([a, b)) = p× q mvf′s g(t) : g ∈ ACp×q([a, c)) for every c ∈ [a, b).

Here, S stands for the Schur class, C for the Caratheodory class, N for the Nevan-linna class, N+ for the Smirnov class, Nout for outer functions from the Smirnovclass and W stands for the Wiener class; Wm×m

± (Im) and Wm×m± (0) are closed

under multiplication. Finally,

ea = ea(λ) = eiaλ, ρω(λ) = −2πi(λ− ω),kb

ω(λ) = ρω(λ)−1(Ip − b(λ)b(ω)∗) and bω(λ) = ρω(λ)−1(b(λ)−1b(ω)−∗ − Ip)are the RK’s for the RKHS’s H(b) and H∗(b), respectively, when b ∈ Sp×p

in ,

Hf = the domain of holomorphy of the mvf f(λ), H+f = Hf ∩ C+,

Mf denotes the operator of multiplication by the mvf f(λ),ΠM denotes the orthogonal projection onto the subspace M,Π+ = ΠHp

2, Π− = ΠKp

2,

f(λ) =∫∞−∞ eiλxf(x)dx denotes the Fourier transform of f ,

g∨(x) = 12π

∫∞−∞ e−iµxg(µ)dµ denotes the inverse Fourier transform of g,

〈f, g〉st =∫ ∞

−∞g(µ)∗f(µ)dµ denotes the standard inner product

andvvf stands for vector-valued function, while mvf stands for matrix-valuedfunction.L(X,Y ) denotes the set of bounded linear operators from the Hilbert spaceX into the Hilbert space Y , and L(X) is short for L(X,X). All the Hilbertspaces considered in this paper are separable.

3. J-inner mvf’s

An m × m constant matrix J is said to be a signature matrix, if J = J∗ andJJ∗ = Im, i.e., if it is both self-adjoint and unitary with respect to the standard

Page 114: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

108 D.Z. Arov and H. Dym

inner product in Cm. The main examples of signature matrices for this paper are

jpq =[Ip 00 −Iq

], p + q = m,

and, if 2p = m,

Jp =[

0 −Ip

−Ip 0

], jp =

[Ip 00 −Ip

]and Jp =

[0 −iIp

iIp o

].

The signature matrices Jp and jp are connected by the signature matrix

V =1√2

[−Ip Ip

Ip Ip

], i.e., VJpV = jp and VjpV = Jp .

An m × m mvf U(λ) is said to be J-inner with respect to the open upper halfplane C+ if it is meromorphic in C+ and if(1) J − U(λ)∗JU(λ) ≥ 0 for every point λ ∈ H+

U and(2) J − U(µ)∗JU(µ) = 0 a.e. on R,

in which H+U denotes the set of points in C+ at which U is holomorphic. This

definition is meaningful because every mvf U(λ) that is meromorphic in C+ andsatisfies the first constraint automatically has nontangential boundary values. Thesecond condition guarantees that detU(λ) ≡ 0 in H+

U and hence permits us todefine a pseudo-continuation of U(λ) to the open lower half plane C− by thesymmetry principle

U(λ) = JU#(λ)−1J for λ ∈ C− ,

where f#(λ) = f(λ)∗. The symbol U(J) will denote the class of J-inner mvf’sconsidered on the set HU of points of holomorphy of U(λ) in the full complexplane C.

4. Reproducing kernel Hilbert spaces

If U ∈ U(J) andρω(λ) = −2πi(λ− ω) ,

then the kernel

KUω (λ =

J − U(λ)JU(ω)∗

ρω(λ)is positive on HU × HU in the sense that

∑ni,j=1 u

∗iK

Uωj

(ωi)uj ≥ 0 for every set ofvectors u1, . . . , un ∈ Cm and points ω1, . . . , ωn ∈ HU ; see, e.g., [Dy:89]. Therefore,by the matrix version of a theorem of Aronszajn [Aron:50], there is an associatedRKHS (reproducing kernel Hilbert space) H(U) with RK (reproducing kernel)KU

ω (λ). This means that for every choice of ω ∈ HU , u ∈ Cm and f ∈ H(U),(1) Kωu ∈ H(U) and(2) 〈f,Kωu〉H(U) = u∗f(ω) .

A vvf f ∈ H(U) is meromorphic in C \ R and has nontangential limits f(µ) a.e.in R that may be used to identify f(λ).

Page 115: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 109

In particular,f(µ) = lim

ν↓0f(µ± iν) a.e. in R .

We shall say that a mvf U ∈ U(J) belongs to the class(1) UsR(J) of strongly regular J-inner mvf’s if H(U) ⊂ Lm

2 .(2) UrR(J) of right regular J-inner mvf’s if H(U) ∩ Lm

2 is dense in H(U).(3) US(J) of singular J-inner mvf’s if H(U) ∩ Lm

2 = 0.

Theorem 4.1. Let U ∈ U(J). Then:(1) U(J) ∩ Lm×m

∞ (R) ⊂ UsR(J).(2) The inclusion in (1) is proper.(3) If U ∈ UsR(J), then (µ + i)−1U(µ) ∈ Lm×m

2 .(4) If U(λ) is an entire mvf, then U ∈ US(J) if and only if it is of minimal

exponential type.

Proof. (1) follows from the discussion of formula (3.25) in [ArD:97] and Remark5.1 below; (2) follows from the example that starts on p. 293 in [ArD:01]; (3) isby definition and (4) follows from Lemma 3.8 and Theorem 3.8 in [ArD:97].

Item (1) of this theorem guarantees that J-inner mvf’s in the Wiener classare automatically strongly regular:

U(J) ∩Wm×m ⊂ UsR(J) .

It also serves to guarantee that the matrizant Ut(λ) = U(t, λ), 0 ≤ t < d, of anumber of classical first order differential systems with potential, such as Diracsystems and Krein systems, are automatically strongly regular; see Section 22 andthe subsequent sections for additional details.

An entire p×m mvf

E(λ) = [E−(λ) E+(λ)]

with p× p components E+(λ) and E−(λ) is said to be a de Branges function if(1) detE+(λ) ≡ 0 in C+ and(2) the mvf χ(λ) = E+(λ)−1E−(λ) is a p× p inner mvf.

The de Branges space B(E) based on an entire de Branges function E(λ) is equalto the set of entire p × 1 vvf’s f(λ) such that E−1

+ f ∈ Hp2 χHp

2 . It is a RKHSwith respect to the inner product

〈f1, f2〉B(E) = 〈E−1+ f1, E

−1+ f2〉st

and the RK is given by the formula

KEω (λ) = −E(λ)jpE(ω)∗

ρω(λ)=

E+(λ)E+(ω)∗ − E−(λ)E−(ω)∗

ρω(λ);

see, e.g., [dBr:68b] and [DI:84]. de Branges spaces can also be defined in the sameway for a more general class of p×m mvf’s that are meromorphic in C+, see, e.g.,[ArD:04a], [ArD:04b] for additional information and references. However, the caseof entire mvf’s E(λ) will suffice for the purposes of this article.

Page 116: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

110 D.Z. Arov and H. Dym

If A ∈ E ∩ U(Jp) and

B(λ) = A(λ)V =[b11(λ) b12(λ)b21(λ) b22(λ)

],

with blocks bij(λ) of size p× p, then

E(λ) =√

2[0 Ip]B(λ) ,

is an entire de Branges function and the map

U2 : f ∈ H(A) onto−→√

2[0 Ip]f

is a coisometry from H(A) onto B(E), i.e., it maps H(A) kerU2 isometricallyonto B(E); see, e.g., Theorem 2.5 of [ArD:05] and Theorem 2.3 of [ArD:04b]. Itwill be an isometry, i.e.,

〈f, f〉H(A) = 〈U2f, U2f〉B(E) for every f ∈ H(A) , (4.1)

if and only if the mvf c0(λ) = b12b−122 (which belongs to Cp×p) meets the condition

limν↑∞

ν−1Rc0(iν) = 0 ,

or equivalently, if and only if

Rc0(i) =1π

∫ ∞

−∞

11 + µ2

Rc0(µ)dµ .

In particular,A ∈ UsR(Jp) =⇒ condition (4.1) prevails .

5. Linear fractional transformations

The linear fractional transformation TU based on the four block decomposition

U(λ) =[u11(λ) u12(λ)u12(λ) u22(λ)

],

of an m×m mvf U(λ) that is meromorphic in C+ with diagonal blocks u11(λ) ofsize p× p and u22(λ) of size q × q is defined on the set

D(TU ) = p× q meromorphic mvf’s ε(λ) in C+

such that detu21(λ)ε(λ) + u22(λ) ≡ 0 in C+by the formula

TU [ε] = (u11ε + u12)(u21ε + u22)−1.

If U1, U2 ∈ U(J) and if ε ∈ D(TU2) and TU2 [ε] ∈ D(TU1) then

TU1U2 [ε] = TU1 [TU2 [ε]] .

The notationTU [E] = TU [ε] : ε ∈ E for E ⊂ D(TU )

will be useful.

Page 117: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 111

It is well known that if W ∈ U(jpq), then Sp×q ⊂ D(TW ) and TW [Sp×q ] ⊂Sp×q. Moreover,

W ∈ UsR(jpq) ⇐⇒ TW [Sp×q]⋂

Sp×q = ∅ .

Remark 5.1. The class UsR(jpq) was introduced and defined by the condition onthe right-hand side of this equivalence in [ArD:97] and then it was shown therethat W ∈ UsR(jpq) ⇐⇒ H(W ) ⊂ Lm

2 . Here, we have reversed the order andhave defined the class UsR(J) by the condition H(U) ⊂ Lm

2 for arbitrary signaturematrices J , including J = ±Im.

It is not hard to show that linear fractional transformations based on A ∈U(Jp) map τ ∈ Cp×p ∩ D(TA) into Cp×p. However, the set

C(A) = TB[Sp×p ∩ D(TB)] .

based on the mvf B(λ) = A(λ)V is more useful. In particular,

TA[Cp×p ∩ D(TA)] ⊂ TB[Sp×p ∩ D(TB)] ⊂ Cp×p

andA ∈ UsR(Jp) ⇐⇒ C(A) ∩ Cp×p = ∅ .

We remark that

Sp×p ⊂ D(TB) ⇐⇒ b22(ω)b22(ω)∗ > b21(ω)b21(ω)∗ (5.1)

for some (and hence every) point ω ∈ H+A; see Theorem 2.7 in [ArD:03b].

The set C(A) can also be described directly in terms of a linear fractionaltransformation TA[a, b] based on the mvf A(λ) acting on pairs a(λ), b(λ) ofp× p mvf’s that are meromorphic in C+ by the formula

TA[a, b] = a11(λ)a(λ) + a12(λ)b(λ)a21(λ)a(λ) + a22(λ)b(λ)−1 . (5.2)

The domain of definition of this transformation

D(TA) =

a, b : a and b are meromorphic p× p mvf’s in C+

and deta21(λ)a(λ) + a22(λ)b(λ) ≡ 0 in C+

.

If J is a signature matrix that is unitarily equivalent to Jp, F(J) denotes the setof pairs a(λ), b(λ) of p × p mvf’s that are meromorphic in C+ and satisfy thefollowing two conditions:

(1) [a(λ)∗ b(λ)∗]J[a(λ)b(λ)

]≤ 0 for λ ∈ H+

a ∩H+b .

(2) a(λ)∗a(λ) + b(λ)∗b(λ) > 0 for at least one point λ ∈ H+a ∩H+

b .It is not difficult to check that

C(A) = TA[F(Jp) ∩ D(TA)] ,

whereTA[E] =

TA[a, b] : a, b ∈ E

Page 118: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

112 D.Z. Arov and H. Dym

for every subset E of D(TA). Thus, if A ∈ U(Jp), then

A ∈ UsR(Jp) ⇐⇒ TA[F(Jp) ∩ D(TA)] ∩ Cp×p = ∅and

Sp×p ⊂ D(TB) ⇐⇒ F(Jp) ⊂ D(TA) .

Other characterizations of the class UsR(J) in terms of the Treil-Volberg ma-trix version of the Muckenhoupt (A)2 condition are furnished in [ArD:01] and[ArD:03a].

6. Parameterization of A ∈ UsR(Jp)

If A ∈ U(Jp), then the formulas

A#(λ)JpA(λ) = A(λ)JpA#(λ) = Jp ,

which are valid for every point λ ∈ HA∩HA# , yield a number of relations betweenthe blocks of A(λ) and indicate that some of the blocks of A(λ) may be computedin terms of the others. In fact, if A ∈ UsR(Jp), then a pair of p × p inner mvf’sb3(λ), b4(λ) and a mvf c ∈ Cp×p that are connected to A(λ) by the rules givenbelow serve to specify A(λ) up to a constant Jp-unitary multiplier on the right.The first two rules are formulated in terms of the blocks of B(λ) = A(λ)V.

(1) b#21b3 ∈ N p×pout and b3 ∈ Sp×p

in .(2) b4b22 ∈ N p×p

out and b4 ∈ Sp×pin .

(3) c ∈ C(A).The mvf’s b3(λ) and b4(λ) are unique up to a constant p×p unitary multiplier (onthe right for b3(λ) and on the left for b4(λ)) and will be designated an associatedpair of the second kind for A(λ):

b3(λ), b4(λ) ∈ apII(A) .

(There is also a set of associated pairs b1(λ), b2(λ) of the first kind that is moreconvenient to use in some other classes of problems that will not be discussedhere.)

The main conclusions are summarized in the following theorems:

Theorem 6.1. If A ∈ U(Jp) is an entire mvf and b3(λ), b4(λ) ∈ apII(A), thenb3(λ) and b4(λ) are also entire. Conversely, if A ∈ UrR(Jp) (and hence a fortioriif A ∈ UsR(Jp)), b3, b4 ∈ apII(A) and b3(λ) and b4(λ) are entire inner mvf’s,then A(λ) is entire.

Proof. See the discussion in Section 3.4 of [ArD:03b]

Theorem 6.2. If A ∈ UsR(Jp), b3(λ), b4(λ) ∈ apII(A) and c ∈ C(A)∩Hp×p∞ , then

H(A) =[

−Π+c∗g + Π−chg + h

]: g ∈ H(b3) and h ∈ H∗(b4)

,

Page 119: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 113

where Π+ denotes the orthogonal projection of Lp2 onto the Hardy space Hp

2 , Π− =I −Π+ denotes the orthogonal projection of Lp

2 onto Kp2 = Lp

2 Hp2 ,

H(b3) = Hp2 b3H

p2 and H∗(b4) = Kp

2 b∗4Kp2 .

Moreover,

f =[−Π+c

∗g + Π−chg + h

]=⇒ 〈f, f〉H(A) = 〈(c + c∗)(g + h), g + h〉st ,

where g ∈ H(b3), h ∈ H∗(b4) and 〈·, ·〉st denotes the standard inner product in Lp2.

Proof. See Theorem 3.8 in [ArD:05].

Theorem 6.3. If A ∈ E ∩ UsR(Jp), b3, b4 ∈ apII(A) and E(λ) =√

2[0 Ip]B(λ),then

〈f, f〉H(A) = 2‖[0 Ip]f‖2B(E)

for every f ∈ H(A) and

B(E) = H(b3)⊕H∗(b4) as Hilbert spaces with equivalent norms .

Proof. See Theorem 3.8 in [ArD:05].

Remark 6.4. If B(E) = H(b3) ⊕ H∗(b4) as linear spaces, then the two normsin these spaces are automatically equivalent, i.e., there exist a pair of positiveconstants γ1, γ2 such that

γ1‖f‖st ≤ ‖f‖B(E) ≤ γ2‖f‖st

for every f ∈ B(E). This follows from the closed graph theorem and the fact thatB(E) and H(b3)⊕H∗(b4) are both RKHS’s.

7. Chains of entire J-inner mvf’s

A family Ut(λ), 0 ≤ t < d, of entire J-inner mvf’s is said to be

(1) normalized: if Ut(0) = Im for every t ∈ [0, d) and U0(λ) = Im for every λ ∈ C,(2) left monotonic: if (Ut1)−1Ut2 ∈ U(J) when 0 ≤ t1 ≤ t2 < d,(3) continuous: if Ut(λ) is a continuous function of t on the interval [0, d) for

every fixed point λ ∈ C.

Thus, a family Ut(λ), 0 ≤ t < d, of entire J-inner mvf’s is said to be a normalizedleft monotonic continuous chain of entire J-inner mvf’s if the preceding threeconstraints are met. Similarly, a family Ut(λ), 0 ≤ t < d, of entire J-inner mvf’sis said to be a normalized right monotonic continuous chain of entire J-inner mvf’sif the constraints (1), (3) and

(2r) right monotonic: if Ut2(Ut1)−1 ∈ U(J) when 0 ≤ t1 ≤ t2 < d,

are met.

Page 120: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

114 D.Z. Arov and H. Dym

8. Canonical systems

The matrizant (fundamental solution) Ut(λ) = U(t, λ), 0 ≤ t < d, of the canonicalintegral system

y(t, λ) = y(0, λ) + iλ

∫ t

0

y(s, λ)dM(s)J , 0 ≤ t < d , (8.1)

based on a continuous nondecreasing m×m mvf M(t) on the interval [0, d) withM(0) = 0, is the unique continuous solution of the integral system

U(t, λ) = Im + iλ

∫ t

0

U(s, λ)dM(s)J , 0 ≤ t < d .

The mass function M(t) of any such canonical integral system is uniquely deter-mined by the matrizant via the formula

M(t) = −i(∂Ut

∂λ

)(0)J , 0 ≤ t < d .

Standard estimates lead to the conclusion that Ut(λ) is an entire mvf of λ for eachfixed t ∈ [0, d). Moreover, as follows from the identity

J − U(t, λ)JU(t, ω)∗ = −i(λ− ω)∫ t

0

U(s, λ)dM(s)U(s, ω)∗, (8.2)

Ut ∈ U(J) for every t ∈ [0, d) and the corresponding RKHS’s H(Ut) are orderedby inclusion as sets:

H(Ut1) ⊂ H(Ut2) if 0 ≤ t1 ≤ t2 < d (8.3)

and‖f‖H(Ut2) ≤ ‖f‖H(Ut1) if f ∈ H(Ut1) . (8.4)

If Ut ∈ UsR(J) for every t ∈ [0, d), then the inclusion in relation (8.3) is isometric,i.e., equality prevails in the inequality (8.4); see, e.g., Theorem 2.5 of [ArD:05] andTheorem 2.3 of [ArD:04b].

The matrizant Ut(λ), 0 ≤ t < d, of every canonical integral system (8.1)with continuous nondecreasing mass function M(t) is a normalized left monotoniccontinuous chain of entire J-inner mvf’s. The converse of this statement is trueunder extra constraints:

Theorem 8.1. Let Ut(λ), 0 ≤ t < d, be a normalized left monotonic continuouschain of entire J-inner mvf’s such that

Ut ∈ UsR(J) for every t ∈ [0, d) . (8.5)

Then there exists exactly one continuous nondecreasing mass function M(t) onthe interval [0, d) such that Ut(λ) is the matrizant of the corresponding canonicalintegral system (8.1).

Proof. This is a corollary of Theorem 4.6 in [ArD:97] and the discussion of formula(0.16) in [ArD:00a].

Page 121: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 115

If M ∈ ACm×mloc ([0, d)), then

M(t) =∫ t

0

H(s)ds

for a mvf

H ∈ Lm×m1, loc ([0, d)) such that H(t) ≥ 0 a.e. in [0, d) . (8.6)

Thus, in this case, a continuous solution of (8.1) is automatically locally absolutelycontinuous on the interval [0, d) and is a solution of the canonical differential system

y′(t, λ) = iλy(t, λ)H(t)J for 0 ≤ t < d . (8.7)

We shall refer to a mvf H(t) that meets the conditions in (8.6) as the Hamiltonianof the canonical differential system (8.7). The matrizant of the canonical integralsystem (8.1) coincides with the matrizant of this canonical differential system.Since M(s) is absolutely continuous with respect to the strictly increasing function

τ(s) = traceM(s) + s for 0 ≤ s < d ,

it is always possible to reexpress the canonical integral system (8.1) as a canon-ical differential system with Hamiltonian H = dM/dτ ; see, e.g., Appendix I in[ArD:00a].

9. Chains of associated pairs

Let At(λ) denote the matrizant of a canonical system of the form (8.1) with J = Jp:

A(t, λ) = Im + iλ

∫ t

0

A(s, λ)dM(s)Jp , 0 ≤ t < d , (9.1)

where M(t) is a continuous nondecreasing m ×m mvf on the interval [0, d) withM(0) = 0 . Then the chain of associated pairs bt

3, bt4 ∈ apII(At), 0 ≤ t < d, of

entire inner p× p mvf’s, is(1) monotonic in the sense that (bt1

3 )−1bt23 ∈ Sp×p

in and bt24 (bt1

4 )−1 ∈ Sp×pin when

0 ≤ t1 ≤ t2 < d.Moreover,(2) the chain may be normalized by the conditions bt

3(0) = bt4(0) = Ip, for every

t ∈ [0, d) and b03(λ) = b04(λ) = Ip for every λ ∈ C. A normalized chain ofassociated pairs is uniquely determined by the matrizant At(λ), 0 ≤ t < d.

If the matrizant At(λ) also satisfies the condition

At ∈ UsR(Jp) for every t ∈ [0, d) , (9.2)

then(3) the normalized chain of associated pairs bt

3, bt4 ∈ apII(At), 0 ≤ t < d, is

continuous in the sense that both bt3(λ) and bt

4(λ) are continuous mvf’s of ton the interval [0, d) for each fixed choice of λ ∈ C.

Page 122: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

116 D.Z. Arov and H. Dym

Thus, if the matrizant At(λ) of the canonical integral system (8.1) with J =Jp satisfies condition (9.2), then the corresponding chain bt

3(λ), bt4(λ), 0 ≤ t < d,

of normalized associated pairs of the second kind for At(λ) is uniquely defined byAt(λ) and is a normalized monotonic continuous chain of pairs of entire inner p×pmvf’s; see Theorem 7.4 in [ArD:03b].

In future sections, we will discuss a number of inverse problems for canonicalintegral systems of the form (8.1) with J = Jp. In our formulation of these prob-lems, the given data includes a normalized monotonic continuous chain of pairsbt

3(λ), bt4(λ), 0 ≤ t < d, of entire inner p × p mvf’s in addition to the spectral

data that is furnished in traditional investigations. This chain helps to specify theclass of admissible solutions by imposing the condition

bt3, b

t4 ∈ apII(At) for every t ∈ [0, d)

on the matrizantAt(λ) of the system (8.1) with J = Jp and unknown mass functionM(t). Thus, it is of interest to describe the set of such chains. Moreover, since theRKHS’s H(bt

3) ⊂ Lp2(R) andH∗(bt

4) ⊂ Lp2(R),

bt3 ∈ UsR(Ip) and (bt

4)τ ∈ UsR(Ip) for every t ∈ [0, d)

and hence, Theorem 8.1 guarantees that bt3(λ) is the matrizant of a canonical

integral system based on a continuous non decreasing p × p mvf m3(t) on theinterval [0, d). Similarly, (bt

4(λ))τ is the matrizant of a canonical integral systembased on a continuous non decreasing p × p mvf (m4(t))τ on the interval [0, d).Thus, we are led to the following conclusion:

Theorem 9.1. There is a one to one correspondence between normalized monotoniccontinuous chains of pairs bt

3(λ), bt4(λ), 0 ≤ t < d, of entire inner p × p mvf’s

and the pairs m3(t),m4(t) of continuous nondecreasing p×p mvf’s on [0, d) withm3(0) = m4(0) = 0:bt

3(λ), bt4(λ), 0 ≤ t < d, are the unique continuous solutions of the integral equa-

tions

bt3(λ) = Ip + iλ

∫ t

0

bs3(λ)dm3(s), bt

4(λ) = Ip + iλ

∫ t

0

(dm4(s))bs4(λ), 0 ≤ t < d,

and

m3(t) = −i∂bt3

∂λ(0) and m4(t) = −i∂b

t4

∂λ(0).

See Theorem 2.1 of [ArD:00a] for additional details.

10. de Branges spaces associated with systems

The matrizant At(λ) = A(t, λ), 0 ≤ t < d of the canonical system (8.1) withJ = Jp satisfies the identity (8.2) with J = Jp and U(t, λ) = A(t, λ) for 0 ≤ t < d.

Page 123: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 117

Thus, for any matrix L ∈ Cm×p such that L∗JpL = 0, formula (8.2) implies thatthe kernel

−L∗A(t, λ)JpA(t, ω)∗Lρω(λ)

=12π

∫ t

0

L∗A(s, λ)dM(s)A(s, ω)∗L

is positive on C×C. Thus, if L∗JpL = 0 and rankL = p, then the mvf’s L∗A(t, λ)Vare de Branges functions. The corresponding de Branges spaces play a significantrole in the study of direct and inverse spectral problems. The matrix L may bechosen to conform to the initial conditions imposed on the solution of the canonicalsystem that intervenes in the generalized Fourier transform that will be discussedin later sections. The particular choice L∗ =

√2[0 Ip], leads to the de Branges

functionEt(λ) =

√2[0 Ip]At(λ)V , 0 ≤ t < d, (10.1)

with p× p components

Et(λ) = [Et−(λ) Et

+(λ)] .

Theorem 10.1. Let Et(λ) denote the de Branges function associated with the ma-trizant At(λ), 0 ≤ t < d, of the canonical system (8.1) with J = Jp by formula(10.1) and let Bt(λ) = At(λ)V and bt

3, bt4 ∈ apII(At) for every t ∈ [0, d). Then:

(1) The spaces B(Et) are ordered by contractive inclusion:

B(Et1) ⊂ B(Et2) and ‖f‖B(Et2) ≤ ‖f‖B(Et1) for every f ∈ B(Et1)

when 0 ≤ t1 ≤ t2 < d.If At ∈ UsR(Jp) for every t ∈ [0, d), then more is true:(2) The inclusion in relation (8.3) for Ut(λ) = At(λ) is isometric, i.e., equality

prevails in the inequality (8.4).(3) The map f ∈ H(At) −→

√2[0 Ip]f ∈ B(Et) is unitary for every t ∈ [0, d).

(4) The inclusions in (1) are isometric.(5) B(Et) = H(bt

3) ⊕ H∗(bt4) as Hilbert spaces with equivalent norms for every

t ∈ [0, d).

Proof. Assertion (1) is due to de Branges; it also follows from Theorem 4.4 in[ArD:97] and Theorem 2.4 of [ArD:05]. The remaining assertions rest on Theorem2.4 and Lemma 3.1 of [ArD:05] and the fact that if At ∈ UsR(Jp), then the RKHSH2(At) referred to in the cited theorem is equal to 0.

Expository introductions to the de Branges spaces associated with the solu-tions of generalized string equations may also be found in [DMc:76] and [Dy:70].

11. A generalized Caratheodory interpolation problem

Our next main objective is to discuss a bitangential inverse impedance problem.However, in keeping with a general strategy for studying such problems that wasintroduced by M.G. Krein, we first introduce a family of generalized Caratheodory

Page 124: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

118 D.Z. Arov and H. Dym

interpolation problems GCIP(bt3, b

t4; c) based on a mvf c ∈ Cp×p and a normalized

monotonic continuous chain of pairs bt3, b

t4, 0 ≤ t < d of entire inner p× p mvf’s.

In general, the GCIP(b3, b4; c) based on a mvf c ∈ Cp×p and a pair of mvf’sb3, b4 ∈ Sp×p

in is to describe the set

C(b3, b4; c) = c ∈ Cp×p : (b3)−1(c− c)(b4)−1 ∈ N p×p+ .

There is a correspondence between problems of this sort that are subject to theextra condition

C(b3, b4; c) ∩ Cp×p = ∅ (11.1)and the class UsR(Jp):

Theorem 11.1. Let A ∈ UsR(Jp), c ∈ C(A) and let b3, b4 ∈ apII(A). Then thecondition (11.1) is in force and

C(b3, b4; c) = C(A) . (11.2)

If, in addition, A(λ) is an entire mvf, then b3(λ) and b4(λ) are also entire mvf’s.Conversely, if the condition (11.1) is in force for a given choice of c ∈ Cp×p

and b3, b4 ∈ Sp×pin , then there exists a mvf A ∈ U(Jp) such that (11.2) holds.

Moreover, every mvf A ∈ U(Jp) for which (11.2) holds is automatically stronglyregular and there is essentially only such mvf A(λ) (up to a constant Jp-unitaryfactor on the right) such that b3, b4 ∈ apII(A).

Proof. The stated results follow from the results established in [Ar:94]. In applications to inverse problems, the case in which b3(λ) and b4(λ) are

entire inner mvf’s is of particular interest. The following specialization of the pre-ceding theorem is useful.

Theorem 11.2. Let b3, b4 ∈ E∩Sp×pin , let c ∈ Cp×p and assume that condition (11.1)

is in force. Then there exists exactly one mvf A ∈ U(Jp) such that(1) C(A) = C(b3, b4; c).(2) b3, b4 ∈ apII(A).(3) A(0) = Im.

Moreover, this mvf A ∈ E ∩ UsR(Jp).

We remark that the bitangential version of the Krein helical extension prob-lem is equivalent to a GCIP(b3, b4; c) that is based on entire inner mvf’s b3(λ) andb4(λ), whereas the classical Krein helical extension problem corresponds to thespecial case when bt

3(λ) = eαt(λ) and bt4(λ) = eβt(λ), where α ≥ 0, β ≥ 0 and

α + β > 0. To be more precise, in the last setting it is only the sum α+ β that issignificant and not the specific choices of the nonnegative numbers α and β.

In future sections we shall have special interest in mvf’s c(λ) for which

C(edIp, Ip; c) ∩Wp×p+ (γ) = ∅ for some γ ∈ Cp×p. (11.3)

If the mvf

c(λ) = γ + 2∫ ∞

0

eiλth(t)dt (11.4)

Page 125: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 119

belongs to this intersection, then the constant matrix γ and the restriction ofthe p × p mvf h(t) to the interval [0, d] are uniquely determined by c(λ). Weshall refer to this restriction as the accelerant of c(λ) on the interval [0, d]. Ifc ∈ Cp×p ∩Wp×p

+ (γ), then the mvf h(t) in the representation

c(λ) = γ + 2∫ ∞

0

eiλth(t)dt (11.5)

is called the accelerant of c(λ) on the interval [0,∞). Additional information onthe classical Krein extension problems and their bitangential generalizations isfurnished in the last section and, in much more detail, in [ArD:98].

12. The bitangential inverse input impedance problem

The set Cdimp(M) of input impedance matrices for a given canonical integral system

(8.1) with J = Jp is defined as the intersection of the sets C(At):

Cdimp(M) =

⋂0≤t<d

C(At) .

The mvf’s ic(λ), where c ∈ Cdimp(M) are usually called Weyl-Titchmarsh functions.

It may happen that Cdimp(M) = ∅. However, if

Sp×p ⊂ D(TBt0) for some t0 ∈ (0, d), (12.1)

then Cdimp(M) = ∅. In particular, the condition (12.1) will be in force if

e−abt03 b

t04 ∈ Sp×p

in for some a > 0 and At0 ∈ UsR(Jp) ;

see, e.g., Lemma 2.4 of [ArD:04b]. In our formulation of the bitangential inverseinput impedance problem, the given data is a mvf c ∈ Cp×p, and a normalizedmonotonic continuous chain of pairs bt

3(λ), bt4(λ), 0 ≤ t < d, of entire inner p× p

mvf’s. An m ×m mvf M(t) on the interval [0, d) is said to be a solution of thebitangential inverse input impedance problem with data c(λ); bt

3(λ), bt4(λ), 0 ≤

t < d if M(t) is a continuous nondecreasing m×m mvf on the interval [0, d) withM(0) = 0 such that the matrizant At(λ) of the corresponding canonical integralsystem (8.1) meets the following three conditions:

(1) c ∈ Cdimp(M) .

(2) bt3, b

t4 ∈ apII(At) for every t ∈ [0, d) .

(3) At ∈ UsR(Jp) for every t ∈ [0, d) .

Theorem 12.1. Let c ∈ Cp×p, let bt3(λ), bt

4(λ), 0 ≤ t < d, be a normalized mono-tonic continuous chain of pairs of entire inner p × p mvf’s. Then there exists atleast one solution M(t), 0 ≤ t < d, of the bitangential inverse input impedanceproblem for the given set of data if and only if

C(bt3, b

t4; c) ∩ Cp×p = ∅ for every t ∈ [0, d) . (12.2)

Page 126: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

120 D.Z. Arov and H. Dym

Moreover, if a solution exists then it is unique and the matrizant At(λ) of thissolution may be characterized as the unique mvf At ∈ U(Jp) such that the followingthree conditions are met for every t ∈ [0, d):(1) C(bt

3, bt4; c) = C(At).

(2) bt3, b

t4 ∈ apII(At).

(3) At(0) = Im.

Proof. See Theorem 7.9 in [ArD:03b]. In order to apply this theorem, we need to know when condition (12.2) is

in force. In particular, the condition (12.2) is satisfied if c ∈ Cp×p. However, ifthe given matrix c ∈ Cp×p ∩ Wp×p

+ (γ), then condition (12.2) will be in force ifγ + γ∗ > 0, even if detR c(µ) = 0 at some points µ ∈ R; see Theorem 5.2 in[ArD:05]. Moreover, if either

limν↑∞

bt03 (iν) = 0 or lim

ν↑∞bt04 (iν) = 0

for some point t0 ∈ [0, d), then the condition γ + γ∗ > 0 is necessary for (12.2) tobe in force and hence for the existence of a canonical system (8.1) with a matrizantAt(λ), 0 ≤ t < d, that meets the conditions (1) (2) and (3); see Theorem 5.4 in[ArD:05].

Remark 12.2. The method of solution depends upon the interplay between theRKHS’s that play a role in the parametrization formulas presented in Theorem6.2 and their corresponding RK’s. This method also yields the formulas for M(t)and the corresponding matrizant At(λ) that are discussed in the next section. Itdiffers from the known methods of Gelfand-Levitan, Marchenko and Krein, whichare not directly applicable to the bitangential problems under consideration.

13. A basic formula

In this section we shall assume that a mvf c ∈ Cp×p and a normalized monotoniccontinuous chain of pairs bt

3(λ), bt4(λ) , 0 ≤ t < d, of entire inner p×p mvf’s that

meet the condition (12.2) have been specified. Then there exists a mvf

ct ∈ C(bt3, b

t4; c) ∩Hp×p

∞ (13.1)

for every t ∈ [0, d) and hence, the operators

Φt11 = ΠH(bt

3)Mct

∣∣∣Hp2, Φt

22 = Π−Mct

∣∣∣H∗(bt

4), Φt

12 = ΠH(bt3)Mct

∣∣∣H∗(bt

4),

(13.2)

Y t1 = ΠH(bt

3)

Mct + (Mct)∗

∣∣∣∣H(bt

3)

= 2R

(Φt

11

∣∣∣∣H(bt

3)

)(13.3)

and

Y t2 = ΠH∗(bt

4)

Mct + (Mct)∗

∣∣∣∣H∗(bt

4)

= 2R

(ΠH∗(bt

4)Φt

22

)(13.4)

Page 127: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 121

are well defined. Moreover, they do not depend upon the specific choice of the mvfct in the set indicated in formula (13.1).

In order to keep the notation relatively simple, an operator T that acts in thespace of p× 1 vvf’s will be applied to p× p mvf’s with columns f1, . . . , fp columnby column: T [f1 · · · fp] = [Tf1 · · ·Tfp]. We define three sets of p× p mvf’s yt

ij(λ),ut

ij(λ) and xtij(λ) by the following system of equations, in which τ3(t) and τ4(t)

denote the exponential types of bt3(λ) and bt

4(λ), respectively and

(R0f)(λ) = f(λ)− f(0)/λ :

yt11(λ) = i

(Φt

11(R0eτ3(t)Ip))(λ) , yt

12(λ) = −i(R0bt3)(λ) ,

yt21(λ) = i

((Φt

22)∗(R0e−τ4(t)Ip)

)(λ) , yt

22(λ) = i(R0(bt4)

−1)(λ) ,(13.5)

Y t1 u

t1j + Φt

12ut2j = yt

1j(λ) ,

(Φt12)

∗ut1j + Y t

2 ut2j = yt

2j(λ) , j = 1, 2 ,(13.6)

xt1j(λ) = −(Φt

11)∗ut

1j and xt2j(λ) = Φt

22ut2j , j = 1, 2 . (13.7)

Theorem 13.1. Let c(λ); bt3(λ), bt

4(λ), 0 ≤ t < d be given where c ∈ Cp×p,bt

3(λ), bt4(λ), 0 ≤ t < d, is a normalized monotonic continuous chain of pairs of

entire inner p × p mvf’s and let assumption (12.2) be in force. Then the uniquesolution M(t) of the inverse input impedance problem considered in Theorem 12.1is given by the formula

M(t) =∫ τ3(t)

0

[xt

11(a) xt12(a)

ut11(a) ut

12(a)

]da +

∫ 0

−τ4(t)

[xt

21(a) xt22(a)

ut21(a) ut

22(a)

]da , (13.8)

where xtij(a) and ut

ij(a) designate the inverse Fourier transforms of the mvf’sxt

ij(λ) and utij(λ) defined earlier, respectively, and the corresponding matrizant

At(λ) = Im + iλ

[xt

11(λ) + xt21(λ) xt

12(λ) + xt22(λ)

ut11(λ) + ut

21(λ) ut12(λ) + ut

22(λ)

]Jp . (13.9)

Proof. See Theorem 4.4 in [ArD:05].

14. Spectral functions

The term spectral function is defined in two different ways: The first definition isin terms of the generalized Fourier transform

(F2f)(λ) = [0 Ip]1√π

∫ d

0

A(s, λ)dM(s)f(s) (14.1)

based on the matrizant of the canonical system (8.1) with J = Jp applied initiallyto the set of f ∈ Lm

2 (dM ; [0, d)) with compact support inside the interval [0, d).

Page 128: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

122 D.Z. Arov and H. Dym

A nondecreasing p× p mvf σ(µ) on R is said to be a spectral function for thesystem (8.1) if the Parseval equality∫ ∞

−∞(F2f)(µ)∗dσ(µ)(F2f)(µ),=

∫ d

0

f(t)∗dM(t)f(t)dt (14.2)

holds for every f ∈ Lm2 (dM ; [0, d)) with compact support. The notation Σd

sf (M)will be used to denote the set of spectral functions of a canonical system of theform (8.1) with J = Jp.

Remark 14.1. The generalized Fourier transform introduced in formula (14.1) is aspecial case of of the transform

(FLf)(λ) = L∗ 1√π

∫ d

0

A(s, λ)dM(s)f(s) (14.3)

that is based on a fixed m× p matrix L that meets the conditions L∗JpL = 0 andL∗L = Ip. The mvf y(t, λ) = L∗At(λ) is the unique solution of the system (8.1)with J = Jp that satisfies the initial condition y(0, λ) = L∗. Spectral functionsmay be defined relative to the transform FL in just the same way that theywere defined for the transform F2. Direct and inverse spectral problems for thesespectral functions are easily reduced to the corresponding problems based on F2.;see Sections 4 and 5 of [ArD:04b] and Section 16 below for additional discussion.

The second definition of spectral function is based on the Riesz-Herglotzrepresentation

c(λ) = iα− iβλ +1πi

∫ ∞

−∞

1

µ− λ− 1

1 + µ2

dσ(µ) , λ ∈ C+ , (14.4)

which defines a correspondence between p× p mvf’s c ∈ Cp×p and a set α, β, σ,in which σ(µ) is a nondecreasing p × p mvf on R that is normalized to be leftcontinuous with σ(0) = 0 and is subject to the constraint∫ ∞

−∞

dtraceσ(µ)1 + µ2

<∞ , (14.5)

and α and β are constant p× p matrices such that α = α∗ and β ≥ 0.The mvf σ(µ) in the representation (14.4) will be referred to as the spectral

function of c(λ). Correspondingly, if F ⊂ Cp×p, then

(F)sf = σ : such that σ is the spectral function of some c ∈ F .If c(λ) = TA[Ip] and A ∈ UsR(Jp), then β = 0 and σ(µ) is absolutely continuouswith σ′(µ) = Rc(µ) a.e. on R; see Lemma 2.2 and the discussion following Lemma2.3 in [ArD:05]. Moreover, if A ∈ UsR(Jp) and Sp×p ⊂ D(TAV), then for eachσ ∈ (C(A))sf , there exists at least one p× p Hermitian matrix α such that

c(α)(λ) = iα+1πi

∫ ∞

−∞

1

µ− λ− µ

1 + µ2

dσ(µ) (14.6)

belongs to C(A); see Theorem 2.14 in [ArD:04b].

Page 129: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 123

We shall also make use of the following condition on the growth of the mvfχt

1(λ) = bt4(λ)bt

3(λ):

‖χa1(reiθ + ω)‖ ≤ γ < 1 on the indicated ray in C+ , (14.7)

i.e., the inequality holds for some fixed choice of θ ∈ [0, π], ω ∈ C+, a ∈ (0, d) andall r ≥ 0. It is readily checked that if this inequality is in force for some pointa ∈ (0, d), then it holds for all t ∈ [a, d).

Remark 14.2. The condition (14.7) will be in force if

e−aχt01 (λ) ∈ Sp×p

in

for some choice of a > 0 and t0 ∈ (0, d).

These observations leads to the following conclusion:

Lemma 14.3. If the matrizant At(λ) of the canonical differential system (8.1) withJ = Jp satisfies the condition (9.2) and if condition (14.7) is in force for somea ∈ [0, d) when bt

3, bt4 ∈ apII(At) for t ∈ [0, d) and if c ∈ C(Aa), then β = 0 in

the representation (14.4).

In view of the fact that

A ∈ UsR(Jp) ⇐⇒ C(A) ∩ Cp×p = ∅ , (14.8)

the conditions (12.2) and (9.2) are equivalent if C(At) = C(bt3, b

t4; c) for every

0 ≤ t < d. In particular, these conditions are satisfied if Cdimp(M)∩Cp×p = ∅. They

are also satisfied if Cdimp(M)∩Wm×m(γ) = ∅ for some γ ∈ Cm×m with γ+γ∗ > 0,

by Theorem 5.2 in [ArD:05].

The direct problem

The direct problem for a given canonical system with mass function M(t) on theinterval [0, d) is to describe the set of input impedances Cd

imp(M) and the setΣd

sf (M) of spectral functions of the system.

Theorem 14.4. Let At(λ) denote the matrizant of a canonical integral system ofthe form (8.1) with J = Jp and suppose that the two conditions (9.2) and (14.7)are met. Then

(1) Σdsf (M) = (Cd

imp(M))sf .(2) To each σ ∈ Σd

sf (M) there exists exactly one mvf c(λ) ∈ Cdimp(M) with spec-

tral function σ(µ). Moreover, this mvf c(λ) is equal to one of the mvf’s c(α)(λ)defined by formula (14.6) for some Hermitian matrix α.

(3) If d <∞ and traceM(t) < δ <∞ for every t ∈ [0, d), then the equation (8.1)and the matrizant At(λ) may be considered on the closed interval [0, d] andCdimp(M) = C(Ad).

Proof. See Theorem 2.21 in [ArD:04b].

Page 130: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

124 D.Z. Arov and H. Dym

A spectral function σ ∈ Σdsf (M) of the canonical integral system (8.1) with

J = Jp is said to be orthogonal if the isometric operator that extends the gener-alized Fourier transform F2 defined by formula (14.1) maps Lm

2 (dM ; [0, d)) ontoLp

2(dσ; R).

Theorem 14.5. Let the canonical integral system (8.1) with J = Jp, mass func-tion M(t) and matrizant At(λ) be considered on a finite closed interval [0, d](so that traceM(d) < ∞) and let A(λ) = Ad(λ), B(λ) = A(λ)V and E(λ) =√

2[0 Ip]B(λ). Suppose further that

(a) (C(A))sf = Σdsf (M) and

(b) KEω (ω) > 0 for at least one (and hence every) point ω ∈ C+.

Then:

(1) Sp×p ⊂ D(TA).(2) The spectral function σ(µ) of the mvf c(λ) = TB[ε] is an orthogonal spectral

function of the given canonical system if ε is a constant p×p unitary matrix.

Proof. The first assertion is equivalent to condition (b); see (5.1). The proof ofassertion (2) will be given elsewhere.

Remark 14.6. The conclusions of the last theorem can be reformulated in terms ofthe linear fractional transformations of pairs that were discussed briefly in Section5: The inclusion (1) in the last theorem is equivalent to the inclusion F(Jp) ⊂D(TA). Moreover, c(λ) = TB[ε] for some constant unitary p × p matrix ε if andonly if c(λ) = TA[a, b] for some pair of constant p× p matrices a, b that meetthe conditions a∗b + b∗a = 0 and a∗a + b∗b > 0. Assertion (2) is obtained undersomewhat stronger assumptions in Theorem 2.5 of Chapter 4 of [Sak:99].

15. The bitangential inverse spectral problem

In our formulation of the bitangential inverse spectral problem the given dataσ; bt

3, bt4, 0 ≤ t < d is a p × p nondecreasing mvf σ(µ) on R that meets the

constraint (14.5) and a normalized monotonic continuous chain bt3, b

t4, 0 ≤ t < d,

of pairs of entire inner p × p mvf’s. An m × m mvf M(t) on the interval [0, d)is said to be a solution of the bitangential inverse spectral problem with dataσ(µ); bt

3(λ), bt4(λ), 0 ≤ t < d if M(t) is a continuous nondecreasing m × m

mvf on the interval [0, d) with M(0) = 0 such that the matrizant At(λ) of thecorresponding canonical integral system (8.1) with J = Jp meets the followingthree conditions:

(i) σ(µ) is a spectral function for this system, i.e., σ ∈ Σdsf (M).

(ii) bt3, b

t4 ∈ apII(At) for every t ∈ [0, d).

(iii) At ∈ UsR(Jp) for every t ∈ [0, d).

Page 131: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 125

The constraint (ii) defines the class of canonical integral systems in which welook for a solution of the inverse problem for the given spectral function σ(µ).Subsequently, the condition (iii) guarantees that in this class there is at most onesolution.

The solution of this problem rests on the preceding analysis of the bitangentialinverse input impedance problem with data c(α); bt

3, bt4, 0 ≤ t < d, where c(α)(λ)

is given by formula (14.6).

Theorem 15.1. If the data σ; bt3, b

t4, 0 ≤ t < d for a bitangential inverse spectral

problem meets the conditions (14.5) and (14.7) and the mvf c(λ) = c(0)(λ) satisfiesthe constraint (12.2), then the following conclusions hold:

(1) For each Hermitian matrix α ∈ Cp×p, there exists exactly one solutionM (α)(t) of the bitangential inverse input spectral problem such that c(α)(λ)is an input impedance for the corresponding canonical integral system (8.1)with J = Jp based on the mass function M (α)(t).

(2) The solutions M (α)(t) are related to M (0)(t) by the formula

M (α)(t) =

[Ip iα

0 Ip

]M (0)(t)

[Ip 0−iα Ip

]. (15.1)

The corresponding matrizants are related by the formula

A(α)t (λ) =

[Ip iα

0 Ip

]A

(0)t (λ)

[Ip −iα0 Ip

]. (15.2)

(3) If M(t) is a solution of the bitangential inverse spectral problem, then M(t) =M (α)(t) for exactly one Hermitian matrix α ∈ Cp×p.

(4) The solution M (0)(t) and matrizant A(0)t (λ) may be obtained from the formu-

las for the solution of the bitangential inverse input impedance problem withdata c(0); bt

3, bt4, 0 ≤ t < d that are given in Theorem 13.1.

Proof. See Theorem 2.20 in [ArD:04b].

The condition (12.2) is clearly satisfied if c(0) ∈ Cp×p. However, this conditionis far from necessary. If, for example, c(0) ∈ Cp×p∩Wp×p(γ), then, as noted earlier,condition (12.2) holds if γ + γ∗ > 0, even if detRc(µ) = 0 on some set of pointsµ ∈ R.

16. Spectral problems for systems with J = Jp

The preceding results for systems of the form (8.1) with J = Jp may be adapted tosystems of the form (8.1) with any signature matrix J that is unitarily equivalentto Jp. If

J = V ∗JpV for some unitary matrix V , (16.1)

Page 132: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

126 D.Z. Arov and H. Dym

then y(t, λ) = y(t, λ)V ∗ is a solution of the system

y(t, λ) = y(t, 0) + iλ

∫ t

0

y(s, λ)dM (s)Jp (16.2)

with mass function M(t) = VM(t)V ∗ and matrizant Ut(λ) = V Ut(λ)V ∗ thatbelongs to class U(Jp). Correspondingly, the set of input impedance matricesCdimp(M) for the system (8.1) with J unitarily equivalent to Jp is defined as the

intersection of the sets C(V UtV∗), 0 ≤ t < d. This set coincides with Cd

imp(M).A nondecreasing p× p mvf σ(µ) on R is said to be a spectral function for the

system (8.1) if the Parseval equality holds for the transform

(F2f)(λ) = [0 Ip]V1√π

∫ d

0

U(s, λ)dM(s)f(s) (16.3)

for every f ∈ Lm2 (dM ; [0, d)) with compact support. Such a spectral function is

said to be orthogonal if the isometry that extends this transform to a mappingfrom Lm

2 (dM ; [0, d)) into Lp2(dσ; R) is onto.

The data for the inverse spectral problem for a canonical integral system ofthe form (8.1) with J unitarily equivalent to Jp is: a fixed constant unitary matrixV such that J = V ∗JpV , a p × p nondecreasing mvf σ(µ) on R that meets theconstraint (14.5) and a normalized monotonic continuous chain bt

3, bt4, 0 ≤ t < d,

of pairs of entire inner p× p mvf’s. A continuous nondecreasing m×m mvf M(t)on the interval [0, d) with M(0) = 0 is said to be a solution of the inverse spectralproblem for this given set of data if the matrizant Ut(λ) of the canonical integralsystem (8.1) with mass function M(t) meets the following three conditions:

(i) σ(µ) is a spectral function for this system, i.e., the Parseval formula basedon the transform (16.3) holds.

(ii) bt3, b

t4 ∈ apII(V UtV

∗) for every t ∈ [0, d).(iii) Ut ∈ UsR(Jp) for every t ∈ [0, d).The results discussed earlier for the case J = Jp, as well as those that will bediscussed in the next section, are easily adapted to the case J = V ∗JpV consideredin this section.

The reduction considered above is also useful in the case J = Jp: It maybe used to reduce direct and inverse spectral problems based on the generalizedFourier transform FL defined in (14.3) to the corresponding problems based onthe transform F2 defined in (14.1) with the help of the unitary matrix V ∗ =[−JpL L], since L∗V ∗ = [0 Ip]:

FL(f) = L∗ 1√π

∫ d

0

A(s, λ)dM(s)f(s)

= L∗V ∗ 1√π

∫ d

0

A(s, λ)dM(s)V f(s) = F2(V f) ,

where A(s, λ) = V A(s, λ)V ∗ and M(s) = VM(s)V ∗.

Page 133: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 127

17. Weyl limit balls

Let At(λ) = A(t, λ), 0 ≤ t < d, denote the matrizant of a canonical integral systemof the form (8.1) with J = Jp, let Bt(λ) = At(λ)V for every t ∈ [0, d) and supposethat Sp×p ⊂ D(TBt0

) for some t0 ∈ [0, d). Then

Sp×p ⊂ D(TBt) for every t ∈ [t0, d) (17.1)

and hence,

Cdimp(M) =

⋂t0≤t<d

TBt [Sp×p] and Cdimp(M) = ∅ .

In view of (5.1), the condition (17.1) is in force if and only if

KEt0ω (ω) > 0 for some (and hence every) point ω ∈ C+ . (17.2)

If condition (17.2) (or, equivalently, (17.1)) is in force, then the set

B∗(ω) = c(ω) : c ∈ Cdimp(M)

is a matrix ball with center γ(ω) and left and right semiradii R(ω) ≥ 0 andRr(ω) ≥ 0 for each point ω ∈ C+:

B∗(ω) = δ(ω) + R(ω)εRr(ω) : ε ∈ Sp×pconst .

The center γ(ω) is unique and the semiradii R(ω) and Rr(ω) are defined up to apositive scalar multiplier δ: R(ω) −→ δR(ω) and Rr(ω) −→ δ−1Rr(ω). The setB∗(ω) is called the Weyl limit ball. By a theorem of Orlov [Or:76], the ranks of thesemiradii n = rankR(ω) and nr = rankRr(ω) are independent of the choice ofthe point ω ∈ C+. If At0 ∈ UsR(Jp), then the three conditions (17.1), (17.2) and

χt01 (ω)χt0

1 (ω)∗ < Ip for at least one (and hence every) point ω ∈ C+ (17.3)

are equivalent; see Theorem 3.3 in [ArD:05].

Theorem 17.1. Let At(λ) = A(t, λ), 0 ≤ t < d, be the matrizant for the system(8.1) with J = Jp, let bt

3, bt4 ∈ apII(At) for every t ∈ [0, d) and suppose that the

mvf χt1(λ) = bt

4(λ)bt3(λ) meets the condition (14.7) and that c ∈ Cd

imp∩ Cp×p. Thenthe ranks n and nr of the left and right semiradii of the limit ball B∗(ω) (that areindependent of ω ∈ C+) may be computed by the formulas

n = rank limt↑d

bt3(ω)bt

3(ω)∗ and nr = rank limt↑d

bt4(ω)∗bt

4(ω) . (17.4)

Proof. See Theorem 3.16 in [ArD:05].

The two extreme cases:

(a) the limit point case, in which at least one of the two indices n, nr is equalto zero, and

(b) the full rank case, in which n = nr = p,

are of particular interest.

Page 134: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

128 D.Z. Arov and H. Dym

In the setting of the previous theorem, the following conclusions prevail:(1) The limit point case is in force ⇐⇒ Cd

imp(M) contains exactly one mvf c(λ).(2) If n = 0, then for any two vectors ξ, η ∈ Cp and any point λ ∈ C+,

[ξ∗ η∗]A(s, λ) ∈ L1×m2 (dM ; [0, d)) ⇐⇒ η = c(λ)ξ ,

which serves to characterize the single mvf c(λ) in Cdimp(M). This is really

the Weyl-Titchmarsh characterization of input impedances for this setting.(3) The full rank case is in force ⇐⇒ limt↑d traceM(t) <∞.

The last two conclusions follow from Theorems 6.4 and 7.5 of [ArD:03b], respec-tively.

18. The bitangential inverse spectral problem for σ(µ) = µIp

Theorem 18.1. Let σ(µ) = µIp and let bt3, b

t4 0 ≤ t < d, be a normalized mono-

tonic continuous chain of pairs of entire inner p× p mvf’s that satisfies the condi-tion (14.7). Then every solution M(t) of the bitangential inverse spectral problemwith given data µIp; bt

3, bt4, 0 ≤ t < d is of the form

M (α)(t) =

[Ip iα

0 Ip

]V

[m3(t) 0

0 m4(t)

]V

[Ip 0−iα Ip

],

where α = α∗ is a p× p Hermitian matrix and

mj(t) = −i∂bt

j

∂λ(0) = 2πk

btj

0 (0), j = 3, 4 . (18.1)

Moreover,B(Et) = H(bt

3)⊕H∗(bt4)

as Hilbert spaces and

C(At) = (Ip − bt3εb

t4)(Ip + bt

3εbt4)

−1 : ε ∈ Sp×p .

Proof. Since σ(µ) = µIp is the spectral function of the mvf c(0)(λ) = Ip, itis readily checked that Theorem 15.1 is applicable and guarantees the existenceof a solution M (0)(t) to the bitangential inverse spectral problem for the givendata µIp; bt

3, bt4, 0 ≤ t < d: M (0)(t) is the solution of the bitangential inverse

input impedance problem with data Ip; bt3, b

t4, 0 ≤ t < d and may be obtained

by invoking the formulas in Section 13. In particular, it follows readily from theformulas in Section 13 that the matrizant At(λ) of the system (8.1) with massfunction M (0)(t) is given by the formula

At(λ) = Im + iλ√

2V

[ut

11(λ) ut12(λ)

ut21(λ) ut

22(λ)

]Jp ,

where

ut11(λ) = −bt

3(λ)− Ip

2iλ= −ut

12(λ), ut21(λ) = −bt

4(λ)−1 − Ip

2iλ= ut

22(λ)

Page 135: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 129

(and the mvf’s xtij(λ) that appear in the cited formulas are: xt

1j(λ) = −ut1j(λ) and

xt2j(λ) = ut

2j(λ) for j = 1, 2). Therefore,

M (0)(t) =√

2V

[ut

11(0) ut12(0)

ut21(0) ut

22(0)

],

which can be reexpressed in terms of the mvf’s (18.1) as

M (0)(t) = V

[m3(t) 0

0 m4(t)

]V .

Moreover, since

Bt(λ) = At(λ)V =1√2

[−bt

3(λ) bt4(λ)−1

bt3(λ) bt

4(λ)−1

],

Et(λ) =√

2N∗2Bt(λ) = [bt

3(λ) bt4(λ)−1].

Thus, in this case ‖f‖2B(Et)

= ‖f‖2st for f ∈ B(Et) and

B(Et) = H(bt3)⊕H∗(bt

4)

as Hilbert spaces. Moreover, Theorem 14.4 is applicable,

C(At) = (Ip − bt3εb

t4)(Ip + bt

3εbt4)

−1 : ε ∈ Sp×p ,

βc = 0 for every mvf c ∈ C(At) and

Σdsf (M) = ∩0≤t<d(C(At))sf .

As a concrete application of this theorem, let bt3(λ) = expiλtD3 and bt

4(λ) =expiλtD4, where D3 and D4 are positive semidefinite p × p matrices, then thepreceding formulas imply that

M (0)(t) = V

[tD3 00 tD4

]V

and

At(λ) = expiλtV

[D3 00 D4

]VJp

= V

[eiλD3t 0

0 e−iλD4t

]V ,

for 0 ≤ t < d.

If d = ∞, then n = p− rankD3 and nr = p− rankD4.

If d <∞, then n = nr = p.

Page 136: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

130 D.Z. Arov and H. Dym

19. Spectral functions for the space H(b3)⊕H∗(b4)

A non decreasing p× p mvf σ(µ) on R is said to be a spectral function for the deBranges space B(E) based on an entire p×m de Branges function E(λ) if∫ ∞

−∞f(µ)∗dσ(µ)f(µ) = ‖f‖B(E)

for every f ∈ B(E). A spectral function of a de Branges space B(E) is said to bean orthogonal spectral function for B(E) if Lp

2(dσ) = B(E). In this section we shalldescribe the set of all spectral functions for the de Branges spaces that arose inthe preceding section.

Let b3(λ) and b4(λ) be a normalized pair of p × p entire inner mvf’s. Thenthe mvf

A(λ) =1√2

[−b3(λ) b4(λ)−1

b3(λ) b4(λ)−1

]V (19.1)

is a normalized entire Jp-inner mvf and

E(λ) =√

2[Ip 0]B(λ) =[b3(λ) (b4(λ))−1

].

The corresponding de Branges space

B(E) = H(b3)⊕H∗(b4) .

The analysis in the preceding example lends itself to consideration of the followingproblem:Find all nondecreasing p× p mvf’s σ(µ) on R such that∫ ∞

∞f(µ)∗f(µ)dµ =

∫ ∞

∞f(µ)∗dσ(µ)f(µ)

for every f ∈ B(E).If the mvf χ1(λ) = b4(λ)b3(λ) is uniformly contractive on a ray in C+, then,

by Lemma 2.4 and Theorem 2.14 of [ArD:04b],

C(A) =(Ip − b3εb4)(Ip + b3εb4)−1 : ε ∈ Sp×p

and σ(µ) is a solution of the problem stated just above if and only if

σ ∈ (C(A))sf .

If ε is a constant unitary matrix and b(λ) = b3(λ)εb4(λ), then

c(λ) = (Ip − b(λ))(Ip + b(λ))−1

is a meromorphic p × p mvf in C with poles in R at the points µ ∈ R at whichdet (Ip + b(µ)) = 0. Let µ0 = 0 if det (Ip + b(0)) = 0 and let µ1, µ2, . . . , denote theremaining poles of c(λ). Then, since Rc(µ) = 0 for all points µ ∈ R that are notpoles of c(λ), the spectral function σc(µ) of c(λ) is a step function with jumps mj

at the points µj . In this setting, the Riesz-Herglotz formula for c(λ) reduces to

c(λ) = iγ +m0

−πiλ +λ

πi

∑j≥1

mj

µj(µj − λ),

Page 137: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 131

where γ = γ∗ is a constant p × p Hermitian matrix and the condition (14.5) isequivalent to the constraint ∑

j≥1

tracemj

µ2j

<∞ .

By Theorem 14.5, the spectral function

σ(µ) =∑

j:µj<µmj (19.2)

of this mvf c(λ) is orthogonal for the space B(E), i.e., Lp2(dσ; R) = B(E). The

corresponding Parseval formula states that∫ ∞

∞f(µ)∗f(µ)dµ =

∑j≥0

f(µj)∗mjf(µj)

for every f ∈ H(b3)⊕H∗(b4). Moreover, if ξ0, ξ1, . . . is any sequence of vectors inCp such that

ξj ∈ rangemj and∑j≥0

ξ∗jmjξj <∞ ,

then there exists a unique f ∈ H(b3)⊕H∗(b4) such that f(µj) = ξj .The matrix-valued jumps mj of σ(µ) at the points µj can be recovered from

the formula for c(λ):

mj = πi limλ→µj

(µj − λ)c(λ) = 2πi limλ→µj

(µj − λ)(Ip + b(λ))−1 .

Moreover, upon using dσ(µ) to calculate the inner product in B(E) instead of dµ,the formula

ξ∗f(ω) = 〈f,KEω ξ〉 ,

which is valid for every f ∈ B(E) and every ξ ∈ Cp, yields the sampling formula

ξ∗f(ω) =∑j≥0

ξ∗KEµj

(ω)mjf(µj) .

In the special case that m = 1 and b(λ) = eiaλ for some a > 0 this reduces to thewell-known Shannon-Kotelnikov sampling formula.

We remark that for each given normalized pair b3, b4 of entire inner p× pmvf’s, there exists at least one normalized monotonic continuous chain bt

3, bt4,

0 ≤ t < d, of entire inner p × p mvf’s such that b3(λ) = bd3(λ) and b4(λ) = bd

4(λ);see, e.g., Theorem 2.4 in [ArD:00a]. Consequently, the set (C(A))sf for the mvfA(λ) defined by formula (19.1) is equal to the set of all spectral functions for thesystems considered in Example 1 with this chain bt

3, bt4, 0 ≤ t < d. In particular,

the spectral function defined by formula (19.2) is an orthogonal spectral functionfor each such system. The same analysis is applicable to the orthogonal spectralfunctions of general canonical integral systems of the form (8.1) with J = Jp thatare considered in the last assertion of Theorem 14.5 and will be discussed in moredetail elsewhere.

Page 138: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

132 D.Z. Arov and H. Dym

20. A left tangential accelerant inverse problem

Let the data c(λ); bt3(λ), bt

4(λ), 0 ≤ t < ∞ for the inverse impedance problemfor the system (8.1) with J = Jp be specified by the formulas:

c(λ) = Ip − 2∫ ∞

0

eiλah(a)da, (20.1)

where

h ∈ Lp×p1 ([0,∞)) and h(a) = h(−a)∗ for a.e. point a ∈ (−∞, 0) , (20.2)

i.e., c ∈ Cp×p∩Wp×p+ (Ip) and −h(t) is the accelerant of c(λ) on the interval [0,∞),

andbt3(λ) = eiλtD and bt

4(λ) = Ip for t ≥ 0 (20.3)where D = diag α1, . . . , αp is a positive definite diagonal matrix. Let

fD(a) =p∑

j=1

eje∗jf(αja) and hD(a, b) =

p∑j,k=1

eje∗jh(αja− αkb)eke∗k ,

where ej , j = 1, . . . , p, denotes the standard basis for Cp and let γtD(a, b) denote the

resolvent kernel for the Fredholm integral operator with kernel D1/2hD(a, b)D1/2

on the square [0, t]× [0, t]:

−D1/2hD(a, b)D1/2 + γtD(a, b)−

∫ t

0

D1/2hD(a, c)D1/2γtD(c, b)dc = 0

and

−D1/2hD(a, b)D1/2 + γtD(a, b)−

∫ t

0

γtD(a, c)D1/2hD(c, b)D1/2dc = 0 .

Theorem 20.1. Let (20.1)–(20.3) be in force, let

v1(a) = −Ip + 2∫ a

0

h(b)db and v2(a) = Ip for a ≥ 0 , (20.4)

and letϕD(a, 0) = [(v1)D(a) (v2)D(a)] .

Then the inverse input impedance problem has a unique solution

M(t) =12

∫ t

0

ϕD(a, 0)∗D1/2

D1/2ϕD(a, 0) +

∫ t

0

γtD(a, b)D1/2ϕD(b, 0)db

da .

(20.5)If it is also assumed that h(t) is continuous, then M(t) is differentiable, M ′(t) islocally absolutely continuous on [0,∞) and

M ′(t) = Y1(t)Y1(t)∗ , (20.6)

where

Y1(t)∗ =1√2

(D1/2ϕD(t, 0) +

∫ t

0

γtD(t, b)D1/2ϕD(b, 0)db

). (20.7)

Page 139: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 133

Proof. See Theorems 5.11 and 5.12 in [ArD:05].

Remark 20.2. The conclusions of the last theorem are in force under the lessrestrictive assumption that c(λ) has an accelerant h(t) on every finite interval[0, d] and that that γ = Ip in formula (11.4).

If D = Ip, then we shall write γt(a, b) instead of γtD(a, b) and the formulas in

the preceding theorem simplify:

Theorem 20.3. Let c ∈ Cp×p ∩ W+(Ip) and suppose that in the representation(20.1) the mvf h(t) is continuous on [0, d) and let γt(a, b) denote the solution ofthe resolvent equation

γt(a, b)−∫ t

0

h(a− c)γt(c, b)dc = h(a− b) , 0 ≤ a, b ≤ t < d . (20.8)

Then:

(1) There exists exactly one solution M(t) of the inverse impedance problem(c(λ); eiλtIp, Ip, 0 ≤ t < d) for the system (8.1) with J = Jp. It has a contin-uous second order derivative M ′′(t) on [0, d) and

M ′(t) = Y (t)

[Ip 00 0

]Y (t)∗ , (20.9)

whereY (t) =

[Y1(t) Y2(t)

](20.10)

is the solution of the Cauchy problem

Y ′(t) = Y (t)

[0 γt(t, 0)

γt(0, t) 0

], 0 ≤ t < d , (20.11)

Y (0) = V ,

Y1(t)∗ = v(t) +∫ t

0

γt(t, b)v(b)db , (20.12)

Y2(t)∗ =1√2[Ip Ip] +

∫ t

0

γt(0, b)v(b)db , 0 ≤ t < d . (20.13)

v(b) =1√2

[− Ip + 2

∫ b

0

h(a)da Ip

](20.14)

and(2)

Y (t)jpY (t)∗ = Jp for every t ∈ [0, d) . (20.15)

Proof. See Theorem 5.13 in [ArD:05].

Page 140: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

134 D.Z. Arov and H. Dym

21. The rational homogeneous case

In this section we shall check that if the given input impedance matrix c ∈ Cp×p

is a rational p× p mvf that has no poles on R such that c(∞) + c(∞)∗ > 0, thenthe formulas in the previous subsection lead to the same realization formulas forM(t) that were presented in Section 5 of [ArD:02b]. Let

c(λ) = Ip − 2∫ ∞

0

eiλth(t)dt = Ip − 2C(λIn −A)−1B , for λ ∈ C+,

where the exhibited realization is minimal. By Theorem 20.3, M ′(t) = Y1(t)Y1(t)∗,where

Y1(t) = V

[Ip + iC

∫ t

0R12(b)R22(t)−1C∗

iB∗ ∫ t

0R22(b)dbR22(t)−1C∗

],

and [R11(t) R12(t)R21(t) R22(t)

]= exp

−it

[A + BC BB∗

C∗C A∗ + C∗B∗

].

which is consistent with formula (5.31) of [ArD:02b]. For additional informationand formulas on inverse problems with rational data, see, e.g., [AG:95], [AG:01],[GKS:98], [GKS:02] and the references cited therein. We wish to emphasize thatwe do not assume that

Rc(µ) = Ip −∫ ∞

−∞eiµth(t)dt > 0

for every point µ ∈ R. Thus, detRc(µ) may be equal to zero on some subsetof R.

22. Differential systems with potential

In this section we consider differential systems of the form

u′(t, λ) = iλu(t, λ)NJ + u(t, λ)V(t) , 0 ≤ t < d , (22.1)

with an m×m signature matrix J , a constant m×m matrix N such that

N ≥ 0 (22.2)

and an m×m matrix-valued potential V(t) such that

V ∈ Lm×m1, loc ([0, d)) and V(t)J + JV(t)∗ = 0 a.e. . (22.3)

It is readily checked that the matrizant Ut(λ) = U(t, λ), 0 ≤ t < d, of this systemsatisfies the identity

Ut(λ)JUt(ω)∗′ = i(λ− ω)Ut(λ)NUt(ω)∗ for 0 ≤ t < d, (22.4)

and hence thatJ − Ut(λ)JUt(ω)∗

ρω(λ)=

12π

∫ t

0

Us(λ)NUs(ω)∗ds for 0 ≤ t < d. (22.5)

Page 141: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 135

This in turn leads easily to the conclusion that

Ut ∈ U(J) for every t ∈ [0, d) . (22.6)

In particular, Ut(0) is J-unitary and so invertible. Moreover, the mvf

Yt(λ) = Ut(λ)Ut(0)−1 for 0 ≤ t < d,

is the matrizant of the canonical differential system

y′(t, λ) = iλy(t, λ)H(t)J, 0 ≤ t < d, (22.7)

with Hamiltonian

H(t) = Ut(0)NUt(0)∗, 0 ≤ t < d. (22.8)

If

J = V ∗JpV for some constant unitary matrix V , (22.9)

then

V UtV∗ ∈ U(Jp) for every t ∈ [0, d)

and, in keeping with the conventions initiated in Section 16, we say that

bt3, b

t4 ∈ apII(Ut) if bt

3, bt4 ∈ apII(V UtV

∗) .

Moreover, when (22.9) is in force, we shall introduce the following V dependentdefinitions for differential systems of the form (22.1) with potential V(t) and ma-trizant Ut(λ):

(1) The set Cdimp(V) of input impedances is defined by the formula

Cdimp(V) =

⋂0≤t<d

C(V UtV∗) . (22.10)

(2) The generalized Fourier transform

g(λ) = [0 Ip]V1√π

∫ d

0

U(s, λ)Ng(s)ds (22.11)

for every g ∈ Lm2 (Nds; [0, d)) with compact support in [0, d).

(3) The set Σdsf (V) of spectral functions for the system (22.1) is the set of non-

decreasing p× p mvf’s σ(µ) on R for which the Parseval equality∫ ∞

∞g(µ)∗dσ(µ)g(µ) =

∫ d

0

g(s)∗Ng(s)ds (22.12)

holds for every g ∈ Lm2 (Nds; [0, d)) with compact support in [0, d).

Remark 22.1. The set Cdimp(V) coincides with the set Cd

imp(H) of input impedancematrices for the canonical differential system (22.7) with Hamiltonian H(t) given

Page 142: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

136 D.Z. Arov and H. Dym

by formula (22.8). The generalized Fourier transform (16.3) for the canonical sys-tem with dM(t) = H(t)dt and H(t) = U(t)NU(t)∗ for 0 ≤ t < d, is related to thetransform (22.11):

(F2f)(λ) = [0 Ip]V1√π

∫ d

0

U(s, λ)U(s, 0)−1H(s)f(s)ds (22.13)

= [0 Ip]V1√π

∫ d

0

U(s, λ)NU(s, 0)∗f(s)ds (22.14)

for f ∈ Lm2 (H(s)ds; [0, d)) with compact support in [0, d).

Theorem 22.2. IfNJ = JN , (22.15)

then the matrizants Ut(λ) and Yt(λ) of the systems (22.1) and (22.7) are bothstrongly regular:

Ut ∈ UsR(J) and Yt ∈ UsR(J) for every t ∈ [0, d) . (22.16)

Proof. Since N and J are both Hermitian matrices,

NJ = JN ⇐⇒ NJ = (NJ)∗

and hence the assumption NJ = JN guarantees that the mvf expiµtNJ isunitary for µt ∈ R. Consequently, standard estimates show that

Ut ∈ Lm×m∞ and Yt ∈ Lm×m

∞ for every t ∈ [0, d) .

Therefore, by Theorem 4.1,

Yt ∈ UsR(J) and Ut ∈ UsR(J) for every t ∈ [0, d) . Systems of the form (22.1) with NJ − JN = 0 and matrizants Ut ∈ UsR(J)

will be considered in Section 25. However, for the moment we focus on systems forwhich the condition NJ = JN is met.

In particular, the condition NJ = JN is met if N is a convex combinationof the orthogonal projections

PJ =Im + J

2and QJ =

Im − J

2,

or, even more generally, if

N = δ1PJ + δ2QJ , where δ1 ≥ 0 , δ2 ≥ 0 and δ1 + δ2 > 0 . (22.17)

The direct problem

The following results on the direct problem are established in [ArD:??]:

Theorem 22.3. Let Ut(λ) = U(t, λ), 0 ≤ t < d, be the matrizant of the system(22.1). Assume that (22.2), (22.3), (22.9) and (22.17) are in force, that At(λ) =V Ut(λ)V ∗ for every t ∈ [0, d) and that the potential V(t) = V(t)∗ a.e. on theinterval [0, d). Then:(1) At ∈ UsR(Jp) for every t ∈ [0, d).(2) eδ1tIp, eδ2tIp ∈ apII(At) for every t ∈ [0, d).

Page 143: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 137

(3) The de Branges spaces B(Et) based on Et(λ) =√

2At(λ)V are independentof the potential V(t) as linear topological spaces, i.e.,

B(Et) =

∫ δ1t

−δ2t

eiλsh(s)ds : h ∈ Lp2([−δ2t, δ1t])

as linear spaces and for each t ∈ [0, d), there exist a pair of positive constantsγ1 = γ1(t) and γ2 = γ2(t) such that

γ1‖f‖2 ≤ ‖f‖B(Et) ≤ γ2‖f‖2

for every f ∈ B(Et).(4) Sp×p ⊂ D(TAtV) for every t ∈ (0, d).(5) C(At) = TAtV[Sp×p] for every t ∈ (0, d) and β = 0 in the integral represen-

tation (14.4) of every mvf c ∈ C(At), 0 < t < d.(6) Cd

imp(V) =⋂

0≤t<d C(At) = ∅.(7) Σd

sf (V) = (Cdimp(V))sf and the integral representation (14.4) defines a one to

one correspondence between these two sets.(8) If d <∞ and V ∈ L1([0, d]), then Cd

imp(V) = C(Ad).

Proof. A proof will be supplied in [ArD:??].

Corollary 22.4. If At(λ) = A(t, λ), 0 ≤ t < d, is the matrizant of the system (22.1)with J = Jp and if the conditions (22.2) and (22.3) are in force, then:

(a) N = κIm =⇒ eiκtλIp, eiκtλIp ∈ apII(At) .

(b) N = κPJ =⇒ eiκtλIp, Ip ∈ apII(At) .(c) N = κQJ =⇒ Ip, e

iκtλIp ∈ apII(At) .

Remark 22.5. In the preceding theorem, the sets C(At) depend only upon thepotential V(t) and the positive number κ = δ1 + δ2 and not on the particularchoices δ1 ≥ 0 and δ2 ≥ 0. This follows from the fact that the mvf eiδλtUt(λ) is thematrizant of the system (22.1) with potential V(t) that is independent of δ andwith N = (δ1 + δ)PJ +(δ2− δ)QJ for every number δ in the interval −δ1 ≤ δ ≤ δ2.Consequently, the sets Cd

imp(V) and Σdsf (V) depend only upon the potential V(t)

and the number κ.

The inverse input impedance problem

The data for the inverse input impedance problem for differential systems of theform (22.1) on an interval [0, d) is a mvf c ∈ Cp×p and the right-hand endpointd, 0 < d ≤ ∞, of the interval and the problem is to find a locally summablepotential V(t) of the prescribed form on [0, d) such that c ∈ Cd

imp(V). In the settingof Theorem 22.3, it is not necessary to specify a chain bt

3, bt4, 0 ≤ t < d, to solve

this inverse problem.

Theorem 22.6. Let c ∈ Cp×p and 0 < d ≤ ∞ be given and let N of the form (22.17)and V as in (22.9) be fixed. Then:

Page 144: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

138 D.Z. Arov and H. Dym

(1) There exists at most one differential system of the form (22.1) with the givenN and J and potential V(t) = V(t)∗ a.e. on [0, d) that meets the condition(22.3) with an input impedance equal to c(λ).

(2) If c ∈ Cp×p has a continuous accelerant h(t) on the interval [0, d) and if thereal part of the matrix γ in definition (11.4) is positive definite and κ > 0, thenthere exists exactly one locally summable potential V(t), 0 ≤ t < d, such that

V(t) = V(t)∗ and V(t)J + JV(t)∗ = 0 a.e. on the interval [0, d) (22.18)

and c ∈ Cdimp(V). Moreover, this potential V(t) is continuous on the interval

[0, d) and is of the form

V(t) = V ∗V

[0 a(t)

a(t)∗ 0

]VV for 0 ≤ t < d . (22.19)

If κ = 1 and c(λ) is given by formula (20.1), then a(t) = γt(t, 0), whereγt(a, b) is the unique solution of the integral equation (20.8).

Proof. Assertion (1) follows from (1) and (2) of Theorem 22.3, Theorem 12.1 andthe fact that the set Cd

imp(V) for the system (22.1) coincides with the set Cdimp(H)

for the corresponding canonical differential system (22.7) with Hamiltonian (22.8).Assertion (2) follows from Theorem 20.3, Remark 22.5, the connection between thesystems (22.1) and (22.7) and the reduction of the matrix γ in formula (11.4) withRγ > 0 to the case γ = Ip considered in Section 5.2 in [ArD:05].

If c ∈ Hp×p∞ ∩ Cd

imp(V) and d = ∞, then Cdimp(V) = c for every N of

the form (22.17) with δ1 + δ2 > 0, i.e., the limit point case prevails for all suchκ = δ1+δ2. This follows from the upper bounds on the left and right semiradii of theWeyl balls that are given in formulas (3.36) and (3.37) of [ArD:05]. Consequently,V ∈ Lm×m

1 ([0,∞)) in this case. Moreover, if δ1 > 0, then the values of the inputimpedance c(λ) may be characterized by the Weyl-Titchmarsh property that isdiscussed in Section 17:

[ξ∗ η∗]V Ut(λ)V ∗ ∈ Lm×m2 ⇐⇒ η = c(λ)ξ

for every point λ ∈ C+.

The inverse spectral problem

The data for the inverse spectral problem is a p× p nondecreasing mvf σ(µ) on Rthat meets the condition (14.5). The special form of N in (22.17) automaticallyinsures that the matrizant will be strongly regular and prescribes the associatedpair of the matrizant in accordance with (1) and (2) of Theorem 22.3. Moreover,for a fixed pair of nonnegative numbers δ1 ≥ 0, δ2 ≥ 0 with κ = δ1 + δ2 > 0,there is at most one mvf c ∈ Cd

imp(V) with the spectral function in its Riesz-Herglotz representation (14.4) equal to the given spectral function σ(µ). This willbe established in Theorem 23.4 for the case δ1 = δ2. The case δ1 = δ2 may bereduced to the case δ1 = δ2 by invoking Remark 22.5. Consequently, Theorem22.6 yields exactly one solution for the inverse spectral problem.

Page 145: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 139

23. Dirac systems

Differential systems of the form (22.1) with rankPJ = rankQJ = p and N = κIm

for some κ > 0 and potentials V(t) ∈ Lm×m1, loc that meet the two conditions

(1) V(t)J + JV(t)∗ = 0 for a.e t ∈ [0, d)(2) V(t) = V(t)∗ for a.e t ∈ [0, d),

are called Dirac systems. Without loss of generality we may assume that κ = 1and J = Jp. Then the potential must of the form

V(t) =

[v1(t) −iv2(t)iv2(t)∗ −v1(t)

], 0 ≤ t < d , (23.1)

where v1(t) = v1(t)∗ and v2(t) = v2(t)∗ a.e. and the matrizant Ut(λ) = U(t, λ) isa solution of the system

u′t(λ) = iλut(λ)Jp + ut(λ)V(t) , for 0 ≤ t < d , (23.2)

with potential V(t) of the form (23.1).The generalized Fourier transform for this system is given by the formula

g(λ) =1√π

∫ d

0

[u21(s, λ) u22(s, λ)]g(s)ds , (23.3)

where g ∈ Lm2 ([0, d)) and has compact support in [0, d). Consequently, a nonde-

creasing p× p mvf σ(µ) on R is said to be a spectral function for the Dirac systemwith N = Im and J = Jp if the Parseval identity∫ ∞

−∞g(µ)∗dσ(µ)g(µ) =

∫ d

0

g(s)∗g(s)ds

holds for every g ∈ Lm2 ([0, d)) with compact support in [0, d).

The direct problem

Theorem 23.1. Let At(λ) = A(t, λ), 0 ≤ t < d, be the matrizant of the system(23.2) with a locally summable potential V(t) of the form (23.1). Then:(1) At ∈ UsR(Jp) for every t ∈ [0, d).(2) etIp, etIp ∈ apII(At) for every t ∈ [0, d).(3) The de Branges spaces B(Et) based on Et(λ) =

√2At(λ)V are independent

of the potential V(t) as linear topological spaces, i.e.,

B(Et) =∫ t

−t

eiλsh(s)ds : h ∈ Lp2([−t, t])

as linear spaces and for each t ∈ [0, d), there exist a pair of positive constantsγ1 = γ1(t) and γ2 = γ2(t) such that

γ1‖f‖2 ≤ ‖f‖B(Et) ≤ γ2‖f‖2

for every f ∈ B(Et).(4) Assertions (4)–(8) of Theorem 22.3 are in force.

Page 146: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

140 D.Z. Arov and H. Dym

Proof. This theorem is a special case of Theorem 22.3.

The inverse input impedance problem

The inverse input impedance problem for differential systems of the form (23.2)on an interval [0, d) with a locally summable potential V(t) of the form (23.1) isto find V(t), given a mvf c ∈ Cp×p and the right-hand endpoint d, 0 < d ≤ ∞, ofthe interval.

Theorem 23.2. Let c ∈ Cp×p and 0 < d ≤ ∞ be given. Then:(1) There exists at most one differential system of the form (23.2) with a locally

summable potential V(t) of the form (23.1) and an input impedance equal toc(λ).

(2) There exists exactly one such differential system if c ∈ Cp×p has a continu-ous accelerant h(t) on the interval [0, 2d) and γ = Ip in the representation(22.19). Moreover, in this case, the potential V(t) is given by formula (23.1)with v1(t) = −Ra(2t), v2(t) = Ia(2t), where a(t) = γt(t, 0) and γt(a, b) isthe unique solution of the integral equation (20.8) with d replaced by 2d.

Proof. This theorem is a special case of Theorem 22.6. Remark 23.3. The condition in (2) is automatically met if c ∈ Cp×p ∩W+(Ip) andthe mvf h(t) in the representation (20.1) is continuous on [0,∞). The solution ofthe inverse input impedance problem for a Dirac system on a finite interval [0, d)depends only on the accelerant h(t) of the mvf c(λ) on the interval [0, 2d). Inother words, the solution will be the same for all input impedances with the sameaccelerant on the interval [0, 2d).

The inverse spectral problem

Theorem 23.4. Let σ(µ) be a nondecreasing p×p mvf on R that meets the constraint(14.5) and let 0 < d ≤ ∞. Then:(1) There exists at most one differential system of the form (23.2) with a locally

summable potential V(t) of the form (23.1) with spectral function σ(µ).(2) If the given nondecreasing p×p mvf σ(µ) is differentiable at every point µ ∈ R

and

σ′(µ) = Ip −∫ ∞

−∞eiµth(t)dt ,

where h(t) is a continuous summable p×p mvf on R such that h(t) = h(−t)∗for every point t ∈ R, then there exists exactly one such differential systemon the interval [0,∞). Moreover, in this case, the potential V(t) is givenby formula (23.1) with v1(t) = −Ra(2t) and v2(t) = Ia(2t), where a(t) =γt(t, 0) and γt(a, b) is the unique solution of the integral equation (20.8) withd replaced by 2d.

Proof. The asserted conclusions follow from the preceding theorem and Theorem15.1 applied to the canonical system (8.1) that corresponds to the differentialsystem (23.2) with potential. The latter theorem implies that there is a family of

Page 147: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 141

possible Hermitians H(α)(t) that are parametrized by a Hermitian p× p matrix αand are connected by the formula

H(α)(t) =

[Ip iα

0 Ip

]H(0)(t)

[Ip 0−iα Ip

].

However, since any Hermitian that corresponds to a Dirac system must also satisfythe condition

H(α)(0) = Im ,

it follows that there is only one choice of α that yields a solution.

24. Krein systems

Differential systems of the form (22.1) with rankPJ = rankQJ = p and N = κPJ

or N = κQJ for some κ > 0 and potentials V(t) ∈ Lm×m1 ,loc that meet the two

conditions

(1) V(t)J + JV(t)∗ = 0 for a.e t ∈ [0, d)(2) V(t) = V(t)∗ for a.e t ∈ [0, d),

are called Krein systems.Without loss of generality, we may assume that κ = 1 and J = jp. Then the

matrizant Ut(λ) = U(t, λ) is a solution of the system

u′t(λ) = iλut(λ)

[Ip 00 0

]jp + ut(λ)V(t) for 0 ≤ t < d (24.1)

with potential V(t) of the form

V(t) =

[0 a(t)

a(t)∗ 0

], 0 ≤ t < d . (24.2)

Therefore, since VjpV∗ = Jp and V = V∗, the general formulas (22.10)–(22.12)

imply that

Cdimp(V) =

⋂0≤t<d

C(VUtV)

and, as √2[0 Ip]VUs(λ) = [Ip Ip]Us(λ) ,

the generalized Fourier transform for the system (24.1) may be taken equal to

g(λ) =1√2π

∫ d

0

(u11(s, λ) + u21(s, λ))g(s)ds (24.3)

for every function g ∈ Lp2([0, d)) with compact support in [0, d).

Page 148: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

142 D.Z. Arov and H. Dym

Correspondingly, a nondecreasing p× p mvf σ(µ) on R is a spectral functionfor the system (24.1) if∫ ∞

−∞g(µ)∗dσ(µ)g(µ) =

∫ d

0

g(s)∗g(s)ds

for every function g ∈ Lp2([0, d)) with compact support in [0, d).

The direct problem

Theorem 24.1. Let Ut(λ) = U(t, λ), 0 ≤ t < d, be the matrizant of the system(24.1) with a locally summable potential V(t) of the form (24.2) and let At(λ) =VUt(λ)V for every t ∈ [0, d). Then:

(1) At ∈ UsR(Jp) for every t ∈ [0, d).(2) etIp, Ip ∈ apII(At) for every t ∈ [0, d).(3) The de Branges spaces B(Et) based on Et(λ) =

√2At(λ)V are independent

of the potential V(t) as linear topological spaces, i.e.,

B(Et) =∫ t

0

eiλsh(s)ds : h ∈ Lp2([0, t])

as linear spaces and for each t ∈ [0, d), there exist a pair of positive constantsγ1 = γ1(t) and γ2 = γ2(t) such that

γ1‖f‖2 ≤ ‖f‖B(Et) ≤ γ2‖f‖2

for every f ∈ B(Et).(4) Assertions (4)–(8) of Theorem 22.3 are in force.

Proof. This theorem is a special case of Theorem 22.3.

The inverse input impedance problem

The inverse input impedance problem for differential systems of the form (24.1)on an interval [0, d) with a locally summable potential V(t) of the form (24.2) isto find V(t), given a mvf c ∈ Cp×p and the right-hand endpoint d, 0 < d ≤ ∞, ofthe interval.

Theorem 24.2. Let c ∈ Cp×p and 0 < d ≤ ∞ be given. Then:

(1) There exists at most one differential system of the form (24.1) with a locallysummable potential V(t) of the form (24.2) with input impedance equal toc(λ).

(2) There exists exactly one such differential system if c ∈ Cp×p has a continuousaccelerant h(t) on the interval [0, d) and γ = Ip in the representation for-mula (11.4). Moreover, in this case, the potential V(t) may be obtained fromformula (24.2), where a(t) = γt(t, 0) and γt(a, b) is the unique solution of theintegral equation (20.8).

Proof. This theorem is a special case of Theorem 22.6.

Page 149: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 143

Remark 24.3. The condition in (2) is automatically met if c ∈ Cp×p ∩W+(Ip) andthe mvf h(t) in the representation (20.1) is continuous on [0,∞). Moreover, thesolution of the inverse impedance problem for a Krein system on a finite interval[0, d) depends only on the accelerant h(t) on the interval [0, d). In other words,the solution will be the same for all input impedances with the same acceleranton the interval [0, d).

The inverse spectral problem

Theorem 24.4. Let σ(µ) be a nondecreasing p×p mvf on R that meets the constraint(14.5) and let 0 < d ≤ ∞. Then:(1) There exists at most one differential system of the form (24.1) with a locally

summable potential V(t) of the form (24.2) with spectral function σ(µ).(2) If the given nondecreasing p×p mvf σ(µ) is differentiable at every point µ ∈ R

and

σ′(µ) = Ip −∫ ∞

−∞eiµth(t)dt ,

where h(t) is a continuous summable p×p mvf on R such that h(t) = h(−t)∗for every point t ∈ R, then there exists exactly one such differential systemon the interval [0,∞). Moreover, in this case it may be obtained from formula(24.2), where a(t) = γt(t, 0) and γt(a, b) is the unique solution of the integralequation (20.8).

Proof. The asserted conclusions follow from Theorem 23.4 and Remark 22.5. Theorem 20.3, and the solution of the inverse spectral problem for the dif-

ferential systems with potential that are now called Dirac and Krein systems werefirst announced by M. G. Krein [Kr:55], [Kr:56], given a continuous accelerant onan appropriate interval (see also [Sak:00a]). Krein also formulated a converse state-ment: if the potential V(t) of such a system is continuous, then c ∈ Cd

imp(V) hasa continuous accelerant on an appropriate interval. Proofs of a number of relatedstatements may be found in [KrL:85]. For additional discussion, see also [MeA:67]and [KrMA:86].

25. A differential system with potential with NJ = JN

In this section we shall consider differential systems of the form (22.1) with

N =[Ip 00 0

], J = Jp and potentials V ∈ Lm×m

1, loc ([0, d)) (25.1)

that are subject to the two additional constraints

V(t)Jp + JpV(t)∗ = 0 a.e. in [0, d) (25.2)

and

det

[0 Ip]V(t)[Ip

0

]= 0 a.e. in [0, d) . (25.3)

Page 150: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

144 D.Z. Arov and H. Dym

The two specific choices of potential

V(t) =[

v(t) 0−iIp −v(t)

], 0 ≤ t < d , with v(t) = v(t)∗ a.e. in [0, d) (25.4)

and

V(t) =[

0 iq(t)−iIp 0

], 0 ≤ t < d , with q(t) = q(t)∗ a.e. in [0, d) ,

(25.5)which both fit into this setting, will play a useful role in our analysis of theSchrodinger equation in the next section. This stems from the fact that for thegiven choice of N and these two potentials

NJpN = 0 and NJpV(t) + V(t)NJp = iIm a.e. in [0, d) .

Consequently, if V ∈ ACm×mloc ([0, d)) and

Y ′t (λ) = iλYt(λ)NJp + Yt(λ)V(t) for 0 ≤ t < d ,

then Y ′t ∈ ACm×m

loc ([0, d)) and

Y ′′t (λ) = Y ′

t (λ)(iλNJp + V(t)) + Yt(λ)V ′(t)= Yt(λ)(iλNJp + V(t)) (iλNJp + V(t) + Yt(λ))V ′(t)

= −λYt(λ) + Yt(λ)(V(t)2 + V ′(t)) ,

i.e., Yt(λ) is automatically the solution of a Schrodinger equation with potentialV(t)2 + V ′(t). Notice that

NJp − JpN = −iJp

for this choice of N . Therefore, we cannot invoke Theorem 22.2 to conclude thatthe matrizant Yt(λ) of the corresponding system is strongly regular. In fact, Yt(λ) isan entire mvf of minimal exponential type and hence of class US(Jp). Nevertheless,it will turn out to be possible to use the interplay between these systems and somerelated systems with strongly regular matrizants to obtain useful conclusions fora class of Schrodinger equations.

Theorem 25.1. Let Yt(λ) = Y (t, λ), 0 ≤ t < d, denote the matrizant of thedifferential system (22.1) with N , J and V(t) specified by (25.1)–(25.3) and letBt(λ) = Yt(λ)V for 0 ≤ t < d. Then the generalized Fourier transform is given bythe formula

g(λ) =1√π

∫ d

0

y21(s, λ)g(s)ds (25.6)

for every g ∈ Lp2([0, d)) with compact support in [0, d) and:

(1) Yt ∈ U(Jp) for every t ∈ [0, d).(2) Sp×p ⊂ D(TYtV) for every t ∈ (0, d).(3) C(Yt) = TYtV[Sp×p] for every t ∈ (0, d).(4) Cd

imp(V) =⋂

0≤t<d C(Yt) = ∅.(5) Σd

sf (V) = (Cdimp(V))sf .

(6) If d <∞ and V ∈ L1([0, d]), then Cdimp(V) = C(Yd).

Page 151: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 145

Proof. In view of (25.2), the matrizant Yt ∈ U(Jp). Therefore, the m×m mvf

Y (t) = Y (t, 0)

is Jp unitary and the mvf

At(λ) = Y (t, λ)Y (t)−1

is the matrizant of the canonical differential system (22.7) with continuous Hamil-tonian

H(t) = Y (t)NY (t)∗ = Y1(t)Y1(t)∗ for t ∈ [0, d) ,

where the m× p mvf Y1(t) is the first block in the block column decomposition

Y (t) = [Y1(t) Y2(t)] .

Moreover, if d < ∞ and V ∈ Lm×m1 ([0, d]), then the mvf A(λ) = Ad(λ) is the

characteristic function A(λ) = Im + iλF (I − λA)−1F ∗Jp of the Livsic-BrodskiiJp-node N = (K,F ;X,Cm), based on the bounded linear operators K ∈ L(X)and F ∈ L(X,Cp) that are defined by the formulas

(Kg)(t) = iY1(t)∗Jp

∫ d

t

Y1(s)g(s)ds and Fg =∫ d

0

Y1(s)g(s)ds ,

where X = Lp2([0, d]) and g ∈ X (and K −K∗ = iFJpF

∗).Let

L =[

0Ip

]and F2 = L∗F .

Then, by Corollary 5.9 in [ArD:04b],∨n≥0

KnF ∗2 Cp = Lp

2([0, d)) ⇐⇒ kerK ∩ kerF = 0 .

Moreover, if this last condition is in force, then, by Theorems 5.10 and 2.14 in[ArD:04b],(a) Sp×p ⊂ D(TAV) ⇐⇒ kerF ∗

2 = 0 .(b) kerK∗ = 0 and rangeK∗ ∩ rangeF ∗

2 = 0 =⇒ Σdsf (H) = (C(A))sf .

Therefore, to complete the proof of the theorem, it suffices to check that(a) kerK = 0.(b) kerK∗ = 0.(c) kerF ∗

2 = 0.(d) rangeK∗ ∩ rangeF ∗

2 = 0.This can be verified by straightforward calculations that exploit the identity[

Y1(t)∗JpY1(t) Y1(t)∗JpY2(t)Y2(t)∗JpY1(t) Y2(t)∗JpY2(t)

]=[

0 −Ip

−Ip 0

],

which is valid for every t ∈ [0, d), the equation Y ′(t) = Y (t)V(t) and the fact thatthe 21 block v21(t) of the potential V(t) is invertible a.e. on the interval [0, d).Details will be presented elsewhere.

Page 152: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

146 D.Z. Arov and H. Dym

26. Spectral problems for the Schrodinger equation

In this section we shall indicate how to extract information on direct and inversespectral problems for the Schrodinger equation

−u′′(x, λ) + u(x, λ)q(x) = λu(x, λ) , 0 ≤ x < d , (26.1)

with a p×p matrix-valued potential q(x) from the corresponding results that werediscussed earlier for differential systems of the form (22.1). The direct spectralproblem will be considered in the class of potentials q(t) that satisfy the conditions

(A1) q(t) = q(t)∗ a.e. on [0, d) and q ∈ Lm×m1, loc ([0, d)).

The inverse spectral problem will be discussed under a more stringent conditionon q(t):

(A2) There exists a solution v(t) of the Riccati equation

v′(t) + v(t)2 = q(t) , for every t ∈ [0, d) , (26.2)

in the class of p× p mvf’s v(t) such that

v ∈ ACp×ploc ([0, d)) and v(t) = v(t)∗ for every t ∈ [0, d) . (26.3)

Let ψ(t, λ) and ϕ(t, λ) be the unique solutions of equation (26.1) that meetthe initial conditions

ψ(0, λ) = Ip , ψ′(0, λ) = 0 , ϕ(0, λ) = 0 and ϕ′(0, λ) = Ip ,

respectively, and let

U(t, λ) =[ψ(t, λ) ψ′(t, λ)ϕ(t, λ) ϕ′(t, λ)

](26.4)

be the fundamental matrix of equation (26.1).In the present discussion, we focus on spectral problems for the Schrodinger

equation (26.1) that are related to the generalized Fourier transform

g(λ) =1√π

∫ d

0

ϕ(s, λ)g(s)ds (26.5)

of vvf’s g ∈ Lp2([0, d)) with compact support in [0, d).

A nondecreasing p × p mvf σ(µ) on R is said to be a spectral function of(26.1) with respect to this transform if the Parseval equality∫ ∞

∞g(µ)∗dσ(µ)g(µ) =

∫ d

0

g(s)∗g(s)ds (26.6)

holds for every g ∈ Lp2([0, d)) with compact support in [0, d). The symbol Σd

sf (q)will be used to denote the set of all spectral functions of (26.1) with respect tothis transform.

Page 153: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 147

The direct spectral problem when (A1) is in force

The direct spectral problem is to describe the set Σdsf (q) for a given potential q(t)

on [0, d). The solution of this problem will be given under assumption (A1).

Theorem 26.1. Let U(t, λ) denote the fundamental matrix of equation (26.1), let

X(t, λ) =[iIp 00 Ip

]U(t, λ)

[−iIp 0

0 Ip

]for 0 ≤ t < d (26.7)

and assume that the potential q(t) satisfies the condition (A1). Then(1) Ut(λ) = U(t, λ) is the matrizant of the differential system (22.1) with

N =[Ip 00 0

], J = −Jp and potential V(t) =

[0 q(t)Ip 0

],

i.e.,

U ′(t, λ) = iλU(t, λ)[Ip 00 0

](−Jp) + U(t, λ)

[0 q(t)Ip 0

](26.8)

for 0 ≤ t < d and U(0, λ) = Im.(2) Xt(λ) = X(t, λ) is the matrizant of the differential system (22.1) with

N =[Ip 00 0

], J = Jp and potential V(t) =

[0 iq(t)

−iIp 0

],

i.e.,

X ′(t, λ) = iλX(t, λ)[Ip 00 0

]Jp + X(t, λ)

[0 iq(t)

−iIp 0

](26.9)

for 0 ≤ t < d, and X(0, λ) = Im.(3) Ut ∈ U(−Jp) and Xt ∈ U(Jp) for every t ∈ [0, d).(4) Sp×p ⊂ D(TXtV) for every t ∈ (0, d).(5) C(Xt) = TXtV[Sp×p] for every t ∈ (0, d).(6) Ct

imp(q) = C(Xt) for every t ∈ [0, d).(7) Cd

imp(q) =⋂

0≤t<d C(Xt) = ∅.(8) Σd

sf (q) = (Cdimp(q))sf = Σd

sf (V).(9) If d <∞ and q ∈ Lp×p

1 ([0, d]), then Cdimp(q) = C(Xd).

Proof. Let u(t, λ) be a solution of equation (26.1) and let

y(t, λ) = [u(t, λ) u′(t, λ)] .

Then

y′(t, λ) = [u′(t, λ) u′′(t, λ)]= [u′(t, λ) u(t, λ)((q(t) − λIp)]

= λy(t, λ)[

0 −Ip

0 0

]+ y(t, λ)

[0 q(t)Ip 0

]= iλy(t, λ)

[Ip 00 0

](−Jp) + y(t, λ)

[0 q(t)Ip 0

]

Page 154: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

148 D.Z. Arov and H. Dym

This justifies the first assertion. The remaining assertions follow from Theorem25.1 and the identity

−Jp =[−iIp 0

0 Ip

]Jp

[iIp 00 Ip

].

Remark 26.2. Assertions (2) and (3) of the theorem can be formulated directly interms of the fundamental matrix U(t, λ):

(2′) F(−Jp) ⊂ D(TUt) for every t ∈ [0, d).(3′) C(At) = iTUt [F(−Jp)] for every t ∈ [0, d).

The direct spectral problem when (A2) is in force

If assumption (A2) is in force and q(t) = v(t)2 + v′(t), then the mvf’s

Y (t, λ) =[Ip v(0)0 −iIp

]U(t, λ)

[Ip −iv(t)0 iIp

]for 0 ≤ t < d (26.10)

and

At(λ) = LλYt(λ2)L−1λ for 0 ≤ t < d , where Lλ =

[Ip 00 λIp

](26.11)

are matrizants of differential systems of the form (22.1). These systems play auseful role in the study of spectral problems for the Schrodinger equation withpotential q(t) = v(t)2 + v′(t), because the generalized Fourier transforms of allthree systems of equations are simply related. The following table, summarizesthe main facts concerning the four matrizants that have been introduced in thissection and the corresponding transforms for the convenience of the reader.

Matr. N J V(t) g(λ)

Ut(λ)[Ip 00 0

]−Jp

[0 q(t)Ip 0

]−i∫ d

0 u21(s, λ)g(s)ds ,g ∈ Lp

2

Xt(λ)[Ip 00 0

]Jp

[0 iq(t)

−iIp 0

]−i∫ d

0x21(s, λ)g(s)ds ,

g ∈ Lp2

Yt(λ)[Ip 00 0

]Jp

[v(t) 0−iIp −v(t)

]−i∫ d

0 y21(s, λ)g(s)ds ,g ∈ Lp

2

At(λ) Im Jp

[v(t) 00 −v(t)

] −i∫ d

0a21(s, λ)g(s)ds

−i∫ d

0 a22(s)h(s)ds ,g, h ∈ Lp

2

Moreover,

x21(s, λ) = y21(s, λ) = −iu21(s, λ) =a21(s,

√λ)√

λ= −iϕ21(s, λ)

for s ∈ [0, d). This connection permits one to reduce the spectral problem for theSchrodinger equation (26.1) to a spectral problem for Dirac systems. This strategywas initiated by M.G. Krein in [Kr:55].

Page 155: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 149

Theorem 26.3. Let U(t, λ) be the fundamental matrix for the Schrodinger equation(26.1) with a potential q(t) that satisfies condition (A2), let v(t) be the p× p mvfthat appears in this condition and let the mvf Y (t, λ) be defined by (26.10). Then:(1) Yt(λ) = Y (t, λ) is the matrizant of the differential system (22.1) with

N =[Ip 00 0

], J = Jp and potential V(t) =

[v(t) 0−iIp −v(t)

], (26.12)

i.e.,

Y ′(t, λ) = iλY (t, λ)[Ip 00 0

]Jp + Y (t, λ)

[v(t) 0−iIp −v(t)

](26.13)

for 0 ≤ t < d, and Y (0, λ) = Im.(2) Yt ∈ U(Jp) for every t ∈ [0, d).(3) Sp×p ⊂ D(TYtV) for every t ∈ (0, d).(4) C(Yt) = TYtV[Sp×p] for every t ∈ (0, d).(5) Cd

imp(q) = −iv(0) + C(Yt) for every t ∈ [0, d).(6) Ct

imp(q) = −iv(0) + Cdimp(V).

(7) Σdsf (q) = Σd

sf (V) = (Cdimp(V))sf .

(8) If d < ∞ and q ∈ Lp×p1 ([0, d]), then Cd

imp(q) = −iv(0) + C(Yd) and Σdsf (q) =

(C(Yd))sf .

Proof. Let

T (t) =[Ip −iv(t)0 iIp

]for 0 ≤ t < d .

ThenY (t, λ) = T (0)−1U(t, λ)T (t) for 0 ≤ t < d ,

and, in view of formula (26.8),

Y ′(t, λ) = T (0)−1U ′(t, λ)T (t) + T (0)−1U(t, λ)T ′(t)

= T (0)−1U(t, λ)iλN(−Jp) +

[0 q(t)Ip 0

]T (t)

+ T (0)−1U(t, λ)T ′(t) .

Formula (26.13) now follows easily from the fact that for the given choice of N

N(−Jp)T (t) = T (t)NJp and[

0 q(t)Ip 0

]T (t) + T ′(t) = T (t)

[v(t) 0−iIp −v(t)

]when

q(t) = v(t)2 + v′(t) .

Assertion (2) follows from the fact that V(t)Jp + JpV(t)∗ = 0 a.e. on [0, d).Next, since

[0 Ip]Y (t, λ)[Ip

0

]= −i[0 Ip]U(t, λ)

[Ip

0

]= −iϕ(t, λ) for 0 ≤ t < d ,

Page 156: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

150 D.Z. Arov and H. Dym

the generalized Fourier transform (25.6)) based on the matrizant Y (t, λ) coincideswith the generalized Fourier transform (26.5) up to a constant factor of modulusone. Therefore,

Σtsf (q) = Σt

sf (V) for 0 ≤ t < d .

Consequently, assertions (3)–(8) follow from Theorem 26.1 and formula (26.10).

Theorem 26.4. Let assumption (A2) be in force for the potential q(t) = v(t)2+v′(t)of the Schrodinger equation, let Yt(λ) denote the matrizant considered in Theorem26.3. Then the mvf At(λ) = LλYt(λ2)L−1

λ is the matrizant of the Dirac system(23.2) and

V(t) =[v(t) 00 −v(t)

], 0 ≤ t < d ,

i.e., At(λ) is a solution of the Cauchy problem

A′t(λ) = iλAt(λ)Jp + At(λ)V(t) , 0 ≤ t < d ,

A0(λ) = Im .

Proof. Clearly

A′t(λ) = LλY

′t (λ2)L−1

λ = Lλ

iλ2Y (t, λ2)NJp + Y (t, λ2)V(t)

,

with N and V(t) as in (26.12). Therefore, since

iλ2LλNJpL−1λ = iλNJp

and

LλV(t)L−1λ = V(t) + iλ

[0 00 Ip

]Jp ,

for this choice of N and V(t), the proof is easily completed. Theorem 26.5. The fundamental matrix Ut(λ) = U(t, λ) for the Schrodinger equa-tion (26.1) with potential q(t) = v(t)2 + v′(t) that satisfies assumption (A2) enjoysthe following properties:(1) Ut ∈ US(−Jp) for every t ∈ [0, d).

(2) lim supr↑∞ln max‖Ut(λ)‖ : |z| ≤ r

r1/2 = lim supµ↓−∞ln ‖Ut(µ)‖|µ|1/2 = t .

Proof. By Theorem 23.1 etIp, etIp ∈ apII(At) and

lim supr↑∞

ln max‖At(λ)‖ : |λ| ≤ rr

= lim supν↑∞

ln ‖At(±iν)‖ν

= t .

But this serves to establish the second assertion, since

‖At(λ)‖ ≤ |λ|‖Yt(λ2)‖ and ‖Yt(λ2)‖ ≤ |λ|‖At(λ)‖ when |λ| ≥ 1

and the matrizants Yt(λ) and Ut(λ) are related by formula (26.11). Thus, as Ut(λ)is an entire m×m mvf of minimal exponential type, assertion (1) is in force. Remark 26.6. Analogous conclusions hold for the matrizants Xt(λ) and Yt(λ).

Page 157: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 151

de Branges spaces

LetEX

t (λ) =√

2[0 Ip]Xt(λ)Vdenote the de Branges function based on the matrizant Xt(λ) that was introducedin Theorem 26.1. Then, in view of formulas (26.4) and (26.7), it is readily checkedthat

EXt (λ) = [ϕ′(t, λ) + iϕ(t, λ) ϕ′(t, λ) − iϕ(t, λ)]

and, since x21(s, λ) = −iu21(s, λ) = −iϕ(s, λ), that the corresponding de Brangesspace

B(EXt ) =

1√π

∫ t

0

ϕ(s, λ)g(s)ds : g ∈ Lp2([0, t]) for every t ∈ [0, d)

(26.14)

with norm

〈g, g〉B(EXt ) =

∫ ∞

−∞g(µ)∗∆X

t (µ)g(µ)dµ ,

where

g(µ) =1√π

∫ t

0

ϕ(s, λ)g(s)ds

and

∆Xt (µ)−1 = (ϕ′(t, µ)− iϕ(t, µ))(ϕ′(t, µ)− iϕ(t, µ))∗

= ϕ′(t, µ)ϕ′(t, µ)∗ + ϕ(t, µ)ϕ(t, µ)∗ ,

since the fundamental matrix Ut ∈ U(−Jp).An analogous set of calculations for the de Branges function

EAt (λ) =

√2[0 Ip]At(λ)V

based on the matrizant At(λ) that was introduced in Theorem 26.4 leads to theconclusion that

EAt (λ) = [a22(t, λ)− a21(t, λ) a22(t, λ) + a21(t, λ)]

and that the corresponding de Branges space

B(EAt ) =

1√π

∫ t

0

[a21(s, λ) a22(s, λ)]f(s)ds : f ∈ Lm2 ([0, t]) for every t ∈ [0, d)

with norm

〈f, f〉B(EXt ) =

∫ ∞

−∞f(µ)∗∆A

t (µ)f(µ)dµ ,

where, upon writing f = col[g h], with components g, h ∈ Lp2([0, t]),

f(µ) =1√π

∫ t

0

[a21(s, λ) a22(s, λ)f(s)ds

=1√π

∫ t

0

a21(s, λ)g(s) + a22(s, λ)h(s)ds

Page 158: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

152 D.Z. Arov and H. Dym

and

∆At (µ)−1 = (a22(t, µ) + a21(t, µ))(a22(t, µ) + a21(t, µ))∗

= a22(t, µ)a22(t, µ)∗ + a21(t, µ)a21(t, µ)∗ ,

since At ∈ U(Jp). Moreover, formula (26.10) implies that a21(t, λ) is an odd func-tion of λ, whereas a22(t, λ) is an even function of λ. Thus,

B(EAt ) = B(EA

t )odd ⊕ B(EAt )ev ,

where

B(EAt )odd =

∫ t

0

a21(s, λ)g(s)ds : g ∈ Lp2([0, t])

and

B(EAt )ev =

∫ t

0

a22(s, λ)g(s)ds : g ∈ Lp2([0, t])

.

At the same time, Theorem 23.1 implies that

B(EAt ) =

∫ t

−t

eiλsg(s)ds : g ∈ Lp2([−t, t])

and hence that

B(EAt )odd =

∫ t

0

sin(sλ)g(s)ds : g ∈ Lp2([0, t])

and

B(EAt )ev =

∫ t

0

cos(sλ)g(s)ds : g ∈ Lp2([0, t])

.

Thus, as

y21(s, λ) =a21(s,

√λ)√

λ= −iϕ(s, λ) ,

we obtain the following conclusion:

Theorem 26.7. If the potential q(t) of the Schrodinger equation (26.1) satisfiesassumption (A2), then the de Branges space

B(EXt ) =

1√π

∫ t

0

sin√λs√λ

g(s)ds : g ∈ Lp2([0, t]) for every t ∈ [0, d)

,

(26.15)as linear spaces and hence these spaces do not depend upon the potential.

In view of the indicated connection between Dirac systems and Schrodingerequations, Theorems 23.2 and 23.4 may be applied to yield existence and unique-ness theorems for the inverse input impedance problem and the inverse spectralproblem for the latter when assumption (A2) is in force, as well as recipes for thesolution. A detailed analysis will be presented elsewhere.

Page 159: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 153

Remark 26.8. The identification (26.15) for the scalar case p = 1 is obtained in[Rem:03] under assumption (A1) on the potential q(t) of the Schrodinger equation,which is less restrictive than the assumption (A2) that is imposed here.

27. Epilogue

In the early fifties M.G. Krein published a series of notes on inverse problems forsecond order differential equations and canonical differential systems. Most of thesenotes were short and did not contain detailed proofs. In fact, the task of filling inthe details is far from trivial and is not complete even to this day. Nevertheless,Krein did try to convey a picture of the ideas that guided him. The strategy thathe followed was, to paraphrase his own words [Kr:47], based

on the following idea. Just as every Jacobi matrix may be uniquely de-fined by the solution of a power moment problem, every second orderdifferential operator (of appropriate form) with boundary conditions atone end is uniquely determined by the solution of a generalized momentproblem. For differential operators of sufficiently regular type, this gen-eralized moment problem is the extension problem for Hermitian positivefunctions, that was investigated by the author. . .

Krein’s point of view is described in more detail in his 1964 lecture at the Ju-bilee session of the Moscow Mathematical Society. A translation of this lecture isreprinted in [GG:97].

Every mvf c ∈ Cp×p can be represented as the Fourier transform of a positivedefinite p × p matrix-valued distribution of order at most two. This is equivalentto the fact that the formula

c(λ) = λ2

∫ ∞

0

eiλtg(t)dt for λ ∈ C+ , (27.1)

defines a one to one correspondence between the class of mvf’s c ∈ Cp×p and theclass of mvf’s g ∈ Gp×p∞ (0) that is defined by the following three conditions:

(1) g(t) is a continuous p× p mvf on R with g(−t) = g(t)∗ for every point t ∈ R.(2) The kernel

k(t, s) = g(t− s)− g(t)− g(−s) + g(0)

is positive on [0,∞)× [0,∞).(3) g(0) ≤ 0.

If c ∈ Cp×p, we shall let gc(t) denote the unique mvf in Gp×p∞ (0) that correspondsto c(λ) by formula (27.1); conversely, if g ∈ Gp×p

∞ (0), then we shall let cg(λ) denotethe unique mvf in Cp×p that is defined by formula (27.1). Thus, the interplaybetween the Riesz-Herglotz integral representation (14.4) for mvf’s in Cp×p andformula (27.1) leads to the following complementary pair of integral representation

Page 160: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

154 D.Z. Arov and H. Dym

formulas for g ∈ Gp×p∞ (0):

c(λ) = iα− iβλ + 1πi

∫ ∞

−∞

1

µ− λ− µ

1 + µ2

dσ(µ) = cg(λ) (27.2)

gc(t) = −β + iαt + 1π

∫ ∞

−∞

(eiµt − 1− iµt

1 + µ2

)dσ(µ)µ2 = g(t) , (27.3)

in which the parameters α, β and the the spectral function σ(µ) are the same inboth.

Given c ∈ Cp×p, we shall refer to the restriction of the mvf gc(t) to the theinterval [−a, a] as the helical trace of c(λ) on the interval [−a, a]. One of Krein’sfundamental observations was that

if c ∈ Cdimp(q), then the restriction of the potential q(s) to the interval

[0, t], depend only upon the helical trace of c(λ) on the interval [−2t, 2t].

Krein also characterized the set of helical traces gc(t)|[−a,a] : c ∈ Cp×p on a fixedinterval [−a, a]:

Theorem 27.1. Let Gp×pa denote the set of continuous p× p mvf’s on [−a, a] such

that the kernel k(t, s) defined above is positive on [0, a]× [0, a] and let

Gp×pa (0) = g ∈ Gp×p

a : g(0) ≤ 0 .

(1) g ∈ Gp×pa (0) ⇐⇒ there exists a mvf c ∈ Cp×p such that gc(t) = g(t) for

|t| ≤ a.(2) If g ∈ Gp×p

a (0) and g(t) coincides with the helical trace on the interval[−a, a] of a mvf c ∈ Cp×p, then

g ∈ Gp×p∞ (0) : g(t) = g(t) for |t| ≤ a

= gc : c ∈ Cp×p and e−1a (c− c) ∈ N p×p

+ . (27.4)

Proof. See Theorem 3.12 of [GG:97] for a proof of assertion (1) and [Ar:94] orTheorem 5.1 of [ArD:98] for a proof of (2).

This establishes a connection between the set of helical traces of a given mvfc ∈ Cp×p on the interval [−a, a] and the solutions of the generalized Caratheodoryinterpolation problem that was considered in Section 11, i.e., the set in item (2) isequal to

C(eaIp, Ip; c) = C(ea/2Ip, ea/2Ip; c) .

We remark that if g ∈ Gp×p∞ (0), then

g(t) = g(t)∗ for every t ∈ R ⇐⇒ cg(λ) = (cg ) (λ) for every λ ∈ C+ . (27.5)

Moreover, if (27.5) is in force, then α = 0 and σ(−µ + 0) = σ(µ) for µ > 0 in theintegral representation (27.3). Thus, if β = 0, then

g(t) =2π

∫ ∞

0+

cosµt− 1µ2

dσ(µ)− 12π

σ(0+)t2 ,

Page 161: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 155

which can be reexpressed as

g(t) =2π

∫ ∞

0−

cos√µt− 1µ

dτ(µ) ,

where

τ(0+)− τ(0) =12σ(0+)− σ(0) =

12σ(0+) and τ(µ) = σ(

√µ) on [0,∞)

is a nondecreasing p× p mvf on [0,∞) that meets the constraint∫ ∞

0

dtraceτ(µ)1 + µ

<∞ .

Remark 27.2. The class of nondecreasing scalar functions τ(µ) on the interval[0,∞) that satisfy this last constraint coincides with the class of spectral functionswith nonnegative support for a class of strings with arbitrary mass distributionon [0,∞). In this setting, Krein called the Hermitian function g(t) the transitionfunction of the string. Additional information on direct and inverse spectral prob-lems for the string may be found in [KaKr:68] and [DMc:76] and the referencescited therein, particularly a number of Doklady notes by Krein.

Theorem 27.3. Let g ∈ Gp×p2d (0) be given, let the helical trace of the mvf c ∈ Cp×p

coincide with g(t) on the interval (−2d, 2d) and suppose that

C(etIp, etIp; c) ∩ Cp×p = ∅ for every t ∈ [0, d) . (27.6)

Then there exists a unique normalized left monotonic continuous chain of entireJp-inner mvf’s At(λ), 0 ≤ t < d, such that:(1) etIp, etIp ∈ apII(At) for every t ∈ [0, d).(2) C(etIp, etIp; c) = C(At) for every t ∈ [0, d).(3) At ∈ UsR(Jp) for every t ∈ [0, d).

Moreover, the mvf

M(t) = −i∂At

∂λ(0)Jp , 0 ≤ t < d,

is the unique solution of the inverse input impedance problem with data

c; etIp, etIp, 0 ≤ t < d .This solution depends only upon g(t) and not upon the choice of the mvf c(λ).

Proof. The last theorem may be viewed as a corollary of Theorem 12.1 applied tothe chain bt

3, bt4 = etIp, etIp, 0 ≤ t < d.

Necessary and sufficient conditions on g ∈ Gp×p2d (0) to insure that condition

(27.6) is in force are given in [ArD:98].If the mvf g ∈ Gp×p

2d (0) that is given in the preceding theorem is Hermitian,i.e., if g(t) = g(t)∗ for every t ∈ [0, d), then the mvf c ∈ Cp×p may be chosen tomeet the extra symmetry condition c(λ) = (c)∼(λ) and then, if the condition(27.6) is met, the mvf’s At(λ) meet the extra condition

(At)∼(λ)JpAt(λ) = Jp for every t ∈ [0, d) .

Page 162: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

156 D.Z. Arov and H. Dym

This property is exploited in the spectral theory of strings and the Schrodingerequation when p = 1; see, e.g., [KrL:85].

There are two particular subclasses of Gp×pa (0) that are of particular interest

and have useful descriptions:

Theorem 27.4. Let g ∈ Gp×pa (0) for some a ∈ [0,∞). Then:

(1) g ∈ C2([−a, a]) and g′(0) = 0 ⇐⇒ g(t) =∫ t

0 (t − s)f(s)ds for some p× pmvf f(s) that is continuous on the interval [−a, a] and meets the positivitycondition∫ a

0

ϕ(t)∗∫ a

0

f(t− s)ϕ(s)dsdt ≥ 0 for every ϕ ∈ Lp2([0, a]]) .

(2) g′ ∈ AC([−a, 0)) ∩AC((0, a]) and Rg′(0+) ≥ 0 ⇐⇒

g(t) =

−γt+∫ t

0 (t− s)h(s)ds for t ∈ [0, a]−g(−t)∗ for t ∈ [−a, 0] ,

where γ ∈ Cp×p and h ∈ Lp1([0, a]) meets the positivity condition∫ a

0

ϕ(t)∗γϕ(s) +∫ a

0

h(t− s)ϕ(s)dsdt ≥ 0 for every ϕ ∈ Lp2([0, a]]) .

The mvf h(t) considered in case (2) is also referred to as the accelerantof g(t) on the interval [0, a]. Additional details on extension problems that areformulated in g ∈ Gp×p

a (0) and the indicated subclasses may be found in [ArD:98].Connections of extension problems in the Wiener class with inverse problems arediscussed in [MeA:67], [DI:84], [KrL:85], [KrMA:86] and [Dy:90].

References

[AlD:84] D. Alpay and H. Dym, Hilbert spaces of analytic functions, inverse scatteringand operator models, I, Integral Equations Operator Theory 7 (1984) 589–641.

[AlD:85] D. Alpay and H. Dym, Hilbert spaces of analytic functions, inverse scatteringand operator models, II, Integral Equations Operator Theory 8 (1985) 145–180.

[AG:95] D. Alpay and I. Gohberg, Inverse spectral problems for differential operatorswith rational scattering matrix functions, J. Differential Equations 118 (1995)1–19.

[AG:01] D. Alpay and I. Gohberg, Inverse problems associated to a canonical differen-tial system, in: Recent Advances in Operator Theory and Related Topics (L.Kerchy, C. Foias, I. Gohberg and H. Langer, eds.), Oper. Theor. Adv. Appl.127, Birkhauser , Basel, 2001, pp. 1–27.

[Ar:94] D.Z. Arov, The generalized bitangent Caratheodory-Nevanlinna-Pick prob-lem and (j, J0)-inner matrix-valued functions, Russian Acad. Sci. Izvestija 42(1994), 1–26.

[Ar:97] D.Z. Arov, On monotone families of J-contractive matrix functions, Algebra iAnaliz 9 (1997), No. 6, 3–37; English transl. St. Petersburg Math. J. 9 (1998),No. 6, 1025–1051.

Page 163: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 157

[ArD:97] D.Z. Arov and H. Dym, J-inner matrix functions, interpolation and inverseproblems for canonical systems, I: Foundations, Integral Equations OperatorTheory 29 (1997), No. 4, 373–454.

[Aron:50] N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc.,68(1950), 337–404.

[ArD:98] D.Z. Arov and H. Dym, On three Krein extension problems and some gener-alizations, Integral Equations Operator Theory 31 (1998) 1–91.

[ArD:00a] D.Z. Arov and H. Dym, J-inner matrix functions, interpolation and inverseproblems for canonical systems, II: The inverse monodromy problem, IntegralEquations Operator Theory 36 (2000), No. 1, 11–70.

[ArD:00b] D.Z. Arov and H. Dym, J-inner matrix functions, interpolation and inverseproblems for canonical systems, III: More on the inverse monodromy prob-lem, Integral Equations Operator Theory 36 (2000), No. 2, 127–181.

[ArD:01] [ArD9] D.Z. Arov and H. Dym, Matricial Nehari problems, J-inner matrixfunctions and the Muckenhoupt condition, J. Funct. Anal. 181 (2001) 227–299.

[ArD:02a] D.Z. Arov and H. Dym, J-inner matrix functions, interpolation and inverseproblems for canonical systems, IV: Direct and inverse bitangential input scat-tering problems, Integral Equations Operator Theory 43 (2002), No. 1, 1–67.

[ArD:02b] D.Z. Arov and H. Dym, J-inner matrix functions, interpolation and inverseproblems for canonical systems, V: The inverse input scattering problem forWiener class and rational p × q input scattering matrices, Integral EquationsOperator Theory 43 (2002), No. 1, 68–129.

[ArD:03a] D.Z. Arov and H. Dym, Criteria for the strong regularity of J-inner functionsand γ-generating matrices, J. Math. Anal. Appl. 280 (2003) 387–399.

[ArD:03b] D.Z. Arov and H. Dym, The bitangential inverse input impedance problem forcanonical systems, I.: Weyl-Titchmarsh classification, existence and unique-ness, Integral Equations Operator Theory 47 (2003) 3–49.

[ArD:04a] D.Z. Arov and H. Dym, Strongly regular J-inner matrix functions and relatedproblems, in: Current Trends in Operator Theory and its Applications (J.A.Ball, J.W. Helton, M. Klaus and L. Rodman, eds.), Oper. Theor. Adv. Appl.,149, Birkhauser, Basel, 2004, pp. 79–106.

[ArD:04b] D.Z. Arov and H. Dym, The bitangential inverse spectral problem for canonicalsystems, J. Funct. Anal., 214 (2004), 312–385.

[ArD:05] D.Z. Arov and H. Dym, The bitangential inverse input impedance problemfor canonical systems, II.: Formulas and examples, Integral Equations andOperator Theory 51(2), 155–213 (2005).

[ArD:??] D.Z. Arov and H. Dym, Direct and inverse problems for differential systemsconnected with Dirac systems and related factorization problems, in prepara-tion.

[dBr:63] L. de Branges, Some Hilbert spaces of analytic functions I, Trans. Amer. Math.Soc. 106 (1963) 445–668.

[dBr:68a] L. de Branges, Hilbert Spaces of Entire Functions, Prentice-Hall, EnglewoodCliffs, 1968.

[dBr:68b] L. de Branges, The expansion theorem for Hilbert spaces of entire functions,in: Entire Functions and Related Parts of Analysis, Amer. Math. Soc., Provi-dence, 1968, pp. 79–148.

Page 164: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

158 D.Z. Arov and H. Dym

[Bro:71] M.S. Brodskii, Triangular and Jordan Representations of Linear Operators,Transl. of Math. Monographs, 32, Amer. Math. Soc., Providence, 1971.

[BrL:60] M.S. Brodskii and M.S. Livsic, Spectral analysis of non-selfadjoint operatorsand intermmediate systems, Amer. Math. Soc. Transl. (2) 13 (1960) 265–346.

[CG:02] S. Clark and F. Gesztesy, Weyl-Titchmarsh M-function asymptotics, localuniqueness results, trace formulas, and Borg-type theorems for Dirac opera-tors, Trans. Amer. Math. Soc. 354 (2002), 3475–3534.

[CG:01] S. Clark and F. Gesztesy, Weyl-Titchmarsh M -function asymptotics formatrix-valued Schrodinger operators, Proc. London Math. Soc. 82 (2001),701–724.

[Dy:70] H. Dym, An introduction to de Branges spaces of entire functions with appli-cations to differential equations of Sturm-Liouville type, Advances in Math.,5 (1970), 395–471.

[Dy:89] H. Dym, J Contractive Matrix Functions, Reproducing Kernel Hilbert Spacesand Interpolation, CBMS Regional Conference Series, number 71, Amer.Math. Soc., Providence, R.I., 1989.

[Dy:90] H. Dym, On reproducing kernels and the covariance extension problem, in:Analysis and Partial Differential Equations (C. Sadosky, ed.), Marcel Dekker,New York, 1990, pp. 427–482.

[DI:84] H. Dym and A. Iacob, Positive definite extensions, canonical equations andinverse problems, in: Topics in Operator Theory, Systems and Networks (H.Dym and I. Gohberg, eds.), Oper. Theory Adv. Appl. 12, Birkhauser, Basel,1984, pp. 141–240.

[DK:78] H. Dym and N. Kravitsky, On recovering the mass distribution function of astring from its spectral function, in: Topics in Functional Analysis (I. Gohbergand M. Kac, eds.), Academic Press, New York, 1978, pp. 45–90.

[DMc:76] H. Dym and H.P. McKean, Gaussian Processes, Function Theory, and theInverse Spectral Problem, Academic Press, New York, 1976.

[GKM:02] F. Gesztesy, A. Kiselev and K.A. Makarov, Uniqueness results for matrix-valued Schrodinger, Jacobi and Dirac-type operators, Math. Nachr. 239/240(2002) 103–145.

[GeSi:00] F. Gesztesy and B. Simon, A new approach to inverse spectral theory, II. Gen-eral real potentials and the connection to the spectral measure, Ann. Math.,152 (2000), 593–643.

[GKS:98] I. Gohberg, M.A. Kaashoek and A.L. Sakhnovich, Canonical systems withrational spectral densities: Explicit formulas and applications, Math. Nachr.149 (1998) 93–125.

[GKS:02] I. Gohberg, M.A. Kaashoek and A.L. Sakhnovich, Scattering problems with apseudo-exponential potential, Asympt. Anal. 29(2002), no. 1, 1–38.

[GKr:70] I. Gohberg and M.G. Krein, Theory and applications of Volterra operators inHilbert space, Trans. Math. Monographs, 24, Amer. Math. Soc., Providence,R.I., 1970.

[GG:97] M.L. Gorbachuk and V.I. Gorbachuk, M.G. Krein’s Lectures on Entire Op-erators, Operator Theory: Advances and Applications, 97, Birkhauser, Basel,1997.

Page 165: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Strong Regularity and Inverse Problems 159

[Ia:86] A. Iacob, On the Spectral Theory of a Class of Canonical Systems of Differ-ential Equations, PhD Thesis, The Weizmann Institute of Science, Rehovot,Israel, 1986.

[Ka:03] I.S. Kats, ”Linear relations generated by the canonical differential equation ofphase dimension 2, and eigenfunction expansions, St. Petersburg Math. J. 14(2003), no.–3, 429–452.

[KaKr:68] I.S. Kac and M.G. Krein, On the spectral functions of the string, Transl. (2)Amer. Math. Soc., 103(1974), 19–102.

[Kr:44] M.G. Krein, On the logarithm of an infinitely decomposable Hermite-positivefunction, Dokl. Akad. Nauk SSSR 45 (1944), no. 3, 91–94.

[Kr:47] M.G. Krein, A contribution to the theory of entire functions of exponentialtype, Izv. Akad. Nauk SSSR 11 (1947) 309–326.

[Kr:51] M.G. Krein, On the theory of entire matrix functions of exponential type,Ukrain. Mat. Zh. 3 (1951), no. 2, 154–173.

[Kr:55] M.G. Krein, Continuous analogs of theorems on polynomials orthogonal onthe unit circle, Dokl. Akad. Nauk SSSR 105 (1955) 433–436.

[Kr:56] M.G. Krein, On the theory of accelerants and S–matrices of canonical differ-ential systems, Dokl. Akad. Nauk 111 (1956), no. 6, 1167–1170.

[KrL:85] M.G. Krein and H. Langer, On some continuation problems which are closelyrelated to the theory of operators in spaces Πκ. IV: Continuous analoguesof orthogonal polynomials on the unit circle with respect to an indefiniteweight and related continuation problems for some classes of functions, J.Oper. Theory, 13 (1985), 299–417.

[KrMA:86] M.G. Krein and F.E. Melik-Adamyan, The matrix continual analogues ofSchur and Caratheodory-Toeplitz problems, Izv. Akad. Nauk Armyan SSR,Ser. Mat. 21 (1986), no. 2, 107–141.

[LeMa:00] M. Lesch and M.M. Malamud, The inverse spectral problem for first ordersystems on the half line, in: Differential operators and related topics, Vol. I(Odessa, 1997), Oper. Theory Adv. Appl. 117 (2000), Birkhauser, Basel, pp.199–238.

[LeSa:75] B.M. Levitan and I.S. Sargsjan, Introduction to Spectral Theory, Transl.Math. Mon. 39, Amer. Math. Soc., Providence, 1975.

[Li:73] M.S. Livsic, Operators, Oscillations, Waves, Open Systems, Trans. Math.Monographs 34 Amer. Math. Soc., Providence, R.I., 1973.

[Ma:99] M.M. Malamud, Uniqueness questions in inverse problems for systems of or-dinary differential equations on a finite interval, Trans. Moscow Math. Soc.60 (1999) 173–124.

[MeA:67] F.E. Melik-Adamyan, On the theory of matrix accelerants and spectral matrixfunctions of canonical differential systems, Dokl. Akad. Nauk Armyan SSR,45 (1967), 145–151.

[MeA:77] F.E. Melik-Adamyan, On canonical differential operators in Hilbert space, Izv.Akad. Nauk Armyan SSR, Ser. Mat. 12 (1977) 10–31.

[MeA:99a] F.E. Melik-Adamyan, Description of spectral functions for a class of differen-tial operators, J. Contemp. Math. Anal. 34 (1999), no. 2, 54–70 (2000).

[MeA3:99b] F.E. Melik-Adamyan, Description of spectral functions for a class of differ-ential operators with decaying boundary conditions, J. Contemp. Math. Anal.34 (1999), no. 3, 64–74 (2000).

Page 166: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

160 D.Z. Arov and H. Dym

[MeA:00] F.E. Melik-Adamyan, Spectral functions of canonical differential equations, J.Contemp. Math. Anal. 35 (2000), no. 2, 42–60 (2001).

[Or:76] S.A. Orlov, Nested matrix discs that depend analytically on a parameterand theorems on the invariance of the ranks of the radii of the limit matrixdiscs, Izv. Akad. Nauk. SSSR Ser. Mat. 40 (1976), No. 3, 593–644, 710.

[P:55] V.P. Potapov, The multiplicative structure of J-contractive matrix functions,Trudy Mosk. Mat. Obshch. 4 (1955) 125–236, English: Amer. Math. Soc.Transl. (2) 15 (1960) 131–243.

[RaSi:00] A. Ramm and B. Simon, A new approach to inverse spectral theory, III. Shortrange potentials, J. d’Analyse Math., 80 (2000), 319–334.

[Rem:02] C. Remling, Schrodinger operators and de Branges spaces, J. Funct. Anal.,196 (2002), 323–394.

[Rem:03] C. Remling, Inverse spectral theory for one dimensional Schrodinger opera-tors: The A function, Math. Z., 245 (2003), 597–617.

[RR:77] M. Rosenblum and J. Rovnyak, Hardy Classes and Operator Theory, DoverReprint, New York, 1977.

[Sak-A:92] A.L. Sakhnovich, Spectral functions of a canonical system of order 2n, Math.USSR Sbornik, 71 (1992), No. 2, 355–369.

[Sak:96] L.A. Sakhnovich, Spectral problems on half-axis. Methods Funct. Anal. Topol-ogy 2 (1996), no. 3-4, 128–140.

[Sak:99] L.A. Sakhnovich, Spectral Theory of Canonical Differential Systems. Methodof Operator Identities, Birkhauser, Basel, 1999.

[Sak:00a] L.A. Sakhnovich, Works by M.G. Krein on inverse problems, Differential op-erators and related topics, Vol. I (Odessa, 1997), 59–69, Oper. Theory Adv.Appl., 117, Birkhauser, Basel, 2000.

[Sak:00b] L.A. Sakhnovich, Spectral theory of a class of canonical differential systems,(Russian) Funktsional. Anal. i Prilozhen. 34 (2000), no. 2, 50–62, 96; transla-tion in Funct. Anal. Appl. 34 (2000), no. 2, 119–128.

[Sim:99] B. Simon, A new approach to inverse spectral theory, I. Fundamental formal-ism, Ann. Math., 150 (1999), 1029–1057.

[Sm:90] Ju.L. Smul’yan, Operator Balls, translation in: Integral Equations OperatorTheory, 13 (1990), No. 6, 864–882.

[W:00] H. Winkler, Small perturbations of canonical systems, Integral Equations Op-erator Theory, 38 (2000) 222–250.

Damir Z. ArovDepartment of MathematicsSouth-Ukranian Pedagogical University65020 Odessa, Ukrainee-mail: [email protected]

Harry DymDepartment of MathematicsThe Weizmann Institute of ScienceRehovot 76100, Israele-mail: [email protected]

Page 167: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 161–178c© 2005 Birkhauser Verlag Basel/Switzerland

Regularization Processes for Real Functionsand Ill-posed Toeplitz Problems

Claudio Estatico

This work is dedicated to Prof. Israel Gohberg, on the occasion of his 75th birthday.

Abstract. Most preconditioners for Toeplitz systems An(f) arising in the dis-cretization of ill-posed problems give rise to instability and noise amplification.Indeed, since these preconditioners are constructed from linear approximationprocesses of the generating function f , they inherit the ill-posedness of theproblem.

Here we first identify a novel set of approximation processes which reg-ularizes the inversion of real functions. Then, such processes are used as abasic tool for the computation of preconditioners endowed with regularizingproperties. We show that these preconditioners provide fast convergence andnoise control of iterative methods for discrete ill-posed Toeplitz systems.

Mathematics Subject Classification (2000). 47A52,65F22,65F10,15A29.

Keywords. preconditioning, regularization, linear approximation operators,matrix algebras, Toeplitz matrices.

1. Introduction

Preconditioning techniques for Toeplitz systems are widely used in order to speedup the convergence of iterative methods [10]. In this paper we consider n×n Her-mitian Toeplitz matrices An(f) generated by a 2π-periodic Lebesgue integrablereal function f , that is, the entries along the kth diagonal of An(f) are equal tothe kth Fourier coefficient of f [21, 20]. Since the spectral distribution of An(f)is asymptotically equivalent to the distribution of the generating function f , mostpreconditioners are constructed by means of approximations of f [28]. Main ex-amples are the linear approximation processes such as the Fourier partial sumFn(f) =

∑nj=0 aje

ijx and the Cesaro sum Cn(f) = 1n+1

∑nj=0

∑jk=−j ake

ikx ,

This work was partially supported by MIUR, grant numbers 2002014121 and 2004015437.

Page 168: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

162 C. Estatico

where ak = 12π

∫ 2π

0 f(x)e−ikxdx are the values on the diagonals of An(f). Theselinear approximation processes lead to the G. Strang natural and the T. Chanoptimal preconditioners respectively [29, 12, 32, 11, 28]. Generally, if the generat-ing function f is sufficiently smooth, the linear approximation processes convergeto f uniformly. In that case, the corresponding preconditioner is a close approx-imation of the system matrix An(f). If An(f) is derived from the discretizationof an ill-posed problem, these preconditioners can yield numerical instability andamplification of the errors due to noise on input data [23, 25]. The rank of thesepreconditioners is asymptotically ill determined as well as the rank of An(f), thatis, these preconditioners inherit the ill-posedness of the problem.

In this paper, we first characterize a class of approximation processes whichallow preconditioners for ill-posed Toeplitz systems to be constructed effectively.In particular, linear approximation processes are filtered by continuous regular-ization algorithms [15]. These procedures lead to approximation operators whichare called regularization processes. Basically, if a continuous real function f hasa root, a regularization process gives rise to a family of bounded functions whichapproximate the unbounded function 1/f .

The properties of preconditioners constructed from regularization processesare then analyzed. We show that these preconditioners have bounded inverses andthe spectrum of the preconditioned matrix is clustered at unity.

With reference to the effectiveness for ill-posed linear systems, we prove thatthe proposed preconditioners belong to the class of regularizing preconditioners[18]. If a linear system comes from the discretization of an ill-posed problem, reg-ularization preconditioners can improve the convergence of appropriate iterativemethods without amplifying the components related to the noise in the data. Thisis a very favorable property, which is absent in other preconditioning strategies.For instance, preconditioners constructed from linear approximation processes be-have differently and often do not yield good results, since they give rise to fastreconstruction on components corrupted by noise.

The paper is organized as follows. In Section 2 we introduce notations and ba-sic results about approximation techniques for Toeplitz preconditioning in trigono-metric matrix algebras. In Section 3 we define the class of regularization processesfor bounded approximations of the unbounded inverse of 2π-periodic real func-tions. In Section 4 we study properties of matrices constructed from regularizationprocesses of the previous section. We prove that the inverse of any n × n matrixassociated with a regularization process of a real function f converges, with respectto n, to the Toeplitz matrix An(f) and that the spectrum of the preconditionedmatrix has a cluster at unity. In Section 5 we study such matrices in the context ofToeplitz preconditioning. We show that these matrices belong to the class of reg-ularization preconditioners [18], and therefore are suitable for preconditioning oflinear systems arising in the discretization of ill-posed problems. Section 6 collectssome final remarks and future goals.

Page 169: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Regularization Processes and Ill-posed Problems 163

2. Toeplitz preconditioning and linear approximation processes

An n×n matrix An = (ai,j)ni,j=1 ∈ Cn×n is said to be a Toeplitz matrix if ai,j = ar,s

for i − j = r − s, that is, An is constant along any diagonal. Toeplitz matricesarise in a wide range of applications, such as the resolution of Fredholm integraloperators with space-invariant integral kernels [5, 20, 10].

The family of Hermitian Toeplitz matrices An = An(f)+∞n=1 is said to be

generated by a Lebesgue-integrable scalar function f : I −→ R, I = [−π, π] , if theentries along the kth diagonal are equal to the kth Fourier Transform coefficientak of f , that is,

[An(f)]r,s = ar−s , ak =12π

∫I

f(x)e−ikxdx ( i2 = −1, k ∈ Z ) .

The Szego-Tyrtyshnikov results state that the distribution of eigenvalues of An(f)asymptotically converge to f ∈ L1(I) [21, 30, 31].

Since there is a connection between the Toeplitz matrix An(f) and thetrigonometric Fourier Transform of the generating function f , preconditioners forAn(f) usually belong to trigonometric matrix algebras. If Un denotes an n × ncomplex unitary matrix of eigenvectors, then the matrix algebra Mn = M(Un) isa matrix space defined as follows

Mn = M(Un) = X = Un∆nU∗n ∈ Cn×n , (2.1)

where ∆n = diag(d0, d1, . . . , dn−1) is the complex diagonal matrix of eigenvalues ofX . If the columns of Un are trigonometric vectors, then the matrix algebra is saidto be trigonometric. In particular, let vnn∈N denote a sequence of trigonometricfunctions on I and let Wnn∈N denote a sequence of grids of n points on I, that

is, Wn = x(n)s n−1

s=0 ⊂ I . If the n×n Vandermonde matrix Vn =(vr(x

(n)s )

)n−1

r,s=0,

is unitary, the corresponding n× n matrix space Mn = M(V ∗n ) is a trigonometric

matrix algebra [14].Thus, for any continuous function g defined in [0, 2π] , Mn(g) denotes the

matrix

Mn(g) = UnGnU∗n ∈M(Un) (2.2)

such that the diagonal matrix Gn is (Gn)s,s = g(x(n)s ) , for s = 0, . . . , n− 1 .

Widely used trigonometric matrix algebras Mn for Toeplitz precondition-ers are the circulant, ω-circulant, Tau and Hartley matrix spaces [13, 4, 7, 8].These algebras are related to the Fast Fourier Transform, the Fast Sine Transformand Fast Hartley Transform; all of them allow fast, i.e., O(n log n), matrix-vectormultiplication and diagonalization.

The convergence speed of iterative system solvers depends on the distribu-tion of singular values of its system matrix: basically the speed is high when thespectrum is “close” to the unity [1].

Page 170: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

164 C. Estatico

In order to speed up the convergence, the linear system Anx = b is replaced bythe algebraic equivalent one P−1

n Anx = P−1n b , where Pn is an n×n preconditioner.

Since the rate of convergence can be improved if the spectrum of the preconditionedmatrix P−1

n An is clustered at unity, often we have that P−1n An ≈ I, that is,

Pn ≈ An.The approximation of a Toeplitz matrix An(f) by a preconditioner derives

from the approximation of the generating function f . The linear approximationprocesses in trigonometric functional spaces [32] are widely used approximationschemes for Toeplitz preconditioning [28]. Let Vnn∈N be the sequence of spacesof all the trigonometric polynomials of degree n. Note that Vn ⊂ Vn+1 and∪n∈NVn is dense in the space (C2π , ‖ • ‖∞) of all the globally continuous and2π-periodic functions endowed with the supremum norm. A linear approximationprocess for a (generating) function f ∈ C2π is a sequence of linear approximationoperators Snn∈N , with Sn : C2π −→ Vn , whose images uniformly converge tof , that is,

limn−→+∞ ‖Sn(f)− f‖∞ = 0 . (2.3)

On the grounds of (2.3) and (2.2), if f ∈ C2π , a trigonometric preconditionerPn ∈Mn = M(Un) of a Toeplitz matrix An(f) can be defined as follows

Pn = Mn(Sn(f)) = UnDnU∗n , (2.4)

where Dn is the n× n diagonal matrix such that (Dn)s,s = [Sn(f)](x(n)s ) for s =

0, . . . , n−1 . Notice that the eigenvalues of An(f) are distributed as the generatingfunction f , while the eigenvalues of its preconditioner Pn are distributed as Sn(f).This yields that the preconditioner Mn(Sn(f)) is an accurate approximation in thealgebra Mn of the Toeplitz matrix An(f). Many preconditioners can be representedas (2.4), such as, for instance, the Strang natural and the T.Chan optimal ones [29,12]. We recall that if An is Hermitian, the T.Chan optimal preconditioner Popt(An)solves the minimization problem Popt(An) = arg minX∈Mn ‖An−X‖F , where ‖·‖F

is the Frobenius norm. According to notation (2.2), the optimal preconditioner canbe written as Popt(An) = Mn(Cn(f)), where Cn(f) = 1

n+1

∑nj=0

∑jk=−j ake

ikx isthe linear approximation process of the Cesaro sums [11].

As already mentioned, if a continuous generating function f has a root, as inill-posed problems, the rank of the matrices An(f) is asymptotically ill-determined,since the smallest non-null eigenvalues tend to zero. Any preconditioner (2.4) “in-herits” the spectral distribution of the system matrix and give numerical instabil-ity. For instance, it is known that the T. Chan optimal approximation leads to badnumerical results due to amplification of the noise of the data [23]. In these cases,all preconditioners are usually modified by means of spectral filtering procedures[23, 9, 27, 24, 25, 3, 2, 6, 17]. These procedures give rise to preconditioners whichapproximate the system matrix only in the space less sensitive to noise. In thatway, they are suitable for ill-posed linear systems, as explained in Section 5.

Page 171: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Regularization Processes and Ill-posed Problems 165

3. Regularization processes for real functions

The regularization theory gives mathematical tools for obtaining low noise-sensitivesolutions of ill-posed problems [15, 5].

Let X,Y denote two (infinite dimensional) Hilbert spaces. Given a datumy ∈ Y and a bounded linear operator T : X −→ Y , we consider the linearequation T x = y , where x ∈ X is the output solution. We look for the minimumnorm solution x† of the Gaussian normal equation T ∗Tx = T ∗y, that is, we solvethe equation in the generalized sense.

If the generalized solution x† exists, it is given by x† =∫ ‖T‖2

01/λ d EλT

∗y ,where Eλ is the spectral family of the self-adjoint operator T ∗T [22]. Accordingto Hadamard, a problem is said to be ill-posed if its solution may not exist, maynot be unique, or it does not depend continuously on the data. The regularizationtheory for solving ill-posed problems states that the problem of computing the gen-eralized solution x† is ill-posed if and only if the range R(T ) is non-closed [15]. In-deed, if y ∈ R(T )⊕R(T )⊥ ⊂ Y then the integrand 1/λ has a non-integrable polein zero with respect to the “data-depending” measure d EλT

∗y . On these bases,the computation of x† needs procedures which filter the pole in zero of the inte-grand function 1/λ, whenever R(T ) is a non-closed set. These procedures are calledregularization algorithms. Basically, if T † denotes the (Moore-Penrose) generalizedinverse T † : R(T )⊕R(T )⊥ −→ X such that T †y = x† for y ∈ R(T )⊕R(T )⊥,a regularization algorithm is a family of continuous, i.e., stable, operators whichapproximate the unbounded, i.e., unstable, operator T †.

Simple regularization algorithms come from the approximation of the inte-grand 1/λ by a family of neighboring functions, which are piecewise continuousover the closure [0, ‖T ‖2] of the spectrum of T ∗T (see [15], except from the mono-tonic characterization added here).

Definition 3.1. Let I denotes the closure of I = [0, β) , with β ∈ R+ ∪ +∞ andlet α0 > 0. A family Rαα∈(0,α0) of real functions Rα : I ⊆ R −→ R , is calledregularization inverse on I if the following three conditions hold:

(i) ∀α ∈ (0, α0) , Rα is piecewise continuous and globally continuous from theright;

(ii) ∀α ∈ (0, α0) , the function xRα(x) is uniformly bounded, i.e., there exists aconstant C > 0 such that |xRα(x)| ≤ C , ∀x ∈ I ,

(iii) Rα(x) approximates 1/x as α −→ 0+ , that is,

limα−→0+

Rα(x) =1x

∀x ∈ I \ 0 .

Moreover, the regularization inverse is called monotone if, for x ∈ (0, η) withη > 0 , the following inequality holds

Rα1(x) ≥ Rα2(x) ∀ 0 < α1 < α2 < α0 . (3.1)

Page 172: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

166 C. Estatico

With the help of the latter definition, we have the continuous regularizationalgorithms.

Lemma 3.2. [15] Let Rαα>0 be a regularization inverse on [0, ‖T ‖2] . The familyof operators Rαα>0 , Rα : Y −→ X, such that

Rαy = Rα(T ∗T )T ∗y :=∫ ‖T‖2

0

Rα(λ) d EλT∗y , (3.2)

is a regularization linear algorithm for T †, called continuous regularization algo-rithm.

For instance, the family of operators Rαα>0 of Tikhonov regularization[5, 15] defined as Rα = (T ∗T + αI)−1T ∗ , can be represented according to (3.2),with Rα(λ) = (λ + α)−1 .

We remark that inequality (3.1) guarantees that the larger the value of theparameter α is, the stronger the filtering capabilities are. If the result of the regu-larization is unstable, we can adopt a larger regularization parameter in order toimprove the noise filtering.

Since it is very simple to design regularization inverses, formula (3.2) is aconstructive method for having regularization linear algorithms. Indeed, regular-ization inverses can be constructed by means of filters for spectral control. Below,we collect and propose some useful filters, including the Tikhonov one.(I) Tikhonov Filter [5, 15]

Rα(x) =1

x + αx ≥ 0

(II) Low Pass Filter

Rα(x) =

0 0 ≤ x < αx−1 x ≥ α

(III) M. Hanke, J.G. Nagy and R.J. Plemmons’ Filter [23]

Rα(x) =

1 0 ≤ x < αx−1 x ≥ α

(IV) p-Polynomial Low Pass Filter (p ≥ 0)

Rα(x) =

α−(p+1)xp 0 ≤ x < αx−1 x ≥ α

(V) (1/α)-Polynomial Low Pass Filter

Rα(x) =

α−α+1α x

1α 0 ≤ x < α

x−1 x ≥ α

(VI) Exponential Low Pass Filter

Rα(x) =

⎧⎨⎩0 x = 0

α−1ex−ααx 0 < x < α

x−1 x ≥ α

Page 173: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Regularization Processes and Ill-posed Problems 167

(VII) Showalter’s Filter for asymptotic regularization [15]

Rα(x) =1x

(1− e−

)=∫ 1/α

0

e−xsds , x ≥ 0 .

It is simple to verify that all these regularizing inverses, except (V ), aremonotone. The list can be extended by considering their linear combinations, withsuitable normalizing factors.

On the basis of regularization inverses and linear approximation processes(2.3), now we characterize a class of approximation schemes which simultaneouslyapproximate and regularize periodic real functions.

Definition 3.3. Let V2π be the space of the 2π-periodic real functions, and letC2π ⊂ V2π denote the subset of the continuous ones.

A family Rn;αn∈N;α∈(0,α0) of operators Rn;α : C2π −→ V2π , is said to bea regularization process if there exists a linear approximation process Snn∈N inthe sense of (2.3), such that, for any function f ∈ C2π and n ∈ N, the followingthree conditions hold:

(i) ∀α ∈ (0, α0) , the function Rn,α(f) is piecewise continuous in [0, 2π] andcontinuous from the right at each point x ∈ [0, 2π] such that[Sn(f)](x) = 0 ;

(ii) ∀α ∈ (0, α0) , the product function Sn(f)Rn,α(f) is uniformly bounded, i.e.,there exists a constant C > 0 such that| [Sn(f)](x) [Rn,α(f)](x) | ≤ C for any x ∈ [0, 2π] ;

(iii) Rn,α(f) “approximates the inverse” of Sn(f) , that is,

limα−→0+

[Rn,α(f)](x) =([Sn(f)](x)

)−1

∀x ∈ x ∈ [0, 2π] : [Sn(f)](x) = 0 .

Summarizing, a regularization process is able to regularize the approximationof 1/f in all the points x ∈ [0, 2π] such that f(x) = 0. On the other hand, iff(x) = 0, the regularization process must guarantee the convergence to f(x)−1,resulting in the following lemma.

Theorem 3.4. Let Rn;αn∈N;α∈(0,α0) be a regularization process and f ∈ C2π . Letx ∈ [0, 2π] with f(x) = 0 .

Then, for any ε > 0, there exist nε,x ∈ N and αε,x > 0 such that

|[Rn,α(f)](x) − f(x)−1| < ε ,

if n > nε,x and 0 < α < αε,x.

Proof. Let Sn be the linear approximation process associated to the regulariza-tion process Rn;α in the sense of Definition 3.3, and let sn denote the functionsuch that sn(y) = [Sn(f)](y) , ∀y ∈ [0, 2π].

Page 174: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

168 C. Estatico

Let us suppose f(x) > 0 (the case f(x) < 0 is analogous). Since f(x) > 0 ,there exists an nx ∈ N such that sn(x) = 0 for n > nx. This can be easily shownby recalling that the sequence of spaces Vnn∈N is dense in (C2π, ‖ • ‖∞) andthe limit (2.3) holds. Hence, if n > nx , we can write

|[Rn,α(f)](x) − f(x)−1| ≤ |sn(x)−1 − f(x)−1|+ |[Rn,α(f)](x) − sn(x)−1| . (3.3)

By virtue of the uniform approximation of f by sn, if δ ∈ (0, f(x)/2) there existsan nδ ∈ N such that |sn(x) − f(x)| < δ for n > nδ. By the first addendum of(3.3), if n > n = maxnx, nδ, we have that

|sn(x) − f(x)| < δ ⇐⇒ |sn(x)−1 − f(x)−1| < δ/(sn(x)f(x))=⇒ |sn(x)−1 − f(x)−1| < 2δ/f(x)2 ,

since sn(x) > f(x)/2 = 0. If we substitute δ = (f(x)2ε)/4 in the latter inequality,then |sn(x)−1 − f(x)−1| is bounded by ε/2 for any n > n =: nε,x .

Finally, by virtue of part (iii) of Definition 3.3, we note that the secondaddendum of (3.3) is also bounded by ε/2 for any 0 < α < αε,x , with αε,x

sufficiently small, which concludes the proof. If f(x) > C > 0 ∀x ∈ [0, 2π], the family of functions Rn,α defined as

[Rn,α(f)](x) ≡([Sn(f)](x)

)−1

is a regularization process for f . Nevertheless, itis evident that such a regularization process is useless for preconditioning of ill-posed Toeplitz systems, since generating functions f(x) > C > 0 are associatedto well-posed problems [18].

The following lemma states that the application of a regularization inverseto a linear approximation process gives rise to a regularization process. Since Def-inition 3.3 has been introduced according to the three conditions of Definition 3.1,the lemma is proved straightforwardly.

Lemma 3.5. Let f ∈ C2π and let Sn(f)n∈N be a linear approximation processsuch as in (2.3). Moreover, let Rαα∈(0,α0) , α0 > 0 , denote a regularizationinverse on (0,+∞) of Definition 3.1.

Then the family of operators Rn;αn∈N;α∈(0,α0) such that

[Rn;α(f)](x) = Rα

([Sn(f)](x)

), ∀x ∈ [0, 2π] , (3.4)

is a regularization process in the sense of Definition 3.3 .

In the following sections, we consider regularization processes applied to gen-erating functions of Toeplitz matrices, and Lemma 3.5 will be useful there.

4. Regularization processes for Toeplitz matrices

Given a Toeplitz matrix An(f), with f ∈ C2π , and a sequence of matrix algebrasMn , let Qn;αn∈N;α∈(0,α0) be the family of preconditioners Qn;α ∈Mn such that

Qn;α = Mn(Rn;α(f)) , (4.1)

Page 175: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Regularization Processes and Ill-posed Problems 169

where the notation was introduced in (2.2). Notice that Rn;α in (4.1) worksin place of Sn in (2.4). We will show that, under suitable hypotheses, suchpreconditioners Qn;α converge to the inverse of An(f), and the eigenvalues of thepreconditioned matrix Qn;αAn(f) are well clustered at unity.

Before continuing, we need to characterize what are the matrix algebras whichallow good approximations of Toeplitz matrices.

Definition 4.1. [28] A sequence Mnn∈N of matrix algebras of order n, is calleda good sequence of algebras if and only if, for any ε > 0, there exists an integernε ∈ N such that, for any trigonometric polynomial p of fixed degree and n > nε,the eigenvalues of the matrix An(p)−Mn(p) are contained in (−ε, ε) except for aconstant number Hε of outliers.

We remark that all the trigonometric matrix algebras of Section 2 are goodsequence of algebras. Now, we may give the first theorem which connects regular-ization processes to generating functions f ∈ C2π of Toeplitz matrices.

Theorem 4.2. Let An(f) be a sequence of Toeplitz matrices generated by a realfunction f ∈ C2π, and let Mn be a good sequence of algebras with grid pointsWn . According to Definition 3.3, let Rn;αα>0 denote a regularization process.

Let us suppose that, for any ε > 0, there exist nε ∈ N and αε > 0 such that,for n > nε and 0 < α < αε ,

[Rn;α(f)](x(n)i ) = 0 , x

(n)i ∈ Wn , ∀ i ∈ 1, . . . , n (4.2)

and

|[Rn;α(f)](x(n)i )−1 − f(x(n)

i )| < ε x(n)i ∈ Wn ∀ i ∈ 1, . . . , n \ Jε , (4.3)

where the number of elements of Jε is o(n) .Let us consider the family of sequences of matrices Bn;αn∈N;α∈(0,α0) , with

Bn;α ∈Mn defined as follows

Bn;α =(Mn(Rn;α(f))

)−1

. (4.4)

Then, for any ε > 0 there exist nε ∈ N and αε ∈ (0, α0) such that, for n > nε

and 0 < α < αε , the eigenvalues of the matrix An(f) − Bn;α are contained in(−ε, ε) except for a number Hε = o(n) of outliers. We shall say that Bn;α weaklyconverges to An(f).

Proof. Let pkk∈N denote a family of trigonometric polynomials of degree kwhich uniformly approximates the function f . Then, fixed ε > 0 , there exists ann′

ε ∈ N such that

‖An(f)−An(pn′ε)‖2 < ‖f − pn′

ε‖∞ ≤ ε/3

for n > n′ε .

Page 176: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

170 C. Estatico

Let Zn,α(f) denote a family of real functions defined on Wn such that

[Zn,α(f)](x(n)i ) =

[Rn;α(f)](x(n)

i )−1 if i ∈ Jε ,

f(x(n)i ) if i ∈ Jε .

By virtue of (4.3), there exist n′′ε > n′

ε ∈ N and αε > 0 such that

‖Mn(Zn;α(f))−Mn(pn′ε)‖2 = ‖Mn(Zn;α(f)− pn′

ε)‖2 < ε/3 ,

for n > n′′ε and 0 < α < αε .

Due to (4.2), the matrix Mn(Rn;α(f)) is invertible. Thus we can write

An(f)−Bn;α = An(f)−Mn(Rn;α(f))−1 = An(f)−Mn(Rn;α(f)−1)

=(An(f)−An(pn′

ε))

+(An(pn′

ε)−Mn(pn′

ε))

+(Mn(pn′

ε)−Mn(Zn;α(f))

)+(Mn(Zn;α(f))−Mn(Rn;α(f)−1)

).

The first and the third addendum have the 2-norm bounded by ε/3, forn > n′′

ε and α < αε, as shown before.Since Mn is a good sequence of algebras, the second addendum can be split

to two parts, the former with norm bounded by ε/3 and the latter with constantrank, for n sufficiently larger than a suitable n′′′

ε .The fourth addendum Mn(Zn;α(f)) − Mn(Rn;α(f)−1) is the difference of

matrices of the same algebra, with the same generating function, except at most#Jε points of the grid Wn of the algebra Mn. It follows that the rank of the lastaddendum is at most equal to #Jε = o(n), for all α > 0.

Therefore we have shown that the matrix An(f) − Bn;α is the sum of twoparts, one with 2-norm bounded by ε and the other with rank equal to o(n), ifn > nε := maxn′′

ε , n′′′ε and 0 < α < αε. Hence, we have only to invoke the

Cauchy interlace Theorem, and the result is proved.

The latter theorem leads to the following lemma which can be used to analyzethe clustering at unity of the preconditioned matrices Mn(Rn;α(f))An(f). Werecall that eigenvalues’ clustering at unity can give a fast convergence of iterativemethods such as the conjugate gradient and the Landweber ones [5, 10].

Lemma 4.3. Under the assumptions of Theorem 4.2 and with f positive, then, forany ε > 0, there exist nε ∈ N and αε > 0 such that, for n > nε and 0 < α < αε ,the preconditioned matrix

Mn(Rn;α(f))An(f)

has eigenvalues contained in (1 − ε, 1 + ε) except at most o(n) outliers.

Proof. From Theorem 4.2, for ε′ > 0 there exist nε′ ∈ N and αε′ > 0 such that theeigenvalues of the matrix An(f)−Mn(Rn;α(f))−1 are contained in (−ε′, ε′) exceptfor a number o(n) of outliers.

Page 177: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Regularization Processes and Ill-posed Problems 171

Since f is continuous and positive in the closed set [0, 2π], the functionsRn;α(f) are uniformly bounded; this implies that all the operators Mn(Rn;α(f))are uniformly bounded. With the help of the identity

Mn(Rn;α(f))An(f) = Mn(Rn;α(f)) (An(f)−Mn(Rn;α(f))−1) + I ,

the claimed result is proved, by invoking Theorem 4.2 with ε′ = ε/K, whereK ∈ R is

K = supα∈(0,α0);n∈N

‖Mn(Rn;α(f))‖2 .

5. Regularization preconditioners from regularization processes

If the continuous generating function has a root, most preconditioners for Toeplitzsystems are asymptotically ill-conditioned and give rise to high numerical instabil-ity. In order to improve the stability of the preconditioned system by filtering thecomponents related to the noise, in [18] we introduced the class of regularizationpreconditioners.

Definition 5.1. [18] Let Ann∈N be a sequence of n×n matrices and let Mnn∈N

be a sequence of matrix algebras.A family of matrices Qn;αn∈N;α∈(0,α0) , Qn;α ∈Mn , with α0 > 0 , is a fam-

ily of regularization preconditioners for An if and only if there exists a sequenceof preconditioners Pnn∈N , Pn ∈Mn , such that:(1) Pn weakly converges to An, that is, for any ε > 0 , there exists an nε ∈ N

such that, for n > nε, the singular values of the matrix Pn−An are containedin the disc B(0, ε) except for a number o(n) of outliers.

(2) For any n ∈ N, we have that

limα−→0+

supyn∈Cn

‖Qn;αyn − P †nyn‖ = 0 , (5.1)

where the matrix P †n is the Moore-Penrose inverse of Pn.

(3) For any n ∈ N, let |l(n)min| = |l(n)

1 | ≤ |l(n)2 | ≤ · · · ≤ |l(n)

n | = |l(n)max| be the

singular values of Pn associated with a basis B of singular vectors of thealgebra Mn , and let l

(n)1;α , l

(n)2;α , . . . , l

(n)n;α denote the singular values of the

matrix Qn;α associated with the same basis B.If |l(n)

min| −→ 0 (n −→ +∞) then, for α ∈ (0, α′0) with 0 < α′

0 <α0 , there exist an index function jα(n) : N −→ N , which satisfies jα(n) ≤n, jα(n) −→ +∞ (n −→ +∞) , and a constant 0 < Cα < 1 , such that

0 ≤ |l(n)i;α | |l

(n)i | ≤ Cα < 1 , if i ≤ jα(n) . (5.2)

In addition, if|l(n)

i;α1| ≥ |l(n)

i;α2| ∀ i ≤ j′(n) (5.3)

for 0 < α1 < α2 ≤ α0 , where j′ : N −→ N is an index function such thatj′(n) −→ +∞ , then the family Qn;α is said to be monotone.

Page 178: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

172 C. Estatico

In summary, the “unstable” inversion of the smallest singular values of a(non-regularizing) preconditioner Pn, is controlled by using the regularization pre-conditioners Qn;α , for α ∈ (0, α0).

Now, using regularization processes of type (3.4), we build families of regu-larization preconditioners in the sense of the latter definition.

Let Rαα∈(0,α0) be the family of operators Rα : C2π −→ V2π such that

[Rα(g)](x) =

0 if g(x) = 0Rα(g(x)) otherwise

(5.4)

for any g ∈ C2π , where Rαα∈(0,α0) is a regularization inverse on [0,+∞) inthe sense of Definition 3.1. Under the hypotheses of Lemma 3.5 and accordingto (4.1), we consider the family of Hermitian preconditioners Qn;αn∈N;α∈(0,α0) ,with Qn;α ∈Mn , defined as follows

Qn;α = Mn(Rn;α(f)) = Mn(Rα(Sn(f))) . (5.5)

The following theorem shows that such a family of preconditioners belongsto the class of Definition 5.1.

Theorem 5.2. Let An(f)n∈N be a sequence of Toeplitz matrices generated by areal function f ∈ C2π . Let x ∈ [0, 2π] be a point such that f(x) = 0 and letMnn∈N be a good sequence of algebras.

If Sn denotes a linear approximation process such that (2.3) holds, and thefamily of operators (5.4) is denoted by Rαα∈(0,α0) , then the family of matricesQn;α defined by (5.5) is a family of regularization preconditioners for An(f)according to Definition 5.1.

Furthermore, if the regularization inverse Rα of (5.4) is monotone, thenthe family Qn;α is monotone.

Proof. Let Pn be the n × n Hermitian matrix defined as Pn = Mn(Sn(f)) andlet pkk∈N denote a family of real trigonometric polynomials of degree k whichuniformly approximates the function f . With the same notations of Theorem 4.2,we have

An(f)−Pn =(An(f)−An(pn′

ε))+(An(pn′

ε)−Mn(pn′

ε))+(Mn(pn′

ε)−Mn(Sn(f))

).

The norms of the first and third addendum are small, for a sufficiently large n,since pn′

εand Sn(f) converge to f uniformly. Since Mn is a good sequence of

algebras, the second addendum can be split to two parts, the former with smallnorm and the latter with constant rank, for a sufficiently large n. As a result of theCauchy interlace theorem, the family Pn satisfies condition (1) of Definition 5.1.

Let us consider the eigenvalues of Qn;α = Mn(Rn;α(f)), that is, the valuesof [Rα(g)](x(n)

s ), where x(n)s n−1

s=0 is the grid Wn of the algebra Mn.

Page 179: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Regularization Processes and Ill-posed Problems 173

Due to property (iii) of Definition 3.1 for the regularization inverse Rαα∈(0,α0),we have that [Rα(Sn(f))](x(n)

s ) = 0 , if [Sn(f)](x(n)s ) = 0 , and

limα−→0+

[Rα(Sn(f))](x(n)s ) = lim

α−→0+Rα([Sn(f)](x(n)

s )) = ([Sn(f)](x(n)s ))−1 ,

if [Sn(f)](x(n)s ) = 0 .

Let K(Pn) denote the kernel of the preconditioner Pn and let yn = un + u⊥n

be the unique decomposition of yn ∈ Cn such that un ∈ K(Pn) and u⊥n ∈ K(Pn)⊥.

Recalling that the eigenvalues of Pn are [Sn(f)](x(n)s ) , s = 0, . . . , n−1 , then

we have that Qn;αun = 0 , since [Rα(Sn(f))](x(n)s ) = 0 .

Furthermore, since P †nun = 0, we obtain that

limα−→0+

Qn;αyn = limα−→0+

Qn;αu⊥n = P †

nu⊥n = P †

nyn

for all yn ∈ Cn, which states that condition (2) of Definition 5.1 holds for thefamily (5.5).

Now we discuss condition (3) of Definition 5.1.Since f ∈ C2π and f(x) = 0, we have that limn−→+∞ |l(n)

min| = 0 in lightof the Szego Theorem ([21] Section 5.2). From Definition 3.1, any regularizationinverse Rα(x) is bounded in a right neighborhood of x = 0, since it is continuousfrom the right on [0,+∞). Thus, for any α ∈ (0, α0) there exists ξα > 0 such that

Rα(x) < Cαx−1

for x ∈ (0, ξα) , where Cα is a constant value 0 < Cα < 1. This implies that forα ∈ (0, α0)

|[Rα(f)](x) f(x) | ≤ C < 1 , ∀x ∈ Lα =x ∈ [0, 2π] : |f(x)| < ξα

. (5.6)

Since f ∈ C2π, the Lebesgue measure of Lα is positive. The number of pointsof the grid Wn of Mn whose images under f are contained in the set (−ξα, ξα)tends to infinity, as n increases. More precisely, if we define the set

Kn;α =s ∈ N : x(n)

s ∈Wn , |f(x(n)s )| < ξα

of indices related to the smallest eigenvalues of Pn, we have that

limn−→+∞#Kn;α = +∞ .

If we consider the function jα(n) of Definition 5.1 and, for α ∈ (0, α0), we setjα(n) ≡ #Kn;α, then the second condition for the regularization preconditionersholds. In fact, for i ∈ Kn;α , we have that

limn−→+∞ l

(n)i = f(x) and lim

n−→+∞ l(n)i;α = [Rα(f)](x)

for a suitable x ∈ Lα, so that inequality (5.2) is a direct consequence of (5.6).

Page 180: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

174 C. Estatico

Finally, we show that, if the regularization inverse Rαα∈(0,α0) is monotone,then the regularization family of preconditioners is monotone too.

Let η be the value introduced in Definition 3.1. From (3.1), for 0 < α1 <α2 < α0 we can write

|l(n)i;α1

| = |li(Rn;α1)| = |li(Mn(Rα1(Sn(f))))| = |[Rα1(Sn(f))](x(n)i )|

= |Rα1([Sn(f)](x(n)i ))| ≥ |Rα2([Sn(f)](x(n)

i ))|= |[Rα2(Sn(f))](x(n)

i )| = |li(Mn(Rα2(Sn(f))))| = |li(Rn;α2)| = |l(n)i;α2

|

for all i ∈ Hn =s ∈ N : x

(n)s ∈Wn , |(Sn(f))(x(n)

s )| ∈ [0, η)

.If the function j′ of (5.3) is defined as j′(n) ≡ #Hn, we have that Qn;α ismonotone, since

limn−→+∞#Hn = +∞ .

The class of regularization preconditioners of Definition 5.1 includes manypreconditioners of the literature for Toeplitz systems derived from the discretiza-tion of ill-posed problems [23, 26, 25, 10]. Here we show that the basic precondi-tioner developed in 1993 by M. Hanke, J.G. Nagy and R.J. Plemmons [23], is apreconditioner obeying the hypotheses of Theorem 5.2. This preconditioner willbe denoted by HNP.

According to Section 2, let Fn be the unitary Fourier matrix which diag-onalizes the matrix algebra Mn = Mn(Fn) of the circulant matrices and letPopt(An) ∈Mn denote the optimal circulant preconditioner for the Toeplitz matrixAn = An(f), with f ∈ C2π.

If λ1(B), λ2(B), . . . , λn(B) denote the eigenvalues of the circulant matrix Bwith respect to the set of eigenvectors of Fn, the HNP preconditioner Popt,τ (An)with truncation parameter τ > 0, is the circulant matrix such that

λi(Popt,τ (An)) =

1 if |λi(Popt(An))| < τλi(Popt(An)) otherwise . (5.7)

Lemma 5.3. Let An(f)n∈N be a sequence of Toeplitz matrices generated by a realfunction f ∈ C2π and let x ∈ [0, 2π] be a point such that f(x) = 0.

Let K(Popt(An)) denote the kernel of the optimal circulant preconditionerPopt(An) ∈Mn(Fn) and let yn = un + u⊥

n be the unique decomposition of yn ∈ Cn

such that un ∈ K(Popt(An)) and u⊥n ∈ K(Popt(An))⊥.

Then the family of circulant preconditioners Qn;αn∈N;α>0 , Qn;α ∈Mn(Fn),such that

Qn;αyn = Popt,α(An)†u⊥n

is a family of monotone regularization preconditioners for An(f) in the sense ofDefinition 5.1.

Page 181: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Regularization Processes and Ill-posed Problems 175

Proof. The preconditioners Qn;α belong to the class (5.5) since:(i) the sequence M(Fn) of trigonometric algebras of circulant matrices is a

good sequence of matrix algebras;(ii) Rα is the operator (5.4) based on the monotone regularization inverse Rα of

the Hanke, Nagy and Plemmons’ filter (III) described in Section 3;(iii) the linear approximation process Sn is the Cesaro sum Cn described in Sec-

tion 2.The thesis holds by invoking Theorem 5.2.

We argue that the latter lemma could be applied to other preconditionersbased on similar filtering procedures [25, 26], with few generalizations of the ar-guments.

Numerical 1-D applications of regularization preconditioners (5.5) can befound in [18], for the solution of a Fredholm equation of the first kind. In [6, 17]several regularization preconditioners for the Landweber and the conjugate gradi-ent methods have been tested for 2-D deconvolution problems in image restoration.An application to an interferometric multi-image problem related to the astronom-ical image restoration of the Large Binocular Telescope can be found in [19].

6. Concluding remarks

In the case of Hermitian Toeplitz matrices An(f), the eigenvalues of most trigono-metric preconditioners are the values of linear approximation processes of the gen-erating function f on a uniform grid of [0, 2π] [10, 11, 28]. These preconditionersapproximate the Toeplitz matrix in the noise space, that is, in the subspace relatedto components of the data mainly corrupted by noise. If the Toeplitz system comesfrom the discretization of an ill-posed problem, such preconditioners must be en-dowed with regularization features. Otherwise, preconditioned iterative methodsmay provide inaccurate results, due to a fast reconstruction from components withthe highest noise [23, 26, 16].

Here we have introduced a different kind of approximation operators for realfunctions, which has been called regularization process. If a regularization processis applied to the generating function of a discrete ill-posed Toeplitz matrix, wecan design and compute efficient preconditioners for the related linear systems. Inthis paper, regularization processes have been constructed by applying well-knowncontinuous regularization algorithms for inverse problems on linear approximationprocesses [15].

Preconditioned iterative system solvers with such regularization precondition-ers give rise to fast convergence in the components of the signal space only. Onthe other hand, in the noise space the convergence is slow, providing a resolutionless sensitive to data errors.

Some widely used preconditioners for ill-conditioned linear systems belongto the general family of regularizing preconditioners from regularization processesintroduced here. On these grounds, the arguments of the paper can be considered

Page 182: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

176 C. Estatico

as an extension. Some families based on the above techniques have been proposedand tested in [18, 6, 17, 19]. Those numerical tests showed that regularization pre-conditioners give a fast and “clean” reconstruction of the solution. It is importantto notice that regularization preconditioners lead to solutions which can be betterthan in the non-preconditioned case. This unusual feature is due to the filteringcapabilities of the regularization preconditioners, which provide fast and accuratereconstruction simultaneously.

Regularization preconditioners depend on a real value, say α, which playsthe role of regularization parameter, that is, it allows both convergence speed andnoise filtering to be controlled. Such a real value α is related to a regularization pa-rameter of the regularization process which approximates the generating functionof the Toeplitz matrix. The choice of this parameter α is crucial for the effective-ness of the regularization preconditioning procedure. This aspect deserves moreattention and will be considered in a future work.

Acknowledgment

The author is grateful to Prof. F. Di Benedetto for useful suggestions and discus-sions. In addition, the author wishes to thank the anonymous referee for his manyadvices to improve the paper.

References

[1] O. Axelsson and G. Lindskog, The rate of convergence of the preconditioned con-jugate gradient method, Numer. Math., 52, 1986, pp. 499–523.

[2] D. Bertaccini, Reliable preconditioned iterative linear solvers for some integrators,Numerical Linear Algebra Appl., 8, 2001, pp. 111–125.

[3] D. Bertaccini, The spectrum of circulant-like preconditioners for some general linearmultistep formulas for linear boundary value problems, SIAM J. Numer. Anal., 40,2002, pp. 1798–1822.

[4] D. Bertaccini and M.K. Ng, Block ω-circulant preconditioners for the systems ofdifferential equations, Calcolo, 40, 2003, pp. 71–90.

[5] M. Bertero and P. Boccacci, Introduction to Inverse Problem in Imaging, Instituteof Physics Publishing, London, 1998.

[6] C. Biamino, Applicazioni del metodo di Landweber per la ricostruzione di immag-ini, Graduate Thesis in Mathematics, Dipartimento di Matematica, Universita diGenova, 2003.

[7] D.A. Bini and F. Di Benedetto, A new preconditioner for the parallel solution ofpositive definite Toeplitz systems, in Proc. 2nd SPAA, Crete, Greece, July 1990,ACM Press, New York, pp. 220–223.

[8] D.A. Bini and P. Favati, On a matrix algebra related to the discrete Hartley trans-form, SIAM J. Matrix Anal. Appl., 14, 1993, pp. 500–507.

[9] D.A. Bini, P. Favati and O. Menchi, A family of modified regularizing circulantpreconditioners for image reconstruction problems, Computers & Mathematics withApplications, 48, 2004, pp. 755–768.

Page 183: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Regularization Processes and Ill-posed Problems 177

[10] R.H. Chan and M.K. Ng, Conjugate gradient methods for Toeplitz systems, SIAMRev., 38, 1996, pp. 427–482.

[11] R.H. Chan and M. Yeung, Circulant preconditioners constructed from kernels, SIAMJ. Numer. Anal., 30, 1993, pp. 1193–1207.

[12] T. Chan, An optimal circulant preconditioner for Toeplitz systems, SIAM J. Sci.Stat. Comp., 9, 1988, pp. 766–771.

[13] P. Davis, Circulant matrices, John Wiley & Sons, 1979.

[14] F. Di Benedetto and S. Serra Capizzano, A unifying approach to abstract matrixalgebra preconditioning, Numer. Math., 82-1, 1999, pp. 57–90.

[15] H.W. Engl, M. Hanke and A. Neubauer, Regularization of Inverse Problems, Kluwer,Dordrecht, 1996.

[16] C. Estatico, A class of filtering superoptimal preconditioners for highly ill-conditioned linear systems, BIT, 42, 2002, pp. 753–778.

[17] C. Estatico, Classes of regularization preconditioners for image processing, Proc.SPIE, 2003 – Advanced Signal Processing: Algorithms, Architectures, and Imple-mentations XIII, Ed. F.T. Luk, Vol. 5205, pagg. 336–347, 2003, Bellingham, WA,USA.

[18] C. Estatico, ”A classification scheme for regularizing preconditioners, with applica-tion to Toeplitz systems”, Linear Algebra Appl., 397, 2005, pp. 107–131.

[19] C. Estatico, Regularized fast deblurring for the Large Binocular Telescope, Tech.Rep. N. 490, Dipartimento di Matematica, Universita di Genova, 2003

[20] R. Gray, Toeplitz and Circulant matrices: a review,http: www-isl.stanford.edu/∼gray/toeplitz.pdf, 2000.

[21] U. Grenander and G. Szego, Toeplitz Forms and Their Applications, Second edition,Chelsea, New York, 1984.

[22] C.W. Groetsch, Generalized inverses of linear operators: representation and approx-imation, Pure and Applied Mathematics, 37, Marcel Dekker, New York, 1977.

[23] M. Hanke, J.G. Nagy and R. Plemmons, Preconditioned iterative regularization forill-posed problems, Numerical Linear Algebra and Scientific Computing, L. Reichel,A. Ruttan and R.S. Varga, eds., Berlin, de Gruyter, 1993, pp. 141–163.

[24] J. Kamm and J.G. Nagy, Kronecker product and SVD approximations in imagerestoration, Linear Algebra Appl., 284, 1998, pp. 177–192.

[25] M. Kilmer, Cauchy-like Preconditioners for Two-Dimensional Ill-Posed Problems,SIAM J. Matrix Anal. Appl., 20, No. 3, 1999, pp. 777–799.

[26] M. Kilmer and D. O’Learly, Pivoted Cauchy-like preconditioners for regularizedsolution of ill-posed problems, SIAM J. Sci. Comp., 21, 1999, pp. 88–110.

[27] J.G. Nagy, M.K. Ng and L. Perrone, Kronecker product approximation for imagerestoration with reflexive boundary conditions, SIAM J. Matrix Anal. Appl., 25,2004, pp. 829–841.

[28] S. Serra Capizzano, Toeplitz preconditioners constructed from linear approximationprocesses, SIAM J. Matrix Anal. Appl., 20-2, 1998, pp. 446–465.

[29] G. Strang, A proposal for Toeplitz matrix calculations, Stud. Appl. Math., 74, 1986,pp. 171–176.

Page 184: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

178 C. Estatico

[30] E. Tyrtyshnikov, A unifying approach to some old and new theorems in distributionand clustering, Linear Algebra Appl., 232, 1996, pp. 1–43.

[31] E. Tyrtyshnikov and N. Zamarashkin, Spectra of multilevel Toeplitz matrices: ad-vanced theory via simple matrix relationship, Linear Algebra Appl., 270, 1997, pp.15–27.

[32] N. Trefethen, Approximation theory and numerical linear algebra, J. Mason and M.Cox, eds., Chapman and Hall, London, 1990, pp. 336–360.

Claudio EstaticoDipartimento di Matematica, Universita di GenovaVia Dodecaneso 35I-16146 Genova, Italye-mail: [email protected]

Page 185: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 179–194c© 2005 Birkhauser Verlag Basel/Switzerland

Minimal State-space Realizationfor a Class of nD Systems

K. Galkowski

Abstract. Minimal realizations play a key role in system analysis and syn-thesis. Among a variety of realizations they are characterised by a minimaldimension, which guarantees that no pole zero cancellations occur, a very im-portant feature for the stability analysis, and are also of importance from thenumerical point of view. This paper provides a simple method for minimalrealization construction for multi-linear odd rational functions an importantclass from the practical point of view due to strong links to so-called reactancefunctions.

Keywords. Multidimensional systems, multi-linear, odd rational functions.

1. Introduction

The past two to three decades, in particular, have seen a continually growing in-terest in multidimensional (nD) systems which is clearly linked to the wide varietyof applications arising in both the theory and practical applications domains. Thekey unique feature of an nD system is that the plant or process dynamics (in-puts, states and outputs) are functions of more than one independent variable asthe result of the fact that information is propagated in independent directions.This is an essential difference with the classical, or 1D, case where the processdynamics (inputs, states and outputs) are functions of only one variable. In bothcases, i.e., 1D and nD, the process can be single-input single-output (SISO) ormultiple-input multiple-output (MIMO). Hence, for example, a SISO nD linearsystem can be represented by a transfer function, which is a rational function inn indeterminates.

Many physical systems, data analysis procedures, computational algorithmsand (more recently) learning algorithms have a natural (and underexploited) two-dimensional (2D) structure due to the presence of more than one spatial variable,the combined effect of space and time or the combined effect of a spatial/timevariable and an integer index representing iteration, pass or trial number. Physicalexamples of such systems include bench mining systems, metal rolling, automatic

Page 186: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

180 K. Galkowski

ploughing aids and vehicle convoy coordination on motorways whilst algorithmicexamples include image processing, discrete models of spatial behavior, point map-ping algorithms and recursive learning schemes as illustrated by trajectory learningin iterative learning control.

Focusing on discrete nD linear systems, two basic state space models havebeen developed. The first is due to Roesser ([19]) and clearly has a first order struc-ture. Among the key features of this model is that the state vector is partitionedinto sub-vectors – one for each of the two directions of information propagation(usually termed horizontal and vertical respectively). One main alternative to theRoesser model is the Fornasini-Marchesini model class [7]. Note, however, thatRoesser and Fornasini-Marchesini models are not fully independent and it is pos-sible to transform one into the other.

A key task in 2D/nD systems theory and applications is the constructionof state-space realizations of the [19] or [7] types from input-output data, oftenin the form of a 2D/nD transfer function matrix. This problem is well studied(see, for example, [3], [4], [6], [8], [11], [14], [20], [21] and references therein). Todate, however, the key systems theoretic and applications relevant question of howto construct a so-called minimal realization has not been solved in the generalcase. For further details, see, for example, [15], [16], or [5]. It is also an interestingmathematical problem on its own and it plays an important role in multivariableinterpolation (see, e.g., [1], and [2]).

This paper aims to extend the class of multidimensional linear systems forwhich the solution of this key problem is known. In particular, the existence andconstruction of a minimal realization for the class of single-input single-output nDlinear systems characterized by transfer functions with multi-linear (i.e., of firstdegree in each indeterminate) numerator and denominator has been developed in[8]. In turn, an assumption for a transfer function to be odd, which is the topicof the paper, also makes the analysis significantly easier and provides interestingresults. Such systems are of some practical interest as this subclass consists ofso-called reactance functions, a subclass of positive real functions, frequently usedin circuits theory.

2. Background

A multidimensional (nD), linear system can be described in the state-space formby the well-known Roesser model⎡⎢⎣ x1(i1 + 1, . . . , in)

...xn(i1, . . . , in + 1)

⎤⎥⎦ =

⎡⎢⎣ A11 · · · A1n

.... . .

...An1 · · · Ann

⎤⎥⎦⎡⎢⎣ x1(i1, . . . , in)

...xn(i1, . . . , in)

⎤⎥⎦+

⎡⎢⎣ B1

...Bn

⎤⎥⎦u(i1, . . . , id),

Page 187: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Minimal State-space Realization 181

y(i1, . . . , in) =[C1 · · · Cn

] ⎡⎢⎣ x1(i1, . . . , in)...

xn(i1, . . . , in)

⎤⎥⎦ + Du(i1, . . . , in), (1)

wherexi(i1, . . . , in) ∈ Rpi is an ith local state sub-vector (i = 1, . . . , n),u(i1, . . . , in) ∈ Rr is an input (control) vector ,y(i1, . . . , in) ∈ Rm is an output vector, Aij , Bi, Ci, D are the real ma-trices of appropriate dimensions (i, j = 1, . . . , n), i1, . . . , in ∈ Z+∪0are the discrete independent variables. In block form

A =

⎡⎢⎣ A11 · · · A1n

.... . .

...An1 · · · Ann

⎤⎥⎦ , B =

⎡⎢⎣ B1

...Bn

⎤⎥⎦ , (2)

C =[C1 · · · Cn

].

In this paper, SISO systems are investigated, i.e., the input u and the outputy are scalars. A linear, multidimensional (nD) system of the form (1) can also bedescribed in the generalised frequency domain by an n-variable, rational functionmatrix. In the SISO case a transfer function matrix becomes a single, rationaltransfer function

f(s1, . . . , sn) =a(s1, . . . , sn)b(s1, . . . , sn)

, (3)

where a(s1, . . . , sn) and b(s1, . . . , sn) are real, n-variable polynomials. It is wellknown that the transfer function description (3) is linked to the Roesser model(1) by

f(s1, . . . , sn) = D + C (s1It1 ⊕ · · · ⊕ snItn −A)−1B (4)

where ⊕ denotes the direct sum, provided that the rational function (4) is proper.A very important problem in nD systems theory and practice (for the SISO

case) is: given a (scalar) rational matrix function (in n complex variables)f(s1, . . . , sn) as in (3), construct matrices A,B,C,D as in (2) so that f is realizedin the form (4) as the transfer function of the associated linear system (1). Notethat a necessary condition for the problem to have a solution is that f be properin each variable. Conversely, under the assumption that this necessary conditionis satisfied, it is known that the realization problem always has a solution (see,e.g., [7]).

For the 1D case it is well understood how to construct state-space minimal orleast-order realizations, i.e., realizations for which the dimension of the state spaceis as small as possible among all possible realizations. Moreover, in the 1D case,it is well known that the minimal state space dimension is equal to the degree ofthe rational transfer function denominator b, provided that there are no pole-zerocancellations between the denominator b and numerator a, i.e., provided that thepair of polynomials a, b is coprime. This suggests our definition of denominator-degree minimal realization for the nD case.

Page 188: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

182 K. Galkowski

Definition 1. Suppose that f(s1, . . . , sn) is a rational function of n complex vari-ables as in (3). Without loss of generality we may assume that the numerator poly-nomial a(s1, . . . , sn) and the denominator polynomial b(s1, . . . , sn) are coprime,i.e., have no nontrivial common polynomial factors. Then the realization (4) for fis said to be denominator-degree minimal provided that the dimension p1 + · · ·+pn

of the state space of the realization is equal to the total degree of the denominatorb, i.e., the sum of the degrees of b in each variable.

The realization (3) of f is said to be least-order minimal if the dimensionp1+ · · ·+pn of the state space is as small as possible among all possible realizations(3) of f .

In the 1D case we have that least-order minimal and denominator-degreeminimal are equivalent, and that it is always possible to obtain a minimal (ineither equivalent sense) realization of a given proper rational function. Also inthe 1D case, the properties of controllability and observability of the state spacerealization are equivalent to minimality. For the nD case, on the other hand, theequivalence between minimality and simultaneous controllability and observabil-ity fails. Moreover, the concepts of denominator-degree minimal and state-spaceleast-order minimal are not equivalent in general. It is even possible that a givenrational function of n variables (proper in each variable separately) may not have adenominator-degree minimal realization. This phenomenon is related to the com-plication in the nD case that there are at least three distinct notions of coprime-ness, termed factor, minor and zero coprimeness, for polynomials in n variables.By definition, however, given that the realization problem always has a solution asobserved above, it follows that the state-space minimal (i.e., least-order minimal)realization problem also always has a solution: the issue is how to compute such arealization, where now the size of a least-order minimal realization may be greaterthan the total denominator degree in a factor-coprime fractional representationof the rational function f . The Elementary Operation Algorithm developed byGalkowski ([8]) is an attempt to produce efficient solutions to this problem and isbased on symbolic calculations.

Until now the problem of the existence and construction of denominator-degree minimal realizations for nD systems has been solved only in some partic-ular cases, as, e.g., when the transfer function has a separable denominator ornumerator. Due to its great practical importance (digital filter applications) thereexists a very rich literature devoted to this class, see for example [18]. However,the problem solution then is much easier than in the general case and in fact it isbased on 1D techniques.

3. Problem formulation

Note first that the state-space realization of a SISO system can be derived in thefollowing way, see, e.g., [8]

Page 189: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Minimal State-space Realization 183

1. Define the n + 1-variable polynomial

af (s1, . . . , sn+1) = sn+1b(s1, . . . , sn)− a(s1, . . . , sn) (5)

2. Find the companion matrix H for a polynomial, i.e., the matrix H , whichsatisfies

af (s1, . . . , sn+1) = det [⊕ni=1siIti ⊕ sn+1 −H ] (6)

where ti denotes both a polynomial degree in the ith variable and a dimensionof ith state sub-vector in (1) in an obvious manner, and the polynomial af

can be written as

af (s1, . . . , sn+1) =t1∑

j1=0

· · ·tn∑

jn=0

tn+1(=1)∑jn+1=0

ajn+1jn···j1n+1∏k=1

stk−jk

k (7)

where, due to (6), the polynomial af (s1, . . . , sn+1) has to be monic, i.e.,

a0...0 = 1. (8)

3. Write the matrix H in the block form H =[H11 H12

H21 H22

], where H22 is a

scalar and refers to sn+1. Then, H11 = A,H12 = B,H21 = C,H22 = D.

Thus the problem of finding a state-space realization for a rational, n-variatetransfer function has been replaced by the problem of finding the companion matrixfor the n + 1-variate polynomial af . Hence, given a multi-linear polynomial af itis aimed to find the matrix that satisfies a polynomial equation of the form (6)where ti = 1, i = 1, 2, . . . , n, which requires solving this polynomial equation. Theproblem is solved in [9]. This approach generalizes also to the general MIMO case([8]).

In this paper, we present a simple efficient construction algorithm for thesolution of the problem for the particular case of an odd multi-linear rationalfunction. The solution algorithm which we obtain here is much simpler than thatobtained for the general case in [9]. In the multi-linear case, the polynomial af

of (7) can be written with t1 = 1, i = 1, 2, . . . , n, and then the only coefficientsaj1...jn+1 appearing are those with indices j1 . . . jn+1 equal to 0 or 1. For simplicity,the coefficients of the polynomial (7) are denoted as

a0 . . . 010 . . . . . . . . . 010 . . . 010 . . . 0︸ ︷︷ ︸i1︸ ︷︷ ︸

i2︸ ︷︷ ︸ik

:= ai1i2···ik(9)

We say that the rational function f(s1, . . . , sn) is odd if f(s1, . . . , sn) =−f(−s1, . . . ,−sn). Thus, if f given by (3) is odd, then either

1. a is odd and b is even, or2. a is even and b is odd.

Page 190: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

184 K. Galkowski

In the first case the associated polynomial af (5) is odd while in the second caseaf is even. For the case where f is also multi-linear, af being odd in terms of thecoefficients (9) means that

ai1i2···i2+1 = 0 for all = 0, 1, . . . in case n is odd,ai1i2···i2

= 0 for all = 1, . . . in case n is even.

while af being even means thatai1i2···i2+1 = 0 for all = 0, 1, . . . in case n is even,ai1i2···i2

= 0 for all = 1, 2, . . . in case n is odd.

For example, consider the two odd rational functions−s1 − s2s1s2 + 2

and−s1s2 − s1s3 − s2s3 − 2s1s2s3 + s + s2 + s3

.

In the first case the polynomial af is

af = s1s2s3 + 2s3 + s2 + s1

= a000s1s2s3 + a011s3 + a101s2 + a110s1

= a000s1s2s3 + a12s3 + a13s2 + a23s1

while in the second case af is given by

af = s1s2s3s4 + s1s4 + s2s4 + s3s4 + s1s2 + s1s3 + s2s3 + 2= a0000s1s2s3s4 + a0110s1s4 + a0101s2s4 + a0011s3s4

+a1100s1s2 + a1010s1s3 + a1001s2s3 + a1111

= a0000s1s2s3s4 + a23s1s4 + a13s2s4 + a12s3s4

+a34s1s2 + a24s1s3 + a14s2s3 + a1234.

Now the problem to be solved can be stated as follows: given a multi-linearpolynomial af of the form (7) with t1 = 1 for i = 1, 2, . . . , n+ 1, find a matrix Hwhich solves equation (6).

4. The denominator-degree minimal realizationand existence conditions

The general case has been considered in [9] and basing ourselves on this we en-counter the particular case of odd transfer functions. First, recall that the polyno-mial equation (6) for the denominator-degree minimal realization can be rewrittenas a system of polynomial equations with the entries hij of the (n + 1)× (n + 1)matrix H being taken as the unknowns, i.e.,

(−1)kMi1i2...ik= ai1i2...ik

(10)

Page 191: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Minimal State-space Realization 185

where, for each k ∈ 1, 2, . . . , n + 1, the set i1, i2, . . . , ik is an element of theset Ck1, 2, . . . , n+ 1 of k-element subsets of the set 1, 2, . . . , n+ 1, and whereMi1i2...ik

= Mi1i2...ik(hij) denotes the principal minor of the matrix H correspond-

ing to row and column indices i1, i2, . . . , ik. To avoid misunderstandings, Mi1i2···ik

denotes here the respective sub-matrix and Mi1i2···ikthe value of the minor, i.e.,

det Mi1i2···ik:= Mi1i2···ik

. Obviously, the minor Mi1i2···ikdepends on the entries

of the matrix H and hence (10) constitutes the set of equations whose solutionis a required denominator-degree minimal realization. Note that this equation setcan be significantly simplified, which is possible due to the fact that the equations(10) for a given k contain some equation parts for values smaller than k.

We define a transversale of a minor Mi1i2...ikto be any terms of the form A :=

hi1αhi2β · · ·hikψ where α, β, . . . , ψ is any permutation of the set i1, i2, . . . , ik.Then the value of a minor Mi1i2...ik

can be expressed as

Mi1i2...ik

∑±A

where k ∈ 1, 2, . . . , n+1, i1, i2, . . . , ik ∈ Ck1, 2, . . . , n+1 and where A sweepsover all transversales of Mi1i2...ik

, and the signs are given by an appropriate expan-sion of the determinant along rows or columns respectively. Note that transversalesA can be elements of products of two or more lower order minors Mi1i2···iα , whichconstitute a mutually exclusive, exhaustive partition of i1, i2, . . . , ik or cannot bepresented in that way. In the first case, the transversale can be calculated in termsof the coefficients of a polynomial af for smaller values of k while transversales ofthe second type remain unchanged in the reduced system of equations.

To introduce a formal way to achieve the ‘minimal’ equation set equivalentto (10) the following notations are necessary. Denote, hence, by r = j1, j2, . . . , jksuch a permutation of the set i1, i2, . . . , ik that

j1 = i1; j2 = ik; jk > j2 (11)

and call R i1, i2, . . . , ik the set of all such permutations. Next introduce for anyr ∈ R i1, i2, . . . , ik

Ar = hi1j2hj2j3 · · ·hjk−1jkhjki1 ,

Ar∗ = hi1jkhjkjk−1 · · ·hj3j2hj2i1 , (12)

which represent the ‘non-partitioned’ transversales and enables rewriting (10) inthe form of ∑

r∈Ri1,i2,...,ik(Ar + Ar∗) = −ai1i2···ik

, (13)

where

ai1i2···ik=

∑z∈Zi1i2···ik

(−1)u−1(u− 1)!u∏

v=1

azv . (14)

Page 192: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

186 K. Galkowski

Here Z i1, i2, . . . , ik is the set of all mutually exclusive, exhaustive partitions ofthe index set i1, i2, . . . , ik , i.e.,

z := z1, . . . , zv, . . . , zw, . . . , zu ∈ Z i1i2 · · · ik : u = 1, 2, . . . k;zv, zw ⊆ i1i2 · · · ik ; zv ∩ zw = ∅;∪u

v=1zv = i1i2 · · · ik .The analogous equation set has been derived and exploited in [9] for a general

multi-linear case.In the following, assume that all ai1i2 = 0, i1, i2 ∈ C2 1, 2, . . . , n+ 1, whichmakes simpler the solution and , moreover, is valid for the reactance functions case.The case when this assumption is not valid requires more complicated methodologyand was solved for the general multi-linear case in [9].At this stage it is instructive to recall the form of equations (13) for k = 1, 2, 3, 4with the oddness assumption introduced previously.

hii = −ai = −ai = 0, (15)i = 1, 2, . . . , n+ 1,

hi1i2hi2i1 = −ai1i2 := −ai1i2 + ai1ai2 = −ai1i2 , (16)i1, i2 ∈ C2 1, 2, . . . , n+ 1

hi1i2hi2i3hi3i1 + hi1i3hi3i2hi2i1 = −ai1i2i3

:= −ai1i2i3 + ai1ai2i3 + ai2ai1i3 + ai3ai1i2 = 0 (17)

i1, i2, i3 ∈ C3 1, 2, . . . , n+ 1hi1i2hi2i3hi3i4hi4i1 + hi1i3hi3i4hi4i2hi2i1

+hi1i2hi2i4hi4i3hi3i1 + hi1i4hi4i2hi2i3hi3i1

+hi1i3hi3i2hi2i4hi4i1 + hi1i4hi4i3hi3i2hi2i1

= −ai1i2i3i4 := −ai1i2i3i4 + ai1i4ai2i3 + ai2i4ai1i3 + ai3i4ai1i2 , (18)

i1, i2, i3, i4 ∈ C4 1, 2, . . . , n+ 1.It is immediate to see from (15) that all diagonal elements of the matrix H mustbe zero. Moreover, it is straightforward from (16) and (17) that

h2i1i2h

2i2i3h

2i3i1 = ai1i3ai2i3ai1i2 (19)

∀ i1, i2, i3 ∈ C3 1, 2, . . . , n+ 1Also, it is straightforward to see from (16)–(18) that for any k > 3 any transversaleAr (see (12)) can be presented in the form of

Ar =Ai1j2j3Ai1j3j4 · · ·Ai1jk−1jk

(−ai1j3) (−ai1j4) · · ·(−ai1jk−1

) . (20)

For example,

hi1i2hi2i3hi3i4hi4i1 =(hi1i2hi2i3hi3i1 ) (hi1i3hi3i4hi4i1)

−ai1i3

.

The following two lemmas can be proved directly in a straightforward way.

Page 193: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Minimal State-space Realization 187

Lemma 1. For each i1, i2, . . . , ik ∈ Ck 1, 2, . . . , n+ 1, k ≥ 3 and permutationr = i1, j2, . . . , jk ∈ R i1, i2, . . . , ik

Ar =√ai1j2aj2j3 · · · ajk−1jk

ajki1 . (21)

Lemma 2. For each i1, i2, . . . , ik ∈ Ck 1, 2, . . . , n+ 1, k ≥ 3 and permutationr = i1, j2, . . . , jk ∈ R i1, i2, . . . , ik

Ar∗ =

Ar k = 2l−Ar k = 2l + 1. (22)

Now, we are in a position to characterize if the denominator-degree minimalrealization is real or complex (provides it exists).

Theorem 1. The denominator-degree minimal realization H is real (provided thatsuch an H exists) if and only if ∀ i1, i2, i3 ∈ C3 1, 2, . . . , n + 1

ai1i2ai2i3ai1i3 > 0. (23)

Proof. Sufficiency. If (23) holds then by (19), (21) and (20) all transversales Ar

for 3 < k ≤ n + 1 are real, and hence from (12) all hij can be real too.Necessity. If (23) does not hold then by (21) there has to exist at least one

complex hij . The condition of Theorem 1 for n = 3 holds obviously if all polynomial

coefficients aij are positive but also for example, when a12, a23, a14, a34 < 0 anda13, a24 > 0, for the case of (n + 1 = 4).

In what follows, we can calculate possible values of the matrix H elements bysolving (15–17). This however does not prejudge that such a matrix is definitelya denominator-degree minimal realization for a given multi-variate, multi-linearpolynomial. In what follows a set of necessary and sufficient conditions will bepresented.

Theorem 2. The elements of the matrix H (provided that such an H exists), canbe calculated as

hii = 0 (24)i = 1, 2, . . . , n+ 1 and

1. If for some i1,i2,i3∈C31,2,...,n+1 (23) holds then ∀i,j⊂i1,i2,i3(a) for aij > 0

hij = −hji = ±√aij (25)(b) for aij < 0

hij = −hji = ±√|aij | (26)

2. If for some i1, i2, i3 ∈ C3 1, 2, . . . , n+ 1 (23) does not hold then for suchi, j ⊂ i1, i2, i3 that aij < 0

hij = −hji = ±j√|aij |, j2 = −1 (27)

and as in (25) when aij > 0.

Page 194: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

188 K. Galkowski

Proof. It is straightforward by (15–17), Lemmas 1, 2 and the above discussion. Theorem 2 shows immediately that the polynomial coefficients ai1i2 , when

nonzero, can be assumed to be the basic ones (they determine the elements ofthe matrix H), and the remaining coefficients required for a denominator-degreeminimal realization are to be recovered as functions of these basic ones. Whensome of the coefficients ai1i2 are zero then the general procedure of [9] has to beapplied.

Note also that the solution of Theorem 2 is not unique and every similarmatrix, with diagonal similarity matrix, is also a solution.

The next necessary stage is obviously to obtain conditions for the existence ofa denominator-degree minimal realization, which can be achieved by substitutionthe values of the elements of the matrix H already calculated into the remainingequations of (13–14) (with exception of (15–17)).

Lemma 3. Given an n+ 1 variate polynomial af (s1, . . . , sn+1) of (5) with nonzerocoefficients aij . Then the necessary condition for solvability of the equation set of(10) with a companion matrix H entries hij as indeterminates, i.e., the neces-sary and sufficient conditions for the existence of a denominator-degree minimalrealization is that all the polynomial af (s1, . . . , sn+1) coefficients satisfy

ai1i2···i2l+1 = 0, ∀ i1, i2, . . . , i2l+1 ∈ C2l+1 1, 2, . . . , n+ 1

ai1i2···i2l=

∑z∈Z2i1i2···ik

∏zv∈z

azv

+∑

z′,z”∈Z2i1i2···ik

z′ =z”

± 2∏

zv∈z′∪z”

√azv (28)

∀ i1, i2, . . . , i2l ∈ C2l 1, 2, . . . , n+ 1where Z2 i1, i2, . . . , ik is the set of all mutually exclusive, exhaustive two elementpartitions of the index set i1, i2, . . . , ik , i.e.,

z : = z1, . . . , zv, . . . , zl ∈ Z2 i1i2 · · · i2lzv, zw ⊆ i1i2 · · · ik ; zv ∩ zw = ∅;∪u

v=1zv = i1i2 · · · i2l .

Proof. It is straightforward when substituting the results of Theorem 2 in theequations (13–14) (with exception of (15-17)) and making use of Lemmas 1, 2.

For example for k = 4 we have

ai1i2i3i4 = ai1i2ai3i4 + ai1i3ai2i4 + ai1i4ai2i3 ± 2√ai1i2ai3i4ai1i3ai2i4

±2√ai1i2ai3i4ai1i4ai2i3 ± 2

√ai1i3ai2i4ai1i4ai2i3 .

However, Lemma 3 does not constitute the sufficient conditions as some signcombinations in (28) are not allowed. In what follows, to achieve the necessary andsufficient existence conditions we solve the problem of appropriate signs in (28)

Page 195: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Minimal State-space Realization 189

and their relationships to the signs of the elements hij signs in (25)–(27). First,however introduce the following notations

s (i, j) :=

1 aij > 0−1 aij < 0 (29)

skl =

1 hkl > 0−1 hkl < 0 (30)

ℵ (· · · ), which denotes the cardinality (the number of elements) of the set · · · ,and S (i1, j2, j3, j4) denotes the sign of the transversaleAr := Ai1j2j3j4 . Basing our-selves on these notations, it is possible to obtain the following ”sign” relationships,first for k = 4.

Lemma 4. While (25)–(26) are valid then

S (i1, j2, j3, j4) = si1j2sj2j3sj3j4sj4j1 (31)

and while (25), (27) are valid then

S (i1, j2, j3, j4) = (−1)12 w(i1,j2,j3,j4) si1j2sj2j3sj3j4sj4j1 (32)

where w (i1, j2, j3, j4) := ℵ k, l∈ i1, j2 , j2, j3 , j3, j4 , j4, j1 : s(k, l) = −1 .Proof. The part “1” is straightforward when the part “2” is to guarantee realityof the respective Ar when possible complex matrix elements. Note also that dueto this w (i1, j2, j3, j4) can be equal 0, 2 or 4. Lemma 5. ∀ i1, i2, i3, i4 ∈ C4 1, 2, . . . , n+ 1

S (i1, i3, i2, i4) = (−1)ω(i1,i2,i3,i4) S (i1, i2, i3, i4)S (i1, i2, i4, i3) (33)

where

1. ω (i1, i2, i3, i4) := ℵ k, l ∈ i1, i3 , i2, i3 , i3, i4 : s(k, l) = −1+ 1(34)

if (25)–(26) are valid, and

2. ω (i1, i2, i3, i4) :=12

[w (i1, i2, i3, i4) + w (i1, i2, i4, i3) + w (i1, i3, i2, i4)] + 1

(35)if (25), (27) are valid.

Proof.1. It is straightforward by noting that

S (i1, i2, i3, i4) = (si1i2si3i4) (si2i3si4i1) := σi3i4σi2i3

S (i1, i2, i4, i3) = (si1i2si4i3) (si2i4si3i1) := σ′i3i4σi3i1

S (i1, i3, i2, i4) = (si1i3si2i4) (si3i2si4i1) := σ′i3i1σ

′i2i3

and

σ′ij =

σij s(i, j) = −1−σij s(i, j) = 1.

Page 196: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

190 K. Galkowski

2. Note that due to

(si1i2)2 = 1, si3i4si4i3 = −1

si1i3si3i2 = si3i1si3i2

we have

S (i1, i2, i3, i4)S (i1, i2, i4, i3) = (−1)12 [w(i1,i2,i3,i4)+w(i1,i2,i4,i3)]+1

×si1i3si2i4si3i2si4i1

which together with (32) completes the proof.

Note that from the above analysis it is straightforward to see that not all signs

(j1, j2, j3, j4) , j1, j2, j3, j4 ∈ R i1, i2, i3, i4where i1, i2, i3, i4 ∈ C4 1, 2, . . . , n+ 1 can be arbitrarily chosen. However, thenext ”sign” Lemma will play a crucial role in development of the so-called signbasis, i.e., these j1, j2, j3, j4∈ R i1, i2, i3, i4 , i1, i2, i3, i4 ∈ C4 1, 2, . . . , n+ 1 for which signs can be arbi-trarily chosen.

Lemma 6. ∀ j1, j2, j3, j4 ∈ R i1, i2, i3, i4 , i1, i2, i3, i4 ∈ C4 1, 2, . . . , n+ 1,∀k ∈ 1, 2, . . . , n + 1 \ j1, j2, j3, j4 the sign S (j1, j2, j3, j4) can be determined as

S (i1, j2, j3, j4) = s(j′1, k)s(k, j′3)S (j′1, j′2, j

′3, k)S (j′1, k, j

′3, j

′4) (36)

where j′1, j′2, j′3, j′4 is any cyclic permutation of j1, j2, j3, j4 .

Proof. It is a clear consequence of

hj1j2hj2j3hj3j4hj4j1 =

(hj′1j′2hj′2j′3hj′3khkj′1

) (hj′1khkj′3hj′3j′4hj′4j′1

)aj′1kakj′3

.

Due to these results, we can choose arbitrarily S(1, 2, 3, l) and S(1, 2, l, 3),which yields

S(1, 3, 2, l) = (−1)ω(1,2,3,l)S(1, 2, 3, l)S(1, 2, l, 3) (37)

l = 4, 5, . . . , n+ 1. Moreover, the following signs are determined by this choice

S(1, k, 2, l) = s(1, 3)s(2, 3)S(1, 3, 2, k)S(1, 3, 2, l) (38)

k = 4, 5, . . . , n + 1, l = k + 1, k + 2, . . . , n + 1. Also, the signs S(1, 2, k, l) andS(1, 2, l, k) can be partially arbitrarily chosen, i.e., under the condition

S(1, 2, k, l)S(1, 2, l, k) = (−1)ω(1,2,k,l) s(1, 3)s(2, 3)S(1, k, 2, 3)S(1, 3, 2, l) (39)

Note also that the signs of Ar for k > 4 are not independent but are related tothe aforementioned arbitrarily chosen signs. For example,

S (i1, j2, j3, j4, j5, j6) = −s(i1, j4)S (i1, j2, j3, j4)S (i1, j4, j5, j6) . (40)

Now, applying aforementioned sign relationships allows us to present the nec-essary and sufficient conditions for the existence of a denominator-degree minimalrealization.

Page 197: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Minimal State-space Realization 191

Theorem 3. Given an n+1 variate polynomial af (s1, . . . , sn+1) of (5) with nonzerocoefficients aij . Then the necessary and sufficient conditions for the existence of adenominator-degree minimal realization are that all the polynomial af(s1, . . . , sn+1)coefficients have to satisfy

ai1i2···i2l+1 = 0, ∀ i1, i2, . . . , i2l+1 ∈ C2l+1 1, 2, . . . , n+ 1

ai1i2···i2l=

⎛⎝ ∑z∈Z2i1i2···ik

∏zv∈z

±√azv

⎞⎠2

(41)

∀ i1, i2, . . . , i2l ∈ C2l 1, 2, . . . , n+ 1

or

ai1i2···i2l=

⎛⎝ ∑z∈Z2i1i2···ik

∏zv∈z

±izv

√azv

⎞⎠2

(42)

where izv = 1 or , iz′v= iz”v , i

2z′

v= i2z”v

= 1, iz′viz”v = −1.

Proof. It is a straightforward consequence of the analysis above.

Note that in the first case all coefficients must be positive.The last problem to solve is here to find appropriate signs of elements hij

with respect to signs in the conditions (41) or (42), which is accomplished bythe following sign algorithm based on previous considerations. Start hence fromarbitrarily chosen basic signs S(1, 3, 2, k), S(1, 3, 2, l), k, l = 4, 5, . . . , n + 1, andfrom S(1, 2, k, l), k = 4, 5, . . . , n, l = k + 1, k + 2, . . . , n+ 1, chosen to satisfy (39).Note that they can be rewritten as

S(1, 2, k, l) = hk−2gkl−k,

S(1, 2, l, k) = hl−2dkl−k (43)

k = 3, . . . , n, l = k + 1, . . . , n+ 1, where

hi := s12s2,i+2, i = 1, . . . , n− 1,

gji := si,i+jsi+j,1,

dji := sj1si+j,i,

j = 3, . . . , n, i = 1, . . . , n + 1− j. (44)

Now, assuming the sign base S(1, 3, 2, k), S(1, 3, 2, l), k, l = 4, 5, . . . , n + 1,and S(1, 2, k, l), k = 4, 5, . . . , n, l = k + 1, k + 2, . . . , n + 1, which satisfy (39) areknown and then we can calculate

gkl−k = S (1, 2, k, l)hk−2

dkl−k = S (1, 2, l, k)hk−2 (45)

Page 198: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

192 K. Galkowski

if (25)–(26) are valid for hij , and

gkl−k = (−1)w(1,2,k,l)/2 S (1, 2, k, l)hk−2

dkl−k = (−1)w(1,2,l,k)/2 S (1, 2, l, k)hk−2 (46)

if (25), (27) are valid. In both cases the signs hk−2 are arbitrary.At the final stage we can determine the appropriate signs of matrix H ele-

ments hij using the following algorithm.

1. Put the signs s12, s23, s24, . . . , s2,n+1, s31 arbitrary,2. Choose sl3, l = 4, . . . , n+ 1 according to

sl3 = d3l−3s31 (47)

3. and sl1, l = 4, . . . , n+ 1 according to

sl1 = g3l−3s31 =

g3

l−3s13 (25)–(26) are valid and s(3, l) = −1−g3

l−3s13 in the rest of cases (48)

4. and finally for k = 4, 5, . . . , n, l = k + 1, k + 2, . . . , n+ 1,

slk = sk1dl−k,k (49)

which finishes the algorithm.

5. Conclusions

The problem of how to construct the state-space realizations for a given 2D MIMOsystem, written, for example, in the 2D transfer function matrix form, is central tovarious applications of 2D systems theory, such as, multidimensional filters analysisand synthesis, and has received a considerable attention in the literature. Thereis no general solution yet to the problem of obtaining a minimal realization (boththe denominator-degree and the least-order realization) in the general nD systemscase but there are several approaches, which aim at developing a general solutionof the problem of determining the minimal possible dimension. These include thework of Guiver, Bose (1982) [10], [11], [21], [5] and the EOA algorithm due to theauthor. In this paper we have developed a method of examining if there exists adenominator-degree minimal realization and constructing it for the particular classof SISO nD systems characterized by multi-linear transfer function polynomials,but moreover odd. This subclass of systems play a significant role in practicalimplementations of system and circuit theory as it has strong links to importantclasses of so-called reactance functions [13] and so-called positive systems [12].

Page 199: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Minimal State-space Realization 193

Acknowledgments

This work is partially supported by Ministry of Scientific Research and InformationTechnology under the project 3 T11A 008 26.

The author feels indebted to express his gratitude to the reviewers for theirvaluable comments.

Krzysztof Galkowski is with The University of Zielona Gora but also is avisiting Professor of The University of Southampton and during the academicyear 2004–2005 is on sabbatical leave to The University of Wuppertal under theGerhard Mercator Guest Professorship founded by DFG. In 2004 he has receivedthe Siemens Poland scientific award for the research in the nD systems area.

References

[1] Agler J. and McCarthy J.E. (2002) Pick interpolation and Hilbert function spaces,Amer. Math. Soc., Providence, RI.

[2] Ball J.A. and Trent T.T. (1998) Unitary colligations, reproducing kernel Hilbertspaces, and Nevanlinna-Pick interpolation in several variables, J. Functional Analy-sis, vol. 157, 1–61.

[3] Bose N.K. (1976) New techniques and results in multidimensional problems, Journalof the Franklin Institute, Special issue on recent trends in systems theory.

[4] Bose N.K. (1982) Applied Multidimensional Systems Theory, New York, Van Nos-trand Reinhold.

[5] Cockburn J.C., Morton B.G. (1997) Linear fractional representations of uncertainsystems, Automatica, vol. 33, no.7, 1263–1271.

[6] Eising R. (1978) Realization and stabilization of 2-D systems IEEE Trans. on Auto-matic Control, AC-23, 793–799.

[7] Fornasini E., Marchesini G. (1978) Doubly-indexed dynamical systems, Math. Syst.Theory, vol. 12, 59–72.

[8] K. Galkowski (2002), State-space Realizations of Linear 2-D Systems with Extensionsto the General nD (n > 2) Case, Lecture Notes in Control and Information Sciences,vol. 263, Springer, London.

[9] Galkowski K. (2001), Minimal state-space realization of the particular case of SISOnD discrete linear systems, International Journal of Control, Vol. 74, No. 13, 1279–1294.

[10] Guiver J.P., Bose N.K. (1982), Polynomial matrix primitive factorization over arbi-trary coefficient field and related results, IEEE Trans. on Circuits and Systems, Vol.CAS-29, No. 10, 649–657.

[11] Kaczorek T. (1985) Two Dimensional Linear Systems, Lecture Notes in Control andInformation Sciences, No. 68, Berlin: Springer-Verlag.

[12] Kaczorek T. (2002) Positive 1D and 2D Systems, Berlin: Springer-Verlag.

[13] Koga T. (1968) Synthesis of finite passive N-ports with prescribed positive real ma-trices of several variables, IEEE Trans. on Circuit Theory, Vol. CT-15, No. 1, 2–23.

[14] Kung S.Y. et al. (1977) New results in 2-D systems theory, part II, Proc. IEEE,945–961.

Page 200: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

194 K. Galkowski

[15] Mentzelopoulos S.H., Theodorou N.J. (1991) N-dimensional minimal state-space re-alization, IEEE Trans. on Circuits and Systems, vol. 38, no. 3, 340–343.

[16] Premaratne K., Jury E.I., Mansour M. (1997) Multivariable canonical forms formodel reduction of a 2-D discrete time systems, IEEE Trans. on Circuits and Sys-tems, vol. 37, no. 4, 488–501.

[17] Pugh A.C., McInerney S.J., Boudellioua M.S., Hayton G.E. (1998) Matrix pencil ofa general 2-D polynomial matrix, Int. J. Control, vol. 71, no. 6, 1027–1050.

[18] Raghuramireddy D., Unbehauen R., 1991, Realization of 2-D denominator-separabledigital filter transfer functions using complex arithmetic, Multidimensional Systemsand Signal Processing, vol. 2, 319–336.

[19] R. Roesser, A discrete state space model for linear image processing, IEEE Trans.Automatic Control, vol. 20, pp. 1–10, 1975.

[20] Sontag E. (1978) On first-order equations for multidimensional filters, IEEE Trans.Acoust. Speech Signal Processing, Vol. 26, No. 5, 480–482.

[21] Zak S. H., Lee E. B., Lu W. S. (1986) Realizations of 2-D filters and time delaysystems, IEEE Trans on Circuits and Systems, Vol. CAS-33, No. 12, 1241–1244.

K. GalkowskiUniversity of Zielona GoraInstitute of Control and Computational EngineeringPodgorna Str. 5065-246 Zielona Gora, Polande-mail: [email protected]

Page 201: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 195–216c© 2005 Birkhauser Verlag Basel/Switzerland

Continuity in Weighted Besov Spacesfor Pseudodifferential Operatorswith Non-regular Symbols

Gianluca Garello and Alessandro Morando

Abstract. The authors state and prove a result of continuity in weighted Besovspaces for a class of pseudodifferential operators whose symbol a(x, ξ) admitsa finite number of bounded derivatives with respect to ξ and is of weightedBesov type in the x variable.

Mathematics Subject Classification (2000). 35S05; 35A17.

Keywords. Pseudodifferential operators, Besov spaces.

1. Introduction

Let a(x, ξ) be in the Schwartz class of tempered distributions S′(Rnx × Rn

ξ ). Weconsider the pseudodifferential operator defined in a formal way, for any rapidlydecreasing function u(x) ∈ S(Rn), by:

a(x,D)u = (2π)−n

∫eix·ξa(x, ξ)u(ξ) dξ, (1.1)

where u(ξ) is the Fourier transform of u and the other notations are standard, inparticular x·ξ =

∑nj=1 xjξj . We are primarily interested in studying the conditions

on the symbol a(x, ξ) which allow a(x,D) to belong to the space L(Lp) of boundedlinear operators on Lp, 1 < p <∞.

Aiming at non-specialists we begin by giving a short review of known results.In applications to linear partial differential equations with C∞ coefficients onedeals, as a rule, with symbols a(x, ξ) which are smooth functions both in x and ξ.We first recall some basic result in this case. Namely, let us refer to the Hormander

The authors are supported by F.I.R.B. grant of Italian Government.

Page 202: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

196 G. Garello and A. Morando

symbol classes Smρ,δ, m ∈ R, 0 ≤ δ ≤ ρ ≤ 1, given by the sets of those a(x, ξ)

satisfying

|∂αξ ∂

βxa(x, ξ)| ≤ cα,β(1 + |ξ|)m−ρ|α|+δ|β|, x, ξ ∈ Rn, α, β ∈ Zn

+. (1.2)

As a natural extension of the Mikhlin-Hormander Lemma on Fourier multiplierswe have a(x,D) ∈ L(Lp) provided that a(x, ξ) ∈ S0

1,0, i.e., a(x,D) is a classicalpseudodifferential operator. More generally a(x,D) ∈ L(Lp) if a(x, ξ) ∈ S0

1,δ, forsome δ < 1, for the proof see for example Taylor [[18], Ch. XI, §1, §2, §3]. On theother hand it is well known since a counter-example of Hirschmann and Wainger,well summarized in [5], that in general a(x,D) /∈ L(Lp) when a(x, ξ) belongsto S0

ρ,δ with ρ < 1. Namely Hormander [8], Fefferman [5] proved that the classOp S−m

ρ,δ of pseudodifferential operators with symbols in S−mρ,δ , 0 ≤ δ ≤ ρ < 1, is

contained in L(Lp), for 1 < p <∞ when m ≥ mp where the critical order is given

by mp = n(1− ρ)∣∣∣ 12 − 1

p

∣∣∣.Let us now deal with pseudodifferential operators with non-regular symbols.

Their importance in the literature is now increasing, because of the applicationsto linear partial differential equations with non-smooth coefficients and non-linearequations, as well as to applicative problems of a different nature (signal theory,quantization etc . . . ). Willing to review some results in this connection, we mayquote as a basic example pseudodifferential operators with symbols which aredifferentiable with respect to the ξ variable a finite number of times and whichbelong to generalized Holder classes with respect to x. As to Lp continuity of sucha kind of operators, a very important part is covered by the works of M. Nagase[13], [14], [15], [16]. As a significant example we recall the main result of [15].For 1 < p < ∞, 0 ≤ δ < ρ ≤ 1 Nagase assumes that for some suitable positiveconstants k = k(n), µ = µ(n, ρ, δ) the symbol a(x, ξ) satisfies for any ξ ∈ Rn:

supx|∂α

ξ ∂βxa(x, ξ)| ≤ cα,β(1 + |ξ|)−mp−ρ|α|+δ|β| for |α| ≤ k, |β| < µ;

‖∂αξ a(x, ξ)‖Cµ ≤ cα(1 + |ξ|)−mp−ρ|α|+δµ, |α| ≤ k,

(1.3)

where ‖·‖Cµ is the standard Holder norm of order µ. Then, provided that as beforemp = n(1− ρ)

∣∣∣ 1p − 12

∣∣∣, it follows that a(x,D) ∈ L(Lp). More precisely k(n) =[

n2

]if 2 ≤ p <∞ and k(n) = n + 1 if 1 < p < 2.

Among the results that are worthy to be noticed we mention those of J.Marschall [10], [11] obtained using techniques of the paradifferential calculus ofBony and Meyer (see [1], [12]) which in some way generalize the results of Nagase.

Let us quote also the Sugimoto paper [17], where Lp continuity is studied forpseudodifferential operators with symbols a(x, ξ) in weighted Besov spaces withrespect to both variables, with a loss of Besov regularity estimated by means ofn(

12 −

1p

), for 2 ≤ p <∞. In fact, the guiding thread in most of the literature on

Lp continuity of pseudodifferential operators is characterized by the critical order

Page 203: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Continuity in Weighted Besov Spaces 197

mp ∼ n∣∣∣ 1p − 1

2

∣∣∣ and much effort is made in estimating it as well as possible, underminimal assumptions on the symbol.

Returning to smooth symbols, a different point of view was offered by Taylor:keeping as a model the proof of the Lp continuity of the operators in Op S0

1,0, hefound that it may be adapted to proving the boundedness of a suitable subclassof Op S0

ρ,0, 0 < ρ < 1, by replacing the Mikhlin-Hormander Lemma on Fouriermultipliers [[18], Ch. XI] with the analogous one due to Marcinkiewicz-Lizorkin, [9],see the next Lemma 2.4. Namely Taylor proved that Op M0

ρ ⊂ L(Lp), 0 < ρ < 1,1 < p < ∞, where the smooth symbol classes Mm

ρ , m ∈ R, are given by thefunctions a(x, ξ) ∈ Sm

ρ,0 such that ξγ∂γξ a(x, ξ) ∈ Sm

ρ,0, for any multi-index γ withcomponents 0 or 1. In some sense we have in this case the critical order mp = 0.

The present paper, which follows Taylor’s basic layout, is an attempt togive a general result of Lp continuity for pseudodifferential operators with similarnon-regular symbols. More precisely pseudodifferential operators are considered,corresponding to symbols a(x, ξ) of Taylor’s type, but with a finite number ofderivatives with respect to ξ and of weighted Besov type Bs,Λ

p,q with respect to x;here s > 0, 1 < p <∞, 1 < q <∞ and Λ is a suitable weight function.

The paper runs as follows. In Section 2 we first introduce the weight functionsΛ(ξ). The weighted Besov spacesBs,Λ

p,q are then characterized by means of a suitablepartition of unity due to Triebel, see [20], [21]. Corresponding properties are givento be used in the following part of the paper.

In §3 we introduce the non-regular symbols a(x, ξ), first defined as finitelydifferentiable with respect to ξ and in a general Banach space with respect to x.It is shown in this section that such non-regular symbols can be decomposed inan expansion of elementary symbols, see Definition 3.2, following a technique ofCoifman-Meyer [4].

The remaining part of the paper is devoted to the proof of the principal result:Theorem 3.4 about the boundedness of the pseudodifferential operators a(x,D)between two weighted Besov spaces under suitable conditions. Let us notice thatour result appears to be new also in the classical case when Λ(ξ) =

√1 + |ξ|2.

Namely in this case we get an extension of Proposition 4.5 of Taylor [[18], Ch. XI]for smooth symbols. Let us emphasize that, with respect to Nagase and Marschall,our assumptions are stronger, but we get the effective Lp continuity of zero orderpseudodifferential operators, without any loss of regularity, i.e., mp = 0.

In fact, sharp Lp estimates are essential for the applications to non-linearequations, by following the line of Bony-Meyer [1], [12], Beals and Reed [2], Taylor[[19], Ch. 2–3]. After further development of symbolic calculus, applications of ourresult are expected in the study of the regularity of solutions to some kind of non-linear partial differential equations, generalizing the multi-quasi-elliptic equationsconsidered in Garello, Morando [6], [7]. This will be detailed in future papers.

Page 204: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

198 G. Garello and A. Morando

2. Weighted Besov spaces

In the whole paper Λ(ξ) will be a weight function satisfying the following definition.

Definition 2.1. Λ(ξ) ∈ C∞(Rn) is a weight function provided that the followingassumptions are satisfied with some positive constants C > 0, µ0 ≥ 1.

1. Λ(ξ) ≥ 1C (1 + |ξ|)µ0 , ξ ∈ Rn;

2. for every γ ∈ Zn+ there exists Cγ > 0 such that

n∏j=1

(1 + ξ2j )

γj2 |∂γΛ(ξ)| ≤ CγΛ(ξ), ξ ∈ Rn;

3. Λ(tξ) ≤ CΛ(ξ), t, ξ ∈ Rn, max1≤j≤n

|tj | ≤ 1, tξ := (t1ξ1, . . . , tnξn);

4. (δ-condition) for some 0 < δ < 1 there holds

Λ(ξ) ≤ C(Λ(η) + Λ(ξ − η) + Λ(η)δΛ(ξ − η)δ

), ξ, η ∈ Rn. (2.1)

As it has been shown in Triebel [[20], Lemma 2.1/2] we can always findµ1 ≥ µ0 such that Λ(ξ) < C(1 + |ξ|)µ1 .

Example. The basic examples of weight functions are the elliptic weights Λm(ξ) :=√1 +

∑nj=1 ξ

2mj . Of greater interest are the multi-quasi-elliptic weights defined

as ΛP(ξ) :=√∑

α∈V(P) ξ2α where V(P) are the vertices of a complete Newton

polyhedron, see for instance [7]. Other examples, such as 〈ξ〉s[log(2 + 〈ξ〉)]t, wheres, t > 1 and 〈ξ〉 =

√1 + |ξ|2, are given in Triebel [20].

For a fixed H > 1 let us consider the decomposition of Rn, given by thesequence of n-intervals P (H)

h,λ defined, for h = (h1, . . . , hn) ∈ Zn+, λ = (λ1, . . . , λn) ∈

En = −1, 1n, by

P(H)h,λ :=

ξ ∈ Rn ;

1H

2hjηhj ≤ λjξj ≤ H2hj+1 ; j = 1, . . . , n, (2.2)

where ηh = −1 if h = 0 and ηh = 1 if h > 0.

Remark 2.2. It is easy to prove the existence of a positive number N0 = N0(H)such that, for any λ, ε ∈ En, P (H)

h,λ ∩ P(H)k,ε = ∅ when |hj − kj | > N0 for some

j = 1, . . . n; see [7] and [20].

Following again Triebel [20], we introduce also the next

Definition 2.3 (Partition of unity). For a fixed H > 1 , φ(H) is the set of allsequences ϕh,λh∈Zn

+λ∈En

⊂ C∞0 such that, for every h ∈ Zn

+, λ ∈ En, supp ϕh,λ ⊂

P(H)h,λ ,

∑h∈Zn

+,λ∈En ϕh,λ(ξ) = 1 and for any α ∈ Zn+ there exists a positive constant

Cα such that

|∂αξ ϕh,λ(ξ)| ≤ Cα2−h·α, ξ ∈ Rn, h ∈ Zn

+, λ ∈ En. (2.3)

Page 205: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Continuity in Weighted Besov Spaces 199

In the remaining part of the paper, the subscripts h and λ will be alwaysunderstood to run through the sets Zn

+ and En respectively.For 1 ≤ p ≤ ∞, 1 ≤ q ≤ ∞ and any sequence uh,λ ⊂ Lp(Rn) we define

‖uh,λ‖q(Lp) := (∑

h,λ ‖uh,λ‖qp)

1q , with obvious modification for q = ∞.

Moreover we will write Fx→ξu(ξ) = u(ξ) for the Fourier transform of a dis-tribution u ∈ S′(Rn) and F−1

ξ→x for the inverse Fourier transformation.Let us consider a function m(ξ) on Rn; we set m(D)u(x) = F−1

ξ→x(m(ξ)u(ξ))for any u ∈ S′(Rn), provided that the expressions involved make sense; as usualm(D) is called Fourier multiplier. The next is a classical result in the theory ofFourier multipliers (cf. [9], [18]).

Lemma 2.4 (of Lizorkin-Marcinckiewicz, [18], Ch.XI, Prop. 4.5). Let m(ξ) be acontinuous function together with its derivatives ∂γm(ξ) for any γ ∈ Kn := 0, 1n.If there exists a constant B > 0 such that

|ξγ∂γm(ξ)| ≤ B, ξ ∈ Rn, γ ∈ Kn, (2.4)

then for every 1 < p <∞ we can find a constant Ap > 0, only depending on p, Band the dimension n, such that:

‖m(D)u‖p ≤ Ap‖u‖p, u ∈ S(Rn). (2.5)

Remark 2.5. The Fourier multipliers ϕh,λ(D) are Lp continuous, for every 1 <

p <∞; indeed by using the inclusions suppϕh,λ ⊂ P(H)h,λ and inequalities (2.3), the

functions ϕh,λ are shown to satisfy the estimates (2.4) with a positive constant Bindependent of h and λ.

For ϕh,λ ∈ φ(H), 1 < p < ∞, 1 ≤ q ≤ ∞, s ∈ R we can introduce, withobvious modification for q = ∞, the following norms:

‖u‖Bs,Λp,q

:= ‖Λ(c(H)h,λ )suh,λ‖q(Lp) :=

(∑h,λ

‖Λ(c(H)h,λ )suh,λ‖q

p

) 1q

. (2.6)

Here and later on, for any u ∈ S′(Rn), we set uh,λ = ϕh,λ(D)u where c(H)h,λ is the

center of the n-interval P (H)h,λ .

We denote by Bs,Λp,q the Banach space of tempered distributions u ∈ S′(Rn)

whose norm in (2.6) is finite.

Remark 2.6. It may be shown that for different choices of systems ϕh,λ(ξ) ∈Φ(H) the norms in (2.6) are equivalent. For H,K greater than 1, there holds1C <

Λ(c(H)h,λ

)

Λ(c(K)k,ε )

< C when |h−k| ≤ A, for some A > 0 and C > 1 independent of k, h.

The next propositions 2.7, 2.8 are proved by adapting to our context thearguments used by Triebel [22] in the framework of classical Besov spaces Bs

p,q (cf.also [21]).

Page 206: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

200 G. Garello and A. Morando

Proposition 2.7 (Nikol’skij representation, [21] Theorem 2.1/2). Let us consideru =

∑h,λ uh,λ, with convergence in S′(Rn) and assume that supp uh,λ ⊂ P

(H)h,λ .

Then for any s ∈ R, 1 < p < ∞ and 1 ≤ q ≤ ∞ there exists a constant C =Cs,p,q > 0 such that:

‖u‖Bs,Λp,q

≤ C

⎛⎝∑h,λ

Λ(c(H)h,λ )sq‖uh,λ‖q

p

⎞⎠ 1q

. (2.7)

Since Zn+ does not have a natural order, here and in the following the con-

vergence in S′(Rn) of the series∑

h,λ uh,λ must be assumed independent of theparticular order of the terms.

Proposition 2.8 ([7], Proposition 5.2). For every s ∈ R and 1 < p < ∞ thefollowing continuous embeddings hold true:

S(Rn) ⊂ Bs,Λp,q1

⊂ Bs,Λp,q2

⊂ S′(Rn), if 1 ≤ q1 < q2 ≤ ∞; (2.8)

Bs+ε,Λp,q1

⊂ Bs,Λp,q2

, if 1 ≤ q1, q2 ≤ ∞, ε > 0. (2.9)

Moreover S(Rn) is dense in Bs,Λp,q for any 1 ≤ q <∞.

In view of the Nikol’skij inequality, see Triebel [22], we can even prove thefollowing

Lemma 2.9 ([7], Lemma 5.1). For any α ∈ Zn+ and 1 ≤ p1 ≤ p2 ≤ ∞ there exists

a positive constant C = C(α, p1, p2) such that

‖∂αv‖p2 ≤ C2h·α+(

1p1

− 1p2

)|h|‖v‖p1 , (2.10)

for any v ∈ S′(Rn) such that supp v ⊂ P(H)h,λ .

Proposition 2.10 ([7], Proposition 5.3). For any s ∈ R, 1 < p1 < p2 < ∞ and1 ≤ q ≤ ∞, the following continuous embedding holds true:

Bs+ n

µ0

(1

p1− 1

p2

),Λ

p1,q ⊂ Bs,Λp2,q. (2.11)

Proposition 2.11. For any s ∈ R, 1 < p <∞, 1 ≤ q ≤ ∞ and H > 1, we can finda positive constant M = Ms,p,q,H such that for every u ∈ S′(Rn):

‖uh,λ‖∞ ≤M‖u‖Bs,Λp,q

Λ(c(H)h,λ )−s+ n

µ0p . (2.12)

Proof. Since the continuous embedding Bs,Λp,q ⊂ Bs,Λ

p,∞ holds true for any 1 ≤ q <∞, we may restrict ourselves to proving (2.12) only for q = ∞ without loss ofgenerality. Let u belong to Bs,Λ

p,∞. By definition (cf. (2.6)) there holds

‖uh,λ‖p ≤ ‖u‖Bs,Λp,∞

Λ(c(H)h,λ )−s, h, λ, (2.13)

Page 207: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Continuity in Weighted Besov Spaces 201

for a given H > 1 ((2.13) would be trivial if u /∈ Bs,Λp,∞, since ‖u‖Bs,Λ

p,∞= ∞).

Applying to uh,λ the Nikol’skij type inequality (2.10), with α = 0, p1 = p andp2 = ∞, gives

‖uh,λ‖∞ ≤ C02|h|p ‖uh,λ‖p, h, λ, (2.14)

with a positive C0 independent of h, λ. Then we find estimate (2.12), gathering(2.13), (2.14) and using also 2|h| ≤ C1Λ(c(H)

h,λ )n

µ0 , h, λ, where C1 depends only on Hand the dimension n; the latter inequalities are an easy consequence of assumption1 of Definition 2.1.

For r ∈ Z+ and κ ∈ −1, 1 we now set:

L(H)r,κ :=

t ∈ R;

1H

2rηr ≤ κt ≤ H2r+1

, (2.15)

with ηr = 1 if r > 0, η0 = −1.

Lemma 2.12. Let us consider u =∑

h,λ uh,λ, with convergence in S′(Rn) such that

supp uh,λ ⊂ J(H)h1,λ1

× · · · × J(H)hn,λn

, (2.16)

where J(H)hj ,λj

are either L(H)hj ,λj

defined in (2.15) or[−H2hj+1, H2hj+1

]. Then for

every s ≥ 0, γ > 0, 1 < p < ∞ and 1 ≤ q ≤ ∞ there exists a positive constantC = Cs,γ,p,q such that

‖u‖Bs,Λp,q

≤ C

⎛⎝∑h,λ

Λ(c(H)h,λ )qs2qγσ(h)·h‖uh, λ‖q

p

⎞⎠ 1q

, (2.17)

where σ(h) = (χ(h1), . . . , χ(hn)) and

χ(hj) :=

1, if J (H) = [−H2hj+1, H2hj+1]0, otherwise.

(2.18)

Proof. Let us consider a partition of unity ϕk,ε ∈ φ(K), where k ∈ Zn+, ε ∈ En,

K > 1. Then following the arguments in [[7], Lemma 7.3], we obtain:

ϕk,ε(D)u =∑

h∈E(N0),n1k

λ∈En

ϕk,ε(D)uh,λ, (2.19)

where for any fixed N0 > log2(2HK) we set

E(N0),n1k :=

h ∈ Zn

+:hj ≥ kj −N0, j = 1, . . . , n1

kj −N0 ≤ hj ≤ kj + N0, j = n1 + 1, . . . , n

; (2.20)

here, without loss of generality, we have assumed J(H)hj ,λj

=[−H2h+1, h2hj+1

], for

1 ≤ j ≤ n1 and J(H)hj ,λj

= L(H)hj,λj

when n1 + 1 ≤ j ≤ n, for a given 1 ≤ n1 ≤ n. For

Page 208: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

202 G. Garello and A. Morando

the case q = ∞ we obtain

‖u‖Bs,Λp,q

=

⎛⎝∑k,ε Λ(c(K)

k,ε )qs‖∑

h∈E(N0),n1k

λ∈En

ϕk,ε(D)uh,λ‖qp

⎞⎠1q

=

⎛⎝∑k,ε Λ(c(K)

k,ε )qs‖∑

t∈E(N0),n1

λ∈En

ϕk,ε(D)uk+t,λ‖qp

⎞⎠1q

,

(2.21)

where t = h− k and

E(N0),n1 :=t ∈ Zn :

tj ≥ −N0, j = 1, . . . , n1

−N0 ≤ tj ≤ N0, j = n1 + 1, . . . , n

, (2.22)

agreeing that uk+t,λ ≡ 0 when kj + tj < 0 for some 1 ≤ j ≤ n.By the triangle inequality applied to the norm ‖ · ‖q(Lp) we obtain

‖u‖Bs,Λp,q

≤∑

t∈E(N0),n1

∥∥∥∥∥∥ϕk,ε(D)

(∑λ∈En

Λ(c(K)k,ε )suk+t,λ

)k,ε

∥∥∥∥∥∥q(Lp)

. (2.23)

Thanks to Remark 2.5 we obtain∥∥∥∥∥ϕk,ε(D)∑

λ∈En

Λ(c(K)k,ε )suk+t,λ

∥∥∥∥∥p

< C

∥∥∥∥∥∑λ∈En

Λ(c(K)k,ε )suk+t,λ

∥∥∥∥∥p

; (2.24)

moreover, there exists C = Cs,n,N0 > 0 such that Λ(c(K)k,ε ) ≤ CΛ(c(H)

k+t,λ), fort ∈ E(N0),n1 . It then follows

‖u‖Bs,Λp,q

≤ C∑

t∈E(N0),n1

⎛⎝∑k,λ

Λ(c(H)k+t,λ)qs‖uk+t,λ‖q

p

⎞⎠1q

. (2.25)

For an arbitrary γ > 0, let us multiply any term depending on t in the right-hand side of (2.25) by 2γtj and its own inverse, as j = 1, . . . , n1; then by setting∑

t∈E(N0),n1 2−γ|t| = CN0,γ,n1 we obtain

‖u‖Bs,Λp,q

≤ CCN0,γ,n1

⎛⎝∑h,λ

Λ(c(H)h,λ )qs2γh1 . . . 2γhn1‖uh,λ‖q

p

⎞⎠1q

, (2.26)

which ends the proof. All the arguments may be repeated with few changes forthe case q = ∞.

At the end of this section let us consider two interpolation results whichdirectly follow from Calderon [3] and Triebel [20], respectively. In the notation ofthese authors [·, ·]Θ, 0 < Θ < 1 is the complex interpolation functor.

Page 209: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Continuity in Weighted Besov Spaces 203

Proposition 2.13. Let (B0, B1) and (C0, C1) be two interpolation couples. Let Lbe a linear mapping from B0 +B1 to C0 +C1 such that x ∈ Bi implies L(x) ∈ Ci

and‖L(x)‖Ci ≤Mi‖x‖Bi , i = 0, 1. (2.27)

Then x ∈ BΘ := [B0, B1]Θ implies L(x) ∈ CΘ := [C0, C1]Θ and

‖L(x)‖CΘ ≤M1−Θ0 MΘ

1 ‖x‖BΘ . (2.28)

Proposition 2.14 ([20]. Theorem 4.2/2). For any weight function Λ(ξ) and 1 <p <∞, 1 ≤ q <∞: [

B0,Λp,q , B

r,Λp,q

= BrΘ,Λp,q , 0 < Θ < 1. (2.29)

3. Pseudodifferential operators with non-regular symbols

In the remainder of the paper X will be a generic Banach space with norm ‖ · ‖.

Definition 3.1. For any non-negative integer N and m ∈ R, we define XMmΛ (N)

as the class of all measurable functions a(x, ξ) on R2n such that∏nj=1(1 + ξ2

j )γj2 |∂γ

ξ a(x, ξ)| ≤ CNΛ(ξ)m, |γ| ≤ N, x, ξ ∈ Rn;

∏nj=1(1 + ξ2

j )γj2 ‖∂γ

ξ a(·, ξ)‖ ≤ CNΛ(ξ)m, |γ| ≤ N, ξ ∈ Rn.

(3.1)

Definition 3.2. For any integer N ≥ 0, we define XME(N) as the class of allexpansions

σ(x, ξ) :=∑h,λ

dh,λ(x)ψh,λ(ξ) (3.2)

whose terms dh,λ ∈ L∞(Rn) ∩ X and ψh,λ ∈ C∞0 (Rn) satisfy for some M > 0,

H > 1 and C > 0

‖dh,λ‖∞ < M ; ‖dh,λ‖ < M ; suppψh,λ ⊂ P(H)h,λ ; (3.3)

|∂αψh,λ(ξ)| < C2−h·α, for any ξ ∈ Rn and |α| ≤ N. (3.4)We say that σ(x, ξ) is an elementary symbol.

The expansion in (3.2) is trivially convergent, since, in view of Remark 2.2, forany fixed ξ ∈ Rn all but a finite number of its terms vanish. Moreover XME(N) ⊂XM0

Λ(N), for any weight function Λ(ξ) and integer N ≥ 0.

Proposition 3.3. Provided that N ≥ n + 1, any symbol a(x, ξ) ∈ XM0Λ(N) may

be written as an expansion of elementary symbols am(x, ξ) ∈ XME(N − n − 1),m ∈ Zn, in the following way:

a(x, ξ) =∑

m∈Zn

1(1 + |m|)n+1

am(x, ξ), (3.5)

Page 210: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

204 G. Garello and A. Morando

where |m| = |m1| + · · · + |mn| and the expansion is absolutely convergent inL∞(Rn

x × Rnξ ). Moreover the elementary symbols am(x, ξ) may be constructed in

such a way that the assumptions (3.3), (3.4) are satisfied with constants M and Cindependent of m.

Proposition 3.3 is proved by slightly modifying the arguments used to showProposition 6.1 in [7].

Theorem 3.4. For any weight function Λ(ξ), let a(x, ξ) be a symbol in Br,Λp,q M

mΛ (N)

with r > n(1−δ)µ0p , N ≥ 2n + 1, 1 < p <∞, 1 ≤ q <∞ and m ∈ R. Then:

a(x,D) : Bs+m,Λp,q −→ Bs,Λ

p,q , continuously for any 0 ≤ s ≤ r. (3.6)

The following remarks will be useful in proving the previous statement.1) Thanks to Remark 2.1/2 in [20], there exists a constant C > 1 such that1/C < Λ(ξ)/Λ(c(H)

h,λ ) < C for every ξ ∈ P(H)h,λ and C does not depend on h, λ.

Let us assume that χh,λ belongs to φ(K), with K > H , and satisfiesχh,λ = 1 in supp ϕh,λ; it is quite easy to prove that χh,λ(D)Λm(D)Λ(c(H)

h,λ )−m

is a continuous Fourier multiplier on Lp(Rn), 1 < p < ∞, thanks to Lemma 2.4.We can then show, for any u ∈ S(Rn):

‖Λm(D)u‖Bs,Λp,q

= (∑h,λ

Λ(c(H)h,λ )(s+m)q‖Λ(c(H)

h,λ )−mχh,λ(D)Λm(D)ϕh,λ(D)u‖qp)

1q

≤ C(∑

h,λ Λ(c(H)h,Λ)(s+m)q‖ϕh,λ(D)u‖q

p)1q = C‖u‖Bs+m,Λ

p,q.

Thus we conclude that for any m, s ∈ R, 1 < p < ∞, 1 ≤ q < ∞, Λm(D) isbounded from Bs+m,Λ

p,q into Bs,Λp,q , because of the density of S(Rn) in Bs,Λ

p,q forq = ∞. Since a(x,D)Λ−m(D) has symbol a(x, ξ)Λ−m(ξ) ∈ Br,Λ

p,q M0Λ(N), assuming

that Theorem 3.4 holds in the case m = 0, we obtain that the operator a(x,D) =(a(x,D)Λ−m(D))Λm(D) is bounded from Bs+m,Λ

p,q into Bs,Λp,q , when 0 ≤ s ≤ r.

2) Proposition 3.3 and the Dominated Convergence Theorem assure that for anya(x, ξ) ∈ Br,Λ

p,q M0Λ(N), we can write for u ∈ S(Rn):

a(x,D)u(x) =∑

m∈Zn

1(1 + |m|)n+1

am(x,D)u(x), |m| =n∑

j=1

|mj |, (3.7)

where am(x, ξ) ∈ Br,Λp,q ME(N − n − 1) and under the assumption of Theorem

3.4 we have N − n − 1 ≥ n. As a first step, let us then prove Theorem 3.4 forσ(x, ξ) ∈ Br,Λ

p,q M0E(N), with N ≥ n.

3) For any σ(x, ξ) ∈ Bs,Λp,q M

0E(N) having the form (3.2) we can write:

σ(x,D)u(x) =∑h,λ

dh,λ(x)uh,λ(x), u ∈ S(Rn), (3.8)

where uh,λ(x) = ψh,λ(D)u(x) and dh,λ(x), ψh,λ(ξ) satisfy (3.3), (3.4). The expan-sion in (3.8) is absolutely convergent in L∞(Rn). In fact in view of (3.4), for any

Page 211: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Continuity in Weighted Besov Spaces 205

T > 0 and some suitable positive constant C we have:

‖uh,λ‖∞ ≤ CCT(

1 + 1H2

∑nj=1 χhj 22hj

)T≤ CCT ah1 . . . ahn , (3.9)

where CT :=∫((1 + |ξ|2)T |u(ξ)| dξ is finite, χh = 0 if h = 0, χh = 1, if h > 0 and

ah = 1 for h = 0, ah = 11

H2 22hM

nfor h > 0.

Using also (3.3), the estimate∑h,λ

‖dh,λ‖∞‖uh,λ‖∞ ≤M2nCT

∞∑h1=0

ah1 · · ·∞∑

hn=0

ahn <∞ (3.10)

shows that (3.8) is absolutely convergent in L∞(Rn).Let us also remark that for every v ∈ S′(Rn) and ϕh,λ ∈ φ(H) there holds

v =∑h,λ

ϕh,λ(D)v =∑h,λ

vh,λ, with convergence in S′(Rn). (3.11)

4) For ψk,ε(ξ) ∈ Φ(K), K > 1, let us set dk,εh,λ(x) := ψk,ε(D)dh,λ(x) and consider

a(x,D)u(x) =∑h,λ

∑k,ε

dk,εh,λ(x)uh,λ(x). (3.12)

It follows from Proposition 2.11 and (3.3) that ‖dk,εh,λ‖∞ < MΛ(c(K)

k,ε )−(

r− nµ0p

), for

some M > 0. Then using (3.9) and provided that r > nµ0p we can conclude that

the expansion in (3.12) is absolutely convergent in L∞(Rnx).

5) Thanks to the absolute convergence we can change the order of the terms inthe expansion in (3.12) and choose a useful order. Let us first introduce somenotations. Namely for a fixed N0 ∈ N and any j ∈ Z+ we set:

E(N0)1,j :=

∅ , j ≤ N0;Z+ ∩ [0, j −N0[ , j > N0;

(3.13)

E(N0)2,j := Z+ ∩ [j −N0, j + N0[; (3.14)

E(N0)3,j := Z+ ∩ [j + N0,∞[. (3.15)

For A := 1, 2, . . . , n and B := 1, 2, 3, let BA be the set of all the functionsω : A → B. For any h ∈ Zn

+ and ω ∈ BA we set E(N0)ω,h :=

∏ni=1 E

(N0)ω(i),hi

. Agreeingwith the previous notation we can write:

σ(x,D)u(x) =∑

ω∈BA

∑h,λ

∑k∈E

(N0)ω,h , ε∈En

dk,εh,λ(x)uh,λ(x). (3.16)

Page 212: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

206 G. Garello and A. Morando

6) For every h, k ∈ Zn+ and λ, ε ∈ En: supp( dk,ε

h,λuh,λ) ⊂ P(K)k,ε + P

(H)h,λ . The n-

intervals P (H)h,λ and P

(K)k,ε are obtained as a superposition of n real intervals of the

type L(H)r,κ and L

(K)s,δ introduced in (2.15). Therefore we have reduced the study of

the n-dimensional sum P(H)h,λ +P

(K)k,ε to an argument involving the one-dimensional

sums L(H)r,κ + L

(K)s,δ . We now need the following technical lemma, for the proof of

which we refer to [7].

Lemma 3.5 ([7], Lemma 7.1). Let us consider r, s ∈ Z+, κ, δ ∈ −1, 1, H,K gre-ater than 1. For any N0 positive integer such that N0 > log2(2HK), we can alwaysfind two positive constants T,M such that T > H+K, 1

T < min 1K− 2H

2N0 ,1H − 2K

2N0 and M > 2N0+1K+2H, which fulfill the following statements, with ηj = 1 if j > 0,and η0 = −1:

(a) if s ∈ E(N0)1,r and r > N0 then

L(H)r,κ + L

(K)s,δ ⊂

θ∈ R :

2rηr

T≤ κθ ≤ T 2r+1

=: L(T )

r,κ ; (3.17)

(b) if s ∈ E(N0)2,r then

L(H)r,κ + L

(K)s,δ ⊂ θ ∈ R : |θ| ≤M2r =: [−M2r,M2r]; (3.18)

(c) if s ∈ E(N0)3,r then

L(H)r,κ + L

(K)s,δ ⊂

θ∈ R :

2sηs

T≤ δθ ≤ T 2s+1

=: L(T )

s,δ . (3.19)

It then follows that dk,ε

h,λuh,λ is supported in the product of n real intervalsof the type (3.17)–(3.19). This suggests to split BA in the following way:

C1 :=ω ∈ BA : ω(A) = 1

; C2 :=

ω ∈ BA : ω(A) = 2

;

C3 :=ω ∈ BA : ω(A) = 3

; C4 :=

ω ∈ BA : ω(A) = 1, 2

;

C5 :=ω ∈ BA : ω(A) = 1, 3

; C6 :=

ω ∈ BA : ω(A) = 2, 3

;

C7 :=ω ∈ BA : ω(A) = 1, 2, 3

.

(3.20)

The sets C1, C2 and C3 reduce to a singleton set ω, while C4-C7 contain severalfunctions, for any dimension n ≥ 2.

For any σ(x,D) ∈ Hr,pΛ ME(N) we can write

σ(x,D)u(x) =7∑

j=1

Tju(x), u ∈ S(Rn), (3.21)

Page 213: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Continuity in Weighted Besov Spaces 207

where for j = 1, . . . 7:

Tju(x) :=∑

ω∈Cj

∑h,λ

∑k∈E

(N0)ω,h , ε∈En

dk,εh,λ(x)uh,λ(x), u ∈ S(Rn). (3.22)

In the following we will work under the conditions obtained step by step in theremarks 1)–6). In particular, in the statements of the following propositions 3.6–3.9 we will always assume σ(x, ξ) ∈ Br,Λ

p,q M0E(N), with 1 < p < ∞, 1 ≤ q < ∞,

Λ(ξ) weight function, N ≥ n and r > nµ0p .

Proposition 3.6.

T1 : Bs,Λp,t −→ Bs,Λ

p,t continuously, for any s ∈ R, 1 ≤ t <∞. (3.23)

Proof. Assuming N0 > log2(2HK), from Lemma 3.5 we find a constant T > 1 such

that supp dk,ε

h,λuh,λ ⊂ P(T )h,λ , for any h, k ∈ Zn

+, with kj < hj − N0 (j = 1, . . . , n),and any λ, ε ∈ En.

In view of Proposition 2.7, for every s ∈ R and 1 < p <∞ we get:

‖T1u‖Bs,Λp,t

≤ C(∑h,λ

Λ(c(T )h,λ)ts‖uh,λ‖t

p(∑

k∈E(N0)1,h ,ε∈En

‖dk,εh,λ‖∞)t)

1t , (3.24)

where E(N0)1,h :=

∏nj=1 E

(N0)1,hj

and C is independent of u. Since the sequence dh,λis bounded in Br,Λ

p,q and r > nµ0p , from Proposition 2.11 we have:∑

k∈E(N0)1,h ,ε∈En

‖dk,εh,λ(x)‖∞ ≤M

(∑k,ε

Λ(c(K)k,ε )−

(r− n

µ0p

))suph,λ

‖dh,λ‖Br,Λp,q

. (3.25)

Using now (3.25) jointly with Remark 2.2, s ∈ R and 1 ≤ t < ∞, we get for anyu ∈ S(Rn):

‖T1u‖ ≤ CM suph,λ ‖dh,λ‖Br,Λp,q

(∑h,λ Λ(c(H)

h,λ )ts‖uh,λ‖tp

) 1t

≤ CM suph,λ ‖dh,λ‖Br,Λp,q‖u‖Bs,Λ

p,t.

(3.26)

In order to get the last inequality in (3.26), we need to observe that the func-tions ψh,λ(ξ) involved in the expression (3.2) give Lp continuous Fourier multi-pliers ψh,λ(D) for every 1 < p < ∞; indeed it is enough to apply the Lizorkin-Marcinckiewicz Lemma in view of (3.3), (3.4) and follow the arguments used inRemark 2.5. Let us also point out that the estimate (3.4), with N ≥ n, is essen-tial in order to apply Lemma 2.4 to ψh,λ(ξ). Since S(Rn) is dense in Bs,Λ

p,t (cf.Proposition 2.8) the proof is concluded.

Proposition 3.7.

T2 : Bs,Λp,t −→ B

s+r− nµ0p−θ,Λ

p,t , 1 ≤ t <∞ (3.27)

continuously for any s > −r + nµ0p , 0 < θ < s+ r − n

µ0p .

Page 214: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

208 G. Garello and A. Morando

Proof. Let us set Uh,λ(x) :=∑

k∈E(N0)2,h

ε∈En

dk,εh,λ(x)uh,λ(x), where E

(N0)2,h :=

n∏j=1

E(N0)2,hj

.

From Lemma 3.5 it follows that T2u(x) =∑

h,λ Uh,λ(x) fulfills the assumptions

of Lemma 2.12. Since s + r − nµ0p − θ > 0, we may estimate the B

s+r− nµ0p−θ,Λ

p,t

norm of T2u(x) by means of (2.17), with γ = µ0θn , remembering also that 2|h| ≤

CΛ(c(H)h,λ )

nµ 0 . Then for a positive constant C depending only on r, s, p, t, µ0, n and

θ > 0:

‖T2u‖B

s+r− nµ0p

−θ,Λ

p,t

≤ C

⎛⎝∑h,λ

Λ(c(T )h,λ)t(s+r− n

µ0p )‖Uh,λ‖tp

⎞⎠1t

. (3.28)

Using now Proposition 2.11, Remark 2.6 and observing moreover that E(N0)2,h is a

finite set, we have:

‖Uh,λ‖p ≤M‖uh,λ‖p suph,λ ‖dh,λ‖Br,Λp,q

∑k∈E

(N0)2,h ,ε∈En Λ(c(K)

k,ε )−(

r− nµ0p

)

≤MC‖uh,λ‖pΛ(c(H)h,λ )−

(r− n

µ0p

)suph,λ ‖dh,λ‖Br,Λ

p,q.

(3.29)Then we conclude that

‖T2u‖B

s+r− nµ0p

−θ,Λ

p,t

≤ CM suph,λ ‖dh,λ‖Br,Λp,q

(∑

h,λ Λ(c(H)h,λ )st‖uh,λ‖t

p)1t

≤ CM suph,λ ‖dh,λ‖Br,Λp,q‖u‖Bs,Λ

p,t,

(3.30)

with positive constants C,M independent of u ∈ S(Rn).

Proposition 3.8.

T3 : Bs−r+θ+ n

µ0p ,Λ

p,t −→ Bs,Λp,t (3.31)

continuously, for any s < r, 1 ≤ t <∞, θ > 0.

Proof. Setting Vk,ε(x) :=∑

h∈E(N0−1)1,k

λ∈En

dk,εh,λ(x)uh,λ(x) and exploiting the absolute

convergence of the expansion in (3.12) we may write T3u(x) =∑

k,ε Vk,ε(x), for

any u ∈ S(Rn). It follows from Lemma 3.5 that suppVk,ε ⊂ P(K)k,ε . Then using

Proposition 2.7 and the embedding (2.11) we obtain for any 1 < p1 < p <∞ andsome positive constant C = Ct,s,p,p1 :

‖T3u‖Bs,Λp,t

≤ C

⎛⎝∑k,ε

Λ(c(K)k,ε )

(s+ n

µ0

(1

p1− 1

p

))t

‖Vk,ε‖tp1

⎞⎠ 1t

. (3.32)

Setting now η = 1p1− 1

p and applying the Holder inequality it follows:

‖Vk,ε‖p1 ≤∑

h∈E(N0−1)1,k ,λ∈En ‖dk,ε

h,λ‖p‖uh,λ‖ 1η

≤ suph,λ ‖dh,λ‖Br,Λp,∞

Λ(c(K)k,ε )−r

∑h∈E

(N0−1)1,k ,λ∈En ‖uh,λ‖ 1

η.

(3.33)

Page 215: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Continuity in Weighted Besov Spaces 209

Then, writing for the sake of brevity Λk := Λ(c(K)k,ε ), we get

‖T3u‖Bs,Λp,t

≤ C suph,λ

‖dh,λ‖Br,Λp,q

⎛⎜⎜⎝∑k,ε

(Λs−r+ n

µ 0η

k

∑h∈E

(N0−1)1,k

λ∈En

‖uh,λ‖ 1η)t

⎞⎟⎟⎠1t

≤ C suph,λ

‖dh,λ‖Br,Λp,q

∑k,ε

Λs−r+ n

µ 0η+η′

k Λ−η′k

∑h∈E

(N0−1)1,k

λ∈En

‖uh,λ‖ 1η.

(3.34)

Assuming now without any restriction that s − r + nµ0η + η′ < 0, thanks to the

assumption 3 in Definition 2.1 and Remark 2.6 we have: Λ(c(K)k,ε )s−r+ n

µ0η+η′

≤TΛ(c(H)

h,λ )s−r+ nµ0

η+η′, with T > 0 independent of k, h, ε, λ. Then the Bs,Λ

p,t norm ofT3u(x) can be bounded from above by

CT suph,λ

‖dh,λ‖Br,Λp,q

∑k,ε Λ(c(K)

k,ε )−η′ ∑h,λ Λ(c(H)

k,λ )s−r+ nµ0

η+η′‖uh,λ‖ 1

η

≤ CT suph,λ ‖dh,λ‖Br,Λp,q‖u‖

Bs−r+ n

µ0η+η′,Λ

,1

. (3.35)

To get the bound (3.35) it is essential to exploit that the operators ψh,λ(D), fromthe expression in (3.2), are continuous Fourier multipliers in L

1η (Rn) when N ≥ n.

We have thus proved the continuity of T3 from Bs−r+ n

µ0 η+η′,Λ1η ,1

into Bs,Λp,t . As a

result of Proposition 2.10 we obtain

Bs−r+ n

µ0p +η′,Λp,1 ⊂ B

s−r+ nµ0p +η′− n

µ0( 1

p−η),Λ

1η ,1

= Bs−r+ n

µ0η+η′,Λ

1η ,1

, (3.36)

with continuous embedding. Furthermore, using (2.9), we prove the following con-tinuous inclusion

Bs−r+ n

µ0p +θ,Λ

p,t ⊂ Bs−r+ n

µ0p +η′,Λp,1 , (3.37)

for an arbitrary η′ such that 0 < η′ < θ. Gathering estimates (3.34)–(3.37) and

using the density of S(Rn) in Bs−r+ n

µ0p +θ,Λ

p,t we obtain (3.31).

Proposition 3.9.

T3 : Bθ+ n

µ0p ,Λp,q −→ Br,Λ

p,q continuously for any θ > 0. (3.38)

Proof. We can write ‖T3u‖Br,Λp,q

≤ C‖Λ(c(K)k,ε )rVk,ε‖q(Lp), in the notation of the

previous proof. Applying the Holder-Schwarz inequality we have for the conjugateorder q′ such that 1

q + 1q′ = 1 and any τ > 0:

‖Vk,ε‖p ≤ (∑

h∈E(N0−1)1,k

λ∈En

Λ(c(H)h,λ )−qτ‖dk,ε

h,λ‖qp)

1q (

∑h∈E

(N0−1)1,k

λ∈En

Λ(c(H)h,λ )q′τ‖uh,λ‖q′

∞)1q′ .

(3.39)

Page 216: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

210 G. Garello and A. Morando

We obtain for the Br,Λp,q norm of T3u the following bound

(∑h,λ

Λ(c(H)h,λ )q′τ‖uh,λ‖q′

∞)1q′∑h,λ

Λ(c(H)h,λ )−τ (

∑k,ε

Λ(c(K)k,ε )qr‖dk,ε

h,λ‖qp)

1q

≤ C suph,λ

‖dh,λ‖Br,Λp,q‖Λ(c(H)

h,λ )τuh,λ‖q′ (L∞),(3.40)

where∑

h,λ Λ(c(H)h,λ )−τ is finite. From Lemma 2.9, with p2 = ∞, and the Lp con-

tinuity of the Fourier multiplier ψh,λ(D), it follows, for suitable C > 0,

‖Λ(c(H)h,λ )τuh,λ‖q′(L∞) ≤

∑h,λ Λ(c(H)

h,λ )τ‖uh,λ‖∞

≤ C∑

h,λ Λ(c(H)h,λ )τ+ n

µ0p ‖uh,λ‖p ≤ C‖u‖B

τ+ nµ0p

p,1

.(3.41)

Using again the embedding (2.9) we now obtain for any τ > 0, ε > 0, 1 ≤ t < ∞and suitable C > 0:

‖T3u‖Br,Λp,q

≤ C suph,λ

‖dh,λ‖Br,Λp,q‖u‖

Bτ+ n

µ0p,Λ

p,1

≤ C‖dh,λ‖Br,Λp,q‖u‖

Bτ+ε+ n

µ0p,Λ

p,t

. (3.42)

Since θ = τ+ε ranges over all of (0,∞), using usual density arguments, we conclude

the proof. We have really proved that T3 : Bθ+ n

µ0p ,Λ

p,t → Br,Λp,q as a bounded linear

operator for any 1 ≤ t <∞.

Let us remark that any operator Tj , j = 4, . . . , 7, may be written as a finitesum of operators with the following form

Ru(x) =∑h,λ

∑k∈E

(N0),n1,n2,π

h ,ε∈E

n

dk,εh,λ(x)uh,λ(x), u ∈ S(Rn). (3.43)

Here n1, n2 are integers such that 0 ≤ n1 ≤ n2 ≤ n and at least two of theseinequalities must be strict; π is any permutation of the set 1, 2, . . . , n, E(N0)

1,hπ(j),

E(N0)2,hπ(j)

, E(N0)3,hπ(j)

are defined by (3.13)–(3.15) and

E(N0),n1,n2,πh :=

n1∏j=1

E(N0)1,hπ(j)

×n2∏

j=n1+1

E(N0)2,hπ(j)

×n∏

j=n2+1

E(N0)3,hπ(j)

. (3.44)

Therefore, one only needs to study the Bs,Λp,q -continuity of an operator having the

form (3.43).In order to simplify the notation we assume from this moment, without loss

of generality, that the permutation π in (3.43) is the identity of 1, 2, . . . , n andrestrict ourselves to the case n1 = 1, n2 = 2 and n = 3, that is Ru(x) :=∑

h,λ

∑kj∈E

(N0)j,hj

, j=1,2,3

ε∈En

dk,εh,λ(x)uh,λ(x), for any u ∈ S(R3). Because of the absolute

Page 217: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Continuity in Weighted Besov Spaces 211

convergence of the expansion in the L∞ norm, we can write

Ru(x) :=∑

h1,h2,k3λ1,λ2,ε3

∑kj∈E

(N0)j,hj

, j=1,2,

h3∈E(N0−1)1,k3

,

ε1,ε2,λ3

dk,εh,λ(x)uh,λ(x). (3.45)

From Lemma 3.5, we find T > 1 such that supp dk,ε

h,λuh,λ ⊂ L(T )h1,λ1

×[−T 2h2, T 2h2]×L

(T )k3,ε3

, for any (k1, k2, h3) satisfying k1 < h1 − N0, h2 − N0 ≤ k2 < h2 + N0 andh3 ≤ k3 −N0, and all ε1, ε2, λ3.

For shortness, we set t := (h1, h2, k3), σ := (λ1, λ2, ε3) and E(N0)t := E

(N0)1,h1

×E

(N0)2,h2

×E(N0−1)1,k3

; moreover e(K)1,r,ε := (c1,K

ε2r, 0, 0), e(K)2,r,ε := (0, c2,K

ε2r, 0), e(K)3,r,ε :=

(0, 0, c3,Kε2r), for any integer r, K > 1, ε ∈ −1, 1 and c

j,K:= K± 1

2K , j = 1, 2, 3.Using now Lemma 2.12 we find that for every s ≥ 0, 1 < p < ∞, 1 ≤ q ≤ ∞ andγ > 0 there exists C = Cs,p,q,γ > 0 such that

‖Ru‖Bs,Λp,q

≤ C

(∑t,σ

Λ(c(T )t,σ )qs2qγh2‖Ut,σ‖p

q

) 1p

, (3.46)

where cT,j := T ± 12T , j = 1, 2, 3, c(T )

t,σ =(cT,1λ12h1 , cT,2λ22h2 , cT,3ε32k3

)and

Ut,σ(x) :=∑

(k1,k2,h3)∈E(N0)t

ε1,ε2,λ3

dk,εh,λ(x)uh,λ(x). Let us notice that c

(T )t,σ = c

(T )h,σ +

τ3e(T )3,k3,ε3

, where τ3 := 1 − 2h3−k3 satisfies 0 < τ3 < 1 as h3 ≤ k3 − N0; byusing the assumptions 3 and 4 (δ-condition) of Definition 2.1, we get a positiveconstant C such that

Λ(c(T )t,σ ) ≤ C

(Λ(c(T )

h,σ) + Λ(e(T )3,k3,ε3

) + Λ(c(T )h,σ)δΛ(e(T )

3,k3,ε3)δ), (3.47)

for any t, σ, k3 ≥ h3 + N0. It then follows:

‖Ru‖Bs,Λp,q

≤ C(I1 + I2 + I3), (3.48)

where

I1 := (∑t,σ

2qγh2‖∑

(k1,k2,h3)∈E(N0)t

ε1,ε2,λ3

Λ(c(T )h,σ)sdk,ε

h,λuh,λ‖qp)

1q , (3.49)

I2 := (∑t,σ

2qγh2Λ(e(T )3,k3,ε3

)qs‖Ut,σ‖qp)

1q , (3.50)

I3 := (∑t,σ

2qγh2Λ(e(T )3,k3,ε3

)qsδ‖∑

(k1,k2,h3)∈E(N0)t

ε1,ε2,λ3

Λ(c(T )h,σ)δsdk,ε

h,λuh,λ‖qp)

1q .(3.51)

In order to estimate separately I1, I2, I3, let us compute in the case s=r.

Page 218: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

212 G. Garello and A. Morando

1) Estimate of I1. Thanks to Proposition 2.11:

‖∑

(k1,k2,h3)∈E(N0)t

ε1,ε2,λ3

Λ(c(T )h,σ)rdk,ε

h,λuh,λ‖p ≤∑

(k1,k2,h3)∈E(N0)t

ε1,ε2,λ3

Λ(c(T )h,σ)r‖dk,ε

h,λ‖∞‖uh,λ‖p

≤M suph,λ

‖dh,λ‖Br,Λp,q

∑h3∈E

(N0−1)1,k3

λ3∈En

Λ(c(T )h,σ)r‖uh,λ‖p

∑kj∈E

(N0)j,hj

εj∈En,j=1,2

Λ(c(K)k,ε )−r+ n

µ0p .

(3.52)

Since r > nµ0p we can find θ > 0 in such a way that Λ(c(K)

k,ε )−(

r− nµ0p

)is bounded

above by

Λ(e(K)1,k1,ε1

)−(r− n

µ0p−θ)Λ(e(K)

3,k3,ε3)−

θ3 Λ(e(K)

2,h2,ε2)−

θ3 Λ(e(K)

3,h3,ε3)−

θ3 , (3.53)

for all k ∈ Z3+, k2 −N0 < h2 ≤ k2 + N0, h3 ≤ k3 −N0 and ε ∈ En.

From (3.53) it follows, for kj , j = 1, 2 running as above,∑kj∈E

(N0)j,hj

εj∈En,j=1,2

Λ(c(K)k,ε )−

(r− n

µ0p

)≤ C2Λ(e(K)

3,k3,ε3)−

θ3 Λ(e(K)

2,h2,ε2)−

θ3 Λ(e(K)

3,h3,ε3)−

θ3 . (3.54)

Now from (3.52), together with (3.54) and the Holder inequality, we obtain thefollowing bound for ‖

∑(k1,k2,h3)∈E

(N0)t

ε1,ε2,λ3

Λ(c(T )h,σ)rdk,ε

h,λuh,λ‖p:

MC suph,λ

‖dh,λ‖Br,Λp,q

Λ− θ3

2 Λ− θ3

3 (∑

h3,λ3

Λ(c(T )h,σ)qr‖uh,λ‖p

q)1q , (3.55)

where Λ2 := Λ(e(K)2,h2,ε2

), Λ3 := Λ(e(K)3,k3,ε3

) and C is a suitable positive constantdependent only on q and θ. If we now choose γ > 0 suitably small in such a way that2γh2Λ− θ

32 is uniformly bounded with respect to h2 and using Λ(c(T )

h,σ) ≤ CΛ(c(H)h,λ ),

we can conclude

I1 ≤ CM suph,λ

‖dh,λ‖BsΛp,q

(∑

h1,h2,k3λ1,λ2,ε3

2qh2γΛ−q θ3

3 maxε2=±1

Λ−q θ3∑

h3,λ3

Λ(c(T )h,σ)qr‖uh,λ‖q

p)1q

≤ CM suph,λ

‖dh,λ‖Br,Λp,q

(∑h,λ

Λ(c(H)h,λ )qr‖uh,λ‖q

p)1q ≤ CM sup

h,λ‖dh,λ‖Br,Λ

p,q‖u‖Br,Λ

p,q.

2) Estimate of I2. Let us set Λ1 := Λ(e(K)1,k1,ε1

) and Λ3 := Λ(e(K)3,h3,λ3

); then forgeneric positive ρ, let us multiply Ut,σ by Λρ

1, Λ3 and their own inverses. Since

k1 < h1−N0, in view of assumption 3 in Definition 2.1 we have Λ1 ≤ CΛ(e(K)1,h1,λ1

).Then using also the Cauchy-Schwarz inequality we obtain

‖Ut,σ‖p ≤ Cρ,,q′Λ(e(K)1,h1,λ1

)ρ(∑

(k1,k2,h3)∈E(N0)t

ε1,ε2,λ3

Λq3 ‖dk,ε

h,λ‖qp‖uh,λ‖q∞)

1q

≤ Cρ,,q′Λ(e(K)1,h1,λ1)

ρ(∑

h3,λ3Λq

3 ‖uh,λ‖q∞∑

k1,k2ε1,ε2

‖dk,εh,λ‖q

p)1q ,

(3.56)

Page 219: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Continuity in Weighted Besov Spaces 213

where Cq′ρ,,q′ =

∑k1,k2,h3∈E

(N0)t

ε1,ε2,λ3

Λ−q′ρ1 Λ−q′

3 is bounded and 1q + 1

q′ = 1. Then for

a suitable C = Cρ,,q > 0, I2 is bounded by

C(∑

h1,h2λ1,λ2

2qγh2Λ(e(K)1,h1,λ1

)qρ∑

k3,ε3

Λ(e(K)3,k3,ε3

)qr∑

h3,λ3

Λq3 ‖uh,λ‖q

∞∑

k1,k2ε1,ε2

‖dk,εh,λ‖q

p)1q

≤ C∑

h,λ 2γh2Λ(e(K)1,h1,λ1

)ρΛ3‖uh,λ‖∞(

∑k,ε Λ(c(K)

k,ε )qr‖dk,εh,λ‖q

p)1q

≤ C suph,λ ‖dh,λ‖Br,Λp,q

∑h,λ Λ(c(H)

h,λ )ρ++ γµ0 ‖uh,λ‖∞,

(3.57)where the inequality 2γh2Λ(e(K)

1,h1,λ1)ρΛ(e(K)

3,h3,λ3) ≤ CΛ(cH)

h,λ)ρ++ γµ0 is used for

proving the last estimate. Let us remark that the statement of Proposition 2.11holds true even if a smooth partition of unity in φ(H) is replaced by any systemψh,λ satisfying (3.3) and (3.4) up to a finite order N ≥ n; indeed, provided thatN ≥ n, it amounts to ψh,λ(D) being Lp continuous Fourier multipliers. Then byusing Proposition 2.11, we compute

Λ(c(H)h,λ )ρ++ γ

µ0 ‖uh,λ‖∞ ≤MΛ(c(H)h,λ )ρ++ γ

µ0−r+ n

µ0p ‖u‖Br,Λp,q

. (3.58)

Since r > nµ0p , for suitable ρ, , γ we can set −θ = ρ + + γ

µ0− r + n

µ0p

in such a way that∑

h,λ Λ(c(H)h,λ )−θ is bounded. We then conclude that I2 ≤

C suph,λ ‖dh,λ‖Br,Λp,q‖u‖Br,Λ

p,q.

3) Estimate for I3. Let us assume in this part that r > nµ0(1−δ)p with 0 < δ <

1 introduced in (2.1). Using (1 − δ)r and Λ(c(T )h,σ)δruh,λ instead of and uh,λ

respectively, now we can estimate ‖∑

(k1,k2,h3)∈E(N0)t

ε1,ε2,λ3∈En

Λ(c(T )h,σ)δrdk,ε

h,λuh,λ‖p as we

did for ‖Ut,σ‖p in (3.56); we then obtain the following estimate:

‖∑

(k1,k2,h3)∈E(N0)t

ε1,ε2,λ3∈En

Λ(c(T )h,σ)δrdk,ε

h,λuh,λ‖p ≤∑

k1,k2,h3ε1,ε2,λ3

Λ(c(T )h,σ)δr‖dk,ε

h,λ‖p‖uh,λ‖∞

≤ CΛ(e(K)1,h1,λ1

)ρ(∑

h3,λ3

Λ(e(K)3,h3,λ3

)(1−δ)rqΛ(c(K)h,σ )δrq‖uh,λ‖q

∞∑

k1,k2ε1,ε2

‖dk,εh,λ‖q

p)1q .

Pointing out now that

maxΛ(e(T )3,k3,ε3

),Λ(e(K)3,h3,λ3

) ≤ CΛ(e(K)3,k3,ε3

) =: CΛ3, Λ(c(K)h,σ ) ≤ CΛ(cK)

h,λ)

and setting moreover Λ1 := Λ(e(K)1,h1,λ1

), we obtain the following bound for I3, witha suitable C > 0 independent of u, γ, ρ:

C(∑

h1,h2,k3λ1,λ2,ε3

2qγh2Λqrδ3 Λqρ

1

∑h3,λ3

Λ(c(K)h,σ )δrq‖uh,λ‖∞

∑k1,k2ε1,ε2

Λ(1−δ)rq3 ‖dk,ε

h,λ‖qp)

1q

≤ C∑

h,λ 2γh2Λρ1Λ(c(K)

h,λ )δr‖uh,λ‖∞(∑

k,ε Λ(c(K)k,ε )rq‖dk,ε

h,λ‖qp)

1q

≤ C suph,λ ‖dh,λ‖Br,Λp,q

(∑

h,λ Λ(c(H)h,λ )ρ+δr+ γ

µ0−r+ n

µ0p )‖u‖Br,Λp,q

.

(3.59)

Page 220: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

214 G. Garello and A. Morando

Arguing as in 2), since (1 − δ)r > nµ0p , we can now choose positive numbers

θ, ρ, γ such that ρ + δr + γµ0

= r − nµ0p − θ; we have thus proved that I3 ≤

C suph,λ ‖dh,λ‖Br,Λp,q‖u‖Br,Λ

p,q.

Concerning the case s=0, we can just repeat the arguments used to estimateI1, starting from (3.46) with s = 0 and γ > 0 sufficiently small.

Summing up the above computations leads to proving the following

Proposition 3.10. For r > nµ0(1−δ)p , 1 < p <∞, 1 ≤ q <∞:

R : Br,Λp,q −→ Br,Λ

p,q R : B0,Λp,q −→ B0,Λ

p,q , continuously. (3.60)

With the help of the interpolation results in the propositions 2.13, 2.14, we areable to show the following continuity result about the operators Tj , j = 4, . . . , 7.

Proposition 3.11. For r > nµ0(1−δ)p , 1 < p <∞, 1 ≤ q <∞:

Tj : Bs,Λp,q −→ Bs,Λ

p,q , continuously for 0 ≤ s ≤ r, j = 4, . . . , 7. (3.61)

We end this section with the proof of Theorem 3.4.

Proof. of Theorem 3.4. By using the Propositions 3.6–3.11 we immediately getthe statement for an elementary symbol σ(x, ξ) ∈ Br,Λ

p,q ME(N) with N ≥ n. Moreprecisely for any 0 ≤ s ≤ r, 1 < p <∞ and 1 ≤ q <∞

‖σ(x,D)u‖Bs,Λp,q

≤ C suph,λ

‖dh,λ‖Br,Λp,q‖u‖Bs,Λ

p,q, u ∈ S(Rn), (3.62)

where the constant C > 0 depends only on r, s, p, q and n.Let us now take an arbitrary symbol a(x, ξ) in Br,Λ

p,q M0Λ(N) for N ≥ 2n + 1;

in view of (3.7), where the elementary symbols am(x, ξ) are in Br,Λp,q ME(N−n−1)

with N − n− 1 ≥ n, for every 0 ≤ s ≤ r, 1 < p < ∞, 1 ≤ q < ∞ and u ∈ S(Rn)we obtain

‖a(x,D)u‖Bs,Λp,q

≤ C‖u‖Bs,Λp,q

∑m∈Zn

1(1 + |m|)n+1

suph,λ

‖dmh,λ‖Br,Λ

p,q, (3.63)

with C > 0 depending only on r, s, p, q and the dimension n.Since the sequences dm

h,λh,λ are bounded in Br,Λp,q uniformly in m ∈ Zn, the

series∑

m∈Zn1

(1+|m|)n+1 converges and S(Rn) is dense in Bs,Λp,q , (3.63) implies that

a(x,D) is Bs,Λp,q bounded.

When the symbol a(x, ξ) ∈ Br,Λp,q M

mΛ has an arbitrary order m, we easily

reduce to the case of order zero as it was already noticed in this section.

Acknowledgements

We thank Prof. C. Van der Meee, Prof. L. Rodino and the two unknown refereesfor the suggestion they offer us, above all in order to let the paper be more clearand more understandable to non-specialist readers.

Page 221: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Continuity in Weighted Besov Spaces 215

References

[1] J.M. Bony, Calcul symbolique et propagation des singularites pour les equations auxderivees partielles non lineaires, Ann. Sc. Ec. Norm. Sup. 14 (1981), 161–205.

[2] M. Beals, M.C. Reeds, Microlocal regularity theorems for non smooth pseudodiffer-ential operators and applications to non-linear problems, Trans. Am. Math. Soc. 285(1984), 159–184.

[3] A.P. Calderon, Intermediate spaces and interpolation, the complex method, StudiaMath. 24 (1964), 113–190.

[4] R. Coifman, Y. Meyer, Au dela des operateurs pseudo-differentiels; Asterisque 57,Soc. Math. France, 1978.

[5] C. Fefferman, Lp bounds for pseudodifferential operators, Israel J. Math. 14 (1973),413–417.

[6] G. Garello, A. Morando, Lp-bounded pseudodifferential operators and regularity formulti-quasi-elliptic equations, Quad. Dip. Mat. Univ. Torino 46/2001, Integral Equa-tions and Operator Theory 51(4) (2005), 501–517.

[7] G. Garello, A. Morando, Lp boundedness for pseudodifferential operators with non-smooth symbols and applications, Quad. Dip. Mat. Univ. Torino 44/2002, to appearon Boll. Un. Mat. It.

[8] L. Hormander, Pseudodifferential operators and hypoelliptic equations, Proc. Symp.Singular Integral AMS 10 (1967), 138–183.

[9] P.I. Lizorkin, (Lp, Lq)-multipliers of Fourier integrals, Dokl. Akad. Nauk SSSR152(1963), 808–811. (Engl. transl. Sov. Math. Dokl. 4 (1963), 1420–1424)

[10] J. Marschall, Pseudo-differential operators with non regular symbols of the class Smρ,δ,

Comm. in Part. Diff. Eq. 12(8) (1987), 921–965. corr. Comm. in Part. Diff. Eq. 13(1)(1988), 129–130.

[11] J. Marschall, Weighted Lp estimates for pseudo-differential operators with non reg-ular symbols, Z. Anal. Anwendungen 10(4)(1991), 493–501.

[12] Y. Meyer, Remarques sur un theoreme de J.-M. Bony, Proceedings of the Seminaron Harmonic Analysis (Pisa, 1980). Rend. Circ. Mat. Palermo (2)suppl. 1 (1981),1–20.

[13] M. Nagase, On a class of Lp-bounded pseudodifferential operators, Sci. Rep. CollegeGen. Ed. Osaka Univ. 33(4)(1985), 1–7.

[14] M. Nagase, On some classes of Lp-bounded pseudodifferential operarators, 23(2)(1986), 425–440.

[15] M. Nagase, On sufficient conditions for pseudodifferential operators to be Lp-bounded. Pseudodifferential operators (Oberwolfach,1986), Lecture Notes in Math.1256, Springer, Berlin 1987.

[16] M. Nagase, On Lp boundedness of a class of pseudodifferential operators. Harmonicanalysis and nonlinear partial differential equations, (Japanese) (Kyoto, 2001).

[17] M. Sugimoto, Lp-boundedness of pseudo-differential operators satisfying Besov esti-mates II, J. Fac. Sci. Univ. Tokyo Sect. IA, Math. 35 (1988), 149–162.

[18] M.E. Taylor, “Pseudodifferential Operators”, Princeton, Univ. Press 1981.

[19] M.E. Taylor, “Pseudodifferential operators and nonlinear PDE”, Birkhauser, Basel-Boston-Berlin, 1991.

Page 222: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

216 G. Garello and A. Morando

[20] H. Triebel, General Function Spaces, III. Spaces Bg(x)p,q and F

g(x)p,q , 1 < p < ∞: basic

properties, Anal. Math. 3(3) (1977), 221–249.

[21] H. Triebel, General Function Spaces, IV. Spaces Bg(x)p,q and F

g(x)p,q , 1 < p < ∞: special

properties, Anal. Math. 3(4) (1977), 299–315.

[22] H. Triebel, “Theory of Function Spaces”, Birkhauser Verlag, Basel, Boston, Stutt-gart, 1983.

Gianluca GarelloDipartimento di MatematicaUniversita di TorinoVia Carlo Alberto 10,I-10123 Torino, Italye-mail: [email protected]

Alessandro MorandoDipartimento di MatematicaFacolta di IngegneriaUniversita di BresciaVia Valotti 9,I-25133 Brescia, Italye-mail: [email protected]

Page 223: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 217–232c© 2005 Birkhauser Verlag Basel/Switzerland

A New Proof of an Ellis-Gohberg Theoremon Orthogonal Matrix FunctionsRelated to the Nehari Problem

G.J. Groenewald and M.A. Kaashoek

To Israel Gohberg on the occasion of his 75th birthday, with gratitude and admiration.

Abstract. The state space method for rational matrix functions and a classicalinertia theorem are used to give a new proof of the main step in a recenttheorem of R.L. Ellis and I. Gohberg on orthogonal matrix functions relatedto the Nehari problem. Also we comment on a connection with the Nehari–Takagi interpolation problem.

Mathematics Subject Classification (2000). Primary 33C47, 42C05, 47B35;Secondary 47A57.

Keywords. Orthogonal matrix function, inertia theorem, Nehari-Takagi prob-lem.

0. Introduction

This paper concerns the following theorem which was stated and proved in Ellis-Gohberg [4]:

Theorem 0.1. Let k ∈ Lm×m1 (a,∞) with a ≥ 0. Assume that there exist solutions

ga in Lm×m1 (a,∞) and ha in Lm×m

1 (−∞,−a) of the equations

ga(t) +∫ ∞

a

k(t + s− a)ha(−s) ds = 0, (t ≥ a), (1)

and ∫ ∞

a

k(t+ s− a)∗ga(s) ds + ha(−t) = −k(t)∗, (t ≥ a). (2)

The research of the first author is supported by the National Research Foundation, South Africa,under Grant number 2053733.

Page 224: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

218 G.J. Groenewald and M.A. Kaashoek

Assume also that there exist γa in Lm×m1 (a,∞) and χa in Lm×m

1 (−∞,−a) satis-fying the equations

γa(t) +∫ ∞

a

k(t + s− a)χa(−s) ds = −k(t), (t ≥ a), (3)

and ∫ ∞

a

k(t+ s− a)∗γa(s) ds + χa(−t) = 0, (t ≥ a). (4)

Define

Φa(λ) = eiλaI +∫ ∞

a

eiλtga(t) dt, (!λ ≥ 0), (5)

and

Θa(λ) = e−iλaI +∫ ∞

a

e−iλtχa(−t) dt, (!λ ≤ 0). (6)

Then Φa(λ) and Θa(λ) are invertible for all real λ, and counting multiplicities, thenumber of zeros of detΦa (respectively, detΘa) in the upper (respectively, lower)half-plane is finite and equals the number of negative eigenvalues of the operator

T =(

I Ka

K∗a I

), (7)

on Lm1 (a,∞)× Lm

1 (−∞,−a). Here

(Kaψ)(t) =∫ ∞

a

k(t+ s− a)ψ(−s) ds, (t ≥ a), (8)

and

(K∗aφ)(−t) =

∫ ∞

a

k(t + s− a)∗φ(s) ds, (t ≥ a), (9)

for φ ∈ Lm1 (a,∞) and ψ ∈ Lm

1 (−∞,−a).1

The proof of this theorem in [4] (see also Chapter 12 of [5]) is given inthree steps. In the first step it is shown that the operator T in (7) is invertiblewhenever the equations (1) and (2) have solutions ga in Lm×m

1 (a,∞) and ha inLm×m

1 (−∞,−a) and the equations (3) and (4) have solutions γa in Lm×m1 (a,∞)

and χa in Lm×m1 (−∞,−a). In the second step T is assumed to be invertible, k is

the limit in Lm×m1 (a,∞) of a sequence kn consisting of continuous functions of

compact support, and the authors show that it suffices to prove the theorem whenk is replaced by kn for n sufficiently large. Since the continuous m × m matrixfunctions with compact support in (a,∞) are dense in Lm×m

1 (a,∞), the secondstep shows that it is enough to prove the theorem for such a function. The latter isdone in Step 3 by converting the operator T into an operator I −B, where B is aself adjoint convolution integral operator on a finite interval with a kernel functiondepending on the difference of arguments. By applying to B the main theorem of[6] the proof is completed.

1In [4] and also in [5] the operator T is considered on Lm×m1 (a,∞)×Lm×m

1 (−∞,−a) but from

the context it is clear that the space Lm1 (a,∞) × Lm

1 (−∞,−a) is meant.

Page 225: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

A New Proof of an Ellis-Gohberg Theorem 219

In this paper we replace the role of continuous functions k with compactsupport by functions of the form

k(t) = CetAB, (10)

where A, B and C are matrices of sizes n×n, n×m and m×n, respectively, A isassumed to be stable, i.e., all eigenvalues of A are in the open left half-plane Π−,and the triple (A,B,C) is minimal, i.e.,

n⋂j=0

kerCAj = 0, spanAjBCm : j = 0, . . . , n = Cn. (11)

We shall refer to a function k with such a representation as a stable kernel functionof exponential type on (a,∞) and we call (10) a stable exponential representationof k. Since rational functions with poles off R∪∞ are dense in the Wiener algebraon the real line (see [3, page 63]), stable kernel functions of exponential type on(a,∞) are dense in Lm×m

1 (a,∞), and hence by repeating Step 2 in [4] with sucha k in place of a continuous function with compact support, it is clear that itsuffices to prove the above theorem for functions k of the form (10). We do this byreformulating the underlying problem as a linear algebra problem which is solvedby using classical inertia theorems ([10, page 448]). The choice of the representation(10) is inspired by [8].

The paper consists of four sections (not counting this introduction). In thefirst section we associate with k in (10) the n× n matrix

Ma = In − Pae−aA∗

Qae−aA, (12)

where Pa and Qa are given by

Pa =∫ ∞

a

esABB∗esA∗ds, Qa =

∫ ∞

a

esA∗C∗CesA ds. (13)

Since A is assumed to be stable, Pa and Qa are well-defined n×n matrices, and (11)is equivalent to Pa and Qa being positive definite. We refer to Ma as the indicatorof the operator T in (7) associated to the stable exponential representation of k in(10). We show (Theorem 2.1 below) that T in (7) is invertible if and only if Ma isinvertible, and the number of negative eigenvalues of T is equal to the number ofnegative eigenvalues of Ma, multiplicities taken into account. We also rewrite theequations (1)–(4) in terms of the indicator Ma.

In the second section we show that for Ma invertible, the inertia of Ma is equalto the inertia of the matrix C∗CPa(M−1

a )∗ − A∗. In Section 3 we use this resultand those of Section 1 to prove Theorem 0.1 for the case when k is a stable kernelfunction of exponential type. In the final section we comment on the connectionwith the Nehari–Takagi interpolation problem.

In conclusion we mention that our approach has its roots in the theory ofinput output systems. In fact (see, e.g., [2], page 6), a function k of the form (10)

Page 226: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

220 G.J. Groenewald and M.A. Kaashoek

is the impulse response of the system

Σ

x′(t) = Ax(t) + Bu(t), t ≥ 0,y(t) = Cx(t),

at time t to a unit impulse at time 0. Furthermore, the conditions in (11) areequivalent to the requirement that the system Σ is minimal, that is, the order ofA is minimal among all systems with the same impulse response as Σ. When Ais stable, then for a = 0 the matrix Pa in (13) is the controllability gramian andQa is the observability gramian of Σ ([2], page 62), and in that case the operatorKa is the Hankel operator which can be written as the product ΛaΓa, where Γa isthe controllability operator which maps the past input (t < 0) to the present state(t=0), and Λa is the observability operator mapping the present state to futureoutputs. This representation of Ka and the corresponding representation of K∗

a

play an essential role in our analysis (see the proof of Theorem 1.1 below).Finally, for the connection with the theory of orthogonal polynomials we refer

the reader to [5], where Theorem 0.1 is presented as a continuous infinite analogueof Kreın’s theorem [9]. The latter theorem is a generalization of the classical Szegotheorem to the case in which the weight function is not necessarily positive. InTheorem 0.1 the operator T plays the same role as the Toeplitz matrix in Kreın’stheorem. See the first chapter of [5], where these results are described in detailand additional references can be found.

1. The operator T and its indicator

In the sequel we assume throughout that k is a stable kernel function of exponentialtype given by the stable exponential representation (10). In particular, A is stableand (11) holds. Furthermore, Pa and Qa are the n × n matrices defined by (13).Since A and hence also A∗ is stable, it is well–known that Pa and Qa are positivedefinite and satisfy the following Lyapunov equations:

APa + PaA∗ = −eaABB∗eaA∗

, (14)

andA∗Qa + QaA = −eaA∗

C∗CeaA. (15)

Recall that the indicator Ma associated with (10) is given by (12).

Theorem 1.1. Assume k has the stable exponential representation (10). Then theoperator T in (7) is invertible if and only if the corresponding indicator Ma isinvertible and the number of negative eigenvalues of T is equal to the number ofnegative eigenvalues of Ma, multiplicities taken into account.

Proof. Since Ka is of finite rank and T is self adjoint, σ(T ) \ 1 consists of afinite number of eigenvalues of finite multiplicity. Introduce the following auxiliary

Page 227: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

A New Proof of an Ellis-Gohberg Theorem 221

operators:

Λa : Cn → Lm1 (a,∞), (Λax)(t) = Ce(t−a)Ax, (t ≥ a)

Λ#a : Cn → Lm

1 (−∞,−a), (Λ#a x)(−t) = B∗e(t−a)A∗

x, (t ≥ a)

Γa : Lm1 (−∞,−a) → Cn, Γaf =

∫ ∞

a

esABf(−s) ds,

Γ#a : Lm

1 (a,∞) → Cn, Γ#a f =

∫ ∞

a

esA∗C∗f(s) ds.

We have

Ka = ΛaΓa, K∗a = Λ#

a Γ#a , and ΓaΛ#

a = Pae−aA∗

, Γ#a Λa = Qae

−aA.

It follows that

T =(

I 00 I

)+(

Λa 00 Λ#

a

)(0 Γa

Γ#a 0

), (16)

Ma = In − ΓaΛ#a Γ#

a Λa. (17)Put

T =(

I 00 I

)+(

0 Γa

Γ#a 0

)(Λa 00 Λ#

a

)=(

I ΓaΛ#a

Γ#a Λa I

). (18)

From (16) it follows that on the domain C \ 1 the operator function λI − T isglobally equivalent (see Section III.2 in [7] for this terminology) to the matrix–valued function λI − T . In particular,

(a) T is invertible if and only if T is invertible,(b) the number of negative eigenvalues of T is equal to the number of negative

eigenvalues of T .Notice that

T =(

I Pae−aA∗

Qae−aA I

)=

(P

12

a 0

0 Q12a

)(I P

12

a e−aA∗Q

12a

Q12a e−aAP

12

a I

)(P

− 12

a 0

0 Q− 1

2a

).

The previous identity is a similarity relation. It follows that (a) and (b) remaintrue if T is replaced by L, where

L =(

I La

L∗a I

), La = P

12

a e−aA∗Q

12a .

Now

L =(

I La

0 I

)(I − LaL

∗a 0

0 I

)(I 0L∗

a I

), (19)

andI − LaL

∗a = P

− 12

a MaP12

a . (20)

Page 228: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

222 G.J. Groenewald and M.A. Kaashoek

The relation (19) is a congruence relation, and (20) is a similarity relation. Itfollows that(c) L is invertible if and only if Ma is invertible,(d) the number of negative eigenvalues of L is equal to the number of negative

eigenvalues of Ma multiplicities taken into account.Since (a) and (b) remain true with L in place of T , the theorem is proved. Proposition 1.2. Assume k has the stable exponential representation (10). Thenthere exists ga ∈ Lm×m

1 (a,∞) and ha ∈ Lm×m1 (−∞,−a) such that (1) and (2)

hold if and only if the matrix equation

MaX = −PaC∗ (21)

is solvable. In this case, if X is a solution of (21), then the m×m matrix functions

ga(t) = −Ce(t−a)AX, (22)

ha(−t) = B∗e(t−a)A∗(Qae

−aAX − eaA∗C∗), (23)

satisfy equations (1) and (2), respectively. Furthermore, for this choice of ga thefunction Φa in (5) is given by

Φa(λ) = eiλa[I − iC(λ− iA)−1X ], !λ ≥ 0. (24)

Proof. Suppose that ga and ha satisfy (1) and (2). Define X by

X =∫ ∞

a

esABha(−s) ds. (25)

Then clearly,

ga(t) = −Ce(t−a)A(∫ ∞

a

esABha(−s) ds),

so ga has the representation (22). Next, note that

k(t)∗ = B∗etA∗C∗, t ≥ a. (26)

Substituting (26) for k(t)∗ in (2) yields a formula for ha(−t), namely,

ha(−t) = B∗e(t−a)A∗(∫ ∞

a

esA∗C∗CesA ds)e−aAX − eaA∗

C∗

= B∗e(t−a)A∗(Qae

−aAX − eaA∗C∗).

Thus ha is of the form (23). Next we show that X in (25) is a solution of (21).Indeed, from (25), (23), the first equality in (13) and (12) we get that

X =∫ ∞

a

esABha(−s) ds = (∫ ∞

a

esABB∗e(s−a)A∗ds)(Qae

−aAX − eaA∗C∗)

= (∫ ∞

a

esABB∗esA∗ds)(e−aA∗

Qae−aAX − C∗)

= Pae−aA∗

Qae−aAX − PaC

∗,

which yields (21).

Page 229: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

A New Proof of an Ellis-Gohberg Theorem 223

Conversely, suppose that X is a solution of the matrix equation (21), and that ga

and ha are defined by (22) and (23), respectively. Then it follows that

ga(t) +∫ ∞

a

k(t+ s− a)ha(−s) ds

= −Ce(t−a)AX +∫ ∞

a

Ce(t+s−a)ABB∗e(s−a)A∗(Qae

−aAX − eaA∗C∗) ds

= −Ce(t−a)AX + Ce(t−a)A(∫ ∞

a

esABB∗esA∗ds)(e−aA∗

Qae−aAX − C∗)

= −Ce(t−a)AX + Ce(t−a)A(Pae−aA∗

Qae−aAX − PaC

∗)

= −Ce(t−a)AX + Ce(t−a)AX = 0.

So (1) is satisfied. Furthermore, it follows from (13), (22) and (23) that∫ ∞

a

k(t + s− a)∗ga(s) ds + ha(−t)

= −∫ ∞

a

B∗e(t+s−a)A∗C∗Ce(s−a)AX ds + B∗e(t−a)A∗

(Qae−aAX − eaA∗

C∗)

= −B∗e(t−a)A∗(∫ ∞

a

esA∗C∗CesA ds)e−aAX + B∗e(t−a)A∗

×

×(Qae−aAX − eaA∗

C∗)

= −B∗e(t−a)A∗Qae

−aAX + B∗e(t−a)A∗Qae

−aAX −B∗e(t−a)A∗eaA∗

C∗

= −B∗etA∗C∗ = −k(t)∗.

So (2) is satisfied.Finally, let ga be given by (22), and take !λ ≥ 0. Using the Fundamental Theoremof Calculus and the stability of A we get from (5) that

Φa(λ) = eiλaI −∫ ∞

a

eiλtCe(t−a)AX dt

= eiλaI − C(∫ ∞

a

ei(λ−iA)t dt)e−aAX

= eiλaI − C[−i(λ− iA)−1ei(λ−iA)t]∞a e−aAX

= eiλaI − iC(λ− iA)−1eia(λ−iA)e−aAX = eiλa[I − iC(λ− iA)−1X ].

This yields (24) and completes the proof. Put k(t) = k(t)∗ = B∗etA∗

C∗, t ≥ 0, and apply the previous proposition with k

replaced by k. In this way (we omit the details) it is straightforward to derive thefollowing result.

Proposition 1.3. Assume k has the stable exponential representation (10). Thenthere exist γa ∈ Lm×m

1 (a,∞) and χa ∈ Lm×m1 (−∞,−a) such that (3) and (4) hold

if and only if the matrix equation

MaPae−aA∗

U = −Pae−aA∗

QaB (27)

Page 230: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

224 G.J. Groenewald and M.A. Kaashoek

is solvable. In this case, if U is a solution of (27), then the m×m matrix functions

χa(−t) = −B∗e(t−a)A∗U, (28)

γa(t) = Ce(t−a)A(Pae−aA∗

U − eaAB), (29)satisfy equations (3) and (4), respectively. Furthermore, for this choice of χa thefunction Θa in (6) is given by

Θa(λ) = e−iλa[I + iB∗(λ + iA∗)−1U ], !λ ≤ 0. (30)

2. The inertia of Ma

We continue with the assumptions that (A,B,C) is a minimal triple and thatA and thus A∗ are both stable. Let Ma be defined by (12), and let the positivedefinite operators Pa and Qa satisfy the Lyapunov equations (14) and (15).

Lemma 2.1. Let Ma be given by (12). Then

MaA−AMa = PaC∗C − eaABB∗Qae

−aA. (31)

Proof. Using (12) and the Lyapunov equations (14) and (15) we readily obtain(31). Indeed,

MaA = A− Pae−aA∗

(QaA)e−aA

= A− Pae−aA∗

(−eaA∗C∗CeaA −A∗Qa)e−aA

= A + PaC∗C + (PaA

∗)e−aA∗Qae

−aA

= A + PaC∗C + (−eaABB∗eaA∗

−APa)e−aA∗Qae

−aA

= A + PaC∗C − eaABB∗Qae

−aA −APae−aA∗

Qae−aA

= AMa + PaC∗C − eaABB∗Qae

−aA,

and so (31) is proved. The next proposition gives necessary and sufficient conditions for the matrix Ma

to be invertible.

Proposition 2.2. The indicator Ma is invertible if and only if the following matrixequations are solvable

MaX = −PaC∗, MaPae

−aA∗U = −Pae

−aA∗QaB. (32)

Proof. Clearly, if Ma is invertible, then the matrix equations in (32) are solvable.Conversely, suppose that the matrix equations (32) are solvable and that Ma isnot invertible. Then there exists a nonzero x in Cn such that x∗Ma = 0. Note thatthis implies from (12) that

x∗eaA = x∗Pae−aA∗

Qa. (33)

Since x∗Ma = 0 and the matrix equations in (32) are solvable, we have

x∗PaC∗ = −x∗MaX = 0, (34)

x∗Pae−aA∗

QaB = −x∗MaPae−aA∗

U = 0. (35)

Page 231: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

A New Proof of an Ellis-Gohberg Theorem 225

Using (33) it follows from (35) that

x∗eaAB = 0. (36)

Employing (31), (34) and (36) we conclude that

−x∗AMa = x∗(MaA−AMa)

= x∗PaC∗C − x∗eaABB∗Qae

−aA = 0.

Thus x∗AMa = 0. Repeating this argument with x∗ replaced by x∗A we obtainthat x∗AnMa = 0 for n = 0, 1, 2, . . .. According to (36) with x∗ replaced by x∗An

we havex∗eaAAnB = 0, n = 0, 1, 2, . . . . (37)

But the pair (A,B) satisfies the second identity in (11). Hence (37) implies thatx∗eaA = 0 and thus x∗ = 0. So Ma is invertible. An important ingredient in the sequel is the following inertia theorem. We use thesymbol In M to denote the inertia of the square matrix M .

Theorem 2.3. Assume Ma is invertible, and put A× = A−M−1a PaC

∗C. Then

In Ma = In (−A×)∗. (38)

Proof. The proof is divided into four steps.Step 1. First we show that without loss of generality we may take Pa = I. Indeed,replace the triple (A,B,C) by (A, B, C), where

A = S−1AS, B = S−1B, C = CS, (39)

for some invertible matrix S. Let Pa, Qa and Ma be defined by (13) and (12) withA, B, C in place of A, B, C. From (13) it follows that

Pa =∫ ∞

a

esABB∗esA∗ds =

∫ ∞

a

S−1esASS−1BB∗S∗−1S∗esA∗S∗−1 ds

= S−1(∫ ∞

a

esABB∗esA∗ds)S∗−1

thusPa = S−1PaS

∗−1. (40)Likewise, it follows from (13) that

Qa = S∗QaS. (41)

From (12) it follows that Ma is given by

Ma = In − Pae−aA∗

Qae−aA

= In − (S−1PaS∗−1)S∗e−aA∗

S∗−1(S∗QaS)(S−1e−aAS)

= In − S−1Pae−aA∗

Qae−aAS,

thusMa = S−1MaS. (42)

Page 232: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

226 G.J. Groenewald and M.A. Kaashoek

Since Ma is invertible, the same holds true for Ma, and we can define (A)× =

A− Ma

−1PaC

∗C. Note that

(A)× = S−1A×S. (43)

In fact, it follows from (39), (40) and (42) that

S−1A× = S−1A− S−1M−1a PaC

∗C = AS−1 − Ma

−1S−1PaC

∗C

= AS−1 − Ma

−1PaS

∗C∗C = AS−1 − Ma

−1PaC

∗CS−1

= (A− Ma

−1PaC

∗C)S−1 = (A)×S−1,

and hence (43) is proved. It follows from (42) and (43) that

In Ma = In Ma, In ((−A)×)∗ = In (−A×)∗. (44)

Hence it suffices to prove (38) with Ma in place of Ma and (A)× in place of A×.

In particular, by choosing S = P12

a we see from (40) that it suffices to prove thetheorem for Pa = I.Step 2. In the following we assume that Pa = I. Note that this implies that Ma isHermitian. We shall show that the following identity holds:

Ma(−A×) + (−A×)∗Ma = eaABB∗eaA∗+ C∗C, (45)

where now Ma = In − e−aA∗Qae

−aA and A× = A − M−1a C∗C. Note that,

MaA× = MaA−C∗C. On the other hand, by using (14), (15) and (12) we obtain:

(A×)∗Ma = (MaA×)∗ = A∗Ma − C∗C

= A∗ − e−aA∗A∗Qae

−aA − C∗C

= A∗ − e−aA∗(−QaA− eaA∗

C∗CeaA)e−aA − C∗C

= A∗ + e−aA∗Qae

−aAA + C∗C − C∗C= A∗ + (In −Ma)A = A∗ + A−MaA.

Then adding we get (using (14) with Pa = I) that

MaA× + (A×)∗Ma = A∗ + A− C∗C = −eaABB∗eaA∗

− C∗C.

Hence (45) is proved.Step 3. We show that A× has no eigenvalue on iR. Suppose that A× has animaginary eigenvalue, i.e., there exists a nonzero vector x ∈ Cn such that A×x =iαx, α ∈ R. Then using (45) with W = eaABB∗eaA∗

+ C∗C and premultiplyingby x∗ and postmultiplying by x we obtain:

x∗Wx = −x∗(A×)∗Max− x∗MaA×x = iαx∗Max− iαx∗Max = 0.

But then x∗eaABB∗eaA∗x + x∗C∗Cx = 0, i.e., ‖B∗eaA∗

x‖2 + ‖Cx‖2 = 0. ThusCx = 0 and B∗eaA∗

x = 0. Moreover, A×x = iαx and Cx = 0, together imply that

Ax = (A−M−1a C∗C)x = A×x = iαx, α ∈ R.

Page 233: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

A New Proof of an Ellis-Gohberg Theorem 227

Hence σ(A) ∩ iR = ∅. This contradicts the stability of A. Therefore, A× has noeigenvalue on iR.Step 4. To finish the proof we use a classical inertia theorem due to D. Carlson andH. Schneider, which can be found in [10, page 448]. We apply this inertia theoremwith

A = (−A×)∗, H = Ma, and W = eaABB∗eaA∗+ C∗C.

From Steps 2–3 we know that A = (−A×)∗ has no eigenvalue on the imaginary axis,and that the Hermitian nonsingular matrix H = Ma satisfies AH + HA∗ = W ,where W = eaABB∗eaA∗

+ C∗C ≥ 0. Thus the Carlson and Schneider inertiatheorem yields In (−A×)∗ = In Ma, and we are done.

3. Proof of Theorem 0.1 for kernel functionsof stable exponential type

Throughout this section k is given by the stable exponential representation (10)with (A,B,C) being a minimal triple. In particular, A and hence A∗ are stable.Let Ma be given by (12) and let the positive definite operators Pa and Qa satisfythe Lyapunov equations (14) and (15).

Proof of Theorem 0.1 for k as in (10). We divide the proof into six steps.Step 1. Let !λ ≥ 0. From (21) and (24) we get that

detΦa(λ) = det eiλaI det[I + iC(λ− iA)−1M−1a PaC

∗]

= det eiλaI det[I + i(λ− iA)−1M−1a PaC

∗C]

= det eiλaI det(λ− iA)−1 det[λ− i(A−M−1a PaC

∗C)].

Thereforedet Φa(λ) = det eiλaI det(λ− iA)−1 det(λ− iA×), (46)

where A× = A − M−1a PaC

∗C. We know that | det eiλaI| = 0 and thatdet(λ − iA)−1 = 0 since A is stable. Hence from (46), we see that det Φa(λ) = 0for !λ ≥ 0 if and only if det(λ − iA×) = 0 for !λ ≥ 0.Step 2. From Step 3 of Theorem 2.3 we know that σ(A×) ∩ iR = ∅, equivalentlythat σ(iA×)∩R = ∅, hence det(λ−iA×) = 0 for each λ ∈ R. It follows immediatelyfrom (46) that detΦa(λ) = 0 for each λ ∈ R. So Φa(λ) is invertible for each λ ∈ R.Step 3. It follows from (46) and the Inertia Theorem 2.3 that

# zeros of detΦa in the upper half-plane == # eigenvalues of iA× in the upper half-plane= # eigenvalues of (−A×)∗ in the left half-plane= # negative eigenvalues of Ma .

Step 4. Suppose that equations (1)–(4) are satisfied. Then the matrix equations(21) and (27) are both solvable (see Propositions 1.2 and 1.3). Therefore by Propo-sition 2.2, Ma is invertible. Hence T is invertible by Theorem 1.1.

Page 234: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

228 G.J. Groenewald and M.A. Kaashoek

Step 5. Next, it follows from Theorem 1.1 and the result of Step 3 above that

# negative eigenvalues of T = # negative eigenvalues of Ma

= # zeros of detΦa(λ) in the upper half-plane.

Step 6. Finally, we prove the statement about the number of zeros of detΘa. Infact, we first replace the system triple (A,B,C) by (A∗, C∗, B∗). Define P#

a , Q#a

and M#a by (13) and (12) with (A∗, C∗, B∗) in place of (A,B,C). Then, clearly

P#a = Qa, Q#

a = Pa and M#a is defined by

M#a = In −Qae

−aAPae−aA∗

. (47)

Observe that(M#

a )−1 = Qae−aAM−1

a eaAQ−1a . (48)

Next, it follows from (24) that the transformed function Φ#a (λ) with (A∗, C∗, B∗)

and (P#a , Q#

a ,M#a ) instead of (A,B,C) and (Pa, Qa,Ma), respectively, is defined

byΦ#

a (λ) = eiλa[I + iB∗(λ− iA∗)−1(M#a )−1P#

a B], !λ ≥ 0. (49)

Using (48) and P#a = Qa we can recast Φ#

a (λ) as:

Φ#a (λ) = eiλa[I + iB∗(λ− iA∗)−1Qae

−aAM−1a eaAB], !λ ≥ 0.

Then by comparing the realization formulas for Φ#a above and Θa in (30) we see

thatQae

−aAM−1a eaAB = eaA∗

P−1a M−1

a Pae−aA∗

QaB. (50)

Indeed, first note that

M#a = eaA∗

P−1a MaPae

−aA∗. (51)

Taking inverses of both sides of equation (51) yields:

(M#a )−1 = eaA∗

P−1a M−1

a Pae−aA∗

. (52)

So, using (48) and (52) proves (50). Clearly, from (50) we see that

Φ#a (−λ) = Θa(λ). (53)

Formula (53) and Step 5 together imply that:

# zeros of detΘa in the lower half-plane

= # zeros of detΦ#a in the upper half-plane

= # eigenvalues of (−iA×)∗ in the upper half-plane

= # eigenvalues of iA× in the upper half-plane= # zeros of detΦa in the upper half-plane= # negative eigenvalues of T .

Page 235: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

A New Proof of an Ellis-Gohberg Theorem 229

4. A connection with the Nehari–Takagi interpolation problem

We conclude with some comments on the connection with the Nehari–Takagi in-terpolation problem. We consider the rational matrix case, that is, we use theformulation of the Nehari–Takagi problem given in [1, page 452]). Thus K is arational m×m matrix2 function given by a minimal realization of the form

K(z) = C(zI −A)−1B, (54)

where σ(A) ⊂ Π−. Here Π− denotes the open left half-plane. We want to obtainall rational matrix functions F of the form F = K + R such that R is an m×mrational matrix function with at most κ poles in Π− and

‖F‖∞ = sup‖F (z)‖ : z ∈ iR ≤ 1.

If κ = 0, we obtain the Nehari problem, (see [1, page 443]).The solution of the above Nehari–Takagi problem is given by the following

theorem (see [1, page 452]).

Theorem 4.1. Let (54) be a minimal realization for a rational m×m matrix func-tion K with σ(A) ⊂ Π−. Let P and Q be the controllability and observabilitygramians corresponding to (54), that is

P =∫ ∞

0

esABB∗esA∗ds, Q =

∫ ∞

0

esA∗C∗CesA ds.

Assume that 1 is not an eigenvalue of PQ. Then there is a rational matrix functionR with at most κ poles (counted with multiplicities) in Π− such that

‖K + R‖∞ ≤ 1 (55)

if and only if the matrix PQ has at most κ eigenvalues (counted with multiplicities)bigger than 1. Moreover, if κ0 is the number of eigenvalues of PQ bigger than 1,then the rational matrix functions F = K +R satisfying (55) and such that R hasprecisely κ0 poles in Π−, are given by the linear fractional formula

F = (Θ11G + Θ12)(Θ21G + Θ22)−1, (56)

where G is an arbitrary rational m×m matrix function satisfying

supz∈Π−

‖G(z)‖ ≤ 1. (57)

Here

Θ(z) =(

IM 00 IN

)+(

C 00 B∗

)((zI −A)−1 0

0 (zI + A∗)−1

)×(

−ZP ZZ∗ −QZ

)(−C∗ 0

0 B

)where Z = (I − PQ)−1.

2In [1] the matrix functions are allowed to be non-square but in this section we restrict ourselvesto the square case.

Page 236: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

230 G.J. Groenewald and M.A. Kaashoek

Let us associate with the minimal realization (54) the kernel function k(t) =CetAB, t ≥ 0. Then for this k the entries Θij , 1 ≤ i, j ≤ 2, in the 2 × 2 blockcoefficient matrix

Θ(z) =(

Θ11(z) Θ12(z)Θ21(z) Θ22(z)

)in Theorem 4.1 are closely related to functions ga, ha, γa, χa (with a = 0) appearingin Theorem 0.1. In fact we have the following proposition.

Proposition 4.2. Given the stable minimal realization (54), put k(t) = CetAB, t ≥0. Then the entries Θij , 1 ≤ i, j ≤ 2, in the 2 × 2 block coefficient matrix Θ aregiven by

Θ11(z) = I +∫ ∞

0

e−ztg(t)d t, z ≥ 0,

Θ21(z) = −∫ ∞

0

ezth(−t)d t, z ≤ 0,

Θ12(z) = −∫ ∞

0

e−ztγ(t)d t, z ≥ 0,

Θ22(z) = I +∫ ∞

0

eztχ(−t)d t, z ≤ 0.

Here the functions g, h, γ, χ are, respectively, equal to the functions ga, ha, γa, χa,with a = 0, appearing in Theorem 0.1 with k(t) = CetAB.

Proof. First notice that P = P0 and Q = Q0. Since M0 = I − P0Q0, we see thatZ = M−1

0 . From Z = (I − PQ)−1, it also follows that

ZP = PZ∗, QZ = Z∗Q. (58)

Now take g = g0. Then we can use (24), (21) and (5) to show that

I +∫ ∞

0

e−ztg(t)d t = I − iC(izI − iA)−1(−ZPC∗) = Θ11(z), z ≥ 0.

In the same way, using the identities in (58), we see that with χ = χ0 the formulas(30), (27) and (6) yield

I +∫ ∞

0

eztχ(−t)d t = I + iB∗(izI + iA∗)−1(−Z∗QB) = Θ22(z), z ≤ 0.

To get the two remaining formulas we first use formulas (21) and (27) toshow that for a = 0 we have

Q0X − C∗ = −Z∗C∗, P0U −B = −ZB.

But then we can use (23) and (29) to prove that

h(−t) = h0(−t) = −B∗etA∗Z∗C∗, γ(t) = γ0(t) = −CetAZB (t ≥ 0).

Page 237: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

A New Proof of an Ellis-Gohberg Theorem 231

Using these identities together with the stability of A and A∗ we obtain

−∫ ∞

0

ezth(−t)d t = −B∗(z + A∗)−1Z∗C∗ = Θ21(z), z ≤ 0,

−∫ ∞

0

e−ztγ(−t)d t = C(z −A)−1ZB = Θ12(z), z ≥ 0,

which completes the proof.

The preceding proposition, together with the approximation argument de-scribed in Section 12.3 of [5], can be used to obtain the solution of the Nehari-Takagi problem in a Wiener algebra setting.

References

[1] J.A. Ball, I. Gohberg and L. Rodman, Interpolation of rational matrix functions, OT45, Birkhauser Verlag, Basel, 1990.

[2] M.J. Corless, A.E. Frazho: Linear systems and control, Marcel Dekker, Inc., NewYork, NY, 2003.

[3] K. Clancey and I. Gohberg, Factorization of matrix functions and singular integraloperators, OT 3, Birkhauser Verlag, Basel, 1981.

[4] R.L. Ellis and I. Gohberg, Distribution of zeros of orthogonal functions related tothe Nehari problem, in: Singular integral operators and related topics. Joint German-Israeli Workshop, OT 90, Birkhauser Verlag, Basel, 1996, pp. 244–263.

[5] R.L. Ellis and I. Gohberg, Orthogonal Systems and Convolution Operators, OT 140,Birkhauser Verlag, Basel, 2003.

[6] R.L. Ellis, I. Gohberg and D.C. Lay, Distribution of zeros of matrix-valued con-tinuous analogues of orthogonal polynomials, in: Continuous and discrete Fouriertransforms, extension problems and Wiener-Hopf equations, OT 58, Birkhauser Ver-lag, Basel, 1992, pp. 26–70.

[7] I. Gohberg, S. Goldberg and M.A. Kaashoek, Classes of Linear operators I, OT 49,Birkhauser Verlag, Basel, 1990.

[8] I. Gohberg, M.A. Kaashoek and F. van Schagen, On inversion of convolution integraloperators on a finite interval, in: Operator Theoretical Methods and Applications toMathematical Physics. The Erhard Meister Memorial Volume, OT 147, BirkhauserVerlag, Basel, 2004, pp. 277–285.

[9] M.G. Kreın, On the location of roots of polynomials which are orthogonal on theunit circle with respect to an indefinite weight, Teor. Funkcii, Funkcional. Anal. iPrilozen 2 (1966), 131-137 (Russian).

[10] P. Lancaster and M. Tismenetsky, The theory of matrices with applications, SecondEdition, Academic Press, 1985.

Page 238: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

232 G.J. Groenewald and M.A. Kaashoek

G.J. GroenewaldDepartment of MathematicsNorth-West UniversityPrivate Bag X6001Potchefstroom 2520, South Africae-mail: [email protected]

M.A. KaashoekAfdeling Wiskunde, Faculteit der Exacte WetenschappenVrije UniversiteitDe Boelelaan 1081a1081 HV Amsterdam, The Netherlandse-mail: [email protected]

Page 239: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 233–252c© 2005 Birkhauser Verlag Basel/Switzerland

Schur-type Algorithms for the Solution ofHermitian Toeplitz Systems via Factorization

Georg Heinig and Karla Rost

Dedicated to our teacher and friend Israel Gohberg

Abstract. In this paper fast algorithms for the solution of systems Tu =b with a strongly nonsingular hermitian Toeplitz coefficient matrix T viadifferent kinds of factorizations of the matrix T are discussed. The first aim isto show that ZW-factorization of T is more efficient than the correspondingLU-factorization. The second aim is to design and compare different Schur-type algorithms for LU- and ZW-factorization of T . This concerns the classicalSchur-Bareiss algorithm, 3-term one-step and double-step algorithms, and theSchur-type analogue of a Levinson-type algorithm of B. Krishna and H. Krish-na. The latter one reduces the number of the multiplications by almost 50%compared with the classical Schur-Bareiss algorithm.

Mathematics Subject Classification (2000). Primary 15A23; Secondary 65F05.

Keywords. Toeplitz matrix, Schur algorithm, LU-factorization, ZW-factori-zation.

1. Introduction

This paper is dedicated to fast algorithms for the solution of linear systems ofequations Tu = b with a nonsingular hermitian Toeplitz matrix T = [ ai−j ]ni,j=1,a−j = aj . We assume that T is strongly nonsingular, which means that all lead-ing principal submatrices Tk = [ ai−j ]ki,j=1 (k = 1, . . . , n) are nonsingular. Thiscondition is in particular fulfilled if T is positive definite.

There are mainly two types of direct algorithms to solve a system Tu = bwith computational complexity O(n2): Levinson-type and Schur-type. The originalLevinson algorithm is closely related to Szego’s recursion formulas for orthogonalpolynomials on the unit circle and to factorizations of the inverse matrix T−1.A Levinson-type algorithm can be combined with an inversion formula, like theGohberg-Semencul formula. Schur-type algorithms are related to factorizations of

Page 240: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

234 G. Heinig and K. Rost

the matrix itself. Using the factorization of the matrix a linear system of equationscan be solved by back substitution.

Practical experience (see [26]) and theoretical results (see [4], [9], [5]) indi-cate that Schur-type algorithms have, in general, a better stability behavior thanLevinson-type algorithms. For this reason we restrict ourselves in this paper toSchur-type algorithms, i.e., algorithms for fast factorization of T , despite theyhave, in general, a higher computational complexity. However, we develop Schur-type algorithms on the basis of their corresponding Levinson-type counterpart, sothat in this paper also Levinson-type algorithms can be found despite it is notalways mentioned explicitly.

The classical Schur algorithm in its original form is an algorithm in complexfunction theory (see [19] and references therein) to decide whether an analyticfunction maps the unit disk into itself. But it can also be applied to compute theLU-factorization of a Toeplitz matrix. An algorithm for solving Toeplitz systemsvia factorization was first proposed by Bareiss in [1] (not mentioning Schur).

The property of a matrix to be hermitian is reflected in the LU-factorizationin such a way that the U-factor is the adjoint of the L-factor. But hermitian Toe-plitz matrices have an additional symmetry property. They are centro-hermitian.This means that JnTJn = T . Here Jn denotes the n×n matrix of counteridentitywith ones on the antidiagonal and zeros elsewhere. The bar denotes the matrix withconjugate complex entries. This property is not reflected in the LU-factorizationin an obvious way.

For this reason we consider another type of factorization: the ZW-factoriza-tion. This is a representation of T in the form T = ZXZ∗, in which Z is a Z- andX an X-matrix (for the definition of these concepts see Section 2). The factors Zand X in this factorization will also be centro-hermitian. The ZW-factorizationis closely related to the “quadrant interlocking” or WZ-factorization, which wasoriginally introduced and studied by D. J. Evans and his coworkers for the paral-lel solution of tridiagonal systems (see [24], [8] and references therein). While theLU-factorization of a matrix A = [ aij ]ni,j=1 relies on the leading principal sub-matrices Ak = [ aij ]ki,j=1, k = 1, . . . , n, the ZW-factorization relies on the centralsubmatrices [ aij ]n+1−l

i,j=l (l = 1 . . . , [(n + 1)/2]) .The ZW-factorization for real symmetric Toeplitz matrices was first men-

tioned by C. J. Demeure in [7]. In our papers [15] (see also [18]), [16] and [17] ZW-factorizations for skewsymmetric Toeplitz, centrosymmetric and centro-skewsym-metric Toeplitz–plus–Hankel, and general Toeplitz–plus–Hankel matrices were de-scribed, respectively. Note that for skewsymmetric Toeplitz matrices (and with itfor purely imaginary hermitian Toeplitz matrices) the factors of the ZW-factor-ization have some surprising additional symmetry properties which are not sharedby the symmetric case.

The first aim of the present paper is to show that for hermitian Toeplitzmatrices the ZW-factorization leads to more efficient algorithms for the solutionof linear systems than LU-factorization. In Section 2 we consider three types of

Page 241: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Schur-type algorithms for Hermitian Toeplitz Systems 235

factorizations for matrices which are hermitian and centro-hermitian. We showthat the ZW-factorization reflects, in contrast to the LU-factorization, both sym-metry properties. This leads to a computational gain in solving linear systems.The number of additions can still be slightly reduced if instead of the standardZW-factorization a modification, which we call “column conjugate-symmetric ZW-factorization”, is considered. In Section 3 the factors of the factorizations are de-scribed in terms of the residuals of solutions of special systems.

The second aim of the paper is to present and to compare different algorithmsfor LU- and ZW-factorization of hermitian Toeplitz matrices. We present first theclassical Schur algorithm in Section 4, then a one-step Schur algorithm based on3-term recursions in Section 5, and a double-step version of it in Section 6. Thesealgorithms are somehow related to the split Schur algorithm for real symmetricToeplitz matrices of Delsarte and Genin presented in [6].

The split Schur algorithm for real symmetric Toeplitz matrices requires, com-pared with the classical Schur algorithm, only half of the number of multiplicationswhile keeping the number of additions. This gain is not achieved for the algorithmsin Sections 5 and 6. Actually, the split Schur algorithm cannot be directly gener-alized from the real-symmetric to the hermitian case. However, algorithms witha saving of about 50% of the multiplications do exist in the literature (see [22],[21], [3]).

In Section 7 we recall a Levinson-type algorithm presented in [21] and derivea Schur version of it, which will be called Krishna-Schur algorithm and whichis new. We show how ZW-factorizations of T are obtained with the help of thedata computed in this algorithm. In Section 8 we compare the complexity of allalgorithms for the solution of hermitian Toeplitz systems. It turns out that theKrishna-Schur algorithm gives the lowest computational amount.

For sake of simplicity of notation we assume that n is even, n = 2m, through-out the paper. The case of odd n can be considered in an analogous way.

We use the following notations:

– 0k will be a zero vector of length k.– 1k will be a vector of length k with all components equal to 1.– ek will be the kth vector in the standard basis of Cn.

2. LU- versus ZW-factorization

Throughout this section, let A = [ aij ]ni,j=1 be a nonsingular hermitian matrix thatis also centro-hermitian where n is even and m = n/2. We compare the efficiencyof three kinds of factorizations of A for solving a system Au = b, namely the clas-sical LU-factorization and two types of ZW-factorizations with different symmetryproperties. By “efficiency” we mean here in the first place the computational com-plexity of an algorithm for solving a linear system with coefficient matrix A. Thiscomplexity will be O(n2). Therefore, we care only for the n2-term and neglect

Page 242: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

236 G. Heinig and K. Rost

lower order terms. Furthermore we will compare the storage requirement (numberof real parameters) for the factorizations which is also O(n2).

CA and CM will stand for complex additions and multiplications, respec-tively, and RA and RM for their real counterparts. We count 1 CA as 2 RA and1 CM as 4 RM plus 2 RA.

2.1. LU-factorization

If A is a strongly nonsingular matrix, then it admits an LU-factorization A =LDL∗, in which D is real diagonal and L is lower triangular. Among the LU-factorizations there is a unique one in which the matrix L has ones on the maindiagonal. This will be referred to as the unit LU-factorization. If A = L0D0L

∗0 is

the unit LU-factorization of A, then the factorization A = LDL∗ with L = L0D0

and D = D−10 will be called standard LU-factorization. The reason for introducing

this concept is that it is often more efficient to produce the standard than the unitLU-factorization.

If an LU-factorization of A is known, then a system Au = b can be solvedvia the solution of two triangular systems and a diagonal system. The diagonalsystem can be solved with O(n) complexity, so it can be neglected in the complexityestimation. The solution of a complex triangular system requires 0.5n2 CM plus0.5n2 CA, which is equivalent to 2n2 RA and 2n2 RM.

Proposition 2.1. If an LU-factorization of A is known, then the solution of Au = brequires 4n2 RM and 4n2 RA.

The property of the matrixA to be centro-hermitian must be somehow hiddenin the structure of the factor L. In the general case the characterization of the factorL for a centro-hermitian A is, to the best of our knowledge, unknown. But in thecase of a hermitian Toeplitz matrix some relations between the entries of L followfrom the theory of orthogonal polynomials on the unit circle. (see [23], relation(1.3) on p.113). However, it is not clear how to get any computational advantagefrom it. Thus the number of real parameters in an LU-factorization, which is thestorage requirement, will be about n2.

2.2. Centro-hermitian ZW-factorization

Since ZW-factorizations are not commonly used, we recall the basic concepts (com-pare [24], [8] and references therein). A matrix A = [ aij ]ni,j=1 is called a W-matrix(or a bow tie matrix) if aij = 0 for all (i, j) for which i > j and i + j > n ori < j and i + j ≤ n . The matrix A will be called a unit W-matrix if in additionaii = 1 and ai,n+1−i = 0 for i = 1, . . . , n. The transpose of a W-matrix is called aZ-matrix (or hourglass matrix). A matrix which is both a Z- and a W-matrix willbe called an X-matrix. A matrix which is either a Z-matrix or a W-matrix will becalled butterfly matrix.

Page 243: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Schur-type algorithms for Hermitian Toeplitz Systems 237

These names are suggested by the shapes of the set of all possible positionsfor nonzero entries, which are as follows:

W =

⎡⎢⎢⎢⎢⎢⎢⎣• •• •• •• • • •• • • •• •

⎤⎥⎥⎥⎥⎥⎥⎦ , Z =

⎡⎢⎢⎢⎢⎢⎢⎣• • • • • •

• ••

• • • • • • •

⎤⎥⎥⎥⎥⎥⎥⎦ ,

X =

⎡⎢⎢⎢⎢⎢⎢⎣• •

• •• •• •

• •• •

⎤⎥⎥⎥⎥⎥⎥⎦ .

A representation of a nonsingular matrix A in the form A = ZXW in which Zis a Z-matrix, W a W-matrix and X an X-matrix is called a ZW-factorizationof A. It is called unit ZW-factorization if Z is a unit Z-matrix and W is a unitW-matrix. Clearly, if A admits a ZW-factorization, then it admits a unique unitZW-factorization.

A necessary and sufficient condition for a matrix A = [ ajk ]nj,k=1 to admit aZW-factorization is that the central submatrices Ac

n+2−2l = [ ajk ]n+1−lj,k=l are non-

singular for l = 1, . . . ,m = n2 . A matrix with this property will be called centro-

nonsingular.Since for a Toeplitz matrix the central submatrices coincide with principal

submatrices, a strongly nonsingular Toeplitz matrix is also centro-nonsingular.It follows from the uniqueness of the unit ZW-factorization that if a centro-

nonsingular matrix A is hermitian and centro-hermitian, then its unit ZW-factori-zation is of the form A = Z0X0Z

∗0 , in which Z0 is centro-hermitian and X0 is

hermitian and centro-hermitian. That means that in the unit ZW-factorizationboth the hermitian and the centro-hermitian structure are reflected. Thus thenumber of real parameters that characterize a centro-hermitian butterfly matrixis only about half the number of parameters describing a triangular matrix.

Any ZW-factorization in which Z is centro-hermitian will be called centro-hermitian ZW-factorization. If A = Z0X0Z

∗0 is the unit ZW-factorization of A,

then A = ZXZ∗ with Z = Z0X0 and X = X−10 will be called standard ZW-

factorization. Clearly, it is also centro-hermitian.If a centro-hermitian ZW-factorization of A is known, then a system Au = b

can be solved via the solution of two centro-hermitian butterfly systems and an X-system. The X-system can be solved with O(n) complexity, so it can be neglectedin the complexity estimation. We show how a centro-hermitian Z-system Zu = bcan be solved and estimate the complexity.

Page 244: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

238 G. Heinig and K. Rost

For u ∈ Cn, we denote by u# the vector u# = Jnu. A vector u ∈ Cn is calledconjugate-symmetric if u = u#.

First we observe that a centro-hermitian matrix transforms conjugate-sym-metric vectors into conjugate-symmetric ones, so that the solution u of Zu = bwith a conjugate-symmetric b is conjugate-symmetric again.

The solution of a linear system Zu = b with general b can be reduced tothe solution of two systems with conjugate-symmetric right-hand sides. For thiswe represent b = b+ + ib−, where b+ = 1

2 (b + b#) and b− = 12i (b− b#). Then

b± are conjugate-symmetric, and the solution u is obtained from the solutions ofZu± = b± via u = u+ + iu−.

We consider now a system Zu+ = b+ with a conjugate-symmetric right-hand

side b+ =[

c#

c

], c ∈ Cm. Suppose that the solution is u+ =

[v#

v

]for some

v ∈ Cm.A centro-hermitian Z-matrix is of the form

Z =[JmL0Jm JmL1

L1Jm L0

], (2.1)

where L0 and L1 are lower triangular. Hence Zu+ = b+ is equivalent to

L1v + L0v = c . (2.2)

Let the subscript r designate the real and i the imaginary part of a vector orof a matrix. Then (2.2) is equivalent to

(L0,r + L1,r)vr + (−L0,i + L1,i)vi = cr ,

(L0,i + L1,i)vr + (L0,r − L1,r)vi = ci .

This can be written as a real Z-system

Z ′[Jmvi

vr

]=[Jmci

cr

], (2.3)

where

Z ′ =[Jm(L0,r − L1,r)Jm Jm(L0,i + L1,i)(−L0,i + L1,i)Jm L0,r + L1,r

].

Here all matrices L are lower triangular.In this way the solution of the complex system Zu = b is reduced to the

solution of two systems (2.3) with a real Z-coefficient matrix. Since a Z-system isequivalent to a block triangular system with 2 × 2 blocks the solution of such asystem requires 0.5n2 RM and 0.5n2 RA. Thus for 2 systems n2 RM plus n2 RAare sufficient.

Similar arguments can be used for W-systems with the coefficient matrix Z∗.For building the matrix Z ′ we need 4 additions of m×m real, lower triangular

matrices, which results in 2m2 = 0.5n2 real additions. Let us summarize.

Proposition 2.2. If a centro-hermitian ZW-factorization of A is known, then thesolution of Au = b requires 2n2 RM and 2.5n2 RA.

Page 245: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Schur-type algorithms for Hermitian Toeplitz Systems 239

2.3. Column conjugate-symmetric ZW-factorization

We show now that the number of additions can be still reduced if another kind ofZW-factorization of A is given.

We introduce the n× n X-matrix

Σ =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

−i 1. . . . .

.

−i 1i 1

. .. . . .

i 1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦.

Obviously, Σ−1 = 12 Σ∗.

If Z is an n × n centro-hermitian Z-matrix, then the matrix Zh = ZΣ hasthe property JnZh = ZΣ = Zh. That means that Zh has conjugate-symmetriccolumns. Let us call a matrix with this property column conjugate-symmetric. Ifmoreover the X-matrix built from the diagonal and antidiagonal of Zh is equal toΣ, then Zh will be referred to as unit.

A centro-hermitian ZW-factorization A = ZXZ∗ can be transformed into aZW-factorization A = ZhXhZ

∗h in which Zh is unit column conjugate-symmetric.

We will call this factorization unit column conjugate-symmetric ZW-factorization.The standard column conjugate-symmetric ZW-factorization A = ZXZ∗ is givenby Z = ZhXh and X = X−1

h . Concerning the factor Xh we obtain Xh = 14 Σ∗XΣ.

For X is hermitian, Xh is hermitian. Moreover, Xh is real. In fact, we have

Xh =14

Σ∗XΣ =

14

Σ∗JnXJnΣ =

14

Σ∗XΣ = Xh .

Obviously, a column conjugate-symmetric Z-matrix Zh has the form

Zh =[JmL1Jm JmL0

L1Jm L0

], (2.4)

where L0 = L0,r + iL0,i, L1 = L1,r + L1,i are lower triangular matrices, L0,r andL1,i are unit, and L0,i and L1,r have zeros on their main diagonal.

From this representation it can be seen that Zh transforms real vectors toconjugate-symmetric vectors, so that the solution of Zhu = b+ with conjugate-symmetric b+ is real.

Suppose that b+ =[

c#

c

], c = cr +ici, and let u =

[vw

]with v, w ∈ Rm

be the solution of Zhu = b. Then we have

Zh

[vw

]=[Jmcr

cr

]+ i

[−Jmci

ci

],

which is equivalent to

L1,iJmv + L0,iw = ci ,

L1,rJmv + L0,rw = cr .

Page 246: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

240 G. Heinig and K. Rost

This system can be written as a real unit Z-system

Z ′h

[vw

]=[Jmci

cr

],

where

Z ′h =

[JmL1,iJm JmL0,i

L1,rJm L0,r

].

In this way the solution of a system Zhu = b is reduced, like in 2.2, to two realunit Z-systems. In contrast to 2.2 no additional amount is necessary to built thematrix Z ′

h. The same can be done for a system with the coefficient matrix Z∗h. Let

us summarize.

Proposition 2.3. If a column conjugate-symmetric ZW-factorization of the matrixA is known, then the solution of Au = b requires 2n2 RM and 2n2 RA.

3. Description of the factors

We show how the factors in the three types of factorizations can be characterizedvia the solutions of some equations. As in the previous section, let A = [ aij ]ni,j=1

be nonsingular hermitian and centro-hermitian, n even and m = n/2. Besides Awe consider the leading principal submatrices Ak = [ aij ]ki,j=1 for k = 1, . . . , n andthe central submatrices Ac

2k = [ aij ]n+1−li,j=l , k + l = m + 1 for k = 1, . . . ,m. All

general observations we specify for the case of a strongly nonsingular hermitianToeplitz matrix T = [ ai−j ]ni,j=1.

3.1. LU-factorization

For strongly nonsingular A, we consider equations

Akuk = ρk ek (k = 1, . . . , n) , (3.1)

where ρk are nonzero real numbers. Then the factors of the LU-factorization A =LDL∗ can be characterized as follows. The kth column of L is given by Lek =

A

[uk

0n−k

]and

D = diag (ξ−1k ρ−1

k )nk=1 ,

where ξk is the last component of uk. If we choose ρk = 1 for all k, then weobtain the unit LU-factorization. If we demand that ξk = 1 for all k we obtain thestandard LU-factorization. In this specific case we write xk instead of uk.

Consider the case of a Toeplitz matrix T . We introduce the residuals

r+jk =[ak+j−1 . . . aj

]xk (3.2)

for j = 0, . . . , n− k. By definition, r+0k = ρk, and ρk is real. The kth column of Lconsists of k − 1 zeros and the numbers r+jk for j = 0, . . . , n− k.

As a conclusion we can state: The standard LU-factorization of T is given ifthe residuals r+jk for j = 0, . . . , n− k and k = 1, . . . , n are known .

Page 247: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Schur-type algorithms for Hermitian Toeplitz Systems 241

3.2. Centro-hermitian ZW-factorization

For centro-nonsingular A, we consider equations of the form

Ac2kwk = ρ−k e1 + ρ+

k e2k (k = 1, . . . ,m)

where ρ±k are numbers satisfying |ρ+k | = |ρ−k |. This condition guarantees that wk

and w#k are linearly independent. Then

Zem+k = A

⎡⎣ 0m−k

wk

0m−k

⎤⎦is the (m+k)th column of Z for k = 1, . . . ,m. The remaining columns are obtainedusing the property of Z to be centro-hermitian by

Zem+1−k = (Zem+k)# .

In order to describe the X-factor we introduce a notation for X-matrices that

is analogous to the “diag” notation for diagonal matrices. If Mk =[αk βk

γk δk

](k = 1, . . . ,m), then we set

xma(Mk)mk=1 =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

αm βm

. . . . ..

α1 β1

γ1 δ1

. .. . . .

γm δm

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦.

Clearly, xma(Mk)mk=1 is nonsingular if and only if all Mk are nonsingular and

(xma(Mk)mk=1)

−1 = xma(M−1k )m

k=1 .

Now the X-factor is given by

X = xma

⎛⎝[ ξ+

k ξ−kξ−k ξ+

k

]−1 [ρ+

k ρ−kρ−k ρ+

k

]−1⎞⎠m

k=1

, (3.3)

where ξ+k is the last and ξ−k is the first component of wk.

We obtain the factors of the unit ZW-factorization for ρ+k = 1, ρ−k = 0 and

the standard ZW-factorization for the choice ξ+k = 1, ξ−k = 0.

A crucial observation is that for a Toeplitz matrix T we have T c2k = T2k. Let

xk denote, as in the previous subsection, the solution of Tkxk = ρk ek with lastcomponent equal to 1, and let r+jk be defined by (3.2). Besides the numbers r+jk weconsider the residuals

r−jk =[ak+j−1 . . . aj

]x#

k =[aj . . . ak+j−1

]xk (3.4)

Page 248: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

242 G. Heinig and K. Rost

for j = 0, . . . , n− k. Then r−0k = 0 and

T2k

[0

x2k−1

]= r−1,2k−1e1 + r+0,2k−1e2k.

Recall that r+0,2k−1 = ρ2k−1 and that ρ2k−1 is real.

That means that we have wk =[

0x2k−1

]for the standard centro-hermitian

ZW-factorization. Thus, the (m + k)th column of the Z-factor of the standardZW-factorization of T is given by

Zem+k =

⎡⎢⎢⎣(r−j,2k−1)

1j=m−k+1

02k−2

(r+j,2k−1)m−kj=0

⎤⎥⎥⎦and the X-factor by

X = xma

⎛⎝[ r+0,2k−1 r−1,2k−1

r−1,2k−1 r+0,2k−1

]−1⎞⎠m

k=1

,

As a conclusion we can state: The standard centro-hermitian ZW-factorizationof T is given if the residuals r±j,2k−1 for j = 0, . . . ,m − k and k = 1, . . . ,m areknown.

In Section 7 we will construct a centro-hermitian ZW-factorization which isnot standard. In this case besides the residuals the first and last components ofwk are needed to apply formula (3.3).

3.3. Column conjugate-symmetric ZW-factorization

For the construction of a column conjugate-symmetric ZW-factorization A =ZhXhZ

∗h we consider two families of equations

Ac2kw

±k = ρ±k e1 + ρ±k e2k (k = 1, . . . ,m)

with Im ρ−k ρ+k = 0. This condition guarantees the linear independence of the solu-

tions w+k and w−

k . Note that the vectors w±k are conjugate-symmetric, since the

right-hand sides are conjugate-symmetric.Now

Zhem+k = A

⎡⎣ 0m−k

w+k

0m−k

⎤⎦ and Zhem−k+1 = A

⎡⎣ 0m−k

w−k

0m−k

⎤⎦are the (m+ k)th and (m− k+1)th columns of Zh, respectively, for k = 1, . . . ,m.

It remains to describe the middle factor Xh. Let ξ±k , denote the last compo-nent of w±

k .We take advantage of the relation[

z1 z2

z1 z2

]=[−i 1i 1

] [Im z1 Im z2Re z1 Re z2

](3.5)

Page 249: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Schur-type algorithms for Hermitian Toeplitz Systems 243

for complex numbers z1, z2, to observe that ZhX−1Z is unit for

XZ = xma([

Im ρ−k Im ρ+k

Re ρ−k Re ρ+k

])m

k=1

.

Similarly, WhX−1W is unit for

Wh =

⎡⎣ 0 0 0 0w−

m w−m−1 . . . w−

1 w+1 . . . w+

m−1 w+m

0 0 0 0

⎤⎦and

XW = xma([

Im ξ−k Im ξ+k

Re ξ−k Re ξ+k

])m

k=1

.

From the uniqueness of the unit column conjugate-symmetric ZW-factorization weconclude now

Xh =12

xma

([Im ξ−k Im ξ+

k

Re ξ−k Re ξ+k

]−1 [ Im ρ−k Im ρ+k

Re ρ−k Re ρ+k

]−1)m

k=1

. (3.6)

The factor 12 appears in view of this factor in Σ−1 = 1

2 Σ∗.For a Toeplitz matrix T we have T c

2k = T2k. We consider the residuals

r±jk =[ak+j−1 . . . aj

]w±

k ,

for j = 0, . . . , n− k. The (m− k + 1)th and (m + k)th columns of Z is given by

Zem−k+1 =

⎡⎢⎢⎣(r−j,2k)0j=m−k

02k−2

(r−j,2k)m−kj=0

⎤⎥⎥⎦ , Zem+k =

⎡⎢⎢⎣(r+

j,2k)0j=m−k

02k−2

(r+j,2k)m−kj=0

⎤⎥⎥⎦ .

As a conclusion we can state: A column conjugate-symmetric ZW-factorization ofT is given if the residuals r±j,2k for j = 0, . . . ,m − k and the last components ofw±

k for k = 1, . . . ,m are known.

4. Classical Schur algorithm for LU- and ZW-factorizations

Throughout this section, let T be a strongly nonsingular hermitian Toeplitz matrix.We use the notations from the previous section.

The classical Schur algorithm is an algorithm that computes in a fast waythe residuals r±jk defined by (3.2) and (3.4), so it can be used to construct boththe LU- and the ZW-factorizations of T considered in the previous section. Forconvenience of the reader we present a derivation of the algorithm.

We collect the residuals r±jk defined by (3.2) and (3.4) to vectors

r+k = (r+jk)n−k

j=0 , r−k = (r−jk)n−kj=0 .

Page 250: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

244 G. Heinig and K. Rost

Recall that r−0k = 0 and r+0k = ρk. If T ′k denotes the (n− k)× n matrix

T ′k =

⎡⎢⎣ ak . . . a1

......

an−1 . . . an−k

⎤⎥⎦ ,

then T ′kx

±k = (r±jk)n−k

j=1 .Let us first introduce a notation. For a vector v = (vj)m

j=1 ∈ Cm, we denoteby [v]+, [v]−, and by [v]+− the vectors

[v]+ = (vj)mj=2 , [v]− = (vj)m−1

j=1 , [v]+− = (vj)m−1j=2 . (4.1)

It is easily checked that[Tk+1

T ′k+1

] [x#

k 00 xk

]=

⎡⎣ r+0k r−1k

0k−1 0k−1

[r−k ]+ [r+k ]−

⎤⎦ .

From this relation we obtain[x#

k+1 xk+1

]=[

x#k 00 xk

]Θk , (4.2)

where

Θk =[

1 γk

γk 1

], γk = −r−1k

r+0k

.

This leads to the following.

Proposition 4.1. The residual vectors r±k satisfy the recursion[r−k+1 r+

k+1

]=[[r−k ]+ [r+

k ]−]Θk . (4.3)

The recursion starts with r−1 = r+1 = (aj)n−1

j=0 .

Recall that for the L-factor of the standard LU-factorization we need thenumbers r+jk for j = 0, . . . , n − k and for the diagonal factor the numbers r+0k

(k = 1 . . . , n). For the Z-factor of the standard centro-hermitian ZW-factorizationwe need the numbers r±j,2k−1 for j = 1, . . . ,m − k and k = 1, . . . ,m and for theX-factor the numbers r+0,2k−1 and r−1,2k−1 for these k.

Remark 4.1. At the first glance one might think that for the computation of theparameters of the ZW-factorization only a part of the numbers r±jk have to becomputed. A closer look, however, reveals that this is not the case.

Let us estimate the complexity of the algorithm emerging from Proposition4.1. The step k → k + 1 consists in 2 complex vector additions, 2 multiplicationsof a complex vector by a complex number. The lengths of the vectors are aboutn− k. This results in 4n2 RM and 4n2 RA.

Page 251: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Schur-type algorithms for Hermitian Toeplitz Systems 245

Remark 4.2. There is a recursion similar (4.3) for the columns of unit LU- andZW-factorizations. The difference to (4.3) is that the matrix Θk has not ones onthe main diagonal but some real number. This increases the complexity by n2 RM.That means this algorithm is less efficient than that generated by (4.3).

5. ZW-Factorization via one-step three-term Schur algorithm

To find the standard column conjugate-symmetric ZW-factorization we considerthe equations

Tkx±k = ρ±k e1 + ρ±k ek

for complex numbers ρ±k under the assumption that[eT1

eTk

] [x−

k x+k

]=[−i 1i 1

].

This guarantees the linear independence of x−k and x+

k . Besides the x±k we consider

the residual vectors s±k = (s±jk)n−kj=0 where s±0k = ρ±k and (s±jk)n−k

j=1 = T ′kx

±k .

We have

Tk+1

([0

x±k

]+[

x±k

0

])=

⎡⎢⎢⎢⎢⎢⎢⎢⎣

s±1k + s0k

s±0k

0k−3

s±0k

s±1k + s±0k

⎤⎥⎥⎥⎥⎥⎥⎥⎦Tk+1

⎡⎣ 0x±

k−1

0

⎤⎦ =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

s±1,k−1

s±0,k−1

0k−3

s±0,k−1

s±1,k−1

⎤⎥⎥⎥⎥⎥⎥⎥⎦.

We are looking for real numbers α±k and β±

k such that

x±k+1 =

[0

x±k

]+[

x±k

0

]− α±

k

⎡⎣ 0x−

k−1

0

⎤⎦− β±k

⎡⎣ 0x+

k−1

0

⎤⎦ . (5.1)

We introduce the matrices

Γk =[

Re s−0k Re s+0k

Im s−0k Im s+0k

]. (5.2)

It follows from the linear independence of x±k that the matrices Γk are nonsingular.

A comparison of coefficients reveals that (5.1) holds if we choose[α−

k α+k

β−k β+

k

]= Γ−1

k−1Γk . (5.3)

The recursion for the vectors x±k transfers to a recursion for the residual vectors.

Proposition 5.1. The vectors s±k satisfy the recursion

s±k+1 = [s±k ]− + [s±k ]+ − α±k [s−k−1]± − β±

k [s+k−1]± ,

where α±k and β±

k are given by (5.3).

Page 252: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

246 G. Heinig and K. Rost

The initialization is given by[s−2 s+

2

]=

[eT2 T2

T ′2

][−i 1i 1

],

[s−3 s+

3

]=

[eT3 T3

T ′3

]⎡⎣ 1 −ibRe a1 b Ima1

1 i

⎤⎦ ,where b = − 2

a0.

Recall that the Z-factor in the standard column conjugate-symmetric ZW-factorization is given by the numbers s±j,2k for j = 0, . . . ,m− k and k = 1, . . . ,m,and the X-factor is given by the numbers s±0,2k.

Let us estimate the complexity. In the step k → k+1 we have 4 multiplicationsof a complex vector by a real number and 6 additions of complex vectors. Thelengths of the vectors are about n− k. This results in 4n2 RM and 6n2 RA.

Remark 5.1. There is a recursion similar to that in Proposition 5.1 for computingthe columns of the unit column conjugate-symmetric ZW-factorization. In contrastto the classical Schur algorithm, the complexity of this algorithm is the same asfor the standard column conjugate-symmetric ZW-factorization.

6. ZW-factorization via double-step three-term Schur algorithm

Since for the standard column conjugate-symmetric ZW-factorization we need onlythe numbers s±j,2k, i.e., only every second residual vector, it is reasonable to thinkabout a double-step algorithm.

We are looking for a recursion of the form

x±2k+2=

⎡⎣ 00

x±2k

⎤⎦+

⎡⎣ x±2k

00

⎤⎦−⎡⎣ 0 0

x−2k x+

2k

0 0

⎤⎦[ γ±k

δ±k

]−

⎡⎣ 02 02

x−2k−2 x+

2k−2

02 02

⎤⎦[ α±k

β±k

].

If we multiply a vector of this form by T2k+2, then only the first and last 3 com-ponents of the resulting (conjugate-symmetric) vector are nonzero. First we findnumbers α±

k , β±k such that the third last component vanishes, which means

s±0,2k = α±k s

−0,2k−2 − β±

k s+0,2k−2 .

This is equivalent to [α−

k α+k

β−k β+

k

]= Γ−1

2k−2Γ2k (6.1)

where Γ2k is defined by (5.2).Next we find γ±

k and δ±k such that the last but one component vanishes. Thisis equivalent to

s±1,2k − α±k s

−1,2k−2 − β±

k s+1,2k−2 = γ±

k s−0,2k + δ±k s+0,2k .

Page 253: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Schur-type algorithms for Hermitian Toeplitz Systems 247

To write this in matrix form we introduce

Γk =[

Re s−1k Re s+1k

Im s−1k Im s+1k

]. (6.2)

After some calculation we find that[γ−

k γ+k

δ−k δ+k

]= Γ2kΓ−1

2k − Γ2k−2Γ−12k−2 . (6.3)

The recursion of the vectors x±2k transfers to the recursion of the residuals. In

order to present this recursion for the residual vectors s±2k we extend the notation(4.1) as follows

[ s ]++ = [ [ s ]+]+, [ s ]−− = [ [ s ]−]−, [ s ]++−− = [ [ s ]+−]+−.

If S is a matrix then [S]+− means that the [.]+− operator is applied to each columnof S.

Proposition 6.1. The vectors s±2k satisfy the recursions

s+2k+2 = [s+

2k]−−+[s+2k]++−

[s−2k s+

2k

]+−

[γ+

k

δ+k

]−[

s−2k−2 s+2k−2

]++

−−

[α+

k

β+k

],

s−2k+2 = [s−2k]−−+[s−2k]++−[

s−2k s+2k

]+−

[γ−

k

δ−k

]−[

s−2k−2 s+2k−2

]++

−−

[α−

k

β−k

],

where coefficients are given by (6.3) and (5.3).

We start this recursion with k = 1, where we put Γ0 = 0 and s−0 = s+0 = 0.

The vectors s±2 are given in the previous section.We estimate the complexity. In the step k → k+ 1 we have 8 multiplications

of a complex vector by a real number and 10 additions of complex vectors. Thelengths of the vectors are about n− 2k. But the number of steps is only m. Thisresults in 4n2 RM and 5n2 RA. That means the number of multiplications is thesame as for the one-step algorithm of Section 5, but we save n2 RA.

7. ZW-factorizations via the Schur version of Krishna’s algorithm

The algorithms presented in the previous sections do not lead to a further re-duction in the complexity compared with the classical Schur algorithm for ZW-factorizations. The reason is that two families of vectors are computed. It is de-sirable to have all information contained in only one family. In [21] a Levinsonalgorithm of this kind was presented that leads to about 50% reduction of thenumber of multiplications. A similar algorithm was proposed in [22]. For otheralgorithms (based on different ideas) we refer to [3].

Here we present a Schur version of the algorithm in [21] and show how itcan be used to find a centro-hermitian as well as a column conjugate-symmetricZW-factorization of T .

Page 254: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

248 G. Heinig and K. Rost

Let qk be the solution of an equation

Tkqk = θk 1k ,

where θk is a nonzero real number. We allow the freedom in admitting a factor θk

and not demanding something about qk in order to save operations. Clearly, qk isconjugate-symmetric. We set sk = (sjk)n−k

j=0 with s0k = θk and (sjk)n−kj=1 = T ′

kqk.We have [

Tk+1

T ′k+1

] [0 qk

qk 0

]=

⎡⎣ s1k θk

θk1k−1 θk1k−1

[sk]− [sk]+

⎤⎦ ,[Tk+1

T ′k+1

]⎡⎣ 0qk−1

0

⎤⎦ =

⎡⎣ s1,k−1

θk−11k−1

[sk−1]+−

⎤⎦ .

The number s1k−θk is nonzero, since otherwise the nonzero vector[

0qk

]−[

qk

0

]would belong to the kernel of Tk+1, which contradicts the nonsingularity of Tk+1.

We are looking for qk+1 to be of the form

qk+1 =[

0 qk

qk 0

] [αk

αk

]−

⎡⎣ 0qk−1

0

⎤⎦ .

If we chooseαk =

s1,k−1 − θk−1

s1,k − θk,

then we really haveTk+1qk+1 = θk+11k+1,

with θk+1 = 2 θk Reαk − θk−1 = 0. The recursion for the solutions leads to arecursion of the residual vectors as follows.

Proposition 7.1. The vectors sk satisfy a recursion

sk+1 = αk[sk]+ + αk[sk]− − [sk−1]+− where αk =s1,k−1 − s0,k−1

s1,k − s0,k.

To start the recursion we observe that we can choose q1 = 1 and q2 =[a0 − a1

a0 − a1

]. Hence

s1 = (aj)n−1j=0 , s2 = ((a0 − a1)aj+1 + (a0 − a1)aj)n−2

j=0 .

Besides the vectors sk we have to compute the last component νk of qk. Therecursion for these numbers follows from the recursion of qk as

νk+1 = αkνk .

In each step of the Schur-type algorithm for computing the vectors sk we have1 multiplication of a complex vector by a complex number and by its conjugatecomplex. This is equivalent to 4 multiplications of a real vector by a real number

Page 255: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Schur-type algorithms for Hermitian Toeplitz Systems 249

and 4 real vector additions. Besides this we have 2 complex vector additions. Sincethe length of the vectors is n− k we end up with 2n2 RM plus 4n2 RA.

We show how we can obtain the data for a centro-hermitian ZW-factorization.For this we observe that the vectors

wk =[

q2k−1

0

]− s0,2k−1

s0,2kq2k

satisfy T2kwk = (s1,2k−1−s0,2k−1) e2k. Since qk are conjugate-symmetric and s0,k

are real, we have

w#k =

[0

q2k−1

]− s0,2k−1

s0,2kq2k.

The first and last components ξ−k and ξ+k of wk are given by

ξ−k = −s0,2k−1

s0,2kν−2k , ξ+

k = ν+2k−1 −

s0,2k−1

s0,2kν+2k . (7.1)

We introduce the vectors

z+k =

(sj+1,2k−1 −

s0,2k−1

s0,2ksj,2k

)m−k

j=0

, z−k =(sj,2k−1 −

s0,2k−1

s0,2ksj,2k

)m−k

j=0

and set

zk =

⎡⎣ (z−k )#

02k−2

z+k

⎤⎦ .

We obtain the following.

Proposition 7.2. The matrix

Z =[

z#m . . . z#

1 z1 . . . zm

]is the Z-factor of a centro-hermitian ZW-factorization of T , and the correspondingX-factor is given by

X = xma

⎛⎝[ ξ+

k ξ−kξk

−ξ+k

]−1 [ρ−1

k 00 ρ−1

k

]⎞⎠m

k=1

,

where ξ±k are given by (7.1) and ρk = s1,2k−1 − s0,2k−1.

For computing these data each step involves 1 multiplication of a complexvector by a real number and 2 additions of complex vectors. The lengths of thevectors are m− k and the number of steps is m. This results in m2 = 0.25n2 RMand 2m2 = 0.5n2 RA. Hence the total amount for computing a centro-hermitianZW-factorization of T will be 2.25n2 RM and 4.5n2 RA.

We show how to compute the data for a column conjugate-symmetric ZW-factorization. Introduce the conjugate-symmetric vectors

w−k =

⎡⎣ 0q2k−2

0

⎤⎦− s0,2k−2

s0,2kq2k, w+

k = i([

q2k−1

0

]−[

0q2k−1

]),

Page 256: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

250 G. Heinig and K. Rost

which are obviously linearly independent. Then

T2k

[w−

k w+k

]=

⎡⎣ ρ−k ρ+k

02k−2 02k−2

ρ−k ρ+k

⎤⎦ ,

whereρ−k = s1,2k−2 −

s0,2k−2

s0,2ks1,2k, ρ+

k = i (s1,2k−1 − s0,2k−1) . (7.2)

The last components ξ±k of w±k are given by

ξ−k = −s0,2k−2

s0,2kν2k, ξ+

k = −i ν2k−1 . (7.3)

We introduce vectors t−k =(sj+1,2k−2 −

s0,2k−1

s0,2ksj,2k

)m−k

j=0

,

t+k = i ( sj+1,2k−1 − sj,2k−1)

m−kj=0 and z±k =

⎡⎣ (t±k )#

02k−2

t±k

⎤⎦ .

Then we have the following.

Proposition 7.3. The matrix

Zh =[

z−m . . . z−1 z+1 . . . z+

m

]is the Z-factor of a column conjugate-complex ZW-factorization of T . The corre-sponding X-factor is given by (3.6) with ρ±k , and ξ±k defined by (7.2) and (7.3)..

The amount for computing the data of this column conjugate-complex ZW-factorization is the same as for the centro-hermitian ZW-factorization above.

8. Summary

In the following table we compare the computational complexities for the algo-rithms discussed in this paper.

Method n2 RM n2 RA

LU and classical Schur-Bareiss 8 8

ch ZW and classical Schur 6 6.5

ccs ZW and one-step 3-term Schur 6 8

ccs ZW and double-step 3-term Schur 6 7

ch ZW and Krishna-Schur 4.25 7

ccs ZW and Krishna-Schur 4.25 6.5

Here “ch” stands for “centro-hermitian” and “ccs” stands for “column conjugate-symmetric”.

Page 257: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Schur-type algorithms for Hermitian Toeplitz Systems 251

References

[1] E.H. Bareiss, Numerical solution of linear equations with Toeplitz and vector Toeplitzmatrices, Numer. Math., 13 (1969), 404–424.

[2] D. Bini, V. Pan, Polynomial and Matrix Computations, Birkhauser Verlag, Basel,Boston, Berlin, 1994.

[3] Y. Bistritz, H. Lev-Ari, T. Kailath, Immitance-type three-term Schur and Levinsonrecursions for quasi-Toeplitz complex Hermitian matrices, SIAM J. Matrix. AnalysisAppl., 12, 3 (1991), 497–520.

[4] A. Bojanczyk, R. Brent, F. de Hoog, D. Sweet, On the stability of Bareiss and relatedToeplitz factorization algorithms, SIAM J. Matrix. Analysis Appl., 16, 1 (1995), 40–57.

[5] R.P. Brent, Stability of fast algorithms for structured linear systems, In: T. Kailath,A.H. Sayed (Eds.), Fast Reliable Algorithms for Matrices with Structure, SIAM,Philadelphia, 1999.

[6] P. Delsarte, Y. Genin, On the splitting of classical algorithms in linear predictiontheory, IEEE Transactions on Acoustics Speech and Signal Processing, ASSP-35(1987), 645–653.

[7] C.J. Demeure, Bowtie factors of Toeplitz matrices by means of split algorithms, IEEETransactions on Acoustics Speech and Signal Processing, ASSP-37, 10 (1989), 1601–1603.

[8] D.J. Evans, M. Hatzopoulos, A parallel linear systems solver, Internat. J. Comput.Math., 7, 3 (1979), 227–238.

[9] I. Gohberg, I. Koltracht, T. Xiao, Solution of the Yule-Walker equations, AdvancedSignal Processing Algorithms, Architectures, and Implementation II, Proceedings ofSPIE, 1566 (1991).

[10] I. Gohberg, A. A. Semencul, On the inversion of finite Toeplitz matrices and theircontinuous analogs (in Russian), Matemat. Issledovanya, 7, 2 (1972), 201–223.

[11] G. Golub, C. Van Loan, Matrix Computations, John Hopkins University Press, Bal-timore, 1996.

[12] G. Heinig, Chebyshev-Hankel matrices and the splitting approach for centrosymmetricToeplitz-plus-Hankel matrices, Linear Algebra Appl., 327, 1-3 (2001), 181–196.

[13] G. Heinig, K. Rost, Algebraic Methods for Toeplitz-like Matrices and Operators,Birkhauser Verlag, Basel, Boston, Stuttgart, 1984.

[14] G. Heinig, K. Rost, DFT representations of Toeplitz-plus-Hankel Bezoutians withapplication to fast matrix-vector multiplication, Linear Algebra Appl., 284 (1998),157–175.

[15] G. Heinig, K. Rost, Fast algorithms for skewsymmetric Toeplitz matrices, OperatorTheory: Advances and Applications, Birkhauser Verlag, Basel, Boston, Berlin, 135(2002), 193–208.

[16] G. Heinig, K. Rost, Fast algorithms for centro-symmetric and centro-skewsymmetricToeplitz-plus-Hankel matrices, Numerical Algorithms, 33 (2003), 305–317.

[17] G. Heinig, K. Rost, New fast algorithms for Toeplitz-plus-Hankel matrices, SIAMJournal Matrix Anal. Appl. 25(3), 842–857 (2004).

Page 258: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

252 G. Heinig and K. Rost

[18] G. Heinig, K. Rost, Split algorithms for skewsymmetric Toeplitz matrices with arbi-trary rank profile, Theoretical Computer Science 315 (2–3), 453–468 (2004).

[19] T. Kailath, A theorem of I. Schur and its impact on modern signal processing, Opera-tor Theory: Advances and Applications, Birkhauser Verlag, Basel, Boston, Stuttgart,18 (1986), 9–30.

[20] T. Kailath, A.H. Sayed, Fast Reliable Algorithms for Matrices with Structure, SIAM,Philadelphia, 1999.

[21] B. Krishna, H. Krishna, Computationally efficient reduced polynomial based algo-rithms for hermitian Toeplitz matrices, SIAM J. Appl. Math., 49, 4 (1989), 1275–1282.

[22] H. Krishna, S.D. Morgera, The Levinson recurrence and fast algorithms for solvingToeplitz systems of linear equations, IEEE Transactions on Acoustics Speech andSignal Processing, ASSP-35 (1987), 839–848.

[23] E.M. Nikishin, V.N. Sorokin, Rational approximation and orthogonality (in Russian),Nauka, Moscow 1988; English: Transl. of Mathematical Monographs 92, Providence,AMS 1991.

[24] S. Chandra Sekhara Rao, Existence and uniqueness of WZ factorization, ParallelComp., 23, 8 (1997), 1129–1139.

[25] W.F. Trench, An algorithm for the inversion of finite Toeplitz matrices, J. Soc. In-dust. Appl. Math., 12 (1964), 515–522.

[26] J.M. Varah, The prolate matrix, Linear Algebra Appl., 187 (1993), 269–278.

Georg HeinigDepartment of Mathematics and Computer SciencesP.O.Box 5969Safat 1306, Kuwaite-mail: [email protected]

Karla RostDepartment of MathematicsChemnitz University of TechnologyD-09107 Chemnitz, Germanye-mail: [email protected]

Page 259: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 253–271c© 2005 Birkhauser Verlag Basel/Switzerland

Almost Pontryagin Spaces

M. Kaltenback, H. Winkler and H. Woracek

Abstract. The purpose of this note is to provide an axiomatic treatment ofa generalization of the Pontryagin space concept to the case of degeneratedinner product spaces.

Mathematics Subject Classification (2000). Primary 46C20; Secondary 46C05.

Keywords. Pontryagin space, indefinite scalar product.

1. Introduction

In this note we provide an axiomatic treatment of a generalization of the Pon-tryagin space concept to the case of degenerated inner product spaces. Pontryaginspaces are inner product spaces which can be written as the direct and orthogonalsum of a Hilbert space and a finite-dimensional anti Hilbert space. The subject ofour paper are spaces which can be written as the direct and orthogonal sum ofa Hilbert space, a finite-dimensional anti Hilbert space and a finite-dimensionalneutral space.

The necessity of a systematic approach to such “almost” Pontryagin spacesbecame clear in the study of various topics: For example in the investigation ofindefinite versions of various classical interpolation problems (e.g., [7]). Related tothese questions is the generalization of Krein’s formula for generalized resolvents ofa symmetric operator (e.g., [8]). Another topic where the occurrence of degeneracyplays a crucial role is the theory of Pontryagin spaces of entire functions whichgeneralizes the theory of Louis de Branges on Hilbert spaces of entire functions.

In Section 2 we generalize the concept of Pontryagin spaces by giving thedefinition of almost Pontryagin spaces and investigating the basic notion of Gramoperator and fundamental decomposition. Moreover, the role played by the topol-ogy of an almost Pontryagin space is made clear. In the subsequent Section 3 weinvestigate some elementary constructions which can be made with almost Pon-tryagin spaces. We deal with subspaces, product spaces and factor spaces. Relatedto the last one of these constructions is the notion of morphism between almostPontryagin spaces. Section 4 deals with the concept of completion. This topic is

Page 260: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

254 M. Kaltenback, H. Winkler and H. Woracek

much more involved than the previous constructions. However, it is clearly of par-ticular importance to be able to construct almost Pontryagin spaces from givenlinear spaces carrying an inner product. In Section 5 we turn our attention to aparticular class of almost Pontryagin spaces, so-called almost reproducing kernelPontryagin spaces. The intention there is to prove the existence of the correct ana-logue of a reproducing kernel of a reproducing kernel Pontryagin space. Finally,in Section 6, we explain some circumstances where almost Pontryagin spaces ac-tually occur.

2. Almost Pontryagin spaces

Before we give the definition of almost Pontryagin spaces, recall the definition ofPontryagin spaces. A pair (P, [., .]) where P is a complex vector space and [., .] isa hermitian inner product on P is called a Pontryagin space if one can decomposeP as

P = P−[+]P+, (2.1)where (P−, [., .]) is a finite-dimensional anti Hilbert space, (P+, [., .]) is a Hilbertspace, and [+] denotes the direct and [., .]-orthogonal sum. Such decompositionsof P are called fundamental decompositions. It is worthwhile to note (see [2] orsee below) that every Pontryagin space carries a unique Hilbert space topology O(there exists an inner product (., .) such that (P, (., .)) is a Hilbert space and suchthat (., .) induces the topology O, i.e., O = O(.,.)) such that the inner product [., .]is continuous with respect to O. This topology is also called the Pontryagin spacetopology on P.

With respect to this topology the subspace P+ is closed for any fundamen-tal decomposition (2.1). Conversely, the product topology induced on P by anyfundamental decomposition (2.1) coincides with the unique Hilbert space topology.

It will turn out that for almost Pontryagin spaces the uniqueness assertionabout the topology is no longer true. Thus we will include the topology into thedefinition.

Definition 2.1. Let L be a linear space, [., .] an inner product on L and O a Hilbertspace topology on L. The triplet (L, [., .],O) is called an almost Pontryagin space, if(aPS1) [., .] is O-continuous.(aPS2) There exists a O-closed linear subspace M of L with finite codimension

such that (M, [., .]) is a Hilbert space.

Let (R, [., .]) be any linear space equipped with a inner product [., .] and assumethat

sup

dim U : U negative definite subspace of R <∞.

Then (see for example [2, Corollary I.3.4]) the dimensions of all maximal negativedefinite subspaces of R are equal. We denote this number by κ−(R, [., .]) and referto it as the negative index (or the degree of negativity) of (R, [., .]). If the abovesupremum is not finite, we set κ−(R, [., .]) = ∞.

Page 261: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Almost Pontryagin Spaces 255

The isotropic part of an inner product space (R, [., .]) is defined as

R[] =x ∈ R : [x, y] = 0, y ∈ R

.

We will denote its dimension by ∆(R, [., .]) ∈ N∪ 0,∞ and call this number thedegree of degeneracy of (R, [., .]).

Remark 2.2. It immediately follows from the definition that if (L, [., .],O) is analmost Pontryagin space, then κ−(L, [., .]) and ∆(L, [., .]) are both finite.

The fact that a given triplet (L, [., .],O) is an almost Pontryagin space can becharacterized in several ways. First let us give one characterization via a spectralproperty of a Gram operator.

Proposition 2.3. Let L be a linear space, [., .] an inner product on L and O aHilbert space topology on L.

(i) Assume that (L, [., .],O) is an almost Pontryagin space and let (., .) be anyHilbert space inner product which induces the topology O. Then there existsa unique (., .)-selfadjoint bounded operator G(.,.) with

[x, y] = (G(.,.)x, y), x, y ∈ L.

There exists ε > 0 such that σ(G(.,.)) ∩ (−∞, ε) consists of finitely manyeigenvalues of finite multiplicity. If we denote by E(M) the spectral measureof G(.,.), this just means that

dim ranE(−∞, ε) <∞. (2.2)

Moreover,

∆(L, [., .]) = dimkerG(.,.), κ−(L, [., .]) = dim ranE(−∞, 0).

We will refer to G(.,.) as the Gram operator corresponding to (., .).(ii) Let (L, (., .)) be a Hilbert space, and let G be a bounded selfadjoint operator

on (L, (., .)) which satisfies (2.2) where E(M) denotes the spectral measure ofG. Moreover, let O be the topology induced by (., .) and define [., .] = (G., .).Then (L, [., .],O) is an almost Pontryagin space.

Proof. ad (i): Since [., .] is continuous with respect to the topology O, the Lax-Milgram theorem ensures the existence and uniqueness of G(.,.). Moreover, since[., .] is an inner product, G(.,.) is selfadjoint.

Let M be a O-closed linear subspace of L with finite codimension such that(M, [., .]) is a Hilbert space. By the open mapping theorem [., .]|M and (., .)|M areequivalent. Hence PMG(.,.)|M, where PM denotes the (., .)-orthogonal projectiononto M, is strictly positive. Choose ε > 0 such that εIM < PMG(.,.)|M. Assumethat dim ranE(−∞, ε) > codimL M, then ranE(−∞, ε) ∩ M = 0. For any x ∈ranE(−∞, ε) ∩M, (x, x) = 1, we have

ε < (PMG(.,.)|Mx, x) = (G(.,.)x, x) ≤ ε,

a contradiction.

Page 262: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

256 M. Kaltenback, H. Winkler and H. Woracek

ad (ii): Choose M = ranE[ε,∞).Then M is (., .)-closed, codimL M = dim ranE(−∞, ε) <∞ and [., .] = (G., .)

is a Hilbert space inner product on M since G|M is strictly positive. Corollary 2.4. Let an almost Pontryagin space (L, [., .],O) be given. If (., .) and〈., .〉 are two Hilbert space inner products on L which both induce the topology Oand T is the (., .)-strictly positive bounded operator on L with 〈., .〉 = (T., .), thenthe Gram operators G(.,.) and G〈.,.〉 are connected by

G(.,.) = TG〈.,.〉.

There exists a Hilbert space inner product (., .) on L which induces O such that itsGram operator G(.,.) is a finite-dimensional perturbation of the identity.

Proof. The first assertion is clear from

(G(.,.)x, y) = [x, y] = 〈G〈.,.〉x, y〉 = (TG〈.,.〉x, y), x, y ∈ L.

For the second assertion choose a Hilbert space inner product 〈., .〉 on L whichinduces O. Let G〈.,.〉 be the corresponding Gram operator, E(M) its spectralmeasure, and choose ε > 0 as in Proposition 2.3, (i). Define

(x, y) = 〈(E[ε,∞)G〈.,.〉 + E(−∞, ε))x, y〉, x, y ∈ L.

Then

G(.,.) = (E[ε,∞)G〈.,.〉 + E(−∞, ε))−1G〈.,.〉 = E[ε,∞) + E(−∞, ε)G〈.,.〉. In the study of Pontryagin spaces so-called fundamental decompositions play

an important role. The following is the correct analogue for almost Pontryaginspaces. In particular, it gives us another characterization of this notion.

Proposition 2.5. The following assertions hold true:(i) Let (L, [., .],O) be an almost Pontryagin space. Then there exists a direct and

[., .]-orthogonal decomposition

L = L+[+]L−[+]L[], (2.3)

where L+ is O-closed, (L+, [., .]) is a Hilbert space and L− is negative definite,dimL− = κ−(L, [., .]).

(ii) Let (L+, (., .)+) be a Hilbert space, (L−, (., .)−) be a finite-dimensional Hilbertspace, and let L0 be a finite-dimensional linear space. Define a linear space

L = L++L−+L0,

and inner products

[x+ + x− + x0, y+ + y− + y0] = (x+, y+)− (x−, y−),

(x+ + x− + x0, y+ + y− + y0) = (x+, y+) + (x−, y−) + (x0, y0)0,where (., .)0 is any Hilbert space inner product on L0. Moreover, let O bethe topology on L induced by the Hilbert space inner product (., .). Then(L, [., .],O) is an almost Pontryagin space. Thereby κ−(L, [., .]) = dimL−and L[] = L0.

Page 263: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Almost Pontryagin Spaces 257

Proof. Let (L, [., .],O) be an almost Pontryagin space. Choose a Hilbert space innerproduct (., .) which induces O, let G(.,.) be the corresponding Gram operator, anddenote by E(M) the spectral measure of G(.,.). Define

L+ = ranE(0,∞), L− = ranE(−∞, 0).

Then L+ is O-closed. The inner products [., .] and (., .) are equivalent on L+ sinceG|L+ is strictly positive. Hence (L+, [., .]) is a Hilbert space. Clearly (L−, [., .]) isnegative definite and dimL− = κ−(L, [., .]). Since E(−∞, 0)+E0+E(0,∞) = I,the space L is decomposed as in (2.3).

Conversely, let (L+, (., .)+), (L−, (., .)−) and L0 be given. The Gram operatorof [., .] with respect to (., .) is equal to

G =

⎛⎝I 0 00 −I 00 0 0

⎞⎠ .

Obviously, ranE(−∞, 12 ) = L− + L0, kerG = L0 and ranE(−∞, 0) = L−.

Corollary 2.6. We have

(i) Let (L+, (., .)+), (L−, (., .)−) and L0 be as in (ii) of Proposition 2.5, andlet (L, [., .],O) be the almost Pontryagin space constructed there. Then L =L+[+]L−[+]L0 is a decomposition of the same kind as in (2.3).

(ii) Let (L, [., .],O) be an almost Pontryagin space, and assume that L is de-composed as L = L+[+]L−[+]L[] where (L+, [., .]) is a Hilbert space and(L−, [., .]) is negative definite.Let (L1, [., .]1,O1) be the almost Pontryagin space constructed by means ofProposition 2.5, (ii), from (L+, [., .]), (L−,−[., .]), L0 = L[]. Then L1 = Land [., .]1 = [., .]. We have O1 = O if and only if L+ is O-closed.

Proof. The assertion (i) follows immediately since L+ is (., .)-closed. We come tothe proof of (ii). The facts that L1 = L and [., .]1 = [., .] are obvious.

Assume that L+ isO-closed. Note that by the assumption on their dimensionsthe subspaces L− and L0 are closed, too. By the Open Mapping Theorem the linearbijection

(x+;x−;x0) → x+ + x− + x0,

is bicontinuous from L+ × L− × L0 provided with the product topology onto Lprovided with O. On the other hand by the definition of O1 this mapping is alsobicontinuous if we provide L with O1. Thus O1 = O.

Finally, assume that O1 = O. By the construction of (., .)1 the space L+ isO1-closed and, therefore, also O-closed.

From the above results we obtain a statement which shows from anotherpoint of view that almost Pontryagin spaces can be viewed as a generalization ofPontryagin spaces.

Page 264: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

258 M. Kaltenback, H. Winkler and H. Woracek

Corollary 2.7. Let (P, [., .]) be a Pontryagin space, and let O be the unique topologyon P such that [., .] is continuous (see [2], compare also Corollary 2.10). Then(P, [., .],O) is an almost Pontryagin space. Moreover, ∆(P, [., .]) = 0.

Conversely, if (P, [., .],O) is an almost Pontryagin space with ∆(P, [., .]) = 0,then (P, [., .]) is a Pontryagin space.

Proof. Let (P, [., .]) be a Pontryagin space. Choose a fundamental decomposi-tion P = P+[+]P−. Then P+ is O-closed, (P+, [., .]) is a Hilbert space andcodimP P+ = dimP− <∞.

Let (P, [., .],O) be an almost Pontryagin space with ∆(P, [., .]) = 0. Choosea decomposition P = P+[+]P− according to (2.3). By Corollary 2.6, (ii), thetopology O coincides with the topology of the Pontryagin space (P+[+]P−, [., .]).

It is a noteworthy fact that in certain cases the topology of an almost Pon-tryagin space (L, [., .],O) is uniquely determined by the inner product, see theProposition 2.9 below. However, in general this is not true. This fact goes back to[4], [5].

Lemma 2.8. For any infinite-dimensional almost Pontryagin space (L, [., .],O) with∆(L, [., .]) > 0 there exists a topology T different from O such that also (L, [., .], T )is an almost Pontryagin space.

Proof. Choose a Hilbert space inner product (., .) on L inducing O. Let h ∈ L[]

and K = h(⊥), and let f be a non-continuous linear functional on K. We define thelinear mapping U from L = K(+) spanh onto itself by

U(x + ξh) = x + (ξ + f(x))h, x ∈ K, ξ ∈ C.

The mapping U is bijective and non-continuous. In fact, if it were continuous, thenalso f would be continuous. Nevertheless, U is isometric:

[U(x+ξh), U(y+ηh)] = [x+(ξ+f(x))h, y+(η+f(y))h] = [x, y] = [x+ξh, y+ηh].

Therefore, with (L, [., .],O) also its isometric copy (L, [., .], U−1(O)) is an almostPontryagin space. As U is not continuous we have T = U−1(O) = O.

The existence of a sufficiently large family of functionals which are requiredto be continuous guarantees the uniqueness of the topology. Such a family offunctionals will show up, in particular, when we deal with spaces consisting offunctions such that the point evaluation functionals are continuous.

A family (fi)i∈I of linear functionals on a linear space L is said to be pointseparating if for each two x, y ∈ L, x = y, there exists i ∈ I such that fi(x) = fi(y).

Proposition 2.9. Let (L, [., .],O) be an almost Pontryagin space and assume thatthere exists a point separating family of continuous linear functionals (fi)i∈I . ThenO is the unique Banach space topology on L such that all functionals fi, i ∈ I, arecontinuous.

Page 265: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Almost Pontryagin Spaces 259

Proof. Let T be a Banach space topology on L such that every fi is continuous.The identity mapping id : (L,O) → (L, T ) has a closed graph. In fact, if xn → xwith respect to O and xn → y with respect to T , then by assumption

fi(x) = limn→∞ fi(xn) = fi(y), for all i ∈ I,

and hence x = y. By the Closed Graph Theorem the identity map is bicontinuous,and therefore T = O.

As a corollary we obtain the well-known result that a Pontryagin space carriesa unique Hilbert space topology with respect to which [., .] is continuous.

Corollary 2.10. If an almost Pontryagin space (P, [., .],O) is a Pontryagin space,i.e., ∆(P, [., .]) = 0, then O is the unique Banach space topology T on P such that[., .] is continuous with respect to T . In particular, it is the unique Hilbert spacetopology T on P such that (P, [., .], T ) is an almost Pontryagin space.

Proof. The assumption ∆(P, [., .]) = 0 is equivalent to the fact that the family offunctionals fx = [., x], x ∈ P, is point separating. Hence we can apply Lemma 2.9.

3. Subspaces, products, factors

The next result shows that the class of almost Pontryagin spaces is closed underthe formation of subspaces and finite direct products. Note that the first half ofthis statement is not true for Pontryagin spaces.

Proposition 3.1. Let (L, [., .],O) be an almost Pontryagin space, K a closed linearsubspace of L, and denote by O∩K the subspace topology induced by O on K. Then(K, [., .],O∩K) is an almost Pontryagin space. We have κ−(K, [., .]) ≤ κ−(L, [., .]).

Let (L1, [., .]1,O1) and (L2, [., .]2,O2) be two almost Pontryagin spaces, anddenote by O1 ×O2 the product topology on L1 × L2. Define the inner product

[(u; v), (x; y)] = [u, x]1 + [v, y]2, (u; v), (x; y) ∈ L1 × L2.

Then (L1 × L2, [., .],O1 ×O2) is an almost Pontryagin space. We have

κ−(L1 × L2, [., .]) = κ−(L1, [., .]1) + κ−(L2, [., .]2),

∆(L1 × L2, [., .]) = ∆(L1, [., .]1) + ∆(L2, [., .]2).

Proof. To establish the first part of the assertion choose an O-closed linear sub-space M of L with finite codimension such that (M, [., .]) is a Hilbert space. Wealready saw that by the closed graph theorem [., .] induces the topology O ∩ Mon M. Thus K ∩ M is at the same time O ∩ K-closed linear subspace of K withfinite codimension in K, and a O∩M-closed (i.e., [., .]-closed) subspace of M. Hence(K∩M, [., .]) is a Hilbert space. Thus (K, [., .],O∩K) is an almost Pontryagin space.The relation between the negative indices is clear.

To prove the second assertion take for j = 1, 2 a Oj-closed subspace Mj withfinite codimension in Lj such that (Mj , [., .]j) is a Hilbert space. Then M1×M2 is a

Page 266: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

260 M. Kaltenback, H. Winkler and H. Woracek

O1×O2-closed subspace of L1×L2 of finite codimension such that (M1×M2, [., .])is a Hilbert space.

We conclude from Corollary 2.7 together with Proposition 3.1 that everyclosed subspace of a Pontryagin space is an almost Pontryagin space. Also theconverse holds true:

Proposition 3.2. Let (L, [., .],O) be an almost Pontryagin space. Then there existsa Pontryagin space (P, [., .]) such that L is a closed subspace of P with codimen-sion ∆(L, [., .]) and O is the subspace topology on L induced by the Pontryaginspace topology on P. Moreover, κ−(P, [., .]) = κ−(L, [., .]) + ∆(L, [., .]). Any twoPontryagin spaces with the listed properties are isometrically isomorphic.

Conversely, let (P, [., .]) be a Pontryagin space. If L is a closed subspace ofP, so that L with the inner product and topology inherited from (P, [., .]) is analmost Pontryagin space, then codimP L ≥ ∆(L, [., .]).

Proof. Fix a decomposition L = L+[+]L−[+]L[] according to (2.3). Let L′ be alinear space of dimension ∆(L, [., .]) and define

P = L++L−+L[]+L′.

We declare an inner product [., .]1 on P by

[x, y]1 = [x, y], x, y ∈ L+ + L−, (L+ + L−)[⊥]1(L[] + L′),

and the requirement that L[] and L′ are skewly linked neutral subspaces, i.e., forevery non-zero x ∈ L[] there exists a y ∈ L′ such that [x, y] = 0 and, conversely,for every non-zero y ∈ L′ there exists an x ∈ L[] such that [x, y] = 0.

Then (P, [., .]1) is a Pontryagin space because it can be seen as the productof the Pontryagin spaces (L++L−, [., .]) and (L[]+L′, [., .]1).

Clearly, codimP L = ∆(L, [., .]) and [., .]1|L = [., .]. Since L = (L[])[⊥]1 , thespace L is a closed subspace of P. Let T be the subspace topology on L inducedby the Pontryagin space topology on P. Then T coincides with the topology onL obtained from the construction of Proposition 2.5, (ii), applied with (L+, [., .]),(L−,−[., .]) and L[]. Since L+ is O-closed, Corollary 2.6, (ii), yields T = O.

Let (P2, [., .]2) be another Pontryagin space which contains L with codimen-sion ∆(L, [., .]). Then P2 can be decomposed as

P2 = L+[+]L−[+](L[]+L′′),

where L′′ is a neutral subspace skewly linked to L[]. It is now obvious that thereexists an isometric isomorphism of P2 onto the above constructed space P.

The second part of the assertion follows from [2, Theorem I.10.9]: Considerthe ∆(L, [., .])-dimensional subspace L[] of P. Then certainly L ⊆ (L[])[⊥] andthus

codimP L ≥ codimP(L[])[⊥] = dimL[] = ∆(L, [., .]).

Page 267: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Almost Pontryagin Spaces 261

Let us introduce the correct notion of morphism between almost Pontryagin spaces.

Definition 3.3. Let (L1, [., .]1,O1) and (L2, [., .]2,O2) be almost Pontryagin spaces.A map φ : L1 → L2 is called a morphism between (L1, [., .]1,O1) and (L2, [., .]2,O2)if φ is linear, isometric, continuous and maps O1-closed subspaces of L1 onto O2-closed subspaces of L2.

A linear mapping φ from an almost Pontryagin space (L1, [., .]1,O1) onto analmost Pontryagin space (L2, [., .]2,O2) is called an isomorphism if φ is bijective,bicontinuous and isometric with respect to [., .]1 and [., .]2.

Let us collect a couple of elementary facts.

Lemma 3.4. The identity map of an almost Pontryagin space onto itself is an iso-morphism. Every isomorphism is a morphism. The composition of two (iso)mor-phisms is a(n) (iso)morphism.

Let φ : (L1, [., .]1,O1) → (L2, [., .]2,O2) be a morphism. Then(i) kerφ ⊆ L[]1(ii) (ranφ, [., .]2,O2 ∩ ranφ) is an almost Pontryagin space.(iii) If φ is surjective, then φ is open.(iv) If φ is bijective, then φ is an isomorphism.If K is a closed subspace of an almost Pontryagin space (L, [., .],O), then the in-clusion map ι : (K, [., .],O ∩ K) → (L, [., .],O) is a morphism.

Proof. The first statement of the lemma is obvious.ad (i): Since φ is isometric an element x ∈ kerφ must satisfy

[x, y]1 = [φx, φy]2 = 0, y ∈ L,

and hence x ∈ L[]1 .ad (ii): Since ranφ is O2-closed, we may refer to Proposition 3.1.ad (iii): Apply the Open Mapping Theorem.ad (iv): This is an immediate consequence of the previous assertion.

The last statement follows since K is a closed subspace of L. Morphisms can be constructed in a canonical way from subspaces of L[].

Proposition 3.5. Let (L, [., .],O) be an almost Pontryagin space and let R be asubspace of L[]. We consider the factor space L/R endowed with an inner product[., .]1 defined by

[x+ R, y + R]1 = [x, y], (3.1)and with the quotient topology O/R. Then (L/R, [., .]1,O/R) is an almost Pon-tryagin space. We have

κ−(L/R, [., .]1) = κ−(L, [., .]),

∆(L/R, [., .]1) = ∆(L, [., .])− dim R.

The quotient map π : L → L/R is a morphism of

(L, [., .],O) onto (L/R, [., .]1,O/R).

Page 268: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

262 M. Kaltenback, H. Winkler and H. Woracek

Proof. The inner product on L/R is well defined by (3.1) because of R ⊆ L[].Since O is a Hilbert space topology and R is a finite-dimensional and, hence, closedsubspace of L, the topology O/R is also a Hilbert space topology.

Denote by π : L → L/R the canonical projection. Since the inner product onL/R is defined according to

(L/R)2[.,.]1 C

L2

π×π

[.,.]

,

and the quotient topology is the final topology with respect to π, we obtain that[., .]1 is O/R-continuous.

Choose an O-closed subspace M of L such that codimL M < ∞ and suchthat (M, [., .]) is a Hilbert space. Since R is finite-dimensional M + R is O-closed.Thus π(M) = (M + R)/R satisfies the requirements of axiom (aPS2).

The formulas for the negative index and the degree of degeneracy are obvious.The quotient map π is clearly linear, isometric and continuous. If U is any O-

closed subspace of L, then also U+R is O-closed and therefore π(U) = (U+R)/Ris O/R-closed. This shows that π is a morphism.

We conclude this section with the 1st homomorphy theorem.

Lemma 3.6. Let φ : (L1, [., .]1,O1) → (L2, [., .]2,O2) be a morphism. Then φ in-duces an isomorphism φ between (L1/ kerφ, [., .]1,O1/ kerφ) and (ranφ, [., .]2,O2∩ranφ) with

(L1, [., .]1,O1)φ

π

(L2, [., .]2,O2)

(L1/ kerφ, [., .]1,O1/ kerφ)φ

(ranφ, [., .]2,O2 ∩ ranφ)

ι

Proof. The induced mapping φ is bijective, isometric and continuous. By the OpenMapping Theorem it is also open. Thus it is an isomorphism.

4. Completions

The generalization of the concept of completion to the almost Pontryagin spacesetting is a much more delicate topic.

Remark 4.1. Let an inner product space (A, [., .]) with κ−(A, [., .]) = κ < ∞be given. Then there always exists a Pontryagin space which contains A/A[] as adense subspace. We are going to sketch the construction of this so-called completionof (A, [., .]) (see, e.g., [4]).

Page 269: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Almost Pontryagin Spaces 263

Take any subspace M of A which is maximal with respect to the propertythat (M, [., .]) is an anti Hilbert space. If e1, . . . , eκ is an orthonormal basis of(M,−[., .]), then

PM = −[., e1]e1 · · · − [., eκ]eκ,

is the orthogonal projection of A onto M. By the maximality property of M theorthogonal complement ((I−PM)A, [., .]) is positive semidefinite. Therefore, settingJM = I − 2PM we see that [JM., .] = (., .)M is a positive semidefinite product onA. We then have (JM., .) = [., .]M, and JM and [., .] are continuous with respectto the topology induced by (., .)M.

Note that if M′ is another subspace of L which is maximal with respect tothe property that (M′, [., .]) is an anti Hilbert space, then PM′ , and hence JM′ and(., .)M′ are continuous with respect to (., .)M. By symmetry we obtain that (., .)M

and (., .)M′ are equivalent scalar products, i.e., there exist α, β > 0 such that

α(x, x)M ≤ (x, x)M′ ≤ β(x, x)M, x ∈ A. (4.1)

This in turn means that these two scalar products induce the same topology T onA. In particular, T is determined by [., .] and not by a particularly chosen M.

A completion of (A, [., .]) is given by (P, [., .]), where P is the completion ofA/A[] with respect to (., .)M. Note that

A()M = x ∈ A : (x, x)M = 0 = x ∈ A : [x, y] = 0 for all y ∈ A = A[],

and that A[] ∩M = 0.After factoring out A[] by continuity we can extend PM, JM, [., .] to P. Then

we have JM = I − 2PM and [., .] = (JM., .)M also on P. The extension PM is theorthogonal projection of P onto M/A[]. It is straightforward to check that

P = PMP[+]((I − PM)P) (4.2)

is a fundamental decomposition of (P, [., .]). Therefore, (P, [., .]) is a Pontryaginspace and by (4.1) its construction does not depend on the chosen space M. More-over, it is the unique Pontryagin space (up to isomorphisms) which contains A/A[]

such that [., .] on P is a continuation of [., .] on A/A[].To see this let (P′, [., .]) be another such Pontryagin space, and let P′ =

P−[+]P+ be a fundamental decomposition of P′. By a density argument we finda subspace M of A with the same dimension as P− such that M/A[] is sufficientlyclose to P− in order that (M, [., .]) is an anti Hilbert space. It follows that M ismaximal with respect to this property. Let P be the orthogonal projection of P′

onto M/A[], and let (., .) be the Hilbert space inner product [(I − 2P )., .] on P.If (P, [., .]) is the completion as constructed above, then the identity φ on

A/A[] is a [., .]-isometric linear mapping from a dense subspace of P onto a densesubspace of P′. By construction φPM = Pφ. Hence φ is isometric with respect to(., .)M and (., .). As both induce the topology on the respective spaces P and P′

we see that φ can be extended to an isomorphism from P onto P′.

Page 270: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

264 M. Kaltenback, H. Winkler and H. Woracek

Definition 4.2. Let (A, [., .]) be an inner product space such that κ−(A, [., .]) = κ <∞. An almost Pontryagin space with a linear mapping ((L, [., .],O), ι) is called acompletion of A, if ι is an isometric mapping (with respect to [., .]) from A onto adense subspace ι(A) of L.

Two completions ((L1, [., .]1,O1), ι1) and ((L2, [., .]2,O2), ι2) are called iso-morphic if there exists an isomorphism φ from (L1, [., .]1,O1) onto (L2, [., .]2,O2)such that φ ι1 = ι2.

Remark 4.3. We saw above that, up to isomorphism, there always exists a uniquePontryagin space which is a completion of (A, [., .]).

If we allow the almost Pontryagin space of a completion ((L, [., .],O), ι) to bedegenerated, i.e., ∆(L, [., .]) > 0, then (L, [., .],O) is not uniquely determined if weassume dim A/A[] = ∞. This can be derived immediately from Proposition 2.8.

For dimA/A[] = ∞ it follows from the subsequent result that for any ∆ ≥ 0there exists a completion ((L, [., .],O), ι) of (A, [., .]) such that ∆(L, [., .]) = ∆. Alsofor fixed ∆ Proposition 2.8 shows that (L, [., .],O) is not uniquely determined.

Proposition 4.4. Let (A, [., .]) be an inner product space with κ−(A, [., .]) = κ <∞,and let T be the topology determined by [., .] on A (see Remark 4.1).

If f1, . . . , f∆ are complex linear functionals on A such that no linear combina-tion of them is continuous with respect to T , then there exists an (up to isomorphiccopies) unique completion ((L, [., .],O), ι) with ∆(L, [., .]) = ∆ such that f1, . . . , f∆

are continuous with respect to ι−1(O).

Proof. The construction made in this proof stems from [6].Let (P, [., .]) be the unique Pontryagin space completion of (A, [., .]), i.e., the

completion with respect to T . Let (., .)M be the Hilbert space inner product on Pfrom Remark 4.1 constructed with the help of a subspace M of A being maximalwith respect to the property that (M, [., .]) is an anti Hilbert space. We define

L = P× C∆,

and provide L with the inner product (., .) such that (., .) coincides with (., .)M onP and with the Euclidean product on C∆, and such that L = P(+)C∆. Let [., .]be defined on L by

[(x; ξ), (y; η)] = [x, y].By definition (L, [., .],O(.,.)) is an almost Pontryagin space. Hereby O(.,.) is thetopology induced by (., .) on L.

Now we embed A in L via the mapping ι

ι(x) = (x + A[]; (f1(x), . . . , f∆(x))).

Then ι(A) is dense in L. In fact, if not, then we could find (y; η) ∈ L such that(y; η)(⊥)ι(A). It would follow that

(x + A[],−y) =∆∑

j=1

ηjfj(x), x ∈ A,

Page 271: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Almost Pontryagin Spaces 265

and, therefore, the right-hand side would be continuous with respect to T . Byassumption η = 0 and further y(⊥)A in P. This is not possible as A is dense in P.

The mapping ι is isometric with respect to [., .]. Thus by defining O = O(.,.),((L, [., .],O), ι) is a completion of (A, [., .]). By the definition of ι the functionalsf1, . . . , f∆ are continuous with respect to ι−1(O).

Assume now that ((L′, [., .]′,O′), ι′) is another completion of (A, [., .]) suchthat ∆(L′, [., .]′) = ∆ and such that f1, . . . , f∆ are continuous with respect toι′−1(O′). Let (., .)′ be a Hilbert space scalar product on L′ which induces O′. Byelementary considerations from the theory of locally convex vector spaces we canfactor f1, . . . , f∆ through the isotropic part A()′ of A with respect to (ι(.), ι(.))′.Note that A()′ is also the set of all points in A which have exactly the sameneighborhoods as 0 with respect to the topology ι′−1(O′).

Clearly, (L′, (., .)′) is isomorphic to the completion of A/A()′ with respect to(ι(.), ι(.))′. Hence by continuation to the completion we obtain continuous linearfunctionals g1, . . . , g∆ on (L′, [., .]′,O′) such that f1 = g1 ι′, . . . , f∆ = g∆ ι′.

By Proposition 3.5 (L′/L′[]′ , [., .]′) is a Pontryagin space. We denote by π thefactorization mapping. As (A/A[], [., .]) is isometrically embedded by π ι′ in thisPontryagin space Remark 4.1 shows that (L′/L′[]′ , [., .]′) is an isomorphic copy of(P, [., .]). Let φ : L′/L′[]′ → P be this isomorphism, which satisfies φπ ι′ = idA.

Let 0 = x ∈ L′[]′ be such that g1(x) = · · · = g∆(x) = 0. By elementary lin-ear algebra we find a non-trivial linear combination g of the functionals g1, . . . , g∆

which vanishes on L′[]′ . Hence, we find a functional f on P such that g = f φπ.But then a → f(a+A[]) is a non-trivial linear combination of f1, . . . , f∆ which iscontinuous with respect to T . By assumption this is ruled out. Thus the intersec-tion of the kernels of gj, j = 1, . . . ,∆, has no point in common with L′[]′ exceptof 0. Since the intersection of ∆ hyperplanes has codimension at most ∆, we have

L′[]′+(ker(g1) ∩ · · · ∩ ker(g∆)) = L′, (4.3)

and see that the mapping

ϕ : L′ → L, x → (φ π(x); (g1(x), . . . , g∆(x))),

is bijective. Moreover, ϕ is isometrically with respect to [., .] and satisfies ϕ ι′ = ι.Since in the decomposition (4.3) all subspaces are closed, the Open Mapping The-orem implies that ϕ is bicontinuous with respect to O′ and O.

Remark 4.5. With the notation from Proposition 4.4

〈., .〉 = (., .)M +∆∑

j=1

fj(.)fj(.), (4.4)

is a non-negative inner product on A. It is easy to see that ι induces an isomorphismfrom the completion of (A/A〈〉, 〈., .〉) onto (L, (., .)). In particular, O〈.,.〉 = ι−1(O).

Page 272: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

266 M. Kaltenback, H. Winkler and H. Woracek

The completion constructed in Proposition 4.4 appeared in implicit formsalready in various papers. See for example [7].

Definition 4.6. We call the completion of (A, [., .]) constructed in Proposition 4.4the completion of (A, [., .], (fi)i=1,...,∆).

Corollary 4.7. Let (A, [., .]) be an inner product space with κ−(A, [., .]) = κ < ∞,and let T be the topology determined by [., .] on A (see Remark 4.1).

Let (fi)i=1,...,∆ and (f ′i)i=1,...,∆ be two sets of complex linear functionals on

A such that no linear combination of (fi)i=1,...,∆ and no linear combination of(f ′

i)i=1,...,∆ is continuous with respect to T .The completion of (A, [., .], (fi)i=1,...,∆) is isomorphic to the completion of

(A, [., .], (f ′i)i=1,...,∆) if and only if the functionals f ′

1, . . . , f′∆ are continuous with

respect to the topology induced by 〈., .〉 defined in (4.4) on A.

Proof. We denote by ((L, [., .],O), ι) and ((L′, [., .]′,O′), ι′) the completions of thetriplets (A, [., .], (fi)i=1,...,∆) and (A, [., .], (f ′

i)i=1,...,∆), respectively. Moreover, let(gi)i=1,...,∆ and (g′i)i=1,...,∆ be the continuous linear functionals on L and L′, re-spectively, such that fi = gi ι and f ′

i = g′i ι′, respectively.If the two completions are isomorphic by the isomorphism φ : (L, [., .],O) →

(L′, [., .]′,O′), then g′i φ, i = 1, . . . ,∆, are continuous functionals on (L, [., .],O).By Remark 4.5 f ′

i = g′i ι′ = g′i φ ι is continuous on A with respect to thetopology induced by 〈., .〉.

Conversely, if f ′1, . . . , f

′∆ are continuous with respect to the topology induced

by 〈., .〉, then by continuation to the completion we obtain continuous linear func-tionals (h′

i)i=1,...,∆ on (L, [., .],O) such that f ′i = h′

iι. By the uniqueness assertionin Proposition 4.4 ((L, [., .],O), ι) is also a completion of (A, [., .], (f ′

i)i=1,...,∆) .

5. Almost reproducing kernel Pontryagin spaces

Objects of intensive studies are the so-called reproducing kernel Pontryagin spaces.These are Pontryagin spaces (P, [., .]) which consist of functions F mapping someset M into C such that there exist K(., t) ∈ P, t ∈M , with

F (t) = [F,K(., t)], F ∈ P, t ∈M. (5.1)

An equivalent definition of reproducing kernel Pontryagin spaces is the assumptionthat (P, [., .]) consists of complex valued functions on a set M such that the pointevaluations are continuous at all points of M .

The first approach to reproducing kernel Pontryagin spaces does not have animmediate generalization to almost Pontryagin spaces but the second does.

Definition 5.1. Let (L, [., .],O) be an almost Pontryagin space, and assume thatthe elements of L are complex valued functions on a set M . This space is a calledan almost reproducing kernel Pontryagin spaces on M , if for any t ∈M the linear

Page 273: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Almost Pontryagin Spaces 267

functionalft : F → F (t), F ∈ L,

is continuous on L with respect to O.

Remark 5.2. As the elements of L are functions we see that the family (ft)t∈M ofpoint evaluation functionals is point separating. Hence Proposition 2.9 yields theuniqueness of the topology O for which the functionals ft, t ∈M , are continuous.Consequently, we are going to skip the topology and write almost reproducingkernel Pontryagin spaces as pairs (L, [., .]).

A major setback to the study of almost reproducing kernel Pontryagin spacesis the fact that in the case ∆(L, [., .]) > 0 we do not find a reproducing kernelK(s, t) which satisfies (5.1). However, we do have the following

Proposition 5.3. Let (L, [., .]) be an almost reproducing kernel Pontryagin space ona set M and put ∆ = ∆(L, [., .]). Moreover, let N be a separating subset of M , i.e.,assume that the family (ft)t∈N is point separating. Then there exist t1, . . . , t∆ ∈ N ,c ∈ R, and R(., t) ∈ L such that

F (t) = [F,R(., t)] + c(F (t1)R(t, t1) + · · ·+ F (t∆)R(t, t∆)

), F ∈ L, t ∈M.

Proof. The number ∆ is by definition the dimension of L[] = kerG, where G =G(.,.) is the Gram operator with respect to a Hilbert space product (., .) inducingthe topology of (L, [., .]). By Corollary 2.4 we may choose (., .) such that G = I+L,where L is a selfadjoint finite rank operator.

Because of the assumption on N by induction one can easily show the exis-tence of points t1, . . . , t∆ ∈ N , such that h ∈ L[] and h(tj) = 0, j = 1, . . . ,∆,implies h = 0.

Because of the continuity of point evaluations we find elements K(., t) ∈ Lwith

F (t) = (F,K(., t)), F ∈ L.

We define the following selfadjoint operator H of finite rank on L

H(F ) =∆∑

j=1

F (tj)K(., tj).

Let K = ker(H) ∩ ker(L), then K(⊥) is finite-dimensional since the selfadjointoperators H and L are of finite rank, and K(⊥) contains the range of L and H . Forz ∈ C it follows that the restriction of the operator I + L + zH onto K is equalto the identity. Hence I + L + zH is invertible on L if and only if it is invertibleon K(⊥). To show that I + L + zH is invertible for z = i, let (I + L + iH)F = 0.Then (HF,F ) = 0 = ((I+L)F, F ), and the form of H implies that F (tj) = 0, j =1, . . . ,∆. It follows that H(F ) = 0, and hence (I + L)F = 0, or F ∈ L[] as I + Lis the Gram operator. The definition of the points t1, . . . , t∆ implies that F = 0,that is, the operator (I + L + iH) is invertible.

Page 274: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

268 M. Kaltenback, H. Winkler and H. Woracek

Thus det((I + L+ zH)|K[⊥]) is not identically zero, and therefore has only adiscrete zero set. In particular, we find c ∈ R such that (I +L+ cH) is invertible.Now set

R(., t) = (I + L + cH)−1K(., t).Note that because of the selfadjointness of I + L + cH ,

R(s, t) = (R(., t),K(., s)) = (K(., t), R(., s)) = R(t, s).

For F ∈ L and t ∈M we have

[F,R(., t)] = ((I + L + cH)F,R(., t))− c(HF,R(., t))

= F (t)− c

∆∑j=1

F (tj)R(t, tj).

6. Examples of almost Pontryagin spaces

As the first topic of this section we are going to sketch the continuation problem forhermitian functions with finitely many negative squares on intervals [−2a, 2a] tothe whole real axis. We will meet inner product spaces and completions in the senseof Section 4. Taking into account also a possible degeneracy of this completion oneobtains a refinement of classical results on the number of all possible extensionsof the given hermitian functions with finitely many negative squares to R. For acomplete treatment of this topic see [7].

Definition 6.1. Let a > 0 be a real number, and assume that f : [−2a, 2a] → C isa continuous function. We say that f is hermitian if it satisfies f(−t) = f(t), t ∈[−2a, 2a], and f is said to be hermitian with κ(∈ N ∪ 0) many negative squaresif the kernel f(t − s), s, t ∈ (−a, a), has κ negative squares. The set of all suchfunctions we denote by Pκ,a.

By Pκ we denote the set of all continuous hermitian functions with κ negativesquares on R, i.e., f(t− s), s, t ∈ R, has κ negative squares.

For κ = 0 the function f is called positive definite.The continuation problem is to find for given f ∈ Pκ,a and κ ∈ N ∪ 0 all

possible extensions f of f to the whole real axis such that f ∈ Pκ. Trivially, by thedefinition of the respective classes a necessary condition for the existence of suchextensions is κ ≤ κ. The following classical result can be found for example in [3].

Theorem 6.2. Let f ∈ Pκ,a. Then either f has exactly one extension belongingto Pκ, or it has infinitely many extensions in Pκ. In the latter case f also hasinfinitely many extensions in Pκ for every κ ≥ κ.

This result originates from the following operator theoretic considerations.First let (P(f), [., .]) be the reproducing kernel Pontryagin space on (−a, a) havingk(s, t) = f(s− t) as its reproducing kernel. As we assume f ∈ Pκ,a the degree ofnegativity of (P(f), [., .]) is κ. Clearly, (P(f), [., .]) is the completion of (A(f), [., .])where A(f) is the linear hull of k(., t) : t ∈ (−a, a).

Page 275: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Almost Pontryagin Spaces 269

Moreover, a certain differential operator S(f) is constructed on (P(f), [., .]).This operator is symmetric and densely defined. Its defect elements are given by

ker (S(f)∗ − z) = eizs, z ∈ C \ R,

as a function of s ∈ (−a, a) if they belong to P(f). Thus S(f) has either defectindices (1, 1) or (0, 0) depending on whether eizs belongs to this space or not.

A crucial fact in verifying Theorem 6.2 is that all extensions of f belongingto Pκ correspond bijectively to all P(f)-minimal selfadjoint extensions A of S(f)in a possibly larger Pontryagin space P ⊇ P(f) with κ−(P, [., .]) = κ. HerebyP(f)-minimal means

cls(P(f) ∪ (A− z)−1x : x ∈ P(f), z ∈ ρ(A)) = P.

Hence, in the case that S(f) has defect index (0, 0) or, equivalently, that S(f) isselfadjoint there are no P(f)-minimal selfadjoint extensions of S(f) other thanS(f) itself. Therefore, f has exactly one extension in Pκ.

If S(f) has defect index (1, 1), then there are infinitely many P(f)-minimalselfadjoint extensions A of S(f) and, hence, infinitely many extensions in Pκ.Moreover, in this case the extensions of f in Pκ for κ ≥ κ correspond bijectively toall P(f)-minimal selfadjoint extensions A of S(f) in a Pontryagin space P ⊇ P(f)with κ−(P, [., .]) = κ, and there are also infinitely many of them for an arbitraryκ ≥ κ.

Theorem 6.2 seems to give a sufficiently satisfactory answer to the continu-ation problem. But as some examples show it can happen that f has exactly oneextension in Pκ but infinitely many extensions in Pκ for some κ > κ. How doesthis fit in with the operator theoretic approach mentioned above?

Here almost Pontryagin spaces come into play. In the case that S(f) has defectindex (1, 1) the fact that eizs, z ∈ C \ R belongs to P(f) can be reformulated bysaying that

Fz :∑

j

αjk(., tj) →∑

j

αjeiztj

are continuous linear functionals on (A, [., .]) for all z ∈ C \ R.If f has a unique extension f0 ∈ Pκ, i.e., S(f) has defect (0, 0), then these

functionals are not continuous. But it can happen that by refining the topologyon (A, [., .]) by finitely many functionals Fz1 , . . . , Fz∆ , zj ∈ C\R as in Remark 4.5we obtain a topology O on (A, [., .]) such that all functionals Fz , z ∈ C \ R, arecontinuous. Hereby let ∆ ∈ N always be chosen such that Fz1 , . . . , Fz∆ is a minimalset of functionals such that all the functionals Fz , z ∈ C \R, are continuous withrespect to O. Then no linear combination of Fz1 , . . . , Fz∆ is continuous with respectto the topology induced by [., .] on A as in Remark 4.1.

Now let ((Q(f), [., .],O(f)), ι) be the completion of (A, [., .], (Fzj )j=1,...,∆).On (Q(f), [., .],O(f)) one can find a symmetric operator T (f) with defect index(1, 1). For the concept of symmetric operators on almost Pontryagin spaces see[8]. In that paper almost Pontryagin spaces were always considered as degeneratesubspaces of Pontryagin spaces, and they were not yet called almost Pontryagin

Page 276: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

270 M. Kaltenback, H. Winkler and H. Woracek

spaces. Similarly as for S(f) the extensions f ∈ Pκ of f which differ from f0

correspond bijectively to all Q(f)-minimal selfadjoint extensions A of T (f) in aPontryagin space P ⊇ Q(f) with κ−(P, [., .]) = κ.

Since every Pontryagin space P which contains Q must satisfy κ−(P, [., .]) ≥∆(Q, [., .]) + κ−(Q, [., .]), there exist extensions f ∈ Pκ, f = f0 of f if only ifκ ≥ ∆+κ. In fact, for these κ there always exist infinitely many extensions in Pκ.These considerations yield the following refinement of Theorem 6.2.

Theorem 6.3. Let f ∈ Pκ,a. Then there exists ∆ ∈ 0 ∪ N ∪ ∞ such that• If ∆ > 0, then f has a unique extension in Pκ.• f has no extensions in Pκ for κ < κ < ∆ + κ.• f has infinitely many extensions in Pκ for κ ≥ ∆ + κ.

As a second topic in the present section we give an example of an interestingclass of almost reproducing kernel Pontryagin spaces. In fact, we are going toconsider the indefinite generalization of Hilbert space of entire functions introducedby Louis de Branges (see [1], [9], [10], [11]).

Definition 6.4. An inner product space (L, [., .]) is called a de Branges space (dB-space) if the following three axioms hold true:(dB1) (L, [., .]) is an almost reproducing kernel Pontryagin space on C consisting

of entire functions.(dB2) If F ∈ L then F# ∈ L, where F#(z) = F (z). Moreover,

[F#, G#] = [G,F ].

(dB3) If F ∈ L and z0 ∈ C \ R with F (z0) = 0, thenz − z0z − z0

F (z) ∈ L,

as a function of z. Moreover, if also G ∈ L with G(z0) = 0, then[z − z0z − z0

F (z),z − z0z − z0

G(z)]

= [F,G].

In many cases one can assume that a dB-space also satisfies

For all t ∈ R there exists F ∈ L such that F (t) = 0. (6.1)

One of the main results about dB-spaces is that the set of all admissible dB-subspaces of a given dB-space is totally ordered. To explain this in more detail, letus start with a dB-space satisfying (6.1). We call a subspace K of L a dB-subspaceof (L, [., .]) if (K, [., .]) itself is a dB-space. It is called an admissible dB-subspaceif (K, [., .]) also satisfies (6.1). The following result originates from [1] and wasgeneralized to the indefinite situation in [9].

Theorem 6.5. Let (L, [., .]) be a dB-space satisfying (6.1). Then the set of all ad-missible dB-subspaces is totally ordered with respect to inclusion, i.e., if P and Qare two admissible dB-subspaces of (L, [., .]), then P ⊆ Q or Q ⊆ P.

Page 277: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Almost Pontryagin Spaces 271

One may think of the degenerate members of the chain of admissible dB-subspaces of a given dB-space as singularities. Thus it is desirable not to have toomany of this kind. In [9] the following result was obtained.

Theorem 6.6. With the same assumptions as in Theorem 6.5 the number of ad-missible dB-subspaces K of (L, [., .]) with ∆(K, [., .]) > 0 is finite.

The presence of singularities is exactly what distinguishes the classical –positive definite – case from the indefinite situation. Thus, to obtain a thoroughunderstanding of the structure of an indefinite dB-space, it is inevitable to dealwith degenerated spaces.

References

[1] L. de Branges, Hilbert spaces of entire functions, Prentice-Hall, London 1968[2] J. Bognar, Indefinite inner product spaces, Springer Verlag, Berlin 1974

[3] M. Grossmann, H. Langer, Uber indexerhaltende Erweiterungen eines hermiteschenOperators im Pontrjaginraum, Math. Nachr. 64(1974), 289–317

[4] I.S. Iohvidov, M.G. Kreın, Spectral theory of operators in spaces with indefinite metricI., Trudy Moskov. Mat. Obsc. 5 (1956), 367–432

[5] I.S. Iohvidov, M. G. Kreın, Spectral theory of operators in spaces with indefinitemetric II., Trudy Moskov. Mat. Obsc. 8 (1959), 413–496

[6] P. Jonas, H. Langer, B. Textorius, Models and unitary equivalence of cyclic selfadjointoperators in Pontryagin spaces, Oper. Theory Adv. Appl. 59(1992), 252–284

[7] M. Kaltenback, H. Woracek, On extensions of Hermitian functions with a finitenumber of negative squares, J. Operator Theory 40 (1998), no. 1, 147–183

[8] M. Kaltenback, H. Woracek, The Krein formula for generalized resolvents in degen-erated inner product spaces, Monatsh. Math. 127 (1999), no. 2, 119–140

[9] M. Kaltenback, H. Woracek, Pontryagin spaces of entire functions I, Integral Equa-tions Operator Theory 33(1999), 34–97

[10] M. Kaltenback, H. Woracek, Pontryagin spaces of entire functions II, Integral Equa-tions Operator Theory 33(1999), 305–380

[11] M. Kaltenback, H. Woracek, Pontryagin spaces of entire functions III, Acta Sci.Math. (Szeged) 69 (2003), 241–310

M. Kaltenback and H. WoracekInstitut fur Analysis und Scientific ComputingTechnische Universitat WienWiedner Hauptstraße 8–10A-1040 Wien, Austriae-mail: [email protected]: [email protected]

H. WinklerFaculteit der Wiskunde en NatuurwetenschappenRijksuniversiteit GroningenNL-9700 AV Groningen, The Netherlandse-mail: [email protected]

Page 278: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 273–298c© 2005 Birkhauser Verlag Basel/Switzerland

Multivariable ρ-contractions

Dmitry S. Kalyuzhnyı-Verbovetzkiı

Dedicated to Israel Gohberg on his 75th birthday

Abstract. We suggest a new version of the notion of ρ-dilation (ρ > 0) of anN-tuple A = (A1, . . . , AN) of bounded linear operators on a common Hilbertspace. We say that A belongs to the class Cρ,N if A admits a ρ-dilation

A = (A1, . . . , AN) for which ζA := ζ1A1 + · · · + ζN AN is a unitary operatorfor each ζ := (ζ1, . . . , ζN) in the unit torus TN . For N = 1 this class coincideswith the class Cρ of B. Sz.-Nagy and C. Foias. We generalize the knowndescriptions of Cρ,1 = Cρ to the case of Cρ,N , N > 1, using so-called Aglerkernels. Also, the notion of operator radii wρ, ρ > 0, is generalized to thecase of N-tuples of operators, and to the case of bounded (in a certain strongsense) holomorphic operator-valued functions in the open unit polydisk DN ,with preservation of all the most important their properties. Finally, we showthat for each ρ > 1 and N > 1 there exists an A = (A1, . . . , AN ) ∈ Cρ,N

which is not simultaneously similar to any T = (T1, . . . , TN ) ∈ C1,N , howeverif A ∈ Cρ,N admits a uniform unitary ρ-dilation then A is simultaneouslysimilar to some T ∈ C1,N .

Mathematics Subject Classification (2000). Primary 47A13; Secondary 47A20,47A56.

Keywords. Multivariable, ρ-dilations, linear pencils of operators, operatorradii, Agler kernels, similarity to a 1-contraction.

1. Introduction

Linear pencils of operators LA(z) := A0 + z1A1 + · · ·+ zNAN on a Hilbert spacewhich take contractive (resp., unitary or J-unitary for some signature operatorJ = J∗ = J−1) values for all z = (z1, . . . , zN ) in the unit torus TN := ζ ∈ CN :|ζk| = 1, k = 1, . . . , N serve as one of possible generalizations of a single contrac-tive (resp., unitary, J-unitary) operator on a Hilbert space. They appear in con-structions of Agler’s unitary colligation and corresponding conservative (unitary)scattering N -dimensional discrete-time linear system of Roesser type [1, 8], andalso of Fornasini–Marchesini type [7], and dissipative (contractive), conservative

Page 279: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

274 D.S. Kalyuzhnyı-Verbovetzkiı

(unitary) or J-conservative (J-unitary) scattering N -dimensional linear systems ofone more form introduced in our paper [17] and studied in [17, 18, 19, 23, 20, 21, 7].These constructions, in particular, provide the transfer function realization formu-lae for certain classes of holomorphic functions [1, 8, 17, 19, 21, 7], the solutionsto the Nevanlinna–Pick interpolation problem [2, 8], the Toeplitz corona problem[2, 8], and the commutant lifting problem [6] in several variables.

In [18] we developed the dilation theory for multidimensional linear systems,and in particular gave a necessary and sufficient condition for such a system to havea conservative dilation. As a special case, this gave a criterion for the existenceof a unitary dilation of a contractive (on TN ) linear pencil of operators on aHilbert space. Linear pencils of operators satisfying this criterion inherit the mostimportant properties of single contraction operators on a Hilbert space (note that,due to [22], not all linear pencils which take contractive operator values on TN

satisfy this criterion).The purpose of the present paper is to develop the theory of ρ-contractions in

several variables in the framework of “linear pencils approach”. We introduce thenotion of ρ-dilation of an N -tuple A = (A1, . . . , AN ) of bounded linear operatorson a common Hilbert space by means of a simultaneous ρ-dilation, in the senseof B. Sz.-Nagy and C. Foias [32, 34], of the values of a homogeneous linear pencilof operators zA :=

∑Nk=1 zkAk. The class Cρ,N consists of those N -tuples of

operators A = (A1, . . . , AN ) (ρ-contractions) for which there exists a ρ-dilationA = (A1, . . . , AN ) such that the operators ζA =

∑Nk=1 ζkAk are unitary for all

ζ = (ζ1, . . . , ζN ) ∈ TN . On the one hand, this class generalizes the class Cρ,1 = Cρ

of Sz.-Nagy and Foias [32, 34] consisting of operators which admit a unitary ρ-dilation to the case N > 1. On the other hand, this class generalizes the class ofN -tuples of operators A for which the associated linear pencil of operators zAadmits a unitary dilation in the sense of [18] (this corresponds to ρ = 1) to thecase of N -tuples of operators A which have a unitary ρ-dilation for ρ = 1.

The paper is organized as follows. Section 2 gives preliminaries on ρ-contracti-ons for the case N = 1. Namely, we recall the relevant definitions, the knowncriteria for an operator to be a ρ-contraction, i.e., to belong to the class Cρ ofSz.-Nagy and Foias, the notion of operator radii wρ and their properties, and thetheorem on similarity of ρ-contractions to contractions. In Section 3 we give thedefinitions of a ρ-dilation of an N -tuple of operators, and of the class Cρ,N ofρ-contractions for the case N > 1, and prove a theorem which generalizes thecriteria of ρ-contractiveness to this case, as well as to the case 0 < ρ = 1. Someproperties of classes Cρ,N are discussed. Then it is shown that the notions of aρ-contraction and of the corresponding class Cρ,N , as well as the theorem justmentioned, can be extended to holomorphic functions on the open unit polydiskDN := z ∈ CN : |zk| < 1, k = 1, . . . , N that are bounded in a certain strongsense, though the notion of unitary ρ-dilation is not relevant any more in this case.In Section 4 we define operator radii wρ,N of N -tuples of operators, and operator-function radii w(∞)

ρ,N of bounded holomorphic functions on DN , ρ > 0. These radii

Page 280: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 275

generalize wρ’s and inherit all the most important properties of them. In Section 5we prove that for each ρ > 1 and N > 1 there exists an A = (A1, . . . , AN ) ∈ Cρ,N

which is not simultaneously similar to any T = (T1, . . . , TN) ∈ C1,N . Then weintroduce the classes Cu

ρ,N , ρ > 0, of N -variable ρ-contractions A = (A1, . . . , AN )which admit a uniform unitary ρ-dilation. We prove that if A ∈ Cu

ρ,N for someρ > 1 then A is simultaneously similar to some T ∈ Cu

1,N . Note, that since the classCu

ρ,N (as well as Cρ,N ) increases as a function of ρ, for any ρ ≤ 1 an A ∈ Cuρ,N

(resp., A ∈ Cρ,N ) belongs to Cu1,N (resp., C1,N ) itself. We show the relation of

our results to ones of G. Popescu [30] where a different notion of multivariable ρ-contractions has been introduced, and the relevant theory has been developed. Theclasses Cu

ρ,N , ρ > 0, which appear in Section 5 in connection with the similarityproblem discussed there, certainly deserve a further investigation.

2. Preliminaries

Let L(X ,Y) denote the Banach space of bounded linear operators mapping aHilbert space X into a Hilbert space Y, and L(X ) := L(X ,X ). For ρ > 0, anoperator A ∈ L(X ) is said to be a ρ-dilation of an operator A ∈ L(X ) if X ⊃ Xand

An = ρPX An|X , n ∈ N, (2.1)

where PX denotes the orthogonal projection onto the subspace X in X . If, more-over, A is a unitary operator then A is called a unitary ρ-dilation of A. In [32] (seealso [34]) B. Sz.-Nagy and C. Foias introduced the classes Cρ, ρ > 0, consistingof operators which admit a unitary ρ-dilation. Due to B. Sz.-Nagy [31], the classC1 is precisely the class of all contractions, i.e., operators A such that ‖A‖ ≤ 1.C.A. Berger [9] showed that the class C2 is precisely the class of all operatorsA ∈ L(X ), for some Hilbert space X , which have the numerical radius

w(A) = sup|〈Ax, x〉| : x ∈ X , ‖x‖ = 1equal to at most one. Thus, the classes Cρ, ρ > 0, provide a framework for simul-taneous investigation of these two important classes of operators.

Recall that the Herglotz (or Caratheodory) class H(X ) (respectively, theSchur class S(X )) consists of holomorphic functions f on the open unit disk Dwhich take values in L(X ) and satisfy Re f(z) = f(z) + f(z)∗ $ 0 in the sense ofpositive semi-definiteness of an operator (resp., ‖f(z)‖ ≤ 1) for all z ∈ D. Let usrecall some known characterizations of the classes Cρ.

Theorem 2.1. Let A ∈ L(X ) and ρ > 0. The following statements are equivalent:(i) A ∈ Cρ;(ii) the function kA

ρ (z, w) := ρIX − (ρ − 1) ((zA + (wA)∗) + (ρ − 2)(wA)∗zAsatisfies kA

ρ (z, z) $ 0 for all z ∈ clos(D);(iii) the function ψA

ρ (z) := (1− 2ρ)IX + 2

ρ (IX − zA)−1 belongs to H(X );

(iv) the function ϕAρ (z) := zA ((− 1)zA− ρIX )−1 belongs to S(X ).

Page 281: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

276 D.S. Kalyuzhnyı-Verbovetzkiı

Conditions (ii) and (iii) of Theorem 2.1 each characterizing the class Cρ

appear in [32], while condition (iv) is due to C. Davis [11].

Corollary 2.2. Condition (ii) in Theorem 2.1 can be replaced by(ii′) kA

ρ (C,C) := ρIX ⊗ IHC − (ρ− 1)(A⊗ C + (A⊗ C)∗)+ (ρ− 2)(A⊗ C)∗(A⊗ C) $ 0for any contraction C on a Hilbert space HC .

Proof. Indeed, (ii′)⇒(ii), hence (ii′)⇒(i). Conversely, if A ∈ Cρ ∩ L(X ) then forany contraction C on HC one has A⊗C ∈ Cρ because, by [31], C admits a unitarydilation C, and A admits a unitary ρ-dilation A, thus A⊗ C is a unitary ρ-dilationof A⊗ C:

(A⊗ C)n = An ⊗ Cn = (ρPX An|X )⊗ (PHC Cn|HC)

= ρPX⊗HC(An ⊗ Cn)|X ⊗ HC

= ρPX⊗HC(A⊗ C)n|X ⊗ HC , n ∈ N.

Therefore, kAρ (C,C) = kA⊗C

ρ (1, 1) $ 0, i.e., (ii′) is valid. Corollary 2.3. Condition

(v): A⊗ C ∈ Cρ for any contraction C on a Hilbert space,

is equivalent to each of conditions (i)–(iv) of Theorem 2.1.

Proof. See the proof of Corollary 2.2. Any operator A ∈ Cρ is power-bounded :

‖An‖ ≤ ρ, n ∈ N, (2.2)

moreover, its spectral radius

ν(A) = limn→+∞ ‖An‖ 1

n (2.3)

is at most one. In [32] an example of a power-bounded operator which is notcontained in any of the classes Cρ, ρ > 0, is given. However, J.A.R. Holbrook [15]showed that any bounded linear operator A with ν(A) ≤ 1 can be approximatedin the operator norm topology by elements of the classes Cρ. More precisely, if C∞denotes the class of bounded linear operators with spectral radius at most one,and X is a Hilbert space, then

C∞ ∩ L(X ) = clos

⋃0<ρ<∞

Cρ ∩ L(X )

. (2.4)

For a fixed Hilbert space X , the class Cρ as a function of ρ increases [32]:

Cρ ⊂ Cρ′ for ρ < ρ′. (2.5)

Moreover, it was shown by E. Durszt [13] that Cρ increases strictly for dimX ≥ 2:

Cρ = Cρ′ for ρ = ρ′.

Page 282: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 277

Proposition 2.4. For X = C, the classes Cρ coincide for all ρ ≥ 1, and strictlyincrease for 0 < ρ < 1:

Cρ Cρ′ for 0 < ρ < ρ′ ≤ 1.

Proof. If a ∈ C ∼= L(C) belongs to Cρ then ‖a‖ = |a| = ν(a) ≤ 1. Hence Cρ ⊂ C1

for any ρ > 0. Since (2.5) implies Cρ ⊃ C1 for ρ ≥ 1, we get Cρ = C1 for this case,that proves the first part of this proposition.

For the proof of the second part, we will show that for any ε, ρ : 0 < ε < ρ < 1,one has

a :=ρ

2− ρ∈ Cρ\Cρ−ε. (2.6)

If 0 ≤ ε < ρ then, by condition (ii) in Theorem 2.1, the inclusion a ∈ Cρ−ε isequivalent to

ρ− ε− (ρ− ε− 1)(az + az) + (ρ− ε− 2)|az|2 ≥ 0, z ∈ clos(D),

which for a = ρ2−ρ turns into

ρ− ε− 2(ρ− ε− 1)ρ

2− ρr cos θ+ (ρ− ε− 2)

(ρr

2− ρ

)2

≥ 0, r ∈ [0, 1], θ ∈ [0, 2π).

Since ρ − ε − 1 < 0, the left-hand side of this inequality, as a function of θ for afixed r, has a minimum at θ = π, so the latter condition turns into

ρ− ε + 2(ρ− ε− 1)ρr

2− ρ+ (ρ− ε− 2)

(ρr

2− ρ

)2

≥ 0, r ∈ [0, 1].

The left-hand side attains its minimum at r = 1, thus the latter inequality turnsinto

ρ− ε + 2(ρ− ε− 1)ρ

2− ρ+ (ρ− ε− 2)

2− ρ

)2

= − 4ε(2− ρ)2

≥ 0,

which is possible if and only if ε = 0. Thus, (2.6) is true. The properties of the classes Cρ become more clear due to the following

numerical characteristics of operators. J.A.R. Holbrook [15] and J.P. Williams[35], independently, introduced for any A ∈ L(X ) the operator radii

wρ(A) := infu > 0 :1uA ∈ Cρ. (2.7)

Theorem 2.5. wρ(·) has the following properties:(i) wρ(A) <∞;(ii) wρ(A) > 0 unless A = 0, moreover, wρ(A) ≥ 1

ρ‖A‖;(iii) ∀µ ∈ C, wρ(µA) = |µ|wρ(A);(iv) wρ(A) ≤ 1 if and only if A ∈ Cρ;(v) wρ(·) is a norm on L(X ) for any ρ : 0 < ρ ≤ 2, and not a norm on

L(X ), dimX ≥ 2, for any ρ > 2;(vi) w1(A) = ‖A‖ (of course, here ‖ · ‖ is the operator norm on L(X ) with

respect to the Hilbert-space metric on X );

Page 283: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

278 D.S. Kalyuzhnyı-Verbovetzkiı

(vii) w2(A) = w(A);(viii) w∞(A) := lim

ρ→+∞wρ(A) = ν(A);

(ix) wρ(IX ) =

1 for ρ ≥ 1,2ρ − 1 for 0 < ρ < 1;

(x) if 0 < ρ < ρ′ then wρ′ (A) ≤ wρ(A) ≤(

2ρ′ρ − 1

)wρ′ (A), thus wρ(A) is

continuous in ρ and non-increasing as ρ increases;(xi) if ‖A‖ = 1 and A2 = 0 then, for any ρ > 0, wρ(A) = 1

ρ ;(xii) if for some ρ0 one has wρ0(A) > w∞(A) (= ν(A)) then for any ρ > ρ0

one has wρ0(A) > wρ(A);(xiii) lgwρ(A) is a convex function in ρ, 0 < ρ < +∞;(xiv) wρ(A) is a convex function in ρ, 0 < ρ < +∞;(xv) the function hA(ρ) := ρwρ(A) is non-decreasing on [1,+∞), and non-

increasing on (0, 1);(xvi) for any ρ such that 0 < ρ < 2 one has ρwρ(A) = (2 − ρ)w2−ρ(A), and

limρ↓0

ρ2wρ(A) = w2(A) (= w(A));

(xvii) ∀ρ : 0 < ρ ≤ 1, wρ(A) ≥(

2ρ − 1

)w2(A);

(xviii) ∀A,B ∈ L(X ), ∀ρ ≥ 1, wρ(AB) ≤ ρ2wρ(A)wρ(B), moreover, ρ2 is thebest constant in this inequality for the case dimX ≥ 2;

(xix) ∀A,B ∈ L(X ), ∀ρ : 0 < ρ < 1, wρ(AB) ≤ (2 − ρ)ρwρ(A)wρ(B), more-over, (2−ρ)ρ is the best constant in this inequality for the case dimX ≥ 2;

(xx) ∀ρ > 0, ∀n ∈ N, wρ(An) ≤ wρ(A)n.

Properties (i)–(xii), (xviii), and (xx) were proved by J.A.R. Holbrook [15],properties (xiii)–(xvi) were discovered by T. Ando and K. Nishio [4]. Property (xix)was shown by K. Okubo and T. Ando [26], and follows also from (xvi) and (xviii).Finally, property (xvii) easily follows from (x) and (xvi). Indeed, for 0 < ρ ≤ 1 onehas w2−ρ(A) ≥ w2(A), hence ρwρ(A) = (2 − ρ)w2−ρ(A) ≥ (2 − ρ)w2(A), whichimplies (xvii).

We have listed in Theorem 2.5 only the most important, as it seems to us,properties of operator radii wρ(·). Other properties of wρ(·) can be found in [15,16, 14, 4, 26, 5] and elsewhere.

Let us note that properties of the classes Cρ discussed before Theorem 2.5,including Proposition 2.4, can be deduced from properties (iv), (vi)–(x) in Theo-rem 2.5. Due to property (iv) in Theorem 2.5, operators from the classes Cρ arecalled ρ-contractions.

Any A ∈ Cρ satisfies the following generalized von Neumann inequality [32]:for any polynomial p of one variable

‖p(A)‖ ≤ max|z|≤1

|ρp(z) + (1− ρ)p(0)|. (2.8)

Page 284: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 279

Let A ∈ L(X ), B ∈ L(Y). Then A is said to be similar to B if there exists abounded invertible operator S ∈ L(X ,Y) such that

A = S−1BS. (2.9)

B. Sz.-Nagy and C. Foias proved in [33] (see also [34]) that any A ∈ Cρ is similarto some T ∈ C1, i.e., any ρ-contraction is similar to a contraction.

To conclude this section, let us remark that the classes Cρ are of continuousinterest, e.g., see recent works [12, 10, 5, 24, 27]. In [30] the classesCρ were extendedto a multivariable setting; we shall discuss this generalization in Section 5.

3. The classes Cρ,N

Let ρ > 0. We will say that an N -tuple of operators A = (A1, . . . , AN ) ∈ L(X )N

is a ρ-dilation of an N -tuple of operators A = (A1, . . . , AN ) ∈ L(X )N if X ⊃ X ,and for any z = (z1, . . . , zN ) ∈ CN the operator zA =

∑Nk=1 zkAk is a ρ-dilation,

in the sense of [32], of the operator zA =∑N

k=1 zkAk, i.e.,

(zA)n = ρPX (zA)n|X , z ∈ CN , n ∈ N. (3.1)

These relations are equivalent to

At = ρPX At|X , t ∈ ZN+ := τ ∈ ZN : τk ≥ 0, k = 1, . . . , N, (3.2)

where At, t ∈ ZN+ , are symmetrized multi-powers of A:

At :=t!|t|!

∑σ

A[σ(1)] · · ·A[σ(|t|)],

and analogously for A. Here for a multi-index t = (t1, . . . , tN), t! := t1! · · · tN ! and|t| := t1+· · ·+tN ; σ runs over the set of all permutations with repetitions in a stringof |t| numbers from the set 1, . . . , N such that the κth number [κ] ∈ 1, . . . , Nappears in this string t[κ] times. Say, if t = (1, 2, 0, . . . , 0) then

At =A1A

22 + A2A1A2 + A2

2A1

3.

In the case of a commutative N -tuple A one has At = At11 · · ·AtN

N , i.e., a usualmulti-power.

Note 3.1. Compare (3.1) and (3.2) with (2.1).

In the case ρ = 1 the notion of ρ-dilation of an N -tuple of operators A =(A1, . . . , AN ) coincides with the notion of dilation of A (or corresponding linearpencil zA) as defined in [18].

We will call A ∈ L(X )N a unitary ρ-dilation of A ∈ L(X )N if A is a ρ-dilationof A and for any ζ ∈ TN the operator ζA =

∑Nk=1 ζkAk is unitary. The class of

operator N -tuples which admit a unitary ρ-dilation will be denoted by Cρ,N .

Page 285: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

280 D.S. Kalyuzhnyı-Verbovetzkiı

Let CN denote the family of all N -tuples C = (C1, . . . , CN ) of commutingstrict contractions on a common Hilbert space HC, i.e., CkCj = CjCk and ‖Ck‖ <1 for all k, j ∈ 1, . . . , N. An L(X )-valued function

k(z, w) =∑

(t,s)∈ZN+×ZN

+

k(t, s)wszt, (z, w) ∈ DN × DN ,

which is holomorphic in z ∈ DN and anti-holomorphic in w ∈ DN , will be calledan Agler kernel if

k(C,C) :=∑

(t,s)∈ZN+×ZN

+

k(t, s)⊗C∗sCt $ 0, C ∈ CN , (3.3)

where the series converges in the operator norm topology on L(X ⊗HC). TheAgler–Herglotz class AHN (X ) (resp., the Agler–Schur class ASN (X )) is the classof all L(X )-valued functions f holomorphic on DN for which k(z, w) = f(z)+f(w)∗

(resp., k(z, w) = IX − f(w)∗f(z)) is an Agler kernel. Agler kernels, as well as theclasses AHN (X ) and ASN (X ), were defined and studied by J. Agler in [1]. Thevon Neumann inequality [25] implies that AS1(X ) = S(X ) and AH1(X ) = H(X ).

Remark 3.2. The function kAρ (z, w) from condition (ii) in Theorem 2.1, due to

Corollary 2.2, is an Agler kernel (N = 1).

Theorem 3.3. Let A ∈ L(X )N , ρ > 0. The following conditions are equivalent:

(i) A ∈ Cρ,N ;(ii) the function kA

ρ,N (z, w) := ρIX − (ρ− 1) ((zA + (wA)∗)+ (ρ− 2)(wA)∗zA isan Agler kernel on DN × DN ;

(iii) the function ψAρ,N (z) := (1− 2

ρ)IX + 2ρ(IX − zA)−1 belongs to AHN (X );

(iv) the function ϕAρ,N (z) := zA ((− 1)zA− ρIX )−1 belongs to ASN (X );

(v) A⊗C :=∑N

k=1 Ak ⊗ Ck ∈ Cρ = Cρ,1 for all C ∈ CN .

Remark 3.4. This theorem generalizes Theorem 2.1 with condition (ii) replacedby condition (ii′) from Corollary 2.2, and added condition (v) from Corollary 2.3.

Proof of Theorem 3.3. (i)⇔(iii). The proof of this part combines the idea of B. Sz.-Nagy and C. Foias [32] for the proof of the equivalence (i)⇔(iii) in Theorem 2.1(see Remark 3.4) with Agler’s representation of functions from AHN (X ) [1]. LetA = (A1, . . . , AN ) ∈ Cρ,N ∩L(X )N , and A = (A1, . . . , AN ) ∈ L(X )N be a unitaryρ-dilation of A. By Corollary 4.3 in [18], the linear function LA(z) = zA belongsto the class ASN (X ). Since for any C ∈ CN one has (1+ε)C ∈ CN for a sufficientlysmall ε > 0, the operator A ⊗C, as well as A ⊗ (1 + ε)C, is contractive. Thus,A⊗C is a strict contraction, and the series

IX⊗HC+ 2

∞∑n=1

(A⊗C)n

Page 286: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 281

converges in the L(X ⊗ HC)-norm to

(IX⊗HC+ A⊗C)(IX⊗HC

− A⊗C)−1.

Moreover,Re[(IX⊗HC

+ A⊗C)(IX⊗HC− A⊗C)−1] $ 0. (3.4)

Therefore,

PX⊗HC(IX⊗HC

+ A⊗C)(IX⊗HC− A⊗C)−1|X ⊗ HC

= PX⊗HC

(IX⊗HC

+ 2∞∑

n=1

(A⊗C)n

)∣∣∣∣∣X ⊗HC

= IX⊗HC + 2∞∑

n=1

∑|t|=n

n!t!

(PX ⊗ IHC)(At ⊗Ct)|X ⊗ HC

= IX⊗HC +2ρ

∞∑n=1

∑|t|=n

n!t!

At ⊗Ct = IX⊗HC +2ρ

∞∑n=1

(A⊗C)n

= (1 − 2ρ)IX⊗H +

2ρ(IX⊗H −A⊗C)−1 = ψA

ρ,N (C),

and (3.4) implies ReψAρ,N (C) $ 0. Since the function (IX + zA)(IX − zA)−1 is

well defined and holomorphic on DN , so is

ψAρ,N (z) = PX (IX + zA)(IX − zA)−1|X , z ∈ DN , (3.5)

and we obtain ψAρ,N ∈ AHN (X ).

Conversely, let ψAρ,N ∈ AHN (X ). Since ψA

ρ,N (0) = IX , according to [1], thereexist a Hilbert space X ⊃ X , its subspaces X1, . . . , XN satisfying X =

⊕Nk=1 Xk,

and a unitary operator U ∈ L(X ) such that

ψAρ,N (z) = PX (IX + U(zP))(IX − U(zP))−1|X , z ∈ DN , (3.6)

where zP :=∑N

k=1 zkPXk, i.e., we get (3.5) with Ak = UPXk

, k = 1, . . . , N . Notethat for each ζ ∈ TN the operator ζA is unitary. Developing both parts of (3.6)into the series in homogeneous polynomials convergent in the operator norm, weget

IX +2ρ

∞∑n=1

(zA)n = IX + 2∞∑

n=1

PX (zA)n|X , z ∈ DN ,

that implies the relations

(zA)n = ρPX (zA)n|X , n ∈ N,

for all z ∈ DN , and hence for all z ∈ CN . Thus, A is a unitary ρ-dilation of A,and A ∈ Cρ,N . The equivalence (i)⇔(iii) is proved.

Note that in this proof we have established that each Agler representation(3.6) of ψA

ρ,N gives rise to a unitary ρ-dilation A of A, and vice versa. Indeed, we

Page 287: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

282 D.S. Kalyuzhnyı-Verbovetzkiı

already showed that (3.6) determines A. Conversely, if A ∈ L(X )N is a unitaryρ-dilation of A, then (3.5) holds. Set U :=

∑Nk=1 Ak ∈ L(X ) and Xk := A∗

kX , k =1, . . . , N . Then U is unitary, Xk is a closed subspace in X for each k = 1, . . . , N ,the subspaces Xk are pairwise orthogonal, and X =

⊕Nk=1 Xk (see Proposition 2.4

in [17]). Thus, (3.5) turns into (3.6).(v)⇔(iv). Let (v) be true. By Theorem 2.1 applied for A⊗C with a C ∈ CN ,

one has ϕA⊗Cρ ∈ S(X ⊗HC). For ε > 0 small enough, (1 + ε)C ∈ CN , hence

A⊗ (1 + ε)C ∈ Cρ, and ϕA⊗(1+ε)Cρ ∈ S(X ⊗HC). Thus,

ϕAρ,N (C) = A⊗C((ρ− 1)A⊗C− ρIX⊗HC)−1 = ϕA⊗(1+ε)C

ρ

(1

1 + ε

)is a contraction on X ⊗ HC. In particular, ϕA

ρ,N (z) is well defined, holomorphicand contractive on DN . Finally, ϕA

ρ,N ∈ ASN (X ).Conversely, if (iv) is true then for any C ∈ CN :

ϕA⊗Cρ (λ) = λA⊗C((ρ− 1)λA⊗C− ρIX⊗HC)−1 = ϕA

ρ,N (λC)

is well defined, holomorphic and contractive for λ ∈ D. Thus, ϕA⊗Cρ ∈ S(X ⊗HC),

and by Theorem 2.1, A⊗C ∈ Cρ.(v)⇔(iii) and (v)⇔(ii) are proved analogously, using the following relations

for C ∈ CN , λ ∈ D:

ψAρ,N (C) = ψA⊗(1+ε)C

ρ

(1

1 + ε

), ψA⊗C

ρ (λ) = ψAρ,N (λC),

kAρ,N (C,C) = kA⊗C

ρ (1, 1), kA⊗Cρ (λ, λ) = kA

ρ,N (λC, λC).

The proof is complete.

Remark 3.5. For the case ρ = 1 each of conditions (ii)–(v) in Theorem 3.3 meansthat for any C ∈ CN the operator A⊗C is a contraction. In other words,

A ∈ C1,N ∩ L(X )N ⇐⇒ LA ∈ ASN (X ),

that coincides with in [18, Corollary 4.3] (here LA(z) := zA, z ∈ CN ).

Let us also note that using [18, Corollary 4.3] one can deduce (v) from (i)directly. Indeed, if A ∈ L(X )N is a unitary ρ-dilation of A ∈ L(X )N then for anyC ∈ CN by [18, Corollary 4.3] the operator A⊗C is a contraction. Therefore, dueto [31], A⊗C ∈ L(X ⊗HC) has a unitary dilation U ∈ L(K), K ⊃ X ⊗HC. Thenfor any n ∈ N:

(A⊗C)n = ρPX⊗HC(A⊗C)n|X ⊗ HC

= ρPX⊗HC(PX⊗HCUn|X ⊗ HC)|X ⊗ HC

= ρPX⊗HCUn|X ⊗ HC,

i.e., U is a unitary ρ-dilation of the operator A⊗C. Thus, A⊗C ∈ Cρ.

Page 288: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 283

Let us define the numerical radius of an N -tuple of operators A ∈ L(X )N as

w(N)(A) := supC∈CN

w(A ⊗C). (3.7)

For N = 1, w(1)(A) = w(A). Indeed,

w(1)(A) = sup‖C‖<1

w(A⊗ C) ≥ sup0<ε<1

w(A ⊗ (1− ε)IHC ) = sup0<ε<1

(1− ε)w(A)

= w(A);

w(1)(A) = sup‖C‖<1

w(A⊗ C) ≤ sup‖C‖<1

w(A)‖C‖ = w(A).

Here we used the properties w(A ⊗ IH) = w(A) and w(A ⊗ B) ≤ w(A)‖B‖ validfor any A ∈ L(X ), B ∈ L(H) (see, e.g., [14]).

Proposition 3.6. A ∈ C2,N ⇐⇒ w(N)(A) ≤ 1.

Proof. By Theorem 3.3, A ∈ C2,N if and only if A⊗C ∈ C2 = C2,1 for anyC ∈ CN . This, in turn, means that w(A ⊗C) ≤ 1 for any C ∈ CN (by Berger’sresult mentioned in Section 2), i.e., w(N)(A) ≤ 1.

Theorem 3.7. If A ∈ Cρ,N ∩ L(X )N for a ρ > 0, then LA ∈ ρASN (X ). For anyρ > 0 such that ρ = 1, there exists an A ∈ L(X )N such that LA ∈ ρASN (X ) andA /∈ Cρ,N .

Proof. Let A ∈ Cρ,N ∩L(X )N for some ρ > 0, and C ∈ CN . Then A has a unitaryρ-dilation A ∈ L(X )N , and

‖A⊗C‖ =

∥∥∥∥∥N∑

k=1

Ak ⊗ Ck

∥∥∥∥∥ =

∥∥∥∥∥ρ(PX ⊗ IHC)

(N∑

k=1

Ak ⊗ Ck

)∣∣∣∣∣X ⊗HC

∥∥∥∥∥≤ ρ

∥∥∥∥∥N∑

k=1

Ak ⊗ Ck

∥∥∥∥∥ = ρ∥∥∥A⊗C

∥∥∥ ≤ ρ

(here we used again Corollary 4.3 in [18]). Thus, LA ∈ ρASN (X ).Now, let 0 < ρ = 1, and A ∈ L(X )N be such that 1

ρLA(ζ) = 1ρζA is a unitary

operator for each ζ ∈ TN . Then, again by Corollary 4.3 in [18], LA ∈ ρASN (X ).Suppose there exists a unitary ρ-dilation A ∈ L(X )N of A. Then for any ζ ∈ TN ,LA(ζ) = ζA = ρPX (ζA)|X . Hence, for any ζ ∈ TN and x ∈ X ,

‖ζAx‖ = ‖x‖ = ‖1ρζAx‖ = ‖PX (ζA)x‖,

that is possible only if ζAx ∈ X for all ζ ∈ TN and x ∈ X . Therefore, for n > 1,

ρn‖x‖ = ‖(ζA)nx‖ = ‖ρPX (ζA)nx‖ = ρ‖(ζA)nx‖ = ρ‖x‖,that is impossible for x = 0. Thus, A /∈ Cρ,N .

Note 3.8. Compare Theorem 3.7 with Remark 3.5.

Page 289: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

284 D.S. Kalyuzhnyı-Verbovetzkiı

The same argument as in the proof of the first part of Theorem 3.7 showsthat, for A ∈ Cρ,N ,

‖(A⊗C)n‖ ≤ ρ, n ∈ N, C ∈ CN . (3.8)

Note 3.9. Compare (3.8) with (2.2).

This uniform (in C ∈ CN) power-boundedness of an N -tuple of operators Ais, in our setting, a generalization of power-boundedness of a single operator. Letus define the spectral radius of an N -tuple of operators A ∈ L(X )N as

ν(N)(A) := limn→+∞

(sup

C∈CN

‖(A⊗C)n‖) 1

n

. (3.9)

Note 3.10. Compare (3.9) with (2.3).

In other words, ν(N)(A) = ν(N,∞)(LA), where ν(N,∞)(f) is the spectral radiusof an element f of the Banach algebra H∞

N (X ) consisting of holomorphic L(X )-valued functions f on DN which satisfy

‖f‖∞,N := supC∈CN

‖f(C)‖ <∞

(this algebra was introduced in [1]). Here f(C) is defined in the same manner ask(C,C) in (3.3), i.e., for

f(z) =∑

t∈ZN+

ftzt, z ∈ DN ,

f(C) :=∑

t∈ZN+

ft ⊗Ct, C ∈ CN ,

where the latter series converges in the L(X ⊗HC)-norm. For N = 1, ν(1)(A) =ν(A). Indeed,

ν(1)(A) = limn→+∞

(sup

‖C‖<1

‖(A⊗ C)n‖) 1

n

= limn→+∞

(sup

‖C‖<1

‖An ⊗ Cn‖) 1

n

= limn→+∞

(‖An‖ sup

‖C‖<1

‖Cn‖) 1

n

= limn→+∞ ‖An‖ 1

n = ν(A).

Remark 3.11. For any A ∈ Cρ,N , by virtue of (3.8), ν(N)(A) ≤ 1.

Theorem 3.12. For a fixed Hilbert space X and any N ≥ 1 the class Cρ,N increasesas a function of ρ:

Cρ,N ⊂ Cρ′,N for ρ < ρ′.Moreover, for dimX ≥ 2, Cρ,N increases strictly:

Cρ,N = Cρ′,N for ρ = ρ′.

For dimX = 1 the classes Cρ,N coincide for all ρ ≥ 1, and strictly increase for0 < ρ < 1.

Page 290: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 285

Proof. For N = 1 this theorem is true (see Section 2). For N > 1 it follows fromthe equivalence (i)⇔(v) in Theorem 3.3. Theorem 3.13. For any A ∈ Cρ,N , C ∈ CN , and a polynomial p of one variable,

‖p(A⊗C)‖ ≤ max|z|≤1

|ρp(z) + (1− ρ)p(0)|.

Proof. This result follows from the generalized von Neumann inequality (2.8) andthe equivalence (i)⇔(v) in Theorem 3.3.

Let us remark that results of this section on N -tuples of operators from theclasses Cρ,N can be extended to elements of H∞

N (X ), though the notion of unitaryρ-dilation no longer makes sense for this case. Define C

(∞)ρ,N as a class of functions

f ∈ H∞N (X ) such that f(C) ∈ Cρ = Cρ,1 for any C ∈ CN . Then, in particular,

Theorem 3.3 implies that A ∈ Cρ,N if and only if LA ∈ C(∞)ρ,N . The following

analogue of Theorem 3.3 is easily obtained.

Theorem 3.14. Let f ∈ H∞N (X ) and ρ > 0. The following conditions are equivalent:

(i) f ∈ C(∞)ρ,N ;

(ii) the function kfρ,N (z, w) := ρIX − (ρ− 1)(f(z) + (f(w)∗) + (ρ − 2)f(w)∗f(z)

is an Agler kernel on DN × DN ;(iii) the function ψf

ρ,N (z) := (1− 2ρ)IX + 2

ρ(IX − f(z))−1 belongs to AHN (X );

(iv) the function ϕfρ,N (z) := f(z)((− 1)f(z)− ρIX )−1 belongs to ASN (X ).

Clearly, H∞N (X ) ∩ C

(∞)1,N = ASN (X ). Set

w(N,∞)(f) := supC∈CN

w(f(C)). (3.10)

Note 3.15. Compare (3.10) with (3.7).

Remark 3.16. Proposition 3.6 extends directly to the class C(∞)2,N , with f ∈ H∞

N (X )in the place of A ∈ L(X )N , and w(N,∞)(f) in the place of w(N)(A). Remark 3.11extends directly to f ∈ H∞

N (X ) in the place of A ∈ L(X )N , and ν(N,∞)(f) in theplace of ν(N)(A). Also, Theorems 3.12 and 3.13 extend to the classes C(∞)

ρ,N .

4. Multivariable operator and operator-function radii

In this section we extend the notion of operator radii wρ, 0 < ρ ≤ ∞, to themultivariable case, i.e., to N -tuples of bounded linear operators and to elementsof the Banach algebra H∞

N (X ). Let 0 < ρ <∞ and f ∈ H∞N (X ). Set

w(∞)ρ,N (f) := infu > 0 :

1uf ∈ C

(∞)ρ,N ,

and for A ∈ L(X )N , define

wρ,N (A) := w(∞)ρ,N (LA).

Page 291: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

286 D.S. Kalyuzhnyı-Verbovetzkiı

Due to our remark preceding to Theorem 3.14,

wρ,N (A) = infu > 0 :1uA ∈ Cρ,N. (4.1)

Note 4.1. Compare (4.1) with (2.7).

Clearly, for N = 1 and A ∈ L(X ), wρ,1(A) = wρ(A).

Lemma 4.2. For f ∈ H∞N (X ), A ∈ L(X )N ,

w(∞)ρ,N (f) = sup

C∈CN

wρ(f(C)), (4.2)

wρ,N (A) = supC∈CN

wρ(A⊗C). (4.3)

Proof. Let f ∈ H∞N (X ). Then for u > 0, 1

uf ∈ C(∞)ρ,N if and only if for any C ∈ CN

one has 1uf(C) ∈ Cρ. Therefore,

w(∞)ρ,N (f) = infu > 0 :

1uf ∈ C

(∞)ρ,N = infu > 0 : ∀C ∈ CN ,

1uf(C) ∈ Cρ

= supC∈CN

infu > 0 :1uf(C) ∈ Cρ = sup

C∈CN

wρ(f(C)),

i.e., (4.2) is true. Now, (4.3) follows from (4.2) and the definition of wρ,N (A).

Theorem 4.3. 1. All properties (i)–(xx) listed in Theorem 2.5 are satisfied forw

(∞)ρ,N (·) in the place of wρ(·); f, g ∈ H∞

N (X ) in the place of A,B ∈ L(X ); w(N,∞)(·)in the place of w(·); and ν(N,∞)(·) in the place of ν(·).

2. Properties (i)–(xvii) listed in Theorem 2.5 are satisfied for wρ,N (·) in theplace of wρ(·); A ∈ L(X )N in the place of A ∈ L(X ); w(N)(·) in the place of w(·);and ν(N)(·) in the place of ν(·).

Proof. 1. Let f ∈ H∞N (X ). Then ‖f‖∞,N = supC∈CN ‖f(C)‖ < ∞. By properties

(vi) and (x) in Theorem 2.5, and Lemma 4.2, if 0 < ρ ≤ 1 then

w(∞)ρ,N (f) = sup

C∈CN

wρ(f(C)) ≤(

2ρ− 1

)sup

C∈CN

w1(f(C))

=(

2ρ− 1

)sup

C∈CN

‖f(C)‖ <∞,

and if ρ > 1 then

w(∞)ρ,N (f) = sup

C∈CN

wρ(f(C)) ≤ supC∈CN

w1(f(C)) = supC∈CN

‖f(C)‖ <∞.

Thus, property (i) is fulfilled.Properties (ii)–(vii), (ix)–(xi), (xiii)–(xv), and (xvii)–(xx) easily follow from

the properties in Theorem 2.5 with the same numbers, and Lemma 4.2.The proof of property (viii) is an adaptation of the proof of Theorem 5.1

in [15] to our case. First of all, let us remark that property (iv) implies that if

Page 292: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 287

u > w(∞)ρ,N (f) then 1

uf ∈ C(∞)ρ,N , and for any C ∈ CN one has 1

uf(C) ∈ Cρ. Inparticular,

supC∈CN

∥∥∥∥(f(C)u

)n∥∥∥∥ ≤ ρ, n ∈ N.

Therefore, ν(N,∞)(

fu

)≤ 1, i.e., ν(N,∞)(f) ≤ u. Thus, for any ρ > 0, ν(N,∞)(f) ≤

w(∞)ρ,N (f), moreover,

ν(N,∞)(f) ≤ limρ→+∞w

(∞)ρ,N (f)

(note, that due to property (x), w(∞)ρ,N (f) is a non-increasing and bounded from

below function of ρ, hence it has a limit as ρ→ +∞).For the proof of the opposite inequality, let us first show that if ν(N,∞)(g) < 1

for some g ∈ H∞N (X ) then beginning with some ρ0 > 0 (i.e., for all ρ ≥ ρ0) one

has g ∈ C(∞)ρ,N . Indeed, in this case there exists an s > 1 such that ν(N,∞)(sg) < 1.

Then there exists a B > 0 such that

sn supC∈CN

‖g(C)n‖ ≤ B, n ∈ N.

Hence, for any C ∈ CN ,

Reψgρ,N (C) =

(1− 2

ρ

)IX⊗HC +

Re(IX⊗HC − g(C))−1

= IX⊗HC +2ρ

Re∞∑

n=1

g(C)n $(

1− 2ρ

∞∑n=1

‖g(C)n‖)IX⊗HC

$(

1− 2ρ

∞∑n=1

B

sn

)IX⊗HC =

(1− 2B

ρ(s− 1)

)IX⊗HC $ 0

as soon as ρ ≥ 2Bs−1 . Thus, by Theorem 3.14, g ∈ C

(∞)ρ,N for any ρ ≥ 2B

s−1 .Now, if ν(N,∞)(f) = 0 then for any k ∈ N, ν(N,∞)(kf) = 0. Hence, for ρ ≥ ρ0

we have kf ∈ C(∞)ρ,N , and by property (iv), w(∞)

ρ,N (kf) ≤ 1. Thus,

limρ→+∞w

(∞)ρ,N (f) ≤ 1

k

for any k ∈ N, andlim

ρ→+∞w(∞)ρ,N (f) = 0 = ν(N,∞)(f),

as required.If ν(N,∞)(f) > 0 then for any ε > 0,

ν(N,∞)

(f

(1 + ε)ν(N,∞)(f)

)=

11 + ε

< 1.

Then for ρ ≥ ρ0,

w(∞)ρ,N

(f

(1 + ε)ν(N,∞)(f)

)≤ 1,

Page 293: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

288 D.S. Kalyuzhnyı-Verbovetzkiı

hence w(∞)ρ,N (f) ≤ (1 + ε)ν(N,∞)(f). Passing to the limit as ρ → +∞, and then as

ε ↓ 0, we getlim

ρ→+∞w(∞)ρ,N (f) ≤ ν(N,∞)(f),

as required. Thus, property (viii) is proved.For the proof of property (xii), it is enough to suppose, by virtue of positive

homogeneity of w(∞)ρ,N (·) and ν(N,∞)(·), that for f ∈ H∞

N (X ) one has w(∞)ρ0,N (f) =

1, ν(N,∞)(f) < 1, and prove that for any ρ > ρ0, w(∞)ρ,N (f) < 1. By Theorem 3.14

and property (iv) in the present theorem,ρ0

2ψf

ρ0,N (z) =(ρ0

2− 1

)IX + (IX − f(z))−1 ∈ AHN (X ),

i.e., for any C ∈ CN ,

Re[ρ0

2ψf

ρ0,N (C)]

= Re[(ρ0

2− 1

)IX⊗HC + (IX⊗HC − f(C))−1

]$ 0,

and for any ρ > ρ0,

Re[ρ2ψf

ρ,N (C)]

= Re[(ρ

2− 1

)IX⊗HC + (IX⊗HC − f(C))−1

]$ ρ− ρ0

2IX⊗HC .

Since the resolvent Rf (λ) := (λIX − f)−1 is continuous in the H∞N (X )-norm

on the resolvent set of f , and ν(N,∞)(f) < 1, for ε > 0 small enough, one hasν(N,∞)((1 + ε)f) < 1, and

Re[ρ2ψ

(1+ε)fρ,N (C)

]= Re

[(ρ2− 1

)IX⊗HC + (IX⊗HC − (1 + ε)f(C))−1

]$ 0

for any C ∈ CN , i.e., ρ2ψ

(1+ε)fρ,N ∈ AHN (X ), and ψ

(1+ε)fρ,N ∈ AHN (X ). Hence, by

Theorem 3.14, (1 + ε)f ∈ C(∞)ρ,N which means, by property (iv), that w

(∞)ρ,N ((1 +

ε)f) ≤ 1. Thus, w(∞)ρ,N (f) ≤ 1

1+ε < 1, as required.The first part of property (xvi) in this theorem follows from property (xvi)

in Theorem 2.5, and Lemma 4.2. For the proof of the second part of (xvi), we useproperties (xv) and (xvi) from Theorem 2.5, property (xv) in the present theorem,and Lemma 4.2:

limρ↓0

ρ

2w

(∞)ρ,N (f) = sup

0<ρ<1

ρ2w

(∞)ρ,N (f)

= sup

0<ρ<1sup

C∈CN

ρ2wρ(f(C))

= sup

C∈CN

sup0<ρ<1

ρ2wρ(f(C))

= sup

C∈CN

limρ↓0

ρ

2wρ(f(C))

= sup

C∈CN

w2(f(C)) = w(∞)2,N (f).

The proof of property (xvi), as well as part 1 of this theorem, is complete.Part 2 follows from part 1.

Denote by C(∞)∞,N (resp., C∞,N ) the class of CN -bounded holomorphic operator

valued functions on DN (resp., N -tuples of bounded linear operators on a commonHilbert space) with spectral radius at most one.

Page 294: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 289

Theorem 4.4. Let X be a Hilbert space. Then

C(∞)∞,N ∩H∞

N (X ) = clos

⋃0<ρ<∞

(C(∞)ρ,N ∩H∞

N (X ))

; (4.4)

C∞,N ∩ L(X )N = clos

⋃0<ρ<∞

(Cρ,N ∩ L(X )N )

. (4.5)

Note 4.5. Compare (4.4) and (4.5) with (2.4).

Proof of Theorem 4.4. The inclusion “⊃” in (4.4) and (4.5) follows from Remarks3.11 and 3.16, and the fact that the set of CN -bounded holomorphic operator-valued functions on DN (resp., N -tuples of bounded operators) with spectral radiusat most one is closed in H∞

N (X ) (resp., L(X )N ).To show the inclusion “⊂” in (4.4), observe that for f ∈ C

(∞)∞,N ∩H∞

N (X ) and0 < r < 1, ν(N,∞)(rf) ≤ r < 1. By property (viii) from Theorem 4.3, for ρ0 > 0big enough, w(∞)

ρ0,N(rf) < 1, and by property (iv) from the same theorem,

rf ∈ C(∞)ρ0,N ∩H∞

N (X ) ⊂ clos

⋃0<ρ<∞

(C(∞)ρ,N ∩H∞

N (X ))

.

Passing to the limit as r ↑ 1, we get

f ∈ clos

⋃0<ρ<∞

(Cρ,N ∩H∞N (X ))

,

and the inclusion “⊂” in (4.4) follows. Analogously for the inclusion “⊂” in (4.5).

In view of property (iv) in Theorem 4.3, let us call the elements of the classCρ,N (N -variable) ρ-contractions.

5. On similarity of ρ-contractions to 1-contractionsin several variables

An N -tuple of operators A = (A1, . . . , AN ) ∈ L(X )N is said to be simultaneouslysimilar to an N -tuple of operators B = (B1, . . . , BN ) ∈ L(Y)N if there exists aboundedly invertible operator S ∈ L(X ,Y) such that

Ak = S−1BkS, k = 1, . . . , N, (5.1)

or equivalently,zA = S−1(zB)S, z ∈ CN . (5.2)

Note 5.1. Compare (5.1) and (5.2) with (2.9).

Theorem 5.2. For any ρ > 1 and N > 1, there exists an A = (A1, . . . , AN ) ∈ Cρ,N

which is not simultaneously similar to any T = (T1, . . . , TN ) ∈ C1,N .

Page 295: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

290 D.S. Kalyuzhnyı-Verbovetzkiı

Proof. Let N = 2, and for any ε ≥ 0 set A(ε) = (A(ε)1 , A

(ε)2 ) ∈ L(C3)2, where

A(ε)1 :=

⎡⎣ 0 1+ε√2

00 0 0− 1+ε√

20 0

⎤⎦ , A(ε)2 :=

⎡⎣ 0 0 1+ε√2

1+ε√2

0 00 0 0

⎤⎦ .Then for any ε ≥ 0 and z ∈ C2,

zA(ε) =

⎡⎢⎣ 0 1+ε√2z1

1+ε√2z2

1+ε√2z2 0 0

− 1+ε√2z1 0 0

⎤⎥⎦ ,

(zA(ε))2 =

⎡⎢⎣ 0 0 00 (1+ε)2

2 z1z2(1+ε)2

2 z22

0 − (1+ε)2

2 z21 − (1+ε)2

2 z1z2

⎤⎥⎦ ,(zA(ε))3 = (zA(ε))4 = . . . = 0,

i.e., zA(ε) is a nilpotent operator of degree 3. Hence, for any ρ > 1 and z ∈ D2,

‖ϕA(0)

ρ,2 (z)‖ =∥∥∥zA(0)((ρ− 1)zA(0) − ρI)−1

∥∥∥ =

∥∥∥∥∥zA(0)

ρ

(I − ρ− 1

ρzA(0)

)−1∥∥∥∥∥

=

∥∥∥∥∥zA(0)

ρ+ (ρ− 1)

(zA(0)

ρ

)2∥∥∥∥∥ ≤ 1

ρ‖zA(0)‖+

ρ− 1ρ2

‖(zA(0))2‖

=1ρ

∥∥∥∥∥∥∥⎡⎢⎣ 0 z1√

2z2√2

z2√2

0 0− z1√

20 0

⎤⎥⎦∥∥∥∥∥∥∥+

ρ− 1ρ2

∥∥∥∥∥∥∥⎡⎣ 0

z2√2

− z1√2

⎤⎦⎡⎣ 0z1√2

z2√2

⎤⎦T∥∥∥∥∥∥∥

≤ 1ρ

+ρ− 1ρ2

=2ρ− 1ρ2

< 1.

Then, due to the von Neumann inequality in two variables [3], one has

‖ϕA(0)

ρ,2 (C)‖ ≤ 2ρ− 1ρ2

< 1, C ∈ C2,

i.e., ϕA(0)

ρ,2 ∈ AS2(C3). Analogously, for ε > 0 small enough (the choice of ε dependson ρ), one has

supC∈C2

‖ϕA(ε)

ρ,2 (C)‖ = supz∈D2

‖ϕA(ε)

ρ,2 (z)‖ ≤ 1 + ε

ρ+ (ρ− 1)

(1 + ε

ρ

)2

< 1,

i.e., ϕA(ε)

ρ,2 ∈ AS2(C3), and by Theorem 3.3, A(ε) ∈ Cρ,2.

Page 296: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 291

Let us show now that for any ε > 0 the pair A(ε) = (A(ε)1 , A

(ε)2 ) is not

simultaneously similar to any pair T = (T1, T2) ∈ C1,2. Observe that

(A(ε)1 + A

(ε)2 )(A(ε)

1 −A(ε)2 ) =

⎡⎢⎣ 0 1+ε√2

1+ε√2

1+ε√2

0 0− 1+ε√

20 0

⎤⎥⎦⎡⎢⎣ 0 1+ε√

2− 1+ε√

2

− 1+ε√2

0 0− 1+ε√

20 0

⎤⎥⎦=

⎡⎢⎣ −(1 + ε)2 0 00 (1+ε)2

2 − (1+ε)2

2

0 − (1+ε)2

2(1+ε)2

2

⎤⎥⎦ .

Thenlim

n→+∞ ‖[(A(ε)1 + A

(ε)2 )(A(ε)

1 −A(ε)2 )]n‖ = ∞. (5.3)

On the other hand, if A(ε) = (A(ε)1 , A

(ε)2 ) is simultaneously similar to some T =

(T1, T2) ∈ C1,2 then for any n ∈ N one would have

‖[(A(ε)1 +A

(ε)2 )(A(ε)

1 −A(ε)2 )]n‖ = ‖S−1[(T1 + T2)(T1 − T2)]nS‖ ≤ ‖S‖‖S−1‖ <∞,

since ‖T1 ± T2‖ ≤ 1. We get a contradiction with (5.3).Examples of N -tuples of operators from Cρ,N , ρ > 1, which are not simul-

taneously similar to any T ∈ C1,N for the case N > 2 can be obtained from theexamples of pairs A = (A(ε)

1 , A(ε)2 ) above, for sufficiently small ε > 0, by setting

zeros for the rest of operators in these N -tuples: A(ε) := (A(ε)1 , A

(ε)2 , 0, . . . , 0).

Let A = (A1, . . . , AN ) ∈ L(X )N . Then A = (A1, . . . , AN ) ∈ L(X )N is calleda uniform ρ-dilation of A if X ⊃ X and

∀n ∈ N, ∀i1, . . . , in ∈ 1, . . . , N, Ai1 · · ·Ain = ρPX Ai1 · · · Ain |X , (5.4)

or equivalently,

∀n ∈ N, ∀z(1), . . . , z(n) ∈ CN , z(1)A · · · z(n)A = ρPX z(1)A · · · z(n)A|X . (5.5)

Note 5.3. Compare (5.4) and (5.5) with (3.1) and (3.2).

Clearly, a uniform ρ-dilation is a ρ-dilation. If A ∈ L(X )N is a uniform ρ-dilation of A ∈ L(X ), and for any ζ ∈ TN , ζA is a unitary operator, then A iscalled a uniform unitary ρ-dilation of A. Denote by Cu

ρ,N the class of N -tuples ofoperators A = (A1, . . . , AN ) on a common Hilbert space which admit a uniformunitary ρ-dilation. Clearly, Cu

ρ,N ⊂ Cρ,N .

Theorem 5.4. Any A = (A1, . . . , AN ) ∈ Cuρ,N is simultaneously similar to some

T = (T1, . . . , TN) ∈ Cu1,N .

Proof. Let A = (A1, . . . , AN ) ∈ Cuρ,N∩L(X )N , and U = (U1, . . . , UN ) ∈ L(X )N be

a uniform unitary ρ-dilation of A. Let A ⊂ L(X ) be the minimal C∗-algebra whichcontains the operators IX , U1, . . . , UN , and B ⊂ L(X ) be the minimal algebra over

Page 297: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

292 D.S. Kalyuzhnyı-Verbovetzkiı

C which contains the operators U1, . . . , UN . Clearly, B ⊂ A. Let ϕ : B −→ L(X )be a homomorphism defined on the generators as

ϕ : Uk −→ Ak, k = 1, . . . , N.

The algebra B consists of operators of the form

p(U) =∑

1≤k≤m, i1,...,ik∈1,...,Nαi1,...,ik

Ui1 · · ·Uik,

where αi1,...,ik∈ C for all i1, . . . , ik ∈ 1, . . . , N. Then

ϕ(p(U)) = ϕ(∑

αi1,...,ikUi1 · · ·Uik

) =∑

αi1,...,ikAi1 · · ·Aik

= p(A) = ρPXp(U)|X .

Therefore, if p(U) = 0 then ϕ(p(U)) = 0, and ϕ is correctly defined. The homo-morphism ϕ is completely bounded, i.e.,

‖ϕ‖cb := supn∈N

‖ idn ⊗ϕ‖ <∞,

where idn is the identical map of the matrix algebra Mn(C) onto itself. Moreover,‖ϕ‖cb ≤ ρ. Indeed, for any n ∈ N and a polynomial n × n matrix of N non-commuting variables,

P (X) = [pij(X)]ni,j=1 =

⎡⎣ ∑1≤k≤m, i1,...,ik∈1,...,N

α(ij)i1,...,ik

Xi1 · · ·Xik

⎤⎦n

i,j=1

,

‖(idn ⊗ϕ)(P (U))‖ = ‖(idn ⊗ϕ)([pij(U)]ni,j=1

)‖ = ‖[ϕ(pij(U))]ni,j=1‖

= ‖[pij(A)]ni,j=1‖ = ‖[ρPXpij(U)|X ]ni,j=1‖= ρ‖(ICn ⊗ PX )[pij(U)]ni,j=1 |Cn ⊗X‖≤ ρ‖[pij(U)]ni,j=1‖ = ρ‖P (U)‖.

Then, by Theorem 3.1 in [28], there exist a Hilbert space N , a completely contrac-tive homomorphism γ : B −→ L(N ) (i.e., such that ‖γ‖cb ≤ 1), and a boundedlyinvertible operator S ∈ L(X ,N ) such that

ϕ(b) = S−1γ(b)S, b ∈ B.Moreover, as was shown in the proof of Theorem 3.1 in [28], γ can be chosen inthe form

γ(b) = PNπ(b)|N , b ∈ B,where π : A −→ L(K) is a ∗-homomorphism, for some Hilbert space K ⊃ N . Inaddition, it follows from Theorem 2.7 and the proof of Theorem 2.8 in [28] thatone can choose K = K1 ⊕K1, for some Hilbert space K1, and

π(a) = π1(a)⊕ 0, a ∈ A,where π1 : A −→ L(K1) is a unital ∗-homomorphism. Set

Tk := γ(Uk) ∈ L(N ), k = 1, . . . , N.

Page 298: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 293

ThenAk = ϕ(Uk) = S−1TkS, k = 1, . . . , N.

It remains to show that T = (T1, . . . , TN ) ∈ Cu1,N . Set

Wk := π(Uk) ∈ L(K), k = 1, . . . , N.

Since for any n ∈ N and i1, . . . , in ∈ 1, . . . , N one has

Ti1 · · ·Tin = γ(Ui1 · · ·Uin) = PNπ(Ui1 · · ·Uin)|N= PNWi1 · · ·Win |N ,

W = (W1, . . . ,WN ) is a uniform 1-dilation of T = (T1, . . . , TN ), however, still notunitary. Actually,

Wk = π(Uk) = π1(Uk)⊕ 0 (=: W (1)k ⊕ 0), k = 1, . . . , N.

Since π1 is a unital ∗-homomorphism, and for any ζ ∈ TN ,

(ζU)∗ζU = IX = ζU(ζU)∗,

one has, for any ζ ∈ TN ,

(ζW(1))∗ζW(1) = IK1 = ζW(1)(ζW(1))∗,

where W(1) = (W (1)1 , . . . ,W

(1)N ). Set

Wk := W(1)k ⊕ δ1kV ∈ L(K1 ⊕R), k = 1, . . . , N,

where δij is the Kronecker symbol, and V is a unitary dilation of the zero operatoron K1, e.g., the two-sided shift on the space R := l2(K1) =

⊕+∞−∞K1 (here we

identify the space K1 with the subspace . . .⊕ 0 ⊕ 0 ⊕K1 ⊕ 0 ⊕ 0 ⊕ . . . inR). Then, for any ζ ∈ TN ,

(ζW)∗ζW = IK1⊕R = ζW(ζW)∗,

and W = (W1, . . . , WN ) ∈ L(K1 ⊕ R) is a uniform unitary 1-dilation of W =(W1, . . . ,WN ), and therefore, of T = (T1, . . . , TN ). Thus, T ∈ Cu

1,N , as required.

Theorem 5.4 is similar to the result of G. Popescu [30] on simultaneous simi-larity of ρ-contractions to 1-contractions in several variables, however his notion ofmultivariable ρ-contractions is different. Let us clarify the relation between thesetwo results. Denote by CP

ρ,N (we use here this notation instead of just Cρ, as in[30]) the Popescu class of all N -tuples A = (A1, . . . , AN ) of bounded linear opera-tors on a common Hilbert space, say X , which have a uniform isometric ρ-dilation,i.e., such an N -tuple of operators V = (V1, . . . , VN ) ∈ L(X )N , X ⊃ X , for which(1) V ∗

k Vk = IX , k = 1, . . . , N ;

(2) V ∗k Vj = 0, k = j;

(3) ∀n ∈ N, ∀i1, . . . , in, Ai1 · · ·Ain = ρPXVi1 · · ·Vin |X .

Page 299: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

294 D.S. Kalyuzhnyı-Verbovetzkiı

Condition (2) in this definition can be replaced by

(2′)N∑

k=1

VkV∗k ' IX ,

since (1)&(2)⇐⇒(1)&(2’). According to [29], the class CP1,N coincides with the

class of N -tuples of operators A = (A1, . . . , AN ) (∈ L(X )N , for some Hilbertspace X ) such that

N∑k=1

AkA∗k ' IX .

By Theorem 4.5 in [30], any A = (A1, . . . , AN ) ∈ CPρ,N , ρ > 0, is simultaneously

similar to some T = (T1, . . . , TN ) ∈ CP1,N . This is a generalization of the theorem

of Sz.-Nagy and Foias [33] to several variables. Theorem 5.4 of the present paperis a different generalization of the same result, since our classes Cu

ρ,N are differentfrom Popescu’s classes CP

ρ,N for N > 1. More precisely, the following is true.

Theorem 5.5. For any N > 1 and ρ > 0, Cuρ,N CP

ρ,N .

Proof. Let A ∈ Cuρ,N ∩ L(X )N , N > 1, and U ∈ L(X )N be a uniform unitary

ρ-dilation of A. Since for any ζ ∈ TN the operator ζU is unitary, it follows that

ζU(ζU)∗ = IX , ζ ∈ TN ,

which impliesN∑

k=1

UkU∗k = IX .

Thus, by [29], U ∈ CP1,N . Let V ∈ L(X )N be a uniform isometric 1-dilation of U

in the sense of Popescu. Then for any n ∈ N and i1, . . . , in ∈ 1, . . . , N,

Ai1 · · ·Ain = ρPXUi1 · · ·Uin |X = ρPX (PXVi1 · · ·Vin |X )|X= ρPXVi1 · · ·Vin |X ,

i.e., V is a uniform isometric ρ-dilation of A in the sense of Popescu. Thus, A ∈CP

ρ,N . This proves the inclusion Cuρ,N ⊂ CP

ρ,N .Let us prove that this inclusion is proper for any N > 1 and ρ > 0. Firstly,

consider the case N = 2. Let B ∈ L(X0) be any operator of the class Cρ with‖B‖ = ρ. For example,

B :=[

0 ρ0 0

]∈ L(C2)

satisfies B2 = 0 and ‖B‖ = ρ, therefore by properties (iii) and (xi) in Theorem 2.5,wρ(B) = 1, and by property (iv) in the same theorem, B ∈ Cρ. Set X := X0⊕X0,

A1 :=[B 00 0

]∈ L(X ), A2 :=

[0 0B 0

]∈ L(X ). (5.6)

Page 300: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 295

Let U ∈ L(X0) be a unitary ρ-dilation of B. Set X := X0 ⊕ X0 ⊕ . . ., and identifyX = X0 ⊕X0 with the subspace X0 ⊕X0 ⊕ 0 ⊕ 0 ⊕ . . . in X . Set

V1 :=

⎡⎢⎢⎢⎢⎢⎢⎢⎣

U0

U0

. . .

⎤⎥⎥⎥⎥⎥⎥⎥⎦∈ L(X ), V2 :=

⎡⎢⎢⎢⎢⎢⎢⎢⎣

0U

0U

. . .

⎤⎥⎥⎥⎥⎥⎥⎥⎦∈ L(X ),

i.e., the operators V1 and V2 are introduced here as infinite block-diagonal matrices

with equal operator blocks[U0

]∈ L(X0, X0 ⊕ X0) (resp.,

[0U

]∈ L(X0, X0 ⊕

X0)) on the main diagonal. We will show that the pair V = (V1, V2) is a uniformisometric ρ-dilation of the pair A = (A1, A2) in the sense of Popescu. First of all,observe that

V ∗1 V1 = IX = V ∗

2 V2, V ∗1 V2 = V ∗

2 V1 = 0.

Next, the following relations hold:

∀k ∈ N, Ak1 =

[Bk 00 0

]=[ρPX0U

k|X0 00 0

]= ρPXV k

1 |X ;

∀k, n ∈ N, ∀i1, . . . , in ∈ 1, 2, Ak1A2Ai1 · · ·Ain = 0

= ρPXV k1 V2Vi1 · · ·Vin |X

(since Ak1A2 = 0, PX0⊕X0⊕0⊕0⊕...V

k1 V2 = 0);

A2 =[

0 0B 0

]=[

0 0ρPX0U |X0 0

]= ρPXV2|X ;

∀k, n ∈ N, ∀i1, . . . , in ∈ 1, 2, Ak+12 Ai1 · · ·Ain = 0

= ρPXV k+12 Vi1 · · ·Vin |X

(since A22 = 0, PX0⊕X0⊕0⊕0⊕...V

22 = 0);

∀k ∈ N, A2Ak1 =

[0 0

Bk+1 0

]=

[0 0

ρPX0Uk+1|X0 0

]= ρPXV2V

k1 |X ;

∀k ∈ N, A2Ak1A2 = 0 = PXV2V

k1 V2|X ;

∀k, n ∈ N, ∀i1, . . . , in ∈ 1, 2, A2Ak1A2Ai1 · · ·Ain = 0

= ρPXV2Vk1 V2Vi1 · · ·Vin |X

(since Ak1A2 = 0, PX0⊕X0⊕0⊕0⊕...V2V

k1 V2 = 0). Finally, we get

∀n ∈ N, ∀i1, . . . , in ∈ 1, 2, Ai1 · · ·Ain = ρPXVi1 · · ·Vin |X .

Page 301: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

296 D.S. Kalyuzhnyı-Verbovetzkiı

Thus, V is a uniform isometric ρ-dilation of A in the sense of Popescu. However,for any ζ ∈ TN ,

‖ζA‖ =∥∥∥∥[ ζ1B 0

ζ2B 0

]∥∥∥∥ =√

2‖B‖ =√

2ρ > ρ.

Therefore, ζA /∈ Cρ for all ζ ∈ TN . We obtain A ∈ CPρ,2\Cu

ρ,2 (moreover, A /∈ Cρ,2).For the case N > 2 (and any ρ > 0) an analogous example of A ∈ CP

ρ,N\Cuρ,N

is easily obtained from the previous one, by setting zeros for the rest of operatorsin the N -tuple, i.e., A := (A1, A2, 0, . . . , 0), where A1 and A2 are defined in (5.6).In this case the construction of a uniform isometric ρ-dilation of A in the sense ofPopescu should be slightly changed (we leave this to a reader as an easy exercise).

Remark 5.6. The pair A(ε) = (A(ε)1 , A

(ε)2 ) constructed in Theorem 5.2 doesn’t

belong to the class Cuρ,2 for any ε > 0 and ρ > 1. Indeed, we have shown in

Theorem 5.2 that A(ε) is not simultaneously similar to any T = (T1, T2) ∈ C1,2,not speaking of T ∈ Cu

1,2. Thus, by Theorem 5.5, A(ε) /∈ Cuρ,2. This can be shown

also by the following estimate: if A(ε) ∈ Cuρ,2 for some ε > 0 and ρ > 1, then there

exists a uniform unitary ρ-dilation U(ε) = (U (ε)1 , U

(ε)2 ) of A(ε) = (A(ε)

1 , A(ε)2 ), and

for any n ∈ N,

‖[(A(ε)1 + A

(ε)2 )(A(ε)

1 −A(ε)2 )]n‖ = ‖ρPX [(U (ε)

1 + U(ε)2 )(U (ε)

1 − U(ε)2 )]n|X‖

≤ ρ‖[(U (ε)1 + U

(ε)2 )(U (ε)

1 − U(ε)2 )]n‖ = ρ <∞.

This contradicts to (5.3). Thus, for each ρ > 1 we obtain for ε > 0 small enough,A(ε) = (A(ε)

1 , A(ε)2 ) ∈ Cρ,2\Cu

ρ,2, as well as A := (A1, A2, 0, . . . , 0) ∈ Cρ,N\Cuρ,N .

Acknowledgements

I am grateful for the hospitality of the Universities of Leeds and Newcastle uponTyne where a part of this work was carried out during my visits under the Inter-national Short Visit Scheme of the LMS (grant no. 5620). I wish to thank also Dr.Michael Dritschel from the University of Newcastle upon Tyne for useful discus-sions.

References

[1] J. Agler, ‘On the representation of certain holomorphic functions defined on apolydisc’, in Topics in Operator Theory: Ernst D. Hellinger Memorial Volume(L. de Branges, I. Gohberg, and J. Rovnyak, eds.), Oper. Theory Adv. Appl. 48(1990) 47–66 (Birkhauser Verlag, Basel).

[2] J. Agler and J.E. McCarthy, ‘Nevanlinna–Pick interpolation on the bidisk’, J. ReineAngew. Math. 506 (1999) 191–204.

[3] T. Ando, ‘On a pair of commutative contractions’, Acta Sci. Math. (Szeged) 24(1963) 88–90.

Page 302: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Multivariable ρ-contractions 297

[4] T. Ando and K. Nishio, ‘Convexity properties of operator radii associated with uni-tary ρ-dilations’, Michigan Math. J. 20 (1973) 303–307.

[5] C. Badea and G. Cassier, ‘Constrained von Neumann inequalities’, Adv. Math. 166no. 2 (2002) 260–297.

[6] J.A. Ball, W.S. Li, D. Timotin and T.T. Trent, ‘A commutant lifting theorem onthe polydisc’, Indiana Univ. Math. J. 48 no. 2 (1999) 653–675.

[7] J.A. Ball, C. Sadosky and V. Vinnikov, ‘Conservative input-state-output systemswith evolution on a multidimensional integer lattice’, Multidimens. Syst. Signal Pro-cess., to appear.

[8] J.A. Ball and T.T. Trent, ‘Unitary colligations, reproducing kernel Hilbert spaces,and Nevanlinna-Pick interpolation in several variables’, J. Funct. Anal. 157 no. 1(1998) 1–61.

[9] C.A. Berger, ‘A strange dilation theorem’, Notices Amer. Math. Soc. 12 (1965) 590.

[10] G. Cassier and T. Fack, ‘Contractions in von Neumann algebras’, J. Funct. Anal.135 no. 2 (1996) 297–338.

[11] C. Davis, ‘The shell of a Hilbert-space operator’, Acta Sci. Math. (Szeged) 29 (1968)69–86.

[12] M.A. Dritschel, S. McCullough and H.J. Woerdeman, ‘Model theory for ρ-con-tractions, ρ ≤ 2’, J. Operator Theory 41 no. 2 (1999) 321–350.

[13] E. Durszt, ‘On unitary ρ-dilations of operators’, Acta Sci. Math. (Szeged) 27 (1966)247–250.

[14] C.K. Fong and J.A.R. Holbrook, ‘Unitarily invariant operator norms’, Canad. J.Math. 35 no. 2 (1983) 274–299.

[15] J.A.R. Holbrook, ‘On the power-bounded operators of Sz.-Nagy and Foias’, Acta Sci.Math. (Szeged) 29 (1968) 299–310.

[16] J.A.R. Holbrook, ‘Inequalities governing the operator radii associated with unitaryρ-dilations’, Michigan Math. J. 18 (1971) 149–159.

[17] D.S. Kalyuzhniy, ‘Multiparametric dissipative linear stationary dynamical scatteringsystems: discrete case’, J. Operator Theory 43 no. 2 (2000) 427–460.

[18] D.S. Kalyuzhniy, ‘Multiparametric dissipative linear stationary dynamical scatteringsystems: discrete case. II. Existence of conservative dilations’, Integral EquationsOperator Theory 36 no. 1 (2000) 107–120.

[19] D.S. Kalyuzhniy, ‘On the notions of dilation, controllability, observability, and min-imality in the theory of dissipative scattering linear nD systems’, in ProceedingsCD of the Fourteenth International Symposium of Mathematical Theory of Net-works and Systems (MTNS), June 19–23, 2000, Perpignan, France (A. El Jai andM. Fliess, Eds.), or http://www.univ-perp.fr/mtns2000/articles/SI13_3.pdf/.

[20] D.S. Kalyuzhniy-Verbovetzky, ‘Cascade connections of linear systems and factoriza-tions of holomorphic operator functions around a multiple zero in several variables’,Math. Rep. (Bucur.) 3(53) no. 4 (2001) 323–332.

[21] D.S. Kalyuzhniy-Verbovetzky, ‘On J-conservative scattering system realizations inseveral variables’, Integral Equations Operator Theory 43 no. 4 (2002) 450–465.

Page 303: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

298 D.S. Kalyuzhnyı-Verbovetzkiı

[22] D.S. Kalyuzhnyı, ‘The von Neumann inequality for linear matrix functions of severalvariables’, Mat. Zametki 64 no. 2 (1998) 218–223 (Russian). English transl. in Math.Notes 64 no. 1–2 (1998) 186–189 (1999).

[23] D.S. Kalyuzhnyı-Verbovetskiı, ‘Cascade connections of multiparameter linear sys-tems and the conservative realization of a decomposable inner operator function onthe bidisk’, Mat. Stud. 15 no. 1 (2001) 65–76 (Russian).

[24] T. Nakazi and K. Okubo, ‘ρ-contraction and 2×2 matrix’, Linear Algebra Appl. 283no. 1-3 (1998) 165–169.

[25] J. von Neumann, ‘Eine Spektraltheorie fur allgemeine Operatoren eines unitarenRaumes’, Math. Nachr. 4 (1951) 258–281 (German).

[26] K. Okubo and T. Ando, ‘Operator radii of commuting products’, Proc. Amer. Math.Soc. 56 no. 1 (1976) 203–210.

[27] K. Okubo and I. Spitkovsky, ‘On the characterization of 2×2 ρ-contraction matrices’,Linear Algebra Appl. 325 no. 1-3 (2001) 177–189.

[28] V.I. Paulsen, ‘Every completely polynomially bounded operator is similar to a con-traction’, J. Funct. Anal. 55 no. 1 (1984) 1–17.

[29] G. Popescu, ‘Isometric dilations for infinite sequences of noncommuting operators’,Trans. Amer. Math. Soc. 316 no. 2 (1989) 523–536.

[30] G. Popescu, ‘Positive-definite functions on free semigroups’, Canad. J. Math. 48no. 4 (1996) 887–896.

[31] B. Sz.-Nagy, ‘Sur les contractions de l’espace de Hilbert’, Acta Sci. Math. (Szeged)15 (1953) 87–92 (French).

[32] B. Sz.-Nagy and C. Foias, ‘On certain classes of power-bounded operators in Hilbertspace’, Acta Sci. Math. (Szeged) 27 (1966) 17–25.

[33] B. Sz.-Nagy and C. Foias, ‘Similitude des operateurs de class Cρ a des contractions’,C. R. Acad. Sci. Paris Ser. A-B 264 (1967) A1063–A1065 (French).

[34] B. Sz.-Nagy and C. Foias, Harmonic analysis of operators on Hilbert space (North-Holland, Amsterdam–London, 1970).

[35] J.P. Williams, ‘Schwarz norms for operators’, Pacific J. Math. 24 (1968) 181–188.

Dmitry S. Kalyuzhnyı-VerbovetzkiıDepartment of MathematicsBen-Gurion University of the NegevP.O. Box 653Beer-Sheva 84105, Israele-mail: [email protected]

Page 304: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 299–309c© 2005 Birkhauser Verlag Basel/Switzerland

The Singularly Continuous Spectrumand Non-Closed Invariant Subspaces

Vadim Kostrykin and Konstantin A. Makarov

Dedicated to Israel Gohberg on the occasion of his 75th birthday

Abstract. Let A be a bounded self-adjoint operator on a separable Hilbertspace H and H0 ⊂ H a closed invariant subspace of A. Assuming that H0 isof codimension 1, we study the variation of the invariant subspace H0 underbounded self-adjoint perturbations V of A that are off-diagonal with respectto the decomposition H = H0 ⊕ H1. In particular, we prove the existence of aone-parameter family of dense non-closed invariant subspaces of the operatorA + V provided that this operator has a nonempty singularly continuousspectrum. We show that such subspaces are related to non-closable denselydefined solutions of the operator Riccati equation associated with generalizedeigenfunctions corresponding to the singularly continuous spectrum of B.

Mathematics Subject Classification (2000). Primary 47A55, 47A15; Secondary47B15.

Keywords. Invariant subspaces, operator Riccati equation, singular spectrum.

1. Introduction

In the present article we address the problem of a perturbation of invariant sub-spaces of self-adjoint operators on a separable Hilbert space H and related questionson the existence of solutions to the operator Riccati equation.

Given a self-adjoint operator A and a closed invariant subspace H0 ⊂ H of Awe set Ai = A|Hi , i = 0, 1, with H1 = HH0. Assuming that the perturbation Vis off-diagonal with respect to the orthogonal decomposition H = H0⊕H1 considerthe self-adjoint operator

B = A + V =(A0 VV ∗ A1

)with V =

(0 VV ∗ 0

),

Page 305: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

300 V. Kostrykin and K.A. Makarov

where V is a linear operator from H1 to H0. It is well known (see, e.g., [7]) thatthe Riccati equation

A1X −XA0 −XVX + V ∗ = 0 (1)has a closed (possibly unbounded) solution X : H0 → H1 if and only if its graph

G(H0, X) := x ∈ H |x = x0 ⊕Xx0, x0 ∈ Dom(X) ⊂ H0 (2)

is an invariant closed subspace for the operator B.Sufficient conditions guaranteeing the existence of a solution to equation (1)

require in general the assumption that the spectra of the operators A0 and A1 areseparated,

d := dist(spec(A0), spec(A1)) > 0, (3)and hence H0 and H1 are necessarily spectral invariant subspaces of the operatorA. In particular (see [9]), if

‖V ‖ < cπd with cπ =3π −

√π2 + 32

π2 − 4= 0.503288 . . . , (4)

then the Riccati equation (1) has a bounded solution X satisfying the bound

‖X‖√1 + ‖X‖2

≤ π

2‖V ‖

d− δV< 1

with

δV = ‖V ‖ tan(

12

arctan2‖V ‖d

).

It is plausible to conjecture that condition (4) can be relaxed by the weaker re-quirement ‖V ‖ <

√3d/2 (see [9] for details). However, no proof of that is available

as yet.In general, without additional assumptions, neither condition (3) nor a small-

ness assumption like (4) on the magnitude of the perturbation V can be dropped.However, if the spectra of A0 and A1 are subordinated in the sense that

sup spec(A0) ≤ inf spec(A1),

then for any V with arbitrary large norm the Riccati equation (1) has a contractivesolution [8] (see also [1]). Note that in this case the invariant subspaces H0 and H1

are not necessarily supposed to be spectral invariant subspaces of A.In the present work we prove new existence results for the Riccati equation

under the assumption that the subspace H1 is one-dimensional. In particular,these results imply the existence of a one-parameter family of non-closed invariantsubspaces of the self-adjoint operator B, provided that B has nonempty singularlycontinuous spectrum.

The main result of our paper is presented by the following theorem.

Theorem 1. Assume that dimH1 = 1 and suppose that H0 is a cyclic subspace forthe operator A0 generated by the one-dimensional subspace RanV . Let Spp denotethe set of all eigenvalues of the operator B.

Page 306: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Non-Closed Invariant Subspaces 301

Then there exists a minimal support Ss of the singular part of the spectralmeasure of the operator B such that:

(i) For any λ ∈ Ssc = Ss \ Spp the subspace Ψ(λ) = G(H0, Xλ) ⊂ H is a densenon-closed graph subspace with Xλ : H0 → H1 a non-closed densely definedoperator solving the Riccati equation (1) in the sense of Definition 2.3 below.

(ii) For any λ ∈ Spp ⊂ Ss the subspace Ψ(λ) = G(H0, Xλ) ⊂ H is a closed graphsubspace of codimension 1 with Xλ : H0 → H1 a bounded operator solvingthe Riccati equation (1). Moreover, the operator Xλ is an isolated point (inthe operator norm topology) of the set of all bounded solutions to the Riccatiequation.

The mapping Ψ from Ss to the set M(B) of all (not necessarily closed) subspacesof H invariant with respect to the operator B is injective.

The article is organized as follows. In Section 2 we establish a link betweennon-closable densely defined solutions to the Riccati equation (1) and the associ-ated non-closed invariant subspaces of the operator B. In Section 3 accommodatingthe Simon-Wolff theory [10] to rank two off-diagonal perturbations we perform thespectral analysis of this operator under the assumption that dim H1 = 1. The mainresult of this section is Theorem 3.4. Theorem 1 will be proven in Section 4.

Throughout the whole work the Hilbert space H will assumed to be separable.The notation B(M,N) is used for the set of bounded linear operators from theHilbert space M to the Hilbert space N. We will write B(N) instead of B(N,N).

2. Non-closed graph subspaces

Let H0 be a closed subspace of a Hilbert space H and X a densely defined (possiblyunbounded and not necessarily closed) operator from H0 to H1 = H⊥

0 := H H0

with domain Dom(X). A linear subspace

G(H0, X) := x ∈ H |x = x0 ⊕Xx0, x0 ∈ Dom(X) ⊂ H0is called the graph subspace of H associated with the pair (H0, X) or, in short, thegraph of X .

Recalling general facts on densely defined closable operators (see, e.g., [6]) wemention the following: If X : H0 → H1 is a densely defined non-closable operator,then G(H0, X) is a non-closed subspace of H. Its closure is not a graph subspace,i.e., there is no closed operator Y such that

G(H0, X) = G(H0, Y ).

Proposition 2.1. Let X : H0 → H1 be a densely defined non-closable operator.Then the closed subspace G(H0, X) contains an element orthogonal to H0.

Proof. First, for X : H0 → H1 being a densely defined non-closable operatorwe prove the following alternative: either the closed subspace G(H0, X) containsan element orthogonal to H0 or the subspace H0 contains an element orthogonal

Page 307: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

302 V. Kostrykin and K.A. Makarov

to G(H0, X). Indeed, assume on the contrary that neither the closed subspaceG(H0, X) contains an element orthogonal to H0 nor the subspace H0 contains anelement orthogonal to G(H0, X). Then by Theorem 3.2 in [7] there is a closeddensely defined operator Y : H0 → H1 such that G(H0, X) = G(H0, Y ), which is acontradiction.

Now assume that the subspace H0 contains an element x0 orthogonal toG(H0, X). Obviously, this element is orthogonal to G(H0, X), that is, 〈x0 ⊕ 0, x0⊕Xx0〉 = 0, and hence x0 = 0. Then, by the alternative proven above the subspaceG(H0, X) contains an element orthogonal to H0, completing the proof.

For notational setup assume the following hypothesis.

Hypothesis 2.2. Let B be a self-adjoint operator represented with respect to thedecomposition H = H0 ⊕ H1 as a 2× 2 operator block matrix

B =(A0 VV ∗ A1

), (5)

where Ai ∈ B(Hi), i = 0, 1, are bounded self-adjoint operators in Hi while V ∈B(H1,H0) is a bounded operator from H1 to H0. More explicitly, B = A + V,where A is the bounded diagonal self-adjoint operator,

A =(A0 00 A1

), (6)

and the operator V = V∗ is an off-diagonal bounded operator

V =(

0 VV ∗ 0

). (7)

Definition 2.3. A densely defined (possibly unbounded and not necessarily closable)operator X from H0 to H1 with domain Dom(X) is called a strong solution to theRiccati equation

A1X −XA0 −XVX + V ∗ = 0 (8)

ifRan(A0 + V X)|Dom(X) ⊂ Dom(X)

andA1Xx−X(A0 + VX)x + V ∗x = 0 for any x ∈ Dom(X).

Theorem 2.4. Assume Hypothesis 2.2. A densely defined (possibly unbounded andnot necessarily closed) operator X from H0 to H1 with domain Dom(X) is a strongsolution to the Riccati equation (8) if and only if the graph subspace G(H0, X) isinvariant for the operator B.

Proof. First, assume that G(H0, X) is invariant for B. Then

B(x⊕Xx) = (A0x + V Xx)⊕ (A1Xx + V ∗x) ∈ G(H0, X)

Page 308: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Non-Closed Invariant Subspaces 303

for any x ∈ Dom(X). In particular, A0x + V Xx ∈ Dom(X) and

A1Xx + V ∗x = X(A0x + V Xx) for all x ∈ Dom(X),

which proves that X is a strong solution to the Riccati equation (21).To prove the converse statement assume that X is a strong solution to the

Riccati equation (8), that is,

A0x + V Xx ∈ Dom(X)

andA1Xx+ V ∗x = X(A0x+ V Xx), x ∈ Dom(X),

which proves that the graph subspace G(H0, X) is B-invariant. Remark 2.5. By Lemma 4.3 in [7] a closed densely defined operator X : H0 → H1

is a strong solution to the Riccati equation (8) if and only if it is a weak solutionto (8).

3. The singular spectrum of the operator B

Assume the following hypothesis.

Hypothesis 3.1. Assume Hypothesis 2.2. Assume in addition that the Hilbert spaceH1 is one-dimensional,

H1 = C,

and the Hilbert space H0 is the cyclic subspace generated by RanV .

Note that under Hypothesis 3.1 the Hilbert space H0 can be realized as aspace of square integrable functions with respect to a Borel probability measurem with compact support,

H0 = L2(R;m)such that the bounded operator A0 acts on L2(R,m) as the multiplication operator

(A0x0)(λ) = λx0(λ), x0 ∈ L2(R,m),

A1 is the multiplication by a real number a1 and, finally, the linear bounded map

V ∗ : H0 → H1

is given byV ∗x0 = 〈v, x0〉H0 , x0 ∈ H0

for some v ∈ H0.

Lemma 3.2. Assume Hypothesis 3.1. Then the element 0 ⊕ 1 ∈ H = H0 ⊕ H1 iscyclic for the operator B given by (5) – (7) and, hence, B has a simple spectrum.

Proof. By hypothesis (in the above notations) the element v ∈ H0 is cyclic forthe operator A0. Therefore, the cyclic subspace with respect to the operator Bgenerated by the elements v ⊕ 0 ∈ H and 0⊕ 1 ∈ H is the whole H. Without lossof generality we may assume that a1 = 0. Observing that B(0⊕ 1) = v⊕ 0 provesthe claim.

Page 309: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

304 V. Kostrykin and K.A. Makarov

Theorem 3.3. Assume Hypothesis 3.1. Then the Herglotz function

φ(z) =1 + (a1 − z)〈v, (A0 − z)−1v〉H0

(a1 − z)− 〈v, (A0 − z)−1v〉H0

(9)

admits the representation

φ(z) =∫

dω(λ)λ − z

,

where ω is a probability measure on R with compact support. Moreover, the operatorB is unitarily equivalent to the multiplication operator by the independent variableon L2(R, ω).

Proof. Introduce the Borel measure Ω with values in the set of non-negative op-erators on H1 ⊕ H1 by

Ω(δ) =(V 00 1

)∗EB(δ)

(V 00 1

),

where(V 00 1

)is the linear map from H1 ⊕ H1 to H0 ⊕ H1 and let

ω(δ) = trΩ(δ), δ ⊂ R a Borel set.

Clearly, the measure ω vanishes on all Borel sets δ such that EB(δ) = 0. In fact,these measures have the same families of Borel sets, on which they vanish. Indeed,assuming ω(δ) = 0 yields

〈v ⊕ 0,EB(δ) v ⊕ 0〉H + 〈0⊕ 1,EB(δ) 0 ⊕ 1〉H = 0

and, hence, in particular,

〈0 ⊕ 1,EB(δ) 0⊕ 1〉H = 0, (10)

which implies EB(δ) = 0.Introducing the B(H1 ⊕ H1)-valued Herglotz function

M(z) =(V 00 1

)∗(B− z)−1

(V 00 1

)(11)

one concludes that the Herglotz function M(z) admits the representation

M(z) =∫

R

dΩ(λ)λ− z

,

and hence

trM(z) =∫

R

dω(λ)λ − z

.

Straightforward computations show that the operator-valued function (11)with respect to the orthogonal decomposition H = H0 ⊕ H1 can be represented asthe 2× 2 matrix

M(z) =(M00(z) M01(z)M10(z) M11(z)

)

Page 310: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Non-Closed Invariant Subspaces 305

with the entries given by

M00(z) = (a1 − z)〈v, (A0 − z)−1v〉[a1 − z − 〈v, (A0 − z)−1v〉]−1,

M11(z) = [a1 − z − 〈v, (A0 − z)−1v〉]−1,

M01(z) = −(a1 − z)−1M00(z),

M10(z) = −(a1 − z)−1M00(z).

Taking the trace of M(z) yields representation (9).Since by Lemma 3.2 the element 0 ⊕ 1 is cyclic and the measure ω and the

spectral measure EB have the same families of Borel sets, on which they vanish,one concludes (see, e.g., [3]) that the operator B is unitarily equivalent to themultiplication operator by the independent variable on L2(R, ω), completing theproof.

Recall that a measurable not necessarily closed set S ⊂ R is a support of ameasure ν if ν(R \ S) = 0. A support S is said to be minimal if any measurablesubset S′ ⊂ S with ν(S′) = 0 has Lebesgue measure zero.

Theorem 3.4. The sets

Ss :=λ ∈ R

∣∣∣ a1 − λ =∫ |v(µ)|2dm(µ)

µ− λ− i0

(12)

and

Ssc :=λ ∈ R

∣∣∣ a1 − λ =∫ |v(µ)|2dm(µ)

µ− λ− i0,

∫ |v(µ)|2dm(µ)|µ− λ|2 = ∞

(13)

are minimal supports of the singular part ωs and the singularly continuous partωsc of the measure ω, respectively. The set

Spp :=λ ∈ R

∣∣∣ a1 − λ =∫ |v(µ)|2dm(µ)

µ− λ,

∫ |v(µ)|2dm(µ)|µ− λ|2 <∞

(14)

coincides with the set of all atoms of the measure ω.

Proof. The fact that (12) is a minimal support of ωs follows from Lemma 3.5 in[4], where one sets m+

a (z) = (a1 − z) and

m+b (z) = 〈v, (A0 − z)−1v〉H0 =

∫ |v(µ)|2dm(µ)µ− z

, Im z = 0.

It is not hard to see (cf., e.g., Example 1 in [2]) that the set Spp coincideswith the set of all eigenvalues of the operator B. Hence, by Theorem 3.3 one provesthat Spp coincides with the set of all atoms of the measure ω. Therefore, to provethat (13) is a minimal support of ωsc it suffices to check the inclusion

Spp ⊂ Ss. (15)

Assume that λ ∈ Spp, that is,

a1 − λ =∫ |v(µ)|2dm(µ)

µ− λ(16)

Page 311: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

306 V. Kostrykin and K.A. Makarov

and ∫ |v(µ)|2dm(µ)|µ− λ|2 <∞.

Since ∫ |v(µ)|2dm(µ)|µ− λ| ≤

(∫ |v(µ)|2dm(µ)|µ− λ|2

)1/2

‖v‖L2(R;m),

the dominated convergence theorem yields∫ |v(µ)|2dm(µ)µ− λ− i0

≡ limε→+0

∫ |v(µ)|2dm(µ)µ− λ− iε

=∫ |v(µ)|2dm(µ)

µ− λ,

which together with (16) proves inclusion (15). The proof is complete.

Remark 3.5. By Lemma 5 in [5] from Theorem 3.3 it follows that there existminimal supports of the absolutely continuous part ωac, the singular part ωs, andthe singularly continuous part ωsc of the measure ω such that their closures coincidewith the absolute continuous part specac(B), the singular part specs(B), and thesingularly continuous part specsc(B) of the spectrum, respectively.

4. Riccati equation

Given λ ∈ R, introduce the operator (linear functional)

Xλ : L2(R;m) → H1 = C

on

Dom(Xλ) =

ϕ ∈ L2(R;m)

∣∣∣ limε→+0

∫v(µ)ϕ(µ)µ− λ− iε

dm(µ) exists finitely

by

Xλϕ = limε→+0

∫v(µ)ϕ(µ)µ− λ− iε

dm(µ), ϕ ∈ Dom(Xλ). (17)

Lemma 4.1. If λ ∈ Ss, then the operator Xλ is densely defined.

Proof. Since the element v ∈ L2(R;m) is generating for the operator A0, the set

D = ϕ | ϕ(µ) = v(µ)ψ(µ), ψ is continuously differentiable on Ris dense in L2(R;m). For ϕ ∈ D and ε > 0 one obtains∫

v(µ)ϕ(µ)µ− λ− iε

dm(µ) = ψ(λ)∫ |v(µ)|2

µ− λ− iεdm(µ) (18)

+∫ |v(µ)|2(ψ(µ)− ψ(λ))

µ− λ− iεdm(µ). (19)

Since λ ∈ Ss, by Theorem 3.4 the limit

limε→+0

∫ |v(µ)|2dm(µ)µ− λ− iε

=∫ |v(µ)|2dm(µ)

µ− λ− i0

Page 312: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Non-Closed Invariant Subspaces 307

exists finitely. The integral (19) also has a limit as ε→ +0 since ψ is a continuouslydifferentiable which proves that the left-hand side of (18) has a finite limit asε→ +0. Therefore, D ⊂ Dom(Xλ), that is, Xλ is densely defined.

Remark 4.2. Note that by the Riesz representation theorem Xλ is bounded when-ever the condition ∫ |v(µ)|2

|λ− µ|2 dm(µ) <∞ (20)

holds true. The converse is also true: If Xλ is bounded, then (20) holds. Indeed,by the uniform boundedness principle from definition (17) it follows that

supε∈(0,1]

∫ |v(µ)|2(µ− λ)2 + ε2

dm(µ) <∞,

proving (20) by the monotone convergence theorem.

Theorem 4.3. Let λ ∈ Ss. Then the operator Xλ is a strong solution to the Riccatiequation

A1X −XA0 −XVX + V ∗ = 0. (21)Moreover, if λ ∈ Spp, the solution Xλ is bounded and if λ ∈ Ssc = Ss \ Spp, theoperator Xλ is non-closable.

Proof. Note that A0 Dom(Xλ) ⊂ Dom(Xλ). If λ ∈ Ss, then by Theorem 3.4

a1 − λ =∫ |v(µ)|2dm(µ)

µ− λ− i0.

In particular, v ∈ Dom(Xλ) and

XλV Xλϕ =∫ |v(µ)|2

µ− λ− i0dm(µ) ·Xλϕ

= (a1 − λ)∫

v(µ)ϕ(µ)µ− λ− i0

dm(µ), ϕ ∈ Dom(Xλ).

Therefore, for an arbitrary ϕ ∈ Dom(Xλ) one gets

A1Xλϕ−XλA0ϕ−XλV Xλϕ

=∫

v(µ)ϕ(µ)(a1 − µ)µ− λ− i0

dm(µ)− (a1 − λ)∫

v(µ)ϕ(µ)µ− λ− i0

dm(µ)

=∫

v(µ)ϕ(µ)(λ − µ)µ− λ− i0

dm(µ) = −∫

v(µ)ϕ(µ)dm(µ) = −V ∗ϕ,

which proves that the operator Xλ is a strong solution to the Riccati equation(21).

If λ ∈ Spp, then (20) holds, in which case Xλ is bounded. If λ ∈ Ssc = Ss\Spp,then Xλ is an unbounded densely defined operator (functional) (cf. Remark 4.2).Since every closed finite-rank operator is bounded [6], it follows that for λ ∈ Ssc

the unbounded solution Xλ is non-closable.

Page 313: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

308 V. Kostrykin and K.A. Makarov

Proof of Theorem 1. Introduce the mapping

Ψ(λ) = G(H0, Xλ), λ ∈ Ss, (22)

where Xλ is the strong solution to the Riccati equation referred to in Theorem4.3. By Theorem 2.4 the subspace Ψ(λ), λ ∈ Ss is invariant with respect to B.To prove the injectivity of the mapping Ψ, assume that Ψ(λ1) = Ψ(λ2) for someλ1, λ2 ∈ Ss. Due to (22), Xλ1 = Xλ2 which by (17) implies λ1 = λ2.

(i) Let λ ∈ Ssc. By Theorem 4.3 the functional Xλ is non-closable. Since Xλ

is densely defined, the closure G(H0, Xλ) of the subspace G(H0, Xλ) contains thesubspace H0. By Proposition 2.1, the subspace G(H0, Xλ) contains an element or-thogonal to H0. Since H0 ⊂ H is of codimension 1, one concludes that G(H0, Xλ) =H0 ⊕ H1 = H.

(ii) Let λ ∈ Spp. By Theorem 5.3 in [7] the solution Xλ is an isolated point (in theoperator norm topology) of the set of all bounded solutions to the Riccati equation(21) if and only if the subspace G(H0, Xλ) is spectral, that is, there is a Borel set∆ ⊂ R such that

G(H0, Xλ) = Ran EB(∆).

Observe that the one-dimensional graph subspace G(H1,−X∗λ) is invariant

with respect to the operator B. This subspace is spectral since by Lemma 3.2 λ isa simple eigenvalue of the operator B. Thus, G(H0, Xλ) = G(H1,−X∗

λ)⊥ is also aspectral subspace of the operator B.

Acknowledgments

The authors are grateful to C. van der Mee for useful suggestions. K.A. Makarov isindebted to the Graduiertenkolleg “Hierarchie und Symmetrie in mathematischenModellen” for its kind hospitality during his stay at the RWTH Aachen in theSummer of 2003.

References

[1] V. Adamyan, H. Langer, and C. Tretter, Existence and uniqueness of contractivesolutions of some Riccati equations, J. Funct. Anal. 179 (2001), 448–473.

[2] S. Albeverio, K.A. Makarov, and A.K. Motovilov, Graph subspaces and the spectralshift function, Canad. J. Math. 55 (2003), 449–503. arXiv:math.SP/0105142

[3] M.S. Birman and M.Z. Solomyak, Spectral Theory of Self-Adjoint Operators in HilbertSpace, D. Reidel, Dordrecht, 1987.

[4] D.J. Gilbert, On subordinacy and analysis of the spectrum of Schrodinger operatorswith two singular endpoints, Proc. Royal Soc. Edinburgh 112A (1989), 213–229.

[5] D.J. Gilbert and D.B. Pearson, On subordinacy and analysis of the spectrum of one-dimensional Schrodinger operators, J. Math. Anal. Appl. 128 (1987), 30–56.

[6] T. Kato, Perturbation Theory for Linear Operators, Springer-Verlag, Berlin, 1966.

Page 314: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Non-Closed Invariant Subspaces 309

[7] V. Kostrykin, K.A. Makarov, and A.K. Motovilov, Existence and uniqueness of so-lutions to the operator Riccati equation. A geometric approach, in Yu. Karpeshina,G. Stolz, R. Weikard, Y. Zeng (Eds.), Advances in Differential Equations and Mathe-matical Physics, Contemporary Mathematics 327, Amer. Math. Soc., 2003, pp. 181–198. arXiv:math.SP/0207125

[8] V. Kostrykin, K.A. Makarov, and A.K. Motovilov, A generalization of the tan 2Θtheorem, in J.A. Ball, M. Klaus, J.W. Helton, and L. Rodman (Eds.), Current Trendsin Operator Theory and Its Applications. Operator Theory: Advances and Applica-tions 149, Birkhauser, Basel, 2004, pp. 349–372. arXiv:math.SP/0302020

[9] V. Kostrykin, K.A. Makarov, and A.K. Motovilov, Perturbation of spectra and spec-tral subspaces, Trans. Amer. Math. Soc. (to appear), arXiv:math.SP/0306025

[10] B. Simon and T. Wolff, Singular continuous spectrum under rank one perturbationsand localization for random Hamiltonians, Comm. Pure Appl. Math. 39 (1986), 75–90.

Vadim KostrykinFraunhofer-Institut fur LasertechnikSteinbachstraße 15D-52074 Aachen, Germanye-mail: [email protected], [email protected]

URL: http://home.t-online.de/home/kostrykin

Konstantin A. MakarovDepartment of MathematicsUniversity of MissouriColumbia, MO 65211, USAe-mail: [email protected]: http://www.math.missouri.edu/people/kmakarov.html

Page 315: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 311–336c© 2005 Birkhauser Verlag Basel/Switzerland

Numerical Methods for Cauchy SingularIntegral Equations in Spaces of WeightedContinuous Functions

G. Mastroianni, M.G. Russo and W. Themistoclakis

Dedicated to Professor Israel Gohberg on the occasion of his 75th birthday

Abstract. Some convergent and stable numerical procedures for Cauchy sin-gular integral equations are given. The proposed approach consists of solvingthe regularized equation and is based on the weighted polynomial interpola-tion. The convergence estimates are sharp and the obtained linear systemsare well conditioned.

Mathematics Subject Classification (2000). Primary 65R20; Secondary 45E05.

Keywords. Cauchy singular integral equation, projection method,Lagrange interpolation.

1. Introduction

We consider the Cauchy singular integral equation (CSIE)

(D + νK)f = g (1.1)

where g is a known function on (−1, 1), f is the unknown, ν ∈ R and the operatorsD and K are defined as follows

Df(y) = cosπαf(y)vα,−α(y)− sinπαπ

∫ 1

−1

f(x)x− y

vα,−α(x)dx, (1.2)

Kf(y) =∫ 1

−1

k(x, y)f(x)vα,−α(x)dx, (1.3)

where 0 < α < 1 and vα,−α(x) = (1− x)α(1 + x)−α is a Jacobi weight.

Work supported by INDAM-GNCS project 2003 “Metodi numerici per equazioni integrali”.

Page 316: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

312 G. Mastroianni, M.G. Russo and W. Themistoclakis

In the last decades the idea of approximating the solution of (1.1) by poly-nomials has been presented in several papers (we mention for instance [26, 30, 1,3, 4, 5, 10, 11, 12, 13, 14, 18, 25, 27] and the references therein).

Such procedures, usually called “collocation” and “discrete collocation” meth-ods, project the equation onto the subspace of polynomials, replacing the integralby a quadrature formula. By collocation on a suitable set of nodes, one can con-struct a linear system, the solution of which gives the coefficients of the polynomialapproximating the exact solution. The convergence and stability of the method isusually studied by considering a finite-dimensional equation equivalent to the sys-tem.

Anyway this approach does not take into account the condition number ofthe matrix of the system. Indeed if, for an approximation method applied to an op-erator equation, convergence and stability are proved, then immediately it followsthat the norms of the discrete operators and the norms of their inverses are uni-formly bounded, and, consequently, the condition numbers of these operators arealso uniformly bounded. However infinite linear systems exist, depending on thechoice of the polynomial base, that are equivalent to the given finite-dimensionalequation and some of these systems can be ill conditioned (see, e.g., [15, 8]). Forexample if we use as a polynomial basis the fundamental Lagrange polynomialsbased on equispaced points, we get a linear system that is strongly ill conditioned.As a consequence we get a not reliable numerical procedure for evaluating thecoefficients of the approximating polynomial.

In this paper, following an idea in [28], we assume Kf and g sufficientlysmooth and solve the equivalent regularized equation

(I + νDK)f = Dg (1.4)

where

Df(y) = cosπαv−α,α(y)f(y) +sinπαπ

∫ 1

−1

f(x)x− y

v−α,α(x)dx.

Hence we will construct two polynomial sequences Kmfm and Gmm whichare convergent to DKf and Dg, respectively, like the best approximation in somesuitable spaces. Hence we consider the finite-dimensional equation

(I + νKm)fm = Gm (1.5)

where fm is the unknown polynomial. Via standard arguments, it follows that(1.5) has a unique solution fm which converges to f (if f is the solution of (1.4)).The convergence estimates proved here are sharp. Moreover the condition numberof I+νKm is uniformly bounded. Then expanding both sides of (1.5) in a suitablebasis, we get a linear system that is equivalent to (1.5) and whose matrix is wellconditioned (except for some log factor). Moreover in the case when K has asmooth kernel the entries of the matrix can be easily computed. Using suitablepolynomial bases the exposed procedure includes the “collocation” and “discretecollocation” methods.

Page 317: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 313

We remark that we are not able to use the above mentioned procedure whenthe index of equation (1.1) is ±1, since at the moment, the behavior of the cor-responding operator D in the Zygmund type spaces is not clear. Anyway theauthors believe that the main results of this paper can be extended to the case ofthe Cauchy singular integral equations of index ±1.

The paper is structured as follows. Section 2 collects basic tools of the ap-proximation theory, the mapping properties of the operators D and K and someresults on special interpolation processes. By using such processes some sequencesof operators convergent in norm are constructed (see Lemma 2.3 and Lemma 2.4).In Section 3 some numerical methods are shown and the related theorems aboutthe convergence, the stability and the behavior of the condition number of the lin-ear systems are given. In Section 4 some weakly singular perturbation operators(frequently appearing in the literature) are considered. Several numerical tests aregiven in Section 5. Finally Section 6 and the Appendix are devoted to the proofsof the main results and to other technical details.

2. Preliminary results

2.1. Functional spaces

In order to introduce some functional spaces we will denote by C(−1, 1) the set ofall continuous functions on the open interval (−1, 1). Let vγ,δ(x) = (1−x)γ(1+x)δ

be the Jacobi weight with exponents γ, δ > −1. Let us define Cvγ,δ , γ, δ > 0, as

Cvγ,δ = f ∈ C(−1, 1) : limx→±1

f(x)vγ,δ(x) = 0.

Further in the case γ = 0 (respectively δ = 0), Cvγ,δ consists of all functions whichare continuous on (−1, 1] (respectively on [−1, 1)) such that lim

x→−1(fvγ,δ)(x) = 0

(respectively limx→1

(fvγ,δ)(x) = 0). Moreover if γ = δ = 0 we set Cv0,0 = C[−1, 1].

The norm of a function f ∈ Cvγ,δ is defined as ‖f‖Cvγ,δ := sup|x|≤1 |f(x)vγ,δ(x)| =

‖fvγ,δ‖∞. Somewhere for brevity we will write ‖G‖A = supx∈A |G(x)|, A ⊆ [−1, 1].To deal with smoother functions we define the Sobolev type space

Wr = Wr(vγ,δ) = f ∈ Cvγ,δ : f (r−1) ∈ AC(−1, 1) and ‖f (r)ϕrvγ,δ‖∞ <∞where ϕ(x) =

√1− x2, r ≥ 1 is an integer and AC(−1, 1) is the set of abso-

lutely continuous functions on (−1, 1). The norm in Wr is ‖f‖Wr := ‖fvγ,δ‖∞ +‖f (r)ϕrvγ,δ‖∞.

Now let us introduce some suitable moduli of smoothness. Following Ditzianand Totik [7] ∀f ∈ Cvγ,δ we define

Ωkϕ(f, τ)vγ,δ = sup

0<h≤τ‖vγ,δ∆k

hϕf‖Ihk(2.1)

where ∆khϕf(x) =

∑ki=0(−1)i

(ki

)f(x + (k/2 − i)hϕ(x)), 0 < k ∈ N, Ihk =

[−1 + 4h2k2, 1− 4h2k2].

Page 318: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

314 G. Mastroianni, M.G. Russo and W. Themistoclakis

Denoting by Pm the set of all algebraic polynomials of degree at most m, we definethe complete modulus as

ωkϕ(f, τ)vγ,δ = Ωk

ϕ(f, τ)vγ,δ + infq1∈Pk−1

‖vγ,δ(f − q1)‖[−1,−1+4k2τ2] (2.2)

+ infq2∈Pk−1

‖vγ,δ(f − q2)‖[1−4k2τ2,1].

For the sake of simplicity we will also write Ωϕ = Ω1ϕ, ωϕ = ω1

ϕ, Ωkϕ(f, τ) =

Ωkϕ(f, τ)v0,0 , ωk

ϕ(f, τ) = ωkϕ(f, τ)v0,0 , k > 0.

Now denoting by

Em(f)vγ,δ = infP∈Pm

‖(f − P )vγ,δ‖∞

the error of best approximation in Cvγ,δ (Em(f) ≡ Em(f)v0,0), the following in-equalities hold

Em(f)vγ,δ ≤ Cωkϕ(f, 1/m)vγ,δ (2.3)

andωk

ϕ(f, τ)vγ,δ ≤ Cτk∑

0≤i≤ 1τ

(1 + i)k−1Ei(f)vγ,δ (2.4)

where C is a positive constant independent of f,m and τ . Estimates (2.3)–(2.4)can be deduced from [7], but they both explicitly appeared in [19].

By definition Ωkϕ(f, τ)vγ,δ ≤ ωk

ϕ(f, τ)vγ,δ . On the other hand by (2.3)–(2.4)Ωk

ϕ(f, τ)vγ,δ ∼ τβ , 0 < β < k implies ωkϕ(f, τ)vγ,δ ∼ τβ . Therefore, by using

Ωkϕ(f, τ)vγ,δ in place of ωk

ϕ(f, τ)vγ,δ , we can define the Zygmund space

Zr(vγ,δ) =

f ∈ Cvγ,δ : ‖f‖Zr(vγ,δ) = ‖fvγ,δ‖∞ + sup

τ>0

Ωkϕ(f, τ)vγ,δ

τr<∞

where 0 < r < k ∈ N and we set Zr ≡ Zr(v0,0). In conclusion we remark that fordifferentiable functions in (−1, 1) we can estimate Ωk

ϕ by means of the inequality

Ωkϕ(f, τ)vγ,δ ≤ C sup

0<h≤τhk‖f (k)ϕkvγ,δ‖Ihk

, C = C(f, τ) (2.5)

where here and in the sequel C = C(a, b, c, . . .) means that C is a positive constantindependent of the parameters a, b, c, . . ..

Finally the following Lemma could be useful in several contexts.

Lemma 2.1. Let f ∈ Cvγ,δ and Pm ∈ Pm be such that

‖(f − Pm)vγ,δ‖∞ ≤ cEm(f)vγ,δ

holds with some positive constant c = c(m, f). Then∫ 1m

0

ωϕ(f − Pm, t)vγ,δ

tdt ≤ C

∫ 1m

0

ωkϕ(f, t)vγ,δ

tdt, k < m, (2.6)

where C is a positive constant independent of f and m.

Page 319: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 315

2.2. Mapping properties of D and assumptions on K

The properties of the operator

Df(y) = cosπαf(y)vα,−α(y)− sinπαπ

∫ 1

−1

f(x)x− y

vα,−α(x)dx, 0 < α < 1

in weighted L2 spaces are well known (see, e.g., [26, 30]). In this paper we considerD as a mapping from Zr(vα,0) into Zr(v0,α), r > 0. In [24] the authors extensivelystudied this operator. For the convenience of the reader we recall here the followingresults

• D : Zr(vα,0) −→ Zr(v0,α) is bounded and invertible, ∀r > 0;• the inverse (and bounded) operator, D : Zr(v0,α) −→ Zr(vα,0) is given by

Df(y) = cosπαv−α,α(y)f(y) +sinπαπ

∫ 1

−1

f(x)x− y

v−α,α(x)dx.

With respect to the operator

Kf(y) =∫ 1

−1

k(x, y)f(x)vα,−α(x)dx (2.7)

we assume

supτ>0

Ωkϕ(Kf, τ)v0,α

τr≤ C‖fvα,0‖∞ (2.8)

with k > r > 0 andC independent of f . Therefore the operatorK : Cvα,0 −→ Cv0,α

is compact.Indeed (2.8) and (2.3) imply that limn supf∈S En(Kf)v0,α = 0, where S =

f ∈ Cvα,0 : ‖fvα,0‖ = 1 and this is equivalent to the compactness of K [32].Moreover by (2.8) Kf belongs to Zr(v0,α), r > 0.

In conclusion we remark that (2.8) is satisfied, for instance, if the kernel ofK is of the type k(x, y) = |x− y|µ, µ > −1, µ = 0, with r = µ+ 1 (see Section 4)or if k(x, y) = kx(y) satisfies

sup|x|≤1

supτ>0

Ωkϕ(kx, τ)v0,α

τr<∞, k ≥ r > 0,

(see Lemma 2.4).

2.3. Some interpolation processes

Let pm(vα,−α)m = pα,−αm m be the sequence of the orthonormal Jacobi poly-

nomials with respect to vα,−α (α is the same parameter appearing in the defi-nition of the operators D and K) and having positive leading coefficients. De-note by t1 < · · · < tm the zeros of pα,−α

m and by Lm(vα,−α, F ), F ∈ C(−1, 1),the Lagrange polynomial interpolating F on the nodes t1, . . . , tm. Moreover letLm,1,1(vα,−α, F ) denote the Lagrange polynomial interpolating F ∈ C(−1, 1) on

the knotst1 − 1

2< t1 < . . . < tm < − t1 − 1

2(we choose symmetric additional

Page 320: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

316 G. Mastroianni, M.G. Russo and W. Themistoclakis

points only in order to simplify the computations). Sometimes for the sake ofbrevity we will use

Lα,−αm =

Lm(vα,−α), if α ≥ 1/2Lm,1,1(vα,−α), if 0 < α < 1/2 .

As a consequence of [23, Th. 2.1, 2.2, 2.4] and of [21, Th. 3.1] the following estimateshold for any F ∈ Cvα,0

‖[F − Lα,−αm F ]vα,0‖∞ ≤ CEm−1(F )vα,0 logm (2.9)

‖[F − Lα,−αm F ]v0,−α‖1 ≤ CEm−1(F ) (2.10)

where in both cases C is a positive constant independent of F and m and ‖ · ‖1

denotes the L1-norm.With an analogous meaning of the symbols we denote by Lm(v−α,α, F ),

F ∈ C(−1, 1), the Lagrange polynomial interpolating F on the zeros x1, . . . , xm,of p−α,α

m and by Lm,1,1(v−α,α, F ) the Lagrange polynomial interpolating F ∈C(−1, 1) on the knots −xm + 1

2< x1 < . . . < xm <

xm + 12

. Also in this casesometimes we will use the notation

L−α,αm =

Lm(v−α,α), if α ≥ 1/2Lm,1,1(v−α,α), if 0 < α < 1/2 . (2.11)

Recalling the definition of D we can state the following theorem.

Theorem 2.2. Let φ ∈ Zr(v0,α), r > 0, 0 < α < 1. Then

‖D[φ− L−α,αm φ]vα,0‖∞ ≤ C

logmmr

‖φ‖Zr(v0,α) (2.12)

where C is a positive constant independent of φ and m.

2.4. Some operator sequences

As we already remarked, the assumption (2.8) we are making on K, allows thekernel k(x, y) to be also singular. In this case we can define the following operatorsequence

Kmf(y) = DL−α,αm (Kf, y). (2.13)

Due to the invariance property of D on polynomials, Km maps Cvα,0 into Pm+1.Moreover the following Lemma holds.

Lemma 2.3. Let 0 < α < 1. If the operator K satisfies the assumption (2.8) then

‖DK −Km‖Cvα,0→Cvα,0 = O(

logmmr

), r > 0 (2.14)

where the constant in “O” is independent of m.

When the kernel k(x, y) is smooth we introduce

K∗f(y) =∫ 1

−1

Lα,−αm (ky, x)f(x)vα,−α(x)dx, (2.15)

where ky(x) = k(x, y) and the interpolation is made with respect to x.

Page 321: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 317

Hence we define the sequence K∗mm as

K∗mf(y) = DL−α,α

m (K∗f)(y). (2.16)

K∗m maps Cvα,0 into Pm+1. Moreover, if kx(y) = k(x, y), we can state the following

Lemma.

Lemma 2.4. Let 0 < α < 1. Assume that

supt>0

sup|y|≤1

(1 − y)αΩkϕ(ky, t)

1tr

<∞ (2.17)

and

supt>0

sup|x|≤1

Ωkϕ(kx, t)vα,0

1tr

<∞ (2.18)

with r > 0 and k > r. Then

‖DK −K∗m‖Cvα,0→Cvα,0 = O

(logmmr

)(2.19)

where the constant in “O” is independent of m.

3. Numerical methods

Now go back to the equation (D + νK)f = g. Assume g ∈ Zr(v0,α), r > 0 andthat operator K satisfies condition (2.8):

supτ>0

Ωkϕ(Kf, τ)v0,α

τr≤ C‖fvα,0‖∞.

Under these assumptions we solve the equivalent equation

(I + νDK)f = G, G := Dg. (3.1)

By the mapping properties of D it follows that G ∈ Zr(vα,0) ⊂ Cvα,0 and DK :Cvα,0 −→ Cvα,0 is compact (see, e.g., Lemma 2.3). Thus I + νDK is invertible inCvα,0 . From now on we will assume that (3.1) has a unique solution in Cvα,0 for anyfixed g ∈ Zr(v0,α) and we denote by f the solution of (3.1). In order to constructan approximate solution of (3.1), we solve the finite-dimensional equation

(I + νKm)fm = Gm, (3.2)

where Gm = DL−α,αm g, fm ∈ Pm+1 is the unknown polynomial and Km was

defined in (2.13).

Theorem 3.1. Let 0 < α < 1. If the previous assumptions on K and g hold true,then for m sufficiently large, there exists a unique polynomial fm ∈ Pm+1, whichis the solution of (3.2). Moreover the following estimate holds

‖(f − fm)vα,0‖∞ ≤ Clogmmr

‖g‖Zr(v0,α) (3.3)

where C is a positive constant independent of f and m.

Page 322: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

318 G. Mastroianni, M.G. Russo and W. Themistoclakis

Consequently f ∈ Zr−ε(vα,0), with ε > 0 arbitrarily small. Moreover

|cond(I + νKm)− cond(I + νDK)| = O(

logmmr

)(3.4)

where cond(B) = ‖B‖‖B−1‖ is the condition number of the bounded and invertibleoperator B in Cvα,0 and the constant in O is independent of m.

In order to compute the coefficients of fm we construct a linear system whichis equivalent to equation (3.2) and whose coefficient matrix has a bounded (up tosome log factor) condition number w.r.t. the ∞ norm. This goal can be reached inseveral ways, also taking into account the complexity in building the matrix andsolving the linear system.

Here we propose the following procedure for α ≥ 1/2. We note that bydefinition we have

(Kmf)(x) = (DL−α,αm Kf)(x) = (DL−α,α

m Kf)(x) (3.5)

=m∑

i=1

Dl−α,αi (x)

v0,α(xi)(Kf)(xi)v0,α(xi)

where xi denote the zeros of p−α,αm , while l−α,α

i are the fundamental Lagrangepolynomials defined on the same zeros. Analogously

Gm(x) = (DL−α,αm g)(x) = (DL−α,α

m g)(x) (3.6)

=m∑

i=1

Dl−α,αi (x)

v0,α(xi)g(xi)v0,α(xi).

Hence both the polynomials Kmf and Gm are expanded in the basisϕi(x) :=

Dl−α,αi (x)

v0,α(xi)

i=1,...,m

. (3.7)

Therefore we express also the unknown fm in the same basis, i.e., we setfm =

∑mj=1 ajϕj . Now by substituting (3.5), (3.6) and the expression of fm in

(3.2) and making equal the corresponding coefficients in the basis we get

ai + ν(Kfm)(xi)v0,α(xi) = g(xi)v0,α(xi), i = 1, . . . ,m. (3.8)

Taking into account that l−α,αj (x) = λ−α,α

j

∑m−1k=0 p−α,α

k (x)p−α,αk (xj), where

λ−α,αj ≡ λm(v−α,α, xj) =

[m−1∑i=0

p2i (v

−α,α, xj)

]−1

,

Page 323: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 319

denotes the jth Christoffel number, we get

Kfm(xi) =m∑

j=1

aj

∫ 1

−1

k(x, xi)ϕj(x)vα,−α(x)dx

=m∑

j=1

aj

v0,α(xj)

∫ 1

−1

k(x, xi)λ−α,αj

m−1∑k=0

(Dp−α,αk )(x)p−α,α

k (xj)vα,−α(x)dx

=m∑

j=1

aj

v0,α(xj)λ−α,α

j

m−1∑k=0

p−α,αk (xj)

∫ 1

−1

k(x, xi)pα,−αk (x)vα,−α(x)dx.

Using this expression in (3.8) we finally have the linear system

ai + ν

m∑j=1

ajv0,α(xi)v0,α(xj)

λ−α,αj

m−1∑k=0

p−α,αk (xj)mk(xi) = bi, i = 1, . . . ,m (3.9)

where bi = g(xi)v0,α(xi) and

mk(y) :=∫ 1

−1

k(x, y)pα,−αk (x)vα,−α(x)dx.

The obtained system can be rewritten in a more significant matrix form.Indeed, if we put

Dm = (p−α,αj (xi))i=1,...,m,j=0,...,m−1,

thenD−1

m = (λj(v−α,α)p−α,αk (xj))k=0,...,m−1,j=1,...,m

(see, e.g., [9]).Hence, setting

Mm = (mk(xi))i=1,...,m,k=0,...,m−1, Λm = diag((v0,α(xj))j=1,m)

and denoting by Im the identity matrix of order m, system (3.9) becomes

(Im + νΛ−1m MmD−1

m Λm)am = bm (3.10)

where am = (ai)Ti=1,...,m and bm = (g(x1)v0,α(x1), . . . , g(xm)v0,α(xm))T .

For the sake of simplicity set Cm = Im +νΛ−1m MmD−1

m Λm. Since in the casewhen α ≥ 1/2, Cm is the matrix representation of the operator I + νKm in thebasis ϕii, for arbitrary polynomials fm =

∑mi=1 aiϕi and Gm =

∑mi=1 biϕi there

holds that Cmam = bm ⇔ (I + νKm)fm = Gm, i.e., equation (3.10) is equivalentto (3.2). Now denote by cond(Cm) the condition number of Cm considered as alinear operator in Rm equipped with the ∞ norm.

Proposition 3.2. Under the assumptions of Theorem 3.1 and with α ≥ 1/2 we have

supm

cond(Cm)log4 m

<∞. (3.11)

Page 324: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

320 G. Mastroianni, M.G. Russo and W. Themistoclakis

During the construction of system (3.10) we supposed α ≥ 1/2. The case0 < α < 1/2 only needs some additional computations.

The described numerical method can be used when the kernel k(x, t) has someweak singularities. The main effort consists of the computation of the so called“modified moments” mk(xi), which, sometimes, satisfy stable recurrent relations,as we will show in some significant examples later on.

Assume now that the kernel k(x, t) is sufficiently smooth. We propose a dif-ferent approach, avoiding the computation of the mj(xk).

Indeed in this case instead of (3.2) we consider the equation

(I + νK∗m)fm = Gm (3.12)

where fm is the unknown polynomial, Gm is the same as before and K∗m was

defined in (2.16).

Theorem 3.3. Let 0 < α < 1. Under the assumptions of Lemma 2.4, if g ∈Zr(v0,α), then for any sufficiently large m, there exists a unique polynomial fm ∈Pm+1, which is the solution of (3.12). Moreover the following estimate holds

‖(f − fm)vα,0‖∞ ≤ Clogmmr

‖g‖Zr(v0,α) (3.13)

where C is a positive constant independent of f and m.Consequently f ∈ Zr−ε(vα,0), with ε > 0 arbitrarily small. Moreover

|cond(I + νK∗m)− cond(I + νDK)| = O

(logmmr

), (3.14)

where the constant in O is independent of m.

Also in this case in order to compute the coefficients of fm assume α ≥ 1/2and set fm =

∑mi=1 aiϕi and Gm =

∑mi=1 biϕi. Using the same argument as

before, i.e., expanding both sides of (3.12) in the basis ϕii and making equal thecorresponding coefficients we get

ai + ν(K∗fm)(xi)v0,α(xi) = bi, i = 1, . . . ,m (3.15)

where K∗ was defined in (2.15). So we have to evaluate the quantities (K∗fm)(xi).Using the gaussian quadrature formula we have

(K∗fm)(xi) =∫ 1

−1

Lα,−αm (k(·, xi), x)fm(x)vα,−α(x)dx

=m∑

k=1

λα,−αk k(tk, xi)

m∑j=1

ajϕj(tk),

where tk denote the zeros of pα,−αm and λα,−α

k the Christoffel numbers with respectto the weight vα,−α.

By (3.7), using l−α,αj (x) = p−α,α

m (x)/[(x − xj)p′m(v−α,α, xj)], since

D

[p−α,α

m

· − xj

](x) =

pα,−αm (x) − pα,−α

m (xj)x− xj

Page 325: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 321

and [22]pα,−α

m (xj)p′m(v−α,α, xj)

=sinπαπ

λ−α,αj ,

we get

(K∗fm)(xi) =sinπαπ

m∑k=1

λα,−αk k(tk, xi)

m∑j=1

aj

v0,α(xj)λ−α,α

j

xj − tk.

Using this expression in (3.15) we finally get the system

ai + νsinπαπ

m∑j=1

ajv0,α(xi)v0,α(xj)

λ−α,αj

m∑k=1

λα,−αk

k(tk, xi)xj − tk

= bi, (3.16)

i = 1, . . . ,m

where bi = g(xi)v0,α(xi). We remark that, since [26] minj,k |ϑj −τk| ∼ m−1, wherexj = cosϑj and tk = cos τk, the last term at the left-hand side in (3.16) alwaysmakes sense.

Now denote by Bm the matrix of system (3.16) and by cond (Bm) its condi-tion number in the ∞ norm. We have

Proposition 3.4. Under the assumptions of Theorem 3.3 and with α ≥ 1/2 we have

supm

cond (Bm)log4 m

<∞. (3.17)

Finally the case 0 < α < 1/2 can be handled in a similar way but with someadditional computations.

Remark 3.5. Looking at (3.16) and (3.9) we underline that, even if in some casesthe computational efforts may be comparable, the entries of the matrix in (3.16)can be always computed, while sometimes the computation of the modified mo-ments in (3.9) can be very hard.

4. Some special cases

In this section we will consider some special cases of the operator K. Consider

Kf(y) = Kµf(y) :=∫ 1

−1

kµ(x, y)f(x)vα,−α(x)dx where we set

kµ(x, y) :=

⎧⎨⎩ |x− y|µ, µ > −1, µ = 0

log |x− y|, µ = 0 .(4.1)

We have the following result.

Page 326: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

322 G. Mastroianni, M.G. Russo and W. Themistoclakis

Lemma 4.1. Let f ∈ Cvα,0 , 0 < α < 1 and µ > −1. Then

supτ>0

Ωkϕ(Kµf, τ)v0,α

τ1+µ≤ C‖fvα,0‖∞, k > 1 + µ, µ = 0, (4.2)

supτ>0

Ωϕ(Kµf, τ)v0,α

τ log τ−1≤ C‖fvα,0‖∞, µ = 0 (4.3)

where in both cases C is a positive constant independent of f .

The proof of the previous lemma is technical and we refer the reader to theAppendix for a sketch of it.

The previous result assures that assumption (2.8) is satisfied with r = 1 + µin the case µ = 0, and 0 < r < 1, for µ = 0. Hence in the case −1 < µ ≤ 0 wewill solve the linear system (3.9), since Theorem 3.1 holds true. In the case µ > 0both of the Theorems 3.1 and 3.3 are true and so we can solve the linear systems(3.9) or (3.16).

In the cases when we have to solve (3.9) it is necessary to compute theintegrals

mj(y) =∫ 1

−1

kµ(x, y)pα,−αj (x)vα,−α(x)dx.

The quantities mj(y) can be computed by means of suitable recurrencerelations. In the Appendix we will give such recurrence relations in the cases−1 < µ < 0 and µ = 0, α = 1/2.

We note that in the particular case α = 1/2, µ = 0, since ddyK

µf(y) =−πDf(y), using the boundedness of D : Zs(vα,0) −→ Zs(v0,α) we immediatelyget ‖Kµf‖Zs+1(v0,α) ≤ C‖f‖Zs(vα,0) for any f ∈ Zs(vα,0), s > 0. For the numericalmethod (3.2) this means that the rate of convergence will only depend on thesmoothness of the right-hand side of (3.1). For instance if g ∈ Zs(v0,α), by Theorem3.1 we get that the rate of convergence will be O(logm/ms).

Finally we remark that the previous results can be generalized for operatorsof the type Kf(y) = Kµf(y) :=

∫ 1

−1 q(x, y)kµ(x, y)f(x)vα,−α(x)dx where q is a

smooth function (assume for instance that q is many times differentiable with re-spect to both variables). Anyway from the computational point of view the problemis how to compute efficiently the corresponding integrals of the type mj(y).

5. Numerical examples

In this section we give some numerical tests for the proposed methods. All thecomputations were performed in 16-digits arithmetic. In every example we givethe values of the weighted approximating polynomials in two internal points of[−1, 1], the condition number of the matrix in ∞ norm (denoted by κ∞) and thegraph of one weighted approximating polynomial of suitable degree.

Page 327: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 323

Example 1. Consider the equation

12f(y)v

23 ,− 2

3 (y) +√

32π

∫ 1

−1

f(x)x− y

v23 ,− 2

3 (x)dx

+∫ 1

−1

|x− y|− 18 f(x)v

23 ,− 2

3 (x)dx = 1 + y2.

In this case the right-hand side is smooth while the kernel of the perturbation isweakly singular. We apply the method (3.2), i.e., we solve system (3.9). Accordingto estimate (3.3) we expect an order of convergence O(logm/m7/8). Anyway inthe internal points of [−1, 1] the convergence is faster.

m y = .5 y = .9 κ∞16 1.0009 .634632 1.000993 .6346 12.25364 1.0009932 .634659 12.611128 1.0009932 .63465960 12.817256 1.00099326 .63465960 12.935512 1.0009932693 .634659602 13.002

The graph of the weighted approximating polynomial f512(y)v23 ,0(y) is given

in Fig. 1.

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

1.2

1.4

Figure 1. f512(y)v23 ,0(y)

Page 328: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

324 G. Mastroianni, M.G. Russo and W. Themistoclakis

Example 2. Consider the equation

cos(.7π)f(y)v.7,−.7(y) − sin(.7π)π

∫ 1

−1

f(x)x− y

v.7,−.7(x)dx

+12

∫ 1

−1

e(x+y)f(x)v.7,−.7(x)dx = sin (1 + y)

In this case both the right-hand side and the kernel are analytic functions. Weapply method (3.12), i.e., linear system (3.16). According to (3.13) a very fastconvergence is expected. Indeed we get the machine precision with m = 32.

m y = −.8 y = −.4 κ∞8 .524545 .48156016 .52454584772060 .48156018319029 8.98452652219436032 .52454584772060 .481560183190292 10.24891532148221

The graph of the approximation f32(y)v0.7,0(y) is given in Fig. 2.

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Figure 2. f32(y)v0.7,0(y)

Example 3. Consider the equation

12f(y)v

23 ,− 2

3 (y) +√

32π

∫ 1

−1

f(x)x− y

v23 ,− 2

3 (x)dx

− 12π

∫ 1

−1

|x− y|3.5f(x)v23 ,− 2

3 (x)dx = cos(1 + y)

In this case the kernel is smooth but is of the type (4.1). So we can solve system(3.16) or (3.9). In both cases the rate of convergence expected is O(logm/m3.5).The numerical test shows essentially the same results, using one system or the

Page 329: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 325

other. There is only a slightly better behavior in the case of (3.9) due to the exactcomputation of the modified moments.

m y = .5 y = .9 κ∞16 -.97619 -.4845 35.6932832 -.9761992 -.484502 44.0508164 -.97619925 -.4845025 48.33887128 -.976199252 -.48450252 50.99769256 -.97619925225 -.48450252058 52.46197

The graph of the weighted approximation f512(y)v23 ,0(y) is given in Fig. 3.

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

Figure 3. f512(y)v23 ,0(y)

Example 4. The last test is devoted to the case of the so called “generalized airfoilequation”

− 1π

∫ 1

−1

f(x)x− y

√1− x

1 + xdx− 1

2

∫ 1

−1

log |x− y|f(x)

√1− x

1 + xdx = e3x

As remarked at the end of Section 4 in this case the rate of convergence dependsonly on the smoothness of the right-hand side. The numerical evidence confirmsthis expectation.

m y = .5 y = .9 κ∞8 5.09 6.3716 5.097410331 6.372656359 8.6845132 5.09741033161161 6.37265635964955 9.92191

Page 330: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

326 G. Mastroianni, M.G. Russo and W. Themistoclakis

The graph of the weighted polynomial f32(y)√

1− y is given in Fig. 4

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−5

0

5

10

15

20

25

30

35

Figure 4. f32(y)√

1− y

6. Proofs of the main results

Proof of Lemma 2.1.Let Pm ∈ Pm as in the assumption. By (2.4) we have∫ 1

m

0

ωϕ(f − Pm, t)vγ,δ

tdt =

∑j≥m

∫ 1j

1j+1

ωϕ(f − Pm, t)vγ,δ

tdt

≤∑j≥m

ωϕ

(f − Pm,

1j

)vγ,δ

log(

1 +1j

)≤∑j≥m

ωϕ(f − Pm, 1/j)vγ,δ

j

≤ C∑j≥m

1j2

j∑i=0

Ei(f − Pm)vγ,δ

≤ C∑j≥m

1j2

(mEm(f)vγ,δ +

j∑i=m

Ei(f)vγ,δ

)

≤ CEm(f)vγ,δ + C∑j≥m

1j2

j∑i=m

Ei(f)vγ,δ

Page 331: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 327

≤ CEm(f)vγ,δ + C

∞∑i=m

Ei(f)vγ,δ

i

≤ C

∫ 1m

0

ωkϕ(f, t)vγ,δ

tdt + C

∞∑i=m

Ei(f)vγ,δ

i

having used (2.3). Now using again (2.3) we have that the sum on the right-handside can be estimated as

∞∑i=m

Ei(f)vγ,δ

i≤ C

∞∑i=m

1iωk

ϕ(f, 1/i)vγ,δ

≤ C

∞∑i=m

∫ 1i

1i+1

ωkϕ(f, t)vγ,δ

tdt ≤ C

∫ 1m

0

ωkϕ(f, t)vγ,δ

tdt

and then the Lemma follows. In order to prove Theorem 2.2 we first need the following result.

Proposition 6.1. Let L−α,αm , 0 < α < 1, be the Lagrange operator defined in (2.11).

Then for every φ ∈ Cv0,α we get

‖(DL−α,αm φ)vα,0‖∞ ≤ C logm‖φv0,α‖∞ (6.1)

where C is a positive constant independent of m and φ.

Proof. First consider the case α ≥ 1/2. Using the property Dp−α,αm = pα,−α

m , weget ∀x ∈ [−1, 1]

D

[p−α,α

m

· − xk

](x) =

pα,−αm (x) − pα,−α

m (xk)x− xk

(6.2)

where xk, k = 1, . . . ,m, denote the zeros of p−α,αm . Hence denoting by xd the

closest node to x, i.e., |xd − x| ≤ |xk − x|, k = 1, 2, . . . ,m, we get∣∣∣(1− x)αDL−α,αm φ(x)

∣∣∣ (6.3)

≤∣∣∣∣(1− x)α φ(xd)

p′m(v−α,α, xd)pα,−α

m (x) − pα,−αm (xd)

x− xd

∣∣∣∣+

∣∣∣∣∣∣(1− x)αpα,−αm (x)

m∑k=1k =d

φ(xk)x− xk

1p′m(v−α,α, xk)

∣∣∣∣∣∣+

∣∣∣∣∣∣(1− x)αm∑

k=1k =d

φ(xk)x− xk

pα,−αm (xk)

p′m(v−α,α, xk)

∣∣∣∣∣∣ =: A1 + A2 + A3

Let us estimate the quantities Ai separately. Since [22, (3.1), p. 124],

pα,−αm (xk)

p′m(v−α,α, xk)=

sinπαπ

λ−α,αk (6.4)

Page 332: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

328 G. Mastroianni, M.G. Russo and W. Themistoclakis

for A3 we get (see, e.g., [6],[23])

A3 ≤sinπαπ

(1− x)αm∑

k=1k =d

|φ(xk)||x− xk|

λ−α,αk ≤ C logm‖φv0,α‖∞. (6.5)

In order to estimate A2 we recall the following inequalities [29](√1− x +

1m

)γ+ 12(√

1 + x +1m

)δ+ 12

|pm(vγ,δ, x)| ≤ C, (6.6)

∀x ∈ [−1, 1],

|p′m(vγ,δ, yk)|−1 ∼ ϕ(yk)vγ/2+1/4,δ/2+1/4(yk)m−1. (6.7)

Here γ, δ > −1, ϕ(x) =√

1− x2, ykk=1,...,m are the zeros of pγ,δm and C and the

constants in ∼ are independent of m and k. We need also the following estimate[20, Lemma 4.1]

m∑k=1k =d

vγ,δ(yk)m|x− yk|

≤ C logm(√

1− x +1m

)2γ−1(√1 + x +

1m

)2δ−1

(6.8)

where −1/2 ≤ γ, δ ≤ 1/2 and yd is the zero of pγ,δm nearest to x.

Therefore by (6.7) we get

A2 ≤ ‖φv0,α‖∞|(1− x)αpα,−αm (x)|

m∑k=1k =d

(1 + xk)−α

|p′m(v−α,α, xk)|1

|x− xk|

≤ C‖φv0,α‖∞|(1− x)αpα,−αm (x)|

m∑k=1k =d

ϕ(xk)m

v−α/2+1/4,−α/2+1/4(xk)|x− xk|

.

Since α ≥ 1/2, by (6.8) and (6.6), we have

A2 ≤ C logm‖φv0,α‖∞|(1 − x)αpα,−αm (x)| (6.9)

·(√

1− x +1m

)−α+ 12(√

1 + x +1m

)−α+ 12

≤ C logm‖φv0,α‖∞.

Only the estimation of A1 is left. Using the mean value theorem we have

A1 ≤ ‖φv0,α‖∞(1− x)α|p′m(vα,−α, ξ)| (1 + xd)−α

|p′m(v−α,α, xd)|where |ξ − x| ≤ |xd − x|. Since xd is one of the nodes closest to x, 1± xd ∼ 1± ξand, by (6.7), we have

A1 ≤ C

m‖φv0,α‖∞|p′m(vα,−α, ξ)|vα/2+3/4,−α/2+3/4(ξ) (6.10)

∼ C‖φv0,α‖∞|pα+1,−α+1m−1 (ξ)vα/2+3/4,−α/2+3/4(ξ)| ≤ C‖φv0,α‖∞

where we used [31, (4.5.5), p. 72] and (6.6).

Page 333: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 329

Hence (6.1), in the case α ≥ 1/2, follows using estimates (6.5), (6.9) and(6.10) in (6.3) and then taking the maximum with respect to x.

The proof in the case α < 1/2 is exactly the same. Indeed if we set xm+1 =(xm + 1)/2, x0 = −xm+1 and q−α,α(x) = (x − x0)(x − xm+1)p−α,α

m (x), we get

q′−α,α(xk) =

⎡⎣ −2xm+1p−α,αm (−xm+1), k = 0

(xk + xm+1)(xk − xm+1)p′m(v−α,α, xk) k = 1, . . . ,m2xm+1p

−α,αm (xm+1), k = m + 1.

(6.11)

Identities (6.2), (6.4) still hold true with q−α,α playing the role of p−α,αm . Thus,

taking also into account that [29]

p−α,αm (xm+1) ∼ m−α+1/2, p−α,α

m (−xm+1) ∼ mα+1/2

the previous proof can be repeated word by word with q−α,α in place of p−α,αm .

Proof of Theorem 2.2. Let P be an arbitrary polynomial of degree m− 1, m > 1.Since there holds [24]

‖(Dφ)vα,0‖∞ ≤ C

[‖φv0,α‖∞ +

∫ 1

0

ωϕ(φ, t)v0,α

dt

t

](6.12)

and DP ∈ Pm−1, we have

‖D[φ− L−α,αm φ]vα,0‖∞ ≤ ‖D[φ− P ]vα,0‖∞ + ‖D[P − L−α,α

m φ]vα,0‖∞

≤ C‖(φ− P )v0,α‖∞ + C

∫ 1

0

ωϕ(φ − P, t)v0,α

tdt + ‖D[L−α,α

m (P − φ)]vα,0‖∞.

Now by applying Proposition 6.1 we get

‖D[φ− L−α,αm φ]vα,0‖∞ ≤ C logm‖(φ− P )v0,α‖∞

+ C

∫ 1m

0

ωϕ(φ− P, t)v0,α

tdt

where C is a positive constant independent of m and φ. Now choosing P as thebest approximation of φ in Cv0,α and using Lemma 2.1 we get

‖D[φ− L−α,αm φ]vα,0‖∞ ≤ C logmEm(φ)v0,α + C

∫ 1m

0

ωkϕ(φ, t)v0,α

tdt.

Therefore (2.12) follows by inequality (2.3) and recalling that if ωkϕ(φ, t) ∼ tr then

ωkϕ(φ, t) ∼ Ωk

ϕ(φ, t).

Proof of Lemma 2.3. The lemma follows by Theorem 2.2 and assumption (2.8). Proof of Lemma 2.4. First we note that, by definition, (2.18) implies (2.8). More-over we get

‖[(DK −K∗m)f ]vα,0‖∞ ≤ ‖[(DK −Km)f ]vα,0‖∞

+ ‖[DL−α,αm (K∗ −K)f ]vα,0‖∞

Page 334: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

330 G. Mastroianni, M.G. Russo and W. Themistoclakis

where Km is the operator defined in (2.13). Since (2.8) holds true, by Lemma 2.3and Proposition 6.1 it follows

‖[(DK −K∗m)f ]vα,0‖∞ ≤ C

logmmr

‖fvα,0‖∞ + C logm‖(K∗ −K)fv0,α‖∞.

So we have to evaluate the second term on the right-hand side. By the definitionof K∗ and using (2.10) we have

‖(K∗ −K)fv0,α‖∞

= max|y|≤1

(1 + y)α

∣∣∣∣∫ 1

−1

[ky − L−α,αm (ky)](x)f(x)vα,−α(x)dx

∣∣∣∣≤ ‖fvα,0‖∞ max

|y|≤1(1 + y)α

∫ 1

−1

∣∣ky(x)− L−α,αm (ky)(x)

∣∣ v0,−α(x)dx

≤ C‖fvα,0‖∞ max|y|≤1

(1 + y)αEm(ky)

≤ C‖fvα,0‖∞ max|y|≤1

(1 + y)αωkϕ(ky,m

−1).

Since by (2.17) it follows that ωkϕ(ky ,m

−1) ∼ Ωkϕ(ky,m

−1), again by (2.17) wefinally get

‖(K∗ −K)fv0,α‖∞ ≤ C

mr‖fvα,0‖∞

and the Lemma follows.

Proof of Theorem 3.1. By Lemma 2.3 we immediately get that I + νDKm isinvertible and uniformly bounded in Cvα,0 . Therefore by the identity

f − fm = (I + νDKm)−1[(Dg − DL−α,α

m g) + ν(DK −Km)f]

it follows

‖(f − fm)vα,0‖∞ ≤ C‖(Dg − DL−α,αm g)vα,0‖∞

+ C‖(DK −Km)(f)vα,0‖∞

where C is a positive constant independent of m and f . Hence (3.3) can be obtainedby applying Theorem 2.2 and Lemma 2.3. Finally working as in [15] estimate (3.4)can be deduced by Lemma 2.3.

Proof of Proposition 3.2. To prove the proposition we need some preliminary re-sults. First of all since [24]

‖(Dφ)v0,α‖∞ ≤ C

[‖φvα,0‖∞ +

∫ 1

0

ωϕ(φ, t)vα,0dt

t

], (6.13)

Page 335: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 331

by (2.5) we get, for any φ ∈ Pm−1,

‖(Dφ)v0,α‖∞ ≤ C

[‖φvα,0‖∞ +

1m‖φ′ϕvα,0‖∞ (6.14)

+∫ 1

1m

ωϕ(φ, t)vα,0dt

t

]≤ C logm‖φvα,0‖∞

having used the Bernstein inequality and the definition of ωϕ and where C =C(φ,m).

On the other hand we remark that if φ ∈ Pm−1 and φ =∑m

i=1 ciϕi, withϕi defined in (3.7), then ci = (Dφ)(xi)v0,α(xi), where xi are the zeros of p−α,α

m .Moreover by Proposition 6.1 we have

‖φvα,0‖∞ ≤ ‖c‖∞ max|x|≤1

vα,0(x)m∑

i=1

|Dl−α,αi (x)|

v0,α(xi)

= ‖c‖∞‖DL−α,αm ‖Cvα,0−→Cv0,α ≤ C logm‖c‖∞ (6.15)

where C = C(m) and c = (c1, . . . , cm)T .Now let a = (a0, . . . , am−1)T ∈ Rm, b = (b0, . . . , bm−1)T ∈ Rm and set

fm =m−1∑i=0

aiϕi, Gm =m−1∑i=0

biϕi.

Since Cm represents the operator I + νKm in the basis ϕii we have

(I + νKm)fm = Gm ⇐⇒ Cma = b.

By (6.14) and for any a ∈ Rm we get

‖Cma‖∞ = ‖b‖∞ = max1≤i≤m

|(DGm)(xi)v0,α(xi)|

≤ max|x|≤1

|(DGm)(x)v0,α(x)| ≤ C logm‖Gmvα,0‖∞

= C logm‖[(I + νKm)fm]vα,0‖∞≤ C logm‖fmvα,0‖∞‖I + νKm‖Cvα,0−→Cvα,0

≤ C log2 m‖a‖∞‖I + νDK‖Cvα,0−→Cvα,0

where C = C(m) and we used (6.15).Analogously for any b ∈ Rm, if a = C−1

m b we get

‖C−1m b‖∞ = ‖a‖∞ = max

1≤i≤m|(Dfm)(xi)v0,α(xi)|

≤ max|x|≤1

|(Dfm)(x)v0,α(x)| ≤ C logm‖fmvα,0‖∞

= C logm‖[(I + νKm)−1Gm]vα,0‖∞≤ C logm‖Gmvα,0‖∞‖(I + νKm)−1‖Cvα,0−→Cvα,0

≤ C log2 m‖b‖∞‖(I + νDK)−1‖Cvα,0−→Cvα,0 , C = C(m)

Page 336: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

332 G. Mastroianni, M.G. Russo and W. Themistoclakis

having used the same arguments as before. Thus the proof is complete. Proof of Theorem 3.3. The proof is similar to that of Theorem 3.1. Indeed we canrepeat the same arguments using Lemma 2.4 instead of Lemma 2.3. Proof of Proposition 3.4. We can repeat word by word the proof of Proposition 3.2using operator K∗

m, defined in (2.16), instead of Km.

7. Appendix

Proof of Lemma 4.1. Let µ > −1, k be an integer s.t. k > 1+µ and |y| ≤ 1−4h2k2,0 < h ≤ τ . Taking into account the definition of Ωk

ϕ we have to estimate

|∆khϕ(y)(K

µf)(y)(1+y)α|≤∣∣∣∣∫ 1

−1

∆khϕ(y)(k

µx(y))f(x)(1−x)α(1+x)−αdx

∣∣∣∣(1+y)α

≤(1+y)α‖fvα,0‖∞∫ 1

−1

|∆khϕ(y)k

µx(y)|(1+x)−αdx. (7.1)

We split the integration interval on the right-hand side as follows∫ 1

−1

|∆khϕ(y)k

µx(y)|(1 + x)−αdx =

∫ y−2khϕ(y)

−1

+∫ y+2khϕ(y)

y−2khϕ(y)

(7.2)

+∫ 1

y+2khϕ(y)

|∆k

hϕ(y)kµx(y)|(1 + x)−αdx :=

3∑i=1

Gi(y, h).

We remark that in the case of G3 we can use that [7, p. 21]

|∆khϕ(y)k

µx(y)| ≤ C (2khϕ(y))k (x− ξ)µ−k ≤ Chk(x− ξ)µ−k,

where ξ ∈[y − k

2hϕ(y), y +

k

2hϕ(y)

]. Since k > 1 + µ and µ > −1 (also µ = 0)

we get

G3(y, h) ≤ Chk

∫ 1

y+2khϕ(y)

(x− y − khϕ(y))µ−k(1 + x)−αdx

≤ Chk(1 + y)−α

∫ 1−(y+kh)

khϕ(y)

uµ−kdu

≤ Chk(1 + y)−α

∫ ∞

khϕ(y)

uµ−kdu ≤ C(1 + y)−αhµ+1.

In order to estimate G1(y, h) it is sufficient to split the integration intervalin [−1,−1 + (1 + y)/2] and [−1 + (1 + y)/2, y − 2hkϕ(y)].

Finally estimate G2(y, h). We note that in this case 1 + y ∼ 1 + x. Hence wecan write

G2(y, h) ≤ C(1 + y)−αk∑

j=0

∫ y+2khϕ(y)

y−2khϕ(y)

∣∣∣∣kµ

(x, y +

(k

2− j

)hϕ(y)

)∣∣∣∣ dx.

Page 337: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 333

By means of basic computations it is easy to see that each term in the sum hasorder hµ+1 if µ = 0 and h log h−1 if µ = 0.

Therefore using the obtained estimates for Gi(y, h), i = 1, 2, 3 in (7.1)–(7.2)and taking the sup on 0 < h ≤ τ the Lemma immediately follows.

7.1. The recurrence relations for the modified moments.

For the convenience of the reader and for further references, we collect here somerecurrence relations performing the modified moments of the type

mj(y) =∫ 1

−1

kµ(x, y)pα,βj (x)vα,β(x)dx, α, β > −1

where kµ is the kernel defined in (4.1).The case µ = 0. Using [31, (4.5.5), p. 72] and the Rodrigues’ formula [31, (4.10.1),p. 94]

vα,β(x)pα,βn (x) = − 1√

n(n + α + β + 1)d

dxvα+1,β+1(x)pα+1,β+1

n−1 (x)

it is possible to deduce, for α + β = −1, the recurrence relation

δj+1

(1 +

µ + 1j + α + β + 1

)mj+1(y) = (y + γj)mj(y)

+ δj

(µ+ 1j

− 1)mj−1(y), j = 1, 2, . . .

where

δj =

√4j(j + α)(j + β)(j + α + β)

(2j + α + β − 1)(2j + α + β)2(2j + α + β + 1)

γj =(α− β)(α + β + 2µ+ 2)

(2j + α + β)(2j + α + β + 2).

The starting moments are defined as follows

m0(y) =1

γ(α, β)[m−

0 (y, µ) + m+0 (y, µ)]

with

γ(α, β) =√

2α+β+1B(1 + α, 1 + β)

m−0 (y, µ) := 2α(1 + y)β+µ+1B(1 + µ, 1 + β)

2F1

(−α, 1 + β, µ + β + 2,

1 + y

2

)m+

0 (y, µ) := 2β(1− y)α+µ+1B(1 + µ, 1 + α)

2F1

(−β, 1 + α, µ + α + 2,

1− y

2

)

Page 338: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

334 G. Mastroianni, M.G. Russo and W. Themistoclakis

where B and 2F1 denote respectively the Beta and the hypergeometric function,and

m1(y) =12

√α + β + 3

(α + 1)(β + 1)[(α + β + 2)y + (α− β)]m0(y)

+(α + β + 2)γ(α, β)

(m+0 (y, µ+ 1)−m−

0 (y, µ+ 1))

with γ(α, β), m−0 and m+

0 defined above. In the case α + β = −1 the recurrencerelation still holds but with j = 2, 3, . . . and hence it is necessary to computeseparately also the starting moment m2(y).

The case µ = 0, α = 12 . In [2, 17] can be found a recurrence relation for the

modified moments

mj(y) =∫ 1

−1

log |x− y|p12 ,− 1

2j (x)

√1− x

1 + xdx

expressed by using only the polynomials p−12 , 12

j . The relation is

m0(y) =√π(t− log 2),

mj(y) =π

2

[1

j + 1p− 1

2 , 12

j+1 (y)− 1j(j + 1)

p− 1

2 , 12

j (y)− 1jp− 1

2 , 12

j−1 (y)], j = 1, 2, . . .

and it can be computed by means of the formula⎧⎨⎩ p− 1

2 , 12

0 (x) = 1√π, p

− 12 , 1

21 (x) = 1√

π(2x− 1)

p− 1

2 , 12

j (x) = 2xp−12 , 12

j−1 (x)− p− 1

2 , 12

j−2 (x), j = 2, 3, . . .

Acknowledgments

The authors are grateful to the referees for the accurate reading of the paper andtheir pertinent remarks.

References

[1] Berthold D., Hoppe W., Silbermann B., A fast algorithm for solving the generalizedairfoil equation, J. Comp. Appl. Math. 43 (1992), 185–219.

[2] Berthold D., Hoppe W., Silbermann B., The numerical solution of the generalizedairfoil equation, J. Integr. Eq. Appl. 4 (1992), 309–336.

[3] Capobianco M.R., The stability and the convergence of a collocation method for aclass of Cauchy singular integral equation, Math. Nachr. 162 (1993), 45–58.

[4] Capobianco M.R., Russo M.G., Uniform convergence estimates for a collocationmethod for the Cauchy Singular integral equation, J. Integral Equations Appl. 9,(1997), no. 1, 21–45.

Page 339: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Numerical Methods for Cauchy Singular Integral Equations 335

[5] Capobianco M.R., Junghanns P., Luther U., Mastroianni G., Weighted uniform con-vergence of the quadrature method for Cauchy singular integral equations, Singularintegral operators and related topics (Tel Aviv, 1995) 153–181, Oper. Theory Adv.Appl. 90, Birkhauser, Basel, 1996.

[6] Criscuolo G., Mastroianni G., On the uniform convergence of Gaussian quadraturerules for Cauchy principal value integrals, Numer. Math., 54 (1989), no. 4, 445–461.

[7] Ditzian Z., Totik V., Moduli of smoothness, SCMG Springer-Verlag, New York BerlinHeidelberg London Paris Tokyo, 1987.

[8] Frammartino C., Russo M.G., Numerical remarks on the condition numbers and theeigenvalues of matrices arising from integral equations, Advanced special functionsand integration methods (Melfi, 2000), 291–310, Proc. Melfi Sch. Adv. Top. Math.Phys., 2, Aracne, Rome, 2001.

[9] Gautschi W., The condition of Vandermonde-like matrices involving orthogonal poly-nomials, Linear Algebra and Appl. 52, (1983) 293–300.

[10] Junghanns P., Luther U., Cauchy singular integral equations in spaces of continuousfunctions and methods for their numerical solution, ROLLS Symposium (Leipzig,1996). J. Comput. Appl. Math. 77 (1997), no. 1–2, 201–237.

[11] Junghanns P., Luther U., Uniform convergence of the quadrature method for Cauchysingular integral equations with weakly singular perturbation kernels, Proceedingsof the Third International Conference on Functional Analysis and ApproximationTheory, Vol. II (Acquafredda di Maratea, 1996). Rend. Circ. Mat. Palermo (2) Suppl.No. 52, Vol. II (1998), 551–566.

[12] Junghanns P., Luther U., Uniform convergence of a fast algorithm for a Cauchysingular integral equations, Proceedings of the Sixth Conference of the InternationalLinear Algebra Society (Chemnitz, 1996), Linear Algebra and Appl. 275/276 (1998),327–347.

[13] Junghanns P., Silbermann B., Zur Theorie der Naherungsverfahren fur singulareIntegralgleichungen auf Intervallen, Math. Nachr. 103 (1981), 199–244.

[14] Junghanns P., Silbermann B., The numerical treatment of singular integral equationsby means of polynomial approximations, Preprint, P–MAT–35/86, AdW der DDR,Karl Weierstrass Institut fur Mathematik, Berlin (1986).

[15] Laurita C., Mastroianni G. Revisiting a quadrature method for Cauchy singular inte-gral equations with a weakly singular perturbation kernel, Problems and methods inmathematical plysics (Chemnitz, 1999), 307–326, Operator Theory: Advances andApplications, 121, Birkhauser, Basel, 2001.

[16] Laurita C., Mastroianni G., Russo M. G., Revisiting CSIE in L2: condition numbersand inverse theorems, Integral and Integrodifferential Equations, 159–184, Ser. Math.Anal. Appl., 2, Gordon and Breach, Amsterdam, 2000.

[17] Laurita C., Occorsio D., Numerical solution of the generalized airfoil equation, Ad-vanced special functions and applications (Melfi, 1999), 211–226, Proc. Melfi Sch.Adv. Top. Math. Phys., 1, Aracne, Rome, 2000.

[18] Luther U., Generalized Besov spaces and CSIE, Ph.D. Dissertation, 1998.

[19] Luther U., Russo M.G., Boundedness of the Hilbert transformation in some Besovtype spaces, Integr. Equ. Oper. Theory 36 (2000), no.2, 220–240.

Page 340: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

336 G. Mastroianni, M.G. Russo and W. Themistoclakis

[20] Mastroianni G., Uniform convergence of derivatives of Lagrange interpolation, J.Comput. Appl. Math. 43 (1992), 37–51.

[21] Mastroianni G., Nevai P., Mean convergence of derivatives of Lagrange interpolation,J. Comput. Appl. Math. 34 (1991), no. 3, 385–396.

[22] Mastroianni G., Prossdorf S., Some nodes matrices appearing in the numerical anal-ysis for singular integral equations, BIT 34 (1994), no. 1, 120–128.

[23] Mastroianni G., Russo M.G., Lagrange interpolation in some weighted uniformspaces, Facta Universitatis, Ser. Math. Inform. 12 (1997), 185–201.

[24] Mastroianni G., Russo M.G., Themistoclakis W., The boundedness of the Cauchysingular integral operator in weighted Besov type spaces with uniform norms, Integr.Equ. Oper. Theory 42, (2002), no.1, 57–89.

[25] Mastroianni G., Themistoclakis W., A numerical method for the generalized airfoilequation based on the de la Vallee Poussin interpolation, J. Comput. Appl. Math.180(1), 71–105 (2005).

[26] Mikhlin S.G., Prossdorf S., Singular Integral Operators, Akademie-Verlag, Berlin,1986.

[27] Monegato G., Prossdorf S., Uniform convergence estimates for a collocation and dis-crete collocation method for the generalized airfoil equation, Contributions to Numer-ical Mathematics (A.G. Agarval, ed.), World Scientific Publishing Company 1993,285–299 (see also the errata corrige in the Internal Reprint No. 14 (1993) Dip. Mat.Politecnico di Torino).

[28] Muskhelishvili N.I., Singular Integral Equations, Noordhoff, Groningen, 1953.

[29] Nevai P., Mean convergence of Lagrange interpolation III, Trans. Amer. Math. Soc.282 (1984), 669–698.

[30] Prossdorf S., Silbermann B., Numerical Analysis for Integral and related Opera-tor Equations, Akademie-Verlag, Berlin 1991 and Birkhauser Verlag, Basel-Boston-Stuttgart 1991.

[31] Szego G., Orthogonal Polynomials, AMS, Providence, Rhode Island, 1939.

[32] Timan A.F., Theory of approximation of functions of a real variable, PergamonnPress, Oxford, England, 1963.

G. Mastroianni and M.G. RussoDipartimento di MatematicaUniversita degli Studi della BasilicataCampus Macchia RomanaI-85100 Potenza, Italye-mail: [email protected]: [email protected]

W. ThemistoclakisCNR, Istituto per le Applicazioni del Calcolo “Mauro Picone”Sezione di NapoliVia P. Castellino 111I-80131 Napoli, Italye-mail: [email protected]

Page 341: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 337–356c© 2005 Birkhauser Verlag Basel/Switzerland

On a Gevrey-NonsolvablePartial Differential Operator

Alessandro Oliaro

Abstract. We consider an operator whose principal part is the mth powerof a Mizohata hypoelliptic operator. We assume that the lower order termvanishes at a small rate with respect to the principal part, and we prove thelocal nonsolvability in Gevrey classes, for large Gevrey index.

Mathematics Subject Classification (2000). 35A07, 35A20, 35D05.

Keywords. Gevrey classes, local solvability, Mizohata operator.

1. Introduction

In this paper we study the nonsolvability in Gevrey classes of the operator

P = (Dt + iat2kDx)m + ctDnx , (t, x) ∈ R2. (1.1)

We recall that, given a real number s > 1 and an open set Ω ⊂ RN the Gevreyspace Gs(Ω) is the set of all C∞ functions f such that for every compact set K ⊂ Ωthere exists CK > 0 satisfying sup

x∈K|∂αf(x)| ≤ C

|α|+1K α!s, for every α ∈ Z+.

A differential operator Q is said to be Gs locally solvable at the point x0

if there exists a neighborhood Ω of x0 such that for any compactly supportedfunction f ∈ Gs there is a solution u ∈ D′

s of the equation Qu = f in Ω, D′s being

the ultradistribution space, topological dual of Gs0 := Gs∩C∞

0 . Since

A(Ω) = G1(Ω) ⊂ · · · ⊂ Gs1(Ω) ⊂ Gs2(Ω) ⊂ · · · ⊂ C∞(Ω) for 1 < s1 ≤ s2,

then any operator Q which is Gs2 locally solvable is also Gs1 locally solvable fors1 < s2. Therefore, dealing with non C∞ locally solvable operators we can lookfor the bigger s up to which they remain Gs solvable, finding a critical index forthe local solvability of Q.

Work supported by NATO grant PST.CLG.979347.

Page 342: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

338 A. Oliaro

Operators connected to P , cf. (1.1), have been considered in many papers. Itis well known that the operator

Dt + ithDx,

h odd, is not C∞ locally solvable, neither Gs locally solvable at the origin for any1 < s <∞; the same result holds for the operator

(Dt + ithDx)m + lower order terms; (1.2)

the C∞ nonlocal solvability of (1.2) was proved by Cardoso-Treves [3] for m = 2;Goldman [10] proved the C∞ nonlocal solvability of (1.2) for arbitrary m butunder conditions on the lower order terms; finally Cicognani-Zanghirati [4] andGramchev [12] showed that (1.2) is not Gs locally solvable for 1 < s <∞, withoutany assumption on the lower order terms. Further generalizations of this results aregiven in Georgiev-Popivanov [9] and Marcolongo-Rodino [17], in which the caseof infinite order vanishing coefficients is treated. Regarding the operator P , cf.(1.1), it follows from general results that P is Gs locally solvable for s < m

m−1 , cf.Gramchev-Rodino [14], Marcolongo-Oliaro [16], Spagnolo [24], De Donno-Oliaro[7]. Okaji [19], [20] studied P in the case m = 2, n = 1 proving that, for = 0, Pis Gs hypoelliptic and Gs locally solvable for 1 ≤ s < 4k

2k−1 ; moreover, P is C∞

locally solvable if ≥ 2k − 1. The case m = 2, n = 1, < 2k − 1 has been studiedby Oliaro-Popivanov-Rodino [21], who proved the Gs nonsolvability at the originfor s > 4k−

2k−−1 , and by Calvo-Popivanov [2], in which the Gs local solvability isproved for s < 4k−

2k−−1 ; if m = 2, n = 1 and < 2k − 1 the critical index for theGevrey local solvability of P is then found. Regarding the more general case m ≥ 2we have a result due to Gramchev [11], who proved the Gs local solvability of

(Dt + it2kDx)m + lower order terms

for s < 2km2km−2k−1 , under nonvanishing conditions on the lower order terms. In

particular, the operator P was studied by Popivanov [22] in the case 2km− > 0,m+ < n(2k+1), proving its C∞ nonlocal solvability at the origin; in the presentpaper, under an additional condition, cf. (2.3) below, we prove that the operator Pis not Gs locally solvable at the origin for s > scr, where scr = 2km−

2km−−(2k+1)(m−n) ;this result have been already conjectured by Popivanov [22]. The additional con-dition (2.3) is technical, and we think that the result is true also when it is notsatisfied, but with the technique used in this paper we cannot avoid it. On theother hand we observe that our result is sharp; indeed, at least in the case m = 2,n = 1, the index scr that we find in this paper coincides with the critical one, inthe sense that we have solvability for s < scr, cf. Calvo-Popivanov [2].

We finally want to give some references where background material on Par-tial Differential Equations can be found: many results on Gevrey classes and(non)solvability of Partial Differential Equations in Gevrey and other functionalspaces are proved for example in the books of Gramchev-Popivanov [13] andRodino [23], see also the references therein.

Page 343: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

On a Gevrey-Nonsolvable PDO 339

2. The main theorem and the fundamental tool

Let us consider the operator (1.1), where a and c are constants, c ∈ C\0, a ∈ R,a < 0; we suppose that 1 ≤ n ≤ m − 1, , k ∈ N, ≥ 0, and moreover we shallassume that the next conditions hold:

(a) If m is odd, then c = imd, d > 0;(b) If m is even, c = −imd, d > 0.

We then have the following result.

Theorem 2.1. Assume that the previous conditions are satisfied, and moreover

2km− > 0 (2.1)

m + < n(2k + 1) (2.2)m ≥ 2n or otherwisem < 2n and m

n (2k + 1)(m− n) ≥ 2km− .(2.3)

Then the transposed tP of the operator P is Gs-non locally solvable at the originfor s > scr, where

scr = 1 +(2k + 1)(m− n)2kn + n−m−

.

Remark 2.2. The conditions (2.1) and (2.2) mean, roughly speaking, that cannotbe too large with respect to k,m and n; on the other hand, if m < 2n the hypothesis(2.3) prevents from being too small. For example, if n = m − 1 and k = m weare requiring

2m2 − m

m− 1(2m + 1) ≤ < 2m2 − 2m− 1;

for instance, when k = m = 4, n = 3, we must assume 20 ≤ < 23.

In the sequel we use the following notation:

ε =m− n

2km− ; (2.4)

then we can write

scr = 1 +(2k + 1)(m− n)2kn + n−m−

=1

1− (2k + 1)ε. (2.5)

The tool that we shall use to prove our theorem is a necessary condition for theGs local solvability, proved by Corli [5]. First of all we can introduce a topology inGs

0(Ω) and Gs(Ω) in the following way. Let us fix K ⊂⊂ Ω and C > 0; we defineGs

0(Ω,K,C) as the set of all functions f ∈ C∞(Ω) such that supp f ⊂ K and

‖f‖s,K,C := supα∈Zn

+

(C−|α|(α!)−s sup

x∈K|∂αf(x)|

)<∞. (2.6)

Analogously we write Gs(Ω,K,C) for the space of all functions f ∈ Gs(Ω) forwhich the norm (2.6) of the restriction of f to K is finite. A topology in Gs

0(Ω)

Page 344: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

340 A. Oliaro

and Gs(Ω) is then introduced as follows:

Gs0(Ω) = ind lim

KΩ C∞Gs

0(Ω,K,C)

Gs(Ω) = proj limKΩ

ind limC∞

Gs(Ω,K,C).

The following result is due to Corli [5], who extended to the Gevrey frame thewell-known necessary condition of Hormander [15] for the local solvability in theSchwartz distribution space D′.

Theorem 2.3 (Necessary condition). Let us fix s > 1 and let P be a linear partialdifferential operator with Gs coefficients; we suppose that P is Gs solvable in Ω,(i.e., for every f ∈ Gs

0(Ω) there exists u ∈ D′s(Ω), solution of Pu = f). Then for

every compact set K ⊂ Ω, for every η > ε > 0, there exists a constant C > 0 suchthat

‖u‖2L∞(K) ≤ C‖u‖s,K, 1

η−ε‖ tPu‖s,K, 1

η−ε(2.7)

for every u ∈ Gs0(Ω,K, 1

η ).

3. Construction of a suitable function violating Corli’s inequality

In this section we shall construct a function uλ(t, x), depending on a (large) pa-rameter λ, that shall contradict the condition (2.7) for λ→ +∞. This constructionis the same as in Popivanov [22], so we give here only some lines, referring to theabove mentioned paper for a more precise treatment.

– First of all, up to a nonvanishing factor, we have that

tP = (Dt + iat2kDx)m + (−1)m+nctDnx .

– By making a partial Fourier transform with respect to x in the equationP (t,Dx, Dt)u = 0 we obtain

P (t, ξ,Dt)u(ξ, t) = 0, (3.1)

where u(ξ, t) =∫e−ixξu(t, x) dx.

– If we choose u(ξ, t) = ea t2k+12k+1 v(ξ, t) in (3.1) we obtain that u(ξ, t) can be

written in the form

u(ξ, t) = ea t2k+12k+1 w(tξ

nm+ ), (3.2)

ξ being considered as a positive (large) parameter; w(s) is then solution ofthe equation

(∂ms −Ams)w(s) = 0, (3.3)

where Am = −cim, so A ∈ R, A > 0, according to (a)–(b).

Page 345: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

On a Gevrey-Nonsolvable PDO 341

From known results of asymptotical analysis, see, e.g., Fedoryuk [8, Chapter 5],Turrittin [25], Braaksma [1], we have that

w(s) ∼ C0s 1−m

2m eA s

m

+1

m

+1 , s→∞

w(r)(s) ∼ Crs 1−m

2m +r m e

A sm

+1

m

+1 , s→∞, r ≥ 1.

(3.4)

where Cr are positive constants, see also Popivanov [22]. These asymptotical ex-pansions are not suitable for our purposes, because they are not uniform withrespect to r: indeed they mean that for every ε > 0∣∣∣∣∣∣∣

w(r)(s)

Crs 1−m2m +r

m eA s

m

+1

m

+1

− 1

∣∣∣∣∣∣∣ < ε

for s > s0, but s0 depends on r. Since the Gevrey seminorms, cf. (2.6), involveall the derivatives of the function, we need some kind of uniform estimates withrespect to r.

Lemma 3.1. Let w(s) be solution of (3.3); then there exists a positive constant Dsuch that for s > S0 > 1 and for every r ≥ 1 we have

|w(s)| ≤ Ds 1−m2m e

A s

m+1

m

+1 (3.5)

|w(r)(s)| ≤ Dr+1r! s 1−m2m +r

m eA s

m

+1

m

+1 , (3.6)

where S0 does not depend on r.

Proof. For r ≤ m the thesis follows immediately from (3.4), by taking D suffi-ciently large in (3.5)–(3.6). For r > m we proceed by induction: let us supposethat the estimate (3.6) holds for every r < m+ h, h ∈ N fixed, and let us estimate|w(m+h)(s)|. Remembering that w(m)(s) = Amsw(s), cf. (3.3), we have

|w(m+h)(s)| = |∂hs (Amsw(s))|

≤ Am

min,h∑µ=0

∣∣∣∣(hµ) . . . (− µ + 1)s−µw(h−µ)(s)

∣∣∣∣;by the inductive hypothesis and since . . . ( − µ + 1) ≤ !, for s > S0 we thenhave:

|w(m+h)(s)| ≤ Am

min,h∑µ=0

h! !µ!(h− µ)!

s−µDh−µ+1(h− µ)! s 1−m2m +(h−µ)

m eA s

m

+1

m

+1

≤ Dh+m+1(h + m)! s 1−m2m +(h+m)

m eA s

m

+1

m

+1 ,

by taking D ≥ l!Am.

Page 346: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

342 A. Oliaro

Let us consider now the cut-off functions ϕ(x), ψ(ρ), g1(t) ∈ Gs′0 (R), 1 < s′ <

scr, cf. (2.5), in such a way that:

– ϕ ≡ 1 for |x| ( 1, ϕ ≡ 0 for |x| > 1, 0 ≤ ϕ(x) ≤ 1 for every x ∈ R;– suppψ = [1, 1 + µ0], ψ(ρ) > 0 for ρ ∈ (1, 1 + µ0),

∫ +∞−∞ ψ(ρ) dρ = 1;

– g1 ≡ 1 for |t− 1| ≤ ε1, g1 ≡ 0 for |t− 1| ≥ 2ε1, 0 ≤ g1(t) ≤ 1 for every t ∈ R;

Now (3.4) gives an asymptotic behavior of u(ξ, t), cf. (3.2), for ξ → +∞ andt ≥ δ0 > 0:

u(ξ, t) ∼ C0(tξn

m+ ) 1−m2m ef(ξ,t),

f(ξ, t) being given by

f(ξ, t) = at2k+1

2k + 1ξ + A

tm +1

m + 1

ξn/m, (3.7)

where a < 0 and A > 0. We can easily prove the following assertions, cf. Popi-vanov [22]:

(i) f(ξ, t) has a maximum for t = tξ, where

tξ = c0ξ−ε, c0 =

(−A

a

) m2km−

> 0, (3.8)

ε being given by (2.4), and moreover

f(ξ, tξ) = α0ξ1−(2k+1)ε, α0 > 0. (3.9)

(ii) There exists a positive constant ε0, sufficiently small, such that for |t− tξ| ≤ε0|tξ| we can write

f(ξ, t) = α0ξ1−(2k+1)ε − e0(t− tξ)2ξ1−(2k−1)ε (3.10)

where α0 is a positive constant and 0 < const1 ≤ e0 ≤ const2.

Let us fix now a cut-off function

gλρ(t, x) = ϕ(x)g1

(t

tλρ

), (3.11)

1 ≤ ρ ≤ 1 + µ0, λ being a (large) parameter, and define

uλ(t, x) =∫ +∞

−∞ψ(ρ)e−α0(λρ)1−(2k+1)ε

eixλρ+a t2k+12k+1 λρw(t(λρ)

nm+ )gλρ(t, x) dρ.

(3.12)

Remark 3.2. The function uλ(t, x) is compactly supported, and its support satisfiessuppuλ ⊂ (1− ε2)tλ ≤ t ≤ (1+ ε2)tλ×|x| ≤ 1 for a suitable constant ε2, with0 < ε2 < 1. Observe in particular that suppuλ does not contain any point witht = 0, for every λ > 0.

Page 347: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

On a Gevrey-Nonsolvable PDO 343

4. Fundamental estimates

In this section we shall estimate the left- and right-hand sides of Corli’s inequalityfor the function uλ(t, x). The compact K in (2.7) can contain the point (0, 0), butsince suppuλ does not contain (0, 0), we shall fix in the following

Kλ = (1− ε2)tλ ≤ t ≤ (1 + ε2)tλ × |x| ≤ 1, (4.1)

cf. Remark 3.2.

4.1. Some preliminary results

We state here several technical lemmas, that we shall use in the following. To startwith, we recall the next known result.

Lemma 4.1. For every compact K and for all η > ε > 0 there exists a constant Csuch that

‖fg‖s,K, 1η−ε

≤ C‖f‖s,K, 1η‖g‖s,K, 1

η

for all f, g ∈ Gs(Ω,K, 1η ) (or f, g ∈ Gs

0(Ω,K, 1η )).

For the proof of the previous lemma, see for example Corli [5, 6].In the following we shall use the Faa di Bruno formula: if f, g : R → R,

f, g ∈ Cn(R), then for every ν = 1, . . . , n, we have:

dxν(f(g(x))) = ν!

ν∑k=1

f (k)(g(x))∑

k1+···+kν=kk1+2k2+···+νkν=ν

ν∏j=1

1kj !

(1j!g(j)(x)

)kj

. (4.2)

Let us recall now the definition of Bell polynomials: if xjj∈Z+ is a sequence ofreal numbers, µ ∈ Z+ and h is a positive integer, we define

Bµ,h(xj) = µ!∑

∞∑j=1

hj=h,∞∑

j=1jhj=µ

∞∏j=1

1hj !

(1j!xj

)hj

, (4.3)

where hj are non-negative integers. The following identity holds:

1h!

( ∞∑j=1

xjzj

j!

)h

=∞∑

µ=h

Bµ,h(xj)zµ

µ!, (4.4)

cf. Mascarello-Rodino [18, Section 5.5]. We now want to prove the following propo-sition.

Proposition 4.2. Let w(s) be the function of Lemma 3.1, solution of (3.3). LetK ⊂ R2 be a compact set satisfying the following condition: for every (t0, x0) ∈ K

limλ→+∞

t0(λρ)n

m+ = +∞ (4.5)

Page 348: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

344 A. Oliaro

for every ρ ∈ [1, 1+µ0]. Then for every s > 1, C > 0, there exist positive constantsC1 and d such that for all ρ ∈ [1, 1 + µ0] and h ∈ Z+ we have:

‖eixλρ+a t2k+12k+1 λρw(h)(t(λρ)

nm+ )‖s,K,C

≤ Ch+11 h!λh n

m(m+)

× em(λρ)+s(dλ)1/s+(s−1)(dλn/m)1

s−1,

(4.6)

for λ sufficiently large, where

m(λρ) = sup(t,x)∈K

(at2k+1

2k + 1λρ + A

tm +1

m + 1

(λρ)n/m).

Proof. For every α, β ∈ Z+ and for fixed h, by formula (4.2) and Lemma 3.1 wehave:

|∂αx ∂

βt

(eixλρ+a t2k+1

2k+1 λρw(h)(t(λρ)n

m+ ))|

=∣∣∣(iλρ)αeixλρ

β∑µ=0

µ

)∂µ

t

(ea t2k+1

2k+1 λρ)(λρ)(β−µ) n

m+w(h+β−µ)(t(λρ)n

m+ )∣∣∣

≤ (λρ)α

β∑µ=0

µ

) ∑0≤q≤µ

ea t2k+12k+1 λρµ!

∑q1+···+qµ=q

q1+2q2+···+µqµ=µ

µ∏j=1

1qj !

(∣∣∂jt

(a t2k+1

2k+1λρ)∣∣

j!

)qj

× (λρ)(β−µ) nm+Dh+β−µ+1(h + β − µ)!

(t(λρ)

nm+

) 1−m2m +(h+β−µ)

m

× eA t

m

+1

m

+1(λρ)n/m

,

for λ > λ0, λ0 being independent on α, β. Now the condition (4.5) implies thatthere exists λ1 > 0 such that t(λρ)

nm+ ≥ 1 for every (t, x) ∈ K, λ > λ1; then we

can find C > 0 satisfying(t(λρ)

nm+

) 1−m2m +(h+β−µ)

m ≤ C(h+β−µ) mλ

nm+ (h+β−µ)

m

for all (t, x) ∈ K, ρ ∈ [1, 1 + µ0] and λ > λ1. From this fact and the estimate(h + β − µ)! ≤ 2h+βh!(β − µ)! we have that for λ > λ0 = maxλ0, λ1

|∂αx ∂

βt

(eixλρ+a t2k+1

2k+1 λρw(h)(t(λρ)n

m+ ))|

≤ Ch+1+α+β0 λαh!λh n

m(m+)

× ea t2k+1

2k+1 λρ+A tm

+1

m

+1(λρ)n/m β∑

µ=0

λ(β−µ) nm (β − µ)!

∑0≤q≤µ

Bµ,q

(∣∣∂jt

(at2k+1

2k + 1λρ)∣∣),

where Bµ,q

(∣∣∂jt

(a t2k+1

2k+1λρ)∣∣) is a Bell Polynomial, cf. (4.3).

Page 349: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

On a Gevrey-Nonsolvable PDO 345

Now, using the identity (4.4) with z = µ and xj =∣∣∂j

t

(a t2k+1

2k+1λρ)∣∣ for every

j, we obtain:

Bµ,q

(∣∣∂jt

(at2k+1

2k + 1λρ)∣∣) ≤ Bµ,q

(∣∣∂jt

(at2k+1

2k + 1λρ)∣∣)µµ

µ!

≤ 1q!

( ∞∑j=1

∣∣∂jt

(at2k+1

2k + 1λρ)∣∣µj

j!

)q

≤(2k+1∑

j=1

[aλρ2k(2k − 1) . . . (2k − j + 2)t2k+1−j

]µj

j!

)q

≤ Cqλqeqµ ≤ Cβλµ.

We then get:

|∂αx ∂

βt

(eixλρ+a t2k+1

2k+1 λρw(h)(t(λρ)n

m+ ))| ≤ Ch+1

1 h!λh nm(m+)Cα+β

1 λαem(λρ)

×β∑

µ=0

λ(β−µ) nm (β − µ)!λµ

for every (t, x) ∈ K and λ > λ0. Since β!−s ≤ µ!−s(β−µ)!−s for every µ = 0, . . . , β,we finally obtain:

C−α−β(α!β!)−s sup(t,x)∈K

|∂αx ∂

βt

(eixλρ+a t2k+1

2k+1 λρw(h)(t(λρ)n

m+ ))|

≤ Ch+11 h!λh n

m(m+) em(λρ) (C1C−1λ)α

α!s

β∑µ=0

12β

(2C1C−1λ)µ

µ!s(2C1C

−1λn/m)β−µ

(β − µ)!s−1

≤ Ch+11 h!λh n

m(m+) em(λρ)+s(dλ)1/s+(s−1)(dλn/m)1

s−1

(4.7)

where d = 3C1C−1, since (C1C−1λ)α

α!s ≤((

(C1C−1λ)1/s)α

α!

)s

≤ es(dλ)1s , and similar

estimates hold for (2C1C−1λ)µ

µ!s and (2C1C−1λn/m)β−µ

(β−µ)!s−1 . Since (4.7) is valid for λ > λ0

and λ0 is independent of α, β ∈ Z+, taking the supα,β∈Z+

in the left-hand side of (4.7)

we have that (4.6) holds for every λ > λ0.

Remark 4.3. The compact (4.1) satisfies the condition (4.5): indeed it is sufficientto prove (4.5) for ρ = 1 and t0 = (1 − ε2)tλ = (1 − ε2)c0λ−ε. We then have toprove that

limλ→+∞

(1 − ε2)c0λ−ε+ nm+ = +∞,

that is true since nm+ − ε > 0, as we can deduce from (2.4), (2.1) and (2.2).

Page 350: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

346 A. Oliaro

Lemma 4.4. Let us consider the cut-off function gλρ(t, x), cf. (3.11). There existpositive constants D and G0 such that∥∥∥(Dαg1)

( t

tλρ

)Dβϕ(x)

∥∥∥s,K, 1

η

≤ Dα+β+1(α!β!)s′eG0λ

εs−s′

, (4.8)

for every α, β ∈ Z+ and s > s′, s′ being the Gevrey order of the functions g1 andϕ. The constants D and G0 are independent of α, β, λ and ρ ∈ [1, 1 + µ0].

Proof. First of all we observe that Dδt

[Dαg1

(t

tλρ

)]= (tλρ)−δ(Dα+δg1)

(t

tλρ

). So

we have:∥∥∥(Dαg1)( t

tλρ

)Dβϕ(x)

∥∥∥s,K, 1

η

≤ supδ,γ∈Z+

(1η

)−δ−γ

(δ! γ!)−s sup(t,x)∈R2

∣∣∣Dδt

[(Dαg1

( t

tλρ

)]Dγ

x

(Dβϕ

)(x)

∣∣∣≤ sup

γ∈Z+

[(1η

)−γ

γ!−s supx∈R

∣∣∂β+γϕ(x)∣∣]

× supδ∈Z+

[(1η

)−δ

δ!−s supt∈R

∣∣∣(tλρ)−δ(∂α+δg1)( t

tλρ

)∣∣∣].Taking into account that ϕ and g1 are compactly supported Gevrey functions oforder s′ we have: sup

x∈R

|∂β+γϕ(x)| ≤ Cβ+γ+1(β + γ)!s′ ≤ Cβ+1

1 β!s′Cγ

1 γ!s′, for every

γ ∈ Z+, since (β + γ)! ≤ 2β+γβ! γ!; a similar estimate holds for g1. Then recallingthe expression of tλρ, cf. (3.8), we obtain:∥∥∥(Dαg1)

( t

tλρ

)Dβϕ(x)

∥∥∥s,K, 1

η

≤ Cβ+11 β!s

′sup

γ∈Z+

[( 1C1η

)−γ

γ!−(s−s′)]

× Cα+12 α!s

′supδ∈Z+

[( c0λερεC2η

)−δ

δ!−(s−s′)].

(4.9)

Now we observe that

supγ∈Z+

[( 1C1η

)−γ

γ!−(s−s′)]

= supγ∈Z+

[[(C1η)

1s−s′

]γγ!

]s−s′

≤ e(s−s′)(C1η)1

s−s′ ;

in the same way we deduce that

supγ∈Z+

[( c0λερεC2η

)−δ

δ!−(s−s′)]≤ e(s−s′)((1+µ0)εC2ηc−1

0 )1

s−s′ λε

s−s′.

Applying the last two estimates in (4.9) we obtain (4.8).

Lemma 4.5. Let R be a real number; then for every s > 1 and C > 0 there existpositive constants b and c such that:

‖tR‖s,Kλ,C ≤ bλ−εRecλε

s−1 (4.10)

Kλ being the compact (4.1); the constants b and c are independent of λ.

Page 351: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

On a Gevrey-Nonsolvable PDO 347

Proof. By the definition of Gevrey seminorms we have:

‖tR‖s,Kλ,C = supα∈Z+

[C−αα!−s sup

(1−ε2)tλ≤t≤(1+ε2)tλ

|R(R− 1) . . . (R − α + 1)tR−α|];

now we observe that |R(R − 1) . . . (R − α + 1)| ≤ |R|(|R|+ 1) . . . (|R|+ α− 1) =(|R|+α−1|R|−1

)α! ≤ 2|R|+α−1α! Moreover, |tR−α| ≤ ((1 + sign(R − α)ε2)tλ)R−α. Then

we have:

‖tR‖s,Kλ,C ≤ 2|R|−1((1 + ε2)c0)|R|λ−εR supα∈Z+

[(C(1 − ε2)c02

λ−ε)−α

α!−(s−1)

].

By the same technique used to estimate the right-hand side of (4.9) we obtain that

supα∈Z+

[(C(1− ε2)c02

λ−ε)−α

α!−(s−1)

]≤ e(s−1)(2(C(1−ε2)c0)

−1)1

s−1 λε

s−1.

We then have proved the estimate (4.10) with c = (s − 1)(2(C(1 − ε2)c0)−1

) 1s−1

and b = 2|R|−1((1 + ε2)c0)|R|.

4.2. Lower bound for ‖uλ‖L∞(Kλ)

Let us define K1 = t ∈ R : |t| ≤ 12, and observe that K1 ⊃ supp

(g1( t

tλρ))

forλ) 0. Since meas(K1) = 1, for λ) 0 we have:

‖uλ‖L∞(Kλ) ≥ ‖uλ(t, 0)‖L∞(K1) ≥ ‖uλ(t, 0)‖L1(K1),

where Kλ is the set (4.1); then, recalling the definition of uλ(t, x), cf. (3.12), wehave:

‖uλ‖L∞(Kλ) ≥∣∣∣ ∫∫ ψ(ρ)e−α0(λρ)1−(2k+1)ε

ea t2k+12k+1 λρw(t(λρ)

nm+ )g1

( t

tλρ

)dρ dt

∣∣∣.(4.11)

Now we can apply the same computations as in Popivanov [22] and conclude thatthere exists a constant E0 > 0 such that∫∫

ψ(ρ)e−α0(λρ)1−(2k+1)εea t2k+1

2k+1 λρw(t(λρ)n

m+ )g1

( t

tλρ

)dρ dt =

= E0λ−p+ 1−m

2m ( nm+−ε)(1 + o(1)) as λ→∞,

(4.12)

where p = 1−(2k+1)ε2 and ε = m−n

2km− . Formula (4.12) is obtained by (3.4), bymaking the change of variable y1 = λp(t− tλρ) in the integral over Rt and then byapplying the Lebesgue Dominated Convergence Theorem for λ → ∞. By (4.11)and (4.12) we can conclude that there exists a positive constant E satisfying

‖uλ‖L∞(Kλ) ≥ Eλ−p+ 1−m2m ( n

m+−ε), (4.13)

for λ) 0.

Page 352: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

348 A. Oliaro

4.3. Upper bound for ‖uλ‖s,Kλ, 1η−ε

By Lemma 4.1 we have:

‖uλ‖s,Kλ, 1η−ε

≤∫ +∞

−∞ψ(ρ)e−α0(λρ)1−(2k+1)ε

× ‖eixλρ+a t2k+12k+1 λρw(t(λρ)

nm+ )‖s,Kλ, 1

η‖gλρ(t, x)‖s,Kλ, 1

ηdρ;

then applying Proposition 4.2 with h = 0 and Lemma 4.4 we obtain:

‖uλ‖s,Kλ, 1η−ε

≤ C

∫ +∞

−∞ψ(ρ)e−α0(λρ)1−(2k+1)ε

em(λρ)+s(dλ)1/s+(s−1)(dλn/m)1

s−1eG0λ

εs−s′

dρ.

We already know that m(λρ) = sup(t,x)∈Kλ

f(λρ, t) = α0(λρ)1−(2k+1)ε, cf. (3.9); so

we can conclude that

‖uλ‖s,Kλ, 1η,ε

≤ Ces(dλ)1/s+(s−1)(dλn/m)1

s−1 +G0λε

s−s′. (4.14)

4.4. Upper bound for ‖Puλ‖s,Kλ, 1η−ε

Observe at first that, since by construction

P (t,Dt, Dx)(eixλρ+a t2k+1

2k+1 λρw(t(λρ)n

m+ ))

= 0,

cf. Popivanov [22], we have:

P (t,Dt, Dx)uλ(t, x) =∑

α1+α2≤mβ1+β2≤m

(α2,β2) =(0,0)

Pαβ(t)∫ +∞

−∞ψ(ρ)e−α0(λρ)1−(2k+1)ε

×Dα1t Dβ1

x

(eixλρ+a t2k+1

2k+1 λρw(t(λρ)n

m+ ))Dα2

t Dβ2x

(gλρ(t, x)

)dρ,

where α = (α1, α2), β = (β1, β2) and Pαβ(t) is a polynomial in t with constant

coefficients of degree ≤ 2km, Pαβ(t) =2km∑r=0

crtr, cr = cr(α, β) ∈ C. Observe that

for α3 ∈ Z+ we can write ∂α3t ea t2k+1

2k+1 λρ =∑

0≤j≤2kα30≤q≤α3

cjqtj(λρ)q ea t2k+1

2k+1 λρ for suitable

cjq ∈ C; then, using Leibnitz rule and recalling that tλρ = c0(λρ)−ε, cf. (3.8), we

Page 353: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

On a Gevrey-Nonsolvable PDO 349

have:

P (t,Dt, Dx)uλ(t,x) =∑

α1+α2≤mβ1+β2≤m

(α2,β2) =(0,0)

∑α3+α4=α1

∑0≤j≤2kα30≤q≤α3

2km∑r=0

Ctr+jλβ1+q+εα2+ nm+ α4

×∫ +∞

−∞ψ(ρ)e−α0(λρ)1−(2k+1)ε

ρβ1+q+εα2+ nm+ α4eixλρ+a t2k+1

2k+1 λρ

× w(α4)(t(λρ)n

m+ )g(α2)1

( t

tλρ

)ϕ(β2)(x) dρ

=∑

α1+α2≤mβ1+β2≤m

(α2,β2) =(0,0)

Jαβ(t, x, λ),

(4.15)

where C = C(α1, α2, β1, β2, α3, j, q, r) =(α1α3

)c−α20 crcjq(−i)α1+α2+β2 . We can write

P (t,Dt, Dx)uλ(t, x) =∑

α1+α2≤mβ1+β2≤m

α2 =0

Jαβ(t, x, λ) +∑

α1+α2≤mβ1+β2≤m

α2=0, β2 =0

Jαβ(t, x, λ) := I1 + I2.

(4.16)Now let us analyze separately I1 and I2.

Regarding I1, since α2 = 0, in the t variable we can limit ourselves tosupp

(g′1(

ttλρ

)), and so we have:

‖I1‖s,Kλ, 1η−ε

= ‖I1‖s,K, 1η−ε

,

where K = (t, x) ∈ R2 : ttλ∈ [1− ε2, 1− ε1]∪ [1+ ε1, 1 + ε2], |x| ≤ 1 for suitable

constants 0 < ε1 < ε2 ( 1. Then we can write:

‖I1‖s,Kλ, 1η−ε

≤∑

α1+α2≤mβ1+β2≤m

α2 =0

∑α3+α4=α1

∑0≤j≤2kα30≤q≤α3

2km∑r=0

C‖tr+j‖s,K, 1ηλβ1+q+εα2+ n

m+ α4

×∫ 1+µ0

1

ψ(ρ)e−α0(λρ)1−(2k+1)ερβ1+q+εα2+ n

m+ α4

× ‖eixλρ+a t2k+12k+1 λρw(α4)(t(λρ)

nm+ )‖s,K, 1

η+ε

× ‖g(α2)1

( t

tλρ

)ϕ(β2)(x)‖s,K, 1

η+εdρ.

In order to apply Proposition 4.2 we observe that m(λρ) = sup(t,x)∈K

(a t2k+1

2k+1 λρ +

A tm

+1

m +1

(λρ)n/m)≤ (α0 − L0)(λρ)1−(2k+1)ε where L0 = e0ε

21c

20 > 0, as we can

Page 354: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

350 A. Oliaro

deduce using the expression (3.10). So we can apply Lemma 4.5, Proposition 4.2and Lemma 4.4 and deduce that

‖I1‖s,Kλ, 1η−ε

≤ Cλ2m+εme−L0λ1−(2k+1)εes(dλ)1/s+(s−1)(dλn/m)

1s−1 +G0λ

εs−s′ +cλ

εs−1

,

(4.17)for λ) 0.

In order to estimate ‖I2‖s,Kλ, 1η−ε

we observe that for every M ∈ N we have:

eixλρ+a t2k+12k+1 λρ = λ−M

(ix+ a

t2k+1

2k + 1

)−M ∂M

∂ρM

(eixλρ+a t2k+1

2k+1 λρ);

then integration by parts in (4.15) gives us:

I2 =∑

α1+α2≤mβ1+β2≤m

β2 =0

∑α3+α4=α1

∑0≤j≤2kα30≤q≤α3

2km∑r=0

(−1)MCtr+jλ−Mλβ1+q+εα2+ nm+ α4

×(ix+ a

t2k+1

2k + 1

)−M∫ 1+µ0

1

eixλρ+a t2k+12k+1 λρ ∂M

∂ρM

[ψ(ρ)e−α0(λρ)1−(2k+1)ε

× ρβ1+q+εα2+ nm+ α4w(α4)(t(λρ)

nm+ )g1

( t

tλρ

)]ϕ(β2)(x) dρ.

We can compute ∂M

∂ρM

[. . .

]in the previous integral via Leibnitz and Faa di Bruno

formulas, cf. (4.2); moreover, since now β2 = 0 we can limit our attention tosupp

(ϕ′(x)

), and so

‖I2‖s,Kλ, 1η−ε

= ‖I2‖s,K′, 1η−ε

; (4.18)

where K ′ = (t, x) ∈ R2 : (1 − ε2)tλ ≤ t ≤ (1 + ε2)tλ, x ∈ [−1,−ε] ∪ [ε, 1], for afixed 0 < ε < 1. Thus we have:‖I2‖s,Kλ, 1

η−ε

≤∑

CM+1

(M

M1

)(K1

M2

)(K2

M3

)(K3

M4

)λ−MλRλM2(1−(2k+1)ε)+M4

nm+ +M5ε

×M5∑h=1

M4∑p=1

‖tS+p+h‖s,K′, 1η

∥∥∥(ix + at2k+1

2k + 1

)−M∥∥∥s,K′, 1

η+ε

FM2FpFhCM3

×∫ 1+µ0

1

|ψ(M1)(ρ)|e−α0(λρ)1−(2k+1)ε‖eixλρ+a t2k+1

2k+1 λρw(α4+p)(t(λρ)n

m+ )‖s,K′, 1η+2ε

×∥∥∥g(h)

1

( t

tλρ

)ϕ(β2)(x)

∥∥∥s,K′, 1

η+2ε

ρV ρM2(1−(2k+1)ε)+M4n

m+ +M5ε dρ,

(4.19)

where S,R, V ∈ R are independent of M and the first sum in the right-hand sideis over α1 +α2 ≤ m, β1+β2 ≤ m, β2 = 0, α3+α4 = α1, 0 ≤ j ≤ 2kα3, 0 ≤ q ≤ α3,r = 0, . . . , 2km, K1 + M1 = M , K2 + M2 = K1, K3 + M3 = K2, M4 + M5 = K3;

Page 355: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

On a Gevrey-Nonsolvable PDO 351

moreover the quantities FM2 , Fp, Fh (coming from the Faa di Bruno formula) andCM3 have the following expression:

– FM2 = M2!M2∑i=1

∑γ1+···+γM2=i

γ1+2γ2+···+M2γM2=M2

M2∏ν=1

1γν !

(Cν

ν!

)γν

,

where Cν = |α0(1− (2k + 1)ε) . . . (1 + (2k + 1)ε− ν + 1)|;

– Fp = M4!∑

δ1+···+δM4=pδ1+2δ2+···+M4δM4=M4

M4∏µ=1

1δµ!

(Cµ

µ!

)δµ

,

where Cµ =∣∣∣ n

m + . . .

( n

m + − µ+ 1

)∣∣∣;– Fh = M5!

∑σ1+···+σM5=h

σ1+2σ2+···+M5σM5=M5

M5∏τ=1

1στ !

(Cτ

τ !

)στ

,

where Cτ = |ε(ε− 1) . . . (ε− τ + 1)|;– CM3 =

∣∣(β1 + q + εα2 + nm+α4

). . .

(β1 + q + εα2 + n

m+α4 −M3 + 1)∣∣.

So we have now to estimate the various quantities appearing in the right-hand sideof (4.19). To start with we give the following lemma.

Lemma 4.6. Let K ′ be as before, cf. (4.18). Then∥∥∥(ix + at2k+1

2k + 1

)−M∥∥∥s,K′, 1

η+ε

≤ C(2ε

)M

.

Proof. By Faa di Bruno formula we have:∣∣∣∣∂αx ∂

βt

((ix+ a

t2k+1

2k + 1

)−M)∣∣∣∣ ≤ ∑

0<h≤β

M(M + 1) . . . (M + α + h− 1)

×∣∣∣∣(ix+ a

t2k+1

2k + 1

)−M−α−h∣∣∣∣β!

∑h1+···+hβ=h

h1+2h2+···+βhβ=β

β∏j=1

1hj !

( 1j!

)hj∣∣∣∂j

t

(at2k+1

2k + 1

)∣∣∣hj

.

Now since (1− ε2)tλ ≤ t ≤ (1 + ε2)tλ there exists a constant D > 0 such that forevery λ ≥ 1 and j = 1, . . . , β ∣∣∣∂j

t

(at2k+1

2k + 1

)∣∣∣hj

≤ Dβ ;

moreover the condition |x| ≥ ε, 0 < ε < 1, gives us∣∣∣∣(ix+ at2k+1

2k + 1

)−M−α−h∣∣∣∣ ≤ ε−M−α−β.

Page 356: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

352 A. Oliaro

Using the identity (4.4) with z = 1 we can see that

β!∑

h1+···+hβ=hh1+2h2+···+βhβ=β

β∏j=1

1hj !

( 1j!

)hj

≤ β!∞∑

β=h

Bβ,h(1) 1β!

=β!h!

( ∞∑j=1

1j!

)h

≤ β!h!eβ;

finally

M(M + 1) . . . (M + α + h− 1) =(M + α + h− 1

M − 1

)(α + h)! ≤ 2M4α+βα!h!,

since (α+h)! ≤ 2α+hα!h! and(M+α+h−1

M−1

)≤ 2M+α+h−1 ≤ 2M+α+β . We then have∣∣∣∣∂α

x ∂βt

((ix+ a

t2k+1

2k + 1

)−M)∣∣∣∣ ≤ (2

ε

)M

Cα+βα!β!

for a constant C independent of α, β and M . Now remembering the definition ofGevrey seminorms, cf. (2.6), we obtain:∥∥∥(ix + a

t2k+1

2k + 1

)−M∥∥∥s,K′, 1

η+ε

≤(2ε

)M

supα,β∈Z+

((C(η + ε)

)α+β(α!β!)−(s−1))

= C(2ε

)M

,

since s > 1.

Taking into account that ψ ∈ Gs′0 (R) we have

|ψ(M1)(ρ)| ≤ CM1+1M1!s′. (4.20)

Regarding CM3 , if we denote by N0 the smallest integer satisfying N0 ≥ β1 + q +εα2 + n

m+α4 for every β1, q, α2, α4 (observe that N0 depends only on n, m, k and) we have:

CM3 ≤ N0 . . . (N0 + M3 − 1) =(N0 + M3 − 1

N0 − 1

)M3!

≤ 2N0+M3−1M3! ≤ CM3+13 M3!

(4.21)

We have now to analyze the quantities FM2 , Fp and Fh. As an example let usconsider Fh. Since Cτ ≤ τ ! we have:

Fh ≤M5!∑

σ1+···+σM5=hσ1+2σ2+···+M5σM5=M5

M5∏τ=1

1στ !

≤ BM5,h(τ !)

≤ 2M5M5!BM5,h(τ !) 1M5!

(12

)M5

≤ 2M5M5!1h!,

(4.22)

Page 357: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

On a Gevrey-Nonsolvable PDO 353

as we can deduce by (4.4) with z = 12 . The quantities FM2 and Fp can be treated

in a similar way: we estimate Cν and Cµ as in (4.21) and then we use the sameprocedure as in (4.22), obtaining:

FM2 ≤ CM2+11 M2! (4.23)

Fp ≤ CM4+12 M4!

1p!

(4.24)

Now we can estimate ‖I2‖s,Kλ, 1η−ε

by considering (4.19) and applying Propo-sition 4.2, Lemma 4.4, Lemma 4.5, Lemma 4.6, (4.20), (4.21), (4.22), (4.23) and(4.24); we then have:

‖I2‖s,Kλ, 1η−ε

≤ CM+1λ−MλM max1−(2k+1)ε, nm ,εM !s

′λR

× es(dλ)1/s+(s−1)(dλn/m)1

s−1ecλ

εs−1

eG0λε

s−s′,

for every M ∈ Z+ and λ ) 0, where R does not depend on M . By the Stir-ling formula we have: M ! ≤ C0M

Me−M√M ; we fix now λ = Mh, with h >

s′1−max1−(2k+1)ε, n

m ,ε (observe that if h is fixed then λ = Mh → +∞⇔M → +∞,h being positive); in this way we obtain

CM(√

MMM)s′

λ−MλM max1−(2k+1)ε, nm ,ε ≤ C0.

Moreover e−M in Stirling formula gives rise to the term e−Ms′= e−s′λ1/h

, so thefollowing estimate holds:

‖I2‖s,Kλ, 1η−ε

≤ CλRe−s′λ1/h

es(dλ)1/s+(s−1)(dλn/m)1

s−1ecλ

εs−1

eG0λε

s−s′, (4.25)

for λ) 0.From (4.16) we have that ‖Puλ‖s,Kλ, 1

η−ε≤ ‖I1‖s,Kλ, 1

η−ε+ ‖I2‖s,Kλ, 1

η−ε, and

so by (4.17) and (4.25) the following estimate holds:

‖Puλ‖s,Kλ, 1η−ε

≤ CλR0es(dλ)1/s+(s−1)(dλn/m)1

s−1 +G0λε

s−s′ +cλε

s−1

×e−L0λ1−(2k+1)ε

+ e−s′λ1/h,

(4.26)

for λ ) 0 where R0, L0 are positive constants, h > s′1−max1−(2k+1)ε, n

m,ε and

s′ > 1 is arbitrarily fixed.

5. Proof of Theorem 2.1

Let us fix s > scr and suppose that tP is Gs locally solvable at the origin. Then thecondition (2.7) holds in particular for u = uλ(t, x), for every λ. Applying (4.13),(4.14) and (4.26) the following estimate must be true for every λ) 0:

E2λ−2p+2 1−m2m ( n

m+−ε) ≤ CλR0e2s(dλ)1/s+2(s−1)(dλn/m)1

s−1 +2G0λε

s−s′ +2cλε

s−1

×e−L0λ1−(2k+1)ε

+ e−s′λ1/h,

(5.1)

Page 358: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

354 A. Oliaro

where R0, L0 are positive constants, h > s′1−max1−(2k+1)ε, n

m ,ε and s′ > 1 isarbitrarily fixed. Now we want to show that, for a suitable choice of h and s′ thefollowing condition is satisfied:

min1− (2k + 1)ε,

1h

> max

1s,n

m

1s− 1

s− s′,

ε

s− 1

. (5.2)

We observe at first that 1h <

1−max1−(2k+1)ε, nm ,ε

s′ < 1−max1− (2k+1)ε, nm , ε;

then if we prove that

min1− (2k + 1)ε, 1−max

1− (2k + 1)ε,

n

m, ε

> max1s,n

m

1s− 1

s− s′,

ε

s− 1

(5.3)

we can choose h and s′ in such a way that (5.2) is satisfied, since we can alwaysfix s′ > 1 and h such that 1−max1− (2k + 1)ε, n

m , ε − h < ν for an arbitraryν > 0. Now, since s > scr, the following estimates hold:

n

m

1scr − 1

>n

m

1s− 1

,

ε

scr − 1>

ε

s− 1;

moreover, choosing 1 < s′ < s− scr + 1 we have:ε

scr − 1>

ε

s− s′.

It is then sufficient to prove that

min1− (2k + 1)ε, 1−max

1− (2k + 1)ε,

n

m, ε

≥ max 1scr

,n

m

1scr − 1

scr − 1

.

(5.4)

Now the condition (2.2) implies that nm > ε, so we have only to show that

min1− (2k + 1)ε, (2k + 1)ε, 1− n

m

≥ max

1scr

,n

m

1scr − 1

. (5.5)

The condition (2.3) implies that (2k + 1)ε ≥ 1 − (2k + 1)ε, and so, recalling theexpression of scr, cf. (2.5), (5.5) is equivalent to

min

1− (2k + 1)ε, 1− n

m

≥ max

1− (2k + 1)ε,

n

m

1− (2k + 1)ε(2k + 1)ε

; (5.6)

finally, since (2k + 1)ε ≥ nm , cf. (2.3), we have that 1 − (2k + 1)ε ≤ 1 − n

m and1− (2k + 1)ε ≥ n

m1−(2k+1)ε(2k+1)ε ; then (5.6) is satisfied, and so (5.2) holds.

Now the condition (5.2) assures us that, when λ→∞, for suitable choices ofh and s′ and for s > scr the leading terms of the two summands in the right-handside of (5.1) are respectively e−L0λ1−(2k+1)ε

and e−s′λ1/h

. Then (5.1) cannot be truefor λ→∞, and so tP is not Gs locally solvable at the origin for s > scr.

Page 359: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

On a Gevrey-Nonsolvable PDO 355

Acknowledgment

The author expresses his deep gratitude to Prof. Popivanov for the very instructiveand pleasant discussions on the subject of this paper during his visit in Sofia.

References

[1] B.L.J. Braaksma, Asymptotic analysis of a differential equation of Turrittin, SIAMJ. Math. Anal. 2 (1971), 1–16.

[2] D. Calvo and P. Popivanov, Solvability in Gevrey classes for second powers of theMizohata operator, C. R. Acad. Bulgare Sci. 57 (2004), n. 6, 11–18.

[3] F. Cardoso and F. Treves, A necessary condition of local solvability for pseudo differ-ential equations with double characteristics, Ann. Inst. Fourier, Grenoble 24 (1974),225–292.

[4] M. Cicognani and L. Zanghirati, On a class of unsolvable operators, Ann. ScuolaNorm. Sup. Pisa 20 (1993), 357–369.

[5] A. Corli, On local solvability in Gevrey classes of linear partial differential operatorswith multiple characteristics, Comm. Partial Differential Equations 14 (1989), 1–25.

[6] A. Corli, On local solvability of linear partial differential operators with multiplecharacteristics, J. Differential Equations 81 (1989), 275–293.

[7] G. De Donno and A. Oliaro, Local solvability and hypoellipticity for semilinearanisotropic partial differential equations, Trans. Amer. Math. Soc. 355, 8 (2003),3405–3432.

[8] M. Fedoryuk, Asymptotic Analysis, Springer, Berlin, 1993.

[9] Ch. Georgiev and P. Popivanov, A necessary condition for the local solvability of aclass of operators having double characteristics, Annuaire Univ. Sofia, Fac. Math.Mec, I 75 (1981), 57–71.

[10] R. Goldman, A necessary condition for the local solvability of a pseudodifferentialequation having multiple characteristics, J. Differential Equations 19 (1975), 176–200.

[11] T. Gramchev, Powers of Mizohata type operators in Gevrey classes, Boll. Un. Mat.Ital. B (7) 5 (1991), 135–156.

[12] T. Gramchev, Nonsolvability for analytic partial differential operators with multiplecharacteristics, J. Math. Kyoto Univ. 33 (1993), 989–1002.

[13] T. Gramchev and P. Popivanov, Partial differential equations: approximate solutionsin scales of functional spaces, Math. Res., 108, Wiley-VCH Verlag, Berlin, 2000.

[14] T. Gramchev and L. Rodino, Gevrey solvability for semilinear partial differentialequations with multiple characteristics, Boll. Un. Mat. Ital. B (8) 2 (1999), 65–120.

[15] L. Hormander, Linear Partial Differential Operators, Springer, Berlin, 1963.

[16] P. Marcolongo and A. Oliaro, Local Solvability for Semilinear Anisotropic PartialDifferential Equations, Ann. Mat. Pura Appl. (4) 179 (2001), 229–262.

[17] P. Marcolongo and L. Rodino, Nonsolvability for an operator with multiple complexcharacteristics, Progress in Analysis, Vol. I,II (Berlin, 2001), World Sci. Publishing,River Edge, NJ, 2003, 1057–1065.

Page 360: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

356 A. Oliaro

[18] M. Mascarello and L. Rodino, Partial differential equations with multiple character-istics, Wiley-Akademie Verlag, Berlin, 1997.

[19] T. Okaji, The local solvability of partial differential operator with multiple character-istics in two independent variables, J. Math. Kyoto Univ. 20, 1 (1980), 125–140.

[20] T. Okaji, Gevrey-hypoelliptic operators which are not C∞-hypoelliptic, J. Math. Ky-oto Univ. 28, 2 (1988), 311–322.

[21] A. Oliaro, P. Popivanov, and L. Rodino, Local solvability for partial differential equa-tions with multiple characteristics, Proceedings of Abstract and Applied Analysisconference 2002 (Hanoi) (Kluwer, ed.), 2002, pp. 143–157.

[22] P. Popivanov, On a nonsolvable partial differential operator, Ann. Univ. Ferrara, Sez.VII, Mat. 49 (2003), 197–208.

[23] L. Rodino, Linear partial differential operators in Gevrey spaces, World Scientific,Singapore, 1993.

[24] S. Spagnolo, Local and semi-global solvability for systems of non-principal type,Comm. Partial Differential Equations 25 (2000), 1115–1141.

[25] H.L. Turrittin, Stokes multipliers for asymptotic solutions of a certain differentialequation, Trans. Amer. Math. Soc. 68 (1950), 304–329.

Alessandro OliaroDip. MatematicaUniversita di TorinoVia Carlo Alberto, 10I-10123 Torino, Italye-mail: [email protected]

Page 361: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 357–366c© 2005 Birkhauser Verlag Basel/Switzerland

Optimal Prediction ofGeneralized Stationary Processes

Vadim Olshevsky and Lev Sakhnovich

To Israel Gohberg on the occasion of his 75th anniversary with appreciation and friendship

Abstract. Methods for solving optimal filtering and prediction problems forthe classical stationary processes are well known since the late forties. Practiceoften gives rise to what is called generalized stationary processes [GV61], e.g.,to white noise and to many other examples. Hence it is of interest to carryover optimal prediction and filtering methods to them. For arbitrary gener-alized stochastic processes this could be a challenging problem. It was shownrecently [OS04] that the generalized matched filtering problem can be effi-ciently solved for a rather general class of SJ-generalized stationary processesintroduced in [S96]. Here it is observed that the optimal prediction problemadmits an efficient solution for a slightly narrower class of TJ -generalized sta-tionary processes. Examples indicate that the latter class is wide enough toinclude white noise, positive frequencies white noise, as well as generalizedprocesses occurring when the smoothing effect gives rise to a situation inwhich the distribution of probabilities may not exist at some time instances.One advantage of the suggested approach is that it connects solving the op-timal prediction problem with inverting the corresponding integral operatorsSJ . The methods for the latter, e.g., those using the Gohberg-Semencul for-mula, can be found in the extensive literature, and we include an illustrativeexample where a computationally efficient solution is feasible.

Mathematics Subject Classification (2000). Primary 60G20, 93E11 ; Secondary60G25.

Keywords. Generalized Stationary Processes, Prediction, Filtering, IntegralEquations, Gohberg-Semencul Formula.

Page 362: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

358 V. Olshevsky and L. Sakhnovich

1. Introduction

1.1. Optimal prediction of classical stationary processes

A complex-valued stochastic process X(t) is called stationary in the wide sense(see, e.g., [D53]), if its expectation is a constant,

E[X(t)] = const, −∞ < t <∞

and the correlation function depends only on the difference (t− s), i.e.,

KX(t, s) = E[X(t)X(s)] = KX(t− s).

We assume that E[|X(t)|2] <∞. Let us consider a system with the memory depthω that maps the input stochastic process X(t) into the output stochastic processY (t) in accordance with the following rule:

Y (t) =∫ t

t−ω

X(s)g(t− s)ds, g(x) ∈ L(0, w). (1.1)

In the optimal prediction problem one needs to find a filter g(t) in

X(t)

g(t) Y (t)

Figure 1. Classical Optimal Filter.

so that the output process Y (t) is as close as possible to the true process X(t+ τ)where τ > 0 is a given constant. The measure of closeness is understood in thesense of minimizing the quantity

E[|X(t+ τ) − Y (t)|2].

Wiener’s seminal monograph [W49] solves the above problem for the case ω = ∞in (1.1). His results were extended to the case ω <∞ in [ZR50].

1.2. Generalized stationary processes. Motivation

White noise X(t) (having equal intensity at all frequencies within a broad band)is not a stochastic process in the classical sense as defined above. In fact, whitenoise can be thought of as the derivative of a Brownian motion, which is a con-tinuous stationary stochastic process W (t). It can be shown that W (t) is nowheredifferentiable, a fact explaining the highly irregular motions that Robert Brownobserved. This means that white noise dW (t)

dt does not exist in the ordinary sense.In fact, it is is a generalized stochastic process whose definition is stated in [GV61].

Generally, any receiving device has a certain “inertia” and hence instead ofactually measuring the classical stochastic process X(t) it measures its averagedvalue

Φ(ϕ) =∫

ϕ(t)X(t)dt, (1.2)

Page 363: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Optimal Prediction of Generalized Stationary Processes 359

where ϕ(t) is a certain function characterizing the device. Small changes in ϕ yieldsmall changes in Φ(ϕ) (small changes in the receiving devices yield closer mea-surements), hence Φ is a continuous linear functional (see (1.2)), i.e., a generalizedstochastic process whose definition was given in [GV61].

Hence it is very natural and important to solve the optimal filtering andprediction problems in the case of generalized stochastic processes. This corre-spondence is a sequel to [OS04] where we solved the problem of constructing thematched filtering problem for generalized stationary processes. Here, formulas forsolving the optimal prediction problem are given.

1.3. The main result and the structure of the correspondence

In the next Section 2 we recall the definition [GV61] of generalized stationaryprocesses and describe the system action on it. Then in Section 3 we introduce aclass of TJ -generalized processes for which we will be solving the optimal predictionproblem in Section 4. Finally, in Section 5 we consider a new model of colored noise.It is shown how our general solution to the optimal prediction problem can be abasis to provide a computationally efficient solution in this important example.

2. Generalized stationary processes. Auxiliary results

2.1. The definition of [GV61]Let K denotes the set of all infinitely differentiable finite functions. Let a stochasticfunctional Φ (i.e., a functional assigning to any ϕ(t) ∈ K a stochastic value Φ(ϕ))be linear, i.e.,

Φ(αϕ + βψ) = αΦ(ϕ) + βΦ(ψ).

Let us further assume that all the stochastic values Φ(ϕ) have expectations given by

m(ϕ) = E[Φ(ϕ)] =∫ ∞

−∞xdF (x), where F (x) = P [Φ(ϕ) ≤ x].

Notice that m(ϕ) is a linear functional acting in the space K that depends contin-uously on ϕ. The bilinear functional

B(ϕ, ψ) = E[Φ(ϕ)Φ(ψ)]

is the correlation functional of a stochastic process. It is supposed that B(ϕ, ψ) iscontinuously dependent on either of the arguments.

The stochastic process Φ is called generalized stationary in the wide sense[GV61], [S97] if for any functions ϕ(t) and ψ(t) from K and for any number h theequalities

m[ϕ(t)] = m[ϕ(t + h)], (2.1)

B[ϕ(t), ψ(t)] = B[ϕ(t + h), ψ(t+ h)] (2.2)

hold true.

Page 364: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

360 V. Olshevsky and L. Sakhnovich

2.2. System action on generalized stationary processes

If the processes X(t) and Y (t) were classical, then it is standard to assume thatthe system action shown in Figure 1 obeys

Y (t) =∫ T

0

X(t− τ)g(τ)dτ, (where T = w is the memory depth). (2.3)

Let us now consider a more general situation when the system shown in Figure 2.

Φ(t)g(t) Ψ(t)

Figure 2. System action on generalized stationary processes

receives the generalized stationary signal Φ that we assume to be zero-mean. Thenwe define the system action as follows:

Ψ(ϕ) = Φ[∫ T

0

g(τ)ϕ(t + τ)dτ ], (2.4)

The motivation for the latter definition is that if X(t) and Y (t) were the classicalstationary processes then the Formula (2.4) for the corresponding functionals

Φ(ϕ) =∫ ∞

−∞X(t)ϕ(t)dt, Ψ(ϕ) =

∫ ∞

−∞Y (t)ϕ(t)dt (2.5)

is equivalent to the classical relation (2.3), so that the former is a natural gener-alization of the latter.

3. SJ -generalized and TJ -generalized stationary processes

3.1. Definitions

Let us denote byKJ the set of functions inK such that ϕ(t) = 0 when t /∈ J = [a, b].The correlation functional BJ (ϕ, ψ) is called a segment of the correlation functionalB(ϕ, ψ) if

BJ (ϕ, ψ) = B(ϕ, ψ), ϕ, ψ ∈ KJ . (3.1)

Definition 3.1. Generalized stationary processes are called SJ -generalized processesif their segments satisfy

BJ (ϕ, ψ) = (SJϕ, ψ)L2 , (3.2)

where SJ is a bounded nonnegative operator acting in L2(a, b) having the form

SJϕ =d

dt

∫ b

a

ϕ(u)s(t− u)du. (3.3)

Here (·, ·)L2 is the inner product in the space L2(a, b).

Page 365: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Optimal Prediction of Generalized Stationary Processes 361

Formulas for matched filters for the above SJ -generalized processes have beenrecently derived in [OS04]. Here we consider a different problem of optimal predic-tion that is shown to have an efficient solution under the additional assumptioncaptured by the next definition.

Definition 3.2. An SJ -generalized process is referred to as a TJ -generalized processif in addition to (3.2) and (3.3) the kernel s(t) of SJ in (3.3) has a continuousderivative (for t = 0) that we denote by k(t), and moreover

s′(t) = k(t) (t = 0), k(0) = ∞. (3.4)

3.2. Examples

Before solving the optimal prediction problem we provide some illustrative exam-ples.

Example 3.3. White noise. It is well known that white noise W (which is the deriv-ative of a nowhere differentiable Brownian motion) is not a continuous stochasticprocess. In fact, it is a generalized stationary process whose correlation functionalis known [L68] to be

B′(ϕ, ψ) =∫ ∞

−∞

∫ ∞

−∞δ(t− s)ϕ(t)ψ(s)ds.

Thus, in this case we have B′(ϕ, ψ) = (ϕ, ψ)L2 and hence (3.2) implies that whitenoise Φ is a very special SJ -generalized stationary process with

SJ = I. (3.5)

It means that the corresponding kernel function s(t) has the form

s(t) =

12 t > 0

− 12 t < 0.

In accordance with (3.4) the white noise Φ is a TJ -generalized stationary process.

Example 3.4. Positive frequencies white noise (PF-white noise). Observe that theoperator

SJf = fD +j

π−∫ T

0

f(t)x− t

dt, f ∈ L2[0, T ], (3.6)

(where −∫ T

0is the Cauchy Principal Value integral, and D ≥ 1) defines an SJ -

generalized process. When D > 1 then SJ is invertible (see example 4.1), and ifD = 1 then SJ is noninvertible. Notice that if D = 1 then the kernel of SJ is theFourier transform of fPW (z) = 1 having equal intensity at all positive frequenciesand the zero intensity at the negative frequencies (hence the name PF-white noise).Observe that the process is, in fact, TJ-generalized. Indeed, it can be shown thatthe kernel function of SJ has the form

s(t) =

D2 + i

π ln t t > 0−D

2 + iπ ln |t| t < 0

Page 366: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

362 V. Olshevsky and L. Sakhnovich

which implies

k(t) = s′(t) =1t

(t = 0), k(0) = ∞.

In accordance with (3.4) the PF-white noise is a TJ-generalized stationary process.

4. Solution to the optimal prediction problem

As is well known [L68], [M65], in the classical case the solution g(t) of the optimalprediction problem can be found by solving∫ w

0

g(u)kx(u− v)dv = kx(u + τ). (4.1)

In the generalized case the solution g to the optimal prediction problem can befound by solving a generalization of (4.1),

SJg = kx(u + τ) = s′(u + τ). (4.2)

If SJ is invertible theng = S−1

J s′(u + τ). (4.3)

Example 4.1. PF-white noise revisited. Here we return to the case considered inexample 3.4. It can be shown that if D > 1 then SJ in (3.6) is positive definiteand invertible with

S−1J f = f(x)D1 − β

j

π−∫ T

0

(t

T − y)jα(

x

T − x)−jα f(t)

x− tdt, (4.4)

where

D1 =D

D2 − 1, β =

1D2 − 1

, (4.5)

and the number α is obtained from

coshαπ = D sinhαπ. (4.6)

Clearly, (4.4) and (4.3) solve the optimal prediction problem in this case.

5. Some practical consequences. A connection to theGohberg-Semencul formula

The main focus of the sections 2–4 had mostly a theoretical nature. In this sectionwe indicate that the Formula (4.3) offers a novel technique allowing one to workout practical problems. Specifically:

• Filtering problems for classical stationary processes typically lead to non-invertible operators SJ , and to find the solution g(t) to the optimal predic-tion problem one needs to solve (4.1). In the case of generalized stationaryprocesses the operator SJ is often invertible and hence there is a better For-mula (4.3).

Page 367: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Optimal Prediction of Generalized Stationary Processes 363

• Secondly, the operator SJ in (3.2) and (3.3) can be seen as a new way ofmodeling colored noise. This model is useful since the existing integral equa-tions literature already describes many particular examples on inverting SJ ,either explicitly or numerically. Hence (4.3) solves the corresponding optimalprediction problem.

Before providing one such example let us rewrite the operator SJ of (3.3),

SJf =d

dx

∫ b

a

f(t)s(x− t)dt, f(x) ∈ L2(a, b)

in a more familiar form.

Proposition 5.1. Let

a = 0, b = T, s(0)− s(−0) = 1, s′(t) = k(t) (t = 0).

With these settings SJ can be rewritten as

SJf = f(x) +∫ T

0

f(t)k(x − t)dt. (5.1)

If k(t) is continuous for t = 0 then the corresponding process is TJ-generalized.

The integral equations literature (see, e.g., [GF74], [M77], [S96]) containsresults on the inversion of the operator SJ of the form (5.1). The following theoremis well known.

Theorem 5.2 (Gohberg-Semencul [GS72] (see also [GF74]).). Let the operator SJ

have the form (5.1) with k(x) ∈ L(−w,w). If there are two functions γ±(x) ∈L(0, w) such that

SJγ+(x) = k(x), SJγ−(x) = k(x− w) (5.2)

then S[0,w] is invertible in Lp(0, w) (p ≥ 1) and

S−1J f = f(x) +

∫ w

0

f(t)γ(x, t)dt, (5.3)

where γ(x, t) is given by (5.4).

γ(x, t) =

⎧⎪⎪⎪⎨⎪⎪⎪⎩−γ+(x − t)−

∫ w+t−x

t

[γ−(w − s)γ+(s + x− t)−γ+(w − s)γ−(s + x− t)

]ds, x > t,

−γ−(x − t)−∫ w

t

[γ−(w − s)γ+(s + x− t)−γ+(w − s)γ−(s + x− t)

]ds, x < t

(5.4)

The latter result leads to a number of interesting special cases when theoperator SJ can be explicitly inverted and hence the Formula (4.3) solves the op-timal prediction problem in these cases. For example, the processes correspondingto k(x) = |x|−h, with 0 < h < 1, or to k(x) = − log |x − t| are of interest. Weelaborate the details for another example next.

Page 368: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

364 V. Olshevsky and L. Sakhnovich

Example 5.3. Colored noise approximated by rational functions. The exponentialkernel. Let us consider colored noise approximated by a combination of rationalfunctions with the fixed poles ±iαm,

f(t) =N∑

m=1

γm1

t2 + α2m

, αm > 0, γm > 0.

Let us use its Fourier transform

k(x) =N∑

m=1

βme−αm|x|, βj =π

αmγm (5.5)

to define the operator SJ via (5.1).

Solution to the filtering problem. The situation is exactly the one captured bytheorem 5.2 where the operator (5.1) has the special kernel (5.5). A procedure tosolve (5.2), and hence to find the inverse of SJ is obtained next.

Theorem 5.4 (Computational Procedure). Let SJ be given by (5.1) and its kernelk(x) have the special form (5.5). Then S−1

J is given by the Formulas (5.3), (5.4),(5.2), where

γ+(x) = −γ(x, 0), γ−(x) = −γ(w − x, 0). (5.6)Here

γ(x, 0) = G(x)[F1

F2

]−1

B, (5.7)

where the 1× 2N row G(x), the N × 2N matrices F1, F2, and the 2N × 1 columnB are defined by

G(x) =[eν1x eν2x · · · eν2N x

], F1 =

[1

αi+νk

]1≤i≤N,1≤k≤2N

,

F2 =[

−eνkw

αi−νk

]1≤i≤N,1≤k≤2N

, B =[

1 · · · 1︸ ︷︷ ︸N

0 · · · 0]︸ ︷︷ ︸

N

.

The numbers νk are the roots (we assume them to be pairwise different) of thepolynomial

Q(z) = P (z)− 2N∑

m=1

δm

m∑s=1

z2(m−s)N∑

k=1

α2s−1k βk, (5.8)

where the numbers δk are the coefficients of the polynomial

P (z) =N∏

m=1

(z2 − α2m) =

N∑m=1

δmz2m. (5.9)

Hence, in this important case, the Formula (4.3) allows us to find the explicitsolution g(t) to the optimal prediction problem by plugging in it the Formulas(5.3), (5.4) together with (5.6) (5.7). Notice that there are fast and superfastalgorithms to solve the Cauchy-like linear system in (5.7), see, e.g., [O03] and thereferences therein.

Page 369: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Optimal Prediction of Generalized Stationary Processes 365

Remark 5.5. Note that the problem in example 5.3 can also be solved by using theKalman filtering method, see, e.g., [K61, KSH00]. Hence, in such simplest caseswhen we have a classical stationary process corrupted by white noise our papersuggests an alternative way to derive the solution via solving integral equations.

However, the Kalman filtering method has not yet been carried over to gen-eralized processes, and in the case of TJ -generalized stationary processes our tech-nique is currently the only one available.

Problem 5.6. We would like to conclude this paper with the interesting open prob-lem of extending the Kalman filtering method to generalized stationary processes.

Acknowledgment

This work was supported in part by the NSF contracts 0242518 and 0098222.We would also like to thank the editor Cornelis Van der Mee and the anonymousreferee for the very careful reading of the manuscript, and for a number of helpfulsuggestions.

References

[D53] J.L. Doob, Stochastic processes, Wiley, 1953.

[GF74] I.C. Gohberg and I.A. Feldman, Convolution equations and projection meth-ods for their solution, Transl. Math. Monographs, v. 41, AMS Publications,Providence, Rhode Island, 1974.

[GS72] I. Gohberg and A. Semencul, On the inversion of finite Toeplitz matrices andtheir continual analogues, Math. Issled. 7:2, 201–223. (In Russian.)

[GV61] I.M. Gelfand and N.Ya. Vilenkin, Generalized Functions, No. 4. Some Applica-tions of Harmonic Analysis. Equipped Hilbert Spaces, Gosud. Izdat. Fiz.-Mat.Lit., Moscow, 1961 (Russian);translated as: Generalized Functions. Vol. 4: Applications of Harmonic Analysis,Academic Press, 1964.

[K61] T. Kailath, Lectures on Wiener and Kalman filtering, Springer Verlag, 1961.

[KSH00] T. Kailath, A.S. Sayed, and B. Hassibi, Linear Estimation, Prentice Hall, 2000.

[L68] B.R. Levin, Theoretical Foundations of Statistical Radio Engineering, Moskow,1968.

[M65] D. Middleton, Topics in Communication Theory, McGraw-Hill, 1965.

[M77] N.I. Muskhelishvili, Singular Integral Equations, Aspen Publishers Inc, 1977.

[O03] V. Olshevsky, Pivoting for structured matrices and rational tangential interpo-lation, in Fast Algorithms for Structured Matrices: Theory and Applications,CONM/323, p. 1–75, AMS publications, May 2003.

[OS04] V. Olshevsky and L. Sakhnovich, Matched filtering for generalized StationaryProcesses, 2004, submitted.

[S96] L.A. Sakhnovich, Integral Equations with Difference Kernels on Finite Intervals,Operator Theory Series, v. 84, Birkhauser Verlag, 1996.

[S97] L.A. Sakhnovich, Interpolation Theory and its Applications, Kluwer AcademicPublications, 1997.

Page 370: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

366 V. Olshevsky and L. Sakhnovich

[W49] N. Wiener, Extrapolation, Interpolation and Smoothing of Stationary TimeSeries, Wiley, 1949.

[ZR50] L.A. Zadeh and J.R. Ragazzini, An extension of Wiener’s Theory of Prediction,J.Appl.Physics, 1950, 21:7, 645–655.

Vadim OlshevskyDepartment of MathematicsUniversity of ConnecticutStorrs, CT 06269, USAe-mail: [email protected]

Lev SakhnovichDepartment of MathematicsUniversity of ConnecticutStorrs, CT 06269, USAe-mail: [email protected]

Page 371: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 367–382c© 2005 Birkhauser Verlag Basel/Switzerland

Symmetries of 2D Discrete-TimeLinear Systems

Paula Rocha, Paolo Vettori and Jan C. Willems

Abstract. Static symmetries of linear shift invariant 1D systems had beenthoroughly investigated, and the resulting theory is now a well-establishedtopic. Nevertheless, only partial results were available for the multidimen-sional case, since the extension of the theory for 1D systems proved not tobe straightforward, as usual. Actually, a non trivial regularity assumptionand also restrictions on the set of allowed symmetries had to be imposed. Inthis paper we show how it is possible to overcome these difficulties using aparticular canonical form for 2D discrete linear systems.

Mathematics Subject Classification (2000). Primary 93C55; Secondary 47B39.

Keywords. Discrete-time linear multidimensional systems, symmetry, behav-ioral approach.

1. Introduction

After Noether’s Theorem, the importance of the analysis of symmetries in thestudy of dynamical systems became more than evident. Indeed, the study of sym-metries is a very important tool to comprehend and clarify many intrinsic proper-ties of physical systems. In fact, the knowledge of the symmetries of a dynamicalsystem often leads to a simplification of its mathematical description.

A basic but illuminating example is a dynamical system given by a collectionof equal particles. As is well known, the behavior of this kind of system can be

The research of the first two authors is partially supported by the Unidade de InvestigacaoMatematica e Aplicacoes (UIMA), University of Aveiro, Portugal, through the Programa Ope-racional “Ciencia, Tecnologia, Inovacao” (POCTI) of the Fundacao para a Ciencia e Tecnolo-

gia (FCT), co-financed by the European Community fund FEDER.The research of the third author is supported by the Belgian Federal Government under theDWTC program Interuniversity Attraction Poles, Phase V, 2002–2006, Dynamical Systems andControl: Computation, Identification and Modelling, by the KUL Concerted Research Action(GOA) MEFISTO–666, and by several grants en projects from IWT-Flanders and the FlemishFund for Scientific Research.

Page 372: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

368 P. Rocha, P. Vettori and J.C. Willems

described by the dynamics of their center of mass. The same simplified represen-tation can be achieved without any consideration of its physical nature, just byanalyzing the symmetry law which is here given by the invariance with respect tothe exchange of particles.

The first and essential aim of our research is precisely to characterize throughcanonical forms the special structures of system representations that are inducedby the underlying symmetries.

As regards linear systems, in the framework of the behavioral approach, thefirst results were presented by Fagnani and Willems [2, 3, 4] for regular behaviors.However, while in the 1D case every behavior is regular, this condition is not nec-essarily fulfilled by generic multidimensional systems. Moreover, only a restrictedclass of symmetries was allowed in the study of real-valued trajectories.

In Section 2 we introduce the class of dynamical discrete systems we deal within this paper: the behaviors, sets of trajectories which can be described by a finitenumber of linear (ordinary or partial) difference equations. To introduce the notionof symmetric behavior, we recall the necessary concepts regarding symmetries andtheir representations in Section 3.

In Section 4, we provide a canonical way to write the equations which definea behavior. This canonical form is then used in Section 5 to show how the resultsabout symmetric 1D behaviors can be extended to the 2D case without the needof any further assumption.

We conclude by showing how each 2D behavior that exhibits some symmetry,can be described by a set of equations that reflect the intrinsic structure of itssymmetry.

2. Behavioral systems

We briefly recall the notion of dynamical system in the behavioral approach [12, 13]and some results that we will need later on. According to this approach a dynamicalsystem is defined as a triple

Σ = (T,W,B),where T denotes the domain, W the signal space, and B, which is a subset ofWT = w : T → W, represents the set of trajectories which are allowed tooccur by the definition of the system. This is called the behavior of the system.We will consider only discrete, linear, complete and shift-invariant systems. Thisamounts to say that: the domain is T = Zn, n = 1, 2, i.e., for 1D and 2D systems,respectively; W and B are vector spaces over K (the real or the complex field); thedimension of W is finite and B is closed in the topology of pointwise convergence;for any trajectory w ∈ B and τ ∈ T we have στw ∈ B where στ is the shiftoperator such that (στw)(t) = w(t + τ).

Remark 2.1. The multi-index notation is here used to handle 1D and 2D systemsin a unified way. So, if τ = (τ1, τ2) ∈ Z2, then στ = στ1

1 στ22 where σi are the (com-

muting) partial shift operators on the i-th component. Analogous is the notation

Page 373: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Symmetries of 2D Discrete-Time Linear Systems 369

for monomials: sτ = sτ11 sτ2

2 . In the following sections we shall be more explicit todistinguish between 1D and 2D systems. In particular, note that for 2D systems,

στw(t) = στ11 στ2

2 w(t1, t2) = w(t1 + τ1, t2 + τ2).

This way of defining the dynamics leads to a representation free theory, i.e.,to a theory which does not require a specific model for the system equations, asfor example the input/state/output model of classical systems theory. Indeed, itis possible to characterize the trajectories of a dynamical system in many ways.Those studied most are the kernel and image representations, which define thebehavior as the kernel and, respectively, as the image of a suitable operator.

To be more precise, we introduce operators on trajectories w ∈ WT of theform

∑i∈I Riw(t + i), where I ⊆ T is a finite subset of the domain and Ri are

constant matrices with suitable dimensions. We may write∑i∈I

Riw(t + i) =∑i∈I

Riσiw(t) = R(σ, σ−1)w(t), (2.1)

where R(s, s−1) is a univariate or bivariate Laurent polynomial matrix.Using this notation, a behavior B ⊆ (Kq)T is defined by a kernel representa-

tion if there exists R(s, s−1) ∈ K[s, s−1]p×q, for some p ∈ N, such that

B = kerR(σ, σ−1) = w ∈ (Kq)T : R(σ, σ−1)w = 0,i.e., if B is the set of solutions of a matrix difference equation represented bythe difference operator R(σ, σ−1). Note that, by shift-invariance of the system,we may always suppose that the summation in (2.1) is made over indices withnon-negative components and therefore we assume without loss of generality thatkernel representations are polynomial matrices R(s) ∈ K[s]p×q.

This representation is very general as the following theorem [8] states.

Theorem 2.2. Every nD behavior admits a kernel representation.

Moreover, it is possible to establish a deep relation between a behavior andany matrix providing its kernel representation that leads, for instance, to thefollowing fundamental result about inclusion of behaviors [8, Thm. 2.61].

Theorem 2.3. The condition kerR1(σ) ⊆ kerR2(σ) holds if and only if there existsa Laurent polynomial matrix X(s, s−1) such that X(s, s−1)R1(s) = R2(s).

In this theorem, X(s, s−1) has to be a Laurent polynomial matrix. Consider,for instance, the case R1(s) = s and R2(s) = 1.

If X(s, s−1) in Theorem 2.3 is unimodular, i.e., it has a polynomial inverse,say Y (s, s−1), then Y (s, s−1)R2(s) = R1(s) and so kerR1(σ) = kerR2(σ). We saythat in this case R1 and R2 are equivalent representations. Additional hypothesesare needed to prove the converse statement.

Corollary 2.4. If kerR1(σ) = kerR2(σ) and both R1(s) and R2(s) have full rowrank, then X(s, s−1)R1(s) = R2(s) with X(s, s−1) Laurent unimodular matrix.

Page 374: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

370 P. Rocha, P. Vettori and J.C. Willems

Note that while every 1D behavior that can be defined by a kernel representa-tion admits a full row rank kernel representation too, called minimal, this does nothold for 2D behaviors. Behaviors that have a full row rank kernel representationare called regular.

3. Symmetries and their representations

To define in a proper way the class of symmetries we will be dealing with, we firsthave to introduce some notions of representation theory. For more details we referthe reader to [10].

Given a finite dimensional vector space W over the field K, we will denoteby GL(W) the group of K-isomorphisms of W .

Definition 3.1. A representation of the group G on the vector space W is a grouphomomorphism

ρ : G→ GL(W), g → ρg.

The degree of a representation is defined by deg ρ = dimW .

Remark 3.2. We suppose in this paper that G is equipped with a topology thatmakes it into a Hausdorff compact topological group. Even if not mentioned, everyrepresentation will be assumed to be continuous. Thus, if G is finite, the discretetopology is employed, which trivially ensures continuity.

Note that, according to the definition, ρg is an isomorphism of W onto itselffor every g. With a little abuse of notation we may implicitly assume that somebasis of W has already been fixed and therefore we will always identify ρg with itsmatrix representation.

Definition 3.3. Given a representation ρ on W , a subspace U ⊆ W is ρ-symmetricif ρU ⊆ U , i.e., ρgU ⊆ U for any g ∈ G.

Note that when U is a ρ-symmetric subspace of W , the restrictions of ρg toU are isomorphisms of U and thus ρ|U is itself a representation which is calledsubrepresentation of ρ. It can be proved that in the case of finite-degree represen-tations there exists another ρ-symmetric subspace V such that W = U ⊕ V . Wewrite also ρ = ρ|U ⊕ρ|V . A representation which does not admit proper symmetricsubspaces, that is to say subrepresentations, is called irreducible. The decompo-sition of W into minimal symmetric subspaces gives then rise to a decompositionof ρ into irreducible subrepresentations. This decomposition becomes unique onlyif we identify different irreducible representations which are isomorphic — η1 isisomorphic to η2, or also η1 ∼= η2, if there exists an isomorphism π : W →W suchthat πη1

g = η2gπ for every g.

Eventually, the standard way to write such a decomposition of ρ into subrep-resentations is

ρ = m1η1 ⊕m2η

2 ⊕ · · · ⊕mrηr, (3.1)

Page 375: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Symmetries of 2D Discrete-Time Linear Systems 371

where the notation miηi stands for the direct sum of mi copies of subrepresenta-

tions isomorphic to ηi.

Example. Consider the symmetric group S3 = e, (12), (13), (23), (123), (132),i.e., the group of all permutations of three elements. The most natural represen-tation of S3 is given by the 3× 3 matrix group generated by

A = ρ(12) =

⎡⎣0 1 01 0 00 0 1

⎤⎦ and B = ρ(123) =

⎡⎣0 1 00 0 11 0 0

⎤⎦ .Note that ρe = I = A2 = B3, ρ(13) = AB, ρ(23) = BA, and ρ(132) = B2.

As it can be easily verified, the subspace U ⊆ R3 generated by[1 1 1

] isρ-symmetric and ρ|U = 1 is the corresponding (trivial) subrepresentation. On theother hand, for any direct summand V of U , ρ = ρ|U ⊕ ρ|V is a decomposition ofρ into irreducible subrepresentations.

To write the representation η = ρ|V in matrix form, it is necessary to fix abasis of V . If, e.g. , V =

⟨[10−1

],[

01−1

]⟩, it follows that η is generated by

η(12) =[0 11 0

]and η(123) =

[0 1−1 −1

]. (3.2)

The theory that was just exposed can be used to define and analyze symme-tries of dynamical systems. A quite general approach would be to chooseW = WT ,the set of trajectories. Nevertheless, in this paper we deal with a simpler class ofsymmetries, called static symmetries, since they act on the coordinates of thetrajectories, i.e., W = W , the signal space of the behavior. Therefore, the repre-sentations are homomorphism of the type

ρ : G→ GL(W ).

In this case, if W = Kq, a representation of G is a family of invertible matricesρg ∈ Kq×q that act on trajectories as one would expect. Actually, for any w ∈WT ,ρgw ∈ WT is such that

ρgw : T →W, t → ρg(w(t))∀g ∈ G.

Definition 3.4. Given a representation ρ onW , a behavior B⊆WT is ρ-symmetric if

ρB ⊆ B.Note that, since ρg are isomorphisms, with ρ−1

g = ρg−1 , a behavior is ρ-symmetric if and only if ρB = BRemark 3.5. The class of symmetries we are dealing with is not the most general.There are many simple symmetries that are group actions [5] but not represen-tations. One example is given by time-reversal symmetries (if w(t) ∈ B then alsow(−t) ∈ B), studied for 1D systems in [2, 11]. Also shift-invariance is a symmetry.Indeed, the behaviors we are considering are symmetric with respect to the actionof the group Z given by the shift operator στ .

Page 376: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

372 P. Rocha, P. Vettori and J.C. Willems

4. A canonical form for 2D behaviors

Our aim is to provide a canonical structure for kernel representations of 2D behav-iors, that will be useful in the analysis of their symmetries. This canonical formwas first introduced as forward computational representation in [9] for the analysisof 2D linear discrete systems.

In this paper we illustrate some of the already investigated properties alongwith new ones that are necessary to our purposes.

The idea is to obtain a representation of a behavior B as a product of twofactors which depend only on one indeterminate. As we will see, this permits toderive properties of B using theorems about 1D behaviors (namely, Corollary 2.4).

The factors of canonical representations mentioned above are constructed re-cursively by computing minimal representations of properly defined 1D behaviors.However, just to clarify the whole process and to give some first important defini-tions, we begin by assuming that a kernel representation of B is already given.

Let B = kerR(σ1, σ2), where R(s1, s2) ∈ K[s1, s2]p×q is a polynomial matrixin two indeterminates, and let N−1 be its degree in s2. We may write

R(s1, s2) =N−1∑j=0

Rj(s1)sj2 =

[R0(s1) R1(s1) · · · RN−1(s1)

]⎡⎢⎢⎢⎣

IIs2...

IsN−12

⎤⎥⎥⎥⎦ , (4.1)

where I are q × q identity matrices. If we call RN (s) =[R0(s) · · · RN−1(s)

]and ΦN (s) =

[I · · · IsN−1

], then (4.1) is a factorization of R(s1, s2) intopolynomial factors in one indeterminate

R(s1, s2) = RN (s1)ΦN (s2), (4.2)

where RN (s) ∈ K[s]p×Nq is uniquely determined by R(s1, s2).We will show how to build a canonical representation CN (s1)ΦN (s2) of B

where the matrix CN (s), that is equivalent to RN(s), has a special nested structureand therefore can be defined recursively. In order to do this, it is necessary to shedsome light on the relation between B and the 1D behavior kerRN(σ).

First of all note that w ∈ B if and only if

R(σ1, σ2)w(t1, t2) = RN(σ1)ΦN (σ2)w(t1, t2) = RN (σ1)

⎡⎢⎢⎢⎣w(t1, t2)

w(t1, t2 + 1)...

w(t1, t2 + N − 1)

⎤⎥⎥⎥⎦ = 0.

So, if we partition the trajectories of kerRN (σ) into N blocks, wj : Z → Kq,j = 0, . . . , N − 1, then by the shift invariance of B we may write that

w ∈ B ⇒ RN(σ)

[w0

...wN−1

]= 0 where wj(t) = w(t, j)∀j = 0, . . . , N−1. (4.3)

Page 377: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Symmetries of 2D Discrete-Time Linear Systems 373

This fact tells us that, roughly speaking, trajectories in the restriction of Bto the domain Z×0, 1, . . . , N−1 (i.e., made up of N consecutive horizontal linesof B), belong to kerRN (σ). This shows the opportunity of the following definition.

Definition 4.1. Given a 2D behavior B, we define

Bi =

[w0

...wi−1

]: ∃w ∈ B such that wj(t) = w(t, j)∀j = 0, . . . , i−1

.

Theorem 4.2. The sets Bi are 1D behaviors [7].

Remark 4.3. Note that in general BN ⊆ kerRN (σ) is not an equality. Indeed, letR(s1, s2) = s2. So, B = 0, hence B2 = 0 too, but kerR2(s) = ker

[0 1

]= 0.

In Section 4.1 we show how it is possible to take advantage of kernel repre-sentations of Bi in the construction of a particular kernel representation of Bi+1.These canonical representations of the behaviors Bi will be the main tool fordefining canonical representations of B in Section 4.2.

4.1. Iterative construction of representations of Bi

In this section we show how to compute minimal representations of Bi in an efficientway. More precisely, the structure of a canonical form relative to Bi will be definedby induction. Actually, in the following proposition, we begin by showing how toconstruct a kernel representations of Bi+1 that is based on a representation of Bi.

Proposition 4.4. Bi = kerCi(σ) if and only if there exist a full row rank matrixCi+1(s) and a matrix Ti+1(s) such that Bi+1 = kerCi+1(σ) with

Ci+1(s) =[

Ci(s) 0−Ti+1(s) Ci+1(s)

]. (4.4)

Proof. Assume first that Bi = kerCi(σ). Given any kernel representation Ci+1(s)of Bi+1, it is always possible to put its last q columns in Hermite form just usingelementary operations on the rows. In other words, there exists a unimodularmatrix U(s) such that

U(s)Ci+1(s) =[

Ci(s) 0−Ti+1(s) Ci+1(s)

], (4.5)

where Ci+1(s) has q columns and full row rank.Moreover, it follows from Definition 4.1 that the trajectories of Bi are restric-

tions (to the first iq components) of some trajectory of Bi+1. As a consequence,Bi+1 ⊆ Bi × (Kq)Z and thus Bi+1 ⊆ ker

[Ci(σ) 0

]. So, since (4.5) is still a kernel

representation of Bi+1, we can write

Bi+1 = ker

⎡⎣ Ci(σ) 0Ci(σ) 0

−Ti+1(σ) Ci+1(σ)

⎤⎦ .

Page 378: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

374 P. Rocha, P. Vettori and J.C. Willems

However, since Ci(σ)w = 0 for every w ∈ Bi, the operator Ci(σ) does notimpose any further restriction on the trajectories of Bi+1. Hence it can be deletedfrom the kernel representation to obtain the equivalent representation (4.4).

Let us suppose now that (4.4) holds, with Ci+1(s) having full row rank.Clearly Bi ⊆ kerCi(σ). Indeed,

w ∈ Bi ⇔ ∃w∗ :[ww∗

]∈ Bi+1 ⇒ Ci(σ)w = 0.

On the other hand, let w ∈ kerCi(σ). Since Ci+1(s) has full row rank,Ci+1(σ)is a surjective operator and there exists a w∗ such that Ti+1(σ)w = Ci+1(σ)w∗.Hence [ w

w∗ ] ∈ Bi+1 and so w ∈ Bi. Thus Bi = kerCi(σ).

In Proposition 4.4 nothing is said about the minimality of Ci(s). However,the statement assures that, once Ci(s) is minimal, Ci+1(s) is minimal too. So,the recursive scheme assures that Ci(s) is minimal for any i if C1(s) = C1(s) is aminimal representation of B1, the restriction of B to one horizontal line (the axis,for example).

Definition 4.5. We call canonical representation of Bi a lower triangular blockmatrix (each block having q columns)

Ci(s) =

⎡⎢⎣C1(s) 0. . .

∗ Ci(s)

⎤⎥⎦ (4.6)

such that Bi = kerCi(σ) is a minimal representation.

To state the properties of this canonical form, a very important role will beplayed by the family of behaviors defined as follows:

Bi =w ∈ (Kq)Z :

[0w

]∈ Bi

. (4.7)

These can be thought of as the sets of ‘line-values’ that in B follow i−1 null ‘lines’.Note that B1 = kerC1(σ) by definition. However, this is a general fact: given

a canonical representation (4.6) of Bi, the behavior Bj has minimal representationCj(s) for any j = 1, . . . , i. Indeed, by the form of Cj(s) given in (4.4),

w ∈ Bj ⇔ Cj(σ)[0w

]=[

0Cj(σ)w

]= 0 ⇔ w ∈ kerCj(σ). (4.8)

Also the converse result holds, as the following statement shows.

Lemma 4.6. The matrix Ci(s) defined by (4.6) is a canonical representation of Bi

if and only if Cj(s) is a minimal representation of Bj for any j = 1, . . . , i.

Proof. We already showed that the sufficiency holds. To prove the necessary con-dition we proceed by induction.

Page 379: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Symmetries of 2D Discrete-Time Linear Systems 375

If we let C1(s) = C1(s), we only have to prove that if Bj = kerCj(σ) andBj+1 = kerCj+1(σ), then there exists Tj+1(s) such that Bj+1 = kerCj+1(σ), withthe matrix Cj+1(s) as in (4.4).

By Proposition 4.4, there exist matrices Tj+1(s) and Cj+1(s) with full rowrank such that

Bj+1 = ker[

Cj(σ) 0−Tj+1(σ) Cj+1(σ)

].

By (4.8), Bj+1 = ker Cj+1(σ). Hence there must exist a unimodular U(s)such that Cj+1(s) = U(s)Cj+1(s). So, if we put Tj+1(s) = U(s)Tj+1(s), we getthat

Cj+1(s) =[

Cj(s) 0−Tj+1(s) Cj+1(s)

]=[I 00 U(s)

] [Cj(σ) 0

−Tj+1(σ) Cj+1(σ)

],

which provides the claimed canonical representation of Bj+1, j = 1, . . . , i.

This fact permits us to state the following important theorem that, restrictingto a particular class of unimodular matrices, specializes Corollary 2.4 to the caseof canonical representations.

Definition 4.7. Given Bi or, equivalently, a representation (4.6), the set of lowertriangular unimodular i × i block matrices with block sizes nl × nk, where nj isthe number of rows of any minimal representations of Bj , is denoted by U i.

Theorem 4.8. Let Ci(s) be one canonical representation of Bi. Then Ci(s) isa canonical representation of Bi if and only if Ci(s) = U i(s)Ci(s) for someU i(s)∈U i.

Proof. On the diagonal of any matrix U i(s) ∈ U i there are necessarily unimodularmatrices Uj(s), j = 1, . . . , i. Therefore, if Ci(s) has the form (4.6), the diagonalblocks of U i(s)Ci(s), i.e., Uj(s)Cj(s), still have full row rank and so U i(s)Ci(s) isa canonical representation of Bi, by Definition 4.5.

Suppose now that Ci(s) is a canonical representation of Bi, with diagonalblocks Cj(s), j = 1, . . . , i. By Lemma 4.6, both the matrices Cj(s) and Cj(s) areminimal representations of the behavior Bj , hence they have the same dimensionsand, by Corollary 2.4, there exist unimodular matrices Uj(s) such that Cj(s) =Uj(s)Cj(s). The question is whether there exists a matrix U i(s) ∈ U i with diagonalblocks Uj(s) such that Ci(s) = U i(s)Ci(s). We prove it by induction on j. Thefact is trivially true for j = 1. So let us suppose that Cj(s) = U j(s)Cj(s) withU j(s) ∈ Uj . We want to find Vj+1(s) such that U j+1(s) =

[Uj(s) 0

−Vj+1(s) Uj+1(s)

].

Explicitly,

Cj+1(s) =[

Cj(s) 0−Tj+1(s) Cj+1(s)

]=[

U j(s) 0−Vj+1(s) Uj+1(s)

] [Cj(s) 0

−Tj+1(s) Cj+1(s)

].

Page 380: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

376 P. Rocha, P. Vettori and J.C. Willems

This condition, since all the other equations are satisfied, is equivalent to Tj+1(s) =Vj+1(s)Cj(s) + Uj+1(s)Tj+1(s). Now, for every w ∈ Bj there exist w∗ such that[ ww∗ ] ∈ Bj+1. Therefore,

Tj+1(σ)w = Cj+1(σ)w∗ = Uj+1(σ)Cj+1(σ)w∗ = Uj+1(σ)Tj+1(σ)w.

This means that (Tj+1(σ) − Uj+1(σ)Tj+1(σ))w = 0, i.e., that

kerCj(σ) = Bj ⊆ ker(Tj+1(σ)− Uj+1(σ)Tj+1(σ)),

which, by Theorem 2.3, implies that there exists a matrix Vj+1(s) such thatTj+1(s)− Uj+1(s)Tj+1(s) = Vj+1(s)Cj(s), thus proving the theorem.

4.2. Canonical representations of a 2D behavior

We use now the canonical form introduced in Definition 4.5 to give a canonicalform for 2D systems.

Definition 4.9. The representation B = kerC(σ1, σ2) of a 2D behavior is canonical(with respect to the second variable) if the degree N in s2 of C(s1, s2) is minimalover all kernel representations of B and in the factorization (4.2),

C(s1, s2) = CN (s1)ΦN (s2), (4.9)

the matrix CN (s) is a canonical representation (4.6).

Remark 4.10. A similar construction would lead to the definition of a canonicalform with respect to the first variable but we will not use it in this paper.

Note that, by its definition, when B is given by a canonical representationCN (s1)ΦN (s2), then BN = kerCN (σ). The canonical representation of kerσ2,given as example in Remark 4.3, is simply C(s1, s2) = 1.

Theorem 4.11. Every 2D behavior B admits a canonical representation.

Proof. By Theorem 2.2, there exist a kernel representations of B. Let R(s1, s2)have minimal degree N in the second variable and factorize it as in Equation (4.1),R(s1, s2) = RN (s1)ΦN (s2). Then, given minimal representations of the 1D behav-iors Bi, Proposition 4.4 and Lemma 4.6 show how to construct canonical repre-sentations Ci(s) of Bi for any i = 1, . . . , N .

We now show that C(s1, s2) = CN (s1)ΦN (s2) is a canonical representationof B. Indeed, as we saw in Remark 4.3, kerCN (σ) = BN ⊆ kerRN(σ). So,

kerCN (σ1)ΦN (σ2) ⊆ kerRN (σ1)ΦN (σ2) = B.On the other hand, if w ∈ B then wk(h) = ΦN (σ2)w(h, k) belongs to BN for

any k ∈ Z and thus

B ⊆ kerC(σ1, σ2) = kerCN (σ1)ΦN (σ2).

Hence, since B = kerC(σ1, σ2), the proof is concluded.

Another fundamental fact regards the relation between two canonical repre-sentations, which is clarified by the following theorem.

Page 381: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Symmetries of 2D Discrete-Time Linear Systems 377

Theorem 4.12. Given a canonical representation C(s1, s2) = CN (s1)ΦN (s2) of the2D behavior B, any other canonical representation is equal to U(s1)C(s1, s2) forsome U(s) ∈ UN , where UN depends on CN (s) as in Definition 4.7.

Note that using this class of canonical representations, we overcome the twoproblems occurring in the equivalence of 2D behaviors. Indeed, two matrices thatare representations of the same behavior are generally not unimodularly equivalent.However, by Theorem 4.12, this holds always true for canonical representations of2D systems and, moreover, the unimodular matrix is a polynomial matrix in oneindeterminate.

These facts will be very useful in the proof of the main result about symme-tries in Section 5.

Example. Let the 2D behavior B be defined by the kernel representation

R(s1, s2) =[s2 − s1 s2 − 11− s1 s2 − s1

]. (4.10)

Since R(s1, s2) is a first order polynomial matrix in s2, i.e., N = 2, it is possibleto write R(s1, s2) = R0(s1) + R1(s1)s2 where

R0(s) =[−s −1

1− s −s

]and R1(s) =

[1 10 1

].

Note that both R0(s) and R1(s) are full row rank matrices and therefore R0(σ)and R1(σ) are surjective operators.

First of all, we show that B1 = (R2)Z, i.e., the restriction of B to one (hor-izontal) line is free. In other words, we prove that any sequence w(t, 0) can beextended to a 2D sequence w(t, τ) ∈ B.

Let us fix w(t, 0). By using the kernel representation of B we have that

R0(σ)w(t, 0) + R1(σ)w(t, 1) = 0. (4.11)

Since R1(σ) is surjective, there exists a sequence w(t, 1) such that (4.11) holdstrue. This shows that any w(t, 0) can be recursively extended to a 2D sequencew(t, τ) which satisfies the defining equation of B for any positive τ .

The same reasoning can be done for negative τ , in this case by using theequation R0(σ)w(t,−1) + R1(σ)w(t, 0) = 0 and the fact that R0(σ) is surjective.We conclude that B1 = kerC1(σ), with C1(s) = 0.

To find a kernel representation of B2 we use Proposition 4.4 and Lemma 4.6,i.e., we look for a kernel representation of B2. By definition, w ∈ B2 if and onlyif [ 0

w ] ∈ B2. So, by shift-invariance, B2 = w(t, 1) : w ∈ B and w(t, 0) = 0.However, by equation (4.11), if w(t, 0) = 0 then R1(σ)w(t, 1) = 0 and, being R1(s)clearly invertible, w(t, 1) = 0. Hence, B2 = kerC2(σ) with C2(s) = I.

To find T2(s) in equation (4.4), observe that −T2(σ)w(t, 0)+C2(σ)w(t, 1) = 0must hold, i.e., w(t, 1) = T2(σ)w(t, 0). By substituting it into equation (4.11), weget [R0(σ)+R1(σ)T2(σ)]w(t, 0) = 0. By the freeness of w(t, 0) and the invertibility

Page 382: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

378 P. Rocha, P. Vettori and J.C. Willems

of R1(σ), it follows that

−T2(s) = R1(s)−1R0(s) =[1 −10 1

] [−s −1

1− s −s

]=[−1 s− 1

1− s −s

].

Finally, since C1(s) is null, the canonical representation of B2 is

C2(s) =[−T2(s) C2(s)

]=[

−1 s− 1 1 01− s −s 0 1

](4.12)

and a canonical representation of B is

C(s1, s2) = C2(s1)Φ2(s2) =[s2 − 1 s1 − 11− s1 s2 − s1

].

Notice that U(s1)C(s1, s2) = R(s1, s2) where U(s) = R1(s). This means, byTheorem 4.12, that R(s1, s2) was already a 2D canonical form.

5. Static symmetries of dynamical systems

The aim of this section is to show the effect of static symmetries of B, correspondingto the condition ρB ⊆ B, on its kernel representations.

For 1D systems the following theorem holds [2].

Theorem 5.1. Given a representation ρ of G on W = Kq, the behavior B ⊆WZ isρ-symmetric if and only if it admits a minimal representation provided by R(s) ∈K[s]p×q such that

R(s)ρ = ρ′R(s)where ρ′ is a subrepresentation of ρ.

This theorem has been extended to nD systems [1] but under the restric-tive assumption of behavior regularity. Moreover, in the real case K = R, anotherhypothesis has to be added: the representation cannot have irreducible quater-nionic components (see Section 5.2). However, we will prove that for 2D systemsa complete extension of Theorem 5.1 is possible.

First of all, note that if B = kerR(s1, s2) with R(s1, s2) ∈ K[s1, s2]p×q, thenfor any representation ρ′ of G on Kp, the condition

R(s1, s2)ρ = ρ′R(s1, s2) (5.1)

is sufficient for B to be ρ-symmetric. Indeed, for any w ∈ B we have

R(s1, s2)ρw = ρ′R(s1, s2)w = 0 ⇒ ρw ∈ B.We are going to show that this condition is also necessary in the main theorem

of this paper.

Theorem 5.2. Given a representation ρ of G on W = Kq, the behavior B ⊆WZ2

isρ-symmetric if and only if it admits a kernel representation R(s1, s2) ∈ K[s1, s2]p×q

such that for a suitable representation ρ′ of G

R(s1, s2)ρ = ρ′R(s1, s2). (5.2)

Page 383: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Symmetries of 2D Discrete-Time Linear Systems 379

Moreover, if ρ = m1η1 ⊕ · · · ⊕mrη

r, then ρ′ = m′1η

1 ⊕ · · · ⊕m′rη

r for some m′i,

i = 1, . . . , r.

Proof. We already showed that if equation (5.2) holds, B is ρ-symmetric. We provenow the converse and the condition on ρ′.

Let C(s1, s2) be a canonical representation of B. The behavior is ρ-symmetricif and only if ρB = B, which is equivalent to kerC(σ1, σ2) = kerC(σ1, σ2)ρ. Notethat ΦN (s)ρ = ρNΦ(s)N where

ρN = diag(N︷ ︸︸ ︷

ρ, . . . , ρ). (5.3)

Therefore, if we decompose C(s1, s2) as in the factorization (4.9), we obtain that

C(s1, s2)ρ = CN (s1)ΦN (s2)ρ = CN (s1)ρNΦN (s2). (5.4)

The matrix CN (s)ρN is still a canonical representation of BN . Indeed it isa lower triangular matrix with diagonal blocks Cj(s)ρ, where Cj(s) are the corre-sponding blocks of CN (s). These blocks are clearly full row rank matrices, hencethe conditions of Definition 4.5 are met and C(s1, s2)ρ is a canonical representationof B. Therefore, by Theorem 4.12, a family of unimodular matrices Ug(s1) ∈ UN

is uniquely determined such that

C(s1, s2)ρg = Ug(s1)C(s1, s2), ∀g ∈ G. (5.5)

The rest of the proof is analogous to the 1D case. We just give a sketchwithout the details that can be found in [2].

It is possible to prove that Ug(s) is a continuous polynomial representation ofG. However, by a theorem proved in [6], it is also isomorphic to a constant represen-tation ρ′, i.e., there exists a unimodular matrix V (s) such that V (s)Ug(s) = ρ′gV (s)for any g ∈ G. This means that, if we let R(s1, s2) = V (s1)C(s1, s2), thenB = kerR(σ1, σ2) and, by (5.5),

ρ′gR(s1, s2) = ρ′gV (s1)C(s1, s2) = V (s1)C(s1, s2)ρg = R(s1, s2)ρg, ∀g ∈ G.

To prove that the decomposition of ρ′ uses the irreducible subrepresentationsof ρ, note that, as we already said, matrix (5.4) is a canonical representation,BN = kerCN (σ)ρN , and thus ρNBN ⊆ BN . Using the matrix V (s) that we havejust found, we see that ρ′V (s)CN (s) = V (s)CN (s)ρN and so, by Theorem 5.1, ρ′

is a subrepresentation of ρN . This shows that m′i ≤ Nmi.

One could wonder whether it is possible to take a canonical representationR(s1, s2) in formula (5.2) of Theorem 5.2. The answer, in general, is negative. Theonly result which is possible to state is the following.

Corollary 5.3. If B is ρ-symmetric, then also the 1D behaviors Bj are ρ-symmetric.Moreover, let Cj(s) be the minimal kernel representations of these behaviors suchthat ρ′jCj(s) = Cj(s)ρ, where ρ′j are subrepresentations of ρ (as stated by Theo-rem 5.1). Then there exist Ug(s) ∈ UN such that

Ug(s1)C(s1, s2) = C(s1, s2)ρg, ∀g ∈ G,

where Ug(s) ∈ UN has diagonal blocks ρ′jg .

Page 384: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

380 P. Rocha, P. Vettori and J.C. Willems

Proof. This is a straightforward consequence of the proof of Theorem 5.2. Indeed,by the properties of Ug(s) therein stated, by the structure (4.6) of CN (s), and byminimality of Cj(s), the statement follows.

If we take into account also the decomposition into irreducible components ofthe representation ρ inducing the symmetry, it is possible to characterize R(s1, s2)in a more detailed way. This leads to the definition of a canonical structure forkernel representations of symmetric systems.

However, the decomposition of representations depends highly on the basefield, and so we have to distinguish the real and complex cases. We will only exposethe main ideas, referring the reader to [2] for further details.

5.1. Symmetric representations of complex behaviors

Suppose that the basis of W is such that the decomposition (3.1), correspondsto orthogonal subspaces U1, . . . , Ur. If ni = deg ηi, then dimUi = mini. Thematrices ρg have therefore a block diagonal structure. We will suppose without lossof generality that also each ρ′g is a block diagonal matrix. We can then partitionR(s1, s2) into blocks Rij(s1, s2) according to ρg as regards the columns and to ρ′gfor the rows – so that equation (5.2) can be written blockwise

m′iη

iRij(s1, s2) = Rij(s1, s2)mjηj .

As a consequence of a fundamental result in representation theory, the SchurLemma [10], we obtain that Rij(s1, s2) = 0 for i = j and that Rii(s1, s2), ablock of dimension m′

ini ×mini, can be expressed as

Rii(s1, s2) = Λi(s1, s2)⊗ Ini , (5.6)

where Λi(s1, s2) ∈ C[s1, s2]m′i×mi is a suitable polynomial matrix, Ini is the ni×ni

identity matrix and ⊗ is the Kronecker product, such that [aij ]⊗B = [aijB].Therefore, by partitioning also trajectories w ∈ B into r block vectors ac-

cording to ρ, the kernel representation of B can be written as

B = ker(Λ1(s1, s2)⊗ In1)⊕ · · · ⊕ ker(Λr(s1, s2)⊗ Inr ).

5.2. Symmetric representations of real behaviors

When K is the field of real numbers, the general structure presented in the previoussection for a complex representation still holds, but the matrices Λi(s1, s2) inequation 5.6 are not so simple anymore. Indeed, three different types of irreduciblereal representations (real, complex and quaternionic, as defined in [5, §3.5]) giverise to different kinds of blocks Rii(s1, s2). If ρi is a real irreducible representation,the corresponding block looks like

Ai(s1, s2)⊗ Ini

where Ai(s1, s2) ∈ R[s1, s2]m′i×mi .

Page 385: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Symmetries of 2D Discrete-Time Linear Systems 381

For a complex irreducible representation ρi the typical block looks like[Ai(s1, s2) Bi(s1, s2)−Bi(s1, s2) Ai(s1, s2)

]⊗ Ini , (5.7)

where both Ai(s1, s2), Bi(s1, s2) ∈ R[s1, s2]m′i×mi .

Finally, if ρi is a quaternionic irreducible representation, the block Rii(s1, s2)is equal to⎡⎢⎢⎣

Ai(s1, s2) Bi(s1, s2) Ci(s1, s2) Di(s1, s2)−Bi(s1, s2) Ai(s1, s2) −Di(s1, s2) Ci(s1, s2)−Ci(s1, s2) Di(s1, s2) Ai(s1, s2) −Bi(s1, s2)−Di(s1, s2) −Ci(s1, s2) Bi(s1, s2) Ai(s1, s2)

⎤⎥⎥⎦⊗ Ini ,

where Ai(s1, s2), Bi(s1, s2), Ci(s1, s2), Di(s1, s2) ∈ R[s1, s2]m′i×mi .

Note that the size of Rii(s1, s2) is divisible by 2 or by 4 when the correspond-ing subrepresentation is complex or quaternionic.

Example. Consider again the 2D behavior B defined by the kernel representa-tion (4.10). We want to show that it is ρ-symmetric, where ρ is the representationof the group of permutations G = e, (123), (132) defined in (3.2), i.e.,

ρ(123) =[

0 1−1 −1

]and ρ(132) = ρ−1

(123) =[−1 −11 0

].

By equations (5.4) and (5.5), the behavior is symmetric if and only if thereexist a family of unimodular matrices Ug(s) such that

C2(s)ρ2g = Ug(s)C2(s), ∀g ∈ G,

where C2(s) =[−T2(s) C2(s)

]was found in equation (4.12). However, this means

that, in particular, C2(s)ρg = Ug(s)C2(s) and, being C2(s) = I, it follows thatρg = Ug(s).

So, we can affirm that B is ρ-symmetric once we check that also T2(s)ρg =Ug(s)T2(s) holds true, i.e., if and only if[

−1 s− 11− s −s

]ρg = ρg

[−1 s− 1

1− s −s

], ∀g ∈ G.

As a last remark, note that ρ is a complex irreducible representation: incomplex form it is just a cubic root of unity (e.g, − 1

2 + i√

32 ). So, by changing the

base of B, its kernel representation and the representation of G can be written

R(s1, s2) =[2s2 − s1 − 1

√3(s1 − 1)

−√

3(s1 − 1) 2s2 − s1 − 1

]and ρ(123) =

[− 1

2

√3

2

−√

32 − 1

2

],

respectively, according to the structure showed in equation (5.7).

Page 386: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

382 P. Rocha, P. Vettori and J.C. Willems

References

[1] C. De Concini and F. Fagnani. Symmetries of differential behaviors and finite groupactions on free modules over a polynomial ring. Math. Control Signals Systems,6(4):307–321, 1993.

[2] F. Fagnani and J.C. Willems. Representations of symmetric linear dynamical sys-tems. SIAM J. Control Optim., 31(5):1267–1293, 1993.

[3] F. Fagnani and J.C. Willems. Interconnections and symmetries of linear differentialsystems. Math. Control Signals Systems, 7(2):167–186, 1994.

[4] F. Fagnani and J.C. Willems. Symmetries of differential systems. In DifferentialEquations, Dynamical Systems, and Control Science, pages 491–504. Marcel DekkerInc., New York, 1994.

[5] W. Fulton and J. Harris. Representation Theory. A First Course. Springer-Verlag,New York, 1991.

[6] V.G. Kac and D.H. Peterson. On geometric invariant theory for infinite-dimensionalgroups. In Algebraic Groups, Utrecht 1986, pages 109–142. Springer-Verlag, Berlin,1987.

[7] J. Komornık, P. Rocha, and J.C. Willems. Closed subspaces, polynomial operatorsin the shift, and ARMA representations. Appl. Math. Lett., 4(3):15–19, 1991.

[8] U. Oberst. Multidimensional constant linear systems. Acta Appl. Math., 20(1–2):1–175, 1990.

[9] P. Rocha and J.C. Willems. Canonical Computational Forms for AR 2-D Systems.Multidimens. Systems Signal Process., 1:251–278, 1990.

[10] J.-P. Serre. Linear Representations of Finite Groups. Springer-Verlag, Berlin, 1977.

[11] P. Vettori. Symmetric controllable behaviors. In Proceedings of the 5th PortugueseConference on Automatic Control, Controlo 2002, pages 552–557, Aveiro, Portugal,2002.

[12] J.C. Willems. Models for dynamics. In Dynamics Reported, volume 2, pages 171–269.John Wiley & Sons Ltd., Chichester, 1989.

[13] J.C. Willems. Paradigms and puzzles in the theory of dynamical systems. IEEETrans. Automat. Control, 36(3):259–294, Mar. 1991.

Paula Rocha and Paolo VettoriDepartamento de MatematicaUniversidade de Aveiro,Campus de Santiago3810-193, Aveiro, Portugale-mail: [email protected]: [email protected]

Jan C. WillemsK.U. LeuvenESAT/SCD (SISTA),Kasteelpark Arenberg 10B-3001 Leuven-Heverlee, Belgiume-mail: [email protected]

Page 387: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 383–401c© 2005 Birkhauser Verlag Basel/Switzerland

An Algorithm for Solving Toeplitz Systemsby Embedding in Infinite Systems

G. Rodriguez, S. Seatzu and D. Theis

Dedicated to Israel Gohberg on the occasion of his 75th birthday

Abstract. In this paper we propose a new algorithm to solve large Toeplitzsystems. It consists of two steps. First, we embed the system of order N intoa semi-infinite Toeplitz system and compute the first N components of itssolution by an algorithm of complexity O(N log2 N). Then we check the ac-curacy of the approximate solution, by an a posteriori criterion, and updatethe inaccurate components by solving a small Toeplitz system. The numeri-cal performance of the method is then compared with the conjugate gradientmethod, for 3 different preconditioners. It turns out that our method is com-patible with the best PCG methods concerning the accuracy and superiorconcerning the execution time.

Mathematics Subject Classification (2000). Primary 65F05; Secondary 47B35.

Keywords. Toeplitz linear systems, infinite linear systems, Wiener-Hopffactorization, projection method.

1. Introduction

One of the most widespread approaches to the solution of a finite Toeplitz linearsystem

ANxN = bN , (AN )ij = aNi−j , i, j = 0, 1, . . . , N − 1, (1.1)

of large order N , in order to exploit the Toeplitz structure to optimize executiontime and storage, is to employ an iterative method preconditioned by a circulantmatrix. If the matrix AN is symmetric and positive definite the preconditionedconjugate gradient (PCG) is the most commonly used method [7]. This technique

Research partially supported by MIUR under COFIN grants No. 2002014121 and 2004015437.

Page 388: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

384 G. Rodriguez, S. Seatzu and D. Theis

consists of determining a nonsingular circulant matrix CN such that the precon-ditioned linear system

(CN )−1ANxN = (CN )−1bN (1.2)

can be solved by a small number of iterations. Within this family of preconditioningtechniques we can distinguish those usually attributed to G. Strang [8], T. Chan[6], and E. Tyrtyshnikov [22] (see also [21]), characterized by different strategiesto compute CN .

For a review of fast direct methods for the solution of Toeplitz systems, seealso [13, 14, 15].

We recall that for the solution of a semi-infinite Toeplitz system

Ax = b, (A)ij = ai−j , i, j = 0, 1, 2, . . . , (1.3)

there exists the classical Wiener-Hopf factorization method [9, 10]. Let us nowassume that the symbol

A(z) =∑j∈Z

ajzj, |z| = 1, (1.4)

does not vanish on the unit circle and its winding number is zero. Under this hy-pothesis, using a (canonical) Wiener-Hopf factorization of the inverse A(z)−1 ofthe symbol for semi-infinite Toeplitz matrix of Wiener class, i.e., for those satis-fying the Wiener condition

∑j∈Z

|aj | <∞, the solution of Eq. (1.3) can be givenin terms of Krein’s resolvent formula [5, 9, 16]. Various numerical techniques existfor computing a Wiener-Hopf factorization, in particular for symbols of positivedefinite semi-infinite Toeplitz matrices [1, 2, 3, 20, 23] (see also [12] where a com-parative analysis of these techniques was made). Here we employ Krein’s method(also called the cepstral method in signal processing [3, 20]) which combines FFTtechniques with Krein’s resolvent formula to solve Eq. (1.3) numerically.

In this article, assuming AN can be extended to a semi-infinite Toeplitzmatrix whose symbol does not vanish and has winding number zero, we proposea numerical method to solve large Toeplitz systems of the form (1.1). EmbeddingEq. (1.1) into a semi-infinite Toeplitz system, solving the semi-infinite system byKrein’s method and truncating its solution to its first N components, we obtain afirst approximation of the solution of the original Eq. (1.1). The crucial observationis that for large enough N there exists a comparatively small positive integer nsuch that the first N − n components of the solution vector have an acceptableaccuracy whereas the remaining n components are to be recomputed by solvinga finite Toeplitz system of order n. Since n ( N , this can be done by any of theestablished techniques for solving Toeplitz systems. The numerical results obtainedare quite satisfactory if compared with those obtained by the conjugate gradientmethod preconditioned by the techniques of Strang, Chan and Tyrtyshnikov.

Let us now discuss the contents of this paper. In Section 2 we shall explain thealgorithm used to solve Eq. (1.1) and also analyze the pointwise error generated inthe first step of the method in a simple case. Section 3 is devoted to the various test

Page 389: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Solving Toeplitz Systems by Embedding 385

matrices of Toeplitz type used in the calculations and Section 4 to the numericalresults. Throughout the paper ‖ · ‖ denotes the Euclidean vector norm ‖x‖ =[|x0|2 + · · ·+ |xN−1|2

]1/2 for x = (x0, . . . , xN−1)T .

2. The algorithm

Given the linear Toeplitz system (1.1), we embed it in the semi-infinite system

A∞x∞ = b∞, (2.1)

where

a∞ij =

aN

i−j , |i− j| ≤ N − 10, |i− j| ≥ N,

and

b∞i =

bNi , i = 0, 1, . . . , N − 1

0, i ≥ N.

We assume that AN and A∞ are invertible, the symbol A(z) does not vanish andits winding number is zero.

The method that we propose is based on the following two steps:

1. computation of the first N components of the solution x∞ of the semi-infinitesystem by an algorithm whose computational complexity has order N log2 N ;

2. correction of the components of x∞ that eventually do not satisfy an a pos-teriori error criterion, by solving a small Toeplitz system.

Let us preliminarily analyze how large the pointwise error |x∞i − xNi |, i =

0, 1, . . . , N − 1, is in a simple case. More precisely, let us consider the N × Ntridiagonal real Toeplitz system

TNyN = bN

given by ⎡⎢⎢⎢⎢⎣α β 0

β. . . . . .. . . . . . β

0 β α

⎤⎥⎥⎥⎥⎦⎡⎢⎢⎢⎢⎣

y0

...

...yN−1

⎤⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎣b0......

bN−1

⎤⎥⎥⎥⎥⎦ (2.2)

where α > 2 |β| > 0. Letting bN = (bi)N−1i=0 for b∞ = (bi)∞i=0 ∈ l2 and letting

y∞ = (yi)∞i=0 be the unique solution of the corresponding semi-infinite Toeplitzsystem

T∞y∞ = b∞ ,

the vector (y∞)N = (y∞i )N−1i=0 is easily seen to satisfy

TN(y∞)N = bN − β y∞N eN

Page 390: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

386 G. Rodriguez, S. Seatzu and D. Theis

where eNi = δi,N−1, i = 0, 1, . . . , N − 1 and (y∞)N = (y∞0 , . . . , y∞N−1)

T . Hence thesolution yN of (2.2) is given by

yN = (y∞)N + β y∞N (xN ) ,

where (xN ) = (xNN−1−i)

N−1i=0 satisfies TN(xN ) = eN . Since the unique solution

of T∞x∞ = (δi,0)∞i=0 equals x∞ = (δci)∞i=0 where c = [−α+√α2 − 4β2]/2β and

δ = 1/(α+ βc) = −c/β, we easily find

yNi − y∞i = βy∞N xN

N−1−i

=βδy∞N

1− β2δ2c2N

(cN−1−i + βδcN+i

)= − y∞N

1− c2N+2cN−i

(1− c2i+2

),

for i = 0, 1, . . . , N − 1.It is well known that y∞N is exponentially decaying as N → ∞ whenever bN

is, in particular when bi = 0 for i large enough. Since

yNi+1 − y∞i+1

yNi − y∞i

=1c

1− c2i+4

1− c2i+2,

which exceeds 1 in absolute value, the deviation of yNi from the corresponding

expression y∞i for the semi-infinite system increases monotonically as i increasesfrom 0 to N − 1. As a result, the maximum absolute value of the error occurs inthe last components of the vector, that is

|yNN−1 − y∞N−1| = max

i=0,...,N−1|yN

i − y∞i |.

Furthermore, asmax

i=0,...,N−1|yN

i − y∞i | + |c y∞N |,

the error goes to zero exponentially fast as N →∞.Let us now illustrate the first part of the algorithm. It is well known [9, 10]

that the invertibility of a semi-infinite Toeplitz matrix A∞ is equivalent to thesymbol (1.4), associated to the bi-infinite matrix Aij = ai−j , i, j ∈ Z, having a(canonical) Wiener-Hopf factorization

A(z) = A+(z)A−(z), |z| = 1,

where A+(z) and(A+(z)

)−1

are continuous for |z| ≤ 1 and analytic for |z| < 1

and A−(z) and(A−(z)

)−1

are continuous for |z| ≥ 1 and analytic for |z| > 1 andat infinity. A Wiener-Hopf factorization of A exists, in particular, if A is positivedefinite.

Let (A(z)

)−1

= Γ+(z)Γ−(z), |z| = 1, (2.3)

Page 391: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Solving Toeplitz Systems by Embedding 387

whereΓ+(z) =

∑j∈Z+

Γ(1)j zj, Γ−(z) =

∑j∈Z+

Γ(2)−jz

−j,

are its Wiener-Hopf factors. Such a factorization exists if and only if A∞ is invert-ible on l2.

With this notation, we have the following resolvent formula [5, 9, 16] for thesemi-infinite system (2.1)

x∞i =∑

j∈Z+

Γijb∞j , i ∈ Z+,

withΓij =

∑0≤h≤min(i,j)

Γ(1)i−hΓ(2)

h−j .

We note that b∞j = 0 for j ≥ N and that, as the matrix A∞ is banded, the

coefficients (A∞)−1j of its inverse decay exponentially as well as the elements Γ(1)

j

and Γ(2)−j of the Wiener-Hopf factors, as can be shown by using Gelfand theory and

Sec. XXX.4(v) of [10].According to Krein’s method [5, 16], the Wiener-Hopf factorization (2.3)

can be obtained by decomposing the Fourier series of the negative logarithm ofA(z) additively in Fourier series in nonnegative and nonpositive powers of z andexponentiating the terms obtained. More precisely, we compute

− log A(z) = γ+(z) + γ−(z),

where

γ+(z) =γ0

2+

∞∑j=1

γjzj,

γ−(z) =γ0

2+

∞∑j=1

γ−jz−j,

and then expand in Fourier series the functions

Γ+(z) = eγ+(z) and Γ−(z) = eγ−(z).

Numerical computations confirm that, as the matrix A∞ is banded, the co-efficients γj and γ−j decay exponentially, within machine precision, as to beexpected for theoretical reasons ([10], Sections XXX.1(v) and XXX.4(v)). This factjustifies the approximation of log A(z) by a polynomial whose degree is compara-tively small with respect to N . The typical decays of these coefficients are depictedin Figures 1 and 2, where the coefficients γj and Γ(1)

j correspond to a positivedefinite matrix generated by a Gaussian decaying function (see Section 3). In thesefigures, as throughout in the paper, µ = cond(A∞), that is µ = lim

N→∞cond(AN )

(see Section 3). Though the absolute values of γj and γ−j, as those of Γ(1)j

and Γ(2)−j, should decay exponentially fast, starting from an index m ( N , for

Page 392: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

388 G. Rodriguez, S. Seatzu and D. Theis

20001000

100

10−4

10−8

10−12

Figure 1. Decay of |γj| (Gaussian decay, σ = 0.17, µ + 2·1012,N = 220)

10−15

10−5

105

1000 2000

Figure 2. Decay of |Γ(1)j | (Gaussian decay, σ = 0.17, µ +

2 · 1012, N = 220)

numerical reasons they become stagnant. This observation induced us to set tozero all coefficients with j > m, determining in this way the polynomials thatapproximate γ+(z) and γ−(z).

It is worthwhile to remark that an accurate approximation of the vector(x∞i )N−1

i=0 can be computed in FFT -time, since the Wiener-Hopf factors Γ+ =

Page 393: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Solving Toeplitz Systems by Embedding 389

(Γ(1)i−j) and Γ− = (Γ(2)

i−j) of (A∞)−1 are lower and upper semi-infinite Toeplitzmatrices, respectively. More precisely, we need 2 FFT’s for computing the poly-nomial approximation of the Fourier series of log A(z), 2 for the coefficients γjand γ−j and 2 more FFT’s for the coefficients Γ(1)

j and Γ(2)−j. If the matrix

A∞ is symmetric positive definite we just need 4 FFT’s, as γ−j = γj , j ∈ N, andΓ(2)−j = Γ(1)

j , j ∈ Z+. The vector (x∞)N of the first N components of x∞ can thenbe computed as

(x∞)N = Γ(1)Γ(2)bN ,

i.e., by 4 additional FFT’s (3 in the positive definite case).Let us now discuss the asymptotic behavior of the pointwise error as N→∞.

To this end we consider a generalization to weighted spaces of well-known resultson the projection method [4, 11].

More precisely, let us assume A∞ is a semi-infinite Toeplitz matrix whosesymbol A(z) does not vanish in an annulus about the unit circle and has windingnumber zero, and let b∞ be a vector whose elements decay exponentially fast.Furthermore, let ANxN = bN be the system of order N with

aNij = a∞ij , i, j = 0, 1, . . . , N − 1 and

bNi = b∞i , i = 0, 1, . . . , N − 1 .

We assume that AN is non singular for each N value. Following [4, 11] it is thenstraightforward to prove

Theorem 2.1. Assuming AN , A∞ and b∞ as before specified, the finite Toeplitzsystem ANxN = bN has a unique solution for each N and, for fixed ρ > 1 suchthat A(z) = 0 for 1

ρ ≤ |z| ≤ ρ, the sequence

αN (ρ) := maxi=0,1,...,N−1

ρi|xNi − x∞i | → 0 , as N →∞ .

A direct consequence is that ρN−1|xNN−1−x∞N−1| converges to zero as N →∞.

Furthermore, our numerical results highlight that, as in the tridiagonal case, theerror

‖xN − PNx∞‖∞ := maxi=0,...,N−1

|xNi − x∞i |

decays exponentially fast. This property is illustrated in Figure 3, where ‖xN −PNx∞‖∞ is depicted in log-log scale in the Gaussian case (see Section 3). A similardecay holds true if we consider both the exponential and the algebraic decayingmatrices AN considered in Section 3.

The second step of the algorithm is based on the following observation: ournumerical experiments highlight that the pointwise error

|xNi − x∞i |

is generally very small for the first N − n components and large for the last ncomponents, with n comparatively very small with respect to N . More precisely,for large enough N , the value of n does not depend on N , so that the larger N is,

Page 394: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

390 G. Rodriguez, S. Seatzu and D. Theis

102

101

100

10−1

10−3

210 211 212 213 214 215 216 217 218

10−2

Figure 3. Decay of ‖xN − PNx∞‖∞ (Gaussian decay, σ = 0.2,µ + 3 · 1010, xN

i =1/(i+ 1)2)

the comparatively smaller is n, that is the number of components to be updated.This result is based on numerical computations for matrices whose elements havea Gaussian, exponential or algebraic decay (see Section 3). Unfortunately, we can-not explain this phenomenon for non tridiagonal Toeplitz systems. Such a typicalsituation is depicted in Figure 4.

10−15

10−10

10−5

100

217217−2000

180

Figure 4. Pointwise errors on the last components of the solution(Gaussian decay, σ = 0.2, µ + 3 · 1010; N = 217, xN

i =1/(i + 1)2)

The observation of the above phenomenon suggested us to treat n as a re-covery parameter, meaning that we consider the first N −n components of (x∞)N

Page 395: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Solving Toeplitz Systems by Embedding 391

to be substantially correct and the last n to be recomputed. Fixing the value ofthis parameter, by a criterion we shall make precise later, we partition the finitesystem (1.1) as follows [

AEE AEF

AFE AFF

] [xE

xF

]=[bE

bF

],

where xE = (xN0 , xN

1 , . . . , xNN−n−1 )T , bE = ( bN

0 , bN1 , . . . , bN

N−n−1 )T , xF =(xN

N−n , xNN−n+1 , . . . , x

NN−1)

T and bF = ( bNN−n , b

NN−n+1 , . . . , b

NN−1)

T . Hence,if AFF is nonsingular, we can improve the approximation coming from the infinitesystem (2.1) by setting xE = (x∞0 , x∞1 , . . . , x∞N−n−1 )T and then taking xF asthe solution of the small Toeplitz system

AFF xF = bF −AFExE . (2.4)

This step takes 3 FFT’s for computing the product AFExE , so that the totalnumber of FFT’s required by the algorithm is 13 (10 if the matrix is positive

definite). Finally, we take the vector[xE

xF

]as an acceptable approximation of xN .

To estimate the recovery parameter n we adopted the following heuristic cri-terion. Let (xN )(n) denote the approximation of the solution obtained by takingthe first N − n components of the solution of the semi-infinite system (2.1) andrecomputing the last n entries by solving system (2.4). Then, choosing an incre-ment k, we compare the last computed correction ‖(xN )(n+k) − (xN )(n)‖ with theprevious one. More precisely, fixing 0 < c1 < c2, we look for the smallest value ofn such that

c1 <‖(xN )(n+k) − (xN )(n)‖‖(xN )(n) − (xN )(n−k)‖ < c2. (2.5)

Our numerical experiments suggest that c1 = 0.9 and c2 = 1.1 are suitable valuesfor these two parameters. Furthermore, we found out that n = 40 and k = 20are good choices for the recovery parameter and the increment. We iterate thecorrection process, by setting n = n + k and correspondingly updating the solu-tion until condition (2.5) is verified, and we take the last computed vector as theapproximation of the solution.

Our experiments show that the ratio (2.5) oscillates for low values of n andstabilizes around 1, as shown in Figures 5 and 6. In these figures, the value of nselected by the adopted criterion is marked by a circle.

3. Test matrices

In this section we illustrate the method adopted to generate the matrices we used inour numerical experiments. This method, recently proposed in [18] and extended invarious directions in [19], allowed us to generate several bi-infinite positive definiteToeplitz matrices, each characterized by a parameter, whose condition number isa known function of this parameter.

Page 396: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

392 G. Rodriguez, S. Seatzu and D. Theis

0.9

1

1.1

0 100 150 50 200 250 300 350 400 450 500

Figure 5. Ratio (2.5) vs. recovery parameter n (Algebraic decay,σ = 5.5, µ + 3 · 1012, N = 217, xN

i = sin π(i+1)N+1 )

0.9

1

1.1

0 100 150 50 200 250 300 350 400 450 500

Figure 6. Ratio (2.5) vs. recovery parameter n (Gaussian decay,σ = 0.17, µ + 2 · 1012, N = 217, xN

i = 1)

The following, basically well-known, result is also of great relevance in ournumerical experiments.

Page 397: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Solving Toeplitz Systems by Embedding 393

Theorem 3.1. If A∞ is a bi-infinite positive definite Toeplitz matrix, then thecondition number of the semi-infinite matrix A∞

+ , whose elements are (A∞+ )ij =

aij, i, j ∈ Z+, equals the condition number of A∞.

As a consequence, the condition number of a bi-infinite matrix turns out tobe the limit of the condition numbers of finite projections of the correspondingsemi-infinite matrix. Let us now introduce the method.

Let φ be a real function in L1(R) ∩ L2(R) satisfying the two following prop-erties:

(a) there exists a number γ > 1 such that∫R

(1 + |t|2)γ φ(t)2 dt <∞;

(b) the Fourier transform φ does not have real zeros.

Every function satisfying these two properties can be used to generate abi-infinite positive matrix, as the following theorem claims.

Theorem 3.2. Let the function φ satisfy the above properties and consider thesampling points tj = αj, j ∈ Z, for some α > 0. Then the Gram matrix withelements

(A∞φ )ij = k(ti, tj) =

∫R

φ(t− ti)φ(t − tj) dt, i, j ∈ Z,

is a positive definite Toeplitz matrix which is bounded on 2(Z). Further, its symbol

Aφ(z;α) =∑j∈Z

zj

∫R

φ(t)φ(t + αj) dt =∑j∈Z

zjκ(αj), |z| = 1,

satisfies the Wiener condition ∑j∈Z

|κ(αj)| <∞,

and the condition number of the Toeplitz matrix equals

max|z|=1 Aφ(z;α)

min|z|=1 Aφ(z;α).

The proof of this theorem can be found in [18, Theorems 3.1 and 3.2].Let us now give some examples of bi-infinite positive definite Toeplitz ma-

trices generated by functions having Gaussian, exponential and algebraic decay,respectively. We consider α = 1, that is ti = i, i ∈ Z.Gaussian decay. For φσ(t) = e−σt2 , σ > 0, the Gram matrix generated as abovespecified is the following positive definite Toeplitz matrix:

(A∞φ )ij =

√π

2σe−

σ2 (i−j)2 , i, j ∈ Z,

Page 398: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

394 G. Rodriguez, S. Seatzu and D. Theis

whose corresponding symbol is

Aφ(z) = cσ

∞∏j=1

[(1 + e−(j− 1

2 )σz)(

1 + e−(j− 12 )σz−1

)],

where z = eiθ and

cσ =√

π

∞∏j=1

(1− e−jσ

).

The condition number of A∞φ is then

cond(A∞φ ) =

Aφ(1)Aφ(−1)

=

⎛⎝ ∞∏j=1

1 + e−(j− 12 )σ

1− e−(j− 12 )σ

⎞⎠2

,

which is strictly decreasing from ∞ to 1 as σ goes from zero to ∞, so that for anychosen µ > 1 there is a unique value of σ for which cond(A∞

φ ) = µ.Exponential decay. Let φσ(t) = e−σ|t|. Then we have

(A∞φ )ij =

1 + σ|i− j|σ

e−σ|i−j|, i, j ∈ Z

and

Aφ(z) =p(σ) + q(σ)(z + z−1)

(1− ze−σ)2(1− z−1e−σ)2,

where z = eiθ and p(σ) = 1

σ

(1− e−4σ

)− 4e−2σ

q(σ) =(1 + 1

σ

)e−3σ +

(1− 1

σ

)e−σ.

The condition number is

cond(A∞φ ) =

Aφ(1)Aφ(−1)

=p(σ) + 2q(σ)p(σ)− 2q(σ)

(1 + e−σ

1− e−σ

)4

.

As in the Gaussian case, cond(A∞φ ) strictly decreases from ∞ to 1 as σ increases

from zero to ∞.Algebraic decay. Let φσ(t) = 1

(σ2+t2)2for σ > 0. Then

(A∞φ )ij =

π

8σ3

[1

(4σ2 + |i− j|2)2 +4σ2

(4σ2 + |i− j|2)3

], i, j ∈ Z

Aφ(eiθ) =(

π

8σ3 sinh(2πσ)

)2 [F2(σ, θ) +

14 sinh(2πσ)

F3(σ, θ)]

and

cond(A∞φ ) =

F2(σ, 0) + F3(σ,0)4 sinh(2πσ)

F2(σ, π) + F3(σ,π)4 sinh(2πσ)

,

Page 399: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Solving Toeplitz Systems by Embedding 395

whereF2(σ, θ) = cosh(2(π − θ)σ) sinh(2πσ) + 2σ

[π cosh(2σθ)+

θ sinh(2(π − θ)σ) sinh(2πσ)],

F3(σ, θ) =[3− 4θ(π − θ)σ2

]cosh(2(π − θ)σ) sinh2(2πσ)

+ 4πσ cosh(2σθ) sinh(2πσ)

+ 2(3θ − π)σ sinh(2(π − θ)σ) sinh2(2πσ)

+ 2πσ cosh(2(π − θ)σ) cosh(2πσ) sinh(2πσ)

+ 8π2σ2 cosh(2σθ) cosh(2πσ)

+ 4π2σ2θ sinh(2(π − θ)σ) cosh(2πσ) sinh(2πσ)

− 4πσ2θ sinh(2σθ) sinh(2πσ).

In this case, cond(A∞φ ) strictly increases from 1 to ∞ as σ increases from zero

to ∞.

4. Numerical results

In order to assess the effectiveness of the method proposed, we carried out anextensive experimentation by using matrices generated by the truncation of thesemi-infinite matrices introduced in the previous section. In every case cond(A∞

φ )is the upper bound of the condition numbers of these truncations. Further, wenote that our experiments concern matrices whose entries exhibit a Gaussian,exponential or algebraic decay, away from the main diagonal.

In our numerical experiments, for each test matrix ANφ , we consider a set of

sample solutions xN and generate the corresponding data vector bN = ANφ xN .

The error values quoted in each table are the relative errors

Er(N) =‖xN − (xN )(n)‖

‖xN‖ ,

where n is the recovery parameter, chosen by the heuristic criterion described inSection 2.

The results quoted in Tables 1–6 have been computed by our embeddingmethod (EM) and by the preconditioned conjugate gradient (PCG) methodwith the Chan (PCG/Chan) [6], Strang (PCG/Strang) [8] and Tyrtyshnikov(PCG/Tyrt) [22] preconditioners. The solutions pertaining the PCG method havebeen obtained by performing 100 iterations.

The computations were performed in double precision with Matlab vers. 6.5[17] running under Linux on an AMD Athlon 64 3200+ processor, with 1.5 GbyteRAM. We remark that while the total computing time for solving a system of order217 by our method is ≈ 10 sec., 100 iterations of the preconditioned conjugategradient method usually take ≈ 110 sec. In either algorithm all Toeplitz matrixproducts and inversions of circulant preconditioners have been implemented, as

Page 400: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

396 G. Rodriguez, S. Seatzu and D. Theis

usual, by means of the Fast Fourier Transform (FFT) in order to optimize theircomputational complexity.

In each table, as throughout the paper, σ denotes the parameter that iden-tifies the function φ and the corresponding matrix A∞

φ , and µ is the upper boundof cond(AN

φ ).

Table 1. Algebraic decay (N = 220; σ = 4.5, µ + 6.6 · 109)

EM Er for PCGxN

i n Er Chan Strang Tyrt1 340 3.7 · 10−3 1.2 · 10−3 3.2 · 10−7 4.5 · 10−5

sin π(i+1)N+1 200 5.4 · 10−7 2.1 · 10−7 2.2 · 10−7 1.1 · 10−9

1/(i+ 1)2 320 9.6 · 10−8 2.6 · 10−2 7.5 · 10−8 6.1 · 10−1

(−1)(i+1)/(i + 1) 340 2.5 · 10−7 5.2 · 10−3 3.0 · 10−8 1.0 · 100

(1−10−4)(i+1) 40 3.0 · 10−5 8.3 · 10−3 3.0 · 10−7 5.5 · 10−3

(1−10−1)(i+1) 60 2.2 · 10−7 1.3 · 10−1 2.0 · 10−7 1.8 · 10−1

Table 2. Algebraic decay (N = 220; σ = 5, µ + 1011)

EM Er for PCGxN

i n Er Chan Strang Tyrt1 240 3.9 · 10−3 2.2 · 10−2 4.6 · 10−6 4.9 · 10−5

sin π(i+1)N+1 200 9.3 · 10−5 1.2 · 10−6 4.7 · 10−6 1.4 · 10−9

1/(i+ 1)2 160 6.8 · 10−5 4.4 · 10−1 1.4 · 10−6 7.1 · 10−1

(−1)(i+1)/(i + 1) 300 3.4 · 10−6 2.1 · 10−1 4.8 · 10−7 1.1 · 100

(1−10−4)(i+1) 200 2.7 · 10−2 2.1 · 10−1 4.9 · 10−6 6.5 · 10−3

(1−10−1)(i+1) 120 1.9 · 10−5 2.9 · 100 3.5 · 10−6 2.1 · 10−1

Table 3. Exponential decay (N = 220; σ = 0.025, µ + 108)

EM Er for PCGxN

i n Er Chan Strang Tyrt1 240 7.5 · 10−6 8.6 · 10−9 8.0 · 10−9 1.3 · 10−4

sin π(i+1)N+1 120 7.6 · 10−8 9.5 · 10−9 9.1 · 10−9 4.0 · 10−8

1/(i+ 1)2 80 6.7 · 10−9 9.3 · 10−10 9.2 · 10−10 6.9 · 10−1

(−1)(i+1)/(i + 1) 100 2.6 · 10−9 3.4 · 10−10 3.3 · 10−10 9.6 · 10−1

(1−10−4)(i+1) 140 1.7 · 10−6 1.0 · 10−8 1.0 · 10−8 6.0 · 10−3

(1−10−1)(i+1) 180 1.7 · 10−8 2.6 · 10−9 2.7 · 10−9 2.0 · 10−1

The numerical results shown in each table have been obtained by consideringthe matrices introduced in Section 3, for different values of the parameter σ on

Page 401: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Solving Toeplitz Systems by Embedding 397

Table 4. Exponential decay (N = 220; σ = 0.005, µ + 8 · 1010)

EM Er for PCGxN

i n Er Chan Strang Tyrt1 100 1.0 · 100 7.7 · 10−6 5.4 · 10−6 1.9 · 10−4

sin π(i+1)N+1 100 1.3 · 10−3 6.3 · 10−6 6.3 · 10−6 2.7 · 10−7

1/(i+ 1)2 60 4.0 · 10−7 2.8 · 10−7 2.7 · 10−7 9.6 · 10−1

(−1)(i+1)/(i + 1) 100 1.1 · 10−6 2.3 · 10−7 9.5 · 10−8 1.0 · 100

(1−10−4)(i+1) 40 1.8 · 10−3 2.4 · 10−5 1.0 · 10−6 1.6 · 10−2

(1−10−1)(i+1) 260 1.0 · 10−6 1.0 · 10−6 8.0 · 10−7 5.6 · 10−1

Table 5. Gaussian decay (N = 220; σ = 0.2, µ + 3 · 1010)

EM Er for PCGxN

i n Er Chan Strang Tyrt1 360 1.1 · 10−6 7.9 · 10−3 8.5 · 10−7 3.1 · 10−5

sin π(i+1)N+1 280 8.6 · 10−7 3.3 · 10−7 7.5 · 10−7 3.6 · 10−10

1/(i + 1)2 320 4.2 · 10−7 5.9 · 10−1 3.1 · 10−7 5.1 · 10−1

(−1)(i+1)/(i + 1) 380 4.8 · 10−7 1.4 · 10−1 1.2 · 10−7 1.0 · 100

(1−10−4)(i+1) 320 1.4 · 10−6 7.2 · 10−2 1.1 · 10−6 4.5 · 10−3

(1−10−1)(i+1) 120 6.1 · 10−7 1.5 · 100 6.2 · 10−7 1.5 · 10−1

Table 6. Gaussian decay (N = 220; σ = 0.17, µ + 2 · 1012)

EM Er for PCGxN

i n Er Chan Strang Tyrt1 360 1.6 · 10−4 1.9 · 10−2 4.8 · 10−5 3.3 · 10−5

sin π(i+1)N+1 280 5.5 · 10−5 6.4 · 10−7 6.0 · 10−5 4.4 · 10−10

1/(i + 1)2 240 3.1 · 10−5 1.2 · 100 3.6 · 10−4 5.9 · 10−1

(−1)(i+1)/(i + 1) 320 3.5 · 10−5 2.6 · 10−1 8.2 · 10−6 1.0 · 100

(1−10−4)(i+1) 380 8.4 · 10−4 2.0 · 10−1 8.7 · 10−5 5.3 · 10−3

(1−10−1)(i+1) 260 5.0 · 10−5 4.2 · 100 9.9 · 10−5 1.7 · 10−1

which they depend. For each matrix, the following sample solutions have beenconsidered:

xNi = 1, xN

i = sinπ(i + 1)N + 1

,

xNi =

1(i + 1)2

, xNi =

(−1)i+1

(i + 1),

xNi = (1 − 10−4)i+1, xN

i = (1− 10−1)i+1,

with i = 0, 1, . . . , N − 1.

Page 402: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

398 G. Rodriguez, S. Seatzu and D. Theis

In each table we have reported the relative error Er(N), with respect to thevariation of the dimension N of the linear system, together with the value of theparameter σ and the asymptotic condition number µ of the corresponding matrix.

For the sake of clarity, we make some specific comments pertaining to eachclass of matrices.

1. Algebraic decay. In this case both our method and the Strang precondition-ing technique give reasonable accuracy, also for large values of the conditionnumber of the matrix (Tables 1 and 2). The performance of the Chan precon-ditioner is not as good, especially in Table 2. Furthermore, it totally fails inone case, as does the Tyrtyshnikov preconditioner. On the whole, both thesetwo preconditioners produce a limited performance on this test problem.

2. Exponential decay. If the condition number is moderately large, µ + 108 say,our results, as well as those obtained with the Strang and Chan precondition-ers are very good whereas those generated by the Tyrtyshnikov preconditionerare not always as good (Table 3). We note that to reach a good accuracy inTable 3 by using the Tyrtyshnikov preconditioner we need about 1000 itera-tions of the PCG method, which take ≈ 200 times the computation time ofour method.

If the condition number is ≈ 1010 our method fails in the first exampleand gives good results in all the other examples, while we have very goodresults using both the Strang and Chan preconditioners (Table 4). The Tyr-tyshnikov preconditioner fails in one case and gives poor results in the othertwo cases. In these cases we need more than 1000 iterations to obtain accept-able results. In particular, the relative errors decrease to ≈ 10−3 in the lastexample when 5000 iterations are considered.

3. Gaussian decay. All of the methods are quite effective if the condition numberis moderately large, µ ≤ 108 say, though our method is very effective andthe Strang and Chan preconditioners perform better than the Tyrtyshnikovpreconditioner (Table 5).

If the condition number is much larger (Table 6) our method still givesvery good results as does the Strang preconditioner, whereas the Chan andTyrtyshnikov preconditioners sometimes fail or give poor results. When theyfail, we did not obtain acceptable results even when considering 104 iterations.

The results shown in Tables 1–6 may suggest that the performance of theTyrtyshnikov preconditioner is always unsatisfactory. However, our numerical ex-periments show that there are situations in which the Tyrtyshnikov preconditionerproves to be compatible with the other two preconditioning techniques on the testproblems considered. This holds true, in particular, when the dimension of the lin-ear system is considerably smaller than in the previous tables; a typical exampleis displayed in Table 7, where N = 210. We feel that the poorer performance ofthis preconditioner for larger dimensions is due to rounding errors propagation,caused by the larger complexity of the algorithm for its computation with respectto Strang and Chan preconditioners.

Page 403: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Solving Toeplitz Systems by Embedding 399

Table 7. Exponential decay (N = 210; σ = 0.005, µ ≈ 7.6 · 1010)

Er for PCGxN

i Chan Strang Tyrt1 6.1 · 10−2 1.1 · 100 4.0 · 10−5

sin π(i+1)N+1 6.4 · 10−2 1.0 · 100 1.8 · 10−5

1/(i+ 1)2 7.8 · 10−2 7.2 · 10−1 6.7 · 10−1

(−1)(i+1)/(i + 1) 3.7 · 10−1 9.6 · 10−1 9.4 · 10−1

(1−10−4)(i+1) 7.7 · 10−2 1.4 · 100 6.4 · 10−5

(1−10−1)(i+1) 1.0 · 10−1 1.5 · 10−1 6.0 · 10−2

Conclusions

Our numerical results show that our method is reliable if the order of the systemis large enough. Furthermore, we generally obtain good results also for moderatelyill-conditioned matrices, though the computational effort of our method is lowerwith respect to preconditioned conjugate gradient method.

213 220

10−3

100

104

Figure 7. Relative error Er(N) (EM (solid line), PCG/Chan(dashed line); Algebraic decay σ = 5, µ + 1011, xN

i = 1)

As our method is generated by the resolvent formula for semi-infinite systems,the results improve as N increases, whereas this conclusion certainly does not holdfor the iterative methods, even when preconditioned. More precisely, in our methodEr(N) decreases as N increases, whereas this does not always happen for the othermethods we tested. In order to give a geometrical idea of this fact, in Figure 7 weplotted Er(N) for 213 ≤ N ≤ 220, obtained applying our method (solid line) andthe PCG/Chan method (dashed line) to one of the test problems considered (seeSection 3).

Page 404: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

400 G. Rodriguez, S. Seatzu and D. Theis

In conclusion, for large Toeplitz systems our method is compatible with thebest PCG methods concerning accuracy and superior concerning the computationtime.

Acknowledgments

The authors are greatly indebted to their friend C.V.M. van der Mee for his helpand advice in writing the paper. They also express their gratitude to the refereesfor their useful comments that allowed us to improve the paper.

References

[1] F.L. Bauer. Ein direktes Iterations Verfahren zur Hurwitz-Zerlegung eines Polynoms.

Arch. Elektr. Ubertragung, 9:285–290, 1955.

[2] F.L. Bauer. Beitrage zur Entwicklung numerischer Verfahren fur programmgesteu-erte Rechenanlagen, ii. Direkte Faktorisierung eines Polynoms. Sitz. Ber. Bayer.Akad. Wiss., pp. 163–203, 1956.

[3] B.P. Bogert, M.J.R. Healy, and J.W. Tukey. The frequency analysis of time seriesfor echoes: cepstrum pseudo-autocovariance, cross-cepstrum and saphe cracking. In:M. Rosenblatt (ed.), Proc. Symposium Time Series Analysis, John Wiley, New York,1963, pp. 209–243.

[4] A. Bottcher and B. Silbermann. Operator-valued Szego-Widom theorems. In:E.L. Basor and I. Gohberg (Eds.), Toeplitz Operators and Related Topics, volume 71of Operator Theory: Advances and Applications. Birkhauser, Basel-Boston, 1994, pp.33–53.

[5] A. Calderon, F. Spitzer, and H. Widom. Inversion of Toeplitz matrices. Illinois J.Math., 3:490–498, 1959.

[6] T.F. Chan. An optimal circulant preconditioner for Toeplitz systems. SIAM J. Sci.Stat. Comput., 9(4):766–771, 1988.

[7] R.H. Chan and M.K. Ng. Conjugate Gradient Methods for Toeplitz Systems. SIAMReview, 38(3):297–386, 1996.

[8] R. Chan and G. Strang. Toeplitz equations by conjugate gradients with circulantpreconditioner. SIAM J. Sci. Stat. Comput., 10(1):104–119, 1989.

[9] I.C. Gohberg and I.A. Feldman. Convolution Equations and Projection Methods fortheir Solution, volume 41 of Transl. Math. Monographs. Amer. Math. Soc., Provi-dence, RI, 1974.

[10] I. Gohberg, S. Goldberg, and M.A. Kaashoek. Classes of Linear Operators, Vol. II,volume 63 of Operator Theory: Advances and Applications. Birkhauser, Basel-Boston,1993.

[11] I. Gohberg and M.A. Kaashoek. Projection method for block Toeplitz operatorswith operator-valued symbols. In: E.L. Basor and I. Gohberg (Eds.), Toeplitz Opera-tors and Related Topics, volume 71 of Operator Theory: Advances and Applications.Birkhauser, Basel-Boston, 1994, pp. 79–104.

[12] T.N.T. Goodman, C.A. Micchelli, G. Rodriguez, and S. Seatzu. Spectral factorizationof Laurent polynomials. Adv. Comput. Math., 7:429–454, 1997.

Page 405: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Solving Toeplitz Systems by Embedding 401

[13] G. Heinig and K. Rost, Algebraic Methods for Toeplitz-like Matrices and Operators,volume 13 of Operator Theory: Advances and Applications. Birkhauser, Basel-Boston,1984.

[14] T. Kailath and A.H. Sayed. Displacement structure: theory and applications. SIAMReview, 37(3):297–386, 1996.

[15] T. Kailath and A.H. Sayed. Fast reliable algorithms for matrices with structure.SIAM, Philadelphia, PA, 1999.

[16] M.G. Krein. Integral equations on the half-line with kernel depending upon the dif-ference of the arguments. Uspehi Mat. Nauk., 13(5):3–120, 1958. (Russian, translatedin AMS Translations, 22:163–288, 1962).

[17] The MathWorks Inc. Matlab ver. 6.5. Natick, MA, 2002.

[18] C.V.M. van der Mee, M.Z. Nashed, and S. Seatzu. Sampling expansions and inter-polation in unitarily translation invariant reproducing kernel Hilbert spaces. Adv.Comput. Math., 19(4):355–372, 2003.

[19] C.V.M. van der Mee and S. Seatzu. A method for generating infinite positive definiteself-adjoint testmatrices and Riesz bases. SIAM J. Matrix Anal. Appl. (to appear).

[20] A.V. Oppenheim and R.W. Schafer. Discrete-Time Signal Processing. Prentice HallSignal Processing Series. Prentice Hall, Englewood Cliffs, NJ, 1989.

[21] M. Tismenetsky. A Decomposition of Toeplitz Matrices and Optimal Circulant Pre-conditioning. Linear Algebra Appl., 154–156:105–121, 1991.

[22] E.E. Tyrtyshnikov. Optimal and superoptimal circulant preconditioners. SIAM J.Matrix Anal. Appl., 13:459–473, 1992.

[23] G. Wilson. Factorization of the covariance generating function of a pure movingaverage process. SIAM J. Numer. Anal., 6(1):1–7, 1969.

G. Rodriguez, S. Seatzu and D. TheisDipartimento di Matematica e InformaticaUniversita degli Studi di Cagliariviale Merello 92I-09123 Cagliari, Italye-mail: [email protected], [email protected], [email protected]

Page 406: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 403–411c© 2005 Birkhauser Verlag Basel/Switzerland

Fredholm Theory andNumerical Linear Algebra

Bernd Silbermann

Dedicated to Israel Gohberg on the occasion of his seventyfifth birthday

Abstract. It is shown that for every bounded linear operator A on a sepa-rable Hilbert space, the finite sections of A∗A reflect perfectly the Fredholmproperties of this operator. A few applications are briefly discussed.

Mathematics Subject Classification (2000). 47B35.

1. Introduction

Given a Hilbert space H we denote by B(H) the C∗-algebra of all bounded linearoperators acting on H and let K(H) ⊂ B(H) be the ideal of all compact operators.The strong limit of a strongly converging sequence (An) ⊂ B(H) will be denotedby s-lim An. We shall use without further comments, that strong convergenceon compact subsets is uniform and that strongly convergent sequences (An) ⊂B(H) are uniformly bounded (Banach-Steinhaus Theorem). Suppose we are givena sequence (Pn) ⊂ B(H) of finite rank orthogonal projections such that (Pn)converges strongly to the identity operator I (thus, H is separable, and for everyseparable Hilbert space there exist such sequences) and consider the sequence(PnAPn) where A ⊂ B(H) is fixed and PnAPn : im Pn → im Pn. Clearly, onecan identify PnAPn with an ln × ln-matrix, where ln = dim im Pn. The paper isorganized as follows.

Section 2 is devoted to the study of the asymptotic behavior of the singularvalues associated to the operators (matrices) PnAPn. It will be shown that forso-called Fredholm sequences the singular values have remarkable behavior: theyare subject to the finite splitting property. The result of Section 2 will then be usedin Section 3 to show that the Fredholm properties of an operator A ⊂ B(H) areperfectly reflected in the asymptotic behavior of the sequences (PnA

∗APn) and(PnAA

∗Pn). A few examples are mentioned in Section 4.

Page 407: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

404 B. Silbermann

2. The finite splitting property

Let H, (Pn) be as in the Introduction. Form the collection E of all bounded se-quences (An) with An : im Pn → im Pn where both sequences (AnPn) , (A∗

nPn)converge strongly (this implies (s-lim AnPn)∗ = s-lim A∗

nPn). By defining thealgebraic operations and the involution componentwise, and the norm by

||(An)|| := sup ||AnPn|| ,E actually becomes a unital C∗-algebra containing the closed and two-sided ideals

G := (An) ∈ E : ||AnPn|| → 0 as u→∞ ,J := (An) ∈ E : An = PnkPn + Cn , k ∈ K(H) , (Cn) ∈ G .

Definition 1. A sequence (An) ∈ E is said to be Fredholm if the coset (An)+J is in-vertible in E/J . This definition is justified by the circumstance that the Fredholm-ness of (An) implies the Fredholmness of s-lim AnPn. Moreover, the Fredholmnessof a sequence is stable under perturbations which are small or belong to J .

Notice that there are more general and involved notions of Fredholm se-quences (see [7], Chapter 6). Theorem 1 below indicates that Fredholm sequencesin the sense of Definition 1 are also Fredholm in more general contents. The re-verse is however not true. An instructive example will be presented later on. Fortechnical reasons we need

Definition 2. A sequence (An) ∈ E is called stable if the operators An are invertiblefor n large enough, say for n ≥ n0, and if sup

n≥n0

||A−1n Pn|| <∞.

Notice that (An) ∈ E is stable if and only if the coset (An)+G is invertible inE/G. If (An) ∈ E is stable, then s- lim

n→∞AnPn =: A is invertible and s- limk→∞

A−1n Pn

exists and equals A−1. The following proposition is well known (see [7], Theorem1.20).

Proposition 1. If (An) ∈ E is Fredholm and s-limAnPn is invertible then (An) isstable.

Let (An) ∈ E be arbitrary. We order the singular values of An as follows:

0 ≤ s1 (An) ≤ s2 (An) ≤ · · · ≤ sln (An) .

For the sake of convenience, let us also put s0(An) = 0

Definition 3. We say that (An) ∈ E owns the finite splitting property (k-splittingproperty) if there is a k such that

limn→∞ sk(An) = 0 ,

while the remaining ln − k singular values stay away from zero, that is

sk+1(An) ≥ δ > 0

for n large enough. k is also called the splitting number.

Page 408: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fredholm Theory and Numerical Linear Algebra 405

Remark 1. If (An) ∈ E is subject to the finite splitting property and k = 0 then(An) is stable.

The following theorem is probably new; for so-called standard models relatedresults were proved in [11] (see also [7]).

Theorem 1. Let (An) ∈ E be Fredholm. Then (An) is subject to the finite splittingproperty, and the splitting number k equals

k = dimker (s− lim AnPn) .

Proof. We shall make use of the following alternative description of the singularvalues (see [6], Theorem 2.1):

sj(An) := minB∈F ln

ln−j

||An −B|| ,

where F lnm denotes the collection of all ln × ln-matrices of rank at most m. Let Rn

be the orthoprojection onto im PnPker APn. From Lemma 6.21 in [7] and its proofit follows that

im Rn = im PnPker APn ,

rank Rn = rank PnPker APn = rank Pker A = dimkerA =: k for n large enough,and

||Rn − PnPker APn|| → 0 as n→∞ .

Consequently, ||AnRn|| → 0 as n → ∞, and (AnRn) ∈ G. Consider the sequence(Bn) ∈ E , Bn := A∗

nAn(Pn − Rn) + PnPker APn. Obviously, this sequence is alsoFredholm and s-lim BnPn = A∗A + Pker A is invertible. Then (Bn) is stable byProposition 1. Since rank (Pn −Rn) = ln − k we get for n large enough

sk(A) ≤ ||(An −AnA

∗nAn(Pn − Rn)B−1

n

)Pn||

≤ || (AnBn −AnA∗nAn(Pn −Rn))Pn|| ||B−1

n Pn||≤ ||B−1

n Pn|| ||AnPnPker APn|| .Since (Bn) is stable, there exists for n large enough a constant C with ||B−1

n Pn|| ≤C. Thus we have

sk(An) ≤ C||AnPnPker APn|| → 0 as n→∞ .

Now consider sk+1(An). By using the well-known inequality sk+1(A∗nAn) ≤

sk+1(An)||A∗n|| and that ||A∗

n|| is bounded away from zero for n large enough (recallthat A∗

nPn converges strongly to A∗ = 0) it has to be shown that sk+1(A∗An) isbounded away from zero (n large enough). We have

sk+1 (A∗nAn) = min

B∈F lnln−k−1

|| (A∗nAn −B)Pn|| =

= minB∈F ln

ln−k−1

|| ((A∗nAn + PnPker APn)−B − PnPker APn)Pn||

≥ minB∈F ln

ln−1

|| ((A∗nAn + PnPker APn)−B)Pn|| =

= s1 (A∗nAn + PnPker APn) ≥ δ > 0

for n large enough since (A∗nAn + PnPker APn) is stable, and we are done.

Page 409: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

406 B. Silbermann

Theorem 1 has remarkable consequences. One of them should be mentionedhere and a further one in the next section. Recall that an element a belonging toa C∗-algebra A is called Moore-Penrose invertible if there exists an element b ∈ Asuch that

aba = a , bab = b , (ab)∗ = ab , (ba)∗ = ba .

If a is Moore-Penrose invertible then there exists only one element b with theproperties cited above. This element is called Moore-Penrose inverse to a and isoften denoted by a+. It is well known that A ∈ B(H) is Moore-Penrose invertibleif and only if A is normally solvable, that is im A = im A.

Theorem 2. Let (An) ∈ E be Fredholm and A = s− lim AnPn. Then A+nPn → A+

if and only if

dimkerAn = dim kerA

for all n large enough.

Proof. Recall that B ∈ B(H) is Moore-Penrose invertible if and only if d :=inf(sp B∗B) \ 0 > 0 (Theorem 2.5 in [7]). In this case d−1 = ||B+||. Let nowA be A := s − lim AnPn. This operator is Fredholm by assumption and Moore-Penrose invertible as well as the operators An.

If A+nPn → A+ strongly, then dim ker An = dim kerA for n large enough.

Otherwise there would be a sequence (nk) such that dim ker Ank< dim kerAn

(by Theorem 1 again) and this would imply that ||A+nk|| → ∞. Conversely, if dim

ker An = dimkerA for n large enough then sk+1 = ||A+n ||−1 is bounded away from

zero by Theorem 1. Hence, (||A+n ||−1) is bounded and therefore s-lim A+

nPn = A+

(by Theorem 2.12 in [7]).

Remark 2. The well-known results about the continuity of the Moore-Penroseinversion for matrices are an immediate corollary to the last theorem.

Remark 3. If (An) ∈ E is Fredholm then ind (s− lim AnPn) = 0.

This result follows immediately from [7]. It is easy to prove it directly. Oneonly has to use that the matrices A∗

nAn and AnA∗n are unitarily equivalent. Indeed,

this gives that the splitting numbers of (An) and (A∗n) coincide; as a consequence

we get the claim.

Example 1. Consider the shift operator V : l2(N) → l2(N) given by V en = en+1,where (en)n∈N denotes the standard orthonormal basis in l2(N). Let (Pn) denotethe sequence of the orthoprojections that map l2(N) onto span (e1, . . . , en). Then(PnV Pn) is Fredholm in the sense of Section 6.1.4 or even of Section 6.3.1 in[7], but not in the sense of Definition 1 because indV = −1. The reason for thisunpleasant fact is the circumstance that the ideal J is too small. However, theFredholmness of a sequence (A) ∈ E implies in any case the Fredholmness in moregeneral situations discussed in [7].

Page 410: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fredholm Theory and Numerical Linear Algebra 407

3. Classes of normally solvable operatorsand the finite splitting property

The collection of all normally solvable linear and bounded operators acting onH with dim ker A < ∞ (dim kerA∗ < ∞) will be denoted by F+(H)(F−(H)).The elements in F+(H) are called semi-Fredholm operators (and the intersectionF(H) := F+(H) ∩ F−(H) consists exactly of the set of all Fredholm operators,and for A ∈ F(H) the number ind A = dim kerA− dimkerA∗ is well defined).

Theorem 3. A ∈ F+(H) (A ∈ F−(H)) if and only if the sequence (PnA∗APn)

((PnAA∗Pn)) is subject to the finite splitting property, and the splitting number

k equals dim kerA (dim kerA∗).

Proof. Suppose that A ∈ F+(H). Then it follows that A∗A ∈ F(H) and A∗A +Pker A is an invertible and positive operator. Then (Pn(A∗A+ Pker A)Pn) is stable(see [5], Chapter II, § 2 or [7], Theorem 1.10), and Pn(A∗A)Pn is Fredholm since(PnPker APn) ∈ J . By Theorem 2 the if-part is proved. Conversely, if A /∈ F+(H)then A∗A is not Fredholm and Theorem 6.67 in [7] applies what gives

sl(An) → 0 as n→∞

for any l ∈ N, and we are done.

The last theorem leads to an alternative description of the classes F±(H)and F(H) including the determination of dim ker A for A ∈ F+(H) (dim ker A∗

for A ∈ F−(H)). Moreover, it allows (at least theoretically) to answer the questionwhether a given complex number belongs to the spectrum, essential spectrum ofA, or whether it is an eigenvalue of finite multiplicity. Notice that these results arein sharp contrast to some investigations of Ben-Artzi [1], where selfadjoint bandoperators with at least 5 nonzero main diagonals were considered.

Example 2. Toeplitz operator with rational generating matrix-function. Considerthe familiar Toeplitz operators T (a) : l22(N) → l2(N) on the C2-valued l2 spacel22(N) with the following generating functions

a1(t) =(

2t2 + 7t+ 3 + 12 t

−1 12 t

−2

t+ 3 + t−1 t−2

),

a2(t) =(

t3 + 2t 2t−2 + t3t2 + 1− 5t−1 t2 − 4

).

We have det a1(t) = 0 for all t ∈ T whereas det a2(−1) = 0. Then T (a1) isFredholm and T (a2) is even not normally solvable (see [5], Chapter VIII). Belowthere are plotted the 10 smallest singular values for n = 50k, k = 1, . . . , 10, foreach of the matrices PnT

∗(ai)T (ai)Pn , i = 1, 2:It is seen that for i = 1 the finite splitting property is in force with k = 2. For

i = 2 the finite splitting property cannot be observed which agrees with Theorem 3.

Page 411: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

408 B. Silbermann

i = 1 50 100 150 200 250 300 350 400 450 500

s10 0.7572 0.7453 0.7431 0.7423 0.7420 0.7418 0.7417 0.7416 0.7415 0.7415

s9 0.7535 0.7444 0.7427 0.7421 0.7418 0.7417 0.7416 0.7415 0.7415 0.7415

s8 0.7503 0.7436 0.7423 0.7419 0.7417 0.7416 0.7415 0.7415 0.7415 0.7414

s7 0.7475 0.7429 0.7420 0.7417 0.7416 0.7415 0.7415 0.7414 0.7414 0.7414

s6 0.7453 0.7423 0.7418 0.7416 0.7415 0.7415 0.7414 0.7414 0.7414 0.7414

s5 0.7436 0.7419 0.7416 0.7415 0.7414 0.7414 0.7414 0.7414 0.7414 0.7414

s4 0.7423 0.7416 0.7415 0.7414 0.7414 0.7414 0.7414 0.7414 0.7414 0.7414

s3 0.7416 0.7414 0.7414 0.7414 0.7414 0.7414 0.7414 0.7414 0.7414 0.7414

s2 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

s1 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

i = 2 50 100 150 200 250 300 350 400 450 500

s10 0.8610 0.6781 0.6202 0.6001 0.5470 0.4595 0.3957 0.3474 0.3095 0.2790

s9 0.8603 0.6493 0.6029 0.5951 0.4837 0.4058 0.3493 0.3066 0.2731 0.2462

s8 0.7726 0.6448 0.6028 0.5192 0.4199 0.3520 0.3029 0.2658 0.2367 0.2134

s7 0.7720 0.6107 0.5753 0.4405 0.3556 0.2980 0.2564 0.2249 0.2004 0.1806

s6 0.6957 0.6098 0.4739 0.3611 0.2912 0.2439 0.2098 0.1841 0.1639 0.1478

s5 0.6518 0.5365 0.3699 0.2812 0.2266 0.1898 0.1632 0.1432 0.1275 0.1150

s4 0.6417 0.3867 0.2648 0.2010 0.1620 0.1356 0.1166 0.1023 0.0911 0.0821

s3 0.4316 0.2331 0.1591 0.1207 0.0972 0.0814 0.0700 0.0614 0.0547 0.0493

s2 0.1461 0.0778 0.0531 0.0402 0.0324 0.0271 0.0233 0.0205 0.0182 0.0164

s1 0.0009 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

Quasidiagonal operators and their finite sections

The algebra E contains interesting subalgebras. We call a C∗-subalgebra E0 ofE standard (or a standard model) if J ⊂ E0 and if the invertibility of s-limAnPn(An) ∈ E0 already implies the stability of (An) (for more general situa-tion, see [7]). Every quasidiagonal operator gives raise for introducing a standardsubalgebra of E .

Recall that a bounded linear operator T acting on a separable (complex)Hilbert space is said to be quasidiagonal if there exists a sequence (Pn)n∈N offinite rank orthogonal projections such that s-lim Pn = I and which asymptoticallycommute with T , that is

||[T, Pn]|| := ||TPn − PnT || → 0 as n→∞ .

In particular, every selfadjoint or even normal operator is quasidiagonal as wellas their perturbations by compact operators. However it is by no means trivialto single out a related sequence (Pn). For instance, for multiplication operatorsin periodic Sobolev spaces Hλ related sequences can explicitly be given: these areorthogonal projections on some spline spaces (see [10], Section 2.12.).

Let T be quasidiagonal with respect to (Pn) = (Pn)n∈N as given above.Let C(Pn)(T ) denote the smallest closed C∗-subalgebra containing the sequence(PnTPn) and the ideal J .

Page 412: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fredholm Theory and Numerical Linear Algebra 409

Proposition 2. C(Pn)(T ) actually forms a standard model. Moreover, C(Pn)(T )/Gis isometrically isomorphic to the smallest C∗-subalgebra C(T ) of B(H) containingT and all compact operators.

Sketch of proof. Suppose s-lim AnPn =: A is invertible ((An) ∈ C(Pn)(T )). ThenA−1 ∈ C(T ), and since every element in C(T ) is quasidiagonal, A−1 owns thisproperty, and

||Pn − PnAPnA−1Pn|| = ||PnAA

−1Pn − PnAPnA−1Pn|| =

= ||Pn (PnA−APn)A−1Pn|| → 0 .

Hence, (PnAPn) is stable. Now it is sufficient to prove that (PnAPn)− (An) ∈ G.For it is sufficient to show this for the special case An = (PnB1Pn)(PnB2Pn). Wehave

||PnB1B2Pn − PnB1PnB2Pn|| = ||Pn(PnB1 −B1Pn)B2Pn|| → 0

as n → ∞, and the stability of (An) is proved. The converse is immediate. Thesecond claim in the theorem is obvious.

An immediate consequence of Remark 3 and Proposition 2 is the well-knownfact that the index of a Fredholm quasidiagonal operator is necessarily equal tozero. Now it is obvious that the theory of spectral approximations explained in [7]completely applies to the case at hand.

In the papers [3], [4] Nathaniel Brown proposed further refinements into twodirections: speed of convergence and how to choose the sequence (Pn) of ortho-projections in some special cases such as quasidiagonal unilateral band operators,bilateral band operators or operators in irrational rotation algebras.

We will now take up one problem considered already in [3], namely theweighted shift operator T : l2(N) → l2(N), acting by the rule Ten = αnen+1

where (αn)n∈N is a bounded sequence of complex numbers (the so-called weightsequence). In [3] there was assumed that the sequence (αn) possesses an infinitesubsequence (αnk

) tending to zero. This condition is equivalent to the quasidiag-onality of T . It is known that if T is any weighted shift and r(T ) is the spectralradius of T then sp T = λ ∈ C : |λ| ≤ ρ(T ), that is, the spectrum of T is aslarge as possible. In [3] there was proposed a simple proof of this result under thecondition that T is quasidiagonal.

We will give a simple proof in the general case; this proof is not based on thematerial before, but it is in its spirit. Note that the spectral radius of any weightedshift T is given by (see [9])

r(T ) = liml→∞

||T l|| 1e = liml→∞

(supm

|αmαm+1 . . . αm+l−1|)1l.

Proposition 3. Let T be any weighted shift with weight sequence (αn). Then spT =x ∈ C : |λ| ≤ r(T ).

Page 413: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

410 B. Silbermann

Proof. Suppose λ /∈ sp T and 0 < |λ| < r(T ). Let Pn ∈ C(l2(N)) be the orthopro-jection onto span e1, . . . , en, where (en)n∈N is the orthonormal standard basisin l2(N). Because of

Cn := Pn (λI − T )Pn = Pn(λI − T )

we get Pn(λI − T )Pn(λI − T )−1Pn = Pn and therefore the stability of (Pn(λI −T )Pn). Consider

C−1n =

⎛⎜⎜⎜⎜⎜⎜⎝

1λα1λ

1λ 0

α1α2λ2

α2λ

· · · · · · · · · . . .Πn−1

1 αi

λn

Πn−12 αi

λn−1 · · · αn−1λ2

⎞⎟⎟⎟⎟⎟⎟⎠ .

It is easy to show (using arguments from [3]) that the sequence (C−1n ) is not

uniformly bounded: Let δ := r(T )− |λ| and l0 be such that∣∣∣∣r(t)− supm

|αmαm+1 . . . αm+l−1|1l

∣∣∣∣ < δ

4for all l ≥ l0 .

Hence, (|λ| + δ4 ) < sup

m|αmαm+l . . . αm+l−1|

1l for all l ≥ l0, whence follows the

existence of an m0 = m0(l) such that(1 +

δ

4|λ|

)l

<|αm0αm0+l−1 . . . αm0+l−1|

|λ|l .

This estimate shows that|αm0αm0+l−1 . . . αm0+l−1|

|α|l −→∞ as l →∞ .

However, the numbersαm0αm0+l−1 . . . αm0+l−1

λlare entries in all of the matrices

above for sufficiently large n and therefore (Cn) cannot be stable. The obtainedcontradiction proves the claim.

Notice that this idea can also be used to determine the spectrum of unilateralblock weighted shifts. A more refined analysis for such problems where both theweight sequence and their inverse are uniformly bounded is contained in [2].

References

[1] A. Ben-Artzi: On approximation spectrum of bounded selfadjoint operators. Inte-gral Equations Operator Theory 9 (1986), no. 2, 266–274.

[2] A. Ben-Artzi, I. Gohberg: Dichotomy, Discrete Bohl Exponents, and Spectrumof Block weighted shifts. Integral Equations Operator Theory 14 (1991), 615–677.

[3] N.P. Brown: Quasidiagonality and the finite section method. Preprint, Departmentof Mathematics, Penn State University (2003).

Page 414: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Fredholm Theory and Numerical Linear Algebra 411

[4] N.P. Brown: An Embedding and the Numerical Computation of Spectra in Irra-tional Rotation Algebras. Preprint, Department of Mathematics, Penn State Univer-sity (2003).

[5] I. Gohberg, I. Feldman: Convolution Equations and Projection Method for TheirSolution. Nauka, Moskva 1971 (Russian; English translat.: Am. Math. Soc. Transl.of Math. Monographs 41, Providence, R. I. 1974).

[6] I. Gohberg, M. Krein: Introduction to the theory of linear nonselfadjoint oper-ators. Nauka, Moskva (Russian; Engl. translat.: Am. Math. Soc. Transl. of Math.Monographs 18, Providence, R. I. 1969).

[7] R. Hagen, S. Roch, B. Silbermann: C∗-Algebras and Numerical Analysis. MarcelDekker, Inc., New York, Basel (2001).

[8] S. Prossdorf, B. Silbermann: Numerical Analysis for Integral and Related Op-erator Equations. Akademie Verlag, Berlin (1991), and Birkhauser Verlag, Basel,Boston, Stuttgart (1991).

[9] P. Halmos: A Hilbert space problem book. D. Van Nostrand Company, Inc., Toronto,London (1967).

[10] S. Prossdorf, B. Silbermann: Numerical Analysis for Integral and Related Opera-tor equations. Akademie Verlag, Berlin, 1991, and Birkhauser Verlag, Basel, Boston,Stuttgart, (1991).

[11] S. Roch, B. Silbermann: Index calculus for approximation methods and singularvalue decomposition. J. Math. Anal. Appl. 225 (1998), 401–426.

Bernd SilbermannTechnical University ChemnitzDepartment of MathematicsD-09107 Chemnitz, Germany

Page 415: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 413–424c© 2005 Birkhauser Verlag Basel/Switzerland

Additive and Multiplicative Perturbationsof Exponentially Dichotomous Operatorson General Banach Spaces

Cornelis V.M. van der Mee and Andre C.M. Ran

To Israel Gohberg on the occasion of his 75th birthday

Abstract. Recent perturbation results for exponentially dichotomous oper-ators are generalized, in part by replacing compactness conditions on theperturbation by resolvent compactness. Both additive and multiplicative per-turbations are considered.

Mathematics Subject Classification (2000). Primary 47D06; Secondary 47A55.

Keywords. block operator, exponential dichotomy, semigroup perturbation,Riccati equation.

1. Introduction

In [14] perturbation results for exponentially dichotomous operators on Banachspaces were discussed. In this paper we continue the investigation started in thatpaper.

Recall that an exponentially dichotomous operator is a direct sum A0+(−A1),in which A0 and A1 are generators of exponentially decaying C0-semigroups. Suchoperators were introduced in [2, 3] in connection with convolution equations onthe half-line. Operators of this type also occur in various other applications, see,e.g., [7, 8, 9, 10, 11].

Perturbation results for exponentially dichotomous operators were alreadystudied in [2], where additive perturbations were considered. Results in this di-rection were later obtained for more particular operators on Hilbert spaces in[8, 9, 10, 11]. In [14] the authors considered additive perturbations for exponen-tially dichotomous operators on Banach spaces. Multiplicative perturbations werestudied in [7, 13].

The research leading to this article was supported in part by MIUR under the COFIN grants2002014121 and 2004015437, and by INdAM.

Page 416: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

414 C.V.M. van der Mee and A.C.M. Ran

In this article we accomplish the following two tasks. First we generalize themain results on additive and bounded perturbations of exponentially dichotomousoperators derived in [14] by requiring the corresponding bisemigroup multipliedfrom the right by a bounded additive perturbation to be continuous in the op-erator norm, except possibly as t → 0±. This greatly simplifies the treatment in[14], where it is assumed that either the corresponding bisemigroup itself is con-tinuous in the operator norm (except possibly as t → 0±) or the perturbation isa compact operator. We shall prove a lemma that will allow us to use the sameproofs as in [14] and to refer to [14] for these proofs. Secondly, for exponentiallydichotomous operators having bounded analytic constituent semigroups, we studyperturbations obtained by multiplying the given exponentially dichotomous op-erator from the right by a compact perturbation of the identity. We shall provethat the newly obtained operator is exponentially dichotomous and has boundedanalytic constituent semigroups. We thus generalize results obtained before in [7]in a Hilbert space setting. All of our results will be derived in general complexBanach spaces, including those on Riccati equations.

The main body of this paper consists of two sections. In Section 2 we indi-cate how one of the main results of [14] can be generalized to the present settingwithout changing its proof and discuss the consequences of this result for canon-ical factorization and for block operators. In Section 3 we study perturbations ofanalytic bisemigroup generators. We refer to the introduction of [14] for a morecomprehensive discussion of the existing literature.

Let us introduce some notations. We let R± stand for the right (left, resp.)half-line, including the point at zero. For two complex Banach spaces X and Y,we let L(X ,Y) stand for the Banach spaces of all bounded linear operators fromX into Y. We write L(X ) instead of L(X ,X ).

Let X be a complex Banach space and E an interval of the real line R. ThenLp(E;X ) denotes the Banach space of all strongly measurable functions φ : E → Xsuch that ‖φ(·)‖X ∈ Lp(E), endowed with the Lp-norm, and C0(E;X ) stands forthe Banach space of all bounded continuous functions φ : E → X which vanishat infinity if E is unbounded, endowed with the supremum norm. In particular,C0(R−;X )+C0(R+;X ) is the Banach space of all bounded continuous functionsφ : R → X which vanish at ±∞ and may have a jump discontinuity in zero.

2. Bisemigroups and their perturbations

2.1. Bisemigroup perturbation results

A C0-semigroup (T (t))t≥0 on a complex Banach space X is called uniformly expo-nentially stable if

‖T (t)‖ ≤Me−εt, t ≥ 0, (2.1)

for certain M, ε > 0.

Page 417: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Perturbations of Exponentially Dichotomous Operators 415

A closed and densely defined linear operator −S on a Banach space X iscalled exponentially dichotomous [2] if for some projection P commuting with S,the restrictions of S to ImP and of −S to KerP are the infinitesimal generators ofexponentially decaying C0-semigroups. We then define the bisemigroup generatedby −S as

E(t;−S) =

e−tS(I − P ), t > 0

−e−tSP, t < 0.

Its separating projection P is given by P = −E(0−;−S) = IX − E(0+;−S). Oneeasily verifies the existence of ε > 0 such that λ ∈ C : |Reλ| ≤ ε is contained inthe resolvent set ρ(S) of S and for every x ∈ X

(λ− S)−1x = −∫ ∞

−∞eλtE(t;−S)xdt, |Reλ| ≤ ε. (2.2)

As a result, for every x ∈ X we have ‖(λ − S)−1x‖ → 0 as λ → ∞ in λ ∈ C :|Reλ| ≤ ε′ for some ε′ ∈ (0, ε]. We call the restrictions of e−tS to KerP and ofetS to ImP the constituent semigroups of the exponentially dichotomous operator−S. Observe that x ∈ X : (λ − S)−1x is analytic for Reλ < 0 = KerP , andx ∈ X : (λ− S)−1x is analytic for Reλ > 0 = ImP .

Before deriving our main perturbation result, we prove the following lemma.Note that Theorem 3 of [14] is an immediate consequence of this lemma.

Lemma 2.1. Let −S0 be exponentially dichotomous, Γ a bounded operator such thatE(t;−S0)Γ is norm continuous in 0 = t ∈ R, and −S = −S0 + Γ, where D(S) =D(S0). Suppose the strip λ ∈ C : |Reλ| < ε is contained in the resolvent set ofS for some ε > 0. Then −S is exponentially dichotomous. Moreover, E(t;−S)Γis norm continuous in 0 = t ∈ R with norm continuous limits as t→ 0±.

Proof. There exists ε > 0 such that∫ ∞

−∞eε|t|‖E(t;−S0)‖ dt <∞. (2.3)

Using the resolvent identity

(λ− S)−1 − (λ− S0)−1 = −(λ− S0)−1Γ(λ− S)−1, |Reλ| ≤ ε, (2.4)

for some ε > 0, we obtain the convolution integral equation

E(t;−S)x−∫ ∞

−∞E(t− τ ;−S0)ΓE(τ ;−S)xdτ = E(t;−S0)x, (2.5)

where x ∈ H and 0 = t ∈ R. By assumption, in (2.5) the convolution kernelE(·;−S0)Γ is continuous in the norm except for a jump discontinuity in t = 0.Further, (2.3) implies that eε|·|E(·;−S0)Γ is Bochner integrable.

The symbol of the convolution integral equation (2.5), which equals IH +(λ− S0)−1Γ = (λ− S0)−1(λ− S), tends to IH in the norm as λ→∞ in the strip|Reλ| ≤ ε, because of the Riemann-Lebesgue lemma. Thus there exists ε0 ∈ (0, ε]such that the symbol only takes invertible values on the strip |Reλ| ≤ ε0. By

Page 418: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

416 C.V.M. van der Mee and A.C.M. Ran

the Bochner-Phillips theorem ([4], also [6]), the convolution equation (2.5) has aunique solution u(·;x) = E(·;−S)x with the following properties:

1) E(·;−S) is strongly continuous, except for a jump discontinuity at t = 0,

2)∫∞−∞ eε0|t|‖E(t;−S)‖ dt <∞; hence E(·;−S) is exponentially decaying,

3) the identity (2.2) holds.

As a result [2], −S is exponentially dichotomous.

We now present our main perturbation result. Note that Theorem 2 of [14]is an immediate consequence of this result.

Theorem 2.2. Let −S0 be exponentially dichotomous, Γ a bounded operator suchthat the operator (λ − S0)−1Γ is compact for imaginary λ, and −S = −S0 + Γ,where D(S) = D(S0). Suppose the imaginary axis is contained in the resolvent setof S. Then −S is exponentially dichotomous. Moreover, E(t;−S) − E(t;−S0) isa compact operator, also in the limits as t→ 0±.

Proof. It suffices to prove that E(t;−S0)Γ is a compact operator for 0 = t ∈ R.This would imply that (1) E(t;−S0)Γ is norm continuous in 0 = t ∈ R withnorm continuous limits as t → 0±, and (2) the symbol IH + (λ − S0)−1Γ =(λ−S0)−1(λ−S) of the convolution integral equation (2.5) tends to IH in the normas λ → ∞ in the strip |Reλ| ≤ ε. In combination with the absence of imaginaryspectrum of S, the latter would imply that the strip λ ∈ C : |Reλ| < ε0 iscontained in the resolvent set of S for some ε0 > 0. Theorem 2.2 would then beimmediate from Lemma 2.1.

By analytic continuation, we easily prove that (λ−S0)−1Γ is a compact oper-ator on a strip λ ∈ C : |Reλ| ≤ ε for some ε > 0. Thus (λ−S0)−1E(0+,−S0)Γ isanalytic and compact operator valued for Reλ < ε, while (λ− S0)−1E(0−,−S0)Γis analytic and compact operator valued for Reλ > −ε.

Now it is well known ([5], Corollary III 5.5) that

E(t;−S0)x =

⎧⎨⎩ limn→∞ (I +

t

nS0)−nE(0+;−S0)x, t > 0,

limn→∞ (I +

t

nS0)−nE(0−;−S0)x, t < 0.

uniformly in x on relatively compact sets. Since for every 0 = t ∈ R we have that(I + t

nS0)−1Γ is compact for sufficiently large n ∈ N, it follows that

E(t;−S0)Γ =

⎧⎨⎩ limn→∞ (I +

t

nS0)−nE(0+;−S0)Γ, t > 0,

limn→∞ (I +

t

nS0)−nE(0−;−S0)Γ, t < 0,

in the operator norm. Since (I + tnS0)−nE(0±;−S0)Γ is compact for (±t) > 0, it

follows that E(t;−S0)Γ is compact for 0 = t ∈ R, which completes the proof.

Page 419: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Perturbations of Exponentially Dichotomous Operators 417

2.2. Canonical factorization and matching of subspaces

Let −S0 be exponentially dichotomous and Γ a bounded operator on a complexBanach space X , and let −S = −S0 + Γ, where D(S) = D(S0) and λ ∈ C :|Reλ| ≤ ε ⊂ ρ(S) for some ε > 0. Then −S is exponentially dichotomous ifE(t;−S0)Γ is continuous in 0 = t ∈ R in the operator norm. In this section weconsider the analogous vector-valued Wiener-Hopf integral equation

φ(t) −∫ ∞

0

E(t− τ ;−S0)Γφ(τ) dτ = g(t) (2.6)

where t > 0.Suppose W is a continuous function from the extended imaginary axis i(R∪

∞) into L(X ). Then by a left canonical (Wiener-Hopf ) factorization of W wemean a representation of W of the form

W (λ) = W+(λ)W−(λ), Reλ = 0, (2.7)

in which W±(±λ) is continuous on the closed right half-plane (the point at ∞included), is analytic on the open right half-plane, and takes only invertible valuesfor λ in the closed right half-plane (the point at infinity included). Obviously, suchan operator function only takes invertible values on the extended imaginary axis.By a right canonical (Wiener-Hopf ) factorization we mean a representation of Wof the form

W (λ) = W−(λ)W+(λ), Reλ = 0, (2.8)

where W±(λ) are as above.Theorems 6 and 7 and Corollary 8 of [14] can now easily be generalized with

exactly the same proofs. We now require −S0 to be an exponentially dichotomousand Γ a bounded operator on a complex Banach space X (Hilbert space whengeneralizing Corollary 8 of [14]) such that E(t;−S0)Γ is continuous in 0 = t ∈ R inthe operator norm, instead of requiring that either (i) E(t;−S0) itself is continuousin the operator norm for 0 = t ∈ R or (ii) Γ is a compact operator. Lemma 2.1then enables us to apply Theorems 6 and 7 and Corollary 8 of [14] in the case(λ− S0)−1Γ is a compact operator for imaginary λ.

The above generalizations of Theorems 6 and 7 of [14] yield results on theequivalence of (i) left (resp., right) canonical Wiener-Hopf factorizability ofW (λ) =(λ − S0)−1(λ − S), (ii) the complementarity in X of the range of one of the sep-arating projections P0 and P and the kernel of the other, and (iii) the uniquesolvability of the vector-valued convolution equation on the positive (resp. neg-ative) half-line with convolution kernel E(·;−S0)Γ. The above generalization ofCorollary 8 of [14] yields left and right canonical factorizability if the symbolW (λ) = (λ− S0)−1(λ− S) of the half-line convolution equation involved is eitherclose to the identity operator or has a strictly positive definite real part. Sim-ilar results in various different contexts exist in the finite-dimensional case [1],for equations with symbols analytic in a strip and at infinity [3], for extendedPritchard-Salamon realizations [8], and for abstract kinetic equations [7].

Page 420: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

418 C.V.M. van der Mee and A.C.M. Ran

2.3. Block operators

Suppose −S0 is exponentially dichotomous and Γ is a bounded linear operator ona complex Banach space X . Define S by −S = −S0 + Γ, and put

X± = ImE(0±;−S0),

i.e., X+ = Im (I − P0) = KerP0 and X− = ImP0. Assuming that Γ[X±] ⊂ X∓,we have the following block decompositions of S0 and S with respect to the directsum X = X++X−:

S0 =(A0 00 −A1

), S =

(A0 −D−Q −A1

), (2.9)

where −A0 and −A1 are the generators of uniformly exponentially stable C0-semigroups and Q : X+ → X− and D : X− → X+ are bounded. Then we call Swritten in the form (2.9) a block operator, which is in line with the definition usedin [14]. In the literature the notion of a block operator is also used in a wider sense(e.g., without assumptions about semigroup generators).

Theorem 9 of [14] can now be generalized in the same way without changingits proof. We now require −S0 to be an exponentially dichotomous operator and Γa bounded operator on a complex Banach space X satisfying Γ[X±] ⊂ X∓, insteadof requiring that either (i) E(t;−S0) itself is continuous in the operator norm for0 = t ∈ R or (ii) Γ is a compact operator. Here X+ and X− are the kernel andrange of the separating projection of −S0, respectively.

The above generalization of Theorem 9 of [14] states that there exists abounded linear operator Π+ from X− into X+ which maps D(A1) into D(A0), hasthe property that B1 = A1 + QΠ+ generates an exponentially stable semigroupon X−, and satisfies the Riccati equation

A0Π+x + Π+A1x−Dx + Π+QΠ+x = 0, x ∈ D(A1), (2.10)

if and only if the equivalent statements (a)–(e) of Theorem 7 of [14] are true. Anal-ogously, it states that there exists a bounded linear operator Π− from X+ into X−

which maps D(A0) into D(A1), has the property that B0 = A0 −DΠ− generatesan exponentially stable semigroup on X+, and satisfies the Riccati equation

Π−A0x + A1 Π−x−Π−DΠ−x + Qx = 0, x ∈ D(A0). (2.11)

if and only if the equivalent statements (a)–(e) of Theorem 8 of [14] are true. Similarresults are valid in the finite-dimensional case [1] and for extended Pritchard-Salamon realizations [8].

3. Analytic bisemigroups and unbounded perturbations

3.1. Preliminaries on analytic semigroups

As in [5] (but in contrast to the definition given in [12]), a closed linear operatorA densely defined on a complex Banach space X is called sectorial if there exists

Page 421: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Perturbations of Exponentially Dichotomous Operators 419

a δ with 0 < δ ≤ (π/2) such that the sector

Σπ2 +δ = λ ∈ C : | argλ| < π

2+ δ \ 0

is contained in the resolvent set of A, and if for each ζ ∈ (0, δ) there exists Mζ ≥ 1such that

‖(λ−A)−1‖ ≤ Mζ

|λ| , λ ∈ Σπ2 +δ−ζ \ 0.

According to [5], Theorem II 4.6, the sectorial operators are exactly the gener-ators of bounded analytic semigroups. Thus A is the generator of a uniformlyexponentially stable analytic semigroup if and only if there exist δ and γ with0 < δ ≤ (π/2) and γ > 0 such that (1) the sector

−γ + Σπ2 +δ = λ ∈ C : | arg(λ + γ)| < π

2+ δ \ −γ (3.1)

is contained in the resolvent set of A, and (2) for each ζ ∈ (0, δ) there exists Mζ ≥ 1such that

‖(λ−A)−1‖ ≤ Mζ

|λ + γ| , λ ∈ −γ +[Σπ

2 +δ−ζ \ −γ]. (3.2)

3.2. Perturbation results for analytic bisemigroups

A bisemigroup is called analytic if its constituent semigroups are analytic. Writing−S for its generator and P for its separating projection, we can define

H(t;−S) =

Se−tS(I − P ), t > 0

−Se−tSP, t < 0,

for the derivative of E(t;−S) with respect to 0 = t ∈ R.Next, we note that the generator −S has the following two properties (cf.

(3.1)–(3.2)):1. there exist δ and γ with 0 < δ ≤ (π/2) and γ > 0 such that the set

Ωδ,γ =λ ∈ C :

∣∣∣π2− argλ

∣∣∣ < δ or |Reλ| < γ

(3.3)

is contained in the resolvent set of S, and2. for each ζ ∈ (0, δ) there exists Nζ ≥ 1 such that

‖(λ− S)−1‖ ≤ Nζ

(1

|λ+ γ| +1

|λ− γ|

), λ ∈ Ωζ,γ \ γ,−γ. (3.4)

It is not clear if a closed and densely defined linear operator −S on X having theproperties (3.3)–(3.4) generates an analytic bisemigroup.

Starting from an exponentially dichotomous operator −S0 on a complex Ba-nach space X generating an analytic bisemigroup and a bounded linear operator∆ on X , we now study sufficient conditions under which the unbounded pertur-bation −S = −S0 + Γ of −S0 for which Γ = S0∆, is a generator of an analyticbisemigroup. We will always assume that 1 /∈ σ(∆) and define −S by

D(S) = (I −∆)−1[D(S0)], −S = −S0(I −∆).

Page 422: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

420 C.V.M. van der Mee and A.C.M. Ran

Before deriving our main perturbation result, we prove the following lemma.

Lemma 3.1. Let −S0 be the generator of an analytic bisemigroup and ∆ a boundedlinear operator such that 1 /∈ σ(∆). Suppose that

1. there exist δ and γ with 0 < δ ≤ (π/2) and γ > 0 such that the set Ωδ,γ

defined by (3.3) is contained in the resolvent set of S = S0(I −∆), and2.∫∞−∞ ‖H(t;−S0)∆‖ dt <∞.

Then −S is the generator of an analytic bisemigroup.

Proof. There exists ε > 0 such that (2.3) is true. Using the resolvent identity (2.4),we obtain the convolution integral equation

E(t;−S)x−∫ ∞

−∞H(t− τ ;−S0)∆E(τ ;−S)xdτ = E(t;−S0)x, (3.5)

where x ∈ H and 0 = t ∈ R. By assumption, in (3.5) the convolution kernelH(·;−S0)∆ is continuous in the norm except for a jump discontinuity in t = 0and satisfies

∫∞−∞ eε|t| ‖H(t;−S0)∆‖ dt < ∞. Indeed, the integral is an improper

integral at 0 and at ±∞. Convergence at t = 0 is guaranteed by the secondassumption. Convergence at ±∞ follows from (2.3) and a line of argument as onpage 103 (bottom) of [5], which together prove that H(t;−S0) is exponentiallydecaying. Thus eε|·|H(·;−S0)∆ is Bochner integrable.

The symbol of the convolution integral equation (3.5), which equals I −∆ +λ(λ − S0)−1∆ = (λ − S0)−1(λ − S), tends to I in the norm as λ → ∞ in thestrip |Reλ| < ε, because of the Riemann-Lebesgue lemma. Thus there exists ε0 ∈(0,min(ε, γ)] such that the symbol only takes invertible values on the strip |Reλ| ≤ε0. By the Bochner-Phillips theorem [4], the convolution equation (3.5) has aunique solution u(·;x) = E(·;−S)x with the following properties:

1) E(·;−S) is strongly continuous, except for a jump discontinuity at t = 0,2)

∫∞−∞ eε0|t|‖E(t;−S)‖ dt <∞; hence E(·;−S) is exponentially decaying,

3) the identity (2.2) holds.As a result [2], −S is exponentially dichotomous.

The following result has been established in [7] for the case in which S0 isthe inverse of a bounded and injective selfadjoint operator on a Hilbert space. In[7] it has been sketched how the arguments used to prove the Hilbert space casecan also be applied to prove the Banach space case, without rendering details.

Theorem 3.2. Let −S0 be the generator of an analytic bisemigroup and ∆ a compactoperator such that 1 /∈ σ(∆) and S = S0(I − ∆) does not have purely imaginaryeigenvalues. Suppose that ∫ ∞

−∞‖H(t;−S0)∆‖ dt <∞. (3.6)

Then −S is the generator of an analytic bisemigroup.

Page 423: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Perturbations of Exponentially Dichotomous Operators 421

Proof. It suffices to prove the first condition of Lemma 3.1. Indeed, since thesymbol of the convolution equation (3.5) is a compact perturbation of the identityand is invertible on a strip |Reλ| ≤ ε0 about the imaginary axis while it hasinvertible limits in the operator norm as λ → 0 and λ → ±i∞, the spectrum ofS in this strip must consist of finitely many normal eigenvalues. Thus the firstcondition of Lemma 3.1 amounts to requiring the absence of purely imaginaryeigenvalues of S, as assumed.

It is clear from the proof that the hypotheses of Theorem 3.2 can be replacedby the hypotheses that −S0 is the generator of an analytic bisemigroup, (λ −S0)−1∆ is compact for purely imaginary λ, 1 /∈ σ(∆), S = S0(I − ∆) does nothave purely imaginary eigenvalues, and (3.6) holds. It is not necessary to have ∆itself compact.

It is well known that sectorial operators have fractional powers [12]. Thusgenerators −S = (−A0)+A1 of analytic bisemigroups, where −A0 and −A1 aregenerators of uniformly exponentially stable analytic semigroups, have fractionalpowers defined by |S|α def= (−A0)α+(−A1)α for any α ∈ R. Moreover,

‖|S|αE(t;−S)‖ = O(|t|−α), t→ 0±; (3.7)

∃c > 0 : ‖|S|αE(t;−S)‖ = O(|t|−α e−c|t|), t→ ±∞. (3.8)

As a result of (3.7)–(3.8) we have∥∥|S|−αH(t;−S)∥∥ = O(|t|α−1), t→ 0±;

∃c > 0 :∥∥|S|−αH(t;−S)

∥∥ = O(|t|α−1 e−c|t|), t→ ±∞.

The following corollary is now clear.

Corollary 3.3. Let −S0 be the generator of an analytic bisemigroup and ∆ a com-pact operator such that 1 /∈ σ(∆), S = S0(I −∆) does not have purely imaginaryeigenvalues, and Im∆ ⊂ D(|S0|α) for some α > 0. Then −S is the generator ofan analytic bisemigroup.

3.3. Canonical factorization and matching of subspaces

The following results can all be found in [7] for the case in which S0 is the inverseof a bounded and injective selfadjoint operator on a Hilbert space.

Theorem 3.4. Suppose X is a complex Banach space. Let −S0 be the generatorof an analytic bisemigroup and ∆ a bounded operator with 1 /∈ σ(∆) such that(λ − S0)−1∆ is compact for purely imaginary λ, S0(I − ∆) does not have purelyimaginary eigenvalues, and (3.6) is true. Let P0 and P stand for the separatingprojections of −S0 and −S, respectively. Then the following statements are equiv-alent:(a) The operator function

W (λ) = (λ− S0)−1(λ− S) = IX −∆ + λ(λ − S0)−1∆, |Reλ| ≤ ε, (3.9)

has a left canonical factorization with respect to the imaginary axis.

Page 424: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

422 C.V.M. van der Mee and A.C.M. Ran

(b) We have the decomposition

KerP +ImP0 = X . (3.10)

(c) For some (and hence every) E(R+;X ), the vector-valued Wiener-Hopf equa-tion

φ(t)−∫ ∞

0

H(t− τ ;−S0)∆φ(τ) dτ = g(t), t > 0, (3.11)

is uniquely solvable in E(R+;X ) for any g ∈ E(R+;X ).

Theorem 3.5. Suppose X is a complex Banach space. Let −S0 be the generatorof an analytic bisemigroup and ∆ a bounded operator with 1 /∈ σ(∆) such that(λ − S0)−1∆ is compact for purely imaginary λ, S0(I − ∆) does not have purelyimaginary eigenvalues, and (3.6) is true. Let P0 and P stand for the separatingprojections of −S0 and −S, respectively. Then the following statements are equiv-alent:

(a) The operator function

W (λ) = (λ− S0)−1(λ− S) = IX −∆ + λ(λ− S0)−1∆, |Reλ| ≤ ε,

has a right canonical factorization with respect to the imaginary axis.(b) We have the decomposition

KerP0+ImP = X . (3.12)

(c) For some (and hence every) E(R−;X ), the vector-valued Wiener-Hopf equa-tion

φ(t) −∫ 0

−∞H(t− τ ;−S0)∆φ(τ) dτ = g(t), t < 0, (3.13)

is uniquely solvable in E(R−;X ) for any g ∈ E(R−;X ).

Corollary 3.6. Suppose H is a complex Hilbert space. Let −S0 be the generatorof an analytic bisemigroup and ∆ a bounded operator with 1 /∈ σ(∆) such that(λ − S0)−1∆ is compact for purely imaginary λ, S0(I − ∆) does not have purelyimaginary eigenvalues, and (3.6) is true. Let P0 and P be the separating projectionsof −S0 and −S, respectively. Suppose

supReλ=0

‖ −∆ + λ(λ − S0)−1∆‖ < 1.

Then all of the following statements are true:

(a) The operator function W (·) in (3.9) has a left and a right canonical factor-ization with respect to the imaginary axis.

(b) We have the decompositions (3.10) and (3.12).(c) For some (and hence every) E(R±;H), the vector-valued Wiener-Hopf equa-

tion (3.11) [(3.13), respectively] is uniquely solvable in E(R±;H) for anyg ∈ E(R±;H).

Page 425: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Perturbations of Exponentially Dichotomous Operators 423

Acknowledgment

The authors are greatly indebted to Prof. Karl-Heinz Forster for a question sug-gesting a generalization of previous results on bisemigroup perturbation.

References

[1] H. Bart, I. Gohberg, and M.A. Kaashoek, Minimal Factorization of Matrix andOperator Functions. Birkhauser OT 1, Basel and Boston, 1979.

[2] H. Bart, I. Gohberg, and M.A. Kaashoek, Wiener-Hopf factorization, inverse Fouriertransforms and exponentially dichotomous operators. J. Funct. Anal. 68 (1986), 1–42.

[3] H. Bart, I. Gohberg, and M.A. Kaashoek, Wiener-Hopf equations with symbols an-alytic in a strip. In: I. Gohberg and M.A. Kaashoek, eds., Constructive Methods ofWiener-Hopf Factorization. Birkhauser OT 21, Basel, 1986, pp. 39–74.

[4] S. Bochner and R.S. Phillips, Absolutely convergent Fourier expansions for non-com-mutative normed rings. Ann. Math. 43 (1942), 409–418.

[5] K.-J. Engel and R. Nagel, One-parameter Semigroups for Linear Evolution Equa-tions. Springer GTM 194, Berlin, 2000.

[6] I.C. Gohberg and J. Leiterer, Factorization of operator functions with respect to acontour. II. Canonical factorization of operator functions close to the identity. Math.Nachrichten 54 (1972), 41–74 [Russian].

[7] W. Greenberg, C.V.M. van der Mee, and V. Protopopescu, Boundary Value Problemsin Abstract Kinetic Theory. Birkhauser OT 23, Basel and Boston, 1987.

[8] M.A. Kaashoek, C.V.M. van der Mee, and A.C.M. Ran, Wiener-Hopf factorizationof transfer functions of extended Pritchard-Salamon realizations. Math. Nachrichten196 (1998), 71–102.

[9] H. Langer, A.C.M. Ran, and B.A. van de Rotten, Invariant subspaces of infinite-dimensional Hamiltonians and solutions of the corresponding Riccati equations. In:I. Gohberg and H. Langer, eds., Linear Operators and Matrices. Birkhauser OT 130,Basel and Boston, 2001, pp. 235–254.

[10] H. Langer and C. Tretter, Spectral decomposition of some nonselfadjoint block oper-ator matrices. J. Operator Theory 39 (1998), 339–359.

[11] H. Langer and C. Tretter, Diagonalization of certain block operator matrices andapplications to Dirac operators. In: H. Bart, I. Gohberg, and A.C.M. Ran, eds.,Operator Theory and Analysis. Birkhauser OT 122, Basel and Boston, 2001, pp.331–358.

[12] A. Lunardi, Analytic Semigroups and Optimal Regularity in Parabolic Problems,Birkhauser PNDEA 16, Basel and Boston, 1995.

[13] C.V.M. van der Mee, Transport theory in Lp-spaces. Integral Equations and OperatorTheory 6 (1983), 405–443.

[14] A.C.M. Ran and C. van der Mee, Perturbation results for exponentially dichotomousoperators on general Banach spaces. J. Func. Anal. 210 (2004), 193–213.

Page 426: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

424 C.V.M. van der Mee and A.C.M. Ran

Cornelis V.M. van der MeeDipartimento di Matematica e InformaticaUniversita di CagliariViale Merello 92I-09123 Cagliari, Italye-mail: [email protected]

Andre C.M. RanAfdeling Wiskunde, FEWVrije UniversiteitDe Boelelaan 1081aNL-1081 HV Amsterdam, The Netherlandse-mail: [email protected]

Page 427: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 425–439c© 2005 Birkhauser Verlag Basel/Switzerland

Factorization of Block Triangular MatrixFunctions with Off-diagonal Binomials

Cornelis V.M. van der Mee, Leiba Rodman and Ilya M. Spitkovsky

Dedicated to Israel Gohberg on the occasion of his 75th birthday

Abstract. Factorizations of Wiener–Hopf type are considered in the abstractframework of Wiener algebras of matrix-valued functions on connected com-pact abelian groups, with a non-archimedean linear order on the dual group.A criterion for factorizability is established for 2 × 2 block triangular matrixfunctions with elementary functions on the main diagonal and a binomialexpression in the off-diagonal block.

Mathematics Subject Classification (2000). Primary 47A68. Secondary 43A17.

Keywords. Wiener–Hopf factorization, Wiener algebras, linearly orderedgroups.

1. Introduction and the main result

Let G be a (multiplicative) connected compact abelian group and let Γ be its(additive) character group. Recall that Γ consists of continuous homomorphismsof G into the group of unimodular complex numbers. Since G is compact, Γ isdiscrete. In applications, often Γ is an additive subgroup of R, the group of realnumbers, or of Rk, and G is the Bohr compactification of Γ. The group G can bealso thought of as the character group of Γ, an observation that will be often used.

The group G has a unique invariant measure ν satisfying ν(G) = 1, whileΓ is equipped with the discrete topology and the (translation invariant) countingmeasure. It is well known [17] that, because G is connected, Γ can be made intoa linearly ordered group. So let ' be a linear order such that (Γ,') is an orderedgroup, i. e., if x, y, z ∈ Γ and x ' y, then x+z ' y+z. Throughout the paper it will

The research of van der Mee leading to this article was supported by INdAM and MIUR undergrants No. 2002014121 and 2004015437. The research of Rodman and Spitkovsky was partiallysupported by NSF grant DMS-9988579.

Page 428: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

426 C.V.M. van der Mee, L. Rodman and I.M. Spitkovsky

be assumed that Γ is ordered with a fixed linear order '. The notations ≺, $, -,max, min (with obvious meaning) will also be used. We put Γ+ = x ∈ Γ : x $ 0and Γ− = x ∈ Γ : x ' 0.

For any nonempty set M , let 1(M) stand for the complex Banach space ofall complex-valued M -indexed sequences x = xjj∈M having at most countablymany nonzero terms that are finite with respect to the norm

‖x‖1 =∑j∈M

|xj |.

Then 1(Γ) is a commutative Banach algebra with unit element with respect tothe convolution product (x∗y)j =

∑k∈Γ xk yj−k. Further, 1(Γ+) and 1(Γ−) are

closed subalgebras of 1(Γ) containing the unit element.Given a = ajj∈Γ ∈ 1(Γ), by the symbol of a we mean the complex-valued

continuous function a on G defined by

a(g) =∑j∈Γ

aj〈j, g〉, g ∈ G, (1)

where 〈j, g〉 stands for the action of the character j ∈ Γ on the group elementg ∈ G (thus, 〈j, g〉 is a unimodular complex number), or, by Pontryagin duality,of the character g ∈ G on the group element j ∈ Γ. The set

σ(a) := j ∈ Γ : aj = 0will be called the Fourier spectrum of a given by (1). Since Γ is written additivelyand G multiplicatively, we have

〈α + β, g〉 = 〈α, g〉 · 〈β, g〉, α, β ∈ Γ, g ∈ G,

〈α, gh〉 = 〈α, g〉 · 〈α, h〉, α ∈ Γ, g, h ∈ G.

We will use the shorthand notation eα for the function eα(g) = 〈α, g〉, g ∈ G.Thus, eα+β = eαeβ, α, β ∈ Γ.

The set of all symbols of elements a ∈ 1(Γ) forms an algebra W (G) ofcontinuous functions on G. The algebra W (G) (with pointwise multiplication andaddition) is isomorphic to 1(Γ). Denote by W (G)+ (resp., W (G)−) the algebraof symbols of elements in 1(Γ+) (resp., 1(Γ−)).

We have the following result. For every unital Banach algebra A we denoteits group of invertible elements by G(A).

Theorem 1. Let G be a compact abelian group with character group Γ, and letW (G)n×n be the corresponding Wiener algebra of n × n matrix functions. ThenA ∈ G(W (G)n×n) if and only if A(g) ∈ G(Cn×n) for every g ∈ G.

This is an immediate consequence of Theorem A.1 in [8] (also proved in [1],and see [14]).

We now consider the discrete abelian subgroup Γ′ of Γ and denote its char-acter group by G′. Then we introduce the annihilator

Λ = g ∈ G : 〈j, g〉 = 1 for all j ∈ Γ′, (2)

Page 429: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Factorization with Off-diagonal binomials 427

which is a closed subgroup of G and hence a compact group. According to Theorem2.1.2 in [17], we have G′ + (G/Λ).

Let us now introduce the natural projection π : G → G/Λ. We observethat the above theorem also applies to W (G′)n×n. Given A ∈ 1(Γ)n×n with itsFourier spectrum restricted to Γ′ (i.e., Aj = 0 for j ∈ Γ \Γ′), we have two symboldefinitions:

AΓ(g) =∑j∈Γ′

Aj〈j, g〉, g ∈ G,

AΓ′(g) =∑j∈Γ′

Aj〈j, g〉, g ∈ G′,

where we have taken into account that Aj = 0 for j ∈ Γ \ Γ′. The latter can bereplaced by

AΓ′([g]) =∑j∈Γ′

Aj〈j, g〉, [g] ∈ (G/Λ),

where [g] = π(g) for g ∈ G. Obviously, 〈j, g〉 only depends on [g] = π(g) if j ∈ Γ′.(If [g1] = [g2], then g1g

−12 ∈ Λ and hence 〈j, g1g

−12 〉 = 1 for all j ∈ Γ′, which implies

the statement.) Thus the two symbol definitions are equivalent in the sense thatthe value of “the” symbol A on g ∈ G only depends on [g] = π(g).

Theorem 2. Let Γ′ be a subgroup of the discrete abelian group Γ, let G and G′

be the character groups of Γ and Γ′, respectively, and let Λ be defined by (2). IfA ∈ W (G)n×n is an element which has all of its Fourier spectrum within Γ′, thenA ∈ G(W (G′)n×n) if and only if A(g) ∈ G(Cn×n) for every g ∈ G.

For the proof see [14].We now consider factorizations. A (left) factorization of A ∈ (W (G))n×n is a

representation of the form

A(g) = A+(g) (diag (ej1(g), . . . , ejn(g)))A−(g), g ∈ G, (3)

where A+ ∈ G((W (G)+)n×n), A− ∈ G((W (G)−)n×n), and j1, . . . , jn ∈ Γ. Hereand elsewhere we use diag (x1, . . . , xn) to denote the n × n diagonal matrix withx1, . . . , xn on the main diagonal, in that order. The elements jk are uniquelydefined (if ordered j1 ' j2 ' · · · ' jn); this can be proved by a standard argument(see [9, Theorem VIII.1.1]). The elements j1, . . . , jn in (3) are called the (left)factorization indices of A.

If all factorization indices coincide with the zero element of Γ, the factor-ization is called canonical. If a factorization of A exists, the function A is calledfactorizable. For Γ = Z and G the unit circle, the definitions and the results areclassical [10], [9], [4]; many results have been generalized to Γ = Rk (see [2] and ref-erences there), and Γ a subgroup of Rk (see [15],[16]). The notion of factorizationin the abstract abelian group setting was introduced and studied, in particular,for block triangular matrices, in [14]. The present paper can be thought of as afollow up of [14].

Page 430: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

428 C.V.M. van der Mee, L. Rodman and I.M. Spitkovsky

In this paper we prove the following result.

Theorem 3. Let A have the form

A(g) =[

eλ(g)Ip 0c1eσ(g)− c2eµ(g) e−λ(g)Iq

], g ∈ G, (4)

and assume that λ - 0, µ - σ, and

nµ ≺ λ, nσ ≺ λ for all integers n. (5)

Then A admits a factorization if and only if

rank (λ1c1 − λ2c2) = maxrank (z1c1 − z2c2) : z1, z2 ∈ Cfor every λ1, λ2 ∈ C satisfying |λ1| = |λ2| = 1. (6)

Moreover, in case a factorization exists, the factorization indices of A belong tothe set

±σ,±µ,±λ, λ− (µ− σ), . . . , λ−minp, q(µ− σ).

We emphasize that the setting of Theorem 3 is a non-archimedean linearlyordered abelian group (Γ,$), in contrast with the archimedean linear order ofR and its subgroups. The setting of non-archimedean, as well as archimedean,linearly ordered abelian subgroups was studied in [14].

2. Preliminary results on factorization

Theorem 4. If A admits a factorization (3), and if the Fourier spectrum σ(A) isbounded:

λmin ' σ(A) ' λmax,

for some λmin, λmax ∈ Γ, then the factorization indices are also bounded with thesame bounds

λmin ' jk ' λmax, k = 1, 2, . . . , n, (7)

and moreover,σ(A−) ⊆ j ∈ Γ : −λmax + λmin ' j ' 0, (8)

andσ(A+) ⊆ j ∈ Γ : 0 ' j ' λmax − λmin. (9)

Proof. We follow well-known arguments. Rewrite (3) in the form

A−1+ (e−λminA) = e−λminΛA−.

Since the left-hand side is in W (G)n×n+ , so is the right-hand side, and we have

jk $ λmin for all k = 1, . . . , n (otherwise, A− would contain a zero row, which isimpossible because A− is invertible). Analogously the second inequality in (7) isproved. Now

eλmax−λminA− =(eλmaxΛ

−1)A−1

+ (e−λminA)

Page 431: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Factorization with Off-diagonal binomials 429

is a product of three matrix functions in W (G)n×n+ , and therefore also

eλmax−λminA− ∈W (G)n×n+ .

This proves (8); (9) is proved analogously. It follows from the proof that “one-sided” bounds are valid for the factoriza-

tion indices:

λmin ' σ(A) =⇒ λmin ' jk, for k = 1, 2, . . . , n;

σ(A) ' λmax =⇒ jk ' λmax for k = 1, 2, . . . , n.For future use we record the next corollary of Theorem 4. On Γ+ \0 we considerthe equivalence relation (cf. [6])

i ∼ j ⇐⇒ ∃n,m ∈ N : (ni - j and mj - i).

Here N is the set of positive integers. Any such i, j are called archimedeally equiv-alent (with respect to (Γ,')). The set Arch(Γ,') of archimedean equivalenceclasses, which are additive semigroups (in the sense that they are closed underaddition), can be linearly ordered in a natural way. Given J ∈ Arch(Γ,'), it iseasily seen that

ΓJ := i− j : i, j ∈ Jis the smallest additive subgroup of Γ containing J and that ΓJ in fact containsall archimedean components ' J in Arch(Γ,').

Before proceeding we first discuss some illustrative examples.a. If Zk is ordered lexicographically, in increasing order the archimedean com-

ponents are as follows: J0 = (0), J1 = (0)k−1 × N, J2 = (0)k−2 × N × Z,J3 = (0)k−3 × N × Z2, . . ., Jk−1 = (0)1 × N× Zk−2, and Jk = N × Zk−1.

b. Z2 with linear order (i1, i2) - (0, 0) whenever i1+i2√

5 > 0. Then the orderedgroup is archimedean and in increasing order the archimedean componentsare J0 = (0) and J1 = i ∈ Z2 : i - 0.

c. Let (i1, i2) - (0, 0) whenever i1 + i2 > 0. Then in increasing order thearchimedean components are J0 = (0), J1 = (j,−j) : j ∈ N, and J2 =(i1, i2) : i1 + i2 > 0.We now have the following corollary.

Corollary 5. If A admits a factorization (3), and if the Fourier spectrum σ(A) iscontained in ΓJ for some J ∈ Arch(Γ,'), then

jk ∈ ΓJ , k = 1, . . . , n,

andσ(A±1

− ) ∈ ΓJ , σ(A±1+ ) ∈ ΓJ .

Indeed, in addition to using Theorem 4 we need only to observe that if X ∈G(W (G)n×n) is such that σ(X) ⊆ Γ′ for some subgroup Γ′ ⊆ Γ, then σ(X−1) ⊆Γ′, and apply this observation for X = A± (the observation follows easily fromTheorem 2).

Page 432: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

430 C.V.M. van der Mee, L. Rodman and I.M. Spitkovsky

Corollary 5 may be considered as asserting the hereditary property of Fourierspectra for additive subgroups of Γ of the form Γj . We say that a subgroup Γ′ of Γhas the hereditary property if for each matrix function A that admits a factorization(3), and the Fourier spectrum of A is contained in Γ′, we have that the factorizationindices as well as the Fourier spectra ofA± and of A−1

± are also contained in Γ′. Thisnotion was introduced in [16] for Γ the additive group Rk; the hereditary propertyof certain subgroups of Rk was proved there as well. It is an open question whetheror not the hereditary property holds for every subgroup of the character group ofevery connected compact abelian group.

A factorization (3) will be called finitely generated if the Fourier spectra ofA+ and of A− are contained in some finitely generated subgroup of Γ. Clearly, anecessary condition for existence of a finitely generated factorization of A is thatthe Fourier spectrum of A is contained in a finitely generated subgroup of Γ. Weshall prove below that this condition is also sufficient.

In the proof of the following theorem we make use of a natural projection: IfB ∈ W (G)n×n is given by the series

B(g) =∑j∈Γ

Bj〈j, g〉, g ∈ G,

and if Ω is a subset of Γ, we define BΩ by

BΩ(g) =∑j∈Ω

Bj〈j, g〉, g ∈ G.

Clearly, BΩ ∈ W (G)n×n and the Fourier spectrum of BΩ is contained in Ω.

Theorem 6. If A ∈ W (G)n×n is factorizable, and if the Fourier spectrum of A iscontained in a finitely generated subgroup of Γ, then A admits a finitely generatedfactorization.

Proof. Let Γ be a finitely generated subgroup of Γ that contains the Fourier spec-trum of A. Let (3) be a factorization of A. Since (W (G)±)n×n are unital Banachalgebras, the set of invertible elements G((W (G)±)n×n) is open in (W (G)±)n×n.Thus, there exists a finitely generated subgroup Γ of Γ with the following proper-ties:(a) Γ contains Γ;(b) Γ contains the elements j1, . . . , jn;(c) (A−)Ω and (A−1

+ )Ω are invertible in (W (G)±)n×n for every set Ω ⊇ Γ.For verification of (c), note the following estimate:

‖A− − (A−)Ω‖(W (G)±)n×n =∑

j∈Γ\Ω‖(A−)j‖ ≤

∑j∈Γ\Γ

‖(A−)j‖

= ‖A− − (A−)Γ‖(W (G)±)n×n .

Letting G be the dual group of Γ, by Theorem 2 and (3) we have

(A−1+ )Γ, (A−)Γ ∈ G((W (G)±)n×n).

Page 433: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Factorization with Off-diagonal binomials 431

Rewrite the equality (3) in the form

(A+(g))−1A(g) = (diag (ej1(g), . . . , ejn(g)))A−(g).

Write also (omitting the argument g ∈ G in the formulas)((A−1

+ )Γ + ((A+)−1)Γ+\Γ)A

= (diag (ej1 , . . . , ejn))((A−)Γ + (A−)Γ−\Γ

). (10)

Since j1, . . . , jn ∈ Γ and the Fourier spectrum of A is contained in Γ, (10) implies

(A−1+ )ΓA = (diag (ej1 , . . . , ejn)) (A−)Γ.

Rewriting this equality in the form

A =((A−1

+ )Γ)−1

(diag (ej1 , . . . , ejn)) (A−)Γ,

we obtain a finitely generated factorization of A.

Theorem 7. Let A be given as in Theorem 3 with p = q, and assume that (5) holds.If the matrix c1 is invertible and the spectrum of c−1

1 c2 does not intersect the unitcircle, or if c2 is invertible and the spectrum of c−1

2 c1 does not intersect the unitcircle, then A admits a finitely generated factorization. Moreover, the factorizationindices belong to the set ±σ,±µ.

For the proof see [14]. In fact, the proof of Theorem 7 shows more detailedinformation about the factorization indices:

Theorem 8. Under the hypotheses of Theorem 7, assume that c1 is invertible andthe spectrum of c−1

1 c2 does not intersect the unit circle, and let r be the dimensionof the spectral subspace of c−1

1 c2 corresponding to the eigenvalues inside the unitcircle. Then the factorization indices of A are σ (r times), −σ (r times), µ (p− rtimes), and −µ (p− r times).

If c2 is invertible and the spectrum of c−12 c1 does not intersect the unit circle,

then the factorization indices of A are µ (r times), −µ (r times), σ (p− r times),and −σ (p− r times), where r be the dimension of the spectral subspace of c−1

2 c1corresponding to the eigenvalues inside the unit circle.

Finally, we present a result concerning linearly ordered groups that will beused in the next section.

Proposition 9. Let (Γ,') be a finitely generated additive ordered abelian group.Let Γ0 stand for the additive subgroup of Γ generated by all archimedean equiva-lence classes preceding the archimedean equivalence class E. Then there exists anadditive subgroup Γ1 of Γ such that the direct sum decomposition

Γ = Γ0+Γ1 (11)

holds and the coordinate projection Γ → Γ1 is '-order preserving.

Page 434: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

432 C.V.M. van der Mee, L. Rodman and I.M. Spitkovsky

Proof. With no loss of generality we assume that Γ = Zk and that the order' on Zk

has been extended to a so-called term order on Rk. That is, if x ' y in Rk, z ∈ Rk

and c ≥ 0, then x+z ' y+z and cx ' cy. Such an extension is always possible but isoften nonunique [3]. There now exists an orthonormal basis e1, . . . , ek of Rk anda decreasing sequence H0, H1, . . . , Hk of linear subspaces of Rk with dimHr =k − r (r = 0, 1, . . . , k) such that er - 0, er ∈ Hr−1 and er ⊥ Hr (r = 1, . . . , k)(cf. [5]). Here we note that the orthonormal basis is completely determined by theterm order ' on Rk, with the one-to-one correspondence between term order (onRk) and orthonormal basis given by

x = (x1, . . . , xk) - (0, . . . , 0) ⇔

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩x1 > 0 orx1 = 0 and x2 > 0, or...x1 = · · · = xk−1 = 0 and xk > 0.

Indeed, put H0 = Rk and let H1 stand for the set of those points in Rk

all of whose neighborhoods contain elements of both Γ+ and Γ−. Then H1 is alinear subspace of Rk of dimension k − 1 [5]. We now let e1 be the unique unitvector in Rk that is '-positive and orthogonal to H1 and restrict the term orderto H1. We now repeat the same construction in H1 and find a linear subspace H2

of H1 of dimension k− 2 and a unique '-positive unit vector e2 in H1 orthogonalto H2. After finitely many such constructions we arrive at the sequence of linearsubspaces Rk = H0 ⊃ H1 ⊃ · · · ⊃ Hk−1 ⊃ Hk = 0 and the orthonormal basise1, . . . , ek of Rk as indicated above.

Next, let Hr be the smallest linear subspace of Rk spanned by Hr ∩ Zk

(r = 0, 1, . . . , k). From this nonincreasing set of linear subspaces of Rk we selecta maximal strictly decreasing set of nontrivial linear subspaces Rk = L0 ⊃ L1 ⊃· · · ⊃ Lµ−1 = 0. Also let ν be the largest among the integers s ∈ 1, . . . , ksuch that Lµ−1 is spanned by Hs−1 ∩ Zk; then Hν ∩ Zk = 0. If µ = 1, we haveH1 ∩ Zk = 0, so that the ordered group (Zk,') is archimedean; in that casei → ξ1(i)

def= (i, e1) (i.e., the signed distance from i to H1) is an order preservinggroup homomorphism from (Zk,') into R. On the other hand, if µ ≥ 2, we let(i) ξ1(i) stand for the signed distance from i to H1 and p1(i) for the orthogonalprojection of i onto L1, (ii) ξr(i) for the signed distance from pr−1(i) to Hq forq = mins : Lr = span(Hs∩Zk) and pr(i) for the orthogonal projection of pr−1(i)onto Lr (r = 2, . . . , µ−1), and finally (iii) ξµ(i) as the signed distance from pµ−1(i)to Hν . In this way

iϕ→ (ξ1(i), . . . , ξµ(i))

is an order preserving group homomorphism from (Zk,') into Rµ with lexico-graphical order. It then appears that µ is the number of nontrivial (i.e., differentfrom 0) archimedean components. Moreover, in increasing order the archimedeancomponents of (Γ,') are now as follows:

J0 = 0, Jr = [Lµ−r ∩ Γ+] \ ∪r−1s=0 Js (r = 1, . . . , µ). (12)

Page 435: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Factorization with Off-diagonal binomials 433

The additive subgroups of Zk generated by the smallest archimedean componentsare as follows:

ΓJ0 = 0, ΓJr = Lµ−r ∩ Γ (r = 1, . . . , µ). (13)

Let us now define the group homomorphisms πr on Γ with image ΓJrand qr with

kernel ΓJrby π0 = 0, q0 equal the identity, and

πri = ϕ−1(0, . . . , 0, ξµ−r+1(i), . . . , ξµ(i)),qri = ϕ−1(ξ1(i), . . . , ξµ−r(i), 0, . . . , 0).

(14)

Then the fact that the linear order on ϕ[Zk] ⊂ Rµ is lexicographical, impliesthat the additive group homomorphisms q0, q1, . . . , qµ are order preserving, butπ1, . . . , πµ−1 are not. Putting Γ′

Jr= qr[Γ] we obtain the direct sum decomposition

Γ = ΓJr +Γ′Jr, r = 0, 1, . . . , µ,

which completes the proof.

3. Proof of Theorem 3

Using Theorem 6 we can assume without loss of generality that Γ is finitely gen-erated, and furthermore assume that Γ = Zk for some positive integer k.

Consider the part “if”. Applying the transformation

c1 → Sc1T, c2 → Sc2T,

for suitable invertible matrices S and T , we may assume that the pair (c1, c2) isin the Kronecker normal form (see, e.g, [7]); in other words, c1 and c2 are directsums of blocks of the following types:

(a) c1 and c2 are of size k × (k + 1) of the form

c1 =[Ik 0k×1

], c2 =

[0k×1 Ik

].

(b) c1 and c2 are of size (k + 1)× k of the form

c1 =[

Ik

01×k

], c2 =

[01×k

Ik

].

(c) c1 is the k × k upper triangular nilpotent Jordan block, denoted by Vk, andc2 = Ik.

(d) c1 = Ik, and c2 = Vk.(e) c1 and c2 are both invertible of the same size.(f) c1 and c2 are both zero matrices of the same size.

Note that if c1 (resp., c2) is invertible, then condition (6) is equivalent to thecondition that the spectrum of c−1

1 c2 (resp., of c1c−12 ) does not intersect the unit

circle. Thus, by Theorem 7 we are done in cases (c), (d), and (e), as well as in thetrivial case (f).

Page 436: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

434 C.V.M. van der Mee, L. Rodman and I.M. Spitkovsky

Consider the cases (a) and (b), where the condition (6) is obviously satisfied.We follow arguments similar to those presented in [13], and also in the proof of[14, Theorem 7].

Let Jk be the k×k matrix with 1’s along the top-right to the left-bottom diag-onal and zeros in all other positions. If A(g) = [ai,j(g)]ni,j=1 ∈ (W (G))n×n, then A∗

will denote the matrix function defined by [aj,i(g)]ni,j=1; clearly, A∗ ∈ (W (G))n×n,and if A ∈ (W (G)±)n×n, then A∗ ∈ (W (G)∓)n×n. The transformation

A →[

0 Jk+1

Jk 0

]A∗

[0 Jk

Jk+1 0

]transforms the case (b) to the case (a). Thus, it will suffice to consider the case(a):

A =

⎡⎣ eλIk 0 00 eλ 0

eσIk − eµVk h e−λIk

⎤⎦ , where h =[

0(k−1)×1

−eµ

].

Let

B+ =

⎡⎣ Ik − eµ−σVk b −eλ−σIk

0 1 00 0

∑k−1j=0 ej(µ−σ)V

jk

⎤⎦ , where b =[

0(k−1)×1

−eµ−σ

],

B− =

⎡⎣ ∑k−1j=0 ej(µ−σ)−λ−σV

jk 0 Ik

0 1 0−Ik 0 0

⎤⎦ .

Clearly,

B+ ∈ G((W (G)+)(2k+1)×(2k+1)) and B− ∈ G((W (G)−)(2k+1)×(2k+1))

(the latter inclusion follows from (5) and from µ $ σ). A direct computation showsthat

Φ0 := B+AB− =

⎡⎣ e−σIk 0 00 eλ 00 hk eσIk

⎤⎦ ,where

(hk)T =[−e(k−1)(µ−σ)+µ . . . −e(µ−σ)+µ −eµ

]. (15)

Define for j = 0, 1, . . . , k − 1 the auxiliary matrices

R+,k−j =

⎡⎣ 1 0 eλ−µ−j(µ−σ)

0 Ik−j−1 hk−j−1e−σ

0 0 1

⎤⎦ ,R−,k−j =

⎡⎣ eσ−µ 0 −10 Ik−j−1 01 0 0

⎤⎦ , Rk−j =[eλ−j(µ−σ) 0hk−j eσIk−j

].

Clearly, R−,k−j ∈ G((W (G)−)(k−j+1)×(k−j+1)), and in view of (5),

R+,k−j ∈ G((W (G)+)(k−j+1)×(k−j+1)).

Page 437: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Factorization with Off-diagonal binomials 435

We also have the recurrence relations

R+,k−jRk−jR−,k−j =[Rk−j−1 0

0 eµ

], R0 = eλ−k(µ−σ), (16)

for j = 0, . . . , k − 1. Note that Φ0 = diag (e−σIk, Rk). Applying consecutively(16) for j = 0, . . . , k − 1, we obtain a factorization A = A+ΛA− with Λ =diag (e−σIk, eλ−k(µ−σ), eµIk). This completes the proof of the “if” part of the the-orem.

For the part “only if”, we make use of the archimedean structure on Γ =(Zk,') (see the previous section). Let Γ0 be the subgroup of Γ generated by allarchimedean classes of Γ that are≺ λ. Condition (5) guarantees that 0 = µ−σ ∈ Γ0

and hence that Γ0 = 0. Since

α ∈ Γ, nα ∈ Γ0 for some n ∈ N =⇒ α ∈ Γ0,

it follows thatΓ = Γ0+Γ1, (17)

a direct sum, for some subgroup Γ1 of Γ = Zk, where the coordinate projectiononto Γ1 along Γ0 is order preserving (Proposition 9). Also, by [11, Theorem 23.18],we may assume

G = G0 ×G1, (18)where Gj is the character group of Γj , j = 0, 1. We write

λ = λ0 + λ1, µ = µ0 + µ1, σ = σ0 + σ1,

in accordance with (17). By construction of Γ0, we have λ1 - 0, and by (5)µ, σ ∈ Γ0, and so µ1 = σ1 = 0.

Assume that A has a factorization

A(g) = A+(g) (diag (ej1(g), . . . , ejn(g)))A−(g), g ∈ G. (19)

In accordance with (17) and (18) write

jk = jk,0 + jk,1, jk,0 ∈ Γ0, jk,1 ∈ Γ1, k = 1, . . . , n,

g = g0g1, g0 ∈ G0, g1 ∈ G1,

and consider the equation (19) in which g0 is kept fixed, whereas g1 is kept variable.To emphasize this interpretation, we write (19) in the form

Ag0 (g1) = A+,g0(g1)(diag (ej1,0(g0), . . . , ejn,0(g0))

(diag (ej1,1(g1), . . . , ejn,1(g1))

)A−,g0(g1). (20)

We consider Γ1 with the linear order induced by (Γ,'). Since the property thatα = α0 + α1 ∈ Γ±, where αj ∈ Γj , j = 0, 1, implies that α1 ∈ (Γ1)±, we obtain

A±,g0 ∈ G((W (G1)±)n×n)

for every g0 ∈ Γ0. Thus, (20) is in fact a factorization of Ag0(g1) whose factorizationindices are j1,1, . . . , jn,1, and moreover we have the following property:

(ℵ) the factorization indices of Ag0(g1) are independent of g0 ∈ G0.

Page 438: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

436 C.V.M. van der Mee, L. Rodman and I.M. Spitkovsky

Arguing by contradiction, we assume that

rank (λ1c1 − λ2c2) < maxrank (z1c1 − z2c2) : z1, z2 ∈ Cfor some λ1, λ2 ∈ C satisfying |λ1| = |λ2| = 1. (21)

A contradiction will be obtained with Property (ℵ). We can assume, using theKronecker normal from (see, e.g., [7]), that c1 and c2 have the form

c1 = diag (c1,1, . . . , c1,s), c2 = diag (c2,1, . . . , c2,s),

where each pair of blocks (c1,w, c2,w) has one of the forms (a) - (f). After a permu-tation transformation, we obtain (keeping the same notation for the transformedAg0(g1)):

Ag0 (g1) = diag (Ag0,1(g1), . . . , Ag0,s(g1)),

where

Ag0,w(g1) =[

eλ1(g1)Ipw 0c1,weβ(g0)− c2,weκ(g0) e−λ1(g1)Iqw

]Q, w = 1, . . . , s,

with β, κ ∈ Γ0 independent of w, and Q is a diagonal matrix (also independentof w) with terms of the form eα(g0), α ∈ Γ0 on the main diagonal. Note thatβ = κ (otherwise we would have µ = σ, which is excluded by the hypotheses ofthe theorem). The “if” part of the theorem shows that Ag0,w(g1) is factorable withindices independent of g0 if the pair (c1,w, c2,w) has one of the forms (a), (b), (c),(d), and (f).

Suppose that the pair (c1,w, c2,w) is of the form (e). Then we may furtherassume that c1,w = I and c2,w is in the Jordan form:

c2,w = Jτ1(ρ1)⊕ · · · ⊕ Jτu(ρu),

where Jτj (ρj) is the upper triangular τj × τj Jordan block with the eigenvalueρj (for notational simplicity, we suppress the dependence of ρj , τj , and u on win the notation used). Accordingly, after a permutation transformation we haveAg0,w(g1)Q−1 in the following form:

Ag0,w(g1) = diag (Ag0,w,1(g1), . . . , Ag0,w,u(g1)),

where

Ag0,w,j(g1) =[

eλ1(g1)Iτj 0eβ(g0)Iτj − eκ(g0)Jτj (ρj) e−λ1(g1)Iτj

], j = 1, . . . , u.

If |ρj | = 1, then by the “if” part of the theorem, the factorization indices ofAg0,w,j(g1) are independent of g0 (this can be also checked directly). Assume |ρj | =1; then the factorization indices of Ag0,w,j(g1) equal zero if

eβ(g0)− ρjeκ(g0) = 0. (22)

Indeed, [eλ1I 0S e−λ1I

]=[I eλ1S

−1

0 I

] [0 −S−1

S e−λ1I

],

Page 439: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Factorization with Off-diagonal binomials 437

whereS = eβ(g0)Iτj − eκ(g0)Jτj (ρj).

If |ρj | = 1 andeβ(g0)− ρjeκ(g0) = 0, (23)

then the factorization indices of Ag0,w,j(g1) are zeros (2(τj − 1) times) and ±λ1.It follows that for the values of g0 such that

eβ(g0)− ρeκ(g0) = 0 (24)

for any eigenvalue ρ of c2,w the factorization indices of Ag0,w(g1) are all zeros,whereas in case the equality

eβ(g0)− ρeκ(g0) = 0 (25)

holds for some eigenvalue ρ of c2,w not all factorization indices of Ag0,w(t1) arezeros. Since κ = β, the range of the function

eβ−κ(g0) = eβ(g0) (eκ(g0))−1

coincides with the unit circle (since G is connected and the characters are con-tinuous), and therefore by hypothesis (21) there do exist eigenvalues ρ of c2,w forwhich (25) holds. We obtain a contradiction with Property ℵ.

This completes the proof of Theorem 3.

4. Invertibility vs factorizability

The following conjecture was stated in [14].

Conjecture 10. Every function A ∈ G(W (G)n×n) admits a factorization if andonly if Γ (as an abstract group without regard to $ ) is isomorphic to a subgroupof the additive group of rational numbers Q.

Regarding this conjecture we quote a result from [14]:

Theorem 11. If Γ is not isomorphic to a subgroup of Q, then there exists a 2 × 2matrix function of the form

A(g) =[

eλ(g) 0c1eα1(g) + c2 + c3eα3(g) e−λ(g)

], g ∈ G, (26)

where λ, α1, α2, α3 ∈ Γ, and c1, c2, c3 ∈ C, which does not admit a factorizationwith the factors A± and their inverses A−1

± having finite Fourier spectrum.

We improve on Theorem 11:

Theorem 12. If Γ is not isomorphic (as an abstract group ) to a subgroup of Q,then there exists a 2× 2 matrix function of the form (26) which is not factorable.

Proof. Consider two cases: (1) Γ is archimedean. Then (Γ,$) is isomorphic to asubgroup of the additive group of real numbers (Holder’s theorem, see, e.g., [6])and since Γ is not isomorphic to a subgroup of Q, there exist non-commensurable

Page 440: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

438 C.V.M. van der Mee, L. Rodman and I.M. Spitkovsky

elements x, y ∈ Γ \ 0. Using x and y, a known construction (see [12], also [2,Section 8.5]) may be used to produce a 2×2 matrix function of the required form.

(2) Γ is not archimedean. Then there exist σ = 0 ≺ µ ≺ λ ∈ Γ such that (5)holds. Theorem 3 now implies that the function[

eλ(g) 01− eµ(g) e−λ(g)

], g ∈ G,

is not factorable. Theorem 12 and its proof show that if Γ is not isomorphic to a subgroup of

Q, then there exists a 2× 2 matrix function of the form

A(g) =k∑

j=1

cjeαj (g), detA(g) ≡ 1,

which is not factorable, with k = 5 if Γ is archimedean, and k = 4 if Γ is notarchimedean. On the other hand, for every linearly ordered group Γ, every n× nmatrix function of the form

A(g) = c1eα1(g) + c2eα2(g)

with detA(g) = 0, g ∈ G, is factorable. Indeed, this follows easily from the Kro-necker form of the pair of matrices (c1, c2). This leaves the following problem open:

Problem 13. Assume that Γ is not isomorphic to a subgroup of Q.(a) If the subgroup generated by α1, α2, α3 ∈ Γ is not archimedean, prove or

disprove that every n× n matrix function of the form

A(g) = c1eα1(g) + c2eα2(g) + c3eα3(g)

with detA(g) = 0, g ∈ G, is factorable.(b) If Γ is archimedean, prove or disprove that every n × n matrix function of

the form A(g) =∑k

j=1 cjeαj(g) with detA(g) = 0, g ∈ G, and with k = 3 ork = 4, is factorable.

References

[1] G.R. Allan, One-sided inverses in Banach algebras of holomorphic vector-valuedfunctions, J. London Math. Soc. 42, 463–470 (1967).

[2] A. Bottcher, Yu.I. Karlovich, and I.M. Spitkovsky, Convolution Operators and Fac-torization of Almost Periodic Matrix Functions, Birkhauser OT 131, Basel andBoston, 2002.

[3] L. Cerlienco and M. Mureddu, Rappresentazione matriciale degli ordini l.c. su Rn esu Nn, Rend. Sem. Fac. Sc. Univ. Cagliari 66, 49–68 (1996).

[4] K.F. Clancey and I. Gohberg, Factorization of Matrix Functions and Singular Inte-gral Operators, Birkhauser OT 3, Basel and Boston, 1981.

[5] J. Erdos, On the structure of ordered real vector spaces, Publ. Math. Debrecen 4,334–343 (1956).

[6] L. Fuchs, Partially Ordered Algebraic Systems, Pergamon Press, Oxford, 1963.

Page 441: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Factorization with Off-diagonal binomials 439

[7] F.R. Gantmacher, Applications of the Theory of Matrices, Interscience Publishers,New York, 1959. (Translation from Russian.)

[8] I.C. Gohberg and Yu. Leiterer, Factorization of operator functions with respect to acontour. II. Canonical factorization of operator functions close to the identity, Math.Nachr. 54, 41–74 (1972). (Russian)

[9] I.C. Gohberg and I.A. Feldman, Convolution Equations and Projection Methods fortheir Solution, Transl. Math. Monographs 41, Amer. Math. Soc., Providence, R. I.,1974.

[10] I.C. Gohberg and M.G. Krein, Systems of integral equations on a half line withkernels depending on the difference of arguments, Amer. Math. Soc. Transl. (2)14,217–287 (1960).

[11] E. Hewitt and K.A. Ross, Abstract Harmonic Analysis I, 2nd edition, Springer-Verlag, Berlin, Heidelberg, New York, 1979.

[12] Yu.I. Karlovich and I.M. Spitkovsky, On the Noether property for certain singularintegral operators with matrix coefficients of the class SAP and the systems of con-volution equations on a finite interval connected with them, Soviet Math. Doklady27, 358–363 (1983).

[13] Yu.I. Karlovich and I.M. Spitkovsky, Factorization of almost periodic matrix func-tions and (semi)-Fredholmness of some convolution type equations, No. 4421–85 dep.,VINITI, Moscow, 1985. (Russian).

[14] C.V.M. van der Mee, L. Rodman, I.M. Spitkovsky, and H. J. Woerdeman, Factor-ization of block triangular matrix functions in Wiener algebras on ordered abeliangroups. In: J.A. Ball, J.W. Helton, M. Klaus, and L. Rodman (eds.), Current Trendsin Operator Theory and its Applications, Birkhauser OT 149, Basel and Boston,2004, pp. 441–465.

[15] L. Rodman, I.M. Spitkovsky, and H.J. Woerdeman, Caratheodory–Toeplitz and Ne-hari problems for matrix-valued almost periodic functions, Trans. Amer. Math. Soc.350, 2185–2227 (1998).

[16] L. Rodman, I.M. Spitkovsky, and H.J. Woerdeman, Noncanonical factorizations ofalmost periodic multivariable matrix functions, Operator Theory: Advances and Ap-plications 142, 311–344 (2003).

[17] W. Rudin, Fourier Analysis on Groups, John Wiley, New York, 1962.

Cornelis V.M. van der MeeDipartimento di Matematica e InformaticaUniversita di CagliariViale Merello 92I-09123 Cagliari, Italye-mail: [email protected]

Leiba Rodman and Ilya M. SpitkovskyDepartment of MathematicsThe College of William and MaryWilliamsburg, VA 23187-8795, USAe-mail: [email protected]: [email protected]

Page 442: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 441–468c© 2005 Birkhauser Verlag Basel/Switzerland

Closely Connected Unitary Realizations of theSolutions to the Basic Interpolation Problemfor Generalized Schur Functions

Gerald Wanjala

Abstract. A generalized Schur function which is holomorphic at z = 0 can bewritten as the characteristic function of a closely connected unitary colligationwith a Pontryagin state space. We describe the closely connected unitarycolligation of a solution s(z) of the basic interpolation problem for generalizedSchur functions (studied in [3]) in terms of the interpolation data and thecanonical unitary colligation of the parameter function s1(z) appearing in theformula for s(z). In particular, we consider the case where the interpolationdata and the Taylor coefficients of s1(z) at z = 0 are real. We also show thatthe canonical unitary colligation of s1(z) can be recovered from that of s(z).

Mathematics Subject Classification (2000). Primary 47A48, 47B32, 47B50.

Keywords. Schur transform, generalized Schur function, reproducing kernelPontryagin space, J-unitary colligation, closely connected colligation, realiza-tion.

1. Introduction

We recall that a Schur function is a holomorphic function s(z) defined on the openunit disc D with the property that |s(z)| ≤ 1, z ∈ D, and that a generalized Schurfunction s(z) with κ negative squares is a meromorphic function on D of the form

s(z) =κ∏

j=1

1− α∗jz

z − αjs0(z), (1.1)

where αj ∈ D and s0(z) is a Schur function with s0(αj) = 0, j = 1, 2, . . . , κ. Here κis a nonnegative integer; evidently, a generalized Schur function with zero negativesquares is a Schur function.

Page 443: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

442 G. Wanjala

A generalized Schur function s(z) which is holomorphic at z = 0 determinesand is determined by a realization of the form

s(z) = s(0) + z〈(1− zA)−1u, v〉P , (1.2)

where A is an operator on some Pontryagin space P , u, v ∈ P , and the colligation

U =(

A u〈 · , v〉 s(0)

):(PC

)→(PC

)is unitary, that is, UU∗ = U∗U = I, and closely connected, which means

P = span Amu,A∗nv | m,n ≥ 0 .The right-hand side of (1.2) is called the characteristic function of the colligationU and is denoted by sU (z). A unitary closely connected realization of s(z) isuniquely determined up to isomorphism. The negative index of the state space Pof any such realization equals the number of negative squares of s(z). If P is thereproducing kernel space D(s) (see Section 2), the realization is unique and calledcanonical.

The basic interpolation problem for generalized Schur functions studied in [3]is as follows.

(BIP): Given σ0 ∈ C, determine all generalized Schur functions s(z)which are holomorphic at the origin and are such that s(0) = σ0.

It was shown in [3] that the solutions s(z) are given by fractional linear transfor-mations of the form

s(z) = TΘ(z)s1(z) =a(z)s1(z) + b(z)c(z)s1(z) + d(z)

, (1.3)

where

Θ(z) =(a(z) b(z)c(z) d(z)

)(1.4)

is a polynomial matrix which depends on whether |σ0| < 1, |σ0| > 1 or |σ0| = 1and the parameter s1(z) runs through a set of generalized Schur functions whichare holomorphic at z = 0, also depending on these three cases. The polynomialmatrix Θ(z) is a generalized Schur function relative to the signature matrix

J =(

1 00 −1

),

and hence Θ(z) can also be written as the characteristic function of a canonicalunitary colligation. For more details see Section 2 which contains the preliminariesabout canonical unitary realizations for matrix-valued generalized Schur functionsand the basic interpolation problem.

In Section 3 for each of the three cases |σ0| < 1, |σ0| > 1 or |σ0| = 1 wedescribe the state space D(Θ) in the canonical unitary realization of Θ(z).

In Section 4 we construct a closely connected unitary realization of the solu-tion s(z) from the canonical unitary realizations of the corresponding parameters1(z) and the polynomial matrix Θ(z). The state space P in the realization of s(z)

Page 444: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 443

is a finite-dimensional extension of the state space in the realization of s1(z), andthe main operator A in the realization of s(z) is an extension to P of a finite-dimensional perturbation of the main operator A1 in the realization of s1(z). SeeTheorems 4.2 and 4.3. We also show how the canonical unitary realization of theparameter can be recovered from the closely connected unitary realization of thesolution. See Theorem 4.4. Similar results can be found in [3] where closely outerconnected coisometric realizations are considered. The fractional linear transfor-mations mentioned above are related to the Schur algorithm for generalized Schurfunctions developed in [9], [11], [8], and [10]. The connection between the Schuralgorithm for generalized Schur functions and their coisometric and unitary real-izations have been investigated in [1, 2] and [4], respectively. In these papers adirect method was used which differs from the approach in this paper (and in [3]),where reproducing kernel Pontryagin spaces are the main tool.

In Section 5 we consider the case where the interpolation data and the Taylorcoefficients of s1(z) at z = 0 are real. Then the Taylor coefficients of s(z) at z = 0are also real. According to [5] there exist unique signature operators Js on D(s)and Js1 on D(s1) such that the main operators A and A1 are Js-selfadjoint andJs1 -selfadjoint respectively. In cases |σ0| = 1 we give explicit formulas relating thetwo signature operators. In the case |σ0| = 1 the connection is rather complicatedand we consider a special case. These results are new even in the case where onlySchur functions are considered, that is, when κ = 0 and |σ0| < 1.

2. Preliminaries

2.1. Realizations

Let J be an n × n signature matrix, that is, J = J∗ = J−1. By Sκ(Cn, J) wedenote the class of generalized Schur functions with κ negative squares. These arethe n×n matrix-valued functions S(z) which are meromorphic on D and for whichthe kernel

KS(z, w) :=J − S(z)JS(w)∗

1− zw∗ : Cn → Cn

has κ negative squares, or, equivalently, the kernel

DS(z, w) =

⎛⎜⎜⎝J − S(z)JS(w)∗

1− zw∗S(z)− S(w∗)

z − w∗

S(z)− S(w∗)z − w∗

J − S(z)JS(w)∗

1− zw∗

⎞⎟⎟⎠ :(

Cn

Cn

)→(

Cn

Cn

)

has κ negative squares, where S(z) = S(z∗)∗. That these kernels have the samenumber of negative squares can be shown as in [6]. If J = In×n, S(z) ∈ Sκ(Cn, J)if and only if it is a product of the inverse of a Blaschke product and a Schurfunction. See, for example, [7, Section 4.2]; for the case n = 1 the formula for S(z)is given by (1.1).

Page 445: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

444 G. Wanjala

The set of generalized Schur functions which have κ negative squares andwhich are holomorphic at the origin will be denoted by S0

κ(Cn, J) and we setS0(Cn, J) = ∪κ∈NS0

κ(Cn, J). To every S(z) ∈ S0(Cn, J) we associate two repro-ducing kernel Pontryagin spaces H(S) and D(S). These are the reproducing kernelPontryagin spaces with reproducing kernels KS(z, w) and DS(z, w) respectively.They occur as state spaces in the coisometric and unitary realizations of the matrixfunction S(z). For an elaborate treatment of these spaces we refer to [7, Section2.1]. In this paper we focus on the unitary realizations.

If P is a Pontryagin space, all maximal uniformly negative subspaces havethe same finite dimension and this number is called the negative index of P andis denoted by ind−P . For S ∈ S0

κ(Cn, J) we have that ind−D(S) = κ.

Theorem 2.1. Let S(z) ∈ S0(Cn, J).(i) The operators

A : D(S) → D(S), A

(hk

)(z) =

⎛⎜⎝ h(z)− h(0)z

zk(z)− S(z)Jh(0)

⎞⎟⎠ ,

B : Cn → D(S), Bf(z) =

⎛⎜⎝ S(z)− S(0)z

J − S(z)JS(0)

⎞⎟⎠ f,

C : D(S) → Cn, C

(hk

)= h(0),

are bounded operators.(ii) Their adjoints are given by

A∗(hk

)(z) =

⎛⎜⎝zh(z)− S(z)Jk(0)

k(z)− k(0)z

⎞⎟⎠ , B∗(hk

)(z) = k(0),

and for c ∈ Cn,

(C∗c)(z) =

⎛⎜⎝J − S(z)JS(0)

S(z)− S(0)z

⎞⎟⎠ c.

(iii) The colligation

U =(A BC S(0)

):(D(S)Cn

)→(D(S)Cn

)(2.1)

is J-unitary, that is,

U

(I 00 J

)U∗ = U∗

(I 00 J

)U =

(I 00 J

),

Page 446: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 445

and closely connected, which means that

span ran AmB, ran A∗nC∗ |m,n ≥ 0 = D(S).

(iv) S(z) has the closely connected unitary realization

S(z) = S(0) + zC(I − zA)−1B. (2.2)

The colligation U in the theorem is unique and is called the canonical J-unitary colligation for S(z). The function on the right-hand side of (2.2) is calledthe characteristic function of U and is denoted by SU (z). The realization (2.2)of S(z) is called the canonical J-unitary realization of S(z). To indicate the de-pendence on S(z), we sometimes write US , AS , BS , etc. instead of U , A, B, etc.Any other J-unitary closely connected colligation whose characteristic functioncoincides with S(z) is unitarily equivalent to the canonical unitary colligation. Bythis we mean that if P ′ is a Pontryagin space such that

U ′ =(A′ B′

C′ D′

):(P ′

Cn

)→(P ′

Cn

)is J-unitary:

U ′(I 00 J

)U

′∗ = U′∗(I 00 J

)U ′ =

(I 00 J

)and closely connected:

span

ran A′mB′, ran A

′∗nC′∗, |m,n ≥ 0

= P ′

and SU ′(z) = D′ + zC′(I − zA′)−1B′ = S(z), then there exists an isomorphismW : D(S) → P ′ such that

A′ = WASW−1, B′ = WBS , C

′W = CS , and D′ = S(0).

We note that Theorem 2.1 can be proved in a similar way as Theorem 2.3.1 in [7]with some minor modifications.

In the sequel we only consider the cases n = 1, J = 1, and n = 2, J =(1 00 −1

). In the first case we write s(z) and S0 instead of S(z) and S0(C, 1).

If n = 1 the colligation U in (2.1) can also be written in the form

U =(

A u〈 · , v〉D(s) s(0)

):(D(s)

C

)→(D(s)

C

),

where u, v ∈ D(s) are given by

u(z) = B1(z) =

⎛⎜⎝ s(z)− s(0)z

1− s(z)s(0)

⎞⎟⎠ , v(z) = C∗1(z) = Ds(z, 0)(

10

).

The last equality follows from the reproducing property of the kernel Ds(z, w).

Page 447: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

446 G. Wanjala

2.2. The basic interpolation problem

In this subsection we recall from [3] the solutions of the basic interpolation problem(BIP) with a given σ0 ∈ C. We use the following notation. Given any k complexnumbers s0 = 0, s1, . . . , sk−1 we form the polynomial

Q(z) = Q(z; s0, s1, . . . , sk−1)

= c0 + c1z + · · ·+ ck−1zk−1 − (c∗k−1z

k+1 + c∗k−2zk+2 + · · ·+ c∗0z

2k)

of degree 2k, where the coefficients c0, c1, . . . , ck−1 are determined by the relation⎛⎜⎜⎜⎝c0 0 · · · 0c1 c0 · · · 0...

. . . . . ....

ck−1 · · · c1 c0

⎞⎟⎟⎟⎠⎛⎜⎜⎜⎝

s0 0 · · · 0s1 s0 · · · 0...

. . . . . ....

sk−1 · · · s1 s0

⎞⎟⎟⎟⎠ = σ0Ik.

Settingp(z) = c0 + c1z + · · ·+ ck−1z

k−1 (2.3)we have that Q(z) = p(z)− z2kp(z−∗)∗, which implies that −Q(z) = z2kQ(z−∗)∗.For s(z) ∈ S0 we write its Taylor expansion at z = 0 as

s(z) =∞∑

n=0

σnzn.

If |σ0| ≤ 1, then s(z) ≡ σ0 is the constant solution of the problem (BIP). If|σ0| > 1, then the function s(z) ≡ σ0 does not belong to the class S0 and henceis not a solution of the problem (BIP). The function s(z) ∈ S0 is nonconstant ifand only if s(z)− σ0 has a zero at z = 0 of finite order and the following theoremdescribes all nonconstant solutions s(z) of the problem (BIP) for which this orderequals k.

Theorem 2.2. Let k be an integer ≥ 1 and if |σ0| = 1 let q be an integer ≥ 0, let s0 =0, s1, . . . , sk−1 be any k complex numbers, and set Q(z) = Q(z; s0, s1, . . . , sk−1).Then the formula

s(z) =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎩

zks1(z) + σ0

σ∗0z

ks1(z) + 1if |σ0| < 1,

σ0s1(z) + zk

s1(z) + σ∗0z

kif |σ0| > 1,

(Q(z) + zk)s1(z)− σ0Q(z)zq

σ∗0Q(z)s1(z)− (Q(z)− zk)zq

if |σ0| = 1,

(2.4)

establishes a one to one correspondence between all nonconstant solutions s(z) ∈S0 of the problem (BIP) with the property that in all three cases

σ1 = σ2 = · · · = σk−1 = 0 and σk = 0,

and in the case |σ0| = 1 with the additional property

σj = sj−k, j = k, k + 1, . . . , 2k − 1,

Page 448: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 447

and all parameters s1(z) ∈ S0 with

s1(0) =

0 if |σ0| = 1, or |σ0| = 1 and q > 0,σ0 if |σ0| = 1 and q = 0. (2.5)

Consider the case |σ0| < 1. If in formula (2.4) we let k vary over all integers≥ 1 and replace zk−1s1(z) by s2(z), then Theorem 2.2 implies that the formula

s(z) =zs2(z) + σ0

σ∗0zs2(z) + 1

gives a one to one correspondence between all solutions s(z) ∈ S0 and all parame-ters s2(z) ∈ S0. The constant solution s(z) ≡ σ0 corresponds to the case s2(z) ≡ 0.The function s(z)−σ0 has a zero of order k at z = 0 if and only if the correspond-ing parameter s2(z) has a zero of order k − 1 at z = 0, that is, s2(z) = zk−1s1(z)for some s1(z) ∈ S0 with s1(0) = 0. In the case |σ0| = 1 one gets the set of allsolutions of the problem (BIP) first by describing a nonnegative integer q and karbitrary complex numbers s0 = 0, s1, . . . , sk−1 and then by applying a fractionallinear transformation with a slightly restricted class of parameters s1(z) ∈ S0.Formally, the constant solution s(z) ≡ σ0 can be obtained from (2.4) by replacingQ(z) by ∞.

The expressions in formula (2.4) are fractional linear transformations of theform (1.3), where Θ(z) in (1.4) is given by

Θ(z) = Θ1(z) =1√

1− |σ0|2

(1 σ0

σ∗0 1

)(zk 00 1

), if |σ0| < 1, (2.6)

Θ(z) = Θ2(z) =1√

|σ0|2 − 1

(σ0 11 σ∗

0

)(1 00 zk

), if |σ0| > 1, (2.7)

Θ(z) = Θ3(z) =(Q(z) + zk −σ0Q(z)σ∗

0Q(z) −Q(z) + zk

)(1 00 zq

), if |σ0| = 1. (2.8)

If in (2.8) we have q > 0 we write Θ3(z) = Θ03(z)Ψq(z), where

Θ03(z) =

(Q(z) + zk −σ0Q(z)σ∗

0Q(z) −Q(z) + zk

), Ψq(z) =

(1 00 zq

).

For a proof of the following theorem we refer to [3, Theorem 3.2].

Theorem 2.3. The three polynomial matrices Θ1(z),Θ2(z), and Θ3(z) are J-unitaryand Θ1(z) ∈ S0

0(C2, J), Θ2(z) ∈ S0

k(C2, J), and Θ3(z) ∈ S0k+q(C

2, J).

Recall that here

J =(

1 00 −1

)and that Θ(z) is J-unitary means that Θ(z)∗JΘ(z) = J for all z ∈ C with |z| = 1.It follows that δ(z) := detΘ(z) = cz for some nonnegative integer and somecomplex number c with |c| = 1; in the cases where Θ = Θ1, Θ2, and Θ3 we haveδ(z) = zk, zk, and z2k+q respectively.

Page 449: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

448 G. Wanjala

Remark: If s(z) is a solution of the basic interpolation problem (BIP) with corre-sponding parameter function s1(z) ∈ S0

κ1then s(z) ∈ S0

κ where

κ =

⎧⎪⎨⎪⎩κ1 if |σ0| < 1,κ1 + k if |σ0| > 1,κ1 + k + q if |σ0| = 1.

This follows from the equality ind−D(s) = ind−D(Θ)+ind−D(s1) which is provedin Theorem 4.1 below.

3. The spaces D(Θ1), D(Θ2), and D(Θ3)

In the following description of these spaces o, u, e1, e2 will stand for the vectors

o =(

00

), u =

(1σ∗

0

), e1 =

(10

), e2 =

(01

)and ∆ will be the k × k matrix

∆ =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

0 0 0 0 · · · 0

ck−1 0 0 0 · · · 0ck−2 ck−1 0 0 · · · 0

.... . . . . . . . . . . .

...

c2 · · · ck−2 ck−1 0 0

c1 · · · ck−3 ck−2 ck−1 0

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠.

Theorem 3.1. Let Θ1(z), Θ2(z) and Θ3(z) be defined by (2.6)–(2.8).

(i) The space D(Θ1) is a Hilbert space spanned by the orthonormal basis(rzn−1uzk−ne1

)k

n=1

, r =1√

1− |σ0|2,

with Gram matrix Ik×k. In particular, the elements of the space D(Θ1) areof the form (

rt(z)u

zk−1t(z−1)e1

),

where t(z) is a polynomial of degree ≤ k − 1.

(ii) The space D(Θ2) is an anti-Hilbert space spanned by the basis(−rzn−1uzk−ne2

)k

n=1

, r =1√

|σ0|2 − 1,

Page 450: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 449

with Gram matrix −Ik×k. In particular, the elements of the space D(Θ2) areof the form (

−rt(z)uzk−1t(z−1)e2

),

where t(z) is a polynomial of degree ≤ k − 1.

(iii) If q = 0, the space D(Θ3) is a Pontryagin space spanned by the basis(zn−1u

zk−nJu

),

(zn−1(Ju− 2zkp(z−∗)∗u)

zk−n(u− 2zkp(z−1)Ju)

)k

n=1

,

with Gram matrix 2(

0 Ik×k

Ik×k −2(∆ + ∆∗)

). In particular, the elements of the

space D(Θ3) are of the form(t1(z)u

zk−1t1(z−1)Ju

)+

(t2(z)(Ju− 2zkp(z−∗)∗u)

zk−1t2(z−1)(u− 2zkp(z−1)Ju)

),

where t1(z) and t2(z) are polynomials of degree ≤ k − 1 .

(iv) If q > 0, the space D(Θ3) can be decomposed as the orthogonal sum

D(Θ3) =(

1 00 Ψq

)D(Θ0

3)⊕(

Θ03 0

0 1

)D(Ψq).

Here the space D(Θ03) is as described in part (iii) and the space D(Ψq) is an

anti-Hilbert space with basis(−zn−1e2

zk−ne2

)k

n=1

whose Gram matrix equals −Iq×q. Moreover, the map

W : D(Θ3) . f →(f1

f2

)∈(D(Θ0

3)D(Ψq)

)(3.1)

determined by the decomposition

f =(

1 00 Ψq

)f1 +

(Θ0

3 00 1

)f2

is unitary.

Proof. (i) We have that

DΘ1(z, w) =

⎛⎜⎜⎜⎝r2

1− zkw∗k

1− zw∗

(1 σ0

σ∗0 |σ0|2

)rzk − w∗k

z − w∗

(1 0σ∗

0 0

)rzk − w∗k

z − w∗

(1 σ0

0 0

)1− zkw∗k

1− zw∗

(1 00 0

)⎞⎟⎟⎟⎠ ,

Page 451: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

450 G. Wanjala

from which we derive the equality

DΘ1(z, w)

⎛⎝abo

⎞⎠ = rw∗(k−1)DΘ1(z, w−1)

⎛⎝ oa+ σ0b

d

⎞⎠for a, b, d ∈ C. Since D(Θ1) is spanned by the columns of DΘ1(z, w), this meansthat in fact it is spanned by

1rDΘ1(z, w)

(e1

o

)=

⎛⎝ r(1 + zw∗ + · · ·+ zk−1w∗(k−1))u

(zk−1 + zk−2w∗ + · · ·+ zw∗(k−2) + w∗(k−1))e1

⎞⎠ .

We divide this element by 2πiw∗n, integrate with respect to w∗ over a circlearound w∗ = 0 and, by Cauchy’s theorem, obtain the basis elements described inpart (i) of the theorem. By the reproducing property of the kernel we have⟨

1rDΘ1(z, w)

(e1

o

),1rDΘ1(z, v)

(e1

o

)⟩D(Θ1)

= 1 + vw∗ + · · ·+ vk−1w∗(k−1).

Dividing both sides by −4π2vmw∗n, 1 ≤ m,n ≤ k, and integrating with respect tov and w∗ over circles around the origin, we see that the Gram matrix associatedwith this basis is equal to Ik×k.

(ii) The case for Θ2(z) can be proved similarly and the proof is omitted.

(iii) For this case we have that

DΘ3(z, w) =

⎛⎜⎜⎝1− zkw∗k

1− zw∗ Jzk − w∗k

z − w∗ I

zk − w∗k

z − w∗ I1− zkw∗k

1− zw∗ J

⎞⎟⎟⎠

⎛⎜⎜⎝zkQ(w)∗ + w∗kQ(z)

1− zw∗

(1 σ0

σ∗0 1

)Q(z)−Q(w∗)

z − w∗

(−1 σ0

−σ∗0 1

)Q(z∗)∗ −Q(w)∗

z − w∗

(−1 −σ0

σ∗0 1

)zkQ(w∗) + w∗kQ(z∗)∗

1− zw∗

(1 −σ0

−σ∗0 1

)⎞⎟⎟⎠ .

From this it can be shown that

DΘ3(z, w)(Juo

)= w∗(k−1)DΘ3(z, w

−1)(

ou

),

and

w∗(k−1)DΘ3(z, w−1)

(uo

)−DΘ3(z, w)

(oJu

)= −2

Q(w∗)w∗k

DΘ3(z, w)(

ou

).

These equalities imply that

D(Θ3) = span DΘ3(z, w)(Juo

), DΘ3(z, w)

(uo

). (3.2)

Page 452: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 451

Since

DΘ3(z, w)(Juo

)=

⎛⎜⎜⎝1− zkw∗k

1− zw∗ u

zk − w∗k

z − w∗ Ju

⎞⎟⎟⎠ ,

we obtain, using integration as in the proof of part (i), that

span DΘ3(z, w)(Juo

) = span

(zj−1uzk−jJu

)k

j=1

(3.3)

is a neutral space which accounts for the 0 entry in the left upper corner of theGram matrix. The elements on the right-hand side are linearly independent andthe span coincides with the space of functions of the form(

t(z)uzk−1t(z−1)Ju

),

where t(z) is a polynomial of degree ≤ k − 1. From

DΘ3(z, w)(uo

)=

⎛⎜⎝1− zkw∗k

1− zw∗ Ju− 2zkQ(w)∗ + w∗kQ(z)

1− zw∗ u

zk − w∗k

z − w∗ u + 2Q(z∗)∗ −Q(w)∗

z − w∗ Ju

⎞⎟⎠and Q(z) = p(z)− z2kp(z−∗)∗, we get

DΘ3(z, w)(uo

)=

⎛⎜⎝1− zkw∗k

1− zw∗ Ju− 2zk 1− zkw∗k

1− zw∗ p(z−∗)∗u

zk − w∗k

z − w∗ u− 2zk zk − w∗k

z − w∗ p(1z)Ju

⎞⎟⎠−( tw(z)uzk−1tw(z−1)Ju

),

where

tw(z) = 2zkp(w)∗ − zkp(z−∗)∗ + w∗kp(z)− zkw∗2kp(w−∗)

1− zw∗

is a polynomial of degree ≤ k − 1 in z. The span of the second summand iscontained in the neutral subspace (3.3) and can be dropped from the formulawhen calculating the span on the right-hand side of (3.2). The remainder of theproof of part (iii) can be given by integration and using the reproducing propertyof the kernel as in the proof of part (i) and is omitted.

(iv) The proof of the statements concerning the space D(Ψq) is similar to the proofof (i). The orthogonal decomposition of D(Θ3) and the unitarity of the map followfrom (a) the equality

DΘ3(z, w) =(

1 00 Ψq(z)

)DΘ0

3(z, w)

(1 00 Ψq(w)∗

)+(

Θ03(z) 00 1

)DΨq(z, w)

(Θ0

3(w)∗ 00 1

), (3.4)

Page 453: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

452 G. Wanjala

(b) the implication that if f1 ∈ D(Θ03) and f2 ∈ D(Ψq) then the identity(

1 00 Ψq

)f1 +

(Θ0

3 00 1

)f2 = 0

implies f1 = 0 and f2 = 0, and (c) reproducing kernel methods as in [7, Section1.5] (see also [3, Theorems 2.1 and 2.2]). The implication in (b) follows from⎛⎜⎜⎝

1 0 0 00 1 0 00 0 1 00 0 0 zq

⎞⎟⎟⎠D(Θ03) ∩

⎛⎜⎜⎝Q(z) + zk −σ0 0 0σ∗

0Q(z) −Q(z) + zk 0 00 0 1 00 0 0 1

⎞⎟⎟⎠D(Ψq) = 0 ,

which can be verified by comparing the degrees of the elements in the two sets.

4. Solutions in terms of colligations

In this section we construct a closely connected unitary colligation U for the solu-tion

s = TΘ =as1 + b

cs1 + d, Θ =

(a bc d

), (4.1)

of the basic interpolation problem (BIP) in terms of the corresponding parameterfunction s1, the entries of the matrix function Θ, and their canonical unitarycolligations. We shall use the following notation:

Υ =

(1 −s 0 0

0 0s1n

1n

), Φ =

(a− cs 0

01n

),

where n = cs1 + d = (a − cs)/δ, δ := det Θ. Here and elsewhere in the sequelwhen f is a matrix function on some set in C, we denote by f the matrix functionf(z) = f(z∗)∗.

Theorem 4.1. Let s in (4.1) be a solution of the interpolation problem (BIP) withparameter function s1 and Θ = Θ1,Θ2, and Θ3.

(i) The space D(s) can be decomposed as the orthogonal sum

D(s) = ΥD(Θ)⊕ ΦD(s1),

and ind−D(s) = ind−D(Θ) + ind−D(s1).(ii) The map

Λ : D(s) . h →(fg

)∈(D(Θ)D(s1)

)determined by the decomposition h = Υf + Φg is unitary.

Proof. We claim that (1)

Ds(z, w) =(Υ(z) Φ(z)

)(DΘ(z, w) 00 Ds1(z, w)

)(Υ(w)∗

Φ(w)∗

), (4.2)

Page 454: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 453

which implies that D(s) = ΥD(Θ) + ΦD(s1), and (2) the multiplication map(Υ Φ

): D(Θ)⊕D(s1) .

(fg

)→ Υf + Φg ∈ D(s)

is injective. The theorem follows from these claims by reproducing kernel methodsas in [7, Section 1.5] (see also [3, Theorems 2.1 and 2.2]).

Proof of (1). In the proof of the first claim we shall also use the notation

Ψs =(

1 −s 0 00 0 1 −s

), J1 =

(0 1−1 0

).

It is easy to see that ΘJ1ΘT = det ΘJ1 and hence δJ1Θ−1 = ΘTJ1. This will beused in the third equality of the following calculation.

Ds(z, w) =

⎛⎜⎜⎝1− s(z)s(w)∗

1− zw∗s(z)− s(w∗)

z − w∗

s(z)− s(w∗)z − w∗

1− s(z)s(w)∗

1− zw∗

⎞⎟⎟⎠

=(

1 −s(z) 0 00 0 1 −s(z)

)⎛⎜⎜⎝J

1− zw∗J1

z − w∗J1

z − w∗J

1− zw∗

⎞⎟⎟⎠⎛⎜⎜⎝

1 0−s(w)∗ 0

0 10 −s(w)∗

⎞⎟⎟⎠

= Ψs(z)

⎛⎜⎜⎝J −Θ(z)JΘ(w)∗

1− zw∗[Θ(w∗)−Θ(z)]Θ(w∗)−1J1

z − w∗

J1Θ(z)−1[Θ(z)− Θ(w∗)]z − w∗

J − J1Θ(z)−1JΘ(w)−∗J1

1− zw∗

⎞⎟⎟⎠Ψs(w)∗

+Ψs(z)

⎛⎜⎜⎜⎝Θ(z)JΘ(w)∗

1− zw∗Θ(z)J1Θ(w∗)T

δ(w∗)(z − w∗)

Θ(z)TJ1Θ(w)∗

δ(z)(z − w∗)

Θ(z)TJ1JJ1Θ(w∗)T

δ(z)δ(w∗)(1− zw∗)

⎞⎟⎟⎟⎠Ψs(w)∗

= Ψs(z)(I 00 J1Θ(z)−1

)DΘ(z, w)

(I 00 −Θ(w∗)−1J1

)Ψs(w)∗

+Ψs(z)

⎛⎝Θ(z) 0

0Θ(z)T

δ(z)

⎞⎠⎛⎜⎝ J

1− zw∗J1

z − w∗J1

z − w∗J

1− zw∗

⎞⎟⎠⎛⎝Θ(w)∗ 0

0Θ(w∗)T

δ(w∗)

⎞⎠× Ψs(w)∗.

Using

Ψs

(I 00 J1Θ−1

)= Υ

Page 455: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

454 G. Wanjala

and

Ψs

⎛⎝Θ 0

0ΘT

δ

⎞⎠ =

⎛⎝(1 −s

)Θ 0 0

0 0(1 −s

) ΘT

δ

⎞⎠=

⎛⎝a− c 0

0a− cs

δ

⎞⎠Ψs1 = ΦΨs1 ,

where the second equality follows from the relation(1 −s

)Θ = (a− sc)

(1 −s1

), (4.3)

we obtain

Ds(z, w) = Υ(z)DΘ(z, w)Υ(w)∗

+ Φ(z)Ψs1(z)

⎛⎜⎜⎝J

1− zw∗J1

z − w∗J1

z − w∗J

1− zw∗

⎞⎟⎟⎠Φ(w)∗Ψs1(w)∗

= Υ(z)DΘ(z, w)Υ(w)∗ + Φ(z)Ds1(z, w)Φ(w)∗.

This proves the first claim.Proof of (2). To prove the second claim, consider

f =(f1

f2

)∈ D(Θ), g =

(g1

g2

)∈ D(s1),

and assume Υf + Φg = 0, that is,(1 −s

)f1 + (a− cs)g1 = 0 (4.4)

and (s1 1

)f2 + g2(z) = 0. (4.5)

Then f1 ∈ H(Θ), g1 ∈ H(s1), and therefore equation (4.4) implies f1 = g1 = 0, asalready shown in [3]. This means that

f =(

0f2

)∈ D(Θ).

Using Theorem 3.1 we conclude that f2 = 0 in all the three cases Θ = Θ1, Θ2

and Θ3. Equation (4.5) then implies g2 = 0. Thus multiplication by(Υ Ψ

)is

injective.

The following theorem is the main result of this paper.

Theorem 4.2. Under the unitary mapping

Λ1 =(

Λ 00 1

):(D(s)

C

)→

⎛⎝(D(Θ)D(s1)

)C

⎞⎠

Page 456: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 455

the canonical unitary colligation

Us =(

As us

〈 · , vs〉D(s) s(0)

):(D(s)

C

)→(D(s)

C

)for s(z) is transformed into the colligation

U = Λ1UsΛ−11 =

(A u

〈 · , v〉D(Θ)⊕D(s1) s(0)

),

where

A = ΛAsΛ−1 =

⎛⎜⎜⎝AΘ +BΘ

n(0)

(s1(0)

1

)(0 −1

)CΘ

n(0)

(d(0)−c(0)

)Cs1

Bs1

n(0)(0 −1

)CΘ As1 −

Bs1

n(0)c(0)Cs1

⎞⎟⎟⎠ ,

u = Λus =1

n(0)

⎛⎜⎝BΘ

(s1(0)

1

)Bs1

⎞⎟⎠ , v = Λvs =

⎛⎜⎜⎜⎜⎝DΘ( · , 0)

⎛⎜⎜⎝1

−σ∗0

00

⎞⎟⎟⎠0

⎞⎟⎟⎟⎟⎠ ∈(D(Θ)D(s1)

).

The formula for A shows that it is an extension to D(s) of the operator

Bs1 = As1 −Bs1

n(0)c(0)Cs1

in D(s1), which is at most a one-dimensional perturbation of As1 . The theorem

implies that with Λh =(fg

)and c ∈ C the following diagram commutes.

(D(s)

C

).(hc

) (D(s)

C

)Us

Λ1

⎛⎝D(Θ)D(s1)

C

⎞⎠⎛⎝D(Θ)D(s1)

C

⎞⎠ .

⎛⎝fgc

⎞⎠

Λ1

U

Proof of Theorem 4.2. The formula for v follows from vs = Ds( · , 0)(

10

), formula

(4.2), and the fact thata(0)− c(0)σ0 = 0 (4.6)

Page 457: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

456 G. Wanjala

in all three cases Θ = Θ1, Θ2, and Θ3. Now we derive the formula for u and beginwith Bs. Denoting by R0 the difference-quotient operator:

R0x(z) =x(z)− x(0)

z, (4.7)

where x(z) is any holomorphic function in a neighborhood of z = 0, we have

Bs =(

R0s1− ss(0)

). (4.8)

The entries of Bs in (4.8) can be written in the form

R0s =1

n(0)

(1 −s

)R0Θ

(s1(0)

1

)+ (a− cs)R0s1

(4.9)

and

1− ss(0) =1

n(0)n

(s1 1

)(J − ΘJΘ(0))

(s1(0)

1

)+ 1− s1s1(0)

. (4.10)

For the equality (4.9) we refer to the proof of Theorem 4.1 in [3]. As for theequality (4.10), the right-hand side equals

−1n(0)n

(s1 1

)(a −cb −d

)(a(0) b(0)c(0) d(0)

)(s1(0)

1

)=

1n(0)n

[(cs1 + d)(c(0)s1(0) + d(0))− (as1 + b)(a(0)s1(0) + b(0))

],

which equals the left-hand side. It follows that

Bs =1

n(0)

(1 −s 0 0

0 0s1n

1n

)(R0Θ

J − ΘJΘ(0)

)(s1(0)

1

)(4.11)

+1

n(0)

(a− cs 0

01n

)(R0s1

1− s1s1(0)

)= Υ

n(0)

(s1(0)

1

)+ Φ

Bs1

n(0),

which yields the formula for u in the theorem.To obtain the expression for As, we start from

Ash = AsΥf + AsΦg, (4.12)

where

h ∈ D(s), f =(f1

f2

)∈ D(Θ), g =

(g1

g2

)∈ D(s1)

are related via h = Υf + Φg, that is, Λh =(fg

). Using the notation

Υ = diag υ1, υ2, υ1 =(1 −s

), υ2 =

1n

(s1 1

)and the formula

R0(xy)(z) = x(z)R0y(z) + R0x(z)y(0), (4.13)

Page 458: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 457

which holds for any two functions x, y, which are holomorphic in a neighborhoodof z = 0, we calculate the first summand on the right-hand side of (4.12):

AsΥf(z) = As

(υ1f1

υ2f2

)(z) =

(R0(υ1f1)(z)

zυ2(z)f2(z)− s(z)υ1(0)f1(0)

)(4.14)

=

⎛⎝ υ1(z)R0f1(z)

υ2(z)[zf2(z)− Θ(z)Jf1(0)]

⎞⎠+

⎛⎝ R0υ1(z)

υ2(z)Θ(z)J − s(z)υ1(0)

⎞⎠ f1(0)

=(υ1(z) 0

0 υ2(z)

)AΘ

(f1

f2

)(z)−

(0 R0s(z)

0 1− s(z)s(0)

)f1(0)

= Υ(z)AΘf(z) + (Bs

(0 −1

)f1(0))(z).

We calculate the second summand on the right-hand side of (4.12) in a similarway. We write

Φ = diag φ1, φ2, φ1 = a− cs, φ2 =1n,

and using (4.6) we get

AsΦg = As

(φ1g1

φ2g2

)= ΦAs1g +

(R0φ1

φ2s1

)g1(0). (4.15)

We claim that the two components of the vector function on the right-hand sideof (4.15) can be written as

R0φ1 =(1 −s

) R0Θn(0)

(d(0)−c(0)

)− (a− cs)R0s1

c(0)n(0)

(4.16)

and

φ2s1 =1

n(0)n

(s1 1

)(J − ΘJΘ(0))

(d(0)−c(0)

)− (1 − s1s1(0))c(0)

. (4.17)

Assuming these claims are true, we find that the vector function takes theform (

R0φ1

φ2s1

)=

(1 −s 0 0

0 01ns1

1n

)(R0Θ

J − ΘJΘ(0)

)1

n(0)

(d(0)−c(0)

)

−(a− cs 0

01n

)(R0s1

1− s1s1(0)

)c(0)n(0)

= ΥBΘ

n(0)

(d(0)−c(0)

)− Φ

Bs1

n(0)c(0).

Page 459: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

458 G. Wanjala

Substituting this in (4.15), and then substituting (4.15) and (4.14) with Bs re-placed by (4.11) into (4.12) we obtain the formula for A = ΛAsΛ−1 in the theorem:

Ash = ΥAΘf +

n(0)

(s1(0)

1

)(0 −1

)f1(0) +

n(0)

(d(0)−c(0)

)g1(0)

Bs1

n(0)(0 −1

)f1(0) + As1g −

Bs1

n(0)c(0)g1(0)

.

It remains to prove the claims. Equality (4.17) follows from writing out the right-hand side and using

Θ(0)(d(0)−c(0)

)=(δ(0)0

)= 0, δ(z) = det Θ(z). (4.18)

For the proof of (4.16) we write

n = cs1 + d =(1 −s1

)( d−c

),

use (4.18) and repeatedly (4.3) and (4.13). We obtain the following chain of equal-ities:

(a− sc)(1 −s1

)R0

(d−c

)+ (a− sc)R0s1c(0) + R0(a− sc)n(0)

= R0(a− sc)n = R0

(1 −s

)Θ(d−c

)=(1 −s

)R0

[Θ(d−c

)]=(1 −s

)R0Θ

(d(0)−c(0)

)+(1 −s

)ΘR0

(d−c

)=(1 −s

)R0Θ

(d(0)−c(0)

)+ (a− sc)

(1 −s1

)R0

(d−c

).

Comparing both sides we find the equality (4.16).

For the case Θ3(z) = Θ03Ψq and q > 0 there is need to modify Theorem 4.2.

To do this we define Λq to be the composition of the two unitary maps W in (3.1)and Λ, that is,

Λq :=

⎛⎝W 0 00 ID(s1) 00 0 1

⎞⎠Λ :(D(s)

C

)→

⎛⎜⎜⎝⎛⎝D(Θ0

3)D(Ψq)D(s1)

⎞⎠C

⎞⎟⎟⎠ ,

and set

M =(s1(0)

1

)(0 −1

), M1 = Ψq(0)M =

(s1(0)

0

)(0 −1

),

M2 = Ψq(0)MΘ03(0), and M3 = MΘ0

3(0).

Page 460: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 459

Theorem 4.3. With Θ(z) = Θ3(z) and q > 0, under the map Λq the canonicalunitary colligation

Us =(

As us

〈 · vs〉 s(0)

):(D(s)

C

)→(D(s)

C

)is transformed into the colligation

Uq = ΛqUsΛ−1q =

(Aq uq

〈 · vq〉 s(0)

):

⎛⎜⎜⎝⎛⎝D(Θ0

3)D(Ψq)D(s1)

⎞⎠C

⎞⎟⎟⎠→

⎛⎜⎜⎝⎛⎝D(Θ0

3)D(Ψq)D(s1)

⎞⎠C

⎞⎟⎟⎠ ,

where

Aq =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

AΘ03+

BΘ03

n(0)M1CΘ0

3BΘ0

3CΨq +

BΘ03

n(0)M2CΨq 0

BΨq

n(0)MCΘ0

3AΨq +

BΨq

n(0)M3CΨq

BΨq

n(0)

(0

−c(0)

)Cs1

Bs1

n(0)(0 −1

)CΘ0

3

Bs1

n(0)(0 −1

)Θ0

3(0)CΨq As1 −Bs1

n(0)c(0)Cs1

⎞⎟⎟⎟⎟⎟⎟⎟⎠,

uq =1

n(0)

⎛⎜⎜⎜⎜⎝Θ0

3Ψq(0)(s1(0)

1

)Ψq

(s1(0)

1

)Bs1

⎞⎟⎟⎟⎟⎠ , vq =

⎛⎜⎜⎜⎜⎜⎜⎝DΘ0

3( · , 0)

⎛⎜⎜⎝1

−σ∗0

00

⎞⎟⎟⎠00

⎞⎟⎟⎟⎟⎟⎟⎠ ∈

⎛⎝D(Θ03)

D(Ψq)D(s1)

⎞⎠ .

Proof. Using (4.13) and d(0) = 0 (since q > 0), we find that

WAΘ =

(AΘ0

3BΘ0

3CΨq

0 AΨq

), WBΘ =

(BΘ0

3Ψq(0)

BΨq

), CΘ =

(CΘ0

3Θ0

3(0)CΨq

).

Substitution of these formulas into the formulas of Theorem 4.2 yields the desiredresult for Aq and uq. The formula for vq is obtained by using the decompositionin (3.4).

Let P be the projection in the space D(Θ) ⊕ D(s1) onto the space D(s1).From the operator matrix form of A we see that

As1 = PA|D(s1) +Bs1

n(0)c(0)Cs1

andus1 = Bs11 = n(0)Pu.

This observation and the next theorem show that the canonical unitary colligationof the parameter s1(z) can be recovered from the canonical unitary colligation ofthe solution s(z).

Page 461: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

460 G. Wanjala

Theorem 4.4.

Ds1(·, 0)(

10

)=

⎧⎪⎪⎨⎪⎪⎩n(0)∗PA∗kv if |σ0| = 1,

n(0)∗PA∗(2k)v if |σ0| = 1 and q = 0,

n(0)∗PA∗(2k+q)v if |σ0| = 1 and q > 0.

Proof. First we note thatA∗ =⎛⎜⎜⎜⎝

A∗Θ +

⟨· , BΘ

n(0)

(s1(0)

1

)⟩C∗

Θ

(0−1

) ⟨· , Bs1

n(0)

⟩C∗

Θ

(0−1

)⟨· , BΘ

n(0)

(d(0)−c(0)

)⟩Ds1(·, 0)

(10

)A∗

s1−⟨· , Bs1

n(0)c(0)

⟩Ds1( · , 0)

(10

)⎞⎟⎟⎟⎠ ,

where A∗Θ is as given in Theorem 2.1 (ii) and

(ab

)= DΘ( · , 0)

⎛⎜⎜⎝00ab

⎞⎟⎟⎠ , C∗Θ

(0−1

)= DΘ( · , 0)

⎛⎜⎜⎝0−100

⎞⎟⎟⎠ .

For |σ0| < 1 we set r = 1/√

1− |σ0|2 and since d(0) = n(0) = r and c(0) = 0 weobtain

A∗ =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

A∗Θ +

1n(0)∗

⟨· , DΘ( · , 0)

⎛⎜⎜⎝00

s1(0)1

⎞⎟⎟⎠⟩C∗

Θ

(0−1

)A∗

12

1n(0)∗

⟨· , DΘ( · , 0)

⎛⎜⎜⎝00r0

⎞⎟⎟⎠⟩Ds1( · , 0)

(10

)A∗

22

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠,

where the operators A∗12 and A∗

22 need not be specified because they play no rolein the calculations that follow. From

v =(vΘ

0

), vΘ(z) = DΘ(z, 0)

⎛⎜⎜⎝1

−σ∗0

00

⎞⎟⎟⎠ =

⎛⎜⎜⎜⎝1σ∗

0

zk−1

(1/r0

)⎞⎟⎟⎟⎠

and using the reproducing property of the kernel DΘ(z, w), we get

A∗jv =

(A∗j

Θ vΘ

0

), j = 0, 1, . . . , k − 1, and A∗kv =

⎛⎝ ∗

r−∗Ds1( · , 0)(

10

)⎞⎠ ,

where the entry denoted by ∗ is of no consequence here. We conclude that

PA∗kv = r−∗Ds1( · , 0)(

10

)=

1n(0)∗

Ds1( · , 0)(

10

).

Page 462: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 461

The proof of the formula for |σ0| > 1 can be given in a similar way and is thereforeomitted.

For |σ0| = 1 and q = 0, d(0) = −Q(0) = −c0 and c(0) = σ∗0Q(0) = σ∗

0c0.From

v =(vΘ

0

), vΘ(z) = DΘ(z, 0)

⎛⎜⎜⎝1

−σ∗0

00

⎞⎟⎟⎠ =

⎛⎜⎜⎝1σ∗

0

zk−1

(1

−σ∗0

)⎞⎟⎟⎠ ,

we see that for 1 ≤ j ≤ k − 1,

A∗jv =(A∗j

Θ vΘ

0

), and A∗kv =

⎛⎜⎝A∗k vΘ +s1(0)∗ − σ∗

0

n(0)∗C∗

Θ

(0−1

)0

⎞⎟⎠ .

Using

DΘ(z, 0)

⎛⎜⎜⎝0−100

⎞⎟⎟⎠ =

⎛⎜⎜⎝01

zk−1

(0−1

)⎞⎟⎟⎠+

⎛⎜⎜⎝ zkQ(0)(σ0

1

)Q(z∗)∗ −Q(0)∗

z

(σ0

−1

)⎞⎟⎟⎠

and

A∗(k−1)Θ C∗

Θ

(0−1

)=

⎛⎜⎜⎝∗∗0−1

⎞⎟⎟⎠+

⎛⎜⎜⎝∗∗

∗(σ0

−1

)⎞⎟⎟⎠ ,

we see that for 0 ≤ j ≤ k − 1,

A∗(k+j)v =

⎛⎜⎝A∗(k+j) vΘ + pj(A∗Θ)C∗

Θ

(0−1

)0

⎞⎟⎠ ,

where pj(z) is a polynomial of degree j with leading coefficient

s1(0)∗ − σ∗0

n(0)∗=

1c∗0σ0

.

Finally we obtain

A∗2kv =

⎛⎜⎜⎝∗

c∗0σ0s1(0)∗ − σ∗

0

n(0)∗2Ds1( · , 0)

(10

)⎞⎟⎟⎠ =

⎛⎜⎜⎝∗

1n(0)∗

Ds1( · , 0)(

10

)⎞⎟⎟⎠ ,

and conclude that

PT ∗2kv =1

n(0)∗Ds1( · , 0)

(10

).

Page 463: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

462 G. Wanjala

To prove the formula for the case |σ0| = 1 and q > 0 we use the decomposition

in Theorem 4.3. Setting N1 = C∗Θ0

3

(0−1

)and N2 = c∗0C

∗Ψq

(−σ0

1

)we get

A∗q =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝

A∗Θ0

3+ 〈 · ,

BΘ03

n(0)

(s1(0)

0

)〉N1 〈 · ,

BΨq

n(0)

(s1(0)

1

)〉N1 ∗

C∗ΨqB∗

Θ03+ 〈 · ,

BΘ03

n(0)

(s1(0)

0

)〉N2 A∗

Ψq+ 〈 · ,

BΨq

n(0)

(s1(0)

1

)〉N2 ∗

0 〈 · ,BΨq

n(0)

(0

−c(0)

)〉C∗

s1∗

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠.

With

vΘ03

= DΘ03(z, 0)

⎛⎜⎜⎝1

−σ∗0

00

⎞⎟⎟⎠ =

⎛⎜⎜⎝1σ∗

0

zk−1

(1

−σ∗0

)⎞⎟⎟⎠ ,

wΘ03

:= DΘ03( · , 0)

⎛⎜⎜⎝0−100

⎞⎟⎟⎠ , and wΨq := DΨq( · , 0)

⎛⎜⎜⎝0−100

⎞⎟⎟⎠we obtain

A∗jq vq =

⎛⎝A∗jΘ0

3vΘ0

3

00

⎞⎠ , A∗jΘ0

3vΘ0

3=

⎛⎜⎜⎝ zj

(1σ∗

0

)zk−(1+j)

(1

−σ∗0

)⎞⎟⎟⎠ , j = 0, 1, . . . , k − 1,

A∗k+jq =

⎛⎜⎝A∗k+jΘ0

3vΘ0

3+ pj(A∗

Θ03)wΘ0

3

00

⎞⎟⎠ , j = 0, 1, . . . , k − 1,

where pj(z) is a polynomial of degree j in z with leading coefficient s1(0)∗/n(0)∗,and finally,

A∗2kq =

⎛⎜⎜⎜⎜⎝A∗2k

Θ03vΘ0

3+ pk(A∗

Θ03)wΘ0

3

s∗1(0)n(0)∗

wΨq

0

⎞⎟⎟⎟⎟⎠ .

For 1 ≤ j ≤ q − 1 the last formula yields

A∗2k+jq =

⎛⎜⎜⎜⎜⎝A∗2k+j

Θ03

vΘ03+ pk+j(A∗

Θ03)wΘ0

3

s1(0)∗

n(0)∗A∗j

ΨqwΨq

0

⎞⎟⎟⎟⎟⎠ ,

which gives the desired result.

Page 464: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 463

5. The symmetry condition

In this section we apply [5, Corollary 3.5, Theorem 3.1]:

Theorem 5.1. Let s(z) ∈ S0κ have the Taylor expansion s(z) =

∑∞n=0 σnz

n at z = 0and assume s(z) = sU (z), where

U =(

As us

〈 · , vs〉 s(0)

):(PC

)→(PC

)is a closely connected unitary colligation. The following are equivalent:

(1) There exists a λ ∈ C with |λ| = 1 such that λσn is real for all n.(2) As is Js-selfadjoint for some signature operator Js on P.

In this case Js is unique and Jsvs = λus.

In the following we may assume without loss of generality that λ = 1. If inthe interpolation problem (BIP) the interpolation data are real and the Taylorexpansion at z = 0 of the parameter function s1(z):

s1(z) =∞∑

n=0

τnzn (5.1)

has real coefficients τn, then the Taylor coefficients σn of the corresponding solutions(z) are also real . So there exist signature operators Js1 on the state space D(s1)

and Js on the state space D(s) =(D(Θ)D(s1)

)(see Theorem 4.1) such that As1 is

Js1 -selfadjoint and As is Js-selfadjoint. We express Js :(D(Θ)D(s1)

)→

(D(Θ)D(s1)

)in

terms of Js1 and the interpolation data. We consider three cases corresponding to|σ0| < 1, |σ0| > 1 and |σ0| = 1. For any function x(z) with Taylor expansion atz = 0

x(z) =∞∑

n=0

xnzn,

we denote by [x]k(z) the polynomial consisting of the first k terms of the series:

[x]k(z) = x0 + x1z + · · ·+ xk−1zk−1.

Recall that R0 is the difference-quotient operator defined by (4.7).

Case I: σ0 ∈ R, |σ0| < 1 and τn ∈ R. Using the notation as in Theorem 3.1 (i) andTheorem 4.1 we have

Js

⎛⎝rt1(z)ut2(z)e1

g

⎞⎠ =

⎛⎝rf1(z)uf2(z)e1

h

⎞⎠ ,

Page 465: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

464 G. Wanjala

where t1(z) is a polynomial of degree ≤ k, g, h ∈ D(s1),

t2(z) = zk−1t1(1/z),

f1(z) = [s1t2]k(z) + 〈(1 − zkAks1

)(1− zAs1)−1Js1h, vs1〉

f2(z) = zk−1f1(1/z) = [s1]k(R0)t1(z) + 〈(zk −Aks1

)(z −As1)−1Js1h, vs1〉,

h = t1(As1)us1 + Aks1Js1g.

Relative to the basis given in Theorem 3.1 (i), Js has the matrix representa-tion

Js =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

0 · · · 0 0 0 τ0 〈 · , us1〉0 · · · 0 0 τ0 τ1 〈 · , As1us1〉0 · · · 0 τ0 τ1 τ2 〈 · , A2

s1us1〉

......

......

0 τ0 τ1 τ2 · · · τk−2 〈 · , Ak−2s1

us1〉τ0 τ1 τ2 τ3 · · · τk−1 〈 · , Ak−1

s1us1〉

us1 As1us1 A2s1us1 A3

s1us1 · · · Ak−1

s1us1 Ak

s1Js1

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠.

Case II: σ0 ∈ R, |σ0| > 1. In the basic interpolation problem (BIP) the parameters1(z) has the property s1(0) = 0. Hence s−1

1 (z) = 1/s1(z) is holomorphic at z = 0and we write its Taylor expansion as

s−11 (z) =

∞∑n=0

µnzn.

Since s1(z) has real Taylor coefficients, the coefficients µn are real also. Using thenotation as in Theorem 3.1 (ii) and Theorem 4.1 we have

Js

⎛⎝−rt1(z)ut2(z)e2

g

⎞⎠ =

⎛⎝−rf1(z)uf2(z)e2

h

⎞⎠ ,

where t1(z) is a polynomial of degree ≤ k, g, h ∈ D(s1),

t2(z) = zk−1t1(1/z),

f1(z) = [s−11 t2]k(z) + µ0〈(1 − zkBk

s1)(1− zBs1)

−1Js1h, vs1〉,f2(z) = zk−1f1(1/z) = [s−1

1 ]k(R0)t1(z) + µ0〈(zk − Bks1

)(z −Bs1)−1Js1h, vs1〉,

h = −µ0t1(Bs1)us1 + Bks1Js1g,

and Bs1 = As1 − µ0〈 · , vs1〉us1 .

Page 466: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 465

Relative to the basis given in Theorem 3.1 (ii), Js has the matrix representation

Js =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

0 · · · 0 0 µ0 µ0〈 · , us1〉0 · · · 0 µ0 µ1 µ0〈 · , Bs1us1〉0 · · · µ0 µ1 µ2 µ0〈 · , B2

s1us1〉...

......

...

0 µ0 µk−2 µ0〈 · , Bk−2s1 us1〉

µ0 µ1 µ2 · · · µk−1 µ0〈 · , Bk−1s1 us1〉

−µ0us1 −µ0Bs1us1 −µ0B2s1us1 · · · −µ0B

k−1s1 us1 Bk

s1Js1

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠.

We sketch the proof of Case II, that of Case I is similar and therefore omitted.Evidently, Js is selfadjoint in the space (Ck)′⊕D(s1) where (Ck)′ is the anti-Hilbertspace of Ck, that is, the space Ck provided with the negative inner product −y∗x,x, y ∈ Ck. Let B be the basis for the space D(Θ2) given in Theorem 3.1 (ii). Then

AΘ2B = B

⎛⎜⎜⎜⎜⎜⎜⎜⎝

0 1 0 0 · · · 00 0 1 0 · · · 00 0 0 1 · · · 0...

......

. . . . . ....

0 0 0 · · · 0 10 0 0 · · · 0 0

⎞⎟⎟⎟⎟⎟⎟⎟⎠, BΘ2

(ab

)= B

⎛⎜⎜⎜⎝00...1

⎞⎟⎟⎟⎠(0 −1

)(ab

),

and CΘ2B =(−ru 0 · · · 0

). It follows that A in Theorem 4.2 with Θ = Θ2

has the matrix representation

A =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

0 1 0 0 · · · 0 00 0 1 0 · · · 0 0...

.... . .

. . .. . .

......

0 0 0 0. . . 0 0

0 0 0 0. . . 1 0

−µ0σ∗o 0 0 0 · · · 0 µ0〈 · , vs1〉

σ∗0µ0us1 0 0 0 · · · 0 Bs1

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠.

Using the identities (see [1, Lemma 4.3])

Bs1B∗s1− µ2

0〈 · , us1〉us1 = I, B∗s1Bs1 − µ2

0〈 · , vs1〉vs1 = I,

Bs1vs1 + µ0us1 = 0, B∗s1us1 + µ0vs1 = 0,

〈vs1 , vs1〉 = 1− 1µ2

0

, 〈us1 , us1〉 = 1− 1µ2

0

,

we find after some tedious but straightforward calculations that J2s = I and that

JsA is selfadjoint in (Ck)′ ⊕D(s1).

Page 467: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

466 G. Wanjala

Case III: As an example we consider the case where in Theorem 2.2 k = 1, q = 0,|σ0| = 1 and σ1 = s0. We assume that the interpolation data are real, that is,σ0 = ±1 and s0 ∈ R, and that the Taylor coefficients τn of s1(z) in (5.1) are real.Note that the polynomial p(z) in (2.3) has the form p(z) = c0 = σ0/s0 and thataccording to (2.5) τ0 = s0. Using the notation as in Theorem 3.1 (iii) and Theorem4.1 we find that

Js

⎧⎪⎨⎪⎩⎛⎜⎝ t0u

t0Jug

⎞⎟⎠+

⎛⎜⎝t1(Ju− 2c∗0u)

t1(u− 2c0Ju)0

⎞⎟⎠⎫⎪⎬⎪⎭ =

⎛⎜⎝ f0u

f0Juh

⎞⎟⎠+

⎛⎜⎝f1(Ju− 2c∗0u)

f1(u− 2c0Ju)0

⎞⎟⎠ ,

where t0 and t1 are complex numbers, g, h ∈ D(s1),

f0 =(τ0 + σ0)s02(τ0 − σ0)

t0 +(τ0 + σ0)2s20 + 4s0τ1 + 1

2(τ0 − σ0)2s0t1 +

τ0 + σ0

2(τ0 − σ0)2〈g, us1〉

− σ0

τ0 − σ0〈g,Bs1us1〉,

f1 =s02

t0 +

τ0 + σ0

τ0 − σ0t1 +

1τ0 − σ0

〈g, us1〉,

h =s0t0

τ0 − σ0us1 +

(τ0 + σ0)t1(τ0 − σ0)2

us1 − 2σ0t1

τ0 − σ0Bs1us1 +

s0(τ0 − σ0)2

〈g, us1〉us1

+B2s1Js1g,

and Bs1 = As1 −〈 · , vs1〉τ0 − σ0

us1 . Relative to the basis given in Theorem 3.1 (iii), Js

has the matrix representation

Js =

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎝

(τ0 + σ0)s0

2(τ0 − σ0)

(τ0 + σ0)2s2

0 + 4s0τ1 + 1

2(τ0 − σ0)2s0

(τ0 + σ0)〈 · , us1〉2(τ0 − σ0)2

− σ0〈 · , Bs1us1〉τ0 − σ0

s0

2

s0(τ0 + σ0)

2(τ0 − σ0)

s0〈 · , us1〉2(τ0 − σ0)

s0

τ0 − σ0us1

(τ0 + σ0)us1

(τ0 − σ0)2−2

σ0Bs1us1

τ0 − σ0B2

s1Js1 +s0〈 · , us1〉us1

(τ0 − σ0)2

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎠.

To show that Js is selfadjoint one uses the fact that the Gram matrix G is given by

G =

⎛⎝0 2 02 0 00 0 I

⎞⎠and that J∗ = G−1J×

s G where J×s is the complex conjugate transpose of Js.

The equality J2s = I can be established by using that in this case Bs1 is a unitary

Page 468: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Closely Connected Unitary Realizations 467

operator on the space D(s1). Lastly, that JsA is selfadjoint in the space C2⊕D(s1)with Gram matrix G follows from straightforward calculations and the matrixrepresentation

A =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

− (τ0 + σ0)σ0s02(τ0 − σ0)

(τ0 + σ0)σ0s02(τ0 − σ0)

− 2σ0

s0−σ0〈 · , vs1〉

τ0 − σ0

−σ0s02

σ0s02

0

− σ0s0τ0 − σ0

us1

σ0s0τ0 − σ0

us1 Bs1

⎞⎟⎟⎟⎟⎟⎟⎟⎠of A with respect to the basis given in Theorem 3.1 (iii).

Acknowledgement

I would like to thank my Ph.D. advisor Professor Aad Dijksma for his valuablecontribution towards the writing of this paper. His remarks have been very usefulin coming up with this final version.

References

[1] D. Alpay, T.Ya. Azizov, A. Dijksma, and H. Langer, The Schur algorithm for gen-eralized Schur functions I: Coisometric realizations, Operator Theory: Adv., Appl.,vol. 129, Birkhauser Verlag, Basel, 2001, 1–36.

[2] D. Alpay, T.Ya. Azizov, A. Dijksma, and H. Langer, The Schur algorithm for general-ized Schur functions II: Jordan chains and transformation of characteristic functions,Monatsh. Math. 138 (2003), 1–29.

[3] D. Alpay, T.Ya. Azizov, A. Dijksma, H. Langer, and G. Wanjala, A basic interpola-tion problem for generalized Schur functions and coisometric realizations, OperatorTheory: Adv. Appl., vol 143, Birkhauser Verlag, Basel, 2003, 39–76.

[4] D. Alpay, T.Ya. Azizov, A. Dijksma, H. Langer, and G. Wanjala, The Schur algorithmfor generalized Schur functions IV: unitary realizations, Operator Theory: Adv.,Appl., Birkhauser Verlag, Basel, to appear.

[5] D. Alpay, T.Ya. Azizov, A. Dijksma, and J. Rovnyak, Colligations in Pontryaginspaces with a symmetric characteristic function, Operator Theory: Adv., Appl., vol.130, Birkhauser Verlag, Basel, 2001, 55–82.

[6] D. Alpay, A. Dijksma, J. van der Ploeg, and H.S.V. de Snoo, Holomorphic operatorsbetween Krein spaces and the number of squares of associated kernels, OperatorTheory: Adv. Appl., vol. 59, Birkhauser Verlag, Basel, 1992, 11–29.

[7] D. Alpay, A. Dijksma, J. Rovnyak, and H. de Snoo, Schur functions, operator col-ligations, and reproducing kernel Pontryagin spaces, Operator Theory: Adv. Appl.,vol. 96, Birkhauser Verlag, Basel, 1997.

[8] M.J. Bertin, A. Decomps-Guilloux, M. Grandet-Hugot, M. Pathiaux-Delfosse, andJ.P. Schreiber, Pisot and Salem numbers, Birkhauser Verlag, Basel, 1992.

[9] C. Chamfy, Fonctions meromorphes sur le circle unite et leurs series de Taylor, Ann.Inst. Fourier 8 (1958), 211–251.

Page 469: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

468 G. Wanjala

[10] P. Delsarte, Y. Genin, and Y. Kamp, Pseudo-Caratheodory functions and HermitianToeplitz matrices, Philips J. Res. 41(1) (1986), 1–54.

[11] J. Dufresnoy, Le probleme des coefficients pour certaines fonctions meromorphes dansle cercle unite, Ann. Acad. Sc. Fenn. Ser. A.I, 250,9 (1958), 1–7.

Gerald WanjalaDepartment of MathematicsUniversity of GroningenP.O. Box 800NL-9700 AV Groningen, The Netherlandse-mail: [email protected]

Page 470: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Operator Theory:Advances and Applications, Vol. 160, 469–478c© 2005 Birkhauser Verlag Basel/Switzerland

Trace-Class Weyl Transforms

M.W. Wong

This paper is dedicated to Professor Israel Gohberg on the occasion of his 75th birthday.

Abstract. Criteria for Weyl transforms to be in the trace class are given andthe traces of these trace-class Weyl transforms are computed. A character-ization of trace-class Weyl transforms is proved and a trace formula for alltrace-class Weyl transforms is derived.

Mathematics Subject Classification (2000). Primary 47G30.

Keywords. Weyl transforms, localization operators, Weyl-Heisenberg groups,traces.

1. Introduction

Let σ ∈ L1(R2n) ∪L2(R2n). Then the Weyl transform associated to the symbol σis the bounded linear operator Wσ : L2(Rn) → L2(Rn) given by

(Wσf, g)L2(Rn) = (2π)−n2

∫Rn

∫Rn

σ(x, ξ)W (f, g)(x, ξ) dx dξ

for all f and g in L2(Rn), where ( , )L2(Rn) is the inner product in L2(Rn) andW (f, g) is the Wigner transform of f and g defined by

W (f, g)(x, ξ) = (2π)−n2

∫Rn

e−iξ·pf(x +

p

2

)g(x− p

2

)dp

for all x and ξ in Rn.It is well known that W (f, g) ∈ L2(R2n) for all f and g in L2(Rn).Let X be a complex and separable Hilbert space in which the inner product

is denoted by ( , ), and let A : X → X be a compact operator. If we denote byA∗ : X → X the adjoint of A : X → X , then the linear operator (A∗A)

12 : X → X

is positive and compact. Let ψk : k = 1, 2, . . . be an orthonormal basis for X

This research has been partially supported by the Natural Sciences and Engineering ResearchCouncil of Canada (NSERC) OGP0008562.

Page 471: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

470 M.W. Wong

consisting of eigenvectors of (A∗A)12 : X → X , and let sk(A) be the eigenvalue

corresponding to the eigenvector ψk, k = 1, 2, . . . . We call sk(A), k = 1, 2, . . . ,the singular values of A : X → X. If

∑∞k=1 sk(A) < ∞, then the linear operator

A : X → X is said to be in the trace class S1. It can be shown that S1 is a Banachspace in which the norm ‖ ‖S1 is given by

‖A‖S1 =∞∑

k=1

sk(A), A ∈ S1.

Let A : X → X be a linear operator in S1 and let ϕk : k = 1, 2, . . . be anyorthonormal basis for X . Then it can be shown that the series

∑∞k=1(Aϕk, ϕk) is

absolutely convergent and the sum is independent of the choice of the orthonormalbasis ϕk : k = 1, 2, . . .. Thus, we can define the trace tr(A) of every linearoperator A : X → X in S1 by

tr(A) =∞∑

k=1

(Aϕk, ϕk),

where ϕk : k = 1, 2, . . . is any orthonormal basis for X .

Key problems

1. Which functions σ in L1(R2n)∪L2(R2n) are such that Wσ : L2(Rn) → L2(Rn)is in S1?

2. If Wσ : L2(Rn) → L2(Rn) is in S1, what is tr(Wσ)?

A sample of known results is surveyed in Section 2. In Section 3, we use the theoryof two-wavelet localization operators on the Weyl-Heisenberg group to give anotherclass of symbols σ for which Wσ : L2(Rn) → L2(Rn) is in S1. In Sections 4 and 5,we recall, respectively, Hilbert-Schmidt operators and twisted convolutions, whichare used in Section 6 to provide a characterization of trace-class Weyl transforms.A trace formula for all trace-class Weyl transforms is given in Section 7.

2. Some known results

Good sufficient conditions on σ to ensure that Wσ : L2(Rn) → L2(Rn) is in S1 canbe formulated in terms of Sobolev spaces and weighted L2 spaces. For a positivenumber s, we let Hs,2 be the set of functions on R2n defined by

Hs,2 =σ ∈ L2(R2n) :

∫R2n

(1 + |q|2 + |p|2)s|σ(q, p)|2dq dp <∞,

where σ is the Fourier transform of σ defined by

σ(z) = (2π)−n limR→∞

∫|ζ|≤R

e−iz·ζσ(ζ) dζ, z ∈ R2n,

Page 472: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Trace-Class Weyl Transforms 471

and the convergence is understood to take place in L2(R2n). It is clear that Hs,2

is the L2 Sobolev space of order s on R2n. We define the space Ls,2 on R2n by

Ls,2 =σ ∈ L2(R2n) :

∫R2n

(1 + |x|2 + |ξ|2)s|σ(x, ξ)|2dx dξ <∞.

The following result is due to Daubechies [2] and Hormander [12].

Theorem 2.1. Let σ ∈ Hs,2 ∩Ls,2, s > 2n. Then Wσ : L2(Rn) → L2(Rn) is in S1.

The following improvement of Theorem 2.1 can be found in the paper [11]by Heil, Ramanathan and Topiwala.

Theorem 2.2. Let σ ∈ Hs,2 ∩ Ls,2, s > n. Then Wσ : L2(Rn) → L2(Rn) is in S1.

The following sufficient condition in terms of the Wigner transform can befound in Section 9 of the book [4] by Dimassi and Sjostrand, and also in the paper[8] by Grochenig.

Theorem 2.3. Let σ ∈ L2(R2n) be such that W (σ, σ) ∈ L1(R4n). Then Wσ :L2(Rn) → L2(Rn) is in S1.

Using the terminology of modulation spaces in the book [9] by Grochenig,the symbol σ in the preceding theorem is said to be in the space M1,1. It is shownin Chapter 11 of [9] that

M1,1 ⊆W (R2n),where W (R2n) is the Wiener space defined by

W (R2n) =

f ∈ L∞(R2n) :

∑m∈Z2n

‖f(·+ m)‖L∞([0,1]2n) <∞.

Since W (R2n) ⊆ L1(R2n), it follows that M1,1 ⊆ L1(R2n). Thus, the symbols inTheorems 2.1–2.3 are in L1(R2n). The advantage of having L1 symbols is revealedby the following result in the paper [5] by Du and Wong.

Theorem 2.4. Let σ ∈ L1(R2n) be such that the Weyl transform Wσ : L2(Rn) →L2(Rn) is in S1. Then

tr(Wσ) = (2π)−n

∫Rn

∫Rn

σ(x, ξ) dx dξ.

We see in a moment that there are functions σ ∈ L1(R2n) for which Wσ :L2(Rn) → L2(Rn) is not in S1.

3. Two-wavelet localization operators

We begin with a recall of the definition of the Weyl-Heisenberg group. Let (WH)n

= R2n × R/2πZ, where Z is the set of all integers. Then we define the binaryoperation · on (WH)n by

(q1, p1, t1) · (q2, p2, t2) = (q1 + q2, p1 + p2, t1 + t2 + q1 · p2)

Page 473: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

472 M.W. Wong

for all points (q1, p1, t1) and (q2, p2, t2) in (WH)n, where q1 · p2 is the Euclideaninner product of q1 and p2 in Rn; t1, t2 and t1+t2+q1 ·p2 are cosets in the quotientgroup R/2πZ in which the group law is addition modulo 2π. With respect to themultiplication ·, (WH)n is a non-abelian group in which (0, 0, 0) is the identityelement and the inverse element of (q, p, t) is (−q,−p,−t + q · p) for all (q, p, t)in (WH)n.

To simplify the notation a little bit, we identify R2n with Cn. Thus, (WH)n =Cn ×R/2πZ, which can be identified with Cn × [0, 2π] = R2n × [0, 2π]. Thus, it isplausible, and indeed the case, that the Lebesgue measure dq dp dt on R2n×[0, 2π] isthe left and right Haar measure on (WH)n. Therefore (WH)n is a locally compact,Hausdorff and unimodular group, which we call the Weyl-Heisenberg group.

Let π : (WH)n → U(L2(Rn)) be the mapping defined by

(π(q, p, t)f)(x) = ei(p·x−q·p+t)f(x− q), x ∈ Rn,

for all (q, p, t) in (WH)n and all f in L2(Rn), where U(L2(Rn)) is the group of allunitary operators on L2(Rn). Then it can be shown that π is an irreducible andunitary representation of (WH)n on L2(Rn) such that there exists a function ϕ inL2(Rn) with ‖ϕ‖L2(Rn) = 1 and∫ 2π

0

∫Rn

∫Rn

|(ϕ, π(q, p, t)ϕ)L2(Rn)|2dq dp dt <∞.

In more succinct language, we say that π is a square-integrable representation of(WH)n on L2(Rn), ϕ is an admissible wavelet for π and the number cϕ defined by

cϕ =∫ 2π

0

∫Rn

∫Rn

|(ϕ, π(q, p, t)ϕ)L2(Rn)|2dq dp dt

is the wavelet constant associated to ϕ. In fact, it is well known that every functionϕ in L2(Rn) with ‖ϕ‖L2(Rn) = 1 is an admissible wavelet and

cϕ = (2π)n+1.

We can now look at localization operators with, say, L1 symbols on the Weyl-Heisenberg group. To this end, let ϕ and ψ be two admissible wavelets for π andlet F ∈ L1((WH)n). Then the localization operator LF,ϕ,ψ : L2(Rn) → L2(Rn)associated to the symbol F and the admissible wavelets ϕ and ψ is defined by

(LF,ϕ,ψu, v)L2(Rn)

=1cϕ

∫(WH)n

F (g)(u, π(g)ϕ)L2(Rn)(π(g)ψ, v)L2(Rn)dµ(g)

for all u and v in L2(Rn), where dµ(g) is the Haar measure on (WH)n.To specialize, we let F ∈ L1(R2n) and let F ∈ L1((WH)n) be defined by

F (q, p, t) = F (q, p), (q, p, t) ∈ (WH)n.

Page 474: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Trace-Class Weyl Transforms 473

Then simple calculations give

(LF ,ϕ,ψu, v)L2(Rn)

= (2π)−n

∫Rn

∫Rn

F (q, p)(u, ϕq,p)L2(Rn)(ψq,p, v)L2(Rn)dq dp

for all u and v in L2(Rn), where ϕq,p is the function defined by

ϕq,p(x) = eip·xϕ(x− q), x ∈ Rn.

It is worth pointing out that the localization operator LF ,ϕ,ψ : L2(Rn) → L2(Rn)is the same as the linear operator DF,ϕ,ψ : L2(Rn) → L2(Rn) given by

(DF,ϕ,ψu, v)L2(Rn)

= (2π)−n

∫Rn

∫Rn

F (q, p)(u, ϕq,p)L2(Rn)(ψq,p, v)L2(Rn)dq dp

for all u and v in L2(Rn). If ϕ = ψ, then the linear operator DF,ϕ,ϕ : L2(Rn) →L2(Rn) is the localization operator first studied in the paper [3] by Daubechies inthe context of signal analysis. It is convenient to call DF,ϕ,ψ : L2(Rn) → L2(Rn)the Daubechies operator with symbol F and admissible wavelets ϕ and ψ.

The connection that is useful to us is the following theorem, which is essen-tially a consequence of Theorems 16.1 and 17.1 in the book [16] by Wong.

Theorem 3.1. Let F ∈ L1(R2n). Then

DF,ϕ,ψ = WF∗V (ϕ,ψ),

where V (ϕ, ψ) is the Fourier-Wigner transform of ϕ and ψ given by

V (ϕ, ψ)∧ = W (ϕ, ψ).

The following theorem, which is an immediate consequence of Theorems 2.4and 3.1, is a special case of Theorem 16.1 in the book [17] by Wong.

Theorem 3.2. Let F ∈ L1(R2n). Then DF,ϕ,ψ : L2(Rn) → L2(Rn) is in S1 and

tr(LF,ϕ,ψ) = (ψ, ϕ)L2(Rn)(2π)−n

∫Rn

∫Rn

F (q, p) dq dp.

From the preceding two theorems, we have the following sufficient conditionfor a Weyl transform to be in the trace class.

Theorem 3.3. Let σ ∈ L1(R2n) ∗ V (ϕ, ψ) : ϕ, ψ ∈ L2(Rn). If we write

σ = F ∗ V (ϕ, ψ),

where F ∈ L1(R2n) and ϕ, ψ ∈ L2(Rn), then Wσ : L2(Rn) → L2(Rn) is in S1 and

tr(WF∗V (ϕ,ψ)) = ‖ϕ‖L2(Rn)‖ψ‖L2(Rn)(ψ, ϕ)L2(Rn)

∫Rn

∫Rn

F (x, ξ) dx dξ

for all F in L1(R2n), and ϕ and ψ in L2(Rn).

The set L1(R2n)∗V (ϕ, ψ) : ϕ, ψ ∈ L2(Rn) can be traced back to the paper[1] by Cohen and is known as the Cohen class in time-frequency analysis.

Page 475: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

474 M.W. Wong

Remark 3.4. The sufficient condition given in Theorem 3.3 for a Weyl transformto be in the trace class is also a special case of Corollary 1.12 in the paper [15]by Toft. Indeed, it follows from Corollary 1.12 in [15] that Wµ∗τ ∈ S1 when µ is abounded measure and Wτ ∈ S1. Now, let F ∈ L1(R2n) and let τ = V (ϕ, ψ), whereϕ and ψ are functions in L2(Rn). Then F is a bounded measure. Since Wτ is anoperator of rank one, Wτ ∈ S1. Thus, by Corollary 1.12 in [15], WF∗τ ∈ S1.

4. The Hilbert-Schmidt class

A compact operator A : X → X from a complex and separable Hilbert space Xinto X is said to be in the Hilbert-Schmidt class S2 if its singular values sk(A) , k =1, 2, . . . , are such that

∑∞k=1 sk(A)2 < ∞. It can be shown that S2 is a Hilbert

space in which the inner product ( , )S2 is given by

(A,B)S2 =∞∑

k=1

(Aϕk, Bϕk), A,B ∈ S2,

where ϕk : k = 1, 2, . . . is an orthonormal basis for X , the series is absolutelyconvergent and the sum is independent of the choice of the orthonormal basisϕk : k = 1, 2, . . . for X .

We need the following connections between trace-class operators and Hilbert-Schmidt operators.

Theorem 4.1. S1 ⊆ S2.

Theorem 4.2. A linear operator A from a complex and separable Hilbert space Xinto X is in S1 if and only if A = BC, where B : X → X and C : X → X arelinear operators in S2. Furthermore, for all B and C in S2,

‖BC‖S1 ≤ ‖B‖S2‖C‖S2.

The literature on S1 and S2 abounds. We just mention here the recent books[7] by Gohberg, Goldberg and Krupnik, and [17] by Wong.

If X = L2(Rn), then it is well known that a linear operator A : L2(Rn) →L2(Rn) is in S2 if and only if there exists a function h in L2(R2n) such that

(Af)(x) =∫

Rn

h(x, y)f(y) dy, x ∈ Rn,

for all f in L2(Rn).The following two results, due to Pool in [13], play an important role in the

solutions of the key problems stated in Section 1, and can be found in Chapters 7and 6 of the book [16] by Wong, respectively.

Theorem 4.3. Let σ and τ be in L2(R2n). Then the Weyl transforms Wσ : L2(Rn)→L2(Rn) and Wτ : L2(Rn) → L2(Rn) associated to the symbols σ and τ , respectively,are in S2 and

(Wσ,Wτ )S2 = (2π)−n(σ, τ)L2(R2n),

where ( , )L2(R2n) is the inner product in L2(R2n).

Page 476: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Trace-Class Weyl Transforms 475

Theorem 4.4. Every linear operator from L2(Rn) into L2(Rn) and in S2 is a Weyltransform Wσ : L2(Rn) → L2(Rn) associated to some symbol in L2(R2n).

In view of Theorems 2.1 and 2.4, we see that a symbol in L1(R2n), but notin L2(R2n), gives a Weyl transform Wσ : L2(Rn) → L2(Rn) that is not in S1.

5. Twisted convolutions

Let us begin with identifying R2n with Cn and any point (x, ξ) in R2n with thepoint z = x + iξ in Cn. We define the symplectic form [ , ] on Cn by

[z, w] = 2 Im(z · w), z, w ∈ Cn,

wherez = (z1, z2, . . . , zn),w = (w1, w2, . . . , wn)

and

z · w =n∑

j=1

zjwj .

Let λ be a fixed real number. Then we define the twisted convolution f ∗λ gof two measurable functions f and g on Cn by

(f ∗λ g)(z) =∫

Cn

f(z − w)g(w)eiλ[z,w]dw, z ∈ Cn,

provided that the integral exists.The following formula for the product of two Weyl transforms associated to

symbols in L2(R2n) in the paper [10] by Grossmann, Loupias and Stein plays apivotal role in the solutions of the key problems.

Theorem 5.1. Let σ and τ be functions in L2(Cn). Then WσWτ = Wω , whereω ∈ L2(Cn) and

ω = (2π)−n(σ ∗ 14τ ).

A proof of Theorem 5.1 can be found, for instance, in Chapter 12 of the book[14] by Stein and Chapter 9 of the book [16] by Wong.

Further results on extensions, continuity and applications of the twisted con-volution can also be found in the book [6] by Folland and the paper [15] by Toft.

6. A characterization

We give in this section a necessary and sufficient condition on the symbol σ toensure that the Weyl transform Wσ : L2(Rn) → L2(Rn) is in S1. To this end, weintroduce the subset W of L2(Cn) given by

W =(2π)−n(a ∗ 1

4b)∨ : a, b ∈ L2(Cn)

,

where (· · · )∨ denotes the inverse Fourier transform of (· · · ).

Page 477: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

476 M.W. Wong

Theorem 6.1. Let σ ∈ L2(Cn). Then Wσ : L2(Rn) → L2(Rn) is in S1 if and onlyif σ ∈W. Furthermore, if σ = (2π)−n(a ∗ 1

4b)∨, where a and b are in L2(Cn), then

‖Wσ‖S1 ≤ (2π)−n‖a‖L2(Cn)‖b‖L2(Cn).

Proof. Suppose that σ ∈ W. Then

σ = (2π)−n(a ∗ 14b),

where a and b are in L2(Cn). By Theorem 5.1,

Wσ = WaWb.

By Theorem 4.3, Wa and Wb are in S2. So, by Theorem 4.1, Wσ ∈ S1. Conversely,suppose that Wσ ∈ S1. Then, by Theorem 4.1, Wσ ∈ S2. By Theorem 4.2, Wσ isa product of two linear operators in S2. By Theorem 4.4, we get

Wσ = WaWb,

where a and b are in L2(Cn). Thus, by Theorem 5.1,

σ = (2π)−n(a ∗ 14b),

and hence σ ∈W. To prove the trace-class norm inequality, we note that

‖Wσ‖S1 = ‖WaWb‖S1

≤ ‖Wa‖S2‖Wb‖S2

= (2π)−n‖a‖L2(Cn)‖b‖L2(Cn).

Remark 6.2. Theorem 6.1 is a special case of Proposition 1.9 in the paper [15]by Toft. The more straightforward proof given above is new and interesting in itsown right.

7. A trace formula

Theorem 7.1. Let σ = (2π)−n(a ∗ 14b)∨, where a and b are in L2(Cn). Then

tr(Wσ) = (2π)−n

∫Cn

a(w)b(w) dw.

Proof. We first prove the theorem for a and b in S(Cn). Since

(a ∗ 14b)(z) =

∫Cn

a(z − w)b(w)ei 14 [z,w]dw, z ∈ Cn,

Page 478: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

Trace-Class Weyl Transforms 477

we see that a ∗ 14b ∈ S(Cn) and hence σ ∈ L1(Cn). By Theorem 2.4, we get

tr(Wσ) = (2π)−2n

∫Cn

(a ∗ 14b)∨(z) dz

= (2π)−n(a ∗ 14b)(0)

= (2π)−n

∫Cn

a(−w)b(w) dw

= (2π)−n

∫Cn

a(w)b(w) dw

= (2π)−n

∫Cn

a(w)b(w) dw.

This proves the result when a and b are in S(Cn). Since S(Cn) is dense in L2(Cn),the full result follows by standard limiting arguments. Acknowledgement

The author is grateful to the referees and Professor Cornelis Van der Mee for veryconstructive comments that lead to a much improved version of the paper.

References

[1] L. Cohen, Generalized phase space distributions, J. Math. Phys. 7 (1966), 781–786.

[2] I. Daubechies, On the distributions corresponding to bounded operators in the Weylquantization, Comm. Math. Phys. 75 (1980), 229–238.

[3] I. Daubechies, Time-frequency localization operators: a geometric phase space ap-proach, IEEE Trans. Inform. Theory 34 (1988), 605–612.

[4] M. Dimassi and J. Sjostrand, Spectral Asymptotics in the Semi-Classical Limit,Cambridge University Press, 1999.

[5] J. Du and M.W. Wong, A trace formula for Weyl transforms, Approx. Theory Applic.16 (2000), 41–45.

[6] G.B. Folland, Harmonic Analysis in Phase Space, Princeton University Press, 1989.

[7] I. Gohberg, S. Goldberg and N. Krupnik, Traces and Determinants of Linear Oper-ators, Birkhauser, 2001.

[8] K. Grochenig, An uncertainty principle related to the Poisson summation formula,Studia Math. 121 (1996), 87–104.

[9] K. Grochenig, Foundations of Time-Frequency Analysis, Birkhauser, 2001.

[10] A. Grossmann, G. Loupias and E.M. Stein, An algebra of pseudodifferential operatorsand quantum mechanics in phase space, Ann. Inst. Fourier (Grenoble), 18 (1968),343–368.

[11] C. Heil, J. Ramanathan and P. Topiwala, Singular values of compact pseudodiffer-ential operators, J. Funct. Anal. 150 (1997), 426–452.

[12] L. Hormander, The Weyl calculus of pseudodifferential operators, Comm. Pure Appl.Math. (32) (1979), 360–444.

Page 479: Recent Advances in Operator Theory and its Applications: The Israel Gohberg Anniversary Volume

478 M.W. Wong

[13] J.C.T. Pool, Mathematical aspects of the Weyl correspondence, J. Math. Phys. 7(1966), 66–76.

[14] E.M. Stein, Harmonic Analysis: Real-Variable Methods, Orthogonality and Oscilla-tory Integrals, Princeton University Press, 1993.

[15] J. Toft, Regularizations, decompositions and lower bound problems in the Weylcalculus, Comm. Partial Differential Equations 25 (2000), 1201–1234.

[16] M.W. Wong, Weyl Transforms, Springer-Verlag, 1998.

[17] M.W. Wong, Wavelet Transforms and Localization Operators, Birkhauser, 2002.

M.W. WongDepartment of Mathematics and StatisticsYork University4700 Keele StreetToronto, Ontario M3J 1P3, Canadae-mail: [email protected]