RANDOM VARI.fl.3LES, WIENER-HOPF FILTE.RING AND CONTROL
FORMULATED IN ABSTRACT SPACES
by
LEONARD JENG TUNG, B.S., M.S. in E.E.
A DISSERTATION
IN
ELECTRICAL ENGINEERING
Submitted to the Grad'jate Faculty of Texas Tech University in
Partial Fulfillment of the Requirements for
the Degree of
DOCTOR OF PHILOSOPHY
August, 1977
ABV-^.5m r
'B0\ a-1
/frr :'
i
ACKNOWLEDGEMENTS
I am grateful to Dr. Richard Saeks for his helpful discussions
and guidance through the course of this investigation; to Dr. Stanley
R. Liberty, Dr. John F. Walkup, Dr. Marion 0. Hagler and Dr. Thomas G.
McLaughlin for their suggestions and criticisms in the preparation of
this dissertation.
n
TABLE OF CONTENTS
ACKNOWLEDGEMENTS i i
I. INTRODUCTION 1
II. BANACH RESOLUTION SPACE 9
III. FACTORIZATION OF OPERATORS 23
IV. HILBERT SPACE REPRESENTATION THEORY 35
V. REPRODUCING KERNEL RESOLUTION SPACE OF HILBERT SPACE VALUED
RANDOM VARIABLES 49
VI. WIENER-HOPF OPTIMIZATION THEORY 58
VII. CONCLUSION 70
LIST OF REFERENCES 72
m
CHAPTER I
INTRODUCTION
For a long time, it has been a goal of some systems engineers to
combine the concepts of causality with the optimization theorems so that
optimization would lead to a realizable solution. During the past years
causality has been introduced into Hilbert space — the so-called Hil-
bert resolution space , where most of the optimization problems are
formulated. However, not enough effort has been devoted to solving the
optimization problems under the constraint of causality. A typical
problem is the Wiener-Hopf filter. This problem has been thoroughly
ixudied in the time domain. However in the frequency domain, the pro
blem is solved in a rather intuitive fashion. People simply take the
causal part of an optimal solution obtained without the causality con
straint and claim it is the optimal solution under the constraint of
causality. It is the purpose of this dissertation to formulate the
problem in an abstract space and attack it by well-developed linear
operator theorems. For comparison, we first look into the classical
Wiener-Hopf filter. Then we discuss the techniques needed for the
formulation — the techniques that will be developed in this disserta-
ticpi, as well as other applications where these techniques could be
used.
2 Classical Wiener-Hopf Filtering
The problem itself is quite simple. Let x(t) denote a signal
process and n(t) denote a noise process. Both processes are stationary
1
and zero-mean. They are statistically independent. However they are
mixed together, i.e. they appear in the form of x(t) + n(t). What we
would like to do is to design a filter, linear and causal, to operate on
x(t) + n(t) such that the error defined as x(t) - y(t), where y(t) is
the output of the filter, has the smallest variance. Let e denote the
error and a denote its variance, then
a = E{e^} = - ^ f.^ {S^(s)[l - H(s)][l - H(-s)]
+ S^(s)H(s)H(-s)} ds,
where S^(s) and S^(s) are the spectral density of x(t) and n(t), respec
tively, and H(s) is the transfer function of the filter. Rearranging
terms in a, we get
1 J~ ^ = -^;^ /j„ {[Sx(s) + S^(s)]H(s)H(-s) - S^(s)H(s)
- S^(s)H(-s) + S^(s)} ds.
Since S (s) + S (s) is also a spectral density, it is symmetric in the
s-plane and can be factored into one factor having poles and zeros in r
the left-half plane and another having corresponding poles and zeros in
the right-half plane. Thus, it can be written as
S^(s) + S^(s) = F(s)F(-s).
Then
c = - ^ /.^ {[F(s)H(s) - -^ ][F(-s)H(-s) - -^ ] ^ ^ "' F(-s) F(s)
Minimum a occurs when
SJs)S fs) + — } ds
F(s)F(-s)
SJs) H{s) = ^
S,(s)
F(s)F(-s) S^(s) + S^(s)
Unfortunately, H(s) is also symmetric in the s-plane, hence it cannot
represent a causal system. So far, the argument is correct, although it
leads nowhere. In order to get a causal filter that might do the job,
some authors suggest an intuitive approach. Since minimum a occurs when
F(s)H(s) -S (s) x^ ' F(-s)
= 0,
if we simply let
F(s)H(s) = fS,(s)1
F(-s) , the par t ia l f rac t ion of S (s) /F( -s) with a l l
the poles in the l e f t - h a l f plane, then
1 H(s) =
fS,(s)l
F(-s) F(s)
problem left is to prove that
1
does represent a causal system. The only
rs,(s)1
F{s) [H-s)\ , the so-called Wiener-Hopf filter, is actually the
optimal filter. To do so, one ought to prove that those cross terms in
a, such as
^^ f. {H(s)F(s) -fSx(s)1
lF(-s)J }
fS,(s)1
F(-s) V. » ' J
ds, do not contribute R
negative values to a.
The Wiener-Hopf filter is a well-known result. Our purpose here
is not to rederive it but rather to point out what procedures should be
done for the formulation in abstract space.
Pre-formulation of Wiener-Hopf Filtering in Abstract Space
In this section, we will not actually formulate the problem but
rather point out what should be done for the formulation. Referring to
the classical W-H filtering, we find that there are five major consider
ations. They are
i. Ransom Process
ii. Causality
iii. Factorization
iv. Additive Decomposition
v. Optimization.
These problems are discussed in the following.
(i) Random Process: How should a random process be defined when
it is formulated in an abstract space, and what kinds of spaces are
suitable for our purpose? In order to answer this question, let's first
review what a random process is. A random process can be thought of as
a function of two independent variables, x(t,w), where w is the outcome
of some experiment. When t is fixed, x(t,w) represents a random varia
ble. When w is fixed, X(T:,W) represents a time function. So if x(t,w)
is an element of some function space, say Lp, when w is fixed, we could
think of the random process as a random variable which takes values in
the function space, L2. Of course, this idea will work only if an
adequate probability measure can be defined over the function space.
Fortunately this kind of probability measure has been defined over
3 4 metric spaces ' . For our purposes, Hilbert space and Banach space are
used, not only because these spaces possess nice properties but also
because stochastic concepts such as "mean" and "variance operator" can
be defined therein. It turns out that the variance operator is "posi
tive" and "self-adjoint" for a reflexive Banach space valued random
variable, and these are just the properties required for operator
factorization. In chapters IV and V of this dissertation, Banach space
and Hilbert space valued random variables along with their characteris
tics, especially the reproducing kernel resolution space, are discussed
with the probability measures over these spaces assumed implicitly.
(ii) Causality: We mestioned that causality has been introduced
in Hilbert space — the so-called Hilbert resolution space. In chapter
II, we extend the works done for Hilbert space to Banach space. When a
Banach space is equipped with a resolution of the identity (then it is
called a Banach resolution space), there is a resolution of the identity
over the dual space which is naturally induced by the original resolu
tion of the identity (by using the adjoint). Since most of our
applications are formulated in reflexive Banach space, emphases are
given to the reflexive Banach resolution space and its dual space with
the induced resolution of the identity. Operator properties such as
6
causality, anti-causality, left- and right-mini phase, maxiphase, are
also defined and discussed in chapter II.
(iii) Factorization: Operator factorization is the fundation of
this dissertation. Because of this, chapter III is totally devoted to
the development of the factorization theorems. Throughout this disser
tation, the desired form of factorization is either K K* or T* T. In
either form, the operator to be factorized has to be "positive" and
"self-adjoint". These commonly used properties among operators on
Hilbert space can be extended for operators which map reflexive Banach
spaces to their dual spaces. One more thing is demanded for the
factorization: the factor space, from which or to which the factor
operator maps, has to be a Hilbert space. In some applications the
factor space is more important than the factor operator itself. For
example, the reproducing kernel resolution space.
In the first half of chapter III, factorization theorems are
developed. These theorems give miniphase (or maxiphase) factors which
in turn give causal (or anti-causal) inverse operators when an inverse
is guaranteed. The second half of chapter III deals with reproducing
kernel resolution space — a special kind of factor space which is
formed by the factor operator given in the factorization theorem, but
is independent of the factor used in its definition.
(iv) Additive Decomposition: Operator decomposition into causal
part, anti-causal part and memoryless part is treated in Ref.l. For our
purpose, the operators involved are Hilbert-Schmidt operators, which
guarantee the decomposition.
(v) Optimization: As in the classical W-H filtering, we would
like to minimize the second-order statistics of the error. However,
when the W-H filtering is formulated in Hilbert space, the second-order
statistics of the error are manifested in the variance operator. Even
though variance operators are positive and self-adjoint, it is not an
easy task to minimize the variance operator directly, since here we are
dealing with the partial ordering of positive operators. Fortunately,
when the operators involved are nuclear, there is a theorem which allows
us to take the trace of the variance operators and to minimize it, thus
changing the operator minimization problem into a minimization of some
scalar. Moreover, taking the trace of the operator actually kills the
cross terms in the variance operator of the error, since the cross terms
involved are either strictly causal or strictly anti-causal and the
traces of these kinds of operator are zero. This is the job that is not
done in the classical W-H filtering. In chapter VI, we first formulate
the W-H filtering in Hilbert space and solve it by using linear operator
theorems. With the result from the filtering, we then look into the
so-called W-H control. In the context, a feedoack and feedforward
control system is derived by the techniques developed in the W-H
filtering.
Summary
While we have found that factorization theorems are most impor
tant to the formulation and the solving of the W-H filtering and con
trol, we have also found two other areas of application -— scattering
operators (the operator version of the scattering variables of the
8
classical network theory) and the reproducing kernel resolution space of
the Banach space valued random variables (Hilbert space valued random
variables are special cases). Hence for the reference of the reader,
we summarize in the following.
In chapter II, Banach resolution spaces are defined first. Then
causality, anti-causality, miniphase and maxiphase are defined. Mini-
phase and maxiphase are defined in such a way that the causality of the
inverse can be determined once its existence is guaranteed. In chapter
III, factorization theorems are developed. Positive and self-adjoint
operators over reflexive Banach spaces are factorized into factors of
left- and right-miniphase operators and their adjoint. A "unique"
factor space — the so-called reproducing kernel resolution space, is
also discussed in this chapter. In chapter IV, scattering variables
along with Banach space valued random variables are discussed. This
chapter is termed "Hilbert space representation theory" since the pro
blems formulated more generally in Banach spaces may be transformed into
problems in Hilbert spaces where they are easier to deal with. In
chapter V, we examine Hilbert space valued random variables. A certain
kind of random variable can be approximated by a sequence of random
variables which take values in the RKRS of the variance operator of the
original random variable and have variance operators that approach (
weakly) the identity operator of the RKRS. In chapter VI, Wiener-Hopf
filtering is formulated in Hilbert space and attacked via linear opera
tor theorems. With the result for the filtering, we then look into the
so-called W-H control of a feedback and feedforward control system.
CHAPTER II
BANACH RESOLUTION SPACE
By a Banach resolution space, we mean a 2-tuple, (B, ^F), where
B is a Banach space (complete normed linear vector space) over the field
of real numbers and gF is the so-called resolution of the identity in B
which will be defined in the following section. By working with Banach
resolution spaces, we will define and discuss properties associated with
operators over the Banach spaces, such as causality, anti-causality,
left- and right-miniphase and maxiphase, etc. All these properties are
for future development of the main work of this dissertation.
Resolution of Identity in Banach Space
Def.II.l. Let B be a Banach space. By a resolution of identity,
DF, in B, we mean a family of bounded linear operators, nF(A), on B
defined for each Borel subset. A, of the real number set R, satisfying
the following conditions:
i- Q^W ^ ^B " "" " ' y operator on B.
ii. 3F(A^) BF(A2) = 3F(AI f! A^), for all A ^ A2 e B(R) = the
set of all Borel subsets of R.
iii. DF(U A.) = E D F ( A . ) , {A.} : finite set of disjoint Borel
subset of R.
iv. ||gF(A)x|| <_ ||x||, for all A e 3(R) and x s B (equivalent
statem.ent: Norm of gF(A) is either 0 or 1).
t See Ref-1 for comparative work in Hilbert space
10
The subscript on the left in the notation, gF, is to notify that
the resolution of identity is defined over the space B, and will be
dropped if no ambiguity would result.
With the resolution of identity defined as above, there are some
properties of it worth mentioning.
Thm.II.l. Let (B, gF) be a Banach resolution space. We have
i. gF(A)x = 0 => X e gF(R - A)[B] = the range of gF(R - A ) , for
all A £ 3(R)
ii. gF($) = 0, where a is the empty set.
iii. gF(A)[B] is complete (closed), for all A e s(R)
Proof: i. x = Ig(x) = gF(R)x = gF(A U (R - A ) ) X
= gF(A)x + gF(R - A)X, for all x £ B. Hence
gF(A)x = 0 => X = gF(R - A ) X £ gF(R - A ) [ B ] .
ii. Ig = gF(R) = gF(R U $) = gF(R) + gF(^) = Ig + gF($),
30 gF($) = 0.
iii. Let {x.} be a Cauchy sequence in nF(A)[B], then there I D
exists X £ B such that x. -)- x. Since
gF(A)x = gF(A)(lim X.) = lim gF(A)x. = lim x. = x,
so X £ gF(A)[B]. (Here we employ the fact that
gF(A)2 = gF(A), for all A £ a(R). Thus x. e gF(A)[B]
implies x. = „F(A)X..) I D I
Working with a Banach resolution space (B, gF), it is natural
and important to ask whether we can define a resolution of identity in
B*, the dual space of B. The following theorem gives us the answer.
Thm.II.2. Let (B, gF) be a Banach resolution space. Then
11
{gF(A)*|A £ 3(R)} is a resolution of identity in B*. This resolution of
identity in B* is called the induced resolution of identity.
Proof: i. (x, gF(A)*y) = (gF(A)x, y ) , for all x £ B, y £ B* and
for all A £ 3(R), so
(x, BF(R)*y) = (gF(R)x, y) = (x, y ) , for all x £ B and
y £ B*. This implies that gF(R)*y = y, for all y £ B*,
i.e. gF(R)* = Ig^.
ii. gF(Ai)* gF(A2)* = [gF(A2) ^.^U^n* = QHA^ n A2)*.
iii. We have
n . n gF(U A .) = z B^^^i^' ^°
BF( 0 A,.)* = [E3F( .)]* = ? 3F(A^)*.
iv. |1BF(A)*X|1 < ||BF(A)*|| lixll = 1IBF(A)|| 11X11
The definition of a resolution of identity can be better under
stood by examples. Two typical examples are illustrated as follows.
Example 1. L : the Banach space of equivalent classes of
functions that map R to R and satisfy the following inequality.
iZ |f(t)|^ dt < °°, where 1 < p < «.
Define
0 , for t ^ A F(A)f(t) = x(A)f(t) = , for f £ L .
f(t), for t £ A P
It is easy to show that {F(A)|A E 3(R)} is a resolution of identity in
L once the properties of X(A) are explored.
12
Example 2, Let p, q £ R such that 1/p + 1/q = 1. Then L is a r
reflexive Banach space whose dual space is L . For all f £ L , g £ L , ^ P q
define
(f, g) = r f(t)g(t) dt.
Let {F(A)|A £ 3(R)} be the resolution of identity in L as defined in P
Example 1. Then we have
(F(A)f, g) = r [x(A)f(t)]g(t) dt
= /A f(t)g(t) dt = C i^(t)[x(A)g(t)] dt
— 00
= (f, x(A)g)
So {X(A)|A £ 3(R)} is also the induced resolution of identity in L .
Causality of Operators
(1) Causal Operators
Def.II.2. Let (X, wF), (Y, yF) be Banach resolution space.
T: X -> Y, is linear. T is said to be causal, if
t t t t JF x-j = „F ^2 => Y^ "'' 1 " Y^ ''' 2' ^ ^ ^
gF^ = gF(-«, t), B = X, Y.
The definition above is good for e\/ery linear operator. However,
when we deal with linear bounded operators, the following theorem will
come in handy.
Thm,II.3. T: X ^ Y, is bounded and linear. Then the following
statements are equivalent:
i. T is causal.
13
ii. yF^ T = yF^ T y?^, for all t £ R.
iii. llvF'' Til < IIT vF II, for all t £ R
iv. x^^ x = 0 =c> yF^ T X = 0.
V. T ^F^ [X] c yF^ [Y], for all t £ R.
^ * x' t ~ Y' t ^ x'^t' ^ ° ^ ^ t £ R.
vii. yF T wF. = 0, for all t £ R.,
where gF^ = Ig - gF , B = X, Y
Proof: a. (i =^ ii)
For all t £ R,
/^ X = (^F^)^ X = yF^ (/^ x ) , for all x £ X. So,
yF^ T X = yF^ T {yF^ x ) , for all x £ X, i.e.
yF^ T = yF^ T , F ^
b. (ii =^ iii)
t T- M _ I I ct T ct I I ^ 1 I ct I I 1 I T pt F^TII = lUF'-T vF"!! < \ K r ||T ,F Y"- M ! - I IY^ ' X" " - " Y ' " '• ' X
£ IJT ^F^ll' " 01 ^ ^'
c. (iii =^ iv)
F X = 0 ^ T / X = 0 ^ | | T / xjl = 0
14
yF^ T x|| = 0 =^ yF^ T X = 0, for all
t £ R.
d. (iv =^ v)
For all y e T ^F^ [X], there exists x £ X such that
y = T ^F^ X. But yF^ {yF^ x) = 0
=> yF^ T ( F x) = 0
=> "f / t ^ ^ Y^t '--'' ° - Y^t '--'
e. (v => vi)
Since for all x e X, there exists y £ Y such that
^ X^t ^ = Y^t y- ^°
yF^ T j F X = yF^ (yF^ y) = ^F^ Y = T /^ X, for all
X £ X. Hence yF. T „F. = T wF..
f. (vi =^ vii)
/ t T ,F, = /^ {/t T ,F^) = 0.
g. (vii =-> i)
0 = yF^ (x - X2) => yF^ T yF^ (x - X2) = 0
=0 yF^ T (I^ - x^t^^^l " ^2^ " °'
i.e. yF^ T (x - X2) - yF^ T ^F^ (x - X2) = 0. So
15
yF T Cx-j - X2) = 0 and T is causal.
(2) Anti-causal Operators
The definition of anti-causality is quite similar to that of
causality and the corresponding theorem can be proved by a similar
argument. Therefore, the definition as well as the corresponding
theorem are stated as follows without further explanation.
Def.II.3. (X, „F), (Y, yF): Banach resolution spaces. T: X - Y,
is linear. T is said to be anti-causal if
X 't ^1 " X^t ^2 ^ Y 't ^ 1 " Y 't " 2*
Thm.II.4. T: X ^ Y, is linear and bounded, then the following
statements are equivalent:
i. T is anti-causal.
ii. yF^ T = yF^ T ^F^, for all t £ R.
m . ^ Tjl 1 jjT »F.l!, for all t £ R Y' t • 1 I _ 1 I • X' t
iv. x^t " Y^t " " °*
V. T /^ [X] c yF^ [Y], for all t £ R.
vi. T vF^ = vF^ T YF^, for all t £ R.
Vii. yF^ T yF^ = 0, for all t £ R.
(3) Memoryless Operators
With causality and anti-causality defined as above, a special
class of operators can be defined as shown in the next definition.
16
Def.II.4. (X, F), (Y, Y F ) : Banach resolution spaces. T: X Y,
is linear. T is said to be memoryless if T is causal and anti-causal.
As in the cases of causal and anti-causal operators, we have a
corresponding theorem for memoryless operators.
Thm.II.5 T: X ->- Y, is linear and bounded. T is memoryless if
and only if T yF(A) = yF(A) T, for all A £ 3(R).
Proof: a.(^) With T ^ F = yF^ T, we have
Y^t "' x' ~ Y^t x' ^ ~ * ^° " ' anti-causal.
Similarly T F^ = yF^ T implies
Y^ "'" x' t ~ Y' W^t ^ ~ ' ^'^' ^ ^ causal.
b.(<H By causality,
^X^t = Y ^ T x ^ t = Yf^t^(^X-X^')
" Y^t ^ " Y^t ^ X^ •
By an t i -causa l i t y , yF. T wF = 0 . So
T „F. = yF. T, for a l l t £ R. Simi lar ly
T yF = yF T, for a l l t £ R. These imply that
T yFM = yF(A) T, for a l l A £ 3(R).
(4) Causality of Adjoint and Inverse Operator
The following theorem tells us the relation between the causality
17
of an operator and the causality of its adjoint.
Thm.II.6. (X, ^F), (Y, Y F ) : Banach resolution spaces. T: X ^ Y,
is linear and bounded. And we take the induced resolution of identity
over each X* and Y*. Then
i. T is causal ^=>T* is anti-causal,
ii. T is anti-causal <=>T* is causal.
Proof: The proof is trivial following the fact that
(YF, T / ) * = (^F^)* T* (yF^)*.
In the above theorem, the causality of T* is dependent on the
causality of T. However, when we deal with a special class of Banach
spaces, the so-called reflexive Banach space, we actually have an "if
and only if" condition. The corollary below makes this clear.
Cor. (X, j^F), (Y, yF) and T are as in Thm.II.6. And X, Y are
reflexive, i.e. X** = X, Y** = Y. Then
i. T is causal <=;> T* is anti-causal,
ii. T is anti-causal <=^ T* is causal.
The proof of this corollary is nothing more than identifying
notation, so it is omitted.
As the above theorem deals with the causality of adjoint opera
tors, so the next theorem deals with the causality of the inverse of
an operator.
Thm.II.7. T: X -> Y, is linear, bounded and invertible. Then
18
i. T' is causal iff YF^ T x = 0 implies „F^ x = 0.
ii. T is anti-causal iff Y^^ T x = 0 implies ^F x = 0.
Proof: The proofs for i and ii are similar, so we only describe
one of them, namely the proof of i.
(=^) T is causal, i.e.
^F^ y = 0 => ^F^ T"^ y = 0. If yF^ T x = 0 then
^F^ T"^ (T x) = 0, i.e. ^F^ x = 0.
(<=) For all y £ Y, there exists x £ X such that y = T x,
i.e. X = T'^ y. With
yF^ y = 0 =^ yF^ T X = 0 ^ ^F^ X = 0
=> wF T" y = 0, so T' is causal.
Although the above condition for causal invertibility is
necessary and sufficient, it is not readily applicable. However, for
those applications in which we are interested, the invertibility is
either guaranteed or it is of no concern. Hence the above theorem is
good enough for our purpose.
(5) Mimiphase and Maxiphase
The definitions of left-, right-miniphase and maxiphase are wery
important for the development of factorization of operators. Hence,
they are emphasized seperately in this section.
19
Def.II.5. T: X -> Y, is linear. T is said to be
i, left-miniphase if yF T x = 0 <=^ „F^ x = 0,
ii. left-maxiphase if yF. T x = 0 <=^ „F. x = 0,
iii. right-miniphase if T[j^F^[X]] = yF^[Y]
iv. right-maxiphase if T[^F^[X]] = yF^[Y].
It is easy to see that being miniphase, left- or right-, implies
being causal while being maxiphase implies being anti-causal. Further
more, when invertibility is guaranteed, left-miniphase implies the
inverse is causal, while left-maxiphase implies the inverse is anti-
causal. An important property of being maxiphase or miniphase is given
in the following theorem when the spaces involved are reflexive.
Thm.II.8. i. (X, ^F), (Y, yF) are Banach resolution spaces.
(X*, v/F*), (Y*, yF*) are the dual spaces of X and
Y, respectively, with their own induced resolution
of identity,
ii. X, Y are reflexive. T: X - Y, is linear & bounded.
Then
1 . T is left-miniphase i f f T* is right-maxiphase,
2. T is left-maxiphase i f f T* is right-miniphase.
We need a lemma to prove Thm.II .8. The lemma needed is stated
and proved as fo l lows.
Lemma, i . B is a Banach space.
20
ii. C, D are subspaces of B, with C £ D and D is closed.
Then C is dense in D iff D^ = C^, where
C" = {X* e B*|(x, X*) = 0, for all x e C} and
D^ = {y* e B*|(y, y*) = 0, for all y e D}.
Proof: (=^) a. C c D ^ D"" £ C"".
b. For all x* £ C" , (x, x*) = 0 for all x e C. For
all y £ D, there exists {x .} in C such that
x^ -)- y. So
(y, X*) = (lim x^, x*) = lim (x., x*) = 0, i.e.
x* £ D"" and C"" £ D"".
(<=) a. Assume there is a y £ D such that
inf Ijx - y|| f 0 (assume C is not dense in D). X £ C
b. inf ||x - y|| = max (y, x*), (Ref.U) X £ C X* £ C"",
x*||<l
max (y, y*) = 0: y* e D ", |y*||<l
Contradiction.
Proof of Thm.II.8: The proofs for 1 and 2 in Thm.II.8 are similar.
Here we describe one of them, namely the proof of part 1.
(=») We have yF^ T x = 0 <=> j F x = 0.
i. T is causal, so T* is anti-causal. By Thm.II.3,
T * [ ( Y F ^ ) * [ Y * ] ] £ (xF^)*[X*].
ii. For all X £ (T*[(yF'')*[Y*]])\ (X** = X),
21
0 = (x, T* C Y F ^ ) * y*) = (yF^ T x, y * ) , for y* £ Y*.
This implies yF^ T x = 0, so ^ F x = 0, (Y** = Y).
Hence, 0 = ( F x, x*) = (x, ( F )* x*), for all x* £ X*,
so X £ ((j^F^)*[X*])^.
By i and ii, (T*[(YF^)*[Y*]])^= ((xF^)*[X*])\ so
T * [ ( Y F ^ ) * [ Y * ] ] = (xF^)*[X*], by Lemma.
(<=) We have T*[(yF^)*[Y*]] = (j(F^)*[X*]
i. T* is anti-causal, so T is causal. So
j F X = 0 => yF" T X = 0.
ii. If yF^ T X = 0, then
0 = (yF^ T X, y*) = (x, T* (yF^)* y*), for all y* £ Y*.
But for all (x^^)* x*, x* £ X*, there is a sequence {y^*}
in Y* such that T* (yF^)* y.* ^ (^F^)* x*. So
(x, (^F^)* x*) = (x, lim T* (yF^)* y.*)
= lim (x, T* (yF^)* y.*) = 0. So
{yF^ X, X*) = 0, for all x* £ X*, implies ^F^ x = 0.
From the theorem, we notice that the definitions for left- and
22
right-miniphase are equivalent when the operator is invertible. This
statement is also true for left- and right-maxiphase.
CHAPTER III
FACTORIZATION OF OPERATORS
Operator factorization is the fundation of this dissertation; but
not every operator on an arbitrary Banach space can be factorized in the
form desired. Throughout this dissertation, the desired form of
factorization is either k k* or T* T. In either form, we notice that
the operator to be factorized is "self-adjoint". This requirement
presents no problem in Hilbert space. However, "self-adjoint" does not
always make sense for operators on Banach spaces. Furthermore, the
operator has to be "positive". Again, "positive" does not always make
sense in Banach spaces. Fortunately these concepts can be defined in a
rather useful class of Banach spaces — the reflexive Banach spaces, in
which they are defined as follows:
Def.III.l. i. B is a reflexive Banach space.
ii. T: B ^ B*, is linear and bounded.
T is said to be positive if
(x, T x) >_ 0, for all x £ B.
T is said to be self-adjoint if
T* = T.
Note that T*: B**=B - B*, so it makes sense to compare T and T*.
Next, we are going to write dov/n a theorem which is the fundation
for the factorizations to be discussed. This theorem is stated without
proof. Interested readers are referred to Masani's work (9).
23
24
Thm.III.l. Let Q: B -> B*, be a linear, bounded, positive, and
self-adjoint operator, where B is a reflexive Banach space and B* is its
dual. Then there exists a Hilbert space H and a linear bounded operator
K from H to B* such that Q = K K*.^'^'^'^
With this theorem, we can go on to the main subject.
Left- and Right-miniphase
Thm.III.2(left-factorization),
i. (B, F) is a reflexive Banach resolution space,
ii. Q: (B, F) -> (B*, F*), where F* is the induced resolution of
identity in B*. Q is linear, bounded, positive and self-
adjoint.
Then there exists a Hilbert resolution (H, E) and a linear
bounded operator K: (H, E) -> (B*, F*), such that
1. Q = K K*,
2. K is left-miniphase,
3. The factorization is unique up to a memoryless unitary trans
formation.
Proof: A. i. By Thm.III.l, there exists a Hilbert space H and a
linear bounded operator K from H to B* such that
Q = K K .
ii. Define H = R(K*) and K = Kj^: H - B*, then
t The requirements for a Hilbert resolution space are more restrictive than those for a Banach resolution space, namely E ( A ) * = E(A) is required for a Hilbert resolution space, which condition does not make sense in a Banach resolution space.
25
K*: B**^B -> H,
iii. (K* b, x ) ^ = (b, K x)g = (b, K x)g = (K* b, x)^, for
all b £ B, X e H,
Since K* b, K b e H, so K* b = K b, for all b e B.
iv. K K* b = K K* b = K K* b = Q b, for all b £ B. So
K K* = K K* = Q.
V. Since H = K [B] = K*[B] , so K* has dense range. So
K is 1-1.
B. i. Define H^ = R(K* F^) and let E^ be the orthogonal
projection on H . Then (E^)^ = (E^)* = E^.
ii. Since H becomes R(K*) which is H, as t -> «>, so
:t = T ^rt
^H
iii. When s < t.
lim E^ = I^ (E^ ^ I^ weakly).
H^ = R(K* F^), H^ = R(K* F^) and R(F^) £ R(f^).
Since F^ b = F^ b + F[s,t) b, for all b e B, and
for b' £ R(F^), b' = F^ b', so
F^ b' = F^ (F^ b') + F[s,t) (F^ b') = F^ b' = b'.
i.e. b' £ R(F^). So R(K* F^) £ R(K* F^) and
H^ = R(K* F^) £ R(K* F^) = H^. So E^ E^ = E^ E^ = E^
iv. With E defined for all (-<x>,t), t £ R, E can be
26
extended to 3(R) uniquely to be a spectral measure,
i.e. a resolution of identity.
V. Since E^[H] = H^ = R(K* F^), by d e f i n i t i o n , so K* is
a right-maxiphase (De f . I I . 5 ) . So K is a l e f t -
miniphase (Thm.I I .8) .
:. i . Let K: (ff, ?) ^ (B*, F*) be another left-miniphase
fac tor iza t ion of Q.
n . Define W on R(K ) by W (K b) = K* b. W is we l l -
defined, for i f K b = K a, then
K K* a = Q a = K K* a = K K* b = Q b = K K* b. But
K is 1-1 , so K* a = K* b.
i i i . For any b £ B,
1|W K b j p = ||K* b j r = (K* b, K* b)^ H . H ^
= (b, K K* b)g = (b, K K* b)g
= (K b, K b)f[ = IJK b j j ^ , H
SO W is isometric on R(K } . Since R(K ) is dense in Oi a.
H, W can be isometrically extended to H.
iv. For.all z £ R(K ), z = K x, for some x £ B, hence
K W z = K W K x = K K * x = K K x = K z . So
K W = K over R(K ). So K W = K over H via continuity
as W extended to H.
V. By the same argument, there exists V: H ->- H, V iso
metric, such that K V = K. So K W V = K, and hence
27
W V = I,., since K is 1-1. Similarly, V W = Iff.
vi. With K W = K, we have
(F^)* K E^ W = (F^)* K W = (F^)* K = (F^)* K E^
= (F^)* K W E" = (F^)* K E^ W E^,
(K, K are causal). So
(F^)* K (E^ W - E^ W E^) z = 0 for all z £ H, hence
E^ (E^ W - E^ W E^ ) z = 0, (K is left-miniphase),
i.e. E^ W = E^ W E^, so W is causal,
vii. Similarly, V is causal. But W = V* which is anti-
causal (Thm.II.6), so W is memoryless.
Should one notice that in the theorem above that we actually
define a resolution of identity to make the factorization be left-
miniphase, he will not be surprised that other resolutions of identity
can be defined to make the factorization be left-maxiphase. The
following corollary explains just that point and most of the proof is
omitted to avoid repetition.
Cor. i. Q: (B, F) ^ (B*, F*).
ii. Q is linear, bounded, positive and self-adjoint.
Then there exists a Hilbert resolution space (H, E) and a
linear bounded operator K: (H, E) (B*, F*), such that
1. Q = K K*,
2. K is left-maxiphase,
3. The factorization is unique up to a memoryless unitary
transformation.
28
Proof: Define H^ = R(K* F^), where K is as defined in Thm.III.2.
Then the proof follows correspondingly.
Since we are dealing with reflexive Banach space, there is no
necessity of labeling "K" as the factorization. Simply treating K* as
an independent operator, say T, we can call T the factorization. In
order to tell the difference between K and T, we simply call T (which is
K*) the right-factorization, and we have the following theorem.
Thm.III.3(right-factorization).
i. Q: (B, F) ^ (B*, F*).
ii. Q is linear, bounded, positive and self-adjoint.
Then there exists a Hilbert resolution space (H, E) and a linear a. a*
bounded operator T: (B, F) ^ (H, E), such that
1. Q = T* T,
2. T is right-miniphase.
3. The factorization is unique up to a memoryless unitary trans
formation. Proof: Define T = K*, then T* = K, and define H. = R(T F.). Then
the rest of the proof follows as in Thm.III.2.
Reproducing Kernel Relolution Space (RKRS)
There is a common statement in each theorem of the previous
section, i.e. "The factorization is unique up to a memoryless unitary
transformation". For certain applications such as the study of Banach
space valued random variables, we would like to get rid of the ambiguity
just mentioned. The concept of a reproducing kernel resolution space
29
evolved from theorems of the previous section is just what we want.
First, let's define some notation.
Def.III.2. Q, K, and T are defined as in the previous section.
Q = K K* = T* T. Let H^ = R(K), HQ = R(T*). For x, y in R(K), define
(x, Y)f[ = (K" ^ X, K'^ y)jv . For z, w in R(T*), define
(z, w ) ^ = (T*"*- z, T*"^ w ) ^ . Define
«E^ = K E^ K"'- on H^ and .E. = T* E^ T*"*" on H->, then extend them to
a* 3(R). Note that E corresponds to E in Thm.III.2.
Thm.III.4. (HQ, QE) (the so-called RKRS) defined above is a
Hilbert resolution space which is independent of the factorization
Q = K K* used in its definition. Moreover,
i. (HQ, QE) is unitarily equivalent to (H, E),
ii. R(Q) is dende in HQ,
iii. (F^)* X = 0 iff E^ X = 0, for x e HQ,
iv. K: (H, E) ^ (HQ, Q E ) , is memoryless,
where (H, E) corresponds to (H, E) in Thm.III.2.
Proof: i. HQ = R(K), so f(p is a linear vector space.
ii. By definition, (x, y)'^ = (K"^ x, K"^ y)jf. It is MQ
trivial to show that
30
1. (x, y) = (y, x ) , for a l l x, y £ H ,
2. (x, ay) = a(x, y ) , for a l l a e R,
3. (x, y + z) = (x, y) + (x, z ) , for a l l x, y, z £ H ,
4. (x, x)f[ = (K"^ X, K'^ xh > 0, for a l l x £ H and Q rl ~ W
Since K is l inear, x f 0 implies K" x f 0, So
(x, x) > 0, for X £ HQ and X ^ 0.
So (x, y)u is an inner product over HQ.
iii. Let {x.} be a Cauchy sequence in HQ, then {K" x.} is
Cauchy in H. So K x-j -> z, for some z £ H. But
z = K" K z , SOX. - > K z i n H . Hence H,. is a Hilbert 1 Q
space,
iv. Assume Q = K K* = K' K'*, both K and K' are left-
miniphase factorization of Q on factor spaces (H, E) and
(H', E'), respectively.
1. By Thm.III.2, there exists a memoryless unitary
transformation U: (H, E) (H*, E'), such that
K U = K'. So R(K') = R(K U) = R(K), since U is onto.
Hence HQ is independent of the factorization.
2. (K'"'- z, K'-*- w);if. = ((K U)"^ z, (K U)"^ w)J(. H' ^ " "' "' ' "' "'H
H
= (K"' z, K"'- w)p|, for all
= (U""* K'^ z, U""* K"'- w);^,
z, w £ HQ. SO the inner product is independent of
the factorization.
31
3, K' E'^ K'""- = (KU) ^^ (K U ) " ^ = K U ^ ^ U"^ K^^
= K E^ U U"^ K"^ = K E^ K"^
Therefore, E is also independent of the factoriza
tion.
V. Define K: H -> HQ by K x = K x, for all x £ Pf. Then ' a,
K is 1-1 and onto and for all x £ H,
IlKxIl^ = IlKxJl! = IJK-L K x i | ^ = llxll! HQ HQ H H
S O K is a unitary mapping. Furthermore, we have
Q E ^ = K E* K"^ on HQ and for all z £ L , there is a
unique x £ H such that z = K x = K x. So
-I ' -1 -I ' -1 x = K '- z = K ' z, i.e. K ^ = K '. Hence
QE = K E K' . This means that (HQ, Q E ) is unitarily
equivalent to (H, E).
vi. R(Q) = R(K K*) = K[R(K*)] = K[R(K*)]
Since R(K*) is dense in H, so K[R(K*)] is dense in H^
(for K is unitary),
vii. 1, K is left-miniphase, so
E^ x = 0 iff (F^)* K X = 0, for X e H.
2. For all z £ HQ, there exists x £ H such that
= K-L
E^ K"''- z = 0 iff (F^)* z = 0, z £ L .
z = K X or X = K"^ z. So
32
3. If (F^)* z = 0, then
Q E ^ z = K E^ K •- z = 0.
And if Q E ^ z = 0, i.e. K E^ K"^ z = 0, then
E^ K"^ z = 0. So (F^)* z = 0.
viii. K: (H, E) -> (HQ, Q E ) , is actually K defined in v. K is
left-miniphase, so ^ x = 0 iff (F^)* K x = 0. Hence
E X = 0 iff Q E ^ K X = 0, for x £ H. But since
K = K, the above equation implies K is also a left-
'v-i miniphase. So K = K'' is causal (by being left-
miniphase), and K is also anti-causal. Therefore, Oi
K is memoryless.
Immediately following from the previous theorem is a rather
interesting result which gives us a sort of "unique" left-factorization,
This result is indicated in the next corollary.
Cor. (HQ, Q E ) is as defined in Thm.III.4. Then there exists a
left-factorization P: (HQ, Q E ) -> (B*, F*), such that
i. P z = z for all z e HQ £ B*,
ii. P* b = Q b, for all b e B.
Proof: By Thm.III.4, we have K: H -> HQ, a memoryless unitary
operator, and K x = K x, for all x e ?f. Define
P = K K"^ HQ -> B*. Then Of a. O;
i. For all z £ HQ, there exists x £ H such that K x = z.
33
2. By Thm.III.4,
Q^^ z = 0 iff (F^)* z = 0, for z £ A'Q. SO
qE z = 0 iff (F^)* P z = 0, for z £ HQ. Hence P is
left-miniphase.
3. P* b = (K K'"")* b = (K""")* K* b = K K* b = K K* b = Q b,
for all b £ B. And P P* b = P (Q b) = Q b.
The "uniqueness" we mentioned is due to the fact that (HQ, QE) is
independent of the factorization.
As in the first section of this chapter, there are corresponding
theorems to Thm.III.4. These theorems are described below, with proofs
only sketched.
Thm.III.5. (HQ, Q E ) , defined at the beginnig of this section, is a
Hilbert resolution space which is independent of the factorization,
Q = T* T, used in its definition. Moreover,
i. (HQ, Q E ) is unitarily equivalent to (H, E),
ii. R(Q) is dense in HQ,
iii. (F^)* X = 0 iff E^ x = 0, for x z H,
iv. T*: (H, E) ^ (HQ, Q E ) , is memoryless.
Proof: i. The proof of (HQ, QE) being a Hilbert resolution space
and being independent of the factorization is
essentially the same as that of Thm.III.4, and
therefore is omitted,
ii. Define T : (H, E) -> (HQ, Q E ) , by T X = T* x, for all
34 '\j-k
X £ H. Then T is 1-1 and onto. For all x £ H,
ll^*x||; = ||T*x||^ = I|{T*)-LT*X||H^ = IIXIIH^.
So T is a unitary mapping. With T defined as above,
it is routine to verify the rest of the theorem. But
we need to note that T*: (H, E) -. (H^, ^E) in iv, is a.*
actually the T defined above.
Cor. (H^, -.E) defined as above, then there is a right-factoriza-
tion (miniphase) of Q, q: (B, F) -> (H^, ^E), such that
i. q X = Q X, for all x e B,
ii. q* z = z, for all z £ Hp, c B*. '\AI
Proof: By Thm. 111.5, T : H -> Hp,, is memoryless and uni tary.
Define q = T T: (B, F) -> (H^, ^E). Then
i . q X = ?* T X = T* (T x) = Q x, for a l l x £ B.
i i . q* z = (T T)* z = T* T z = T (T z) = z, for a l l
z e H, , since T is uni tary,
i i i . By Thm. I l l . 5 ,
^E^ X = 0 i f f (F . ) * X = 0, for X £ H. So
^E, X = 0 i f f (F . ) * q* X = 0, for X e H. So q* is
left-maxiphase, i . e . q is right-miniphase.
i v , q* q b = ( t * T)* (T* T) b = T* T T* T b = T* T b = Q b
for a l l b e B.
CHAPTER IV
HILBERT SPACE REPRESENTATION THEORY
So far, we have been dealing with basic concepts and fundamental
theorems. From now no, applications will be our prime concern. In this
chapter, two special areas, scattering operators generalization of
the scattering variables and Banach space valued random variables,
are discussed. Both areas fit in nicely with the theorems developed in
previous chapters. Although at this moment we don't have complete
theorems for these general problems, the exploration of the problems
itself suits its own purpose.
Scattering Operator
Classically in network analysis, network variables, such as
voltage and current, are assumed to be in Hilbert space. Although the
scattering variable is a very useful tool in network analysis, it is not
clear what is the significance of the "characteristic impedance" which
transforms voltages and currents into objects in a different Hilbert
space. Situations have occurred where we have to assume network
variables to be in Banach spaces. If the same idea for the scattering
variable works here, the function of the characteristic impedance should
be the transforming of network variables in Banach space into elements
in Hilbert space. Theoretically, it is much easier to work with Hilbert
space. Therefore the significance of the characteristic impedance lies
in changing a problem that is more general in formulation into a problem
that is easier to handle. In this section, we are going to generalize
35
36
the idea of scattering variables in Banach spaces with the help of the
factorization theorems developed in chapter III.
The reason for the generalization of the scattering variable
fitting in with the factorization theorems is due to the "duality"
between voltage and current. Thinking of the impedance Z, or the
transfer function, as a operator from a current space to a voltage space,
the product V-I, where V denotes voltage and I current, simulates the
role of a linear functional over the current space. Thus, the voltage V
can be as well thought of as a linear functional. Similarly the
admittance can assume the same role as the impedance. In that case, I
is a linear functional over the voltage space. What model could be
better than a reflexive Banach space to fit the structure just described?
Therefore, the current space and the voltage space are chosen to be a
reflexive Banach space and its dual.
Since the impedance operator is the inverse of the admittance
operator and vice versa, operators involved in the formulation of
scattering operator are invertible. The following theorems describe the
effect of invertibility on the factorization theorems.
Thm.IV.l. Let Z: (B, F) -> (B*, F*), where B is a reflexive
Banach space, be positive, causal, invertible, linear and bounded. Let
M = 1/2-(Z + Z*); then M is also positive, invertible, and furthermore
self-adjoint. Then
i. There exists a Hilbert resolution space (H, E) and a left-
factorization of M, KQ: (H, E) ^ (B*, F*), such that Kg is
left-miniphase and invertible.
37
ii. There exists a Hilbert resolution space (H, E) and a right-
factorization of M, T Q : (B, F) -> (H, E), such that Tg is
right-miniphase and invertible.
Proof: The existence of the left- and right-factorization follows
from Thm.III.2 and Thm.III.3. All we have to show is the
invertibility of Kg and TQ. We have
M = KQ KQ* = T Q * TQ. M is onto. So KQ, T Q * are onto,
but they are also 1-1. Therefore KQ, TQ* are invertible.
T Q * invertible implies TQ invertible.
Note here that we use the same Hilbert space for left- and right-
factorization. This can be justified from the proofs of Thm.III.2 and
Thm.III.3.
Thm.IV.2. Let M be defined as above. If K: (H, E) ^ (B*, F*),
is a linear bounded operator such that
i. K is 1-1, onto and causal,
ii. M = K K*,
then there exists a linear bounded operator U: (H, E) ^ (H, E), such that
a. K = KQ U,
b. U is causal and unitary.
Proof: Simply let U = Kg^ K, then
U* = K* ( K Q ^ ) * = K* (Kg*)"^ and
U* U = K* (Kg*)'^ Kg^ K = K* (Kg Kg*)"' K = K* M"' K
= K* (K K*)'"' K = K* (K*)""* K""* K = I.
Similarly U U* = I, so U* = U"\
38
Thm.IV.3. Let M be defined as above. If T: (B, F) ^ (H, ^ ) , is a
linear bounded operator such that
i. T is 1-1, onto and causal,
ii. M = T* T,
then there exists W: (H, ^) -v (H, JP) , such that
a. T = W Tg,
b. W is causal and unitary.
Proof: Simply let W = T Tg , then the proof follows.
The significance of Thm.IV.2 and Thm.IV.3 is that they enable us
to choose causal and unitary operators U & W as desired in order to make
the factorization satisfy some requirements. Unfortunatley, the
existence and the way to find these U & W are currently beyond our reach,
even though it is trivial to do so in the classical case, i.e. the
scattering variable. With this in mind, let us now formulate the
problem.
Consider the following network:
\
V V , 0 ^
y a ^0 ^0* . '1 ' i 0
a. n-port network series loaded b. optimal matching situation
From the network diagram,
V = (Z^ + Z^*) I. = (Z^ + Z^) I^
= * 0 h = V + z„ r. a 0 a
39
Define V^ = V^ - V^ and I^ = -{T - i ), then we have V = Z I . a I r 0 r
Define I^ = S I., where S is called the current-basis scattering
operator, then
r-h -a I„=I-. - I . = 1. - ( Z ^ . Z ^ ) - ! (Z^.Z^*) I.
SO
^ ~ ^B " ^^0 " L^" ^^0 " V ^ ' " "^ B ' ^ identity mapping
on current space B. Now let K, T be the factorizations of 1/2'(Z + Z *) 0 0
as defined in Thm.IV.2 and Thm.IV.3, i.e.
M = 1/2-(ZQ + ZQ*) = K K* = T* T.
Define a = K* I ., b = T I and b = S a, where S is the so-called scattering operator. Then I = S^ I. implies that
r 1 '
T""' b = S^ (K*)""" a, so b = T S^ (K*)'"' a. Hence
S = T S^ (K*)"'' = T (K*)""" - 2 T (ZQ + Z^)"^ K
= C - 2 T Y K, where C = T (K*)"^ and Y = (Z + Z,)"^. a ' ' a ^ 0 L'
In order to have a causal scattering operator S, we need a causal
C. However, C = T (K*)" which is not causal in general. By Thm.IV.2
and Thm.IV.3,
C = W Tg (Kg*)" U, where Kg, Tg denote the left- and right-
factorization respectively. Therefore the requirement for the selection
of U & W is to make C causal.
Similarly, consider the following network, an n-port network
parallel loaded by an n-port network.
40
Y • 0
I. ^ 1
. fig ' o
a. parallel loaded n-port b. optimal matching situation
Equations as follows can be easily verified.
def def Ir - -(la-Ii)' \ - \-''v
V - S v., where S is the so-called voltage-basis scattering
operator.
S^ = {\ + Y^)-l (Y^* - Y^) = -IB* + 1, i \ * Y^*). where
h-^\'\^''-def def a ^i' Q* v., b - P V , where P, Q are the factorizations of
1/2*(Y + Y *) = P* P = Q Q* as mentioned in Thm.IV.2 and Thm.IV.3. 0 0
b ^^^ S a, where S is the scattering operator,
-1 . S = -p ( Q * ) " ' + 2 P Z Q = -D + 2 P Z Q , where D = P ( Q * ) : -1
Similarly, in order to have a causal S, we must have a causal D.
However, D = P (Q*)"^ = W Pg (QQ*)"^ U*, where Pg, QQ is the right- and
41
left-factorization of 1/2'(YQ + Y^*). Therefore the requirement for the
selection of W' a U' is to make D causal.
One of the most useful properties of scattering variable is that
it gives a measure of the optimal transducer power gain. To see that
this property still holds for scattering operators, let us consider the
"power" entering the load network. For the series loaded network,
I = -I + I. = - T"^ b + (K*)"^ a a r 1
V = V +V. = Z T"''b + Z * (K*)'"' a. a r 1 0 0
So the power entering the load is given by:
(a' ) B = " ^ * ) " ^ " " ' ' 0 " V ^ *'" '! a' 'a'B ''•' ' - • -' "0 0 ' • B
//,^j.\-l - -7 + /'>/*\-l ^\ ^T"l K 7 T"' h\ = ((K*)-^ a. Z * (K*)-'' a)g - (r^ b, Z^ T"'' b)
+ ((K*)-l a, Z^ T-l b)B - (T-1 b, Z^* (K*)'^ a)^
= (a, K-1 Z^* {K*)-l a)^ - (b, (T*)-1 Z^ r^ b)^
+ (a, K-l Z^ T-1 b)^ - (K-l Z^ T-1 b, a)^
= V2(a, K-1 Z^*(K*) a)j. + 1/2(K-1 Z^* (K*)"! a. a)^
- l/2(b. (T*)-1 Z^ T-1 b)^ - 1/2((T*)-1 Z^ 1-1 b. b)^
= (a. V2-K-1 (2^* + 3^) (K*)-l a)^
- (b. V2-(T*)-1 ( y + z^) T-1 b)^
= (a. K-1 K K* (K*)-1 a), - (b. (T*)-^ T* T V^ b)^
42
= (a. a)^ - (b, b)^
= (a, a)^ - (S a, S a)^
= (a, a)^ - (a, S* S a)^
= (a, (I^ - S* S) a)^.
Above equation indicates that for a passive network, i.e.
^H " * ^ ^^^^ ^^ ^ positive operator. And S* S = In for a lossless
network. The same result can be obtained for the parallel network.
Banach Space Valued Random Variables
One way to think of a random process is to consider it as a
random variable which takes values in a function space. Of course, we
have to use an adequate probability measure to make the idea work.
Fortunately, this kind of measure has been defined for metric spaces '.
Among all kinds of metric space, Hilbert space has the nicest properties.
Hence, Hilbert space valued random variable is the kind that has been
12 studied most . However there are problems where a random process
considered as a Hilbert space valued random variable does not lead to a
satisfactory result. Consequently this kind of random process has to be
considered as a random variable over spaces other than Hilbert space.
The next choice is naturally Banach space. In this section, we first
define stochastic properties such as "me'an" and "variance operator" of
a Banach space valued random variable assuming the existence of such
random variables. We then look into the factorization of the variance
operator and the characteristic that might be achieved via the
43
factorization, i.e. the RKRS. Interestingly enough, the RKRS of a
Banach space valued random variable is a Hilbert space. This seems to
be a nice result, but there are obstacles for further application. All
these will be discussed in the following.
(1) Variance Operator
Probability measure on Banach space is a rather complicated
matter. In the sequel, we implicitly assume its existence as indicated
by the symbol of expectation value, E{*}. However we will not entangle
ourselves in the probability measure itself.
Let p, TT denote finitely additive random variables taking values
in a reflexive Banach space B. Assume
i. E{|(p, x*)|} < ->, for all x* £ B* laj
ii. E{(p, X*)} is continuous in x*.
Then there exists a unique m £ B such that
E{(p, X*)} = (m^, X*)
For E{(p, X*)} is a continuous linear functional on B*, so it is an
element of B** = B. m is termed as the mean of the random variable p.
The mean has the following properties:
i. m = m + ni » p+7r P TT
ii. IK'pll < E^l|p|l>» iii. L: B - B, is bounded and linear, then
m. = L m .
As in most stochastic processes, mean is not our prime concern.
In the sequel we thus assume that all random variables have zero mean.
For the definition of variance operator, we have to assume the follow-
44
ing:
i. E{|(p, x*)(7r, y*) |} < «, for all x*, y* £ B*, .. , (b)
ii. E{(p, X*)(TT, y*)} is continuous in x* and y*.
It is easy to show that condition (b) implies condition (a). Further
more, we have the following lemma to facilitate the definition of the
variance operator.
Lemma. A continuous bilinear functional, (xjy), on a Banach
space B is also bounded (i.e. there exists M e R such that
|(xly)|/(||x||-||y||) < M, for all x, y £ B).
Proof: Assume not, then for all i £ N, there exists x., y. £ B uch that
l l=<ill-l iyil l
ef ine x.
u. = '
• > . "• •
V . = i
' ^ T | | X i l l ' y^\\y^\
Then u. - 0 and v. ->- 0. However
(u. v.J = i 1-
' ' i-||Xil|-||yill i
contradiction.
Now if we fix y*, then E{(p, X*)(7T, y*)} is a bounded linear
functional on B* (so an element of B** = B). This implies that there
exists a unique p_^. £ B such that
E{(p, x*)(7T, y*)} = (Py*, X*), for all x* £ B*.
45
Define a mapping Q : B* - B by
It can be easily proved that Q is linear. Moreover, Q is bounded,
since
\\%^y*\\ = i iPv*i i = sup i^^(p> ^*)i-^ v*)>i ^ x * llx*i
. < s u p M | x * | h | | Y * | | = , | . . , , | | x * | |
The operator Q^^ is termed the covariance operator of the random
variables p and TT. Covariance operators satisfy the following
conditions: i. .Q(LP)(KTT) " "^ where L and K are bounded and
linear operators on B.
ii. Define Q = Q ; then p PP
Q = Q + Q +Q + Q .
Q is cal led the variance operator of p.
i i i . Q = Q * , in par t i cu la r , Q* = Q . PIT ^TTp » '' * ^p ^ p
iv. Qp is positive; for, (Q^ y*, y*) = E{(p, y*)^} o.
(2) Reproducing Kernel Resolution Space (RKRS)
As mentioned in the previous section, a reflexive Banach space
valued random variable has a variance operator Q which is positive and
self-adjoint. Then Thm.Ill,2 and Thm.III.4 come into the picture and
we have the following: There exists a Hilbert resolution space (H, E)
and a left-factorization (miniphase) K: (H, E) -> (B, F) such that
Q = K K*, and also the RKRS, (H , E), which corresponds to (H^, nh in P P P H H
Thm,III,4. In this section the resolution of identity does not play a
46
vital role. However for future applications, we include it. Since we
have R(Q^) c H^ £ B, one of the natural questions to ask is whether the
random variable takes values only in H , and, if it does, what can we p
say about the original random var-iable. The answer to the first part of
the question is no, and we will give a counterexample in the next
chapter. However, if we happen to have the random variable taking
values in H , we would have the following properties. First define
BQ = R I K T , where K: H - B. Then K"*": B^ ^ H and we have
(K"^)*: H -. B* Consider E{(p, x) (p, y) }, for x, y in H . u P p p
E{(p, x) (p, y) } = E{(K"'- p, K"^ x)jK'^ p, K'^ y)^} p p n n
= E{(p, (K"4* K"^ x)g (p, (K-4* K"L y)g }. 0 0
Note: (K"^)* K"*" x and (K"^)"* K"^ y are elements of B*, i.e., the linear
functionals on B . By the Habn-Banach theorem, there exist x* and y* in
B* such that
x*|g = (K"4* K"^ X,
& y*|R = (K'4* K"L y, B 0
So (p, (K"4-^ K"^ x)g = (p, x*)g
0
& (p. (K'^)* K-^ y)g = (p, y*)g. 0
Therefore,
E{(p, x)p(p, y)p} = E{(p, x*)g(p, y*)3}
= (Q X*, y * ) . P ^
47
= (K K* X*, y*)g
= (K* X*, K* y*)^.
What is K* y? We claim it is K'^ y. For:
jjK* y* - K-L yj j^ = (K* y* - K' - y, K* y* - K'^ y)^
= (K* y*, K* y*)^;^ - 2(K* y*, K'^ y)^ + (K-^ y, K'^ y)^;
however,
(K* y*, K-L y)^ = (K K-L y, y*)^
= (y» y*)g, since y £ R(K),
= (y, (K-4* K-i- y)g '0
'H
and
= (K-L y, K-L y)^,-
(K* y*, K* y*)^ = (Q y* y*)^.
But Q y*£R(Q ) c R(K), so p P ^
(K* y*, K* y*)^ = (Q y*, (K'^* K'^ y) ii p ' O
= (K-L K K* y*, K-L y)^
= (K* y*, K-L y ) ^
. = (K-L y, K-Ly)^.
Hence ||K* y* - r ^ y M ' = 0, i.e. K* y* = K"^ y. Similarly,
K* X* = K-^ X. Going back to E{(P, x)p(p, y)^}, we see that
E{(p, x)p(p, y)p} = (Qp X*, y*)g
= (K* x*, K* y*)^
= (K-L X, K-^ y ) ^
= (x, y)p.
Furthermore, E{(p, x)^} = E{(p, x*)3} = 0. The above results show not only that p can be considered as a
48
zero-mean random variable over H , by satisfying condition (b) in the
previous section, but also that p has the identity operator on H as its
variance operator. Unfortunately, these results are overshadowed by the
prerequisite that p takes values in H only.
CHAPTER V
REPRODUCING KERNEL RESOLUTION SPACE OF HILBERT
SPACE VALUED RANDOM VARIABLES
In the last chapter, one problem was left unsolved for Banach
space valued random variables. We mentioned there that the random
variable may not take values only in the RKRS of its variance operator,
hence the characterization of a random variable by the RKRS of its
variable operator needs further justification. Here we will give a
counterexample to make the point, then examine some remedies for random
variables on Hilbert space. The result is that a certain kind of random
variable can be approximated by a sequence of random variables which
take values in the RKRS of the variance operator of the original random
variable and have variance operators that approach (weakly) the identity
operator of the RKRS.
Since the counterexample involves random variables on Hilbert
space, there are some special properties of this kind of random variable
that are generally not true for Banach space valued random variables.
These properties are summerized in the following theorems.
Thm.V.l. Let P be a random variable which takes values in Hil
bert space H, is zero-mean and has a variance operator Q. Then p takes
values in R(Q} with probability 1.
12 Proof: i. By definition ,
E{(p, x)(p, y)} = (x, Qy), for all x, y in H.
i i . Let p = p i + P2, where pi = P p, P is the project ion
49 CTEXAS T r .. ,
50
mapping on R(Q), and P2 = P - PI = (I - P) p.
iii. 0 = E{(p, x)} = E{(pi + P2, x)} = E{(pi, x)}
+ E{(p2, x)}
But E{(pi, x)} = E{(P p, x)} = E{(p, P x)} = 0, so
E{(P2» x)} = 0.
iv. E{(pi, x)^} = E{(P p, x)2} = E{(p, P x)2} = (P X, Q P x)
= (x, P Q P x) = (x, Q P x) = (P Q X, x)
= (Q X, x) = E{(p, x)2}
E{(pi. x)(po, x)} = E{(P p, x)([I - P] p, x)}
= E{(p, P x)(p, [I - P] x)}
(P X, Q [I - P] x) = 0
But E{(p, x)2} = E{(pi, x)2} + 2E{(pi, x)(p2, x)}
+ E{(p2, x)2}
Hence 0 = E{(p2, x)^}, for all x £ H. So P2 = 0 with
probility 1, i.e. p = pi = P p with probability 1.
Therefore p takes values in R(Q) with probability 1.
Thm.V.2. With p, Q as defined in Thm.V.l, let Q be a compact
operator. Therefore Q = z A,^(p, <\>^)^^, where \^ is the eigenvalue and
(|). is the corresponding eigenvector with the properties that
ii. X. > 0, for all k, since Q is positive,
iii. lim x. = 0,
TV, {(J). }°° isan orthonormal system, ^ 1
Then p = z a . , where a^ = (p, <,,^), with probability 1.
51
Proof: i . E{(p - " a .6 . , x ) } = E{(p, x ) } - z (4)., x) E{a.} I l l 1 1 1
= 0 - 2 (({,., x) E{(p, ({).)} 1 T 1
= 0, fo r a l l n.
i i . E{(p - ? a.(|,., x)2} 1 T T
= F{(p, x)-2} - 2E{(p, x)(z a.<{>., x ) } + E{(? a A., X ) 2 } i l l 1 ' '
= (x , Qx)' - 2 z ((j)., x) E{(p, x ) . a . }
+ z z ((()., x)(c{)., x) E{a.a.} I l l 0 T J
But E{(p, x) a.} = E{(p, x ) (p , cj,.)} = (x , Q ^,)
= X. (x , <|).) 1 1
and E{a.a.} = E{(p, (j).)(p, <^.)} = {<^., Q <{>.) = ^.<5... 1 J 1 3 1 J 1 TJ
So E{(p - z a.<j,., x)2} = ? x . ( x , ci>.)2 - 2 ? x . ( x , <D J ^ 3 1 1 1 1 1 I T T
' + Z X . ( X , ({,.)2 1 1 1
= z x . ( x , <},.)2 - z x . ( x , c )2 1 1 1 1 1 1
X ^ 0 , a s n -> 00.
n+] = z x . ( x , * . ) < X | |>^l l
n+i T 1 "+^
Bv i and i i , we conclude that p = z a 4, with probabi l i ty ^ 1 ' '
1.
Now we are ready for the counterexample.
Exp. Let {d)., i|;.}7 be an orthonormal system in Hilbert space H.
Let p = z (l/n)'COs(2nTra))-4) + " (1/n) •sin(2n7ra)) .^^, where ^ is a real
52
valued random variable uniformly distributed in [0,1]. Then we have
i. IIPII^ = z (l/n2) cos2(2niTa)) + z (l/n^) sinHZmt^) 1 1
= z (l/n2)[cos2(2mTU)) + sin^iZn-ni^)]-1
= z (l/n^) < oo 1
ii. E{(p, x)} = E{ z (1/n) cos(2mTw) (({.„, x) 1 "
+ z (1/n) sin(2mTa)) (i „, x)
But E{cos(2mra))} = /o cos(2mT(o) do) = 0 , for a l l n,
and E{sin(2mT(o)} = /Q cos(2mTu)) dw = 0 , for a l l n.
So E{(p, x ) } = 0 , for a l l x £ H.
i i i . E{(p, x ) ( p , y )}
= E{[Z (1/n) cos(2n7Ta)) (<!>_, x) 1
+ z (1 /n) 3in(2nTra)) {i>^. x ) ] 1
= E{ z z (1/nm) cos(2mTa)) cos(2m7Ta)) ((t> , x)(<{)j , y) 1 1 + z z (1/nm) cos(2mTa)) sin(2m7ra)) (c{> , x)(j;^, y)
1 1 + z z (1/nm) sin(2mTa3) cos(2mTTa)) (ip^, x)((t>^, y)
+ z z (1/nm) sin(2mra)) sin(2m7ra)) (^„, x)(i|^^, y ) }
But E C s i n u L ) sin(2m^a>)} = /J s in(2nu. ) sin(2mTra>) do)
E{sin(2mra}) cos(2miTa))} = 0
E{cos(2mTa)) cos(2mTTa))} = 1/2 5^^
53
So •E{(p, x)(p, y)}
= Z (l/2n2)(<,>^, x)(4)^, y) + z (l/2n2)(i|;^, x)(i|. , y)
= (Z (l/2n2)(^^, x)^^, y) + (z (l/2n2)(i|.^, x)^^, y)
= (z (l/2n2)(4)^, x)^^ + z (l/2n2)(i|,^, x) ^ ^ , y)
def = (Q X, y)
Hence Q x = z (l/2n2)(4, x)<i>^ + z (l/2n2)(i|; x)i|;
Q (j). = Z (l/2n2)(cj,^, 4,.)4) + z (l/2n2)(i|,^, ^.)^i,^
= (l/2i^)
Q il;. = z (l/2n2)(4,^, ^.)<^^ + z (l/2n2)(^^, ^ . ) ^ ^
= (l/2i2)
Hence l/2i2 is the eigenvalue and {(f>., i|;} are the
corresponding orthonormal eigenvectors,
iv. We claim that p takes values on R(Q) - R(Q) with probabili
ty 1. Since R(Q) = R(K K*) c R(K), where
K = z (l/n/2)(4)^, • )c{>n + I (l/nv^(^n, • )i>^> we are
actually going to prove that p takes values on R(Q) - R(K).
Here is how:
For all 0) £ [0,1],
p(a3) = Z (1/n) cos(2n™) (j) + z (1/n) sin(2n™) ij;
and ||p(a))||2 = z l/n2.
If there exists z in R(K) such that Q M = z, for some ,
then there exists x £ H such that p(a)) = z = K x
= z 0/n/Tli^^, x)4>„ + z (l/n/2)(i{,^, x)^.^
This implies that
54
(4>.j, x) = v i(({)., z) and (;/;., x) = ^i{^-, z)
For x is in H, z (cj, x)2 + f ( ., x)2 has to be finite. 1 » 1 1
But z = p(a)), (<|> , z) = ( 1 / i ) cos(2i7TU)) and
{^T> z) = ( 1 / i ) Sin(2i7ra)).
So Z [ i^ i(<{,. , Z)]2 + f [ ^ i ( ^ . , z) ]2 1 » 1 1 0 0
= z 2 cos2(2i7Tu)) + z 2 sin2(2iTTa>) 1 1
This series does not converge, so contradiction occurs.
With Thm.V.l, we conclude that p takes values in R(Q) - R(Q)
with probability 1.
Thm.V.2 gives us a pretty good idea how to approximate a random
variable. But there is a requirement that the random variable must
have a compact variance operator. The following is what we found:
i. With p, Q as defined in Thm.V.l & 2, let Q be compact.
Then by Thm.V.2, we have p = z (p, ({)..)(j)- with probability
1, where {(j).} are the orthonormal eigenvectors of Q.
ii. Define P„ = z (P, <^JC|). = P„ P, where P„ is the projection n ] 1 1 n II
on the subspace spanned by {<p^, (^2^ ""» 'n'*
p is zero-mean and n
E{(p„, x)(p^, y)} = E{(P„ p, x){P^ p, y)}
= E{(p, ?^ x)(p, P^ y)}
= (P„ X. Q P„ y)
= {P„x. ?x.{*.. P„y)6.)
= (P„ X. E x^(P^*,..y)*i)
= (p„ X, S \MA< y)i'i) n 1 ' ' '
55
n 1
n = (x , P z X.(<!)., y)(|,.)
= (x , z X.(<!,., y)^.) = (x, P Q y)
def = (x , Q„ y) 'n
n • So Q^y = Z X.(c{,., y)c(,. = P Q y .
i i i . E{(p-Pn, x ) (p-p^ , y ) } = E{ ( [ I -P^ ]p , x ) ( [ I -P^ ]p , y ) }
= ([I-P^]x, Q[I-P^]y)
= (x, Q y) - (x , Q P y) - (Pp ^ ' Q ^)
+ (Pn ^ ' Q PR ^ '
= (x , Q y) - (x , Q Pn y) - (x, P Q y)
+ (x , P Q P y)
But Q P y = P Q y = P Q P y = Z X.((j>., y)<^..
So E{(p-p^, x ) (p-p^ , y ) } = (x, Q y) - (x, P Q y)
= (x , [Q - P^ Q]y) = (x , [Q - Q^]y)
i v . IKQ - Q j x I P = II z x . ( * . , x)c^.| j^ 11 n - n I I « n+1
= n! i ^ i ^ * i ' ""^^ - ^ n + i * n | i ( * i ' ^^^ 1 V i H ^ M ^
So | | (Q„ - Q ) x | | / | | x | l < /x^^, fo r a l l x f 0.
So I IQ - Q||2 < X T -> 0, as n -> «>, i . e . Q -> Q uniformly. ' ' n ' ' — n+i n
n " V. For p„ = z (p , (j),)cf). = z X.(p/X. , 6.)(|)., p„ takes values in
n 1 • ' 1 • I I I II
R(Q) c_ R(K) only, where K is the factor izat ion of Q. p^ is
also zero-mean in HQ.
Consider def , ,
^n " ^ ^ ^ ^ n ' ^^Q^^n' ^^Q^' ^^"^ ''^^ x, y £ H^ = R(K)
= E{(K-L p^, K-L x)^(K-"- p^, K-L y)^}
56
= E{(p^, (K"h* K"^ x)^(p^, (K"L)* K"'- y)^}
= ((K"^)*K"L X, Q^ (K"4* K - S ) H
= (K"L X, K"^ P^ Q (K"'-)* K"^ y)^
= (K"^ X, K"^ P„ K K* (K"^)* K"'- y)n
= (K"^ X, K"^ P^ K (K'*- K)* K"^ y)^
= (K"^ X, K"^ P^ K K"^ y)^
= (K"^ X, K"^ P^ y)^, for y £ R(K)
= (X, P, y)Q
Note that (K"*")* K"^ x is actually an element in R(K) £ H.
Furthermore, since Q^ " Q uniformly,
Qp (K"^)* K"*- y -> Q (K'^)* K"^ y.
So (P y, X)Q = ((K"4* K"*- X, Q^ (K-4* K"L y)H
^ ((K"^)* K" X, Q (K"4* K" y)H
= ((K"^)* K"^ X, K K* (K"^)* K"^ y)^
= (K"*- X, K"^ K K* (K"^)* K"^ y)^
= (K"^ X, (K"*- K)* K"^ y), H
(K"^ X, K"^ y)^
57
= (x, y)Q, for all x, y £ HQ.
This means that
W ^ "" \ '' ' ^ ' The above results simply indicate that although we cannot
directly consider a random variable as a "white noise" in the RKRS of
its variance operator — by having the identity mapping as its
variance operator, in some situations we may as well approximate the
random variable by a sequence of random variables that take values only
in the RKRS and have variance operators approaching the identity
mapping weakly, i.e., a sequence of random variables "approaching the
white noise".
CHAPTER VI
WIENER-HOPF OPTIMIZATION THEORY
In the first part of this chapter, we formulate the classical
Wiener-Hopf filtering in Hilbert spaces. With the techniques devel
oped for this filtering, we then look into a special control problem,
the so-called Wiener-Hopf control. By choosing an adequate optimiza
tion criterion, this kind of control fits nicely into the method of
Wiener-Hopf filtering. In both cases we deal with the optimization of
positive operators, and because of this, we designate this chapter as
"Wiener-Hopf Optimization Theory".
Wiener-Hopf Filtering
Before we formulate the Wiener-Hopf filtering in Hilbert spaces,
we have to assume certain requirements to facilitate the formulation.
For comparison, let X be a random variable denoting the signal and n
be a random variable denoting the noise. Both random variables take
values in the Hilbert resolution space (H, E) and they have the
following properties:
i. E{(X, z)2}andE{(n, z)^} are finite, for all z £ H.
ii. X, n are zero-mean and statistically independent, i.e.
Qxn = %X - 0-iii. Qy, Q denote the variance operators of X and n, respec-
tively. As such, they are positive, self-adjoint, bounded
and linear. We assume that
a. Qy is Hilbert-Schm.idt,
58
59
b. Q„ is positive definite,
0- Qx " n ^ onto.
Let T denote the filter operator. T has to be causal and
Hilbert-Schmidt. Let Y denote the output of the filter. Define
e = X - Y, then e = X - T(n + X) and Q^ = (I - T)Qy{l - T)* + T Q^ T*.
Since Q^ is a function of T, we will write Q as Q (T) to indicate the
dependence of T. Our assignment is to find an optimal filter T such ^ 0
that Qg(TQ) is minimal, i.e. for any Qg(T) 1 Qg(TQ) we have
Qe(T) =Qe(T^).
Solution:
i. Qg = (I - T) Qx (I - T ) * + T Q^ T*
= T (Q, + Q,) T* - T Qx - Qx T* . Q^
Since Q is positive definite and Q„ is positive, Qw + Q^
is positive definite, hence 1-1.
ii. Let F: (H, E) ->• (H, E) be the left factorization of
Qw + Q^ such that
a. Qx + Q, = F F*
b. E^ X = 0 iff E^ F X = 0, for all t £ R.
This can be done because Qx + Qp is positive definite,
self-adjoint and onto. F can be taken as the square root
of Qx + Q,.
iii. Since Qy + Q„ is onto, so for all y e H, there exists A n
X £ H such that (Qx + Qn) ^ " y- ^°
y = F F* X = F(F* x), i.e. F is onto.
F is also 1-1, since it is left-miniphase. So F" exists
60
and is causal (also follows the property of being left-
miniphase).
iv. Qg = T F F* T* - T Qx - Qx T* + Qx
= [T F - Qx (F*)-^][F* T* - F""' Qx] + Qx
- Qx ( ^ * ) " ^ Qx
= [T F - Qx (F*)-"'][T F - Qx (F*)"^]* + Qx
- Qx (F F*)-^ Qx.
Since Qx and Qx (F F*)"^ Qx are positive and independent
of T, therefore, to find the minimal of Q is the same as e
to find the minimal of [T F - Qx (F*)"^][T F - Q (F*)"^]*,
denoted as Q(T).
V. Lemma 1. For Hilbert-Schmidt operators, the additive
decomposition always exists and each component is again 15 Hilbert-Schmidt . The proof of this lemma can be found
in the reference indicated.
vi. Lemma 2. For any self-adjoint, nuclear operator pair A and
B, with A and B comparable, then
A = B iff Tr[A] = Tr[B].
Proof: a. The lemma is equivalent to saying that for any
positive, nuclear and self-adjoint operator A,
A = 0 iff Tr[A] = 0.
b. {=>) trivial.
61
c. (<=) Tr[A] = z X.(A) = 0,^^ where X.(A) denotes
the i-th largest eigenvalues of A. But
^.|(A) > 0, for i, since A is positive and self-
adjoint. So X .(A) = 0, for all i. This implies
the radius of the spectrum of A is zero, so
I|A|| = 0, for A is self-adjoint. So A = 0.
vii. Qx is Hilbert-Schmidt and (F*)"^ is bounded, so
Qx (F*)"^ is also Hilbert-Schmidt^^. Let
Qx (F*) = C + A, where A is the strictly anti-causal
part of Qx (F*)"^ and C is the causal part of Qx {F*)"\
viii. Q(T) = [T F - Qx (F*)'"'][T F - Qx (F*)""^]*
= [T F - C - A][T F - C - A]*
= [T F - C][T F - C]* - [T F - C] A* - A [T F - C]*
+ A A*
ix. Lemma 3. The trace of any strictly causal (strictly anti-
causal) nuclear operator is zero .
Proof: The radius of the spectrum of a strictly causal
(strictly anti-causal) nuclear operator, say T,
is zero . Therefore,
Tr[T] = z X.(T) = 0. 1 T
X. Lemma 4. The composite operator of any two Hilbert-Schmidt
22 operators is nuclear .
Lemma 5. The strictly causal (strictly anti-causal)
operators form a two-sided ideal in the space of causal
62
(anti-causal) operators^.
xi. Now let's consider the trace of Q(T) (Q(T) is nuclear by
Lemma 4).
Tr[Q(T)] = Tr[(T F - C)(T F - C)*] - Tr[(T F - C) A*]
- Tr[A (T F - c)*]+Tr[A A*]
Since A is strictly anti-causal. A* is strictly causal.
T F - C is causal, so (T F - C)* is anti-causal. So
(T F - C) A* as strictly causal and A (T F - C)* is
strictly anti-causal by Lemma 5. So
Tr[(T F - C) A*] = 0 = Tr[A (T F - C)*] by Lemma 5. Hence
Tr[Q(T)] = Tr[(T F - C)(T F - C)*] + Tr[A A*].
Since (T F - C)(T F - C)* and A A* are both positive, their
traces are also positive. So
min Tr[Q(T)] = Tr[A A*], when T = C F"\ a causal
operator,
xii. Claim that when T = C F' » Q^ is minimal. For we first
point out that Q is minimal whenever Q(T) is minimal.
Second for any causal, Hilbert-Schmidt T such that
Q(T) 1 Q(C F"^), we have
Tr[Q(T)] £ Tr[Q(C F " ^ ] * by Lemma 2.
But Tr[Q{C F"b] is minimum. SO
Tr[Q(C F"^)] = Tr[Q(T)]. Hence
Q(C F"b = Q(T) by Lemma 2.
At th is po in t , the operator version of the Wiener-Hopf f i l t e r i n g
63
is completed. However, the requirement that certain operators be
Hilbert-Schmidt seems too restrictive. Being Hilbert-Schmidt is
required for the formation of a nuclear Q(T) in the variance operator
so that the trace of the Q(T) makes sense. Furthermore, the "if and
only if" statement of Lemma 2 enables us to minimize the trace in
order to find the optimal filter. Taking the memoryless part of the
variance operator has a similar effect on the variance operator as just
taking the trace. Intuitively, the scheme used previously might also
work here once the additive decomposition is guaranteed. Unfortunate
ly, there is no counterpart for Lemma 2, i.e. a positive, self-adjoint
operator having zero memoryless part is not necessarily a zero
operator. Therefore, in order to optimize the operator (filter) over
a broader class of operators — operators that can be decomposed
additively (for example, S in the space of compact operators) , we
have to compromise the optimization criterion. The compromise is made
by minimizing the memoryless part of the variance operator instead of
the operator itself. With this modification, the problem can be
solved in the same fashion once the additive decomposition is
guaranteed by working in S and it leads to the same solution.
Wiener-Hopf Control
Fig.l on the next page shows a control system suggested by
Yula . The center branch represents the plant P together with the
controller C, while the upper and the lower branch represent the
feedforward and the feedback compensators, respectively.
u ^ C
T m
(r
Fig.l. A Feedback and Feedforward Control System.
64
In the diagram, u is the input to the system and y is the output,
"d" represents the load disturbance. "£" is the noise picked up by
the feedforward sensor, "m" is the noise picked up by the feedback
sensor. The only unknown is the controller C. Our goal is to
construct an optimal C such that a certain sum of the variance opera
tors is minimal. However, before we formulate the problem and solve
it, we must assume the following:
i. Signal u, i, d and m are random variables that take values
in Hilbert resolution spaces (H, ^ E ) , (G, ^E), (D, ^E) and
(M, i^E), respectively,
ii. P: K -> H, P^: D -> H, L: D -> H, L^: G -> H, F: H -> H and
F • M -> H, are linear and bounded. C: H - K is linear. 0
Here (K, j E) is also a Hilbert resolution space.
65
iii. E{(u, x)2}. E{(£, g)2}, E((d, 2)2} ^ ^ E{(m, w)2} are
finite, for all x £ H, g £ G, z £ D and w £ M.
E{(u, Xi)(u, X2)}, Eiii, gi)(z, g^)}, E{(d, Zi)(d, Z2)}
and E{(m, Wi)(m, W2)} are continuous in x, and x^ of H, g^
and g2 of G, zi and Z2 of D and w^ and W2 of M, respec
tively.
iv. Random variables u, £, d and m are zero-mean and statisti
cally independent."^
V- Qy> Qo> Q j and Q ^ denote the variance operators of u, Ji, d
and m, respectively. As such, they are positive, self-
adjoint, bounded and linear. We assume that
a. Qjj + Pj Q^ P^* is Hilbert-Schmidt, where
Pd = ^ Po ^ L.
- % ^ ^0 ^m V ^ ^0 \ L,* + P^ Q^ P^* is positive
definite and onto,
vi. Def.: The plant P and the feedback compensator F form an
admissible pair if (a), both are causal, (b). there exists
a controller C (causal and linear) such that (I^ + F P C)"
exist and P C is Hilbert-Schmidt.
vii. P is onto.
Problem: For an admissible pair P and F, find an optimal
controller C (over the class of operators that make P and F admissible)
t Random variables u and i are independent if the covariance operator
Q /. ^ = 0, for all linear bounded operator A: G ^ H.
66
such that Qg + Q,p . is minimal, where e = u - y and r is the input to s '
the plant, while P^ is a linear and bounded mapping from K to H which
transforms the random variable r such that P r takes values in H and
Q(p r) " positive operator over H, so compatible to Q . The
operator P^ will be specified later.
Solution:
i. From Fig.l,
y = P r + P d 0
v = F y + F m 0
h = L d + L I. 0
Straightforward calculation yields
y = P R (u - FQ m - LQ £) + (P^ - P R P^) d
r = R (u - FQ m - LQ ii - P^ d)
e = u - y
= d u - P R) u + P R (F^ m + L l) - (? - P R P .) d, H 0 0 0 d
where R = C S, S = (I ^ + F P C)"^ and P^ = F P^ + L.
So
% = (IH - P R) % (IH - P ^ ) *
+ (PQ - P R Pd) Qd ( 0 - P ^d)*
V r ) - ^ M Q , -F^Q^F*.L^Q, L^*.P,Q, P,*) s
•(Pg R)*
ii. Note that the only parameter depending on C is R and ? & R
always appear in product in Q^. Therefore, in order to
67
^^^^ Q(p y,\ compatible, the proper choice for P would be
k P with k be a real constant. For P = k P,
Qe + Q(P^r) = P M Q , + F^ Q^ F^* + L^ Q^ L^* + P, Q, P^*)
•(P R)* (1 + k^) - P R (Q^ + Pj QH P *)
- (Qu ' Po d d*) (P R'* '\*^o % V = P R OQ(P R ) * - P R Q^* - Q^ (P R)* + Q
^^O^dV' Where Q^ = (1 + k^JCQ^ + F^ Q_ F^* + L^ Q^ L^* + P^ Q^ p^*)
and Qi = Q^ + P Q^ P^*.
i i i . QQ is posi t ive de f in i te and onto by assumption, so i t is
also 1-1. Let T: (H, ^E) •> (H, ^E) be the l e f t -
fac tor iza t ion of Q such that (a) . Q = T T*, ^f t
(b ) . ^E X = 0 i f f ME T X = 0, for a l l real number t .
T~ exists and is causal by the same argument as in the
f i l t e r i n g . i v . Qg + Q(p r) " P ^ ^ ^* (P R)* - P R Q^* - Q (P R)* + Q^
+ P Q . P * 0 ^d 0
= [P R T - Q (T* ) "^ ] [T* (P R)* - T"'' Q^*]
+Q + P Q P * - Q (T*) '^ T"^ Q * ^u 0 ^d 0 ^-j ^ / ^1
= [P R T - Q (T*)"" ' ] [P R T - Q (T*)"""]*
^ Qu " ^0 ^d V - ^1 ^o' ^ 1 * -
Q * P Qj P * , Q, Q" Q-,* are posit ive and independent of ^u* 0 ^d 0 ^1 ^0 ^1
68
of C. Therefore, to minimize Q^ + Q,p . is the same as to
def ^
minimize Q(P R) = [P R T - Q^ (T*)"'][P R T - Q^ (T*)"^]*.
V. P R = P C S and Q^ are Hilbert-Schmidt, So Q(P R) is
nuclear. By Lemma 1 to 5 in the filtering, minimal Q(P R)
occurs when P R = B T"\ where B is the causal part of Ql (T*)'\
v i . The next thing is to f ind C. By formula,
P R = P C S = P C ( I ^ + FP C)""*.
So PC = PR + P R F P C ,
( I ^ - P R F) P C = P R,
P C = ( I ^ - P R F)"^ P R.
So P CQ = ( I ^ - B T"^ F)"^ B T " \ i f the inverse of
In - B T " F ex is ts . C denotes an optimal cont ro l le r . H 0
P is not necessarily a 1-1 mapping, but it is onto. This
means there is probably more than one solution for C .
But all we are interested in is "a" solution. The
following shows how we choose to define C :
For all X £ H, define
G X = (I^ - B T"'' F)'"^ B T""* X.
G X £ H, so there exists z £ K such that P z = G x.
Let w = z - P^ z, where P^ is the projection on the
subspace N = {z £ K | P z = 0} — the null space of P.
Define: C X = W = Z - P | M Z .
69
a. CQ is well-defined. For if there exists z^, Z2 e K such
that P z^ = P Z2 = G X, then P (z^ - z^) = 0. So
2-, - 22 £ N and P^ (z^ - Z2) = Z-, - Z2, i.e.
^1 " ^N ^1 = ^2 - ^N H = ^0 •
b. Let CQ (x^ + X2) = z^2 " P^ ZT2' ^ '" 12 ' " ^ ^ ^
P Z]2 " ^ ( 1 + ^2). So
P 2^2 " ^ ^1 + S X2 = P z + P Z2 = P (z^ + Z2), where
z-j, Z2 are such that P z. = 6 x., i = 1, 2.
So CQ ( X ^ + X 2 ) = (z^ + Z2) - P^ (z^ +Z2)
= z - P^ z + Z2 - P^ Z2
= ^0 1 ^ ^0 2-
Similarly, C^ (b x) = b C (x), for all real b.
So C is linear,
c. P CQ X = P (z - P^ z) = P z = G X.
From a, b and c, C is a solution of
p CQ = (IH - B T"'' F)""' B r \
As pointed out in the context of Wiener-Hopf filtering, if we
sacrifice the optimization criterion by minimizing the memoryless part
of Q + Q/p \ instead of the operator itself, the requirement of
being Hilbert-Schmidt could be changed to the requirement of being in
S . The sam.e solution follows.
CHAPTER VII
CONCLUSION
In this dissertation, we have developed Banach resolution space
and factorization theorems for operators that map from reflexive Banach
spaces to their duals. These fundamental definitions and theorems are
then directly applied to the study of scattering operators and Banach
space valued random variables. As yet, however, we do not have
complete theories in these two areas. We then come to the subject of
Wiener-Hopf filtering. According to the development of this disserta-
tation, Wiener-Hopf filtering should have been formulated in reflexive
Banach resolution space. However, there are still problems to be
solved in Banach resolution spaces, e.g., operator decomposition.
Therefore, the Wiener-Hopf filtering is instead formulated in Hilbert
resolution space. In this formulation, the method of taking the trace
of a nuclear operator has been used to get rid of the cross terms
involved in the variance operator. We have mentioned that we could
take the memoryless part of the variance operator to cancel the cross
terms at the expense of accepting a less restrictive optimization
criterion — optimizing the memoryless part of an operator instead of
the operator itself. This is important in two ways. First, it allows
us to work on a more general class of operators which guarantee the
decomposition. The advantage of working with S^ is that S^ is an ideal
in the space S^ of compact operators. This property allows us to make
assumptions directly concerning the noise, the signal and the filter
70
71
instead of only the variance operator of the error. Second, it makes
possible the generalization of the formulation in reflexive Banach
resolution space, because the term "trace" does not make sense for
operators which map from reflexive Banach spaces to their duals whereas
the memoryless part of an operator is meaningful. The only problem
left is the unique existence of the memoryless part of an operator.
This will, presumably, be a future development in Wiener-Hopf filtering.
With the results from filtering, we then look into the problem of
Wiener-Hopf control. What we have done, is to improve the performance
measure by using an optimal controller in a feedback and feedforwark
control system. However, not much effort has been devoted to studying,
the stablization of a nonstable system by the Wiener-Hopf technique.
This will also be a future direction of research for Wiener-Hopf
control theory.
LIST OF REFERENCES
^' sSngerlvei U^!""°".^P"^'' ''^''''"" '"' ^V^^^"'"' ^" ^ ° ^ ^ '
^' anH''wt»;'^i ^?'' Mcgillem, CO., Probabilistic Methods of Signal and^System Analysis, New York. Holt, Rinehart and Winston. INC.,
' press!^1967''^' ^'^" '''•" ' ' tv Measures on Metric Soarp.;. Academic
'^' t^lt^^l^'^"^^' *-^-' Introduction to Optimization Theory in a Hilbert Space. New York, Springer-Verlag, 1971.
5. Bachman, G. and Narici. L., Functional Analysis. New York, Academic Kress, 1966.
6. Chobanian, S.A., "On a Class of Functions of a Banach Space Valued Stationary Stochastic Process", Sakharth SSR Mecn. Acad. Moambe 55, pp. 21-24, 1969.
7. Chobanian, S.A., "On Some Properties of Positive Operator-Valued Measures in Banach Spaces", Sakharth SSR Mecn. Acad. Moambe 57, pp. 273-276, 1970.
8. Vakhania, N.N., "The Covariance of Random Elements in a Banach Space", Thbilis Sanelmc. Univ. Gamoqeneb. Math. Inst. 2 (1969), pp. 179-184.
9. Masani, P., "An Explicit Treatment of Dilation Theory", published notes.
10. Rohrer, R.A., "The Scattering Matrix: Normalized to Complex n-port Load Networks", IEEE Trans, on Circuit Theory, Vol. CT-12, pp. 223-230, 1965.
11. Luenberger, D.G., Optimization by Vector Space Methods, New York, J. Wiley and Sons, 1969.
12. Saeks, R., "Reproducing Kernel Resolution Space and its Applications", in Journal of the Franklin Institude, Vol. 302, No. 4, pp. 331-355, Oct. 1976.
13. Kailath, T. and Duttweiler, D., "An RKHS Approach to Detection and Estimation Problems-Part III: Generalized Innovations and a Likelihood-ratio Formula", IEEE Trans. Info. Thy., Vol. IT-18, pp. 730-745, 1972.
72
73
^ * Hilberl'Rp^;^.^ ^ " ' ^ Causality and Invertibility in Hilbert Resolution Spaces", SIAM J. Control, Vol. 12, No. 3, Aug.
^^' TIII'^V^U ^f^-."Causality Structure of Engineering Systems", Ph.D. Thesis, University of Michigan, Sept. 1971.
16. Duttweitler, D.,"Reproducing Kernel Hilbert Space Techniques for 1970^ " Estimation Problems", Ph.D. Thesis, Stanford Univ.,
17. Grenander, U., Probabilities on Algebraic Structures, Wiley, New York, 1963. —
18. Loeve, M., "Sur les fonctions aleatories stationnaires du second ordre". Rev. Sci., Vol. 83, pp. 297-310, 1945.
19. Aronazajn, N., "Theory of Reproducing Kernels", Trans, on the AMS, Vol. 63, pp.337-404, 1950.
20. Porter, W.A., "Some Circuit Theory Concepts Revisited", Int. J. on Control, Vol. 12, pp. 443-448, 1970.
21. Schnure, W.K., "Controllability and Observability Criteria for Causal Operators on a Hilbert Resolution Space", Proc. of the 14th MSCT, Univ. of Denver, 1971.
22. Gohberg, I.C. and Krein, M.G., Introduction to the Theory of Linear Nonselfadjoint Operators, American Mathematical Society, 1969.
23. Youla, D.C., Jabr, H.A. and Bongiorno, J.J., Jr., "Modern Wiener-Hopf Design of Optimal Controllers-Part II: The Multivariable Case", IEEE Trans, of Automatic Control, Vol. AC-21, No. 3, June 1976.
2^. Yosida, K., Functional Analysis, Springer, Heidelberg, 1966.