an always convergent method for approximating the spectral
TRANSCRIPT
An Always Convergent Method for
Approximating the Spectral Radius of a
Non-Negative Matrix, With Particular
Reference to a Leontief Input-Output
System.
By
Robert James Wood
Submitted for the Degree of
Doctor of Philosophy
Charles Sturt University
Bathurst
2009
2
HD7
CERTIFICATE OF AUTHORSHIP OF THESIS
& AGREEMENT FOR THE RETENTION &
USE OF THE THESIS
DOCTORAL AND MASTER BY RESEARCH
APPLICANTS
To be completed by the student for submission
with each of the bound copies of the thesis
submitted for examination to the Centre of
Research & Graduate Training.
For duplication purpose, please TYPE or PRINT
on this form in BLACK PEN ONLY.
Please keep a copy for your own records.
I
Robert James Wood
Hereby declare that this submission is my own work and that, to the best of my knowledge
and belief, it contains no material previously published or written by another person nor
material which to a substantial extent has been accepted for the award of any other degree
or diploma at Charles Sturt University or any other educational institution, except where due
acknowledgment is made in the thesis. Any contribution made to the research by colleagues
with whom I have worked at Charles Sturt University or elsewhere during my candidature is
fully acknowledged.
I agree that the thesis be accessible for the purpose of study and research in accordance with
the normal conditions established by the University Librarian for the care, loan and
reproduction of the thesis.*
Signature Date
* Subject to confidentiality provisions as approved by the University
3
Acknowledgements
The author is indebted to a number of people who have assisted me in this research, and I
would like to extend my sincere gratitude to:
Michael O’Neill, my former lecturer, colleague, and now supervisor, whose
influence and assistance has been greatly appreciated. If it wasn’t for his teaching and
influence throughout my career, I don’t think I would’ve got to this point. For this I will
be forever grateful.
David Tien, my colleague and assistant supervisor, whose kind words were
always appreciated.
Gene Golub, (unfortunately now deceased) whose kind words at conferences
encouraged me so much.
And to the most important people in my life, my wife Louise, and children
Kathryn, Hamish and Rosalie, who have been a constant support throughout the period of
this research.
4
Abstract
The Leontief input-output model is used to forecast various effects that can occur in an
industry as it interacts with other industries under changing conditions within an
economy. It is because of this valuable ability to forecast various effects (interaction)
that the model has become very widely used. Users of the model need to be aware of the
conditions under which a unique solution exists for the system of input-output equations,
and also when this system is ill-conditioned. The Hawkins-Simon and Spectral Radius
conditions have been previously established and guarantee uniqueness. New proofs are
presented for these two conditions and their equivalence is established. Several useful
bounds are developed for the condition number of the system of input-output equations.
An always convergent method for approximating the spectral radius of an input-output
matrix is developed. The spectral radius of a matrix is the largest eigenvalue of the
matrix in absolute value. The more general problem of approximating the spectral radius
of a general non-negative matrix is considered and an always convergent method is
developed for this problem.
5
TABLE OF CONTENTS
Acknowledgements ............................................................................................................. 3
Abstract ............................................................................................................................... 4
CHAPTER 1 ......................................................................................................................... 7
INTRODUCTION ................................................................................................................. 7
1.1 Background to the Research ...................................................................................... 7
1.2 Motivation for the Research: The Leontief Input-Output Model ............................ 7
1.3 Aim of this Thesis ..................................................................................................... 10
1.4 Literature Review. .................................................................................................... 11
1.4.1 Irreducibility. ......................................................................................................... 12
1.4.2 Perron-Frobenius Thereom and its Variant. ....................................................... 14
1.4.3 Hawkins-Simon Condition. ................................................................................... 14
1.4.4 The positivity of F. ................................................................................................. 15
1.4.5 The Spectral Radius Condition. ............................................................................ 15
1.4.6 The Power Method and Inverse Power Method. ................................................ 15
1.4.7 The Bounds of Collatz. ........................................................................................... 17
1.4.8 Condition Number of a Linear System of Equations. .......................................... 18
1.4.9 Another Bound for the Spectral Radius of a Non-Negative Matrix.................... 18
1.4.10 The QR Iteration .................................................................................................. 19
1.4.11 Orthogonal and Simultaneous Iteration. ........................................................... 20
1.4.12 Krylov Subspace Methods. .................................................................................. 21
1.4.13 M-Matrices. .......................................................................................................... 22
1.5 Chapter organisation and objectives. ..................................................................... 22
1.6 Original Research ..................................................................................................... 25
CHAPTER 2 ..................................................................................................................... 27
AN ALTERNATIVE PROOF OF THE HAWKINS-SIMON CONDITION .................. 27
2.1 Introduction .............................................................................................................. 27
2.2 Leontief Input-Output System ................................................................................. 27
2.3 Hawkins-Simon Condition ....................................................................................... 28
2.4 Computational Aspects of the Hawkins-Simon Condition .................................... 34
2.5 Conclusion ................................................................................................................. 41
CHAPTER 3 ..................................................................................................................... 43
USING THE SPECTRAL RADIUS TO DETERMINE WHETHER A LEONTIEF
SYSTEM HAS A UNIQUE POSITIVE SOLUTION ...................................................... 43
3.2 The Spectral Radius Condition ................................................................................ 46
3.3 Testing for the Spectral Radius Condition using Bounds ...................................... 48
3.4 Testing for the Spectral Radius Condition by Computing )(A ........................... 50
3.5 Conclusion ................................................................................................................. 59
CHAPTER 4 ..................................................................................................................... 63
A NEW METHOD FOR CALCULATING THE SPECTRAL RADIUS OF AN ........... 63
INPUT-OUTPUT MATRIX ............................................................................................. 63
4.1 Conditions for a Unique, Positive Solution of the Model ....................................... 63
4.2 Bounds on the Spectral Radius of an Irreducible Non-Negative Matrix .............. 63
4.3 Bounds on the Spectral Radius of any Non-Negative Matrix ................................ 70
4.4 Rate of Convergence ................................................................................................. 73
6
4.5 A New and Improved Method .................................................................................. 75
4.6 Condition of the Dominant Eigenvalue ................................................................... 78
4.7 Conclusion ................................................................................................................. 79
CHAPTER 5 ....................................................................................................................... 82
AN ALWAYS CONVERGENT METHOD FOR FINDING THE SPECTRAL RADIUS OF AN IRREDUCIBLE NON-NEGATIVE MATRIX ....................................................................... 82
5.1 Introduction .............................................................................................................. 82
5.2 The Power Method ................................................................................................... 82
5.3 Techniques for a Reducible Matrix ......................................................................... 86
5.4 Applying the Method of Collatz ............................................................................... 88
5.5 Conclusion ................................................................................................................. 92
CHAPTER 6 ....................................................................................................................... 93
A FASTER ALGORITHM FOR IDENTIFICATION OF AN M-MATRIX ............................. 93
6.1 Introduction .............................................................................................................. 93
6.2 What is an M-matrix? ............................................................................................... 93
6.3 Computational Aspects ............................................................................................ 97
6.4 Conclusion ............................................................................................................... 102
CHAPTER 7 ..................................................................................................................... 103
FINDING THE SPECTRAL RADIUS OF A LARGE SPARSE NON-NEGATIVE MATRIX 103
7.1 Introduction ............................................................................................................ 103
7.2 The Method of Collatz ............................................................................................ 103
7.3 The Arnoldi Method ............................................................................................... 104
7.4 Other Methods ........................................................................................................ 106
7.5 Comparison of the methods ................................................................................... 107
7.6 Practicalities for the Method of Collatz. ................................................................ 108
7.7 Conclusion ............................................................................................................... 111
CHAPTER 8 ..................................................................................................................... 112
ESTIMATING THE CONDITION NUMBER OF A LEONTIEF SYSTEM ............... 112
8.1 Introduction ............................................................................................................ 112
8.2 A Bound Using the Row Norm ............................................................................... 113
8.3 A Bound Using the Spectral Radius ....................................................................... 115
8.4 A More Practical Bound ......................................................................................... 118
8.5 The Best Possible Constants .................................................................................. 120
8.6 Conclusion ............................................................................................................... 121
Chapter 9 ........................................................................................................................ 124
AN UPPER BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE MATRIX .. 124
9.1 Introduction ............................................................................................................ 124
9.2 A New Proof of an Established Bound .................................................................. 124
9.3 Conclusion ............................................................................................................... 136
Chapter 10 ...................................................................................................................... 137
CONCLUSION.................................................................................................................. 137
10.1 Introduction .......................................................................................................... 137
PUBLICATIONS .............................................................................................................. 140
BIBLIOGRAPHY .............................................................................................................. 141
7
CHAPTER 1
INTRODUCTION
1.1 Background to the Research
This research is an extension of my master’s degree thesis, titled “Research and
Development and Sensitivity Analysis within the Input-Output Model”, University of
Technology, Sydney 1996. As implied by the title of the master’s thesis it was based on
the input-output model in mathematical economics, looked at sensitivity analysis within
the input-output model, and included some methods for determining when the input-
output model has a unique positive solution. This current thesis goes beyond the above
application and considers some wider mathematical questions, in particular issues
concerning the spectral radius of a general non-negative matrix. The spectral radius of a
matrix is the largest eigenvalue of the matrix in absolute value.
1.2 Motivation for the Research: The Leontief Input-Output Model
The input-output model was developed by Wassily Leontief during the 1930's, for which
he won a Nobel Prize for Economics in 1973. It is often referred to as the Leontief Input-
Output Model. It was designed to forecast/measure the interflow of goods and services
that can occur within an industry as it interacts with other industries under changing
conditions in the economy. For example, the model can measure the complex input-
output transactions involved in the manufacture of a motor car involving direct purchases
of sheet metal. Through this direct purchase, the motor industry will however, also have
been indirectly responsible for the purchase of iron ore, limestone, coke, electricity and
other supplies which are required in the making of sheet metal. The most convenient
method for representing these inter-industry transactions is via an input-output table,
which records the production and disposal of the products of an economic system for a
8
particular time period. The table is basically a quantitative snapshot of the economy in
question, capturing its essential features during the period of interest.
If we imagine an economy to consist of n industries each producing a single product, the
Leontief input-output model determines how much each industry should produce to meet
inter-industry* demands among the n industries and m final demands for their product.
The following table shows how much in dollars ($) a given industry buys from and sells to
other industries, how much it sells to final demand**
, and how much it buys from primary
inputs***
.
TABLE 1
Input-Output Transaction Table ($)
Buying Industriesa Final Demandb Totald
1 2 ... n 1 2 ... m
1 x11 x12 ... x1n f11 f12 ... f1m X1
2 x21 x22 ... x2n f21 f22 ... f2m X2
Selling . . . . . . . .
Industries . . Q1 . . . Q2 . .
. . . . . . . .
n xn1 xn2 ... xnn fn1 fn2 ... fnm Xn
1 P11 P12 ... P1n
2 P21 P22 ... P2n
Primaryc
Inputs
.
.
.
.
.
Q3
.
.
. . . .
k Pk1 Pk2 ... Pkn
Totald X1 X2 ... Xn
* Inter-industry demands are represented in Q1 in the above table. This quadrant shows how much each
industry buys or sells to other industries. This quadrant is referred to as the intermediate quadrant. **
Final demands are demands placed on the output of each industry, from outside the intermediate
quadrant. Examples of this are household consumption, exports and capital formation. It is the quadrant
marked Q2, and is referred to as the final demand quadrant. ***
Primary inputs are inputs which originate outside the intermediate quadrant. Examples of primary
inputs are salaries and wages (payments to employees), imports, payments to governments in the form of
indirect taxes. It is the quadrant marked Q3, and is referred to as the primary inputs quadrant.
9
a xij = value of product sold by industry i to industry j
b fij = value of product sold by industry i to final demand j
c Pij = value of sales by primary input i to industry j
d X j = Total output for industry j
Also for this input-output system,
X j = Total input to industry j
This table is called the transaction table and is normally constructed in dollars.
At this point it is worth noting that no X j= 0 (j = 1,...,n) as this would mean that industry j
does not produce any product to sell to other industries or to satisfy final demand. X j= 0
would also indicate that no inputs are required for industry j to operate. Therefore industry j
would be excluded from consideration because it has no influence on the economy in
question.
From the transaction table we can write down the following system of equations.
x11 + x12 +...+ x1n + f11 +...+ f1m = X1
x21 + x22 +...+ x2n + f21 +...+ f2m = X2
.
.
.
.
.
.
.
.
.
.
.
.
(1)
. . . . . .
xn1 + xn2 +...+ xnn + fn1 +...+ fnm = Xn
.Letj
ij
ijX
xa
From this it can be seen that
n
i ija1
10 , since aij is calculated by dividing each xij
element in a column of the input-output transaction table by the total of that column. (Note:
further that if an industry requires primary inputs then aiji
n
11
).
10
The system of equations (1) can then be rewritten as:
a11X1 + a12X2 + ... + a1nXn + f11 + ... + f1m = X1
a21X1 + a22X2 + ... + a2nXn + f21 + ... + f2m = X2
. . . . . . (2)
. . . . . .
. . . . . .
an1X1 + an2X2 + ... + annXn + fn1 + ... + fnm = Xn
Now writing
nnnmnn
m
m
nnnn
n
n
X
X
X
X
F
F
F
F
aaa
aaa
aaa
A
.
.
. and
.
.
.
f...ff
...
...
...
f...ff
f...ff
,
...
...
...
...
...
...
2
1
2
1
21
22221
11211
21
22221
11211
the system (2) becomes
AX F X or
(I - A) X = F , (3)
where I is the n by n identity matrix.
This is the system of linear equations that models the Leontief input-output system and has
the solution
X = (I - A)-1 F, provided (I - A)-1 exists. (4)
1.3 Aim of this Thesis
The research conducted for my master’s thesis investigated conditions under which
1)( AI exists. It is known that the spectral radius of the matrix A can be used to
determine whether the system of equations represented by the input-output system has a
11
unique, positive solution. This is just the well-known result that 1)( AI exists if
1)( A , where )(A represents the spectral radius of A . (See for example, Atkinson
(1989, p 491)). In the input-output context A is a non-negative matrix. A non-negative
matrix is a matrix in which all the elements are either equal to or greater than zero, a
positive matrix is a matrix with all elements greater than zero. Thus methods for finding
the spectral radius of a non-negative matrix are relevant in the input-output context.
The research for this thesis had as its aim to find an always-convergent method for
approximating the spectral radius of more general non-negative matrices, not just
those that arise in the input-output context. This has been achieved. The research
has also provided an alternative proof to the Hawkins-Simon condition, after
Georgescu-Roegen noted that the original Hawkins-Simon proof contained a
fallacious continuity argument.
An alternative proof has been provided to Levinger’s Theorem concerning the
relationship of )(A and
2
TAA . The published proof by Berman and
Plemmons required a further qualification.
A new formula for estimating the condition number of a Leontief System has also
been derived.
1.4 Literature Review.
The motivation for finding an always-convergent method for approximating the spectral
radius of a non-negative matrix is because of its applicability to the Leontief model, as
detailed earlier, Google, see Golub and Gret (2006) and Andersson and Ekstrom (2004),
12
Genetics, Population Growth, see Saad (2003), Harvesting Animal Populations, Energy
consumption, see Slesser (1978), Environment modelling see Wagershals (2007),
determining the rate of convergence of an iterative method for solving a system of linear
equations, see Young (1971). As detailed above calculating the spectral radius of a matrix
is useful in a number of applications. The author’s particular interest has been the area of
Mathematical Economics known as Leontief Input-Output Analysis, where finding the
spectral radius of a non-negative matrix is an important technique in verifying that a linear
system has a unique positive solution (see chapter 3). Other applications include finding
the roots of a polynomial equation using a companion matrix, determining whether a
Hessian matrix is positive definite, finding the first few eigenvalues of a covariance
matrix, (an important statistical technique in factor analysis), calculating the 2-norm of a
matrix ( )(2
AAA T ), verifying that a matrix is convergent ( 0lim .
n
nAie , which
occurs if and only if 1)( A ). It is acknowledged that some of the applications listed
above do not always involve non-negative matrices but, when they do, the technique is
appropriate.
The following well-known ideas and known results are integral to this thesis, and form
the basis of its conclusions.
1.4.1 Irreducibility.
The concept of irreducibility was first introduced by Frobenius in 1912.
An nn matrix A is said to be reducible if there exists a permutation matrix P such
that
D
CBPAPT
0,
13
where B and D are square matrices and 0 is a zero matrix (not necessarily square).
Otherwise it is irreducible. Irreducibility of a matrix can be determined from its
associated directed graph )(AG . If the directed graph is traversable, that is there is a path
from any one vertex to any other, then the matrix A is irreducible. Connectivity from
vertex i to vertex j is indicated in the matrix A if 0ija .
Note: If A is an nn matrix, then
nnnjn
nj
nj
nnnn
inii
n
aaa
aaa
aaa
aaa
aaa
aaa
A
1
2221
1111
21
21
11211
2
and the thji ),( element of 2A is njinjiji aaaaaa 2211
Provided A is non-negative and 0, kjik aa for some k , then the thji ),( element of 2A
is positive and there exists a two-step path from vertex i to vertex j .
Similarly, if the thji ),( element of 3A is positive there exists a 3-step path from vertex i
to vertex j , and, generally if the thji ),( element of 1nA is positive there is an )1( n -
step path from vertex i to vertex j . Without retracing steps, the longest possible path
from vertex i to vertex j is of length )1( n , when ji . So if all off-diagonal
elements of 12 nAAA are positive then it is possible to find a path from one
vertex to any other.
According to Varga (1962, p20) if 012 nAAAI then it is possible to go from
one vertex to any other vertex, hence the graph )(AG is strongly connected and the
matrix A is irreducible.
14
Irreducible matrices can be categorized as primitive or cyclic. An irreducible matrix is
cyclic when it has more than one eigenvalue equal in magnitude to its spectral radius. A
primitive matrix has a unique eigenvalue equal in magnitude to its spectral radius.
1.4.2 Perron-Frobenius Thereom and its Variant.
Let 0A be an irreducible nn matrix. Then,
i) A has a positive, simple real eigenvalue equal to the spectral radius )(A .
ii) the corresponding eigenvector to )(A is positive.
iii) if any element of A increases, then so does )(A .
This is the Perron-Frobenius Theorem, a proof of which is given in Varga (1962, p 30).
The following is a variant of the Perron-Frobenius Theorem, a proof of which is given in
Varga (1962, p 46).
Let 0A be an nn matrix. Then,
i) A has a non-negative real eigenvalue equal to its spectral radius. Moreover,
this eigenvalue is positive unless A is reducible and the normal form of A is
strictly upper triangular.
ii) the corresponding eigenvector to )(A is non-negative.
iii) if any element of A increases, then )(A does not decrease.
1.4.3 Hawkins-Simon Condition.
Hawkins and Simon (1949) presented the condition known as the Hawkins-Simon
condition as a necessary and sufficient condition to verify that an input-output system had
a unique positive solution. It stated that “A necessary and sufficient condition that X
satisfying (4) be all positive is that all principal minors of the matrix )( AI be positive”.
15
The principal minors are the determinants of the square sub matrices of a given matrix.
For a large matrix, the Hawkins-Simon condition can be computationally intensive.
Georgescu-Roegen (1966, ch.9) showed that "the Hawkins-Simon proof contained a
fallacious continuity argument."
1.4.4 The positivity of F.
Miller and Blair (1985, p 36)) and Jensen and West (1986) claim that a unique positive
solution of
(I - A) X = F,
is obtained provided ),...,1( 0 niFi and that not all .0iF However, having some
zero and non-zero elements in F can sometimes lead to a degenerate case. Previous
authors do not appear to have considered this degenerate case. Implications of this are
examined later in section 3.1.
1.4.5 The Spectral Radius Condition.
Gerard and Herstein (1953) and Varga (1962, p 82) show that if the spectral radius )(A
of matrix A has a value less than one, then 1)( AI exists. Implications of this are
examined in Chapters 3 and 4.
1.4.6 The Power Method and Inverse Power Method.
The Power Method has proved to be important in this research, and is the basis of many
other methods such as the Inverse Power Method. (See Wilkinson (1965)). The Power
Method seems to have been originally used by Lord Rayleigh early in his career in an
1873 work in which he needed to evaluate approximately the normal modes of a complex
vibrating system. He subsequently used it and related methods in his classic text “The
Theory of Sound” (1877) for evaluating the normal modes of a complex vibrating system.
16
The main use of the Power Method is to find the dominant eigenvalue of a matrix. The
method is as follows:
Suppose an nn matrix A has n eigenvalues 1 2, ,..., n with corresponding linearly
independent eigenvectors nxxx ,,, 21 . Assume that one eigenvalue (say 1) is largest in
absolute value.
Next, choose an arbitrary vector 0q which will of necessity be a linear combination of the
set of eigenvectors,
i.e . ...22110 nn xxxq
Then
.
...
...
since ,...
...
1 1
1
221110
2
2
2
221
2
110
2
2
222111
221101
n
j
j
m
j
j
m
n
m
nn
m
n
mm
m
nnn
nnn
nn
x
xxxqAq
xxxqAq
xAxxxx
AxAxAxAqq
Since 1 j for all j n 2 3, ,..., 0lim1
m
j
m
.
Hence,
.
))((
))((lim
lim
1
111111
11
1
1111
xx
xx
AqqR
mTm
mTm
m
m
T
m
m
T
m
m
This quantity R is known as the Rayleigh Quotient. Difficulties can occur with the
method if 01 or if A does not have a complete set of linearly independent
eigenvalues. These difficulties are discussed further in Chapter 3.
17
The Inverse Power Method, which was proposed by Wielandt (1944) is a variation of the
Power Method which substitutes 1)( AI for A of the Power Method. This allows
one to detect the closest eigenvalue to . Convergence is then to )/(1 1 , from which
1 can be calculated. The rate of convergence for the Inverse Power Method is 2
1
(see Atkinson (1989, p628ff), and with closer to 1 , than to 2 , it generally has an
improved rate of convergence when compared to the Power Method, which has the
convergence rate of 12 / .
1.4.7 The Bounds of Collatz.
Collatz (1942) established an upper and lower bound for the spectral radius of a positive
matrix, which is necessarily irreducible. This bound was as follows:
Let 0A be an nn irreducible matrix and 0q be an arbitrary positive n-dimensional
vector. Defining 01 Aqq , let
)(
0
)(
10)(
0
)(
1
0
1
max
1
mini
i
i
i
q
q
niand
q
q
ni ,
where the superscript i represents the thi component of a vector. Then, denoting the
spectral radius of A by )(A ,
00 )( A .
Wielandt (1950) extended the result to non-negative irreducible matrices, not just positive
irreducible matrices. The bounds of Collatz are examined further in Chapter 4.
18
1.4.8 Condition Number of a Linear System of Equations.
In input-output analysis it is necessary to solve a system of linear equations. Knowing the
condition of the system provides an estimate of the accuracy of the final solution and also
indicates the precision of the arithmetic which must be used to compute the solution.
Stewart (1973) defines the condition number of the matrix A as
1)( AAA . In
Chapter 8 this result is used to establish a new bound for the condition number of an
input-output system.
1.4.9 Another Bound for the Spectral Radius of a Non-Negative Matrix. The problem of approximating the spectral radius of a non-negative matrix is of importance
where iterative processes require an initial estimate of the spectral radius. The best known
bounds are the 1-norm, 2-norm and -norm, see Stewart (1973). However other well-
known bounds are due to Frobenius (1912), and Minc (1988) for non-negative matrices,
Ledermann (1950), Ostrowski (1952), and Brauer (1957) for positive matrices.
Levinger (1969) in the Notices of the American Mathematical Society posed the
following propostion: 2/TAA is always greater than or equal to the spectral radius
of A , where 0A . Levinger later proved this result as did a number of other
mathematicians. The solutions by Deutsch and Walsh, were published in the Notices of
the American Mathematical Society in 1970. This result is important because finding the
eigenvalues of a symmetric matrix is a better-conditioned problem when compared to finding
the eigenvalues of an unsymmetric matrix. Berman and Plemmons (1979) consider this
same problem and state further that the equality )()2/)(( AAA T holds, if and only
if A and TA have a common eigenvector corresponding to the spectral radius. A full
19
examination of the Theorems by Levinger and Berman and Plemmons is presented in
chapter 9.
1.4.10 The QR Iteration
The generally accepted method for finding all the eigenvalues of a matrix is the QR
algorithm, which was developed by Francis (1961). The basis of the algorithm is as
follows;
An nn real matrix A can be factorised in the form
QRA ,
where Q is orthogonal and R is upper triangular. The computation of Q and R
proceeds as follows: Orthogonal matrices 121 ,,, nPPP (e.g. Householder reflectors) are
constructed so that RAPPPP nn 1221 is upper triangular. If we let
1221 PPPPQ nn
T ,
then
QRA , since Q is orthogonal ( i.e 1 QQT)
Using the above results, the QR iterations proceeds as follows:
Let
AA 0 .
Then, factorise 0A in the form
20
000 RQA .
Next reverse the product, producing
000001 QAQQRA T , where 1A has the same eigenvalues as 0A .
Continuing this process, for ,3,2,1k
kkk RQA
kk
T
kkkk QAQQRA 1
where successive matrices kA are similar to each other. If A is in upper-Hessenberg
form, then it can be shown that kA converges to an upper triangular matrix in general or
a diagonal matrix if A is symmetric. The eigenvalues of A are then the diagonal
elements of kA . The convergence of the QR method can also be accelerated by use of a
shift and associated QR factorisations:
viz. kkk RQqIA
where q is an approximation to an eigenvalue.
Then, define
qIQRA kkk 1 ,
which will have the same eigenvalues as kA . See Atkinson (1989, p623ff).
1.4.11 Orthogonal and Simultaneous Iteration.
Another method that is a generalization of the Power Method is the method of Orthogonal
Iteration. It begins with an initial rn )1( nr matrix 0Q , with orthonormal columns
and generates 01 AQZ . The QR factorization of 1Z , viz 111 RQZ , produces the next Q
matrix, and the process is repeated to generate 3322 ,,, RQRQ etc. It can be shown (see
21
Golub and VanLoan (1996, p 333)) that the span of the columns of the Q matrix tends to an
invariant subspace containing the eigenvectors of the r most dominant eigenvalues of A .
The method of Orthogonal Iteration can sometimes be very slow if the gap between r and
1r is not sufficiently wide. Accordingly, the method of Simultaneous Iteration is designed
to accelerate the convergence of Orthogonal Iteration by periodically performing a Schur
decomposition of the matrices k
T
k AQQ . See Stewart (1976).
1.4.12 Krylov Subspace Methods.
Another range of methods for finding the dominant eigenvalue, generate an invariant
subspace and these methods are typically used for large sparse matrices. A particular
invariant subspace method is the procedure introduced by Arnoldi (1951), which begins by
building an orthonormal basis },,,{ 21 mvvv for the subspace mK , where
},,,{ 1
1
11 vAAvvspanK m
m
and 1v is an arbitrarily chosen vector of norm one. With
},,,{ 21 mm vvvV the so-called Arnoldi factorization, EHVAV mmm is computed.
Here mH is an mm upper Hessenberg matrix and T
nmm evE 1 is a rank one matrix. See
Saad (1992) for further details. For m sufficiently large the eigenvalues of mH provide
approximations to a set of the eigenvalues of A , among which is usually the dominant
eigenvalue.
22
1.4.13 M-Matrices.
An nn matrix A with elements ij is said to be an M-matrix if it satisfies the following
properties.
1. niforii ,,1 0
2. njj, ifor iij ,,1, 0
3. A is nonsingular
4. .01 A
This is the usual definition of an M-matrix. However, as most authors point out (see for
example Saad (2003, p 28)), property 1. is redundant as properties 2., 3.,and 4. together
imply 1. Properties 1. and 2. are easily tested, but not so properties 3. and 4. which, for
large matrices would typically involve extensive computation. The matrix )( AI in an
input-output system is an example of such an M-matrix, since typically )( AI is non-
singular. The identification of M-matrices is further examined in Chapter 6.
1.5 Chapter organisation and objectives.
In Chapter 2 the Hawkins-Simon condition is discussed. An alternative proof of this
condition, which does not require a continuity argument, is presented and the
computational difficulties that may be encountered when testing this condition are
discussed. A proof is given showing that a system of input-output equations is column
diagonally dominant and hence no pivoting strategies are required to stop the growth of
rounding errors when solving the system in floating-point arithmetic. Many of the results
derived in this chapter are new and have been published in the following paper,
23
O'Neill, M.J. and Wood, R.J. (1999): An Alternative Proof of the Hawkins-Simon Condition,
Asia Pacific Journal of Operational Research, Operational Research Society of Singapore,
Singapore, 16, 173-183.
In Chapter 3 alternative methods to the Hawkins-Simon condition are investigated in
order to determine whether a system of input-output equations has a unique positive
solution. The methods considered in this chapter are computationally easier to use and
are shown to be equivalent to the Hawkins-Simon condition. In particular, the use of the
spectral radius of the matrix A is investigated for determining whether an input-output
system has a unique positive solution. It also examines the issue of the positivity of F ,
the final demand vector. Many of the results derived in this chapter are new and have
been published in the following paper,
Wood, R.J. and O'Neill, M.J. (2002): Using the Spectral Radius to determine whether a
Leontief System has a unique positive solution, Asia Pacific Journal of Operational Research,
Operational Research Society of Singapore, Singapore, 19, 233-247.
In Chapter 4 methods for calculating the spectral radius of an input-output matrix are
discussed. The Power Method and the Inverse Power Method and a Hybrid Method that
combines the former two methods is considered. Conditions under which the methods
are convergent as well as the rates of convergence for the methods are examined.
Chapter 5 presents an always convergent method for determining the spectral radius of a
general irreducible non-negative matrix not just that of a matrix from an input-output
problem. The method makes use of the bounds of Collatz and Wielandt and is called “The
24
Method of Collatz”. The results derived in this chapter have been published in the following
paper,
Wood, R.J and O'Neill, M.J. (2004). An always convergent method for finding the spectral
radius of a non-negative matrix. ANZIAM J., 45(E): C474-C485. [Online]
http://anziamj.austms.org.au/V45/CTAC2003/Wood/Wood.pdf
In Chapter 6 a method for identifying M-matrices is considered. The matrix )( AI in
an input-output system is an example of such a matrix. The techniques of Chapter 5 are
employed here. The results derived in this chapter have been published in the following
paper,
Wood, R.J and O'Neill, M.J (2005): A faster algorithm for identification of an M-Matrix,
http://anziamj.austms.org.au/V46/CTAC2004/Wood/home.html
In Chapter 7 a comparison is given of the techniques of Chapter 5 with other methods for
determining the spectral radius of a particular type of non-negative matrix viz., a large
sparse matrix. These methods include the Hybrid Method described in Chapter 4,
Orthogonal Iteration, Simultaneous Iteration, and Arnoldi’s Method. The results derived
in this chapter have been published in the following paper
Wood, R.J and O’Neill (2007): Finding the Spectral Radius of a large Sparse Non-Negative
matrix, http://anziamj.austms.org.au/ojs/index.php/ANZIAMJ/article/view/117/99
In Chapter 8 two new results are derived that bound the condition number of an input-
output system. One is a result that provides achievable bounds but these could in practice
involve some computational complexity. However, using readily accessible matrix
25
norms, an alternative bound is also derived. It is much easier to calculate, but can be
more conservative in its estimates.
In Chapter 9 a new proof is given for the Levinger result mentioned in section 1.4.9 of the
Literature Review. A further extension to this result is derived, and consideration is given to
the result enunciated by Berman and Plemmons.
1.6 Original Research
The author claims that the following research is to the best of his knowledge original
Chapter 2: the alternative proof of the Hawkins-Simon condition, the result
concerning the non-necessity for a pivoting strategy in solving an Input-Output
System, and Theorem 2.3. Also there is a discussion of further computational
aspects of the Hawkins-Simon condition.
Chapter 3: the result concerning the positivity of F, the final demand for a
company’s product. Also part ii) Theorem 3.2, and Theorems 3.3 and 3.4 are
new, which show that the Inverse Power Method is certain to converge and yield
the spectral radius of an input-output matrix.
Chapter 4: the proof of Theorems 4.1 and 4.2 that show that the Method of
Collatz is certain to converge for a non-negative, primitive matrix.
Chapter 5: the proof that the Inverse Collatz Method is always convergent for a
general irreducible, non-negative matrix, not just an input-output matrix.
26
Chapter 6: spectral radius methods for determining whether a matrix is an M-
matrix.
Chapter 8: bounds on the condition number of an input-output matrix )( AI .
Chapter 9: an alternative proof that the spectral radius of the symmetric part of a
non-negative matrix A is always greater than or equal to the spectral radius of
A , and proof of a necessary and sufficient condition for equality of the spectral
radii.
27
CHAPTER 2
The contents of this chapter are included in the paper
O'Neill, M.J. and Wood, R.J. (1999): An Alternative Proof of the Hawkins-Simon
Condition, Asia Pacific Journal of Operational Research, Operational Research Society of
Singapore, Singapore, 16, 173-183.
AN ALTERNATIVE PROOF OF THE HAWKINS-SIMON CONDITION
2.1 Introduction
The Hawkins-Simon condition for the existence of a unique positive solution of a
Leontief input-output system was put forward by Hawkins and Simon (1949). Interest in
the proof of the Hawkins-Simon condition arose in an article by Georgescu-Roegen
(1966, ch.9). In this article it was demonstrated that "the Hawkins-Simon proof contains
a fallacious continuity argument." This chapter gives an alternative proof of the Hawkins-
Simon condition without the need for a continuity argument. Furthermore, some
important computational aspects of the Hawkins-Simon condition are discussed. These
include the effect of rounding and inherent errors and also whether it is necessary to
employ a pivoting strategy, so vital to control the growth of rounding errors when solving
a general system of linear equations in approximate arithmetic.
2.2 Leontief Input-Output System
As seen in section 1.2 the input-output system can be formulated as a system of linear
equations with the following form,
(I - A) X = F ,where 0A . (1)
The system (1) has the solution
28
FAIX 1)( , provided 1)( AI exists. (2)
2.3 Hawkins-Simon Condition
A necessary and sufficient condition that (I - A)-1 exists and the solution (2) is positive is
provided by the Hawkins-Simon condition. This condition will now be derived.
The system (1) may be written in expanded form
where Fi 0 ( i = 1,...,n) and not all Fi 0,
0 aij 1 ( i = 1,...,n; j = 1,...,n),
and aij
i
n
1
1
( j = 1,...,n).
Two lemmas are needed for the ensuing derivation of the Hawkins-Simon condition.
In the matrix A, the leading principal sub-matrix of order k is the k k sub-matrix in the
upper left-hand corner of A and is denoted by ][kA . The determinant of this matrix is
called the leading principal sub-determinant of order k.
Lemma 2.1: If L is a lower triangular matrix then using the notation defined above
kkk ALLA )(
(3)
0)1( .....
.....
.....
.....
0 ..... )1(
0 ..... )1(
2211
22222121
11212111
nnnnnn
nn
nn
FXaXaXa
FXaXaXa
FXaXaXa
29
for all k for which kA is defined.
Proof: Consider the partitions
2221
12
2221
0
AA
AAA
LL
LL
kk
where A A A12 21 22, , , are all sub-matrices
which conform with respect to the multiplication LA, also ][kL and 22L are lower
triangular.
Then
22221221212221
12
ALALALAL
ALALLA
k
kkk
So kLA)( = kk AL ■
The system of equations (3) is usually solved using Gaussian Elimination. When a
matrix A is reduced using Gaussian elimination the (i,j)th element at the ith stage of the
elimination is represented by )(i
ija . The pivot element at this stage will be )(i
iia .
Lemma 2.2: The pivot elements ),...2,1()( kia i
ii , are positive if and only if the leading
principal sub-determinants, det ),,...,2,1(][ kiA i are all positive.
Proof: The proof is by induction on k. For k = 1, the lemma is trivially true since
.)1(
11
1 aA For the induction step it is sufficient to assume that the determinants of
11 ,..., kAA are all positive and show that det ][kA is positive if and only if akk
k( ) is
positive.
30
By the induction hypothesis, .0,...,, )1(
1,1
)2(
22
)1(
11
k
kkaaa Hence Gaussian elimination may be
carried out through to its (k-1)th step by using unit lower triangular matrices (also known
as Gauss transformations) iM (a unit lower triangular matrix is a lower triangular matrix
with all of its diagonal elements unity) of index (i=1,2,...,k-1) such that
,0
...)(
22
)(
12
)(
11
121
k
kk
kkkA
AAAMMMA
where )(
11
kA is upper triangular with diagonal elements )1(
1,1
)1(
11 ,...,
k
kkaa . For a further
explanation of this aspect of Gaussian elimination see Stewart (1973, Ch.3). It follows
that k
kA is upper triangular with diagonal elements .,,..., )()1(
1,1
)1(
11
k
kk
k
kk aaa
Since
][)1(
1,1
)1(
11 det ,0,..., k
k
k
kk Aaa
is positive if and only if )(k
kka is positive. Now since
11,..., kMM are lower triangular,
,... ][][
1
][
1
][ kkk
k
k
kAMMA by Lemma 2.1.
Since iM is unit lower triangular, so is ][k
iM and hence 1det ][ k
iM . Thus det ][kA is
positive if and only if det ][k
kA is positive, which, we have seen, is true if and only if
.0)( k
kka ■
The proof of this lemma is an adaptation of a proof in Stewart (1973, Ch 3).
Theorem 2.1: A necessary and sufficient condition that the solution of the system (3)
be unique and positive is that all of the leading principal sub-determinants (or minors)
of the coefficient matrix are positive. (This is the Hawkins-Simon condition.)
31
Proof: Firstly it is proved that if a unique positive solution exists for the system (3) then
all of the leading principal sub-determinants are positive. (This is the necessity part of
the proof)
For 111 a there are three cases:
(i) 01 11 a , (ii) 01 11 a , (iii) 01 11 a .
In case (i) positive multiples of the first equation can be added to all other equations to
zeroise any non-zero elements in the first column. All other non-diagonal elements
(including the constant terms) remain negative or zero.
In case (ii) the interpretation must be that industry 1 consumes all of the product it
produces. Consequently, ,0... 111312 Faaa n which means that the first equation in
the system (3) is a trivial identity. Further, since 11
1
n
i
ia and 111 a , then all other 1ia
must be zero. So the system (3) has no equation involving 1X . Accordingly 1X can take
any arbitrary value, which means the solution is not unique. Hence if there is a unique
solution, case (ii) is not possible.
In case (iii) ,111a which is not possible according to the definition of ija .
So the only possibility, if there is a unique solution, is that 01 11 a .
Thus .01 11
)1(
11 aa The proof proceeds by induction on k, to prove 0)( k
kka . Let the
solution of the system (3) be continued using Gaussian elimination. Assume that after
completion of the (k-1)th step in this Gaussian elimination procedure the reduced system
is in the form
32
)4(
0...
....
....
....
0...
0...
....
.
.
.
.
.
..
0...
0...
)()()(
)(
1
)(
,1
)(
,1
)()()(
)2(
2
)2(
22
)2(
22
)1(
1
)1(
12
)1(
121
)1(
11
k
nn
k
nnk
k
nk
k
kn
k
nkk
k
kk
k
kn
k
knk
k
kk
nn
nn
FXaXa
FXaXa
FXaXa
FXaXa
FXaXaXa
where a i kiii( ) ( , ,..., ) 0 1 2 1 and consequently, in the kth row, all other non-diagonal
'a s and )(k
kF are greater than or equal to zero. It is possible to continue the Gaussian
elimination thus far, since the diagonal elements that are the pivot elements are all
positive.
Then there are three possibilities for :)(k
kka
.0)iii(,0)ii(,0)i( )()()( k
kk
k
kk
k
kk aaa
In case (i) positive multiples of the kth equation can be added to all necessary equations
to zeroise elements in the kth column below .)(k
kka All other non-diagonal elements
remain negative or zero.
In case (ii) either )()()(
2,
)(
1, ,,...,, k
k
k
kn
k
kk
k
kk Faaa are all zero or at least one of these is positive.
In the former, the system would not have a unique solution and in the latter, either there
cannot be a positive solution for all ),...,1,( nkkjX j or else there is no solution at
all. Hence if there is a unique positive solution, case (ii) is not possible.
33
In case (iii) there are the same two possibilities as in case (ii). In the former, the only
solution is ,0kX which is not allowed, and in the latter there cannot be a positive
solution for all ).,...,1,( nkkjX j Hence if there is a positive solution, case (iii) is not
possible.
The only possibility then for a unique positive solution is that .0)( k
kka
So, by mathematical induction, if a unique positive solution exists for the system (3) it is
necessary that
)()2(
22
)1(
11 ...,,, n
nnaaa
are all positive. By the preceding Lemma 2.2 this implies that all of the leading principal
sub-determinants in the system (3) are positive. This proves the necessary part of the
condition.
Next it is proved that if all of the leading principal sub-determinants of the coefficient
matrix are positive then a unique positive solution exists for the system (3).
If all of the leading principal sub-determinants are positive then, by Lemma 2.2, all of the
pivot elements positive. are...,, )()1(
11
n
nnaa
When Gaussian elimination is complete for the system (4) the final equation will be
0)()( n
nn
n
nn FXa (5)
34
Now in equation (5) )(n
nF cannot be zero since then ,0nX which is not allowed, as it
would mean that industry n does not produce any products to sell to other industries or to
satisfy final demand. It also does not require any inputs to operate as an industry.
Therefore it would be excluded from consideration because it has no influence on the
economy in question. So )(n
nF must be positive, which means that Xn is positive. The
proof proceeds by induction on k.
Assume that 11 ...,,, knnn XXX are all positive. Then in the (n-k)th equation when the
system (4) has been completely reduced,
,0... )()(
,1
)(
1,
)(
,
kn
knn
kn
nknkn
kn
knknkn
kn
knkn FXaXaXa (6)
either )()(
,
)(
1, ,,..., kn
kn
kn
nkn
kn
knkn Faa
are all zero, implying ,0knX which is not allowed, or
at least one of these coefficients is positive, in which case knX is positive.
Hence, by mathematical induction, all of the ),...,2,1( njX j are positive.
Furthermore the solution is unique, as well as positive, since the determinant of the
coefficient matrix is positive and hence non-zero, implying that the system (3) is non-
singular.
This proves the sufficiency part of the condition. ■
2.4 Computational Aspects of the Hawkins-Simon Condition
(i) The necessity to evaluate a set of determinants in order to test the Hawkins-Simon
condition can be avoided.
35
By Lemma 2.2,
0,...,, )()2(
22
)1(
11 n
nnaaa
is equivalent to the Hawkins-Simon condition. Hence, the Hawkins-Simon condition can
be tested in the course of solving the system of linear equations by examining the pivot
elements as they occur. If they are all positive then the Hawkins-Simon condition is
satisfied.
(ii) In exact arithmetic the test described in (i) presents no problem. However, in
approximate arithmetic uncertainties may arise. Consider the following example where
computations are performed using 4-digit arithmetic with chopping.
Example: Given
,
then
3
2
3
2
3
2
3
2
3
1
3
2
3
2
3
1
AI
A
which is singular. However, in 4-digit floating-point arithmetic with chopping, matrix
A becomes
36
0.66670.6666-
0.6666-0.6667= )(-
Then,
.3333.06666.0
6666.03333.0)(
AflI
Afl
Using Gaussian elimination and 4-digit arithmetic with chopping, this reduces to the
matrix.
0003.00
6666.06667.0
According to the test in (i), I - fl (A) is non-singular, since the pivot elements are both
positive. But it is obviously fallacious to conclude in this case that I - A is non-singular.
The (2,2) element of the reduced matrix is of the order of unit round-off (or machine
epsilon) in 4-digit arithmetic. For a further explanation of machine epsilon, see Forsythe
et al. (1977, p.13). Consequently in an automatic routine for testing the Hawkins-Simon
condition the pivot elements )()2(
22
)1(
11 ,...,, n
nnaaa should be tested to ascertain if they are of
the order of unit round-off. If so, there must be some doubt about the non-singularity of
I - A.
(iii) Is there a necessity for Pivoting in Solving the System of Linear Equations?
For a solution of a general system of linear equations in approximate arithmetic a
pivoting strategy (usually partial pivoting) is required to control the growth of rounding
errors. For a further explanation of pivoting see Stewart (1973, p.127). However
37
provided that the Hawkins-Simon condition holds, partial pivoting is not necessary in the
solution of the system of equations (3). This is guaranteed by the following theorem.
Theorem 2.2: The coefficient matrix in (3) is column diagonally dominant
n
ji
i
ijjj aaei1
)1()1( .. for ( j= 1,2,...,n), where jjjj aa 1)1( .
and each reduced sub-matrix in the Gaussian elimination procedure is also column
diagonally dominant, provided that the previous pivot element is positive.
Proof: Since in the system (3)
n
i
ij nja1
),...,2,1( 1
then
njjjjjjjjj aaaaaa ......)1( ,1,121 (7)
n
ji
i
ijjj aaei1
)1()1( .. , since all the a’s are non-negative.
So the coefficient matrix I - A in (3) is column diagonally dominant.
In particular then, the element )1( 11a will be greater than or equal to the size of any
other element in the first column of I - A. This obviates the need for partial pivoting in
the first stage of the Gaussian elimination procedure.
After zeroising elements below the diagonal in the first column, we move to the second
stage of Gaussian elimination. At this stage we wish to zeroise elements below the
diagonal in the ( ) ( )n n 1 1 by reduced sub-matrix with typical element )2(
ija given by
njnia
aaaa
ij
ijij ,...,3,2;,...,3,2 )1(
11
)1(
1
)1(
1)1()2( (8)
This step can indeed be carried out, since the previous pivot element )1(
11a is positive (and
hence is non-zero).
38
Then, from (8) we deduce that
n
ji
i
ij
ij
n
ji
i
ija
aaaa
2
)1(
11
)1(
1
)1(
1)1(
2
)2(
n
ji
i
ijn
ji
i
ija
aaa
2
)1(
11
)1(
1
)1(
1
2
)1(
(7)using, 2
)1(
1)1(
11
)1(
1)1(
1
)1(
n
ji
i
i
j
jjj aa
aaa
. with (8) using,
1 with (7) using ,
)2(
)1(
11
)1(
1
)1(
1)1(
)1(
11
)1(
1
)1(
1)1(
)1(
1
)1(
11)1(
11
)1(
1)1(
1
)1(
jia
a
aaa
a
aaa
jaaa
aaa
jj
jj
jj
jj
jj
j
j
jjj
Thus the ( ) ( )n n 1 1 by reduced sub-matrix is column diagonally dominant.
Continuing in this way it can be shown by induction that all of the reduced sub-matrices
are column diagonally dominant, if the previous pivot element is positive, and hence no
pivoting strategy is required. ■
(iv) In certain cases inherent errors in the coefficient matrix of (3) may cause a wrong
decision with respect to the Hawkins-Simon condition.
39
If the elements )1(
ija of I - A are in error by a maximum amount , then the elements of
the reduced sub-matrix, given by (8),in )2(
ija have maximum propagated error
)1(
11
)1(
1= where,1a
amm i
But 0 m 1, since the coefficient matrix is column diagonally dominant. So the
maximum propagated error in )2(
ija is 2.
Further, when the ( ) ( )n n 1 1 by reduced sub-matrix is also column diagonally
dominant, it can be shown in a similar way that the elements )3(
ija have maximum
propagated error 4. Continuing in the same way it is found that the maximum
propagated error in )4(
ija is 8, the maximum propagated error in .etc 16 is )5( ija
In particular then,
)1(
11a will have maximum propagated error
)2(
22a will have maximum propagated error 2
)3(
33a will have maximum propagated error 4
.
.
.
)(n
nna will have maximum propagated error 2 1n
So, if at any stage in the Gaussian elimination procedure a pivot element is of the same
size as the maximum propagated error in that pivot element there must be some doubt as
to whether the pivot element is truly positive or has been rendered positive by the
propagation of inherent errors in the original coefficient matrix. This then raises the
possibility that the Hawkins-Simon condition is not actually satisfied.
40
(v) Another computational difficulty with the Hawkins-Simon condition can occur when
A is a large nn matrix, for then 0)det( AI as .n
Theorem 2.3: If A is a large nn input-output matrix, then as n ,
det 0)( AI .
Proof: The proof of this condition comes directly from the proof of the Hawkins-Simon
condition. If the system of linear equations (4) has been completely reduced it can be
written in the form
a a a a
a a a
a a
a
X
X
X
F
F
F
n
n
n
nnn
n nn
111
121
131
11
222
232
22
333
33
1
2
11
220
0
0
0 0 0 0
( ) ( ) ( ) ( )
( ) ( ) ( )
( ) ( )
( )
( )
( )
( )
...
...
. ...
. . . .
.
.
.
.
system.output input an ofproperty a ,10 since 10
1 now
11
)1(
11
11
)1(
11
aa
aa
It was also proven in the Hawkins-Simon condition that a11 1 , since if a11 1 this
would result in a system of equations with a non-unique solution.
0 1111a( )
Hence the first principal sub-determinant a111( )
must satisfy 0 1111 a( )
.
Now after k steps of Gaussian elimination
a a m a
m a m a
m a m a m a m a
m a m a m a m a m a m a
m a m a m a m a m a m a
kkk
kk k k
k k k
k k k k k
k k k k k k k
k k k k k k k k k k k k k k k
( )
, , , , , , ,
( )
( ( )
( ( ) ( ))
.
.
( ( ) ... ( ))
1 1 1
2 2 12 1
3 3 13 1 23 2 12 1
4 4 14 1 24 2 12 1 34 3 13 1
1 1 1 1 1 2 1 2 12 1 2 1 2 1 2 1
41
where, mij 0 is the multiplier needed to zeroize the ( j , i )th element when using row i,
and 0 1 aij .
Therefore it can be guaranteed that the ( k , k ) th element of the reduced matrix, at any
stage of Gaussian elimination is less than or equal to one. Since
a akkk
kk( ) ( 1 a combination of positive or zero quantities)
Now from Lemma 2.2 it was proven that all akkk( ) 0 for k = 1,...,n, provided the principal
sub-determinants are positive 0 1akkk( ) .
The kth leading principal sub-determinant is equal to a a a akkk
111
222
333( ) ( ) ( ) ( ). . .... . Now, since
each 0 1 1 a k nkkk( ) ,..., for it can be guaranteed that the maximum value of any sub-
determinant is one. Under normal circumstances not all akkk( ) 1, therefore as
n a a a annn 11
1222
333( ) ( ) ( ) ( ). . .... will approach zero. ■
This then raises the possibility that if the Hawkins-Simon condition is checked by
evaluating the det( )I A it could lead to an incorrect conclusion, when dealing with a
large input-output system.
2.5 Conclusion
As mentioned in the introduction, the original proof of the Hawkins-Simon condition
contained an erroneous continuity argument. This chapter has presented an alternative
proof of the Hawkins-Simon condition using only basic ideas from linear algebra and
avoiding the necessity for a continuity argument.
An attempt has been made to present a proof, modeled on the proof in the original paper
by Hawkins and Simon (1949), but a proof that exhibited a rigorous justification at each
step. An alternative condition to the Hawkins-Simon condition can be derived using the
42
spectral radius of the matrix A in (1). It is shown in chapter 3 that this condition is
equivalent to the Hawkins-Simon condition.
Further, it has been shown that the Hawkins-Simon condition can be tested during the
course of Gaussian elimination to solve the system of linear equations in the Leontief
model. All that is necessary is to check the positivity of the pivot elements as they
emerge in the Gaussian elimination procedure. Care must be exercised in any test of this
positivity since there will be some doubt when a pivot element is of the order of unit
round-off with respect to the computer arithmetic being used.
Although a pivoting strategy such as partial pivoting is generally advisable when solving
a system of linear equations in approximate arithmetic, a partial pivoting strategy is not
needed for the Leontief model provided that successive pivot elements are positive. This
is because the reduced sub-matrix at the next stage of Gaussian elimination will be
column diagonally dominant.
Inherent errors in the original coefficient matrix of the Leontief model will cause these
errors to be propagated to the pivot element and thereby possibly magnified. There must
then be some doubt that the Hawkins-Simon condition holds when a pivot element is of
the same size as the maximum propagated error in that pivot element.
43
CHAPTER 3
The contents of this chapter are included in the paper
Wood, R.J. and O'Neill, M.J. (2002): Using the Spectral Radius to determine whether a
Leontief System has a unique positive solution, Asia Pacific Journal of Operational Research,
Operational Research Society of Singapore, Singapore, 19, 233-247.
USING THE SPECTRAL RADIUS TO DETERMINE WHETHER A LEONTIEF
SYSTEM HAS A UNIQUE POSITIVE SOLUTION
3.1 The Positivity of F
The issue of the positivity of F, within the input-output system of equations
(I - A) X = F, (1)
has not been addressed fully by previous authors such as Sohn (1986), Miller and Blair
(1985) and Jensen and West (1986). They state that ),...,1( 0 niFi and that not all
.0iF This is satisfactory in most cases. However, having some zero elements in F can
sometimes lead to a degenerate case, as shown in Example 1 below. Gerard and Herstein
(1953) and Varga (1962, p 82) show that if the spectral radius )(A has a positive value
less than one then 1)( AI exists.
Both the Hawkins-Simon condition and the Spectral Radius condition are concerned with
conditions under which there is a unique positive solution of the system (1), and our
contention is that this is also dependent upon the positivity of F. It is quite conceivable
that some elements of F could be zero. This might occur if the total production of a
particular industry was consumed by itself and other industries. Consequently, the
44
external demand for that product would be zero. In such circumstances, the solution
vector X, given by
X I A F ( ) 1 (2)
may turn out to be strictly positive, but on the other hand, may also turn out to have some
zero elements. The following example shows these possibilities.
Example 1
If
2/108/1
4/12/14/1
8/102/1
A , and
1
0
1
F , then
15/32015/8
3/423/4
15/8015/32
)( 1AI , and
by the use of (2),
3/8
3/8
3/8
X ,
which is a strictly positive solution. On the other hand, if
0
1
0
F , then
0
2
0
X ,
which is not strictly positive. This can only happen if A is reducible. This is because if
A is irreducible (i.e. 1)( AI is irreducible),
FAIX 1)(
FAAI )( 2 , assuming 1)( A .
0 , since )( 2 AAI is a positive matrix. ■ See section 1.4.
45
In the latter case, the associated transaction table (transaction tables are explained in
Miller and Blair (1985)) would be:
Industry 1 Industry 2 Industry 3 Final
demand
Total
outputs
Industry 1 0 0 0 0 0
Industry 2 0 1 0 1 2
Industry 3 0 0 0 0 0
Primary
Inputs
0 1 0
Total inputs 0 2 0
What has resulted is a degenerate economic system in which the linearity assumption of
the Leontief model is no longer valid. This is because, for both Industry 1 and 3, the ratio
of inputs to total outputs is undefined (total outputs being zero). These ratios should be
equal to the corresponding ratios in the original input-output matrix A. It has become a
degenerate economic system in that a functioning 3-industry economy has effectively
been reduced to a 1-industry economy, since Industry 1 and 3 contribute in no way to the
economy. The possibility of such a degenerate case is removed if the demand vector, F,
is strictly positive. As will be shown later, this results in a strictly positive solution for
the vector X, and thereby ensures that all ratios will be defined.
However, a question that must be answered is this: What if some of the elements of F are
legitimately zero? The answer is to replace such elements in F by a small positive
Input
Output
46
number. But how small should they be, and what guarantee is there that a small change
in the demand vector will not cause a large change in the output vector, X? Both
questions are answered by the following standard result for the perturbation of a linear
system. (See Stewart (1973, pages 194-198)).
F
FFAI
X
XX
)( , (3)
where X is the output vector for the system (1) when F 0 and X is the output vector for
the perturbed system (4), when 0F .
( )I A X F (4)
Knowing ( )I A , or an upper bound for ( )I A , F can be chosen appropriately close
to F , and then (3) guarantees X is appropriately close to X .
3.2 The Spectral Radius Condition
The Hawkins-Simon condition provides a necessary and sufficient condition for the
existence of a unique, positive solution of the system (1). This condition states that:
A necessary and sufficient condition for the existence of a unique positive solution of (1)
is that all of the leading principal sub-determinants of the matrix )( AI be positive.
The Spectral Radius condition provides an alternative necessary and sufficient condition
for the existence of a unique, positive solution of (1). This condition is proved in the
following theorem.
47
Theorem 3.1: A necessary and sufficient condition that a unique, positive solution exists
for the system (1) with 0F , is that the spectral radius of A (denoted by )(A ) is less
than unity.
Proof:
(i) Sufficiency
If 1)( A , it is well-known (see Varga (1962, page 82)) that a unique solution exists for
the system (1). Positivity of this solution can be seen by the fact that 1)( A guarantees
the convergence of the Neumann series:
...)( 321 AAAIAI , (5)
which shows that the elements of 1)( AI must be non-negative, and that no row of
1)( AI can consist entirely of zeros. Thus, recalling that F is strictly positive, the
solution (2) must also be strictly positive.
(ii) Necessity
If a unique, positive solution exists for the system (1), then a unique solution certainly
exists, and, by the result from Varga (1962), it is necessary that 1)( A , since for an
input-output system 1)( A (see section 1.2, p10) and 1)( A , otherwise )( AI is
singular. ■
Since the Spectral Radius condition and the Hawkins-Simon condition both provide
necessary and sufficient conditions for the unique positive solution of (1), these are
equivalent conditions.
48
3.3 Testing for the Spectral Radius Condition using Bounds
There are available some standard bounds on the spectral radius and these provide useful
tests for the Spectral Radius condition.
(i) It is well-known that, for any real matrix A,
AA )( , (6)
where A is any natural norm. See Stewart (1973, p 271) for details of
this bound. In particular, then
AA)( ,the maximum absolute row sum, (7)
and
1
)( AA , the maximum absolute column sum. (8)
Thus, if the row sum or the column sum of the matrix A is less than unity, results (7) and
(8) verify the Spectral Radius condition.
(ii) It is also known that, for a non-negative matrix A, )(A is greater than or equal to
the largest diagonal element of the matrix A. (See Lemma 8.2, p116.) This
provides a lower bound for the spectral radius, and coupled with the upper bound
provided by (i) above, it provides a useful bound on the spectral radius.
(iii) A well-known result from Varga (1962, p31) shows that, for a non-negative,
irreducible matrix A,
minimum column sum of A )(A maximum column sum of A. (9)
(The actual result in Varga (1962) is for the row sum, but the argument is the same for
column sums since the eigenvalues of A and TA are identical). This result can be
extended to non-negative reducible matrices, as the following theorem shows.
49
Theorem 3.2: If 0A is an nn reducible or irreducible matrix then the
minimum column sum of )(AA maximum column sum of A.
Proof:
(i) If A is irreducible the result is proven in Varga (1962, p31).
(ii) If A is reducible then it can be shown that there exists an nn permutation matrix
P such that,
ss
s
s
T
A
AA
AAA
PAP
00
0 222
11211
(10)
where the submatrices ssAAA ,...,, 2211 , are square and are either a 11 null matrix or an
irreducible matrix. See Varga (1962, p 46) for details of this result.
The transformation from A to TPAP will not affect the minimum and maximum column
sums. So, if 11A is a null matrix, then, since the elements of A are non-negative, the
minimum column sum of A must be zero. Hence, the lower bound of the theorem is
established in this case.
Alternatively, if 11A is an irreducible matrix, then (9) shows that
minimum column sum of )( 1111 AA maximum column sum of 11A (11)
Therefore,
)()( 11AA minimum column sum of 11A minimum column sum of A , (12)
since the columns sums of 11A must also be among the column sums of A . Hence, the
lower bound is established in this case. The upper bound is established easily using the
result (8), since 1
A maximum column sum. ■
50
The above proof is an extension of Varga’s (1962, p31) proof of the bounds on the
spectral radius for an irreducible matrix, part ii) being the new part.
Corollary 3.1: For reasons stated earlier see p 48, the column sums in Theorem 3.2 can
be replaced by row sums and the result still holds true. ■
The above results are now applied to the following example.
Example 2:
05.05.04.0
05.000
005.00
If A ,
The eigenvalues of A are - 0.05, - 0.1, and 0.2. So 2.0)( A ,
(7) shows that .95.0)( A
(8) shows that .55.0)( A
Theorem 3.2 shows that 55.0)(1.0 A , and the corollary shows that
95.0)(05.0 A . Hence, it could have been guaranteed that FAIX 1)( , has a
unique positive solution, without the need to determine ).(A However this may not
always be the case. (For example, if 11A and 1
A .)
3.4 Testing for the Spectral Radius Condition by Computing )(A
Sometimes it may be necessary to verify the Spectral Radius condition by actual
computation of )(A . This is achieved quite conveniently by means of the Inverse
Power method, a well-known numerical method for calculating a particular matrix
51
eigenvalue which is closest to a given value . Beginning with an initial arbitrary n-
dimensional vector 0q , successive vectors vq are calculated, where
,...3,2,1for ,1
1
vqAIq vv (13)
Note that, if is already an eigenvalue, then )( AI will be revealed as a singular
matrix. We will assume that is not an eigenvalue.
Assuming that A has a complete set of linearly independent eigenvectors, nxxx ,...,, 21 ,
then 0q can be expressed in terms of this basis of eigenvectors as
nn xxxq 22110 (14)
Then, from (13) and (14)
nv
n
n
vv
v
v xxxqAIq
2
2
21
1
10 (15)
where n ,,, 21 are the eigenvalues corresponding to nxxx ,,, 21 . Furthermore, (15)
can now be written in the form
nn
v
n
v
vvv xxxq
122
2
1
1
1
1
1 1 . (16)
We shall assume for simplicity that n ,,, 21 and are all real. Then representing the
unique eigenvalue closest to the value by 1 , and representing the k th component of
q by )(kq , assumed to be the most positive of the components of q ,
.1
)(
1
2
1)(22
1
2
1)(11
)(
2
1)(22
2
1)(11
1)(1
)(
k
nnkk
knn
kk
k
k
xxx
xxx
q
q
(17)
52
When )(1
)(
k
k
q
q
has stabilised in this process, the ratio gives the value , where
,1
1
(18)
and from this, 1 is obtained as
1
1 . (19)
Some scaling of the vq may be necessary to prevent overflow or underflow in a computer
calculation. This scaling does not affect the computation of 1 . A more comprehensive
discussion of the method can be found in Stewart (1973, pgs 340-345).
Potential problems arise during implementation of the Inverse Power method if any of the
following circumstances arise:
(i) If )1( ,,21 rr then the limit in equation (17) becomes
)()(
22)(
11
)()(22
)(11
1)(1
)(1
krr
kk
krr
kk
k
k
xxx
xxx
q
q
(20)
1
1
, provided r ,,, 21 are not all zero. So, convergence to
1
1
still occurs in this case. However, if there are unequal eigenvalues that
are equidistant from (and some of these may be complex), the above analysis is
not valid.
(ii) An initial vector 0q is chosen unfortunately with 01 . Convergence in this
case may be to a subdominant eigenvalue such as 2 , which is not the closest
eigenvalue to .
53
(iii) The matrix A is deficient. ie. It does not have a complete set of n linearly
independent eigenvectors. In such a case, equation (15) is invalid, and the method
must be justified by a more extensive analysis employing the Jordan form of A .
It is most fortunate that, in the case of a non-negative matrix A with 1)( A , all of
these potential difficulties are overcome because of the nature of A , and by choosing 0q
as an arbitrary positive vector. Theorem 3.3 and Theorem 3.4 establish this.
Theorem 3.3: For a non-negative matrix A with a complete set of eigenvectors and
with 1)( A , the Inverse Power method is certain to converge and yield the spectral
radius of A when the initial vector 0q is chosen with strictly positive elements, and ,
the approximate eigenvalue, is chosen to be equal to one.
Proof: Matrix A could have a pair of complex conjugate eigenvalues, say 2 and 2 ,
which are equal in magnitude to )(A or a negative real eigenvalue, say 3 , which is
equal in magnitude to )(A . This is illustrated in the following diagram.
Figure 1. Spectrum of Eigenvalues
1
y
x
1 = ( )A
2
2
3
54
It is obvious that 3221 1,1,11 . So other real or complex eigenvalues
equal in magnitude to )(A , do not interfere with convergence of the Inverse Power
method to the value 1 , as referred to above in (i).
Since A has a complete set of eigenvectors it can then be diagonalised in the form
1 XDXA .
T
n
T
T
n
n
y
y
y
xxxAei
2
1
2
1
21
00
0
00
00
,...,, ..
(21)
where nxxx ,...,, 21 are the right eigenvectors of A and T
n
TT yyy ,...,, 21 are the
corresponding left eigenvectors of A . (See Stewart (1973, p 277) for proof of this
result.)
Hence, since IXX 1 , the identity matrix, we can conclude that:
111 xyT and 1 ,01 jxy j
T (22)
However Ty1 , being the left eigenvector corresponding to )(1 A , is the dominant
right column eigenvector of TA , and hence, by a variant of the Perron-Frobenius
Theorem (see Varga (1962, p46)), 01 y . The variant of the Perron-Frobenius Theorem
includes the case when the matrix is reducible. Thus, if 0q is defined as in (14) and
00 q ,
.0101 qyT (23)
(i) If A is also irreducible, then, by the Perron-Frobenius theorem (see Varga (1962,
p 30)) ),(1 A as well as being real and positive, is a simple eigenvalue of A .
55
Hence (17) approaches a definite limit, and since from (23), ,01 then (18) and
(19) yield )(A .
(ii) If A is reducible, but still has a simple dominant eigenvalue, then the above
remains true by the variant of the Perron-Frobenius Theorem. Alternatively, if A
is reducible, and has an eigenvalue )(1 A of multiplicity r , then equation
(20) applies and yields
1)(1
)(1
k
k
q
q (24)
provided that the limiting vector rr xxx 2211 in (20) is not the zero
vector. However, since nxxx ,,, 21 form a linearly independent set, such a zero
vector can only occur if 021 r . This cannot be, since by (23),
01 . (18) and (19) may then be used to find ).(1 A ■
Theorem 3.4: For a non-negative matrix ,A which does not have a complete set of
eigenvectors, but for which ,1)( A the Inverse Power method is certain to converge
and yield the spectral radius of ,A when the initial vector 0q is chosen with strictly
positive elements and , the approximate eigenvalue is chosen to be equal to one.
Proof: If A does not have a complete set of linearly independent eigenvectors, then A
may still be either irreducible or reducible
(i) If A is irreducible, then, by the Perron-Frobenius Theorem it has a simple, real
eigenvalue )(1 A and hence 1)( AI has a simple, real, dominant
eigenvalue 11
1
. Therefore the matrix
1)( AI can be factorised into the
Jordan form
56
T
n
T
T
n
y
y
J
xxxAI 1
121
1
0
01
1
,...,,)( , (25)
where J consists of Jordan blocks (See Stewart (1973, p 270)) involving the
lesser eigenvalues ,, 32 . In this case, 1x is still the right eigenvector and Ty1
the left eigenvector corresponding to the dominant eigenvalue )(1 A , since
the eigenvectors of A and 1)( AI are the same. Following similar arguments to
Theorem 3.3, it can then be shown that 01 .
(ii) If A is reducible then 1)( AI can be factorised into the form
Tn
T
n
y
y
J
JxxxAI
1
211
0
0 ,...,,)( , (26)
where J is a set of Jordan blocks associated with 1 and is ).1( nrrr J
consists of Jordan blocks associated with the lesser eigenvalues ,, 32 .
It is observed then that (25) is a special case of (26) with 1r . Thus we can merge the
two cases (i) and (ii).
More particularly, (26) can be factorised in the form
T
n
T
T
t
s
r
n
y
y
y
J
J
J
xxxAI
2
1
3
2
1
21
1
00
01
1
1
10
001
1
,...,,)(
, (27)
57
where
11
1
rJ ,
21
1
sJ , and
31
1
tJ ,… are sets of Jordan blocks of one or
both of the forms
i
i
i
1
10
1
1
1
011
1
,
i
i
1
10
01
1
(28)
Hence, from (27), we have the result that
Tn
T
T
vw
vs
vr
n
v
y
y
y
J
J
J
xxxAI2
1
21
0
0
,...,, , (29)
where wsr JJJ ,,, are matrices of the form (30) or rmmm 2 matrices of the
form (31) or a combination of these
vi
vi
)1(
10
0)1(
1
, (30)
vi
vi
vi
mvi
vi
vi
mvi
vi
vi
v
m
vv
m
vv
)1(
100
)1(
1
10
)1(
10
)1(
1
2)1(
1
1)1(
10
)1(
1
1)1(
1
1
)1(
1
1
21
11
. (31)
See Blum (1972, p 73) for further information on the derivation of (31).
58
If the only Jordan block associated with the dominant eigenvalue 1 is of the form (30),
then
0
1
21
0
1
21
10)1(
0
1
00
0 ,...,,
00
0 ,...,,
1
1
)(
)(
q
y
yI
xxx
q
y
yI
xxx
qAI
qAI
q
q
Tn
T
r
n
Tn
T
r
n
v
, (32)
as . Note that the ratio of the vectors q and 1q is defined to be the ratio of
corresponding elements. Furthermore, such ratios will always exist, since q (and also
1q ) is strictly positive. This is so, since 0)( qAIq
, with 00 q , and )( AI is
non-negative and cannot have a complete row of zeros (otherwise it would be singular).
Expanding the right-hand side of (32) gives:
)1( , 1
1
2211
2211
11
rxxx
xxx
q
q
rr
rr
. (33)
Again, 1x is still a right eigenvector and Ty1 , from (32), a left eigenvector corresponding
to the dominant eigenvalue )(1 A . Thus 01 , as before.
Then the limiting vector in (33) cannot be the zero vector, since 01 and rxxx ,...,, 21
are linearly independent. Therefore, from (33)
1
)(1
)(
1
1
k
k
q
q (34)
On the other hand, if the largest Jordan block associated with 1 is of the form (31) and,
without loss of generality, this block is placed first in the Jordan form, then
)(
1
)(1
1)(1
)(
1
1k
m
km
k
k
x
x
q
q
(35)
59
Using a similar argument to that for 01 , it may be shown that 0m , and since
01 x , 1xm cannot be the zero vector. Hence (35) approaches the same limit as (34).
Equations (18) and (19) may then be used to find ).(A This completes the proof of
Theorem 3.4. ■
Theorem 3.3 and Theorem 3.4 together show that the Inverse Power method is certain to
converge, whether A has a complete set of eigenvectors or not.
A potential computational problem arises if 1 , which we have shown must be strictly
positive, is nevertheless extremely small. This issue is addressed by a number of authors
who provide evidence that, even if 01 , rounding errors will eventually make it non-
zero and convergence to the dominant eigenvalue occurs as a consequence. See Stewart
(1972, pgs 343-344) and Ralston and Rabinowitz (1978, pgs 494-495).
Example 3:
When the Inverse Power method was applied to the following matrix
005.025.0
1.005.025.0
55.065.00
A ,
the method converged to the correct six decimal approximation 601741.0)( A , in 9
iterations.
3.5 Conclusion
The formulation of the system of linear equations (1) for a Leontief input-output system
should ensure ideally that the final demand vector F is strictly positive. This then avoids
the possibility of a degenerate system and a violation of the linearity assumption. In
60
circumstances where a particular element of F is zero, this should be made a small
positive number. The resulting effect on the solution vector X can then be controlled via
equation (3).
Formulating the Leontief input-output system in this way, with zero elements in F made
positive, it is possible then to state that a necessary and sufficient condition for the
existence of a unique, positive solution to the system (1) is that .1)( A This is the so-
called Spectral Radius condition, and is an equivalent alternative to the Hawkins-Simon
condition.
The Spectral Radius condition can often be verified simply by the use of certain results
that provide bounds for ).(A A useful bound is provided by Theorem 3.2.
If it is required to calculate )(A , this can be achieved by use of the Inverse Power
method, which is certain to converge with the choice of a strictly positive initial vector.
In certain circumstances, such as a simulation, it may be desirable to answer the question:
Does the input-output system have a unique positive solution? In such circumstances it
may not be required to actually find the solution. Calculating )(A may be a more
efficient way of answering this question, since the computation of )(A may require less
time than calculating the full solution.
There are situations where a direct calculation of )(A may be desirable in preference to
reliance on a bound for ).(A For example, an estimate may be required for the
61
condition number of the system (1). Such an estimate is provided readily by the new
result
)(1
)()(
)(1
)(
A
AnAI
A
A
, (36)
where )( AI is the condition number of the system (1). This result is proved in
chapter 8. This may then be considered as a more efficient way of estimating the
condition number )( AI than by use of
1)()( AIAIAI which requires
the full calculation of 1)( AI . This new result (36) could then be evaluated along side
other methods for determining the condition number. This estimate of the condition
number can be used consequently to determine the precision of the computer arithmetic
needed to solve accurately the system (1). For example, if the condition number is of the
order p10 , then p digits of accuracy may be lost in the solution, and therefore the
precision of arithmetic used must be substantially greater than p decimal digits of
accuracy.
Furthermore the condition number estimate can be used to determine the accuracy of the
solution via standard error bounds such as,
F
RAI
X
XX)(
(37)
where X is the computed solution and R is the residual vector .)( XAIF
See Stewart (1973, pgs 192-198) for further information on these bounds. Also the
estimate of the condition number can be used to determine the size of a small perturbation
62
in F , in order to ensure the positivity of F and retain satisfactory accuracy in the
solution vector X . This can be done via equation (3).
It is worth noting that the method described in this chapter has obvious applications to
other areas such as:-
(i) finding the 2-norm of a matrix, where )(2
AAA T ;
(ii) finding the first few eigenvalues of a covariance matrix (this is important
in factor analysis, a statistical technique);
(iii) verifying that a matrix is convergent (ie 0nA as n ) (this will
happen, if and only if 1)( A );
(iv) finding a circle which is within the complex plane and within which are
located all the zeros of a polynomial. (This can be done via the companion
matrix of the polynomial.)
63
CHAPTER 4
A NEW METHOD FOR CALCULATING THE SPECTRAL RADIUS OF AN
INPUT-OUTPUT MATRIX
4.1 Conditions for a Unique, Positive Solution of the Model
There are two well-known methods for determining whether the system
(I - A) X = F (1)
has a unique, positive solution. One is the Hawkins-Simon condition. The other is the
Spectral Radius condition, a proof of which was given previously in Theorem 3.1.
The purpose of the present chapter is to derive a robust computational method for
calculating the spectral radius of a non-negative matrix, such as the matrix A in the
Leontief model. It is known that for the Leontief input-output model, 1)( A , since the
column sum of A does not exceed unity. However, it may be necessary to calculate
)(A explicitly in order to verify that )(A is indeed strictly less than one.
4.2 Bounds on the Spectral Radius of an Irreducible Non-Negative Matrix
A non-negative matrix is either irreducible or reducible. Bounds are developed first for
an irreducible matrix and these results are later extended to reducible matrices. The
following well-known inequality is used in the proof of Theorem 4.1 which follows.
Lemma 4.1: If ),,1( , nicb ii are positive real numbers such that
n
n
c
b
c
b
c
b
2
2
1
1 (2)
then
n
n
n
n
c
b
ccc
bbb
c
b
21
21
1
1 . (3)
Proof: Let iii cb ),,1( ni , where 0i . Then (2) becomes
n 21 (4)
64
)()( 212211211 nnnnn ccccccccc , (5)
since n ,1 are the minimum and the maximum respectively of the i . Then dividing
(5) by nccc 21 proves the result (3). ■
Theorem 4.1: Let 0A be an nn irreducible matrix and 0q be an arbitrary positive
n-dimensional vector. Defining 01 qAAqq , 1 , let
)(
)(1
)(
)(1
1
max
1
min
i
i
ri
i
rq
q
niand
q
q
ni
,
where the superscript i represents the thi component of a vector. Then, denoting the
spectral radius of A by )(A ,
012210 )( A . (6)
Proof: We have AqqAqq 11 and (7)
Then, if )( jq is the thj component of q , and T
ja is the thj row of A,
.1
)(
)(1
qa
qa
q
qTj
Tj
j
j
(8)
However, 1 Aqaqa Tj
Tj , (9)
and Tnjn
Tj
Tj
Tn
T
T
jnjjTj aaaaaa
a
a
a
aaaAa
22112
1
21 (10)
Using (9) and (10) in (8) and expanding the denominator in the right-hand side of (8)
gives
)(1
)2(12
)1(11
12211
)(
)(1
)(
njnjj
Tnjn
Tj
Tj
j
j
qaqaqa
qaaaaaa
q
q
. (11)
Furthermore,
niq
qa
q
qi
Ti
i
i
,,1 )(1
1
)(1
)(
, (12)
65
and since A is irreducible the numerator and denominator of (12) will always be positive.
(This is so, since 0q is strictly positive and 1q could only contain a zero element if A
possessed a row of zeros, which is impossible when A is irreducible. Continuing the
argument shows that ,, 32 qq are also positive vectors.)
Next, multiply the numerator and denominator of R.H.S of (12) by jia to obtain
niqa
qaa
q
qi
ji
Tiji
i
i
,,1 )(1
1
)(1
)(
. (13)
Identifying the ratios in (13) with i
i
c
b in Lemma 4.1, we then have
(i)
(i)
njnjj
Tnjn
Tj
Tj
(i)
(i)
q
q
niqaqaqa
qaaaaaa
q
q
ni1
)(1
)2(12
)1(11
12211
11
max
1
min
(14)
However, this applies for all .,,1 nj Hence
.1, 1
1 (15)
Furthermore, Thereom 2.2 (Varga, 1962, p32) shows that, either
(i)
(i)
(i)
(i)
q
q
niA
q
q
ni11
1
max)(
1
min
(16)
or
)(1
A q
q(i)
(i)
, for all ni 1 (17)
Now either (16) or (17) is true for all ,1 and so, combining whichever of these applies,
together with (15), produces the result (6). ■
The generation of the q as 0qA shows that Theorem 4.1 is closely related to the Power
Method for calculating the dominant eigenvalue of a matrix. Theorem 4.1 is also an
extension of the bounds of Collatz (1942). We shall refer to the method of Theorem 4.1
as the Method of Collatz. The proof of Theorem 4.1 is new. Its enunciation appears as
an unproven exercise in Varga (1962, p 34). A query to Varga shed no light on its origin.
66
The following examples demonstrate how well, if at all, this result approximates the
spectral radius of certain types of irreducible matrices.
Example 1:
541
100
010
A .
q Ratio of corresponding
elements
Bounds on )(A
0 [1, 1, 1]
1 [1, 1, 10] 1, 1, 10 1, 10
2 [1, 10, 55] 1, 10, 5.5 1, 10
3 [1, 55, 316] 10, 5.5, 5.7455 5.5, 10
4 [55, 316, 1810] 5.5, 5.7455, 5.7278 5.5, 5.7455
5 [316, 1810, 10369] 5.7455, 5.7278, 5.7287 5.7278, 5.7455
6 [1810, 10369, 59401] 5.7278, 5.7287, 5.7287 5.7278, 5.7287
Here the matrix A is primitive (see section 1.4 of Chapter 1, or Varga (1962, p35), for
definition of a primitive matrix), and it is obvious that the sequences of both lower and
upper bounds are converging to the correct spectral radius approximation 7287.5)( A .
In general, scaling of the q is needed to prevent overflow or underflow. However, this
scaling does not affect the result of Theorem 4.1.
Example 2:
003
200
010
B
67
q Ratio of corresponding
elements
Bounds on )(B
0 [1, 1, 1]
1 [1, 2, 3] 1, 2, 3 1, 3
2 [2, 6, 3] 2, 3, 1 1, 3
3 [6, 6, 6] 3, 1, 2 1, 3
In this case there is no improvement in the lower and upper bounds for )(B although
these bounds still include the true spectral radius approximation .8171.1)( B The
matrix B is cyclic (see section 1.4 of Chapter 1, or Varga (1962, p35), for definition of a
cyclic matrix), so it would appear that converging bounds occur only when the
irreducible matrix is primitive. (This is the subject of the next theorem.) However, the
cycling in the ratios of corresponding elements can be avoided by use of a spectral shift,
this being obtained by simply adding the identity matrix to B . Example 3 shows this.
Example 3: Let
103
210
011
IBC .
C is then a primitive matrix and when the Method of Collatz is applied, this yields, after
15 iterations, the approximation 8171.2)( C . Hence 8171.11)()( CB . The
obvious discrepancy in the behaviour of primitive and cyclic matrices is explained in
Theorem 4.2, but before stating the theorem some preliminary lemmas are needed.
Lemma 4.2: If 0A is primitive, with Jordan basis nxxx ,...,, 21 , corresponding to the
eigenvalues 321 and 0q , an arbitrary positive vector, is written in terms of
this basis as nn xxxq 22110 , then it is guaranteed that 01 .
Proof: Since A is primitive it has a unique eigenvalue of maximum modulus.
Therefore A can be factorised in the form
68
11
0
0
X
JXA
(18)
where J is a Jordan matrix associated with the lesser eigenvalues ,, 32 ,
nxxxX ,...,, 21 is a matrix with columns that are the right principal vectors of A and
TnyyyX 21
1 is a matrix with rows which are the corresponding left principal
vectors of A . (See Halmos (1958, p 112ff) for verification that the factorisation (18) is
possible.)
Hence, since IXX 1 , the identity matrix, we can conclude that
111 xyT and 1 ,01 jxy j
T. (19)
However Ty1 , being the left eigenvector corresponding to )(1 A , is the dominant
right eigenvector of TA , and hence, by the Perron-Frobenius Theorem, 01 y .
Therefore, .0101 qyT ■ (20)
Lemma 4.3: If J is a Jordan matrix comprising Jordan blocks associated with the
eigenvalues ,, 32 , and 321 ,where 1 is real and positive, then
01
J, as .
Proof: 11
2
1
J. Hence
as ,01
J. ■ (21)
Theorem 4.2: For the Method of Collatz, both the sequences 0
and
0
converge to )(A , from an arbitrary initial positive vector 0q , if and only if the
irreducible matrix 0A is primitive. (This is the result included in Varga (1962, p 44)
as an exercise to be proved.)
Proof: If A is primitive, it has a single dominant eigenvalue 1 equal to )(A , and A
can be factorised as in (18). Then applying the method of Theorem 4.1,
69
01
1
11
011
1
0
0
0
0
qXJ
X
qXJ
X
q
q
, (22)
where the ratio of vectors is intended as the ratio of corresponding elements. Equation
(22) can be re-written in the form
01
1
1
01
1
1
1
0
01
0
01
qXJX
qXJX
q
q
. (23)
Taking the limit as , and using Lemma 4.3, we have
01
01
1
1
00
01
00
01
qXX
qXX
q
q
. (24)
Let 0q be represented in terms of the principal vectors of A (the columns of X ), as
nn xxxq ...22110 (25)
Hence (24) becomes
]1,,1 ,1[1
11
111
1
x
x
q
q, (26)
since 01 , by Lemma 4.2, and 01 x , by the Perron-Frobenius Theorem.
This shows that 0
and
0
both converge to )(1 A .
Conversely, if 0A is irreducible, but not primitive, then it must be cyclic of order t ,
and can be factorised into Jordan canonical form as
1
0
0
X
JXA , (27)
70
where 1,,1 ,0 /21 tkediag tki , and J is a Jordan matrix with diagonal
elements all less than 1 in modulus Atkinson (1989, p 480). Again applying the method
of Theorem 4.1,
01
1
1
1
1
01
1
11
1
0
0
0
0
qXJ
X
qXJ
X
q
q
. (28)
Then, using Lemma 4.3,
01
1
1
01
11
1
00
0
00
0
lim
qXX
qXX
q
q
. (29)
However, because of the nature of , the ratios in (29) will cycle through the same set of
t values as increases from to 1 t . Hence, the method does not converge at all
when A is irreducible, but not primitive. ■
Theorem 4.2 is thus proved and this explains the difference in behaviour of the matrices
in Examples 1 and 2 above.
4.3 Bounds on the Spectral Radius of any Non-Negative Matrix
We now show that Theorem 4.1 can be generalised to cover the case of any non-negative
matrix, not just an irreducible matrix. To prove this generalisation two results are
required, firstly Theorem 3.2, and Lemma 4.4. The following Lemma 4.4 is also given in
Varga (1962, p 32). However an alternative proof is given for it here.
71
Lemma 4.4: If A is an nn non-negative matrix and 0q is any vector with positive
components n ,...,, 21 , then
i
n
j
jij
i
n
j
jij a
niA
a
ni
11
1
max)(
1
min. (30)
Proof: Let ),,,( 21 ndiagD where ),,1( 0 nii .
Now let ADDB 1 . Then B will also be a non-negative matrix, and
nn
n
n
n
n
nn
nn
aaa
aa
a
aaa
B
2211
2
222
2
121
1
1
1
21211
. (31)
By using Theorem 3.2, applied to TB , and noting that )()( BBT ,
i
n
j
jij
i
n
j
jij a
niB
a
ni
11
1
max)(
1
min. (32)
These are also the bounds for )(A , since A and B , being similar, have the same
eigenvalues. ■
Theorem 4.3: Let 0A be an nn matrix such that if .00 00 Aqq Then the
result of Theorem 4.1 also applies in this case,
i.e. 012210 )( A .
Proof: The proof here exactly models that of Theorem 4.1, with the condition
0 0 00 Aqq , ensuring the positivity of the ratios in (11) and (12), and Lemma 4.4
guaranteeing that (16), or its alternative (17), is still valid. ■
72
If A is primitive, then, by Theorem 4.2, convergence will occur from both ends. If A is
cyclic, the bounds will not improve. If A is reducible, its dominant eigenvector may be
entirely positive or contain some zero elements. In the former case, convergence from
both ends will occur, but in the latter case, the upper bound will converge to the dominant
eigenvalue and the lower bound to a sub-dominant eigenvalue. Examples 1 and 2 have
already demonstrated convergence behaviour when the matrix A is primitive or cyclic.
Examples 4 and 5 show what happens when A is reducible. Example 4 is the case of a
matrix with a dominant eigenvector, which is totally positive, and convergence to
1.0)( A occurs from both ends. Example 5 is the case of a matrix with a dominant
eigenvector containing a zero element. In this case the lower bound converges to the
subdominant eigenvalue 25.02 , but the upper bound still converges to 5.0)( A . If
all dominant eigenvectors and their associated principal vectors for a reducible matrix
have zeros in corresponding positions, then the lower bound will converge to a sub-
dominant eigenvalue. In Examples 4, and 5, 0q was chosen as 1 ,1 .
Example 4:
1.00
9.005.0F Example 5:
25.00
75.05.0G
1 0.1000 0.9500 1 0.2500 1.2500
2 0.1000 0.1447 2 0.2500 0.6500
3 0.1000 0.1135 3 0.2500 0.5577
9 0.1000 0.1002 11 0.2500 0.5002
10 0.1000 0.1001 12 0.2500 0.5001
11 0.1000 0.1000 13 0.2500 0.5000
73
4.4 Rate of Convergence
Since the method of Theorem 4.3 is essentially equivalent to the Power Method, rates of
convergence will be similar. It is a well-known property of the Power Method that
ee
1
21 where e is the error in )(A at the th step (see Wilkinson (1965, p 577)).
This is the case when A has a complete set of linearly independent eigenvectors and
n 321 . If A is still non-defective, but has multiple dominant
eigenvalues nrr 121 , then
ee r
1
11
, (see Isaacson and
Keller (1966, p147ff)). In both of the above cases, convergence will be slow if the
convergence ratio is close to 1, and this will occur if 12 or 11 r . In such a case,
convergence may be accelerated by the use of Aitken's 2 method. This is shown in
Example 6.
Example 6:
6.000
090.00
005.089.0
H where is the extrapolated Aitken's
approximation.
1 0.6000 0.9400
2 0.6000 0.9379
3 0.6000 0.9359 0.9168
4 0.6000 0.9342 0.9158
46 0.6000 0.9326 0.9031
47 0.6000 0.9311 0.9031
48 0.6000 0.9297 0.9030
The correct spectral radius is obviously 9.0)( H .
74
For a non-defective matrix, with all eigenvalues equal, we have the trivial case of IA .
In the case where A does not have a complete set of linearly independent eigenvectors,
then, using the Jordan canonical form of A , it can be shown that
ee
11
by the fact
that the error at the k -step is given by
ke and this produces very slow convergence.
However Example 7 shows that the rate of convergence can be accelerated by using
1)1( . This result can be proven by the fact that the error at step is
proportional to
1, the proof of this follows.
1
1
1
)1(
)1()1(
1
k
k
k
k
75
Example 7:
6.000
08.00
01.08.0
K
1 0.6000 0.9000
2 0.6000 0.8889 0.8778
3 0.6000 0.8800 0.8622
4 0.6000 0.8727 0.8508
38 0.6000 0.8178 0.8028
39 0.6000 0.8174 0.8027
40 0.6000 0.8170 0.8026
The correct spectral radius is obviously 8.0)( K .
4.5 A New and Improved Method
For calculating the spectral radius of a non-negative matrix A , with 1)( A , the method
described so far has several drawbacks:
(i) The matrix A may be cyclic, in which case the bounds do not improve.
(ii) If the matrix A is reducible, it may contain a complete row of zeros, in which case
the necessary condition 00 00 Aqq is violated.
(iii) If the matrix A is reducible, it may have a dominant eigenvector containing a zero
element, in which case the lower bound will converge to a subdominant eigenvalue.
Both of the first two difficulties are overcome if matrix A is replaced by IA (as in
Example 3) or by 1)( AI . However the convergence ratio of the former is at least
76
1
2
1
1
and of the latter
2
1
1
1
, and in fact the latter is superior for an input-output
system. We shall call the former the Shifted Collatz Method and the latter is obviously
the Inverse Collatz Method. However it is important to note that the method proposed in
this chapter is providing both upper and lower bounds for the spectral radius.
Theorem 4.4: The Inverse Collatz Method has a faster rate of convergence than the
Shifted Collatz Method when applied to an input-output matrix that is primitive
Proof: We require to prove that
2211
2
1
1
2
1 11 1 ..
1
1
1
1
ei
For an input-output system 11 is real and positive, however 2 could be complex.
Let )sin(cos2 ir where 1r .
2
111 111 and
.12coso valuemaximum thesince,1
2cos21
2sin2cos1
cossin2sincos1
sincos1sincos111
2
42
22
2222
22
fr
rr
irr
irrr
rirrir
Since 12 r , the result is proven. ■
Theorems 4.1, 4.2 and 4.3 still apply for the method with A replaced by 1)( AI since
1)( A guarantees that 0)( 1 AI . The diagram below shows that, if 1)( A , there
cannot exist, different dominant eigenvalues of 1)( AI (some real, some complex),
since the dominant eigenvalues will be those closest to one. Hence, if 1)( A ,
1)( AI cannot be cyclic.
77
1
y
x
1 = ( )A
2
2
3
Furthermore, 1)( AI cannot contain a complete row of zeros, otherwise its inverse
would not exist. The dominant eigenvalue of A is then easily recoverable from the
dominant eigenvalue of 1)( AI , noting that eigenvalues of 1)( AI are of the form
1
1, where is an eigenvalue of A .
Example 8:
0075.0
5.000
025.00
L using
the Shifted Collatz Method
Example 9:
0075.0
5.000
025.00
L using
the Inverse Collatz Method
1 1.2500 1.7500 1 1.5172 2.1379
2 1.3000 1.5833 2 1.6740 1.9770
3 1.3654 1.5658 3 1.7672 1.9057
17 1.4542 1.4544 12 1.8324 1.8325
18 1.4542 1.4543 13 1.8324 1.8325
19 1.4543 1.4543 14 1.8324 1.8324
The spectral radius from Example 8 is 4543.014543.1)( L and from Example 9 is
4543.08324.1/11)( L . These examples show that the Inverse Collatz Method has a
faster rate of convergence than the Shifted Collatz Method.
78
The difficulty mentioned in iii) above can be avoided by perturbing certain elements of
the matrix to ensure that it is irreducible, see Theorem 4.5 below. This simply requires
adding to all elements on the super diagonal, and the )1,(n element (i.e. the
),,1(, ,)3,2( ),2,1( nn and the )1,(n elements. In this case convergence is guaranteed
for the Shifted Collatz Method and the Inverse Collatz Method. However convergence
may be slow in general.
Theorem 4.5: If an nn non-negative matrix A is reducible it can be made irreducible
by perturbing the elements as described above.
Proof: Let
0
00
00
where 0 .
If we construct a directed graph for the matrix, a fully traversable graph from node 1 to
node n can be constructed, therefore must be irreducible. Hence if the matrix is
added to any other non-negative matrix A , the newly constructed matrix it will also be
irreducible. ■
4.6 Condition of the Dominant Eigenvalue
Ill-conditioning of eigenvalues may occur when some eigenvalues are very close
together. For example, the matrix
0011.00
9900.00010.0A has a badly conditioned
eigenvalue 0011.01 , which is very close to the other eigenvalue 0010.02 . Ill-
conditioning is evident by reason of the fact that a small perturbation of 610 , in the (2,1)
element of A , produces a new dominant eigenvalue 0020.01 . This perturbed
eigenvalue does not even agree with 1 to one significant digit! To determine the
79
severity of ill-conditioning a result from Stewart (1973, p 296) may be utilised. This
states that if is a simple eigenvalue of A with right eigenvector x and left eigenvector
y , with 12x and 1xyT , and A is deflated using an orthogonal matrix R such that
C
hARR
TT
0
, then
)(0 22
2
y , (33)
where is the corresponding eigenvalue of the perturbed matrix EA . Also, 2
E ,
1
2
1)(
CI and 2
h . Hence the numbers 2
y , and give a measure of the
condition of the simple eigenvalue . These numbers were calculated for the dominant
eigenvalue of 5000 randomly generated input-output matrices for each of the orders 5, 10,
20, 50, 100, 200. In all but a very few exceptional cases, 2
y was close to 1, was
greater than 0.1 and less than 1, indicating that typically, the dominant eigenvalue of
an input-output matrix is well-conditioned. The exceptional cases occurred when the
input-output matrix was of low order (5 or 10), and very sparse.
4.7 Conclusion
Approximating the spectral radius of an input-output matrix is an important a priori
method for determining whether the system (1) has a unique, positive solution. In many
cases the Method of Collatz will be entirely satisfactory to approximate the spectral
radius of a non-negative matrix, such as an input-output matrix. However, there are
certain cases (listed in section 4.5) where the method fails when applied to the matrix A ,
or fails to provide converging bounds for )(A . Most of these cases can be remedied by
either a spectral shift from A to IA or by replacing A by 1)( AI . The latter
alternative has the more desirable convergence ratio, and is the new approach advocated
in this chapter. Superior results are obtained for a hybrid method, consisting of a
prescribed number of steps of the Method of Collatz and if convergence has not occurred
80
then the Inverse Power Method is applied with 1)( AqI where q the upper bound is
calculated from the initial application of the Method of Collatz. This superiority is
particularly evident in matrices of order less than 20. One nagging difficulty still
remains, however: the case of a reducible matrix with a dominant eigenvector containing
a zero element. In this case the lower bound converges to a subdominant eigenvalue,
even when the hybrid method is used. Fortunately, it may be shown that the upper bound
still converges to the Spectral Radius. See Theorems 3.3 and 3.4. An alternative
approach would be to perturb slightly certain elements of the matrix to ensure that it is
primitive as described in Theorem 4.5. Convergence of both lower and upper bounds
will result, albeit with a much slower rate of convergence in general. This convergence is
guaranteed by Theorem 4.2. The appealing attribute of the alternative approach is that
with converging lower and upper bounds that continually bracket the spectral radius, the
problem of premature convergence to the wrong value can be avoided. The overall effect
of the small perturbation can be controlled by the use of (33).
Convergence acceleration techniques are useful in cases of slow convergence. These
were described in section 4.4.
Though the new method is specific to a non-negative matrix A such that 1)( A , the
method can be readily extended to deal with any non-negative matrix. This is achieved
by introducing a positive scale factor such that 1)( A . A satisfactory is easily
obtained using Lemma 4.4, and then )(1
)( AA
.
Ill-conditioned dominant eigenvalues do not appear to occur frequently for input-output
matrices of realistic order (ie > 20).
Wilkinson (1965, p 620) suggests that rounding errors do not grow significantly for the
Inverse Power Method, which is essentially the main part of the hybrid method.
Equation (33) can be used to estimate the effect of inherent errors in the input-output
matrix. These errors are typically of maximum magnitude 0.05, as reported by Sarma
81
(1977). With 2
y being typically close to 1, this ensures reasonable accuracy for
calculation of the spectral radius.
82
CHAPTER 5
The contents of this chapter are included in the paper
Wood, R.J and O'Neill, M.J. (2004). An always convergent method for finding the spectral
radius of a non-negative matrix. ANZIAM J., 45(E): C474-C485. [Online]
http://anziamj.austms.org.au/V45/CTAC2003/Wood/Wood.pdf
AN ALWAYS CONVERGENT METHOD FOR FINDING THE SPECTRAL RADIUS OF AN IRREDUCIBLE NON-NEGATIVE MATRIX
5.1 Introduction
Calculating the spectral radius of a matrix is useful in a number of applications. My
particular interest has been the area of Mathematical Economics known as Input-Output
Analysis, where finding the spectral radius of a non-negative matrix is an important
technique in verifying that a linear system has a unique positive solution (see chapter 3).
5.2 The Power Method
This is a well-known method for approximating the dominant eigenvalue, 1 , of a matrix.
In this chapter, discussion is restricted to that of a nn non-negative matrix A . The
method proceeds by choosing an initial vector 0q and performing the iterations
1 ,01 qAAqq . Typically, q tends to the dominant eigenvector, and the
dominant eigenvalue is usually obtained by one of two methods:
(i) calculation of )(
1
where
qu
AquT
T
)(
1 , and u is chosen so that 00 quT . (See
Conte and de Boor (1980, p 192)). A smoother variant of this replaces u by vq . (See
Golub and Van Loan (1996, p 326).
83
(ii) comparison of two corresponding non-zero components of 1q and q . (See
Atkinson (1989, p 604)). The ratio of these components tends to .1
In order to avoid overflow and underflow, appropriate scaling of q is carried out at each
step. The convergence ratio of the method is typically 1
2
where 1 is the dominant
eigenvalue and 2 is a subdominant eigenvalue. Difficulties can occur with the method
if the matrix A has two eigenvalues of maximum magnitude, or if 1 and 2 have
approximately equal magnitudes. Difficulties can also occur if 0q does not have a
component in the direction of 1x , the dominant eigenvector. Convergence will then be to
a subdominant eigenvalue. It is worth noting that rounding errors may intervene and
produce eventual convergence to 1 , even in this case. (See Stewart (1973, p343)).
However in the case of a non-negative matrix, a strictly positive initial vector will always
have a positive component in the direction of 1x . This is proved in the following
theorem, which applies to any non-negative matrix, reducible or irreducible.
Theorem 5.1: Let matrix 0A , with Jordan basis nxxx ,...,, 21 , corresponding to the
eigenvalues nkk 121 have an initial mm )1( km
Jordan block associated with 1 . Let 0q be an arbitrary positive vector, which when
written in terms of this basis is nn xxxq 22110 . Then it is guaranteed that
01 and 0m .
Proof: A can be factorised in the form
1
1
1
0
0
0
1
01
X
J
XA
, where ] [ 21 nxxxX (1)
(See Halmos (1958, p 112ff) for verification that the factorisation (1) is possible.)
If we pre-multiply both sides of (1) by TnyyyX 21
1 ,
84
Tn
Tm
T
Tn
Tm
T
y
y
y
J
A
y
y
y
1
1
1
1
0
0
0
1
01
Tm
Tm yAy 1 , which means T
my is a left eigenvector of A corresponding to the
dominant eigenvalue, 1 . ( 0T
my , since 1X cannot have a complete row of zeros.)
Therefore 0Tmy , by the variation of the Perron-Frobenius theorem. (See section 1.4)
Now, ,0)( 110 mnnmm
T
m
T
m xxxyqy since 00 qyT
m .
Further, if we post-multiply both sides of (1) by X ,
J
xxxxxxA nn
0
0
0
1
01
,,,,,,
1
1
2121
111 xAx , which means 1x is a right eigenvector of A corresponding to the dominant
eigenvalue, 1 , and is hence non-negative. (It is not totally zero, since it is a column of
non-singular X .) Then ,0)( 111101 nnmm
TT xxxxqx since
001 qxT . ■
The theorem can obviously be extended to any mm Jordan block for 1 . We now
show how the Power Method performs on a selection of test matrices; our aim in
including these examples is to show the deficiencies of some rival methods i) and ii). In
each case, 0q was chosen as 1,1, ,1 .
Example 1:
i
iA
2045.03644.0
2045.03644.0 ,7287.5
541
100
010
3
21
85
This matrix has a single dominant eigenvalue. Using method (ii) described above the
method converges in 6 iterations to the correct 4 decimal value. The iterates are 10, 5.5,
5.7455, 5.7278, 5.7287, 5.7287. This shows that a standard method works quite well for
this matrix.
Example 2:
003
200
010
B , this matrix is cyclic; it has 1 real and 2 complex
eigenvalues all of which are equal in magnitude. In particular 3321 6 .
Using method (i), it does not converge; it cycles among the values 2, 1.8333, 1.6364.
Example 3:
iiC 7862.0 ,7862.0 ,2720.1 ,2720.1
0101
1000
0100
0010
4321
This matrix is also cyclic and has two dominant eigenvalues which are real, and of equal
magnitude but opposite sign. Using method (i), it does not converge, it cycles between
the values 1.2361, and 1.3090.
Example 4:
9100.0 ,9200.0 91.00002.0
0001.092.021
D
This matrix has two eigenvalues, which are both real and approximately equal in
magnitude. When method (i) is applied to the above matrix, the method converges to the
incorrect value of 0.9121 after two iterations. This occurs if an absolute value
convergence criterion is applied to the eigenvalue approximations. However, if a
different convergence criterion such as suggested in Golub and Van Loan (1996, p332) is
applied then such premature convergence is avoided, but convergence is very slow with
only one decimal place of accuracy after 100 iterations. The Golub and Van Loan
86
procedure uses an approximation to the left and right eigenvectors to estimate the error at
each step.
Using method (ii) with the Golub and Van Loan convergence criterion, false premature
convergence is avoided, but convergence is also very slow, with only two decimal places
of accuracy after 100 iterations.
Example 5:
92.000
15.00
0192.0
E 5.0 ,92.0 ,92.0 321
This matrix has two dominant eigenvalues, which are real and equal. Furthermore, 1
and 2 occur in a 22 Jordan Block. Admittedly, this matrix is reducible but can be
made irreducible by adding 610 to the (3, 1) element. The effect of this perturbation can
be estimated by using the result from Stewart (1973) quoted later in the conclusion.
When method (ii) was applied with the Golub and Van Loan convergence criterion,
convergence was extremely slow with only one decimal place of accuracy after 100
iterations. The purpose in the inclusion of this example is to show that, even if a non-
negative matrix is reducible, modifications can be made to estimate its spectral radius.
Great care should be taken, however in perturbing defective matrices. It is advisable to
estimate the relevant quantities in the above-mentioned result from Stewart (1973).
5.3 Techniques for a Reducible Matrix
If the matrix A is reducible then converging bounds do not necessarily occur. However,
this can be overcome by adding to A the matrix E , where
87
00
0
00
000
E , where 0 and small, E can also be
represented by
1
1,1
1. n
iji
T
n
T
ji eeeeE .
This ensures that EA is irreducible, and the method of this paper can be applied. An
attendant difficulty is then whether the perturbation in matrix A significantly affects the
spectral radius. To determine the impact of this perturbation a result from Stewart (1973,
p296) is helpful. This result states that if is a simple eigenvalue of A with right
eigenvector x and left eigenvector y , with 12x and 1xyT , and A is deflated using
an orthogonal matrix R such that
C
hARR
TT
0
, then
)(0 22
2
y ,
where is the corresponding eigenvalue of the perturbed matrix EA . Also,
2E ,
1
2
1)(
CI and 2
h . So that the numbers 2
y , and give a
measure of the condition of the simple eigenvalue . These numbers were calculated for
the dominant eigenvalue of 5000 randomly generated non-negative matrices for each of
the orders 5, 10, 20, 50, 100, 200. In all but a very few exceptional cases, 2
y was close
to 1, was greater than 0.1 and less than 1, indicating that typically, the dominant
eigenvalue of a non-negative matrix is not greatly affected if the values in E are
appropriately small.
If severe convergence problems are encountered when A is reducible, then use can be
made of the result that an nn permutation matrix P exists such that,
88
ss
s
s
T
A
AA
AAA
PAP
00
0 222
11211
(10)
where the submatrices ssAAA ,...,, 2211 , are square, and are either a 11 null matrix or
irreducible matrices. See Varga (1962, p 46) for details of this result. The matrix TPAP
is referred to as the normal form of A . A procedure for converting to normal form is
given in Senata (1973, Ch 1). It is then possible to apply the Method of Collatz to the
submatrices ssAAA ,...,, 2211 , and because they are irreducible, the method is guaranteed
to converge.
5.4 Applying the Method of Collatz
In order to overcome some of the above difficulties, the more robust Method of Collatz
(see Theorem 4.1) was applied to the above examples.
For the matrix A , in Example 1, convergence was quite rapid when the Method of
Collatz was applied. The bounds ,
, are (1,10), (1,10), (5.5, 10),(5.5, 5.7455),
(5.7278, 5.7455), (5.7278, 5.7287). However, for the matrix B in Example 2, the bounds
do not improve. The initial bounds produced are (1,3), and these do not improve, because
the matrix B is cyclic.
Noting that A is primitive and B is cyclic, we then have an explanation of their different
behaviors. The Method of Collatz is obviously closely related to the Power Method, but
it has the double advantage of being always convergent when A is primitive (see
Theorem 4.2), and also, of providing an estimate of the error at each step and this
overcomes the problem of premature convergence to the wrong value. It is worth noting
that a cyclic matrix can always be converted to a primitive matrix by means of a positive
spectral shift. This is proved in the next theorem.
89
Theorem 5.2: If the matrix 0A is an nn irreducible matrix then the matrix AqI ,
where 0q , is primitive.
Proof:
Case (i): If A is cyclic it will have t -eigenvalues equal to the spectral radius, and these
will lie on the complex circle with radius )(A . The form of these eigenvalues will
be 1,,1 ,0 /2 tke tki , the form of the other eigenvalues will be ie where
, and 20 , since these eigenvalues must lie inside the spectral circle. The
eigenvalues of AqI are of the form tkieq /2 or , ieq where 20 and
these have modulus 22 2cos2
t
kqq or 22 )(cos2 qq , attaining a
unique maximum value, when 0k or 0 respectively. The maximum value of the
former is q and of the latter q . And since , the matrix )( AqI has only
one eigenvalue, that attains the value of q . Hence the matrix )( AqI is primitive if
A is cyclic.
Case (ii): If A is primitive the form of the eigenvalues for )( AqI will be of the form
q or ieq where and 20 . These have maximum modulus q
or q respectively. Since , the former is obviously greater, and is unique.
Hence, )( AqI is primitive if A is primitive. ■
Theorem 5.2 can be used to ensure convergence for Matrix B ( see Example 2), when
applying the Method of Collatz to the matrix IBB 1 , where I is the 33 identity
matrix. 1B is then a primitive matrix and after 15 iterations, the approximation
8171.2)( 1 B is obtained. Hence 8171.11)()( 1 BB . The question then arises
as to what might be an optimal shift to give the most favourable rate of convergence.
This is not easily answered except in the case of a real symmetric matrix. As remarked
by Stewart (1973, p342), the search for an optimal shift is not very satisfactory in
automatic computation. An alternative approach is adopted in this chapter. Where the
90
Method of Collatz is applied to the matrix 1)( AqI instead of the matrix A . However
first it is necessary to show that 1)( AqI is primitive.
Theorem 5.3: If 0A is an nn irreducible matrix with qA )( , then 1)( AqI is a
non-negative irreducible matrix. Furthermore it is primitive.
Proof: We first show 1 AqI is irreducible. Since qA )( ,
n
q
A
q
A
q
AI
qAqI
2
1 1)( . (2)
Therefore, since A is non-negative, 1)( AqI is also non-negative. Furthermore, the
series (2) must be irreducible since A is irreducible. We next show the matrix 1)( AqI
is primitive.
Case (i): If A is cyclic the eigenvalues of 1)( AqI will have the form tikeq /2
1
1,,1 ,0 tk or ieq
1 where and 20 . These have modulus
t
kqq
2cos2)(
1
22
or cos2))((
1
22 qq respectively. Unique maximum
values for each are obtained when 0k or 0 respectively. Hence the spectral radius
of 1)( AqI is q
1 or
q
1, and since we know that q , the matrix
1)( AqI will have a single dominant eigenvalue q
1. Therefore 1)( AqI is
primitive when A is cyclic.
Case (ii): If A is primitive the eigenvalues of 1)( AqI will have the form q
1 or
ieq
1 where and 20 . As above, the maximum modulus is
q
1, and
91
this is unique. Hence we can say that the matrix 1)( AqI is primitive when the matrix
A is primitive. ■
Unfortunately, if A is reducible, 1)( AqI may also be reducible Example 6 shows such
a case.
Example 6: If
10
11A , then
5.00
25.05.0)( 1AqI , if 3q . Both A and
1)( AqI are reducible.
Corollary 5.1: If A is an irreducible, non-negative matrix, and qA )( , then the
Method of Collatz applied to the matrix 1)( AqI is certain to converge.
Proof: Follows from Theorems 4.1, 4.2 and 5.3. ■
In this method it is not necessary to explicitly calculate 1)( AqI , but merely calculate
the solution of the linear system yxAqI )( at each iteration, noting that the LU
decomposition will suffice for each iteration.
This method is obviously closely related to the Inverse Power Method, but has several
advantages: it is always convergent when A is irreducible and it gives an estimate of the
error at each step. It also has an advantage over the Method of Collatz in that it applies to
any irreducible, non-negative matrix, not just a primitive matrix. Since
AA)( ,
choosing
Aq will ensure qA )( . This method was applied to Examples 1-5, with
the following results:-
Example 1 converged in 17 iterations. Example 2 converged in 8 iterations. Example 3
converged in 9 iterations. Example 4 converged in 2 iterations. Example 5 had not
converged after 100 iterations, but if q was reset to the last upper bound, it converged in
a further 14 iterations. All convergence was to the correct 4-decimal eigenvalue.
92
5.5 Conclusion
This chapter presents an always convergent method for finding the spectral radius of a
non-negative, irreducible matrix. It is a method closely related to the Power Method and
the Inverse Power Method. However it has advantages over both of these methods, viz
certainty of convergence, a reliable estimate of the error at each step and the ability to
restart the iterations in the case of very slow convergence, as was seen in Example 5
above.
93
CHAPTER 6
The contents of this chapter are included in the paper
Wood, R.J and O'Neill, M.J (2005): A faster algorithm for identification of an M-Matrix,
http://anziamj.austms.org.au/V46/CTAC2004/Wood/home.html
A FASTER ALGORITHM FOR IDENTIFICATION OF AN M-MATRIX
6.1 Introduction
Some problems involving elliptic partial differential equations, when solved by finite-
difference methods, lead to a linear system where the coefficient matrix is what is known
as an M-matrix (see Young (1971)). Furthermore, when such an M-matrix is sparse there
are well-established iterative techniques for solving the linear system (see Saad (2003, p
28)). M-matrices also occur in linear systems associated with Input-Output analysis in
Economic modelling.
6.2 What is an M-matrix?
An M-matrix and its properties 1) 4) are defined in section 1.4, p 22.
Saad (2003, p 28) and Young (1971, p 43) give an alternative and simpler method for
determining whether a matrix is an M-matrix. This method is described in the following
theorem.
Theorem 6.1: Let properties 1) and 2) hold for matrix A . Construct the matrix B
where ADIB 1 , and D is the diagonal of A . Then A is an M-matrix if and only if
1)( B .
Proof: See Young (1971, p43). ■
94
So properties 3) and 4) can be replaced by the condition 1)( B . The following
examples show the application of Theorem 6.1. All matrices in these examples satisfy
properties 1) and 2).
Example 1:
If
14/1
4/11A then 14/1)( and ,
04/1
4/10
BB , which
implies that A is an M-matrix.
Example 2:
If
11
22A then 1)( and ,
01
10
BB . So A is not an M-matrix.
Example 3:
If
11
32A then 12/3)( and ,
01
2/30
BB . So A is not an M-
matrix.
It may be possible to check the property 1)( B without actually calculating the spectral
radius of B . For example, using the fact that the spectral radius cannot exceed the row or
column norm, calculation of one or other of these norms may establish immediately that
1)( B . This is certainly true in Example 1, where 4/1
B and, since
BB)( ,
then 4/1)( B . Even if this is not possible, as in Examples 2 and 3, noting that B is
non-negative, the Method of Collatz may be used to compute )(B . This method
involves selection of an appropriate value
Bq and then application to B of a variant
of the Inverse Power Method, which is essentially an application of the Method of Collatz
to the matrix 1 BqI . See chapter 5 for details of this method.
This chapter considers the alternative approach of Theorem 6.2, the proof of which uses
the following easily proved Lemma.
95
Lemma 6.1: Let A be any real nn matrix with elements ij . If A has non-positive
off-diagonal elements then it can be decomposed into the form CIA with
iii max and 0C .
Proof:
CIA
i.e.
nnnnn
nn
n
nnnnn
nn
n
1,1
,1
21
11211
1,1
,1
21
11211
00
0
0
00
.
Just choose ijij for ji and iiii with iii max . This then
ensures that 0C . ■
Theorem 6.2: Let properties 1) and 2) hold for matrix A . Decompose the matrix A into
the form CIA , where iii max . Then A is an M-matrix if and only if
)(C . (See Berman et al (1979, Ch 2)).
Proof: The proof is quite straight-forward when use is made of Theorem 3.8 in Varga
(1962, p 83), which states that if 0C is an nn matrix, then the following are
equivalent
1) )(C
2) )( CI is non-singular and 0)( 1 CI .■
As with the method described in Theorem 6.1, the properties 3) and 4) can be replaced by
the condition )(C . For the matrix A in Example 1, choosing 2 gives
14/1
4/11C , and, since 24/5)( C , then A is an M-matrix. For the matrix A
in Example 2, choosing 3 gives
21
21C with 3)(C . So the matrix A is
96
not an M-matrix. For the matrix A in Example 3, choosing 3 gives
21
31C with
2/133)(C . Therefore the matrix A is not an M-matrix.
As stated previously, it may be possible to check the property )(C using row or
column norms, without actually calculating the spectral radius of C . This is certainly true
for the matrix C from Example 1. Failing this, the fact that C is non-negative allows the
always-convergent Method of Collatz mentioned previously in Chapter 5 to be used to
compute )(C .
It should be noted that there is considerable flexibility in choosing the of Theorem 2.
However, if the chosen does not produce a matrix C such that )(C , then
increasing the value of will not remedy the situation and verify that A is an M-matrix.
This is obvious from the following corollary.
Corollary 6.1: If )(C , increasing the value of by 0k , will only increase
)(C by k.
Proof: Since
kICCkwhereCIkICIkCIA '''' , ,)()( .
,)()()( ' kCkICC
since the Perron-Frobenius Theorem applies to .0C ■
Hence increasing by k will only result in the spectral radius of C increasing by k .
In both the methods of Theorem 6.1 and 6.2 it may be required, as was seen in Examples
2 and 3, to compute the spectral radius of the non-negative matrix. The procedure for this
is greatly simplified if it is known that we are dealing with an irreducible matrix. The
following two theorems describe conditions under which it is known that the matrices B
and C are irreducible.
Theorem 6.3: If D is a non-singular diagonal matrix then ADIB 1 is irreducible
if and only if A is irreducible.
97
Proof: Since D is non-singular the elements of D are non-zero, so AD 1 is irreducible
if A is irreducible. Then, since ADI 1 could zeroise only the diagonal elements of
AD 1 , it does not alter the irreducibility of AD 1 . Hence ADIB 1 is irreducible if
A is irreducible. Conversely, if B is irreducible, then, since )( BIDA , A is
irreducible. ■
Theorem 6.4: AIC is irreducible if and only if A is irreducible.
Proof: Since AI could zeroise only the diagonal elements of A it does not alter the
irreducibility of A . Hence AIC is irreducible if A is irreducible. Conversely,
since CIA , then A is irreducible if C is irreducible. ■
Hence, knowing that A is irreducible guarantees that both B and C are irreducible. This
has implications for the Method of Collatz, which operates more efficiently when the
subject matrix is irreducible.
6.3 Computational Aspects
Example 4 shows the results of the Method of Collatz when applied to a particular
matrix A .
Example 4:
If
1011
910A then A satisfies properties 1) and 2) and B is as shown in the table
below. Choosing 11 gives C as shown in the table. In each case the bounds are
calculated using the Inverse Collatz Method, as shown in the table.
98
01.1
9.00B , with eigenvalues
995.0 ,995.0 21
111
91C , with eigenvalues
9499.8 ,9499.10 21
1 0.9947 0.9952 1 10.9474 10.9524
2 0.9950 0.9950 2 10.9499 10.9499
In this table, is the iteration number, is a lower bound for the spectral radius and
is an upper bound. In both cases only the first iteration is necessary as the upper bound
for the spectral radius is in each case less than the required value for an M-matrix.
Both the method of Theorem 6.1 and that of Theorem 6.2 can be quite robust, in the sense
that they are not unduly affected by an ill-conditioned matrix. The next example shows
this.
Example 5:
If
pp
pp
A1010
1010
, with p moderately large and sufficiently small, then
A
is obviously ill-conditioned. In fact the condition number
1)( AAA is
approximately 22 /104 p . For this matrix A ,
1101)( and 0101
1010 22
p
p
p
BB . Theorem 6.1 then gives the
correct conclusion that A is an M-matrix.
Using 110 p gives
pppp
p
p
CC 101101101101)( and 110
101 2222
.
99
Theorem 6.2 also gives the correct conclusion that A is an M-matrix. It must be
acknowledged, however, that if 2210 p underflows to zero in approximate arithmetic, a
wrong conclusion will result in each case.
In the method of Theorem 6.1, if D has very small elements, the computation of AD 1
has the potential to compound any errors in A . This is not a problem with the method of
Theorem 6.2. Furthermore, when D has very small elements, it is possible that B will
be a non-symmetric matrix with elements differing widely in magnitude. In such
circumstances, B is likely to have ill-conditioned eigenvalues. Example 6 is a case in
point.
Example 6: If
6909.00697.09810.0
0108166.0
0059.006909.06A , then
010009.11042.1
00166.8
1054.800
1065
8
5B .
The eigenvalues of B are 8.8943, and i7019.74471.4 , which are almost equal in
magnitude. This results in very slow convergence of the Method of Collatz and the
Inverse Collatz Method. The dominant eigenvalue of B is ill-conditioned and this causes
convergence to a slightly incorrect approximation. However, balancing the matrix
remedies this situation. A balanced matrix is one where the sum of the ith row is equal
to the sum of the ith column, and is said to be balanceable if XAXXAX T11 where
X is a positive diagonal matrix. See Parlett and Reinsch (1969).
The method of Theorem 6.2, with 6909.1 , yields
10697.09810.0
06909.18166.0
0059.001
C
100
and none of the above difficulties occur. In view of potential difficulties with the
stability of methods for finding eigenvalues of a non-symmetric matrix, (See Atkinson
(1989, p596ff)), there are obvious advantages in preserving symmetry when A is
symmetric. The following two theorems show that the method of Theorem 6.2 preserves
the symmetry of A , but the method of Theorem 6.1 doesn't necessarily do so.
Theorem 6.5: If matrix A is symmetric, ADIB 1 is not symmetric in general.
Proof: If ADIB 1 then
.)( 11 BADIDAIB TTT ■
Theorem 6.6: If matrix A is symmetric then so also is AIC .
Proof: If AIC , then
CAIAIC TT ■
A comparison can be made with respect to the number of operations of each method. The
Method of Theorem 6.1 requires )( 2 nn operations to calculate B whereas, with the
Method of Theorem 6.2, only n operations are required to calculate C . Hence when n
is large there is a possibility of greater rounding errors in the method of Theorem 6.1.
The Method of Collatz operates most effectively when the non-negative matrix is
irreducible. If the matrix is reducible, then there are at least two ways to proceed:
i) Convert it to normal form. See section 5.5.
The following example shows how this technique is applied.
Example 7: If
80000
210222
001002
000100
00208
A .
This matrix is obviously reducible. It satisfies properties 1) and 2) and
101
00000
2.002.02.02.0
00002.0
00000
0025.000
B . Choosing 11 gives
30000
21222
00102
00010
00203
C
Converting C to normal form, and noting that the normal form will have the same
eigenvalues as C , we obtain
30000
01000
0
0
0
0
12
23
0
0
22221
TPCP ,
where the four irreducible diagonal blocks have been indicated. Applying the Inverse
Method of Collatz to the (2,2) block gives [4,4.333] as the interval of uncertainty for the
spectral radius of this block and ultimately then for )(C . Since all values in this interval
are less than 11 , A is confirmed as an M-matrix. This example is chosen to
illustrate the technique, but it should be observed that the column norm of C reveals
immediately that 117)( C .
ii) Perturb slightly the reducible matrix, as described in section 5.5.
For the matrix B in example 7, choose 610 . Then 6210.12y , 1523.0 , and
3617.0 . The quantities y , and are explained in Section 5.5. Using the bound
given in that section yields 6' 106210.1 , and this indicates that the difference
between the spectral radius of B and EB is probably suspect in the sixth decimal
place.
The Inverse Collatz Method requires the solution of a system of linear equations. If this
is computationally expensive the straight Method of Collatz (Theorem 4.1) can be
applied.
102
6.4 Conclusion
In this chapter two methods for determining whether a matrix is an M-matrix are
suggested. Both methods overcome the difficulty of showing that 1A exists and is non-
negative, by calculating or bounding the spectral radius of an associated non-negative
matrix. If the non-negative matrix is irreducible, an always convergent method is
available for calculating its spectral radius when this is necessary. If the matrix is
reducible, then two ways are suggested to handle this situation.
The method of Theorem 6.2 requires fewer operations than that of Theorem 6.1 and is
thus potentially faster and more accurate. Also, it avoids the potential problem of
amplification of errors in A , which can result in the calculation of ADIB 1 , when
some elements of D are very small. However, care should be taken to choose large
enough to avoid drastic loss of significant digits in the calculation of AIC . The
method of Theorem 6.2 preserves the symmetry of matrix A , but the method of Theorem
6.1 does not, in general. This has implications for the condition of the eigenvalues of B
and C . To avoid the solution of a large system of linear equations the straight Method of
Collatz rather than the Inverse Collatz Method can be applied. In this context, B may be
cyclic, which results in non-convergence of the method. This problem does not occur
with matrix C . It is reasonable then to conclude that the method of Theorem 6.2 is
computationally superior to that of Theorem 6.1.
103
CHAPTER 7
The contents of this chapter are included in the paper
Wood, R.J and O’Neill (2007): Finding the Spectral Radius of a large Sparse Non-Negative
matrix, http://anziamj.austms.org.au/ojs/index.php/ANZIAMJ/article/view/117/99
FINDING THE SPECTRAL RADIUS OF A LARGE SPARSE NON-NEGATIVE MATRIX
7.1 Introduction
The accepted method for finding all the eigenvalues of a full matrix is the QR method.
However, if it is desired to find just the dominant eigenvalue of a large sparse, non-
negative matrix there is an element of unnecessary computational expense in using the QR
method. Furthermore, the requirement of the QR method that the subject matrix must first
be converted to Hessenberg form, which is an )( 3nO operation, may be prohibitively
expensive and may also destroy the sparsity of that matrix. A more promising approach is
to use a method that calculates only the dominant, or the first few largest eigenvalues of
the matrix, and a method that at the same time preserves the sparsity.
One such approach is to use a Krylov sub-space method such as that of Arnoldi. Another
approach is to develop further a result by Collatz (1942). Both of these methods will be
discussed and a further comparison made with the methods of Orthogonal Iteration and
Simultaneous Iteration.
7.2 The Method of Collatz
A commonly used set of test matrices developed by G.W. Stewart for large, sparse matrix
computation is a set of transition matrices for a particular Markov chain consisting of a
random walk on a )1()1( kk triangular grid. These matrices are usually denoted
Mark(k). For a description of these matrices see Saad (1992, Chp 2). Mark(14) has
dimension 120n and contains 420 non-zero elements. From the definition of the Mark
104
matrices it is easily shown that they are irreducible. However they are also cyclic; and to
ensure primitivity the Method of Collatz has been applied to the matrix Mark(14) + 120I ,
with the following results:
Method of Collatz for Mark(14) + 120I
1 1.0525 55.5695
2 1.5528 3.6045
3 1.7001 2.6934
238 2.0000 2.0001
239 2.0000 2.0000
The Mark matrices are row stochastic and consequently have a dominant eigenvalue of
one. Consequently Mark(14) + 120I will have a dominant eigenvalue of 2. When the
Method of Collatz for this matrix was implemented using Matlab it was found that, with a
randomly generated positive initial vector 0q , the number of flops to achieve 4-decimal
places of accuracy in the dominant eigenvalue was 316 800 flops as calculated by Matlab.
7.3 The Arnoldi Method
The procedure introduced by Arnoldi (1951) begins by building an orthogonal basis
},,,{ 21 mvvv for the Krylov subspace mK , where },,,{ 1
1
11 vAAvvspanK m
m
and 1v
is an arbitrarily chosen vector of norm one. The vectors in this basis then form the
successive columns of the matrix mV as shown below in the so-called Arnoldi
factorisation.
105
where mH is an mm upper Hessenberg matrix and T
nmm eV 1 is a rank one matrix.
See Saad (1992, p 148) for further details. For m sufficiently large the eigenvalues of
mH provide approximations to a set of the eigenvalues of A . The basic Arnoldi
algorithm (See Saad (1992, p 147)) was applied to the test matrix Mark(14), with the
following results.
m Flops )( mH
10 39 640 0.9994
20 127 280 1.0000
30 262 840 1.0000
Note: The number of Flops was calculated by Matlab.
The total number of flops required to determine the dominant eigenvalue correct to 4
decimal places of accuracy was 429 840. It must be acknowledged that, since both
Arnoldi and the Method of Collatz were started with a Matlab-generated positive initial
vector, the number of flops may vary depending on how close that initial vector is to the
dominant eigenvector. To remove this effect Arnoldi and the Method of Collatz were
repeated 100 times with different random initial vectors and the average number of flops
calculated for each. The result was that, for Arnoldi, Mark(14) averaged 5100.5 flops
and, for Mark(14) + 120I , the Method of Collatz averaged 5107.3 flops.
A mV
mV = +
mH
106
7.4 Other Methods
Other methods that are useful for large, sparse matrices are the methods of Orthogonal
Iteration and Simultaneous Iteration. Orthogonal Iteration begins with an initial rn
matrix 0Q , with orthonormal columns and generates 01 AQZ . The QR factorization of
1Z , viz 111 RQZ , produces the next Q matrix, and the process is repeated to generate
3322 ,,, RQRQ etc . Under certain conditions for convergence the diagonal entries of the
kR matrix will eventually approximate the r largest eigenvalues of A . Full details of
the method can be found in Golub and Van Loan (1996, page 332 ff).
The method of Orthogonal Iteration can sometimes be very slow. Accordingly, the
method of Simultaneous Iteration is designed to accelerate the convergence of
Orthogonal Iteration by periodically performing a Schur decomposition. See Stewart
(1976, p 278) for a description of the technique.
The two methods with 1r were applied to Mark(14) + 120I with the following results
in order to achieve 4-decimal accuracy:
Method Total no. of Flops
Orthogonal 5108.2
Simultaneous 5107.2
A further alternative method is to use the Matlab function eigs to find the dominant
eigenvalue. When eigs was repeated 100 times for the matrix Mark(14) the average
number of flops required to achieve 4-decimal accuracy in the dominant eigenvalue was
approximately 6107.1 flops ( 6105.2 flops for Mark(14) + 120I ). A further difficulty
here was that sometimes eigs for Mark(14) converged to the eigenvalue -1 instead of 1.
Also, when the QR method (applied using the Matlab function eig) was applied to
107
Mark(14) in order to calculate all eigenvalues, the number of flops needed was
approximately 6108.14 ( 61017 flops for Mark(14) + 120I ).
7.5 Comparison of the methods
For the test matrix Mark(14) the Method of Collatz consistently recorded fewer flops than
the Arnoldi method and eigs in finding the dominant eigenvalue. It must be
acknowledged that the method of Arnoldi, and the Matlab routine eig are also providing
some or all of the subdominant eigenvalues, and so more information than just the
dominant eigenvalue. However, if the sole aim is to find the dominant eigenvalue of a
non-negative matrix then the Method of Collatz would appear to be superior in terms of
the number of flops, when compared to the Arnoldi method and eig routine. The Matlab
routines eigs and eig would appear to be expensive in terms of the number of flops
required when the problem is just to find the dominant eigenvalue.
When it is known that a matrix is irreducible and primitive, the Method of Collatz is
certain to converge to the dominant eigenvalue. For example, with the matrix Mark(14) it
is possible to verify its irreducibility just from the definition of the matrix, since all nodes
are totally connectable. Because of its close relationship to the Power method, Arnoldi
will typically generate a set of m eigenvalues among which will be the dominant
eigenvalue, although convergence to this dominant eigenvalue is not guaranteed. A
further matter of uncertainty with Arnoldi is the choice of m , the dimension of the Krylov
subspace to be used. Choosing ,30 ,20 ,10m proved to be effective for Mark(14), but on
the other hand choosing 21,,12 ,11 ,10 m was not so effective, as the following table
shows.
108
m Flops i
10 39 640 0.9997
11 46 244 0.9965
20 127 280 1.0001
21 138 684 1.0001
In fact, in this case, premature convergence to the wrong value 1.0001 has occurred, and
many more flops were required to achieve 4-decimal accuracy.
For Mark(14) + 120I , the methods of Orthogonal Iteration and Simultaneous Iteration
required slightly fewer flops than the Method of Collatz, but the number of flops were of
the same order. The advantage of the Method of Collatz is that it bounds the dominant
eigenvalue at each step and it is guaranteed to converge. There are conditions under
which Orthogonal Iteration can be guaranteed to converge to the dominant eigenvalue, but
these are not easily verifiable a priori. For Simultaneous Iteration the condition
1 rr (where 0Q is an rn matrix) does guarantee convergence for an irreducible
matrix. See Stewart (1976).
7.6 Practicalities for the Method of Collatz.
To ensure irreducibility of any non-negative matrix A , a slight perturbation of A will
suffice, as described in section 5.5. Furthermore, if all the diagonal elements of the
resulting irreducible matrix are zero, a diagonal shift of the identity matrix will ensure that
we are not dealing with a cyclic matrix, and that primitivity is assured. See section 5.3. A
relevant concern when perturbing the elements of A by a small amount is its effect on the
eigenvalues. If the matrix has an ill-conditioned dominant eigenvalue, a small
109
perturbation may result in a very inaccurate largest eigenvalue. However, another result
given in section 5.3 is helpful in determining the impact of a small perturbation.
Numerical experiments were carried out with 100 randomly generated sparse non-
negative matrices for each of the orders 100, 200, 300, 400. Sparsity density was
randomly selected in the range (0, 0.5). In all but a very few exceptional cases, 2
y was
less than 10, was greater than 0.1 and less than 10, indicating that typically, the
dominant eigenvalue of a non-negative matrix is not greatly affected if the values in E
given on p 87 are appropriately small. The few exceptional cases occurred when the
sparsity density was very low – less than 2%. Hence for matrices where sparsity is of
such an order it would be wise to convert the matrix to normal form, which is block
upper-triangular with the square diagonal blocks being irreducible matrices or null
matrices. The Method of Collatz may then be applied to those irreducible blocks. In
cases where the Method of Collatz produces very slow convergence a superior rate of
convergence can usually be achieved by a hybrid method consisting of several steps of the
Method of Collatz applied to the matrix A , followed by the Method of Collatz applied to
the matrix 1)( AqI with q chosen as the upper bound for the dominant eigenvalue
calculated after several steps of the Method of Collatz for A . See Theorem 5.3 which
guarantees that 1)( AqI is non-negative and primitive in this situation. However, the
Method of Collatz applied to 1)( AqI involves the solution of a large system of linear
equations and may be prohibitively expensive if A is too large. When the Hybrid method
(see p 79) was applied to Mark(14) + 120I , the method converged in 4106.7 flops, which
is 28% of the number of flops for Simultaneous Iteration. In order to investigate
performance of the methods for larger, sparse matrices the following results were
compiled:
110
Example 1: Mark(50), dimensions 1326*1326
Method Total no. of Flops
Collatz 6104.8
Hybrid 6106.5
Arnoldi 7105.1
Orthogonal 6107.5
Simultaneous 6106.5
Example 2: Mark(100), dimensions 5151*5151
Method Total no. of Flops
Collatz 7100.6
Hybrid 7105.6
Arnoldi 7108.12
Orthogonal 7104.4
Simultaneous 7102.4
For very large matrices the Hybrid method starts to lose its computational advantage,
because it involves a solution of a large system of linear equations. For large sparse
matrices of order up to 51515151 the number of flops for the Collatz and Hybrid
method is of a comparable order with the other methods, including a Krylov based
method. For matrices of order greater than 5151 the comparative advantage in flops of
the Collatz Method and Hybrid Method over the Krylov based Method may be lost.
However the Collatz Method and Hybrid Method are certain to converge and bound the
spectral radius and this is not necessarily so for other Methods.
111
7.7 Conclusion
This chapter presents the Method of Collatz for finding the spectral radius of a large,
sparse, non-negative, irreducible matrix. The method has an advantage over other
methods presented in this paper, viz Arnoldi, Orthogonal Iteration, and Simultaneous
Iteration, in that it bounds the dominant eigenvalue. For large, sparse matrices the
number of flops is of a comparable order with the other methods. If the convergence of
the Method of Collatz is slow, it can be applied to 1)( AqI and this has superior
convergence if q is chosen appropriately and sufficiently close to 1 the dominant
eigenvalue of A . Thus, the method can be restarted in the case of very slow
convergence, resulting in the so-called Hybrid method as per p 79. The Method of
Collatz and the Hybrid method also have an advantage over the Arnoldi method which
requires the initial size of the Krylov space to be predetermined. A poor choice of the
initial dimension can affect performance of the method, in that more flops are needed,
and in some cases this can lead to premature convergence to the wrong value as shown in
section 7.5. The Method of Collatz and the Hybrid method do not suffer from this
difficulty.
112
CHAPTER 8
ESTIMATING THE CONDITION NUMBER OF A LEONTIEF SYSTEM
8.1 Introduction The condition number of a linear system gives a measure of the effect on the final
solution when the coefficient matrix is perturbed. As such it provides an estimate of the
accuracy of the final solution and also indicates the precision of the arithmetic that must
be used to compute the solution. If the condition number is large, then small
perturbations in the coefficient matrix may cause significant errors in the final solution.
Turing (1948) seems to have been the first to use the term condition number, which he
defined in terms of matrix norms, and that definition will be continued in this chapter.
Further information on the condition number may be found in Stewart (1973, Ch 4.).
This chapter derives bounds for the condition number of AI , the coefficient matrix of
the Leontief linear system used in input-output analysis:
FXAI )( . (1)
These bounds involve the spectral radius, the trace, the row norm and the column norm of
matrix A. The trace of A, denoted by ),(Atr is defined as the sum of the leading diagonal
elements of A, and is also equal to the sum of the eigenvalues of A.
The condition number of AI , denoted by )( AI , will be defined using the
infinity norm as
1)()( AIAIAI , (2)
113
where
. is defined as the maximum absolute row sum.
8.2 A Bound Using the Row Norm
Since the row norm is simple to calculate, there are obvious attractions for a condition
number bound involving ,
AI and A
. In Theorem 8.1 such a bound is derived.
Theorem 8.1: If 1
A , then
A
AIAI
1)(1 . (3)
Proof
All matrix norms in this proof are infinity norms.
Let X0 where X is a n dimensional column vector then
.1A 0 since,0
)1(
)(
XA
XAX
AXX
AXXXAI
Hence, if X 0, then (I - A)X 0. Therefore (I - A) is non singular.
Now, since
( )( )I A I A I 1,
then I(I - A)1
- A(I - A)1
= I
i.e. (I - A)1
= I + A(I - A) 1
Hence,
( ) ( )I A I A I A 1 1
i.e. ( ) ( ) ,I A A I A I 1 11 1since
( ) ( )1 11 A I A
114
i.e. ( )( )
I AA
1 1
1, if 1A (4)
This result applies for any consistent matrix norm. Applying it to the infinity norm
shows that
A
AIAIAIAI
1)()( 1 . (5)
This gives an upper bound for ).( AI
Also, since
,))(( 1 AIAII
then
,)(1 1
AIAII
and it follows that
1)()( 1
AIAIAI . (6)
This gives a lower bound for ).( AI
Combining (5) and (6) gives the result (3). ■
The above proof is based on Theorem 3.4, Stewart (1973, p187).
Example 1:
If
3
10
4
1
5
1
A ,
then
3
20
4
1
5
4
AI .
So 20
9 and
20
21
AAI .
Using Theorem 8.1,
115
91.111
21)(1 AI .
Using (2), 80.1128
231)( AI .
Theorem 8.1 provides quite a sharp upper bound in this example.
The difficulty with Theorem 8.1 is its limited applicability. It is quite feasible to have an
input-output matrix A for which A1, and in such circumstances, the theorem cannot
be applied.
8.3 A Bound Using the Spectral Radius
It is well-known that 1)( A is a necessary and sufficient condition for a unique
solution of the system (1). Accordingly it would be most appropriate to have available a
condition number bound which had as its only restriction, that .1)( A Such a bound
would then be universally applicable to all systems with a unique solution. Theorem 8.2
provides such a bound.
Lemma 8.1: If ,1)( A then
)(1
1)( 1
AAI
. (7)
Proof: If A is a non-negative n by n matrix with eigenvalues n ,....,, 21 and
,)( 1 A say, then 1 is real by the Perron-Frobenius theorem. Also 1)( A
guarantees the existence of 1)( AI and the eigenvalues of ( )I A 1 are
.1
1,...,
1
1,
1
1
21 n
Hence,
116
A
AI
1
1
1
1
1
1. (8)
See Theorem 3.3 for a verification that 11
1
is indeed the dominant eigenvalue. ■
Lemma 8.2: If A is an n by n non-negative matrix, then any element on the leading
diagonal of A must be less than or equal to the spectral radius of A.
Proof:
0 where,
...
....
...
...
Let
21
22221
11211
A
aaa
aaa
aaa
A
nnnn
n
n
.
Let the largest element on the leading diagonal of A be iia for some .,...,1 ni
nn
n
n
a
aa
aaa
U
0..00
....
...0
...
Let 222
11211
.
Then iiaU )( .
By the Perron-Frobenius theorem, )(U does not decrease when the lower-diagonal
elements of U are increased to make it equal to the matrix .A Therefore
iiaUA )()( .
Hence, any element on the leading diagonal of A must be less than or equal to ).(A ■
117
Theorem 8.2: If ,1)( A then
)(1 )(
)(1 A
AInAI
A
AI
. (9)
Proof:
11AIAI , using (6) in section 3.3. (10)
But, by Lemma 8.1,
)(1
11
AAI
. (11)
Hence,
1
)(1
1AI
A. (12)
and, on multiplying (12) by ,
AI
).()(1
AIA
AI
(13)
This gives a lower bound for ).( AI
Wong (1954) showed that the elements of 1 AI are non-negative and that the
diagonal elements in each row are the maximal elements in that row.
By Lemma 8.2, the largest leading diagonal element of 1 AI is less than or equal to
)(1
11
AAI
.
118
Using Wong's result, this means that the maximum row sum of 1 AI is
)(1 A
n
.
Hence,
)(1
1
A
nAI
, (14)
and )(1
)(A
AInAI
. (15)
This gives an upper bound for ).( AI
Combining (13) and (15) gives the result (9). ■
Example 2: Applying Theorem 8.2 to Example 1 shows that
15.3)(575.1 AI
Recalling that the actual ( )I A is approximately 1.80, it is seen that, although the
upper bound is somewhat coarser than that obtained using Theorem 8.1, it is still of the
same order, viz a number between 1 and 10. As will be explained later, it is the order of
the upper bound which is most significant.
8.4 A More Practical Bound
A practical difficulty associated with the bound in Theorem 8.2 is that )(A is not easily
calculated for a matrix of large dimensions. Theorem 8.3 provides a related alternative,
which is easier to calculate.
Lemma 8.3: If A is a real n n by matrix, then
).()( AnAtr (16)
119
Proof: ....)( 21 nAtr
.)( where),(
... )(Then
11
21
AAnn
Atr n
Since A is a real matrix,
)()( AtrAtr .
The result follows. ■
Theorem 8.3: If A1
1 , (See (8) in section 3.3 for the definition of 1
A ) then
1
1)( A
AInAIκ
Atrn
AIn
. (17)
Proof: Using the fact that ,1)(1 AA the upper bound of Theorem 8.2 shows that
.1
)(
1A
AInAI
(18)
Also, using Lemma 8.3,
)(1
1
)(1
1
A
n
Atr
. (19)
Then, using (18), and (19), the result (17) is proved. ■
Theorem 8.3 provides more practical bounds, since 1
and )( AAtr are readily calculated.
120
Example 3: Applying Theorem 8.3 to the matrix in Example 1 shows that, with
.04.525
126
44
6343.1
12
7 and
15
8)(
1
AI
AAtr
This is a slightly coarser bound than that given by Theorem 8.2. Also, it should be noted
that there exist matrices, A, for which ,1but ,1)(1 AA and in these circumstances
Theorem 8.3 is not applicable.
Example 4: The matrix A, where
2
10
2
1
2
1
A ,
is a case in point. .1but ,12
1)(
1 AA
8.5 The Best Possible Constants
The constant n in the upper bound of (9) is the best possible, and the lower bound of (9)
is the best possible. This is so, since the upper bound and lower bound are achieved in
some cases. The next two examples show this.
Example 5: This example shows a particular case when the upper bound is attained.
If
2
10
2
1
4
3
A ,
then
,4
3)( ,8)( ,
4
3 1
AAIAI
121
and .6)(1
A
AInAI
Example 6: This example shows a particular case when the lower bound is attained.
If
2
10
02
1
A ,
then
,2
1)( ,2)( ,
2
1 1
AAIAI
and .1)(1
A
AIAI
8.6 Conclusion
In a system of the form (1) with ,1)( A Theorem 8.1 provides bounds on the condition
number .1 when )(
AAI Theorem 8.2 provides the most general bounds on
AI . The constants in the upper and lower bound of Theorem 8.2 are the best
possible, since they are achieved in some cases. However, Theorem 8.2 requires
calculation of the spectral radius of A, which is computationally intensive for a large
matrix. In the case 11A , which will be satisfied by very many input-output systems, a
more computationally available bound is provided by Theorem 8.3, which gives bounds
in terms of ).( and 1
AtrA It is worth noting that, in all theorems,
AI could be
replaced in the upper bound by n. However this is likely to make the upper bound very
conservative.
122
The condition number of AI could be calculated from its definition in (2). However,
this is a very computationally intensive process as it requires the full computation of
1)( AI . Thereoms 8.1, 8.2 and 8.3 provide a much more efficient means of bounding
the condition number.
The practical significance of ( )I A is two-fold. Firstly, it gives an indication of the
possible loss of significant digits in solution of the system (1) and hence is a guide to the
precision of the computer arithmetic which must be used to solve the system in order to
obtain an accurate answer. For example, if )( AI is of the order of p10 , then p
decimal digits of accuracy may be lost in solving the system. For a more detailed
discussion of this aspect of the condition number, see Stewart (1973, p 196). Secondly,
( )I A is a significant indicator of the accuracy of a computed solution. For example,
in the system (1), if there is considerable uncertainty in the external demand vector F to
the extent that the system (1) may present as the perturbed system
FXAI )( , (20)
accuracy of solution of the system (20) is then provided by the standard result
F
FFAI
X
XX . (21)
See Stewart (1973, pgs 194-198) for further details of this aspect of the condition
number.
Theorem 8.2 emphasises the importance of the spectral radius and the associated Spectral
Radius condition, .1)( A This condition and the Hawkins-Simon condition are the
123
two most well-known conditions for a unique solution of an input-output system of
equations.
124
Chapter 9
AN UPPER BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE MATRIX
9.1 Introduction
This chapter provides a new proof that the spectral radius of the symmetric part of a non-
negative matrix A provides an upper bound for the spectral radius of A . This theorem
was posed and proved by Levinger and mention was made of this in the Notices of the
American Mathematical Society (1969). However his solution was not printed there. It
was also solved by a number of others: Deutsch, Walsh, Barker, Kuttler, Minc, Mueller
and Thompson. They each had their own unique solution to the problem. However the
solutions by Deutsch, and Walsh, which were published in the Notices of the American
Mathematical Society (1970), did not investigate the conditions necessary for equality of
the spectral radii. Examination of the conditions under which the spectral radii for both
are equal are examined in this chapter.
9.2 A New Proof of an Established Bound
The following theorem establishes that the spectral radius of the symmetric part of a non-
negative matrix A is always greater than or equal to the spectral radius of A .
Theorem 9.1: If is the spectral radius of a non-negative matrix A and is the
spectral radius of the symmetric part of A 2/)( ., TAAviz then .
Proof: Matrix A may be deflated using an elementary reflector R formed from the
eigenvector of the dominant eigenvalue . The Perron-Frobenius Theorem guarantees
125
that both eigenvector and eigenvalue will be real. Consequently R will be a real
symmetric matrix and
C
hARR
TT
0
(1)
See Stewart (1973, p279) for details of this deflation procedure. It then follows that
T
TT
ChRAR
0 (2)
Combining (1) and (2), we obtain
22
22 T
T
TT
CCh
h
RAA
R
(3)
Then, using the fact that the Rayleigh quotient of the right-hand matrix in (3) must be less
than or equal to , (see Stewart (1973, p312)), we have
0
1
22
2 01T
T
T
CCh
h
. i.e ■
As stated earlie,r this theorem was posed by Bernard W Levinger in Notices of the
American Mathematical Society in 1969. The above proof is a new proof of the same
theorem.
Unfortunately this result does not extend generally to a matrix with negative elements.
Example 1 is a case in point.
Example 1:
126
11
11A . For this matrix the eigenvalues are i1 , with 2)( A but
1)2/)(( TAA .
Corollary 9.1: If unit vector x is the dominant eigenvector of the right-hand matrix in
(3), then
x
CCh
h
xT
T
T
22
2 (4)
Proof: This follows directly from (3) using the Rayleigh quotient result. ■
Corollary 9.2: If unit vector y is the dominant eigenvector of
2
TAA, then
yR
CCh
h
Ry T
T
T
T
22
2 (5)
Proof: This follows directly from (4), using the fact that yRx T . ■
Results (4) and (5) give some insight into the relationship between and . For
example, when A is of dimension 2, (4) becomes 2221
21 Ch where
2
1
x with 1, 21 . This shows that is dependent upon ,,, Ch and x .
In the special case that A is also symmetric, the following theorem shows that further
deductions can be made.
127
Theorem 9.2: If a non-negative matrix A is symmetric, then (a) 0h , (b) , and
(c) 1ex is an eigenvector of ARRT corresponding to .
Proof: (a) Since A is symmetric, the right-hand matrices in (1) and (2) must be
identical. This implies 0h and C is symmetric.
(b) If TAA , then AAA T 2/)( and hence .
(c) From (a) and (b) above, 0h , TCC and . Using these facts we can say that
the eigenvector of ARRT satisfies
nn
T
C
11
0
0.
Hence 1ex is then an eigenvector of ARRT corresponding to , since it satisfies this
equation. ■
Converses of Theorem 9.2 are not true.
The following Example 2 shows that none of the conditions (a), (b), and (c) of Theorem
9.2 ensure that A is symmetric. In this example, both A and 2/)( TAA have the same
dominant eigenvalue 3 and associated eigenvector 0011 . Hence , and
it may also be shown that 0h , and 1ex is an eigenvector of ARRT . However, A is
obviously non-symmetric.
Example 2:
0000
1100
0021
0012
A
128
On the other hand, for some matrices, the quantity 2
h will give a measure of the
closeness of A to a symmetric matrix. This can be seen in the following example of a
22 matrix.
Example 3:
21
1 A where 0 .
If
cossin
sincosR , and is chosen so that
C
hARRT
0
, it can be shown that
1 i.e .12
hh , if 1 . Hence, 2
h is an increasing function of for
1 . This agrees with what we would intuitively expect of a matrix A , which is
becoming more and more unsymmetric as increases beyond 1.
However, it should be noted that 2
h does not always indicate the closeness of A to a
symmetric matrix. Example 4 is a matrix for which it can be shown that 0h , but the
matrix is obviously far from symmetric.
Example 4:
110
011
101
A
Theorem 9.2 and Example 2 show that there are circumstances, other than when A is
symmetric, in which and hence )2/)(( TAA is an achievable bound for )(A .
We now consider several theorems which show that, for certain classes of non-negative
matrices, . These theorems require the following three lemmas.
129
Lemma: 9.1: If 0A is an irreducible nn matrix, then A cannot have two linearly
independent eigenvectors with non-negative components.
Proof: Let 1x be the eigenvector of A corresponding to the dominant eigenvalue
1 . Then 0),,,( 211 nx , by the Perron-Frobenius theorem. Let 2x be another
real eigenvector of A corresponding to 2 ; this eigenvalue must then be real, and less
than 1 . Eigenvectors 1x and 2x will then be linearly independent, since they correspond
to different eigenvalues.
Assume that 0),,,( 212 nx . Form the linear combination
2211 xxy , where 0 ,0 21 .
Then 2211
222111
2211
2211
xx
xx
xx
AxAx
y
Ay
(6)
(Note: The ratio in (6) is intended as a ratio of corresponding elements of the vectors Ay
and y .)
A typical ratio of components in (6) will be bounded as follows
1
21
2211
2
ii
ii , if 0i , for all i .
( i , being the ith component of 1x , will be positive.) This contradicts Theorems 2.1 and
2.2 (Varga 1962, pages 30-32), which imply that
ii
ii
21
2211
1
, for at least one value of i .
Furthermore, if some 0i , the maximum value of the ratio of components in (6) is 1
and this still contradicts the result from Varga (1962, p 34). Hence 2x cannot be non-
negative. ■
130
This result is quoted as an exercise in Varga (1962, p 34), but is not proved there. The
proof given here is new.
Lemma 9.1 does not extend generally to a reducible matrix as Example 5 shows.
Example 5
21
12
00
00
00
00
21
12
A
Matrix A is obviously reducible but has non-negative linearly independent eigenvectors
)0 ,0 ,1 ,1( and )1, 1,0 ,0( corresponding to the dominant eigenvalue 31 .
Lemma: 9.2: If 0A and irreducible, and 1 is the first component of the vector x in
Corollary 9.1, then 1 may be chosen to be greater than zero.
Proof:
yU
w
yRx
T
T
T
, where 0w and 0y , by the Perron-Frobenius Theorem.
yU
ywT
T
Hence 01 ywT . ■
Lemma: 9.3: If 0A and irreducible, then 2/)( TCC cannot have an eigenvalue
.
Proof: Assume 2/)( TCC has an eigenvalue , with corresponding unit eigenvector
u .
131
Then,
u
CCu
uCCh
h
uT
T
T
T
T
2
0
22
2 0 .
This implies that the dominant eigenvector of
22
2T
T
CCh
h
is
ux
0.
But this is impossible, since there cannot be another eigenvector corresponding to the
simple eigenvalue , and by Lemma 9.2, the first component of its dominant eigenvector
x must be positive.
Hence 2
TCC cannot have an eigenvalue .
Furthermore, assume 2
TCC has an eigenvalue with associated unit eigenvector
u . Then,
u
CCu
uCCh
h
uT
T
T
T
T
2
0
22
2 0 .
132
This implies that
22
2T
T
CCh
h
has an eigenvalue , which is a contradiction.
Combining the above two results, we can say that 2
TCC cannot have an eigenvalue
greater than or equal to , when A is irreducible. ■
The following theorem shows conditions under which A and TA have the same dominant
eigenvector and that this condition is equivalent to 0h .
Theorem 9.3: If 0A and irreducible, then ,
i) if and only if A and TA have the same dominant eigenvector, or
ii) if and only if 0h .
Proof: i) If A and TA have the same dominant eigenvector 0w corresponding to the
dominant eigenvalue , then
wAw and wwAT
wwAA T
2.
Since A is irreducible, so also is 2/)( TAA and using Lemma 9.1, w must be the
dominant eigenvector of 2/)( TAA . Hence, .
Conversely, if , from (3) we can say that
11
22
22 T
T
TT
CCh
h
RxAA
R
where ),( 1 x is the dominant eigenvector of the right-hand matrix in (3).
133
i.e. 1'
1'
2
Th, (7)
and '1
2
)(
2
TCCh . (8)
From (7) 0Th , and pre-multiplying (8) by T yields
T
TT CC
2 or 0 . (9)
From Lemma 9.3, the first possibility in (9) is not feasible; so 0 , i.e.
)0,1()0,( 1 x , since 12x . Since Rxy , this means that
0
1Uwy . Hence wy .
i.e wwAA T
2
)(.
Using the fact that wwAw , we then have wwAT . Therefore A and TA
have the same dominant eigenvector w .
(ii) Writing ],[ UwR as above, where again 0w is the dominant eigenvector
corresponding to , we have
wAUh TT . (10)
(See Stewart (1973, p312) for verification of this result.)
Then, if 0h we conclude that wAT is orthogonal to all columns of U . But so is w
orthogonal to all columns of U . So we can then say
wwAT , for some real scalar . (11)
However wAw . (12)
134
wwAA T )()( (13)
Noting that TAA is skew-symmetric and hence has eigenvalues which are either zero or
pure-imaginary, and noting further that all values in (13) are real, the only possibility is
that 0 , and hence .
Thus (11) becomes
wwAT . (14)
Combining (12) and (14) gives
wwAA T
2 (15)
Since A is irreducible, so also is 2/)( TAA , and using Lemma 9.1, we can say that w
must be the dominant eigenvector of 2/)( TAA . Hence .
Conversely, if then by i) A and TA have the same dominant eigenvector. ie
wAw and wwAT . But 0 wUwAUh TTT , since 0wU T , from the
construction of R . ■
The relationship of to 2
A is shown in the following theorem.
Unfortunately the requirement that A be irreducible is a necessary condition. The
requirement that A be irreducible appears to be missing in a statement of this result in
Berman and Plemmons (1979, p 53), in which they state that the equality holds if and only
if A and TA have a common eigenvector corresponding to )(A . Example 6, shows a
reducible matrix for which A and TA have the same dominant eigenvector, but .
135
Example 6:
250000
1001000
00024
00321
00321
A
A and TA have the same dominant eigenvector, 00111
The spectral radius of both A and TA is 6. However the spectral radius of 2/)( TAA
is 51.5025, so .
The relationship of to 2
A is shown in the following theorem.
Theorem 9.4: For 0A , 2
A .
i.e. )(2
)( AAAA
A TT
.
Proof: The left-hand inequality has already been proved in Theorem 9.1.
To show 2
A , we note that the spectral radius of a matrix is no greater then its 2-
norm, and hence we have
2
22
2222
AAAAA
TT
. ■
This theorem shows that is the superior bound for )(A , when compared with 2
A .
It should be noted that the result 2
A does not require that matrix A be non-
negative; it applies for any real matrix. The next theorem provides a characterisation of
.
136
Theorem 9.5: is the largest non-negative eigenvalue of the matrix
122
20
n
T
T
ICCh
h
Proof From (3), must be the dominant eigenvalue of the matrix
22
2T
T
CCh
h
G
,
since this matrix and 2/)( TAA are similar. Then will be the largest non-
negative eigenvalue of IG . ■
9.3 Conclusion
The bound of Theorem 9.1 provides a bound for the spectral radius of a matrix A , in
terms of the spectral radius of the symmetric part of A . The solution of Theorem 9.1 also
includes a necessary and sufficient condition for equality of the spectral radii, a condition
not included in the original solution published in the Notices of The American
Mathematical Society (1970). The advantage of Theorem 9.1 is that most computational
methods for finding the eigenvalues of a symmetric matrix are more stable, and have
faster convergence. See Stewart (1973, p308). Although )( AAT provides an upper
bound for )(A in terms of a symmetric matrix, Theorem 9.4 shows that
2
TAA
provides a superior bound. The bounds by Frobenius, Minc, Ledermann, Ostrowski, and
Brauer, (see section 1.4.9), were compared to the spectral radius of the symmetric part of A .
In most cases these bounds were inferior, and applied mainly to positive matrices.
137
Chapter 10
CONCLUSION
10.1 Introduction
Over the past few decades input-output analysis has been applied to numerous areas, such
as the software industry, (Thornton, Park, Goddard and Hughes, 1988); estimating
pollution output, (Miller and Blair, 1985); company analysis, (Russo, 1976); and
estimating energy input and output, (Slesser, 1978; Herendeen, 1963). Input-output
tables can be used whenever the output needs to be estimated. The number of
applications with which the model has been applied indicates its versatility, but at the
same time highlights the necessity to ensure that researchers who apply the model are
aware of the conditions necessary for a unique positive solution of the input-output
system.
The research for this thesis began with a consideration of these conditions, viz., the
Hawkins-Simon condition and the Spectral Radius condition. Whilst these conditions
have been established elsewhere, new proofs have been presented in this thesis, and the
equivalence of the conditions established. The present research has also led to a
tightening of the restrictions on F , the final demand vector.
Focus for the research then turned to a method for finding the spectral radius of an input-
output matrix and in Theorems 3.3 and 3.4 a new proof has been provided, showing that
the Inverse Power Method applied to an input-output matrix is guaranteed to yield the
desired spectral radius. This latter consideration led to methods for finding the spectral
radius of a general, non-negative matrix. The bounds developed by Collatz and Wielandt
provided the basis for an always convergent method that has been named “The Method of
Collatz”. It is important to note that the way in which the bounds of Collatz have been
applied iteratively is one of the innovations in this Thesis.
138
The present Thesis has applied the Method of Collatz to the problem of finding the
spectral radius of a large, sparse non-negative matrix and evaluated its performance
against rival methods.
The Method of Collatz has also been applied to a new result requiring the computation of
the spectral radius of a matrix, hence enabling fast identification of an M-matrix.
The Thesis has established new bounds for determining the condition number of an Input-
Output system. This condition number is important in determining the accuracy of the
solution, and also the precision of the arithmetic which must be used to compute the
solution.
Relating the spectral radius of a general non-negative matrix to a related symmetric
matrix has obvious computational advantages. This has been done by means of a result
previously established by Levinger but developed further in this thesis.
The author hopes this research will lead to an increase in understanding of the input-
output model, an important tool in economic modelling. Also that the new result
concerning an always convergent method for determining the spectral radius of a non-
negative matrix will be a useful advancement in knowledge and techniques in the area of
Numerical Linear Algebra.
The author claims that the following research is to the best of his knowledge original.
This original research also demonstrates how the objectives outlined in Chapter 1 were
achieved.
Chapter 2: the alternative proof of the Hawkins-Simon condition, the result
concerning the non-necessity for a pivoting strategy in solving an Input-Output
System, and Theorem 2.3. Also discussion of further computational aspects of
the Hawkins-Simon condition.
139
Chapter 3: the result concerning the positivity of F, the final demand for a
company’s product. Also part ii) Theorem 3.2, and Theorems 3.3 and 3.4, which
show that the Inverse Power Method is certain to converge and yield the spectral
radius of an input-output matrix.
Chapter 4: the proof of Theorems 4.1 and 4.2, which show that the Method of
Collatz is certain to converge for a non-negative, primitive matrix.
Chapter 5: the proof that the Inverse Collatz Method is always convergent for a
general irreducible, non-negative matrix, not just an input-output matrix.
Chapter 6: spectral radius methods for determining whether a matrix is an M-
matrix.
Chapter 8: bounds on the condition number of an input-output matrix )( AI .
Chapter 9: an alternative proof that the spectral radius of the symmetric part of a
non-negative matrix A is always greater than or equal to the spectral radius of
A , and proof of a necessary and sufficient condition for equality of the spectral
radii.
It is the author’s contention that the two most significant results of this research have
been the establishment of always convergent methods for finding the spectral radius of an
input-output matrix and the spectral radius of a more general, non-negative matrix.
Further research may be useful in applying the Method of Collatz to Symmetric matrices
and comparing this method to other methods for finding the spectral radius of Symmetric
matrices.
140
PUBLICATIONS
O'Neill, M.J. and Wood, R.J. (1999): An Alternative Proof of the Hawkins-Simon
Condition, Asia Pacific Journal of Operational Research, Operational Research Society of
Singapore, Singapore, 16, 173-183.
Wood, R.J. and O'Neill, M.J. (2002): Using the Spectral Radius to determine whether a
Leontief System has a unique positive solution, Asia Pacific Journal of Operational
Research, Operational Research Society of Singapore, Singapore, 19, 233-247.
Wood, R.J and O'Neill, M.J. (2004). An always convergent method for finding the
spectral radius of a non-negative matrix. ANZIAM J., 45(E): C474-C485. [Online]
http://anziamj.austms.org.au/V45/CTAC2003/Wood/Wood.pdf
Wood, R.J and O'Neill, M.J (2005): A faster algorithm for identification of an M-Matrix,
http://anziamj.austms.org.au/V46/CTAC2004/Wood/home.html
Wood, R.J and O’Neill (2007): Finding the Spectral Radius of a large Sparse Non-
Negative matrix,
http://anziamj.austms.org.au/ojs/index.php/ANZIAMJ/article/view/117/99
141
BIBLIOGRAPHY
Andersson, E. and Ekstrom, P (2004). Investigating Google’s PageRank Algorithm,
www.it.uu.se/edu/course/homepage/projektTDB/vt04/projekt5/website/report.pdf
Atkinson, K.E. (1989). An Introduction to Numerical Analysis, 2nd edition, John Wiley
and Sons Inc, Singapore.
Anton, H. and Rorres, C. (1991). Elementary Linear Algebra Applications Version, John
Wiley and Sons, Inc, New York.
Berman A, and Shaked-Monderer, N (2003). Completely Positive Matrices, World
Scientific. New Jersey.
Berman, A and Plemmons, R. J. (1979) Nonnegative Matrices in the Mathematical
Sciences, Academic Press, New York.
Blum, E.K. (1972). Numerical Analysis and Computation Theory and Practice, Addison-
Wesley Publishing Company Inc, Philippines.
Brauer, A. (1957). The theorems of Ledermann and Ostrowski on positive matrices, J
Duke Math, 24, 265-274.
Burden, R.L. and Faires, J.D. (1978). Numerical Analysis, Prindle, Weber and Schmidt,
Boston.
142
Collatz, L. (1942). Einschliessungenssatz für die characteristischen Zahlen von
Matrizen. Math Zeit, 48, 221-6.
Conte, S and de Boor, C. (1980). Elementary Numerical Analysis, 3rd edition, McGraw
Hill, New York.
Deutsch, E. (1970). Solution of problem posed by Levinger (1969), The Spectral Radius
of a Matrix, Notices of the American Mathematical Society.
Forsythe, G.E., M.A. Malcolm, C.B. Moler (1977): Computer Methods for Mathematical
Computations, Prentice-Hall, New Jersey.
Frances, J. G. F. (1961): The QR transformation, Computer Journal. 4, 265-271.
Frobenius, G (1912). Uber matrizen aus nicht negativen elementen. S.-B. K. Preuss.
Wiss, Berlin, 456-477.
Georgescu-Roegen, N. (1966): Analytical Economics Issues and Problems, Harvard
University Press, Harvard.
Gerard, D. and Herstein, I.N. (1953). Non Negative Square Matrices in Readings in
Input-Output Analysis: Theory and Applications, ed I. Sohn, Oxford University Press,
Oxford, 200-209.
143
Golub, G.H. and Van Loan, C.F. (1996). Matrix Computations, The John Hopkins
University Press, Maryland.
Golub, G.H. and Greif, C. (2006). An Arnoldi-type Algorithm for Computing PageRank,
www.cs.ubc.ca/~greif/Papers/gg2006BIT.pdf
Halmos, P.R. (1958). Finite-dimensional Vector Spaces, D. Van Nostrand Company,
Inc, New Jersey.
Hawkins, D. and H.A. Simon (1949): "Note: Some Conditions of Macro economic
Stability," Econometrica, XVII, 245-248.
Hawkins, D. and Simon, H.A. (1949). Note: Some Conditions of Macro economic
Stability in Readings in Input-Output Analysis: Theory and Applications, Oxford
University Press, Oxford, 196-199.
Herendeen, R.A. (1963). An Energy Input-Output Matrix for the USA, Users Guide
Centre for Advanced Computation, University of Illinois, Urbana.
Isaacson, E. and Keller, H.B. (1966). Analysis of Numerical Methods, John Wiley and
Sons, Inc, New York.
144
Jensen R.C. and West, G.R. (1986). Australian Regional Developments, No.1, Input -
Output for Practitioners- Theory and Applications, Australian Government Publishing
Service, Canberra.
Krylov, A.N. (1931) O cislennom resenii uravnenija, kotorym v techniceskih voprasah
opredeijajutsja castoty malyh kolebanii material’nyh system, Izv. Akad. Nauk SSSR. Ser.
Fiz-mat. 4, 491-539.
Ledermann, W. (1950). Bounds for the greatest latent root of a positive matrix, J. London
Math. Soc, 25, 265-268.
Levinger, B.W. (1969). The Spectral Radius of a Matrix, Notices of the American
Mathematical Society.
McKenzie, L. (1959). Matrices with Dominant Diagonals and Economic Theory in K.J.
Arrow, S. Karlin, and P. Suppes, Mathematical Methods in the Social Sciences, 1959,
Stanford University Press, Stanford.
Miller, R.E. and Blair, P.D. (1985). Input-Output Analysis, Foundations and Extensions,
Prentice-Hall Inc., Englewood Cliffs, New Jersey.
Minc H. (1988), Nonnegative matrices, Wiley, New York.
145
Noble, B, and Daniel, J.W. (1977). Applied Linear Algebra, 2nd ed., Prentice-Hall, Inc.,
Englewood Cliffs.
O'Neill, M.J. and Wood, R.J. (1999): An Alternative Proof of the Hawkins-Simon
Condition, Asia Pacific Journal of Operational Research, Operational Research Society of
Singapore, Singapore, 16, 173-183.
Ortega, J.M. (1972). Numerical Analysis - A Second Course, Academic Press, New
York, 48.
Ostrowski, A.M. (1952). Bounds for the greatest latent root of a positive matrix, J.
London Math. Soc, 27, 253-256.
Ralston, A and Rabinwitz, P. (1978). A First Course in Numerical Analysis, McGraw-
Hill, New York.
Parlett, B.N, and Reinsch, C. (1969). Balancing a Matrix for Calculation of Eigenvalues
and Eigenvectors, Numerical Mathematics,13, 293-304.
Perron, O. (1907). Zur Theorie der Matrizen (1907)
Rayleigh, Lord. (1877), The Theory of Sound, Phil. Trans. Roy. Soc. London, A, 161, 77.
Russo, J.A. (1976). Input-Output Analysis for Financial Decision-Making. Management
Accounting, 58(3) 22-24.
146
Saad, Y. (2003). Iterative Methods for Sparse Linear Systems, SIAM, Philadelphia.
Saad, Y. (1992). Numerical Methods for Large Eigenvalue Problems, Manchester
University Press, New York.
Sarma, K.S. (1977). An Input Output Econometric Model, IBM System Journal, 4, 398-
420.
Senata , E. (1973). Non-Negative Matrices, George Allen and Unwin, London.
Slesser, M. (1978) Energy In The Economy, McMillian Press.
Sohn, I. (1986). Readings in Input-Output Analysis, Oxford University Press, Inc, New
York.
Stewart, G.W. (1973). Introduction to Matrix Computations, Academic Press, New York.
Stewart, G.W. (1976). Simultaneous Iteration for Computing Invariant Subspaces of
Non-Hermitian Matrices, Numer. Math, Springer-Verlag, New York, 123-136.
Thornton, B.S., Park, T.M., Goddard, J., & Hughes, J.M. (1988). Staffing and Training
Implications of Australian Software Exports Targets, The Australian Computer Journal,
20(4) 161-167.
147
Turing, A.M. (1948): Rounding-Off Errors in Matrix Processes, Quart, J. Mech. Appl.
Math, 1 287-308.
Wagenshals, L. W. Janssen, R. A., DeGregorio, E. A. (2007), Model Interoperation for
Effects Based Planning, 12th
International Command and Control Research and
Technology Symposium, Newport.
Walsh, B. (1970). Solution of problem posed by Levinger (1969), The Spectral Radius of
a Matrix, Notices of the American Mathematical Society.
Varga, R.S. (1962). Matrix Iterative Analysis, Prentice- Hall, Englewood Cliffs, New
Jersey.
Wielandt, H. (1944). Das Iterationsverfahren bei nicht Selbstadjunglerten Linearen
Eigenwert Aufgaben. Math Z. 50, 93-143.
Wielandt, H. (1950). Unzerlegbare, nicht negative matizen, Math Z. 52, 642-645.
Wilkinson, J.H. (1965). The Algebraic Eigenvalue Problem, Oxford University Press, Ely
House, London.
Wong, Y.K. (1954). An Elementary Treatment of an Input-Output System, Naval
Research Logistics Quarterly, 1(4), 321-326.
148
Wood, R.J. and O'Neill, M.J. (2002): Using the Spectral Radius to determine whether a
Leontief System has a unique positive solution, Asia Pacific Journal of Operational
Research, Operational Research Society of Singapore, Singapore, 19, 233-247.
Wood, R.J and O'Neill, M.J. (2004). An always convergent method for finding the
spectral radius of a non-negative matrix. ANZIAM J., 45(E): C474-C485. [Online]
http://anziamj.austms.org.au/V45/CTAC2003/Wood/Wood.pdf
Wood, R.J and O'Neill, M.J (2005): A faster algorithm for identification of an M-Matrix,
http://anziamj.austms.org.au/V46/CTAC2004/Wood/home.html
Wood, R.J and O’Neill (2007): Finding the Spectral Radius of a large Sparse Non-
Negative matrix,
http://anziamj.austms.org.au/ojs/index.php/ANZIAMJ/article/view/117/99
Young, D.M. (1971). Iterative Solutions of Large Linear Systems, Academic Press, New
York.