matrix roots of nonnegative and eventually …math.wsu.edu/faculty/tsat/files/pietro_thesis.pdf ·...
TRANSCRIPT
MATRIX ROOTS OF NONNEGATIVE AND EVENTUALLY NONNEGATIVE
MATRICES
By
PIETRO PAPARELLA
A dissertation submitted in partial fulfillment of therequirements for the degree of
DOCTOR OF PHILOSOPHY
WASHINGTON STATE UNIVERSITYDepartment of Mathematics
JULY 2013
© Copyright by PIETRO PAPARELLA, 2013All Rights Reserved
© Copyright by PIETRO PAPARELLA, 2013All Rights Reserved
ii
To the Faculty of Washington State University:
The members of the Committee appointed to examine the disser-
tation of PIETRO PAPARELLA find it satisfactory and recommend that
it be accepted.
Michael J. Tsatsomeros, Ph.D., Chair
Judith J. McDonald, Ph.D., Chair
Bala Krishnamoorthy, Ph.D.
iii
MATRIX ROOTS OF NONNEGATIVE AND EVENTUALLY NONNEGATIVE
MATRICES
Abstract
by Pietro Paparella, Ph.D.Washington State University
July 2013
Chair: Michael J. Tsatsomeros and Judith J. McDonald
Eventually nonnegative matrices are real matrices whose powers become and remain
nonnegative. As such, eventually nonnegative matrices are, a fortiori, matrix roots of non-
negative matrices, which motivates us to study the matrix roots of nonnegative matrices.
Using classical matrix function theory and Perron-Frobenius theory, we characterize, clas-
sify, and describe in terms of the complex and real Jordan canonical form the pth-roots of
nonnegative and eventually nonnegative matrices.
iv
Acknowledgements
This dissertation would not have been possible without the unwavering encouragement, pa-
tience, and love of my wife Allison L. Paparella. I wish to thank my children Quintin L.
Paparella and Alessia R. Paparella for providing me with inspiration to help me finish this
endeavor. I also would like to acknowledge my mother Maria Aquino for always supporting
me and my siblings and for her tremendous efforts and sacrifice in raising three children on
her own.
I owe a deep debt of gratitude to my advisors Michael J. Tsatsomeros and Judtih J.
McDonald. Their vision, guidance, wisdom, and expertise were paramount in the completion
of this project. I wish to extend a debt of gratitude to Bala Krishnamoorthy, who went
above and beyond the duties and responsibilities of a committee member: thank you for
responding to my pesky emails, the sage advice, and the encouraging words. I would also
like to thank David S. Watkins for serving on my committee and K. A. “Ari” Ariyawansa
for generously providing me a research assistantship from 2009–2011, so that I was able to
return to Washington State to complete my Ph.D in mathematics.
I am also grateful for wonderful friends and family for their encouraging words and
support: Travis W. Arnold; Aaron L. Carr-Callen; Kirk J. Griffith; Jasmin L. and Jeffrey
A. Harshman; Jameus C. Hutchens; Francesco Paparella; Stefano Paparella; Christine S.
and V. Joshua Phillips; Lena Y. and V. Jefferson Phillips; Daniela Paparella and Joseph W.
Thornton; and Frank and Chiara D. Trejos.
Finally, I would like to acknowledge several graduate students from my time at Wash-
ington State and Louisiana State for the wonderful memories, collaboration, and continued
v
inspiration: Baha’ M. Alzalg, Jeremy J. Becnel, Timothy D. Breitzman, Mihaly Kovacs,
Suat Namli, and Rokas Varaneckas.
vi
To my wife Allison, my children Quintin and Alessia, and my mother Maria.
vii
“If I have seen further it is by standing on ye sholders of Giants” – Sir Isaac Newton.
viii
Contents
Abstract iii
Acknowledgements iv
Dedication vi
Inspiration vii
1 Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.1 Combinatorial Structure of a Matrix . . . . . . . . . . . . . . . . . . 5
1.3.2 Upper-triangular Toeplitz Matrices . . . . . . . . . . . . . . . . . . . 8
2 Theoretical Background 12
2.1 Perron-Frobenius Theory of Nonnegative Matrices . . . . . . . . . . . . . . . 12
2.2 Theory of Functions of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Matrix Roots of Nonnegative Matrices 32
3.1 2× 2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.1 1 > λ ≥ 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.2 1 > 0 > λ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.1.3 λ = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
ix
3.2 Primitive Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3 Imprimitive Irreducible Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.1 Complete Residue Systems . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.2 Jordan Chains of h-cyclic matrices . . . . . . . . . . . . . . . . . . . 51
3.3.3 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.4 Reducible Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4.1 Preliminary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Bibliography 73
1
Chapter 1
Introduction
1.1 Background and Motivation
A real matrix A is said to be eventually nonnegative (respectively, positive) if there exists a
nonnegative integer p such that Ak is entrywise nonnegative (respectively, positive) for all
k ≥ p. If p is the smallest such integer, then p is called the power index of A and is denoted
by p(A).
Eventual nonnegativity was introduced and studied in [5] and was the central subject of
study in various publications (see, e.g., [3, 4, 10, 14, 19, 20, 21, 24, 25]). It is well-known
that the notions of eventual positivity and nonnegativity are associated with properties of
the eigenspace corresponding to the spectral radius.
A real matrix A has the Perron-Frobenius property if its spectral radius is a positive eigen-
value corresponding to an entrywise nonnegative eigenvector. The strong Perron-Frobenius
property further requires that the spectral radius is simple; it dominates in modulus every
other eigenvalue of A; and it has an entrywise positive eigenvector.
Several challenges regarding the theory and applications of eventually nonnegative ma-
trices remain unresolved. For example, eventual positivity of A is equivalent to A and AT
having the strong Perron-Frobenius property, however, the Perron-Frobenius property for A
and AT is a necessary but not sufficient condition for eventual nonnegativity of A.
The motivation of this thesis is the following observation: an eventually nonnegative
2
(respectively, positive) matrix with power index p = p(A) is, a fortiori, a matrix pth-root of
the nonnegative matrix Ap (i.e., it is a solution to the matrix equation Xp − Ap = 0). As
a consequence, in order to gain insight into eventual nonnegativity, it is natural to examine
the roots of matrices that possess the (strong) Perron-Frobenius property.
1.2 Notation and Definitions
The following notational conventions are adopted throughout the thesis; when convenient,
additional notation and definitions are introduced and adopted throughout.
(1) Denote by C the set of complex numbers, R the set of real numbers, Z the set of
integers, and N the set of natural numbers. Denote by i the imaginary unit, i.e.,
i :=√−1.
(2) When convenient, an indexed set of the form xi, xi+1, . . . , xi+j is abbreviated to
xki+jk=i.
(3) Denote by Mn(C) (Mn(R)) the algebra of complex (respectively, real) n× n matrices.
(4) Given A ∈ Mn(C), [aij] = [aij]i,j=1n denotes the entries of A, where aij ∈ C is the
(i, j)-entry of A. When convenient, the (i, j)-entry of A may also be denoted by [A]ij.
(5) Denote by Cn (Rn) the n-dimensional complex (respectively, real) vector space. Given
x ∈ Cn (x ∈ Rn), [x]i denotes the ith-component of x.
(6) Denote by In the n× n identity matrix. Whenever the context is clear, the subscript
may be suppressed.
3
(7) For λ ∈ C, Jn(λ) ∈Mn(C) denotes the n× n Jordan block, i.e,
[Jn(λ)]ij :=
λ, i = j;
1, j = i+ 1; and
0, otherwise.
(8) Given A ∈Mn(C):
(i) the spectrum of A, denoted by σ (A), is the multiset
σ (A) := λ ∈ C : det(λI − A) = 0,
i.e., σ (A) contains the eigenvalues of A;
(ii) the spectral radius of A, denoted by ρ = ρ (A), is the quantity
ρ (A) := maxλi∈σ(A)
|λi|; and
(iii) the peripheral spectrum, denoted by π (A), is the (multi-)set given by
λ ∈ σ (A) : |λ| = ρ.
(9) The direct sum of the matrices A1, A2, . . . , Ak, where Ai ∈ Mni(C), denoted by A1 ⊕
A2 ⊕ · · · ⊕ Ak,⊕k
i=1 Ani , or diag (A1, A2, . . . , An), is the matrix
A1
A2
. . .
Ak
∈Mn(C)
where n =∑k
i=1 ni.
4
(10) For A ∈ Mn(C), let J = Z−1AZ =⊕t
i=1 Jni(λi) =⊕t
i=1 Jni , where∑ni = n, denote
a Jordan canonical form.
(11) The order of the largest Jordan block of A corresponding to the eigenvalue λ is called
the index of λi and is denoted by m = mi = indexλi A.
(12) For c =
(c1, c2, . . . , cn
), where ci ∈ C, ∀i, the circulant matrix (or simply
circulant) of c, denoted by circ (c) is the n× n matrix given by
c1 c2 · · · cn
cn c1 · · · cn−1
......
. . ....
c2 c3 · · · c1
∈Mn(C).
(13) For A, B ∈Mn(C), the hadamard product of A and B, denoted by A B, is the n× n
matrix whose (i, j)-entry is given by
[A B]ij := aijbij.
(14) For A ∈ Mn(C) and X ∈ Mn(C), X is said to be a matrix pth-root, pth-root, matrix
root, or simply root of A if X satisfies the matrix equation Xp − A = 0.
(15) For k ∈ N, let R(k) denote the canonical complete residue system (mod k), i.e., R(k) =
0, 1, . . . , k − 1.
1.3 Preliminaries
Because the combinatorial stucture of a matrix (i.e., the location of its zero-nonzero entries)
will be central to the analysis of the matrix roots of nonnegative, irreducible, imprimitive
5
matrices, we devote Subsection 1.3.1 to notation, concepts, and background concerning the
combinatorial structure of a matrix.
In addition, in Subsection 1.3.2, we establish ancillary results pertaining to upper-triangular
Toeplitz matrices.
1.3.1 Combinatorial Structure of a Matrix
For notation and definitions concerning the combinatorial stucture of a matrix, i.e., the
location of the zero-nonzero entries of a matrix, we follow [2] and [10]. For further results
concerning combinatorial matrix theory, see [2] and references therein.
A directed graph (or simply digraph) Γ = (V,E) consists of a finite, nonempty set V of
vertices, together with a set E ⊆ V × V of arcs. Digraphs may contain both arcs (u, v) and
(v, u) as well as loops, i.e., arcs of the form (v, v). In this thesis, we do not consider directed
general graphs (or simply general digraphs), i.e., digraphs that contain multiple copies of the
same arc.
For A ∈ Mn(C), the directed graph (or simply digraph) of A, denoted by Γ = Γ (A), has
vertex set V = 1, . . . , n and arc set E = (i, j) ∈ V × V : aij 6= 0. If R, C ⊆ 1, . . . , n,
then A[R|C] denotes the submatrix of A whose rows and columns are indexed by R and
C, respectively. If R = C, then A[R|R] is abbreviated as A[R]. For a digraph Γ = (V,E)
and W ⊆ V , the induced subdigraph Γ[W ] is the digraph with vertex set W and arc set
(u, v) ∈ E : u, v ∈ W. For a square matrix A, Γ(A[W ]) is identified with Γ (A) [W ] by a
slight abuse of notation.
A digraph Γ is strongly connected if for any two distinct vertices u and v of Γ, there
is a walk in Γ from u to v (as a consequence, there must also be a walk from v to u;
6
thus, both conditions could be employed as the definition). Following [2], we consider every
vertex of V as strongly connected to itself. For a strongly connected digraph Γ, the index
of imprimitivity is the greatest common divisor of the the lengths of the closed walks in Γ.
A strong digraph is primitive if its index of imprimitivity is one, otherwise it is imprimitive.
The strong components of Γ are the maximal strongly connected subdigraphs of Γ.
For n ≥ 2, a matrix A ∈Mn(C), is reducible if there exists a permutation matrix P such
that
P TAP =
A11 A12
0 A22
where A11 and A22 are nonempty square matrices and 0 is a (possibly rectangular) block
consisting entirely of zero entries. If A is not reducible, then A is called irreducible. The
connection between reducibility and the digraph of A is as follows: A is irreducible if and
only if Γ (A) is strongly connected1 (see, e.g., [2, Theorem 3.2.1] or [11, Theorem 6.2.24]). If
A ∈Mn(C) is reducible, then there exists a permutation matrix P such that
P TAP =
A11 A12 · · · A1k
A22 . . . A2k
. . ....
Akk
(1.3.1)
where each Aii is irreducible (see, e.g., [2, Theorem 3.2.4] or [11, Exercise 8, §8.3]. The
matrix on the right-hand side of (1.3.1) is called the Frobenius (or irreducible) normal form
of A.
For h ≥ 2, a digraph Γ = (V,E) is cyclically h-partite if there exists an ordered partition
Π = (π1, . . . , πh) of V into h nonempty subsets such that for each arc (i, j) ∈ E, there
1Following [2], vertices are strongly connected to themselves so we take this result to hold for all n ∈ Nand not just n ∈ N, n ≥ 2.
7
exists ` ∈ 1, . . . , h with i ∈ π` and j ∈ π`+1 (where, for convenience, we take Vh+1 := V1).
For h ≥ 2, a strong digraph Γ is cyclically h-partite if and only if h divides the index of
imprimitivity (see, e.g., [2, p. 70]). For h ≥ 2, a matrix A ∈Mn(C) is called h-cyclic if Γ (A)
is cyclically h-partite. If Γ (A) is cyclically h-partite with ordered partition Π, then A is said
to be h-cyclic with partition Π or that Π describes the h-cyclic structure of A. The ordered
partition Π = (π1, . . . , πh) is consecutive if π1 = 1, . . . , i1, π2 = i1 + 1, . . . , i2, . . . , πh =
ih−1 + 1, . . . , n. If A is h-cyclic with consecutive ordered partition Π, then A has the block
form
0 A12 0 · · · 0
0 0 A23. . .
...
......
. . . . . . 0
0 0 · · · 0 A(h−1)h
Ah1 0 0 · · · 0
(1.3.2)
where Ai,i+1 = A[πi|πi+1] ([2, p. 71]). For any h-cyclic matrix A, there exists a permutation
matrix P such that PAP T is h-cyclic with consecutive ordered partition. The cyclic index
of A is the largest h for which A is h-cyclic.
An irreducible nonnegative matrix A is primitive if Γ (A) is primitive, and the index of
imprimitivity of A is the index of imprimitivity of Γ (A). If A is irreducible and imprimitive
with index of imprimitivity h ≥ 2, then h is the cyclic index of A, Γ (A) is cyclically h-partite
with ordered partition Π = (π1, . . . , πh), and the sets πi are uniquely determined (up to cyclic
permutation of the πi) (see, for example, [2, p. 70]). Furthermore, Γ(Ah)
is the disjoint
union of h primitive digraphs on the sets of vertices πi, i = 1, . . . , h (see, e.g., [2, §3.4]).
Following [10], for an ordered partition Π = (π1, . . . , πh) of 1, . . . , n into h nonnempty
subsets, the cyclic characteristic matrix, denoted by χΠ, is the n×n matrix whose (i, j)-entry
8
is 1 if there exists ` ∈ 1, . . . , h such that i ∈ π` and j ∈ π`+1 and 0 otherwise. For an
ordered partition Π = (π1, . . . , πh) of 1, . . . , n into h nonnempty subsets, note that
(1) χΠ is h-cyclic and Γ (χΠ) contains every arc (i, j) for i ∈ π` and j ∈ π`+1; and
(2) A ∈Mn(C) is h-cyclic with ordered partition Π if and only if Γ (A) ⊆ Γ (χΠ).
1.3.2 Upper-triangular Toeplitz Matrices
In this subsection we establish the following lemmas concerning upper-triangular Toeplitz
matrices.
Lemma 1.3.1. For n ≥ 2, let λ, c, y1, . . . , yn−1 ∈ C, and yn−1 6= 0. If
y := (y1, y2, . . . , yn−1)T ∈ Cn−1,
y := (c, y1, . . . , yn−2)T ∈ Cn−1,
J :=
Jn−1(λ) y
zT λ
∈Mn(C),
and
Q :=
In−1 y
zT yn−1
∈Mn(C)
then QJQ−1 = Jn(λ).
Proof. Note that Q−1 =
In−1 − yyn−1
zT 1yn−1
, hence
QJQ−1 =
In−1 y
zT yn−1
Jn−1(λ) y
zT λ
In−1 − y
yn−1
zT 1yn−1
9
=
Jn−1(λ) y + λy
zT λyn−1
In−1 − y
yn−1
zT 1yn−1
=
Jn−1(λ) 1yn−1
(y + λy − Jn−1(λ)y)
zT λ
.If Nn = [nij] ∈ Cn×n is the nilpotent matrix defined by
nij =
1, j = i+ 1
0, otherwise
then it is well known that Jn(λ) = λIn +Nn. Furthermore,
Nn−1y = (y1, y2, . . . , yn−2, 0)T
and
y + λy − Jn−1(λ)y
yn−1
=y + λy − (λIn−1 +Nn−1)y
yn−1
=y + λy − λy +Nn−1y
yn−1
=(y1, y2, . . . , yn−2, yn−1)T − (y1, y2, . . . , yn−2, 0)T
yn−1
= (0, 0, . . . , 0, 1)T ,
which completes the proof.
10
Lemma 1.3.2. For n ≥ 2, consider the upper-triangular Toeplitz matrix
F :=
x1 . . . xk . . . xn
. . . . . ....
x1 . . . xk
. . ....
x1
∈Mn(C)
where x1, . . . , xn ∈ C and xi 6= 0 for all i = 2, . . . , n. Then there exists an invertible matrix
Q such that QFQ−1 = Jn(x1).
Proof. Proceed by induction on n: for the base-case, i.e., n = 2, note that1 0
0 x2
x1 x2
0 x1
1 0
0 1x2
=
x1 x2
0 x1x2
1 0
0 1x2
=
x1 1
0 x1
,so that the base-case holds.
Assume the assertion holds when n = k− 1. Let F be the submatrix of F obtained from
deleting the nth-row and column of F , i.e.,
F =
x1 . . . xn−1
. . ....
x1
∈Mn−1(C),
and let x = (xn, xn−1, . . . , x2)T ∈ Cn−1. Then F =
F x
zT x1
and, by the inductive hypoth-
esis, there exists an invertible matrix Q such that QF Q−1 = Jn−1(x1). Note that the matrix
Q :=
Q z
zT 1
is invertible, Q−1 =
Q−1 z
zT 1
, and
QF Q−1 =
Q z
zT 1
F x
zT x1
Q−1 z
zT 1
11
=
QF Qx
zT x1
Q−1 z
zT 1
=
QF Q−1 Qx
zT x1
=
Jn−1(x1) Qx
zT x1
. (1.3.3)
If y := Qx ∈ Cn−1, y := (0, y1, . . . , yn−2)T ∈ Cn−1, then, following Lemma 1.3.1, a similarity-
transform by the matrix Q :=
In−1 y
zT yn−1
brings the matrix in (1.3.3) into the desired
form. Thus, the desired similarity matrix transformation is given by Q = QQ.
In Table 1, we present the transformation Q and its inverse for several dimensions.
Table 1: The transformation matrix Q and Q−1.
n F Q Q−1
2
x1 x2
0 x1
1 0
0 x2
1 0
0 1x2
3
x1 x2 x3
0 x1 x2
0 0 x1
1 0 0
0 x2 x3
0 0 x22
1 0 0
0 x2 x3
0 0 x22
4
x1 x2 x3 x4
0 x1 x2 x3
0 0 x1 x2
0 0 0 x1
1 0 0 0
0 x2 x3 x4
0 0 x22 2x2x3
0 0 0 x32
1 0 0 0
0 1x2−x3x32
2x23−x2x4x52
0 0 1x22
−2x3x42
0 0 0 1x32
12
Chapter 2
Theoretical Background
2.1 Perron-Frobenius Theory of Nonnegative Matrices
We review some well-known results, concepts, and definitions from the theory of nonnegative
matrices. For further results, see [11, Chapter 8] or [1] and references therein.
Recall that a matrix A = [aij] ∈Mn(R) is said to be (entrywise) nonnegative (respectively,
positive), denoted A ≥ 0 (respectively, A > 0), if aij ≥ 0 (respectively, aij > 0) for all
1 ≤ i, j ≤ n. Recall that a matrix A ∈ Mn(R) is eventually positive (nonnegative) if there
exists a nonnegative integer p such that Ak is entrywise positive (nonnegative) for all k ≥ p.
If p is the smallest such integer, then p is called the power index of A and is denoted by
p(A).
We recall the Perron-Frobenius theorem for positive matrices (see [11, Theorem 8.2.11]).
Theorem 2.1.1. If A ∈Mn(R) is positive, then
(a) ρ > 0;
(b) ρ ∈ σ (A);
(c) there exists a positive vector x such that Ax = ρx;
(d) ρ is an algebraically (and hence geometrically) simple eigenvalue of A; and
(e) |λ| < ρ for every λ ∈ σ (A) such that λ 6= ρ.
13
There are nonnegative matrices with zero-entries that satisfy Theorem 2.1.1. Recall that
a nonnegative matrix A ∈ Mn(R) is said to be primitive if it is irreducible and has only
one eigenvalue of maximum modulus. The conclusions to Theorem 2.1.1 apply to primitive
matrices (see [11, Theorem 8.5.1]), and the following theorem is a useful characterization of
primitivity (see [11, Theorem 8.5.2]).
Theorem 2.1.2. If A ∈ Mn(R) is nonnegative, then A is primitive if and only if Ak > 0
for some k ≥ 1.
Perron-Frobenius type results also hold when A is an irreducible imprimitive matrix (see
[11, §8.4]).
Theorem 2.1.3. Let A ∈ Mn(R) and suppose that A is irreducible, nonnegative, and h is
the cyclic index of A. Then
(a) ρ > 0;
(b) ρ ∈ σ (A);
(c) there exists a positive vector x such that Ax = ρx;
(d) ρ is an algebraically (and hence geometrically) simple eigenvalue of A; and
(e) π (A) = ρ exp (i2πk/h) : k ∈ R(h).
The following result is the widest-reaching forrm of the Perron-Frobenius Theorem.
Theorem 2.1.4. If A ∈Mn(R) and A ≥ 0, then ρ (A) ∈ σ (A) and there exists a nonnegative
vector x ≥ 0, x 6= 0, such that Ax = ρ (A)x.
14
One can verify that the matrix 2 1
2 −1
possesses properties (a) through (e) of Theorem 2.1.1, is irreducible, but obviously contains
a negative entry. This motivates the following concept.
Definition 2.1.5. A matrix A ∈ Mn(R) is said to possess the strong Perron-Frobenius
property if A possesses properties (a) through (e) of Theorem 2.1.1.
The following theorem characterizes the strong Perron-Frobenius property (see [6, Lemma
2.1], [14, Theorem 1], or [20, Theorem 2.2]).
Theorem 2.1.6. A real matrix A is eventually positive if and only if A and AT possess the
strong Perron-Frobenius property.
2.2 Theory of Functions of Matrices
We review some basic notions and results from the theory of matrix functions (for further
results on the theory of matrix functions, see [8], [12, Chapter 9], or [15, Chapter 6]).
Definition 2.2.1. Let f : C −→ C be a function and f (k) denote the kth derivative of f .
The function f is said to be defined on the spectrum of A if the values
f (k)(λi), k = 0, . . . ,mi − 1, i = 1, . . . , s,
called the values of the function f on the spectrum of A, exist.
Definition 2.2.2 (Matrix function via Jordan canonical form). If f is defined on the spec-
trum of A ∈Mn(C), then
f(A) := Zf(J)Z−1 = Z
(k⊕i=1
f(Jni)
)Z−1,
15
where
f(Jni) :=
f(λi) f ′(λi) . . . f (ni−1)(λi)(ni−1)!
f(λi). . .
...
. . . f ′(λi)
f(λi)
. (2.2.1)
The following equivalent definition (see [8, Theorem 1.12]) serves to illuminate the notion
of a primary root, and will be used to examine the 2× 2 case in Section 3.1.
Definition 2.2.3 (Matrix function via Hermite interpolation). Let f be defined on the
spectrum of A ∈ Mn(C) and let ϕ be the minimal polynomial of A. Then f(A) := p(A),
where p is the polynomial of degree less than
s∑i=1
ni = degϕ
that satisfies the interpolation conditions
p(j)(λi) = f (j)(λi), j = 0, . . . , ni − 1, i = 1, . . . , s (2.2.2)
There is a unique such p and it is known as the Hermite interpolating polynomial.
Remark 2.2.4. The Hermite interpolating polynomial p is given explicitly by the Lagrange-
Hermite formula
p(t) =s∑i=1
[(ni−1∑j=0
1
j!φ
(j)i (λi)(t− λi)j
)∏j 6=i
(t− λj)nj], (2.2.3)
where φi(t) = f(t)/∏
j 6=i(t−λj)nj . For a matrix with distinct eigenvalues, i.e., ni = 1, s = n,
(2.2.3) reduces to
p(t) =n∑i=1
f(λi)`i(t), `i(t) =∏j=1j 6=i
(t− λjλi − λj
). (2.2.4)
16
Formulae (2.2.3) and (2.2.4) will be used to classify the primary matrix roots 2× 2 matrices
in Section 3.1.
Theorem 2.2.5. Let f be defined on the spectrum of a nonsingular matrix A ∈ Mn(C). If
J =⊕t
i=1 Jni(λi) = Z−1AZ is a Jordan canonical form of A, then
Jf :=t⊕i=1
Jni [f(λi)]
is a Jordan canonical form of f(A).
Proof. Because f(A) = Z (f(J))Z−1 = Z(⊕k
i=1 f(Jni))Z−1, note that, for each i =
1, . . . , t, f(Jni) is an upper-triangular, Toeplitz matrix. Following Lemma 1.3.1, for each i =
1, . . . , k, there exists an invertible matrix Qi ∈Mni(C) such that f(Jni) = QiJni(f(λi))Q−1i ,
and hence
f(A) = Zt⊕i=1
f(Jni)Z−1
= Z
[t⊕i=1
QiJni [f(λi)]Q−1i
]Z−1
=
(Z
t⊕i=1
Qi
)t⊕i=1
Jni [f(λi)]
(t⊕i=1
Q−1i Z−1
)
= Z
(t⊕i=1
Jni [f(λi)]
)Z−1
and the result is established.
Definition 2.2.6. For z = r exp (iθ) ∈ C, where r > 0, and an integer p > 1, let
z1/p := r1/p exp (iθ/p),
and, for j ∈ 0, 1, . . . , p− 1, define
fj(z) = z1/p exp (i2πj/p) = r1/p exp (i [θ + 2πj] /p), (2.2.5)
17
i.e., fj is the (j + 1)st-branch of the pth-root function.
Note that
f(k)j (z) =
1
pk
[k−1∏i=0
(1− ip)
]r(1−kp)/p exp
(i[2πj + θ(1− kp)]
p
), (2.2.6)
where k is a nonnegative integer and the product∏k−1
i=0 (1− ip) is empty when k = 0.
Next we present several technical, but easily establishable lemmas on the branches of the
pth-root function.
Lemma 2.2.7. If z = r exp (iθ), z′ = r′ exp (iθ′) ∈ C, and r < r′, then .
Proof. Note that
|fj(z)| =∣∣r1/p exp (i [θ + 2πj] /p)
∣∣= r1/p |exp (i [θ + 2πj] /p)|
= r1/p
< (r′)1/p
= (r′)1/p |exp (i [θ + 2πj′] /p)|
=∣∣(r′)1/p exp (i [θ + 2πj′] /p)
∣∣= |fj′(z′)|.
Lemma 2.2.8. For z ∈ C, =(z) 6= 0, j, j′ ∈ R(p), and f(k)j as in (2.2.6), we have f
(k)j (z) =
f(k)j′ (z) if and only if j + j′ ≡ 0 (modp).
Proof. Note that
f(k)j (z) = f
(k)j′ (z)
18
⇔ exp (i[2πj + θ(1− kp)]/p) = exp (i[−2πj′ + θ(1− kp)]/p)
⇔ exp (i[2πj + θ(1− kp)]/p) = exp (i[−2πj′ + θ(1− kp)]/p) · exp (i2π`p/p)
⇔ exp (i[2πj + θ(1− kp)]/p) = exp (i[2π(p`− j′) + θ(1− kp)]/p)
⇔ i[2πj + θ(1− kp)]/p = i[2π(p`− j′) + θ(1− kp)]/p
⇔ j + j′ = `p
⇔ j + j′ ≡ 0 (modp) ,
where ` ∈ Z. Finally, note that j + j′ ≡ 0 (modp)⇔ j = j′ = 0 or j = p− j′.
Lemma 2.2.9. Let z = r exp (iπ), r > 0. For j, j′ ∈ 0, 1, . . . , p− 1 and f(k)j as in (2.2.6),
we have f(k)j (z) = f
(k)j′ (z) if and only if j + j′ = p− 1.
Proof. Note that
f(k)j (z) = f
(k)j′ (z)
⇔ exp (i[2πj + π(1− kp)]/p) = exp (i[−2πj′ − π(1− kp)]/p)
⇔ exp (i[2πj + π(1− kp)]/p) = exp (i[−2πj′ − π(1− kp)]/p) · exp (i2π(p− kp))
⇔ exp (i[2πj + π(1− kp)]/p) = exp (i[−2πj′ − π(1− kp) + 2π(p− kp)]/p)
⇔ i[2πj + π(1− kp)]/p = i[−2πj′ − π(1− kp) + 2π(p− kp)]/p
⇔ 2πj + π(1− kp) = −2πj′ − π(1− kp) + 2π(p− kp)
⇔ 2πj + 2π(1− kp) = −2πj′ + 2π(p− kp)
⇔ j + j′ = p− 1.
The following theorem classifies all pth-roots of a general nonsingular matrix [23, Theo-
rems 2.1 and 2.2].
19
Theorem 2.2.10 (Classification of pth-roots of nonsingular matrices). If A ∈ Mn(C) is
nonsingular, then A has precisely ps pth-roots that are expressible as polynomials in A, given
by
Xj = Z
(t⊕i=1
fji(Jni)
)Z−1, (2.2.7)
where j = (j1, . . . , jt), ji ∈ 0, 1, . . . , p− 1, and ji = jk whenever λi = λk.
If s < t, then A has additional pth-roots that form parameterized families
Xj(U) = ZU
(t⊕i=1
fji(Jni)
)U−1Z−1, (2.2.8)
where U is an arbitrary nonsingular matrix that commutes with J and, for each j, there exist
i and k, depending on j, such that λi = λk, while ji 6= jk.
In the theory of matrix functions, the roots given by (2.2.7) are called primary functions
(or roots) of A and are expressible as polynomials in A, and the roots given by (2.2.8), which
exist only if A is derogatory (i.e., if some eigenvalue appears in more than one Jordan block),
are called the nonprimary functions or roots of A and are not expressible as polynomials in
A [8, Chapter 1].
The next result provides a necessary and sufficient condition for the existence of a root
for a general matrix (see [22]).
Theorem 2.2.11 (Existence of pth-root). A matrix A ∈ Mn(C) has a pth-root if and only
if the “ascent sequence” of integers d1, d2, . . . defined by
di = dim (null (Ai))− dim (null (Ai−1))
has the property that for every integer ν ≥ 0 no more than one element of the sequence lies
strictly between pν and p(ν + 1).
20
Before we state results concerning the matrix roots of a real matrix, we state some well-
known results concerning the real Jordan canonical form for real matrices (see [11, Section
3.4], [15, Section 6.7]).
Theorem 2.2.12 (Real Jordan canonical form). If A ∈ Mn(R) has r real eigenvalues (in-
cluding multiplicities) and c complex conjugate pairs of eigenvalues (including multiplicities),
then there exists a real, invertible matrix R ∈Mn(R) such that
R−1AR = JR =
⊕rk=1 Jnk(λk) ⊕r+c
k=r+1Cnk(λk)
,where:
1.
Ck(λ) :=
C(λ) I2
C(λ). . .
. . . I2
C(λ)
∈M2k(R); (2.2.9)
2.
C(λ) :=
<(λ) =(λ)
−=(λ) <(λ)
∈M2(R); (2.2.10)
3. λ1, . . . , λr are the real eigenvalues (including multiplicities) of A; and
4. λr+1, λr+1, . . . , λr+c, λr+c are the complex eigenvalues (including multiplicities) of A.
21
Lemma 2.2.13. Let λ ∈ C and suppose Ck(λ) and C(λ) are defined as in (2.2.9) and
(2.2.10), respectively. If Sk :=(⊕k
i=1 S)∈M2k(R), where S :=
−i −i
1 −1
, then
S−1k Ck(λ)Sk = Dk(λ) :=
D(λ) I2
D(λ). . .
. . . I2
D(λ)
∈M2k(C), (2.2.11)
where D(λ) :=
λ 0
0 λ
.
Proof. Proceed by induction on k, the number of 2 × 2-blocks: when k = 1 one readily
obtains
S−1C(λ)S =1
2
i 1
i −1
<(λ) =(λ)
−=(λ) <(λ)
−i −i
1 −1
= D(λ).
Now assume the assertion holds for all matrices of the form (2.2.9) of dimension 2(k − 1).
Note that the matrices in the product S−1k Ck(λ)Sk can be partitioned asS−1
k−1 Z
ZT S−1
Ck−1(λ) Y
ZT C(λ)
Sk−1 Z
Z S
,
where Z ∈ M2(k−1),2(R) is a rectangular zero matrix, Y =
Z2
...
Z2
I2
∈ M2(k−1),2(R), and Z2
is the 2 × 2 zero matrix. With the above partition in mind, and following the induction
22
hypothesis, we obtain
S−1Ck(λ)S =
Dk−1(λ) S−1k−1Y S
ZT S−1C(λ)S
,but note that S−1
k−1Y S = Y and S−1C(λ)S = D(λ).
Lemma 2.2.14. Let λ ∈ C and suppose Dk(λ) and D(λ) are defined as in Lemma 2.2.13.
If Pk is the permutation matrix given by
Pk =
[e1 e3 . . . e2k−1 e2 e4 . . . e2k
]∈M2k(R), (2.2.12)
where ei denotes the canonical basis vector in Rn of appropriate dimension, then
P Tk Dk(λ)Pk = Jk(λ)⊕ Jk
(λ).
Proof. Proceed by induction on k: the base-case when k = 1 is trivial, so we assume the
assertion holds for matrices of the form (2.2.11) of dimension 2(k − 1). If zn denotes the
n× 1 zero vector, then
D(λ) =
λ eT2 0
z2(k−1) D2(k−1)(λ) e2(k−1)
0 zT2(k−1) λ
,
and if P is the permutation matrix defined by
P :=
1 zT2(k−1) 0
z2(k−1) P2(k−1) z2(k−1)
0 zT2(k−1) 1
∈M2k(R),
23
then, following the induction-hypothesis,
P TD(λ)P =
λ zTk−1 eT1 0
zk−1 Jk−1
(λ)
Zk−1 zk−1
zk−1 Zk−1 Jk−1(λ) ek−1
0 zTk−1 zTk−1 λ
. (2.2.13)
A permutation-similarity by the matrix P defined by
P :=
1
Ik−1
Ik−1
1
∈M2k(R)
brings the matrix in the right-hand side of (2.2.13) to the desired form. The proof is com-
pleted by noting that
P P =
[e1 e2 . . . e2(k−1) e3 . . . e2k−1 e2k
]
1
Ik−1
Ik−1
1
= Pk,
since right-hand multiplication by P permutes columns 2 through k with columns k + 1
through 2k − 1.
Corollary 2.2.15. Let λ ∈ C, λ 6= 0, and let f be a function defined on the spectrum of
Jk(λ) ⊕ Jk(λ). For j a nonnegative integer, let f
(j)λ denote f (j)(λ). If Ck(λ) and C(λ) are
24
defined as in (2.2.9) and (2.2.10), respectively, then
f(Ck(λ)) =
C(fλ) C(f ′λ) . . . C
(f(k−1)λ
(k−1)!
)C(fλ)
. . ....
. . . C(f ′λ)
C(fλ)
∈M2k(R).
if and only if f(j)λ = f
(j)
λ.
Proof. Following Lemmas Lemma 2.2.13 and Lemma 2.2.14,
P Tk S−1k Ck(λ)SkPk = Jk(λ)⊕ Jk
(λ). (2.2.14)
Since f(A) = f(X−1AX) ([8, Theorem 1.13(c)]), f(A ⊕ B) = f(A) ⊕ f(B) [8, Theorem
1.13(g)], and f(j)λ = f
(j)
λfor all j, applying f to (2.2.14) yields
P Tk S−1k f(Ck(λ))SkPk = f(Jk(λ))⊕ f(Jk
(λ)) = f(Jk(λ))⊕ f(Jk(λ)).
Hence,
S−1k f(Ck(λ))Sk = Pk
[f(Jk(λ))⊕ f(Jk(λ))
]P Tk
=
D(fλ) D(f ′λ) . . . D
(f(k−1)λ
(k−1)!
)D(fλ)
. . ....
. . . D(f ′λ)
D(fλ)
and
f(Ck(λ)) = Sk
D(fλ) D(f ′λ) . . . D
(f(k−1)λ
(k−1)!
)D(fλ)
. . ....
. . . D(f ′λ)
D(fλ)
S−1k
25
=
C(fλ) C(f ′λ) . . . C
(f(k−1)λ
(k−1)!
)C(fλ)
. . ....
. . . C(f ′λ)
C(fλ)
.
The converse follows by noting that, for µ, ν ∈ C, the product
S
µ 0
0 ν
S−1 =
−i −i
1 −1
µ 0
0 ν
1
2
i 1
i −1
=
1
2
−iµ −iν
µ −ν
i 1
i −1
=
1
2
µ+ ν i(ν − µ)
i(µ− ν) µ+ ν
is real if and only if ν = µ.
Remark 2.2.16. In general, our analysis reveals that
f(Ck(λ)) =
f(Cλ) f ′(Cλ) . . . f (k−1)(Cλ)(k−1)!
f(Cλ). . .
...
. . . f ′(Cλ)
f(Cλ)
∈M2k(C),
which bears a striking resemblance to (2.2.1).
Corollary 2.2.17. If λ ∈ C, =(λ) 6= 0 and
Fj(Ck(λ)) := SkPk
fj1(Jk(λ)) 0
0 fj2(Jk(λ))
P Tk S−1k ∈M2k(C), (2.2.15)
26
where j = (j1, j2) and j1, j2 ∈ 0, 1, . . . , p− 1, then
Fj(Ck(λ)) =
C(fj1(λ)) C(f ′j1(λ)) . . . C
(f(k−1)j1
(λ)
(k−1)!
)C(fj1(λ))
. . ....
. . . C(f ′j1(λ))
C(fj1(λ))
∈M2k(R)
if and only if j1 = j2 = 0 or j1 = j2 − p.
Proof. Follows from Lemma 2.2.8 and Corollary 2.2.15.
Corollary 2.2.18. If λ = r exp (iπ) ∈ C, where r > 0, and Fj is defined as in (2.2.15), then
Fj(Ck(λ)) =
C(fj1(λ)) C(f ′j1(λ)) . . . C
(f(k−1)j1
(λ)
(k−1)!
)C(fj1(λ))
. . ....
. . . C(f ′j1(λ))
C(fj1(λ))
∈M2k(R)
if and only if j1 + j2 = p− 1.
Proof. Follows from Lemma 2.2.9 and Corollary 2.2.15.
The next theorem provides a necessary and sufficient condition for the existence of a real
pth-root of a real A (see [9, Theorem 2.3]) and our proof utilizes the real Jordan canonical
form.
Theorem 2.2.19 (Existence of real pth-root). A matrix A ∈ Mn(R) has a real pth-root if
and only if it satisfies the ascent sequence condition specified in Theorem 2.2.11 and, if p is
even, A has an even number of Jordan blocks of each size for every negative eigenvalue.
27
Proof. Case 1: p is even. Following Theorem 2.2.12, there exists a real, invertible matrix R
such that
A = R
J0
J+
J−
C
R−1
where J0 collects the singular Jordan blocks; J+ collects the Jordan blocks with positive real
eigenvalues; J− collects Jordan blocks with negative real eigenvalues; and C collects blocks
of the form (2.2.9) corresponding to the complex conjugate pairs of eigenvalues of A.
By hypothesis, if Jk(λ) is a submatrix of J−, it must appear an even number of times;
for every such pair of blocks, it follows that
Pk[Jk(λ)⊕ Jk(λ)]P Tk = Ck(λ),
where Pk is defined as in (2.2.12). Thus, there exists a permutation matrix P such that
A = RP TP
J0
J+
J−
C
P TPR−1 = R
J0
Jp
C
R−1,
where R = RP T , and C collects all the blocks of the form (2.2.9).
Since the ascent sequence condition holds for A it must also hold for J0, so J0 has a
pth-root, say W0, and W0 can be taken real in view of the construction given in [22, Section
3]; clearly, there exists a real matrix W+ such that W p+ = J+ and, following Corollaries
2.2.17 and 2.2.18, there exists a real matrix Wc such that W pc = C. Hence, the matrix
X = R[W0 ⊕W+ ⊕Wc]R−1 is a real pth-root of A.
28
Conversely, if A satisfies the ascent sequence condition and has an odd number of Jor-
dan blocks corresponding to a negative eigenvalue, then the process just described can not
produce a real matrix pth-root, as one of the Jordan blocks can not be paired, so that the
root of such a block is necessarily complex.
Case 2: p is odd. Follows similarly to the first case since real roots can be taken for J0,
J+, J−, and C.
We now present an analog of Theorem 2.2.10 for real matrices.
Theorem 2.2.20 (Classification of pth-roots of nonsingular real matrices). Let Fk be defined
as in (2.2.15). If A ∈Mn(R) is nonsingular, then A has precisely ps primary pth-roots, given
by
Xj = R
⊕rk=1 fjk (Jnk(λk)) 0
0⊕r+c
k=r+1 Fjk(Cnk(λk))
R−1, (2.2.16)
where j = (j1, . . . , jr, jr+1, . . . , jr+c), jk = (jk1 , jk2) for k = r + 1, . . . , r + c, and ji = jk
whenever λi = λk.
If s < t, then A has additional nonprimary pth-roots that form parameterized families
given by
Xj(U) = RU
⊕rk=1 fjk (Jnk(λk)) 0
0⊕r+c
k=r+1 Fjk(Cnk(λk))
U−1R−1, (2.2.17)
where U is an arbitrary nonsingular matrix that commutes with JR, and for each j there
exist i and k, depending on j, such that λi = λk while ji 6= jk.
Proof. Following Theorem 2.2.12, there exists a real, invertible matrix R such that
R−1AR = JR =
⊕rk=1 Jnk(λk) ⊕r+c
k=r+1Cnk(λk)
;
29
if T =
I ⊕r+ck=r+1 SnkPnk
, then
T−1JRT = J =
⊕rk=1 Jnk(λk) ⊕r+c
k=r+1
[Jnk(λk)⊕ Jnk
(λk)] ,
following Lemmas 2.2.13 and 2.2.14. Following Theorem 2.2.10, JR has ps primary roots
given by
T
⊕rk=1 fjk(Jnk(λk)) ⊕r+c
k=r+1
[fjk1 (Jnk(µk))⊕ fjk2 (Jnk(µk)
]T−1
=
⊕rk=1 fjk (Jnk(λk)) 0
0⊕r+c
k=r+1 Fjk (Cnk(µk))
,where jk = (jk1 , jk2) for k = r + 1, . . . , r + c, which establishes (2.2.16).
If A is derogatory, then JR has additional roots of the form
TW
⊕rk=1 fjk(Jnk(λk)) ⊕r+c
k=r+1
[fjk1 (Jnk(µk))⊕ fjk2 (Jnk(µk)
]W−1T−1,
where W is arbitray nonsingular matrix that commutes with J . Note that
TW
⊕rk=1 fjk(Jnk(λk)) ⊕r+c
k=r+1
[fjk1 (Jnk(µk))⊕ fjk2 (Jnk(µk)
]W−1T−1
= U
⊕rk=1 fjk(Jnk(λk)) ⊕r+c
k=r+1 Fjk(Cnk(λk))
U−1,
where U = TWT−1. Following [15, Theorem 1, §12.4], U is an arbitary, nonsingular matrix
that commutes with JR, which establishes (2.2.17).
30
The next theorem identifies the number of real primary pth-roots of a real matrix (c.f.
[9, Theorem 2.4]) and our proof utilizes the real Jordan canonical form.
Corollary 2.2.21. Let the nonsingular real matrix A have r1 distinct positive real eigenval-
ues, r2 distinct negative real eigenvalues, and c distinct complex-conjugate pairs of eigenval-
ues. If p is even, there are (a) 2r1pc real primary pth-roots when r2 = 0; and (b) no real
primary pth-roots when r2 > 0. If p is odd, there are pc real primary pth-roots.
Proof. Following Theorem 2.2.12, there exists a real, invertible matrix R such that
R−1AR = JR =
⊕rk=1 Jnk(λk) ⊕r+c
k=r+1Cnk(λk)
.Case 1: p is even. If r2 > 0, then A does not possess a real primary root, since A must have an
even number of Jordan blocks of each size for every negative eigenvalue and the same branch
of the pth-root function must be selected for every Jordan block containing the same negative
eigenvalue. If r2 = 0, then, following Corollary 2.2.17, for every complex-conjugate pair of
eigenvalues, there are p choices such that Fjk (Cnk(λk)) is real. For every real eigenvalue,
there are two choices such that fjk (Jnk(λk)) is real, yielding 2r1pc real primary roots.
Case 2: p is odd. The matrix Xj is real provided that the principal-branch of the pth-root
function is chosen for every real eigenvalue. Similar to the first case, there are p choices such
that Fjk (Cnk(λk)) is real, yielding pc real primary roots.
The following theorem extends Theorem 2.2.10 to include singular matrices (see [9, The-
orem 2.6]).
Theorem 2.2.22 (Classification of pth-roots). Let A ∈ Mn(C) have the Jordan canonical
form Z−1AZ = J = J0⊕J1, where J0 collects together all the Jordan blocks corresponding to
31
the eigenvalue zero and J1 contains the remaining Jordan blocks. If A possesses a pth-root,
then all pth-roots of A are given by A = Z (X0 ⊕X1)Z−1, where X1 is any pth-root of J1,
characterized by Theorem 2.2.10, and X0 is any pth-root of J0.
32
Chapter 3
Matrix Roots of Nonnegative Matrices
3.1 2× 2 Matrices
We analyze and categorize the primary matrix roots of a 2 × 2 nonnegative matrix via the
interpolating-polynomial definition of a matrix function (see Definition 2.2.3, (2.2.3), and
(2.2.4)): let A ∈M2(R) and suppose A ≥ 0. Following Theorem 2.1.4 and the fact that the
spectrum of a real matrix must be self-conjugate (see, e.g., [11, Appendix C]), without loss
of generality, we may assume that σ (A) = 1, λ, where 1 ≥ |λ| ≥ 0 and λ ∈ R. Note that
the Jordan form of A must be 1 0
0 λ
(3.1.1)
or 1 1
0 1
. (3.1.2)
As in (2.2.5), let fj denote the (j + 1)st-branch of the pth-root function. For 2 × 2
matrices having the Jordan form (3.1.1), with the exception when λ = 1, following (2.2.4),
the required Lagrange-Hermite interpolating polynomial is given by
p(t) = fj1(1)t− λ1− λ
+ fjj2 (λ)
(t− 1
λ− 1
)=ωj1p − fj2(λ)
1− λt+
fj2(λ)− λωj1p1− λ
.
Thus, if j = (j1, j2), then A has p2 primary pth-roots given by
A1/pj =
ωj1p − fj2(λ)
1− λA+
fj2(λ)− λωj1p1− λ
I2. (3.1.3)
33
3.1.1 1 > λ ≥ 0
Let A1/pj be the matrix given in (3.1.3) and consider the two cases depending on the parity
of p:
(1) p is even. The root A1/pj is real if and only if j1, j2 ∈ 0, p/2.
If j1 = j2 = 0, then
A1/pj =
1− p√λ
1− λA+
p√λ− λ
1− λI2 ≥ 0,
because (1− p√λ)/(1− λ) > 0 and ( p
√λ− λ)/(1− λ) > 0.
If j1 = j2 = p/2, then
A1/pj =
p√λ− 1
1− λA+
λ− p√λ
1− λI2 ≤ 0.
If j1 = p/2 and j2 = 0, then A1/pj can not be nonnegative since the off-diagonal entries
of A are negative and not affected by a positive multiple of the identity.
If j1 = 0 and j2 = p/2, then
A1/pj =
1− p√λ
1− λA+
λ− p√λ
1− λI2,
and A1/pj ≥ 0 if and only if (1− p
√λ)aii ≥ (λ− p
√λ) for i = 1, 2.
(2) p is odd. The root A1/pj is real if and only if j1 = j2 = 0 and, if j1 = j2 = 0, then
A1/pj =
1− p√λ
1− λA+
p√λ− λ
1− λI2 ≥ 0,
because (1− p√λ)/(1− λ) > 0 and ( p
√λ− λ)/(1− λ) > 0.
If λ = 0, then A is a rank-one matrix and
A1/pj = ωj1p A, j1 ∈ R(p)
34
3.1.2 1 > 0 > λ
Let A1/pj be the matrix given in (3.1.3) and consider the two cases depending on the parity
of p:
(1) p is even. If λ < 0 and p is even, then A1/pj 6∈M2(R).
(2) p is odd. Same as (2) when 1 > λ > 0.
3.1.3 λ = 1
If the Jordan form of A is (3.1.1), then (trivially) A = I2 andωj1p I2
p−1
j1=0are the pth-roots
of A.
If the Jordan form of A is (3.1.2), then (2.2.3) reduces to
p(t) = fj1(1) + f ′j1(1)(t− 1) = ωj1p +ωj1pp
(t− 1)
so that
A1/pj = fj1(1)I2 + f ′j1(1)(A− I2) =
ωj1ppA+
[ωj1p −
ωj1pp
]I2,
which is real if and only if j1 = 0 (or j1 = p/2, when p is even). When j1 = 0, note that
A1/pj =
1
pA+
[1− 1
p
]I2 =
A+ (p− 1)I2
p.
As a special case, note that if A =
1 1
0 1
, then Jp :=
1 p−1
0 1
is a pth-root for all
p ∈ N.
35
3.2 Primitive Matrices
We present our main results for primitive matrices.
Theorem 3.2.1. Let the nonsingular primitive matrix A have r1 distinct positive real eigen-
values, r2 distinct negative real eigenvalues, and c distinct complex-conjugate pairs of eigen-
values. If p is even, there are (a) 2r1−1pc eventually positive primary pth-roots when r2 = 0;
and (b) no eventually positive primary pth-roots if r2 > 0. If p is odd, there are pc eventually
positive primary pth-roots.
Proof. If r2 > 0, then, following Theorem 2.2.21, the matrix A does not have a real primary
pth-root, hence, a fortiori, it can not have an eventually positive primary pth-root.
If r2 = 0, then, following Theorems 2.2.12 and 2.1.1, there exists a real, invertible matrix
R such that
A =
[x R′
]ρ ⊕r
k=2 Jnk(λk) ⊕r+ck=r+1Cnk(λk)
yT
(R−1)′
, (3.2.1)
where R =
[x R′
]∈ Mn(R), R−1 =
y
(R−1)′
∈ Mn(R), x > 0 is the right Perron-vector,
and y > 0 is the left Perron-vector.
Following Lemma 2.2.7, |fj(z)| < |fj′(z′)| for all j, j′ ∈ 0, 1, . . . , p − 1, so that any
primary pth-root X of the form
Xj = R
p√ρ ⊕r
k=2 fjk(Jnk(λk)) ⊕r+ck=r+1 Fjk(Cnk(λk))
R−1
36
inherits the strong Perron-Frobenius property from A. Since f(A)T = f(AT ) ([8, Theorem
1.13(b)]), a similar argument demonstrates that XT inherits the strong Perron-Frobenius
property from AT .
Case 1: p is even. For all k = 2, . . . , r, there are two possible choices such that
fjk (Jnk(λk)) is real; following Corollary 2.2.17, for all k = r + 1, . . . , r + c, there are p
choices such that Fjk (Cnk(λk)) is real. Thus, there are 2r1−1pc possible ways to select X and
XT to be real.
Case 2: p is odd. For all k = 2, . . . , r, the principal-branch of the pth-root function must
be selected so that fjk (Jnk(λk)) is real and, similar to the previous case, there are p choices
such that Fjk (Cnk(λk)) is real. Hence, there are pc ways to select X and XT real.
Regardless of the parity of p, following Theorem 2.1.6, the matrices X and XT are
eventually positive.
Example 3.2.2. We demonstrate Theorem 3.2.1 via an example. Consider the matrix
A =1
5
16 16 6 11 1
7 12 12 12 7
9 4 14 9 14
8 8 8 13 13
10 10 10 5 15
.
37
Note that A = RJRR−1, where
JR =
10 0 0 0 0
0 1 1 1 0
0 −1 1 0 1
0 0 0 1 1
0 0 0 −1 1
, R =
1 1 1 1 1
1 −1 0 0 0
1 0 −1 0 0
1 0 0 −1 0
1 0 0 0 −1
,
and
R−1 =1
5
1 1 1 1 1
1 −4 1 1 1
1 1 −4 1 1
1 1 1 −4 1
1 1 1 1 −4
.
Because σ (A) = 10, 1 + i, 1 + i, 1− i, 1− i, following Theorem 3.2.1, A has eight primary
matrix square-roots, of which the matrices
Xj = R
√10 0 0 0 0
0 1.0987 0.4551 0.3884 −0.1609
0 −0.4551 1.0987 0.1609 0.3884
0 0 0 1.0987 0.4551
0 0 0 −0.4551 1.0987
R−1
38
=
1.6668 1.0232 0.1130 0.4738 −0.1145
0.2762 1.3749 0.7313 0.6646 0.1153
0.3939 −0.0612 1.4926 0.5548 0.7823
0.3217 0.3217 0.3217 1.4204 0.7768
0.5037 0.5037 0.5037 0.0486 1.6024
,
where j = (0, (0, 0)), and
Xj′ = R
√10 0 0 0 0
0 −1.0987 −0.4551 −0.3884 0.1609
0 0.4551 −1.0987 −0.1609 −0.3884
0 0 0 −1.0987 −0.4551
0 0 0 0.4551 −1.0987
R−1
=
−0.4019 0.2417 1.1519 0.7911 1.3794
0.9887 −0.1100 0.5336 0.6003 1.1496
0.8710 1.3261 −0.2276 0.7101 0.4826
0.9432 0.9432 0.9432 −0.1555 0.4881
0.7612 0.7612 0.7612 1.2163 −0.3375
,
where j′ = (0, (1, 1)), are eventually positive square-roots of A.
The following question arises from Example 3.2.2: are Xj and Xj′ the only eventually
positive square-roots of A? This is answered in the following result, which yields an explicit
description of the eventually positive primary roots of a nonsingular primitive matrix A.
Theorem 3.2.3. Let A be a nonsingular, primitive matrix and Xj be any primary pth-root
39
of A of the form
Xj = R
fj1(ρ) ⊕r
k=2 fjk(Jnk(λk)) ⊕r+ck=r+1 Fjk(Cnk(λk))
R−1,
where j = (j1, . . . , jr, jr+1, . . . , jr+c) and jk = (jk1 , jk2), for k = r + 1, . . . , r + c. If p is odd,
then Xj is eventually positive if and only if
1. j1 = 0;
2. jk = 0 for all k = 2, . . . , r; and
3. jk = (0, 0) or jk = (jk1 , p− jk1) for all k = r + 1, . . . , r + c.
If p is even, then Xj is eventually positive if and only if
1. j1 = 0;
2. jk = 0 or jk = p/2 for all k = 2, . . . , r; and
3. jk = (0, 0) or jk = (jk1 , p− jk1) for all k = r + 1, . . . , r + c.
Proof. We demonstrate necessity as sufficiency is shown in the proof of Theorem 3.2.1.
To this end, we demonstrate the contrapositive. Case 1: p is odd. If the principal branch
of the pth-root function is not selected for the Perron eigenvalue, then Xj can not possess
the strong Perron-Frobenius property; if jk 6= 0 for some k ∈ 2, . . . , r, then fjk(Jnk(λk)) is
not real so that Xj can not be real; similarly, Xj is not real if jk 6= (0, 0) or jk 6= (jk1 , p− jk1)
for some k ∈ r + 1, . . . , r + c.
Case 2: p is even. Result is similar to the first case, but we note that, without loss of
generality, we may assume that A does not have any negative eigenvalues (else it can not
possess a primary root).
40
Theorem 3.2.4. Let A be a primitive, nonsingular, derogatory matrix that possesses a real
root, and let Xj(U) be any nonprimary root of A of the form
Xj(U) = RU
fj1(ρ) ⊕r
k=2 fjk (Jnk(λk)) ⊕r+ck=r+1 Fjk(Cnk(λk))
U−1R−1,
where j = (j1, . . . , jr, jr+1, . . . , jr+c) and jk = (jk1 , jk2), for k = r + 1, . . . , r + c, subject to
the constraint that for each j, there exist i and k, depending on j, such that λi = λk, while
ji 6= jk.
If p is even, then Xj(U) is eventually positive if and only if
(1) j1 = 0;
(2) jk = 0 or jk = p/2, for all k = 2, . . . , r;
(3) jk = (0, 0) or jk = (jk1 , p− jk1) for all k = r + 1, . . . , r + c;
(4) U is a real nonsingular matrix; and
(5) Jordan blocks containing negative eigenvalues are transformed, via a permutation ma-
trix, to blocks of the form (2.2.9) (see proof of Theorem 2.2.19) and branches for these
blocks are selected in accordance with Corollary 2.2.18.
If p is odd, then Xj(U) is eventually positive if and only if
(1) j1 = 0;
(2) jk = 0 for all k = 2, . . . , r;
(3) jk = (0, 0) or jk = (jk1 , p− jk1) for all k = r + 1, . . . , r + c; and
41
(4) U is a real nonsingular matrix.
Example 3.2.5. We demonstrate Theorem 3.2.4 via an example. Consider the matrix
A =1
9
32 32 14 23 5 32 14 23 5
17 26 26 26 17 17 17 17 17
19 10 28 19 28 19 19 19 19
18 18 18 27 27 18 18 18 18
20 20 20 11 29 20 20 20 20
17 17 17 17 17 26 26 26 17
19 19 19 19 19 10 28 19 28
18 18 18 18 18 18 18 27 27
20 20 20 20 20 20 20 11 29
.
Note that A = RJRR−1, where
JR =
20 0 0 0 0 0 0 0 0
0 1 1 1 0 0 0 0 0
0 −1 1 0 1 0 0 0 0
0 0 0 1 1 0 0 0 0
0 0 0 −1 1 0 0 0 0
0 0 0 0 0 1 1 1 0
0 0 0 0 0 −1 1 0 1
0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 −1 1
42
R =
1 1 1 1 1 1 1 1 1
1 −1 0 0 0 0 0 0 0
1 0 −1 0 0 0 0 0 0
1 0 0 −1 0 0 0 0 0
1 0 0 0 −1 0 0 0 0
1 0 0 0 0 −1 0 0 0
1 0 0 0 0 0 −1 0 0
1 0 0 0 0 0 0 −1 0
1 0 0 0 0 0 0 0 −1
,
and
R−1 =1
9
1 1 1 1 1 1 1 1 1
1 −8 1 1 1 1 1 1 1
1 1 −8 1 1 1 1 1 1
1 1 1 −8 1 1 1 1 1
1 1 1 1 −8 1 1 1 1
1 1 1 1 1 −8 1 1 1
1 1 1 1 1 1 −8 1 1
1 1 1 1 1 1 1 −8 1
1 1 1 1 1 1 1 1 −8
.
If j = (0, (0, 0), (1, 1)), then, following Theorem 3.2.4, any matrix of the form
Xj(U) = RU
√
20
F(0,0)(C2(1 + i))
F(1,1)(C2(1 + i))
R−1U−1,
43
where
U =
u1 0 0 0 0 0 0 0 0
0 u2 u3 1 0 u4 u5 1 0
0 −u3 u2 0 1 −u5 u4 0 1
0 0 0 u2 u3 0 0 u4 u5
0 0 0 −u3 u2 0 0 −u5 u4
0 u6 u7 1 0 u8 u9 1 0
0 −u7 u6 0 1 −u9 u8 0 1
0 0 0 u6 u7 0 0 u8 u9
0 0 0 −u7 u6 0 0 −u9 u8
,
u1 6= 0, and ui ∈ R, for i = 2, . . . , 9, is an eventually positive square-root of A.
Next, we present an analog of Theorem 2.2.22 for real matrices.
Theorem 3.2.6. Let the primitive matrix A ∈ Mn(R) have the real Jordan canonical form
R−1AR = JR = J0 ⊕ J1, where J0 collects all the singular Jordan blocks and J1 collects the
remaining blocks. If A possesses a real root, then all eventually positive pth-roots of A are
given by A = R (X0 ⊕X1)R−1, where X1 is any pth-root of J1, characterized by Theorem
3.2.3 or Theorem 3.2.4, and X0 is a real pth-root of J0.
Remark 3.2.7. It should be clear that Theorems 3.2.1, 3.2.3, 3.2.4, and 3.2.6 remain true
if the assumption of primitivity is replaced with eventually positivity.
Recall that for A ∈ Mn(C) with no eigenvalues on R−, the principal pth-root, denoted
by A1/p, is the unique pth-root of A all of whose eigenvalues lie in the segment z : −π/p <
44
arg(z) < π/p [8, Theorem 7.2]. In addition, recall that a nonnegative matrix A is said to
be stochastic if∑n
j=1 aij = 1, for all i = 1, . . . , n.
In [9], being motivated by discrete-time Markov-chain applications, two classes of stochas-
tic matrices were identified that possess stochastic principal pth-roots for all p. As a conse-
quence, and of particular interest, for these classes of matrices the twelfth-root of an annual
transition matrix is itself a transition matrix. A (monthly) transition matrix that contains a
negative entry or is complex is meaningless in the context of a model, however the following
remark demonstrates that, under suitable conditions, a matrix-root of a primitive stochastic
matrix will be eventually stochastic (i.e., eventually positive with row sums equal to one).
Eventual stochasticity may be useful in the following manner: consider, for example,
the application of discrete-time Markov chains in credit risk: let P = [pij] ∈ Mn(R) be a
primitive transition matrix where pij ∈ [0, 1] is the probability that a firm with rating i
transitions to rating j. Such matrices are derived via annual data, and the twelfth-root of
such a matrix would correspond to a monthly transition matrix if the entries are nonnegative.
However, in application, the twelfth-root would be used for forecasting purposes (e.g., to
estimate the likelihood that a firm with credit-rating i, transitions to rating j, m months
into the future). Thus, if R is the twelfth-root of P and Rm ≥ 0, then r(m)ij is a candidate
for the aforementioned probability. Moreover, it is well-known that n2 − 2n + 2 is a sharp
upper-bound for the primitivity index of a primitive matrix (see, e.g., [11, Corollary 8.5.9])
so that eventual stochasticity is feasible in application.
Remark 3.2.8. Theorems 3.2.1, 3.2.3, 3.2.4, and 3.2.6 remain true if stochasticity is added
as an assumption on the matrix A and the conclusion of eventually postivity is replaced with
eventually stochasticity.
45
We conclude with the following remark.
Remark 3.2.9. If A is an eventually positive matrix matrix, then B := Aq is eventually
positive for all q ∈ N: thus, all the previous results hold for fractional powers of A.
3.3 Imprimitive Irreducible Matrices
Consider the matrices
A =
0 1 0
0 0 1
1 0 0
, B =
0 0 1
1 0 0
0 1 0
and C = 3
2 2 −1
−1 2 2
2 −1 2
.
One can verify that B2 = C2 = A, yet the matrix C is not eventually nonnegative
despite possessing left and right positive eigenvectors. In this section, we investigate why it
is guaranteed that B is an eventually nonnegative and C is not eventually nonnegative.
3.3.1 Complete Residue Systems
Our discussion of complete residue systems begins with the following definition.
Definition 3.3.1. If R = a0, a1, . . . , ah−1 is a set of integers, then R is said to be a
complete residue system (modh), denoted CRS (h), if the map f : R −→ R(h) defined by
ai −→ qi := ai (modh)
is injective or, equivalently, surjective.
With the aforementioned definition, we readily glean the following equivalent properties
if R is a CRS(h):
46
(P1) if i 6= j, then qi 6= qj;
(P2) if i 6= j, then ai 6≡ aj (modh); and
(P3) for a ∈ Z, if a ≡ ai (modh), then a 6≡ aj (modh) for all j 6= i.
The next two number-theoretic results are well-known (see, e.g., [17, Theorem 3.3], [18,
Theorem 4.6 and Corollary 4.7] or [7, Theorems 54 and 55]).
Theorem 3.3.2. If gcd (k,m) = d, then ka ≡ kb (mod m) if and only if
a ≡ b (mod (m/d)) .
In particular is the following useful case.
Corollary 3.3.3. If gcd (k,m) = 1, then ka ≡ kb (modm) if and only if a ≡ b (modm).
Theorem 3.3.4. If aih−1i=0 is a CRS (h) and p ∈ Z, then paih−1
i=0 is a CRS (h) if and only
if gcd(h, p) = 1.
Remark 3.3.5. It is common to find only sufficiency in the literature (see [7, Theorem 56],
[17, Theorem 3.5], [18, Section 4.1, Exercise 10], [13, Theorem 20(i)], or [16, Theorem 62]),
however, as the following proof shows, necessity holds as well.
Proof. If gcd(h, p) = 1, then, following Corollary 3.3.3, pai ≡ paj (mod h) if and only if
ai ≡ aj (modh), which holds if and only if i = j.
Conversely, if gcd(h, p) = d > 1, then aih−1i=0 is not a CRS (h/d). Thus, there exist
distinct integers i and j such that ai ≡ aj (mod(h/d)), which holds, by Theorem 3.3.2, if
and only if pai ≡ paj (modh), i.e., paih−1i=0 is not a CRS (h).
47
Application to Complete Sets of Roots of Unity
For h ∈ N, h > 1, let
ω = ωh := exp (2πi/h) ∈ C,
Ωh :=ωkh−1
k=0=ωkhh−1
k=0⊆ C, (3.3.1)
and
Ω∗h :=(1, ω, . . . , ωh−1
)=(1, ωh, . . . , ω
h−1h
), (3.3.2)
i.e., Ωh =z ∈ C : zh − 1 = 0
.
With ω as defined above, note that
α ≡ β (mod h) =⇒ ωα = ωβ (3.3.3)
is easily established.
Theorem 3.3.6. If aih−1i=0 ⊂ Z, then ωaih−1
i=0 = Ωh if and only if aih−1i=0 is a CRS (h).
Proof. If ri := ai (mod h), then ai ≡ ri (mod h) and, following (3.3.3), ωai = ωri and the
conclusion follows.
Conversely, if aih−1i=0 is not a CRS (h), then, with ri := ai (mod h), rih−1
i=0 ⊂ R(h) and
ωaih−1i=0 ⊂ Ωh.
The following is corollary to Theorem 3.3.4 and Theorem 3.3.6.
Corollary 3.3.7. For p > 1, let Ωph :=
1, ωp, . . . , ω(h−1)p
. Then Ωp
h = Ωh if and only if
gcd (p, h) = 1.
Lemma 3.3.8. If a, b, c, and h ∈ Z, then a ≡ b (modh) if and only if (a + c) ≡ (b +
c) (modh).
48
Proof. Note that
a ≡ b (modh)⇐⇒ a− b = hk (k ∈ Z)
⇐⇒ (a+ c)− (b+ c) = hk
⇐⇒ (a+ c) ≡ (b+ c) (modh) .
Corollary 3.3.9. If aih−1i=0 is a CRS (h) and ` ∈ Z, then pai + `h−1
i=0 is a CRS (h) if and
only if gcd(h, p) = 1.
Lemma 3.3.10. If a, b, c d and h ∈ Z, then (a + hc) ≡ (b + hd) (modh) if and only if
a ≡ b (modh).
Proof. Note that
a ≡ b (modh)⇐⇒ a− b = hk (k ∈ Z)
⇐⇒ (a+ hc)− (b+ hd) = h(k + c− d)
⇐⇒ (a+ hc) ≡ (b+ hd) (modh) .
Corollary 3.3.11. If aih−1i=0 ⊂ Z and jih−1
i=0 ⊂ Z, then aih−1i=0 is a CRS (h) if and only
if ai + hjih−1i=0 is a CRS (h).
Proof. If ai + hjih−1i=0 is not a CRS (h), then there exist distinct integers k, ` ∈ R(h) such
that (ak +hjk) ≡ (a` +hj`) (modh). Following Lemma 3.3.10, (ai +hli) ≡ (aj +hlj) (modh)
if and only if ai ≡ aj (modh); i.e., if and only if aih−1i=0 is not a CRS (h).
Theorem 3.3.12. If fk denotes the (k + 1)-st branch of the pth-root function as defined in
(2.2.5),
B := j = (j0, j1, . . . , jh−1) : jk ∈ R(p),∀k ∈ R(h) ,
49
and
(Ωh)1/pj :=
fjk(ω
k)h−1
k=0, j ∈ B,
then there exists a unique j ∈ B such that (Ωh)1/pj = Ωh if and only if gcd (h, p) = 1.
Remark 3.3.13. For every k ∈ R(h) and j ∈ B, note that
fjk(ωk) = exp
(2πki
hp
)exp
(2πjki
p
)= exp
(2π [k + hlk] i
hp
)= ω(k+hjk)/p
so that
(Ωh)1/pj =
ω(k+hjk)/p
h−1
k=0.
Proof. Existence. If gcd(h, p) = 1, then, following Corollary 3.3.9, for each k ∈ R(h), the set
k + hjp−1j=0 is a CRS (p); in particular, there exists jk ∈ R(p) such that k+hjk ≡ 0 (mod p),
i.e., k + hjk is divisible by p. Moreover, the setk + hjk
p
h−1
k=0
is a CRS (h); indeed, following Corollary 3.3.3,(i+ hjip
)≡(`+ hj`p
)(mod h)
if and only if (i+ hji) ≡ (`+ hj`) (modh), which holds, following Lemma 3.3.10, if and only
if i ≡ ` (mod h). Since R(h) is the canonical CRS (h), it follows that i = `, and the claim is
established.
Uniqueness. Uniqueness follows readily by construction of j. Moreover, it should be clear
that j0 = 0, i.e., 1 must be chosen as the pth-root of 1.
For the converse, suppose gcd(h, p) = d > 1. We claim that for at least one k, there does
not exist jk ∈ R(p) such that k + hjk ≡ 0 (mod p); for contradiction, suppose the assertion
50
is false, so that, for each k, there exists jk such that k + hjk ≡ 0 (mod p), i.e., there exists
an integer qk such that k + hjk = pqk, or hjk = pqk − k, so that pqk ≡ k (mod h). Because
k ∈ R(h), the set pqkh−1k=0 is a CRS (h), contradicting gcd(h, p) 6= 1. Following Remark
3.3.13, at least one exponent in the set
(Ωh)1/pj :=
ω(k+hjk)/p
h−1
k=0
is not an integer so that (Ωh)1/pj 6= Ωh.
Remark 3.3.14. If j ∈ B is the unique h-tuple as specified in Theorem 3.3.12, thenk + hjk
p
h−1
k=0
is a CRS (h); moreover, note that 0 ≤ (k + hjk)/p ≤ h − 1. The lower-bound is trivial and
the upper bound follows because, for any k,
k + hjkp
≤ (h− 1) + h(p− 1)
p=hp− 1
p= h− 1
p< h.
Thus, k + hjk
p
h−1
k=0
= R(h). (3.3.4)
For j ∈ B, consider
(Ω∗h)1/pj :=
(fj0(1), fj1(ω), . . . , fjh−1
(ωh−1))
=(1, ω(1+hj1)/p, . . . , ω[(h−1)+hjh−1]/p
)Following (3.3.4),
(1, ω(1+hj1)/p, . . . , ω[(h−1)+hjh−1]/p
)is a permutation of the elements in Ωh.
Moreover, following Lemma 3.3.10 and (3.3.3), note that for k ∈ 2, . . . , h− 1,
(ω(1+hj1)/p
)k=(ω(k+hkj1)/p
)=(ω(k+hjk)/p
),
51
thus
(Ω∗h)1/pj =
(1, ω(1+hj1)/p, . . . , ω[(h−1)+hjh−1]/p
)=(
1,(ω(1+hj1)/p
)1, . . . ,
(ω(1+hj1)/p
)h−1).
Note that
(Ω∗h)q :=
(1, ωq, . . . , ωq(h−1)
)=(
1, (ωq)1 , . . . , (ωq)h−1)
is a permutation of the elements in Ωh if and only if gcd (h, q) = 1; following (3.3.3), for
every h there are ϕ(h) such permutations, where ϕ denotes Euler’s totient (or phi) function.
Thus, if gcd(h, p) = 1, then there exists q ∈ N, gcd (h, q) = 1 such that
(Ω∗h)1/pj = (Ω∗h)
q . (3.3.5)
Corollary 3.3.15. If B is defined as in Theorem 3.3.12, and (Ωh)q/pj := (Ωq
h)1/pj or (Ωh)
q/pj :=(
(Ωh)1/pj
)q, then there exists a unique j ∈ B such that (Ωh)
q/pj = Ωh if and only if gcd(h, p) =
gcd (h, q) = 1.
3.3.2 Jordan Chains of h-cyclic matrices
Unless otherwise noted, in this subsection it is assumed that A ∈ Mn(C) is a nonsingular,
h-cyclic matrix with ordered-partition Π and, without loss of generality, has the form (1.3.2).
Lemma 3.3.16. For i, j ∈ Z, let αij := (i− j) (modh).
1. Ifx〈0,j〉
rj=1
is a right Jordan chain corresponding to λ ∈ σ (A), where, for j =
1, . . . , r, x〈0,j〉 is partitioned conformably with A as
x〈0,j〉 =
x1j
x2j
...
xhj
,
52
then, for k = 1, . . . , h− 1, the setx〈k,j〉 :=
(ωk)α1jx1j
(ωk)α2jx2j
...
(ωk)αhjxhj
r
j=1
is a right Jordan chain corresponding to λωk.
2. Ify〈i,0〉
ri=1
is a left Jordan chain corresponding to λ ∈ σ (A), where, for i = 1, . . . , r,
y〈i,0〉 is partitioned conformably with A as
yT〈i,0〉 =
[yTi1 yTi2 · · · yTih
]T,
then, for k = 1, . . . , h− 1, the setyT〈i,k〉 :=
[(ωk)αi1yTi1 (ωk)αi2yTi2 · · · (ωk)αihyTih
]Tr
i=1
is a left Jordan chain corresponding to λωk.
Proof. The following facts are easily established:
αij = α(i+1)(j+1), ∀i, j ∈ Z (3.3.6)
α(i+1)j = αi(j−1) = (αij + 1) (modh) , ∀i, j ∈ Z (3.3.7)
α ≡ β (modh) =⇒ (ωk)α = (ωk)β. (3.3.8)
For convenience, let h + 1 := 1. As a consequence of the h-cyclic structure of A, for
i = 1, . . . , h and j = 1, . . . , r, note that the ith component of the vector Ax〈0,j〉 is given by
[Ax〈0,j〉
]i
= Ai(i+1)x(i+1)j. (3.3.9)
53
By hypothesis,
Ax〈0,1〉 = λx〈0,1〉, (3.3.10)
so that, following (3.3.7) – (3.3.10),
[Ax〈k,1〉
]i
= (ωk)α(i+1)1Ai(i+1)x(i+1)1 = λ(ωk)α(i+1)1xi1 = λωk(ωk)αi1xi1 =[λωkx〈k,1〉
]i,
i.e.,(λωk, x〈k,1〉
)constitutes a right-eigenpair for A.
By hypothesis, for j = 2, . . . , r,
Ax〈0,j〉 = x〈0,j−1〉 + λx〈0,j〉, (3.3.11)
hence, following (3.3.9) and (3.3.11),
[Ax〈0,j〉
]i
= Ai(i+1)x(i+1)j = xi(j−1) + λxij.
Following (3.3.6) – (3.3.8), (3.3.9), and (3.3.11), for i = 1, . . . , h and j = 2, . . . , r,
[Ax〈k,j〉
]i
= (ωk)α(i+1)jAi(i+1)x(i+1)j
= (ωk)α(i+1)j(xi(j−1) + λxij
)= (ωk)αi(j−1)xi(j−1) + λωk
(ωk)αij
xij,
i.e,
Ax〈k,j〉 = x〈k,j−1〉 + λωkx〈k,j〉.
For the second assertion, first note that, as a consequence of the h-cyclic structure of A,
for i = 1, . . . , r and j = 1, . . . , h, the jth component of the vector yT〈i,0〉A is given by
[yT〈i,0〉A
]j
= yTi(j−1)A(j−1)j. (3.3.12)
54
By hypothesis,
yT〈r,0〉A = λyT〈r,0〉 (3.3.13)
so that, following (3.3.6) – (3.3.8), (3.3.12), and (3.3.13), for k = 1, . . . , h− 1,
[yT〈r,k〉A
]j
= (ωk)αr(j−1)yTr(j−1)A(j−1)j = λ(ωk)αr(j−1)yTrj = λωk(ωk)αrjyTrj =[λωkyT〈r,k〉
]j
i.e.,(λωk, yT〈r,k〉
)is a left eigenvector for A.
By hypothesis, for i = 1, . . . , r − 1
yT〈i,0〉A = λyT〈i,0〉 + yT〈i+1,0〉, (3.3.14)
hence, following (3.3.12) and (3.3.14),
[yT〈i,0〉A
]j
= yTi(j−1)A(j−1)j = λyTij + yT(i+1)j.
Following (3.3.6) – (3.3.8), (3.3.12), and (3.3.13), for i = 1, . . . , r − 1 and j = 1, . . . , h,
[yT〈i,k〉A
]j
= (ωk)αi(j−1)yTi(j−1)A(j−1)j
= (ωk)αi(j−1)(λyTij + yT(i+1)j
)= λωk(ωk)αijyTij + (ωk)α(i+1)jyT(i+1)j,
i.e,
yT〈i,k〉A = λωkyT〈i,k〉 + yT〈i+1,k〉,
and the proof is complete.
Corollary 3.3.17. If Jr(λ) is a Jordan block of J , then Jr(λωk
)is a Jordan block of J for
k = 1, . . . , h− 1. In particular, if λ ∈ σ (A), then λωk ∈ σ (A) for k = 1, . . . , h− 1.
55
Following (3.3.8), note that, for α ∈ Z,
(ωk)α = (ωk)α(mod h). (3.3.15)
For ` = 1, . . . , r
ωk
(ωk)α1`
(ωk)α2`
...
(ωk)αh`
[(ωk)α`1 (ωk)α`2 · · · (ωk)α`h
]= ωk
[(ωk)αi`+α`j
]i,j=1
h
Following (3.3.15), note that
(ωk)αi`+α`j = (ωk)(αi`+α`j)(mod h)
= (ωk)[(i−`)(mod h)+(`−j)(mod h)](mod h)
= (ωk)[(i−`)+(`−j)](mod h)
= (ωk)(i−j)(mod h) = (ωk)αij ,
hence
ωk
(ωk)α1`
(ωk)α2`
...
(ωk)αh`
[(ωk)α`1 (ωk)α`2 · · · (ωk)α`h
]= ωk
[(ωk)αij
]i,j=1
h
=
[(ωk)αij+1
]i,j=1
h
.
Following (3.3.7) and (3.3.8), (ωk)αij+1 = (ωk)(αij+1)(mod h) = (ωk)α(i+1)j . Thus
ωk
(ωk)α1`
(ωk)α2`
...
(ωk)αh`
[(ωk)α`1 (ωk)α`2 · · · (ωk)α`h
]=
[(ωk)α(i+1)j
]i,j=1
h
56
=
1 2 3 ··· h−1 h
1 ωk 1 (ωk)h−1 · · · (ωk)3 (ωk)2
2 (ωk)2 ωk 1 · · · (ωk)4 (ωk)3
3 (ωk)3 (ωk)2 ωk. . .
......
......
.... . . . . . . . .
...
h−1 (ωk)h−1 (ωk)h−2 (ωk)h−3 · · · ωk 1
h 1 (ωk)h−1 (ωk)h−2 · · · (ωk)2 ωk
=circ
(ωk, 1, (ωk)h−1, · · · , (ωk)2
).
Because ω` is an hth-root of unity for ` ∈ Z, note that, for ` = 1, . . . , h− 1,
h−1∑k=0
(ωk)` =h−1∑k=0
(ω`)k =(ω`)h − 1
ω` − 1= 0,
Thus,
h−1∑k=0
ωk 1 (ωk)h−1 · · · (ωk)3 (ωk)2
(ωk)2 ωk 1 · · · (ωk)4 (ωk)3
(ωk)3 (ωk)2 ωk. . .
......
......
. . . . . . . . ....
(ωk)h−1 (ωk)h−2 (ωk)h−3 · · · ωk 1
1 (ωk)h−1 (ωk)h−2 · · · (ωk)2 ωk
57
=
1 2 3 ··· h
1 h
2 h
.... . .
h−1 h
h h
= h
circ
h︷ ︸︸ ︷0, 1, 0, . . . , 0
(3.3.16)
For ` = 1, . . . , r − 1
(ωk)α1`
(ωk)α2`
...
(ωk)αh`
[(ωk)α(`+1)1 (ωk)α(`+1)2 · · · (ωk)α(`+1)h
]=
[(ωk)αi`+α(`+1)j
]i,j=1
h
Following (3.3.15),
(ωk)αi`+α(`+1)j = (ωk)[αi`+α(`+1)j](mod h)
= (ωk)[(i−`)(mod h)+(`+1−j)(mod h)](mod h)
= (ωk)[(i−`)+(`+1−j)](mod h)
= (ωk)[(i+1)−j](mod h) = (ωk)α(i+1)j ,
hence
(ωk)α1`
(ωk)α2`
...
(ωk)αh`
[(ωk)α(`+1)1 (ωk)α(`+1)2 · · · (ωk)α(`+1)h
]= circ
(ωk, 1, (ωk)h−1, · · · , (ωk)2
).
58
With Ω∗h as defined in (3.3.2), let λΩ∗h :=(λ, λω, . . . , λωh−1
)and
J (λΩ∗h, r) :=
Jr(λ)
Jr(λω)
. . .
Jr(λωh−1
)
∈Mrh(C).
Thus, the Jordan form of a nonsingular h-cylic matrix A is of the form
Z−1AZ = J =t⊕i=1
J (λiΩ∗h, ri) ,
and the Jordan chains comprising the matrix Z may be selected as in Lemma 3.3.16; i.e., ifx〈0,j〉
rj=1
is a left Jordan chain andy〈i,0〉
ri=1
is a right Jordan chain for λ, thenx〈k,j〉
rj=1
is a left Jordan chain andy〈i,k〉
ri=1
is a right Jordan chain for λωk, k = 1, . . . , h− 1.
Lemma 3.3.18. For i = 1, . . . , t, if
Ai := Z diag
0, . . . , 0,
i︷ ︸︸ ︷J (λiΩ
∗h, ri), 0, . . . , 0
Z−1 ∈Mn(C), (3.3.17)
then Γ (Ai) ⊆ Γ (χΠ), Ai commutes with A, and AiAj = AjAi = 0 for j ∈ 1, . . . , j, i 6= j.
Proof. Note that
Ai =h−1∑k=0
(ri∑j=1
λωkx〈k,j〉yT〈j,k〉 +
ri−1∑j=1
x〈k,j〉yT〈j+1,k〉
)
=h−1∑k=0
(ri∑j=1
λωkx〈k,j〉yT〈j,k〉 +
ri−1∑j=1
x〈k,j〉yT〈j+1,k〉
)
=h−1∑k=0
ri∑j=1
λωk
(ωk)α1jx1j
...
(ωk)αhjxhj
[(ωk)αj1yTj1 · · · (ωk)αjhyTjh
]+
59
ri−1∑j=1
(ωk)α1jx1j
...
(ωk)αhjxhj
[(ωk)α(j+1)1yT(j+1)1 · · · (ωk)α(j+1)hyT(j+1)h
]
=h−1∑k=0
ri∑j=1
λωk
(ωk)α1j
...
(ωk)αhj
[(ωk)αj1 · · · (ωk)αjh
] x1j
...
xhj
[yTj1 · · · yTjh
]+
ri−1∑j=1
(ωk)α1j
...
(ωk)αhj
[(ωk)α(j+1)1 · · · (ωk)α(j+1)h
] x1j
...
xhj
[yT(j+1)1 · · · yT(j+1)h
]= λ
ri∑j=1
[h−1∑k=0
(circ
(ωk, 1, (ωk)h−1, · · · , (ωk)2
)[xpjy
Tjq
]p,q=1
h
)]+
ri−1∑j=1
[h−1∑k=0
(circ
(ωk, 1, (ωk)h−1, · · · , (ωk)2
)[xpjy
T(j+1)q
]p,q=1
h
)]
= λ
ri∑j=1
[(h−1∑k=0
circ(ωk, 1, (ωk)h−1, · · · , (ωk)2
))[xpjy
Tjq
]p,q=1
h
]+
ri−1∑j=1
[(h−1∑k=0
circ(ωk, 1, (ωk)h−1, · · · , (ωk)2
))[xpjy
T(j+1)q
]p,q=1
h
]
= λ
ri∑j=1
circ
h︷ ︸︸ ︷0, h, 0, . . . , 0
[xpjyTjq]p,q=1
h
+
ri−1∑j=1
circ
h︷ ︸︸ ︷0, h, 0, . . . , 0
[xpjyT(j+1)q
]p,q=1
h
60
= λ
ri∑j=1
0 hx1jyTj2 · · · · · · 0
0 0 hx2jyTj3 · · · 0
......
. . . . . ....
0 0 · · · 0 hx(h−1)jyTjh
hxhjyTj1 0 · · · 0 0
+
ri−1∑j=1
0 hx1jyT(j+1)2 · · · · · · 0
0 0 hx2jyT(j+1)3 · · · 0
......
. . . . . ....
0 0 · · · 0 hx(h−1)jyT(j+1)h
hxhjyT(j+1)1 0 · · · 0 0
=
0 A(i)12 · · · · · · 0
0 0 A(i)23 · · · 0
......
. . . . . ....
0 0 · · · 0 A(i)(h−1)h
A(i)h1 0 · · · 0 0
,
where, for ` = 1, . . . , h,
A(i)`(`+1) := λ
ri∑j=1
hxijyTji +
ri−1∑j=1
hxijyT(j+1)i (3.3.18)
and h+1 := 1. The conclusion that Γ (Ai) ⊆ Γ (χΠ) follows becausex〈0,j〉
rj=1
andy〈i,0〉
ri=1
are partitioned conformably with χΠ; the matrices Ai and A commute because Ai and A are
simultaneously triangularizable; and AiAj = AjAi = 0 by construction of the matrices Ai
and Aj.
Corollary 3.3.19. If r = 1 and x〈0,j〉rj=1 and y〈i,0〉ri=1 are strictly nonzero Jordan chains
61
and A has cyclic index h, then Ai has cyclic index h, Γ (Ai) = Γ (χΠ), and Ai commutes
with A.
As a special case, we examine the following: let A be a nonnegative, irreducible, imprim-
itive, nonsingular matrix with index of cyclicity h and assume, without loss of generality,
that ρ (A) = 1. Following Theorem 2.1.3 and Corollary 3.3.17, note that the Jordan form of
A is
Z−1AZ =
J(Ω∗h, 1)
J (λ2Ω∗h, r2)
. . .
J (λtΩ∗h, rt)
.
Consider the matrix
A1 := Z
J(Ω∗h, 1) 0
0 0
Z−1.
Following Corollary 3.3.19, Γ (A1) = Γ (χΠ) and AA1 = A1A. Moreover, following Theorem
2.1.3, there exist positive vectors x and y such that Ax = x and yTA = yT . If x and y are
partitioned conformably with A as
x =
x1
x2
...
xh
and yT =
[yT1 yT2 · · · yTh
],
62
then, following (3.3.18),
A1 = h
0 x1yT2 · · · · · · 0
0 0 x2yT3 · · · 0
......
. . . . . ....
0 0 · · · 0 xh−1yTh
xhyT1 0 · · · 0 0
≥ 0.
3.3.3 Main Results
Unless otherwise noted, we assume A ∈ Mn(R) is a nonnegative, nonsingular, irreducible,
imprimitive matrix with ρ (A) = 1. Before we state our main results on such matrices, we
introduce some concepts and notation pertinent to the discussion: following Friedland [5], for
a (multi-)set σ = λini=1 ⊆ C, let ρ = ρ (σ) := maxi|λi| and σ :=λini=1⊆ C. If σ = σ,
we say that σ is self-conjugate. Clearly, σ is self-conjugate if and only if σ is self-conjugate.
The (multi-)set σ is said to be a Frobenius (multi)-set if, for some positive integer h ≤ n,
the following properties hold:
(i) ρ (σ) > 0;
(ii) σ ∩ z ∈ C : |z| = ρ (σ) = ρΩh; and
(iii) σ = ωσ, i.e., σ is invariant under rotation by the angle 2π/h.
Clearly, the set Ωh as defined in (3.3.1) is a self-conjugate Frobenius set. Let λ =
r exp (iθ) ∈ C, =(λ) 6= 0, λΩh :=λ, λωh, . . . , λω
h−1h
, ϕ = 2πk/h, and assume gcd (h, p) = 1.
With fj as defined in (2.2.5), note that
fi(λ)fjk(ωk)
= r1/p exp (i [θ + 2πi] /p) exp (i[ϕ+ 2πjk]/p)
63
= r1/p exp (i [(θ + ϕ) + 2π(i+ jk)] /p)
= r1/p exp (i [θ + ϕ] /p) exp (i [2π(i+ jk)] /p)
= r1/p exp (i [θ + ϕ] /p)ωi+jkp
= r1/p exp (i [θ + ϕ] /p)ω(i+jk)(mod p)p (3.3.8)
= r1/p exp (i [θ + ϕ] /p) exp (i2π [(i+ jk) (mod p)] /p)
= r1/p exp (i [(θ + ϕ) + 2π((i+ jk) (mod p))] /p) = fj(i)k
(λ), (3.3.19)
where λ = r exp (i [θ + ϕ]) and j(i)k = (i+ jk) (mod p) ∈ 0, 1, . . . , p− 1.
Hence, following Theorem 3.3.12, there exists a unique j ∈ B such that, for all i ∈ R(p),
the set
fi(λ) (Ωh)1/pj :=
fi(λ)fj0 (1) , fi(λ)fj1 (ωh) , . . . , fi(λ)fjh−1
(ωh−1h
)(3.3.20)
is a self-conjugate Frobenius set; moreover, following the analysis leading up to (3.3.19),
there exists j(i) =(j
(i)0 , j
(i)1 , . . . , j
(i)h−1
)∈ B such that
fi(λ) (Ωh)1/pj =
fj(i)0
(λ) , fj(i)1
(λωh) , . . . , fj(i)h−1
(λωh−1
h
). (3.3.21)
The following important lemma was introduced and stated without proof in [5, §4, Lemma
1] and proven rigorously in [25, Theorem 3.1].
Lemma 3.3.20. Let A be an eventually nonnegative matrix. If A is not nilpotent, then the
spectrum of A is a union of self-conjugate Frobenius sets.
As a consequence of Theorem 3.3.12 and Lemma 3.3.20, it should be clear that A can
not possess an eventually nonnegative matrix pth-root if gcd(h, p) 6= 1; however, more can
be ascertained.
64
Partition the Jordan form of A as J+
J−
JC
(3.3.22)
where:
(1) J (λΩ∗h, r) is a submatrix of J+ if and only if σ (J (λΩ∗h, r))∩R+ 6= ∅ and σ (J (λΩ∗h, r))∩
R− = ∅;
(2) J (λΩ∗h, r) is a submatrix of J− if and only if σ (J (λΩ∗h, r)) ∩ R− 6= ∅; and
(3) J (λΩ∗h, r) is a submatrix of JC if and only if σ (J (λΩ∗h, r)) ∩ R = ∅.
Suppose J+ has r1 distinct blocks of the form J (λΩ∗h, r), J− has r2 distinct blocks of the
form J (λΩ∗h, r), and JC has c distinct blocks of the form J (λΩ∗h, r).
We are now ready to present our main results.
Theorem 3.3.21. Let A ∈ Mn(R) and suppose that A ≥ 0, nonsingular, irreducible, and
imprimitive. Let h > 1 be the cyclic index of A, and suppose that Π describes the h-cyclic
structure of A, the Jordan form of A is partitioned as in (3.3.22), and gcd(h, p) = 1. If p
is even, then A has (a) 2r1−1pc eventually nonnegative primary pth-roots, when r2 = 0; and
(b) no eventually nonnegative primary pth-roots when r2 > 0. If p is odd, then A has pc
eventually nonnegative primary pth-roots.
65
Proof. For λ ∈ C, λ 6= 0, and j = (j0, j, . . . , jh−1) ∈ B, let
Fj (J (λΩ∗h, r)) :=
fj0 (Jr(λ))
fj1 (Jr(λω))
. . .
fjh−1
(Jr(λωh−1
))
.
We form an eventually nonnegative root X by carefully selecting a root for each submatrix
of every block appearing in (3.3.22).
Case 1: p is even. We consider the blocks appearing in (3.3.22):
(i) J+: Following Theorem 2.1.3, J (Ω∗h, 1) is a submatrix of J+ (note that h must be
odd) and, following Theorem 3.3.12, there exists j = (j0, j1, . . . , jh) ∈ B such that
σ (Fj (J (Ω∗h, 1))) = Ωh. With that specific choice of j, it is clear that
ρ (Fj (J (Ω∗h, 1))) = 1.
For any other submatrix J (λΩ∗h, r) of J+, without loss of generality, we may assume
that λ ∈ R+. For every such λ, k must be chosen such that fk(λ) is real (see Corollary
2.2.21) and fk(λ) is real if k = 0 or k = p/2; hence, there are two choices such that
fk(λ) is real and, following (3.3.20) and (3.3.21), there are two choices such that and
σ(Fj(k) (J (λΩ∗h, r))
)is a self-conjugate Frobenius set.
(ii) J−: If r2 > 0, then A does not have a real primary root so that, a fortiori, it can not
have an eventually nonnegative primary root.
(iii) JC: For any submatrix J (λΩ∗h, r) of JC, following (3.3.20) and (3.3.21), for the same j
chosen for Fj (J (Ω∗h, 1)) in (i), there exists
j(k) =(j
(k)0 , j
(k)1 , . . . , j
(k)h−1
)∈ B
66
such that σ(Fj(k) (J (λΩ∗h, r))
)is a self-conjugate Frobenius set for all k ∈ R(p). Thus,
there are pc ways to choose roots for blocks in JC.
Following the analysis contained in (i)–(iii), note that there are 2r1−1pc ways to form a root
in this manner.
Case 2: p is odd. Following Theorem 3.3.12 and properties of the pth-root function, if p
is odd, then for any submatrix J (λΩ∗h, r) of J+ or J−, there is only one choice j ∈ B such
that σ (Fj (J (λΩ∗h, r))) is a self-conjugate Frobenius set. For submatrices J (λΩ∗h, r) of Jc,
the analysis is the same as in (iii). In this manner, there are pc possible selections.
Partition the Jordan form of A as J(Ω∗h, 1)
J
.With the above partition in mind, consider the matrix pth-root of A given by
X := Z
Fj (J(Ω∗h, 1))
F(J)Z−1
where j is selected as in Theorem 3.3.12 and F(J)
is a pth-root of J containing blocks of
the form Fj(k) (J (λΩ∗h, r)), chosen as in Case 1 or Case 2. Because every power of a reducible
matrix is reducible note that X must be irreducible. If h is the cyclic index of X, then
1 ≤ h ≤ h and h must divide h (if h > h, then X would have h eigenvalues of maximum
modulus so that A would h eigenvalues of maximum modulus, contradicting the maximality
of h). However, we claim that h = h. For contradiction, assume h < h and consider the
matrix A1 given by
A1 = Z
J(Ω∗h, 1) 0
0 0
Z−1.
67
Following Corollary 3.3.19, A1 ≥ 0, irreducible, h-cyclic, and Π describes the h-cyclic struc-
ture of A1. Next, consider the matrix X1 given by
X1 := Z
Fj (J(Ω∗h, 1)) 0
0 0
Z−1,
which is a matrix pth-root of A1. Following Corollary 3.3.19, the cyclic index of X1 is
h. However, following the remarks leading up to (3.3.5), there exists q ∈ N relatively
prime to h such that X1 = Aq1 ≥ 0 (along with X1 being a pth-root of A1, this implies
X1 = Xpq1 ). Thus, X1 is a nonnegative, irreducible, h-cyclic matrix, contradicting the
maximality of h. Hence, h = h and X is h-cyclic. Before we continue with the proof, note
that Γ (X) ⊆ Γ (X1) = Γ (χqΠ).
Following Theorem 2.2.5, there exists Z ∈Mn(C) such that
Z−1XZ =
Fj (J(Ω∗h, 1))
J(fj2(λ2) (Ω∗h)
1/pj , r2
). . .
J(fjt(λt) (Ω∗h)
1/pj , rt
)
.
By construction of Z, note that
X1 := Z
Fj (J(Ω∗h, 1)) 0
0 0
Z−1 = Z
Fj (J(Ω∗h, 1)) 0
0 0
Z−1.
Let
X2 := Z
0
J(fj2(λ2) (Ω∗h)
1/pj , r2
). . .
J(fjt(λt) (Ω∗h)
1/pj , rt
)
Z−1.
68
Following Lemma 3.3.18, X2 is h-cyclic and Γ (X2) ⊆ Γ (χqΠ); in addition, X1X2 = X2X1 = 0.
Thus, for k ∈ N, Xk = Xk1 + Xk
2 . Because ρ (X2) < 1, limk→∞Xk2 = 0. The matrices X1
and X2 are h-cyclic, rank (X21 ) = rank (X1), rank (X2
2 ) = rank (X2), and Γ (X2) ⊆ Γ (X1) =
Γ (χqΠ), thus, following [10, Theorem 2.7], note that Γ(Xk
2
)⊆ Γ
(Xk
1
)= Γ
(χqkΠ
). Thus,
there exists p ∈ N such that Xk1 > Xk
2 for all k ≥ p, i.e., X is eventually nonnegative.
Remark 3.3.22. Following the notation of Theorem 3.3.21, note that all primary roots of
A are given by
X = Z
Fj1 (J(Ω∗h, 1))
Fj2 (J (λ2Ω∗h, r2))
. . .
Fjt (J (λtΩ∗h, rt))
Z−1
where jk ∈ B for k = 1, . . . , t and subject to the constraint that jk = j` (because jk and j`
are ordered p-tuples, equality is meant as entrywise) if λk = λ`.
If A is deregatory, i.e., some eigenvalue appears in more than one Jordan block in the
Jordan form of A, then A has additional roots of the form
X(U) = ZU
Fj1 (J(Ω∗h, 1))
Fj2 (J (λ2Ω∗h, r2))
. . .
Fjt (J (λtΩ∗h, rt))
U−1Z−1
where U is any matrix that commutes with J and subject to the constraint that jk 6= j`, if
λk = λ`. Select branches j1, j2, . . . , jt following the proof of Theorem 3.3.21. Following [15,
69
Theorem 1, §12.4], note that the matrix U must be of the form
u1
u2
. . .
uh
U
and X(U) is real if U is selected to be real. The matrix X(U) is irreducible because X(U)p =
A. Moreover,
X1(U) := ZU
Fj1 (J(Ω∗h, 1)) 0
0 0
U−1Z−1 = X1,
where X1 is defined as in the proof of Theorem 3.3.21. Thus, X1(U) is a nonnegative,
irreducible, h-cyclic matrix that commutes with X(U), so the argument demonstrating the
eventual nonnegativity of X is also valid for X(U).
Corollary 3.3.23. Let A ∈ Mn(R) and suppose that A ≥ 0, irreducible, and imprimitive
with index of cyclicity h. Then A possesses an eventually nonnegative pth-root if and only if
A possesses a real root and gcd (h, p) = 1.
Theorem 3.3.24. Let A ∈Mn(R) and suppose that A ≥ 0, irreducible, and imprimitive. If
Z−1AZ = J = J0 ⊕ J1 is a Jordan canonical form of A, where J0 collects all the singular
Jordan blocks and J1 collects the remaining Jordan blocks and A possesses a real root, then
all eventually nonnegative pth-roots of A are given by A = Z (X0 ⊕X1)Z−1, where X1 is any
pth-root of J1, characterized by Theorem 3.3.21 or Remark 3.3.22, and X0 is a real pth-root
of J0.
70
Although the following result is known (see [10, Algorithm 3.1 and its proof]), our work
provides another proof.
Corollary 3.3.25. If A ∈ Mn(R) is irreducible with index of cyclicity h, where h > 1, and
Π describes the h-cyclic structure of A, then A is eventually nonnegative if and only if there
exists a nonsingular matrix Z such that
Z−1AZ =
J(Ω∗h, 1)
J
,and, associated with ρ (A) = 1 ∈ σ (A), is a positive left eigenvector x and right eigenvector
y.
Proof. If A is eventually nonnegative, select p relatively prime to h such that Ap ≥ 0. Then
A is a pth-root of Ap and the result follows from Theorem 3.3.24.
The converse is clear by setting X1 = Z
J(Ω∗h, 1) 0
0 0
Z−1 and X2 = Z
0 0
0 J
Z−1.
Remark 3.3.26. It should be clear that Theorem 3.3.24 remains true if the assumption of
nonnegativity is replaced with eventual nonnegativity.
3.4 Reducible Matrices
Recall that a matrix A ∈ Mn(C), n ≥ 2 is reducible if there exists a permutation matrix P
such that
A = P
A11 A12
0 A22
P T ,
71
where A11 and A22 are square, nonempty matrices. Via strong induction, it can be shown
that if A is reducible, then there exists a permutation matrix P such that
P TAP =
A11 A12 · · · A1k
A22 . . . A2k
. . ....
Akk
(3.4.1)
where each matrix Akk appearing in the diagonal is irreducible. Recall that the matrix on
the right-hand side of (3.4.1) is known as the Frobenius, or irreducible, normal form of A.
Identifying the eventually nonnegative matrix roots of nonnegative reducible matrices
is beyond the scope of this dissertation (and perhaps a dissertation in and of itself). The
main difficulty arises in controlling entries of the off-diagonal blocks in the Frobenius normal
form (1.3.1). Moreover, the assumption that gcd (h, p) = 1 is not necessarily required: for
instance, consider the matrix
A =
0 1 0 0
0 0 1 0
0 0 0 1
1 0 0 0
and note that
B :=
0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
= A2,
B is a reducible 2-cyclic matrix, σ (B) = Ω2 ∪ Ω2, B has an irreducible nonnegative square
root, but gcd (2, 2) = 2 > 1.
72
3.4.1 Preliminary Results
Definition 3.4.1. A matrix A ∈ Mn(C) is said to be completely reducible if there exists a
permutation matrix P such that
P TAP =k⊕i=1
Aii =
A11
. . .
Akk
, (3.4.2)
where k ≥ 2 and A11, . . . , Akk are square irreducible matrices.
Remark 3.4.2. Following the definition, if A is completely reducible, without loss of gen-
eraltiy, we may assume A is in the form of the matrix on the right-hand side of (3.4.2).
Furthermore, note that Av
≥ 0 if and only if Aiiv
≥ 0 for all i = 1, . . . , k.
The following is corollary to Theorem 3.3.24 and Corollary 3.3.23.
Corollary 3.4.3. Let A ∈ Mn(R) and suppose that Av
≥ 0, nonsingular, and completely
reducible. Let hi denote the cyclicity of Aii for i = 1, . . . , k. If each Aii possesses a real root,
then A possesses an eventually nonnegative root if and only if gcd (hi, p) = 1 for all i.
73
Bibliography
[1] A. Berman and R. J. Plemmons. Nonnegative matrices in the mathematical sciences,
volume 9 of Classics in Applied Mathematics. Society for Industrial and Applied Math-
ematics (SIAM), Philadelphia, PA, 1994. Revised reprint of the 1979 original.
[2] R. A. Brualdi and H. J. Ryser. Combinatorial matrix theory, volume 39 of Encyclopedia
of Mathematics and its Applications. Cambridge University Press, Cambridge, 1991.
[3] S. Carnochan Naqvi and J. J. McDonald. The combinatorial structure of eventually
nonnegative matrices. Electron. J. Linear Algebra, 9:255–269 (electronic), 2002.
[4] S. Carnochan Naqvi and J. J. McDonald. Eventually nonnegative matrices are similar
to seminonnegative matrices. Linear Algebra Appl., 381:245–258, 2004.
[5] S. Friedland. On an inverse problem for nonnegative and eventually nonnegative ma-
trices. Israel J. Math., 29(1):43–60, 1978.
[6] D. Handelman. Positive matrices and dimension groups affiliated to C∗-algebras and
topological Markov chains. J. Operator Theory, 6(1):55–74, 1981.
[7] G. H. Hardy and E. M. Wright. An introduction to the theory of numbers. Oxford
University Press, Oxford, sixth edition, 2008. Revised by D. R. Heath-Brown and J. H.
Silverman, With a foreword by Andrew Wiles.
[8] N. J. Higham. Functions of matrices. Society for Industrial and Applied Mathematics
(SIAM), Philadelphia, PA, 2008. Theory and computation.
74
[9] N. J. Higham and L. Lin. On pth roots of stochastic matrices. Linear Algebra Appl.,
435(3):448–463, 2011.
[10] L. Hogben. Eventually cyclic matrices and a test for strong eventual nonnegativity.
Electron. J. Linear Algebra, 19:129–140, 2009.
[11] R. A. Horn and C. R. Johnson. Matrix analysis. Cambridge University Press, Cam-
bridge, 1990. Corrected reprint of the 1985 original.
[12] R. A. Horn and C. R. Johnson. Topics in matrix analysis. Cambridge University Press,
Cambridge, 1994. Corrected reprint of the 1991 original.
[13] J. Hunter. Number theory. Oliver & Boyd, Edinburgh, 1964.
[14] C. R. Johnson and P. Tarazaga. On matrices with Perron-Frobenius properties and
some negative entries. Positivity, 8(4):327–338, 2004.
[15] P. Lancaster and M. Tismenetsky. The theory of matrices. Computer Science and
Applied Mathematics. Academic Press Inc., Orlando, FL, second edition, 1985.
[16] E. Landau. Elementary number theory. Chelsea Publishing Co., New York, N.Y., 1958.
Translated by J. E. Goodman.
[17] W. J. LeVeque. Fundamentals of number theory. Dover Publications Inc., Mineola, NY,
1996. Reprint of the 1977 original.
[18] C. T. Long. Elementary introduction to number theory. Prentice Hall Inc., Englewood
Cliffs, NJ, third edition, 1987.
75
[19] J. J. McDonald and D. M. Morris. Level characteristics corresponding to peripheral
eigenvalues of a nonnegative matrix. Linear Algebra Appl., 429(7):1719–1729, 2008.
[20] D. Noutsos. On Perron-Frobenius property of matrices having some negative entries.
Linear Algebra Appl., 412(2-3):132–153, 2006.
[21] D. Noutsos and M. J. Tsatsomeros. On the numerical characterization of the reachability
cone for an essentially nonnegative matrix. Linear Algebra Appl., 430(4):1350–1363,
2009.
[22] P. J. Psarrakos. On the mth roots of a complex matrix. Electron. J. Linear Algebra,
9:32–41 (electronic), 2002.
[23] M. I. Smith. A Schur algorithm for computing matrix pth roots. SIAM J. Matrix Anal.
Appl., 24(4):971–989 (electronic), 2003.
[24] B. G. Zaslavsky and J. J. McDonald. A characterization of Jordan canonical forms
which are similar to eventually nonnegative matrices with the properties of nonnegative
matrices. Linear Algebra Appl., 372:253–285, 2003.
[25] B. G. Zaslavsky and B.-S. Tam. On the Jordan form of an irreducible matrix with
eventually non-negative powers. Linear Algebra Appl., 302/303:303–330, 1999. Special
issue dedicated to Hans Schneider (Madison, WI, 1998).