matrix roots of nonnegative and eventually …math.wsu.edu/faculty/tsat/files/pietro_thesis.pdf ·...

MATRIX ROOTS OF NONNEGATIVE AND EVENTUALLY NONNEGATIVE

MATRICES

By

PIETRO PAPARELLA

A dissertation submitted in partial fulfillment of therequirements for the degree of

DOCTOR OF PHILOSOPHY

WASHINGTON STATE UNIVERSITYDepartment of Mathematics

JULY 2013

© Copyright by PIETRO PAPARELLA, 2013All Rights Reserved

ii

To the Faculty of Washington State University:

The members of the Committee appointed to examine the disser-

tation of PIETRO PAPARELLA find it satisfactory and recommend that

it be accepted.

Michael J. Tsatsomeros, Ph.D., Chair

Judith J. McDonald, Ph.D., Chair

Bala Krishnamoorthy, Ph.D.

iii

MATRIX ROOTS OF NONNEGATIVE AND EVENTUALLY NONNEGATIVE

MATRICES

Abstract

by Pietro Paparella, Ph.D.Washington State University

July 2013

Chair: Michael J. Tsatsomeros and Judith J. McDonald

Eventually nonnegative matrices are real matrices whose powers become and remain

nonnegative. As such, eventually nonnegative matrices are, a fortiori, matrix roots of non-

negative matrices, which motivates us to study the matrix roots of nonnegative matrices.

Using classical matrix function theory and Perron-Frobenius theory, we characterize, clas-

sify, and describe in terms of the complex and real Jordan canonical form the pth-roots of

nonnegative and eventually nonnegative matrices.

iv

Acknowledgements

This dissertation would not have been possible without the unwavering encouragement, pa-

tience, and love of my wife Allison L. Paparella. I wish to thank my children Quintin L.

Paparella and Alessia R. Paparella for providing me with inspiration to help me finish this

endeavor. I also would like to acknowledge my mother Maria Aquino for always supporting

me and my siblings and for her tremendous efforts and sacrifice in raising three children on

her own.

I owe a deep debt of gratitude to my advisors Michael J. Tsatsomeros and Judtih J.

McDonald. Their vision, guidance, wisdom, and expertise were paramount in the completion

of this project. I wish to extend a debt of gratitude to Bala Krishnamoorthy, who went

above and beyond the duties and responsibilities of a committee member: thank you for

responding to my pesky emails, the sage advice, and the encouraging words. I would also

like to thank David S. Watkins for serving on my committee and K. A. “Ari” Ariyawansa

for generously providing me a research assistantship from 2009–2011, so that I was able to

return to Washington State to complete my Ph.D in mathematics.

I am also grateful for wonderful friends and family for their encouraging words and

support: Travis W. Arnold; Aaron L. Carr-Callen; Kirk J. Griffith; Jasmin L. and Jeffrey

A. Harshman; Jameus C. Hutchens; Francesco Paparella; Stefano Paparella; Christine S.

and V. Joshua Phillips; Lena Y. and V. Jefferson Phillips; Daniela Paparella and Joseph W.

Thornton; and Frank and Chiara D. Trejos.

Finally, I would like to acknowledge several graduate students from my time at Wash-

ington State and Louisiana State for the wonderful memories, collaboration, and continued

v

inspiration: Baha’ M. Alzalg, Jeremy J. Becnel, Timothy D. Breitzman, Mihaly Kovacs,

Suat Namli, and Rokas Varaneckas.

vi

To my wife Allison, my children Quintin and Alessia, and my mother Maria.

vii

“If I have seen further it is by standing on ye sholders of Giants” – Sir Isaac Newton.

http://en.wikipedia.org/wiki/Isaac_Newton

viii

Contents

Abstract iii

Acknowledgements iv

Dedication vi

Inspiration vii

1 Introduction 1

1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3.1 Combinatorial Structure of a Matrix . . . . . . . . . . . . . . . . . . 5

1.3.2 Upper-triangular Toeplitz Matrices . . . . . . . . . . . . . . . . . . . 8

2 Theoretical Background 12

2.1 Perron-Frobenius Theory of Nonnegative Matrices . . . . . . . . . . . . . . . 12

2.2 Theory of Functions of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Matrix Roots of Nonnegative Matrices 32

3.1 2× 2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.1.1 1 > λ ≥ 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.1.2 1 > 0 > λ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.1.3 λ = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

ix

3.2 Primitive Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.3 Imprimitive Irreducible Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.3.1 Complete Residue Systems . . . . . . . . . . . . . . . . . . . . . . . . 45

3.3.2 Jordan Chains of h-cyclic matrices . . . . . . . . . . . . . . . . . . . 51

3.3.3 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.4 Reducible Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.4.1 Preliminary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Bibliography 73

1

Chapter 1

Introduction

1.1 Background and Motivation

A real matrix A is said to be eventually nonnegative (respectively, positive) if there exists a

nonnegative integer p such that Ak is entrywise nonnegative (respectively, positive) for all

k ≥ p. If p is the smallest such integer, then p is called the power index of A and is denoted

by p(A).

Eventual nonnegativity was introduced and studied in [5] and was the central subject of

study in various publications (see, e.g., [3, 4, 10, 14, 19, 20, 21, 24, 25]). It is well-known

that the notions of eventual positivity and nonnegativity are associated with properties of

the eigenspace corresponding to the spectral radius.

A real matrix A has the Perron-Frobenius property if its spectral radius is a positive eigen-

value corresponding to an entrywise nonnegative eigenvector. The strong Perron-Frobenius

property further requires that the spectral radius is simple; it dominates in modulus every

other eigenvalue of A; and it has an entrywise positive eigenvector.

Several challenges regarding the theory and applications of eventually nonnegative ma-

trices remain unresolved. For example, eventual positivity of A is equivalent to A and AT

having the strong Perron-Frobenius property, however, the Perron-Frobenius property for A

and AT is a necessary but not sufficient condition for eventual nonnegativity of A.

The motivation of this thesis is the following observation: an eventually nonnegative

2

(respectively, positive) matrix with power index p = p(A) is, a fortiori, a matrix pth-root of

the nonnegative matrix Ap (i.e., it is a solution to the matrix equation Xp − Ap = 0). As

a consequence, in order to gain insight into eventual nonnegativity, it is natural to examine

the roots of matrices that possess the (strong) Perron-Frobenius property.

1.2 Notation and Definitions

The following notational conventions are adopted throughout the thesis; when convenient,

additional notation and definitions are introduced and adopted throughout.

(1) Denote by C the set of complex numbers, R the set of real numbers, Z the set of

integers, and N the set of natural numbers. Denote by i the imaginary unit, i.e.,

i :=√−1.

(2) When convenient, an indexed set of the form xi, xi+1, . . . , xi+j is abbreviated to

xki+jk=i.

(3) Denote by Mn(C) (Mn(R)) the algebra of complex (respectively, real) n× n matrices.

(4) Given A ∈ Mn(C), [aij] = [aij]i,j=1n denotes the entries of A, where aij ∈ C is the

(i, j)-entry of A. When convenient, the (i, j)-entry of A may also be denoted by [A]ij.

(5) Denote by Cn (Rn) the n-dimensional complex (respectively, real) vector space. Given

x ∈ Cn (x ∈ Rn), [x]i denotes the ith-component of x.

(6) Denote by In the n× n identity matrix. Whenever the context is clear, the subscript

may be suppressed.

3

(7) For λ ∈ C, Jn(λ) ∈Mn(C) denotes the n× n Jordan block, i.e,

[Jn(λ)]ij :=

λ, i = j;

1, j = i+ 1; and

0, otherwise.

(8) Given A ∈Mn(C):

(i) the spectrum of A, denoted by σ (A), is the multiset

σ (A) := λ ∈ C : det(λI − A) = 0,

i.e., σ (A) contains the eigenvalues of A;

(ii) the spectral radius of A, denoted by ρ = ρ (A), is the quantity

ρ (A) := maxλi∈σ(A)

|λi|; and

(iii) the peripheral spectrum, denoted by π (A), is the (multi-)set given by

λ ∈ σ (A) : |λ| = ρ.

(9) The direct sum of the matrices A1, A2, . . . , Ak, where Ai ∈ Mni(C), denoted by A1 ⊕

A2 ⊕ · · · ⊕ Ak,⊕k

i=1 Ani , or diag (A1, A2, . . . , An), is the matrix

A1

A2

. . .

Ak

∈Mn(C)

where n =∑k

i=1 ni.

4

(10) For A ∈ Mn(C), let J = Z−1AZ =⊕t

i=1 Jni(λi) =⊕t

i=1 Jni , where∑ni = n, denote

a Jordan canonical form.

(11) The order of the largest Jordan block of A corresponding to the eigenvalue λ is called

the index of λi and is denoted by m = mi = indexλi A.

(12) For c =

(c1, c2, . . . , cn

), where ci ∈ C, ∀i, the circulant matrix (or simply

circulant) of c, denoted by circ (c) is the n× n matrix given by

c1 c2 · · · cn

cn c1 · · · cn−1

......

. . ....

c2 c3 · · · c1

∈Mn(C).

(13) For A, B ∈Mn(C), the hadamard product of A and B, denoted by A B, is the n× n

matrix whose (i, j)-entry is given by

[A B]ij := aijbij.

(14) For A ∈ Mn(C) and X ∈ Mn(C), X is said to be a matrix pth-root, pth-root, matrix

root, or simply root of A if X satisfies the matrix equation Xp − A = 0.

(15) For k ∈ N, let R(k) denote the canonical complete residue system (mod k), i.e., R(k) =

0, 1, . . . , k − 1.

1.3 Preliminaries

Because the combinatorial stucture of a matrix (i.e., the location of its zero-nonzero entries)

will be central to the analysis of the matrix roots of nonnegative, irreducible, imprimitive

5

matrices, we devote Subsection 1.3.1 to notation, concepts, and background concerning the

combinatorial structure of a matrix.

In addition, in Subsection 1.3.2, we establish ancillary results pertaining to upper-triangular

Toeplitz matrices.

1.3.1 Combinatorial Structure of a Matrix

For notation and definitions concerning the combinatorial stucture of a matrix, i.e., the

location of the zero-nonzero entries of a matrix, we follow [2] and [10]. For further results

concerning combinatorial matrix theory, see [2] and references therein.

A directed graph (or simply digraph) Γ = (V,E) consists of a finite, nonempty set V of

vertices, together with a set E ⊆ V × V of arcs. Digraphs may contain both arcs (u, v) and

(v, u) as well as loops, i.e., arcs of the form (v, v). In this thesis, we do not consider directed

general graphs (or simply general digraphs), i.e., digraphs that contain multiple copies of the

same arc.

For A ∈ Mn(C), the directed graph (or simply digraph) of A, denoted by Γ = Γ (A), has

vertex set V = 1, . . . , n and arc set E = (i, j) ∈ V × V : aij 6= 0. If R, C ⊆ 1, . . . , n,

then A[R|C] denotes the submatrix of A whose rows and columns are indexed by R and

C, respectively. If R = C, then A[R|R] is abbreviated as A[R]. For a digraph Γ = (V,E)

and W ⊆ V , the induced subdigraph Γ[W ] is the digraph with vertex set W and arc set

(u, v) ∈ E : u, v ∈ W. For a square matrix A, Γ(A[W ]) is identified with Γ (A) [W ] by a

slight abuse of notation.

A digraph Γ is strongly connected if for any two distinct vertices u and v of Γ, there

is a walk in Γ from u to v (as a consequence, there must also be a walk from v to u;

6

thus, both conditions could be employed as the definition). Following [2], we consider every

vertex of V as strongly connected to itself. For a strongly connected digraph Γ, the index

of imprimitivity is the greatest common divisor of the the lengths of the closed walks in Γ.

A strong digraph is primitive if its index of imprimitivity is one, otherwise it is imprimitive.

The strong components of Γ are the maximal strongly connected subdigraphs of Γ.

For n ≥ 2, a matrix A ∈Mn(C), is reducible if there exists a permutation matrix P such

that

P TAP =

A11 A12

0 A22

where A11 and A22 are nonempty square matrices and 0 is a (possibly rectangular) block

consisting entirely of zero entries. If A is not reducible, then A is called irreducible. The

connection between reducibility and the digraph of A is as follows: A is irreducible if and

only if Γ (A) is strongly connected1 (see, e.g., [2, Theorem 3.2.1] or [11, Theorem 6.2.24]). If

A ∈Mn(C) is reducible, then there exists a permutation matrix P such that

P TAP =

A11 A12 · · · A1k

A22 . . . A2k

. . ....

Akk

(1.3.1)

where each Aii is irreducible (see, e.g., [2, Theorem 3.2.4] or [11, Exercise 8, §8.3]. The

matrix on the right-hand side of (1.3.1) is called the Frobenius (or irreducible) normal form

of A.

For h ≥ 2, a digraph Γ = (V,E) is cyclically h-partite if there exists an ordered partition

Π = (π1, . . . , πh) of V into h nonempty subsets such that for each arc (i, j) ∈ E, there

1Following [2], vertices are strongly connected to themselves so we take this result to hold for all n ∈ Nand not just n ∈ N, n ≥ 2.

7

exists ` ∈ 1, . . . , h with i ∈ π` and j ∈ π`+1 (where, for convenience, we take Vh+1 := V1).

For h ≥ 2, a strong digraph Γ is cyclically h-partite if and only if h divides the index of

imprimitivity (see, e.g., [2, p. 70]). For h ≥ 2, a matrix A ∈Mn(C) is called h-cyclic if Γ (A)

is cyclically h-partite. If Γ (A) is cyclically h-partite with ordered partition Π, then A is said

to be h-cyclic with partition Π or that Π describes the h-cyclic structure of A. The ordered

partition Π = (π1, . . . , πh) is consecutive if π1 = 1, . . . , i1, π2 = i1 + 1, . . . , i2, . . . , πh =

ih−1 + 1, . . . , n. If A is h-cyclic with consecutive ordered partition Π, then A has the block

form

0 A12 0 · · · 0

0 0 A23. . .

...

......

. . . . . . 0

0 0 · · · 0 A(h−1)h

Ah1 0 0 · · · 0

(1.3.2)

where Ai,i+1 = A[πi|πi+1] ([2, p. 71]). For any h-cyclic matrix A, there exists a permutation

matrix P such that PAP T is h-cyclic with consecutive ordered partition. The cyclic index

of A is the largest h for which A is h-cyclic.

An irreducible nonnegative matrix A is primitive if Γ (A) is primitive, and the index of

imprimitivity of A is the index of imprimitivity of Γ (A). If A is irreducible and imprimitive

with index of imprimitivity h ≥ 2, then h is the cyclic index of A, Γ (A) is cyclically h-partite

with ordered partition Π = (π1, . . . , πh), and the sets πi are uniquely determined (up to cyclic

permutation of the πi) (see, for example, [2, p. 70]). Furthermore, Γ(Ah)

is the disjoint

union of h primitive digraphs on the sets of vertices πi, i = 1, . . . , h (see, e.g., [2, §3.4]).

Following [10], for an ordered partition Π = (π1, . . . , πh) of 1, . . . , n into h nonnempty

subsets, the cyclic characteristic matrix, denoted by χΠ, is the n×n matrix whose (i, j)-entry

8

is 1 if there exists ` ∈ 1, . . . , h such that i ∈ π` and j ∈ π`+1 and 0 otherwise. For an

ordered partition Π = (π1, . . . , πh) of 1, . . . , n into h nonnempty subsets, note that

(1) χΠ is h-cyclic and Γ (χΠ) contains every arc (i, j) for i ∈ π` and j ∈ π`+1; and

(2) A ∈Mn(C) is h-cyclic with ordered partition Π if and only if Γ (A) ⊆ Γ (χΠ).

1.3.2 Upper-triangular Toeplitz Matrices

In this subsection we establish the following lemmas concerning upper-triangular Toeplitz

matrices.

Lemma 1.3.1. For n ≥ 2, let λ, c, y1, . . . , yn−1 ∈ C, and yn−1 6= 0. If

y := (y1, y2, . . . , yn−1)T ∈ Cn−1,

y := (c, y1, . . . , yn−2)T ∈ Cn−1,

J :=

Jn−1(λ) y

zT λ

∈Mn(C),

and

Q :=

In−1 y

zT yn−1

∈Mn(C)

then QJQ−1 = Jn(λ).

Proof. Note that Q−1 =

In−1 − yyn−1

zT 1yn−1

, hence

QJQ−1 =

In−1 y

zT yn−1

Jn−1(λ) y

zT λ

In−1 − y

yn−1

zT 1yn−1

9

=

Jn−1(λ) y + λy

zT λyn−1

In−1 − y

yn−1

zT 1yn−1

=

Jn−1(λ) 1yn−1

(y + λy − Jn−1(λ)y)

zT λ

.If Nn = [nij] ∈ Cn×n is the nilpotent matrix defined by

nij =

1, j = i+ 1

0, otherwise

then it is well known that Jn(λ) = λIn +Nn. Furthermore,

Nn−1y = (y1, y2, . . . , yn−2, 0)T

and

y + λy − Jn−1(λ)y

yn−1

=y + λy − (λIn−1 +Nn−1)y

yn−1

=y + λy − λy +Nn−1y

yn−1

=(y1, y2, . . . , yn−2, yn−1)T − (y1, y2, . . . , yn−2, 0)T

yn−1

= (0, 0, . . . , 0, 1)T ,

which completes the proof.

10

Lemma 1.3.2. For n ≥ 2, consider the upper-triangular Toeplitz matrix

F :=

x1 . . . xk . . . xn

. . . . . ....

x1 . . . xk

. . ....

x1

∈Mn(C)

where x1, . . . , xn ∈ C and xi 6= 0 for all i = 2, . . . , n. Then there exists an invertible matrix

Q such that QFQ−1 = Jn(x1).

Proof. Proceed by induction on n: for the base-case, i.e., n = 2, note that1 0

0 x2

x1 x2

0 x1

1 0

0 1x2

=

x1 x2

0 x1x2

1 0

0 1x2

=

x1 1

0 x1

,so that the base-case holds.

Assume the assertion holds when n = k− 1. Let F be the submatrix of F obtained from

deleting the nth-row and column of F , i.e.,

F =

x1 . . . xn−1

. . ....

x1

∈Mn−1(C),

and let x = (xn, xn−1, . . . , x2)T ∈ Cn−1. Then F =

F x

zT x1

and, by the inductive hypoth-

esis, there exists an invertible matrix Q such that QF Q−1 = Jn−1(x1). Note that the matrix

Q :=

Q z

zT 1

is invertible, Q−1 =

Q−1 z

zT 1

, and

QF Q−1 =

Q z

zT 1

F x

zT x1

Q−1 z

zT 1

11

=

QF Qx

zT x1

Q−1 z

zT 1

=

QF Q−1 Qx

zT x1

=

Jn−1(x1) Qx

zT x1

. (1.3.3)

If y := Qx ∈ Cn−1, y := (0, y1, . . . , yn−2)T ∈ Cn−1, then, following Lemma 1.3.1, a similarity-

transform by the matrix Q :=

In−1 y

zT yn−1

brings the matrix in (1.3.3) into the desired

form. Thus, the desired similarity matrix transformation is given by Q = QQ.

In Table 1, we present the transformation Q and its inverse for several dimensions.

Table 1: The transformation matrix Q and Q−1.

n F Q Q−1

2

x1 x2

0 x1

1 0

0 x2

1 0

0 1x2

3

x1 x2 x3

0 x1 x2

0 0 x1

1 0 0

0 x2 x3

0 0 x22

1 0 0

0 x2 x3

0 0 x22

4

x1 x2 x3 x4

0 x1 x2 x3

0 0 x1 x2

0 0 0 x1

1 0 0 0

0 x2 x3 x4

0 0 x22 2x2x3

0 0 0 x32

1 0 0 0

0 1x2−x3x32

2x23−x2x4x52

0 0 1x22

−2x3x42

0 0 0 1x32

12

Chapter 2

Theoretical Background

2.1 Perron-Frobenius Theory of Nonnegative Matrices

We review some well-known results, concepts, and definitions from the theory of nonnegative

matrices. For further results, see [11, Chapter 8] or [1] and references therein.

Recall that a matrix A = [aij] ∈Mn(R) is said to be (entrywise) nonnegative (respectively,

positive), denoted A ≥ 0 (respectively, A > 0), if aij ≥ 0 (respectively, aij > 0) for all

1 ≤ i, j ≤ n. Recall that a matrix A ∈ Mn(R) is eventually positive (nonnegative) if there

exists a nonnegative integer p such that Ak is entrywise positive (nonnegative) for all k ≥ p.

If p is the smallest such integer, then p is called the power index of A and is denoted by

p(A).

We recall the Perron-Frobenius theorem for positive matrices (see [11, Theorem 8.2.11]).

Theorem 2.1.1. If A ∈Mn(R) is positive, then

(a) ρ > 0;

(b) ρ ∈ σ (A);

(c) there exists a positive vector x such that Ax = ρx;

(d) ρ is an algebraically (and hence geometrically) simple eigenvalue of A; and

(e) |λ| < ρ for every λ ∈ σ (A) such that λ 6= ρ.

13

There are nonnegative matrices with zero-entries that satisfy Theorem 2.1.1. Recall that

a nonnegative matrix A ∈ Mn(R) is said to be primitive if it is irreducible and has only

one eigenvalue of maximum modulus. The conclusions to Theorem 2.1.1 apply to primitive

matrices (see [11, Theorem 8.5.1]), and the following theorem is a useful characterization of

primitivity (see [11, Theorem 8.5.2]).

Theorem 2.1.2. If A ∈ Mn(R) is nonnegative, then A is primitive if and only if Ak > 0

for some k ≥ 1.

Perron-Frobenius type results also hold when A is an irreducible imprimitive matrix (see

[11, §8.4]).

Theorem 2.1.3. Let A ∈ Mn(R) and suppose that A is irreducible, nonnegative, and h is

the cyclic index of A. Then

(a) ρ > 0;

(b) ρ ∈ σ (A);

(c) there exists a positive vector x such that Ax = ρx;

(d) ρ is an algebraically (and hence geometrically) simple eigenvalue of A; and

(e) π (A) = ρ exp (i2πk/h) : k ∈ R(h).

The following result is the widest-reaching forrm of the Perron-Frobenius Theorem.

Theorem 2.1.4. If A ∈Mn(R) and A ≥ 0, then ρ (A) ∈ σ (A) and there exists a nonnegative

vector x ≥ 0, x 6= 0, such that Ax = ρ (A)x.

14

One can verify that the matrix 2 1

2 −1

possesses properties (a) through (e) of Theorem 2.1.1, is irreducible, but obviously contains

a negative entry. This motivates the following concept.

Definition 2.1.5. A matrix A ∈ Mn(R) is said to possess the strong Perron-Frobenius

property if A possesses properties (a) through (e) of Theorem 2.1.1.

The following theorem characterizes the strong Perron-Frobenius property (see [6, Lemma

2.1], [14, Theorem 1], or [20, Theorem 2.2]).

Theorem 2.1.6. A real matrix A is eventually positive if and only if A and AT possess the

strong Perron-Frobenius property.

2.2 Theory of Functions of Matrices

We review some basic notions and results from the theory of matrix functions (for further

results on the theory of matrix functions, see [8], [12, Chapter 9], or [15, Chapter 6]).

Definition 2.2.1. Let f : C −→ C be a function and f (k) denote the kth derivative of f .

The function f is said to be defined on the spectrum of A if the values

f (k)(λi), k = 0, . . . ,mi − 1, i = 1, . . . , s,

called the values of the function f on the spectrum of A, exist.

Definition 2.2.2 (Matrix function via Jordan canonical form). If f is defined on the spec-

trum of A ∈Mn(C), then

f(A) := Zf(J)Z−1 = Z

(k⊕i=1

f(Jni)

)Z−1,

15

where

f(Jni) :=

f(λi) f ′(λi) . . . f (ni−1)(λi)(ni−1)!

f(λi). . .

...

. . . f ′(λi)

f(λi)

. (2.2.1)

The following equivalent definition (see [8, Theorem 1.12]) serves to illuminate the notion

of a primary root, and will be used to examine the 2× 2 case in Section 3.1.

Definition 2.2.3 (Matrix function via Hermite interpolation). Let f be defined on the

spectrum of A ∈ Mn(C) and let ϕ be the minimal polynomial of A. Then f(A) := p(A),

where p is the polynomial of degree less than

s∑i=1

ni = degϕ

that satisfies the interpolation conditions

p(j)(λi) = f (j)(λi), j = 0, . . . , ni − 1, i = 1, . . . , s (2.2.2)

There is a unique such p and it is known as the Hermite interpolating polynomial.

Remark 2.2.4. The Hermite interpolating polynomial p is given explicitly by the Lagrange-

Hermite formula

p(t) =s∑i=1

[(ni−1∑j=0

1

j!φ

(j)i (λi)(t− λi)j

)∏j 6=i

(t− λj)nj], (2.2.3)

where φi(t) = f(t)/∏

j 6=i(t−λj)nj . For a matrix with distinct eigenvalues, i.e., ni = 1, s = n,

(2.2.3) reduces to

p(t) =n∑i=1

f(λi)`i(t), `i(t) =∏j=1j 6=i

(t− λjλi − λj

). (2.2.4)

16

Formulae (2.2.3) and (2.2.4) will be used to classify the primary matrix roots 2× 2 matrices

in Section 3.1.

Theorem 2.2.5. Let f be defined on the spectrum of a nonsingular matrix A ∈ Mn(C). If

J =⊕t

i=1 Jni(λi) = Z−1AZ is a Jordan canonical form of A, then

Jf :=t⊕i=1

Jni [f(λi)]

is a Jordan canonical form of f(A).

Proof. Because f(A) = Z (f(J))Z−1 = Z(⊕k

i=1 f(Jni))Z−1, note that, for each i =

1, . . . , t, f(Jni) is an upper-triangular, Toeplitz matrix. Following Lemma 1.3.1, for each i =

1, . . . , k, there exists an invertible matrix Qi ∈Mni(C) such that f(Jni) = QiJni(f(λi))Q−1i ,

and hence

f(A) = Zt⊕i=1

f(Jni)Z−1

= Z

[t⊕i=1

QiJni [f(λi)]Q−1i

]Z−1

=

(Z

t⊕i=1

Qi

)t⊕i=1

Jni [f(λi)]

(t⊕i=1

Q−1i Z−1

)

= Z

(t⊕i=1

Jni [f(λi)]

)Z−1

and the result is established.

Definition 2.2.6. For z = r exp (iθ) ∈ C, where r > 0, and an integer p > 1, let

z1/p := r1/p exp (iθ/p),

and, for j ∈ 0, 1, . . . , p− 1, define

fj(z) = z1/p exp (i2πj/p) = r1/p exp (i [θ + 2πj] /p), (2.2.5)

17

i.e., fj is the (j + 1)st-branch of the pth-root function.

Note that

f(k)j (z) =

1

pk

[k−1∏i=0

(1− ip)

]r(1−kp)/p exp

(i[2πj + θ(1− kp)]

p

), (2.2.6)

where k is a nonnegative integer and the product∏k−1

i=0 (1− ip) is empty when k = 0.

Next we present several technical, but easily establishable lemmas on the branches of the

pth-root function.

Lemma 2.2.7. If z = r exp (iθ), z′ = r′ exp (iθ′) ∈ C, and r < r′, then .

Proof. Note that

|fj(z)| =∣∣r1/p exp (i [θ + 2πj] /p)

∣∣= r1/p |exp (i [θ + 2πj] /p)|

= r1/p

< (r′)1/p

= (r′)1/p |exp (i [θ + 2πj′] /p)|

=∣∣(r′)1/p exp (i [θ + 2πj′] /p)

∣∣= |fj′(z′)|.

Lemma 2.2.8. For z ∈ C, =(z) 6= 0, j, j′ ∈ R(p), and f(k)j as in (2.2.6), we have f

(k)j (z) =

f(k)j′ (z) if and only if j + j′ ≡ 0 (modp).

Proof. Note that

f(k)j (z) = f

(k)j′ (z)

18

⇔ exp (i[2πj + θ(1− kp)]/p) = exp (i[−2πj′ + θ(1− kp)]/p)

⇔ exp (i[2πj + θ(1− kp)]/p) = exp (i[−2πj′ + θ(1− kp)]/p) · exp (i2π`p/p)

⇔ exp (i[2πj + θ(1− kp)]/p) = exp (i[2π(p`− j′) + θ(1− kp)]/p)

⇔ i[2πj + θ(1− kp)]/p = i[2π(p`− j′) + θ(1− kp)]/p

⇔ j + j′ = `p

⇔ j + j′ ≡ 0 (modp) ,

where ` ∈ Z. Finally, note that j + j′ ≡ 0 (modp)⇔ j = j′ = 0 or j = p− j′.

Lemma 2.2.9. Let z = r exp (iπ), r > 0. For j, j′ ∈ 0, 1, . . . , p− 1 and f(k)j as in (2.2.6),

we have f(k)j (z) = f

(k)j′ (z) if and only if j + j′ = p− 1.

Proof. Note that

f(k)j (z) = f

(k)j′ (z)

⇔ exp (i[2πj + π(1− kp)]/p) = exp (i[−2πj′ − π(1− kp)]/p)

⇔ exp (i[2πj + π(1− kp)]/p) = exp (i[−2πj′ − π(1− kp)]/p) · exp (i2π(p− kp))

⇔ exp (i[2πj + π(1− kp)]/p) = exp (i[−2πj′ − π(1− kp) + 2π(p− kp)]/p)

⇔ i[2πj + π(1− kp)]/p = i[−2πj′ − π(1− kp) + 2π(p− kp)]/p

⇔ 2πj + π(1− kp) = −2πj′ − π(1− kp) + 2π(p− kp)

⇔ 2πj + 2π(1− kp) = −2πj′ + 2π(p− kp)

⇔ j + j′ = p− 1.

The following theorem classifies all pth-roots of a general nonsingular matrix [23, Theo-

rems 2.1 and 2.2].

19

Theorem 2.2.10 (Classification of pth-roots of nonsingular matrices). If A ∈ Mn(C) is

nonsingular, then A has precisely ps pth-roots that are expressible as polynomials in A, given

by

Xj = Z

(t⊕i=1

fji(Jni)

)Z−1, (2.2.7)

where j = (j1, . . . , jt), ji ∈ 0, 1, . . . , p− 1, and ji = jk whenever λi = λk.

If s < t, then A has additional pth-roots that form parameterized families

Xj(U) = ZU

(t⊕i=1

fji(Jni)

)U−1Z−1, (2.2.8)

where U is an arbitrary nonsingular matrix that commutes with J and, for each j, there exist

i and k, depending on j, such that λi = λk, while ji 6= jk.

In the theory of matrix functions, the roots given by (2.2.7) are called primary functions

(or roots) of A and are expressible as polynomials in A, and the roots given by (2.2.8), which

exist only if A is derogatory (i.e., if some eigenvalue appears in more than one Jordan block),

are called the nonprimary functions or roots of A and are not expressible as polynomials in

A [8, Chapter 1].

The next result provides a necessary and sufficient condition for the existence of a root

for a general matrix (see [22]).

Theorem 2.2.11 (Existence of pth-root). A matrix A ∈ Mn(C) has a pth-root if and only

if the “ascent sequence” of integers d1, d2, . . . defined by

di = dim (null (Ai))− dim (null (Ai−1))

has the property that for every integer ν ≥ 0 no more than one element of the sequence lies

strictly between pν and p(ν + 1).

20

Before we state results concerning the matrix roots of a real matrix, we state some well-

known results concerning the real Jordan canonical form for real matrices (see [11, Section

3.4], [15, Section 6.7]).

Theorem 2.2.12 (Real Jordan canonical form). If A ∈ Mn(R) has r real eigenvalues (in-

cluding multiplicities) and c complex conjugate pairs of eigenvalues (including multiplicities),

then there exists a real, invertible matrix R ∈Mn(R) such that

R−1AR = JR =

⊕rk=1 Jnk(λk) ⊕r+c

k=r+1Cnk(λk)

,where:

1.

Ck(λ) :=

C(λ) I2

C(λ). . .

. . . I2

C(λ)

∈M2k(R); (2.2.9)

2.

C(λ) :=

<(λ) =(λ)

−=(λ) <(λ)

∈M2(R); (2.2.10)

3. λ1, . . . , λr are the real eigenvalues (including multiplicities) of A; and

4. λr+1, λr+1, . . . , λr+c, λr+c are the complex eigenvalues (including multiplicities) of A.

21

Lemma 2.2.13. Let λ ∈ C and suppose Ck(λ) and C(λ) are defined as in (2.2.9) and

(2.2.10), respectively. If Sk :=(⊕k

i=1 S)∈M2k(R), where S :=

−i −i

1 −1

, then

S−1k Ck(λ)Sk = Dk(λ) :=

D(λ) I2

D(λ). . .

. . . I2

D(λ)

∈M2k(C), (2.2.11)

where D(λ) :=

λ 0

0 λ

.

Proof. Proceed by induction on k, the number of 2 × 2-blocks: when k = 1 one readily

obtains

S−1C(λ)S =1

2

i 1

i −1

<(λ) =(λ)

−=(λ) <(λ)

−i −i

1 −1

= D(λ).

Now assume the assertion holds for all matrices of the form (2.2.9) of dimension 2(k − 1).

Note that the matrices in the product S−1k Ck(λ)Sk can be partitioned asS−1

k−1 Z

ZT S−1

Ck−1(λ) Y

ZT C(λ)

Sk−1 Z

Z S

,

where Z ∈ M2(k−1),2(R) is a rectangular zero matrix, Y =

Z2

...

Z2

I2

∈ M2(k−1),2(R), and Z2

is the 2 × 2 zero matrix. With the above partition in mind, and following the induction

22

hypothesis, we obtain

S−1Ck(λ)S =

Dk−1(λ) S−1k−1Y S

ZT S−1C(λ)S

,but note that S−1

k−1Y S = Y and S−1C(λ)S = D(λ).

Lemma 2.2.14. Let λ ∈ C and suppose Dk(λ) and D(λ) are defined as in Lemma 2.2.13.

If Pk is the permutation matrix given by

Pk =

[e1 e3 . . . e2k−1 e2 e4 . . . e2k

]∈M2k(R), (2.2.12)

where ei denotes the canonical basis vector in Rn of appropriate dimension, then

P Tk Dk(λ)Pk = Jk(λ)⊕ Jk

(λ).

Proof. Proceed by induction on k: the base-case when k = 1 is trivial, so we assume the

assertion holds for matrices of the form (2.2.11) of dimension 2(k − 1). If zn denotes the

n× 1 zero vector, then

D(λ) =

λ eT2 0

z2(k−1) D2(k−1)(λ) e2(k−1)

0 zT2(k−1) λ

,

and if P is the permutation matrix defined by

P :=

1 zT2(k−1) 0

z2(k−1) P2(k−1) z2(k−1)

0 zT2(k−1) 1

∈M2k(R),

23

then, following the induction-hypothesis,

P TD(λ)P =

λ zTk−1 eT1 0

zk−1 Jk−1

(λ)

Zk−1 zk−1

zk−1 Zk−1 Jk−1(λ) ek−1

0 zTk−1 zTk−1 λ

. (2.2.13)

A permutation-similarity by the matrix P defined by

P :=

1

Ik−1

Ik−1

1

∈M2k(R)

brings the matrix in the right-hand side of (2.2.13) to the desired form. The proof is com-

pleted by noting that

P P =

[e1 e2 . . . e2(k−1) e3 . . . e2k−1 e2k

]

1

Ik−1

Ik−1

1

= Pk,

since right-hand multiplication by P permutes columns 2 through k with columns k + 1

through 2k − 1.

Corollary 2.2.15. Let λ ∈ C, λ 6= 0, and let f be a function defined on the spectrum of

Jk(λ) ⊕ Jk(λ). For j a nonnegative integer, let f

(j)λ denote f (j)(λ). If Ck(λ) and C(λ) are

24

defined as in (2.2.9) and (2.2.10), respectively, then

f(Ck(λ)) =

C(fλ) C(f ′λ) . . . C

(f(k−1)λ

(k−1)!

)C(fλ)

. . ....

. . . C(f ′λ)

C(fλ)

∈M2k(R).

if and only if f(j)λ = f

(j)

λ.

Proof. Following Lemmas Lemma 2.2.13 and Lemma 2.2.14,

P Tk S−1k Ck(λ)SkPk = Jk(λ)⊕ Jk

(λ). (2.2.14)

Since f(A) = f(X−1AX) ([8, Theorem 1.13(c)]), f(A ⊕ B) = f(A) ⊕ f(B) [8, Theorem

1.13(g)], and f(j)λ = f

(j)

λfor all j, applying f to (2.2.14) yields

P Tk S−1k f(Ck(λ))SkPk = f(Jk(λ))⊕ f(Jk

(λ)) = f(Jk(λ))⊕ f(Jk(λ)).

Hence,

S−1k f(Ck(λ))Sk = Pk

[f(Jk(λ))⊕ f(Jk(λ))

]P Tk

=

D(fλ) D(f ′λ) . . . D

(f(k−1)λ

(k−1)!

)D(fλ)

. . ....

. . . D(f ′λ)

D(fλ)

and

f(Ck(λ)) = Sk

D(fλ) D(f ′λ) . . . D

(f(k−1)λ

(k−1)!

)D(fλ)

. . ....

. . . D(f ′λ)

D(fλ)

S−1k

25

=

C(fλ) C(f ′λ) . . . C

(f(k−1)λ

(k−1)!

)C(fλ)

. . ....

. . . C(f ′λ)

C(fλ)

.

The converse follows by noting that, for µ, ν ∈ C, the product

S

µ 0

0 ν

S−1 =

−i −i

1 −1

µ 0

0 ν

1

2

i 1

i −1

=

1

2

−iµ −iν

µ −ν

i 1

i −1

=

1

2

µ+ ν i(ν − µ)

i(µ− ν) µ+ ν

is real if and only if ν = µ.

Remark 2.2.16. In general, our analysis reveals that

f(Ck(λ)) =

f(Cλ) f ′(Cλ) . . . f (k−1)(Cλ)(k−1)!

f(Cλ). . .

...

. . . f ′(Cλ)

f(Cλ)

∈M2k(C),

which bears a striking resemblance to (2.2.1).

Corollary 2.2.17. If λ ∈ C, =(λ) 6= 0 and

Fj(Ck(λ)) := SkPk

fj1(Jk(λ)) 0

0 fj2(Jk(λ))

P Tk S−1k ∈M2k(C), (2.2.15)

26

where j = (j1, j2) and j1, j2 ∈ 0, 1, . . . , p− 1, then

Fj(Ck(λ)) =

C(fj1(λ)) C(f ′j1(λ)) . . . C

(f(k−1)j1

(λ)

(k−1)!

)C(fj1(λ))

. . ....

. . . C(f ′j1(λ))

C(fj1(λ))

∈M2k(R)

if and only if j1 = j2 = 0 or j1 = j2 − p.

Proof. Follows from Lemma 2.2.8 and Corollary 2.2.15.

Corollary 2.2.18. If λ = r exp (iπ) ∈ C, where r > 0, and Fj is defined as in (2.2.15), then

Fj(Ck(λ)) =

C(fj1(λ)) C(f ′j1(λ)) . . . C

(f(k−1)j1

(λ)

(k−1)!

)C(fj1(λ))

. . ....

. . . C(f ′j1(λ))

C(fj1(λ))

∈M2k(R)

if and only if j1 + j2 = p− 1.

Proof. Follows from Lemma 2.2.9 and Corollary 2.2.15.

The next theorem provides a necessary and sufficient condition for the existence of a real

pth-root of a real A (see [9, Theorem 2.3]) and our proof utilizes the real Jordan canonical

form.

Theorem 2.2.19 (Existence of real pth-root). A matrix A ∈ Mn(R) has a real pth-root if

and only if it satisfies the ascent sequence condition specified in Theorem 2.2.11 and, if p is

even, A has an even number of Jordan blocks of each size for every negative eigenvalue.

27

Proof. Case 1: p is even. Following Theorem 2.2.12, there exists a real, invertible matrix R

such that

A = R

J0

J+

J−

C

R−1

where J0 collects the singular Jordan blocks; J+ collects the Jordan blocks with positive real

eigenvalues; J− collects Jordan blocks with negative real eigenvalues; and C collects blocks

of the form (2.2.9) corresponding to the complex conjugate pairs of eigenvalues of A.

By hypothesis, if Jk(λ) is a submatrix of J−, it must appear an even number of times;

for every such pair of blocks, it follows that

Pk[Jk(λ)⊕ Jk(λ)]P Tk = Ck(λ),

where Pk is defined as in (2.2.12). Thus, there exists a permutation matrix P such that

A = RP TP

J0

J+

J−

C

P TPR−1 = R

J0

Jp

C

R−1,

where R = RP T , and C collects all the blocks of the form (2.2.9).

Since the ascent sequence condition holds for A it must also hold for J0, so J0 has a

pth-root, say W0, and W0 can be taken real in view of the construction given in [22, Section

3]; clearly, there exists a real matrix W+ such that W p+ = J+ and, following Corollaries

2.2.17 and 2.2.18, there exists a real matrix Wc such that W pc = C. Hence, the matrix

X = R[W0 ⊕W+ ⊕Wc]R−1 is a real pth-root of A.

28

Conversely, if A satisfies the ascent sequence condition and has an odd number of Jor-

dan blocks corresponding to a negative eigenvalue, then the process just described can not

produce a real matrix pth-root, as one of the Jordan blocks can not be paired, so that the

root of such a block is necessarily complex.

Case 2: p is odd. Follows similarly to the first case since real roots can be taken for J0,

J+, J−, and C.

We now present an analog of Theorem 2.2.10 for real matrices.

Theorem 2.2.20 (Classification of pth-roots of nonsingular real matrices). Let Fk be defined

as in (2.2.15). If A ∈Mn(R) is nonsingular, then A has precisely ps primary pth-roots, given

by

Xj = R

⊕rk=1 fjk (Jnk(λk)) 0

0⊕r+c

k=r+1 Fjk(Cnk(λk))

R−1, (2.2.16)

where j = (j1, . . . , jr, jr+1, . . . , jr+c), jk = (jk1 , jk2) for k = r + 1, . . . , r + c, and ji = jk

whenever λi = λk.

If s < t, then A has additional nonprimary pth-roots that form parameterized families

given by

Xj(U) = RU


0⊕r+c

k=r+1 Fjk(Cnk(λk))

U−1R−1, (2.2.17)

where U is an arbitrary nonsingular matrix that commutes with JR, and for each j there

exist i and k, depending on j, such that λi = λk while ji 6= jk.

Proof. Following Theorem 2.2.12, there exists a real, invertible matrix R such that

R−1AR = JR =


k=r+1Cnk(λk)

;

29

if T =

I ⊕r+ck=r+1 SnkPnk

, then

T−1JRT = J =


k=r+1

[Jnk(λk)⊕ Jnk

(λk)] ,

following Lemmas 2.2.13 and 2.2.14. Following Theorem 2.2.10, JR has ps primary roots

given by

T

⊕rk=1 fjk(Jnk(λk)) ⊕r+c

k=r+1

[fjk1 (Jnk(µk))⊕ fjk2 (Jnk(µk)

]T−1

=


0⊕r+c

k=r+1 Fjk (Cnk(µk))

,where jk = (jk1 , jk2) for k = r + 1, . . . , r + c, which establishes (2.2.16).

If A is derogatory, then JR has additional roots of the form

TW


k=r+1


]W−1T−1,

where W is arbitray nonsingular matrix that commutes with J . Note that

TW


k=r+1


]W−1T−1

= U


k=r+1 Fjk(Cnk(λk))

U−1,

where U = TWT−1. Following [15, Theorem 1, §12.4], U is an arbitary, nonsingular matrix

that commutes with JR, which establishes (2.2.17).

30

The next theorem identifies the number of real primary pth-roots of a real matrix (c.f.

[9, Theorem 2.4]) and our proof utilizes the real Jordan canonical form.

Corollary 2.2.21. Let the nonsingular real matrix A have r1 distinct positive real eigenval-

ues, r2 distinct negative real eigenvalues, and c distinct complex-conjugate pairs of eigenval-

ues. If p is even, there are (a) 2r1pc real primary pth-roots when r2 = 0; and (b) no real

primary pth-roots when r2 > 0. If p is odd, there are pc real primary pth-roots.

Proof. Following Theorem 2.2.12, there exists a real, invertible matrix R such that

R−1AR = JR =


k=r+1Cnk(λk)

.Case 1: p is even. If r2 > 0, then A does not possess a real primary root, since A must have an

even number of Jordan blocks of each size for every negative eigenvalue and the same branch

of the pth-root function must be selected for every Jordan block containing the same negative

eigenvalue. If r2 = 0, then, following Corollary 2.2.17, for every complex-conjugate pair of

eigenvalues, there are p choices such that Fjk (Cnk(λk)) is real. For every real eigenvalue,

there are two choices such that fjk (Jnk(λk)) is real, yielding 2r1pc real primary roots.

Case 2: p is odd. The matrix Xj is real provided that the principal-branch of the pth-root

function is chosen for every real eigenvalue. Similar to the first case, there are p choices such

that Fjk (Cnk(λk)) is real, yielding pc real primary roots.

The following theorem extends Theorem 2.2.10 to include singular matrices (see [9, The-

orem 2.6]).

Theorem 2.2.22 (Classification of pth-roots). Let A ∈ Mn(C) have the Jordan canonical

form Z−1AZ = J = J0⊕J1, where J0 collects together all the Jordan blocks corresponding to

31

the eigenvalue zero and J1 contains the remaining Jordan blocks. If A possesses a pth-root,

then all pth-roots of A are given by A = Z (X0 ⊕X1)Z−1, where X1 is any pth-root of J1,

characterized by Theorem 2.2.10, and X0 is any pth-root of J0.

32

Chapter 3

Matrix Roots of Nonnegative Matrices

3.1 2× 2 Matrices

We analyze and categorize the primary matrix roots of a 2 × 2 nonnegative matrix via the

interpolating-polynomial definition of a matrix function (see Definition 2.2.3, (2.2.3), and

(2.2.4)): let A ∈M2(R) and suppose A ≥ 0. Following Theorem 2.1.4 and the fact that the

spectrum of a real matrix must be self-conjugate (see, e.g., [11, Appendix C]), without loss

of generality, we may assume that σ (A) = 1, λ, where 1 ≥ |λ| ≥ 0 and λ ∈ R. Note that

the Jordan form of A must be 1 0

0 λ

(3.1.1)

or 1 1

0 1

. (3.1.2)

As in (2.2.5), let fj denote the (j + 1)st-branch of the pth-root function. For 2 × 2

matrices having the Jordan form (3.1.1), with the exception when λ = 1, following (2.2.4),

the required Lagrange-Hermite interpolating polynomial is given by

p(t) = fj1(1)t− λ1− λ

+ fjj2 (λ)

(t− 1

λ− 1

)=ωj1p − fj2(λ)

1− λt+

fj2(λ)− λωj1p1− λ

.

Thus, if j = (j1, j2), then A has p2 primary pth-roots given by

A1/pj =

ωj1p − fj2(λ)

1− λA+

fj2(λ)− λωj1p1− λ

I2. (3.1.3)

33

3.1.1 1 > λ ≥ 0

Let A1/pj be the matrix given in (3.1.3) and consider the two cases depending on the parity

of p:

(1) p is even. The root A1/pj is real if and only if j1, j2 ∈ 0, p/2.

If j1 = j2 = 0, then

A1/pj =

1− p√λ

1− λA+

p√λ− λ

1− λI2 ≥ 0,

because (1− p√λ)/(1− λ) > 0 and ( p

√λ− λ)/(1− λ) > 0.

If j1 = j2 = p/2, then

A1/pj =

p√λ− 1

1− λA+

λ− p√λ

1− λI2 ≤ 0.

If j1 = p/2 and j2 = 0, then A1/pj can not be nonnegative since the off-diagonal entries

of A are negative and not affected by a positive multiple of the identity.

If j1 = 0 and j2 = p/2, then

A1/pj =

1− p√λ

1− λA+

λ− p√λ

1− λI2,

and A1/pj ≥ 0 if and only if (1− p

√λ)aii ≥ (λ− p

√λ) for i = 1, 2.

(2) p is odd. The root A1/pj is real if and only if j1 = j2 = 0 and, if j1 = j2 = 0, then

A1/pj =

1− p√λ

1− λA+

p√λ− λ

1− λI2 ≥ 0,

because (1− p√λ)/(1− λ) > 0 and ( p

√λ− λ)/(1− λ) > 0.

If λ = 0, then A is a rank-one matrix and

A1/pj = ωj1p A, j1 ∈ R(p)

34

3.1.2 1 > 0 > λ

Let A1/pj be the matrix given in (3.1.3) and consider the two cases depending on the parity

of p:

(1) p is even. If λ < 0 and p is even, then A1/pj 6∈M2(R).

(2) p is odd. Same as (2) when 1 > λ > 0.

3.1.3 λ = 1

If the Jordan form of A is (3.1.1), then (trivially) A = I2 andωj1p I2

p−1

j1=0are the pth-roots

of A.

If the Jordan form of A is (3.1.2), then (2.2.3) reduces to

p(t) = fj1(1) + f ′j1(1)(t− 1) = ωj1p +ωj1pp

(t− 1)

so that

A1/pj = fj1(1)I2 + f ′j1(1)(A− I2) =

ωj1ppA+

[ωj1p −

ωj1pp

]I2,

which is real if and only if j1 = 0 (or j1 = p/2, when p is even). When j1 = 0, note that

A1/pj =

1

pA+

[1− 1

p

]I2 =

A+ (p− 1)I2

p.

As a special case, note that if A =

1 1

0 1

, then Jp :=

1 p−1

0 1

is a pth-root for all

p ∈ N.

35

3.2 Primitive Matrices

We present our main results for primitive matrices.

Theorem 3.2.1. Let the nonsingular primitive matrix A have r1 distinct positive real eigen-

values, r2 distinct negative real eigenvalues, and c distinct complex-conjugate pairs of eigen-

values. If p is even, there are (a) 2r1−1pc eventually positive primary pth-roots when r2 = 0;

and (b) no eventually positive primary pth-roots if r2 > 0. If p is odd, there are pc eventually

positive primary pth-roots.

Proof. If r2 > 0, then, following Theorem 2.2.21, the matrix A does not have a real primary

pth-root, hence, a fortiori, it can not have an eventually positive primary pth-root.

If r2 = 0, then, following Theorems 2.2.12 and 2.1.1, there exists a real, invertible matrix

R such that

A =

[x R′

]ρ ⊕r

k=2 Jnk(λk) ⊕r+ck=r+1Cnk(λk)

yT

(R−1)′

, (3.2.1)

where R =

[x R′

]∈ Mn(R), R−1 =

y

(R−1)′

∈ Mn(R), x > 0 is the right Perron-vector,

and y > 0 is the left Perron-vector.

Following Lemma 2.2.7, |fj(z)| < |fj′(z′)| for all j, j′ ∈ 0, 1, . . . , p − 1, so that any

primary pth-root X of the form

Xj = R

p√ρ ⊕r

k=2 fjk(Jnk(λk)) ⊕r+ck=r+1 Fjk(Cnk(λk))

R−1

36

inherits the strong Perron-Frobenius property from A. Since f(A)T = f(AT ) ([8, Theorem

1.13(b)]), a similar argument demonstrates that XT inherits the strong Perron-Frobenius

property from AT .

Case 1: p is even. For all k = 2, . . . , r, there are two possible choices such that

fjk (Jnk(λk)) is real; following Corollary 2.2.17, for all k = r + 1, . . . , r + c, there are p

choices such that Fjk (Cnk(λk)) is real. Thus, there are 2r1−1pc possible ways to select X and

XT to be real.

Case 2: p is odd. For all k = 2, . . . , r, the principal-branch of the pth-root function must

be selected so that fjk (Jnk(λk)) is real and, similar to the previous case, there are p choices

such that Fjk (Cnk(λk)) is real. Hence, there are pc ways to select X and XT real.

Regardless of the parity of p, following Theorem 2.1.6, the matrices X and XT are

eventually positive.

Example 3.2.2. We demonstrate Theorem 3.2.1 via an example. Consider the matrix

A =1

5

16 16 6 11 1

7 12 12 12 7

9 4 14 9 14

8 8 8 13 13

10 10 10 5 15

.

37

Note that A = RJRR−1, where

JR =

10 0 0 0 0

0 1 1 1 0

0 −1 1 0 1

0 0 0 1 1

0 0 0 −1 1

, R =

1 1 1 1 1

1 −1 0 0 0

1 0 −1 0 0

1 0 0 −1 0

1 0 0 0 −1

,

and

R−1 =1

5

1 1 1 1 1

1 −4 1 1 1

1 1 −4 1 1

1 1 1 −4 1

1 1 1 1 −4

.

Because σ (A) = 10, 1 + i, 1 + i, 1− i, 1− i, following Theorem 3.2.1, A has eight primary

matrix square-roots, of which the matrices

Xj = R

√10 0 0 0 0

0 1.0987 0.4551 0.3884 −0.1609

0 −0.4551 1.0987 0.1609 0.3884

0 0 0 1.0987 0.4551

0 0 0 −0.4551 1.0987

R−1

38

=

1.6668 1.0232 0.1130 0.4738 −0.1145

0.2762 1.3749 0.7313 0.6646 0.1153

0.3939 −0.0612 1.4926 0.5548 0.7823

0.3217 0.3217 0.3217 1.4204 0.7768

0.5037 0.5037 0.5037 0.0486 1.6024

,

where j = (0, (0, 0)), and

Xj′ = R

√10 0 0 0 0

0 −1.0987 −0.4551 −0.3884 0.1609

0 0.4551 −1.0987 −0.1609 −0.3884

0 0 0 −1.0987 −0.4551

0 0 0 0.4551 −1.0987

R−1

=

−0.4019 0.2417 1.1519 0.7911 1.3794

0.9887 −0.1100 0.5336 0.6003 1.1496

0.8710 1.3261 −0.2276 0.7101 0.4826

0.9432 0.9432 0.9432 −0.1555 0.4881

0.7612 0.7612 0.7612 1.2163 −0.3375

,

where j′ = (0, (1, 1)), are eventually positive square-roots of A.

The following question arises from Example 3.2.2: are Xj and Xj′ the only eventually

positive square-roots of A? This is answered in the following result, which yields an explicit

description of the eventually positive primary roots of a nonsingular primitive matrix A.

Theorem 3.2.3. Let A be a nonsingular, primitive matrix and Xj be any primary pth-root

39

of A of the form

Xj = R

fj1(ρ) ⊕r

k=2 fjk(Jnk(λk)) ⊕r+ck=r+1 Fjk(Cnk(λk))

R−1,

where j = (j1, . . . , jr, jr+1, . . . , jr+c) and jk = (jk1 , jk2), for k = r + 1, . . . , r + c. If p is odd,

then Xj is eventually positive if and only if

1. j1 = 0;

2. jk = 0 for all k = 2, . . . , r; and

3. jk = (0, 0) or jk = (jk1 , p− jk1) for all k = r + 1, . . . , r + c.

If p is even, then Xj is eventually positive if and only if

1. j1 = 0;

2. jk = 0 or jk = p/2 for all k = 2, . . . , r; and

3. jk = (0, 0) or jk = (jk1 , p− jk1) for all k = r + 1, . . . , r + c.

Proof. We demonstrate necessity as sufficiency is shown in the proof of Theorem 3.2.1.

To this end, we demonstrate the contrapositive. Case 1: p is odd. If the principal branch

of the pth-root function is not selected for the Perron eigenvalue, then Xj can not possess

the strong Perron-Frobenius property; if jk 6= 0 for some k ∈ 2, . . . , r, then fjk(Jnk(λk)) is

not real so that Xj can not be real; similarly, Xj is not real if jk 6= (0, 0) or jk 6= (jk1 , p− jk1)

for some k ∈ r + 1, . . . , r + c.

Case 2: p is even. Result is similar to the first case, but we note that, without loss of

generality, we may assume that A does not have any negative eigenvalues (else it can not

possess a primary root).

40

Theorem 3.2.4. Let A be a primitive, nonsingular, derogatory matrix that possesses a real

root, and let Xj(U) be any nonprimary root of A of the form

Xj(U) = RU

fj1(ρ) ⊕r

k=2 fjk (Jnk(λk)) ⊕r+ck=r+1 Fjk(Cnk(λk))

U−1R−1,

where j = (j1, . . . , jr, jr+1, . . . , jr+c) and jk = (jk1 , jk2), for k = r + 1, . . . , r + c, subject to

the constraint that for each j, there exist i and k, depending on j, such that λi = λk, while

ji 6= jk.

If p is even, then Xj(U) is eventually positive if and only if

(1) j1 = 0;

(2) jk = 0 or jk = p/2, for all k = 2, . . . , r;

(3) jk = (0, 0) or jk = (jk1 , p− jk1) for all k = r + 1, . . . , r + c;

(4) U is a real nonsingular matrix; and

(5) Jordan blocks containing negative eigenvalues are transformed, via a permutation ma-

trix, to blocks of the form (2.2.9) (see proof of Theorem 2.2.19) and branches for these

blocks are selected in accordance with Corollary 2.2.18.

If p is odd, then Xj(U) is eventually positive if and only if

(1) j1 = 0;

(2) jk = 0 for all k = 2, . . . , r;

(3) jk = (0, 0) or jk = (jk1 , p− jk1) for all k = r + 1, . . . , r + c; and

41

(4) U is a real nonsingular matrix.

Example 3.2.5. We demonstrate Theorem 3.2.4 via an example. Consider the matrix

A =1

9

32 32 14 23 5 32 14 23 5

17 26 26 26 17 17 17 17 17

19 10 28 19 28 19 19 19 19

18 18 18 27 27 18 18 18 18

20 20 20 11 29 20 20 20 20

17 17 17 17 17 26 26 26 17

19 19 19 19 19 10 28 19 28

18 18 18 18 18 18 18 27 27

20 20 20 20 20 20 20 11 29

.

Note that A = RJRR−1, where

JR =

20 0 0 0 0 0 0 0 0

0 1 1 1 0 0 0 0 0

0 −1 1 0 1 0 0 0 0

0 0 0 1 1 0 0 0 0

0 0 0 −1 1 0 0 0 0

0 0 0 0 0 1 1 1 0

0 0 0 0 0 −1 1 0 1

0 0 0 0 0 0 0 1 1

0 0 0 0 0 0 0 −1 1

42

R =

1 1 1 1 1 1 1 1 1

1 −1 0 0 0 0 0 0 0

1 0 −1 0 0 0 0 0 0

1 0 0 −1 0 0 0 0 0

1 0 0 0 −1 0 0 0 0

1 0 0 0 0 −1 0 0 0

1 0 0 0 0 0 −1 0 0

1 0 0 0 0 0 0 −1 0

1 0 0 0 0 0 0 0 −1

,

and

R−1 =1

9

1 1 1 1 1 1 1 1 1

1 −8 1 1 1 1 1 1 1

1 1 −8 1 1 1 1 1 1

1 1 1 −8 1 1 1 1 1

1 1 1 1 −8 1 1 1 1

1 1 1 1 1 −8 1 1 1

1 1 1 1 1 1 −8 1 1

1 1 1 1 1 1 1 −8 1

1 1 1 1 1 1 1 1 −8

.

If j = (0, (0, 0), (1, 1)), then, following Theorem 3.2.4, any matrix of the form

Xj(U) = RU

√

20

F(0,0)(C2(1 + i))

F(1,1)(C2(1 + i))

R−1U−1,

43

where

U =

u1 0 0 0 0 0 0 0 0

0 u2 u3 1 0 u4 u5 1 0

0 −u3 u2 0 1 −u5 u4 0 1

0 0 0 u2 u3 0 0 u4 u5

0 0 0 −u3 u2 0 0 −u5 u4

0 u6 u7 1 0 u8 u9 1 0

0 −u7 u6 0 1 −u9 u8 0 1

0 0 0 u6 u7 0 0 u8 u9

0 0 0 −u7 u6 0 0 −u9 u8

,

u1 6= 0, and ui ∈ R, for i = 2, . . . , 9, is an eventually positive square-root of A.

Next, we present an analog of Theorem 2.2.22 for real matrices.

Theorem 3.2.6. Let the primitive matrix A ∈ Mn(R) have the real Jordan canonical form

R−1AR = JR = J0 ⊕ J1, where J0 collects all the singular Jordan blocks and J1 collects the

remaining blocks. If A possesses a real root, then all eventually positive pth-roots of A are

given by A = R (X0 ⊕X1)R−1, where X1 is any pth-root of J1, characterized by Theorem

3.2.3 or Theorem 3.2.4, and X0 is a real pth-root of J0.

Remark 3.2.7. It should be clear that Theorems 3.2.1, 3.2.3, 3.2.4, and 3.2.6 remain true

if the assumption of primitivity is replaced with eventually positivity.

Recall that for A ∈ Mn(C) with no eigenvalues on R−, the principal pth-root, denoted

by A1/p, is the unique pth-root of A all of whose eigenvalues lie in the segment z : −π/p <

44

arg(z) < π/p [8, Theorem 7.2]. In addition, recall that a nonnegative matrix A is said to

be stochastic if∑n

j=1 aij = 1, for all i = 1, . . . , n.

In [9], being motivated by discrete-time Markov-chain applications, two classes of stochas-

tic matrices were identified that possess stochastic principal pth-roots for all p. As a conse-

quence, and of particular interest, for these classes of matrices the twelfth-root of an annual

transition matrix is itself a transition matrix. A (monthly) transition matrix that contains a

negative entry or is complex is meaningless in the context of a model, however the following

remark demonstrates that, under suitable conditions, a matrix-root of a primitive stochastic

matrix will be eventually stochastic (i.e., eventually positive with row sums equal to one).

Eventual stochasticity may be useful in the following manner: consider, for example,

the application of discrete-time Markov chains in credit risk: let P = [pij] ∈ Mn(R) be a

primitive transition matrix where pij ∈ [0, 1] is the probability that a firm with rating i

transitions to rating j. Such matrices are derived via annual data, and the twelfth-root of

such a matrix would correspond to a monthly transition matrix if the entries are nonnegative.

However, in application, the twelfth-root would be used for forecasting purposes (e.g., to

estimate the likelihood that a firm with credit-rating i, transitions to rating j, m months

into the future). Thus, if R is the twelfth-root of P and Rm ≥ 0, then r(m)ij is a candidate

for the aforementioned probability. Moreover, it is well-known that n2 − 2n + 2 is a sharp

upper-bound for the primitivity index of a primitive matrix (see, e.g., [11, Corollary 8.5.9])

so that eventual stochasticity is feasible in application.

Remark 3.2.8. Theorems 3.2.1, 3.2.3, 3.2.4, and 3.2.6 remain true if stochasticity is added

as an assumption on the matrix A and the conclusion of eventually postivity is replaced with

eventually stochasticity.

45

We conclude with the following remark.

Remark 3.2.9. If A is an eventually positive matrix matrix, then B := Aq is eventually

positive for all q ∈ N: thus, all the previous results hold for fractional powers of A.

3.3 Imprimitive Irreducible Matrices

Consider the matrices

A =

0 1 0

0 0 1

1 0 0

, B =

0 0 1

1 0 0

0 1 0

and C = 3

2 2 −1

−1 2 2

2 −1 2

.

One can verify that B2 = C2 = A, yet the matrix C is not eventually nonnegative

despite possessing left and right positive eigenvectors. In this section, we investigate why it

is guaranteed that B is an eventually nonnegative and C is not eventually nonnegative.

3.3.1 Complete Residue Systems

Our discussion of complete residue systems begins with the following definition.

Definition 3.3.1. If R = a0, a1, . . . , ah−1 is a set of integers, then R is said to be a

complete residue system (modh), denoted CRS (h), if the map f : R −→ R(h) defined by

ai −→ qi := ai (modh)

is injective or, equivalently, surjective.

With the aforementioned definition, we readily glean the following equivalent properties

if R is a CRS(h):

46

(P1) if i 6= j, then qi 6= qj;

(P2) if i 6= j, then ai 6≡ aj (modh); and

(P3) for a ∈ Z, if a ≡ ai (modh), then a 6≡ aj (modh) for all j 6= i.

The next two number-theoretic results are well-known (see, e.g., [17, Theorem 3.3], [18,

Theorem 4.6 and Corollary 4.7] or [7, Theorems 54 and 55]).

Theorem 3.3.2. If gcd (k,m) = d, then ka ≡ kb (mod m) if and only if

a ≡ b (mod (m/d)) .

In particular is the following useful case.

Corollary 3.3.3. If gcd (k,m) = 1, then ka ≡ kb (modm) if and only if a ≡ b (modm).

Theorem 3.3.4. If aih−1i=0 is a CRS (h) and p ∈ Z, then paih−1

i=0 is a CRS (h) if and only

if gcd(h, p) = 1.

Remark 3.3.5. It is common to find only sufficiency in the literature (see [7, Theorem 56],

[17, Theorem 3.5], [18, Section 4.1, Exercise 10], [13, Theorem 20(i)], or [16, Theorem 62]),

however, as the following proof shows, necessity holds as well.

Proof. If gcd(h, p) = 1, then, following Corollary 3.3.3, pai ≡ paj (mod h) if and only if

ai ≡ aj (modh), which holds if and only if i = j.

Conversely, if gcd(h, p) = d > 1, then aih−1i=0 is not a CRS (h/d). Thus, there exist

distinct integers i and j such that ai ≡ aj (mod(h/d)), which holds, by Theorem 3.3.2, if

and only if pai ≡ paj (modh), i.e., paih−1i=0 is not a CRS (h).

47

Application to Complete Sets of Roots of Unity

For h ∈ N, h > 1, let

ω = ωh := exp (2πi/h) ∈ C,

Ωh :=ωkh−1

k=0=ωkhh−1

k=0⊆ C, (3.3.1)

and

Ω∗h :=(1, ω, . . . , ωh−1

)=(1, ωh, . . . , ω

h−1h

), (3.3.2)

i.e., Ωh =z ∈ C : zh − 1 = 0

.

With ω as defined above, note that

α ≡ β (mod h) =⇒ ωα = ωβ (3.3.3)

is easily established.

Theorem 3.3.6. If aih−1i=0 ⊂ Z, then ωaih−1

i=0 = Ωh if and only if aih−1i=0 is a CRS (h).

Proof. If ri := ai (mod h), then ai ≡ ri (mod h) and, following (3.3.3), ωai = ωri and the

conclusion follows.

Conversely, if aih−1i=0 is not a CRS (h), then, with ri := ai (mod h), rih−1

i=0 ⊂ R(h) and

ωaih−1i=0 ⊂ Ωh.

The following is corollary to Theorem 3.3.4 and Theorem 3.3.6.

Corollary 3.3.7. For p > 1, let Ωph :=

1, ωp, . . . , ω(h−1)p

. Then Ωp

h = Ωh if and only if

gcd (p, h) = 1.

Lemma 3.3.8. If a, b, c, and h ∈ Z, then a ≡ b (modh) if and only if (a + c) ≡ (b +

c) (modh).

48

Proof. Note that

a ≡ b (modh)⇐⇒ a− b = hk (k ∈ Z)

⇐⇒ (a+ c)− (b+ c) = hk

⇐⇒ (a+ c) ≡ (b+ c) (modh) .

Corollary 3.3.9. If aih−1i=0 is a CRS (h) and ` ∈ Z, then pai + `h−1

i=0 is a CRS (h) if and

only if gcd(h, p) = 1.

Lemma 3.3.10. If a, b, c d and h ∈ Z, then (a + hc) ≡ (b + hd) (modh) if and only if

a ≡ b (modh).

Proof. Note that

a ≡ b (modh)⇐⇒ a− b = hk (k ∈ Z)

⇐⇒ (a+ hc)− (b+ hd) = h(k + c− d)

⇐⇒ (a+ hc) ≡ (b+ hd) (modh) .

Corollary 3.3.11. If aih−1i=0 ⊂ Z and jih−1

i=0 ⊂ Z, then aih−1i=0 is a CRS (h) if and only

if ai + hjih−1i=0 is a CRS (h).

Proof. If ai + hjih−1i=0 is not a CRS (h), then there exist distinct integers k, ` ∈ R(h) such

that (ak +hjk) ≡ (a` +hj`) (modh). Following Lemma 3.3.10, (ai +hli) ≡ (aj +hlj) (modh)

if and only if ai ≡ aj (modh); i.e., if and only if aih−1i=0 is not a CRS (h).

Theorem 3.3.12. If fk denotes the (k + 1)-st branch of the pth-root function as defined in

(2.2.5),

B := j = (j0, j1, . . . , jh−1) : jk ∈ R(p),∀k ∈ R(h) ,

49

and

(Ωh)1/pj :=

fjk(ω

k)h−1

k=0, j ∈ B,

then there exists a unique j ∈ B such that (Ωh)1/pj = Ωh if and only if gcd (h, p) = 1.

Remark 3.3.13. For every k ∈ R(h) and j ∈ B, note that

fjk(ωk) = exp

(2πki

hp

)exp

(2πjki

p

)= exp

(2π [k + hlk] i

hp

)= ω(k+hjk)/p

so that

(Ωh)1/pj =

ω(k+hjk)/p

h−1

k=0.

Proof. Existence. If gcd(h, p) = 1, then, following Corollary 3.3.9, for each k ∈ R(h), the set

k + hjp−1j=0 is a CRS (p); in particular, there exists jk ∈ R(p) such that k+hjk ≡ 0 (mod p),

i.e., k + hjk is divisible by p. Moreover, the setk + hjk

p

h−1

k=0

is a CRS (h); indeed, following Corollary 3.3.3,(i+ hjip

)≡(`+ hj`p

)(mod h)

if and only if (i+ hji) ≡ (`+ hj`) (modh), which holds, following Lemma 3.3.10, if and only

if i ≡ ` (mod h). Since R(h) is the canonical CRS (h), it follows that i = `, and the claim is

established.

Uniqueness. Uniqueness follows readily by construction of j. Moreover, it should be clear

that j0 = 0, i.e., 1 must be chosen as the pth-root of 1.

For the converse, suppose gcd(h, p) = d > 1. We claim that for at least one k, there does

not exist jk ∈ R(p) such that k + hjk ≡ 0 (mod p); for contradiction, suppose the assertion

50

is false, so that, for each k, there exists jk such that k + hjk ≡ 0 (mod p), i.e., there exists

an integer qk such that k + hjk = pqk, or hjk = pqk − k, so that pqk ≡ k (mod h). Because

k ∈ R(h), the set pqkh−1k=0 is a CRS (h), contradicting gcd(h, p) 6= 1. Following Remark

3.3.13, at least one exponent in the set

(Ωh)1/pj :=

ω(k+hjk)/p

h−1

k=0

is not an integer so that (Ωh)1/pj 6= Ωh.

Remark 3.3.14. If j ∈ B is the unique h-tuple as specified in Theorem 3.3.12, thenk + hjk

p

h−1

k=0

is a CRS (h); moreover, note that 0 ≤ (k + hjk)/p ≤ h − 1. The lower-bound is trivial and

the upper bound follows because, for any k,

k + hjkp

≤ (h− 1) + h(p− 1)

p=hp− 1

p= h− 1

p< h.

Thus, k + hjk

p

h−1

k=0

= R(h). (3.3.4)

For j ∈ B, consider

(Ω∗h)1/pj :=

(fj0(1), fj1(ω), . . . , fjh−1

(ωh−1))

=(1, ω(1+hj1)/p, . . . , ω[(h−1)+hjh−1]/p

)Following (3.3.4),

(1, ω(1+hj1)/p, . . . , ω[(h−1)+hjh−1]/p

)is a permutation of the elements in Ωh.

Moreover, following Lemma 3.3.10 and (3.3.3), note that for k ∈ 2, . . . , h− 1,

(ω(1+hj1)/p

)k=(ω(k+hkj1)/p

)=(ω(k+hjk)/p

),

51

thus

(Ω∗h)1/pj =

(1, ω(1+hj1)/p, . . . , ω[(h−1)+hjh−1]/p

)=(

1,(ω(1+hj1)/p

)1, . . . ,

(ω(1+hj1)/p

)h−1).

Note that

(Ω∗h)q :=

(1, ωq, . . . , ωq(h−1)

)=(

1, (ωq)1 , . . . , (ωq)h−1)

is a permutation of the elements in Ωh if and only if gcd (h, q) = 1; following (3.3.3), for

every h there are ϕ(h) such permutations, where ϕ denotes Euler’s totient (or phi) function.

Thus, if gcd(h, p) = 1, then there exists q ∈ N, gcd (h, q) = 1 such that

(Ω∗h)1/pj = (Ω∗h)

q . (3.3.5)

Corollary 3.3.15. If B is defined as in Theorem 3.3.12, and (Ωh)q/pj := (Ωq

h)1/pj or (Ωh)

q/pj :=(

(Ωh)1/pj

)q, then there exists a unique j ∈ B such that (Ωh)

q/pj = Ωh if and only if gcd(h, p) =

gcd (h, q) = 1.

3.3.2 Jordan Chains of h-cyclic matrices

Unless otherwise noted, in this subsection it is assumed that A ∈ Mn(C) is a nonsingular,

h-cyclic matrix with ordered-partition Π and, without loss of generality, has the form (1.3.2).

Lemma 3.3.16. For i, j ∈ Z, let αij := (i− j) (modh).

1. Ifx〈0,j〉

rj=1

is a right Jordan chain corresponding to λ ∈ σ (A), where, for j =

1, . . . , r, x〈0,j〉 is partitioned conformably with A as

x〈0,j〉 =

x1j

x2j

...

xhj

,

http://en.wikipedia.org/wiki/Euler's_totient_function

52

then, for k = 1, . . . , h− 1, the setx〈k,j〉 :=

(ωk)α1jx1j

(ωk)α2jx2j

...

(ωk)αhjxhj

r

j=1

is a right Jordan chain corresponding to λωk.

2. Ify〈i,0〉

ri=1

is a left Jordan chain corresponding to λ ∈ σ (A), where, for i = 1, . . . , r,

y〈i,0〉 is partitioned conformably with A as

yT〈i,0〉 =

[yTi1 yTi2 · · · yTih

]T,

then, for k = 1, . . . , h− 1, the setyT〈i,k〉 :=

[(ωk)αi1yTi1 (ωk)αi2yTi2 · · · (ωk)αihyTih

]Tr

i=1

is a left Jordan chain corresponding to λωk.

Proof. The following facts are easily established:

αij = α(i+1)(j+1), ∀i, j ∈ Z (3.3.6)

α(i+1)j = αi(j−1) = (αij + 1) (modh) , ∀i, j ∈ Z (3.3.7)

α ≡ β (modh) =⇒ (ωk)α = (ωk)β. (3.3.8)

For convenience, let h + 1 := 1. As a consequence of the h-cyclic structure of A, for

i = 1, . . . , h and j = 1, . . . , r, note that the ith component of the vector Ax〈0,j〉 is given by

[Ax〈0,j〉

]i

= Ai(i+1)x(i+1)j. (3.3.9)

53

By hypothesis,

Ax〈0,1〉 = λx〈0,1〉, (3.3.10)

so that, following (3.3.7) – (3.3.10),

[Ax〈k,1〉

]i

= (ωk)α(i+1)1Ai(i+1)x(i+1)1 = λ(ωk)α(i+1)1xi1 = λωk(ωk)αi1xi1 =[λωkx〈k,1〉

]i,

i.e.,(λωk, x〈k,1〉

)constitutes a right-eigenpair for A.

By hypothesis, for j = 2, . . . , r,

Ax〈0,j〉 = x〈0,j−1〉 + λx〈0,j〉, (3.3.11)

hence, following (3.3.9) and (3.3.11),

[Ax〈0,j〉

]i

= Ai(i+1)x(i+1)j = xi(j−1) + λxij.

Following (3.3.6) – (3.3.8), (3.3.9), and (3.3.11), for i = 1, . . . , h and j = 2, . . . , r,

[Ax〈k,j〉

]i

= (ωk)α(i+1)jAi(i+1)x(i+1)j

= (ωk)α(i+1)j(xi(j−1) + λxij

)= (ωk)αi(j−1)xi(j−1) + λωk

(ωk)αij

xij,

i.e,

Ax〈k,j〉 = x〈k,j−1〉 + λωkx〈k,j〉.

For the second assertion, first note that, as a consequence of the h-cyclic structure of A,

for i = 1, . . . , r and j = 1, . . . , h, the jth component of the vector yT〈i,0〉A is given by

[yT〈i,0〉A

]j

= yTi(j−1)A(j−1)j. (3.3.12)

54

By hypothesis,

yT〈r,0〉A = λyT〈r,0〉 (3.3.13)

so that, following (3.3.6) – (3.3.8), (3.3.12), and (3.3.13), for k = 1, . . . , h− 1,

[yT〈r,k〉A

]j

= (ωk)αr(j−1)yTr(j−1)A(j−1)j = λ(ωk)αr(j−1)yTrj = λωk(ωk)αrjyTrj =[λωkyT〈r,k〉

]j

i.e.,(λωk, yT〈r,k〉

)is a left eigenvector for A.

By hypothesis, for i = 1, . . . , r − 1

yT〈i,0〉A = λyT〈i,0〉 + yT〈i+1,0〉, (3.3.14)

hence, following (3.3.12) and (3.3.14),

[yT〈i,0〉A

]j

= yTi(j−1)A(j−1)j = λyTij + yT(i+1)j.

Following (3.3.6) – (3.3.8), (3.3.12), and (3.3.13), for i = 1, . . . , r − 1 and j = 1, . . . , h,

[yT〈i,k〉A

]j

= (ωk)αi(j−1)yTi(j−1)A(j−1)j

= (ωk)αi(j−1)(λyTij + yT(i+1)j

)= λωk(ωk)αijyTij + (ωk)α(i+1)jyT(i+1)j,

i.e,

yT〈i,k〉A = λωkyT〈i,k〉 + yT〈i+1,k〉,

and the proof is complete.

Corollary 3.3.17. If Jr(λ) is a Jordan block of J , then Jr(λωk

)is a Jordan block of J for

k = 1, . . . , h− 1. In particular, if λ ∈ σ (A), then λωk ∈ σ (A) for k = 1, . . . , h− 1.

55

Following (3.3.8), note that, for α ∈ Z,

(ωk)α = (ωk)α(mod h). (3.3.15)

For ` = 1, . . . , r

ωk

(ωk)α1`

(ωk)α2`

...

(ωk)αh`

[(ωk)α`1 (ωk)α`2 · · · (ωk)α`h

]= ωk

[(ωk)αi`+α`j

]i,j=1

h

Following (3.3.15), note that

(ωk)αi`+α`j = (ωk)(αi`+α`j)(mod h)

= (ωk)[(i−`)(mod h)+(`−j)(mod h)](mod h)

= (ωk)[(i−`)+(`−j)](mod h)

= (ωk)(i−j)(mod h) = (ωk)αij ,

hence

ωk

(ωk)α1`

(ωk)α2`

...

(ωk)αh`

[(ωk)α`1 (ωk)α`2 · · · (ωk)α`h

]= ωk

[(ωk)αij

]i,j=1

h

=

[(ωk)αij+1

]i,j=1

h

.

Following (3.3.7) and (3.3.8), (ωk)αij+1 = (ωk)(αij+1)(mod h) = (ωk)α(i+1)j . Thus

ωk

(ωk)α1`

(ωk)α2`

...

(ωk)αh`

[(ωk)α`1 (ωk)α`2 · · · (ωk)α`h

]=

[(ωk)α(i+1)j

]i,j=1

h

56

=

1 2 3 ··· h−1 h

1 ωk 1 (ωk)h−1 · · · (ωk)3 (ωk)2

2 (ωk)2 ωk 1 · · · (ωk)4 (ωk)3

3 (ωk)3 (ωk)2 ωk. . .

......

......

.... . . . . . . . .

...

h−1 (ωk)h−1 (ωk)h−2 (ωk)h−3 · · · ωk 1

h 1 (ωk)h−1 (ωk)h−2 · · · (ωk)2 ωk

=circ

(ωk, 1, (ωk)h−1, · · · , (ωk)2

).

Because ω` is an hth-root of unity for ` ∈ Z, note that, for ` = 1, . . . , h− 1,

h−1∑k=0

(ωk)` =h−1∑k=0

(ω`)k =(ω`)h − 1

ω` − 1= 0,

Thus,

h−1∑k=0

ωk 1 (ωk)h−1 · · · (ωk)3 (ωk)2

(ωk)2 ωk 1 · · · (ωk)4 (ωk)3

(ωk)3 (ωk)2 ωk. . .

......

......

. . . . . . . . ....

(ωk)h−1 (ωk)h−2 (ωk)h−3 · · · ωk 1

1 (ωk)h−1 (ωk)h−2 · · · (ωk)2 ωk

57

=

1 2 3 ··· h

1 h

2 h

.... . .

h−1 h

h h

= h

circ

h︷︸︸︷0, 1, 0, . . . , 0

(3.3.16)

For ` = 1, . . . , r − 1

(ωk)α1`

(ωk)α2`

...

(ωk)αh`

[(ωk)α(`+1)1 (ωk)α(`+1)2 · · · (ωk)α(`+1)h

]=

[(ωk)αi`+α(`+1)j

]i,j=1

h

Following (3.3.15),

(ωk)αi`+α(`+1)j = (ωk)[αi`+α(`+1)j](mod h)

= (ωk)[(i−`)(mod h)+(`+1−j)(mod h)](mod h)

= (ωk)[(i−`)+(`+1−j)](mod h)

= (ωk)[(i+1)−j](mod h) = (ωk)α(i+1)j ,

hence

(ωk)α1`

(ωk)α2`

...

(ωk)αh`

[(ωk)α(`+1)1 (ωk)α(`+1)2 · · · (ωk)α(`+1)h

]= circ

(ωk, 1, (ωk)h−1, · · · , (ωk)2

).

58

With Ω∗h as defined in (3.3.2), let λΩ∗h :=(λ, λω, . . . , λωh−1

)and

J (λΩ∗h, r) :=

Jr(λ)

Jr(λω)

. . .

Jr(λωh−1

)

∈Mrh(C).

Thus, the Jordan form of a nonsingular h-cylic matrix A is of the form

Z−1AZ = J =t⊕i=1

J (λiΩ∗h, ri) ,

and the Jordan chains comprising the matrix Z may be selected as in Lemma 3.3.16; i.e., ifx〈0,j〉

rj=1

is a left Jordan chain andy〈i,0〉

ri=1

is a right Jordan chain for λ, thenx〈k,j〉

rj=1

is a left Jordan chain andy〈i,k〉

ri=1

is a right Jordan chain for λωk, k = 1, . . . , h− 1.

Lemma 3.3.18. For i = 1, . . . , t, if

Ai := Z diag

0, . . . , 0,

i︷︸︸︷J (λiΩ

∗h, ri), 0, . . . , 0

Z−1 ∈Mn(C), (3.3.17)

then Γ (Ai) ⊆ Γ (χΠ), Ai commutes with A, and AiAj = AjAi = 0 for j ∈ 1, . . . , j, i 6= j.

Proof. Note that

Ai =h−1∑k=0

(ri∑j=1

λωkx〈k,j〉yT〈j,k〉 +

ri−1∑j=1

x〈k,j〉yT〈j+1,k〉

)

=h−1∑k=0

(ri∑j=1

λωkx〈k,j〉yT〈j,k〉 +

ri−1∑j=1

x〈k,j〉yT〈j+1,k〉

)

=h−1∑k=0

ri∑j=1

λωk

(ωk)α1jx1j

...

(ωk)αhjxhj

[(ωk)αj1yTj1 · · · (ωk)αjhyTjh

]+

59

ri−1∑j=1

(ωk)α1jx1j

...

(ωk)αhjxhj

[(ωk)α(j+1)1yT(j+1)1 · · · (ωk)α(j+1)hyT(j+1)h

]

=h−1∑k=0

ri∑j=1

λωk

(ωk)α1j

...

(ωk)αhj

[(ωk)αj1 · · · (ωk)αjh

] x1j

...

xhj

[yTj1 · · · yTjh

]+

ri−1∑j=1

(ωk)α1j

...

(ωk)αhj

[(ωk)α(j+1)1 · · · (ωk)α(j+1)h

] x1j

...

xhj

[yT(j+1)1 · · · yT(j+1)h

]= λ

ri∑j=1

[h−1∑k=0

(circ

(ωk, 1, (ωk)h−1, · · · , (ωk)2

)[xpjy

Tjq

]p,q=1

h

)]+

ri−1∑j=1

[h−1∑k=0

(circ

(ωk, 1, (ωk)h−1, · · · , (ωk)2

)[xpjy

T(j+1)q

]p,q=1

h

)]

= λ

ri∑j=1

[(h−1∑k=0

circ(ωk, 1, (ωk)h−1, · · · , (ωk)2

))[xpjy

Tjq

]p,q=1

h

]+

ri−1∑j=1

[(h−1∑k=0

circ(ωk, 1, (ωk)h−1, · · · , (ωk)2

))[xpjy

T(j+1)q

]p,q=1

h

]

= λ

ri∑j=1

circ

h︷︸︸︷0, h, 0, . . . , 0

[xpjyTjq]p,q=1

h

+

ri−1∑j=1

circ

h︷︸︸︷0, h, 0, . . . , 0

[xpjyT(j+1)q

]p,q=1

h

60

= λ

ri∑j=1

0 hx1jyTj2 · · · · · · 0

0 0 hx2jyTj3 · · · 0

......

. . . . . ....

0 0 · · · 0 hx(h−1)jyTjh

hxhjyTj1 0 · · · 0 0

+

ri−1∑j=1

0 hx1jyT(j+1)2 · · · · · · 0

0 0 hx2jyT(j+1)3 · · · 0

......

. . . . . ....

0 0 · · · 0 hx(h−1)jyT(j+1)h

hxhjyT(j+1)1 0 · · · 0 0

=

0 A(i)12 · · · · · · 0

0 0 A(i)23 · · · 0

......

. . . . . ....

0 0 · · · 0 A(i)(h−1)h

A(i)h1 0 · · · 0 0

,

where, for ` = 1, . . . , h,

A(i)`(`+1) := λ

ri∑j=1

hxijyTji +

ri−1∑j=1

hxijyT(j+1)i (3.3.18)

and h+1 := 1. The conclusion that Γ (Ai) ⊆ Γ (χΠ) follows becausex〈0,j〉

rj=1

andy〈i,0〉

ri=1

are partitioned conformably with χΠ; the matrices Ai and A commute because Ai and A are

simultaneously triangularizable; and AiAj = AjAi = 0 by construction of the matrices Ai

and Aj.

Corollary 3.3.19. If r = 1 and x〈0,j〉rj=1 and y〈i,0〉ri=1 are strictly nonzero Jordan chains

61

and A has cyclic index h, then Ai has cyclic index h, Γ (Ai) = Γ (χΠ), and Ai commutes

with A.

As a special case, we examine the following: let A be a nonnegative, irreducible, imprim-

itive, nonsingular matrix with index of cyclicity h and assume, without loss of generality,

that ρ (A) = 1. Following Theorem 2.1.3 and Corollary 3.3.17, note that the Jordan form of

A is

Z−1AZ =

J(Ω∗h, 1)

J (λ2Ω∗h, r2)

. . .

J (λtΩ∗h, rt)

.

Consider the matrix

A1 := Z

J(Ω∗h, 1) 0

0 0

Z−1.

Following Corollary 3.3.19, Γ (A1) = Γ (χΠ) and AA1 = A1A. Moreover, following Theorem

2.1.3, there exist positive vectors x and y such that Ax = x and yTA = yT . If x and y are

partitioned conformably with A as

x =

x1

x2

...

xh

and yT =

[yT1 yT2 · · · yTh

],

62

then, following (3.3.18),

A1 = h

0 x1yT2 · · · · · · 0

0 0 x2yT3 · · · 0

......

. . . . . ....

0 0 · · · 0 xh−1yTh

xhyT1 0 · · · 0 0

≥ 0.

3.3.3 Main Results

Unless otherwise noted, we assume A ∈ Mn(R) is a nonnegative, nonsingular, irreducible,

imprimitive matrix with ρ (A) = 1. Before we state our main results on such matrices, we

introduce some concepts and notation pertinent to the discussion: following Friedland [5], for

a (multi-)set σ = λini=1 ⊆ C, let ρ = ρ (σ) := maxi|λi| and σ :=λini=1⊆ C. If σ = σ,

we say that σ is self-conjugate. Clearly, σ is self-conjugate if and only if σ is self-conjugate.

The (multi-)set σ is said to be a Frobenius (multi)-set if, for some positive integer h ≤ n,

the following properties hold:

(i) ρ (σ) > 0;

(ii) σ ∩ z ∈ C : |z| = ρ (σ) = ρΩh; and

(iii) σ = ωσ, i.e., σ is invariant under rotation by the angle 2π/h.

Clearly, the set Ωh as defined in (3.3.1) is a self-conjugate Frobenius set. Let λ =

r exp (iθ) ∈ C, =(λ) 6= 0, λΩh :=λ, λωh, . . . , λω

h−1h

, ϕ = 2πk/h, and assume gcd (h, p) = 1.

With fj as defined in (2.2.5), note that

fi(λ)fjk(ωk)

= r1/p exp (i [θ + 2πi] /p) exp (i[ϕ+ 2πjk]/p)

63

= r1/p exp (i [(θ + ϕ) + 2π(i+ jk)] /p)

= r1/p exp (i [θ + ϕ] /p) exp (i [2π(i+ jk)] /p)

= r1/p exp (i [θ + ϕ] /p)ωi+jkp

= r1/p exp (i [θ + ϕ] /p)ω(i+jk)(mod p)p (3.3.8)

= r1/p exp (i [θ + ϕ] /p) exp (i2π [(i+ jk) (mod p)] /p)

= r1/p exp (i [(θ + ϕ) + 2π((i+ jk) (mod p))] /p) = fj(i)k

(λ), (3.3.19)

where λ = r exp (i [θ + ϕ]) and j(i)k = (i+ jk) (mod p) ∈ 0, 1, . . . , p− 1.

Hence, following Theorem 3.3.12, there exists a unique j ∈ B such that, for all i ∈ R(p),

the set

fi(λ) (Ωh)1/pj :=

fi(λ)fj0 (1) , fi(λ)fj1 (ωh) , . . . , fi(λ)fjh−1

(ωh−1h

)(3.3.20)

is a self-conjugate Frobenius set; moreover, following the analysis leading up to (3.3.19),

there exists j(i) =(j

(i)0 , j

(i)1 , . . . , j

(i)h−1

)∈ B such that

fi(λ) (Ωh)1/pj =

fj(i)0

(λ) , fj(i)1

(λωh) , . . . , fj(i)h−1

(λωh−1

h

). (3.3.21)

The following important lemma was introduced and stated without proof in [5, §4, Lemma

1] and proven rigorously in [25, Theorem 3.1].

Lemma 3.3.20. Let A be an eventually nonnegative matrix. If A is not nilpotent, then the

spectrum of A is a union of self-conjugate Frobenius sets.

As a consequence of Theorem 3.3.12 and Lemma 3.3.20, it should be clear that A can

not possess an eventually nonnegative matrix pth-root if gcd(h, p) 6= 1; however, more can

be ascertained.

64

Partition the Jordan form of A as J+

J−

JC

(3.3.22)

where:

(1) J (λΩ∗h, r) is a submatrix of J+ if and only if σ (J (λΩ∗h, r))∩R+ 6= ∅ and σ (J (λΩ∗h, r))∩

R− = ∅;

(2) J (λΩ∗h, r) is a submatrix of J− if and only if σ (J (λΩ∗h, r)) ∩ R− 6= ∅; and

(3) J (λΩ∗h, r) is a submatrix of JC if and only if σ (J (λΩ∗h, r)) ∩ R = ∅.

Suppose J+ has r1 distinct blocks of the form J (λΩ∗h, r), J− has r2 distinct blocks of the

form J (λΩ∗h, r), and JC has c distinct blocks of the form J (λΩ∗h, r).

We are now ready to present our main results.

Theorem 3.3.21. Let A ∈ Mn(R) and suppose that A ≥ 0, nonsingular, irreducible, and

imprimitive. Let h > 1 be the cyclic index of A, and suppose that Π describes the h-cyclic

structure of A, the Jordan form of A is partitioned as in (3.3.22), and gcd(h, p) = 1. If p

is even, then A has (a) 2r1−1pc eventually nonnegative primary pth-roots, when r2 = 0; and

(b) no eventually nonnegative primary pth-roots when r2 > 0. If p is odd, then A has pc

eventually nonnegative primary pth-roots.

65

Proof. For λ ∈ C, λ 6= 0, and j = (j0, j, . . . , jh−1) ∈ B, let

Fj (J (λΩ∗h, r)) :=

fj0 (Jr(λ))

fj1 (Jr(λω))

. . .

fjh−1

(Jr(λωh−1

))

.

We form an eventually nonnegative root X by carefully selecting a root for each submatrix

of every block appearing in (3.3.22).

Case 1: p is even. We consider the blocks appearing in (3.3.22):

(i) J+: Following Theorem 2.1.3, J (Ω∗h, 1) is a submatrix of J+ (note that h must be

odd) and, following Theorem 3.3.12, there exists j = (j0, j1, . . . , jh) ∈ B such that

σ (Fj (J (Ω∗h, 1))) = Ωh. With that specific choice of j, it is clear that

ρ (Fj (J (Ω∗h, 1))) = 1.

For any other submatrix J (λΩ∗h, r) of J+, without loss of generality, we may assume

that λ ∈ R+. For every such λ, k must be chosen such that fk(λ) is real (see Corollary

2.2.21) and fk(λ) is real if k = 0 or k = p/2; hence, there are two choices such that

fk(λ) is real and, following (3.3.20) and (3.3.21), there are two choices such that and

σ(Fj(k) (J (λΩ∗h, r))

)is a self-conjugate Frobenius set.

(ii) J−: If r2 > 0, then A does not have a real primary root so that, a fortiori, it can not

have an eventually nonnegative primary root.

(iii) JC: For any submatrix J (λΩ∗h, r) of JC, following (3.3.20) and (3.3.21), for the same j

chosen for Fj (J (Ω∗h, 1)) in (i), there exists

j(k) =(j

(k)0 , j

(k)1 , . . . , j

(k)h−1

)∈ B

66

such that σ(Fj(k) (J (λΩ∗h, r))

)is a self-conjugate Frobenius set for all k ∈ R(p). Thus,

there are pc ways to choose roots for blocks in JC.

Following the analysis contained in (i)–(iii), note that there are 2r1−1pc ways to form a root

in this manner.

Case 2: p is odd. Following Theorem 3.3.12 and properties of the pth-root function, if p

is odd, then for any submatrix J (λΩ∗h, r) of J+ or J−, there is only one choice j ∈ B such

that σ (Fj (J (λΩ∗h, r))) is a self-conjugate Frobenius set. For submatrices J (λΩ∗h, r) of Jc,

the analysis is the same as in (iii). In this manner, there are pc possible selections.

Partition the Jordan form of A as J(Ω∗h, 1)

J

.With the above partition in mind, consider the matrix pth-root of A given by

X := Z

Fj (J(Ω∗h, 1))

F(J)Z−1

where j is selected as in Theorem 3.3.12 and F(J)

is a pth-root of J containing blocks of

the form Fj(k) (J (λΩ∗h, r)), chosen as in Case 1 or Case 2. Because every power of a reducible

matrix is reducible note that X must be irreducible. If h is the cyclic index of X, then

1 ≤ h ≤ h and h must divide h (if h > h, then X would have h eigenvalues of maximum

modulus so that A would h eigenvalues of maximum modulus, contradicting the maximality

of h). However, we claim that h = h. For contradiction, assume h < h and consider the

matrix A1 given by

A1 = Z

J(Ω∗h, 1) 0

0 0

Z−1.

67

Following Corollary 3.3.19, A1 ≥ 0, irreducible, h-cyclic, and Π describes the h-cyclic struc-

ture of A1. Next, consider the matrix X1 given by

X1 := Z

Fj (J(Ω∗h, 1)) 0

0 0

Z−1,

which is a matrix pth-root of A1. Following Corollary 3.3.19, the cyclic index of X1 is

h. However, following the remarks leading up to (3.3.5), there exists q ∈ N relatively

prime to h such that X1 = Aq1 ≥ 0 (along with X1 being a pth-root of A1, this implies

X1 = Xpq1 ). Thus, X1 is a nonnegative, irreducible, h-cyclic matrix, contradicting the

maximality of h. Hence, h = h and X is h-cyclic. Before we continue with the proof, note

that Γ (X) ⊆ Γ (X1) = Γ (χqΠ).

Following Theorem 2.2.5, there exists Z ∈Mn(C) such that

Z−1XZ =

Fj (J(Ω∗h, 1))

J(fj2(λ2) (Ω∗h)

1/pj , r2

). . .

J(fjt(λt) (Ω∗h)

1/pj , rt

)

.

By construction of Z, note that

X1 := Z

Fj (J(Ω∗h, 1)) 0

0 0

Z−1 = Z

Fj (J(Ω∗h, 1)) 0

0 0

Z−1.

Let

X2 := Z

0

J(fj2(λ2) (Ω∗h)

1/pj , r2

). . .

J(fjt(λt) (Ω∗h)

1/pj , rt

)

Z−1.

68

Following Lemma 3.3.18, X2 is h-cyclic and Γ (X2) ⊆ Γ (χqΠ); in addition, X1X2 = X2X1 = 0.

Thus, for k ∈ N, Xk = Xk1 + Xk

2 . Because ρ (X2) < 1, limk→∞Xk2 = 0. The matrices X1

and X2 are h-cyclic, rank (X21 ) = rank (X1), rank (X2

2 ) = rank (X2), and Γ (X2) ⊆ Γ (X1) =

Γ (χqΠ), thus, following [10, Theorem 2.7], note that Γ(Xk

2

)⊆ Γ

(Xk

1

)= Γ

(χqkΠ

). Thus,

there exists p ∈ N such that Xk1 > Xk

2 for all k ≥ p, i.e., X is eventually nonnegative.

Remark 3.3.22. Following the notation of Theorem 3.3.21, note that all primary roots of

A are given by

X = Z

Fj1 (J(Ω∗h, 1))

Fj2 (J (λ2Ω∗h, r2))

. . .

Fjt (J (λtΩ∗h, rt))

Z−1

where jk ∈ B for k = 1, . . . , t and subject to the constraint that jk = j` (because jk and j`

are ordered p-tuples, equality is meant as entrywise) if λk = λ`.

If A is deregatory, i.e., some eigenvalue appears in more than one Jordan block in the

Jordan form of A, then A has additional roots of the form

X(U) = ZU

Fj1 (J(Ω∗h, 1))

Fj2 (J (λ2Ω∗h, r2))

. . .

Fjt (J (λtΩ∗h, rt))

U−1Z−1

where U is any matrix that commutes with J and subject to the constraint that jk 6= j`, if

λk = λ`. Select branches j1, j2, . . . , jt following the proof of Theorem 3.3.21. Following [15,

69

Theorem 1, §12.4], note that the matrix U must be of the form

u1

u2

. . .

uh

U

and X(U) is real if U is selected to be real. The matrix X(U) is irreducible because X(U)p =

A. Moreover,

X1(U) := ZU

Fj1 (J(Ω∗h, 1)) 0

0 0

U−1Z−1 = X1,

where X1 is defined as in the proof of Theorem 3.3.21. Thus, X1(U) is a nonnegative,

irreducible, h-cyclic matrix that commutes with X(U), so the argument demonstrating the

eventual nonnegativity of X is also valid for X(U).

Corollary 3.3.23. Let A ∈ Mn(R) and suppose that A ≥ 0, irreducible, and imprimitive

with index of cyclicity h. Then A possesses an eventually nonnegative pth-root if and only if

A possesses a real root and gcd (h, p) = 1.

Theorem 3.3.24. Let A ∈Mn(R) and suppose that A ≥ 0, irreducible, and imprimitive. If

Z−1AZ = J = J0 ⊕ J1 is a Jordan canonical form of A, where J0 collects all the singular

Jordan blocks and J1 collects the remaining Jordan blocks and A possesses a real root, then

all eventually nonnegative pth-roots of A are given by A = Z (X0 ⊕X1)Z−1, where X1 is any

pth-root of J1, characterized by Theorem 3.3.21 or Remark 3.3.22, and X0 is a real pth-root

of J0.

70

Although the following result is known (see [10, Algorithm 3.1 and its proof]), our work

provides another proof.

Corollary 3.3.25. If A ∈ Mn(R) is irreducible with index of cyclicity h, where h > 1, and

Π describes the h-cyclic structure of A, then A is eventually nonnegative if and only if there

exists a nonsingular matrix Z such that

Z−1AZ =

J(Ω∗h, 1)

J

,and, associated with ρ (A) = 1 ∈ σ (A), is a positive left eigenvector x and right eigenvector

y.

Proof. If A is eventually nonnegative, select p relatively prime to h such that Ap ≥ 0. Then

A is a pth-root of Ap and the result follows from Theorem 3.3.24.

The converse is clear by setting X1 = Z

J(Ω∗h, 1) 0

0 0

Z−1 and X2 = Z

0 0

0 J

Z−1.

Remark 3.3.26. It should be clear that Theorem 3.3.24 remains true if the assumption of

nonnegativity is replaced with eventual nonnegativity.

3.4 Reducible Matrices

Recall that a matrix A ∈ Mn(C), n ≥ 2 is reducible if there exists a permutation matrix P

such that

A = P

A11 A12

0 A22

P T ,

71

where A11 and A22 are square, nonempty matrices. Via strong induction, it can be shown

that if A is reducible, then there exists a permutation matrix P such that

P TAP =

A11 A12 · · · A1k

A22 . . . A2k

. . ....

Akk

(3.4.1)

where each matrix Akk appearing in the diagonal is irreducible. Recall that the matrix on

the right-hand side of (3.4.1) is known as the Frobenius, or irreducible, normal form of A.

Identifying the eventually nonnegative matrix roots of nonnegative reducible matrices

is beyond the scope of this dissertation (and perhaps a dissertation in and of itself). The

main difficulty arises in controlling entries of the off-diagonal blocks in the Frobenius normal

form (1.3.1). Moreover, the assumption that gcd (h, p) = 1 is not necessarily required: for

instance, consider the matrix

A =

0 1 0 0

0 0 1 0

0 0 0 1

1 0 0 0

and note that

B :=

0 0 1 0

0 0 0 1

1 0 0 0

0 1 0 0

= A2,

B is a reducible 2-cyclic matrix, σ (B) = Ω2 ∪ Ω2, B has an irreducible nonnegative square

root, but gcd (2, 2) = 2 > 1.

72

3.4.1 Preliminary Results

Definition 3.4.1. A matrix A ∈ Mn(C) is said to be completely reducible if there exists a

permutation matrix P such that

P TAP =k⊕i=1

Aii =

A11

. . .

Akk

, (3.4.2)

where k ≥ 2 and A11, . . . , Akk are square irreducible matrices.

Remark 3.4.2. Following the definition, if A is completely reducible, without loss of gen-

eraltiy, we may assume A is in the form of the matrix on the right-hand side of (3.4.2).

Furthermore, note that Av

≥ 0 if and only if Aiiv

≥ 0 for all i = 1, . . . , k.

The following is corollary to Theorem 3.3.24 and Corollary 3.3.23.

Corollary 3.4.3. Let A ∈ Mn(R) and suppose that Av

≥ 0, nonsingular, and completely

reducible. Let hi denote the cyclicity of Aii for i = 1, . . . , k. If each Aii possesses a real root,

then A possesses an eventually nonnegative root if and only if gcd (hi, p) = 1 for all i.

73

Bibliography

[1] A. Berman and R. J. Plemmons. Nonnegative matrices in the mathematical sciences,

volume 9 of Classics in Applied Mathematics. Society for Industrial and Applied Math-

ematics (SIAM), Philadelphia, PA, 1994. Revised reprint of the 1979 original.

[2] R. A. Brualdi and H. J. Ryser. Combinatorial matrix theory, volume 39 of Encyclopedia

of Mathematics and its Applications. Cambridge University Press, Cambridge, 1991.

[3] S. Carnochan Naqvi and J. J. McDonald. The combinatorial structure of eventually

nonnegative matrices. Electron. J. Linear Algebra, 9:255–269 (electronic), 2002.

[4] S. Carnochan Naqvi and J. J. McDonald. Eventually nonnegative matrices are similar

to seminonnegative matrices. Linear Algebra Appl., 381:245–258, 2004.

[5] S. Friedland. On an inverse problem for nonnegative and eventually nonnegative ma-

trices. Israel J. Math., 29(1):43–60, 1978.

[6] D. Handelman. Positive matrices and dimension groups affiliated to C∗-algebras and

topological Markov chains. J. Operator Theory, 6(1):55–74, 1981.

[7] G. H. Hardy and E. M. Wright. An introduction to the theory of numbers. Oxford

University Press, Oxford, sixth edition, 2008. Revised by D. R. Heath-Brown and J. H.

Silverman, With a foreword by Andrew Wiles.

[8] N. J. Higham. Functions of matrices. Society for Industrial and Applied Mathematics

(SIAM), Philadelphia, PA, 2008. Theory and computation.

74

[9] N. J. Higham and L. Lin. On pth roots of stochastic matrices. Linear Algebra Appl.,

435(3):448–463, 2011.

[10] L. Hogben. Eventually cyclic matrices and a test for strong eventual nonnegativity.

Electron. J. Linear Algebra, 19:129–140, 2009.

[11] R. A. Horn and C. R. Johnson. Matrix analysis. Cambridge University Press, Cam-

bridge, 1990. Corrected reprint of the 1985 original.

[12] R. A. Horn and C. R. Johnson. Topics in matrix analysis. Cambridge University Press,

Cambridge, 1994. Corrected reprint of the 1991 original.

[13] J. Hunter. Number theory. Oliver & Boyd, Edinburgh, 1964.

[14] C. R. Johnson and P. Tarazaga. On matrices with Perron-Frobenius properties and

some negative entries. Positivity, 8(4):327–338, 2004.

[15] P. Lancaster and M. Tismenetsky. The theory of matrices. Computer Science and

Applied Mathematics. Academic Press Inc., Orlando, FL, second edition, 1985.

[16] E. Landau. Elementary number theory. Chelsea Publishing Co., New York, N.Y., 1958.

Translated by J. E. Goodman.

[17] W. J. LeVeque. Fundamentals of number theory. Dover Publications Inc., Mineola, NY,

1996. Reprint of the 1977 original.

[18] C. T. Long. Elementary introduction to number theory. Prentice Hall Inc., Englewood

Cliffs, NJ, third edition, 1987.

75

[19] J. J. McDonald and D. M. Morris. Level characteristics corresponding to peripheral

eigenvalues of a nonnegative matrix. Linear Algebra Appl., 429(7):1719–1729, 2008.

[20] D. Noutsos. On Perron-Frobenius property of matrices having some negative entries.

Linear Algebra Appl., 412(2-3):132–153, 2006.

[21] D. Noutsos and M. J. Tsatsomeros. On the numerical characterization of the reachability

cone for an essentially nonnegative matrix. Linear Algebra Appl., 430(4):1350–1363,

2009.

[22] P. J. Psarrakos. On the mth roots of a complex matrix. Electron. J. Linear Algebra,

9:32–41 (electronic), 2002.

[23] M. I. Smith. A Schur algorithm for computing matrix pth roots. SIAM J. Matrix Anal.

Appl., 24(4):971–989 (electronic), 2003.

[24] B. G. Zaslavsky and J. J. McDonald. A characterization of Jordan canonical forms

which are similar to eventually nonnegative matrices with the properties of nonnegative

matrices. Linear Algebra Appl., 372:253–285, 2003.

[25] B. G. Zaslavsky and B.-S. Tam. On the Jordan form of an irreducible matrix with

eventually non-negative powers. Linear Algebra Appl., 302/303:303–330, 1999. Special

issue dedicated to Hans Schneider (Madison, WI, 1998).

matrix roots of nonnegative and eventually …math.wsu.edu/faculty/tsat/files/pietro_thesis.pdf ·...

Documents