Download - Chapter 2 A short review of matrix algebra

Chapter 2

A short review of matrix algebra

2.1 Vectors and vector spaces

Definition 2.1.1. A vector a of dimension n is a collection of n elements

typically written as

a =

⎛⎜⎜⎜⎜⎜⎜⎝

a1

a2

...

an

⎞⎟⎟⎟⎟⎟⎟⎠

= (ai)n.

Vectors of length 2 (two-dimensional vectors) can be thought of points in

33

BIOS 2083 Linear Models Abdus S. Wahed

the plane (See figures).

Chapter 2 34


Figure 2.1: Vectors in two and three dimensional spaces

(-1.5,2)

(1, 1)

(1, -2)

(2.5, 1.5, 0.95)

x2

x1

x3

(0, 1.5, 0.95)

Chapter 2 35


• A vector with all elements equal to zero is known as a zero vector and

is denoted by 0.

• A vector whose elements are stacked vertically is known as column

vector whereas a vector whose elements are stacked horizontally will be

referred to as row vector. (Unless otherwise mentioned, all vectors will

be referred to as column vectors).

• A row vector representation of a column vector is known as its trans-

pose. We will use the notation ‘′’ or ‘T ’ to indicate a transpose. For

instance, if a =

⎛⎜⎜⎜⎜⎜⎜⎝

a1

a2

...

an

⎞⎟⎟⎟⎟⎟⎟⎠

and b = (a1 a2 . . . an), then we write b = aT

or a = bT .

• Vectors of same dimension are conformable to algebraic operations such

as additions and subtractions. Sum of two or more vectors of dimension

n results in another n-dimensional vector with elements as the sum of

the corresponding elements of summand vectors. That is,

(ai)n ± (bi)n = (ai ± bi)n.

Chapter 2 36


• Vectors can be multiplied by a scalar.

c(ai)n = (cai)n.

• Product of two vectors of same dimension can be formed when one of

them is a row vector and the other is a column vector. The result is called

inner, dot or scalar product. if a =

⎛⎜⎜⎜⎜⎜⎜⎝

a1

a2

...

an

⎞⎟⎟⎟⎟⎟⎟⎠

and b =

⎛⎜⎜⎜⎜⎜⎜⎝

b1

b2

...

bn

⎞⎟⎟⎟⎟⎟⎟⎠

, then

aT b = a1b1 + a2b2 + . . . + anbn.

Definition 2.1.2. The length, magnitude, or Euclidean norm of a vec-

tor is defined as the square root of the sum of squares of its elements and is

denoted by ||.||. For example,

||a|| = ||(ai)n|| =

√√√√ n∑i=1

a2i =

√aTa.

• The length of the sum of two or more vectors is less than or equal to the

sum of the lengths of each vector. (Cauchy-Schwarz Inequality).

||a + b|| ≤ ||a|| + ||b||

Chapter 2 37


Definition 2.1.3. A set of vectors {a1, a2, . . . , am} is linearly dependent

if at least one of them can be written as a linear combination of the others.

In other words, {a1, a2, . . . , am} are linearly dependent if there exists at

least one non-zero cj such that

m∑j=1

cjaj = 0. (2.1.1)

In other words, for some k,

ak = −(1/ck)∑j �=k

cjaj.

Definition 2.1.4. A set of vectors are linearly independent if they are

not linearly dependent. That is, in order for (2.1.1) to hold, all cj’s must be

equal to zero.

Chapter 2 38


Definition 2.1.5. Two vectors a and b are orthogonal if their scalar prod-

uct is zero. That is, aTb = 0, and we write a ⊥ b.

Definition 2.1.6. A set of vectors is said to be mutually orthogonal if

members of any pair of vectors belonging to the set are orthogonal.

• If vectors are mutually orthogonal then they are linearly independent.

Chapter 2 39


Definition 2.1.7. Vector space. A set of vectors which are closed under

addition and scalar multiplication is known as a vector space.

Thus if V is a vector space, for any two vectors a and b from V, (i)

caa + cbb ∈ V, and (ii) caa ∈ V for any two constants ca and cb.

Definition 2.1.8. Span. All possible linear combinations of a set of linearly

independent vectors form a Span of that set.

Thus if A = {a1, a2, . . . , am} is a set of m linearly independent vectors,

then the span of A is given by

span(A) =

{a : a =

m∑j=1

cjaj

},

for some numbers cj, j = 1, 2, . . . , m. Viewed differently, the set of vectors A

generates the vector space span(A) and is referred to as a basis of span(A).

Formally,

• Let a1, a2, . . . , am be a set of m linearly independent n-dimensional vec-

tor in a vector space V that spans V. Then a1, a2, . . . , am together forms

a basis of V and the dimension of a vector space is defined by the number

of vectors in its basis. That is, dim(V) = m.

Chapter 2 40


2.2 Matrix

Definition 2.2.1. A matrix is a rectangular or square arrangement of num-

bers. A matrix with m rows and n columns is referred to as an m × n (read

as ‘m by n’) matrix. An m × n matrix A with (i, j)th element aij is written

as

A = (aij)m×n =

⎡⎢⎢⎢⎢⎢⎢⎣

a11 a12 . . . a1n

a21 a22 . . . a2n

· · · · · · . . . · · ·am1 am2 . . . amn

⎤⎥⎥⎥⎥⎥⎥⎦

.

If m = n then the matrix is a square matrix.

Definition 2.2.2. A diagonal matrix is a square matrix with non-zero

elements in the diagonal cells and zeros elsewhere.

A diagonal matrix with diagonal elements a1, a2, . . . , an is written as

diag(a1, a2, . . . , an) =

⎡⎢⎢⎢⎢⎢⎢⎣

a1 0 . . . 0

0 a2 . . . 0

· · · · · · . . . · · ·0 0 . . . an

⎤⎥⎥⎥⎥⎥⎥⎦

.

Definition 2.2.3. An n×n diagonal matrix with all diagonal elements equal

to 1 is known as identity matrix of order n and is denoted by In.

Chapter 2 41


A similar notation Jmn is sometimes used for an m × n matrix with all

elements equal to 1, i.e.,

Jmn =

⎡⎢⎢⎢⎢⎢⎢⎣

1 1 . . . 1

1 1 . . . 1

· · · · · · . . . · · ·1 1 . . . 1

⎤⎥⎥⎥⎥⎥⎥⎦

= [1m 1m . . . 1m] .

Like vectors, matrices with the same dimensions can be added together

and results in another matrix. Any matrix is conformable to multiplication

by a scalar. If A = (aij)m×n and B = (bij)m×n, then

1. A ± B = (aij ± bij)m×n, and

2. cA = (caij)m×n.

Definition 2.2.4. The transpose of a matrix A = (aij)m×n is defined by

AT = (aji)n×m.

• If A = AT , then A is symmetric.

• (A + B)T = (AT + BT ).

Chapter 2 42


Definition 2.2.5. Matrix product. If A = (aij)m×n and B = (aij)n×p,

then

AB = (cij)m×p, cij =∑

k

aikbkj = aTi bj,

where ai is the ith row (imagine as a vector) of A and bj is the jth column

(vector) of B.

• (AB)T = BTAT ,

• (AB)C = A(BC),whenever defined,

• A(B + C) = AB + AC, whenever defined,

• JmnJnp = nJmp.

Chapter 2 43


2.3 Rank, Column Space and Null Space

Definition 2.3.1. The rank of a matrix A is the number of linearly inde-

pendent rows or columns of A. We denote it by rank(A).

• rank(AT ) = rank(A).

• An m × n matrix A with with rank m (n) is said to have full row

(column) rank.

• If A is a square matrix with n rows and rank(A) < n, then A is singular

and the inverse does not exist.

• rank(AB) ≤ min(rank(A), rank(B)).

• rank(ATA) = rank(AAT ) = rank(A) = rank(AT ).

Chapter 2 44


Definition 2.3.2. Inverse of a square matrix. If A is a square matrix

with n rows and rank(A) = n, then A is called non-singular and there exists

a matrix A−1 such that AA−1 = A−1A = In. The matrix A−1 is known as

the inverse of A.

• A−1 is unique.

• If A and B are invertible and has the same dimension, then

(AB)−1 = B−1A−1.

• (cA)−1 = A−1/c.

• (AT )−1 = (A−1)T .

Chapter 2 45


Definition 2.3.3. Column space. The column space of a matrix A is the

vector space generated by the columns of A. If A = (aij)m×n = (a1 a2 . . . an,

then the column space of A, denoted by C(A) or R(A) is given by

C(A) =

{a : a =

n∑j=1

cjaj

},

for scalars cj , j = 1, 2, . . . , n.

Alternatively, a ∈ C(A) iff there exists a vector c such that

a = Ac.

• What is the dimension of the vectors in C(A)?

• How many vectors will a basis of C(A) have?

• dim(C(A)) =?

• If A = BC, then C(A) ⊆ C(B).

• If C(A) ⊆ C(B), then there exist a matrix C such that A = BC.

Example 2.3.1. Find a basis for the column space of the matrix

A =

⎡⎢⎢⎢⎣

−1 2 −1

1 1 4

0 2 2

⎤⎥⎥⎥⎦ .

Chapter 2 46


Definition 2.3.4. Null Space. The null space of an m× n matrix A is de-

fined as the vector space consisting of the solution of the system of equations

Ax = 0. Null space of A is denoted by N (A) and can be written as

N (A) = {x : Ax = 0} .

• What is the dimension of the vectors in N (A)?

• How many vectors are there in a basis of N (A)?

• dim(N (A)) = n − rank(A) → Nullity of A.

Chapter 2 47


Definition 2.3.5. Orthogonal complements. Two sub spaces V1 and V2

of a vector space V forms orthogonal complements relative to V if every vector

in V1 is orthogonal to every vector in V2. We write V1 = V⊥2 or equivalently,

V2 = V⊥1 .

• V1 ∩ V2 = {0}.

• If dim(V1) = r, then dim(V2) = n − r, where n is the dimension of the

vectors in the vector space V.

• Every vector a in V can be uniquely decomposed into two components

a1 and a2 such that

a = a1 + a2, a1 ∈ V1, a2 ∈ V2. (2.3.1)

• If (2.3.1) holds, then

‖a‖2 = ‖a1‖2 + ‖a2‖2. (2.3.2)

How?

Chapter 2 48


Proof of (2.3.1).

• Existence. Suppose it is not possible. Then a is independent of the

basis vectors of V1 and V2. But that would make the total number of

independent vectors in V n + 1. Is that possible?

• Uniqueness. Let two such decompositions are possible, namely,

a = a1 + a2, a1 ∈ V1, a2 ∈ V2,

and

a = b1 + b2, b1 ∈ V1,b2 ∈ V2.

Then,

a1 − b1 = b2 − a2.

This implies

a1 = b1 & b2 = a2.(Why?)

.

Chapter 2 49


Proof of (2.3.2).

• From (2.3.1),

‖a‖2 = aTa

= (a1 + a2)T (a1 + a2)

= aT1 a1 + aT

1 a2 + aT2 a1 + aT

2 a2

= ‖a1‖2 + ‖a2‖2. (2.3.3)

This result is known as Pythagorean theorem.

Chapter 2 50


Figure 2.2: Orthogonal decomposition (direct sum)

V1 = {(x, y): x = y R2}

V2 = {(x, y): x, y R2,x+ y = 0}

V = {(x, y): x, y R2}

(2, 1) +

( 1/2, -1/2 )

(3/2, 3/2)

=

Chapter 2 51


Theorem 2.3.2. If A is an m×n matrix, and C(A) and N (AT ) respectively

denote the column and null space of A and AT , then

C(A) = N (AT )⊥.

Proof. • dim(C(A)) = rank(A) = rank(AT ) = r (say), dim(N (AT )) =

m − r.

• Suppose a1 ∈ C(A) and a2 ∈ N (AT ). Then, there exist a c such that

Ac = a1,

and

ATa2 = 0.

Now,

aT1 a2 = cTATa2

= 0. (2.3.4)

Chapter 2 52


• (More on Orthogonality.) If V1 ⊆ V2, and V⊥1 and V⊥

2 respectively denote

their orthogonal complements, then

V⊥2 ⊆ V⊥

1 .

Chapter 2 53


Proof. Proof of the result on previous page. Suppose a1 ∈ V1. Then we

can write

a1 = A1c1,

for some vector c1 and the columns of matrix A1 consisting of the basis

vectors of V1. And similarly,

a2 = A2c2, ∀ a2 ∈ V2.

In other words,

V1 = C(A1)

and

V2 = C(A2).

Since V1 ⊆ V2, there exists a matrix B such that A1 = A2B. (See PAGE 39)

Now let, a ∈ V⊥2 =⇒ a ∈ N (AT

2 ) implying

AT2 a = 0.

But

AT1 a = BTAT

2 a = 0,

providing that a ∈ N (AT1 ) = V⊥

2 .

Chapter 2 54


2.4 Trace

The trace of a matrix will become handy when we will talk about the distri-

bution of quadratic forms.

Definition 2.4.1. Trace of a square matrix is the sum of its diagonal

elements. Thus, if A = (aij)n×n, then

trace(A) =n∑

i=1

aii

.

• trace(In) =

• trace(A) = trace(AT )

• trace(A + B) = trace(A) + trace(B)

• trace(AB) = trace(BA)

• trace(ATA) = trace(A2) =∑n

i=1∑n

j=1 a2ij.

Chapter 2 55


2.5 Determinants

Definition 2.5.1. Determinant. The determinant of a scalar is the scalar

itself. The determinants of an n×n matrix A = (aij)m×n is given by a scalar,

written as |A|, where,

|A| =

n∑j=1

aij(−1)i+j|Mij|,

for any fixed i, where, the determinant |Mij| of the matrix Mij is known as

the minor of aij and the matrix Mij is obtained by deleting the ith row and

jth column of matrix A.

• |A| = |AT |

• |diag(di, i = 1, 2, . . . , n)| =∏n

i=1 di.

This also holds if the matrix is an upper or lower triangular matrix with

diagonal elements di, i = 1, 2, . . . , n.

Chapter 2 56


• |AB| = |A||B|

• |cA| = cn|A|

• If A is singular (rank(A) < n), then |A| = 0.

• |A−1| = 1/|A|.

• The determinants of block-diagonal (block-triangular) matrices works

the way as you would expect. For instance,∣∣∣∣∣∣A C

0 B

∣∣∣∣∣∣ = |A||B|.

In general ∣∣∣∣∣∣A B

C D

∣∣∣∣∣∣ = |A||D− CA−1B|.

Chapter 2 57


2.6 Eigenvalues and Eigenvectors

Definition 2.6.1. Eigenvalues and eigen vectors. The eigenvalues (λ)

of a square matrix An×n and the corresponding eigenvectors (a) are defined

by the set of equations

Aa = λa. (2.6.1)

Equation (2.6.1) leads to the polynomial equation

|A − λIn| = 0. (2.6.2)

For a given eigenvalue, the corresponding eigenvector is obtained as the so-

lution to the equation (2.6.1). The solutions to equation (2.6.1) constitutes

the eigenspace of the matrix A.

Example 2.6.1. Find the eigenvalues and eigenvectors for the matrix

A =

⎡⎢⎢⎢⎣

−1 2 0

1 2 1

0 2 −1

⎤⎥⎥⎥⎦ .

Chapter 2 58


Since in this course our focus will be on the eigenvalues of symmetric

matrices, hereto forth we state the results on eigenvalues and eigenvectors

applied to a symmetric matrix A. Some of the results will, however, hold for

general A. If you are interested, please consult a linear algebra book such as

Harville’s Matrix algebra from statistics perspective.

Definition 2.6.2. Spectrum. The spectrum of a matrix A is defined as the

set of distinct (real) eigenvalues {λ1, λ2, . . . , λk} of A.

• The eigenspace L of a matrix A corresponding to an igenvalue λ can be

written as

L = N (A − λIn).

• trace(A) =∑n

i=1 λi.

• |A| =∏n

i=1 λi.

• |In ± A| =∏n

i=1(1 ± λi).

• Eigenvectors associated with different eigenvalues are mutually orthog-

onal or can be chosen to be mutually orthogonal and hence linearly

independent.

• rank(A) is the number of non-zero λi’s.

Chapter 2 59


The proof of some of these results can be easily obtained through the

application of a special theorem called spectral decomposition theorem.

Definition 2.6.3. Orthogonal Matrix. A matrix An×n is said to be or-

thogonal if

ATA = In = AAT .

This immediately implies that A−1 = AT .

Theorem 2.6.2. Spectral decomposition. Any symmetric matrix Acan

be decomposed as

A = BΛBT ,

where Λ = diag(λ1, . . . , λn), is the diagonal matrix of eigenvalues and B is

an orthogonal matrix having its columns as the eigenvectors of A, namely,

A = [a1a2 . . .an], where aj’s are orthonormal eigenvectors corresponding to

the eigenvalues λj, j = 1, 2, . . . , n.

Proof.

Chapter 2 60


Outline of the proof of spectral decomposition theorem:

• By definition, B satisfies

AB = BΛ, (2.6.3)

and

BTB = In.

Then from (2.6.3),

A = BΛB−1 = BΛBT .

Spectral decomposition of a symmetric matrix allows one to form ’square

root’ of that matrix. If we define

√A = B

√ΛBT ,

it is easy to verify that√

A√

A = A.

In general, one can define

Aα = BΛαBT , α ∈ R.

Chapter 2 61


Example 2.6.3. Find a matrix B and the matrix Λ (the diagonal matrix of

eigenvalues) such that

A =

⎡⎣ 6 −2

−2 9

⎤⎦ = BTΛB.

Chapter 2 62


2.7 Solutions to linear systems of equations

A linear system of m equations in n unknowns is written as

Ax = b, (2.7.1)

where Am×n is a matrix and b is a vector of known constants and x is an

unknown vector. The goal usually is to find a value (solution) of x such that

(2.7.1) is satisfied. When b = 0, the system is said to be homogeneous. It

is easy to see that homogeneous systems are always consistent, that is, has

at least one solution.

• The solution set of a homogeneous system of equation Ax = 0 forms a

vector space and is given by N (A).

• A non-homogeneous system of equations Ax = b is consistent iff

rank(A,b) = rank(A).

– The system of linear equations Ax = b is consistent iff b ∈ C(A).

– If A is square and rank(A) = n, then Ax = b has a unique solution

given by x = A−1b.

Chapter 2 63


2.7.1 G-inverse

One way to obtain the solutions to a system of equations (2.7.1) is just to

transform the augmented matrix (A,b) into a row-reduced-echelon form.

However, such forms are not algebraically suitable for further algebraical

treatment. Equivalent to the inverse of a non-singular matrix, one can define

an inverse, referred to as generalized inverse or in short g-inverse of any ma-

trix, square or rectangular, singular or non-singular. This generalized inverse

helps finding the solutions of linear equations easier. Theoretical develop-

ments based on g-inverse are very powerful for solving problems arising in

linear models.

Definition 2.7.1. G-inverse. The g-inverse of a matrix Am×n is a matrix

Gn×m that satisfies the relationship

AGA = A.

Chapter 2 64


The following two lemmas are useful for finding the g-inverse of a matrix

A.

Lemma 2.7.1. Suppose rank(Am×n) = r, and Am×n can be factorized as

Am×n =

⎡⎣ A11 A12

A21 A22

⎤⎦

such that A11 is of dimension r × r with rank(A11) = r. Then, a g-inverse

of A is given by

Gn×m =

⎡⎣ A−1

11 0

0 0

⎤⎦ .

Example 2.7.2. Find the g-inverse of the matrix

A =

⎡⎢⎢⎢⎣

1 1 1 1

0 1 0 −1

1 0 1 2

⎤⎥⎥⎥⎦ .

Chapter 2 65


Suppose you do not have an r × r minor to begin with. What do you do

then?

Lemma 2.7.3. Suppose rank(Am×n) = r, and there exists non-singular ma-

trices B and C such that

BAC =

⎡⎣ D 0

0 0

⎤⎦ .

where D is a diagonal matrix with rank(D) = r. Then, a g-inverse of A is

given by

Gn×m = C−1

⎡⎣ D−1 0

0 0

⎤⎦B−1.

Chapter 2 66


• rank(G) ≥ rank(A).

• G-inverse of a matrix is not necessarily unique. For instance,

– If G is a g-inverse of a symmetric matrix A, then GAG is also a

g-inverse of A.

– If G is a g-inverse of a symmetric matrix A, then G1 = (G+GT )/2

is also a g-inverse of A.

– The g-inverse of a diagonal matrix D = diag(d1, . . . , dn) is another

diagonal matrix Dg = diag(dg1, . . . , d

gn), where

dgi =

⎧⎨⎩ 1/di, di �= 0,

0, di = 0.

Again, as you can see, we concentrate on symmetric matrices as this matrix

properties will be applied to mostly symmetric matrices in this course.

Chapter 2 67


Another way of finding a g-inverse of a symmetric matrix.

Lemma 2.7.4. Let A be an n-dimensional symmetric matrix. Then a g-

inverse of A, G is given by

G = QTΛQ,

where Q and Λ bears the same meaning as in spectral decomposition theorem.

2.7.2 Back to the system of equations

Theorem 2.7.5. If Ax = b is a consistent system of linear equations and

G be a g-inverse of A, then Gb is a solution to Ax = b.

Proof.

Chapter 2 68


Theorem 2.7.6. x∗ is a solution to the consistent system of linear equation

Ax = b iff there exists a vector c such that

x∗ = Gb + (I −GA)c,

for some g-inverse G of A.

Proof.

Chapter 2 69


Proof. Proof of Theorem ??.

If part.

For any compatible vector c and for any g-inverse G of A, define

x∗ = Gb + (I− GA)c.

Then,

Ax∗ = A[Gb + (I− GA)c] = AGb + (A −AGA)c = b + 0 = b.

Only If part.

Suppose x∗ is a solution to the consistent system of linear equation Ax = b.

Then

x∗ = Gb + (x∗ −Gb) = Gb + (x∗ −GAx∗) = Gb + (I− GA)c,

where c = x∗.

Remark 2.7.1. 1. Any solution to the system of equations Ax = b can be

written as a sum of two components: one being a solution by itself and

the other being in the null space of A.

2. If one computes one g-inverse of A, then he/she has identified all possible

solutions of Ax = b.

Chapter 2 70


Example 2.7.7. Give a general form of the solutions to the system of equa-

tions ⎡⎢⎢⎢⎢⎢⎢⎣

1 2 1 0

1 1 1 1

0 1 0 −1

1 −1 1 3

⎤⎥⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎢⎣

x1

x2

x3

x4

⎤⎥⎥⎥⎥⎥⎥⎦

=

⎡⎢⎢⎢⎢⎢⎢⎣

5

3

2

−1

⎤⎥⎥⎥⎥⎥⎥⎦

.

Chapter 2 71


Idempotent matrix and projections

Definition 2.7.2. Idempotent matrix. A square matrix B is idempotent

if B2 = BB = B.

• If B is idempotent, then rank(B) = trace(B).

• If Bn×n is idempotent, then In−B is also idempotent with rank(In−B) =

n − trace(B).

• If Bn×n is idempotent with rank(B) = n, then B = In.

Lemma 2.7.8. If the m× n matrix A has rank r, then the matrix In −GA

is idempotent with rank n − r, where G is a g-inverse of A.

Chapter 2 72


Definition 2.7.3. Projection. A square matrix Pn×n is a projection onto a

vector space V ⊆ Rn iff all three of the following holds: (a) P is idempotent,

(b) ∀x ∈ Rn, Px ∈ V, and (c)∀x ∈ V, Px = x. An idempotent matrix is a

projection onto its own column space.

Example 2.7.9. Let the vector space be defined as

V = {(v1, v2), v2 = kv1} ⊆ R2,

for some non-zero real constant k. Consider the matrix P =

⎧⎨⎩ t (1 − t)/k

kt (1 − t)

⎫⎬⎭

for any real number t ∈ R. Notice that

(a) PP = P,

(b) For any x = (x1, x2)T ∈ R2, Px = (tx1+(1−t)x2/k, ktx1+(1−t)x2)

T ∈V.

(c) For any x = (x1, x2)T = (x1, kx1) ∈ V, Px = x.

Thus, P is a projection onto the vector space V. Notice that the projection

P is not unique as it depends on the coice of t. Consider k = 1. Then V is the

linear space representing the line with unit slope passing through the origin.

When multiplied by the projection matrix (for t = 2) P1 =

⎧⎨⎩ 2 −1

2 −1

⎫⎬⎭, any

point in the two-dimensional real space produces a point in V. For instance,

the point (1, .5) when multiplied by P1 produces (1.5, 1.5) which belongs to

Chapter 2 73


Figure 2.3: Projections.

(1,1/2)

V = {(x,y), x = y}

(1.5,1.5)(.75, .75)

P2

P1

V. But the projection P2 =

⎧⎨⎩ .5 .5

.5 .5

⎫⎬⎭ projects the point (1, .5) onto V at

(0.75, 0.75). See figure.

Chapter 2 74


Back to g-inverse and solution of system of equations

Lemma 2.7.10. If G is a g-inverse of A, then I−GA is a projection onto

N (A).

Proof. Left as an exercise.

Lemma 2.7.11. If G is a g-inverse of A, then AG is a projection onto

C(A).

Proof. Left as an exercise (Done in class).

Lemma 2.7.12. If P and Q are symmetric and both project onto the same

space V ⊆ Rn, then P = Q.

Proof.

Chapter 2 75


By definition, for any x ∈ Rn, Px ∈ V & Qx ∈ V. Let

Px = x1 ∈ V & Qx = x2 ∈ V.

Then,

(P−Q)x = (x1 − x2), ∀x ∈ Rn. (2.7.2)

Multiplying both sides by PT = P,

PT (P−Q)x = P T (x1 − x2) = (x1 − x2), ∀x ∈ Rn.

We get,

P(P−Q)x = P(x1 − x2) = (x1 − x2), ∀x ∈ Rn. (2.7.3)

Subtracting (2.7.2) from (2.7.3) we obtain,

=⇒ [P(P−Q) − (P−Q)]x = 0, ∀x ∈ Rn,

=⇒ Q = PQ.

Multiplying both sides of (2.7.2) by QT = Q and following similar procedure

we can show that P = PQ = Q.

Chapter 2 76


Lemma 2.7.13. Suppose V1, V2(V1 ⊆ V2) are vector spaces in Rn and P1,

P2, and P⊥1 are symmetric projections onto V1, V2, and V⊥

1 respectively.

Then,

1. P1P2 = P2P1 = P1. (The smaller projection survives.)

2. P⊥1 P1 = P1P

⊥1 = 0.

3. P2 − P1 is a projection matrix. (What does it project onto?)

Proof. See Ravishanker and Dey, Page 62, Result 2.6.7.

Chapter 2 77


2.8 Definiteness

Definition 2.8.1. Quadratic form. If x is a vector in Rn and A is a matrix

in Rn×n, then the scalar xTAx is known as a quadratic form in x.

The matrix A does not need to be symmetric but any quadratic form

xTAx can be expressed in terms of symmetric matrices, for,

xTAx = (xTAx + xTATx)/2 = xT [(A + AT )/2]x.

Thus, without loss of generality, the matrix associated with a quadratic form

will be assumed symmetric.

Definition 2.8.2. Non-negative definite/Positive semi-definite. A

quadratic form xTAx and the corresponding matrix A is non-negative defi-

nite if xTAx ≥ 0 for all x ∈ Rn.

Definition 2.8.3. Positive definite. A quadratic form xTAx and the cor-

responding matrix A is positive definite if xTAx > 0 for all x ∈ Rn,x �= 0,

and xTAx = 0 only when x = 0.

Chapter 2 78


Properties related to definiteness

1. Positive definite matrices are non-singular. The inverse of a positive

definite matrix is also positive definite.

2. A symmetric matrix is positive (non-negative) definite iff all of its eigen-

values are positive (non-negative).

3. All diagonal elements and hence the trace of a positive definite matrix

are positive.

4. If A is symmetric positive definite then there exists a nonsingular matrix

Q such that A = QQT .

5. A projection matrix is always positive semi-definite.

6. If A and B are non-negative definite, then so is A + B. If one of A or

B is positive definite, then so is A + B.

Chapter 2 79


2.9 Derivatives with respect to (and of) vectors

Definition 2.9.1. Derivative with respect to a vector. Let f(a) be any

scalar function of the vector an×1. Then the derivative of f with respect to

a is defined as the vector

δf

δa=

⎡⎢⎢⎢⎢⎢⎢⎣

δfδa1

δfδa2

...

δfδan

⎤⎥⎥⎥⎥⎥⎥⎦

,

and the derivative with respect to the aT is defined as

δf

δaT=

[δf

δa

]T

.

The second derivative of f with respect to a is written as the derivative of

each of the elements in δfδa with respect to aT and stacked as rows of n × n

matrix,. i.e.,

δ2f

δaδaT=

δ

δaT

{δf

δa

}=

⎡⎢⎢⎢⎢⎢⎢⎣

δ2fδa2

1

δ2fδa1δa2

. . . δ2fδa1δan

δ2fδa2δa1

δ2fδa2

2. . . δ2f

δa2δan

......

......

δ2fδanδa1

δ2fδanδa2

. . . δ2fδ2an

⎤⎥⎥⎥⎥⎥⎥⎦

.

Chapter 2 80


Example 2.9.1. Derivative of linear and quadratic functions of a

vector.

1. δaTb

δb= a.

2. δbTAb

δb= Ab + ATb.

Derivatives with respect to matrices can be defined in a similar fashion.

We will only remind ourselves about one result on matrix derivatives which

will become handy when we talk about likelihood inference.

Lemma 2.9.2. If An×n is a symmetric non-singular matrix, then,

δ ln |A|δA

= A−1.

2.10 Problems

1. Are the following sets of vectors linearly independent? If not, in each

case find at least one vectors that are dependent on the others in the

set.

(a) vT1 = (0,−1, 0), vT

2 = (0, 0, 1), vT3 = (−1, 0, 0)

(b) vT1 = (2,−2, 6), vT

2 = (1, 1, 1)

(c) vT1 = (2, 2, 0,−2), vT

2 = (2, 0, 1,−1),vT3 = (0,−2, 1, 1)

Chapter 2 78

wahed

Text Box

81


2. Show that a set of non-zero mutually orthogonal vectors v1, v2, . . . , vn

are linearly independent.

3. Find the determinant and inverse of the matrices

(a)

1 ρ

ρ 1

,

1 ρ ρ

ρ 1 ρ

ρ ρ 1

,

1 ρ . . . ρ

ρ 1 . . . ρ

...... . . .

...

ρ ρ . . . 1

n×n

(b)

1 ρ

ρ 1

,

1 ρ ρ2

ρ 1 ρ

ρ2 ρ 1

,

1 ρ ρ2 . . . ρn

ρ 1 ρ . . . ρn−1

...... . . .

...

ρn ρn−1 ρn−2 . . . 1

(c)

1 ρ

ρ 1

,

1 ρ 0

ρ 1 ρ

0 ρ 1

,

1 ρ 0 . . . 0 0

ρ 1 ρ . . . 0 0

0 ρ 1 . . . 0 0

......

... . . ....

...

0 0 0 . . . 1 ρ

0 0 0 . . . ρ 1

n×n

Chapter 2 79

wahed

Text Box

82


4. Find the the rank and a basis for the null space of the matrix

1 2 2 −1

1 3 1 −2

1 1 3 0

0 1 −1 −1

1 2 2 −1

Chapter 2 80

wahed

Text Box

83

Download - Chapter 2 A short review of matrix algebra

Top Related