Download - Chapter 2 A short review of matrix algebra
Chapter 2
A short review of matrix algebra
2.1 Vectors and vector spaces
Definition 2.1.1. A vector a of dimension n is a collection of n elements
typically written as
a =
⎛⎜⎜⎜⎜⎜⎜⎝
a1
a2
...
an
⎞⎟⎟⎟⎟⎟⎟⎠
= (ai)n.
Vectors of length 2 (two-dimensional vectors) can be thought of points in
33
BIOS 2083 Linear Models Abdus S. Wahed
Figure 2.1: Vectors in two and three dimensional spaces
(-1.5,2)
(1, 1)
(1, -2)
(2.5, 1.5, 0.95)
x2
x1
x3
(0, 1.5, 0.95)
Chapter 2 35
BIOS 2083 Linear Models Abdus S. Wahed
• A vector with all elements equal to zero is known as a zero vector and
is denoted by 0.
• A vector whose elements are stacked vertically is known as column
vector whereas a vector whose elements are stacked horizontally will be
referred to as row vector. (Unless otherwise mentioned, all vectors will
be referred to as column vectors).
• A row vector representation of a column vector is known as its trans-
pose. We will use the notation ‘′’ or ‘T ’ to indicate a transpose. For
instance, if a =
⎛⎜⎜⎜⎜⎜⎜⎝
a1
a2
...
an
⎞⎟⎟⎟⎟⎟⎟⎠
and b = (a1 a2 . . . an), then we write b = aT
or a = bT .
• Vectors of same dimension are conformable to algebraic operations such
as additions and subtractions. Sum of two or more vectors of dimension
n results in another n-dimensional vector with elements as the sum of
the corresponding elements of summand vectors. That is,
(ai)n ± (bi)n = (ai ± bi)n.
Chapter 2 36
BIOS 2083 Linear Models Abdus S. Wahed
• Vectors can be multiplied by a scalar.
c(ai)n = (cai)n.
• Product of two vectors of same dimension can be formed when one of
them is a row vector and the other is a column vector. The result is called
inner, dot or scalar product. if a =
⎛⎜⎜⎜⎜⎜⎜⎝
a1
a2
...
an
⎞⎟⎟⎟⎟⎟⎟⎠
and b =
⎛⎜⎜⎜⎜⎜⎜⎝
b1
b2
...
bn
⎞⎟⎟⎟⎟⎟⎟⎠
, then
aT b = a1b1 + a2b2 + . . . + anbn.
Definition 2.1.2. The length, magnitude, or Euclidean norm of a vec-
tor is defined as the square root of the sum of squares of its elements and is
denoted by ||.||. For example,
||a|| = ||(ai)n|| =
√√√√ n∑i=1
a2i =
√aTa.
• The length of the sum of two or more vectors is less than or equal to the
sum of the lengths of each vector. (Cauchy-Schwarz Inequality).
||a + b|| ≤ ||a|| + ||b||
Chapter 2 37
BIOS 2083 Linear Models Abdus S. Wahed
Definition 2.1.3. A set of vectors {a1, a2, . . . , am} is linearly dependent
if at least one of them can be written as a linear combination of the others.
In other words, {a1, a2, . . . , am} are linearly dependent if there exists at
least one non-zero cj such that
m∑j=1
cjaj = 0. (2.1.1)
In other words, for some k,
ak = −(1/ck)∑j �=k
cjaj.
Definition 2.1.4. A set of vectors are linearly independent if they are
not linearly dependent. That is, in order for (2.1.1) to hold, all cj’s must be
equal to zero.
Chapter 2 38
BIOS 2083 Linear Models Abdus S. Wahed
Definition 2.1.5. Two vectors a and b are orthogonal if their scalar prod-
uct is zero. That is, aTb = 0, and we write a ⊥ b.
Definition 2.1.6. A set of vectors is said to be mutually orthogonal if
members of any pair of vectors belonging to the set are orthogonal.
• If vectors are mutually orthogonal then they are linearly independent.
Chapter 2 39
BIOS 2083 Linear Models Abdus S. Wahed
Definition 2.1.7. Vector space. A set of vectors which are closed under
addition and scalar multiplication is known as a vector space.
Thus if V is a vector space, for any two vectors a and b from V, (i)
caa + cbb ∈ V, and (ii) caa ∈ V for any two constants ca and cb.
Definition 2.1.8. Span. All possible linear combinations of a set of linearly
independent vectors form a Span of that set.
Thus if A = {a1, a2, . . . , am} is a set of m linearly independent vectors,
then the span of A is given by
span(A) =
{a : a =
m∑j=1
cjaj
},
for some numbers cj, j = 1, 2, . . . , m. Viewed differently, the set of vectors A
generates the vector space span(A) and is referred to as a basis of span(A).
Formally,
• Let a1, a2, . . . , am be a set of m linearly independent n-dimensional vec-
tor in a vector space V that spans V. Then a1, a2, . . . , am together forms
a basis of V and the dimension of a vector space is defined by the number
of vectors in its basis. That is, dim(V) = m.
Chapter 2 40
BIOS 2083 Linear Models Abdus S. Wahed
2.2 Matrix
Definition 2.2.1. A matrix is a rectangular or square arrangement of num-
bers. A matrix with m rows and n columns is referred to as an m × n (read
as ‘m by n’) matrix. An m × n matrix A with (i, j)th element aij is written
as
A = (aij)m×n =
⎡⎢⎢⎢⎢⎢⎢⎣
a11 a12 . . . a1n
a21 a22 . . . a2n
· · · · · · . . . · · ·am1 am2 . . . amn
⎤⎥⎥⎥⎥⎥⎥⎦
.
If m = n then the matrix is a square matrix.
Definition 2.2.2. A diagonal matrix is a square matrix with non-zero
elements in the diagonal cells and zeros elsewhere.
A diagonal matrix with diagonal elements a1, a2, . . . , an is written as
diag(a1, a2, . . . , an) =
⎡⎢⎢⎢⎢⎢⎢⎣
a1 0 . . . 0
0 a2 . . . 0
· · · · · · . . . · · ·0 0 . . . an
⎤⎥⎥⎥⎥⎥⎥⎦
.
Definition 2.2.3. An n×n diagonal matrix with all diagonal elements equal
to 1 is known as identity matrix of order n and is denoted by In.
Chapter 2 41
BIOS 2083 Linear Models Abdus S. Wahed
A similar notation Jmn is sometimes used for an m × n matrix with all
elements equal to 1, i.e.,
Jmn =
⎡⎢⎢⎢⎢⎢⎢⎣
1 1 . . . 1
1 1 . . . 1
· · · · · · . . . · · ·1 1 . . . 1
⎤⎥⎥⎥⎥⎥⎥⎦
= [1m 1m . . . 1m] .
Like vectors, matrices with the same dimensions can be added together
and results in another matrix. Any matrix is conformable to multiplication
by a scalar. If A = (aij)m×n and B = (bij)m×n, then
1. A ± B = (aij ± bij)m×n, and
2. cA = (caij)m×n.
Definition 2.2.4. The transpose of a matrix A = (aij)m×n is defined by
AT = (aji)n×m.
• If A = AT , then A is symmetric.
• (A + B)T = (AT + BT ).
Chapter 2 42
BIOS 2083 Linear Models Abdus S. Wahed
Definition 2.2.5. Matrix product. If A = (aij)m×n and B = (aij)n×p,
then
AB = (cij)m×p, cij =∑
k
aikbkj = aTi bj,
where ai is the ith row (imagine as a vector) of A and bj is the jth column
(vector) of B.
• (AB)T = BTAT ,
• (AB)C = A(BC),whenever defined,
• A(B + C) = AB + AC, whenever defined,
• JmnJnp = nJmp.
Chapter 2 43
BIOS 2083 Linear Models Abdus S. Wahed
2.3 Rank, Column Space and Null Space
Definition 2.3.1. The rank of a matrix A is the number of linearly inde-
pendent rows or columns of A. We denote it by rank(A).
• rank(AT ) = rank(A).
• An m × n matrix A with with rank m (n) is said to have full row
(column) rank.
• If A is a square matrix with n rows and rank(A) < n, then A is singular
and the inverse does not exist.
• rank(AB) ≤ min(rank(A), rank(B)).
• rank(ATA) = rank(AAT ) = rank(A) = rank(AT ).
Chapter 2 44
BIOS 2083 Linear Models Abdus S. Wahed
Definition 2.3.2. Inverse of a square matrix. If A is a square matrix
with n rows and rank(A) = n, then A is called non-singular and there exists
a matrix A−1 such that AA−1 = A−1A = In. The matrix A−1 is known as
the inverse of A.
• A−1 is unique.
• If A and B are invertible and has the same dimension, then
(AB)−1 = B−1A−1.
• (cA)−1 = A−1/c.
• (AT )−1 = (A−1)T .
Chapter 2 45
BIOS 2083 Linear Models Abdus S. Wahed
Definition 2.3.3. Column space. The column space of a matrix A is the
vector space generated by the columns of A. If A = (aij)m×n = (a1 a2 . . . an,
then the column space of A, denoted by C(A) or R(A) is given by
C(A) =
{a : a =
n∑j=1
cjaj
},
for scalars cj , j = 1, 2, . . . , n.
Alternatively, a ∈ C(A) iff there exists a vector c such that
a = Ac.
• What is the dimension of the vectors in C(A)?
• How many vectors will a basis of C(A) have?
• dim(C(A)) =?
• If A = BC, then C(A) ⊆ C(B).
• If C(A) ⊆ C(B), then there exist a matrix C such that A = BC.
Example 2.3.1. Find a basis for the column space of the matrix
A =
⎡⎢⎢⎢⎣
−1 2 −1
1 1 4
0 2 2
⎤⎥⎥⎥⎦ .
Chapter 2 46
BIOS 2083 Linear Models Abdus S. Wahed
Definition 2.3.4. Null Space. The null space of an m× n matrix A is de-
fined as the vector space consisting of the solution of the system of equations
Ax = 0. Null space of A is denoted by N (A) and can be written as
N (A) = {x : Ax = 0} .
• What is the dimension of the vectors in N (A)?
• How many vectors are there in a basis of N (A)?
• dim(N (A)) = n − rank(A) → Nullity of A.
Chapter 2 47
BIOS 2083 Linear Models Abdus S. Wahed
Definition 2.3.5. Orthogonal complements. Two sub spaces V1 and V2
of a vector space V forms orthogonal complements relative to V if every vector
in V1 is orthogonal to every vector in V2. We write V1 = V⊥2 or equivalently,
V2 = V⊥1 .
• V1 ∩ V2 = {0}.
• If dim(V1) = r, then dim(V2) = n − r, where n is the dimension of the
vectors in the vector space V.
• Every vector a in V can be uniquely decomposed into two components
a1 and a2 such that
a = a1 + a2, a1 ∈ V1, a2 ∈ V2. (2.3.1)
• If (2.3.1) holds, then
‖a‖2 = ‖a1‖2 + ‖a2‖2. (2.3.2)
How?
Chapter 2 48
BIOS 2083 Linear Models Abdus S. Wahed
Proof of (2.3.1).
• Existence. Suppose it is not possible. Then a is independent of the
basis vectors of V1 and V2. But that would make the total number of
independent vectors in V n + 1. Is that possible?
• Uniqueness. Let two such decompositions are possible, namely,
a = a1 + a2, a1 ∈ V1, a2 ∈ V2,
and
a = b1 + b2, b1 ∈ V1,b2 ∈ V2.
Then,
a1 − b1 = b2 − a2.
This implies
a1 = b1 & b2 = a2.(Why?)
.
Chapter 2 49
BIOS 2083 Linear Models Abdus S. Wahed
Proof of (2.3.2).
• From (2.3.1),
‖a‖2 = aTa
= (a1 + a2)T (a1 + a2)
= aT1 a1 + aT
1 a2 + aT2 a1 + aT
2 a2
= ‖a1‖2 + ‖a2‖2. (2.3.3)
This result is known as Pythagorean theorem.
Chapter 2 50
BIOS 2083 Linear Models Abdus S. Wahed
Figure 2.2: Orthogonal decomposition (direct sum)
V1 = {(x, y): x = y R2}
V2 = {(x, y): x, y R2,x+ y = 0}
V = {(x, y): x, y R2}
(2, 1) +
( 1/2, -1/2 )
(3/2, 3/2)
=
Chapter 2 51
BIOS 2083 Linear Models Abdus S. Wahed
Theorem 2.3.2. If A is an m×n matrix, and C(A) and N (AT ) respectively
denote the column and null space of A and AT , then
C(A) = N (AT )⊥.
Proof. • dim(C(A)) = rank(A) = rank(AT ) = r (say), dim(N (AT )) =
m − r.
• Suppose a1 ∈ C(A) and a2 ∈ N (AT ). Then, there exist a c such that
Ac = a1,
and
ATa2 = 0.
Now,
aT1 a2 = cTATa2
= 0. (2.3.4)
Chapter 2 52
BIOS 2083 Linear Models Abdus S. Wahed
• (More on Orthogonality.) If V1 ⊆ V2, and V⊥1 and V⊥
2 respectively denote
their orthogonal complements, then
V⊥2 ⊆ V⊥
1 .
Chapter 2 53
BIOS 2083 Linear Models Abdus S. Wahed
Proof. Proof of the result on previous page. Suppose a1 ∈ V1. Then we
can write
a1 = A1c1,
for some vector c1 and the columns of matrix A1 consisting of the basis
vectors of V1. And similarly,
a2 = A2c2, ∀ a2 ∈ V2.
In other words,
V1 = C(A1)
and
V2 = C(A2).
Since V1 ⊆ V2, there exists a matrix B such that A1 = A2B. (See PAGE 39)
Now let, a ∈ V⊥2 =⇒ a ∈ N (AT
2 ) implying
AT2 a = 0.
But
AT1 a = BTAT
2 a = 0,
providing that a ∈ N (AT1 ) = V⊥
2 .
Chapter 2 54
BIOS 2083 Linear Models Abdus S. Wahed
2.4 Trace
The trace of a matrix will become handy when we will talk about the distri-
bution of quadratic forms.
Definition 2.4.1. Trace of a square matrix is the sum of its diagonal
elements. Thus, if A = (aij)n×n, then
trace(A) =n∑
i=1
aii
.
• trace(In) =
• trace(A) = trace(AT )
• trace(A + B) = trace(A) + trace(B)
• trace(AB) = trace(BA)
• trace(ATA) = trace(A2) =∑n
i=1∑n
j=1 a2ij.
Chapter 2 55
BIOS 2083 Linear Models Abdus S. Wahed
2.5 Determinants
Definition 2.5.1. Determinant. The determinant of a scalar is the scalar
itself. The determinants of an n×n matrix A = (aij)m×n is given by a scalar,
written as |A|, where,
|A| =
n∑j=1
aij(−1)i+j|Mij|,
for any fixed i, where, the determinant |Mij| of the matrix Mij is known as
the minor of aij and the matrix Mij is obtained by deleting the ith row and
jth column of matrix A.
• |A| = |AT |
• |diag(di, i = 1, 2, . . . , n)| =∏n
i=1 di.
This also holds if the matrix is an upper or lower triangular matrix with
diagonal elements di, i = 1, 2, . . . , n.
Chapter 2 56
BIOS 2083 Linear Models Abdus S. Wahed
• |AB| = |A||B|
• |cA| = cn|A|
• If A is singular (rank(A) < n), then |A| = 0.
• |A−1| = 1/|A|.
• The determinants of block-diagonal (block-triangular) matrices works
the way as you would expect. For instance,∣∣∣∣∣∣A C
0 B
∣∣∣∣∣∣ = |A||B|.
In general ∣∣∣∣∣∣A B
C D
∣∣∣∣∣∣ = |A||D− CA−1B|.
Chapter 2 57
BIOS 2083 Linear Models Abdus S. Wahed
2.6 Eigenvalues and Eigenvectors
Definition 2.6.1. Eigenvalues and eigen vectors. The eigenvalues (λ)
of a square matrix An×n and the corresponding eigenvectors (a) are defined
by the set of equations
Aa = λa. (2.6.1)
Equation (2.6.1) leads to the polynomial equation
|A − λIn| = 0. (2.6.2)
For a given eigenvalue, the corresponding eigenvector is obtained as the so-
lution to the equation (2.6.1). The solutions to equation (2.6.1) constitutes
the eigenspace of the matrix A.
Example 2.6.1. Find the eigenvalues and eigenvectors for the matrix
A =
⎡⎢⎢⎢⎣
−1 2 0
1 2 1
0 2 −1
⎤⎥⎥⎥⎦ .
Chapter 2 58
BIOS 2083 Linear Models Abdus S. Wahed
Since in this course our focus will be on the eigenvalues of symmetric
matrices, hereto forth we state the results on eigenvalues and eigenvectors
applied to a symmetric matrix A. Some of the results will, however, hold for
general A. If you are interested, please consult a linear algebra book such as
Harville’s Matrix algebra from statistics perspective.
Definition 2.6.2. Spectrum. The spectrum of a matrix A is defined as the
set of distinct (real) eigenvalues {λ1, λ2, . . . , λk} of A.
• The eigenspace L of a matrix A corresponding to an igenvalue λ can be
written as
L = N (A − λIn).
• trace(A) =∑n
i=1 λi.
• |A| =∏n
i=1 λi.
• |In ± A| =∏n
i=1(1 ± λi).
• Eigenvectors associated with different eigenvalues are mutually orthog-
onal or can be chosen to be mutually orthogonal and hence linearly
independent.
• rank(A) is the number of non-zero λi’s.
Chapter 2 59
BIOS 2083 Linear Models Abdus S. Wahed
The proof of some of these results can be easily obtained through the
application of a special theorem called spectral decomposition theorem.
Definition 2.6.3. Orthogonal Matrix. A matrix An×n is said to be or-
thogonal if
ATA = In = AAT .
This immediately implies that A−1 = AT .
Theorem 2.6.2. Spectral decomposition. Any symmetric matrix Acan
be decomposed as
A = BΛBT ,
where Λ = diag(λ1, . . . , λn), is the diagonal matrix of eigenvalues and B is
an orthogonal matrix having its columns as the eigenvectors of A, namely,
A = [a1a2 . . .an], where aj’s are orthonormal eigenvectors corresponding to
the eigenvalues λj, j = 1, 2, . . . , n.
Proof.
Chapter 2 60
BIOS 2083 Linear Models Abdus S. Wahed
Outline of the proof of spectral decomposition theorem:
• By definition, B satisfies
AB = BΛ, (2.6.3)
and
BTB = In.
Then from (2.6.3),
A = BΛB−1 = BΛBT .
Spectral decomposition of a symmetric matrix allows one to form ’square
root’ of that matrix. If we define
√A = B
√ΛBT ,
it is easy to verify that√
A√
A = A.
In general, one can define
Aα = BΛαBT , α ∈ R.
Chapter 2 61
BIOS 2083 Linear Models Abdus S. Wahed
Example 2.6.3. Find a matrix B and the matrix Λ (the diagonal matrix of
eigenvalues) such that
A =
⎡⎣ 6 −2
−2 9
⎤⎦ = BTΛB.
Chapter 2 62
BIOS 2083 Linear Models Abdus S. Wahed
2.7 Solutions to linear systems of equations
A linear system of m equations in n unknowns is written as
Ax = b, (2.7.1)
where Am×n is a matrix and b is a vector of known constants and x is an
unknown vector. The goal usually is to find a value (solution) of x such that
(2.7.1) is satisfied. When b = 0, the system is said to be homogeneous. It
is easy to see that homogeneous systems are always consistent, that is, has
at least one solution.
• The solution set of a homogeneous system of equation Ax = 0 forms a
vector space and is given by N (A).
• A non-homogeneous system of equations Ax = b is consistent iff
rank(A,b) = rank(A).
– The system of linear equations Ax = b is consistent iff b ∈ C(A).
– If A is square and rank(A) = n, then Ax = b has a unique solution
given by x = A−1b.
Chapter 2 63
BIOS 2083 Linear Models Abdus S. Wahed
2.7.1 G-inverse
One way to obtain the solutions to a system of equations (2.7.1) is just to
transform the augmented matrix (A,b) into a row-reduced-echelon form.
However, such forms are not algebraically suitable for further algebraical
treatment. Equivalent to the inverse of a non-singular matrix, one can define
an inverse, referred to as generalized inverse or in short g-inverse of any ma-
trix, square or rectangular, singular or non-singular. This generalized inverse
helps finding the solutions of linear equations easier. Theoretical develop-
ments based on g-inverse are very powerful for solving problems arising in
linear models.
Definition 2.7.1. G-inverse. The g-inverse of a matrix Am×n is a matrix
Gn×m that satisfies the relationship
AGA = A.
Chapter 2 64
BIOS 2083 Linear Models Abdus S. Wahed
The following two lemmas are useful for finding the g-inverse of a matrix
A.
Lemma 2.7.1. Suppose rank(Am×n) = r, and Am×n can be factorized as
Am×n =
⎡⎣ A11 A12
A21 A22
⎤⎦
such that A11 is of dimension r × r with rank(A11) = r. Then, a g-inverse
of A is given by
Gn×m =
⎡⎣ A−1
11 0
0 0
⎤⎦ .
Example 2.7.2. Find the g-inverse of the matrix
A =
⎡⎢⎢⎢⎣
1 1 1 1
0 1 0 −1
1 0 1 2
⎤⎥⎥⎥⎦ .
Chapter 2 65
BIOS 2083 Linear Models Abdus S. Wahed
Suppose you do not have an r × r minor to begin with. What do you do
then?
Lemma 2.7.3. Suppose rank(Am×n) = r, and there exists non-singular ma-
trices B and C such that
BAC =
⎡⎣ D 0
0 0
⎤⎦ .
where D is a diagonal matrix with rank(D) = r. Then, a g-inverse of A is
given by
Gn×m = C−1
⎡⎣ D−1 0
0 0
⎤⎦B−1.
Chapter 2 66
BIOS 2083 Linear Models Abdus S. Wahed
• rank(G) ≥ rank(A).
• G-inverse of a matrix is not necessarily unique. For instance,
– If G is a g-inverse of a symmetric matrix A, then GAG is also a
g-inverse of A.
– If G is a g-inverse of a symmetric matrix A, then G1 = (G+GT )/2
is also a g-inverse of A.
– The g-inverse of a diagonal matrix D = diag(d1, . . . , dn) is another
diagonal matrix Dg = diag(dg1, . . . , d
gn), where
dgi =
⎧⎨⎩ 1/di, di �= 0,
0, di = 0.
Again, as you can see, we concentrate on symmetric matrices as this matrix
properties will be applied to mostly symmetric matrices in this course.
Chapter 2 67
BIOS 2083 Linear Models Abdus S. Wahed
Another way of finding a g-inverse of a symmetric matrix.
Lemma 2.7.4. Let A be an n-dimensional symmetric matrix. Then a g-
inverse of A, G is given by
G = QTΛQ,
where Q and Λ bears the same meaning as in spectral decomposition theorem.
2.7.2 Back to the system of equations
Theorem 2.7.5. If Ax = b is a consistent system of linear equations and
G be a g-inverse of A, then Gb is a solution to Ax = b.
Proof.
Chapter 2 68
BIOS 2083 Linear Models Abdus S. Wahed
Theorem 2.7.6. x∗ is a solution to the consistent system of linear equation
Ax = b iff there exists a vector c such that
x∗ = Gb + (I −GA)c,
for some g-inverse G of A.
Proof.
Chapter 2 69
BIOS 2083 Linear Models Abdus S. Wahed
Proof. Proof of Theorem ??.
If part.
For any compatible vector c and for any g-inverse G of A, define
x∗ = Gb + (I− GA)c.
Then,
Ax∗ = A[Gb + (I− GA)c] = AGb + (A −AGA)c = b + 0 = b.
Only If part.
Suppose x∗ is a solution to the consistent system of linear equation Ax = b.
Then
x∗ = Gb + (x∗ −Gb) = Gb + (x∗ −GAx∗) = Gb + (I− GA)c,
where c = x∗.
Remark 2.7.1. 1. Any solution to the system of equations Ax = b can be
written as a sum of two components: one being a solution by itself and
the other being in the null space of A.
2. If one computes one g-inverse of A, then he/she has identified all possible
solutions of Ax = b.
Chapter 2 70
BIOS 2083 Linear Models Abdus S. Wahed
Example 2.7.7. Give a general form of the solutions to the system of equa-
tions ⎡⎢⎢⎢⎢⎢⎢⎣
1 2 1 0
1 1 1 1
0 1 0 −1
1 −1 1 3
⎤⎥⎥⎥⎥⎥⎥⎦
⎡⎢⎢⎢⎢⎢⎢⎣
x1
x2
x3
x4
⎤⎥⎥⎥⎥⎥⎥⎦
=
⎡⎢⎢⎢⎢⎢⎢⎣
5
3
2
−1
⎤⎥⎥⎥⎥⎥⎥⎦
.
Chapter 2 71
BIOS 2083 Linear Models Abdus S. Wahed
Idempotent matrix and projections
Definition 2.7.2. Idempotent matrix. A square matrix B is idempotent
if B2 = BB = B.
• If B is idempotent, then rank(B) = trace(B).
• If Bn×n is idempotent, then In−B is also idempotent with rank(In−B) =
n − trace(B).
• If Bn×n is idempotent with rank(B) = n, then B = In.
Lemma 2.7.8. If the m× n matrix A has rank r, then the matrix In −GA
is idempotent with rank n − r, where G is a g-inverse of A.
Chapter 2 72
BIOS 2083 Linear Models Abdus S. Wahed
Definition 2.7.3. Projection. A square matrix Pn×n is a projection onto a
vector space V ⊆ Rn iff all three of the following holds: (a) P is idempotent,
(b) ∀x ∈ Rn, Px ∈ V, and (c)∀x ∈ V, Px = x. An idempotent matrix is a
projection onto its own column space.
Example 2.7.9. Let the vector space be defined as
V = {(v1, v2), v2 = kv1} ⊆ R2,
for some non-zero real constant k. Consider the matrix P =
⎧⎨⎩ t (1 − t)/k
kt (1 − t)
⎫⎬⎭
for any real number t ∈ R. Notice that
(a) PP = P,
(b) For any x = (x1, x2)T ∈ R2, Px = (tx1+(1−t)x2/k, ktx1+(1−t)x2)
T ∈V.
(c) For any x = (x1, x2)T = (x1, kx1) ∈ V, Px = x.
Thus, P is a projection onto the vector space V. Notice that the projection
P is not unique as it depends on the coice of t. Consider k = 1. Then V is the
linear space representing the line with unit slope passing through the origin.
When multiplied by the projection matrix (for t = 2) P1 =
⎧⎨⎩ 2 −1
2 −1
⎫⎬⎭, any
point in the two-dimensional real space produces a point in V. For instance,
the point (1, .5) when multiplied by P1 produces (1.5, 1.5) which belongs to
Chapter 2 73
BIOS 2083 Linear Models Abdus S. Wahed
Figure 2.3: Projections.
(1,1/2)
V = {(x,y), x = y}
(1.5,1.5)(.75, .75)
P2
P1
V. But the projection P2 =
⎧⎨⎩ .5 .5
.5 .5
⎫⎬⎭ projects the point (1, .5) onto V at
(0.75, 0.75). See figure.
Chapter 2 74
BIOS 2083 Linear Models Abdus S. Wahed
Back to g-inverse and solution of system of equations
Lemma 2.7.10. If G is a g-inverse of A, then I−GA is a projection onto
N (A).
Proof. Left as an exercise.
Lemma 2.7.11. If G is a g-inverse of A, then AG is a projection onto
C(A).
Proof. Left as an exercise (Done in class).
Lemma 2.7.12. If P and Q are symmetric and both project onto the same
space V ⊆ Rn, then P = Q.
Proof.
Chapter 2 75
BIOS 2083 Linear Models Abdus S. Wahed
By definition, for any x ∈ Rn, Px ∈ V & Qx ∈ V. Let
Px = x1 ∈ V & Qx = x2 ∈ V.
Then,
(P−Q)x = (x1 − x2), ∀x ∈ Rn. (2.7.2)
Multiplying both sides by PT = P,
PT (P−Q)x = P T (x1 − x2) = (x1 − x2), ∀x ∈ Rn.
We get,
P(P−Q)x = P(x1 − x2) = (x1 − x2), ∀x ∈ Rn. (2.7.3)
Subtracting (2.7.2) from (2.7.3) we obtain,
=⇒ [P(P−Q) − (P−Q)]x = 0, ∀x ∈ Rn,
=⇒ Q = PQ.
Multiplying both sides of (2.7.2) by QT = Q and following similar procedure
we can show that P = PQ = Q.
Chapter 2 76
BIOS 2083 Linear Models Abdus S. Wahed
Lemma 2.7.13. Suppose V1, V2(V1 ⊆ V2) are vector spaces in Rn and P1,
P2, and P⊥1 are symmetric projections onto V1, V2, and V⊥
1 respectively.
Then,
1. P1P2 = P2P1 = P1. (The smaller projection survives.)
2. P⊥1 P1 = P1P
⊥1 = 0.
3. P2 − P1 is a projection matrix. (What does it project onto?)
Proof. See Ravishanker and Dey, Page 62, Result 2.6.7.
Chapter 2 77
BIOS 2083 Linear Models Abdus S. Wahed
2.8 Definiteness
Definition 2.8.1. Quadratic form. If x is a vector in Rn and A is a matrix
in Rn×n, then the scalar xTAx is known as a quadratic form in x.
The matrix A does not need to be symmetric but any quadratic form
xTAx can be expressed in terms of symmetric matrices, for,
xTAx = (xTAx + xTATx)/2 = xT [(A + AT )/2]x.
Thus, without loss of generality, the matrix associated with a quadratic form
will be assumed symmetric.
Definition 2.8.2. Non-negative definite/Positive semi-definite. A
quadratic form xTAx and the corresponding matrix A is non-negative defi-
nite if xTAx ≥ 0 for all x ∈ Rn.
Definition 2.8.3. Positive definite. A quadratic form xTAx and the cor-
responding matrix A is positive definite if xTAx > 0 for all x ∈ Rn,x �= 0,
and xTAx = 0 only when x = 0.
Chapter 2 78
BIOS 2083 Linear Models Abdus S. Wahed
Properties related to definiteness
1. Positive definite matrices are non-singular. The inverse of a positive
definite matrix is also positive definite.
2. A symmetric matrix is positive (non-negative) definite iff all of its eigen-
values are positive (non-negative).
3. All diagonal elements and hence the trace of a positive definite matrix
are positive.
4. If A is symmetric positive definite then there exists a nonsingular matrix
Q such that A = QQT .
5. A projection matrix is always positive semi-definite.
6. If A and B are non-negative definite, then so is A + B. If one of A or
B is positive definite, then so is A + B.
Chapter 2 79
BIOS 2083 Linear Models Abdus S. Wahed
2.9 Derivatives with respect to (and of) vectors
Definition 2.9.1. Derivative with respect to a vector. Let f(a) be any
scalar function of the vector an×1. Then the derivative of f with respect to
a is defined as the vector
δf
δa=
⎡⎢⎢⎢⎢⎢⎢⎣
δfδa1
δfδa2
...
δfδan
⎤⎥⎥⎥⎥⎥⎥⎦
,
and the derivative with respect to the aT is defined as
δf
δaT=
[δf
δa
]T
.
The second derivative of f with respect to a is written as the derivative of
each of the elements in δfδa with respect to aT and stacked as rows of n × n
matrix,. i.e.,
δ2f
δaδaT=
δ
δaT
{δf
δa
}=
⎡⎢⎢⎢⎢⎢⎢⎣
δ2fδa2
1
δ2fδa1δa2
. . . δ2fδa1δan
δ2fδa2δa1
δ2fδa2
2. . . δ2f
δa2δan
......
......
δ2fδanδa1
δ2fδanδa2
. . . δ2fδ2an
⎤⎥⎥⎥⎥⎥⎥⎦
.
Chapter 2 80
BIOS 2083 Linear Models Abdus S. Wahed
Example 2.9.1. Derivative of linear and quadratic functions of a
vector.
1. δaTb
δb= a.
2. δbTAb
δb= Ab + ATb.
Derivatives with respect to matrices can be defined in a similar fashion.
We will only remind ourselves about one result on matrix derivatives which
will become handy when we talk about likelihood inference.
Lemma 2.9.2. If An×n is a symmetric non-singular matrix, then,
δ ln |A|δA
= A−1.
2.10 Problems
1. Are the following sets of vectors linearly independent? If not, in each
case find at least one vectors that are dependent on the others in the
set.
(a) vT1 = (0,−1, 0), vT
2 = (0, 0, 1), vT3 = (−1, 0, 0)
(b) vT1 = (2,−2, 6), vT
2 = (1, 1, 1)
(c) vT1 = (2, 2, 0,−2), vT
2 = (2, 0, 1,−1),vT3 = (0,−2, 1, 1)
Chapter 2 78
BIOS 2083 Linear Models Abdus S. Wahed
2. Show that a set of non-zero mutually orthogonal vectors v1, v2, . . . , vn
are linearly independent.
3. Find the determinant and inverse of the matrices
(a)
1 ρ
ρ 1
,
1 ρ ρ
ρ 1 ρ
ρ ρ 1
,
1 ρ . . . ρ
ρ 1 . . . ρ
...... . . .
...
ρ ρ . . . 1
n×n
(b)
1 ρ
ρ 1
,
1 ρ ρ2
ρ 1 ρ
ρ2 ρ 1
,
1 ρ ρ2 . . . ρn
ρ 1 ρ . . . ρn−1
...... . . .
...
ρn ρn−1 ρn−2 . . . 1
(c)
1 ρ
ρ 1
,
1 ρ 0
ρ 1 ρ
0 ρ 1
,
1 ρ 0 . . . 0 0
ρ 1 ρ . . . 0 0
0 ρ 1 . . . 0 0
......
... . . ....
...
0 0 0 . . . 1 ρ
0 0 0 . . . ρ 1
n×n
Chapter 2 79