notes_mmu218_00036
TRANSCRIPT
Applied Mathematics for Engineers1
1Reference textbook: Kreyszig, E., Advanced Engineering Mathematics, 10thEd.,John Wiley & Sons, 2011.
1
Chapter 1
Vectors in Rn and Cn,Spatial Vectors
1.1 Introduction
Weights of eight students are listed as156, 125, 145, 134, 178, 145, 162, 193we can denote these numbers using a single symbol, w, but with di�erent subscript asw1, w2, w3, w4, w5, w6, w7, w8
Each subscript denotes the position of the numbers in the list, for examplew1 = 156 the �rst numberw2 = 125 the second numberSuch a list of values is called a linear array or vector
w =(w1 w2 . . . w8
)Vector Addition and Scalar Multiplication
Vector Addition
The resultant u+ v of the two vectors u and v is obtained by the so-called parallegrom law.Furthermore, if
u =(a b c
)v =
(a′ b′ c′
)thenend point of the vector u+ v is (
a+ a′ b+ b′ c+ c′)
Scalar Multiplication
The product ku of a vector u by a real number k is obtained as:
u =(a b c
)then
ku =(ka kb kc
)Mathematically, vector u is de�ned with its
(a b c
)we write u =
(a b c
).
The ordered triple(a b c
)of real numbers may be called a point or a vector
General notation: n− tuple(a1 a2 . . . an
)
2
CHAPTER 1. VECTORS IN RN AND CN ,SPATIAL VECTORS 3
1.2 Vectors in Rn
The set of all n− tuples of real numbers, denoted by Rn,is called n− space.A particular n− tuple in Rn, is called a point or vector
u =(a1 a2 . . . an
)The numbers ai are called the coordinates, components, or elements of uTwo vectors, u and v are equal, written u = v. If they have the same number of components, and if the corresponding componentsare equal.The vector
(0 0 . . . 0
)is called the zero vector.
Column Vectors [12
] [3−4
] 15−6
Row Vectors [
1 2] [
3 −4] [
1 5 −6]
1.3 Vector Addition and Scalar Multiplication
Consider two vectors u and v in Rn
u =(a1 a2 . . . an
)and
v =(b1 b2 . . . bn
)Their sum, written u+ v
u+ v =[a1 + b1 a2 + b2 . . . an + bn
]The scalar product or, simply product of the vector u by a real number k, written ku
ku =(ka1 ka2 . . . kan
)u+ v and ku are also vectors in Rn
Negatives and subtraction are de�ned in Rnas follows
−u = (−1)u
and
u− v = u+ (−1)u
Given vectors u1, u2, . . . , um in Rn and scalars k1, k2, . . . , km,we can form the vector
v = k1u1 + k2u2 + k3u3 + · · ·+ kmum
Vector v is called a linear combination of the vectors u1, u2, . . . , um
CHAPTER 1. VECTORS IN RN AND CN ,SPATIAL VECTORS 4
1.4 Dot (Inner) Product
Consider arbitrary vector, u, v ∈ Rn
u =(a1 a2 . . . an
)and
v =(b1 b2 . . . bn
)The dot product or inner product or scalar product of u and v is denoted and de�ned by
u · v = a1b1 + a2b2 + a3b3 + · · ·+ anbn
u and v are said to be orthogonal if
u ⊥ v ⇐⇒ u · v = 0
Norm (Length) of a Vector
The norm or length of a vector u ∈ Rn, denoted by
‖u‖ =√u · u 1 0
u =(a1 a2 . . . an
)∈ Rn
‖u‖ =√u · u =
√a2
1 + a22 + a2
3 + · · ·+ a2n
‖u‖ 1 0
u = 0⇐⇒ ‖u‖ = 0
Unit vector
u is called a unit vector ⇐⇒ ‖u‖ = 1∀v 6= 0 ∈ Rn
v =1
‖v‖v =
v
‖v‖is called normalizing vv is the unit vector in the same direction as v
Distance, Angles, Projections
Consider arbitrary vector, ∀u, v ∈ Rn
u =(a1 a2 . . . an
)and
v =(b1 b2 . . . bn
)Distance between u and v is denoted by d (u, v) = ‖u− v‖
‖u− v‖ =
√(a1 − b1)
2+ (a2 − b2)
2+ (a3 − b3)
2+ · · ·+ (an − bn)
2
The angle between ∀{u, v}∈ Rn and
{u, v}6= 0 is de�ned by
CHAPTER 1. VECTORS IN RN AND CN ,SPATIAL VECTORS 5
cosθ =u · v‖u‖‖v‖
If u · v = 0 =⇒ θ =π
2or θ =
π
2=⇒ u ⊥ v
u · v = 0 =⇒ θ =π
2or θ =
π
2=⇒ u ⊥ v
The projection of a vector u onto vector v 6= 0 is the vector described by
proj (u, v) =u · v‖v‖2
v
1.5 Located Vectors, Hyperplanes, Lines, Curves in Rn
n-tuple P (ai) ≡(a1 a2 . . . an
)∈ Rn P (ai) is a point
n-tuple u =[c1 c2 . . . cn
]∈ Rn u is a vector from origin O to the point C
(c1 c2 . . . cn
)Located Vectors
A (ai) ∈ Rn and B (bi) ∈ Rn
Located vector or directed line segment A→ B, written as−−→AB
u =−−→AB = B −A =
[b1 − a1 b2 − a2 . . . bn − an
]A,B ∈ R3
A(a1 a2 a3
)B(b1 b2 b3
)u = B −A
P(b1 − a1 b2 − a2 b3 − a3
)Hyperplanes
H ∈ Rnis the set of points(x1 x2 . . . xn
)a1x1 + a2x2 + · · ·+ anxn = b
u =[a1 a2 . . . an
]6= 0
H ∈ R2 is a lineH ∈ R3is a plane
In R3 u ⊥−−→PQ where P (pi) ∈ H and Q (qi) ∈ H
u ⊥ H ⇐⇒ H ⊥ uP (pi) ∈ H and Q (qi) ∈ H they satisfy the hyperplane equation
a1p1 + a2p2 + · · ·+ anpn = b (1.5.1)
a1q1 + a2q2 + · · ·+ anqn = b
Let
v =−−→PQ = Q− P =
[q1 − p1 q2 − p2 . . . qn − pn
]Then
u · v = a1 (q1 − p1) + a2 (q2 − p2) + · · ·+ an (qn − pn)
u · v = (a1q1 + a2q2 + · · ·+ anqn)− (a1p1 + a2p2 + · · ·+ anpn) = b− b = 0
Thusv =−−→PQ ⊥ u
CHAPTER 1. VECTORS IN RN AND CN ,SPATIAL VECTORS 6
Lines in Rn
The line L ∈ Rn passing through P(b1 b2 . . . bn
)in the direction of a vector u =
[a1 a2 . . . an
]6= 0 consists of the
points X(x1 x2 . . . xn
)that satisfy
X = P + tu ≡
x1 = a1t+ b1x2 = a2t+ b2
xn = ant+ bn
≡ L (t) = (ait+ bi)
where t ∈ RL ∈ R3 is shown in Fig.
Curves in Rn
D ≡ [a, b] ⊆ RF : D → Rn is a curve in Rn.F (t) is assumed to be continous.
∀t ∈ D F (t) ∈ Rn
F (t) =[F1 (t) F2 (t) . . . Fn (t)
]Derivative of F (t)is V =
dF (t)
dt
V =dF (t)
dt=
[dF1 (t)
dt
dF2 (t)
dt. . .
dFn (t)
dt
]which tangent to the curve.
T (t) =V (t)
‖V (t) ‖is the unit tangent vector to the curve.
1.6 Vectors in R3(Spatial Vectors), ijk Notation
Vectors, u ∈ R3 is called spatial vectors.
i =[1 0 0
]denotes the unit vector in the x-direction
j =[0 1 0
]denotes the unit vector in the y-direction
k =[0 0 1
]denotes the unit vector in the z-direction
∀u ∈ R3 u =[a b c
]= ai+ bj + ck
∀u, v ∈{i, j, k
}and u 6= v u ⊥ v
∀u ∈{i, j, k
}‖u‖ = 1
Suppose
u = a1i+ a2j + a3k
v = b1i+ b2j + b3k
Then,
u+ v = (a1 + b1) i+ (a2 + b2) j + (a3 + b3) k
cu = ca1i+ ca2j + ca3k
where c ∈ R
u · v = a1b1 + a2b2 + a3b3
‖u‖ =√u · u = a2
1 + a22 + a2
3
CHAPTER 1. VECTORS IN RN AND CN ,SPATIAL VECTORS 7
Cross Product
There is a special operation for vectors u, v ∈ R3that is not de�ned in Rn for n 6= 3, called cross product u× v∣∣∣∣a bc d
∣∣∣∣ = ad− bc
−∣∣∣∣a bc d
∣∣∣∣ = bc− ad
Suppose
u = a1i+ a2j + a3k
v = b1i+ b2j + b3k
Then
u× v = (a2b3 − a3b2) i+ (a3b1 − a1b3) i+ (a1b2 − a2b1) i
u× v =
∣∣∣∣a1 a2 a3
b1 b2 b3
∣∣∣∣ i− ∣∣∣∣a1 a2 a3
b1 b2 b3
∣∣∣∣ j +
∣∣∣∣a1 a2 a3
b1 b2 b3
∣∣∣∣ k
Chapter 2
Algebra of Matrices
2.1 Introduction
The entries in our matrices will come �*om some arbitrary, but �xed, �eld K. The elements of K are called numbers or scalars.Nothing essential is lost if the reader assumes that K is the real �eld R.
2.2 Matrices
A matrix A over a�eld K is a rectangular array of scalars :
A =
a11 a12 · · · a1n
a21 a22 · · · a2n
......
. . ....
am1 am2 · · · amn
Rows of matrix A are m row vectors(
a11 a12 · · · a1n
) (a21 a22 · · · a2n
)· · ·
(am1 am2 · · · ann
)The comuns of matrix A are n column vectors
a11
a21
...am1
a12
a22
...am2
· · ·
a1n
a2n
...amn
Element aij , called the ij − entry or ij − element, appears in row i and column j.Matrix A can be written as
A =[aij]
A matrix with m rows and n columns is called an m by n matrix, written as m× n where m and n are the size of the matrixA and B are equal A = B ⇐⇒ size (A) = size (B) and aij = bijif m = 1 and n > 1 matrix A is called a row matrix or row vectorif m > 1 and n = 1 matrix A is called a column matrix or column vectorA =
[aij = 0
]is called a zero matrix
A =[aij]and aij ∈ R =⇒ A is called a real matrix
A =[aij]and aij ∈ C =⇒ A is called a complex matrix
2.3 Matrix Addition and Scalar Multiplication
Let
A =[aij]
and
B =[bij]
8
CHAPTER 2. ALGEBRA OF MATRICES 9
size (A) = size (B) = m× n
The sum of A and B, written A+B
A+B =
a11 + b11 a12 + b12 · · · a1n + b1na21 + b21 a22 + b22 · · · a2n + b2n
......
. . ....
am1 + bm1 am2 + bm2 · · · amn + bmn
The product of the matrix A by a scalar k, written k ·A, or simply kA
kA =
ka11 ka12 · · · ka1n
ka21 ka22 · · · a2n
......
. . ....
kam1 kam2 · · · amn
size (A+B) and size (kA) are also m× nDe�ne
−A ≡ (−1)A
A−B ≡ A+ (−B)
The matrix−A is called the negative of the matrix AThe matrix A−B is called the di�erence of A and Bsize (A) 6= size (B) =⇒ A+B is not de�ned
2.4 Summation Symbol
Summation symbol Σ(the Greek capital letter sigma)∑nk=1 f (k) has the following meaning
k = 1 f (1)k = 2 f (2) f (1) + f (2)k = 3 f (3) f (1) + f (2) + f (3)k = n f (n) f (1) + f (2) + f (3) + · · ·+ f (n)k is called index1 and n are called, respectively, lower and upper limitsGeneral expression for
∑can be written
n2∑k=n1
f (k) = f (n1) + f (n1 + 1) + f (n1 + 2) + · · ·+ f (n2)
2.5 Matrix Multiplication
The product of matrices A and B, written ABThe product of a row matrix A =
[ai]and column matrix B =
[bi], with the same number of elements is de�ned to be a scalar
or 1× 1 matrix
AB =[a1 a2 · · · an
]b1b2...bn
= a1b1 + a2b2 + a3b3 + · · · anbn
AB =
n∑k=1
akbk
AB is a scalar or a 1× 1 matrixAB is not de�ned if A and B have di�erent number of elements
CHAPTER 2. ALGEBRA OF MATRICES 10
De�nition
Suppose A =[aij]and B =
[bij]are matrices such that the number of columns of A is equal to the number of rows of B; say,
A is an m × p matrix and B is an p × n matrix. Then the product AB is the m × n matrix whose ij − entry is obtained bymultiplying the ith row of A by the jth column of B. That is,
a11 · · · a1p
... · · ·...
ai1 · · · aip...
. . ....
am1 · · · amp
b11 · · · b1j · · · b1n...
. . ....
......
... · · ·. . .
......
... · · ·...
. . ....
bp1 bpj · · · bpn
=
c11 · · · c1n... · · ·
...... cij
...... · · ·
...cm1 · · · cmn
where
cij = ai1b1j + ai2b2j + ai3b3j + · · ·+ aipbpj
cij =
p∑k=1
aikbkj
A is an m× p matrix and B is an q × n matrix p 6= q =⇒the product AB is not de�ned
2.6 Transpose of a Matrix
The transpose of a matrix A, written AT
[1 2 34 5 6
]T=
1 42 53 6
[1 −3 −5
]T=
1−3−5
A =
[aij]is an m× n matrix, then AT =
[bij]is the n×m matrix where bij = aij
A =[aij]is an 1× n row matrix, then AT =
[bij]is the n× 1 column matrix
A =[aij]is an m× 1 column matrix, then AT =
[bij]is the 1×m row matrix
2.7 Square Matrices
A =[aij]is a matrix with size m× n. m = n =⇒ A is said to be a square matrix
An n× nsquare matrix is said to be of order n and is sometimes called an n− square matrix
Diagonal and Trace
Let, A =[aij]be an n− square matrix, the elements of diagonal or main diagonal of A are S =
{aij |i = j
}The trace of A, written tr (A), is sum of diagonal elements
tr (A) = a11 + a22 + a33 + · · ·+ ann
Identity Matrix, Scalar Matrices
The n − square identity or unit matrix, denoted by Inor simply I is the n − square matrix with 1′s on the diagonal and0′severywhere. For an n− square matrix A
AI = IA = A
If B is an m× n matrix, then
BIn = ImB = B
For any scalar k, the matrix kI that contains k′s on the diagonal and 0′s elsewhere is called the scalar matrix
CHAPTER 2. ALGEBRA OF MATRICES 11
(kI)A = k (IA) = kA
Kronecker delta function δijis de�ned by
δij
{0 if i 6= j
1 if i = j
Thus identity matrix may be written
I =[δij]
2.8 Powers of Matrices, Polynomials in Matrices
Let A be an n× n matrix. Powers of A are de�ned as follows:
A2 = AA A3 = A2A . . . An+1 = AnA and A0 = I
f (x) = a0 + a1x+ a2x2 + · · ·+ anx
n ai ∈ R
f (A) = a0I + a1A+ a2A2 + · · ·+ anA
n ai ∈ R
x = A
a0 = a0I
f (A) = 0 then A is called a zero or root of f (x)
2.9 Invertible (Nonsingular) Matrices
A square matrix A is said to be invertible or nonsingular if there exists a matrix B such that
AB = BA = I
where I is the identity matrix. Such a matrix B is unique. We call such a matrix B the inverse of A and denote it by A−1. IfB is the inverse of A, then A is the inverse of BSuppose A and B are invertible. Then AB is invertible and (AB)
−1= B−1A−1
More generally, if A1, A2, . . . , Ak are invertible, then their prodcut is invertible and
(A1A2 . . . Ak)−1
= A−1k . . . , A−1
2 A−11
The product of the inverses in the reverse order.
Inverse of a 2× 2 Matrix
Let A be an arbitrary 2× 2 matrix, say
A =
[a bc d
]We want to �nd a general formula for the inverse of A, A−1
A−1 =
[x1 x2
y1 y2
]such that
AA−1 = I
AA−1 =
[a bc d
] [x1 x2
y1 y2
]=
[1 00 1
]
CHAPTER 2. ALGEBRA OF MATRICES 12
[ax1 + by1 ax2 + by2
cx1 + dy1 cx2 + dy2
]=
[1 00 1
]The above matrix equality yields four equations
ax1 + by1 = 1 ax2 + by2 = 0
cx1 + bdy = 0 cx2 + dy2 = 1
let∣∣A∣∣ = ab− cd called the determinant of A, assuming
∣∣A∣∣ 6= 0. The unknowns x1, x2, y1, y2 can be found uniquely
x1 =d∣∣A∣∣ x2 =
−b∣∣A∣∣y1 =
−c∣∣A∣∣ y2 =a∣∣A∣∣
Thus
A−1 =
[a bc d
]−1
=1∣∣A∣∣[d −b−c a
]∣∣A∣∣ 6= 0 =⇒ A is not invertible.
Inverse of an n× n Matrix
Suppose A is an arbitrary n − square matrix. Finding its inverse A−1reduces to �nding the solution of a collection of n × nsystems of linear equations.
2.10 Special Types of Square Matrices
Diagonal and Triangular Matrices
A square matrix D =[dij]is diagonal ⇐⇒ {dij = 0|i 6= j}
D = diag (d11, d22, . . . , dnn)
Examples
3 0 00 −7 00 0 2
≡ diag (3,−7, 2)
[4 00 −5
]≡ diag (4,−5)
6
0−9
8
≡ diag (6, 0,−9, 8)
A square matrix A =[aij]is upper triangular =⇒ S
{aij = 0|i > j
}[a11 a12
a22
] b11 b12 b13
b22 b23
b33
c11 c12 c13 c14
c22 c23 c24
c33 c34
c44
A lower triangular matrix is a square matrix A =
[aij]is upper triangular =⇒ S
{aij = 0|i < j
}Special Real Square Matrices: Symmetric, Orthogonal, Normal
A matrix A is symmetric if A = AT . Equivalently, A =[aij]is symmetric if each aij = aji
A matrix A is symmetric if A = −AT . Equivalently, A =[aij]is symmetric if each aij = −aji. Clearly the diagonal elements
of such a matrix matrix must be all zero.
A ={aji = 0|i = j
}∪{aji = −aji|i 6= j
}Matrix A must be square if A = AT or A = −AT
CHAPTER 2. ALGEBRA OF MATRICES 13
Orthogonal Matrices
A real matrix A is orthogonal if AT = A−1,that is AAT = ATA = I. Thus A must be necessarily be square and invertible.Now, suppose A is a real orthogonal 3× 3matrix with rows
u1 =(a1 a2 a3
)ub =
(b1 b2 b3
)u3 =
(c1 c2 c3
)(2.10.1)
Since A is orthogonal, we must have AAT = Ia1 a2 a3
b1 b2 b3c1 c2 c3
a1 b1 c1a2 b2 c2a3 b3 c3
=
1 0 00 1 00 0 1
= I (2.10.2)
The above matrix equality yields the following equations
a21 + a2
2 + a23 = 1 a1b1 + a2b2 + a3b3 = 0 a1c1 + a2c2 + a3c3 = 0
a1b1 + a2b2 + a3b3 = 0 b21 + b22 + b23 = 1 b1c1 + b2c2 + b3c3 = 0
a1c1 + a2c2 + a3c3 = 0 b1c1 + b2c2 + b3c3 = 0 c21 + c22 + c23 = 1
Implies
u1 · u1 = 1 u2 · u2 = 1 u3 · u3 = 1
ui · uj for i 6= j
The rows of u1, u2, u3 are unit vectors and they are orthogonal to each otherVectors, u1, u2, . . . , un ∈ Rnare said to form an orthonormal set of vectors if the vectors are unit vectors and are orthogonal toeach other that
ui · uj =
{0 i 6= j1 i = j
}In other words, ui · uj = δij ,where δij is the Kronecker delta functionAAT = I =⇒rows of A form an orthonormal set of vectorsATA = I =⇒columns of A form an orthonormal set of vectors
2.11 Block Matrices
Using a system of horizontal and vectrical lines, we can partition amatrix A into submatrices called blocks (cells). The convenienceof partition matrices, say Aand B into blocks is that the result of operations on A and B can be obtained by carrying out thecomputation with the blocks, just as if they were the actual elements of the matrices. The notation A =
[Aij]wilbe used for a
block matrix A with blocks Aij .Suppose A =
[Aij]and B =
[Bij]are block matrices with the same number of row and column blocks and suppose that
coresponding blocks have the same size.
A+B =
A11 +B11 A12 +B12 · · · A1n +B1n
A21 +B21 A22 +B22 · · · A2n +B2n
......
. . ....
Am1 +Bm1 Am2 +Bm2 · · · Amn +Bmn
and
kA =
kA11 kA12 · · · kA1n
kA21 kA22 · · · kA2n
......
. . ....
kAm1 kAm2 · · · kAmn
Suppose that U =
[Uik]and V =
[Vkj]are block matrices, as long as product UikVkj is de�ned
UV =
W11 W12 · · · W1n
W21 W22 · · · W2n
......
. . ....
Vm1 Wm2 · · · Wmn
, where Wij = Ui1V1j + Ui2V2j + · · ·+ UipVpj
CHAPTER 2. ALGEBRA OF MATRICES 14
Square Block Matrices
Let M be a block matrix. Then M is called a square block matrix if
1. M is a square matrix
2. The blocks form a square matrix
3. The diagonal blocks are also square
The latter two conditions will occur if and only if there are the same number of horizontal and vertical lines and they are placedsymmetrically.
Block Diagonal Matrices
Let M =[Aij]be square block matrix such that nondiagonal blocks are all zero matrices, that
M ={Aij = 0|i 6= j
}M = diag
(A11, A22, . . . , Arr
)or M = A11 ⊕A22 ⊕ · · · ⊕Arr
Suppose f (x) is a polynomial and M is a diagonal matrix. Then, f (M) is the block diagonal matrix and
f (M) = diag (f (A11) , f (A22) , , . . . , f (Arr) , )
M is invertible ⇐⇒ ∀Aij is invertible, then
M−1 = diag(A−1
11 , A−122 , . . . , A
−1rr
)Analogously, a square block matrix is called a block upper triangular matrix, if the block below the diagonal are zero marices.Analogously, a square block matrix is called a block lower triangular matrix, if the block above the diagonal are zero marices.
Chapter 3
Systems of Linear Equations
3.1 Introduction
All our systems of linear equations involve scalars as both coe�cients and constants, and such scalars may come from any number�eld K. There is almost no loss in generality if the reader assumes that all our scalars are real numbers, that is, that they comefrom the real �eld R.
3.2 Basic De�nitions, Solutions
Linear Equations and Solutions
A linear equation in unknowns x1, x2, . . . , xn is an equation that can be put in the standard form
a1x1 + a2x2 + · · ·+ anxn = b
where a1, a2, . . . , anand b are constants. The constant ak is called the coe�cient xk, and b is called the constant term of theequation.A solution of the linear equation
x1 = k2 x2 = k2 , . . . xn = kn
or
u =(k1 k2 . . . kn
)such that
a1k1 + a2k2 + · · ·+ ankn = b
is true, or we say u satis�es the equation.
Systems of Linear Equations
a system of m linear equations L1, L2, . . . , Lm in nunknowns x1, x2, . . . , xncan be put in the standard form
a11x2 + a12x2 + · · ·+ a1nx2 = b1
a21x2 + a22x2 + · · ·+ a2nx2 = b2
am1x2 + am2x2 + · · ·+ amnx2 = bm
where the aij and bi are constants. The number aij is the coe�cient of the unknown xj in the equation Li, and the number biis the constant of the equation Li. It is called a square system if m = n, that is, if the number m of equations is equal to thenumber n of unknowns. The system is said to be homogeneous if all the constant terms are zero. Otherwise the system is saidto be nonhomogeneous. The system of linear equations is said to be consistent if it has one or more solutions, and it is said tobe inconsistent if it has no solution.
15
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 16
Augmented and Coe�cient Matrices of a System
Consider the general system of m equation in n unknowns.
M =
a11 a12 · · · a1n b1a21 a22 · · · a2n b2...
.... . .
......
am1 am2 · · · amn bm
and A =
a11 a12 · · · a1n
a21 a22 · · · a2n
......
. . ....
am1 am2 · · · amn
M is the augmented matrix of the system and A is called the coe�cient matrix.M =
[A B
]where B denotes the column vector of constants.
Degenerate Linear Equations
A linear equation is said to be degenerate if all the coe�cients are zero.
0x1 + 0x2 + · · ·+ 0xn = b
The solution of such an equation only depends on the value of the constant b. Speci�cally:(i) If b 6= 0, then the equation has no solution,(ii) If b = 0, then every vector u =
(k1 k2 . . . kn
)is a solution.
Leading Unknown in a Nondegenerate Linear Equation
By the leading unknown of L, we mean the �rst unknown in L with a nonzero coe�cient.
0x1 + 0x2 + 5x3 + 6x4 + 0x5 + 8x6 = 7
0x+ 2y − 4z = 5
3.3 Elementary Operations
The following operations on a system of linear equations L1, L2, . . . , Lm are called elementary operations.[E1
]: Interchange Li and Lj Li ←→ Lj[
E2
]: Replace Li by kLi kLi −→ Li[
E3
]: Replace Li by kLi + Lj kLi + Lj −→ Lj
Suppose a system of M of linear equations is obtained from a system L of linear equations by a �nite sequence of elementaryoperations. Then M and L have the same solutions.
3.4 Small Square Systems of Linear Equations
Systems of Two Linear Equation in Two Unknowns (2× 2)Systems
A1x+B1y = C1
A2x+B2y = C2
The system has exactly one solution
A1
B16= A2
B2A1B2 −A2B1 6= 0
The system has no solution
A1
A2=B1
B26= C1
C2
The system has in�nite solution
A1
A2=B1
B2=C1
C2
Determinant of order two ∣∣∣∣A1 B1
A2 B2
∣∣∣∣ = A1B2 −A2B1
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 17
Elimination Algorithm
Algorithm 3.1: The input consists of two nondegenerate linear equations L1 and L2 in two unknowns with a unique solution.Part A. (Forward Elimination) Multiply each equation by a constant so that the resulting coe�cients of one unknown arenegatives of each other, and then add the two equations to obtain a new equation L that has only one unknown.Part B. (Back-substitution) Solve for the unknown in the new equation L (which contains only one unknown), substitutethis value of the unknown into one of the original equations, and then solve to obtain the value of the other unknown.Part A of Algorithm 3.1 can be applied to any system even if the system does not have a unique solution. In such a case, thenew equation L will be degenerate and Part B will not apply.
3.5 Systems in Triangular and Echelon Form
Triangular Form
2x1 + 3x2 + 5x3 − 2x4 = 9
5x2 − x3 + 3x4 = 1
7x3 − x4 = 3
2x4 = 8
Such a triangular system always has a unique solution, which may be obtained by back-substitution.
Echelon Form, Pivot and Free Variables
2x1 + 6x2 − x3 + 4x4 − 2x5 = 7
x3 + 2x4 + 2x5 = 5
3x4 − 9x5 = 6
x1, x3, x4 are called pivot variables and the other unknowns x2 and x5are called free variablesConsider a system of linear equations in echelon form, say with r equations in n unknowns. There are two cases.r = n If there are as many equations as unknowns (triangular form). Then the system has a unique solution,r < n If there are more unknowns than equations. Then we can arbitrarily assign values to the n − r free variables andsolve uniquely for the r pivot variables, obtaining a solution of the system.The general solution of a system with free variables may be described in either of two equivalent ways: One description is calledthe "Parametric Form" of the solution, and the other description is called the "Free-Variable Form".
Parametric Form
Assign arbitrary values, called parameters, to the free variables, and then use back-substitution to obtain values for the pivotvariables
Free-Variable Form
Use back-substitution to solve for the pivot variables directly in terms of the free variables.
3.6 Gauss Elimination
It essentially consists of two parts:Part A. (Forward Elimination) Step-by-step reduction of the system yielding either a degenerate equation with no solution(which indicates the system has no solution) or an equivalent simpler system in triangular or echelon form.Part B. (Backward Elimination) Step-by-step back-substitution to �nd the solution of the simpler system.
Part A. (Forward Elimination)
Input: The m× n system of linear equations.Elimination Step: Find the �rst unknown in the system with a nonzero coe�cient (which now must be x1).
1. Arrange so that a11 6= 0. That is, if necessary, interchange equations so that the �rst unknown x1 appears with a nonzerocoe�cient in the �rst equation.
2. Use ail as a11 as a pivot to eliminate x1 from all equations except the �rst equation. That is, for i > j:
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 18
(a) Set = − ai1a11
(b) Replace Li by mL1 + Li
The system now has the following form:
a11x1 + a12x2 + a13x3 + · · ·+ a1nxn = b1
a2j2xj2 + · · ·+ a2nxn = b2
amjmxj2 + · · ·+ amnxn = bm
where x1 does not appear in any equation except the �rst, a11 6= 0, and xj2 denotes the �rst unknown with a nonzerocoe�cient in any equation other than the �rst.
3. Examine each new equation L.
(a) If L has the form 0x1 + 0x2 + · · ·+ · · · = b with b 6= 0, then STOP
The system is inconsistent and has no solution.
(b) If L has the form 0x1 + 0x2 + · · ·+ · · · = b or if L is a multiple of another equation, then delete L from the system.
Recursion Step: Repeat the Elimination Step with each new "smaller" subsystem formed by all the equations excluding the�rst equation.Output: Finally, the system is reduced to triangular or echelon form, or a degenerate equation with no solution is obtainedindicating an inconsistent system.The next remarks refer to the Elimination Step in Algorithm 3.2.
1. The following number m in (b) is called the multiplier.
m = − ai1a11
= −coefficient to be deletedpivot
2. One could alternatively apply the following operation in (b):
Replace Li by −ai1L1 + a11Li
This would avoid fractions if all the scalars were originally integers.
3.7 Elementary Matrices
Elementary Column Operations
Now let Abe a matrix with columns C1, C2, . . . , Cn. The following operations on A, analogous to the elementary row operations,are called elementary column operations'.[F1] (Column Interchange): Interchange columns Ci and Cj .[F2] (Column Scaling): Replace Ci by kCi (where k 6= 0).[F3](Column Addition): Replace Cj by kCi + Cj .We may indicate each of the column operations by writing, respectively.
1. Ci ←→ Cj
2. kCi −→ Ci
3. kCi + Cj −→ Cj
Now let f denote an elementary column operation, and let F be the matrix obtained by applying f to the identity matrix I,that is, F = f (I) Then F is called the elementary matrix corresponding to the elementary column operation f . Note that F isalways a square matrix.
Theorem:
For any matrix A, f(A) = AF . That is, the result of applying an elementary column operation f on a matrix A can be obtainedby postmultiplying A by the corresponding elementary matrix F .
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 19
Elementary Row Operations
Suppose A is a matrix with rows R1, R2, . . . , Rm The following operations on A are called elementary row operations.[E1] (Row Interchange):Interchange Ri and Rj or Ri ←→ Rj[E2
](Row Scaling):
Replace Ri bykRi or kRi −→ Ri[E3
](Row Addition):
Replace Rj by kRi +Rj or kRi +Rj −→ Rj .Let e denote an elementary row operation and let e(A) denote the results of applying the operation e to a matrix A. Now let Ebe the matrix obtained by applying e to the identity matrix I, that is,E = e (I)Then E is called the elementary matrix corresponding to the elementary row operation e. Note that E is always a square matrix.
Theorem:
Let e be an elementary row operation and let E be the corresponding mxm elementary matrix. Thene(A) = EAwhere A is any mxn matrix.In other words, the result of applying an elementary row operation e to a matrix A can be obtained by premultiplying A by thecorresponding elementary matrix E.
3.8 Linear Systems of Equations;Gauss Elimination, Matrix Formulation
3.8.1 Introduction
a system of m linear equations L1, L2, . . . , Lm in nunknowns x1, x2, . . . , xncan be put in the standard form
a11x2 + a12x2 + · · ·+ a1nx2 = b1
a21x2 + a22x2 + · · ·+ a2nx2 = b2
am1x2 + am2x2 + · · ·+ amnx2 = bm
M =
a11 a12 · · · a1n b1a21 a22 · · · a2n b2...
.... . .
......
am1 am2 · · · amn bm
and A =
a11 a12 · · · a1n
a21 a22 · · · a2n
......
. . ....
am1 am2 · · · amn
M is the augmented matrix of the system and A is called the coe�cient matrix.M =
[A B
]where B denotes the column vector of constants.
3.8.2 Homogeneous Systems Of Linear Equations
A system of linear equations is said to be homogeneous if all the constant terms are zero. Thus a homogeneous system has theform AX = 0. Clearly, such a system always has the zero vector 0 = (0, 0, . . . , 0) as a solution, called the zero or trivial solution.Accordingly, we are usually interested in whether or not the system has a nonzero solution.Since a homogeneous system AX = 0 does have at least the zero solution, it can always be put in an echelon form; say
a11x1 + a12x2 + a13x3 + a14x4 + · · ·+ a1nxn = 0
a2j2x2 + a2j2+1xj2 + · · ·+ a2nxn = 0
arjrxjr + arjr+1xjr+1 + · · ·+ arnxn = 0
Here r denotes the number of equations in echelon form and n denotes the number of unknowns. Thus the echelon system hasn− r free variables. The question of nonzero solutions reduces to the following two cases:
1. r = n. The system has only the zero solution,
2. r < n. The system has a nonzero solution.
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 20
Accordingly, if we begin with fewer equations than unknowns, then, in echelon form, r < n, and the system has a nonzerosolution.The augmented matrix M determines the system completely because its contains all the given numbers appearing in system ofequations.
3.8.3 Systems Of Linear Equations And Linear Combinations Of Vectors
The general system of linear equations may be rewritten as the following vector equation:
x1
a11
a21
...am1
+ x2
a12
a22
...am2
+ · · ·+ xn
a1n
a2n
...amn
=
b1b2...bm
Accordingly, the genereal system of linear equations and the above equivalent vector equation have a solution if and only if thecolumn vector of constants is a linear combination of the columns of the coe�cient matrix.
Linear Combinations of Orthogonal Vectors, Fourier Coe�cients
Recall �rst (Section 1.4) that the dot (inner) product u · v of vectors
u = (a1, . . . , an)
and
v = (b1, . . . , bn)
in Rn is de�ned by
u · v = a1b1 + a2b2 + · · · anbnFurthermore, vectors u and v are said to be orthogonal if their dot product u · v = 0.Suppose that u1, u2, . . . , un are in Rn nonzero pairwise orthogonal vectors. This means
ui · uj = 0 for i 6= j (3.8.1)
and
ui · ui 6= 0 for each i
Then, for any vector v in R", there is an easy way to write v as a linear combination of u1, u2, . . . , un which is illustrated in thenext example.
Theorem:
Suppose that u1, u2, . . . , un are in Rn nonzero pairwise orthogonal vectors. Then, for any vector v; in Rn,
v =v · u1
u1 · u1u1 +
v · u2
u2 · u2u2 + · · ·+ v · un
un · unun (3.8.2)
We emphasize that there must be n such orthogonal vectors ui in Rn for the formula to be used. Note also that each ui · ui 6=
0 for each i, since each ui is a nonzero vector.
Remark:
The following scalar ki (appearing in Theorem 3.10) is called the Fourier coe�cient of v with respect to ui.
ki =v · uiui · ui
=v · ui∥∥ui∥∥
It is analogous to a coe�cient in the celebrated Fourier series of a function.
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 21
3.8.4 Matrix Equation Of A System Of Linear Equations
The general system C.2) of m linear equations in n unknowns is equivalent to the matrix equationa11 a12 . . . a1n
a21 a22 . . . a2n
......
. . ....
am1 am2 . . . amn
x1
x2
...xm
=
b1b2...bm
or
AX = B
where A =[aij]is the coe�cient matrix, X =
[xj]is the column vector of unknowns, and B =
[bi]is the column vector of
constants.The statement that the system of linear equations and the matrix equation are equivalent means that any vectorsolution of the system is a solution of the matrix equation, and vice versa.A system AX = B of linear equations is square if and only if the matrix A of coe�cients is square. In such a case, we have thefollowing important result.
Theorem:
A square system AX = Bof linear equations has a unique solution if and only if the matrix A is invertible. In such a case, A−1Bis the unique solution of the system.
3.8.5 Geometric Interpretation. Existence and Uniqueness of Solutions
the theorem has a geometrical description when the system consists of two equations in two unknowns, where each equationrepresents a line in R2. The theorem also has a geometrical description when the system consists of three nondegenerate equationsin three unknowns, where the three equations correspond to planes H1, H2, H3 in R3.Consider a system of two nondegenerate linear equations in two unknowns x and y, which can be put in the standard form
A1x+B1y = C1
A2x+B2y = C2
The system has exactly one solution: Here the two lines intersect in one point. This occurs when the lines have distinct slopesor, equivalently, when the coe�cients of x and 3; are not proportional:
A1
A26= B1
B2A1B2 −A2B1 6= 0
The section gives two matrix algorithms that accomplish the following:
1. Algorithm 3.3 transforms any matrix A into an echelon form.
2. Algorithm 3.4 transforms the echelon matrix into its row canonical form.
These algorithms, which use the elementary row operations, are simply restatements of Gaussian elimination as applied tomatrices rather than to linear equations.Algorithm 3.3 (Forward Elimination):The input is any matrix A. (The algorithm puts 0′s below each pivot, working from the "top-down".) The output is an echelonform of A.Step 1. Find the �rst column with a nonzero entry. Let j1 denote this column.Arrange so that a1j1 6= 0. That is, if necessary, interchange rows so that a nonzero entry appears in the �rst row in column j1.
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 22
Figure 3.8.1: Geometric Interpretation. Existence and Uniqueness of Solutions
Figure 3.8.2: Geometric Interpretation: 2D Space
Figure 3.8.3: Geometric Interpretation: 3D Space
Figure 3.8.4: Gauss Eliminitaion Example: Electrical Network
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 23
Use a1j1 as a pivot to obtain 0′s below a1j1 .Speci�cally, for i > 1:
Set m = − aij1a1j1
;
Replace Ri by mR1 +Ri
[That is, apply the operation −(aij1a1j1
)R1 +Ri −→ Ri]
Repeat Step 1 with the submatrix formed by all the rows excluding the �rst row. Here we let j2 denote the �rst column in thesubsystem with a nonzero entry. Hence, at the end of Step 2, we have a2j2 6= 0.Continue the above process until a submatrix has only zero rows.We emphasize that at the end of the algorithm, the pivots will be
a1j1 , a2j2 , . . . , arjr
where r denotes the number of nonzero rows in the �nal echelon matrix.Remark 1: The following number m in Step 1(Z?) is called the multiplier.
m = − aij1a1j1
= −entry to be deleted[ivot
Remark 2: One could replace the operation in Step 1(b) byReplace Ri by −aij1R1 + a1j2RiThis would avoid fractions if all the scalars were originally integers.Algorithm 3.4 (Backward Elimination): The input is a matrix A =
[aij]in echelon form with pivot entries
a1j1 , a2j2 , . . . , arjr (3.8.3)
The output is the row canonical form of A.Step 1.
(a) (Use row scaling so the last pivot equals 1.) Multiply the last nonzero row Rj . by1
arjr.
(b) (Use arjr = 1 to obtain 0′s above the pivot.) For i = r − 1, r − 2, . . . , 1Set m = −aijrReplace Ri by mRr +Ri(That is, apply the operations −aijrRr +Ri −→ Ri )Steps 2 to r�1. Repeat Step 1 for rows Rr−1, Rr−2, . . . , R2
Step r. (Use row scaling so the �rst pivot equals 1.) Multiply R1 by1
a1j1Remark: We emphasize that Gaussian elimination is a two-stage process. Speci�cally:Stage A (Algorithm 3.3). Puts 0′s below each pivot, working from the top row R1 down.Stage B (Algorithm 3.4). Puts 0′s above each pivot, working from the bottom row Rr. up.There is another algorithm, called Gauss-Jordan, that also row reduces a matrix to its row canonical form. The di�erence is thatGauss-Jordan puts 0′s both below and above each pivot as it works its way from the top row R1 down. Although Gauss-Jordanmay be easier to state and understand, it is much less e�cient than the two-stage Gaussian elimination algorithm.Application to Systems of Linear EquationsOne way to solve a system of linear equations is by working with its augmented matrix M rather than the equations themselves.Speci�cally, we reduce M to echelon form (which tells us whether the system has a solution), and then further reduce M to itsrow canonical form (which essentially gives the solution of the original system of linear equations).
Example: Gauss Elimination. Electrical Network
Solve the linear system
x1 − x2 + x3 = 0
−x1 + x2 − x3 = 0
10x2 + 25x3 = 90
20x1 + 10x2 = 80
Solution by Gauss Elimination
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 24
Form the augmented matrix
Step 1. Elimination of x1
Step 2. Elimination of x2
The result
Backsubstitution: in this order x3, x2, x1
3.8.5.1 Gauss Elimination: The Three Possible Cases of Systems
The Gauss elimination can take care of linear systems with a unique solution, with in�nitely many solutions , and withoutsolutions (inconsistent systems).
Example: Gauss Elimination if In�nitely Many Solutions Exist Solve the following linear systems of three equatIonsin four unknowns whose augmented matrix is
System of linear equations and Augmented Matrix
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 25
Step 1: Elimitaion of x1
Step 2: Elimitaion of x2
Backsubstitution:
x2 = 1− x3 + 4x4
x1 = 2− x4
Since x3 and x4 remain arbitrary, we have in�ntely many solutionsor we can writex2 = 1− t1 + 4t2x1 = 2− t2wherex3 = t1 x4 = t2 since x3and x4are arbitrary we have in�initely many solutions
Example: Gauss Elimination if no Solution Exists Consider the system of linear equations
Systems of linear equations and Augmented Matrix
Step 1: Elimitaion of x1
Step 2: Elimitaion of x2
Backsubstitution:
The false statement 0 = 12 shows that the system has no solution.
3.8.6 Row Echelon Form and Information From It
At the end of the Gauss elimination the form of the coe�cient matrix, the augmented matrix, and the system itself are calledthe row echelon form.
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 26
Row Echelon Form Examples
At the end of the Gauss elimination (before the back substitution) the row echelon form of the augmented matrix will be
1. Exactly one solution if r = n and br+1 . . . bm if present. are zero.
2. In�nitely many solutions if if r < n and br+1 . . . bm if present. are zero.
3. No solution, if r < m and one of the entries br+1 . . . bm is nonzero.
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 27
3.9 Solutions of Linear Systems: Existence, Uniqueness
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 28
always has the trivial solution x1 = 0;x2 = 0; . . . xn = 0. Nontrivial solutions exist if and only if rank(A) = r < n.
3.9.1 Second- and Third-Order Determinants
3.9.1.1 Second-order Determinat
A determinant of second order is denoted and de�ned by
3.9.1.2 Cramer's rule for solving linear systems of two equations in two unknowns
with D 6= 0. The value D = 0 appears for inconsistent nonhomogeneous systems and for homogeneous systems with nontrivialsolutions
3.9.1.3 Third-order Determinat
A determinant of third order can be de�ned by
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 29
3.9.1.4 Cramer's Rule for Linear Systems of Three Equations
where D1, D2, and D3 are given by
where D is given by
3.10 Determinants: Cramer's Rule
A determinant of order n is a scalar associated with an n× n matrix A =[aij]which is written
for n = 1
for n > 1
or
and Mjk is a determinant of order n− 1. namely, the determinant of the submatrix of A obtained from A by omitting the rowand column of the entry ajk that is, the jth row and the kth column. Mjkis called the minor of ajk in D, and Cjk the cofactorof ajk in D.
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 30
Example: Expansions of a Third-Order Determinant
Example: Determinant of a Triangular Matrix
3.10.1 General Properties of Determinants
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 31
3.10.2 Determination of Rank and Submatrices
Submatrix:
Any matrix obtained by deleting some rows and/or columns of a given matrix [A].
Example:
Find all submatricesof the following 2× 3 matrix:
A =
[a11 a12 a13
a21 a22 a23
]Of course, one obvious submatrix is the [A] matrix itself with no row or column deletionOther submatrices areThree 2× 2 submatrices [
a11 a12
a21 a22
] [a11 a13
a21 a23
] [a12 a13
a22 a23
]Two 1× 3 submatrices [
a11 a12 a13
] [a21 a22 a23
]Three 2× 1 submatrices [
a11
a21
] [a12
a22
] [a!3
a23
]Six 1× 2 submatrices [
a11 a12
] [a11 a13
] [a21 a22
] [a21 a23
] [a22 a23
]Six 1× 1 submatrices [
a11
] [a12
] [a13
] [a21
] [a22
] [a23
]Rank
A general matrix [A] is said to be of rank r if it contains at least one square submatrix of size r × r with a non�vanishing(non�zero) determinant, while the determinant of any square submatrix of [A] of size greater than r is zero.Example:
A =
4 2 1 36 3 4 72 1 0 1
Matrix A contains four 3× 3matrices. But, the determinant ofeach is zero. So, the rank of [A] is not 3.Matrix A contains the submatrix
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 32
[4 16 4
]whose determinant is not zero. Therefore, the rank of [A] is 2.For an n×n square matrix [A], if det[A] = 0, then its rank is less than n. In that case, [A] is called a singularmatrix. Consequently,an n× n matrix [A] has a rank equal to n if and only if det[A] is not equal to zero; i.e., [A] is non-singular.
Example:
Show that matrix A is singular
A =
1 2 34 5 67 8 8
det(A)
=∣∣A∣∣ =
∣∣∣∣∣∣1 2 34 5 67 8 9
∣∣∣∣∣∣ = 1
∣∣∣∣5 68 9
∣∣∣∣− 2
∣∣∣∣4 67 9
∣∣∣∣+ 3
∣∣∣∣4 57 8
∣∣∣∣∣∣A∣∣ =
(45− 48
)− 2
(36− 42
)+ 3
(32− 35
)= −3 + 12− 9 = 0
The rank of [A] is less than n = 3. Hence, it is a singular matrix. The rank is 2.
3.10.3 Cramer's Rule
3.10.4 Useful Formulas for Inverses
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 33
Example: Inverse of a 2× 2 Matrix
Example: Inverse of a 3× 3 Matrix
Example: Inverse of a Diagonal Matrix
CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 34
3.11 Determination of the Inverse by the Gauss-Jordan Method
Example
Chapter 4
Matrix Eigenvalue Problems
A matrix eigenvalue problem considers the vector equation
Ax = λx
whereA is an n× n square matrixλis an unknown scalarx is an unknown vectorx = 0 is always a solution to the above equation and so it is of no interestWe seek solutions where x 6= 0Terminologyλ's that satisfy matrix eigenvalue problem are callled eigenvalues of ACorresponding nonzero x's are called eigenvectors of ANow consider the following numeric examplesObserve the in�uence of multiplication the matrix on the given vectorsCase I: [
6 34 7
] [51
]=
[3327
]In the �rst case, we get a totally new vector with a di�erent direction and di�erent length when compared to the original vector.Case II: [
6 34 7
] [34
]=
[3040
]In the second case something interesting happens. The multiplication produces a vector[
3040
]= 10
[34
]which means the new vector has the same direction as the original vector.The scale constant is
λ = 10
Formal de�nitiona of Eigenvalue problemLet A =
[aij]n×n
Consider the following vector equation
Ax = λx
Find x 6= 0 and corresponding λGeometric Interpretation of the solution of Eigenvalue Problem
1. Geometrically, we are looking for vectors, x, for which the multiplication by A has the same e�ect as the multiplication bya scalar λ
2. Ax should be proportional to x.
3. Thus, the multiplication has the e�ect of producing, from the original vector x, a new vector that has the same or opposite(minus sign) direction as the original vector.
35
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 36
Terminology of Eigenvalue ProblemSymbol Name
λ Eigenvalue/characteristic value/ latent root of matrix Ax 6= 0 Eigenvector/ characteristics vectors of matrixA{λi}ni=1
Spectrum of A
max{∣∣λi∣∣}ni=1
Spectral radius of matrix A
4.1 How to Find Eigenvalues and Eigenvectors
This Example demonstrates how to systematically solve a simple eigenvalue problem.
Example
All steps of eigenvalue problem is illustrated in terms of the matrix
A =
[−5 22 −2
]EigenvaluesIf we write eigenvalue problem that corresponds tot he givne matrix
Ax =
[−5 22 −2
] [x1
x2
]= λ
[x1
x2
]If we expand this vector equation we get
−5x1 + 2x2 = λx1
2x1 − 2x2 = λx2
This equation can be cast in the following form
(−5− λ
)x1 + 2x2 = 0
2x1 +(−2− λ
)x2 = 0
In matrix notation (A− λI
)x = 0
This is a homogeneous linear system. By Cramer's theorem it has a nontrivial solution x 6= 0 if and only if its coe�cientdeterminant is zero, that is,
D(λ)
= det(A− λI
)=
∣∣∣∣(−5− λ)
22
(−2− λ
)∣∣∣∣D(λ)
= λ2 + 7λ+ 6 = 0
Below you may �nd some more information about the terminology used in this chapter.TerminologySymbol Name
D(λ)
Characteristic determinant/ if expanded, the characteristic polynomial
D(λ)
= 0 Characteristic equation of AThe roots of the characteristic equation
D(λ)
= λ2 + 7λ+ 6 = 0
are the eigenvalues of A.In this particular problem, λ′s are
λ1 = −1 λ2 = −6
Eigenvector of A corresponding to λ1
in the original equations of eigenvalue problem set
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 37
λ = λ1
(−5− λ
)x1 + 2x2 = 0
2x1 +(−2− λ
)x2 = 0
Then we get
−4x1 + 2x2 = 0
2x1 − x2 = 0
A solution to the above set of equations can be obtained from either of the equations as
x2 = 2x1
Note that since we set λ = λ1 and under these circumstances the determinant of the original vector equation(A− λI
)x = 0
D(λ)
= det(A−
[λ = λ1
]I)
= 0
Number of independent equation is one and the above two equations are linearly dependent, so in fact we have only oneindependent equation.If we examine equations after we λ = λ1
−4x1 + 2x2 = 0
2x1 − x2 = 0
we see that we can get equation (1) if we multiply the equation (2) by a scalar which is equal to−2We can compute the �rst eigenvector upto an unknown scalar multiplier, if we chose x1 = 1, we obtain the eigenvector
x1 =
[12
]We can check the solution by substituting this eigenvector into the original eigenvalue problem
Ax1 =
[−5 22 −2
] [12
]=
[−1−2
]= (−1)x1 = λ1x1
Eigenvector of A corresponding to λ2
For λ = λ2 = −6
(−5− λ
)x1 + 2x2 = 0
2x1 +(−2− λ
)x2 = 0
reduces to
x1 + 2x2 = 0
2x1 + 4x2 = 0
Solution of the above set homogenous sytem of equation is
x2 = −x1
2
One of the unknowns is arbitrary, set
x1 = 2
we can compute
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 38
x2 = −1
Eigenvector can be wtitten upto an unknown scale can be written as
x2 =
[2−1
]Lets check the result
Ax2 =
[−5 22 −2
] [2−1
]=
[−12
6
]= (−6)x2 = λ2x2
4.2 General form of Eigenvalue Problem for an n× n Matrix
In the sequel we will investigate the general form of eigenvalue problem for a matrix A =[aij]n×n
a11x1 + · · ·+ a1nxn = λx1
a21x1 + · · ·+ a2nxn = λx2
...
an1x1 + · · ·+ annxn = λxn
The above set of homogeneous set of linear equations can be written as
(a11 − λ
)x1 + a12x2 + · · ·+ a1nxn = 0
a21x1 +(a22 − λ
)x2 + · · ·+ a2nxn = 0
...
an1x1 + an2x2 + · · ·+(ann − λ
)xn = 0
In matrix notation, (A− λI
)x = 0
By Cramer's theorem , this homogeneous linear system of equations has a nontrivial solution if and only if the correspondingdeterminant of the coe�cients is zero:
D(λ)
= det(A− λI
)=
∣∣∣∣∣∣∣∣∣
(a11 − λ
)a12 · · · a1n
a21
(a22 − λ
)· · · a2n
......
. . ....
an1 an2 · · ·(ann − λ
)∣∣∣∣∣∣∣∣∣ = 0
Lets talk about terminologySymbol Name
A− λI Characteristic matrix
D(λ)
Characteristic determinant
det(A− λI
)Characteristic equation of matrix A
D(λ)
= poly(nth order in λ) Characteristic polynomial of AEigenvalues:The eigenvalues of a square matrix A are the roots of the characteristic equation of A.The eigenvalues must be determined �rst. Once these are known, corresponding eigenvectors are obtained from the homogenoussystem of linear equations, for instance, by the Gauss elimination, where is the eigenvalue for which an eigenvector is wanted.
Example: Multiple Eigenvalues
Find the eigenvalues and eigenvectors of the matrix A
A =
−2 +2 −3+2 +1 −6−1 −2 +0
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 39
Eigenvalue problem in matrix notation can be written asIn matrix notation, (
A− λI)x = 0
the characteristic determinant gives the characteristic equation
D(λ)
= det(A− λI
)= 0
−λ3 − λ2 + 21λ+ 45 = 0
Eigenvalues of matrix A which are the roots of characteristic equation can be found asλ1 = 5 λ2 = λ3 = −3To �nd eigenvectors, we apply the Gauss elimination to the system(
A− λI)x = 0
Set λ = λ1
(A− λI
)= A− 5I =
−7 +2 −3+2 −4 −6−1 −2 −5
Apply Gauss elimination to reduce the above system to echelon form note that we dont necessarily have to use the augmentedmatrix since the vector of constants are all zeroThe above matrix row-reduces to −7 2 −3
0 −24
7−48
70 0 0
Hence it has rank 2.Choose x3 = −1, then using
−24
7x2 −
48
7x3 = 0
we can computex2 = 2, then using
−7x1 + 2x2 − 3x3 = 0
we can compute
x1 = 1
Hence the eigenvector corresponding to λ = λ1is 12−1
Set λ = λ2
(A− λI
)= A+ 3I =
1 2 −32 4 −6−1 −2 3
Apply Gauss elimination to reduce the above system to echelon form note that we dont necessarily have to use the augmentedmatrix since the vector of constants are all zeroThe above matrix row-reduces to 1 2 −3
0 0 00 0 0
Hence it has rank 1. Use the only available equation which is the �rst equation to compute x1
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 40
x1 + 2x2 − 3x3 = 0
Solve for x1
x1 = −2x2 + 3x3
Choose
x2 = 1 x3 = 0
and
x2 = 0 x3 = 1
we obtain two linearly independent eigenvectors of matrix A corresponding to λ = λ2 = λ3
Because rank is equal to one and number of unknowns is three. These eignevectors are
x2 =
−210
and
x3 =
301
Example: Real Matrices with Complex Eigenvalues and Eigenvectors
Since real polynomials may have complex roots (which then occur in conjugate pairs), a real matrix may have complex eigenvaluesand eigenvectors
A =
[0 1−1 0
]The characteristic equation of the skew-symmetric matrix A is
det(A− λI
)=
∣∣∣∣−λ 1−1 λ
∣∣∣∣ = λ2 + 1 = 0
Solution of the above characteristic equation gives eignevalue as
λ1 = i(=√−1)
λ2 = −i
Eigenvectors can be obtained from
−ix1 + x2 = 0
ix1 + x2 = 0
Choose arbitrarily
x1 = 1
x1 =
[1i
]and x2 =
[1−i
]Eigenvalues of the Transpose
The transpose AT of a square matrix A has the same eigenvalues as A.
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 41
4.3 Symmetric, Skew-Symmetric, and Orthogonal Matrices
4.3.1 Introduction
De�nitions:
A real square matrix A =[ajk]is called symmetric
if transition leaves it unchanged
AT = A thus akj = ajk
A real square matrix A =[ajk]is called skew-symmetric
if transition gives the negative of A
AT = −A thus akj = −ajkA real square matrix A =
[ajk]is called orthogonal
if transition gives the inverse of A
AT = A−1
Any real square matrix A may be written as the sum of a symmetric matrix R and a skewsymmetric matrix S, where
R =1
2
(A+AT
)and S =
1
2
(A−AT
)Eigenvalues of Symmetric and Skew-Symmetric Matrices
• The eigenvalues of a symmetric matrix are real.
• The eigenvalues of a skew-symmetric matrix are pure imaginary or zero.
4.3.2 Orthogonal Transformations and Orthogonal Matrices
Orthogonal transformations are transformations
y = Ax
whereA is an orthogonal matrixPlane rotation through an angle θis an ortogonal transformation
y =
[y1
y2
]=
[cosθ sinθ−sinθ cosθ
] [x1
x2
]It can be shown that any orthogonal transformation in the plane or in three-dimensional space is a rotation
Invariance of Inner Product
An orthogonal transformation preserves the value of the inner product of vectors a and b in Rn, de�ned by
a · b = aT b =[a1 · · · an
] b1...bn
That is, for any a and b in Rn, orthogonal matrix n× n A, and
u = Aa
v = Ab
we have
u · v = a · b
Hence the transformation also preserves the length or norm of any vector a in Rn given by∥∥a∥∥ =√a · a−
√aT · a
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 42
Orthonormality of Column and Row Vectors
A real square matrix is orthogonal if and only if its column vectors a1, a2, . . . , an(and also its row vectors) form an orthonormalsystem, that is,
aj · ak = aTj ak =
{0 j 6= k1 j = k
}Determinant of an Orthogonal Matrix
The determinant of an orthogonal matrix has the value +1 or −1
Eigenvalues of an Orthogonal Matrix
The eigenvalues of an orthogonal matrix A are real or complex conjugates in pairs and have absolute value 1
4.4 Eigenbases. Diagonalization. Quadratic Forms
4.4.1 Introduction
Eigenvectors of an n× n matrix A may (or may not!) form a basis for Rn
�eigenbasis� (basis of eigenvectors)�if it exists�is of great advantage, because then we can write the following
x = c1x1 + c2x2 + · · ·+ cnxn
where x1, x2, . . . , xn are eignevectors that forms the eigenbasisSince
(λi, xi
)is an eigenvalue, eigenvector pair of solution to the following matrix eigenvalue problem
Axj = λjxj
we can write
y = Ax = A(c1x1 + c2x2 + · · ·+ cnxn
)y = c1Ax1 + c2Ax2 + · · ·+ cnAxn
y = c1λ1x1 + c2λ2x2 + +cnλnxn
This shows that we have decomposed the complicated action of A on an arbitrary vector x into a sum of simple actions(multiplication by scalars) on the eigenvectors of A.
Theorem: Basis of Eigenvectors
if an n× n matrix A has n distinct eigenvalues, then A has a basis of eigenvectors x1, x2, . . . , xn for Rn
Theorem: Symmetric Matrices
A symmetric matrix has an orthonormal basis of eigenvectors for Rn
4.4.2 Similarity of Matrices. Diagonalization
Eigenbases also play a role in reducing a matrix A to a diagonal matrix whose entries are the eigenvalues of A. This is done bya �similarity transformation
De�nition: Similar Matrices. Similarity Transformation
An n× n matrix A is called similar to an n× n matrix A if
A = P−1AP
for some (nonsingular!) n× n matrix P . This transformation, which gives A from A is called a similarity transformation
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 43
Theorem; Eigenvalues and Eigenvectors of Similar Matrices
If A similar to A, then Ahas the same eigenvalues of AFurthermore, if x is an eigenvector of A , then y = P−1a is an eigenvector of A corresponding to the same eigenvalue.
Example: Eigenvalues and Vectors of Similar Matrices
Let,
A =
[6 −34 −1
]and
P =
[1 31 4
]Then
A =
[4 −3−1 1
] [6 −34 −1
] [1 31 4
]=
[3 00 2
]A has the eigenvalues
λ1 = 3 and λ2 = 2
The characteristic equation of A is (6− λ
) (−1− λ
)= λ2 − 5λ+ 6 = 0
Roots of this charcteristic equation (the eigenvalue of A) is
λ1 = 3 andλ2 = 2
which con�rms the �rst part of theoryIn order to compute th eigenvectors,we use the following matrix equation(
A− λI)x = 0
If we select the �rst row, we get (6− λ
)x1 − 3x2 = 0
For λ = λ1 = 3, this gives
3x1 − 3x2 = 0
so the �rst eigenvector can be written as
x1 =
[11
]For λ = λ2 = 2, this gives
4x1 − 3x2 = 0
so the second eigenvector can be written as
x1 =
[34
]Theorem states that
y1 = P−1x1 =
[4 −3−1 1
] [11
]=
[10
]y2 = P−1x2 =
[4 −3−1 1
] [34
]=
[01
]Indeed, these are eigenvectors of the diagonal matrix AWe see that x1and x2 are the columns of PBy a suitable similarity transformation we can now transform a matrix A to a diagonal matrix D whose diagonal entries are theeigenvalues of A:
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 44
Theorem: Diagonalization of a Matrix
If an n× n matrix A has a basis of eigenvectors, then
D = X−1AX
is diagonal, with the eigenvalues of A as the entries on the main diagonal. Here X is the matrix with thsee eigenvectors asscolumn vectors, also
Dm = X−1AmX(m = 2, 3, . . .
)Example: Diagonalization
Diagonalize
A =
7.3 0.2 −3.7−11.5 1.0 5.517.7 1.8 −9.3
The characteristic determinant can be written as ∣∣A− λI∣∣ = 0
This gives the characteristic equation
−λ3 − λ2 + 12λ = 0
The roots (eigenvalues) of this characteristic equation are
λ1 = 3λ2 = −4λ3 = 0
We apply Gauss elimination to (A− λI
)x = 0
with
λ = λ1, λ2, λ3
and �nd the corresponding eigenvectors. (λ1, x1
) (λ2, x2
) (λ3, x3
)From these eigenvectors we form the transformation matrix X
X =[x1 x2 x3
]Then we use Gauss-Jordan elimination to compute X−1from X.The results can be summarized asλ1 = 3, x1 =
−13−1
λ2 = −4, x2 =
1−13
λ3 = 0, x3 =
214
X =
−1 1 23 −1 1−1 3 4
X−1 =
−0.7 0.2 0.3−1.3 −0.2 0.70.8 0.2 −0.2
Calculate AX and premultiply by X−1
D = X−1AX =
−0.7 0.2 0.3−1.3 −0.2 0.70.8 0.2 −0.2
7.3 0.2 −3.7−11.5 1.0 5.517.7 1.8 −9.3
−1 1 23 −1 1−1 3 4
=
3 0 00 −4 00 0 0
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 45
4.4.3 Quadratic Forms. Transformation to Principal Axes
By de�nition, a quadratic form Q in the components x1, x2, . . . , xnof a vector x
Q = xTAx =
n∑j=1
n∑k=1
ajkxjxk
when the above double summation is expanded
Q = a11x21 + a12x1x2 + · · ·+ a1nx1xn
+a21x2x1 + a22x22 + · · ·+ a2nx2xn
an1xnx1 + an2xnx2 + · · ·+ annx2n
A =[ajk]is called the coe�cient matrix. A is assumed to be symmetric
We know that symmetric coe�cient matrix A has an orthonormal basis of eigenvectors. Hence if we form matrix X from theseorthonormal vectors
X =[x1 x2 · · · xn
]we obtain a matrix X that is orthognal, so we may conclude that
X−1 = XT
Then we can write
D = X−1AX
or
A = XDX−1
or by using the orthogonal property of X that is X−1 = XT we can write A as
A = XDXT
If we substitute this form A into quadratic form of Q
Q = xTXDXTx
If we set
XTx = y
and use the orthogonal property of X that is X−1 = XT , we have
X−1x = y
or
x = Xy
Similarly
xTX =(XTx
)T= yT
and
XTx = y
so Q simply becomes
Q = yTDy = λ1y21 + λ2y
22 + · · ·+ λny
2n
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 46
Theorem: Principal Axes Theorem
The substitution of the following transformation
x = Xy
transforms a Quadratic form
Q = xTAx
n∑j=1
n∑k=1
ajkxjxk(akj = ajk
)to the principal axes form or canonical form (10), where λ1, λ2, . . . , λnare the (not necessarily distinct) eigenvalues of the(symmetric!) matrix A, and X is an orthogonal matrix with corresponding eigenvectors x1, x2, . . . , xn , respectively, as columnvectors.
Example: Transformation to Principal Axes. Conic Sections
Transform the conic section which is represented by the following quadratic form
Q = 17x21 − 30x1x2 + 17x2
2
Q can be written as
Q = xTAx
where
A =
[17 −15−15 17
]x =
[x1
x2
]First we must compute the transformation matrix X, the columns of X are the eigenvectors of matrix A. Hence we must solvean eigenvalue problemThe characteristic equation of matrix A (
17− λ)2 − 152 = 0
Roots of the characteristic equation, eigenvalues of matrix A are
λ1 = 2 λ2 = 32
Using theorem we know that if we solved the eigenvalue problem completely and found corresponding eigenvectors(λ1, x1
) (λ2, x2
)then formed the orthogonal transformation matrix X from the eigenvectors
{x1, x2
}X =
[x1 x2
]Finally use the following tranformation
x = Xy
together with the knowledge that X−1 = XT because matrix X is orthogonal. We will end up with the following representationof quadratic form, in y
Q = λ1y21 + λ2y
22
or
Q = 2y21 + 32y2
2
To calculate the direction of the principal axes in the xy − coordinates, we have to determine normalized eigenvectors.The eigenvalue problem can be setup as (
A− λI)x = 0
The eigenvalue are
λ1 = 2
CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 47
λ2 = 32
Solving (A− λI
)x = 0
with λ = λ1 λ2
we get
x1 =
1√2
1√2
x1 =
−1√2
1√2
Hence
x = Xy =
1√2
−1√2
1√2
1√2
[y1
y2
]
x1 =y1√
2− y2√
2
x1 =y1√
2+
y2√2
This is 45o rotation
Chapter 5
Vector and Scalar Functions and Their Fields.
Vector Calculus: Derivatives
5.1 Introduction
De�nition: Vector Function
Let P be any point in a domain of de�nition. Then a vector function v is de�ned as
v = v(P)
=[v1 (P ) v2 (P ) v3 (P )
]Note tha v is a 3D vector and its value depends on points P in space.In general a vector function de�nes a vector �eld in a domain of de�nition.
Example: Typical vector �elds
1. Field of tangent vectors of a curve
2. Normal vectors of a surface
3. Velocity �eld of a rotating body
De�nition: Scalar Function
Values of a scalar function are scalars. It is de�ned as
f = f (P )
that depends on P .Like vector functions, scalar function de�nes a scalar �eld in that threedimensional domain or surface or curve in space
Example: Typical scalar �elds
1. Temperature �eld of a body
2. Pressure �eld of the air in Earth's atmosphere
48
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 49
Notation Vector Function: Cartesian coordinates x, y, z
Instead of writing v (P ), we can write
v (x, y, z) =[v1 (x, y, z) v2 (x, y, z) v3 (x, y, z)
]where P =
[x y z
]Notation Scalar Function: Cartesian coordinates x, y, z
Instead of writing f (P ), we can write
f (P ) = f (x, y, z)
where P =[x y z
]Caution: Vector Field Representation
The components depend on our choice of coordinate system, whereas a vector �eld that has a physical or geometric meaningshould have magnitude and direction depending only on P , not on the choice of coordinate system.
Example: Scalar Function (Euclidean Distance in Space)
f (P ) = f (x, y, z) =
√(x− x0)
2+ (y − y0)
2+ (z − z0)
2
f (P ) is a scalar functionf(P ) de�nes a scalar �eld in space
Example: Vector Field (Velocity Field)
At any instant the velocity vectors v(P ) of a rotating body B constitute a vector �eld, called the velocity �eld of the rotation.
v (x, y, z) = w × r = w ×[x y z
]= w ×
[xi yj zk
]w = ωk
Then
v =
∣∣∣∣∣∣i j k0 0 ωx y z
∣∣∣∣∣∣ = ω[−y +x 0
]= ω
(−yi+ xj
)
Example: Vector Field (Field of Force, Gravitational Field)
According to Newton's law of gravitationLet a particle A of mass M be �xed at a point and let a particle B of mass m be free to take up various positions P in space.Then A attracts B.The vector function that describes the gravitational force acting on B is
p = −cx− x0
r3i− cy − y0
r3j − cz − z0
r3k
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 50
Gravitational �eld
5.2 Vector Calculus
First we will study basic concepts of
• convergence,
• continuity,
• and di�erentiability
of vector functions.
De�nition: Convergence
An in�nite sequence of vectors a(n), n = 1, 2, . . .∞ is said to converge if there is a vector a such that
limn→∞
∣∣∣a(n) − a∣∣∣ = 0
a is called the limit vector of that sequence
limn→∞
a(n) = a
Every component of this sequence of this vectors which is expressed in Cartesian coordinates, must converge to a
De�nition: Limit
A vector function v(t) of a real variable t is said to have the limit l as t approaches t0 if v(t) is de�ned in some neighborhood of(possibly except at t0)
limt→t0
∣∣v (t)− l∣∣ = 0
Then
limt→t0
v(t)
= l
De�nition: Neighborhood
A neighborhood of t0 is an interval (segment) on the t− axis containing t0 as an interior point (not as an endpoint).
De�nition: Continuity
A vector function v(t) is said to be continuous at t = t0 if it is de�ned in some neighborhood of t0(including at t0 itself!) and
limt→t0
v(t)
= v(t0)
In Cartesian coordinates
v(t)
=[v1 (t) v2 (t) v3 (t)
]= v1 (t) i+ v2 (t) j + v3 (t) k
v1 (t) v2 (t) v3 (t) must be continous at t0then we can conclude that v (t) is continous at t0
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 51
De�nition: Derivative of a Vector Function
A vector function v(t) is said to be di�erentiable at a point t if the following limit exists:
v′(t)
= lim∆t→0
v(t+ ∆t
)− v
(t)
∆t
This vector v′(t)is called the derivative of v(t).
In Cartesian coordinate system
v′ (t) =[v1′ (t) v2′ (t) v3′ (t)
]Hence the derivative v′ (t) is obtained by di�erentiating each component separately
Di�erentiation Rules (cv)′ = cv′
(u+ v
)′ = u′+ v′
(u · v
)′ = u′ · v + u · v′
(u× v
)′ = u′ × v + u× v′
(u v w
)=(u′ v w
)+(u v′ w
)+(u v w′
)5.3 Partial Derivatives of a Vector Function
Suppose,
v =[v1 v2 v3
]= v1i+ v2j + v3k
are di�erentiable functions of n variables t1, t2, . . . , tn. Then∂v
∂tmis de�ned as the vector function
∂v
∂tm=∂v1
∂tmi+
∂v2
∂tmj +
∂v3
∂tmk
Second partial derivatives can be written as
∂2v
∂tltm=
∂2v1
∂tltmi+
∂2v2
∂tltmj +
∂2v3
∂tltmk
5.4 Curves. Arc Length. Curvature.Torsion
The application of vector calculus to geometry is a �eld known as di�erential geometry.Bodies that move in space form paths that may be represented by curves C. This shows the need for parametric representationsof C with parameter t, which may denote time or something else.
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 52
A typical parametric representation is given by
r(t) =[x(t) y(t) z(t)
]= x(t)i+ y(t)j + z(t)k
Heret is the parameterx, y, z are the Cartesian coordinatesTo each t = t0, there corresponds a point C with a position vector r(t0) whose coordinates are x(t0), y(t0), z(t0)The use of parametric representations has key advantages over other representations that involve projections into the xy-planeand xz-plane or involve a pair of equations with y or with z as independent variable.the parametric representation induces an orientation on C. This means that as we increase t, we travel along the curve C in acertain direction. The sense of increasing t is called the positive sense on C. The sense of decreasing t is then called the negativesense on C.Examples give parametric representations of several important curves
Example: Circle. Parametric Representation. Positive Sense
The circle x2 + y2 = 4, z = 0 in the xy − plane with center 0 and radius 2 can be represented parametrically by
r(t) =[2cos(t) 2sin(t) 0
]or simply by
r(t) =[2cos(t) 2sin(t)
]where
0 0 t 0 2π
Indeed
x2 + y2 =(2cos t
)2+(2sin t
)2= 4
(cos2 t+ sin2 t
)= 4
For t = 0 we have r(0) =[2 0
]For t =
π
2we have r(
π
2) =
[0 2
]The positive sense induced by this representation is the counterclockwise sense.If we replace t with t∗ = −t we have t = −t∗ and get
r∗(t∗)
=[2cos (−t∗) 2sin (−t∗)
]=[2cost∗ −2sint∗
]This has reversed the orientation, and the circle is now oriented clockwise.
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 53
Example: Ellipse
The vector function
r(t) =[acost bsint 0
]= acosti+ bsintj
represents an ellipse in the xy − plane with center at the origin and principal axes in the direction of the x− and y − axes. Infact, since , cos2t+ sin2t = 1we obtain
x2
a2+y2
b2= 1, z = 0
if b = a, then it represents a circle of radius a
Example: Straight Line
A straight line L through a point A with position vector a in the direction of a constant vector b can be represented parametricallyin the form
r(t) = a+ tb =[a1 + tb1 a2 + tb2 a3 + tb3
]If b is a unit vector, its components are the direction cosines of L. In this case,
∣∣t∣∣ measures the distance of the points of L fromA. For instance, the straight line in the xy − plane through A : (3, 2) having slope 1 is
r(t) =[3 2 0
]+ t
[1√2
1√2
0
]=
[(3 +
t√2
) (2 +
t√2
)0
]
A plane curve is a curve that lies in a plane in space. A curve that is not plane is called a twisted curve.
Example: Circular Helix
The twisted curve C represented by the vector function
r(t) =[acost asint ct
]= acosti+ asintj + ctk
is called a circular helix. It lies on the cylinder x2 + y2 = a2. If c > 0 the helix is shaped like a right-handed screw. If c < 0 itlooks like a left-handed screw. If c = 0 then it is a circle
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 54
A simple curve is a curve without multiple points, that is, without points at which the curve intersects or touches itself. Circleand helix are simple curves.An arc of a curve is the portion between any two points of the curve.
Tangent to a Curve
The next idea is the approximation of a curve by straight lines, leading to tangents and to a de�nition of length. Tangents arestraight lines touching a curve. The tangent to a simple curve C at a point P of C is the limiting position of a straight line Lthrough P and a point Q of C as Q approaches P along C.If C is given by r(t), and P and Q correspond to t and t+ ∆tthen a vector in the direction of L is
1
∆t
[r(t+ ∆t
)− r
(t)]
In the limit this vector becomes the derivative
r′(t) = lim∆t→0
1
∆t
[r(t+ ∆t
)− r(t)
]provided r(t) is di�erentiable.If r′(t) 6= 0 we r′(t) call a tangent vector of C at P because it has the direction of the tangent. The corresponding unit vector isthe unit tangent vector.
u =1∣∣r′∣∣r′
Note that both r′ and u point in the direction of increasing t. Hence their sense depends on the orientation of C. It is reversedif we reverse the orientation.It is now easy to see that the tangent to C at P is given by
q(w) = r + wr′
This is the sum of the position vector r of P and a multiple of the tangent vector r′of C at P . Both vectors depend on P . Thevariable w is the parameter.
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 55
Example: Tangent to an Ellipse
Find the tangent to the ellipse
1
4x2 + y2 = 1
at P :
(√2
1√2
)SolutionParametric ellipse function can be written as
r(t) =[acost bsint 0
]= acosti+ bsintj
represents an ellipse in the xy − plane with center at the origin and principal axes in the direction of the x− and y − axes. Infact, since , cos2t+ sin2t = 1we obtain
x2
a2+y2
b2= 1, z = 0
Thus we can identify the constants of ellipse as
a = 2 b = 1
This gives
r(t) =[2cost sint
]The derivative is
r′(t) =[−2sint cost
]We must �nd t that corresponds to P
r(π
4
)=[2cos
(π4
)sin(π
4
)]=
[√2
1√2
]Hence we conclude that t =
π
4We can compute
r′(π
4
)=
[−√
21√2
]Thus, we get the answer
q(w)
=
[√2
1√2
]+ w
[−√
21√2
]=
[√2(1− w
) (1√2
)(1 + w
)]Length of a Curve
Length l of a curve will be ht elimit of lengths of broken lines of n chords with larger and larger n. Let r(t), a 0 t 0 b representC. For each n = 1, 2, . . . we subdivide (partition) the interval a 0 t 0 b by points
t0(= a), t1, t2, . . . , tn−1, tn(= b) where t0 < t1 < t2 < · · · < tn
This gives a broken line of chords with endpoints r(t0), . . . r(tn). We do this arbitrarily but so that the greatest approaches∣∣∆tm∣∣ =∣∣tm − tm−1
∣∣ aprooaches zero as n → ∞ The l1, l2, . . . lengths of these chords can be obtained from the Pythagoreantheorem. If r(t) has a continuous derivative it can be shown that the sequence l1, l2, . . . has a limit, which is independent of theparticular choice of the representation of C and of the choice of subdivisions. This limit is given by the integral
l =
ˆ b
a
√r′ · r′dt
(r′ = dr
dt
)l is called the length of C, and C is called recti�able. The actual evaluation of the integral will, in general, be di�cult. However,some simple cases are given in the problem set.
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 56
Arc Length s of a Curve
The length of a curve C is a constant, a positive number. But if we replace the �xed b with a variable t, the integral becomes afunction of t, denoted by s(t) and called the arc length function or simply the arc length of C. Thus
s(t) =
ˆ t
a
√r′ · r′dt
(r′ = dr
dt
)Geometrically, s(t0) with some t0 is the length of the arc of C between the points with parametric values a and t0. The choiceof a (the point s = 0) is arbitrary; changing a means changing s by a constant.
Linear Element ds.
If we di�erentiate (11) and square, we have(ds
dt
)2
=dr
dt· drdt
=∣∣r′(t)∣∣2 =
(dx
dt
)2
+
(dy
dt
)2
+
(dz
dt
)2
We can write
dr =[dx dy dz
]= dxi+ dyj + dzk
Then we can write
ds2 = dx2 + dy2 + dz2
ds is called the linear element of C.
Arc Length as Parameter.
The use of s in
r(t) =[x(t) y(t) z(t)
]= x(t)i+ y(t)j + z(t)k
instead of an arbitrary t simpli�es various formulas.
r(s) =[x(s) y(s) z(s)
]= x(s)i+ y(s)j + z(s)k
For the unit tangent vector
u(t) =1∣∣r′(t)∣∣r′(t)
we simply obtain
u(s) = r′(s)
Indeed,
∣∣r′(s)∣∣ =
(ds
ds
)= 1
shows that r′(s) is a unit vector.
Example: Circular Helix. Circle. Arc Length as Parameter
The helix
r(t) =[acost asint ct
]has the derivative
r′(t) =[−asint acost c
]Hence
r′ · r′ = a2 + b2
which is a constant denoted by K2
Hence the integrand in
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 57
s(t) =
ˆ t
a
√r′ · r′dt
is constant and is equal to K, and the integral is
s = Kt
Thus,
t =s
K
so that a representation of the helix with the arc length s as parameter is
r∗(s) = r( sK
)=[acos
( sK
)asin
( sK
)c( sK
)], K =
√a2 + b2
A circle is obtained if we set c = 0.Then K = a, t =s
aand a representation with arc length s as parameter is
r∗(s) = r( sa
)=[acos
( sa
)asin
( sa
)c( sa
)]Curves in Mechanics. Velocity. Acceleration
Curves play a basic role in mechanics, where they may serve as paths of moving bodies. Then such a curve C should berepresented by a parametric representation r(t) with time t as parameter. The tangent vector of C is then called the velocityvector v because, being tangent, it points in the instantaneous direction of motion and its length gives the, speed
∣∣v∣∣∣∣v∣∣ =∣∣r′∣∣ =
√r′ · r′ = ds
dt
see below (ds
dt
)2
=dr
dt· drdt
=∣∣r′(t)∣∣2 =
(dx
dt
)2
+
(dy
dt
)2
+
(dz
dt
)2
The second derivative of r(t) is called the acceleration vector and is denoted by a. Its length∣∣a∣∣ is called the acceleration of the
motion. Thus
v(t) = r′(t), a(t) = v′(t) = r′′(t)
Tangential and Normal Acceleration.
Whereas the velocity vector is always tangent to the path of motion, the acceleration vector will generally have another direction.We can split the acceleration vector into two directional components, that is,
a = atan + anorm
where the tangential acceleration vector atan is tangent to the path and the normal acceleration vector anorm is normal (per-pendicular) to the path. Expressions for the vectors are obtained from
v(t) = r′(t), a(t) = v′(t) = r′′(t)
by the chain rule.
v(t) =dr
dt=dr
ds
ds
dt= u(s)
ds
dt
where u(s) is the unit tangent vector.
u(s) = r′(s)
Another di�erentiation gives
a(t) =dv
dt=
d
dt
(u(s)
ds
dt
)=du
ds
(ds
dt
)2
+ u(s)d2s
dt2
Note that
dr
dt=dr
ds
ds
dt= u(s)
ds
dt
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 58
Since the tangent vector u(s) has constant length (length one), its derivative is perpendicular to u(s). Hence the �rst term onthe right side is the normal acceleration vector, and the second term on the right side is the tangential acceleration vector.Now the length
∣∣atan∣∣ is the absolute value of the projection of a in the direction of v,that is,
∣∣atan∣∣ =
∣∣a · v∣∣∣∣v∣∣Hence atanis this expression times the unit vector in the direction of v, that is,
atan =a · vv · v
v
Also
anorm = a− atan
Example: Centripetal Acceleration. Centrifugal Force
The vector function
r(t) =[Rcosωt Rsinωt
]= Rcosωt i+Rsinωt j
(with �xed i and j) represents a circle C of radius R with center at the origin of the xy − plane and describes the motion of asmall body B counterclockwise around the circle. Di�erentiation gives the velocity vector
v = r′ =[−Rωsinωt Rωcosωt
]= −Rωsinωt i+Rωcosωt j
vis tangent to C. Its magnitude, the speed, is ∣∣v∣∣ =∣∣r′∣∣ =
√r′ · r′ = Rω
Hence it is constant. The speed divided by the distance R from the center is called the angular speed. It equals ω, so that it isconstant, too. Di�erentiating the velocity vector, we obtain the acceleration vector
a = v′ =[−Rω2cosωt −Rω2sinωt
]= −Rω2cosωt i−Rω2sinωt j
This shows that a = −ω2r, so that there is an acceleration toward the center, called the centripetal acceleration of the motion.It occurs because the velocity vector is changing direction at a constant rate. Its magnitude is constant,
∣∣a∣∣ = ω2∣∣r∣∣ = ω2R.
Multiplying a by the mass m of B, we get the centripetal force ma. The opposite vector −ma is called the centrifugal force. Ateach instant these two forces are in equilibrium. We see that in this motion the acceleration vector is normal (perpendicular) toC; hence there is no tangential acceleration.
Example: Superposition of Rotations. Coriolis Acceleration
A projectile is moving with constant speed along a meridian of the rotating earth. Find its acceleration.
SolutionLet x, y, z be a �xed Cartesian coordinate system in space, with unit vectors i, j, k in the directions of the axes. Let the Earth,together with a unit vector b, be rotating about the z − axis with angular speed ω > 0. Since b is rotating together with theEarth, it is of the form
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 59
b(t) = cosωt i+ sinωt j
Let the projectile be moving on the meridian whose plane is spanned by b and k with constant angular speed ω > 0. Then itsposition vector in terms of b and k is
r(t) = Rcosγtb(t) +Rsinγtk,(R = Radius of Earth
)Next, we apply vector calculus to obtain the desired acceleration of the projectile. Our result will be unexpected�and highlyrelevant for air and space travel. The �rst and second derivatives of b with respect to t are
b′(t) = −ωsinωt i+ ωcosωt j
b′′(t) = −ω2cosωt i− ω2sinωt j = −ω2b(t)
The �rst and second derivatives of r(t) with respect to t are
v = r′(t) = −Rcosγtb′ − γRsinγtb+ γRcosγtk
a = v′ = Rcosγtb′′ − 2γRsinγtb′ − γ2Rcosγtb− γ2Rsinγtk
a = Rcosγtb′′ − 2γRsinγtb′ − γ2r
By analogy;b′′ = −ω2b, we conclude that the �rst term in a (involving ω in b) is the centripetal acceleration due to the rotationof the Earth. Similarly, the third term in the last line (involving γ!) is the centripetal acceleration due to the motion of theprojectile on the meridian M of the rotating Earth. The second, unexpected term in a is called the Coriolis acceleration and isdue to the interaction of the two rotations. On the Northern Hemisphere, sinγt > 0 (for t > 0 ; also γ > 0 by assumption), sothat acorhas the direction of −b′, that is, opposite to the rotation of the Earth.
∣∣acor∣∣ is maximum at the North Pole and zeroat the equator. The projectile B of mass m0experiences a force −m0acor opposite to m0acorwhich tends to let B deviate fromM to the right (and in the Southern Hemisphere, where sinγt < 0, to the left). This deviation has been observed for missiles,rockets, shells, and atmospheric air�ow.
Curvature and Torsion.
The curvature κ(s) of a curve C : r(s) (s the arc length) at a point P of C measures the rate of change∣∣u′(s)∣∣of the unit tangent
vector u(s) at P . Hence κ(s) measures the deviation of C at P from a straight line (its tangent at P ). Since , the de�nition is
κ(s) =∣∣u′(s)∣∣ =
∣∣r′′(s)∣∣ , (′ = d
ds
)The torsion τ(s) of C at P measures the rate of change of the osculating plane O of curve C at point P . Note that this planeis spanned by u and u′. Hence τ(s) measures the deviation of C at P from a plane (from O at P ). Now the rate of change isalso measured by the derivative b′of a normal vector b at O. By the de�nition of vector product, a unit normal vector of O is
b = u×(
1
κ
)u′ = u× p. Here p =
(1
κ
)u′ is called the unit principal normal vector and b is called the unit binormal vector of
C at P . The vectors are labeled in Figure. Here we must assume that κ 6= 0 ; hence κ > 0. The absolute value of the torsion isnow de�ned by ∣∣τ(s)
∣∣ =∣∣b′(s)∣∣
Whereas κ(s) is nonnegative, it is practical to give the torsion a sign, motivated by �right-handed� and �left-handed. Since b isa unit vector, it has constant length. Hence b′ is perpendicular to b. Now b′ is also perpendicular to u because, by the de�nitionof vector product, we have b · u = 0, b · u′ = 0. This implies(
b · u)′ = 0
that is
b′ · u+ b · u′ = b′ · u+ 0 = 0
Hence if b′ 6= 0 at P , it must have the direction of p or −p, so that it must be of the form b′ = −τp Taking the dot product ofthis by p and using p · p gives
τ(s) = −p(s) · b′(s)
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 60
The minus sign is chosen to make the torsion of a right-handed helix positive and that of a left-handed helix negative. Theorthonormal vector tripleu, p, b is called the trihedron of C. Figure also shows the names of the three straight lines in thedirections of u, p, b, which are the intersections of the osculating plane, the normal plane, and the rectifying plane.
5.5 Calculus Review: Functions of Several Variables
Chain Rules
Figure shows the notations in the following basic theorem.
In calculus, x, y, z are often called the intermediate variables, in contrast with the independent variables u, v and the dependentvariable w.
Special Cases of Practical Interest
If w = f(x, y) and x = x(u, v), y = y(u, v), then
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 61
If w = f(x, y, z) andx = x(t), y = y(t), z = z(t) , then
If w = f(x, y) and x = x(t), y = y(t) , then
If w = f(x) and x = x(t) , then
Partial Derivatives on a Surface z = g(x, y)
Let w = f(x, y, z) and let z = g(x, y) represent a surface S in space. Then on S the function becomes
w(x, y)
= f(x, y, g
(x, y))
Hence, the partial derivatives are
∂w
∂x=∂f
∂x+∂f
∂z
∂g
∂x
∂w
∂y=∂f
∂y+∂f
∂z
∂g
∂y
z = g(x, y)
Mean Value Theorem
Special Cases
For a function f(x, y) of two variables
and, for a function f(x) of a single variable
where, the domain D is a segment of the x− axis and the derivative is taken at a suitable point between x0 and x0 + h
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 62
5.6 Gradient of a Scalar Field. Directional Derivative
Some of the vector �elds that occur in applications�not all of them!� can be obtained from scalar �elds. It is the �gradient�that allows us to obtain vector �elds from scalar �elds.
De�nition: Gradient
Notation: ∇
Di�erentia operator ∇ is de�ned by
Use of Gradients:
Gradients are useful in several ways, notably in giving the rate of change of in any direction in space, in obtaining surface normalvectors, and in deriving vector �elds from scalar �elds
Directional Derivative
From calculus we know that the partial derivatives give the rates of change of in the directions of the three coordinate axes. Itseems natural to extend this and ask for the rate of change of f(x, y, z) in an arbitrary direction in space
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 63
The next idea is to use Cartesian xyz − coordinates and for b a unit vector. Then the line L is given by
r(s) = x(s) i+ y(s) j + z(s) k = p0 + sb(∣∣b∣∣ = 1
)where p0 the position vector of P . Dbf =
df
dsis the derivative of the function with respect to the arc length s of L. Hence,
assuming that f has continuous partial derivatives and applying the chain rule
Dbf =df
ds=∂f
∂xx′+ ∂f
∂yy′+ ∂f
∂zz′
where primes denote derivatives with respect to s (which are taken at s = 0) . But here, di�erentiating
r(s) = x(s) i+ y(s) j + z(s) k = p0 + sb(∣∣b∣∣ = 1
)gives:
r′(s) = x′ i+ y′ j + z′ k = b
.Hence
Dbf =df
ds=∂f
∂xx′+ ∂f
∂yy′+ ∂f
∂zz′
is simply the inner product of grad f and b; that is,
Dbf =df
ds= b · grad f
If the direction is given by a vector a of any length (6= 0) , then
Daf =df
ds=
1∣∣a∣∣a · grad fExample: Gradient. Directional Derivative
Gradient Is a Vector. Maximum Increase
grad f points in the direction of maximum increase of f .
Theorem: Use of Gradient: Direction of Maximum Increase
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 64
Gradient as Surface Normal Vector
Gradients have an important application in connection with surfaces, namely, as surface normal vectors, as follows. Let S be asurface represented by f(x, y, z) = c, where f is di�erentiable. Such a surface is called a level surface of f , and for di�erent cwe get di�erent level surfaces. Now let C be a curve on S through a point P of S. As a curve in space, C has a representationr(t)
[x(t) y(t) z(t)
]. For C to lie on the surface S, the components of r(t)must satisfy f(x, y, z) = c, that is,
f(x(t), y(t), z(t)) = c
Now a tangent vector of C is r′(t) =[x′(t ) y′(t) z′(t)
]. And the tangent vectors of all curves on S passing through P will
generally form a plane, called the tangent plane of S at P . The normal of this plane (the straight line through P perpendicularto the tangent plane) is called the surface normal to S at P . A vector in the direction of the surface normal is called a surfacenormal vector of S at P . We can obtain such a vector quite simply by di�erentiating
f(x(t), y(t), z(t)) = c
with respect to t. By the chain rule,
∂f
∂xx′+ ∂f
∂yy′+ ∂f
∂zz′ =
(grad f
)· r′ = 0
Hence grad f is orthogonal to all the vectors r′ in the tangent plane, so that it is a normal vector of S at P .
Theorem: Gradient as Surface Normal Vector
Example: Gradient as Surface Normal Vector. Cone
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 65
5.7 Vector Fields That Are Gradients of Scalar Fields (�Potentials�)
Some vector �elds have the advantage that they can be obtained from scalar �elds.Such a vector �eld is given by a vector function , which is obtained as the gradient of a scalar function, sayv(P ) = grad f(P )The function f(P ) is called a potential function or a potential of v(P ).Such a v(P )and the corresponding vector �eld are called conservativebecause in such a vector �eld, energy is conserved;that is, no energy is lost (or gained) in displacing a body from a point P to another point in the �eld and back to P.
5.8 Divergence of a Vector Field
From a scalar �eld we can obtain a vector �eld by the gradient.Conversely, from a vector �eld we can obtain a scalar �eld by the divergence or another vector �eld by the curl.Let v(x, y, z)be a di�erentiable vector function, where x, y, z are Cartesian coordinates and let v1, v2, v3 be the components of v.Then the function
is called the divergence of v or the divergence of the vector �eld de�ned by v.Another common notation for the divergence is
with the understanding that the �product�
(∂
∂x
)v1in the dot product means the partial derivative
∂v1
∂x. Note that ∇ · v means
the scalar div v, whereas ∇f means the vector grad f
Theorem: Invariance of the Divergence
Let f(x, y, z) be a twice di�erentiable scalar function. Then its gradient exists,
then form the divergence
Hence we have the basic result that the divergence of the gradient is the Laplacian
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 66
5.9 Curl of a Vector Field
Letv(x, y, z) =
[v1 v2 v3
]= v1i+ v2j + v3k
be a di�erentiable vector function of the Cartesian coordinates x, y, z.Then the curl of the vector function v or of the vector �eld given by v is de�ned by the �symbolic� determinant
Example: Curl of a Vector Function
Example: Rotation of a Rigid Body. Relation to the Curl
Theorem: Rotating Body and Curl
Theorem: Grad, Div, Curl
CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 67
Theorem: Invariance of Curl
Chapter 6
Vector Integral Calculus. Integral Theorems
Goal of this chapter
• Line Integral
• Surface Integral
• Volume Integrals
Vector integral calculus extends integrals as known from regular calculus to
• integrals over curves
� called line integrals
• surfaces
� called surface integrals
• and solids,
� called triple integrals
We can transform these di�erent integrals into one anotherWe will learn
• Green's theorem
• Gauss's convergence theorem
• Stokes's theorem
Green's theorem in the plane allows you
• to transform line integrals into double integrals,
• or conversely,
• double integrals into line integrals
Gauss's convergence theorem
• converts surface integrals into triple integrals, and vice-versa
Stokes's theorem deals with
• converting line integrals into surface integrals, and vice-versa
68
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 69
6.1 Line Integrals
The concept of a line integral is a simple and natural generalization of a de�nite integral
ˆ b
a
f(x)dx
we integrate the function also known as the integrand,
• from x = a
• along the x-axis to
• x = b.
Now, in a line integral,
• we shall integrate a given function
� called the integrand,
• along a curve C
� in space
� or in the plane
we represent the curve C
• by a parametric representation
The curve C is called
• the path of integration.
The path of integration goes from A to B.
• A: is its initial point
• and B: is its terminal point.
• C is now oriented.
• The direction from A to B, in which t increases is called the positive direction on C.
De�nition and Evaluation of Line Integrals
A line integral of a vector function F (r) over a curve C : r(t) is de�ned by
where
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 70
If we write dr in terms of components
dr =[dx dy dz
]and
′ =d
dt
we get
ˆ t2=π2
t1=0
cos2t sint dt
do the following transformation
u = cost
du = −sint
u1 = cost1 = cos0 = 1
u2 = cost2 = cosπ
2= 0
ˆ t2=π2
t1=0
cos2t sint dt =
ˆ u2=0
u1=1
u2(−du)
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 71
Integration by parts
ˆudv = uv −
ˆvdu
In order to integrate
−3
[ˆ 2π
0
tsintdt
]set
u = t dv = sint dt
du
dt= 1 du = dt
dv = sintdt v = −cost
Then
−3
[ˆ 2pi
0
tsintdt
]= −3 [−tcost]2π0 −
ˆ 2π
0
−costdt
−3
[ˆ 2pi
0
tsintdt
]= 6π − 0
For the second integral
cos2t =1
2[1− cos2t]
The third integral can be evaluted easily
Simple general properties of the line integral
If the sense of integration along C is reversed, the value of the integral is multiplied by -1.
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 72
6.2 Path Indepence of Line Integral
We want to �nd out under what conditions, in some domain,
• a line integral takes on the same value
• no matter what path of integration is taken (in that domain).
As before we consider line integrals
The line integral is said to be path independent in a domain D in space
• if for every pair of endpoints
• A, B in domain D,
• it has the same value for all paths in D
• that begin at A and end at B.
We shall see that path independence of (1) in a domain D holds if and only if:
• Theorem-I: F = grad f
• Theorem-II: Integration around closed curves C in D always gives 0.
• Theorem-III: curl F = 0 provided D is simply connected
is analogous to
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 73
Path Independence and Integration Around Closed Curves
Path Independence and Exactness of Di�erential Forms
A third idea relates path independence to the exactness of the di�erential form
This form is called exact in a domain D in space if it is the di�erential
of a di�erentiable function f(x, y, z) everywhere in D, that is, if we have
Comparing these two formulas, we see that the form is exact if and only if there is a di�erentiable function f(x, y, z) in D suchthat everywhere in D,
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 74
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 75
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 76
6.3 Calculus Review: Double Integrals.
Properties of Double Integral
Mean Value Theorem for Double Integral
Evaluation of Double Integrals by Two Successive Integrations
Double integrals over a region R may be evaluated by two successive integrations. We may integrate �rst over y and then overx. Then the formula is
Here y = g(x) and y = h(x) represents the boundary curve of R and keeping x constant, we integrate f(x, y) over y fromg(x( to h(x). The result is a function of x and we integrate it from x = a to x = bSimilarly
The boundary curve of R is now represented by x = p(y) and q(y).Treating y as a constant, we �rst integrate f(x, y) over x fromp(y) to q(y) and then the resulting function of y from y = c to y = d
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 77
6.3.1 Applications of Double Integrals
Area A of a region R in the xy-plane is given by the double integral
A =
¨R
dxdy
The volume V beneath the surface z = f(x, y)and above a region R in the xy − plane is
V =
¨R
f(x, y)dxdy
let f(x, y)be the density ( mass per unit area) of a distribution of mass in the xy − plane. Then the total mass M in R is
M =
¨R
f(x, y)dxdy
the center of gravity of the mass in R has the coordinates, x, y where
x =1
M
¨R
xf(x, y)dxdy
y =1
M
¨R
yf(x, y)dxdy
the moments of inertia Ixand Iyof the mass in R about the x− and y − axes, respectively, are
Ix =
¨R
y2f(x, y)dxdy
Iy =
¨R
x2f(x, y)dxdy
polar moment of inertia Ioabout the origin of the mass in R is
Io = Ix + Iy =
¨R
(x2 + y2
)f(x, y)dxdy
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 78
6.3.2 Change of Variables in Double Integrals. Jacobian
Recall from calculus that for a de�nite integral the formula for the change from x to u is
ˆ b
a
f(x)dx =
ˆ β
α
f(x(u)
) dxdudu
such that
x(α) = a x(β) = b
The formula for a change of variables in double integrals from x, y to u, v is
¨R
f(x, y)dxdy =
¨R∗f(x(u, v), y(u, v)
) ∣∣∣∣∂(x, y)
∂(u, v)
∣∣∣∣ dudvthat is, the integrand is expressed in terms of u and v, and dxdy is replaced by dudv times the absolute value of the Jacobian
J =
∣∣∣∣∂(x, y)
∂(u, v)
∣∣∣∣ =
∣∣∣∣∣∣∣∂x
∂u
∂x
∂v∂y
∂u
∂y
∂v
∣∣∣∣∣∣∣ =∂x
∂u
∂y
∂v− ∂x
∂v
∂y
∂u
Example: Change of Variables in a Double Integral
polar coordinates r and θ, which can be introduced by setting
x = rcosθ y = rsinθ
Then
J =
∣∣∣∣∂(x, y)
∂(r, θ)
∣∣∣∣ ∣∣∣∣cosθ −rsinθsinθ rcosθ
∣∣∣∣ = r
and
¨R
f(x, y)dxdy =
¨R∗f(rcosθ, rsinθ
)ddrdθ
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 79
where R∗is the region in the rθ − plane corresponding to R in the xy − plane
Example: Double Integrals in Polar Coordinates. Center of Gravity. Moments of Inertia
6.4 Green's Theorem in the Plane
6.4.1 Introduction
Double integrals over a plane region may be transformed into line integrals over the boundary of the region and conversely.
Theorem: Green's Theorem
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 80
Setting F =[F1 F2
]= F1i+ F2j and we obtain vectorial form˜
R
(curlF
)· kdxdy =
¸CF · dr
Example: Veri�cation of Green's Theorem in the Plane
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 81
6.4.2 Some Applications of Green's Theorem
Example: Area of a Plane Region as a Line Integral Over the Boundary
Example: Area of a Plane Region in Polar Coordinates
6.5 Surfaces for Surface Integrals
• With line integrals, we integrate over curves in space
• with surface integrals we integrate over surfaces in space.
• Each curve in space is represented by a parametric equation
• This suggests that we should also �nd parametric representations for the surfaces in space.
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 82
6.5.1 Representation of Surfaces
Representations of a surface S in xyz − space arez = f(x, y) or g(x, y, z) = 0For examplez = +
√a2 − x2 − y2 or x2 + y2 + z2 − a2 = 0
(z 1 0
)represents a hemisphere of radius a and center 0For surfaces S in surface integrals, it will often be more practical to use a parametric representation.Surfaces are two-dimensional.Hence we need two parameters;which we call u and v.Thus a parametric representation of a surface S in space is of the formr(u, v) =
[x(u, v) y(u, v) z(u, v)
]= x(u, v)i+ y(u, v)j + z(u, v)k
where (u, v) varies in some region R of the uv − plane.This mapping maps every point (u, v) in R onto the point of S with position vector r(u, v).
Example: Parametric Representation of a Cylinder
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 83
Example: Parametric Representation of a Sphere
Example: Parametric Representation of a Cone
6.5.2 Tangent Plane and Surface Normal
Recall that the tangent vectors of all the curveson a surface Sthrough a point P of Sform a plane,called the tangent plane of S at P .Exceptions are pointswhere S has an edge or a cusp (like a cone),so that S cannot have a tangent plane at such a point.Furthermore, a vector perpendicular to the tangent planeis called a normal vector of S at P .The partial derivatives ru and rvat P are tangential to S at P .Hence their cross product gives a normal vector N of S at P .N = ru × rv 6= 0The corresponding unit normal vector n of S at P is
n =1∣∣N ∣∣N =
1∣∣ru × rv∣∣ru × rv
Also, if S is represented by g(x, y, z) = 0 then,
n =1∣∣gradg∣∣gradg
A surface S is called a smooth surface if its surface normal depends continuously on the points of S.Sis called piecewise smooth if it consists of �nitely many smooth portions.
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 84
Example: Unit Normal Vector of a Sphere / Unit Normal Vector of a Cone
6.6 Surface Integrals
To de�ne a surface integral, we take a surface S, given by a parametric representation as just discussedr(u, v) =
[x(u, v) y(u, v) z(u, v)
]= x(u, v)i+ y(u, v)j + z(u, v)k
where (u, v) varies over a region R in the uv − planeS has a normal vector
N = ru × rv n =1∣∣N ∣∣n
For a given vector function F we can now de�ne the surface integral over S by˜SF · ndA =
˜AF(r(u, v)
)·N(u, v)dudv
HereN =
∣∣N ∣∣n ∣∣N ∣∣ =∣∣ru × rv∣∣
N is the area of the parallelogram with sides ruand rv, by the de�nition of cross product. HencendA = n
∣∣N ∣∣ dudv = Ndudv
And we see that dA =∣∣N ∣∣ dudv is the element of area of S.
Also F · n is the normal component of F.We can write in components, usingF =
[F1 F2 F3
]N =
[N1 N2 N3
]n =
[cosα cosβ cosγ
]Here, α;β; γare the angles between n and the coordinate axes.
we can writecosαdA = dydz cosβdA = dzdx cosγdA = dxdy
We can use this formulato evaluate surface integralsby converting them to double integralsover regions in the coordinate planes of the xyz-coordinate system
Example: Flux Through a Surface
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 85
Example: Surface Integral
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 86
6.7 Triple Integrals. Divergence Theorem of Gauss
6.7.1 Introduction
The divergence theorem, transforms surface integrals into triple integrals.A triple integral is an integral of a function taken over a closed bounded, three-dimensional region T in space.We subdivide T by planes parallel to the coordinate planes.Then we consider those boxes of the subdivision that lie entirely inside T , and number them from 1 to n.Here each box consists of a rectangular parallelepiped.In each such box we choose an arbitrary point, say, in box k. The volume of box k we denote by ∆Vk. We now form the sum
This we do for larger and larger positive integers n arbitrarilybut so that the maximum length of all the edges of those n boxes approaches zeroas n approaches in�nity.Then it can be shown that the sequence converges to a limit.This limit is called the triple integral of f(x, y, z)over the region T and is denoted by
Triple integrals can be evaluated by three successive integrations.This is similar to the evaluation of double integrals by two successive integrations
6.7.2 Divergence Theorem of Gauss
Triple integrals can be transformed into surface integralsover the boundary surface of a region in space and conversely.
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 87
Such a transformation is of practical interestbecause one of the two kinds of integral is often simpler than the other.The transformation is done by the divergence theorem,which involves the divergence of a vector functionF =
[F1 F2 F3
]= F1i+ F2j + F3k
Theorem: Divergence Theorem of Gauss
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 88
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 89
6.8 Stokes's Theorem
Double integrals over a region in the plane can be transformed intoline integrals over the boundary curve of that region and conversely,line integrals into double integrals.This important result is known as Green's theorem in the planeWe can transform triple integrals into surface integrals and vice versa,that is, surface integrals into triple integrals.This �big� theorem is called Gauss's divergence theorem.Another �big� theorem that allows usto transform surface integrals into line integrals and conversely,line integrals into surface integrals.It is called Stokes's Theorem
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 90
CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 91