notes_mmu218_00036

91

Upload: dayna-powell

Post on 05-May-2017

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Notes_MMU218_00036

Applied Mathematics for Engineers1

1Reference textbook: Kreyszig, E., Advanced Engineering Mathematics, 10thEd.,John Wiley & Sons, 2011.

1

Page 2: Notes_MMU218_00036

Chapter 1

Vectors in Rn and Cn,Spatial Vectors

1.1 Introduction

Weights of eight students are listed as156, 125, 145, 134, 178, 145, 162, 193we can denote these numbers using a single symbol, w, but with di�erent subscript asw1, w2, w3, w4, w5, w6, w7, w8

Each subscript denotes the position of the numbers in the list, for examplew1 = 156 the �rst numberw2 = 125 the second numberSuch a list of values is called a linear array or vector

w =(w1 w2 . . . w8

)Vector Addition and Scalar Multiplication

Vector Addition

The resultant u+ v of the two vectors u and v is obtained by the so-called parallegrom law.Furthermore, if

u =(a b c

)v =

(a′ b′ c′

)thenend point of the vector u+ v is (

a+ a′ b+ b′ c+ c′)

Scalar Multiplication

The product ku of a vector u by a real number k is obtained as:

u =(a b c

)then

ku =(ka kb kc

)Mathematically, vector u is de�ned with its

(a b c

)we write u =

(a b c

).

The ordered triple(a b c

)of real numbers may be called a point or a vector

General notation: n− tuple(a1 a2 . . . an

)

2

Page 3: Notes_MMU218_00036

CHAPTER 1. VECTORS IN RN AND CN ,SPATIAL VECTORS 3

1.2 Vectors in Rn

The set of all n− tuples of real numbers, denoted by Rn,is called n− space.A particular n− tuple in Rn, is called a point or vector

u =(a1 a2 . . . an

)The numbers ai are called the coordinates, components, or elements of uTwo vectors, u and v are equal, written u = v. If they have the same number of components, and if the corresponding componentsare equal.The vector

(0 0 . . . 0

)is called the zero vector.

Column Vectors [12

] [3−4

] 15−6

Row Vectors [

1 2] [

3 −4] [

1 5 −6]

1.3 Vector Addition and Scalar Multiplication

Consider two vectors u and v in Rn

u =(a1 a2 . . . an

)and

v =(b1 b2 . . . bn

)Their sum, written u+ v

u+ v =[a1 + b1 a2 + b2 . . . an + bn

]The scalar product or, simply product of the vector u by a real number k, written ku

ku =(ka1 ka2 . . . kan

)u+ v and ku are also vectors in Rn

Negatives and subtraction are de�ned in Rnas follows

−u = (−1)u

and

u− v = u+ (−1)u

Given vectors u1, u2, . . . , um in Rn and scalars k1, k2, . . . , km,we can form the vector

v = k1u1 + k2u2 + k3u3 + · · ·+ kmum

Vector v is called a linear combination of the vectors u1, u2, . . . , um

Page 4: Notes_MMU218_00036

CHAPTER 1. VECTORS IN RN AND CN ,SPATIAL VECTORS 4

1.4 Dot (Inner) Product

Consider arbitrary vector, u, v ∈ Rn

u =(a1 a2 . . . an

)and

v =(b1 b2 . . . bn

)The dot product or inner product or scalar product of u and v is denoted and de�ned by

u · v = a1b1 + a2b2 + a3b3 + · · ·+ anbn

u and v are said to be orthogonal if

u ⊥ v ⇐⇒ u · v = 0

Norm (Length) of a Vector

The norm or length of a vector u ∈ Rn, denoted by

‖u‖ =√u · u 1 0

u =(a1 a2 . . . an

)∈ Rn

‖u‖ =√u · u =

√a2

1 + a22 + a2

3 + · · ·+ a2n

‖u‖ 1 0

u = 0⇐⇒ ‖u‖ = 0

Unit vector

u is called a unit vector ⇐⇒ ‖u‖ = 1∀v 6= 0 ∈ Rn

v =1

‖v‖v =

v

‖v‖is called normalizing vv is the unit vector in the same direction as v

Distance, Angles, Projections

Consider arbitrary vector, ∀u, v ∈ Rn

u =(a1 a2 . . . an

)and

v =(b1 b2 . . . bn

)Distance between u and v is denoted by d (u, v) = ‖u− v‖

‖u− v‖ =

√(a1 − b1)

2+ (a2 − b2)

2+ (a3 − b3)

2+ · · ·+ (an − bn)

2

The angle between ∀{u, v}∈ Rn and

{u, v}6= 0 is de�ned by

Page 5: Notes_MMU218_00036

CHAPTER 1. VECTORS IN RN AND CN ,SPATIAL VECTORS 5

cosθ =u · v‖u‖‖v‖

If u · v = 0 =⇒ θ =π

2or θ =

π

2=⇒ u ⊥ v

u · v = 0 =⇒ θ =π

2or θ =

π

2=⇒ u ⊥ v

The projection of a vector u onto vector v 6= 0 is the vector described by

proj (u, v) =u · v‖v‖2

v

1.5 Located Vectors, Hyperplanes, Lines, Curves in Rn

n-tuple P (ai) ≡(a1 a2 . . . an

)∈ Rn P (ai) is a point

n-tuple u =[c1 c2 . . . cn

]∈ Rn u is a vector from origin O to the point C

(c1 c2 . . . cn

)Located Vectors

A (ai) ∈ Rn and B (bi) ∈ Rn

Located vector or directed line segment A→ B, written as−−→AB

u =−−→AB = B −A =

[b1 − a1 b2 − a2 . . . bn − an

]A,B ∈ R3

A(a1 a2 a3

)B(b1 b2 b3

)u = B −A

P(b1 − a1 b2 − a2 b3 − a3

)Hyperplanes

H ∈ Rnis the set of points(x1 x2 . . . xn

)a1x1 + a2x2 + · · ·+ anxn = b

u =[a1 a2 . . . an

]6= 0

H ∈ R2 is a lineH ∈ R3is a plane

In R3 u ⊥−−→PQ where P (pi) ∈ H and Q (qi) ∈ H

u ⊥ H ⇐⇒ H ⊥ uP (pi) ∈ H and Q (qi) ∈ H they satisfy the hyperplane equation

a1p1 + a2p2 + · · ·+ anpn = b (1.5.1)

a1q1 + a2q2 + · · ·+ anqn = b

Let

v =−−→PQ = Q− P =

[q1 − p1 q2 − p2 . . . qn − pn

]Then

u · v = a1 (q1 − p1) + a2 (q2 − p2) + · · ·+ an (qn − pn)

u · v = (a1q1 + a2q2 + · · ·+ anqn)− (a1p1 + a2p2 + · · ·+ anpn) = b− b = 0

Thusv =−−→PQ ⊥ u

Page 6: Notes_MMU218_00036

CHAPTER 1. VECTORS IN RN AND CN ,SPATIAL VECTORS 6

Lines in Rn

The line L ∈ Rn passing through P(b1 b2 . . . bn

)in the direction of a vector u =

[a1 a2 . . . an

]6= 0 consists of the

points X(x1 x2 . . . xn

)that satisfy

X = P + tu ≡

x1 = a1t+ b1x2 = a2t+ b2

xn = ant+ bn

≡ L (t) = (ait+ bi)

where t ∈ RL ∈ R3 is shown in Fig.

Curves in Rn

D ≡ [a, b] ⊆ RF : D → Rn is a curve in Rn.F (t) is assumed to be continous.

∀t ∈ D F (t) ∈ Rn

F (t) =[F1 (t) F2 (t) . . . Fn (t)

]Derivative of F (t)is V =

dF (t)

dt

V =dF (t)

dt=

[dF1 (t)

dt

dF2 (t)

dt. . .

dFn (t)

dt

]which tangent to the curve.

T (t) =V (t)

‖V (t) ‖is the unit tangent vector to the curve.

1.6 Vectors in R3(Spatial Vectors), ijk Notation

Vectors, u ∈ R3 is called spatial vectors.

i =[1 0 0

]denotes the unit vector in the x-direction

j =[0 1 0

]denotes the unit vector in the y-direction

k =[0 0 1

]denotes the unit vector in the z-direction

∀u ∈ R3 u =[a b c

]= ai+ bj + ck

∀u, v ∈{i, j, k

}and u 6= v u ⊥ v

∀u ∈{i, j, k

}‖u‖ = 1

Suppose

u = a1i+ a2j + a3k

v = b1i+ b2j + b3k

Then,

u+ v = (a1 + b1) i+ (a2 + b2) j + (a3 + b3) k

cu = ca1i+ ca2j + ca3k

where c ∈ R

u · v = a1b1 + a2b2 + a3b3

‖u‖ =√u · u = a2

1 + a22 + a2

3

Page 7: Notes_MMU218_00036

CHAPTER 1. VECTORS IN RN AND CN ,SPATIAL VECTORS 7

Cross Product

There is a special operation for vectors u, v ∈ R3that is not de�ned in Rn for n 6= 3, called cross product u× v∣∣∣∣a bc d

∣∣∣∣ = ad− bc

−∣∣∣∣a bc d

∣∣∣∣ = bc− ad

Suppose

u = a1i+ a2j + a3k

v = b1i+ b2j + b3k

Then

u× v = (a2b3 − a3b2) i+ (a3b1 − a1b3) i+ (a1b2 − a2b1) i

u× v =

∣∣∣∣a1 a2 a3

b1 b2 b3

∣∣∣∣ i− ∣∣∣∣a1 a2 a3

b1 b2 b3

∣∣∣∣ j +

∣∣∣∣a1 a2 a3

b1 b2 b3

∣∣∣∣ k

Page 8: Notes_MMU218_00036

Chapter 2

Algebra of Matrices

2.1 Introduction

The entries in our matrices will come �*om some arbitrary, but �xed, �eld K. The elements of K are called numbers or scalars.Nothing essential is lost if the reader assumes that K is the real �eld R.

2.2 Matrices

A matrix A over a�eld K is a rectangular array of scalars :

A =

a11 a12 · · · a1n

a21 a22 · · · a2n

......

. . ....

am1 am2 · · · amn

Rows of matrix A are m row vectors(

a11 a12 · · · a1n

) (a21 a22 · · · a2n

)· · ·

(am1 am2 · · · ann

)The comuns of matrix A are n column vectors

a11

a21

...am1

a12

a22

...am2

· · ·

a1n

a2n

...amn

Element aij , called the ij − entry or ij − element, appears in row i and column j.Matrix A can be written as

A =[aij]

A matrix with m rows and n columns is called an m by n matrix, written as m× n where m and n are the size of the matrixA and B are equal A = B ⇐⇒ size (A) = size (B) and aij = bijif m = 1 and n > 1 matrix A is called a row matrix or row vectorif m > 1 and n = 1 matrix A is called a column matrix or column vectorA =

[aij = 0

]is called a zero matrix

A =[aij]and aij ∈ R =⇒ A is called a real matrix

A =[aij]and aij ∈ C =⇒ A is called a complex matrix

2.3 Matrix Addition and Scalar Multiplication

Let

A =[aij]

and

B =[bij]

8

Page 9: Notes_MMU218_00036

CHAPTER 2. ALGEBRA OF MATRICES 9

size (A) = size (B) = m× n

The sum of A and B, written A+B

A+B =

a11 + b11 a12 + b12 · · · a1n + b1na21 + b21 a22 + b22 · · · a2n + b2n

......

. . ....

am1 + bm1 am2 + bm2 · · · amn + bmn

The product of the matrix A by a scalar k, written k ·A, or simply kA

kA =

ka11 ka12 · · · ka1n

ka21 ka22 · · · a2n

......

. . ....

kam1 kam2 · · · amn

size (A+B) and size (kA) are also m× nDe�ne

−A ≡ (−1)A

A−B ≡ A+ (−B)

The matrix−A is called the negative of the matrix AThe matrix A−B is called the di�erence of A and Bsize (A) 6= size (B) =⇒ A+B is not de�ned

2.4 Summation Symbol

Summation symbol Σ(the Greek capital letter sigma)∑nk=1 f (k) has the following meaning

k = 1 f (1)k = 2 f (2) f (1) + f (2)k = 3 f (3) f (1) + f (2) + f (3)k = n f (n) f (1) + f (2) + f (3) + · · ·+ f (n)k is called index1 and n are called, respectively, lower and upper limitsGeneral expression for

∑can be written

n2∑k=n1

f (k) = f (n1) + f (n1 + 1) + f (n1 + 2) + · · ·+ f (n2)

2.5 Matrix Multiplication

The product of matrices A and B, written ABThe product of a row matrix A =

[ai]and column matrix B =

[bi], with the same number of elements is de�ned to be a scalar

or 1× 1 matrix

AB =[a1 a2 · · · an

]b1b2...bn

= a1b1 + a2b2 + a3b3 + · · · anbn

AB =

n∑k=1

akbk

AB is a scalar or a 1× 1 matrixAB is not de�ned if A and B have di�erent number of elements

Page 10: Notes_MMU218_00036

CHAPTER 2. ALGEBRA OF MATRICES 10

De�nition

Suppose A =[aij]and B =

[bij]are matrices such that the number of columns of A is equal to the number of rows of B; say,

A is an m × p matrix and B is an p × n matrix. Then the product AB is the m × n matrix whose ij − entry is obtained bymultiplying the ith row of A by the jth column of B. That is,

a11 · · · a1p

... · · ·...

ai1 · · · aip...

. . ....

am1 · · · amp

b11 · · · b1j · · · b1n...

. . ....

......

... · · ·. . .

......

... · · ·...

. . ....

bp1 bpj · · · bpn

=

c11 · · · c1n... · · ·

...... cij

...... · · ·

...cm1 · · · cmn

where

cij = ai1b1j + ai2b2j + ai3b3j + · · ·+ aipbpj

cij =

p∑k=1

aikbkj

A is an m× p matrix and B is an q × n matrix p 6= q =⇒the product AB is not de�ned

2.6 Transpose of a Matrix

The transpose of a matrix A, written AT

[1 2 34 5 6

]T=

1 42 53 6

[1 −3 −5

]T=

1−3−5

A =

[aij]is an m× n matrix, then AT =

[bij]is the n×m matrix where bij = aij

A =[aij]is an 1× n row matrix, then AT =

[bij]is the n× 1 column matrix

A =[aij]is an m× 1 column matrix, then AT =

[bij]is the 1×m row matrix

2.7 Square Matrices

A =[aij]is a matrix with size m× n. m = n =⇒ A is said to be a square matrix

An n× nsquare matrix is said to be of order n and is sometimes called an n− square matrix

Diagonal and Trace

Let, A =[aij]be an n− square matrix, the elements of diagonal or main diagonal of A are S =

{aij |i = j

}The trace of A, written tr (A), is sum of diagonal elements

tr (A) = a11 + a22 + a33 + · · ·+ ann

Identity Matrix, Scalar Matrices

The n − square identity or unit matrix, denoted by Inor simply I is the n − square matrix with 1′s on the diagonal and0′severywhere. For an n− square matrix A

AI = IA = A

If B is an m× n matrix, then

BIn = ImB = B

For any scalar k, the matrix kI that contains k′s on the diagonal and 0′s elsewhere is called the scalar matrix

Page 11: Notes_MMU218_00036

CHAPTER 2. ALGEBRA OF MATRICES 11

(kI)A = k (IA) = kA

Kronecker delta function δijis de�ned by

δij

{0 if i 6= j

1 if i = j

Thus identity matrix may be written

I =[δij]

2.8 Powers of Matrices, Polynomials in Matrices

Let A be an n× n matrix. Powers of A are de�ned as follows:

A2 = AA A3 = A2A . . . An+1 = AnA and A0 = I

f (x) = a0 + a1x+ a2x2 + · · ·+ anx

n ai ∈ R

f (A) = a0I + a1A+ a2A2 + · · ·+ anA

n ai ∈ R

x = A

a0 = a0I

f (A) = 0 then A is called a zero or root of f (x)

2.9 Invertible (Nonsingular) Matrices

A square matrix A is said to be invertible or nonsingular if there exists a matrix B such that

AB = BA = I

where I is the identity matrix. Such a matrix B is unique. We call such a matrix B the inverse of A and denote it by A−1. IfB is the inverse of A, then A is the inverse of BSuppose A and B are invertible. Then AB is invertible and (AB)

−1= B−1A−1

More generally, if A1, A2, . . . , Ak are invertible, then their prodcut is invertible and

(A1A2 . . . Ak)−1

= A−1k . . . , A−1

2 A−11

The product of the inverses in the reverse order.

Inverse of a 2× 2 Matrix

Let A be an arbitrary 2× 2 matrix, say

A =

[a bc d

]We want to �nd a general formula for the inverse of A, A−1

A−1 =

[x1 x2

y1 y2

]such that

AA−1 = I

AA−1 =

[a bc d

] [x1 x2

y1 y2

]=

[1 00 1

]

Page 12: Notes_MMU218_00036

CHAPTER 2. ALGEBRA OF MATRICES 12

[ax1 + by1 ax2 + by2

cx1 + dy1 cx2 + dy2

]=

[1 00 1

]The above matrix equality yields four equations

ax1 + by1 = 1 ax2 + by2 = 0

cx1 + bdy = 0 cx2 + dy2 = 1

let∣∣A∣∣ = ab− cd called the determinant of A, assuming

∣∣A∣∣ 6= 0. The unknowns x1, x2, y1, y2 can be found uniquely

x1 =d∣∣A∣∣ x2 =

−b∣∣A∣∣y1 =

−c∣∣A∣∣ y2 =a∣∣A∣∣

Thus

A−1 =

[a bc d

]−1

=1∣∣A∣∣[d −b−c a

]∣∣A∣∣ 6= 0 =⇒ A is not invertible.

Inverse of an n× n Matrix

Suppose A is an arbitrary n − square matrix. Finding its inverse A−1reduces to �nding the solution of a collection of n × nsystems of linear equations.

2.10 Special Types of Square Matrices

Diagonal and Triangular Matrices

A square matrix D =[dij]is diagonal ⇐⇒ {dij = 0|i 6= j}

D = diag (d11, d22, . . . , dnn)

Examples

3 0 00 −7 00 0 2

≡ diag (3,−7, 2)

[4 00 −5

]≡ diag (4,−5)

6

0−9

8

≡ diag (6, 0,−9, 8)

A square matrix A =[aij]is upper triangular =⇒ S

{aij = 0|i > j

}[a11 a12

a22

] b11 b12 b13

b22 b23

b33

c11 c12 c13 c14

c22 c23 c24

c33 c34

c44

A lower triangular matrix is a square matrix A =

[aij]is upper triangular =⇒ S

{aij = 0|i < j

}Special Real Square Matrices: Symmetric, Orthogonal, Normal

A matrix A is symmetric if A = AT . Equivalently, A =[aij]is symmetric if each aij = aji

A matrix A is symmetric if A = −AT . Equivalently, A =[aij]is symmetric if each aij = −aji. Clearly the diagonal elements

of such a matrix matrix must be all zero.

A ={aji = 0|i = j

}∪{aji = −aji|i 6= j

}Matrix A must be square if A = AT or A = −AT

Page 13: Notes_MMU218_00036

CHAPTER 2. ALGEBRA OF MATRICES 13

Orthogonal Matrices

A real matrix A is orthogonal if AT = A−1,that is AAT = ATA = I. Thus A must be necessarily be square and invertible.Now, suppose A is a real orthogonal 3× 3matrix with rows

u1 =(a1 a2 a3

)ub =

(b1 b2 b3

)u3 =

(c1 c2 c3

)(2.10.1)

Since A is orthogonal, we must have AAT = Ia1 a2 a3

b1 b2 b3c1 c2 c3

a1 b1 c1a2 b2 c2a3 b3 c3

=

1 0 00 1 00 0 1

= I (2.10.2)

The above matrix equality yields the following equations

a21 + a2

2 + a23 = 1 a1b1 + a2b2 + a3b3 = 0 a1c1 + a2c2 + a3c3 = 0

a1b1 + a2b2 + a3b3 = 0 b21 + b22 + b23 = 1 b1c1 + b2c2 + b3c3 = 0

a1c1 + a2c2 + a3c3 = 0 b1c1 + b2c2 + b3c3 = 0 c21 + c22 + c23 = 1

Implies

u1 · u1 = 1 u2 · u2 = 1 u3 · u3 = 1

ui · uj for i 6= j

The rows of u1, u2, u3 are unit vectors and they are orthogonal to each otherVectors, u1, u2, . . . , un ∈ Rnare said to form an orthonormal set of vectors if the vectors are unit vectors and are orthogonal toeach other that

ui · uj =

{0 i 6= j1 i = j

}In other words, ui · uj = δij ,where δij is the Kronecker delta functionAAT = I =⇒rows of A form an orthonormal set of vectorsATA = I =⇒columns of A form an orthonormal set of vectors

2.11 Block Matrices

Using a system of horizontal and vectrical lines, we can partition amatrix A into submatrices called blocks (cells). The convenienceof partition matrices, say Aand B into blocks is that the result of operations on A and B can be obtained by carrying out thecomputation with the blocks, just as if they were the actual elements of the matrices. The notation A =

[Aij]wilbe used for a

block matrix A with blocks Aij .Suppose A =

[Aij]and B =

[Bij]are block matrices with the same number of row and column blocks and suppose that

coresponding blocks have the same size.

A+B =

A11 +B11 A12 +B12 · · · A1n +B1n

A21 +B21 A22 +B22 · · · A2n +B2n

......

. . ....

Am1 +Bm1 Am2 +Bm2 · · · Amn +Bmn

and

kA =

kA11 kA12 · · · kA1n

kA21 kA22 · · · kA2n

......

. . ....

kAm1 kAm2 · · · kAmn

Suppose that U =

[Uik]and V =

[Vkj]are block matrices, as long as product UikVkj is de�ned

UV =

W11 W12 · · · W1n

W21 W22 · · · W2n

......

. . ....

Vm1 Wm2 · · · Wmn

, where Wij = Ui1V1j + Ui2V2j + · · ·+ UipVpj

Page 14: Notes_MMU218_00036

CHAPTER 2. ALGEBRA OF MATRICES 14

Square Block Matrices

Let M be a block matrix. Then M is called a square block matrix if

1. M is a square matrix

2. The blocks form a square matrix

3. The diagonal blocks are also square

The latter two conditions will occur if and only if there are the same number of horizontal and vertical lines and they are placedsymmetrically.

Block Diagonal Matrices

Let M =[Aij]be square block matrix such that nondiagonal blocks are all zero matrices, that

M ={Aij = 0|i 6= j

}M = diag

(A11, A22, . . . , Arr

)or M = A11 ⊕A22 ⊕ · · · ⊕Arr

Suppose f (x) is a polynomial and M is a diagonal matrix. Then, f (M) is the block diagonal matrix and

f (M) = diag (f (A11) , f (A22) , , . . . , f (Arr) , )

M is invertible ⇐⇒ ∀Aij is invertible, then

M−1 = diag(A−1

11 , A−122 , . . . , A

−1rr

)Analogously, a square block matrix is called a block upper triangular matrix, if the block below the diagonal are zero marices.Analogously, a square block matrix is called a block lower triangular matrix, if the block above the diagonal are zero marices.

Page 15: Notes_MMU218_00036

Chapter 3

Systems of Linear Equations

3.1 Introduction

All our systems of linear equations involve scalars as both coe�cients and constants, and such scalars may come from any number�eld K. There is almost no loss in generality if the reader assumes that all our scalars are real numbers, that is, that they comefrom the real �eld R.

3.2 Basic De�nitions, Solutions

Linear Equations and Solutions

A linear equation in unknowns x1, x2, . . . , xn is an equation that can be put in the standard form

a1x1 + a2x2 + · · ·+ anxn = b

where a1, a2, . . . , anand b are constants. The constant ak is called the coe�cient xk, and b is called the constant term of theequation.A solution of the linear equation

x1 = k2 x2 = k2 , . . . xn = kn

or

u =(k1 k2 . . . kn

)such that

a1k1 + a2k2 + · · ·+ ankn = b

is true, or we say u satis�es the equation.

Systems of Linear Equations

a system of m linear equations L1, L2, . . . , Lm in nunknowns x1, x2, . . . , xncan be put in the standard form

a11x2 + a12x2 + · · ·+ a1nx2 = b1

a21x2 + a22x2 + · · ·+ a2nx2 = b2

am1x2 + am2x2 + · · ·+ amnx2 = bm

where the aij and bi are constants. The number aij is the coe�cient of the unknown xj in the equation Li, and the number biis the constant of the equation Li. It is called a square system if m = n, that is, if the number m of equations is equal to thenumber n of unknowns. The system is said to be homogeneous if all the constant terms are zero. Otherwise the system is saidto be nonhomogeneous. The system of linear equations is said to be consistent if it has one or more solutions, and it is said tobe inconsistent if it has no solution.

15

Page 16: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 16

Augmented and Coe�cient Matrices of a System

Consider the general system of m equation in n unknowns.

M =

a11 a12 · · · a1n b1a21 a22 · · · a2n b2...

.... . .

......

am1 am2 · · · amn bm

and A =

a11 a12 · · · a1n

a21 a22 · · · a2n

......

. . ....

am1 am2 · · · amn

M is the augmented matrix of the system and A is called the coe�cient matrix.M =

[A B

]where B denotes the column vector of constants.

Degenerate Linear Equations

A linear equation is said to be degenerate if all the coe�cients are zero.

0x1 + 0x2 + · · ·+ 0xn = b

The solution of such an equation only depends on the value of the constant b. Speci�cally:(i) If b 6= 0, then the equation has no solution,(ii) If b = 0, then every vector u =

(k1 k2 . . . kn

)is a solution.

Leading Unknown in a Nondegenerate Linear Equation

By the leading unknown of L, we mean the �rst unknown in L with a nonzero coe�cient.

0x1 + 0x2 + 5x3 + 6x4 + 0x5 + 8x6 = 7

0x+ 2y − 4z = 5

3.3 Elementary Operations

The following operations on a system of linear equations L1, L2, . . . , Lm are called elementary operations.[E1

]: Interchange Li and Lj Li ←→ Lj[

E2

]: Replace Li by kLi kLi −→ Li[

E3

]: Replace Li by kLi + Lj kLi + Lj −→ Lj

Suppose a system of M of linear equations is obtained from a system L of linear equations by a �nite sequence of elementaryoperations. Then M and L have the same solutions.

3.4 Small Square Systems of Linear Equations

Systems of Two Linear Equation in Two Unknowns (2× 2)Systems

A1x+B1y = C1

A2x+B2y = C2

The system has exactly one solution

A1

B16= A2

B2A1B2 −A2B1 6= 0

The system has no solution

A1

A2=B1

B26= C1

C2

The system has in�nite solution

A1

A2=B1

B2=C1

C2

Determinant of order two ∣∣∣∣A1 B1

A2 B2

∣∣∣∣ = A1B2 −A2B1

Page 17: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 17

Elimination Algorithm

Algorithm 3.1: The input consists of two nondegenerate linear equations L1 and L2 in two unknowns with a unique solution.Part A. (Forward Elimination) Multiply each equation by a constant so that the resulting coe�cients of one unknown arenegatives of each other, and then add the two equations to obtain a new equation L that has only one unknown.Part B. (Back-substitution) Solve for the unknown in the new equation L (which contains only one unknown), substitutethis value of the unknown into one of the original equations, and then solve to obtain the value of the other unknown.Part A of Algorithm 3.1 can be applied to any system even if the system does not have a unique solution. In such a case, thenew equation L will be degenerate and Part B will not apply.

3.5 Systems in Triangular and Echelon Form

Triangular Form

2x1 + 3x2 + 5x3 − 2x4 = 9

5x2 − x3 + 3x4 = 1

7x3 − x4 = 3

2x4 = 8

Such a triangular system always has a unique solution, which may be obtained by back-substitution.

Echelon Form, Pivot and Free Variables

2x1 + 6x2 − x3 + 4x4 − 2x5 = 7

x3 + 2x4 + 2x5 = 5

3x4 − 9x5 = 6

x1, x3, x4 are called pivot variables and the other unknowns x2 and x5are called free variablesConsider a system of linear equations in echelon form, say with r equations in n unknowns. There are two cases.r = n If there are as many equations as unknowns (triangular form). Then the system has a unique solution,r < n If there are more unknowns than equations. Then we can arbitrarily assign values to the n − r free variables andsolve uniquely for the r pivot variables, obtaining a solution of the system.The general solution of a system with free variables may be described in either of two equivalent ways: One description is calledthe "Parametric Form" of the solution, and the other description is called the "Free-Variable Form".

Parametric Form

Assign arbitrary values, called parameters, to the free variables, and then use back-substitution to obtain values for the pivotvariables

Free-Variable Form

Use back-substitution to solve for the pivot variables directly in terms of the free variables.

3.6 Gauss Elimination

It essentially consists of two parts:Part A. (Forward Elimination) Step-by-step reduction of the system yielding either a degenerate equation with no solution(which indicates the system has no solution) or an equivalent simpler system in triangular or echelon form.Part B. (Backward Elimination) Step-by-step back-substitution to �nd the solution of the simpler system.

Part A. (Forward Elimination)

Input: The m× n system of linear equations.Elimination Step: Find the �rst unknown in the system with a nonzero coe�cient (which now must be x1).

1. Arrange so that a11 6= 0. That is, if necessary, interchange equations so that the �rst unknown x1 appears with a nonzerocoe�cient in the �rst equation.

2. Use ail as a11 as a pivot to eliminate x1 from all equations except the �rst equation. That is, for i > j:

Page 18: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 18

(a) Set = − ai1a11

(b) Replace Li by mL1 + Li

The system now has the following form:

a11x1 + a12x2 + a13x3 + · · ·+ a1nxn = b1

a2j2xj2 + · · ·+ a2nxn = b2

amjmxj2 + · · ·+ amnxn = bm

where x1 does not appear in any equation except the �rst, a11 6= 0, and xj2 denotes the �rst unknown with a nonzerocoe�cient in any equation other than the �rst.

3. Examine each new equation L.

(a) If L has the form 0x1 + 0x2 + · · ·+ · · · = b with b 6= 0, then STOP

The system is inconsistent and has no solution.

(b) If L has the form 0x1 + 0x2 + · · ·+ · · · = b or if L is a multiple of another equation, then delete L from the system.

Recursion Step: Repeat the Elimination Step with each new "smaller" subsystem formed by all the equations excluding the�rst equation.Output: Finally, the system is reduced to triangular or echelon form, or a degenerate equation with no solution is obtainedindicating an inconsistent system.The next remarks refer to the Elimination Step in Algorithm 3.2.

1. The following number m in (b) is called the multiplier.

m = − ai1a11

= −coefficient to be deletedpivot

2. One could alternatively apply the following operation in (b):

Replace Li by −ai1L1 + a11Li

This would avoid fractions if all the scalars were originally integers.

3.7 Elementary Matrices

Elementary Column Operations

Now let Abe a matrix with columns C1, C2, . . . , Cn. The following operations on A, analogous to the elementary row operations,are called elementary column operations'.[F1] (Column Interchange): Interchange columns Ci and Cj .[F2] (Column Scaling): Replace Ci by kCi (where k 6= 0).[F3](Column Addition): Replace Cj by kCi + Cj .We may indicate each of the column operations by writing, respectively.

1. Ci ←→ Cj

2. kCi −→ Ci

3. kCi + Cj −→ Cj

Now let f denote an elementary column operation, and let F be the matrix obtained by applying f to the identity matrix I,that is, F = f (I) Then F is called the elementary matrix corresponding to the elementary column operation f . Note that F isalways a square matrix.

Theorem:

For any matrix A, f(A) = AF . That is, the result of applying an elementary column operation f on a matrix A can be obtainedby postmultiplying A by the corresponding elementary matrix F .

Page 19: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 19

Elementary Row Operations

Suppose A is a matrix with rows R1, R2, . . . , Rm The following operations on A are called elementary row operations.[E1] (Row Interchange):Interchange Ri and Rj or Ri ←→ Rj[E2

](Row Scaling):

Replace Ri bykRi or kRi −→ Ri[E3

](Row Addition):

Replace Rj by kRi +Rj or kRi +Rj −→ Rj .Let e denote an elementary row operation and let e(A) denote the results of applying the operation e to a matrix A. Now let Ebe the matrix obtained by applying e to the identity matrix I, that is,E = e (I)Then E is called the elementary matrix corresponding to the elementary row operation e. Note that E is always a square matrix.

Theorem:

Let e be an elementary row operation and let E be the corresponding mxm elementary matrix. Thene(A) = EAwhere A is any mxn matrix.In other words, the result of applying an elementary row operation e to a matrix A can be obtained by premultiplying A by thecorresponding elementary matrix E.

3.8 Linear Systems of Equations;Gauss Elimination, Matrix Formulation

3.8.1 Introduction

a system of m linear equations L1, L2, . . . , Lm in nunknowns x1, x2, . . . , xncan be put in the standard form

a11x2 + a12x2 + · · ·+ a1nx2 = b1

a21x2 + a22x2 + · · ·+ a2nx2 = b2

am1x2 + am2x2 + · · ·+ amnx2 = bm

M =

a11 a12 · · · a1n b1a21 a22 · · · a2n b2...

.... . .

......

am1 am2 · · · amn bm

and A =

a11 a12 · · · a1n

a21 a22 · · · a2n

......

. . ....

am1 am2 · · · amn

M is the augmented matrix of the system and A is called the coe�cient matrix.M =

[A B

]where B denotes the column vector of constants.

3.8.2 Homogeneous Systems Of Linear Equations

A system of linear equations is said to be homogeneous if all the constant terms are zero. Thus a homogeneous system has theform AX = 0. Clearly, such a system always has the zero vector 0 = (0, 0, . . . , 0) as a solution, called the zero or trivial solution.Accordingly, we are usually interested in whether or not the system has a nonzero solution.Since a homogeneous system AX = 0 does have at least the zero solution, it can always be put in an echelon form; say

a11x1 + a12x2 + a13x3 + a14x4 + · · ·+ a1nxn = 0

a2j2x2 + a2j2+1xj2 + · · ·+ a2nxn = 0

arjrxjr + arjr+1xjr+1 + · · ·+ arnxn = 0

Here r denotes the number of equations in echelon form and n denotes the number of unknowns. Thus the echelon system hasn− r free variables. The question of nonzero solutions reduces to the following two cases:

1. r = n. The system has only the zero solution,

2. r < n. The system has a nonzero solution.

Page 20: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 20

Accordingly, if we begin with fewer equations than unknowns, then, in echelon form, r < n, and the system has a nonzerosolution.The augmented matrix M determines the system completely because its contains all the given numbers appearing in system ofequations.

3.8.3 Systems Of Linear Equations And Linear Combinations Of Vectors

The general system of linear equations may be rewritten as the following vector equation:

x1

a11

a21

...am1

+ x2

a12

a22

...am2

+ · · ·+ xn

a1n

a2n

...amn

=

b1b2...bm

Accordingly, the genereal system of linear equations and the above equivalent vector equation have a solution if and only if thecolumn vector of constants is a linear combination of the columns of the coe�cient matrix.

Linear Combinations of Orthogonal Vectors, Fourier Coe�cients

Recall �rst (Section 1.4) that the dot (inner) product u · v of vectors

u = (a1, . . . , an)

and

v = (b1, . . . , bn)

in Rn is de�ned by

u · v = a1b1 + a2b2 + · · · anbnFurthermore, vectors u and v are said to be orthogonal if their dot product u · v = 0.Suppose that u1, u2, . . . , un are in Rn nonzero pairwise orthogonal vectors. This means

ui · uj = 0 for i 6= j (3.8.1)

and

ui · ui 6= 0 for each i

Then, for any vector v in R", there is an easy way to write v as a linear combination of u1, u2, . . . , un which is illustrated in thenext example.

Theorem:

Suppose that u1, u2, . . . , un are in Rn nonzero pairwise orthogonal vectors. Then, for any vector v; in Rn,

v =v · u1

u1 · u1u1 +

v · u2

u2 · u2u2 + · · ·+ v · un

un · unun (3.8.2)

We emphasize that there must be n such orthogonal vectors ui in Rn for the formula to be used. Note also that each ui · ui 6=

0 for each i, since each ui is a nonzero vector.

Remark:

The following scalar ki (appearing in Theorem 3.10) is called the Fourier coe�cient of v with respect to ui.

ki =v · uiui · ui

=v · ui∥∥ui∥∥

It is analogous to a coe�cient in the celebrated Fourier series of a function.

Page 21: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 21

3.8.4 Matrix Equation Of A System Of Linear Equations

The general system C.2) of m linear equations in n unknowns is equivalent to the matrix equationa11 a12 . . . a1n

a21 a22 . . . a2n

......

. . ....

am1 am2 . . . amn

x1

x2

...xm

=

b1b2...bm

or

AX = B

where A =[aij]is the coe�cient matrix, X =

[xj]is the column vector of unknowns, and B =

[bi]is the column vector of

constants.The statement that the system of linear equations and the matrix equation are equivalent means that any vectorsolution of the system is a solution of the matrix equation, and vice versa.A system AX = B of linear equations is square if and only if the matrix A of coe�cients is square. In such a case, we have thefollowing important result.

Theorem:

A square system AX = Bof linear equations has a unique solution if and only if the matrix A is invertible. In such a case, A−1Bis the unique solution of the system.

3.8.5 Geometric Interpretation. Existence and Uniqueness of Solutions

the theorem has a geometrical description when the system consists of two equations in two unknowns, where each equationrepresents a line in R2. The theorem also has a geometrical description when the system consists of three nondegenerate equationsin three unknowns, where the three equations correspond to planes H1, H2, H3 in R3.Consider a system of two nondegenerate linear equations in two unknowns x and y, which can be put in the standard form

A1x+B1y = C1

A2x+B2y = C2

The system has exactly one solution: Here the two lines intersect in one point. This occurs when the lines have distinct slopesor, equivalently, when the coe�cients of x and 3; are not proportional:

A1

A26= B1

B2A1B2 −A2B1 6= 0

The section gives two matrix algorithms that accomplish the following:

1. Algorithm 3.3 transforms any matrix A into an echelon form.

2. Algorithm 3.4 transforms the echelon matrix into its row canonical form.

These algorithms, which use the elementary row operations, are simply restatements of Gaussian elimination as applied tomatrices rather than to linear equations.Algorithm 3.3 (Forward Elimination):The input is any matrix A. (The algorithm puts 0′s below each pivot, working from the "top-down".) The output is an echelonform of A.Step 1. Find the �rst column with a nonzero entry. Let j1 denote this column.Arrange so that a1j1 6= 0. That is, if necessary, interchange rows so that a nonzero entry appears in the �rst row in column j1.

Page 22: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 22

Figure 3.8.1: Geometric Interpretation. Existence and Uniqueness of Solutions

Figure 3.8.2: Geometric Interpretation: 2D Space

Figure 3.8.3: Geometric Interpretation: 3D Space

Figure 3.8.4: Gauss Eliminitaion Example: Electrical Network

Page 23: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 23

Use a1j1 as a pivot to obtain 0′s below a1j1 .Speci�cally, for i > 1:

Set m = − aij1a1j1

;

Replace Ri by mR1 +Ri

[That is, apply the operation −(aij1a1j1

)R1 +Ri −→ Ri]

Repeat Step 1 with the submatrix formed by all the rows excluding the �rst row. Here we let j2 denote the �rst column in thesubsystem with a nonzero entry. Hence, at the end of Step 2, we have a2j2 6= 0.Continue the above process until a submatrix has only zero rows.We emphasize that at the end of the algorithm, the pivots will be

a1j1 , a2j2 , . . . , arjr

where r denotes the number of nonzero rows in the �nal echelon matrix.Remark 1: The following number m in Step 1(Z?) is called the multiplier.

m = − aij1a1j1

= −entry to be deleted[ivot

Remark 2: One could replace the operation in Step 1(b) byReplace Ri by −aij1R1 + a1j2RiThis would avoid fractions if all the scalars were originally integers.Algorithm 3.4 (Backward Elimination): The input is a matrix A =

[aij]in echelon form with pivot entries

a1j1 , a2j2 , . . . , arjr (3.8.3)

The output is the row canonical form of A.Step 1.

(a) (Use row scaling so the last pivot equals 1.) Multiply the last nonzero row Rj . by1

arjr.

(b) (Use arjr = 1 to obtain 0′s above the pivot.) For i = r − 1, r − 2, . . . , 1Set m = −aijrReplace Ri by mRr +Ri(That is, apply the operations −aijrRr +Ri −→ Ri )Steps 2 to r�1. Repeat Step 1 for rows Rr−1, Rr−2, . . . , R2

Step r. (Use row scaling so the �rst pivot equals 1.) Multiply R1 by1

a1j1Remark: We emphasize that Gaussian elimination is a two-stage process. Speci�cally:Stage A (Algorithm 3.3). Puts 0′s below each pivot, working from the top row R1 down.Stage B (Algorithm 3.4). Puts 0′s above each pivot, working from the bottom row Rr. up.There is another algorithm, called Gauss-Jordan, that also row reduces a matrix to its row canonical form. The di�erence is thatGauss-Jordan puts 0′s both below and above each pivot as it works its way from the top row R1 down. Although Gauss-Jordanmay be easier to state and understand, it is much less e�cient than the two-stage Gaussian elimination algorithm.Application to Systems of Linear EquationsOne way to solve a system of linear equations is by working with its augmented matrix M rather than the equations themselves.Speci�cally, we reduce M to echelon form (which tells us whether the system has a solution), and then further reduce M to itsrow canonical form (which essentially gives the solution of the original system of linear equations).

Example: Gauss Elimination. Electrical Network

Solve the linear system

x1 − x2 + x3 = 0

−x1 + x2 − x3 = 0

10x2 + 25x3 = 90

20x1 + 10x2 = 80

Solution by Gauss Elimination

Page 24: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 24

Form the augmented matrix

Step 1. Elimination of x1

Step 2. Elimination of x2

The result

Backsubstitution: in this order x3, x2, x1

3.8.5.1 Gauss Elimination: The Three Possible Cases of Systems

The Gauss elimination can take care of linear systems with a unique solution, with in�nitely many solutions , and withoutsolutions (inconsistent systems).

Example: Gauss Elimination if In�nitely Many Solutions Exist Solve the following linear systems of three equatIonsin four unknowns whose augmented matrix is

System of linear equations and Augmented Matrix

Page 25: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 25

Step 1: Elimitaion of x1

Step 2: Elimitaion of x2

Backsubstitution:

x2 = 1− x3 + 4x4

x1 = 2− x4

Since x3 and x4 remain arbitrary, we have in�ntely many solutionsor we can writex2 = 1− t1 + 4t2x1 = 2− t2wherex3 = t1 x4 = t2 since x3and x4are arbitrary we have in�initely many solutions

Example: Gauss Elimination if no Solution Exists Consider the system of linear equations

Systems of linear equations and Augmented Matrix

Step 1: Elimitaion of x1

Step 2: Elimitaion of x2

Backsubstitution:

The false statement 0 = 12 shows that the system has no solution.

3.8.6 Row Echelon Form and Information From It

At the end of the Gauss elimination the form of the coe�cient matrix, the augmented matrix, and the system itself are calledthe row echelon form.

Page 26: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 26

Row Echelon Form Examples

At the end of the Gauss elimination (before the back substitution) the row echelon form of the augmented matrix will be

1. Exactly one solution if r = n and br+1 . . . bm if present. are zero.

2. In�nitely many solutions if if r < n and br+1 . . . bm if present. are zero.

3. No solution, if r < m and one of the entries br+1 . . . bm is nonzero.

Page 27: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 27

3.9 Solutions of Linear Systems: Existence, Uniqueness

Page 28: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 28

always has the trivial solution x1 = 0;x2 = 0; . . . xn = 0. Nontrivial solutions exist if and only if rank(A) = r < n.

3.9.1 Second- and Third-Order Determinants

3.9.1.1 Second-order Determinat

A determinant of second order is denoted and de�ned by

3.9.1.2 Cramer's rule for solving linear systems of two equations in two unknowns

with D 6= 0. The value D = 0 appears for inconsistent nonhomogeneous systems and for homogeneous systems with nontrivialsolutions

3.9.1.3 Third-order Determinat

A determinant of third order can be de�ned by

Page 29: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 29

3.9.1.4 Cramer's Rule for Linear Systems of Three Equations

where D1, D2, and D3 are given by

where D is given by

3.10 Determinants: Cramer's Rule

A determinant of order n is a scalar associated with an n× n matrix A =[aij]which is written

for n = 1

for n > 1

or

and Mjk is a determinant of order n− 1. namely, the determinant of the submatrix of A obtained from A by omitting the rowand column of the entry ajk that is, the jth row and the kth column. Mjkis called the minor of ajk in D, and Cjk the cofactorof ajk in D.

Page 30: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 30

Example: Expansions of a Third-Order Determinant

Example: Determinant of a Triangular Matrix

3.10.1 General Properties of Determinants

Page 31: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 31

3.10.2 Determination of Rank and Submatrices

Submatrix:

Any matrix obtained by deleting some rows and/or columns of a given matrix [A].

Example:

Find all submatricesof the following 2× 3 matrix:

A =

[a11 a12 a13

a21 a22 a23

]Of course, one obvious submatrix is the [A] matrix itself with no row or column deletionOther submatrices areThree 2× 2 submatrices [

a11 a12

a21 a22

] [a11 a13

a21 a23

] [a12 a13

a22 a23

]Two 1× 3 submatrices [

a11 a12 a13

] [a21 a22 a23

]Three 2× 1 submatrices [

a11

a21

] [a12

a22

] [a!3

a23

]Six 1× 2 submatrices [

a11 a12

] [a11 a13

] [a21 a22

] [a21 a23

] [a22 a23

]Six 1× 1 submatrices [

a11

] [a12

] [a13

] [a21

] [a22

] [a23

]Rank

A general matrix [A] is said to be of rank r if it contains at least one square submatrix of size r × r with a non�vanishing(non�zero) determinant, while the determinant of any square submatrix of [A] of size greater than r is zero.Example:

A =

4 2 1 36 3 4 72 1 0 1

Matrix A contains four 3× 3matrices. But, the determinant ofeach is zero. So, the rank of [A] is not 3.Matrix A contains the submatrix

Page 32: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 32

[4 16 4

]whose determinant is not zero. Therefore, the rank of [A] is 2.For an n×n square matrix [A], if det[A] = 0, then its rank is less than n. In that case, [A] is called a singularmatrix. Consequently,an n× n matrix [A] has a rank equal to n if and only if det[A] is not equal to zero; i.e., [A] is non-singular.

Example:

Show that matrix A is singular

A =

1 2 34 5 67 8 8

det(A)

=∣∣A∣∣ =

∣∣∣∣∣∣1 2 34 5 67 8 9

∣∣∣∣∣∣ = 1

∣∣∣∣5 68 9

∣∣∣∣− 2

∣∣∣∣4 67 9

∣∣∣∣+ 3

∣∣∣∣4 57 8

∣∣∣∣∣∣A∣∣ =

(45− 48

)− 2

(36− 42

)+ 3

(32− 35

)= −3 + 12− 9 = 0

The rank of [A] is less than n = 3. Hence, it is a singular matrix. The rank is 2.

3.10.3 Cramer's Rule

3.10.4 Useful Formulas for Inverses

Page 33: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 33

Example: Inverse of a 2× 2 Matrix

Example: Inverse of a 3× 3 Matrix

Example: Inverse of a Diagonal Matrix

Page 34: Notes_MMU218_00036

CHAPTER 3. SYSTEMS OF LINEAR EQUATIONS 34

3.11 Determination of the Inverse by the Gauss-Jordan Method

Example

Page 35: Notes_MMU218_00036

Chapter 4

Matrix Eigenvalue Problems

A matrix eigenvalue problem considers the vector equation

Ax = λx

whereA is an n× n square matrixλis an unknown scalarx is an unknown vectorx = 0 is always a solution to the above equation and so it is of no interestWe seek solutions where x 6= 0Terminologyλ's that satisfy matrix eigenvalue problem are callled eigenvalues of ACorresponding nonzero x's are called eigenvectors of ANow consider the following numeric examplesObserve the in�uence of multiplication the matrix on the given vectorsCase I: [

6 34 7

] [51

]=

[3327

]In the �rst case, we get a totally new vector with a di�erent direction and di�erent length when compared to the original vector.Case II: [

6 34 7

] [34

]=

[3040

]In the second case something interesting happens. The multiplication produces a vector[

3040

]= 10

[34

]which means the new vector has the same direction as the original vector.The scale constant is

λ = 10

Formal de�nitiona of Eigenvalue problemLet A =

[aij]n×n

Consider the following vector equation

Ax = λx

Find x 6= 0 and corresponding λGeometric Interpretation of the solution of Eigenvalue Problem

1. Geometrically, we are looking for vectors, x, for which the multiplication by A has the same e�ect as the multiplication bya scalar λ

2. Ax should be proportional to x.

3. Thus, the multiplication has the e�ect of producing, from the original vector x, a new vector that has the same or opposite(minus sign) direction as the original vector.

35

Page 36: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 36

Terminology of Eigenvalue ProblemSymbol Name

λ Eigenvalue/characteristic value/ latent root of matrix Ax 6= 0 Eigenvector/ characteristics vectors of matrixA{λi}ni=1

Spectrum of A

max{∣∣λi∣∣}ni=1

Spectral radius of matrix A

4.1 How to Find Eigenvalues and Eigenvectors

This Example demonstrates how to systematically solve a simple eigenvalue problem.

Example

All steps of eigenvalue problem is illustrated in terms of the matrix

A =

[−5 22 −2

]EigenvaluesIf we write eigenvalue problem that corresponds tot he givne matrix

Ax =

[−5 22 −2

] [x1

x2

]= λ

[x1

x2

]If we expand this vector equation we get

−5x1 + 2x2 = λx1

2x1 − 2x2 = λx2

This equation can be cast in the following form

(−5− λ

)x1 + 2x2 = 0

2x1 +(−2− λ

)x2 = 0

In matrix notation (A− λI

)x = 0

This is a homogeneous linear system. By Cramer's theorem it has a nontrivial solution x 6= 0 if and only if its coe�cientdeterminant is zero, that is,

D(λ)

= det(A− λI

)=

∣∣∣∣(−5− λ)

22

(−2− λ

)∣∣∣∣D(λ)

= λ2 + 7λ+ 6 = 0

Below you may �nd some more information about the terminology used in this chapter.TerminologySymbol Name

D(λ)

Characteristic determinant/ if expanded, the characteristic polynomial

D(λ)

= 0 Characteristic equation of AThe roots of the characteristic equation

D(λ)

= λ2 + 7λ+ 6 = 0

are the eigenvalues of A.In this particular problem, λ′s are

λ1 = −1 λ2 = −6

Eigenvector of A corresponding to λ1

in the original equations of eigenvalue problem set

Page 37: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 37

λ = λ1

(−5− λ

)x1 + 2x2 = 0

2x1 +(−2− λ

)x2 = 0

Then we get

−4x1 + 2x2 = 0

2x1 − x2 = 0

A solution to the above set of equations can be obtained from either of the equations as

x2 = 2x1

Note that since we set λ = λ1 and under these circumstances the determinant of the original vector equation(A− λI

)x = 0

D(λ)

= det(A−

[λ = λ1

]I)

= 0

Number of independent equation is one and the above two equations are linearly dependent, so in fact we have only oneindependent equation.If we examine equations after we λ = λ1

−4x1 + 2x2 = 0

2x1 − x2 = 0

we see that we can get equation (1) if we multiply the equation (2) by a scalar which is equal to−2We can compute the �rst eigenvector upto an unknown scalar multiplier, if we chose x1 = 1, we obtain the eigenvector

x1 =

[12

]We can check the solution by substituting this eigenvector into the original eigenvalue problem

Ax1 =

[−5 22 −2

] [12

]=

[−1−2

]= (−1)x1 = λ1x1

Eigenvector of A corresponding to λ2

For λ = λ2 = −6

(−5− λ

)x1 + 2x2 = 0

2x1 +(−2− λ

)x2 = 0

reduces to

x1 + 2x2 = 0

2x1 + 4x2 = 0

Solution of the above set homogenous sytem of equation is

x2 = −x1

2

One of the unknowns is arbitrary, set

x1 = 2

we can compute

Page 38: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 38

x2 = −1

Eigenvector can be wtitten upto an unknown scale can be written as

x2 =

[2−1

]Lets check the result

Ax2 =

[−5 22 −2

] [2−1

]=

[−12

6

]= (−6)x2 = λ2x2

4.2 General form of Eigenvalue Problem for an n× n Matrix

In the sequel we will investigate the general form of eigenvalue problem for a matrix A =[aij]n×n

a11x1 + · · ·+ a1nxn = λx1

a21x1 + · · ·+ a2nxn = λx2

...

an1x1 + · · ·+ annxn = λxn

The above set of homogeneous set of linear equations can be written as

(a11 − λ

)x1 + a12x2 + · · ·+ a1nxn = 0

a21x1 +(a22 − λ

)x2 + · · ·+ a2nxn = 0

...

an1x1 + an2x2 + · · ·+(ann − λ

)xn = 0

In matrix notation, (A− λI

)x = 0

By Cramer's theorem , this homogeneous linear system of equations has a nontrivial solution if and only if the correspondingdeterminant of the coe�cients is zero:

D(λ)

= det(A− λI

)=

∣∣∣∣∣∣∣∣∣

(a11 − λ

)a12 · · · a1n

a21

(a22 − λ

)· · · a2n

......

. . ....

an1 an2 · · ·(ann − λ

)∣∣∣∣∣∣∣∣∣ = 0

Lets talk about terminologySymbol Name

A− λI Characteristic matrix

D(λ)

Characteristic determinant

det(A− λI

)Characteristic equation of matrix A

D(λ)

= poly(nth order in λ) Characteristic polynomial of AEigenvalues:The eigenvalues of a square matrix A are the roots of the characteristic equation of A.The eigenvalues must be determined �rst. Once these are known, corresponding eigenvectors are obtained from the homogenoussystem of linear equations, for instance, by the Gauss elimination, where is the eigenvalue for which an eigenvector is wanted.

Example: Multiple Eigenvalues

Find the eigenvalues and eigenvectors of the matrix A

A =

−2 +2 −3+2 +1 −6−1 −2 +0

Page 39: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 39

Eigenvalue problem in matrix notation can be written asIn matrix notation, (

A− λI)x = 0

the characteristic determinant gives the characteristic equation

D(λ)

= det(A− λI

)= 0

−λ3 − λ2 + 21λ+ 45 = 0

Eigenvalues of matrix A which are the roots of characteristic equation can be found asλ1 = 5 λ2 = λ3 = −3To �nd eigenvectors, we apply the Gauss elimination to the system(

A− λI)x = 0

Set λ = λ1

(A− λI

)= A− 5I =

−7 +2 −3+2 −4 −6−1 −2 −5

Apply Gauss elimination to reduce the above system to echelon form note that we dont necessarily have to use the augmentedmatrix since the vector of constants are all zeroThe above matrix row-reduces to −7 2 −3

0 −24

7−48

70 0 0

Hence it has rank 2.Choose x3 = −1, then using

−24

7x2 −

48

7x3 = 0

we can computex2 = 2, then using

−7x1 + 2x2 − 3x3 = 0

we can compute

x1 = 1

Hence the eigenvector corresponding to λ = λ1is 12−1

Set λ = λ2

(A− λI

)= A+ 3I =

1 2 −32 4 −6−1 −2 3

Apply Gauss elimination to reduce the above system to echelon form note that we dont necessarily have to use the augmentedmatrix since the vector of constants are all zeroThe above matrix row-reduces to 1 2 −3

0 0 00 0 0

Hence it has rank 1. Use the only available equation which is the �rst equation to compute x1

Page 40: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 40

x1 + 2x2 − 3x3 = 0

Solve for x1

x1 = −2x2 + 3x3

Choose

x2 = 1 x3 = 0

and

x2 = 0 x3 = 1

we obtain two linearly independent eigenvectors of matrix A corresponding to λ = λ2 = λ3

Because rank is equal to one and number of unknowns is three. These eignevectors are

x2 =

−210

and

x3 =

301

Example: Real Matrices with Complex Eigenvalues and Eigenvectors

Since real polynomials may have complex roots (which then occur in conjugate pairs), a real matrix may have complex eigenvaluesand eigenvectors

A =

[0 1−1 0

]The characteristic equation of the skew-symmetric matrix A is

det(A− λI

)=

∣∣∣∣−λ 1−1 λ

∣∣∣∣ = λ2 + 1 = 0

Solution of the above characteristic equation gives eignevalue as

λ1 = i(=√−1)

λ2 = −i

Eigenvectors can be obtained from

−ix1 + x2 = 0

ix1 + x2 = 0

Choose arbitrarily

x1 = 1

x1 =

[1i

]and x2 =

[1−i

]Eigenvalues of the Transpose

The transpose AT of a square matrix A has the same eigenvalues as A.

Page 41: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 41

4.3 Symmetric, Skew-Symmetric, and Orthogonal Matrices

4.3.1 Introduction

De�nitions:

A real square matrix A =[ajk]is called symmetric

if transition leaves it unchanged

AT = A thus akj = ajk

A real square matrix A =[ajk]is called skew-symmetric

if transition gives the negative of A

AT = −A thus akj = −ajkA real square matrix A =

[ajk]is called orthogonal

if transition gives the inverse of A

AT = A−1

Any real square matrix A may be written as the sum of a symmetric matrix R and a skewsymmetric matrix S, where

R =1

2

(A+AT

)and S =

1

2

(A−AT

)Eigenvalues of Symmetric and Skew-Symmetric Matrices

• The eigenvalues of a symmetric matrix are real.

• The eigenvalues of a skew-symmetric matrix are pure imaginary or zero.

4.3.2 Orthogonal Transformations and Orthogonal Matrices

Orthogonal transformations are transformations

y = Ax

whereA is an orthogonal matrixPlane rotation through an angle θis an ortogonal transformation

y =

[y1

y2

]=

[cosθ sinθ−sinθ cosθ

] [x1

x2

]It can be shown that any orthogonal transformation in the plane or in three-dimensional space is a rotation

Invariance of Inner Product

An orthogonal transformation preserves the value of the inner product of vectors a and b in Rn, de�ned by

a · b = aT b =[a1 · · · an

] b1...bn

That is, for any a and b in Rn, orthogonal matrix n× n A, and

u = Aa

v = Ab

we have

u · v = a · b

Hence the transformation also preserves the length or norm of any vector a in Rn given by∥∥a∥∥ =√a · a−

√aT · a

Page 42: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 42

Orthonormality of Column and Row Vectors

A real square matrix is orthogonal if and only if its column vectors a1, a2, . . . , an(and also its row vectors) form an orthonormalsystem, that is,

aj · ak = aTj ak =

{0 j 6= k1 j = k

}Determinant of an Orthogonal Matrix

The determinant of an orthogonal matrix has the value +1 or −1

Eigenvalues of an Orthogonal Matrix

The eigenvalues of an orthogonal matrix A are real or complex conjugates in pairs and have absolute value 1

4.4 Eigenbases. Diagonalization. Quadratic Forms

4.4.1 Introduction

Eigenvectors of an n× n matrix A may (or may not!) form a basis for Rn

�eigenbasis� (basis of eigenvectors)�if it exists�is of great advantage, because then we can write the following

x = c1x1 + c2x2 + · · ·+ cnxn

where x1, x2, . . . , xn are eignevectors that forms the eigenbasisSince

(λi, xi

)is an eigenvalue, eigenvector pair of solution to the following matrix eigenvalue problem

Axj = λjxj

we can write

y = Ax = A(c1x1 + c2x2 + · · ·+ cnxn

)y = c1Ax1 + c2Ax2 + · · ·+ cnAxn

y = c1λ1x1 + c2λ2x2 + +cnλnxn

This shows that we have decomposed the complicated action of A on an arbitrary vector x into a sum of simple actions(multiplication by scalars) on the eigenvectors of A.

Theorem: Basis of Eigenvectors

if an n× n matrix A has n distinct eigenvalues, then A has a basis of eigenvectors x1, x2, . . . , xn for Rn

Theorem: Symmetric Matrices

A symmetric matrix has an orthonormal basis of eigenvectors for Rn

4.4.2 Similarity of Matrices. Diagonalization

Eigenbases also play a role in reducing a matrix A to a diagonal matrix whose entries are the eigenvalues of A. This is done bya �similarity transformation

De�nition: Similar Matrices. Similarity Transformation

An n× n matrix A is called similar to an n× n matrix A if

A = P−1AP

for some (nonsingular!) n× n matrix P . This transformation, which gives A from A is called a similarity transformation

Page 43: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 43

Theorem; Eigenvalues and Eigenvectors of Similar Matrices

If A similar to A, then Ahas the same eigenvalues of AFurthermore, if x is an eigenvector of A , then y = P−1a is an eigenvector of A corresponding to the same eigenvalue.

Example: Eigenvalues and Vectors of Similar Matrices

Let,

A =

[6 −34 −1

]and

P =

[1 31 4

]Then

A =

[4 −3−1 1

] [6 −34 −1

] [1 31 4

]=

[3 00 2

]A has the eigenvalues

λ1 = 3 and λ2 = 2

The characteristic equation of A is (6− λ

) (−1− λ

)= λ2 − 5λ+ 6 = 0

Roots of this charcteristic equation (the eigenvalue of A) is

λ1 = 3 andλ2 = 2

which con�rms the �rst part of theoryIn order to compute th eigenvectors,we use the following matrix equation(

A− λI)x = 0

If we select the �rst row, we get (6− λ

)x1 − 3x2 = 0

For λ = λ1 = 3, this gives

3x1 − 3x2 = 0

so the �rst eigenvector can be written as

x1 =

[11

]For λ = λ2 = 2, this gives

4x1 − 3x2 = 0

so the second eigenvector can be written as

x1 =

[34

]Theorem states that

y1 = P−1x1 =

[4 −3−1 1

] [11

]=

[10

]y2 = P−1x2 =

[4 −3−1 1

] [34

]=

[01

]Indeed, these are eigenvectors of the diagonal matrix AWe see that x1and x2 are the columns of PBy a suitable similarity transformation we can now transform a matrix A to a diagonal matrix D whose diagonal entries are theeigenvalues of A:

Page 44: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 44

Theorem: Diagonalization of a Matrix

If an n× n matrix A has a basis of eigenvectors, then

D = X−1AX

is diagonal, with the eigenvalues of A as the entries on the main diagonal. Here X is the matrix with thsee eigenvectors asscolumn vectors, also

Dm = X−1AmX(m = 2, 3, . . .

)Example: Diagonalization

Diagonalize

A =

7.3 0.2 −3.7−11.5 1.0 5.517.7 1.8 −9.3

The characteristic determinant can be written as ∣∣A− λI∣∣ = 0

This gives the characteristic equation

−λ3 − λ2 + 12λ = 0

The roots (eigenvalues) of this characteristic equation are

λ1 = 3λ2 = −4λ3 = 0

We apply Gauss elimination to (A− λI

)x = 0

with

λ = λ1, λ2, λ3

and �nd the corresponding eigenvectors. (λ1, x1

) (λ2, x2

) (λ3, x3

)From these eigenvectors we form the transformation matrix X

X =[x1 x2 x3

]Then we use Gauss-Jordan elimination to compute X−1from X.The results can be summarized asλ1 = 3, x1 =

−13−1

λ2 = −4, x2 =

1−13

λ3 = 0, x3 =

214

X =

−1 1 23 −1 1−1 3 4

X−1 =

−0.7 0.2 0.3−1.3 −0.2 0.70.8 0.2 −0.2

Calculate AX and premultiply by X−1

D = X−1AX =

−0.7 0.2 0.3−1.3 −0.2 0.70.8 0.2 −0.2

7.3 0.2 −3.7−11.5 1.0 5.517.7 1.8 −9.3

−1 1 23 −1 1−1 3 4

=

3 0 00 −4 00 0 0

Page 45: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 45

4.4.3 Quadratic Forms. Transformation to Principal Axes

By de�nition, a quadratic form Q in the components x1, x2, . . . , xnof a vector x

Q = xTAx =

n∑j=1

n∑k=1

ajkxjxk

when the above double summation is expanded

Q = a11x21 + a12x1x2 + · · ·+ a1nx1xn

+a21x2x1 + a22x22 + · · ·+ a2nx2xn

an1xnx1 + an2xnx2 + · · ·+ annx2n

A =[ajk]is called the coe�cient matrix. A is assumed to be symmetric

We know that symmetric coe�cient matrix A has an orthonormal basis of eigenvectors. Hence if we form matrix X from theseorthonormal vectors

X =[x1 x2 · · · xn

]we obtain a matrix X that is orthognal, so we may conclude that

X−1 = XT

Then we can write

D = X−1AX

or

A = XDX−1

or by using the orthogonal property of X that is X−1 = XT we can write A as

A = XDXT

If we substitute this form A into quadratic form of Q

Q = xTXDXTx

If we set

XTx = y

and use the orthogonal property of X that is X−1 = XT , we have

X−1x = y

or

x = Xy

Similarly

xTX =(XTx

)T= yT

and

XTx = y

so Q simply becomes

Q = yTDy = λ1y21 + λ2y

22 + · · ·+ λny

2n

Page 46: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 46

Theorem: Principal Axes Theorem

The substitution of the following transformation

x = Xy

transforms a Quadratic form

Q = xTAx

n∑j=1

n∑k=1

ajkxjxk(akj = ajk

)to the principal axes form or canonical form (10), where λ1, λ2, . . . , λnare the (not necessarily distinct) eigenvalues of the(symmetric!) matrix A, and X is an orthogonal matrix with corresponding eigenvectors x1, x2, . . . , xn , respectively, as columnvectors.

Example: Transformation to Principal Axes. Conic Sections

Transform the conic section which is represented by the following quadratic form

Q = 17x21 − 30x1x2 + 17x2

2

Q can be written as

Q = xTAx

where

A =

[17 −15−15 17

]x =

[x1

x2

]First we must compute the transformation matrix X, the columns of X are the eigenvectors of matrix A. Hence we must solvean eigenvalue problemThe characteristic equation of matrix A (

17− λ)2 − 152 = 0

Roots of the characteristic equation, eigenvalues of matrix A are

λ1 = 2 λ2 = 32

Using theorem we know that if we solved the eigenvalue problem completely and found corresponding eigenvectors(λ1, x1

) (λ2, x2

)then formed the orthogonal transformation matrix X from the eigenvectors

{x1, x2

}X =

[x1 x2

]Finally use the following tranformation

x = Xy

together with the knowledge that X−1 = XT because matrix X is orthogonal. We will end up with the following representationof quadratic form, in y

Q = λ1y21 + λ2y

22

or

Q = 2y21 + 32y2

2

To calculate the direction of the principal axes in the xy − coordinates, we have to determine normalized eigenvectors.The eigenvalue problem can be setup as (

A− λI)x = 0

The eigenvalue are

λ1 = 2

Page 47: Notes_MMU218_00036

CHAPTER 4. MATRIX EIGENVALUE PROBLEMS 47

λ2 = 32

Solving (A− λI

)x = 0

with λ = λ1 λ2

we get

x1 =

1√2

1√2

x1 =

−1√2

1√2

Hence

x = Xy =

1√2

−1√2

1√2

1√2

[y1

y2

]

x1 =y1√

2− y2√

2

x1 =y1√

2+

y2√2

This is 45o rotation

Page 48: Notes_MMU218_00036

Chapter 5

Vector and Scalar Functions and Their Fields.

Vector Calculus: Derivatives

5.1 Introduction

De�nition: Vector Function

Let P be any point in a domain of de�nition. Then a vector function v is de�ned as

v = v(P)

=[v1 (P ) v2 (P ) v3 (P )

]Note tha v is a 3D vector and its value depends on points P in space.In general a vector function de�nes a vector �eld in a domain of de�nition.

Example: Typical vector �elds

1. Field of tangent vectors of a curve

2. Normal vectors of a surface

3. Velocity �eld of a rotating body

De�nition: Scalar Function

Values of a scalar function are scalars. It is de�ned as

f = f (P )

that depends on P .Like vector functions, scalar function de�nes a scalar �eld in that threedimensional domain or surface or curve in space

Example: Typical scalar �elds

1. Temperature �eld of a body

2. Pressure �eld of the air in Earth's atmosphere

48

Page 49: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 49

Notation Vector Function: Cartesian coordinates x, y, z

Instead of writing v (P ), we can write

v (x, y, z) =[v1 (x, y, z) v2 (x, y, z) v3 (x, y, z)

]where P =

[x y z

]Notation Scalar Function: Cartesian coordinates x, y, z

Instead of writing f (P ), we can write

f (P ) = f (x, y, z)

where P =[x y z

]Caution: Vector Field Representation

The components depend on our choice of coordinate system, whereas a vector �eld that has a physical or geometric meaningshould have magnitude and direction depending only on P , not on the choice of coordinate system.

Example: Scalar Function (Euclidean Distance in Space)

f (P ) = f (x, y, z) =

√(x− x0)

2+ (y − y0)

2+ (z − z0)

2

f (P ) is a scalar functionf(P ) de�nes a scalar �eld in space

Example: Vector Field (Velocity Field)

At any instant the velocity vectors v(P ) of a rotating body B constitute a vector �eld, called the velocity �eld of the rotation.

v (x, y, z) = w × r = w ×[x y z

]= w ×

[xi yj zk

]w = ωk

Then

v =

∣∣∣∣∣∣i j k0 0 ωx y z

∣∣∣∣∣∣ = ω[−y +x 0

]= ω

(−yi+ xj

)

Example: Vector Field (Field of Force, Gravitational Field)

According to Newton's law of gravitationLet a particle A of mass M be �xed at a point and let a particle B of mass m be free to take up various positions P in space.Then A attracts B.The vector function that describes the gravitational force acting on B is

p = −cx− x0

r3i− cy − y0

r3j − cz − z0

r3k

Page 50: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 50

Gravitational �eld

5.2 Vector Calculus

First we will study basic concepts of

• convergence,

• continuity,

• and di�erentiability

of vector functions.

De�nition: Convergence

An in�nite sequence of vectors a(n), n = 1, 2, . . .∞ is said to converge if there is a vector a such that

limn→∞

∣∣∣a(n) − a∣∣∣ = 0

a is called the limit vector of that sequence

limn→∞

a(n) = a

Every component of this sequence of this vectors which is expressed in Cartesian coordinates, must converge to a

De�nition: Limit

A vector function v(t) of a real variable t is said to have the limit l as t approaches t0 if v(t) is de�ned in some neighborhood of(possibly except at t0)

limt→t0

∣∣v (t)− l∣∣ = 0

Then

limt→t0

v(t)

= l

De�nition: Neighborhood

A neighborhood of t0 is an interval (segment) on the t− axis containing t0 as an interior point (not as an endpoint).

De�nition: Continuity

A vector function v(t) is said to be continuous at t = t0 if it is de�ned in some neighborhood of t0(including at t0 itself!) and

limt→t0

v(t)

= v(t0)

In Cartesian coordinates

v(t)

=[v1 (t) v2 (t) v3 (t)

]= v1 (t) i+ v2 (t) j + v3 (t) k

v1 (t) v2 (t) v3 (t) must be continous at t0then we can conclude that v (t) is continous at t0

Page 51: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 51

De�nition: Derivative of a Vector Function

A vector function v(t) is said to be di�erentiable at a point t if the following limit exists:

v′(t)

= lim∆t→0

v(t+ ∆t

)− v

(t)

∆t

This vector v′(t)is called the derivative of v(t).

In Cartesian coordinate system

v′ (t) =[v1′ (t) v2′ (t) v3′ (t)

]Hence the derivative v′ (t) is obtained by di�erentiating each component separately

Di�erentiation Rules (cv)′ = cv′

(u+ v

)′ = u′+ v′

(u · v

)′ = u′ · v + u · v′

(u× v

)′ = u′ × v + u× v′

(u v w

)=(u′ v w

)+(u v′ w

)+(u v w′

)5.3 Partial Derivatives of a Vector Function

Suppose,

v =[v1 v2 v3

]= v1i+ v2j + v3k

are di�erentiable functions of n variables t1, t2, . . . , tn. Then∂v

∂tmis de�ned as the vector function

∂v

∂tm=∂v1

∂tmi+

∂v2

∂tmj +

∂v3

∂tmk

Second partial derivatives can be written as

∂2v

∂tltm=

∂2v1

∂tltmi+

∂2v2

∂tltmj +

∂2v3

∂tltmk

5.4 Curves. Arc Length. Curvature.Torsion

The application of vector calculus to geometry is a �eld known as di�erential geometry.Bodies that move in space form paths that may be represented by curves C. This shows the need for parametric representationsof C with parameter t, which may denote time or something else.

Page 52: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 52

A typical parametric representation is given by

r(t) =[x(t) y(t) z(t)

]= x(t)i+ y(t)j + z(t)k

Heret is the parameterx, y, z are the Cartesian coordinatesTo each t = t0, there corresponds a point C with a position vector r(t0) whose coordinates are x(t0), y(t0), z(t0)The use of parametric representations has key advantages over other representations that involve projections into the xy-planeand xz-plane or involve a pair of equations with y or with z as independent variable.the parametric representation induces an orientation on C. This means that as we increase t, we travel along the curve C in acertain direction. The sense of increasing t is called the positive sense on C. The sense of decreasing t is then called the negativesense on C.Examples give parametric representations of several important curves

Example: Circle. Parametric Representation. Positive Sense

The circle x2 + y2 = 4, z = 0 in the xy − plane with center 0 and radius 2 can be represented parametrically by

r(t) =[2cos(t) 2sin(t) 0

]or simply by

r(t) =[2cos(t) 2sin(t)

]where

0 0 t 0 2π

Indeed

x2 + y2 =(2cos t

)2+(2sin t

)2= 4

(cos2 t+ sin2 t

)= 4

For t = 0 we have r(0) =[2 0

]For t =

π

2we have r(

π

2) =

[0 2

]The positive sense induced by this representation is the counterclockwise sense.If we replace t with t∗ = −t we have t = −t∗ and get

r∗(t∗)

=[2cos (−t∗) 2sin (−t∗)

]=[2cost∗ −2sint∗

]This has reversed the orientation, and the circle is now oriented clockwise.

Page 53: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 53

Example: Ellipse

The vector function

r(t) =[acost bsint 0

]= acosti+ bsintj

represents an ellipse in the xy − plane with center at the origin and principal axes in the direction of the x− and y − axes. Infact, since , cos2t+ sin2t = 1we obtain

x2

a2+y2

b2= 1, z = 0

if b = a, then it represents a circle of radius a

Example: Straight Line

A straight line L through a point A with position vector a in the direction of a constant vector b can be represented parametricallyin the form

r(t) = a+ tb =[a1 + tb1 a2 + tb2 a3 + tb3

]If b is a unit vector, its components are the direction cosines of L. In this case,

∣∣t∣∣ measures the distance of the points of L fromA. For instance, the straight line in the xy − plane through A : (3, 2) having slope 1 is

r(t) =[3 2 0

]+ t

[1√2

1√2

0

]=

[(3 +

t√2

) (2 +

t√2

)0

]

A plane curve is a curve that lies in a plane in space. A curve that is not plane is called a twisted curve.

Example: Circular Helix

The twisted curve C represented by the vector function

r(t) =[acost asint ct

]= acosti+ asintj + ctk

is called a circular helix. It lies on the cylinder x2 + y2 = a2. If c > 0 the helix is shaped like a right-handed screw. If c < 0 itlooks like a left-handed screw. If c = 0 then it is a circle

Page 54: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 54

A simple curve is a curve without multiple points, that is, without points at which the curve intersects or touches itself. Circleand helix are simple curves.An arc of a curve is the portion between any two points of the curve.

Tangent to a Curve

The next idea is the approximation of a curve by straight lines, leading to tangents and to a de�nition of length. Tangents arestraight lines touching a curve. The tangent to a simple curve C at a point P of C is the limiting position of a straight line Lthrough P and a point Q of C as Q approaches P along C.If C is given by r(t), and P and Q correspond to t and t+ ∆tthen a vector in the direction of L is

1

∆t

[r(t+ ∆t

)− r

(t)]

In the limit this vector becomes the derivative

r′(t) = lim∆t→0

1

∆t

[r(t+ ∆t

)− r(t)

]provided r(t) is di�erentiable.If r′(t) 6= 0 we r′(t) call a tangent vector of C at P because it has the direction of the tangent. The corresponding unit vector isthe unit tangent vector.

u =1∣∣r′∣∣r′

Note that both r′ and u point in the direction of increasing t. Hence their sense depends on the orientation of C. It is reversedif we reverse the orientation.It is now easy to see that the tangent to C at P is given by

q(w) = r + wr′

This is the sum of the position vector r of P and a multiple of the tangent vector r′of C at P . Both vectors depend on P . Thevariable w is the parameter.

Page 55: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 55

Example: Tangent to an Ellipse

Find the tangent to the ellipse

1

4x2 + y2 = 1

at P :

(√2

1√2

)SolutionParametric ellipse function can be written as

r(t) =[acost bsint 0

]= acosti+ bsintj

represents an ellipse in the xy − plane with center at the origin and principal axes in the direction of the x− and y − axes. Infact, since , cos2t+ sin2t = 1we obtain

x2

a2+y2

b2= 1, z = 0

Thus we can identify the constants of ellipse as

a = 2 b = 1

This gives

r(t) =[2cost sint

]The derivative is

r′(t) =[−2sint cost

]We must �nd t that corresponds to P

r(π

4

)=[2cos

(π4

)sin(π

4

)]=

[√2

1√2

]Hence we conclude that t =

π

4We can compute

r′(π

4

)=

[−√

21√2

]Thus, we get the answer

q(w)

=

[√2

1√2

]+ w

[−√

21√2

]=

[√2(1− w

) (1√2

)(1 + w

)]Length of a Curve

Length l of a curve will be ht elimit of lengths of broken lines of n chords with larger and larger n. Let r(t), a 0 t 0 b representC. For each n = 1, 2, . . . we subdivide (partition) the interval a 0 t 0 b by points

t0(= a), t1, t2, . . . , tn−1, tn(= b) where t0 < t1 < t2 < · · · < tn

This gives a broken line of chords with endpoints r(t0), . . . r(tn). We do this arbitrarily but so that the greatest approaches∣∣∆tm∣∣ =∣∣tm − tm−1

∣∣ aprooaches zero as n → ∞ The l1, l2, . . . lengths of these chords can be obtained from the Pythagoreantheorem. If r(t) has a continuous derivative it can be shown that the sequence l1, l2, . . . has a limit, which is independent of theparticular choice of the representation of C and of the choice of subdivisions. This limit is given by the integral

l =

ˆ b

a

√r′ · r′dt

(r′ = dr

dt

)l is called the length of C, and C is called recti�able. The actual evaluation of the integral will, in general, be di�cult. However,some simple cases are given in the problem set.

Page 56: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 56

Arc Length s of a Curve

The length of a curve C is a constant, a positive number. But if we replace the �xed b with a variable t, the integral becomes afunction of t, denoted by s(t) and called the arc length function or simply the arc length of C. Thus

s(t) =

ˆ t

a

√r′ · r′dt

(r′ = dr

dt

)Geometrically, s(t0) with some t0 is the length of the arc of C between the points with parametric values a and t0. The choiceof a (the point s = 0) is arbitrary; changing a means changing s by a constant.

Linear Element ds.

If we di�erentiate (11) and square, we have(ds

dt

)2

=dr

dt· drdt

=∣∣r′(t)∣∣2 =

(dx

dt

)2

+

(dy

dt

)2

+

(dz

dt

)2

We can write

dr =[dx dy dz

]= dxi+ dyj + dzk

Then we can write

ds2 = dx2 + dy2 + dz2

ds is called the linear element of C.

Arc Length as Parameter.

The use of s in

r(t) =[x(t) y(t) z(t)

]= x(t)i+ y(t)j + z(t)k

instead of an arbitrary t simpli�es various formulas.

r(s) =[x(s) y(s) z(s)

]= x(s)i+ y(s)j + z(s)k

For the unit tangent vector

u(t) =1∣∣r′(t)∣∣r′(t)

we simply obtain

u(s) = r′(s)

Indeed,

∣∣r′(s)∣∣ =

(ds

ds

)= 1

shows that r′(s) is a unit vector.

Example: Circular Helix. Circle. Arc Length as Parameter

The helix

r(t) =[acost asint ct

]has the derivative

r′(t) =[−asint acost c

]Hence

r′ · r′ = a2 + b2

which is a constant denoted by K2

Hence the integrand in

Page 57: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 57

s(t) =

ˆ t

a

√r′ · r′dt

is constant and is equal to K, and the integral is

s = Kt

Thus,

t =s

K

so that a representation of the helix with the arc length s as parameter is

r∗(s) = r( sK

)=[acos

( sK

)asin

( sK

)c( sK

)], K =

√a2 + b2

A circle is obtained if we set c = 0.Then K = a, t =s

aand a representation with arc length s as parameter is

r∗(s) = r( sa

)=[acos

( sa

)asin

( sa

)c( sa

)]Curves in Mechanics. Velocity. Acceleration

Curves play a basic role in mechanics, where they may serve as paths of moving bodies. Then such a curve C should berepresented by a parametric representation r(t) with time t as parameter. The tangent vector of C is then called the velocityvector v because, being tangent, it points in the instantaneous direction of motion and its length gives the, speed

∣∣v∣∣∣∣v∣∣ =∣∣r′∣∣ =

√r′ · r′ = ds

dt

see below (ds

dt

)2

=dr

dt· drdt

=∣∣r′(t)∣∣2 =

(dx

dt

)2

+

(dy

dt

)2

+

(dz

dt

)2

The second derivative of r(t) is called the acceleration vector and is denoted by a. Its length∣∣a∣∣ is called the acceleration of the

motion. Thus

v(t) = r′(t), a(t) = v′(t) = r′′(t)

Tangential and Normal Acceleration.

Whereas the velocity vector is always tangent to the path of motion, the acceleration vector will generally have another direction.We can split the acceleration vector into two directional components, that is,

a = atan + anorm

where the tangential acceleration vector atan is tangent to the path and the normal acceleration vector anorm is normal (per-pendicular) to the path. Expressions for the vectors are obtained from

v(t) = r′(t), a(t) = v′(t) = r′′(t)

by the chain rule.

v(t) =dr

dt=dr

ds

ds

dt= u(s)

ds

dt

where u(s) is the unit tangent vector.

u(s) = r′(s)

Another di�erentiation gives

a(t) =dv

dt=

d

dt

(u(s)

ds

dt

)=du

ds

(ds

dt

)2

+ u(s)d2s

dt2

Note that

dr

dt=dr

ds

ds

dt= u(s)

ds

dt

Page 58: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 58

Since the tangent vector u(s) has constant length (length one), its derivative is perpendicular to u(s). Hence the �rst term onthe right side is the normal acceleration vector, and the second term on the right side is the tangential acceleration vector.Now the length

∣∣atan∣∣ is the absolute value of the projection of a in the direction of v,that is,

∣∣atan∣∣ =

∣∣a · v∣∣∣∣v∣∣Hence atanis this expression times the unit vector in the direction of v, that is,

atan =a · vv · v

v

Also

anorm = a− atan

Example: Centripetal Acceleration. Centrifugal Force

The vector function

r(t) =[Rcosωt Rsinωt

]= Rcosωt i+Rsinωt j

(with �xed i and j) represents a circle C of radius R with center at the origin of the xy − plane and describes the motion of asmall body B counterclockwise around the circle. Di�erentiation gives the velocity vector

v = r′ =[−Rωsinωt Rωcosωt

]= −Rωsinωt i+Rωcosωt j

vis tangent to C. Its magnitude, the speed, is ∣∣v∣∣ =∣∣r′∣∣ =

√r′ · r′ = Rω

Hence it is constant. The speed divided by the distance R from the center is called the angular speed. It equals ω, so that it isconstant, too. Di�erentiating the velocity vector, we obtain the acceleration vector

a = v′ =[−Rω2cosωt −Rω2sinωt

]= −Rω2cosωt i−Rω2sinωt j

This shows that a = −ω2r, so that there is an acceleration toward the center, called the centripetal acceleration of the motion.It occurs because the velocity vector is changing direction at a constant rate. Its magnitude is constant,

∣∣a∣∣ = ω2∣∣r∣∣ = ω2R.

Multiplying a by the mass m of B, we get the centripetal force ma. The opposite vector −ma is called the centrifugal force. Ateach instant these two forces are in equilibrium. We see that in this motion the acceleration vector is normal (perpendicular) toC; hence there is no tangential acceleration.

Example: Superposition of Rotations. Coriolis Acceleration

A projectile is moving with constant speed along a meridian of the rotating earth. Find its acceleration.

SolutionLet x, y, z be a �xed Cartesian coordinate system in space, with unit vectors i, j, k in the directions of the axes. Let the Earth,together with a unit vector b, be rotating about the z − axis with angular speed ω > 0. Since b is rotating together with theEarth, it is of the form

Page 59: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 59

b(t) = cosωt i+ sinωt j

Let the projectile be moving on the meridian whose plane is spanned by b and k with constant angular speed ω > 0. Then itsposition vector in terms of b and k is

r(t) = Rcosγtb(t) +Rsinγtk,(R = Radius of Earth

)Next, we apply vector calculus to obtain the desired acceleration of the projectile. Our result will be unexpected�and highlyrelevant for air and space travel. The �rst and second derivatives of b with respect to t are

b′(t) = −ωsinωt i+ ωcosωt j

b′′(t) = −ω2cosωt i− ω2sinωt j = −ω2b(t)

The �rst and second derivatives of r(t) with respect to t are

v = r′(t) = −Rcosγtb′ − γRsinγtb+ γRcosγtk

a = v′ = Rcosγtb′′ − 2γRsinγtb′ − γ2Rcosγtb− γ2Rsinγtk

a = Rcosγtb′′ − 2γRsinγtb′ − γ2r

By analogy;b′′ = −ω2b, we conclude that the �rst term in a (involving ω in b) is the centripetal acceleration due to the rotationof the Earth. Similarly, the third term in the last line (involving γ!) is the centripetal acceleration due to the motion of theprojectile on the meridian M of the rotating Earth. The second, unexpected term in a is called the Coriolis acceleration and isdue to the interaction of the two rotations. On the Northern Hemisphere, sinγt > 0 (for t > 0 ; also γ > 0 by assumption), sothat acorhas the direction of −b′, that is, opposite to the rotation of the Earth.

∣∣acor∣∣ is maximum at the North Pole and zeroat the equator. The projectile B of mass m0experiences a force −m0acor opposite to m0acorwhich tends to let B deviate fromM to the right (and in the Southern Hemisphere, where sinγt < 0, to the left). This deviation has been observed for missiles,rockets, shells, and atmospheric air�ow.

Curvature and Torsion.

The curvature κ(s) of a curve C : r(s) (s the arc length) at a point P of C measures the rate of change∣∣u′(s)∣∣of the unit tangent

vector u(s) at P . Hence κ(s) measures the deviation of C at P from a straight line (its tangent at P ). Since , the de�nition is

κ(s) =∣∣u′(s)∣∣ =

∣∣r′′(s)∣∣ , (′ = d

ds

)The torsion τ(s) of C at P measures the rate of change of the osculating plane O of curve C at point P . Note that this planeis spanned by u and u′. Hence τ(s) measures the deviation of C at P from a plane (from O at P ). Now the rate of change isalso measured by the derivative b′of a normal vector b at O. By the de�nition of vector product, a unit normal vector of O is

b = u×(

1

κ

)u′ = u× p. Here p =

(1

κ

)u′ is called the unit principal normal vector and b is called the unit binormal vector of

C at P . The vectors are labeled in Figure. Here we must assume that κ 6= 0 ; hence κ > 0. The absolute value of the torsion isnow de�ned by ∣∣τ(s)

∣∣ =∣∣b′(s)∣∣

Whereas κ(s) is nonnegative, it is practical to give the torsion a sign, motivated by �right-handed� and �left-handed. Since b isa unit vector, it has constant length. Hence b′ is perpendicular to b. Now b′ is also perpendicular to u because, by the de�nitionof vector product, we have b · u = 0, b · u′ = 0. This implies(

b · u)′ = 0

that is

b′ · u+ b · u′ = b′ · u+ 0 = 0

Hence if b′ 6= 0 at P , it must have the direction of p or −p, so that it must be of the form b′ = −τp Taking the dot product ofthis by p and using p · p gives

τ(s) = −p(s) · b′(s)

Page 60: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 60

The minus sign is chosen to make the torsion of a right-handed helix positive and that of a left-handed helix negative. Theorthonormal vector tripleu, p, b is called the trihedron of C. Figure also shows the names of the three straight lines in thedirections of u, p, b, which are the intersections of the osculating plane, the normal plane, and the rectifying plane.

5.5 Calculus Review: Functions of Several Variables

Chain Rules

Figure shows the notations in the following basic theorem.

In calculus, x, y, z are often called the intermediate variables, in contrast with the independent variables u, v and the dependentvariable w.

Special Cases of Practical Interest

If w = f(x, y) and x = x(u, v), y = y(u, v), then

Page 61: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 61

If w = f(x, y, z) andx = x(t), y = y(t), z = z(t) , then

If w = f(x, y) and x = x(t), y = y(t) , then

If w = f(x) and x = x(t) , then

Partial Derivatives on a Surface z = g(x, y)

Let w = f(x, y, z) and let z = g(x, y) represent a surface S in space. Then on S the function becomes

w(x, y)

= f(x, y, g

(x, y))

Hence, the partial derivatives are

∂w

∂x=∂f

∂x+∂f

∂z

∂g

∂x

∂w

∂y=∂f

∂y+∂f

∂z

∂g

∂y

z = g(x, y)

Mean Value Theorem

Special Cases

For a function f(x, y) of two variables

and, for a function f(x) of a single variable

where, the domain D is a segment of the x− axis and the derivative is taken at a suitable point between x0 and x0 + h

Page 62: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 62

5.6 Gradient of a Scalar Field. Directional Derivative

Some of the vector �elds that occur in applications�not all of them!� can be obtained from scalar �elds. It is the �gradient�that allows us to obtain vector �elds from scalar �elds.

De�nition: Gradient

Notation: ∇

Di�erentia operator ∇ is de�ned by

Use of Gradients:

Gradients are useful in several ways, notably in giving the rate of change of in any direction in space, in obtaining surface normalvectors, and in deriving vector �elds from scalar �elds

Directional Derivative

From calculus we know that the partial derivatives give the rates of change of in the directions of the three coordinate axes. Itseems natural to extend this and ask for the rate of change of f(x, y, z) in an arbitrary direction in space

Page 63: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 63

The next idea is to use Cartesian xyz − coordinates and for b a unit vector. Then the line L is given by

r(s) = x(s) i+ y(s) j + z(s) k = p0 + sb(∣∣b∣∣ = 1

)where p0 the position vector of P . Dbf =

df

dsis the derivative of the function with respect to the arc length s of L. Hence,

assuming that f has continuous partial derivatives and applying the chain rule

Dbf =df

ds=∂f

∂xx′+ ∂f

∂yy′+ ∂f

∂zz′

where primes denote derivatives with respect to s (which are taken at s = 0) . But here, di�erentiating

r(s) = x(s) i+ y(s) j + z(s) k = p0 + sb(∣∣b∣∣ = 1

)gives:

r′(s) = x′ i+ y′ j + z′ k = b

.Hence

Dbf =df

ds=∂f

∂xx′+ ∂f

∂yy′+ ∂f

∂zz′

is simply the inner product of grad f and b; that is,

Dbf =df

ds= b · grad f

If the direction is given by a vector a of any length (6= 0) , then

Daf =df

ds=

1∣∣a∣∣a · grad fExample: Gradient. Directional Derivative

Gradient Is a Vector. Maximum Increase

grad f points in the direction of maximum increase of f .

Theorem: Use of Gradient: Direction of Maximum Increase

Page 64: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 64

Gradient as Surface Normal Vector

Gradients have an important application in connection with surfaces, namely, as surface normal vectors, as follows. Let S be asurface represented by f(x, y, z) = c, where f is di�erentiable. Such a surface is called a level surface of f , and for di�erent cwe get di�erent level surfaces. Now let C be a curve on S through a point P of S. As a curve in space, C has a representationr(t)

[x(t) y(t) z(t)

]. For C to lie on the surface S, the components of r(t)must satisfy f(x, y, z) = c, that is,

f(x(t), y(t), z(t)) = c

Now a tangent vector of C is r′(t) =[x′(t ) y′(t) z′(t)

]. And the tangent vectors of all curves on S passing through P will

generally form a plane, called the tangent plane of S at P . The normal of this plane (the straight line through P perpendicularto the tangent plane) is called the surface normal to S at P . A vector in the direction of the surface normal is called a surfacenormal vector of S at P . We can obtain such a vector quite simply by di�erentiating

f(x(t), y(t), z(t)) = c

with respect to t. By the chain rule,

∂f

∂xx′+ ∂f

∂yy′+ ∂f

∂zz′ =

(grad f

)· r′ = 0

Hence grad f is orthogonal to all the vectors r′ in the tangent plane, so that it is a normal vector of S at P .

Theorem: Gradient as Surface Normal Vector

Example: Gradient as Surface Normal Vector. Cone

Page 65: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 65

5.7 Vector Fields That Are Gradients of Scalar Fields (�Potentials�)

Some vector �elds have the advantage that they can be obtained from scalar �elds.Such a vector �eld is given by a vector function , which is obtained as the gradient of a scalar function, sayv(P ) = grad f(P )The function f(P ) is called a potential function or a potential of v(P ).Such a v(P )and the corresponding vector �eld are called conservativebecause in such a vector �eld, energy is conserved;that is, no energy is lost (or gained) in displacing a body from a point P to another point in the �eld and back to P.

5.8 Divergence of a Vector Field

From a scalar �eld we can obtain a vector �eld by the gradient.Conversely, from a vector �eld we can obtain a scalar �eld by the divergence or another vector �eld by the curl.Let v(x, y, z)be a di�erentiable vector function, where x, y, z are Cartesian coordinates and let v1, v2, v3 be the components of v.Then the function

is called the divergence of v or the divergence of the vector �eld de�ned by v.Another common notation for the divergence is

with the understanding that the �product�

(∂

∂x

)v1in the dot product means the partial derivative

∂v1

∂x. Note that ∇ · v means

the scalar div v, whereas ∇f means the vector grad f

Theorem: Invariance of the Divergence

Let f(x, y, z) be a twice di�erentiable scalar function. Then its gradient exists,

then form the divergence

Hence we have the basic result that the divergence of the gradient is the Laplacian

Page 66: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 66

5.9 Curl of a Vector Field

Letv(x, y, z) =

[v1 v2 v3

]= v1i+ v2j + v3k

be a di�erentiable vector function of the Cartesian coordinates x, y, z.Then the curl of the vector function v or of the vector �eld given by v is de�ned by the �symbolic� determinant

Example: Curl of a Vector Function

Example: Rotation of a Rigid Body. Relation to the Curl

Theorem: Rotating Body and Curl

Theorem: Grad, Div, Curl

Page 67: Notes_MMU218_00036

CHAPTER 5. VECTOR AND SCALAR FUNCTIONS AND THEIR FIELDS. VECTOR CALCULUS: DERIVATIVES 67

Theorem: Invariance of Curl

Page 68: Notes_MMU218_00036

Chapter 6

Vector Integral Calculus. Integral Theorems

Goal of this chapter

• Line Integral

• Surface Integral

• Volume Integrals

Vector integral calculus extends integrals as known from regular calculus to

• integrals over curves

� called line integrals

• surfaces

� called surface integrals

• and solids,

� called triple integrals

We can transform these di�erent integrals into one anotherWe will learn

• Green's theorem

• Gauss's convergence theorem

• Stokes's theorem

Green's theorem in the plane allows you

• to transform line integrals into double integrals,

• or conversely,

• double integrals into line integrals

Gauss's convergence theorem

• converts surface integrals into triple integrals, and vice-versa

Stokes's theorem deals with

• converting line integrals into surface integrals, and vice-versa

68

Page 69: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 69

6.1 Line Integrals

The concept of a line integral is a simple and natural generalization of a de�nite integral

ˆ b

a

f(x)dx

we integrate the function also known as the integrand,

• from x = a

• along the x-axis to

• x = b.

Now, in a line integral,

• we shall integrate a given function

� called the integrand,

• along a curve C

� in space

� or in the plane

we represent the curve C

• by a parametric representation

The curve C is called

• the path of integration.

The path of integration goes from A to B.

• A: is its initial point

• and B: is its terminal point.

• C is now oriented.

• The direction from A to B, in which t increases is called the positive direction on C.

De�nition and Evaluation of Line Integrals

A line integral of a vector function F (r) over a curve C : r(t) is de�ned by

where

Page 70: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 70

If we write dr in terms of components

dr =[dx dy dz

]and

′ =d

dt

we get

ˆ t2=π2

t1=0

cos2t sint dt

do the following transformation

u = cost

du = −sint

u1 = cost1 = cos0 = 1

u2 = cost2 = cosπ

2= 0

ˆ t2=π2

t1=0

cos2t sint dt =

ˆ u2=0

u1=1

u2(−du)

Page 71: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 71

Integration by parts

ˆudv = uv −

ˆvdu

In order to integrate

−3

[ˆ 2π

0

tsintdt

]set

u = t dv = sint dt

du

dt= 1 du = dt

dv = sintdt v = −cost

Then

−3

[ˆ 2pi

0

tsintdt

]= −3 [−tcost]2π0 −

ˆ 2π

0

−costdt

−3

[ˆ 2pi

0

tsintdt

]= 6π − 0

For the second integral

cos2t =1

2[1− cos2t]

The third integral can be evaluted easily

Simple general properties of the line integral

If the sense of integration along C is reversed, the value of the integral is multiplied by -1.

Page 72: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 72

6.2 Path Indepence of Line Integral

We want to �nd out under what conditions, in some domain,

• a line integral takes on the same value

• no matter what path of integration is taken (in that domain).

As before we consider line integrals

The line integral is said to be path independent in a domain D in space

• if for every pair of endpoints

• A, B in domain D,

• it has the same value for all paths in D

• that begin at A and end at B.

We shall see that path independence of (1) in a domain D holds if and only if:

• Theorem-I: F = grad f

• Theorem-II: Integration around closed curves C in D always gives 0.

• Theorem-III: curl F = 0 provided D is simply connected

is analogous to

Page 73: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 73

Path Independence and Integration Around Closed Curves

Path Independence and Exactness of Di�erential Forms

A third idea relates path independence to the exactness of the di�erential form

This form is called exact in a domain D in space if it is the di�erential

of a di�erentiable function f(x, y, z) everywhere in D, that is, if we have

Comparing these two formulas, we see that the form is exact if and only if there is a di�erentiable function f(x, y, z) in D suchthat everywhere in D,

Eren
Yapışkan Not
y^2, 0'a dönüştü
Page 74: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 74

Page 75: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 75

Page 76: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 76

6.3 Calculus Review: Double Integrals.

Properties of Double Integral

Mean Value Theorem for Double Integral

Evaluation of Double Integrals by Two Successive Integrations

Double integrals over a region R may be evaluated by two successive integrations. We may integrate �rst over y and then overx. Then the formula is

Here y = g(x) and y = h(x) represents the boundary curve of R and keeping x constant, we integrate f(x, y) over y fromg(x( to h(x). The result is a function of x and we integrate it from x = a to x = bSimilarly

The boundary curve of R is now represented by x = p(y) and q(y).Treating y as a constant, we �rst integrate f(x, y) over x fromp(y) to q(y) and then the resulting function of y from y = c to y = d

Page 77: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 77

6.3.1 Applications of Double Integrals

Area A of a region R in the xy-plane is given by the double integral

A =

¨R

dxdy

The volume V beneath the surface z = f(x, y)and above a region R in the xy − plane is

V =

¨R

f(x, y)dxdy

let f(x, y)be the density ( mass per unit area) of a distribution of mass in the xy − plane. Then the total mass M in R is

M =

¨R

f(x, y)dxdy

the center of gravity of the mass in R has the coordinates, x, y where

x =1

M

¨R

xf(x, y)dxdy

y =1

M

¨R

yf(x, y)dxdy

the moments of inertia Ixand Iyof the mass in R about the x− and y − axes, respectively, are

Ix =

¨R

y2f(x, y)dxdy

Iy =

¨R

x2f(x, y)dxdy

polar moment of inertia Ioabout the origin of the mass in R is

Io = Ix + Iy =

¨R

(x2 + y2

)f(x, y)dxdy

Page 78: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 78

6.3.2 Change of Variables in Double Integrals. Jacobian

Recall from calculus that for a de�nite integral the formula for the change from x to u is

ˆ b

a

f(x)dx =

ˆ β

α

f(x(u)

) dxdudu

such that

x(α) = a x(β) = b

The formula for a change of variables in double integrals from x, y to u, v is

¨R

f(x, y)dxdy =

¨R∗f(x(u, v), y(u, v)

) ∣∣∣∣∂(x, y)

∂(u, v)

∣∣∣∣ dudvthat is, the integrand is expressed in terms of u and v, and dxdy is replaced by dudv times the absolute value of the Jacobian

J =

∣∣∣∣∂(x, y)

∂(u, v)

∣∣∣∣ =

∣∣∣∣∣∣∣∂x

∂u

∂x

∂v∂y

∂u

∂y

∂v

∣∣∣∣∣∣∣ =∂x

∂u

∂y

∂v− ∂x

∂v

∂y

∂u

Example: Change of Variables in a Double Integral

polar coordinates r and θ, which can be introduced by setting

x = rcosθ y = rsinθ

Then

J =

∣∣∣∣∂(x, y)

∂(r, θ)

∣∣∣∣ ∣∣∣∣cosθ −rsinθsinθ rcosθ

∣∣∣∣ = r

and

¨R

f(x, y)dxdy =

¨R∗f(rcosθ, rsinθ

)ddrdθ

Page 79: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 79

where R∗is the region in the rθ − plane corresponding to R in the xy − plane

Example: Double Integrals in Polar Coordinates. Center of Gravity. Moments of Inertia

6.4 Green's Theorem in the Plane

6.4.1 Introduction

Double integrals over a plane region may be transformed into line integrals over the boundary of the region and conversely.

Theorem: Green's Theorem

Page 80: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 80

Setting F =[F1 F2

]= F1i+ F2j and we obtain vectorial form˜

R

(curlF

)· kdxdy =

¸CF · dr

Example: Veri�cation of Green's Theorem in the Plane

Page 81: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 81

6.4.2 Some Applications of Green's Theorem

Example: Area of a Plane Region as a Line Integral Over the Boundary

Example: Area of a Plane Region in Polar Coordinates

6.5 Surfaces for Surface Integrals

• With line integrals, we integrate over curves in space

• with surface integrals we integrate over surfaces in space.

• Each curve in space is represented by a parametric equation

• This suggests that we should also �nd parametric representations for the surfaces in space.

Page 82: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 82

6.5.1 Representation of Surfaces

Representations of a surface S in xyz − space arez = f(x, y) or g(x, y, z) = 0For examplez = +

√a2 − x2 − y2 or x2 + y2 + z2 − a2 = 0

(z 1 0

)represents a hemisphere of radius a and center 0For surfaces S in surface integrals, it will often be more practical to use a parametric representation.Surfaces are two-dimensional.Hence we need two parameters;which we call u and v.Thus a parametric representation of a surface S in space is of the formr(u, v) =

[x(u, v) y(u, v) z(u, v)

]= x(u, v)i+ y(u, v)j + z(u, v)k

where (u, v) varies in some region R of the uv − plane.This mapping maps every point (u, v) in R onto the point of S with position vector r(u, v).

Example: Parametric Representation of a Cylinder

Page 83: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 83

Example: Parametric Representation of a Sphere

Example: Parametric Representation of a Cone

6.5.2 Tangent Plane and Surface Normal

Recall that the tangent vectors of all the curveson a surface Sthrough a point P of Sform a plane,called the tangent plane of S at P .Exceptions are pointswhere S has an edge or a cusp (like a cone),so that S cannot have a tangent plane at such a point.Furthermore, a vector perpendicular to the tangent planeis called a normal vector of S at P .The partial derivatives ru and rvat P are tangential to S at P .Hence their cross product gives a normal vector N of S at P .N = ru × rv 6= 0The corresponding unit normal vector n of S at P is

n =1∣∣N ∣∣N =

1∣∣ru × rv∣∣ru × rv

Also, if S is represented by g(x, y, z) = 0 then,

n =1∣∣gradg∣∣gradg

A surface S is called a smooth surface if its surface normal depends continuously on the points of S.Sis called piecewise smooth if it consists of �nitely many smooth portions.

Page 84: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 84

Example: Unit Normal Vector of a Sphere / Unit Normal Vector of a Cone

6.6 Surface Integrals

To de�ne a surface integral, we take a surface S, given by a parametric representation as just discussedr(u, v) =

[x(u, v) y(u, v) z(u, v)

]= x(u, v)i+ y(u, v)j + z(u, v)k

where (u, v) varies over a region R in the uv − planeS has a normal vector

N = ru × rv n =1∣∣N ∣∣n

For a given vector function F we can now de�ne the surface integral over S by˜SF · ndA =

˜AF(r(u, v)

)·N(u, v)dudv

HereN =

∣∣N ∣∣n ∣∣N ∣∣ =∣∣ru × rv∣∣

N is the area of the parallelogram with sides ruand rv, by the de�nition of cross product. HencendA = n

∣∣N ∣∣ dudv = Ndudv

And we see that dA =∣∣N ∣∣ dudv is the element of area of S.

Also F · n is the normal component of F.We can write in components, usingF =

[F1 F2 F3

]N =

[N1 N2 N3

]n =

[cosα cosβ cosγ

]Here, α;β; γare the angles between n and the coordinate axes.

we can writecosαdA = dydz cosβdA = dzdx cosγdA = dxdy

We can use this formulato evaluate surface integralsby converting them to double integralsover regions in the coordinate planes of the xyz-coordinate system

Example: Flux Through a Surface

Page 85: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 85

Example: Surface Integral

Page 86: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 86

6.7 Triple Integrals. Divergence Theorem of Gauss

6.7.1 Introduction

The divergence theorem, transforms surface integrals into triple integrals.A triple integral is an integral of a function taken over a closed bounded, three-dimensional region T in space.We subdivide T by planes parallel to the coordinate planes.Then we consider those boxes of the subdivision that lie entirely inside T , and number them from 1 to n.Here each box consists of a rectangular parallelepiped.In each such box we choose an arbitrary point, say, in box k. The volume of box k we denote by ∆Vk. We now form the sum

This we do for larger and larger positive integers n arbitrarilybut so that the maximum length of all the edges of those n boxes approaches zeroas n approaches in�nity.Then it can be shown that the sequence converges to a limit.This limit is called the triple integral of f(x, y, z)over the region T and is denoted by

Triple integrals can be evaluated by three successive integrations.This is similar to the evaluation of double integrals by two successive integrations

6.7.2 Divergence Theorem of Gauss

Triple integrals can be transformed into surface integralsover the boundary surface of a region in space and conversely.

Page 87: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 87

Such a transformation is of practical interestbecause one of the two kinds of integral is often simpler than the other.The transformation is done by the divergence theorem,which involves the divergence of a vector functionF =

[F1 F2 F3

]= F1i+ F2j + F3k

Theorem: Divergence Theorem of Gauss

Page 88: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 88

Page 89: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 89

6.8 Stokes's Theorem

Double integrals over a region in the plane can be transformed intoline integrals over the boundary curve of that region and conversely,line integrals into double integrals.This important result is known as Green's theorem in the planeWe can transform triple integrals into surface integrals and vice versa,that is, surface integrals into triple integrals.This �big� theorem is called Gauss's divergence theorem.Another �big� theorem that allows usto transform surface integrals into line integrals and conversely,line integrals into surface integrals.It is called Stokes's Theorem

Page 90: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 90

Page 91: Notes_MMU218_00036

CHAPTER 6. VECTOR INTEGRAL CALCULUS. INTEGRAL THEOREMS 91