function of two variables notes -2

The Chain Rule:

Recall that the chain rule for a function of one variable says that for differentiable function f and g ,

( )( ) ( )( ) ( )d f g x f g x g xdx = . Now, we extend the chain rule to functions of two or more variables.

Theorem 1: If ( ),f x y is a differentiable function of x and y , and both andx y are differentiable functions of a single variable t , then ( ) ( )( ),f x t y t is differentiable with respect to t , and its derivative is given by the equation

( ) ( )( ) ( ) ( )( ) ( ) ( )( ), , ,d f dx f dyf x t y t x t y t x t y tdt x dt y dt

= + .

Proof: Let ( ) ( ) ( )( )( ),g t f x t y t= . Then, by definition of the ordinary derivative, we have

( ) ( ) ( ) ( ) ( )( )( ) ( ) ( )( )( )0 0

, ,

lim limt t

f x t t y t t f x t y tg t t g tg t

t t

+ + + = =

.

For simplicity, if we write ( ) ( )x x t t x t = + , ( ) ( )y y t t y t = + and ( ) ( )( )( ) ( ) ( )( )( ), ,f f x t t y t t f x t y t = + + , then the preceding equation gives us

( ) ( )( )( ) ( )0

, limt

d ff x t y t g tdt t

= =

.

Since f is a differentiable function of x and y , by definition, we have that

1 2 ...(1)f ff x y x yx y

= + + +

where both 1 0 and 2 0 as ( ) ( ), 0,0x y . Dividing (1) throughout by t , we get

( ) ( )1 2

1 20 0 0 0 0 0 0lim lim lim lim lim lim lim ...(2)

t t t t t t t

f f x f y x yt x t y t t t

f f x f y x yt x t y t t t

= + + +

= + + +

Now, ( ) ( )0 0

lim limt t

x t t x tx dxt t dt

+ = =

and ( ) ( )

0 0lim lim

t t

y t t y ty dyt t dt

+ = =

. Also,

since ( )x t and ( )y t are differentiable, they are continuous as well, so that

( ) ( )( ) ( )

0 0

0 0

lim lim 0,

lim lim 0.t t

t t

x x t t x t

y y t t y t

= + = = + =

Consequently, ( ) ( ), 0,0x y as 0t . Hence, 1 20 0lim 0, lim 0t t = = . Thus, using equation (2), we obtain

( ) ( )( ) ( ) ( )( ) ( ) ( )( )0

lim , , , .t

f f dx f dy d f dx f dyf x t y t x t y t x t y tt x dt y dt dt x dt y dt

= + = +

Remarks: We consider x and y to be intermediate variables, which are both functions of a single variable t . If we write ( ),w f x y= , then the Chain rule is conveniently written in short form as

dw w dx w dydt x dt y dt

= +

.

To remember it, we use the tree diagram as shown below:

Similarly, if ( ), ,w f x y z= is a function of three variables, which are themselves differentiable functions of a single variable t , then the appropriate Chain rule is

dw w dx w dy w dzdt x dt y dt z dt

= + +

.

Example 1: Let ( ) 2, yz f x y x e= = , where x and y are functions one variable t defined by ( ) ( )2 1, sinx t t y t t= = . Find the derivative of z with respect to t .

Solution: First, we compute the derivatives

( ) ( )22 , , 2 , cosy yz zxe x e x t t y t tx y

= = = =

.

By applying Theorem 1, we obtain

( ) ( )( ) ( ) ( ) ( )

2

22 sin 2 sin

2 2 cos

2 1 2 1 cos .

y y

t t

dz z dx z dydt x dt y dt

xe t x e t

t e t t e t

= +

= +

= +

Example 2: Let ( ) 2 2,w f x y x y y= = , where sin , tx t y e= = . Find dw dt when 0t = . Solution: By the Chain rule in Theorem 1, we have

( ) ( )( ) ( )

2

2

2 cos 2

2sin cos sin 2 .

t

t t t

dw w dx w dydt x dt y dt

xy t x y e

t e t t e e

= + = +

= +

When 0t = , it follows that 2dw dt = .

Remarks: In Examples 1 and 2, we could have first substituted for x and y , then computed

the derivative of ( ) ( ) ( )( ),g t f x t y t= , using the usual rules of differentiation of one variable. However, as the next example shows, sometimes we dont have any alternative but to use the chain rule to find the derivative of a function of two variables.

Example 3: Suppose the production of a firm is modeled by the Cobb-Douglas production function ( ) 1 4 3 4, 20P k l k l= , where k measures the capital (in millions of dollars) and l measures the labour force (in thousands of workers). Suppose that when 2l = and 6k = , the labour force is decreasing at the rate of 20 workers per year and capital is growing at the rate of $400,000 per year. Determine the rate of change of production.

Solution: Let t denote the time in years and ( ) ( ) ( )( ),g t P k t l t= . From the Chain rule, we have

( ) ( ) ( ).P dk P dl P Pg t k t l tk dt l dt k l

= + = +

Also, we have 3 4 3 45P k k l = and 1 4 1 415P l k l = . With 2l = and 6k = , this gives us

( ) ( )6,2 2.1935 and 6,2 19.7411P Pk l

.

Also, we have ( ) 0.4k t = and ( ) 0.02l t = . Thus, we obtain

( ) ( ) ( ) ( ) ( )2.1935 0.4 19.7411 0.02 0.48258.P Pg t k t l tk l

= + + =

This indicates that the production is increasing at the rate of approximately one-half unit per year.

Example 4: Two objects are travelling in elliptical paths given by the following parametric equations.

1 1

1 1

4cos and 2sin ... first object2sin 2 and 3cos2 ...second object

x t y tx t y t

= =

= =

At what rate is the distance between objects is changing when t pi= ? Solution: The motion of the two objects is shown in the figure below.

The distance s between the two objects, as a function of four variables, is given by

( ) ( )2 22 1 2 1s x x y y= + , and that when t pi= , we have 1 1 2 24, 0, 0, 3x y x y= = = = , so that 5.s = Therefore, when t pi= , the partial derivatives of s are as follows:

( ) ( ) ( ) ( )2 1 2 1 2 1 2 11 1 2 2

4 3 4 3, , ,

5 5 5 5x x y y x x y ys s s s

x s y s x s y s

= = = = = = = =

.

Also, at t pi= , 1 1 2 24sin 0, 2cos 2, 4cos2 4, 6sin 2 0.dx dy dx dyt t t tdt dt dt dt

= = = = = = = =

Thus, using the appropriate Chain rule, we obtain that the distance is changing at the rate of

( ) ( ) ( ) ( )

1 1 2 2

1 1 2 2

4 3 4 30 2 4 05 5 5 5

22.

5

ds s dx s dy s dx s dydt x dt y dt x dt y dt

= + + +

= + + +

=

Remarks: We can easily extend the Chain Rule given in Theorem 1 to the case of a function ( ),f x y , where both x and y are functions of two independent variables s and t ,

( ) ( ), , ,x x s t y y s t= = . It is given in the next theorem (without proof). Theorem 2: If ( ),z f x y= is a differentiable function of x and y , and both and ,x y as functions of two variables and s t , written as ( ) ( ), , ,x x s t y y s t= = , have first order partial derivatives, then we have the chain rules:

; .z z x z y z z x z ys x s y s t x t y t

= + = +

Example 4: Use the Chain Rule to find w s and w t for 2w xy= , where 2 2x s t= + and y s t= .

Solution: First, we find the partial derivatives of x and y with respect to s and t .

212 , 2 , ,x x y y ss t

s t s t t t

= = = = .

Hence, by the Chain rule given in Theorem 2, we get

( ) ( ) ( ) 2 22 21 1 6 22 2 2 2 2 2w w x w y s s ty s x s s ts x s y s t t t t

+ = + = + = + + =

,

( ) ( ) ( ) 2 32 22 2 22 22 2 2 2 2 2w w x w y s s s st sy t x t s tt x t y t t t t t

= + = + = + + = .

Remarks: The Chain Rule in Theorem 2 can be extended to any number of variables. For example, if w is a function of n variables 1 2, ,..., nx x x , where each ix is a differentiable

function of the m variables 1 2, ,..., mt t t , then for ( )1 2, ,..., nw f x x x= , we have 1 2

1 1 1 2 1 1

1 2

2 1 2 2 2 2

1 2

1 2

n

n

n

n

n

m m m n m

w w x w x w x

t x t x t x t

w w x w x w x

t x t x t x t

w w x w x w x

t x t x t x t

= + + +

= + + +

= + + +

The tree diagram below gives the Chain rule for a function of three variables , , ,x y z where each of these intermediate variables are functions of two variables u and v .

Implicit Partial Differentiation:

Suppose that x and y are related by the equation ( ), 0F x y = , where it is assumed that ( )y f x= is a differentiable function of x . The problem is to find dy dx . If y can be solved explicitly in terms of x , then we can find dy dx by usual methods of differentiation. However, if y cannot be solved in terms of x , then we use Chain rule to find dy dx . Consider the function ( ) ( )( ), ,w F x y F x f x= = . Using the Chain rule, we find that

( ) ( ) ( ) ( ), ,x ydw dx F x y dx dx F x y dy dx= + .

Since ( ), 0w F x y= = , we have 0dw dx = , so that ( ) ( ), , 0x ydx dyF x y F x ydx dx+ = . But

1dx dx = , so that if ( ), 0yF x y , we get ( )( ),

,

x

y

F x ydydx F x y

= . We formally state this result as a

theorem of implicit differentiation:

Theorem 3: If the equation ( ), 0F x y = defines y implicitly as a differentiable function of x such that ( ), 0yF x y , then

( )( )

,

,

x

y

F x ydydx F x y

= .

Similarly, if the equation ( ), , 0F x y z = defines z implicitly as a differentiable function of x and y such that ( ), , 0zF x y z , then

( )( )

( )( )

, ,, ,

and , , , ,

yx

z z

F x y zF x y zz zx F x y z y F x y z

= =

.

The above theorem can be extended to differentiable functions defined implicitly with any number of variables.

Example 5: Find dy dx , given 3 2 25 4 0.y y y x+ + =

Solution: Let ( ) 3 2 2, 5 4F x y y y y x= + + . Then we have ( ) ( ) 2, 2 , , 3 2 5x yF x y x F x y y y= = + .

Because ( ) 2, 3 2 5 0yF x y y y= + for any ( ),x y , it follows therefore by Theorem 3 that ( )( ) 2

, 2.

, 3 2 5x

y

F x ydy xdx F x y y y

= =

+

Example 6: Find z x and z y , given 2 2 2 33 2 3 5 0x z x y z yz + + = .

Solution: Let ( ) 2 2 2 3, , 3 2 3 5F x y z x z x y z yz= + + . Then we have ( )( )( )

2

2

2 2

, , 6 2 ,, , 2 3 ,

, , 3 6 3 .

x

y

z

F x y z xz xy

F x y z x y z

F x y z x z y

=

= +

= + +

Hence, we obtain

( )( )( )( )

2

2 2

2

2 2

, , 2 6, , 3 6 3, , 2 3

.

, , 3 6 3

x

z

y

z

F x y zz xy xzx F x y z x z y

F x y zz x y zy F x y z x z y

= =

+ +

= =

+ +

Directional Derivatives:

Recall that by Chain Rule, if ( ),f x y is differentiable, then the rate at which f changes with respect to t along a differentiable curve ( ) ( ),x g t y h t= = is

.

df f dx f dydt x dt y dt

= +

At any point ( ) ( ) ( )( )0 0 0 0 0 0, ,P x y P g t h t= , this equation gives the instantaneous rate of change of f with respect to increasing t and therefore depends, among other things, on the direction of motion along the curve.

Example 1: Suppose the direction of motion is a straight line and s is the arc length parameter along the line measured from ( )0 0 0,P x y in the direction of a given unit vector u . Then df

ds at 0P is the instantaneous rate of change of f with respect to distance in its

domain in the direction of u . By varying u , we find the rates at which f changes with respect to distance as we move through 0P in different directions.

Example 2: Suppose ( ),z T x y= gives the temperature at each point ( ),x y in a region R of the plane, and let ( )0 0 0,P x y be a particular point in R . Then we know that the partial derivative ( )0 0,xT x y gives the rate at which the temperature changes if we move from 0P in the x direction, while the rate of temperature change in the y direction is given by

( )0 0,yT x y . The question is how to find the direction of greatest temperature change, which may be in a direction not parallel to either of the coordinate axes.

Example 3: While hiking in rugged terrain, we may think of the altitude on a hill side at the point given by longitude x and latitude y as defining a function ( ),f x y . If you face due east (in the direction of the positive x-axis), then we know that the slope of the terrain is given by

the partial derivative f x . Similarly, facing due north, the slope of the terrain is given by f y . However, how would one compute the slope in some other direction, say north-by-

northwest? Also, how one would find the direction of steepest ascent or descent?

To answer these questions, we now introduce the concept of directional derivative and understand its geometrical interpretation.

Suppose that we want to find the instantaneous rate of change of ( ),f x y at the point ( ),P a band in the direction given by the unit vector 1 2 u u i u j= +

. Let ( ),Q x y be any point on the

line through ( ),P a b in the direction of u (see figure above). Notice that the vector PQ is then parallel u . Since two vectors are parallel if and only if one is a scalar multiple of the other, we have that PQ h u=

for some scalar h, so that

( ) ( ) ( ) ( )1 2 x a i y b j h u hu i hu j + = = + . It then follows that 1 2,x a hu y b hu = = , so that 1 2,x a hu y b hu= + = + . The point Q is then described by ( )1 2,a hu b hu+ + , as indicated in Figure above. The average rate of change of ( ),z f x y= along the line from P to Q is therefore

( ) ( )1 2, ,.

f a hu b hu f a bh

+ +

The instantaneous rate of change of ( ),f x y at the point ( ),P a b and in the direction of the unit vector u is then found by taking the limit as 0h . We give this limit a special name in the following definition.

Definition 1: The directional derivative of ( ),f x y at the point ( ),a b and in the direction of the unit vector 1 2 u u i u j= +

is given by

( ) ( ) ( )1 20

, ,

, limu hf a hu b hu f a b

D f a bh

+ + =

provided the limit exists.

Remarks:

a) Notice that this limit resembles the definition of partial derivative, except that in this case, both variables may change.

b) Calculating directional derivatives by this definition is similar to finding the derivative of a function of one variable by the limit process. However, a simpler working formula for finding directional derivatives involving the partial derivatives

xf and yf is given in Theorem 1 below. c) At a particular point ( ),P a b , there are infinitely many directional derivatives to the

function ( ),f x y , one for each direction radiating from ( ), .P a b Two of these are the partial derivatives ( ),xf a b and ( ),yf a b . To see this, note that if u i= (so 1 1u = and

2 0u = ), then

( ) ( ) ( ) ( ) 0

, ,

, lim ,xi h

f a h b f a bD f a b f a b

h+

= = ,

and if u j= (so 1 0u = and 2 1u = ),

( ) ( ) ( ) ( ) 0

, ,

, lim ,yj hf a b h f a b

D f a b f a bh

+ = = .

Theorem 1: (Directional derivatives using partial derivatives)

Let ( ),f x y be a function that is differentiable at ( ),P a b . Then f has a directional derivative in the direction of the unit vector 1 2 u u i u j= +

given by

( ) ( ) ( )1 2, , , (1)u x yD f a b f a b u f a b u= + Proof: We define a function F of a single variable h by ( ) ( )1 2,F h f a hu b hu= + + . Then

( ) ( ) ( )

( ) ( ) ( )

1 2

0

0

, ,

, lim

0lim 0 .

u h

h

f a hu b hu f a bD f a b

hF h F

Fh

+ + =

= =

Writing 1x a hu= + , 2y b hu= + , and applying the Chain rule on F , we obtain

( ) ( ) ( )1 2, , .x ydF f dx f dyF h f x y u f x y udh x dh y dh

= = + = +

When 0h = , we have x a= and y b= , so that

( ) ( ) ( ) ( )1 2, 0 , , .u x yD f a b F f a b u f a b u= = +

Geometrical Interpretation of Directional Derivative:

The directional derivative of ( ),f x y at a point ( )0 0,P x y in the domain of f and in the direction of the unit vector 1 2 u u i u j= +

can be interpreted as a slope of the surface

( ),z f x y= at the point ( )0 0,P x y in the direction of u . To see this, we reduce the problem to two dimensions by intersecting the surface with a vertical plane passing through the point

( )0 0,P x y and parallel to u as shown in Figure (b) below.

This vertical plane intersects the surface to form a curve C . The slope of the surface at ( )( )0 0 0 0, , ,x y f x y in the direction of u is defined as the slope of the curve C at ( )( )0 0 0 0, , ,x y f x y . The vertical plane used to form C

intersects the xy plane in a line L ,

represented by the equations 0 1 0 2,x x tu y y tu= + = + so that for any value of t , the point

( ),Q x y lies on the line L . The points on the surface corresponding to P and Q are ( )( )0 0 0 0, , ,x y f x y and ( )( ), , ,x y f x y , respectively. Since the distance between P and Q is

( ) ( ) ( ) ( )2 2 2 2 2 20 0 1 2 1 2 , ( is a unit vector)x x y y tu tu t u u t u + = + = + = the slope of the secant line through the points ( )( )0 0 0 0, , ,x y f x y and ( )( ), , ,x y f x y is

( ) ( ) ( ) ( )0 0 0 1 2 0 0, , , ,.

f x y f x y f x tu y tu f x yt t

+ + =

Letting 0t , we obtain the slope of the tangent line to the curve C at ( )0 0,x y . But this is, by definition, the directional derivative of f at ( )0 0,x y in the direction of u :

( ) ( ) ( )0 1 2 0 0 0 00, ,

lim ,utf x tu y tu f x y

D f x yt

+ + = .

Thus, geometrically, the directional derivative in a given direction at any point on the surface gives the slope of the surface in that direction.

Remarks:

1) Note that since u is a unit vector, we may write cos sinu i j = + , where is the angle that the vector u makes with the positive x axis. With this notation, any point on L can be written as 0 0cos , sinx x t y y t = + = + .

2) In order to compute the directional derivative of f at the point ( ),a b in the direction of the vector 1 2 v v i v j= +

, where v is not a unit vector, v must be normalized.

That is, consider the unit vector u v v= in the direction of v and then find uD f using the formula (1) proved in Theorem 1.

Example 1: Find the directional derivative of ( ) 2 3, 3 2f x y x y= + at the point ( )1,2P in the direction of the unit vector ( ) ( )1 2 3 2u i j= . (Ans: 2 6 3 ) Example 2: Find the directional derivative of ( ) ( )2, sin 2f x y x y= at the point ( )1, 2P pi in the direction of the vector 3 4v i j= . (Ans: 8 5 )

Example 3: For ( ) 2 3, 4f x y x y y= , compute ( )2,1uD f where ( i ) ( ) ( ) 3 2 1 2u i j= + , and ( ii ) u in the direction from ( )2,1 to ( )4,0 . (Ans: 2 3 4 ; 16 5 )

Gradient of a function of two variables:

Definition 2: Let ( ),z f x y= be a function of two variables x and y such that the partial derivatives xf and yf exist. Then the gradient of f , denoted by ( ),f x y or ( ), ,grad f x y is the vector

( ) ( ) ( ) , , ,x yf x y f x y i f x y j = + . Note that the gradient ( ),f x y is a vector in the plane and not a vector in space. Also, think of as an operator which produces a vector in the plane. We read f as del f .

Example 4: The gradient of ( ) 2 3,f x y x y y= + is

( ) ( ) ( ) ( ) ( )2 3 2 3 2 2 , 2 3f x y x y y i x y y j xy i x y jx y

= + + + = + + .

Example 5: Find the gradient of ( ) 2, lnf x y y x xy= + at the point ( )1,2 . (Ans: 6 4i j+ )

Theorem 2: (The gradient formula for the directional derivative)

If f is a differentiable function of x and y , then the directional derivative of f at the point ( ),P a b in the direction of the unit vector u is

( ) ( ), ,uD f a b f a b u= . Proof: Since ( ) ( ) ( ) , , ,x yf x y f x y i f x y j = + for all x and y , and 1 2 u u i u j= + , we have

( ) ( ) ( )( ) ( )

( )

1 2

1 2

, , ,

, ,

, ,

x y

x y

u

f a b u f a b i f a b j u i u jf a b u f a b uD f a b

= + + = +

=

using equation (1) in Theorem 1.

Example 6: Find the directional derivative of ( ) ( )2 3, lnf x y x y= + at ( )1, 3P in the direction of 2 3v i j= . (Ans: 77 13 338 )

Example 7: Find the directional derivative of ( ) 2 2, 3 2f x y x y= at ( )3 4,0 in the direction of PQ

where ( )3 4,0P = and ( )0,1Q = . (Ans: 27 10 )

Basic properties of gradient:

Let f and g be differentiable functions of x and y . Then

a) Constant Rule 0c = for any constant c . b) Linearity Rule ( ) .af bg a f b g + = + c) Product Rule ( )fg f g g f = + . d) Quotient Rule 2 , ( 0)

f g f f g gg g

=

.

e) Power Rule 1n nf nf f = .

Proof: (a) Since ( ) ( )0 and 0c cx y

= =

, we have 0c = for any constant c .

(b) Using the linearity rule of partial (ordinary) derivatives, we get

( ) ( ) ( ) ( ) ( )( ) ( )

.

x x y y

x y x y

af bg af bg i af bg j af bg i af bg jx y

a f i f j b g i g j a f b g

+ = + + + = + + +

= + + + = +

(c) Using the product rule of partial (ordinary) derivatives, we have

( ) ( ) ( ) ( ) ( )( ) ( )

.

x x y y

x y x y

fg fg i g j fg f g i fg f g jx y

f g i g j g f i f j f g g f

= + = + + +

= + + + = +

(d) Using the quotient rule of partial (ordinary) derivatives, we obtain for 0g ,

( ) ( )2 2

2 2

.

y yx x

x y x y

gf fggf fgf f fi j i jg x g y g g g

g f i f j f g i g j g f f gg g

= + = + + +

= =

(e) Using power rule of partial (ordinary) derivatives, we have

( ) ( ) ( )1 1 1 1 .n n n n n n nx y x yf f i f j nf f i nf f j nf f i f j nf fx y

= + = + = + =

Theorem 3: (Maximal Direction Property of the Gradient) Let ( ),f x y be differentiable function of x and y . Let ( ),a b be any point in the domain of f . Then

1) If ( ), 0,f a b = then ( ), 0uD f a b = for any direction u . 2) If ( ), 0f a b , then

a) The direction of maximum increase of f at ( ),a b is ( ),f a b . The maximum rate of change of f at ( ),a b is ( ),f a b .

b) The direction of minimum increase of f at ( ),a b is ( ),f a b . The minimum rate of change of f at ( ),a b is ( ),f a b .

3) The rate of change of f at ( ),a b is 0 in the directions orthogonal to ( ), .f a b 4) The gradient ( ),f a b is orthogonal to the level curve ( ),f x y c= at the point ( ),a b ,

where ( ),c f a b= . Proof: 1) Given that ( ), 0f a b = , we have ( ) ( ), , 0 0uD f a b f a b u u= = = for any direction .u

2) Let ( ), 0f a b . Then (a) Using Theorem 2, we have

( ) ( ) ( ) ( ), , , cos , cosuD f a b f a b u f a b u f a b = = = .

where is the angle between the gradient vector at ( ),a b and the direction vector u . Now, ( ), cosf a b has its maximum value when cos assumes its largest value, that is, when 0 = , so that cos 1 = . Thus, the direction of maximum increase of f at ( ),a b is attained when the angle between ( ),f a b and u is 0. In other words,

( ),f a b is in the same direction as u . Hence, the direction of maximum increase of f at ( ),a b is ( )( )

,

,

f a bu f a b

=

, and the largest value of ( ),uD f a b is ( ),f a b .

(b) As in part (a), the minimum value of ( ),uD f a b occurs when cos 1 = and pi= . This value occurs when u points towards ( ),f a b , and in this direction

( ) ( ) ( ), 1 ,uD f f a b f a b= = . 3) Since ( ) ( ), ,uD f a b f a b u= and ( ), 0f a b u = if and only if ( ),f a b is

orthogonal to u , we get that ( ), 0uD f a b = in the directions orthogonal to ( ),f a b . 4) Let the vector equation of the level curve be ( ) ( ) ( ) r t x t i y t j= + , where t is a

parameter. Since ( ) ( )( ),f x t y t c= for every t , we differentiate it with respect to t and apply Chain rule to obtain

( ) ( )( ) ( ), 0 0 0d f dx f dyf x t y t f r tdt x dt y dt

= + = =

.

This implies that f is normal to velocity vector ( )r t for every point on the level curve. But ( )r t is tangent to the level curve at every point, so that f is normal to the level curve at every point. In particular, ( ),f a b is orthogonal to the level curve at the point ( ),a b .

Example 8: In what direction is the function defined by ( ) 2, y xf x y xe = increasing most rapidly at the point ( )2,1P , and what is the maximum rate of increase? In what direction is f decreasing most rapidly?

Solution: By the preceding theorem, the gradient of f at P provides the required answer. So, we first find the gradient of f :

( ) ( )( )

2 2 2

2

1 2

1 2 .

y x y x y xx y

y x

f f i f j e xe i xe je x i xj

= + = + + = +

Now, at the point ( )2,1P , we have ( ) 2,1 4f i j = + . Thus, the most rapid rate of increase is ( )2,1 17f = and it occurs in the direction of 4i j + . The most rapid rate of decrease therefore occurs in the direction ( ) 2,1 4 .f i j = Example 9: Find the direction of maximum increase on the contour plot of the surface given by ( ) 3 2, 3 3f x y x x xy= from the point ( )0.6, 0.7P and sketch the path of steepest ascent.

Solution: The direction of maximum increase at P is given by ( )0.6, 0.7f . We have 2 23 3 3 , 6f x y xy =< > ,

so that ( )0.6, 0.7 0.45,2.52f =< > . The unit vector in this direction is then 0.176,0.984u =< > , which is the direction of maximum increase. Note that the path of

steepest ascent is a curve that remains perpendicular to each level curve through which it passes. Finding the path of steepest ascent is a difficult one. The unit vector u does not point towards the maximum (the peak of the surface) at ( )1,0 . However, a plausible path of steepest ascent is shown in the figure below:

Remarks: The directional derivative and gradient concepts can easily be extended to functions of three or more variables. The basic properties of the gradient and maximal property is valid for function or three or more variables. For example, for a function

( ), ,f x y z of three variables, the gradient f is defined by f f ff i j kx y z

= + + , and the

directional derivative uD f of ( ), ,f x y z at the point ( )0 0 0 0, ,P x y z in the direction of the unit vector u

is given by

( ) ( )0 0 0 0, , , ,u o oD f x y z f x y z u= .

Example10: Let ( ) ( ), , sinf x y z xy xz= . Find ( )1, 2,f pi and then compute the directional derivative of f at ( )1, 2,pi in the direction of the vector 2 3 5v i j k= + . Solution: By definition, we have x y zf f i f j f k = + + . We now compute the partial derivatives at ( )1, 2,pi :

( )( ) ( ) ( )( ) ( )( )( ) ( ) ( )( )( ) ( ) ( )

sin sin cos 1, 2, 2 ;

sin sin 1, 2, 0;

sin cos 1, 2, 2.

x x

y y

z z

f xy xz y xz xy z xz fx

f xy xz x xz fy

f xy xz xy xz x fz

pi pi

pi

pi

= = + =

= = =

= = =

Thus, the gradient of f at ( )1, 2,pi is ( ) 1, 2, 2 2f i kpi pi = + . To find vD f , we first need to normalize v to get the unit vector u in the direction of v :

( ) ( ) ( ) ( )2 2 2 2 3 5 1

2 3 5382 3 5

v i j ku i j k

v

+ = = = +

+ +

,

and then note that v u

D f D f= . Thus, we have

( ) ( ) ( ) ( ) ( )1 1 1, 2, 1, 2, 2 2 2 3 5 4 10 3.66.38 38uD f f u i k i j kpi pi pi pi = = + + = Theorem 4: (Normal Property of the Gradient) Suppose the function f is differentiable at the point ( ), ,P a b c and that the gradient at P satisfies ( ), , 0f a b c . Then ( ), ,f a b c is orthogonal to the level surface of f through P .

Proof: Let C be any smooth curve on the level surface ( ), ,f x y z k= that passes through P and has the vector equation ( ) ( ) ( ) ( ) R t x t i y t j z t k= + + for all t in some interval I . Since C lies on the level surface, any point ( ) ( ) ( )( ), ,x t y t z t on the curve C must satisfy the equation ( ) ( ) ( )( ), ,f x t y t z t k= . Differentiating this equation with respect to t , we obtain

( ) ( ) ( )( ), , 0d f x t y t z tdt = . Applying the Chain rule on left hand side, we get that for all t I ,

( ) ( ) ( )( ) ( ) ( ) ( )( ) ( ) ( ) ( )( ), , , , , , 0x y zdx dy dzf x t y t z t f x t y t z t f x t y t z tdt dt dt+ + =

( ) ( ) ( )( )or , , 0.dRf x t y t z t dt =

In particular, at the point ( ), ,P a b c , we have

( )( ), ,

, , 0 ... (1)a b c

dRf a b cdt

=

.

But since the curve is smooth, the velocity vector to the curve C at P , namely,( ), ,

0a b c

dRdt

,

and it is given that ( ), , 0f a b c . Therefore, from (1), we must have that the vector( ), ,f a b c is normal to the vector ( ), ,a b cdR dt

. That is, ( ), ,f a b c is orthogonal to the level

surface of f through ( ), , .P a b c Remarks: The preceding theorem implies that the gradient vector at a point ( ), ,P a b c on the level surface is normal to the tangent vector T dR dt=

on each curve C on the surface that

passes through P . Thus, all these tangent vectors lie in a single plane through P with normal vector ( ), ,N f a b c= . This plane is the tangent plane to the surface at P . Note that since P is an arbitrary point on the level surface, there is a unique tangent plane at every point on the level surface of f where 0f .

Example 11: Find a vector that is normal to the level surface 2 22 3 7x xy yz z+ + = at the point ( )1,1, 1P . Solution: By the preceding theorem, the gradient vector ( )1,1, 1f is normal to the level surface ( ), , 7f x y z = , where ( ) 2 2, , 2 3f x y z x xy yz z= + + . First, we have

( ) ( ) ( ) 2 2 2 6x y zf f i f j f k x y i x z j z y k = + + = + + +

.

At the point ( )1,1, 1 , we have, therefore, ( ) 1,1, 1 4 3 7f i j k = + as the required normal. Example 12: Sketch the level curve ( ),f x y c= corresponding to 1c = for the function

( ) 2 2,f x y x y= and find a normal vector at the point ( )2, 3P . Solution: The level curve for 1c = is the hyperbola given by 2 2 1x y = (trace it). The gradient vector is normal to the level curve, by Theorem 3 part (4). We have

2 2x yf f i f j xi yj = + = .

Thus, at the point ( )2, 3 , we get that ( ) 2, 3 4 2 3f i j = is the required normal. Example 13: The set of points ( ),x y with 0 5x and 0 5y is a square in the first quadrant of the xy plane. Suppose this square is heated in such a way that the temperature at the point ( ),P x y is given by ( ) 2 2,T x y x y= + . In what direction will heat flow from the point ( )3,4P ? Solution: The level curves of T are called isothermal curves. From physics it is known that the flow of heat is perpendicular to the isothermal curves, and points in the direction of decreasing temperature. Thus, if ( ),H x y denotes the heat flow at a point ( ),x y in the region, then we can express the heat flow as

( ) ( ), , ... (2)H x y k T x y= . where k is a positive constant (called thermal conductivity). Since ( )3,4 25T = , the point

( )3,4P lies on the isotherm ( ), 25T x y = , which is part of the circle 2 2 25x y+ = . Because ( ) , 2 2T x y xi yj = + , we get that ( ) 3,4 6 8T i j = + . Using the heat flow equation (2), we

have that the heat flow at ( )3,4P satisfies

( ) ( ) 3,4 6 8H k i j= + . Because the thermal conductivity k is positive, the heat flows from ( )3,4P in the direction of the unit vector u given by

( )( ) ( )

( ) ( )2 2

6 8 6 8 3 4

.

5 5100 1006 8

i ju i j i j

+

= = + = +

+

Example 14: Find equations for the tangent plane and the normal line at the point ( )1, 1,2P on the surface S given by 2 2 2 5x y y z z x+ + = .

Solution: The given surface S can be written as the level surface ( ), , 5F x y z = , where ( ) 2 2 2, ,F x y z x y y z z x= + + .

The gradient vector F is normal to the level surface S at ( )1, 1,2P . We have ( ) ( ) ( ) ( )2 2 2 , , 2 2 2F x y z xy z i x yz j y xz k = + + + + + .

Thus, a normal vector at ( )1, 1,2P is ( ) 1, 1,2 2 3 5N F i j k= = + . Hence, if ( ), ,Q x y z is any point on the plane, then equation for the tangent plane passing through ( )1, 1,2P and with normal vector N

is given by 0QP N =

, or equivalently

( ) ( ) ( )2 1 3 1 5 2 0 or 2 3 5 15.x y z x y z + + = + = The normal line to the surface at ( )1, 1,2P with direction numbers 2, 3,5< > is

1 2 , 1 3 , 2 5 , .x t y t z t t= + = = +

Example 15: Find the equation for the tangent plane and the normal line to the cone

2 2 2z x y= +

at the point where 3, 4, and 0.x y z= = >

Solution: At the point ( )3,4 , since 0z > , we have 2 23 4 25 5z = + = = . If we write ( ) 2 2 2, ,F x y z x y z= + , then the cone can be regarded as the level surface ( ), , 0F x y z = .

The partial derivatives of F are

( ) ( ) ( ), , 2 , , , 2 , , , 2 .x y zF x y z x F x y z y F x y z z= = = Therefore, the direction numbers of the normal to the cone at ( )3,4,5 are

( ) ( ) ( )3,4,5 6, 3,4,5 8, 3,4,5 10.x y zF F F= = = Thus the tangent plane to the cone at ( )3,4,5 has the equation

( ) ( ) ( )6 3 8 4 10 5 0 3 4 5 0x y z x y z + = + = . The normal line is given parametrically by the equations

3 6 , 4 8 , 5 10 , .x t y t z t t= + = + =

Extrema of functions of two variables:

Definition: (Absolute Extrema)

(1) We call f (a, b) the absolute maximum of f on the region R if f (a, b) f (x, y) for all (x, y) R.

(2) We call f (a, b) is called the absolute minimum of f on R if f (a, b) f (x, y) for all (x, y) R.

(3) In either case (1) or (2), f (a, b) is called an absolute extremum of f.

Theorem 1: (Extreme value theorem)

Let f be a continuous function of two variables x and y defined on a closed and bounded region R in the xy plane.

a) There is at least one point in R where f takes on a minimum value. b) There is at least one point in R where f takes on a maximum value.

Definition: (Relative Extrema)

Let be a function defined on a region R containing ( ),a b 1) The function has a relative minimum at ( ),a b if ( ) ( ), ,f x y f a b for all ( ),x y in

an open disk in R containing ( ),a b . 2) The function has a relative maximum at ( ),a b if ( ) ( ), ,f x y f a b for all ( ),x y in

an open disk in R containing ( ),a b . The figure given below shows peaks and valleys on the graph of a function f which are respectively the points of relative maximum and minimum of f

Relative Extrema Local Maximum Local Minimum

Definition: (Critical Point)

Let f be defined on an open region R containing the point ( )0 0,x y . The point ( )0 0,x y is a critical point of f if one of the following is true:

1) ( )0 0, 0xf x y = and ( )0 0, 0yf x y = . 2) ( )0 0,xf x y or ( )0 0,yf x y does not exist.

Definition: (Saddle Point)

A critical point ( )0 0,x y is called a saddle point of f if every open disk centered at ( )0 0,x y contains points in the domain of f that satisfy ( ) ( )0 0, ,f x y f x y> as well as points in the domain of f that satisfy ( ) ( )0 0, ,f x y f x y< . Remarks: Note that if f is differentiable at a critical point ( )0 0,x y , then

( ) ( ) ( )0 0 0 0 0 0 , , , 0 0x yf x y f x y i f x y j i j = + = + . Therefore, every directional derivative of f at ( )0 0,x y must be 0. This implies that the graph of the function f has a horizontal tangent plane at the point ( )( )0 0 0 0, , ,x y f x y . Since some critical points yields saddle points, ( )0 0,x y is a possible location for relative extremum. See figures below:

Theorem 2: Let f be a function of two variables x and y defined on an open region R containing the point ( )0 0,x y . If f has a relative extremum at ( )0 0,x y and partial derivatives

xf and yf both exist at ( )0 0,x y , then ( ) ( )0 0 0 0, , 0x yf x y f x y= = .

Proof: Let ( ) ( )0,F x f x y= . Then ( )F x must have a relative extremum at 0x x= , so that ( )0 0F x = . This implies that ( )0 0, 0xf x y = . Similarly, if ( ) ( )0,G y f x y= has a relative

extremum at 0y y= , then ( )0 0G y = which implies that ( )0 0, 0yf x y = . Thus, we must have both ( )0 0, 0xf x y = and ( )0 0, 0yf x y = . Remarks: The preceding theorem implies that there is a horizontal plane at each extreme point where the first partial derivatives exist. However, the theorem does not say that whenever a horizontal tangent plane occurs at a point P , there must be an extremum there. All that we can deduce is that such a point P is a possible location for a relative extremum.

Example 1: Discuss the nature of the critical ( )0,0 for the quadric surfaces a) 2 2z x y= + . b) 2 21z x y= .

Solution: The graphs of the quadric surfaces are shown below.

Let ( ) 2 2,f x y x y= + , ( ) 2 2, 1g x y x y= . A point ( ),a b is a critical point of a function ( ),F x y if ( ) ( ), 0 and , 0x yF a b F a b= = . We use these equations to find the critical points

of ,f g and discuss their nature:

a) ( ), 2xf x y x= , and ( ), 2yf x y y= . By definition of a critical point, we must have( ) ( )2 0, 2 0 , 0,0x y x y= = = . Thus ( )0,0 is the only critical point. The function

f has a relative minimum at ( )0,0 because 2 2 0x y+ > for all nonzero x and y . b) ( ) ( ), 2 , , 2x yg x y x g x y y= = , so again ( )0,0 is the only critical point. Since

( )2 2 2 21 1z x y x y= = + , and 2 2 0x y+ for all x and y , it follows that 1z

for all x and y with a relative maximum 1z = occurring when ( ) ( ), 0,0x y = , that is the point ( )0,0 is the point of relative maximum.

Example 2: Determine the critical points of ( ) 2 2,h x y y x= and discuss its nature. Solution: The graph of the quadric surface 2 2z y x= is shown below.

Since ( ) ( ), 2 , , 2x yh x y x h x y y= = , and 2 0 2x y = = implies that ( )0,0 is the critical point of h . At ( )0,0 , we have 0z = . But when 0z = , h has a minimum and a maximum in every open disk D centered at ( )0,0 . In fact, h has a minimum in D when 0y = (on the x axis) and a maximum when 0x = (on the y axis). Thus, the function h has neither a relative maximum nor a relative minimum at ( )0,0 , and hence ( )0,0 is a saddle point of the hyperbolic paraboloid ( ) 2 2,h x y y x= . Example 3: Determine the relative extrema of

a) ( ) 2 2, 2 8 6 20f x y x y x y= + + + . b) ( ) ( )1 32 2, 1g x y x y= +

Solution: The graphs of quadric surfaces are shown below:

(a) (b)

We begin by finding the critical points of f and g . The critical points of f and g are found using their partial derivatives. The partial derivatives of f and g are

( ) ( )( ) ( ) ( ) ( )2 3 2 32 2 2 2

, 4 8, , 2 6;2 2

, , , .

3 3

x y

x y

f x y x f x y yx yg x y g x y

x y x y

= + =

= =

+ +

(a) ( ) ( ), 4 8, , 2 6x yf x y x f x y y= + = is defined for all x and y . Therefore, the only critical points of f are those for which both partials xf and yf are 0. Thus, to locate these points, we solve the equations 4 8 0, 2 6 0x y+ = = simultaneously. This gives us that 2, 3x y= = which implies that ( )2,3 is the only critical point. Since

( )2,3 3f = and ( ) ( ) ( )2 2, 2 2 3 3 3f x y x y= + + + > for all ( ) ( ), 2,3x y , we conclude that a relative minimum of f occurs at ( )2,3 , and the value of the relative minimum is ( )2,3 3f = .

(b) Both partial derivatives x

g and yg exist for all x and y except for ( )0,0 . Since the critical points are those points for which one of the partials must not exist or both must be 0, we have that the critical points are the solutions of the equations

( ) ( ), 0 ,x yg x y g x y= = . Solving these equations, we get that ( )0,0 is the only critical point of g . Since ( )0,0 1g = and

( ) ( ) ( )1 32 2 2, 1 1 ,g x y x y x y= + < , we find that g has a relative maximum at ( )0,0 .

Second Partials Test:

In the preceding examples, it was relatively easy to find the relative extrema of given functions or determine when a critical point of the given function is a saddle point. The arguments were algebraic. However, for more complicated functions, it is better to rely on analytical means presented in following Second partials test:

Theorem 3: Let f have continuous second order partial derivatives on an open region containing a point ( )0 0,x y for which

( ) ( )0 0 0 0, 0 and , 0x yf x y f x y= = . To test for relative extrema of f at ( )0 0,x y , consider the quantity, called discriminant of f , defined by the equation

( ) ( ) ( ) ( )20 0 0 0 0 0 0 0, , , ,xx y xyD x y f x y f x y f x y= .

(1) A relative maximum occurs at ( )0 0,x y if ( )0 0, 0D x y > and ( )0 0, 0xxf x y < (or equivalently, ( )0 0, 0D x y > and ( )0 0, 0yyf x y < ).

(2) A relative minimum occurs at ( )0 0,x y if ( )0 0, 0D x y > and ( )0 0, 0xxf x y > (or equivalently, ( )0 0, 0D x y > and ( )0 0, 0yyf x y > ).

(3) A saddle point occurs at ( )0 0,x y if ( )0 0, 0D x y < . (4) The test is inconclusive if ( )0 0, 0D x y = .

Example 4: Find the relative extrema of

(a) ( ) 3 2, 4 2 1f x y x xy y= + + . (b) ( ) 2 2,g x y x y= .

Solution: (a) We first find the critical points of f . Because the partial derivatives

( ) ( )2, 3 4 and , 4 4x yf x y x y f x y x y= + = both exist for all x and y , the only critical points are those for which both ( ),xf x y and

( ),yf x y are zero. Thus, we obtain 23 4 0, 4 4 0x y x y + = = which implies that 0x y= = and 4 3y x= = . That is, the critical points of f are ( )0,0 and ( )4 3,4 3 .

To test these points for relative extrema, we now compute the second partials:

( ) ( ) ( ), 6 , , 4, , 4.xx yy xyf x y x f x y f x y= = = For the critical point ( )0,0 , we have ( )0,0 1f = , and the discriminant of f

( ) ( ) ( ) ( ) 20,0 0,0 0,0 0,0 0 16 0xx yy xyD f f f = =

( ) ( ) ( ) ( ) ( ) ( )24 3,4 3 4 3,4 3 4 3,4 3 4 3,4 3 8 4 16 16 0xx yy xyD f f f = = = > . So, by second partials test, we get that at ( )4 3,4 3 , f has a relative maximum.

(b) Since both the partial derivatives ( ) ( )2 2, 2 , , 2x yf x y xy f x y x y= = exist for all x and y , and both are zero if 0x = or 0y = , we get that every point along the x axis or y axis is a critical point of g . Now, to know the nature of these critical points, we find the second partials of g :

( ) ( ) ( )2 2, 2 , , 2 , , 4xx yy xyg x y y g x y x g x y xy= = = . If either 0x = or 0y = , then

( ) ( ) ( ) ( ) 2 2 2 2 2 2 2, , , , 4 16 12 0xx yy xyD x y f x y f x y f x y x y x y x y = = = = . So, the second partials test fails. However, because ( ), 0f x y = for every point along x axis or y axis, and ( ) 2 2, 0f x y x y= > for all other points, we conclude that each of these critical points is yields an absolute minimum. (See figure below)

Example 5: Find the absolute extrema of the function ( ) ( ), sinf x y xy= on the closed region given by 0 , 0 1.x ypi

Solution: From the partial derivatives

( ) ( ) ( ) ( ), cos , , cosx yf x y y xy f x y x xy= = , we obtain that the critical points in the domain of f are ( )0,0 and each point lying in the domain and on the hyperbola given by 2xy pi= . Since every point on the hyperbola

2xy pi= yields the value

( ) ( ), sin 2 1f x y pi= = , which we know is the absolute maximum of f , we get that all the critical points lying on the hyperbola 2xy pi= are points of absolute maximum.

The critical point ( )0,0 yields an absolute minimum of f because for every point in the domain 0 , 0 1,x ypi we have 0 xy pi which implies that

( ) ( )0 sin 1 0 , 1xy f x y . To locate other absolute extrema, we consider the points of the boundary of the domain:

0, , 0, 1x x y ypi= = = = . Since ( )sin 0xy = for all the points on the x axis, y axis and at the point ( ),1pi , we get that each of these points yields an absolute minimum of f . (See figure below)

Example 6: Find all critical points on the graph of the following functions and classify each point as a relative extremum or a saddle point:

(1) ( ) 3 3, 8 24f x y x xy y= + . (2) ( ) 2 4,g x y x y= .

(Try on your own and draw the rough sketch of the surface in each case.)

Example 7: Find the absolute extrema of the function ( ) 2 2, x yh x y e = over the disk 2 2 1x y+ .

Solution: Since the partial derivatives ( ) ( )2 2 2 2, 2 and , 2x y x yx yh x y xe h x y ye = = exist for all the points ( ),x y , and both are zero only when 0x = and 0y = , it follows that ( )0,0 is the only critical point of h and it is inside the unit disk 2 2 1x y+ . Also, ( ) 00,0 1h e= = . But the discriminant of h at ( )0,0 is ( )0,0 0D = , so that the second partial test is inconclusive about the nature of the critical point ( )0,0 .

Now we examine the values of h on the boundary curve 2 2 1x y+ = . On this

boundary curve, 2 21y x= so we obtain ( ) ( )2 2 21 2 1, x x xh x y e e = = . In order to find the absolute extrema of h , we need to find the largest and smallest values of

( ) 22 1 for 1 1.xF x e x=

Since ( ) 22 14 0xF x xe = = only when 0x = , and at 0x = , we have 1y = , so that ( )0,1 and ( )0, 1 are boundary critical points. At the endpoints of the interval [ ]1,1 , the corresponding points are ( )1,0 and ( )1,0 , which are also possible critical points of h . Thus, by computing the values of h for all the above critical points:

( ) ( ) ( ) ( )1 10,1 , 0, 1 , 1,0 , 1,0h e h e h e h e = = = = , we find that the absolute maximum value of h on the given unit disk is e , which occurs at ( )1,0 and ( )1,0 ; the absolute minimum value is 1e , which occurs at ( )0,1 and ( )0, 1 . Example 8: Find the point on the plane 2 5x y z+ + = that is closest to the point ( )0,3,4P . Solution: Let ( ), ,Q x y z be any point on the plane 2 5x y z+ + = . Then 5 2z x y= and the distance from P to Q is ( ) ( )2 22 3 5 2 4d x y x y= + + . Since the minimum value of d will occur at the same points where 2d is minimized, we minimize ( )2 ,d f x y= where

( ) ( ) ( )2 22, 3 1 2f x y x y x y= + + . First, we find the critical points of f by solving the system of equations

( ) ( )( ) ( ) ( )

, 2 2 1 2 4 4 2 0, 2 3 4 1 2 4 10 10 0.

x

y

f x y x x y x yf x y y x y x y

= = + =

= = + =

Thus, we obtain 5 6, 4 3x y= = . Since 4, 10, 4,xx yy xyf f f= = = we find that the discriminant of f at the critical point ( )5 6,4 3 is

( ) ( ) ( ) ( ) ( )25 6,4 3 5 6,4 3 5 6,4 3 5 6,4 3 4 10 16 0xx yy xyD f f f = = > , and ( )5 6,4 3 4 0xxf = > , so a relative minimum occurs at ( )5 6,4 3 . Note that this relative minimum must also be an absolute minimum because there must be exactly one point on the plane that is closest to the given point. The corresponding z value is

( ) ( )5 5 6 2 4 3 19 6z = = . Thus, the closest point on the plane is ( )5 3,4 3,19 6Q and the minimum distance is

22 25 4 5 4 25 53 1 26 3 6 3 6 6

d = + + + = =

.

Example 9: A rectangular box is resting on the xy plane with one vertex at the origin. The opposite vertex lies on the plane 6 4 3 24x y z+ + = . Find the maximum volume of the box. See figure below.

Solution: Let , ,x y z represent the length, width and height of the box, respectively. Because one vertex of the box lies at the origin and the vertex opposite to this vertex lies on the plane 6 4 3 24x y z+ + = , the point ( ), ,x y z must satisfy the equation of the this plane. Thus, we get

( )1 24 6 43

z x y= .

Thus, we can write the volume V xyz= of the box as a function of two variables as follows:

( ) ( ) ( )2 21 1, 24 6 4 24 6 43 3V x y xy x y xy x y xy = = , where the domain of V is the triangle in the xy plane bounded by the lines 0, 0x y= = and 3 2 12 0x y+ = . By setting the first partial derivatives equal to 0,

( ) ( ) ( )( ) ( ) ( )

2

2

1, 24 12 4 24 12 4 0

3 31

, 24 6 8 24 6 8 03 3

x

y

yV x y y xy y x y

xV x y x x xy x y

= = =

= = =

we obtain that the critical points are ( )0,0 and ( )4 3,2 . At ( )0,0 , the volume is 0 so that point does not yield a maximum value. At the point ( )4 3,2 , we apply the second partial test:

( )8 14 , , 24 12 83 3xx yy xy

xV y V V x y= = = .

Because ( ) ( ) ( ) ( ) ( ) ( )2 24 3,2 4 3,2 4 3,2 8 32 9 8 3 64 3 0xx yy xyV V V = = > , and ( )4 3,2 8 0xxV = < , we conclude that the maximum volume of the box occurs at

( ) ( ), 4 3,2x y = and the maximum volume is

( ) ( )2

21 4 4 4 644 3,2 24 2 6 2 4 2 .3 3 3 3 9

V cu units

= =

Note that the volume is 0 at the boundary points of the triangular domain of V.

Example 10: An electronics manufacturer determines that the profit (in dollars) obtained by producing x units of a DVD player and y units of a DVD recorder is approximated by the model ( ) ( )2 2, 8 10 0.001 10,000P x y x y x xy y= + + + . Find the production level that produces a maximum profit. What is the maximum profit?

Solution: The partial derivatives of the profit function are

( ) ( ), 8 0.001 2xP x y x y= + and ( ) ( ), 10 0.001 2yP x y x y= + . By setting these partial derivatives equal to 0, and after simplifying, we obtain the following system of equations:

2 80002 10,000.

x yx y

+ =

+ =

Solving this system produces 2000x = and 4000y = . The second partial derivatives of P at ( ) ( ), 2000,4000x y = are

( )( )( )

2000,4000 0.0022000,4000 0.0022000,4000 0.001.

xx

yy

xy

P

P

P

=

=

=

Because ( )2000,4000 0xxP < and

( ) ( ) ( ) ( ) ( )2 22000,4000 2000,4000 2000,4000 0.002 0.001 0xx yy xyP P P = > . we conclude that the production level of 2000x = units and 4000y = units yields a maximum profit. The maximum profit is ( )2000,4000 18,000P = , that is $18,000. Remarks: Note that in the preceding example, it was assumed that the manufacturing plant is able to produce the required number of units to yield a maximum profit. In actual practice, the production would be bounded by physical constraints. We will next study such constrained optimization problems and learn how deal with them by a very ingenious technique called the method of Lagrange Multiplier.

The Method of Lagrange Multipliers

Many optimization problems may have restrictions, or constraints, on the values that can be used to produce the optimal solution. Such constraints tend to complicate optimization problems because the optimal solution can occur at a boundary point of the domain. For solving such problems, we use the Method of Lagrange Multipliers.

To motivate the method of Lagrange multiplier, suppose that we are trying to maximize a function ( ),f x y subject to the constraint ( ), 0g x y = . Geometrically, this means that we are looking for a point ( )0 0,x y in the domain of f and on the graph of the constraint curve ( ), 0g x y = at which ( ),f x y is as large as possible. To help locate such a point, let us construct all the level curves of ( ),f x y in the same coordinate system as the graph of

( ), 0g x y = .

In figure (a), each point of intersection of ( ), 0g x y = with a level curve is a candidate for a solution, since these points lie on the constraint curve. Among the seven such intersections shown in the figure, the maximum value of ( ),f x y occurs at the intersection ( )0 0,x y where

( ),f x y has a value of 400. Note that at ( )0 0,x y , the constraint curve and the level curve just

touch and thus have a common tangent line at this point. Since ( )0 0,f x y is normal to the level curve ( ), 400f x y = at ( )0 0,x y , and since ( )0 0,g x y is normal to the constraint curve

( ), 0g x y = at ( )0 0,x y , we conclude that the vectors ( )0 0,f x y and ( )0 0,g x y must be parallel. That is,

( ) ( ) ( )0 0 0 0, , *f x y g x y = for some scalar . The same condition holds at points on the constraint curve where ( ),f x yhas a minimum. For example, if the level curves are as shown in the figure (b), then the minimum value of ( ),f x y occurs where the constraint curve just touches a level curve. Thus, to find the maximum or minimum of ( ),f x y subject to the constraint ( ), 0g x y = , we look for points at which (*) equation holdsthis is the method of Lagrange multipliers. The scalar is called a Lagrange multiplier.

To see how this method works, let us look at the following problem. Suppose we want to find the greatest and the smallest values that the function

( ),f x y xy= takes on the ellipse

2 2

18 2x y

+ = .

That is, to find the extreme values of ( ),f x y xy= subject to the constraint

( )2 2

, 1 08 2x yg x y = + = .

The level curves of the function ( ),f x y xy= are the hyperbolas xy c= , where c is a constant (see figure below).

The farther the hyperbolas lie from the origin, the larger the absolute value of f . We want to find the extreme values of ( ),f x y given that the point ( ),x y is in the domain of f and also lies on the ellipse 2 24 8x y+ = . Which hyperbolas intersecting the ellipse lie farthest from the origin? The hyperbolas that just graze the ellipse, the ones that are tangent to it, are farthest. To find the appropriate hyperbola, we use the fact that two curves are tangent at a point if and only if their gradient vectors are parallel. This means that ( ),f x y must be a scalar multiple of ( ),g x y at the point of tangency, so that

( ) ( ) , ,4

f x y g x y yi xj xi yj = + = +

.

Thus, we first find the values of ,x y and for which 4

yi xj xi yj + = + and ( ), 0.g x y =Solving first equation, we get ( ) ( )4 , 4 0 or 2.y x x y y y y = = = = = We now consider these two cases.

Case 1: If 0y = , then 0x y= = . But ( )0,0 does not satisfy the equation ( ), 0g x y = , that is, ( )0,0 does not lie on the ellipse. Hence, 0y . Case 2: If 0y , then 2 = and 2x y= . Substituting this in the equation ( ), 0g x y = , we obtain ( )2 2 2 22 4 8 4 4 8 1.y y y y y + = + = = Therefore, the function

( ),f x y xy= takes on its extreme values on the ellipse at four points ( ) ( )2,1 , 2, 1 . Hence, the extreme values are 2xy = and 2xy = .

Now, we state and prove the necessary conditions for the existence of Lagrange multipliers.

Theorem: (Lagrange Theorem)

Let f and g have continuous first order partial derivatives such that f has an extremum at a point ( ),a b on the smooth constraint curve ( ),g x y c= . If ( ), 0g a b , then there is a real number such that

( ) ( ), ,f a b g a b = . Proof: Let us represent the smooth curve given ( ),g x y c= by the vector-valued function

( ) ( ) ( ) ( ) , 0,r t x t i y t j r t= + where x and y are continuous on an open interval I . Define the function h on I as

( ) ( ) ( )( ),h t f x t y t= . Then, since ( ),f a b is an extreme value of f , we have that

( ) ( ) ( )( ) ( )0 0 0, ,h t f x t y t f a b= = is an extreme value of h . This implies that ( )0 0h t = . But by Chain rule, we have

( ) ( ) ( ) ( ) ( ) ( ) ( )0 0 0 0, , , 0x yh t f a b x t f a b y t f a b r t = + = = . Thus, we obtain that ( ),f a b is orthogonal to ( )0r t . Moreover, ( ),g a b is orthogonal to

( )0r t . Consequently, the gradients ( ),f a b and ( ),g a b are parallel. Hence, there exists a scalar such that ( ) ( ), ,f a b g a b = . This completes the proof. Method of Lagrange multipliers:

Let f and g satisfy the hypothesis of Lagranges Theorem, and let f have a minimum or maximum subject to the constraint ( ),g x y c= . To find the minimum or maximum of f , use the following steps:

1. Simultaneously solve the equations ( ) ( ), ,f x y g x y = and ( ), ,g x y c= that is, solve the system of equations

( ) ( )( ) ( )

( )

, , ,

, , ,

, .

x x

y y

f x y g x yf x y g x yg x y c

=

=

=

2. Evaluate f at each solution point obtained in the first step. The largest value yields the maximum of f subject to the constraint ( ),g x y c= , and the smallest value yields the minimum of f subject to the constraint ( ),g x y c= .

Example 11: Find the maximum value of ( ), 4f x y xy= where 0x > and 0y > , subject to the constraint 2 29 16 1x y+ = .

Solution: For the rectangle to be inscribed inside the given ellipse, we must have one of its vertices in the first quadrant. Let ( ),x y

be the vertex of the rectangle in the first quadrant, so

that 0x > and 0y > (see figure above). Note that the other three vertices of the rectangle are then determined uniquely as ( ) ( ) ( ), , , , ,x y x y x y . Now, because the rectangle has sides of lengths 2x and 2 y , its area is given by (the objective function) ( ), 4f x y xy= . We want to find x and y such that ( ),f x y is a maximum. However, the choice of ( ),x y is restricted to the first-quadrant points that lie on the ellipse (the constraint function)

2 2

19 16x y

+ = .

We now use the method of Lagrange multiplier. Let ( ) 2 2, 9 16 1g x y x y= + = . By equating

( ) , 4 4f x y yi xj = + and ( ) ( ) ( ) , 2 9 2 16g x y x i y j = + , we obtain the following system of equations.

2 22 14 , 4 , 19 8 9 16

x yy x x y = = + = (constraint equation).

From the first equation, we obtain ( )18 y x = and substitution into the second equation produces ( ) ( ) ( )2 24 1 8 18 9 16x y x y x y= = . Substituting this value of 2x into the third equation gives us 2 8y = , so that 2 2y = . But 0y > , so we take the positive value

2 2.y = This gives 2 9 2x = or 3 2x = because 0x > . Hence, the maximum value of f will occur at ( )3 2 ,2 2 and the maximum value of f is

3 3,2 2 4 2 2 24.

2 2f = =

Geometry of the solution: We consider the constraint equation to be the fixed level curve

( ), 1g x y = , where ( )2 2

,

9 16x yg x y = + . The level curves of f , ( )1f k , k a constant,

represent a family of hyperbolas ( ), 4f x y xy k= = (see figure below).

In this family, the level curves that meet the given constraint correspond to the hyperbolas that intersect the ellipse. Moreover, to maximize ( ),f x y , we wanted to find the hyperbola that just barely satisfies the constraint. The level curve that does this is the one that is tangent to the ellipse.

Example 12: The Cobb-Douglas production function for a software manufacturer is given by

( ) 3 4 1 4, 100f x y x y= , where x represents the units of labour (at $150 per unit) and y represents the units of capital (at $250 per unit). The total cost of labour and capital is limited to $50,000. Find the maximum production level for this manufacturer.

Solution: The limit on the cost of labour and capital produces the constraint equation

( ) ( ), 50,000 where , 150 250g x y g x y x y= = + . Thus, ( ) , 150 250g x y i j = + . From the given function, we get

( ) 1 4 1 4 3 4 3 4 , 75 25f x y x y i x y j = + . By Lagranges theorem, there exists such that ( ) ( ), ,f x y g x y = . This gives us the following system of equations

1 4 1 4 3 4 3 475 150 , 25 250 and 150 250 50,000.x y x y x y = = + =

By solving for in the first equation, we have ( ) 1 4 1 41 2 x y = , and substituting in the second equation gives 25 125 5x y x y= = . By putting this value of x into the third equation, we obtain 50y = and thus 250x = . So, the maximum production level is obtained when 250 units of labour is employed and 50 units of capital is invested, and the maximum production level is ( ) ( ) ( )3 4 1 4250,50 100 250 50 16,719 product units.f = Remarks:

Economists call the Lagrange multiplier obtained in a production function the marginal productivity of money. For instance, in the preceding example, the marginal productivity of money at 250x = and 50y = is

( ) ( )1 4 1 41 4 1 4 250 501 0.3342 2

x y

= =

which means that for each additional dollar spent on production, an additional 0.334 unit of the product can be produced.

Example 13: At what point(s) on the circle 2 2 1x y+ = does the function ( ),f x y xy= attains its absolute maximum and what is that maximum?

Solution: The circle 2 2 1x y+ = is a closed and bounded set and ( ),f x y xy= is a continuous function, so it follows from the Extreme-Value Theorem that f has an absolute maximum and an absolute minimum on the circle. To find these extrema, we will use Lagrange multipliers to find the constrained relative extrema, and then evaluate f at those relative extrema to find the absolute extrema.

We want to maximize ( ),f x y xy= subject to the constraint ( ), 0g x y = , where ( ) 2 2, 1g x y x y= + .

First, we will find the constrained relative extrema. For this purpose, we need the gradients ( ) ,f x y yi xj = + and ( ) , 2 2g x y xi yj = + . Now, ( ), 0g x y = if and only if 0x = and 0y = , so that ( ), 0g x y for any point on the circle 2 2 1x y+ = . Thus, at a constrained

relative extremum, we must have

( ) ( ) , , 2 2f x y g x y yi xj xi yj = + = + . This gives us the pair of equations 2 , 2y x x y = = . It follows from these equations that if

0x = , then 0y = , and if 0y = , then 0x = . In either case, we have 2 2 0x y+ = , so the constraint equation ( ), 0g x y = is not satisfied. Thus, we can assume that x and y are nonzero, and we can rewrite the equations as

and =2 2y xx y

= .

This implies that 2 2x y= , which on substituting in the constraint equation gives 22 1 0x = .

Thus, we obtain 1 2 , 1 2x y= = . Hence, the constrained relative extrema occur at

the points 1 1 1 1 1 1 1 1, , , , , , ,2 2 2 2 2 2 2 2

.

Thus, the absolute maximum of ( ),f x y xy= is 1 2 at the points 1 1 1 1, , ,2 2 2 2

.

Also, note that f has an absolute minimum 1 2 at the points 1 1 1 1, , ,2 2 2 2

. In

the figure above, some level curves xy c= , and the constraint curve 2 2 1x y+ = are shown in the vicinity of maxima of f .

Example 14: Find the maximum and minimum values of the function ( ), 3 4f x y x y= + on the circle 2 2 1x y+ = .

Solution: We model this as a Lagrange multiplier problem with

( ) ( ) 2 2, 3 4 , , 1f x y x y g x y x y= + = + , and look for the values of ,x y and that satisfies the equations

( ) ( )( ) 2 2

, , 3 4 2 2 ;

, 0 1 0.

f x y g x y i j xi yjg x y x y

= + = += + =

The gradient equation implies that 0 , and gives that 3 2 , 2x y = = . These equations tell us that both x and y have the same sign. The constraint equation gives us that

2 23 2 51 02 2

+ = =

.

Thus, 3 5 and 4 5x y= = , and ( ), 3 4f x y x y= + has extreme values at ( ) ( ), 3 5,4 5x y = (see figure below). The maximum and minimum values of f are, respectively, therefore

3 4 25 3 4 253 4 5 and 3 4 5.5 5 5 5 5 5

+ = = + = =

Example 15: Suppose that the temperature of a metal plate is given by

( ) 2 2, 2T x y x x y= + + for points ( ),x y on the elliptical plate defined by 2 24 24x y+ . Find the maximum and minimum temperatures on the plate.

Solution: The plate corresponds to the shaded region as shown in the figure below.

We first look for critical points of ( ),T x y inside the region R . We have the gradient of T

( ) ( ) ( ) , 2 2 2T x y x i y j = + + . At the critical point, we must have ( ), 0T x y = , that is 2 2 0, 2 0x y+ = = which implies

1, 0x y= = . Thus, T is maximum or minimum inside the region R at the point ( )1,0 and at this point ( )1,0 1T = . Now, we look for extrema of ( ),T x y on the boundary of the region R , that is the ellipse

2 24 24x y+ = using the method of Lagrange multiplier. The constraint function is

( ) 2 2, 4 24 0g x y x y= + = . By Lagranges theorem, any extrema on the ellipse must satisfy the gradient equation

( ) ( ), ,T x y g x y = .

Thus, we must have ( ) ( ) ( ) 2 2 2 2 8x i y j xi yj+ + = + . This occurs when 2 2 2 and 2 8x x y y + = = .

The second equation holds when 0y = or 1 4 = . If 0y = , the constraint equation 2 24 24x y+ = gives us that 24x = . If 1 4 = , then first equation gives 4 3x = and the

constraint equation implies that 50 3y = . Now, we compare all the values of T at all of these points, namely, one interior critical point and the candidates for boundary extrema:

( )( )( )

1,0 1,

24,0 24 2 24 33.8,

24,0 24 2 24 14.2,

4 50 14, 4.7,

3 3 3

4 50 14, 4.7.

3 3 3

T

T

T

T

T

=

= +

=

=

=

Hence, the minimum value of T is 1 at the point ( )1,0 and the maximum value of T is 24 2 24+ at the point ( )24,0 . Example 16: Find the extreme value of ( ) 2 2, 2 2 3f x y x y x= + + subject to the constraint

2 2 10x y+ .

Solution: This problem is similar to Example 15. We divide the constraint into two cases.

a) For points on the circle 2 2 10x y+ = , use Lagrange multipliers to find that the maximum value of ( ),f x y is 24. This value occurs at ( )1,3 and at ( )1, 3 . In a similar way, we can determine that the minimum value of f is approximately 6.675, and this value occurs at ( )10,0 .

b) For points inside the circle, use the techniques of finding the relative extrema to conclude that the function has a relative minimum of 2 at the point ( )1,0 .

By combining these two results, we conclude that f has a maximum of 24 at ( )1,3 and a minimum of 2 at ( )1,0 , as shown in the figure below.

Example 17: (Lagrange Multipliers and three variables)

Find the minimum value of ( ) 2 2 2, , 2 3f x y z x y z= + + subject to the constraint 2 3 4 49x y z = .

Solution: Let ( ), , 0g x y z = , where ( ), , 2 3 4 49g x y z x y z= , represent the constraint curve. Because we have

( ) ( ) , , 2 3 4 and , , 4 2 6g x y z i j k f x y z xi yj zk = = + + , so that Lagranges theorem gives us that f g = , we obtain the following system of equations

( ) ( )( ) ( )( ) ( )

4 2 : , , , ,2 3 : , , , ,

6 4 : , , , ,2 3 4 49 0 : (the constraint)

x x

y y

z z

x f x y z g x y zy f x y z g x y zz f x y z g x y zx y z

= =

= =

= =

=

Solving these equations, we get 6, 3, 9, 4.x y z = = = = So, the optimum value of f is attained at ( )3, 9, 4 and the optimum value is

( ) ( ) ( ) ( )2 2 23, 9, 4 2 3 9 3 4 147.f = + + = Note that from the original function and the constraint, f has no maximum. So the optimum value determined above is a minimum of f .

Geometrical view of Lagrange multiplier theorem in three variables: A graphical interpretation of constrained optimization problems in three variables is similar to that of two variables except that level surfaces are used instead of level curves. For instance, in the preceding example, the level surfaces of f are ellipsoids centered at the origin, and the constraint is a plane. The minimum value of f is represented by the ellipsoid that is tangent to the constraint plane, as shown in the figure below.

The method of Lagrange Multipliers with two constraints:

The problem is to find the minimum or maximum value of a differentiable function ( ), ,f x y z subject to two constraints ( ), , 0g x y z = and ( ), , 0h x y z = , where g and h are

also differentiable. Notice that for both constraints to be satisfied at a point ( ), ,x y z , the point must lie on both surfaces defined by the constraints. Consequently, in order for there to be a solution, we must assume that the two surfaces intersect. We further assume that g and h are nonzero and are not parallel, so that the two surfaces intersect in a curve C and are not tangent to one another. If f has an extremum at a point ( )0 0 0, ,x y z on a curve C, then

( )0 0 0, ,f x y z must be normal to the curve. Notice that since C lies on both constraint surfaces, ( )0 0 0, ,g x y z and ( )0 0 0, ,h x y z are both orthogonal to C at ( )0 0 0, ,x y z . This implies that ( )0 0 0, ,f x y z must lie in the plane determined by ( )0 0 0, ,g x y z and

( )0 0 0, ,h x y z (see figure below).

That is, for ( ) ( )0 0 0, , , ,x y z x y z= and some constants and (Lagrange multipliers), ( ) ( ) ( ), , , , , ,f x y z g x y z h x y z = + .

The method of Lagrange multipliers for the case of two constraints then consists of finding the point ( ), ,x y z and the Lagrange multipliers and (for a total of five unknowns) satisfying the five equations defined by:

( ) ( ) ( )( ) ( ) ( )( ) ( ) ( )

( )( )

, , , , , ,

, , , , , ,

, , , , , ,

, , 0, , 0.

x x x

y y y

z z z

f x y z g x y z h x y zf x y z g x y z h x y zf x y z g x y z h x y zg x y z

h x y z

= +

= +

= +

=

=

Example 18: The plane 12x y z+ + = intersects the paraboloid 2 2z x y= + in an ellipse. Find the point on the ellipse that is closest to the origin.

Solution: The intersection of the given plane and the paraboloid is shown below.

Finding the point on the ellipse that is closest to the origin is same as finding the point

( ), ,x y z that minimizes the distance ( ) ( ) ( )2 2 2 2 2 20 0 0d x y z x y z= + + = + + or equivalently minimizes the function ( ) 2 2 2, ,f x y z x y z= + + subject to the constraints

( ) ( ) 2 2, , 12 0 and , , 0g x y z x y z h x y z x y z= + + = = + = . At any extremum, we must have, by the method of Lagrange multipliers, that

( ) ( ) ( )( ) ( )

, , , , , ,

or 2 2 2 2 2 .

f x y z g x y z h x y zxi yj zk i j k xi yj k

= +

+ + = + + + +

Together with the constraint equations, we now have the system of five equations

( )2 22 2 , 2 2 , 2 , 12 0, 0 *x x y y z x y z x y z = + = + = + + = + = . From the first two equations, we have 2 2x x = and 2 2y y = . Equating these two expressions of , we obtain that

( ) ( )2 2 2 2 1 0x x y y x y = = . It follows that either 1 = (in which case 0 = ) or x y= . However, if 1 = and 0 = , then third equation in ( )* gives us that 1 2z = , a contradiction to the fact that

2 2 0z x y= + > . Consequently, the only possibility is x y= , from which it follows that 22z x= . Substituting this in the first constraint (the equation of plane), we obtain

( ) ( )2 20 12 2 12 2 2 12 2 3 2x y z x x x x x x x= + + = + + = + = + . Thus, we get that either 3x = or 2x = . Since y x= and 22z x= , we have the points of extremum of f as ( )3, 3,18 and ( )2,2,8 . Finally, since

( ) ( )3, 3,18 342 and 2,2,8 72f f = = , the closest point on the intersection of the two surfaces to the origin is (2, 2, 8). By the same reasoning, observe that the point on the intersection of the two surfaces that is farthest from the origin is (3,3, 18). Notice that these are also consistent with what you see in the figure.

Example 19: Let ( ) 2, , 20 2 2T x y z x y z= + + + represent the temperature at each point on the sphere 2 2 2 11x y z+ + = . Find the extreme temperatures on the curve formed by the intersection of the plane 3x y z+ + = and the sphere.

(Ans.: Min. Temp. 25T = and Max. Temp. 91 3T = )

Example 20: The plane 1x y z+ + = cuts the cylinder 2 2 1x y+ = in an ellipse. Find the points on the ellipse that lie closest to and farthest from the origin.

(Ans.: The closest points are ( )1,0,0 and ( )0,1,0 . Farthest point is ( )2 2, 2 2,1 2 + )

function of two variables notes -2

Documents