1 functions in two variables

BEEM103 —Optimization Techiniques for Economists Dieter Balkenborg

Departments of Economics

Lecture Week 2 University of Exeter

1 Functions in two variables

A function z = f (x, y) or simply z (x, y) in two independent variables with one dependentvariable assigns to each pair (x, y) of (decimal) numbers from a certain domain D inthe two-dimensional plane a number z = f (x, y).1 x and y are hereby the independentvariables and z is the dependent variable. The graph of f is the surface in 3-dimensionalspace consisting of all points (x, y, f (x, y)) with (x, y) in D.

Example 1 The cubic polynomial

z = f (x, y) = x3 − 3x2 − y2

is defined on the entire plane. It has the graph:

1

468

00

2y x

01

2

z 1

24

2 3

6

12

The vertical line is here the z-axis, the x- and the y- axis span a horizontal plane. Thusthe height of the graph indicates the values of the function. Lighter colours correspond tohigher values of the function. For instance, f (0, 0) = 0, so the origin of the coordinatesystem (0, 0, 0) belongs to the graph and is a peak. The function has a saddle point at(x, y) = (2, 0) where z = f (x, y) = 8− 3× 4− 0 = −4.

Exercise 2 Evaluate

z = f (2, 1)

z = f (3, 0)

z = f (4,−4)

z = f (4, 4)

Could the information fit with the graph?

1The two-dimensional plane consisting of all pairs of numbers (x, y) is usually denoted by IR2.

Example 3 A production function like

Q =6√K√L = K

16L

12

of a firm describes for each combination of capital (K ≥ 0) and labour (L ≥ 0) the totalamount Q ≥ 0 of output which the firm can produce with these inputs. The specifiedfunction has the graph:

0 10 20

y

200

10 15

x50

24

z6

Example 4 Consider a firm with the above production function and assume that the firmis a price taker in the product market and in both factor markets. If P is the price of thecommodity produced, r the interest rate (= the price of capital) and w the wage rate (=the price of labour) then the profit of this firm is

Π (K,L) = PQ− rK − wL= PK

16L

12 − rK − wL

if it employs K units of capital and L units of labour. For the prices P = 12, r = 1,w = 3 we obtain the function

Π (K,L) = 12K16L

12 −K − 3L

with the graph:

20x

10

z

0

10

10

0

20

y

2010

0

One can show that profits is maximized for K = L = 8.

2

The level curve of the function z = f (x, y) for the level c is the solution set to theequation

f (x, y) = c

where c is a given constant. Geometrically, a level curve is obtained by intersecting thegraph of f with a horizontal plane z = c and then projecting into the (x, y)-plane. Thisis illustrated here for the cubic polynomial discussed above:

0 00

5

12

1

x

2 3

5

y

z 11

2

1 1 20

2

1

0 3

xz1

y

2

In the case of a production function the level curves are called isoquants. An isoquantshows for a given output level capital-labour combinations which yield the same output.

0 10

x

20

y

200

10 1550

24z6

010

20

20

x

10

y

z0

Finally, the linear functionz = 3x+ 4y

has the graph and the level curves:

42 0

x0 y

420

10z

3020

z0

y

4

2

x0 2 4

3

The level curves of a linear function form a family of parallel lines. This can be seenalgebraically as follows: For a given level c the level curve is given by

c = 3x+ 4y

4y = c− 3x

y =c

4− 3

4x

The level curves are therefore all lines with slope −34. They differ only in the intercept c

4.

Hence they are all parallel lines.

Exercise 5 Describe the isoquant of the production function

Q = KL

for the quantity Q = 2.

Exercise 6 Describe the isoquant of the production function

Q =√KL

for the quantity Q = 4.

Remark 7 The exercises illustrate the following general principle: If h (z) is an increas-ing (or decreasing) function in one variable, then the composite function h (f (x, y)) hasthe same level curves as the given function f (x, y) . However, they correspond to differentlevels.

00

x4y

42

20

20

z 10

00

x4y

42

20

4

z 2

Q = KL Q =√KL

2 Partial derivatives

Exercise 8 What is the derivative of

z (x) = a3x2

with respect to x when a is a given constant?

4

Exercise 9 What is the derivative of

z (y) = y3b2

with respect to y when b is a given constant?

Consider now a function in two variables z = f (x, y). If we fix y at a value y = y0 andvary only x we obtain a function in one variable z = F (x) = f (x, y0). The derivative ofthis function F (x) at x = x0 is called the partial derivative of f with respect to x anddenoted by ∂z

∂x |x=x0,y=y0= dF

dx |x=x0= dF

dx(x0).2 The partial derivative ∂z

∂y |x=x0,y=y0is defined

symmetrically. (Notation: “d”=“dee”, “δ”=“delta”, “∂”=“del”)We do not have to specify y0 numerically. To partially differentiate in the direction

of x it suffi ces to think of y and all expressions containing only y as exogenously fixedconstants. We can then use the familiar rules for differentiating functions in one variablein order to obtain ∂z

∂x.

Other common notations for partial derivatives are ∂f∂x, ∂f∂yor fx, fy.

Example 10 Letz (x, y) = y3x2

Then∂z

∂x= 2y3x

∂z

∂y= 3y2x2

Example 11 Letz = x3 + x2y2 + y4.

Setting e.g. y = 1 we obtain z = x3 + x2 + 1 and hence

∂z

∂x |y=1= 3x2 + 2x

∂z

∂x |x=1,y=1= 5

For fixed, but arbitrary, y we obtain

∂z

∂x= 3x2 + 2xy2

as follows: We can differentiate the sum x3 + x2y2 + y4 with respect to x term-by-term.Differentiating x3 yields 2x2, differentiating x2y2 yields 2xy2 because we think now of y2

as a constant andd(ax2)dx

= 2ax holds for any constant a. Finally, the derivative of anyconstant term is zero, so the derivative of y4 with respect to x is zero.Similarly considering x as fixed and y variable we obtain

∂z

∂y= 2x2y + 4x3

2From now on I will often write dzdx |x=x0

instead of dzdx (x0) to denote the derivative evaluated at a

special value of x.

5

Example 12 The partial derivatives ∂q∂K

and ∂q∂Lof a production function q = f (K,L)

are called the marginal product of capital and (respectively) labour. They describeapproximately by how much output increases if the input of capital (respectively labour)is increased by a small unit. If one fixes for the production function in Example 3 capitalinput at K = 64 one obtains q = K

16L

12 = 2L

12 which has, as a function in the one

variable L, the graph

0 1 2 3 4 50

1

2

3

4

x

Q

This graph is obtained from the graph of the function in two variables by intersecting thelatter with a vertical plane parallel to L-q-axes.

10 15

x

20

z6420

20

y10

0 50

thus the partial derivatives ∂q∂K

and ∂q∂Ldescribe geometrically the slope of the function in

the K- and, respectively, the L- direction. Notice that for any given level of capital thereis diminishing productivity of labour: The more labour is used, the less is the increasein output when one more unit of labour is employed. Algebraically this can be verified asfollows:

∂q

∂L=

1

2K

16L−

12 =

1

2

6√K

2√L> 0,

the marginal product of labour is always positive. Differentiating a second time we obtain

∂2q

∂L2=

∂

∂L

(∂q

∂L

)= −1

4K

16L−

32 = −1

4

6√K

2√L3

< 0,

so the marginal product of labour is decreasing. Verify similarly that there is diminishingproductivity of capital!

Exercise 13 Find the partial derivatives of

z =(x2 + 2x

) (y3 − y2

)+ 10x+ 3y

6

3 The tangent plane

For a function y = f (x) in one variable the derivative dfdx |x=x0

at x0 is the slope of thetangent to the graph of f through the point (x0, f (x0)).

0.5 1.0 1.5 2.0

1

0

1

2

3

4

x

y

The tangent of f (x) = x2 at x = 1.

This tangent is the graph of the linear function

y = l (x) = f (x0) +df

dx |x=x0(x− x0) .

This is so because l (x0) = f (x0) , so the graph of l goes through the point (x0, f (x0)).Moreover,

y = l (x) =

(f (x0)− df

dx0 |x=x0

x0

)+df

dx |x=x0x,

so l (x) is indeed a linear function in x with intercept(f (x0)− df

dx0 |x=x0x0

)and the “right”

slope dfdx |x=x0

.The graph of a function in one variable is a curve and its tangent is a line. By analogy,

the graph of a function z = f (x, y) in two variables is a surface and its tangent at a point(x0, y0) is a plane.

0

2z

4

2

y431

00

1

x2 3 4

7

Non-vertical planes are the graphs of linear functions z = ax+ by+ c. The tangent planeof the function z = f (x, y) in (x0, y0) is, correspondingly, the graph of the linear function

z = l (x, y) = f (x0, y0) +∂f

∂x |x=x0,y=y0(x− x0) +

∂f

∂y |x=x0,y=y0

(y − y0) (1)

since the graph of this linear function contains the point (x0, y0, f (x0, y0)) and has theright slopes in the x- and y-directions.

Example 14 This is the example shown in the above graph. The tangent plane of z =f (x, y) =

√x+√y at the point (1, 1) is obtained as follows: We have

∂z

∂x=

1

2

1√x

and∂z

∂y=

1

2

1√y

and hence∂z

∂x |x=1,y=1=

1

2and

∂z

∂y |x=1,y=1

=1

2.

Since f (1, 1) = 2 the tangent plane is the graph of the linear function

z = l (x, y) = 2 +1

2(x− 1) +

1

2(y − 1) = 1 +

1

2x+

1

2y

Using the abbreviations dx = x− x0, dy = y − y0 and dz = z − z0 the formula for thetangent can be neatly rewritten as

dz =∂z

∂xdx+

∂z

∂ydy (2)

an expression which is called the total differential.

4 The slope of level curves

Suppose (x0, y0) is a point on the level curve f (x, y) = c. Then the slope of tangent tothe level curve in this point in the x-y-plane is

dy

dx |x=x0,y=y0= − ∂z

∂x |x=x0,y=y0

/∂z

∂y |x=x0,y=y0

(3)

provided ∂z∂y |x=x0,y=y0

6= 0. This is known as the rule for implicit differentiation: If the

function y = y (x) is implicitly defined by the equation f (x, y (x)) = c, then its derivativedydxis given by the above formula.

Example 15 A linear functiony = ax+ by + c

8

has the partial derivatives ∂z∂x

= a and ∂z∂y

= b. According to the above formula its levelcurves have the slope

− ∂z

∂x

/∂z

∂y= −a

b.

Indeed, its level curves are the solutions to the equations

c̄ = ax+ by + c

with some constant c̄. Solving the equation for y we obtain

y = −abx+

c̄− cb

which is a linear function with slope −ab.

Example 16 An isoquant of the production function Q = K16L

12 has according to the

above formula the slope

dK

dL= − ∂Q

∂L

/∂Q

∂K= −

(1

2K

16L−

12

)/(1

6K−

56L

12

)= −6

2K

16K

56L−

12L−

12 = −3

K

L.

in the point (K,L).Indeed, an isoquant is the set of solutions to the equation q̄ = K

16L

12 for fixed q̄. Solving

for K we see that the isoquant is the graph of the function

K =(q̄L−

12

)6

= q̄6L−3.

Differentiation yieldsdK

dL= −3q̄6L−4

and since (K,L) is on the isoquant

dK

dL= −3

(K

16L

12

)6

L−4 = −3K

L.

Economically, −dKdLis the marginal rate of substitution of labour for capital: If labour

is reduced by one unit, how much more capital is (approximately) needed to produce thesame output? Just reducing labour by one unit reduces output by the marginal product oflabour ∂q

∂L. Each additional unit of capital produces ∂q

∂K—the marginal product of capital

—more units. Therefore, if now capital input is increased by ∂q∂L

/∂q∂Kunits, output changes

overall by

− ∂q∂L

+

(∂q

∂L

/∂q

∂K

)∂q

∂K= 0.

9

Remark 17 A quick way to memorize the formula (3) is to use the total differential asfollows: In order to stay on the level curve one must have dz = z − z0 = 0, so

0 = dz =∂z

∂xdx+

∂z

∂ydy

hence

dy = −(∂z

∂x

/∂z

∂y

)dx

ordy

dx= − ∂z

∂x

/∂z

∂y.

Remark 18 The implicit function theorem states that if ∂z∂y |x=x0,y=y0

6= 0 for some point

(x0, y0) on a level curve f (x, y) = c, then one can find a function y = g (x) definednear x = x0 such that the level curve is locally the graph of this function, i.e. one hasf (x, g (x)) = c for x near x0.

Exercise 19 For the production function

Q =√KL

calculate the marginal rate of substitution for K = 3 and L = 4.

5 The chain rule

Suppose that z = f (x, y) is a function in two variables x, y. Suppose that x depends onthe variable t via x = g (t) and that y depends on the variable t via y = h (t). Then wecan define the composite function in one variable z = F (t) = f (x (t) , y (t)) which has tas the independent and z as the dependent variable. The derivative of this function, if itexists, is calculated as

dz

dt=∂z

∂x

dx

dt+∂z

∂y

dy

dt.

Notice that this is just the total differential divided by dt.

Example 20 Suppose z = xy, x = t3 + 2t, y = t2 − t. Then, according to the chain rule,

dz

dt=

∂z

∂x

dx

dt+∂z

∂y

dy

dt= y ×

(3t2 + 2

)+ x× (2t− 1)

=(t2 − t

) (3t2 + 2

)+ (2t− 1)

(t3 + 2t

)Since z = xy = (t3 + 2t) (t2 − t) we get the same result using the product rule:

dz

dt=(3t2 + 2

) (t2 − t

)+(t3 + 2t

)(2t− 1) .

10

6 Optimization problems

6.1 Unconstrained optimization

Suppose we want to find the absolute maximum of a function

z = f (x, y)

in two variables, i.e., we want to find the pair of numbers (x∗, y∗) such that

f (x∗, y∗) ≥ f (x, y)

holds for all (x, y). If (x∗, y∗) is such a maximum the following must hold: If we freezethe variable y at the optimal value y∗ and vary only the variable x then the function

f (x, y∗)

—which is now just a function in just one variable —has a maximum at x∗. Typically themaximum of this function in one variable will be a critical point. Hence we expect thepartial derivative ∂z

∂xto be zero at the optimum. Similarly we expect the partial derivative

∂z∂yto be zero at the optimum. We are thus led to the first order conditions

∂z

∂x |x=x∗,y=y∗= 0

∂z

∂y |x=x∗,y=y∗= 0

which must typically be satisfied. We speak of first order conditions because only thefirst derivatives are involved. The conditions do not tell us whether we have a maximum,a minimum or a saddle point. To find out the latter we will have to look at the secondderivatives and the so-called “second order conditions”, as will be discussed in a laterlecture.

Example 21 Consider a price-taking firm with the production function Q (K,L). Let rbe the interest rate (the price of capital K), let w be the wage rate (the price of labour L)and let P be the price of the output the firm is producing. When the firm uses K units ofcapital and L units labour she can produce Q (K,L) units of the output and hence makea revenue of

P ×Q (K,L)

by selling each unit of output at the market price P . Her production costs will then be hertotal expenditure on the inputs capital and labour

rK + wL.

Her profit will be the difference

Π (K,L) = PQ (K,L)− rK − rL.

11

The firm tries to find the input combination (K∗, L∗) which maximizes her profit. To findit we want to solve the first order conditions

∂Π

∂K= P

∂Q

∂K− r = 0 (4)

∂Π

∂L= P

∂Q

∂L− w = 0 (5)

These conditions are intuitive: Suppose for instance P ∂Q∂K− r > 0. ∂Q

∂Kis the marginal

product of capital, i.e., it tells us how much more can be produced by using one more unitof capital. P ∂Q

∂Kis the marginal increase in revenue when one more unit of capital is used

and the additional output is sold on the market. r is the marginal cost of using one moreunit of capital. P ∂Q

∂K− r > 0 means that the marginal profit from using one more unit

of capital is positive, i.e., it increases profits to produce more by using more capital andtherefore we cannot be in a profit optimum. Correspondingly, P ∂Q

∂K− r < 0 would mean

that profit can be increased by producing less and using less capital. So (4) should hold inthe optimum. Similarly (5) should hold.It is useful to rewrite the two first order conditions as

∂Q

∂K=

r

P(6)

∂Q

∂L=

w

P(7)

Thus, for both inputs it must be the case that the marginal product is equal to the priceratio of input- to output price. When we divide here the two left-hand sides and equatethem with the quotient of the right hand sides we obtain as a consequence

∂Q

∂K

/∂Q

∂L=

r

P

/ w

P=r

w(8)

so the marginal rate of substitution must be equal to the price ratio of the input price(which is the relative price of capital in terms of labour).

Example 22 Let us be more specific and assume that the production function is

Q (K,L) = K16L

12

and that the output- and input prices are P = 12, r = 1 and w = 3. Then

∂Q

∂K=

1

6K−

56L

12

∂Q

∂L=

1

2K

16L−

12

∂Q

∂K

/∂Q

∂L=

1

6K−

56L

12

/1

2K

16L−

12 =

1

3K−

56L

12K−

16L

12 =

1

3

L

K

The first order conditions (6) and (7) become

1

6K−

56L

12 =

1

12(9)

1

2K

16L−

12 =

3

12. (10)

12

This is a simultaneous system of two equations in two unknowns. The implied condition(8) becomes

1

3

L

K=

1

3

which simplifies toL = K

Example 23 A monopolist with total cost function TC (Q) = Q2 sells his product in twodifferent countries. When he sells QA units of the good in country A he will obtain theprice

PA = 22− 3QA

for each unit. When he sells QB units of the good in country B he obtains the price

PB = 34− 4QB.

How much should the monopolist sell in the two countries in order to maximize profits?To solve this problem we calculate first the profit function as the sum of the revenues

in each country minus the production costs. Total revenue in country A is

TRA = PAQA = (22− 3QA)QA

Total revenue in country B is

TRB = PBQB = (34− 4QB)QB

Total production costs areTC = (QA +QB)2

Therefore the profit is

Π (QA, QB) = (22− 3QA)QA + (34− 4QB)QB − (QA +QB)2 ,

a quadratic function in QAand QB. In order to find the profit maximum we must find thecritical points, i.e., we must solve the first order conditions

∂Π

∂QA= −3QA + (22− 3QA)− 2 (QA +QB) = 22− 8QA − 2QB = 0

∂Π

∂QB= −4QB + (34− 4QB)− 2 (QA +QB) = 34− 2QA − 10QB = 0

or

8QA + 2QB = 22 (11)

2QA + 10QB = 34.

Thus we have to solve a linear simultaneous system of two equations in two unknowns.

13

7 Simultaneous systems of equations

7.1 Linear systems

We illustrate four methods to solve a linear system of equations using the example

5x+ 7y = 50 (12)

4x− 6y = −18

7.1.1 Using the slope-intercept form

We rewrite both linear equations in slope-intercept form

7y = 50− 5x

y =50

7− 5

7x

4x− 6y = −18

4x+ 18 = 6y

y =2

3x+ 3

2 0 2 4 6

4

6

8

x

y

The solutions to each equation form a line, which we have now described as the graph oflinear functions. At the intersection point of the two lines the two linear functions musthave the same y-value. Hence

50

7− 5

7x = y =

2

3x+ 3

50

7− 3 =

2

3x+

5

7x | × 3× 7

150− 63 = 14x+ 15x

87 = 29x

x =87

29= 3

We have found the value of x in a solution. To calculate the value of y we use one of thelinear functions in slope-intercept form

y =2

3× 3 + 3 = 5

14

The solution to the system of equations is x∗ = 3, y∗ = 5It is strongly recommended to check the result for the original system of equations.

5× 3 + 7× 5 = 15 + 35 = 50

4× 3− 6× 5 = 12− 30 = −18

7.1.2 The method of substitution

First we solve one of the equations in (12), say the second, for one of the variables, say y:

4x− 6y = −18

4x+ 18 = 6y

y =4

6x+ 3 =

2

3x+ 3 (13)

Then we replace y in the other equation by the result. (Do not forget to place bracketsaround the expression.) We obtain an equation in only one variable, x, which we solvefor x.

5x+ 7y = 50

5x+ 7

(2

3x+ 3

)= 50

5x+14

3x+ 21 = 50

15

3x+

14

3x+ 21 = 50

29

3x = 50− 21 = 29

x =3

29× 29 = 3

We have found x and can now use (13) to find y:

y =2

3x+ 3 =

2

3× 3 + 3 = 5

Hence the solution is x = 3, y = 5.

7.1.3 The method of elimination

To eliminate x we proceed in two stages: First we multiply the first equation by the coef-ficient of x in the second equation and we multiply the second equation by the coeffi cientof x in the first equation

5x+ 7y = 50 | × 4

4x− 6y = −18 | × 5

20x+ 28y = 200

20x− 30y = −90

15

Then we subtract the second equation from the first equation

20x+ 28y = 20020x− 30y = −90 |−0 + 28y − (−30y) = 200− (−90)58y = 290y = 290

58= 5

Having found y we use one of the original equations to find x

5x+ 7y = 50

5x+ 35 = 50

5x = 15

x = 3

Instead of first eliminating x we could have first eliminated y:

5x+ 7y = 50 | × 6

4x− 6y = −18 | × 7

30x+ 42y = 300

28x− 42y = −126 |+58x = 174

x = 3

7.1.4 Cramer’s rule

This is a little preview on linear algebra. In linear algebra the system of equations iswritten as [

5 74 −6

] [xy

]=

[50−18

]where

[x1

x2

]and

[50−18

]are so-called columns vectors, simply two numbers written

below each other and surrounded by square brackets.[

5 74 −6

]is the 2 × 2-matrix

of coeffi cients. A 2 × 2-matrix is a system of four numbers arranged in a square andsurrounded by square brackets.With each 2× 2-matrix

A =

[a bc d

]we associate a new number called the determinant

detA =

∣∣∣∣ a bc d

∣∣∣∣ = ad− cb

Notice that the determinant is indicated by vertical lines in contrast to square brackets fora matrix.

16

For instance, ∣∣∣∣ 5 74 −6

∣∣∣∣ = 5× (−6)− 4× 7 = −30− 28 = −58

Cramer’s rule uses determinants to give an explicit formula for the solution of a linearsimultaneous system of equations of the form

ax+ by = e

cx+ dy = f

or [a bc d

] [xy

]=

[ef

].

Namely,

x∗ =

∣∣∣∣ e bf d

∣∣∣∣∣∣∣∣ a bc d

∣∣∣∣ =ed− bfad− bc y∗ =

∣∣∣∣ a ec f

∣∣∣∣∣∣∣∣ a bc d

∣∣∣∣ =af − cead− bc

Thus x∗ and y∗ are calculated as the quotients of two determinants. In both cases wedivide by the determinant of the matrix of the coeffi cients. For x∗ the determinant inthe numerator is the determinant of the matrix obtained by replacing in the matrix ofcoeffi cients the coeffi cients a and c of x by the constant terms e and f , i.e., the firstcolumn is replaced by the column vector with the constant terms. Similarly, for y∗ thedeterminant in the numerator is the determinant of the matrix obtained by replacing inthe matrix of coeffi cients the coeffi cients b and d of y by the constant terms e and f , i.e.,the second column is replaced by the column vector with the constant terms.The method works only if the determinant in the denominator is not zero.In our example,

x∗ =

∣∣∣∣ 50 7−18 −6

∣∣∣∣∣∣∣∣ 5 74 −6

∣∣∣∣ =50× (−6)− (−18)× 7

5× (−6)− 4× 7=−300 + 126

−30− 28=−174

−58= 3

y∗ =

∣∣∣∣ 5 504 −18

∣∣∣∣∣∣∣∣ 5 74 −6

∣∣∣∣ =5× (−18)− 4× 50

5× (−6)− 4× 7=−90− 200

−58=−290

−58= 5

Exercise 24 Use the above methods to find the optimum in Example 11

7.1.5 Existence and uniqueness.

A linear simultaneous system of equations

ax+ by = e

cx+ dy = f

17

can have zero, exactly one or infinitely many solution. To see that only these cases canarise, bring both equations into slope-intercept form3

y =e

b− a

bx

y =f

d− c

dx

If the slopes of these two linear functions are the same, they describe identical or parallellines. The slopes are identical when a

b= c

d, i.e., when the determinant of the matrix of

coeffi cients ad− cb is zero. If in addition the intercepts are equal, both equations describethe same line. In this case all points (x, y) on the line are solutions. If the intercepts aredifferent the two equations describe parallel lines. These do not intersect and hence thereis no solution.

Example 25

x+ 2y = 3

2x+ 4y = 4

has no solution: The two lines

y =3

2− 1

2x

y = 1− 1

2x

are parallel

2 1 1 2 3 4 5

1

1

2

x

y

There is no common solution. Trying to find one yields a contradiction

3

2− 1

2x = y = 1− 1

2x |+ 1

2x

3

2= 1

3We assume here for simplicity that the coeffi cients b and d are not zero. The arguments can be easilyextended to these cases, except when all four coeffi cients are zero. The latter case is trivial - either thereis no solution or all pairs of numbers (x, y) are solutions.

18

7.2 One equation non-linear, one linear

Consider, for instance,

y2 + x− 1 = 0

y +1

2x = 1

In this case we solve the linear equation for one of the variables and substitute the resultinto the non-linear equation. As a result we obtain one non-linear equation in a singleunknown. This makes the problem simpler, but admittedly some luck is needed to solvethe non-linear equation in one variable. Solving the linear equation for x can sometimesgive a simpler non-linear equation than solving for y and vice versa. One may have to tryboth possibilities. In our example it is convenient to solve for x:

1

2x = 1− yx = 2− 2y

y2 + (2− 2y)− 1 = y2 − 2y + 1 = (y − 1)2 = 0

So the unique solution is y∗ = 1 and x∗ = 2− 2× 1 = 0.

7.3 Two non-linear equations

There is no general method. One needs to memorize some tricks which work in specialcases.In the example of profit-maximization with a Cobb-Douglas production function we

were led to the system

1

6K−

56L

12 =

1

121

2K

16L−

12 =

3

12.

This is a simultaneous system of two equations in two unknowns which looks hard tosolve. However, as we have seen, division of the two equations shows that L = K musthold in a critical point.We can substitute this into, say, the first equation and obtain

1

12=

1

6K−

56L

12 =

1

6K−

56K

12 =

1

6K−

56

+ 36 =

1

6K−

26 =

1

6K−

13

K−13 =

1

213√K

=1

23√K = 2

K = 23 = 8

Thus the solution to the first order conditions (and, in fact, the optimum) isK∗ = L∗ = 8.

Remark 26 This method works with any Cobb-Douglas production function Q (K,L) =KaLb where the indices a and b are positive and sum to less than 1.

19

Remark 27 The first order conditions (9) and (10) form a “hidden” linear system ofequations. Namely, if you take the logarithms (introduced later!) you get the system

ln1

6− 5

6lnK +

1

2lnL = ln

1

12

ln1

2+

1

5lnK − 1

2lnL = ln

3

12

which is linear in lnK and lnL. One can solve this system for lnK and lnL, which thengives us the values for L and K.

8 Youngs theorem

Example:z = f (x, y) = x3 + 3x2y2 + y4

This is a function in two variables. When we calculate the two partial derivatives weobtain two new functions in two variables:

∂z

∂x= 3x2 + 6xy2

∂z

∂y= 6x2y + 4y3

Each of these two functions can again be partially differentiated in the two directions. Sothere are four second partial derivatives:

∂2z

∂x2=

∂

∂x

(∂z

∂x

)=

∂

∂x

(3x2 + 6xy2

)= 6x+ 6y2

∂2z

∂y∂x=

∂

∂y

(∂z

∂x

)=

∂

∂y

(3x2 + 6xy2

)= 12xy

∂2z

∂x∂y=

∂

∂x

(∂z

∂y

)=

∂

∂x

(6x2y + 4y3

)= 12xy

∂2z

∂y2=

∂

∂y

(∂z

∂y

)=

∂

∂y

(6x2y + 4y3

)= 6x2 + 12y2

We observe that out of these four new functions two are the same, namely the crossderivatives ∂2z

∂y∂xand ∂2z

∂x∂y. Thus it does not matter whether we first take the derivative

with respect to x and then with respect to y or vice versa. Youngs theorem states thetwo cross derivatives ∂2z

∂y∂xand ∂2z

∂x∂yare always the same, provided some mild conditions

are satisfied.4.The four second derivatives are conveniently arranged in the form of a 2 × 2-matrix

called the Hessian matrix. In our example:

H =

[∂2z∂x2

∂2z∂y∂x

∂2z∂x∂y

∂2z∂y2

]=

[6x+ 6y2 12xy

12xy 6x2 + 12y2

]4The second derivatives must exist and must be continuous functions.

9 Second order conditions

A critical point of a function z = f (x, y) can be a peak, a trough or a saddle point:

4

30

10

20

0

zx

2

y0 42 4

4

z = −x2 − y2

peak

100

10z10

0

y100

0 10

x10

0

z = x2 − y2

saddle

44

x2

y

0

10

20

30

z

4204

z = x2 + y2

trough

As in the case of a function in one variable, the second derivatives can be used todecide which of the three cases occur. However, the criterion that separates a saddlepoints from peaks and troughs may come as a surprise.

Theorem 28 Suppose that the function z = f (x, y) has a critical point at (x∗, y∗). If thedeterminant of the Hessian

detH =

∣∣∣∣∣ ∂2z∂x2

∂2z∂y∂x

∂2z∂x∂y

∂2z∂y2

∣∣∣∣∣ =∂2z

∂x2

∂2z

∂y2− ∂2z

∂x∂y

∂2z

∂y∂x=∂2z

∂x2

∂2z

∂y2−(∂2z

∂x∂y

)2

is negative at (x∗, y∗) then (x∗, y∗) is a saddle point.21

If this determinant is positive then at (x∗, y∗) then (x∗, y∗) is a peak or a trough. Inthis case the signs of ∂

2z∂x2 |(x∗,y∗) and

∂2z∂y2 |(x∗,y∗)

are the same. If both signs are positive, then

(x∗, y∗) is a trough. If both signs are negative, then (x∗, y∗) is a peak.Nothing can be said if the determinant is zero.

detH

< 0: saddle point

> 0:{

∂2z∂x2

> 0: trough∂2z∂x2

< 0: peak

Remark 29 If ∂2z∂x2is positive and ∂2z

∂y2is negative or vice versa then the determinant is

automatically negative and hence one has a saddle point. This is so because ∂2z∂x2

∂2z∂y2

is

negative and therefore also ∂2z∂x2

∂2z∂y2−(∂2z∂x∂y

)2

.

Remark 30 When the determinant is zero more complicated saddle points are possible.For instance, z = yx2 − y3 has a critical point at (0, 0) where all second derivatives arezero. Its graph is called a “monkey saddle”.

Example 31 z = f (x, y) = x2 − y2. The partial derivatives are ∂z∂x

= 2x and ∂z∂y

= −2y.Clearly, (0, 0) is the only critical point. The determinant of the Hessian is

detH =

∣∣∣∣∣ ∂2z∂x2

∂2z∂y∂x

∂2z∂x∂y

∂2z∂y2

∣∣∣∣∣ =

∣∣∣∣ 2 00 −2

∣∣∣∣ = (2) (−2)− 0× 0 = −4 < 0

Hence the function has a saddle point at (0, 0).

Example 32 z = − 110x2 + 101

100yx− 1

10y2 has the two partial derivatives ∂z

∂x= −1

5x+ 101

100y

and ∂z∂y

= 101100x − 1

5y. It is not diffi cult to show that (0, 0) is the only critical point. The

determinant of the Hessian is

detH =

∣∣∣∣∣ ∂2z∂x2

∂2z∂y∂x

∂2z∂x∂y

∂2z∂y2

∣∣∣∣∣ =

∣∣∣∣ −15

101100

101100

15

∣∣∣∣ =

(−1

5

)(−1

5

)−(

101

100

)2

= − 9801

10 000< 0

22

Hence the function has a saddle point at (0, 0) and not a minimum, although both ∂2z∂x2

and∂2z∂y2are negative. The graph of the function and the level curves are as follows:

yx

44

2

20

2002

20

44

2z 0z

4

2

0

x

y2

4

4

04 2 2

Exercise 33 Show that the critical point in the example on page 1 is a peak.

Example 34 We have shown in the handouts of the previous weeks that the profit function

Π (K,L) = 12K16L

12 −K − 3L

has a critical point at K∗ = L∗ = 8. Is it a relative maximum? We have

∂Π

∂K= 2K−

56L

12

∂Π

∂L= 6K

16L−

12

and hence

H =

[∂2Π∂K2

∂2Π∂L∂K

∂2Π∂K∂L

∂2Π∂L2

]=

[−5

3K−

116 L

12 K−

56L−

12

K−56L−

12 −3K

16L−

32

]For K = L = 8 we obtain (since 8−

83 = 1

( 3√8)= 1

64)

detH|K=8L=8

=

∣∣∣∣ −53× 8−

116

+ 12 8−

56− 12

8−56− 12 −3× 8

16− 32

∣∣∣∣ =

∣∣∣∣ −53× 8−

43 8−

43

8−43 −3× 8−

43

∣∣∣∣=

(−5

3× 8−

43

)(−3× 8−

43

)−(

8−43

)2

= 5×(

8−43

)2

−(

8−43

)2

= 4×(

8−43

)2

= 4× 8−83 =

4(3√

8)8 =

22

28=

1

26=

1

64> 0

Thus the determinant is positive. Since ∂2Π∂K2

|K=8L=8

> 0 we can conclude that K∗ = L∗ = 8

is a trough.

10 Absolute maxima or minima

The role of intervals for functions in one variable is taken over by convex sets whenconsidering functions in two variables. A region of the (x, y)-plane is called convex ifit contains with every two points (x1, y1) and (x2, y2) in the region the line segment

23

connecting these two points. (Notice: There are concave and convex functions and thereare concave and convex lenses, but there is no such thing as a “concave set”.)

two convex sets two non-convex sets

The (x, y)-plane and the the set of all non-negative coordinate pairs (x, y) are also convexsets.An absolute maximum of a function f (x, y) in a region is a point (x∗, y∗) in the region

such that f (x∗, y∗) ≥ f (x, y) holds with respect to all other points in the region.

Theorem 35 Suppose the function f (x, y) is defined on a convex set and has there aunique critical point (x∗, y∗). If the determinant of the Hessian at this point is positiveand if ∂

2f∂x2

is negative (positive) then (x∗, y∗) is an absolute maximum (minimum) on theconvex set.

The conditions are satisfied in many examples we have discussed, for instance in theprofit-maximization problem above.

Remark 36 Suppose the function f (x, y) is defined on a convex set. Suppose the de-terminant of the Hessian is always positive in the region and that ∂

2f∂x2

is always positive(respectively, negative). Then the function is convex (respectively, negative).Hereby the function is convex (concave) if the area above (below) the graph of the

function is convex.

24

1 functions in two variables

Documents