1 functions in two variables
TRANSCRIPT
BEEM103 —Optimization Techiniques for Economists Dieter Balkenborg
Departments of Economics
Lecture Week 2 University of Exeter
1 Functions in two variables
A function z = f (x, y) or simply z (x, y) in two independent variables with one dependentvariable assigns to each pair (x, y) of (decimal) numbers from a certain domain D inthe two-dimensional plane a number z = f (x, y).1 x and y are hereby the independentvariables and z is the dependent variable. The graph of f is the surface in 3-dimensionalspace consisting of all points (x, y, f (x, y)) with (x, y) in D.
Example 1 The cubic polynomial
z = f (x, y) = x3 − 3x2 − y2
is defined on the entire plane. It has the graph:
1
468
00
2y x
01
2
z 1
24
2 3
6
12
The vertical line is here the z-axis, the x- and the y- axis span a horizontal plane. Thusthe height of the graph indicates the values of the function. Lighter colours correspond tohigher values of the function. For instance, f (0, 0) = 0, so the origin of the coordinatesystem (0, 0, 0) belongs to the graph and is a peak. The function has a saddle point at(x, y) = (2, 0) where z = f (x, y) = 8− 3× 4− 0 = −4.
Exercise 2 Evaluate
z = f (2, 1)
z = f (3, 0)
z = f (4,−4)
z = f (4, 4)
Could the information fit with the graph?
1The two-dimensional plane consisting of all pairs of numbers (x, y) is usually denoted by IR2.
Example 3 A production function like
Q =6√K√L = K
16L
12
of a firm describes for each combination of capital (K ≥ 0) and labour (L ≥ 0) the totalamount Q ≥ 0 of output which the firm can produce with these inputs. The specifiedfunction has the graph:
0 10 20
y
200
10 15
x50
24
z6
Example 4 Consider a firm with the above production function and assume that the firmis a price taker in the product market and in both factor markets. If P is the price of thecommodity produced, r the interest rate (= the price of capital) and w the wage rate (=the price of labour) then the profit of this firm is
Π (K,L) = PQ− rK − wL= PK
16L
12 − rK − wL
if it employs K units of capital and L units of labour. For the prices P = 12, r = 1,w = 3 we obtain the function
Π (K,L) = 12K16L
12 −K − 3L
with the graph:
20x
10
z
0
10
10
0
20
y
2010
0
One can show that profits is maximized for K = L = 8.
2
The level curve of the function z = f (x, y) for the level c is the solution set to theequation
f (x, y) = c
where c is a given constant. Geometrically, a level curve is obtained by intersecting thegraph of f with a horizontal plane z = c and then projecting into the (x, y)-plane. Thisis illustrated here for the cubic polynomial discussed above:
0 00
5
12
1
x
2 3
5
y
z 11
2
1 1 20
2
1
0 3
xz1
y
2
In the case of a production function the level curves are called isoquants. An isoquantshows for a given output level capital-labour combinations which yield the same output.
0 10
x
20
y
200
10 1550
24z6
010
20
20
x
10
y
z0
Finally, the linear functionz = 3x+ 4y
has the graph and the level curves:
42 0
x0 y
420
10z
3020
z0
y
4
2
x0 2 4
3
The level curves of a linear function form a family of parallel lines. This can be seenalgebraically as follows: For a given level c the level curve is given by
c = 3x+ 4y
4y = c− 3x
y =c
4− 3
4x
The level curves are therefore all lines with slope −34. They differ only in the intercept c
4.
Hence they are all parallel lines.
Exercise 5 Describe the isoquant of the production function
Q = KL
for the quantity Q = 2.
Exercise 6 Describe the isoquant of the production function
Q =√KL
for the quantity Q = 4.
Remark 7 The exercises illustrate the following general principle: If h (z) is an increas-ing (or decreasing) function in one variable, then the composite function h (f (x, y)) hasthe same level curves as the given function f (x, y) . However, they correspond to differentlevels.
00
x4y
42
20
20
z 10
00
x4y
42
20
4
z 2
Q = KL Q =√KL
2 Partial derivatives
Exercise 8 What is the derivative of
z (x) = a3x2
with respect to x when a is a given constant?
4
Exercise 9 What is the derivative of
z (y) = y3b2
with respect to y when b is a given constant?
Consider now a function in two variables z = f (x, y). If we fix y at a value y = y0 andvary only x we obtain a function in one variable z = F (x) = f (x, y0). The derivative ofthis function F (x) at x = x0 is called the partial derivative of f with respect to x anddenoted by ∂z
∂x |x=x0,y=y0= dF
dx |x=x0= dF
dx(x0).2 The partial derivative ∂z
∂y |x=x0,y=y0is defined
symmetrically. (Notation: “d”=“dee”, “δ”=“delta”, “∂”=“del”)We do not have to specify y0 numerically. To partially differentiate in the direction
of x it suffi ces to think of y and all expressions containing only y as exogenously fixedconstants. We can then use the familiar rules for differentiating functions in one variablein order to obtain ∂z
∂x.
Other common notations for partial derivatives are ∂f∂x, ∂f∂yor fx, fy.
Example 10 Letz (x, y) = y3x2
Then∂z
∂x= 2y3x
∂z
∂y= 3y2x2
Example 11 Letz = x3 + x2y2 + y4.
Setting e.g. y = 1 we obtain z = x3 + x2 + 1 and hence
∂z
∂x |y=1= 3x2 + 2x
∂z
∂x |x=1,y=1= 5
For fixed, but arbitrary, y we obtain
∂z
∂x= 3x2 + 2xy2
as follows: We can differentiate the sum x3 + x2y2 + y4 with respect to x term-by-term.Differentiating x3 yields 2x2, differentiating x2y2 yields 2xy2 because we think now of y2
as a constant andd(ax2)dx
= 2ax holds for any constant a. Finally, the derivative of anyconstant term is zero, so the derivative of y4 with respect to x is zero.Similarly considering x as fixed and y variable we obtain
∂z
∂y= 2x2y + 4x3
2From now on I will often write dzdx |x=x0
instead of dzdx (x0) to denote the derivative evaluated at a
special value of x.
5
Example 12 The partial derivatives ∂q∂K
and ∂q∂Lof a production function q = f (K,L)
are called the marginal product of capital and (respectively) labour. They describeapproximately by how much output increases if the input of capital (respectively labour)is increased by a small unit. If one fixes for the production function in Example 3 capitalinput at K = 64 one obtains q = K
16L
12 = 2L
12 which has, as a function in the one
variable L, the graph
0 1 2 3 4 50
1
2
3
4
x
Q
This graph is obtained from the graph of the function in two variables by intersecting thelatter with a vertical plane parallel to L-q-axes.
10 15
x
20
z6420
20
y10
0 50
thus the partial derivatives ∂q∂K
and ∂q∂Ldescribe geometrically the slope of the function in
the K- and, respectively, the L- direction. Notice that for any given level of capital thereis diminishing productivity of labour: The more labour is used, the less is the increasein output when one more unit of labour is employed. Algebraically this can be verified asfollows:
∂q
∂L=
1
2K
16L−
12 =
1
2
6√K
2√L> 0,
the marginal product of labour is always positive. Differentiating a second time we obtain
∂2q
∂L2=
∂
∂L
(∂q
∂L
)= −1
4K
16L−
32 = −1
4
6√K
2√L3
< 0,
so the marginal product of labour is decreasing. Verify similarly that there is diminishingproductivity of capital!
Exercise 13 Find the partial derivatives of
z =(x2 + 2x
) (y3 − y2
)+ 10x+ 3y
6
3 The tangent plane
For a function y = f (x) in one variable the derivative dfdx |x=x0
at x0 is the slope of thetangent to the graph of f through the point (x0, f (x0)).
0.5 1.0 1.5 2.0
1
0
1
2
3
4
x
y
The tangent of f (x) = x2 at x = 1.
This tangent is the graph of the linear function
y = l (x) = f (x0) +df
dx |x=x0(x− x0) .
This is so because l (x0) = f (x0) , so the graph of l goes through the point (x0, f (x0)).Moreover,
y = l (x) =
(f (x0)− df
dx0 |x=x0
x0
)+df
dx |x=x0x,
so l (x) is indeed a linear function in x with intercept(f (x0)− df
dx0 |x=x0x0
)and the “right”
slope dfdx |x=x0
.The graph of a function in one variable is a curve and its tangent is a line. By analogy,
the graph of a function z = f (x, y) in two variables is a surface and its tangent at a point(x0, y0) is a plane.
0
2z
4
2
y431
00
1
x2 3 4
7
Non-vertical planes are the graphs of linear functions z = ax+ by+ c. The tangent planeof the function z = f (x, y) in (x0, y0) is, correspondingly, the graph of the linear function
z = l (x, y) = f (x0, y0) +∂f
∂x |x=x0,y=y0(x− x0) +
∂f
∂y |x=x0,y=y0
(y − y0) (1)
since the graph of this linear function contains the point (x0, y0, f (x0, y0)) and has theright slopes in the x- and y-directions.
Example 14 This is the example shown in the above graph. The tangent plane of z =f (x, y) =
√x+√y at the point (1, 1) is obtained as follows: We have
∂z
∂x=
1
2
1√x
and∂z
∂y=
1
2
1√y
and hence∂z
∂x |x=1,y=1=
1
2and
∂z
∂y |x=1,y=1
=1
2.
Since f (1, 1) = 2 the tangent plane is the graph of the linear function
z = l (x, y) = 2 +1
2(x− 1) +
1
2(y − 1) = 1 +
1
2x+
1
2y
Using the abbreviations dx = x− x0, dy = y − y0 and dz = z − z0 the formula for thetangent can be neatly rewritten as
dz =∂z
∂xdx+
∂z
∂ydy (2)
an expression which is called the total differential.
4 The slope of level curves
Suppose (x0, y0) is a point on the level curve f (x, y) = c. Then the slope of tangent tothe level curve in this point in the x-y-plane is
dy
dx |x=x0,y=y0= − ∂z
∂x |x=x0,y=y0
/∂z
∂y |x=x0,y=y0
(3)
provided ∂z∂y |x=x0,y=y0
6= 0. This is known as the rule for implicit differentiation: If the
function y = y (x) is implicitly defined by the equation f (x, y (x)) = c, then its derivativedydxis given by the above formula.
Example 15 A linear functiony = ax+ by + c
8
has the partial derivatives ∂z∂x
= a and ∂z∂y
= b. According to the above formula its levelcurves have the slope
− ∂z
∂x
/∂z
∂y= −a
b.
Indeed, its level curves are the solutions to the equations
c̄ = ax+ by + c
with some constant c̄. Solving the equation for y we obtain
y = −abx+
c̄− cb
which is a linear function with slope −ab.
Example 16 An isoquant of the production function Q = K16L
12 has according to the
above formula the slope
dK
dL= − ∂Q
∂L
/∂Q
∂K= −
(1
2K
16L−
12
)/(1
6K−
56L
12
)= −6
2K
16K
56L−
12L−
12 = −3
K
L.
in the point (K,L).Indeed, an isoquant is the set of solutions to the equation q̄ = K
16L
12 for fixed q̄. Solving
for K we see that the isoquant is the graph of the function
K =(q̄L−
12
)6
= q̄6L−3.
Differentiation yieldsdK
dL= −3q̄6L−4
and since (K,L) is on the isoquant
dK
dL= −3
(K
16L
12
)6
L−4 = −3K
L.
Economically, −dKdLis the marginal rate of substitution of labour for capital: If labour
is reduced by one unit, how much more capital is (approximately) needed to produce thesame output? Just reducing labour by one unit reduces output by the marginal product oflabour ∂q
∂L. Each additional unit of capital produces ∂q
∂K—the marginal product of capital
—more units. Therefore, if now capital input is increased by ∂q∂L
/∂q∂Kunits, output changes
overall by
− ∂q∂L
+
(∂q
∂L
/∂q
∂K
)∂q
∂K= 0.
9
Remark 17 A quick way to memorize the formula (3) is to use the total differential asfollows: In order to stay on the level curve one must have dz = z − z0 = 0, so
0 = dz =∂z
∂xdx+
∂z
∂ydy
hence
dy = −(∂z
∂x
/∂z
∂y
)dx
ordy
dx= − ∂z
∂x
/∂z
∂y.
Remark 18 The implicit function theorem states that if ∂z∂y |x=x0,y=y0
6= 0 for some point
(x0, y0) on a level curve f (x, y) = c, then one can find a function y = g (x) definednear x = x0 such that the level curve is locally the graph of this function, i.e. one hasf (x, g (x)) = c for x near x0.
Exercise 19 For the production function
Q =√KL
calculate the marginal rate of substitution for K = 3 and L = 4.
5 The chain rule
Suppose that z = f (x, y) is a function in two variables x, y. Suppose that x depends onthe variable t via x = g (t) and that y depends on the variable t via y = h (t). Then wecan define the composite function in one variable z = F (t) = f (x (t) , y (t)) which has tas the independent and z as the dependent variable. The derivative of this function, if itexists, is calculated as
dz
dt=∂z
∂x
dx
dt+∂z
∂y
dy
dt.
Notice that this is just the total differential divided by dt.
Example 20 Suppose z = xy, x = t3 + 2t, y = t2 − t. Then, according to the chain rule,
dz
dt=
∂z
∂x
dx
dt+∂z
∂y
dy
dt= y ×
(3t2 + 2
)+ x× (2t− 1)
=(t2 − t
) (3t2 + 2
)+ (2t− 1)
(t3 + 2t
)Since z = xy = (t3 + 2t) (t2 − t) we get the same result using the product rule:
dz
dt=(3t2 + 2
) (t2 − t
)+(t3 + 2t
)(2t− 1) .
10
6 Optimization problems
6.1 Unconstrained optimization
Suppose we want to find the absolute maximum of a function
z = f (x, y)
in two variables, i.e., we want to find the pair of numbers (x∗, y∗) such that
f (x∗, y∗) ≥ f (x, y)
holds for all (x, y). If (x∗, y∗) is such a maximum the following must hold: If we freezethe variable y at the optimal value y∗ and vary only the variable x then the function
f (x, y∗)
—which is now just a function in just one variable —has a maximum at x∗. Typically themaximum of this function in one variable will be a critical point. Hence we expect thepartial derivative ∂z
∂xto be zero at the optimum. Similarly we expect the partial derivative
∂z∂yto be zero at the optimum. We are thus led to the first order conditions
∂z
∂x |x=x∗,y=y∗= 0
∂z
∂y |x=x∗,y=y∗= 0
which must typically be satisfied. We speak of first order conditions because only thefirst derivatives are involved. The conditions do not tell us whether we have a maximum,a minimum or a saddle point. To find out the latter we will have to look at the secondderivatives and the so-called “second order conditions”, as will be discussed in a laterlecture.
Example 21 Consider a price-taking firm with the production function Q (K,L). Let rbe the interest rate (the price of capital K), let w be the wage rate (the price of labour L)and let P be the price of the output the firm is producing. When the firm uses K units ofcapital and L units labour she can produce Q (K,L) units of the output and hence makea revenue of
P ×Q (K,L)
by selling each unit of output at the market price P . Her production costs will then be hertotal expenditure on the inputs capital and labour
rK + wL.
Her profit will be the difference
Π (K,L) = PQ (K,L)− rK − rL.
11
The firm tries to find the input combination (K∗, L∗) which maximizes her profit. To findit we want to solve the first order conditions
∂Π
∂K= P
∂Q
∂K− r = 0 (4)
∂Π
∂L= P
∂Q
∂L− w = 0 (5)
These conditions are intuitive: Suppose for instance P ∂Q∂K− r > 0. ∂Q
∂Kis the marginal
product of capital, i.e., it tells us how much more can be produced by using one more unitof capital. P ∂Q
∂Kis the marginal increase in revenue when one more unit of capital is used
and the additional output is sold on the market. r is the marginal cost of using one moreunit of capital. P ∂Q
∂K− r > 0 means that the marginal profit from using one more unit
of capital is positive, i.e., it increases profits to produce more by using more capital andtherefore we cannot be in a profit optimum. Correspondingly, P ∂Q
∂K− r < 0 would mean
that profit can be increased by producing less and using less capital. So (4) should hold inthe optimum. Similarly (5) should hold.It is useful to rewrite the two first order conditions as
∂Q
∂K=
r
P(6)
∂Q
∂L=
w
P(7)
Thus, for both inputs it must be the case that the marginal product is equal to the priceratio of input- to output price. When we divide here the two left-hand sides and equatethem with the quotient of the right hand sides we obtain as a consequence
∂Q
∂K
/∂Q
∂L=
r
P
/ w
P=r
w(8)
so the marginal rate of substitution must be equal to the price ratio of the input price(which is the relative price of capital in terms of labour).
Example 22 Let us be more specific and assume that the production function is
Q (K,L) = K16L
12
and that the output- and input prices are P = 12, r = 1 and w = 3. Then
∂Q
∂K=
1
6K−
56L
12
∂Q
∂L=
1
2K
16L−
12
∂Q
∂K
/∂Q
∂L=
1
6K−
56L
12
/1
2K
16L−
12 =
1
3K−
56L
12K−
16L
12 =
1
3
L
K
The first order conditions (6) and (7) become
1
6K−
56L
12 =
1
12(9)
1
2K
16L−
12 =
3
12. (10)
12
This is a simultaneous system of two equations in two unknowns. The implied condition(8) becomes
1
3
L
K=
1
3
which simplifies toL = K
Example 23 A monopolist with total cost function TC (Q) = Q2 sells his product in twodifferent countries. When he sells QA units of the good in country A he will obtain theprice
PA = 22− 3QA
for each unit. When he sells QB units of the good in country B he obtains the price
PB = 34− 4QB.
How much should the monopolist sell in the two countries in order to maximize profits?To solve this problem we calculate first the profit function as the sum of the revenues
in each country minus the production costs. Total revenue in country A is
TRA = PAQA = (22− 3QA)QA
Total revenue in country B is
TRB = PBQB = (34− 4QB)QB
Total production costs areTC = (QA +QB)2
Therefore the profit is
Π (QA, QB) = (22− 3QA)QA + (34− 4QB)QB − (QA +QB)2 ,
a quadratic function in QAand QB. In order to find the profit maximum we must find thecritical points, i.e., we must solve the first order conditions
∂Π
∂QA= −3QA + (22− 3QA)− 2 (QA +QB) = 22− 8QA − 2QB = 0
∂Π
∂QB= −4QB + (34− 4QB)− 2 (QA +QB) = 34− 2QA − 10QB = 0
or
8QA + 2QB = 22 (11)
2QA + 10QB = 34.
Thus we have to solve a linear simultaneous system of two equations in two unknowns.
13
7 Simultaneous systems of equations
7.1 Linear systems
We illustrate four methods to solve a linear system of equations using the example
5x+ 7y = 50 (12)
4x− 6y = −18
7.1.1 Using the slope-intercept form
We rewrite both linear equations in slope-intercept form
7y = 50− 5x
y =50
7− 5
7x
4x− 6y = −18
4x+ 18 = 6y
y =2
3x+ 3
2 0 2 4 6
4
6
8
x
y
The solutions to each equation form a line, which we have now described as the graph oflinear functions. At the intersection point of the two lines the two linear functions musthave the same y-value. Hence
50
7− 5
7x = y =
2
3x+ 3
50
7− 3 =
2
3x+
5
7x | × 3× 7
150− 63 = 14x+ 15x
87 = 29x
x =87
29= 3
We have found the value of x in a solution. To calculate the value of y we use one of thelinear functions in slope-intercept form
y =2
3× 3 + 3 = 5
14
The solution to the system of equations is x∗ = 3, y∗ = 5It is strongly recommended to check the result for the original system of equations.
5× 3 + 7× 5 = 15 + 35 = 50
4× 3− 6× 5 = 12− 30 = −18
7.1.2 The method of substitution
First we solve one of the equations in (12), say the second, for one of the variables, say y:
4x− 6y = −18
4x+ 18 = 6y
y =4
6x+ 3 =
2
3x+ 3 (13)
Then we replace y in the other equation by the result. (Do not forget to place bracketsaround the expression.) We obtain an equation in only one variable, x, which we solvefor x.
5x+ 7y = 50
5x+ 7
(2
3x+ 3
)= 50
5x+14
3x+ 21 = 50
15
3x+
14
3x+ 21 = 50
29
3x = 50− 21 = 29
x =3
29× 29 = 3
We have found x and can now use (13) to find y:
y =2
3x+ 3 =
2
3× 3 + 3 = 5
Hence the solution is x = 3, y = 5.
7.1.3 The method of elimination
To eliminate x we proceed in two stages: First we multiply the first equation by the coef-ficient of x in the second equation and we multiply the second equation by the coeffi cientof x in the first equation
5x+ 7y = 50 | × 4
4x− 6y = −18 | × 5
20x+ 28y = 200
20x− 30y = −90
15
Then we subtract the second equation from the first equation
20x+ 28y = 20020x− 30y = −90 |−0 + 28y − (−30y) = 200− (−90)58y = 290y = 290
58= 5
Having found y we use one of the original equations to find x
5x+ 7y = 50
5x+ 35 = 50
5x = 15
x = 3
Instead of first eliminating x we could have first eliminated y:
5x+ 7y = 50 | × 6
4x− 6y = −18 | × 7
30x+ 42y = 300
28x− 42y = −126 |+58x = 174
x = 3
7.1.4 Cramer’s rule
This is a little preview on linear algebra. In linear algebra the system of equations iswritten as [
5 74 −6
] [xy
]=
[50−18
]where
[x1
x2
]and
[50−18
]are so-called columns vectors, simply two numbers written
below each other and surrounded by square brackets.[
5 74 −6
]is the 2 × 2-matrix
of coeffi cients. A 2 × 2-matrix is a system of four numbers arranged in a square andsurrounded by square brackets.With each 2× 2-matrix
A =
[a bc d
]we associate a new number called the determinant
detA =
∣∣∣∣ a bc d
∣∣∣∣ = ad− cb
Notice that the determinant is indicated by vertical lines in contrast to square brackets fora matrix.
16
For instance, ∣∣∣∣ 5 74 −6
∣∣∣∣ = 5× (−6)− 4× 7 = −30− 28 = −58
Cramer’s rule uses determinants to give an explicit formula for the solution of a linearsimultaneous system of equations of the form
ax+ by = e
cx+ dy = f
or [a bc d
] [xy
]=
[ef
].
Namely,
x∗ =
∣∣∣∣ e bf d
∣∣∣∣∣∣∣∣ a bc d
∣∣∣∣ =ed− bfad− bc y∗ =
∣∣∣∣ a ec f
∣∣∣∣∣∣∣∣ a bc d
∣∣∣∣ =af − cead− bc
Thus x∗ and y∗ are calculated as the quotients of two determinants. In both cases wedivide by the determinant of the matrix of the coeffi cients. For x∗ the determinant inthe numerator is the determinant of the matrix obtained by replacing in the matrix ofcoeffi cients the coeffi cients a and c of x by the constant terms e and f , i.e., the firstcolumn is replaced by the column vector with the constant terms. Similarly, for y∗ thedeterminant in the numerator is the determinant of the matrix obtained by replacing inthe matrix of coeffi cients the coeffi cients b and d of y by the constant terms e and f , i.e.,the second column is replaced by the column vector with the constant terms.The method works only if the determinant in the denominator is not zero.In our example,
x∗ =
∣∣∣∣ 50 7−18 −6
∣∣∣∣∣∣∣∣ 5 74 −6
∣∣∣∣ =50× (−6)− (−18)× 7
5× (−6)− 4× 7=−300 + 126
−30− 28=−174
−58= 3
y∗ =
∣∣∣∣ 5 504 −18
∣∣∣∣∣∣∣∣ 5 74 −6
∣∣∣∣ =5× (−18)− 4× 50
5× (−6)− 4× 7=−90− 200
−58=−290
−58= 5
Exercise 24 Use the above methods to find the optimum in Example 11
7.1.5 Existence and uniqueness.
A linear simultaneous system of equations
ax+ by = e
cx+ dy = f
17
can have zero, exactly one or infinitely many solution. To see that only these cases canarise, bring both equations into slope-intercept form3
y =e
b− a
bx
y =f
d− c
dx
If the slopes of these two linear functions are the same, they describe identical or parallellines. The slopes are identical when a
b= c
d, i.e., when the determinant of the matrix of
coeffi cients ad− cb is zero. If in addition the intercepts are equal, both equations describethe same line. In this case all points (x, y) on the line are solutions. If the intercepts aredifferent the two equations describe parallel lines. These do not intersect and hence thereis no solution.
Example 25
x+ 2y = 3
2x+ 4y = 4
has no solution: The two lines
y =3
2− 1
2x
y = 1− 1
2x
are parallel
2 1 1 2 3 4 5
1
1
2
x
y
There is no common solution. Trying to find one yields a contradiction
3
2− 1
2x = y = 1− 1
2x |+ 1
2x
3
2= 1
3We assume here for simplicity that the coeffi cients b and d are not zero. The arguments can be easilyextended to these cases, except when all four coeffi cients are zero. The latter case is trivial - either thereis no solution or all pairs of numbers (x, y) are solutions.
18
7.2 One equation non-linear, one linear
Consider, for instance,
y2 + x− 1 = 0
y +1
2x = 1
In this case we solve the linear equation for one of the variables and substitute the resultinto the non-linear equation. As a result we obtain one non-linear equation in a singleunknown. This makes the problem simpler, but admittedly some luck is needed to solvethe non-linear equation in one variable. Solving the linear equation for x can sometimesgive a simpler non-linear equation than solving for y and vice versa. One may have to tryboth possibilities. In our example it is convenient to solve for x:
1
2x = 1− yx = 2− 2y
y2 + (2− 2y)− 1 = y2 − 2y + 1 = (y − 1)2 = 0
So the unique solution is y∗ = 1 and x∗ = 2− 2× 1 = 0.
7.3 Two non-linear equations
There is no general method. One needs to memorize some tricks which work in specialcases.In the example of profit-maximization with a Cobb-Douglas production function we
were led to the system
1
6K−
56L
12 =
1
121
2K
16L−
12 =
3
12.
This is a simultaneous system of two equations in two unknowns which looks hard tosolve. However, as we have seen, division of the two equations shows that L = K musthold in a critical point.We can substitute this into, say, the first equation and obtain
1
12=
1
6K−
56L
12 =
1
6K−
56K
12 =
1
6K−
56
+ 36 =
1
6K−
26 =
1
6K−
13
K−13 =
1
213√K
=1
23√K = 2
K = 23 = 8
Thus the solution to the first order conditions (and, in fact, the optimum) isK∗ = L∗ = 8.
Remark 26 This method works with any Cobb-Douglas production function Q (K,L) =KaLb where the indices a and b are positive and sum to less than 1.
19
Remark 27 The first order conditions (9) and (10) form a “hidden” linear system ofequations. Namely, if you take the logarithms (introduced later!) you get the system
ln1
6− 5
6lnK +
1
2lnL = ln
1
12
ln1
2+
1
5lnK − 1
2lnL = ln
3
12
which is linear in lnK and lnL. One can solve this system for lnK and lnL, which thengives us the values for L and K.
8 Youngs theorem
Example:z = f (x, y) = x3 + 3x2y2 + y4
This is a function in two variables. When we calculate the two partial derivatives weobtain two new functions in two variables:
∂z
∂x= 3x2 + 6xy2
∂z
∂y= 6x2y + 4y3
Each of these two functions can again be partially differentiated in the two directions. Sothere are four second partial derivatives:
∂2z
∂x2=
∂
∂x
(∂z
∂x
)=
∂
∂x
(3x2 + 6xy2
)= 6x+ 6y2
∂2z
∂y∂x=
∂
∂y
(∂z
∂x
)=
∂
∂y
(3x2 + 6xy2
)= 12xy
∂2z
∂x∂y=
∂
∂x
(∂z
∂y
)=
∂
∂x
(6x2y + 4y3
)= 12xy
∂2z
∂y2=
∂
∂y
(∂z
∂y
)=
∂
∂y
(6x2y + 4y3
)= 6x2 + 12y2
We observe that out of these four new functions two are the same, namely the crossderivatives ∂2z
∂y∂xand ∂2z
∂x∂y. Thus it does not matter whether we first take the derivative
with respect to x and then with respect to y or vice versa. Youngs theorem states thetwo cross derivatives ∂2z
∂y∂xand ∂2z
∂x∂yare always the same, provided some mild conditions
are satisfied.4.The four second derivatives are conveniently arranged in the form of a 2 × 2-matrix
called the Hessian matrix. In our example:
H =
[∂2z∂x2
∂2z∂y∂x
∂2z∂x∂y
∂2z∂y2
]=
[6x+ 6y2 12xy
12xy 6x2 + 12y2
]4The second derivatives must exist and must be continuous functions.
9 Second order conditions
A critical point of a function z = f (x, y) can be a peak, a trough or a saddle point:
4
30
10
20
0
zx
2
y0 42 4
4
z = −x2 − y2
peak
100
10z10
0
y100
0 10
x10
0
z = x2 − y2
saddle
44
x2
y
0
10
20
30
z
4204
z = x2 + y2
trough
As in the case of a function in one variable, the second derivatives can be used todecide which of the three cases occur. However, the criterion that separates a saddlepoints from peaks and troughs may come as a surprise.
Theorem 28 Suppose that the function z = f (x, y) has a critical point at (x∗, y∗). If thedeterminant of the Hessian
detH =
∣∣∣∣∣ ∂2z∂x2
∂2z∂y∂x
∂2z∂x∂y
∂2z∂y2
∣∣∣∣∣ =∂2z
∂x2
∂2z
∂y2− ∂2z
∂x∂y
∂2z
∂y∂x=∂2z
∂x2
∂2z
∂y2−(∂2z
∂x∂y
)2
is negative at (x∗, y∗) then (x∗, y∗) is a saddle point.21
If this determinant is positive then at (x∗, y∗) then (x∗, y∗) is a peak or a trough. Inthis case the signs of ∂
2z∂x2 |(x∗,y∗) and
∂2z∂y2 |(x∗,y∗)
are the same. If both signs are positive, then
(x∗, y∗) is a trough. If both signs are negative, then (x∗, y∗) is a peak.Nothing can be said if the determinant is zero.
detH
< 0: saddle point
> 0:{
∂2z∂x2
> 0: trough∂2z∂x2
< 0: peak
Remark 29 If ∂2z∂x2is positive and ∂2z
∂y2is negative or vice versa then the determinant is
automatically negative and hence one has a saddle point. This is so because ∂2z∂x2
∂2z∂y2
is
negative and therefore also ∂2z∂x2
∂2z∂y2−(∂2z∂x∂y
)2
.
Remark 30 When the determinant is zero more complicated saddle points are possible.For instance, z = yx2 − y3 has a critical point at (0, 0) where all second derivatives arezero. Its graph is called a “monkey saddle”.
Example 31 z = f (x, y) = x2 − y2. The partial derivatives are ∂z∂x
= 2x and ∂z∂y
= −2y.Clearly, (0, 0) is the only critical point. The determinant of the Hessian is
detH =
∣∣∣∣∣ ∂2z∂x2
∂2z∂y∂x
∂2z∂x∂y
∂2z∂y2
∣∣∣∣∣ =
∣∣∣∣ 2 00 −2
∣∣∣∣ = (2) (−2)− 0× 0 = −4 < 0
Hence the function has a saddle point at (0, 0).
Example 32 z = − 110x2 + 101
100yx− 1
10y2 has the two partial derivatives ∂z
∂x= −1
5x+ 101
100y
and ∂z∂y
= 101100x − 1
5y. It is not diffi cult to show that (0, 0) is the only critical point. The
determinant of the Hessian is
detH =
∣∣∣∣∣ ∂2z∂x2
∂2z∂y∂x
∂2z∂x∂y
∂2z∂y2
∣∣∣∣∣ =
∣∣∣∣ −15
101100
101100
15
∣∣∣∣ =
(−1
5
)(−1
5
)−(
101
100
)2
= − 9801
10 000< 0
22
Hence the function has a saddle point at (0, 0) and not a minimum, although both ∂2z∂x2
and∂2z∂y2are negative. The graph of the function and the level curves are as follows:
yx
44
2
20
2002
20
44
2z 0z
4
2
0
x
y2
4
4
04 2 2
Exercise 33 Show that the critical point in the example on page 1 is a peak.
Example 34 We have shown in the handouts of the previous weeks that the profit function
Π (K,L) = 12K16L
12 −K − 3L
has a critical point at K∗ = L∗ = 8. Is it a relative maximum? We have
∂Π
∂K= 2K−
56L
12
∂Π
∂L= 6K
16L−
12
and hence
H =
[∂2Π∂K2
∂2Π∂L∂K
∂2Π∂K∂L
∂2Π∂L2
]=
[−5
3K−
116 L
12 K−
56L−
12
K−56L−
12 −3K
16L−
32
]For K = L = 8 we obtain (since 8−
83 = 1
( 3√8)= 1
64)
detH|K=8L=8
=
∣∣∣∣ −53× 8−
116
+ 12 8−
56− 12
8−56− 12 −3× 8
16− 32
∣∣∣∣ =
∣∣∣∣ −53× 8−
43 8−
43
8−43 −3× 8−
43
∣∣∣∣=
(−5
3× 8−
43
)(−3× 8−
43
)−(
8−43
)2
= 5×(
8−43
)2
−(
8−43
)2
= 4×(
8−43
)2
= 4× 8−83 =
4(3√
8)8 =
22
28=
1
26=
1
64> 0
Thus the determinant is positive. Since ∂2Π∂K2
|K=8L=8
> 0 we can conclude that K∗ = L∗ = 8
is a trough.
10 Absolute maxima or minima
The role of intervals for functions in one variable is taken over by convex sets whenconsidering functions in two variables. A region of the (x, y)-plane is called convex ifit contains with every two points (x1, y1) and (x2, y2) in the region the line segment
23
connecting these two points. (Notice: There are concave and convex functions and thereare concave and convex lenses, but there is no such thing as a “concave set”.)
two convex sets two non-convex sets
The (x, y)-plane and the the set of all non-negative coordinate pairs (x, y) are also convexsets.An absolute maximum of a function f (x, y) in a region is a point (x∗, y∗) in the region
such that f (x∗, y∗) ≥ f (x, y) holds with respect to all other points in the region.
Theorem 35 Suppose the function f (x, y) is defined on a convex set and has there aunique critical point (x∗, y∗). If the determinant of the Hessian at this point is positiveand if ∂
2f∂x2
is negative (positive) then (x∗, y∗) is an absolute maximum (minimum) on theconvex set.
The conditions are satisfied in many examples we have discussed, for instance in theprofit-maximization problem above.
Remark 36 Suppose the function f (x, y) is defined on a convex set. Suppose the de-terminant of the Hessian is always positive in the region and that ∂
2f∂x2
is always positive(respectively, negative). Then the function is convex (respectively, negative).Hereby the function is convex (concave) if the area above (below) the graph of the
function is convex.
24