chapter 10: differential calculus and nonlinear

Differential Calculus and Nonlinear Comparative Statics

Chapter 10:

Differential calculus is a core tool for optimization and comparative statics. This chapter applies the

concepts of differential calculus to multivariate problems.

Limits and Continuity10.1Recall from our discussion of univariate calculus that an infinite sequence has a limit L iff almost all of

its points are arbitrarily close to L. Recall as well that we say that a function f has a limit L at point x0 if

f [x] approaches L as x approaches x0. These basic concepts are unchanged, but in a multivariate

setting, there is a complication. Our discussion of univariate function limit involved a discussion of one-

sided limits, but in a multivariate setting, we can approach x0 from many directions, not just two. Never-

theless, the core idea behind continuity remains the same: a function is continuous at x0 if it does not

have a hole or a jump at x0, and this means that the function limits is the same no matter which direction

(in the domain) we come from, and additionally equals the value of the function at x0.

f = {x, y} Ifx y < 0, 1 3, 2 3;

For this function, we generally find that if we set a value of y and then let x approach 0, the answer

depends on the sign of y, and we get a different answer if x is coming from the left or from the right.

And by symmetry, we encounter the same issues if we fix a value of x and then let y approach zero.

Limit[f[x, y], x → 0, Direction → -1]

1

3y < 0

2

3True

256

Limit[f[x, y], x → 0, Direction → 1]

1

3y > 0

2

3True

Next we turn to a bivariate example of a discontinuous function whose function limits are well defined.

As in the univariate case, the existence of a function limit is not sufficient for continuity. As an example

of a function that has both holes and limits, consider the following.

f = {x, y} Sin[x + y]

x + y;

This function is undefined when x + y ⩵ 0. This shows up as a missing ridge in the function graph.

Plot3D[f[x, y], {x, -π, π}, {y, -π, π}, ColorFunction → "GrayYellowTones"]

Pick (0, 0) as an example point of discontinuity. No matter how we approach (0, 0) through the domain

of f , we always approach a function value of 1. For example,

Limit[f[x, y] /. {y → m x}, x → 0]

1

Note that as users we are again expected to be alert to the fact that this result does not apply if m = -1.

Mathematica gives us the generally useful result, which applies for all other values of m.

Finally, let us look at a function of two variables that is undefined at a single point: the origin.

f = {x, y} x y

x2 + y2;

257

Although this function behaves smoothly everywhere else, it does not have a limit at the origin. Here is

an easy way to see that. Note that, except at (0, 0), the value of the function is 1 /2 whenever x ⩵ y, yet

the value of the function is -1 /2 whenever x ⩵ -y. So we can easily find two directions of approach to

the origin that yield sequences in the range that approach different limits.

This is enough information to show that the function does not have a limit at the origin. But these two

cases are simply special cases of collinearity between x and y as we approach the origin. More gener-

ally we can observe that along any line through the origin the function has a constant value.

f[x, y] /. {y → m x} // Simplify

m

1 + m2

This means we continuously vary the limit of the function values as we vary the line along which we

approach the origin. The two values initially considered were just the largest and the smallest of these

varied limits.

258

Plotm

1 + m2, {m, -10, 10}

-10 -5 5 10

-0.4

-0.2

0.2

0.4

Multivariate Differential Calculus10.2A continuous multivariate function may additionally be differentiable. We begin with the notion of a

partial derivative.

Partial Derivatives

A first-order partial derivative is just the ordinary derivative of a function with respect to one of its argu-

ments, when all the other arguments are constant. In other words, when we fix all the arguments of the

function but one, we are left with an ordinary univariate function. The derivative of this univariate func-

tion is a partial derivative of the original function.

To illustrate, suppose f is a real-valued function of two real variables. We can find its difference quo-

tient with respect either the first argument or the second. For example,

Clear[f, x, y, h]

DifferenceQuotient[f[x, y], {x, h}]

-f[x, y] + f[h + x, y]

h

Once we have the difference quotient, we can find the derivative we seek as before: by shrinking the

step size to 0. Here we illustrate this approach to producing partial derivatives with a arbitrary bivariate

function.

Expression Result

f={x,y}x2 y2 Function{x, y}, x2 y2

dqx=DifferenceQuotient[f[x,y],{x,h}] h y2 + 2 x y2

Limit[dqx,h→0] 2 x y2

dqy=DifferenceQuotient[f[x,y],{y,h}] h x2 + 2 x2 y

Limit[dqy,h→0] 2 x2 y

The `Derivative` command can produce partial derivatives of ordinary functions. We need to specify the

order of differentiation with respect to each of the slots of the function. The partial derivative of func-

259

tions that are not yet defined use a slightly unusual notation, but it is unambiguous once one is accus-

tomed to it. As a superscript to the function, we see a tuple that gives the order of differentiation with

respect to each of the function arguments.

Expression Result

Derivative[1,0][f] f(1,0)

Derivative[0,1][f] f(0,1)

The notation is a bit unusual, but it is nicely unambiguous (for any continuously differentiable function).

If the function is known, the `Derivative` command produces a new function representing the partial

derivative.

Expression Result


Derivative[1,0][f] Function{x, y}, 2 x y2

Derivative[0,1][f] Function{x, y}, 2 x2 y

To partially differentiate a multivariate expression, one can request a single partial derivative with the

`D` command, or one can request a list of all the partial derivatives at once with the `Grad` command.

This list represents a vector of partial derivatives, known as the gradient of the function.

Clear[f]

Grad[f[x, y], {x, y}]

f(1,0)[x, y], f(0,1)[x, y]

The output of the `Grad` command is a list of the first-order partial derivatives, given in the same order

as the provided variables. Here is an example with a an arbitrary bivariate function.

Expression Result


D[f[x,y],x] 2 x y2

D[f[x,y],y] 2 x2 y

Grad[f[x,y],{x,y}] 2 x y2, 2 x2 y

There is also a nabla shorthand for `Grad` that we will occasionally used. The variables of differentia-

tion are provided as a list that subscripts a nabla, which is entered as `del`.

∇{x,y}f[x, y]

2 x y2, 2 x2 y

Visualizing the Partial Derivatives

In order to plot a function of two variables, we will need three dimensions: two dimensions for the

argument values, and one dimension for the resulting function value. Recall that the `Plot3D` command

nicely fits the bill for this need. To make it easier to keep track of which axis represents which value, we

will use the `AxesLabel` option.

260

g1 = Plot3D[x^2 + y^2, {x, -1, 1}, {y, -1, 1},

AxesLabel → {"x", "y", None}, ImageSize → 288]

Each grid line represents a univariate function. If we hold one variable constant while varying the other,

we can produce two-dimensional plots. Here we elaborate on this observation by slicing our surface

plot with a plane at y = 0. This helps us visualize the univariate function that results when we fix y = 0

but allow x to vary. The resulting function graph is the intersection between our surface and the plane.

g2 = Graphics3D[{InfinitePlane[{{-1, 0, 1}, {0, 0, 0}, {1, 0, 1}}]}];

Show[{g1, g2}]

Recall that setting y = 0 this way is called partial function application. After partially applying our bivari-

ate function, we end up with a univariate function. Naturally, we can create a two-dimensional plot of

this univariate function. The slopes we observe in such a plot are derivatives of the partially applied

function, but they are partial derivatives of our multivariate function.

261

-1.0 -0.5 0.0 0.5 1.0x

0.5

1.0

1.5

2.0z

-1.0 -0.5 0.0 0.5 1.0y

0.5

1.0

1.5

2.0z

Figure 33: Univariate Slopes are Partial Derivatives

Since we can think of the derivative of a univariate function as the slope of a tangent line to the graph of

the univariate function, we can correspondingly think of a partial derivative as the slope of a tangent line

to the graph of a partially applied multivariate function.

Multivariate Differential: Visualizing the Gradient

If we have a univariate function f , the derivative f '[x] tells us approximately how much the function will

increase given a unit increase in x, so f '[x] dx is the approximate change in the value of f give a change

in x in the amount dx. We exploited this observation to develop the notion of the differential of a univari-

ate function.

If we have a bivariate function f , the partial derivatives ∇{x,y} f [x, y] correspondingly tells us approxi-

mately how much f will increase given a unit change in x or a unit change in y, while we hold the other

variable constant. If both variables to change by amounts (dx, dy), then we correspondingly approxi-

mate the change in the value of f by the dot product of the gradient and the change vector:

df = ∇{x,y} f [x, y].(dx, dy). This is the differential of our bivariate function. Expressing this with WL, we

have

Clear[f, x, y, dx, dy]

df = Dot[∇{x,y}f[x, y], {dx, dy}]

dy f(0,1)[x, y] + dx f(1,0)[x, y]

As a concrete example, consider the following function.

f = {x, y} x2 + y2;

df

2 dx x + 2 dy y

The gradient gives the direction of most rapid increase in the function value. To see that, let f be a real-

valued function of a real-vector input, and define g[t] = f [x + t v]. Assuming g is differentiable at 0, the

rate of increase of g is g '[0] = ∇x f .v, which is the dot product of the gradient and the vector v. This is

called the directional derivative of f in the direction v (at the point x). That is, a directional derivative is a

linear combination of partial derivatives. For example,

262

Clear[f]

D[f[x1 + t v1, x2 + t v2], t] /. {t → 0}

v2 f(0,1)[x1, x2] + v1 f(1,0)[x1, x2]

To get rid of the influence of length, let v be a unit vector. This allows us to again ask, how much will a

unit step in a particular direction increase the value of f? Recall from our earlier discussion of angles

that to make this dot product as large as possible, we need v to be codirectional with x.

A gradient of a function is perpendicular to the associated curve in the level set. In this section we will

illustrate this. First we illustrate the gradient.

For this illustration, we will use a Cobb-Douglas production function. The traditional representation is

Y = A Kα N1-α. For concreteness, set A = 10 and α = 0.3. Recall that user-defined symbols should start

with lower-case letters in WL. So let us simply use minuscule letters in our representation.

Clear[k, dk, n, dn, y]

y = {k, n} Evaluate10 * k^α * n^1 - α /. {α → 0.3} // TraditionalForm

{k, n} 10 k0.3 n0.7

Use what you know about `Plot3D` to make a nice surface plot of our Cobb-Douglas function. (Name it

`g1`.) For example,

We will add a direction vector and an isoquant to this plot. Let us arbitrarily choose an initial point

(k0, n0) for our illustration. Letting y[k, n] be our Cobb-Douglas function, and evaluate it at this point.

This gives us a triple (k0, n0, y0), which describes a three-dimensional point.

{k0, n0} = {5, 5};

y0 = y[k0, n0];

pt0 = {k0, n0, y0}

{5, 5, 50.}

The partial derivative of a production function with respect to the capital input is called the marginal

product of capital. The partial derivative of a production function with respect to the labor input is called

the marginal product of labor. We can produce these both at one go with the `Grad` command.

263

grad = Grad[y[k, n], {k, n}]

3. n0.7

k0.7,7. k0.3

n0.3

The gradient gives us the direction of steepest increase. Evaluate this gradient at this initial point,

(k0, n0). Use this to produce a new point by moving in the direction of steepest increase from our initial

point.

{dk, dn} = grad /. {k → k0, n → n0};

{k1, n1} = {k0, n0} + {dk, dn};

y1 = y[k1, n1];

pt1 = {k1, n1, y1}

{8., 12., 106.256}

The `Graphics3D` command provides for drawing three-dimensional arrows. Here we draw an arrow

from our initial point to our new point, calling this drawing `g2`. We then use the `Show` command to

add it to our surface plot. (Note how the combined plot uses the options set in the first plot, since it is

the first argument to `Show`.)

g2 = Graphics3D[{

Arrowheads[0.02, Appearance → "Projected"], Thick, Arrow[{pt0, pt1}]

}];

Show[g1, g2]

An isoquant comprises capital-labor pairs that give a fixed value of output. We will find it informative to

add an isoquant to our drawing. For any fixed value of y, we can solve for k as a function of n. Here we

simply solve for the function by hand, since it is so simple.

kny = {n, y} y 10 n^0.7^1 0.3;

Using this function to draw the isoquant corresponding to our initial inputs, we get

264

g3 = ParametricPlot3D[{kny[n, y0], n, y0}, {n, 0, 20}];

Show[{g1, g2, g3}]

It is more common to draw an isoquant as a projection on the input plane. We can use the

`ContourPlot` command to do this automatically for our production function, for any specified level of

production. Here we increase the value of `PlotPoints` in order to produce a smoother isoquant, and we

use the `Epilog` option to add an arrow corresponding to our gradient. In order to highlight the orthogo-

nal relationship between the gradient and the slope of the isoquant, we also use `ParametricPlot` to add

a tangent line to the isoquant.

265

g1 = ContourPlot[y[k, n] ⩵ y0, {k, 0, 20}, {n, 0, 20},

PlotPoints → 100, Epilog → Arrow[{{k0, n0}, {k1, n1}}]];

tangentSlope = D[kny[n, y0], n] /. {k → k0, n → n0};

g2 = ParametricPlot[{kny[5, y0] + tangentSlope * (n - 5), n}, {n, 0.1, 7},

PlotStyle → {Thin, Dashed}];

Show[{g1, g2}]

0 5 10 15 20

0

5

10

15

20

Non-linear Comparative Statics 10.3

A Non-linear System of Equations

Consider the following as our “structural” equations (without worrying too much about the nature of

economic structure). e:islm.gmkt

Y A (i -π, Y, G)

m L(i, Y)(30)

Here Y is total production in the economy, A is the function determining demand for that production, i is

the nominal interest rate, π is the expected inflation rate, G measures the “fiscal stance” (i.e., how

expansionary fiscal policy is), and m is the real money supply.

ClearAll[fA, fL, i, y, m, Π, g] (* all Mma variables should start lower case *)

lmnl = m ⩵ fL[i, y];

isnl = y ⩵ fA[i - Π, y, g];

Produce a textbook “Keynesian” model by taking Y and i to be endogenous, or produce a textbook

“Classical” model by taking m and i to be endogenous.

266

Total Differentials

We will first consider the money market. Recall that equation (11 ) described money market equilibrium

as m L(i,Y) This must hold both before and after any exogenous changes. That is, we require that

we start out in money market equilibrium, and we also require that we end up in money market equilib-

rium. It follows that the changes in the real money supply (d m) must equal the changes in real money

demand (d L).

dm ⩵ dL (31)

The change in real money demand has two sources: changes in i and changes in Y. As usual, we will

represent these as d i and d Y. Of course, the change in real money demand depends not only on the

size of the changes in these arguments, but also on how sensitive money demand is to each of these

arguments.

dL Li di + LY dY (32)

Putting these two observations together, we get e:islm.dmdl

dm Li di + LY dY (33)

We call this the “total differential” of the LM equation. It makes a very simple statement: we start out on

an LM curve, and we end up on an LM curve.

dlmnl = Dt[lmnl]

Dt[m] ⩵ Dt[y] fL(0,1)[i, y] + Dt[i] fL(1,0)[i, y]

Let’s make this a bit easier to read (but not to manipulate) by introducing some notation as rules. Note

that this is purely for convenience in reading: we are using strings (not symbols) in the result.

notationRulesLM =

fL(0,1)[i, y] → "Ly", fL(1,0)[i, y] → "Li", Dt[m] → dm, Dt[i] → di, Dt[y] → dy;

dlmnl /. notationRulesLM

dm ⩵ Li di + Ly dy

Next consider the goods market. Recall that the equation

Y A (i -π, Y, G) (34)

represents equilibrium in the goods market. This must hold both before and after any exogenous

changes. That is, we require that we start out in goods market equilibrium, and we also require that we

end up in goods market equilibrium. It follows that the changes in real income must equal the changes

in real aggregate demand. Looking at the equation for the IS curve, we can see that this means that the

change in real income (d Y) must equal the change in real aggregate demand (d A).

dY ⩵ dA (35)

The change in aggregate demand has three sources: changes in r, changes in Y, and changes in F.

We represent these changes as d r, d Y, and dG. Of course, the changes in aggregate demand depend

not only on the size of the changes in these arguments, but also on how sensitive aggregate demand is

to each of these arguments.

267

dA Ar (di - dΠ)dr

+ AY dY + AG dG (36)

Putting these two pieces together, we have the total differential of the IS equation:

dY Ar(di - dπ) + AY dY + AG dG (37)

Note that A ( · , · , ·) has only three arguments. Do not be misled by the fact that we choose to write r

as i -π. This does not change the number of arguments of the aggregate demand function. E.g., there

is no derivative Ai.

disnl = Dt[isnl]

Dt[y] ⩵

Dt[g] fA(0,0,1)[i - Π, y, g] + Dt[y] fA(0,1,0)[i - Π, y, g] + Dt[i] - Dt[Π] fA(1,0,0)[i - Π, y, g]

notationRulesIS = fA(0,0,1)[i - Π, y, g] → "AG",

fA(0,1,0)[i - Π, y, g] → "Ay", fA(1,0,0)[i - Π, y, g] → "Ar", Dt[g] → dg, Dt[Π] → dΠ;

notationRules = Join[notationRulesLM, notationRulesIS];

disnl /. notationRules

dy ⩵ AG dg + Ay dy + Ar di - dΠ

Implicit Function Theorem

The IFT provides the conditions under which we can characterize the partial derivatives of the reduced

form in terms of the partial derivatives of the structural form. That is, we can do qualitative comparative

statics.

Review the IFT using the online notes.

“Keynesian” Model

Let us first consider a textbook Keynesian model. Assuming satisfaction of the assumptions of the

implicit function theorem, there is an implied reduced form for the Keynesian model. The reduced form

expresses the solution for each endogenous variables in terms of the exogenous variables. We will

represent this as

i i (m, π, G)

Y Y(m, π, G)(38)

The implicit function theorem tells us how to find the partial derivatives of i (., .) and Y (., .).

Note how we use the letter i to represent both a variable (on the left) and a function (on the right). This

is common practice among economists and mathematicians, as it helps us keep track of which function

is related to which variable. (However we will not usually be able to do this in a computer algebra

system.) Note that since we did not begin with an explicit functional form for the structural equations we

cannot hope to find an explicit functional form for the reduced form. Instead we rely on qualitative

information about the structural equations to make qualitative statements about the reduced form.

268

dlmnl

dlmnl /. notationRules

Dt[m] ⩵ Dt[y] fL(0,1)[i, y] + Dt[i] fL(1,0)[i, y]

dm ⩵ Li di + Ly dy

The total differential can be used to find the slope of the LM curve. Suppose we allow only i and Y to

change (so that d m 0). Then we must have

0 Li d i + LY d Y

d i

d Y L M

-LY

Li

> 0(39)

Dt[lmnl] /. {Dt[m] → 0} /. notationRules (* represent restricted total differential *)

SolveDt[lmnl] /. {Dt[m] → 0, Dt[y] → 1}, Dt[i] /. notationRules

0 ⩵ Li di + Ly dy

di → -Ly

Li

This represents the way i and Y must change together to maintain equilibrium in the money market,

ceteris paribus. That is, this determines the slope of the “Keynesian” LM curve. Under the standard

assumptions that LY > 0 and Li < 0, the “Keynesian” LM curve has a positive slope.

Similarly, if we allow only i and Y to change in the goods market, we must have

d Y Ar d i + AY d Y

d i

d Y I S

1 - AY

Ar

< 0(40)

restrictions = {Dt[m] → 0, Dt[g] → 0, Dt[Π] → 0}

Dt[isnl] /. restrictions /. notationRules (* represent restricted total differential *)

SolveDt[isnl] /. restrictions /. {Dt[y] → 1}, Dt[i] /. notationRules

{Dt[m] → 0, Dt[g] → 0, Dt[Π] → 0}

dy ⩵ Ar di + Ay dy

di →1 - Ay

Ar

This is the way i and Y must change together to maintain equilibrium in the goods market. That is, this

determines the slope of the “Keynesian” IS curve. Under the standard assumptions that 0 < AY < 1 and

Ar < 0, the “Keynesian” IS curve has a negative slope.

Solving the Nonlinear Keynesian Model

So we have seen what is required to stay on the IS curve and what is required to stay on the LM curve.

Putting these together we have

269

d Y Ar(di - dπ) + AY d Y + AF dG

dm Li di + LY dY(41)

When we insist that both of these equation hold together, we are insisting that we stay on both the IS

and LM curves simultaneously. In this system there are two endogenous variables, d r and d Y, which

are being determined so as to achieve this simultaneous satisfaction of the IS and LM equations.

Now we just solve two linear equations in two unknowns. First prepare to set up the system as a matrix

equation by moving all terms involving the endogenous variables to the left. (Note that this is the first

time we have paid attention to which variables are endogenous.)

-Ar di + dY - AY dY -Ar d π + AF dG

Li di + LY dY dm(42)

Now rewrite this system as a matrix equation in the form J x b.

-Ar (1 - AY)

Li LY

di

dY

-Ar d π +AF dG

dm (43)

Then solve for the endogenous variables by multiplying both sides by J-1.

d i

d Y

1

-Ar LY - (1 - AY) Li

LY -(1 - AY)

-Li -Ar

-Ar d π +AF dG

d m

1

Ar LY + (1 - AY) Li

-LY (1 - AY)

Li Ar

-Ar d π +AF dG

d m

(44)

Letting Δ Ar LY + (1 -AY) Li, we can write this as

d i

d Y

1

Δ -LY (1 - AY)

Li Ar

-Ar d π +AG dG

d m (45)

Invoking the standard assumptions on the structural form partial derivatives, listed above, we note that

Δ Ar LY + (1 -AY) Li < 0.

solnK = Solve[dlmnl && disnl, {Dt[i], Dt[y]}];

solnK /. notationRules

di → --AG Ly dg + dm - Ay dm + Ar Ly dΠ

-Li + Ay Li - Ar Ly, dy → -

-AG Li dg - Ar dm + Ar Li dΠ

Li - Ay Li + Ar Ly

Fiscal policy experiment:

∂ i /∂G

∂Y /∂G

1

Δ -LY (1 - AY)

Li Ar

AG

0

1

Δ -LY AF

Li AF

+

+

(46)

270

(* find the partial responses to dg *)

$Assumptions = 1 > fA(0,1,0)[i - Π, y, g] > 0 && fA(0,0,1)[i - Π, y, g] > 0 &&

fA(1,0,0)[i - Π, y, g] < 0 && fL(0,1)[i, y] > 0 && fL(1,0)[i, y] < 0

gpartials = solnK /. {Dt[m] → 0, Dt[Π] → 0, Dt[g] → 1}

gpartials /. notationRules // Simplify

{didg, dydg} = {Dt[i], Dt[y]} /. gpartials[[1]]

Sign[{didg, dydg}] // Simplify

1 > fA(0,1,0)[i - Π, y, g] > 0 && fA(0,0,1)[i - Π, y, g] > 0 &&

fA(1,0,0)[i - Π, y, g] < 0 && fL(0,1)[i, y] > 0 && fL(1,0)[i, y] < 0

Dt[i] → fL(0,1)[i, y] fA(0,0,1)[i - Π, y, g]

-fL(1,0)[i, y] + fL(1,0)[i, y] fA(0,1,0)[i - Π, y, g] - fL(0,1)[i, y] fA(1,0,0)[i - Π, y, g],

Dt[y] → fL(1,0)[i, y] fA(0,0,1)[i - Π, y, g]

fL(1,0)[i, y] - fL(1,0)[i, y] fA(0,1,0)[i - Π, y, g] + fL(0,1)[i, y] fA(1,0,0)[i - Π, y, g]

di → -AG Ly

Li - Ay Li + Ar Ly, dy →

AG Li

Li - Ay Li + Ar Ly

fL(0,1)[i, y] fA(0,0,1)[i - Π, y, g]


fL(1,0)[i, y] fA(0,0,1)[i - Π, y, g]

fL(1,0)[i, y] - fL(1,0)[i, y] fA(0,1,0)[i - Π, y, g] + fL(0,1)[i, y] fA(1,0,0)[i - Π, y, g]

{1, 1}

Monetary policy experiment:

We know from the implicit function theorem that this is the same as solving for the partial derivatives of

the reduced form. , we can write

∂ i /∂m

∂Y /∂m

1

Δ -LY (1 - AY)

Li Ar

0

1

1

Δ (1 - AY)

Ar

-

+

(47)

(* find the partial responses to dm *)

mpartials = solnK /. {Dt[g] → 0, Dt[Π] → 0, Dt[m] → 1};

mpartials /. notationRules // Simplify

{didm, dydm} = {Dt[i], Dt[y]} /. mpartials[[1]]

Sign[{didm, dydm}] // Simplify

di →1 - Ay

Li - Ay Li + Ar Ly, dy →

Ar

Li - Ay Li + Ar Ly

-1 - fA(0,1,0)[i - Π, y, g]


fA(1,0,0)[i - Π, y, g] fL(1,0)[i, y] - fL(1,0)[i, y] fA(0,1,0)[i - Π, y, g] +

fL(0,1)[i, y] fA(1,0,0)[i - Π, y, g]

{-1, 1}

271

Experiment: change in expected inflation.

Recall our reduced form:

d i

d Y

1

Δ -LY (1 - AY)

Li Ar

-Ar d π +AG dG

d m (48)

Now set dG = 0 and dm = 0.

d i

d Y

1

Δ -LY (1 - AY)

Li Ar

-Ar dπ

0 (49)

Now divide both sides by dπ.

∂ i /∂π

∂Y /∂π

1

Δ -LY (1 - AY)

Li Ar

-Ar

0

1

Δ

LY Ar

-Li Ar

+

+

(50)

“Classical” Model (Exercise)

In the Classical case we follow the same procedures and the same type of reasoning, making only a

single change: instead of Y we take m to be endogenous, so that m and i are the endogenous vari-

ables. Note that we start with the same system of structural equations:

Y A (i -π, Y, F)

m L(i, Y)(51)

It follows that the total differential is unchanged:

d Y Ar (d i - d π) + AY d Y + AF dF

dm Li d i + LY d Y(52)

Of course, all the partial derivatives from the structural form are unchanged: Ar < 0, 0 < AY < 1, AF > 0,

Li < 0, and LY > 0.

But of course we have a different set of endogenous variables, so we have a different implied reduced

form:

m m (Y, π, F)

i i(Y, π, F)(53)

So when we write down the matrix equation, we use our new set of endogenous variables:

Ar 0

-Li 1

d i

d m

Ar d π + (1 - AY) d Y - AF d F

LY d Y (54)

Solving for the changes in the endogenous variables:

272

d i

d m

1

Ar

1 0

Li Ar

Ar d π + (1 - AY) d Y - AF d F

LY d Y (55)

So for example

∂ i /∂π

∂m /∂π

1

Ar

1 0

Li Ar

Ar

0

1

Ar

Ar

Li Ar

+

-

(56)

Curvature10.4During our examination of the calculus of univariate functions, we found that for any differentiable

function we can find a derivative function whose values represent the slopes of the primitive function. If

this derivative function is in turn differentiable, we found we could produce another function -- the

second-order derivative function -- whose values represented the curvature of the primitive function.

We would now like to explore the curvature of functions of more than one variable. We will emphasize

the bivariate case, which allows relatively easy visualization.

Basic Bivariate Example

Clearly the slopes of the first-order partial derivatives are giving us information about the curvature of

the function. But now we need to consider a difficulty similar to a problem we ran into when thinking

about function limits in a multidimensional setting: there are many different ways we can slice a three-

dimensional surface. As a result, the information about the curvature provided only by slices parallel to

the axes is too limited.

In order to explore this, introduce a direction vector v along with a scalar λ. The scalar will control the

size of the step we take in the direction of the direction vector.

Clear[f, x, y, λ, vx, vy]

D[f[x + λ vx, y + λ vy], λ]

D[f[x + λ vx, y + λ vy], {λ, 2}] // Expand

vy f(0,1)[x + vx λ, y + vy λ] + vx f(1,0)[x + vx λ, y + vy λ]

vy2 f(0,2)[x + vx λ, y + vy λ] + 2 vx vy f(1,1)[x + vx λ, y + vy λ] + vx2 f(2,0)[x + vx λ, y + vy λ]

In this expression, we see not only the second-order partial derivatives with respect to a single variable,

but also derivatives with respect to one variable and then the other. These are sometimes called mixed

derivatives. The result that the value of a mixed derivative does not depend on the order of differentia-

tion is sometimes called Young’s Theorem or Schwartz’s Theorem.

273

Theorem

If f has continuous second-order partial derivatives, then fx,y ⩵ fy,x.

The matrix of second-order partial derivatives is called the Hessian of the function. We can produce the

Hessian with the `D` command. (Note that satisfaction of Young’s theorem is assumed; the matrix is

symmetric.)

hess = D[f[x, y], {{x, y}, 2}] // MatrixForm

f(2,0)[x, y] f(1,1)[x, y]

f(1,1)[x, y] f(0,2)[x, y]

Let us consider the associated quadratic form.

{vx, vy}.hess.{vx, vy} // Expand

vy2 f(0,2)[x, y] + 2 vx vy f(1,1)[x, y] + vx2 f(2,0)[x, y]

We can produce our earlier expression for the curvature along a line through the origin by setting

v = (1, m). In fact, any value for v other than (0, 0) represents the curvature resulting from a movement

along some line. It follows that in order to say something about the curvature of the function indepen-

dently of what direction we are moving, we need enough information about the Hessian of the function.

In particular, if the Hessian ensures that regardless of the (nonzero) value of v we will get a positive

result, we will say that the curvature is positive, and that the function is convex. This is the case when

the Hessian is a positive-definite matrix. However, if the Hessian ensures that regardless of the

(nonzero) value of v we will get a negative result, we will say that the curvature is negative, and that the

function is concave. This is the case when the Hessian is a negative definite matrix.

Recall that we can test for positive definiteness by first checking that all diagonal elements are positive

and then testing that all leading principle minors are positive. Similarly, we can test for negative definite-

ness by first checking that all diagonal elements are negative and then testing that the leading principle

minors alternate in sign.

274

ClearAll[f, x, y, x0, y0, λ]

Withf = {x, y} x2 + y2,

Manipulate[

g1 = ParametricPlot3D[

{x0 + λ vx, y0 + λ vy, f[x0 + λ vx, y0 + λ vy]}

/. {x0 → 0, y0 → 0}, {λ, -1, 1},

AxesLabel → {"x", "y", None},

PlotRange → {{-1, 1}, {-1, 1}, {0, 2}}];

g2 = Graphics3D[Arrow[{{0, 0, 0}, {vx, vy, 0}}]];

Show[g1, g2],

{{vx, 1}, -1, 1}, {{vy, 1}, -1, 1}]

vx

vy

f = {x, y} x2 + y2;

hess = D[f[x, y], {{x, y}, 2}]

PositiveDefiniteMatrixQ[hess]

{{2, 0}, {0, 2}}

True

275

Weak Curvature

We may have a minimum where the function is only weakly concave. We can check for this case by

ensuring the Hessian is semi-definite.

f = {x, y} x2;

Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1}]

Grad[f[x, y], {x, y}]

{2 x, 0}

The point (0, 0) is a stationary point, and it is an extremum. However, it is not unique -- not even locally.

In this case we find that the Hessian matrix is nonnegative definite, which is also called positive semi-

definite.


2 00 0

PositiveSemidefiniteMatrixQ[hess]

True

Saddle Point

Just as in the univariate case, curvature may differ even in sign at different points of the function. How-

ever, we now have new possibilities. We will be particularly interested in the possibility of a saddle

point, where at a single point of the function the curvature is negative along one slice but positive along

another.

276

f = {x, y} x2 - y2;

Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1},

AxesLabel → {"x", "y", None}]

Note that (0, 0) is a stationary point.

Solve[Grad[f[x, y], {x, y}] ⩵ 0, {x, y}]

{{x → 0, y → 0}}

But we are not dealing with an extremum. We can see this by examinating the second-order partial

derivatives along each axis.

Expression Result

f Function{x, y}, x2 - y2

Derivative[2,0][f] Function[{x, y}, 2]Derivative[0,2][f] Function[{x, y}, -2]

So along the x-axis, the function looks convex, suggesting a minimum. Yet along the y-axis, the func-

tion looks concave, suggesting a maximum. Correspondingly, the Hessian matrix of the function is

indefinite.


2 00 -2

277

ClearAll[f, x, y, x0, y0, λ]

Withf = {x, y} x2 - y2,

Manipulate[

g1 = ParametricPlot3D[

{x0 + λ vx, y0 + λ vy, f[x0 + λ vx, y0 + λ vy]}

/. {x0 → 0, y0 → 0}, {λ, -10, 10},

AxesLabel → {"x", "y", None},

PlotRange → {{-1, 1}, {-1, 1}, {-1, 1}}];

g2 = Graphics3D[Arrow[{{0, 0, 0}, {vx, vy, 0}}]];

g3 = Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1}];

Show[g1, g2, g3],

{{vx, 1}, -1, 1}, {{vy, 1}, -1, 1}]

vx

vy

Bivariate Optimization10.5

In this section, we explore how to find extreme points of a multivariate function. Our approach will be to

extend our calculus tools for optimization to the multivariate case. The discussion below will focus on

278

the bivariate case, but it is easily generalized.

The key to finding a multivariate extreumum is the realization that it must also be a univariate extremum

in each of the variables individually. We already know how to search for a univariate extremum by

searching for stationary points and then checking the curvature at those points.

The default display provides grid lines and is quite good, but it may not always meet our needs. As

always, WL offers a plethora of plotting options, and the best approach is not to try to master them all

but rather to browse the documentation on an ad hoc basis. Here we mention the `MeshFunctions`

option, because it allows us to nicely see the level sets of our function, and the `Mesh` option, which

can control the number of level sets to display. A level set is combinations of x and y that produce a

constant function value. The mesh function is provided with three arguments, representing the x, y, and

z coordinates. To produce level sets, our choice of mesh function is just the third of these arguments.

Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1},

AxesLabel → {"x", "y", "z"}, MeshFunctions → {#3 &}, Mesh → 10]

ContourPlotf[x, y], {x, -1, 1}, {y, -1, 1},

ContourLabels → Style[Text[#3, {#1, #2}], GrayLevel[.3], 6] &, ImageSize → 288

-1.0 -0.5 0.0 0.5 1.0-1.0

-0.5

0.0

0.5

1.0

279

Stationary PointsDefinition

A stationary point of a differentiable function is a point where the function has a null gradient.

This result should seem natural, in light of our work on univariate functions and our discussion of the

function differential. To be at a stationary point, the directional derivative vanish regardless of our

chosen direction. Since we found that the directional derivative is a linear combination of partial deriva-

tives, a null gradient ensures a null directional derivative.

As in the univariate case, we will begin our search for extrema by searching for stationary points. We

therefore search for an extremum looking at the stationary points: find (xs, ys) such that the two first-

order partial derivatives equal zero. The slopes of the the grid lines in our first figure are the partial

derivatives. We can compute these one at a time, using the `D` command as usual. Recall that we can

also use the `Grad` command to compute them all at one go as a list of values.

Expression Result

f={x,y}x2+y2 Function{x, y}, x2 + y2

D[f[x,y],x] 2 xD[f[x,y],y] 2 yGrad[f[x,y],{x,y}] {2 x, 2 y}

Once we have the first-order partial derivatives, we can search for a stationary point, where they all

simultaneously equal 0. The intuition is very similar to the univariate case: to be at the extremum of a

differentiable function is necessarily to be where the slope is 0. For example, if we are at a maximum,

there must be no direction of possible increase. This certainly means each partial derivative must be

zero. However, we have also seen that the directional derivative can be written as a weighted sum of

partial derivations. As a result, if we can find a point where all first-order partial derivatives are zero, we

have found a stationary point.

Solve[Grad[f[x, y], {x, y}] ⩵ 0, {x, y}, Reals]

{{x → 0, y → 0}}

Small Changes Can Matter

Suppose we are looking for an extremum of f [x, y]. If we consider any strictly increasing transformation

of f , it will have the same extrema. In this section, we show one way that observation can be useful.

Consider the following real-valued function of real variables.

280

ClearAll[f]

f = {x, y} x2 + y2

Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1}, Mesh → 5, MeshFunctions → {#3 &}]

Function{x, y}, x2 + y2

It seems pretty clear from the plot that we (0, 0) is a minimizer. In an attempt to show that, let us pro-

ceed as usual: searching for an extremum by trying to find (x1, x2) such that the two first-order partial

derivatives equal zero. Applying the chain rule, we see that the gradient is

gradf = Grad[f[x, y], {x, y}]

x

x2 + y2,

y

x2 + y2

So let us proceed apace and solve the necessary first-order conditions for the minimum of a differen-

tiable function. But, this produces an empty set of solutions.

Solve[gradf ⩵ 0, {x, y}, Reals]

{}

The problem arises because the gradient is undefined at (0, 0). In order to visualize the problem, let us

look at a slice of the surface at y ⩵ 0.

Plot[f[x, 0], {x, -1, 1}, ImageSize → 144]

-1.0 -0.5 0.5 1.0

0.2

0.4

0.6

0.8

1.0

We see that although the function is continuous, it is not differentiable at 0. So our function minimum

281

does occur at a critical point, but this is not a stationary point, but the function is not differentiable at the

critical point.

The domain of f is ℝ2 and the range of f is ℝ+. Define g to be the result of squaring the value of f .

Since the range of values is nonnegative, squaring is a strictly increasing transformation of the range.

g = {x, y} Power[f[x, y], 2];

g[x, y]

x2 + y2

Now we are in familiar territory. We can take a standard calculus-based approach to optimization.

282

chapter 10: differential calculus and nonlinear

Documents