chapter 10: differential calculus and nonlinear
TRANSCRIPT
Differential Calculus and Nonlinear Comparative Statics
Chapter 10:
Differential calculus is a core tool for optimization and comparative statics. This chapter applies the
concepts of differential calculus to multivariate problems.
Limits and Continuity10.1Recall from our discussion of univariate calculus that an infinite sequence has a limit L iff almost all of
its points are arbitrarily close to L. Recall as well that we say that a function f has a limit L at point x0 if
f [x] approaches L as x approaches x0. These basic concepts are unchanged, but in a multivariate
setting, there is a complication. Our discussion of univariate function limit involved a discussion of one-
sided limits, but in a multivariate setting, we can approach x0 from many directions, not just two. Never-
theless, the core idea behind continuity remains the same: a function is continuous at x0 if it does not
have a hole or a jump at x0, and this means that the function limits is the same no matter which direction
(in the domain) we come from, and additionally equals the value of the function at x0.
f = {x, y} Ifx y < 0, 1 3, 2 3;
For this function, we generally find that if we set a value of y and then let x approach 0, the answer
depends on the sign of y, and we get a different answer if x is coming from the left or from the right.
And by symmetry, we encounter the same issues if we fix a value of x and then let y approach zero.
Limit[f[x, y], x → 0, Direction → -1]
1
3y < 0
2
3True
256
Limit[f[x, y], x → 0, Direction → 1]
1
3y > 0
2
3True
Next we turn to a bivariate example of a discontinuous function whose function limits are well defined.
As in the univariate case, the existence of a function limit is not sufficient for continuity. As an example
of a function that has both holes and limits, consider the following.
f = {x, y} Sin[x + y]
x + y;
This function is undefined when x + y ⩵ 0. This shows up as a missing ridge in the function graph.
Plot3D[f[x, y], {x, -π, π}, {y, -π, π}, ColorFunction → "GrayYellowTones"]
Pick (0, 0) as an example point of discontinuity. No matter how we approach (0, 0) through the domain
of f , we always approach a function value of 1. For example,
Limit[f[x, y] /. {y → m x}, x → 0]
1
Note that as users we are again expected to be alert to the fact that this result does not apply if m = -1.
Mathematica gives us the generally useful result, which applies for all other values of m.
Finally, let us look at a function of two variables that is undefined at a single point: the origin.
f = {x, y} x y
x2 + y2;
257
Although this function behaves smoothly everywhere else, it does not have a limit at the origin. Here is
an easy way to see that. Note that, except at (0, 0), the value of the function is 1 /2 whenever x ⩵ y, yet
the value of the function is -1 /2 whenever x ⩵ -y. So we can easily find two directions of approach to
the origin that yield sequences in the range that approach different limits.
This is enough information to show that the function does not have a limit at the origin. But these two
cases are simply special cases of collinearity between x and y as we approach the origin. More gener-
ally we can observe that along any line through the origin the function has a constant value.
f[x, y] /. {y → m x} // Simplify
m
1 + m2
This means we continuously vary the limit of the function values as we vary the line along which we
approach the origin. The two values initially considered were just the largest and the smallest of these
varied limits.
258
Plotm
1 + m2, {m, -10, 10}
-10 -5 5 10
-0.4
-0.2
0.2
0.4
Multivariate Differential Calculus10.2A continuous multivariate function may additionally be differentiable. We begin with the notion of a
partial derivative.
Partial Derivatives
A first-order partial derivative is just the ordinary derivative of a function with respect to one of its argu-
ments, when all the other arguments are constant. In other words, when we fix all the arguments of the
function but one, we are left with an ordinary univariate function. The derivative of this univariate func-
tion is a partial derivative of the original function.
To illustrate, suppose f is a real-valued function of two real variables. We can find its difference quo-
tient with respect either the first argument or the second. For example,
Clear[f, x, y, h]
DifferenceQuotient[f[x, y], {x, h}]
-f[x, y] + f[h + x, y]
h
Once we have the difference quotient, we can find the derivative we seek as before: by shrinking the
step size to 0. Here we illustrate this approach to producing partial derivatives with a arbitrary bivariate
function.
Expression Result
f={x,y}x2 y2 Function{x, y}, x2 y2
dqx=DifferenceQuotient[f[x,y],{x,h}] h y2 + 2 x y2
Limit[dqx,h→0] 2 x y2
dqy=DifferenceQuotient[f[x,y],{y,h}] h x2 + 2 x2 y
Limit[dqy,h→0] 2 x2 y
The `Derivative` command can produce partial derivatives of ordinary functions. We need to specify the
order of differentiation with respect to each of the slots of the function. The partial derivative of func-
259
tions that are not yet defined use a slightly unusual notation, but it is unambiguous once one is accus-
tomed to it. As a superscript to the function, we see a tuple that gives the order of differentiation with
respect to each of the function arguments.
Expression Result
Derivative[1,0][f] f(1,0)
Derivative[0,1][f] f(0,1)
The notation is a bit unusual, but it is nicely unambiguous (for any continuously differentiable function).
If the function is known, the `Derivative` command produces a new function representing the partial
derivative.
Expression Result
f={x,y}x2 y2 Function{x, y}, x2 y2
Derivative[1,0][f] Function{x, y}, 2 x y2
Derivative[0,1][f] Function{x, y}, 2 x2 y
To partially differentiate a multivariate expression, one can request a single partial derivative with the
`D` command, or one can request a list of all the partial derivatives at once with the `Grad` command.
This list represents a vector of partial derivatives, known as the gradient of the function.
Clear[f]
Grad[f[x, y], {x, y}]
f(1,0)[x, y], f(0,1)[x, y]
The output of the `Grad` command is a list of the first-order partial derivatives, given in the same order
as the provided variables. Here is an example with a an arbitrary bivariate function.
Expression Result
f={x,y}x2 y2 Function{x, y}, x2 y2
D[f[x,y],x] 2 x y2
D[f[x,y],y] 2 x2 y
Grad[f[x,y],{x,y}] 2 x y2, 2 x2 y
There is also a nabla shorthand for `Grad` that we will occasionally used. The variables of differentia-
tion are provided as a list that subscripts a nabla, which is entered as `del`.
∇{x,y}f[x, y]
2 x y2, 2 x2 y
Visualizing the Partial Derivatives
In order to plot a function of two variables, we will need three dimensions: two dimensions for the
argument values, and one dimension for the resulting function value. Recall that the `Plot3D` command
nicely fits the bill for this need. To make it easier to keep track of which axis represents which value, we
will use the `AxesLabel` option.
260
g1 = Plot3D[x^2 + y^2, {x, -1, 1}, {y, -1, 1},
AxesLabel → {"x", "y", None}, ImageSize → 288]
Each grid line represents a univariate function. If we hold one variable constant while varying the other,
we can produce two-dimensional plots. Here we elaborate on this observation by slicing our surface
plot with a plane at y = 0. This helps us visualize the univariate function that results when we fix y = 0
but allow x to vary. The resulting function graph is the intersection between our surface and the plane.
g2 = Graphics3D[{InfinitePlane[{{-1, 0, 1}, {0, 0, 0}, {1, 0, 1}}]}];
Show[{g1, g2}]
Recall that setting y = 0 this way is called partial function application. After partially applying our bivari-
ate function, we end up with a univariate function. Naturally, we can create a two-dimensional plot of
this univariate function. The slopes we observe in such a plot are derivatives of the partially applied
function, but they are partial derivatives of our multivariate function.
261
-1.0 -0.5 0.0 0.5 1.0x
0.5
1.0
1.5
2.0z
-1.0 -0.5 0.0 0.5 1.0y
0.5
1.0
1.5
2.0z
Figure 33: Univariate Slopes are Partial Derivatives
Since we can think of the derivative of a univariate function as the slope of a tangent line to the graph of
the univariate function, we can correspondingly think of a partial derivative as the slope of a tangent line
to the graph of a partially applied multivariate function.
Multivariate Differential: Visualizing the Gradient
If we have a univariate function f , the derivative f '[x] tells us approximately how much the function will
increase given a unit increase in x, so f '[x] dx is the approximate change in the value of f give a change
in x in the amount dx. We exploited this observation to develop the notion of the differential of a univari-
ate function.
If we have a bivariate function f , the partial derivatives ∇{x,y} f [x, y] correspondingly tells us approxi-
mately how much f will increase given a unit change in x or a unit change in y, while we hold the other
variable constant. If both variables to change by amounts (dx, dy), then we correspondingly approxi-
mate the change in the value of f by the dot product of the gradient and the change vector:
df = ∇{x,y} f [x, y].(dx, dy). This is the differential of our bivariate function. Expressing this with WL, we
have
Clear[f, x, y, dx, dy]
df = Dot[∇{x,y}f[x, y], {dx, dy}]
dy f(0,1)[x, y] + dx f(1,0)[x, y]
As a concrete example, consider the following function.
f = {x, y} x2 + y2;
df
2 dx x + 2 dy y
The gradient gives the direction of most rapid increase in the function value. To see that, let f be a real-
valued function of a real-vector input, and define g[t] = f [x + t v]. Assuming g is differentiable at 0, the
rate of increase of g is g '[0] = ∇x f .v, which is the dot product of the gradient and the vector v. This is
called the directional derivative of f in the direction v (at the point x). That is, a directional derivative is a
linear combination of partial derivatives. For example,
262
Clear[f]
D[f[x1 + t v1, x2 + t v2], t] /. {t → 0}
v2 f(0,1)[x1, x2] + v1 f(1,0)[x1, x2]
To get rid of the influence of length, let v be a unit vector. This allows us to again ask, how much will a
unit step in a particular direction increase the value of f? Recall from our earlier discussion of angles
that to make this dot product as large as possible, we need v to be codirectional with x.
A gradient of a function is perpendicular to the associated curve in the level set. In this section we will
illustrate this. First we illustrate the gradient.
For this illustration, we will use a Cobb-Douglas production function. The traditional representation is
Y = A Kα N1-α. For concreteness, set A = 10 and α = 0.3. Recall that user-defined symbols should start
with lower-case letters in WL. So let us simply use minuscule letters in our representation.
Clear[k, dk, n, dn, y]
y = {k, n} Evaluate10 * k^α * n^1 - α /. {α → 0.3} // TraditionalForm
{k, n} 10 k0.3 n0.7
Use what you know about `Plot3D` to make a nice surface plot of our Cobb-Douglas function. (Name it
`g1`.) For example,
We will add a direction vector and an isoquant to this plot. Let us arbitrarily choose an initial point
(k0, n0) for our illustration. Letting y[k, n] be our Cobb-Douglas function, and evaluate it at this point.
This gives us a triple (k0, n0, y0), which describes a three-dimensional point.
{k0, n0} = {5, 5};
y0 = y[k0, n0];
pt0 = {k0, n0, y0}
{5, 5, 50.}
The partial derivative of a production function with respect to the capital input is called the marginal
product of capital. The partial derivative of a production function with respect to the labor input is called
the marginal product of labor. We can produce these both at one go with the `Grad` command.
263
grad = Grad[y[k, n], {k, n}]
3. n0.7
k0.7,7. k0.3
n0.3
The gradient gives us the direction of steepest increase. Evaluate this gradient at this initial point,
(k0, n0). Use this to produce a new point by moving in the direction of steepest increase from our initial
point.
{dk, dn} = grad /. {k → k0, n → n0};
{k1, n1} = {k0, n0} + {dk, dn};
y1 = y[k1, n1];
pt1 = {k1, n1, y1}
{8., 12., 106.256}
The `Graphics3D` command provides for drawing three-dimensional arrows. Here we draw an arrow
from our initial point to our new point, calling this drawing `g2`. We then use the `Show` command to
add it to our surface plot. (Note how the combined plot uses the options set in the first plot, since it is
the first argument to `Show`.)
g2 = Graphics3D[{
Arrowheads[0.02, Appearance → "Projected"], Thick, Arrow[{pt0, pt1}]
}];
Show[g1, g2]
An isoquant comprises capital-labor pairs that give a fixed value of output. We will find it informative to
add an isoquant to our drawing. For any fixed value of y, we can solve for k as a function of n. Here we
simply solve for the function by hand, since it is so simple.
kny = {n, y} y 10 n^0.7^1 0.3;
Using this function to draw the isoquant corresponding to our initial inputs, we get
264
g3 = ParametricPlot3D[{kny[n, y0], n, y0}, {n, 0, 20}];
Show[{g1, g2, g3}]
It is more common to draw an isoquant as a projection on the input plane. We can use the
`ContourPlot` command to do this automatically for our production function, for any specified level of
production. Here we increase the value of `PlotPoints` in order to produce a smoother isoquant, and we
use the `Epilog` option to add an arrow corresponding to our gradient. In order to highlight the orthogo-
nal relationship between the gradient and the slope of the isoquant, we also use `ParametricPlot` to add
a tangent line to the isoquant.
265
g1 = ContourPlot[y[k, n] ⩵ y0, {k, 0, 20}, {n, 0, 20},
PlotPoints → 100, Epilog → Arrow[{{k0, n0}, {k1, n1}}]];
tangentSlope = D[kny[n, y0], n] /. {k → k0, n → n0};
g2 = ParametricPlot[{kny[5, y0] + tangentSlope * (n - 5), n}, {n, 0.1, 7},
PlotStyle → {Thin, Dashed}];
Show[{g1, g2}]
0 5 10 15 20
0
5
10
15
20
Non-linear Comparative Statics 10.3
A Non-linear System of Equations
Consider the following as our “structural” equations (without worrying too much about the nature of
economic structure). e:islm.gmkt
Y A (i -π, Y, G)
m L(i, Y)(30)
Here Y is total production in the economy, A is the function determining demand for that production, i is
the nominal interest rate, π is the expected inflation rate, G measures the “fiscal stance” (i.e., how
expansionary fiscal policy is), and m is the real money supply.
ClearAll[fA, fL, i, y, m, Π, g] (* all Mma variables should start lower case *)
lmnl = m ⩵ fL[i, y];
isnl = y ⩵ fA[i - Π, y, g];
Produce a textbook “Keynesian” model by taking Y and i to be endogenous, or produce a textbook
“Classical” model by taking m and i to be endogenous.
266
Total Differentials
We will first consider the money market. Recall that equation (11 ) described money market equilibrium
as m L(i,Y) This must hold both before and after any exogenous changes. That is, we require that
we start out in money market equilibrium, and we also require that we end up in money market equilib-
rium. It follows that the changes in the real money supply (d m) must equal the changes in real money
demand (d L).
dm ⩵ dL (31)
The change in real money demand has two sources: changes in i and changes in Y. As usual, we will
represent these as d i and d Y. Of course, the change in real money demand depends not only on the
size of the changes in these arguments, but also on how sensitive money demand is to each of these
arguments.
dL Li di + LY dY (32)
Putting these two observations together, we get e:islm.dmdl
dm Li di + LY dY (33)
We call this the “total differential” of the LM equation. It makes a very simple statement: we start out on
an LM curve, and we end up on an LM curve.
dlmnl = Dt[lmnl]
Dt[m] ⩵ Dt[y] fL(0,1)[i, y] + Dt[i] fL(1,0)[i, y]
Let’s make this a bit easier to read (but not to manipulate) by introducing some notation as rules. Note
that this is purely for convenience in reading: we are using strings (not symbols) in the result.
notationRulesLM =
fL(0,1)[i, y] → "Ly", fL(1,0)[i, y] → "Li", Dt[m] → dm, Dt[i] → di, Dt[y] → dy;
dlmnl /. notationRulesLM
dm ⩵ Li di + Ly dy
Next consider the goods market. Recall that the equation
Y A (i -π, Y, G) (34)
represents equilibrium in the goods market. This must hold both before and after any exogenous
changes. That is, we require that we start out in goods market equilibrium, and we also require that we
end up in goods market equilibrium. It follows that the changes in real income must equal the changes
in real aggregate demand. Looking at the equation for the IS curve, we can see that this means that the
change in real income (d Y) must equal the change in real aggregate demand (d A).
dY ⩵ dA (35)
The change in aggregate demand has three sources: changes in r, changes in Y, and changes in F.
We represent these changes as d r, d Y, and dG. Of course, the changes in aggregate demand depend
not only on the size of the changes in these arguments, but also on how sensitive aggregate demand is
to each of these arguments.
267
dA Ar (di - dΠ)dr
+ AY dY + AG dG (36)
Putting these two pieces together, we have the total differential of the IS equation:
dY Ar(di - dπ) + AY dY + AG dG (37)
Note that A ( · , · , ·) has only three arguments. Do not be misled by the fact that we choose to write r
as i -π. This does not change the number of arguments of the aggregate demand function. E.g., there
is no derivative Ai.
disnl = Dt[isnl]
Dt[y] ⩵
Dt[g] fA(0,0,1)[i - Π, y, g] + Dt[y] fA(0,1,0)[i - Π, y, g] + Dt[i] - Dt[Π] fA(1,0,0)[i - Π, y, g]
notationRulesIS = fA(0,0,1)[i - Π, y, g] → "AG",
fA(0,1,0)[i - Π, y, g] → "Ay", fA(1,0,0)[i - Π, y, g] → "Ar", Dt[g] → dg, Dt[Π] → dΠ;
notationRules = Join[notationRulesLM, notationRulesIS];
disnl /. notationRules
dy ⩵ AG dg + Ay dy + Ar di - dΠ
Implicit Function Theorem
The IFT provides the conditions under which we can characterize the partial derivatives of the reduced
form in terms of the partial derivatives of the structural form. That is, we can do qualitative comparative
statics.
Review the IFT using the online notes.
“Keynesian” Model
Let us first consider a textbook Keynesian model. Assuming satisfaction of the assumptions of the
implicit function theorem, there is an implied reduced form for the Keynesian model. The reduced form
expresses the solution for each endogenous variables in terms of the exogenous variables. We will
represent this as
i i (m, π, G)
Y Y(m, π, G)(38)
The implicit function theorem tells us how to find the partial derivatives of i (., .) and Y (., .).
Note how we use the letter i to represent both a variable (on the left) and a function (on the right). This
is common practice among economists and mathematicians, as it helps us keep track of which function
is related to which variable. (However we will not usually be able to do this in a computer algebra
system.) Note that since we did not begin with an explicit functional form for the structural equations we
cannot hope to find an explicit functional form for the reduced form. Instead we rely on qualitative
information about the structural equations to make qualitative statements about the reduced form.
268
dlmnl
dlmnl /. notationRules
Dt[m] ⩵ Dt[y] fL(0,1)[i, y] + Dt[i] fL(1,0)[i, y]
dm ⩵ Li di + Ly dy
The total differential can be used to find the slope of the LM curve. Suppose we allow only i and Y to
change (so that d m 0). Then we must have
0 Li d i + LY d Y
d i
d Y L M
-LY
Li
> 0(39)
Dt[lmnl] /. {Dt[m] → 0} /. notationRules (* represent restricted total differential *)
SolveDt[lmnl] /. {Dt[m] → 0, Dt[y] → 1}, Dt[i] /. notationRules
0 ⩵ Li di + Ly dy
di → -Ly
Li
This represents the way i and Y must change together to maintain equilibrium in the money market,
ceteris paribus. That is, this determines the slope of the “Keynesian” LM curve. Under the standard
assumptions that LY > 0 and Li < 0, the “Keynesian” LM curve has a positive slope.
Similarly, if we allow only i and Y to change in the goods market, we must have
d Y Ar d i + AY d Y
d i
d Y I S
1 - AY
Ar
< 0(40)
restrictions = {Dt[m] → 0, Dt[g] → 0, Dt[Π] → 0}
Dt[isnl] /. restrictions /. notationRules (* represent restricted total differential *)
SolveDt[isnl] /. restrictions /. {Dt[y] → 1}, Dt[i] /. notationRules
{Dt[m] → 0, Dt[g] → 0, Dt[Π] → 0}
dy ⩵ Ar di + Ay dy
di →1 - Ay
Ar
This is the way i and Y must change together to maintain equilibrium in the goods market. That is, this
determines the slope of the “Keynesian” IS curve. Under the standard assumptions that 0 < AY < 1 and
Ar < 0, the “Keynesian” IS curve has a negative slope.
Solving the Nonlinear Keynesian Model
So we have seen what is required to stay on the IS curve and what is required to stay on the LM curve.
Putting these together we have
269
d Y Ar(di - dπ) + AY d Y + AF dG
dm Li di + LY dY(41)
When we insist that both of these equation hold together, we are insisting that we stay on both the IS
and LM curves simultaneously. In this system there are two endogenous variables, d r and d Y, which
are being determined so as to achieve this simultaneous satisfaction of the IS and LM equations.
Now we just solve two linear equations in two unknowns. First prepare to set up the system as a matrix
equation by moving all terms involving the endogenous variables to the left. (Note that this is the first
time we have paid attention to which variables are endogenous.)
-Ar di + dY - AY dY -Ar d π + AF dG
Li di + LY dY dm(42)
Now rewrite this system as a matrix equation in the form J x b.
-Ar (1 - AY)
Li LY
di
dY
-Ar d π +AF dG
dm (43)
Then solve for the endogenous variables by multiplying both sides by J-1.
d i
d Y
1
-Ar LY - (1 - AY) Li
LY -(1 - AY)
-Li -Ar
-Ar d π +AF dG
d m
1
Ar LY + (1 - AY) Li
-LY (1 - AY)
Li Ar
-Ar d π +AF dG
d m
(44)
Letting Δ Ar LY + (1 -AY) Li, we can write this as
d i
d Y
1
Δ -LY (1 - AY)
Li Ar
-Ar d π +AG dG
d m (45)
Invoking the standard assumptions on the structural form partial derivatives, listed above, we note that
Δ Ar LY + (1 -AY) Li < 0.
solnK = Solve[dlmnl && disnl, {Dt[i], Dt[y]}];
solnK /. notationRules
di → --AG Ly dg + dm - Ay dm + Ar Ly dΠ
-Li + Ay Li - Ar Ly, dy → -
-AG Li dg - Ar dm + Ar Li dΠ
Li - Ay Li + Ar Ly
Fiscal policy experiment:
∂ i /∂G
∂Y /∂G
1
Δ -LY (1 - AY)
Li Ar
AG
0
1
Δ -LY AF
Li AF
+
+
(46)
270
(* find the partial responses to dg *)
$Assumptions = 1 > fA(0,1,0)[i - Π, y, g] > 0 && fA(0,0,1)[i - Π, y, g] > 0 &&
fA(1,0,0)[i - Π, y, g] < 0 && fL(0,1)[i, y] > 0 && fL(1,0)[i, y] < 0
gpartials = solnK /. {Dt[m] → 0, Dt[Π] → 0, Dt[g] → 1}
gpartials /. notationRules // Simplify
{didg, dydg} = {Dt[i], Dt[y]} /. gpartials[[1]]
Sign[{didg, dydg}] // Simplify
1 > fA(0,1,0)[i - Π, y, g] > 0 && fA(0,0,1)[i - Π, y, g] > 0 &&
fA(1,0,0)[i - Π, y, g] < 0 && fL(0,1)[i, y] > 0 && fL(1,0)[i, y] < 0
Dt[i] → fL(0,1)[i, y] fA(0,0,1)[i - Π, y, g]
-fL(1,0)[i, y] + fL(1,0)[i, y] fA(0,1,0)[i - Π, y, g] - fL(0,1)[i, y] fA(1,0,0)[i - Π, y, g],
Dt[y] → fL(1,0)[i, y] fA(0,0,1)[i - Π, y, g]
fL(1,0)[i, y] - fL(1,0)[i, y] fA(0,1,0)[i - Π, y, g] + fL(0,1)[i, y] fA(1,0,0)[i - Π, y, g]
di → -AG Ly
Li - Ay Li + Ar Ly, dy →
AG Li
Li - Ay Li + Ar Ly
fL(0,1)[i, y] fA(0,0,1)[i - Π, y, g]
-fL(1,0)[i, y] + fL(1,0)[i, y] fA(0,1,0)[i - Π, y, g] - fL(0,1)[i, y] fA(1,0,0)[i - Π, y, g],
fL(1,0)[i, y] fA(0,0,1)[i - Π, y, g]
fL(1,0)[i, y] - fL(1,0)[i, y] fA(0,1,0)[i - Π, y, g] + fL(0,1)[i, y] fA(1,0,0)[i - Π, y, g]
{1, 1}
Monetary policy experiment:
We know from the implicit function theorem that this is the same as solving for the partial derivatives of
the reduced form. , we can write
∂ i /∂m
∂Y /∂m
1
Δ -LY (1 - AY)
Li Ar
0
1
1
Δ (1 - AY)
Ar
-
+
(47)
(* find the partial responses to dm *)
mpartials = solnK /. {Dt[g] → 0, Dt[Π] → 0, Dt[m] → 1};
mpartials /. notationRules // Simplify
{didm, dydm} = {Dt[i], Dt[y]} /. mpartials[[1]]
Sign[{didm, dydm}] // Simplify
di →1 - Ay
Li - Ay Li + Ar Ly, dy →
Ar
Li - Ay Li + Ar Ly
-1 - fA(0,1,0)[i - Π, y, g]
-fL(1,0)[i, y] + fL(1,0)[i, y] fA(0,1,0)[i - Π, y, g] - fL(0,1)[i, y] fA(1,0,0)[i - Π, y, g],
fA(1,0,0)[i - Π, y, g] fL(1,0)[i, y] - fL(1,0)[i, y] fA(0,1,0)[i - Π, y, g] +
fL(0,1)[i, y] fA(1,0,0)[i - Π, y, g]
{-1, 1}
271
Experiment: change in expected inflation.
Recall our reduced form:
d i
d Y
1
Δ -LY (1 - AY)
Li Ar
-Ar d π +AG dG
d m (48)
Now set dG = 0 and dm = 0.
d i
d Y
1
Δ -LY (1 - AY)
Li Ar
-Ar dπ
0 (49)
Now divide both sides by dπ.
∂ i /∂π
∂Y /∂π
1
Δ -LY (1 - AY)
Li Ar
-Ar
0
1
Δ
LY Ar
-Li Ar
+
+
(50)
“Classical” Model (Exercise)
In the Classical case we follow the same procedures and the same type of reasoning, making only a
single change: instead of Y we take m to be endogenous, so that m and i are the endogenous vari-
ables. Note that we start with the same system of structural equations:
Y A (i -π, Y, F)
m L(i, Y)(51)
It follows that the total differential is unchanged:
d Y Ar (d i - d π) + AY d Y + AF dF
dm Li d i + LY d Y(52)
Of course, all the partial derivatives from the structural form are unchanged: Ar < 0, 0 < AY < 1, AF > 0,
Li < 0, and LY > 0.
But of course we have a different set of endogenous variables, so we have a different implied reduced
form:
m m (Y, π, F)
i i(Y, π, F)(53)
So when we write down the matrix equation, we use our new set of endogenous variables:
Ar 0
-Li 1
d i
d m
Ar d π + (1 - AY) d Y - AF d F
LY d Y (54)
Solving for the changes in the endogenous variables:
272
d i
d m
1
Ar
1 0
Li Ar
Ar d π + (1 - AY) d Y - AF d F
LY d Y (55)
So for example
∂ i /∂π
∂m /∂π
1
Ar
1 0
Li Ar
Ar
0
1
Ar
Ar
Li Ar
+
-
(56)
Curvature10.4During our examination of the calculus of univariate functions, we found that for any differentiable
function we can find a derivative function whose values represent the slopes of the primitive function. If
this derivative function is in turn differentiable, we found we could produce another function -- the
second-order derivative function -- whose values represented the curvature of the primitive function.
We would now like to explore the curvature of functions of more than one variable. We will emphasize
the bivariate case, which allows relatively easy visualization.
Basic Bivariate Example
Clearly the slopes of the first-order partial derivatives are giving us information about the curvature of
the function. But now we need to consider a difficulty similar to a problem we ran into when thinking
about function limits in a multidimensional setting: there are many different ways we can slice a three-
dimensional surface. As a result, the information about the curvature provided only by slices parallel to
the axes is too limited.
In order to explore this, introduce a direction vector v along with a scalar λ. The scalar will control the
size of the step we take in the direction of the direction vector.
Clear[f, x, y, λ, vx, vy]
D[f[x + λ vx, y + λ vy], λ]
D[f[x + λ vx, y + λ vy], {λ, 2}] // Expand
vy f(0,1)[x + vx λ, y + vy λ] + vx f(1,0)[x + vx λ, y + vy λ]
vy2 f(0,2)[x + vx λ, y + vy λ] + 2 vx vy f(1,1)[x + vx λ, y + vy λ] + vx2 f(2,0)[x + vx λ, y + vy λ]
In this expression, we see not only the second-order partial derivatives with respect to a single variable,
but also derivatives with respect to one variable and then the other. These are sometimes called mixed
derivatives. The result that the value of a mixed derivative does not depend on the order of differentia-
tion is sometimes called Young’s Theorem or Schwartz’s Theorem.
273
Theorem
If f has continuous second-order partial derivatives, then fx,y ⩵ fy,x.
The matrix of second-order partial derivatives is called the Hessian of the function. We can produce the
Hessian with the `D` command. (Note that satisfaction of Young’s theorem is assumed; the matrix is
symmetric.)
hess = D[f[x, y], {{x, y}, 2}] // MatrixForm
f(2,0)[x, y] f(1,1)[x, y]
f(1,1)[x, y] f(0,2)[x, y]
Let us consider the associated quadratic form.
{vx, vy}.hess.{vx, vy} // Expand
vy2 f(0,2)[x, y] + 2 vx vy f(1,1)[x, y] + vx2 f(2,0)[x, y]
We can produce our earlier expression for the curvature along a line through the origin by setting
v = (1, m). In fact, any value for v other than (0, 0) represents the curvature resulting from a movement
along some line. It follows that in order to say something about the curvature of the function indepen-
dently of what direction we are moving, we need enough information about the Hessian of the function.
In particular, if the Hessian ensures that regardless of the (nonzero) value of v we will get a positive
result, we will say that the curvature is positive, and that the function is convex. This is the case when
the Hessian is a positive-definite matrix. However, if the Hessian ensures that regardless of the
(nonzero) value of v we will get a negative result, we will say that the curvature is negative, and that the
function is concave. This is the case when the Hessian is a negative definite matrix.
Recall that we can test for positive definiteness by first checking that all diagonal elements are positive
and then testing that all leading principle minors are positive. Similarly, we can test for negative definite-
ness by first checking that all diagonal elements are negative and then testing that the leading principle
minors alternate in sign.
274
ClearAll[f, x, y, x0, y0, λ]
Withf = {x, y} x2 + y2,
Manipulate[
g1 = ParametricPlot3D[
{x0 + λ vx, y0 + λ vy, f[x0 + λ vx, y0 + λ vy]}
/. {x0 → 0, y0 → 0}, {λ, -1, 1},
AxesLabel → {"x", "y", None},
PlotRange → {{-1, 1}, {-1, 1}, {0, 2}}];
g2 = Graphics3D[Arrow[{{0, 0, 0}, {vx, vy, 0}}]];
Show[g1, g2],
{{vx, 1}, -1, 1}, {{vy, 1}, -1, 1}]
vx
vy
f = {x, y} x2 + y2;
hess = D[f[x, y], {{x, y}, 2}]
PositiveDefiniteMatrixQ[hess]
{{2, 0}, {0, 2}}
True
275
Weak Curvature
We may have a minimum where the function is only weakly concave. We can check for this case by
ensuring the Hessian is semi-definite.
f = {x, y} x2;
Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1}]
Grad[f[x, y], {x, y}]
{2 x, 0}
The point (0, 0) is a stationary point, and it is an extremum. However, it is not unique -- not even locally.
In this case we find that the Hessian matrix is nonnegative definite, which is also called positive semi-
definite.
hess = D[f[x, y], {{x, y}, 2}] // MatrixForm
2 00 0
PositiveSemidefiniteMatrixQ[hess]
True
Saddle Point
Just as in the univariate case, curvature may differ even in sign at different points of the function. How-
ever, we now have new possibilities. We will be particularly interested in the possibility of a saddle
point, where at a single point of the function the curvature is negative along one slice but positive along
another.
276
f = {x, y} x2 - y2;
Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1},
AxesLabel → {"x", "y", None}]
Note that (0, 0) is a stationary point.
Solve[Grad[f[x, y], {x, y}] ⩵ 0, {x, y}]
{{x → 0, y → 0}}
But we are not dealing with an extremum. We can see this by examinating the second-order partial
derivatives along each axis.
Expression Result
f Function{x, y}, x2 - y2
Derivative[2,0][f] Function[{x, y}, 2]Derivative[0,2][f] Function[{x, y}, -2]
So along the x-axis, the function looks convex, suggesting a minimum. Yet along the y-axis, the func-
tion looks concave, suggesting a maximum. Correspondingly, the Hessian matrix of the function is
indefinite.
hess = D[f[x, y], {{x, y}, 2}] // MatrixForm
2 00 -2
277
ClearAll[f, x, y, x0, y0, λ]
Withf = {x, y} x2 - y2,
Manipulate[
g1 = ParametricPlot3D[
{x0 + λ vx, y0 + λ vy, f[x0 + λ vx, y0 + λ vy]}
/. {x0 → 0, y0 → 0}, {λ, -10, 10},
AxesLabel → {"x", "y", None},
PlotRange → {{-1, 1}, {-1, 1}, {-1, 1}}];
g2 = Graphics3D[Arrow[{{0, 0, 0}, {vx, vy, 0}}]];
g3 = Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1}];
Show[g1, g2, g3],
{{vx, 1}, -1, 1}, {{vy, 1}, -1, 1}]
vx
vy
Bivariate Optimization10.5
In this section, we explore how to find extreme points of a multivariate function. Our approach will be to
extend our calculus tools for optimization to the multivariate case. The discussion below will focus on
278
the bivariate case, but it is easily generalized.
The key to finding a multivariate extreumum is the realization that it must also be a univariate extremum
in each of the variables individually. We already know how to search for a univariate extremum by
searching for stationary points and then checking the curvature at those points.
The default display provides grid lines and is quite good, but it may not always meet our needs. As
always, WL offers a plethora of plotting options, and the best approach is not to try to master them all
but rather to browse the documentation on an ad hoc basis. Here we mention the `MeshFunctions`
option, because it allows us to nicely see the level sets of our function, and the `Mesh` option, which
can control the number of level sets to display. A level set is combinations of x and y that produce a
constant function value. The mesh function is provided with three arguments, representing the x, y, and
z coordinates. To produce level sets, our choice of mesh function is just the third of these arguments.
Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1},
AxesLabel → {"x", "y", "z"}, MeshFunctions → {#3 &}, Mesh → 10]
ContourPlotf[x, y], {x, -1, 1}, {y, -1, 1},
ContourLabels → Style[Text[#3, {#1, #2}], GrayLevel[.3], 6] &, ImageSize → 288
-1.0 -0.5 0.0 0.5 1.0-1.0
-0.5
0.0
0.5
1.0
279
Stationary PointsDefinition
A stationary point of a differentiable function is a point where the function has a null gradient.
This result should seem natural, in light of our work on univariate functions and our discussion of the
function differential. To be at a stationary point, the directional derivative vanish regardless of our
chosen direction. Since we found that the directional derivative is a linear combination of partial deriva-
tives, a null gradient ensures a null directional derivative.
As in the univariate case, we will begin our search for extrema by searching for stationary points. We
therefore search for an extremum looking at the stationary points: find (xs, ys) such that the two first-
order partial derivatives equal zero. The slopes of the the grid lines in our first figure are the partial
derivatives. We can compute these one at a time, using the `D` command as usual. Recall that we can
also use the `Grad` command to compute them all at one go as a list of values.
Expression Result
f={x,y}x2+y2 Function{x, y}, x2 + y2
D[f[x,y],x] 2 xD[f[x,y],y] 2 yGrad[f[x,y],{x,y}] {2 x, 2 y}
Once we have the first-order partial derivatives, we can search for a stationary point, where they all
simultaneously equal 0. The intuition is very similar to the univariate case: to be at the extremum of a
differentiable function is necessarily to be where the slope is 0. For example, if we are at a maximum,
there must be no direction of possible increase. This certainly means each partial derivative must be
zero. However, we have also seen that the directional derivative can be written as a weighted sum of
partial derivations. As a result, if we can find a point where all first-order partial derivatives are zero, we
have found a stationary point.
Solve[Grad[f[x, y], {x, y}] ⩵ 0, {x, y}, Reals]
{{x → 0, y → 0}}
Small Changes Can Matter
Suppose we are looking for an extremum of f [x, y]. If we consider any strictly increasing transformation
of f , it will have the same extrema. In this section, we show one way that observation can be useful.
Consider the following real-valued function of real variables.
280
ClearAll[f]
f = {x, y} x2 + y2
Plot3D[f[x, y], {x, -1, 1}, {y, -1, 1}, Mesh → 5, MeshFunctions → {#3 &}]
Function{x, y}, x2 + y2
It seems pretty clear from the plot that we (0, 0) is a minimizer. In an attempt to show that, let us pro-
ceed as usual: searching for an extremum by trying to find (x1, x2) such that the two first-order partial
derivatives equal zero. Applying the chain rule, we see that the gradient is
gradf = Grad[f[x, y], {x, y}]
x
x2 + y2,
y
x2 + y2
So let us proceed apace and solve the necessary first-order conditions for the minimum of a differen-
tiable function. But, this produces an empty set of solutions.
Solve[gradf ⩵ 0, {x, y}, Reals]
{}
The problem arises because the gradient is undefined at (0, 0). In order to visualize the problem, let us
look at a slice of the surface at y ⩵ 0.
Plot[f[x, 0], {x, -1, 1}, ImageSize → 144]
-1.0 -0.5 0.5 1.0
0.2
0.4
0.6
0.8
1.0
We see that although the function is continuous, it is not differentiable at 0. So our function minimum
281
does occur at a critical point, but this is not a stationary point, but the function is not differentiable at the
critical point.
The domain of f is ℝ2 and the range of f is ℝ+. Define g to be the result of squaring the value of f .
Since the range of values is nonnegative, squaring is a strictly increasing transformation of the range.
g = {x, y} Power[f[x, y], 2];
g[x, y]
x2 + y2
Now we are in familiar territory. We can take a standard calculus-based approach to optimization.
282