an introduction to mathematical economics - loglinear · chapter 1 total di¤erentials 1.1...

Michael Sampson

An Introduction to

LoglinearPublishing

Mathematical

EconomicsPart 2

Copyright © 2001 Michael Sampson. Loglinear Publications: http://www.loglinear.com Email: [email protected].

Terms of Use This document is distributed "AS IS" and with no warranties of any kind, whether express or implied. Until November 1, 2001 you are hereby given permission to print one (1) and only one hardcopy version free of charge from the electronic version of this document (i.e., the pdf file) provided that:

1. The printed version is for your personal use only. 2. You make no further copies from the hardcopy version. In particular no photocopies,

electronic copies or any other form of reproduction. 3. You agree not to ever sell the hardcopy version to anyone else. 4. You agree that if you ever give the hardcopy version to anyone else that this page, in

particular the Copyright Notice and the Terms of Use are included and the person to whom the copy is given accepts these Terms of Use.

Until November 1, 2001 you are hereby given permission to make (and if you wish sell) an unlimited number of copies on paper only from the electronic version (i.e., the pdf file) of this document or from a printed copy of the electronic version of this document provided that:

1. You agree to pay a royalty of either $3.00 Canadian or $2.00 US per copy to the

author within 60 days of making the copies or to destroy any copies after 60 days for which you have not paid the royalty of $3.00 Canadian or $2.00 US per copy. Payment can be made either by cheque or money order and should be sent to the author at:

Professor Michael Sampson Department of Economics Concordia University 1455 de Maisonneuve Blvd W. Montreal, Quebec Canada, H3G 1M8

2. If you intend to make five or more copies, or if you can reasonably expect that five or more copies of the text will be made then you agree to notify the author before making any copies by Email at: [email protected] or by fax at 514-848-4536.

3. You agree to include on each paper copy of this document and at the same page number as this page on the electronic version of the document: 1) the above Copyright Notice, 2) the URL: http://www.loglinear.com and the Email address [email protected]. You may then if you wish remove this Terms of Use from the paper copies you make.

Contents

0.1 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

1 Total Di¤erentials 11.1 Endogenous and Exogenous Variables . . . . . . . . . . . . . . . 11.2 Structural and Reduced Forms of a Model . . . . . . . . . . . . . 2

1.2.1 The Structural Form of a Model . . . . . . . . . . . . . . 21.2.2 The Reduced Form of a Model . . . . . . . . . . . . . . . 21.2.3 Implicit and Explicit Functions . . . . . . . . . . . . . . . 31.2.4 Calculating Derivatives . . . . . . . . . . . . . . . . . . . 4

1.3 Total Di¤erentials for Beginners . . . . . . . . . . . . . . . . . . . 51.3.1 A Recipe for Calculating dy

dx . . . . . . . . . . . . . . . . . 61.4 Total Di¤erentials for Intermediates . . . . . . . . . . . . . . . . 9

1.4.1 A Recipe for Calculating @y@xj

. . . . . . . . . . . . . . . . 101.4.2 Example 1: A Simple Linear Function . . . . . . . . . . . 111.4.3 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.4.4 Example 3: The Bivariate Normal Density . . . . . . . . . 131.4.5 Example 4: Labour Demand . . . . . . . . . . . . . . . . 14

1.5 Total Di¤erentials for Experts . . . . . . . . . . . . . . . . . . . . 161.5.1 Systems of Implicit Functions . . . . . . . . . . . . . . . . 161.5.2 A Recipe for Calculating @yi

@xj. . . . . . . . . . . . . . . . 17

1.5.3 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.5.4 Example 2: Supply and Demand I . . . . . . . . . . . . . 221.5.5 Example 3: Supply and Demand II . . . . . . . . . . . . . 241.5.6 Example 4: The IS=LM Model . . . . . . . . . . . . . . . 25

1.6 Total Di¤erentials and Optimization . . . . . . . . . . . . . . . . 291.6.1 A General Discussion . . . . . . . . . . . . . . . . . . . . 291.6.2 A Quick Derivation of the Main Principle . . . . . . . . . 301.6.3 Pro…t Maximization . . . . . . . . . . . . . . . . . . . . . 311.6.4 Constrained Optimization . . . . . . . . . . . . . . . . . . 331.6.5 Utility Maximization . . . . . . . . . . . . . . . . . . . . . 341.6.6 Cost Minimization . . . . . . . . . . . . . . . . . . . . . . 35

i

CONTENTS ii

2 The Envelope Theorem 382.1 Unconstrained Optimization . . . . . . . . . . . . . . . . . . . . . 38

2.1.1 Pro…t Maximization in the Short-Run I . . . . . . . . . . 382.1.2 Envelope Theorem I . . . . . . . . . . . . . . . . . . . . . 402.1.3 Pro…t Maximization in the Short-Run II . . . . . . . . . . 412.1.4 Pro…t Maximization in the Long-Run . . . . . . . . . . . 422.1.5 Envelope Theorem II . . . . . . . . . . . . . . . . . . . . . 432.1.6 A Recipe for Unconstrained Optimization . . . . . . . . . 452.1.7 The Pro…t Function . . . . . . . . . . . . . . . . . . . . . 452.1.8 Homogeneous Functions . . . . . . . . . . . . . . . . . . . 482.1.9 Concavity and Convexity . . . . . . . . . . . . . . . . . . 492.1.10 Concavity, Convexity and Homogeneity of f¤ (x) . . . . . 532.1.11 Properties of the Pro…t Function . . . . . . . . . . . . . . 552.1.12 Some Symmetry Results . . . . . . . . . . . . . . . . . . . 58

2.2 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . 582.2.1 Envelope Theorem III: Constrained Optimization . . . . . 582.2.2 A Recipe for Constrained Optimization . . . . . . . . . . 602.2.3 Concavity, Convexity and Homogeneity of f¤ (x) . . . . . 612.2.4 Cost Minimization . . . . . . . . . . . . . . . . . . . . . . 632.2.5 Properties of the Cost Function . . . . . . . . . . . . . . . 672.2.6 Utility Maximization . . . . . . . . . . . . . . . . . . . . . 702.2.7 Expenditure Minimization . . . . . . . . . . . . . . . . . . 74

3 Integration and Random Variables 783.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783.2 Discrete Summation . . . . . . . . . . . . . . . . . . . . . . . . . 78

3.2.1 De…nitions . . . . . . . . . . . . . . . . . . . . . . . . . . 783.2.2 Example 1:

Pni=1 aXi . . . . . . . . . . . . . . . . . . . . 79

3.2.3 Example 2:Pni=1 (aXi + bYi) . . . . . . . . . . . . . . . . 79

3.2.4 Example 3:Pni=1 (aXi + b) . . . . . . . . . . . . . . . . . 80

3.2.5 Example 4:Pni=1 (aXi + b)

2 . . . . . . . . . . . . . . . . 803.2.6 Example 5:

Pni=1

¡Xi ¡ ¹X

¢= 0 . . . . . . . . . . . . . . . 80

3.2.7 Manipulating the Sigma Notation . . . . . . . . . . . . . . 813.2.8 Example 1:

Pni=1 (aXi + bYi)

2 . . . . . . . . . . . . . . . . 823.2.9 Example 2:

Pni=1 (aXi + b)

2 . . . . . . . . . . . . . . . . 833.2.10 Example 3:

Pni=1

¡Xi ¡ ¹X

¢2. . . . . . . . . . . . . . . . 83

3.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843.3.1 The Fundamental Theorem of Integral Calculus . . . . . . 843.3.2 Example 1:

R 101 x2dx . . . . . . . . . . . . . . . . . . . . . 85

3.3.3 Example 2:R 50e¡xdx . . . . . . . . . . . . . . . . . . . . 85

3.3.4 Integration by Parts . . . . . . . . . . . . . . . . . . . . . 853.3.5 Example 1:

R 40xe¡xdx . . . . . . . . . . . . . . . . . . . . 86

3.3.6 Example 2: ¡(n) . . . . . . . . . . . . . . . . . . . . . . . 863.3.7 Integration as Summation . . . . . . . . . . . . . . . . . . 883.3.8

R ba dx as a Linear Operator . . . . . . . . . . . . . . . . . 89

CONTENTS iii

3.3.9 Example 1:R ba(cf (x) + dg (x))2 dx . . . . . . . . . . . . . 90

3.3.10 Example 2:R ba (f (x) + d)

2dx . . . . . . . . . . . . . . . . 91

3.4 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 913.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 913.4.2 Discrete Random Variables . . . . . . . . . . . . . . . . . 913.4.3 The Bernoulli Distribution . . . . . . . . . . . . . . . . . 933.4.4 The Binomial Distribution . . . . . . . . . . . . . . . . . . 933.4.5 The Poisson Distribution . . . . . . . . . . . . . . . . . . 94

3.4.6 Continuous Random Variables . . . . . . . . . . . . . . . 963.4.7 The Uniform Distribution . . . . . . . . . . . . . . . . . . 973.4.8 The Standard Normal Distribution . . . . . . . . . . . . . 983.4.9 The Normal Distribution . . . . . . . . . . . . . . . . . . 983.4.10 The Chi-Squared Distribution . . . . . . . . . . . . . . . . 993.4.11 The Student’s t Distribution: . . . . . . . . . . . . . . . . 1003.4.12 The F Distribution: . . . . . . . . . . . . . . . . . . . . . 101

3.5 Expected Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 1023.5.1 E [X] for Discrete Random Variables . . . . . . . . . . . . 1023.5.2 Example 1: Die Roll . . . . . . . . . . . . . . . . . . . . . 1043.5.3 Example 2: Binomial Distribution . . . . . . . . . . . . . 1053.5.4 E [X] for Continuous Random Variables . . . . . . . . . . 1053.5.5 Example 1: The Uniform Distribution . . . . . . . . . . . 1063.5.6 The Standard Normal Distribution . . . . . . . . . . . . . 1073.5.7 E [ ] as a Linear Operator . . . . . . . . . . . . . . . . . . 1093.5.8 The Rules for E [ ] . . . . . . . . . . . . . . . . . . . . . . 110

3.5.9 Example 1: Eh(3X + 4Y )2

i. . . . . . . . . . . . . . . . . 111

3.5.10 Example 2: Eh(3X + 4)

2i. . . . . . . . . . . . . . . . . . 112

3.5.11 Example 3: X » N £¹; ¾2¤ . . . . . . . . . . . . . . . . . . 1123.6 Variances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

3.6.1 De…nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 1133.6.2 Some Results for Variances . . . . . . . . . . . . . . . . . 1133.6.3 Example 1: Die Roll . . . . . . . . . . . . . . . . . . . . . 1143.6.4 Example 2: Uniform Distribution . . . . . . . . . . . . . . 1153.6.5 Example 3: The Normal Distribution . . . . . . . . . . . 116

3.7 Covariances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1173.7.1 De…nitions . . . . . . . . . . . . . . . . . . . . . . . . . . 1173.7.2 Independence . . . . . . . . . . . . . . . . . . . . . . . . . 1173.7.3 Results for Covariances . . . . . . . . . . . . . . . . . . . 118

3.8 Linear Combinations of Random Variables . . . . . . . . . . . . . 1193.8.1 Linear Combinations of Two Random Variables . . . . . . 1193.8.2 The Capital Asset Pricing Model (CAPM) . . . . . . . . 1223.8.3 An Example . . . . . . . . . . . . . . . . . . . . . . . . . 1253.8.4 Sums of Uncorrelated Random Variables . . . . . . . . . . 1263.8.5 An Example . . . . . . . . . . . . . . . . . . . . . . . . . 127

CONTENTS iv

3.8.6 The Sample Mean . . . . . . . . . . . . . . . . . . . . . . 1273.8.7 The Mean and Variance of the Binomial Distribution . . . 1293.8.8 The Linear Regression Model for Beginners . . . . . . . . 1303.8.9 Linear Combinations of Correlated Random Variables . . 1323.8.10 The Linear Regression Model for Experts . . . . . . . . . 135

4 Dynamics 1384.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1384.2 Trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

4.2.1 Degrees and Radians . . . . . . . . . . . . . . . . . . . . . 1384.2.2 The Functions cos (x) and sin (x) . . . . . . . . . . . . . . 139

4.2.3 The Functions tan (x) and arctan (x) . . . . . . . . . . . . 1404.3 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 140

4.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1404.3.2 Arithmetic with Complex Numbers . . . . . . . . . . . . . 1424.3.3 Complex conjugates: . . . . . . . . . . . . . . . . . . . . . 1424.3.4 The Absolute Value of a Complex Number . . . . . . . . 1434.3.5 The Argument of a Complex Number . . . . . . . . . . . 1444.3.6 The Polar Form of a Complex Number . . . . . . . . . . . 1454.3.7 De Moivre’s theorem . . . . . . . . . . . . . . . . . . . . . 1454.3.8 Euler’s Formula . . . . . . . . . . . . . . . . . . . . . . . . 1464.3.9 When does (a+ bi)n ! 0 as n!1? . . . . . . . . . . . . 149

4.4 First-Order Linear Di¤erence Equations . . . . . . . . . . . . . . 1514.4.1 De…nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 1514.4.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1514.4.3 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1524.4.4 A Macroeconomic Example . . . . . . . . . . . . . . . . . 1554.4.5 The Cob-Web Model . . . . . . . . . . . . . . . . . . . . . 156

4.5 Second-Order Linear Di¤erence Equations . . . . . . . . . . . . 1574.5.1 The Behavior of Yt with Complex Roots . . . . . . . . . . 1594.5.2 Example 1 Real Roots and Stable . . . . . . . . . . . . . 1604.5.3 Example 2 Real Roots and Unstable . . . . . . . . . . . . 1614.5.4 Example 3 Complex Roots and Stable . . . . . . . . . . . 1634.5.5 Example 4 Complex Roots and Unstable . . . . . . . . . . 1654.5.6 A Macroeconomic Example . . . . . . . . . . . . . . . . . 167

4.6 First-Order Di¤erential Equations1684.6.1 De…nitions . . . . . . . . . . . . . . . . . . . . . . . . . . 1684.6.2 The Case where b = 0 . . . . . . . . . . . . . . . . . . . . 1694.6.3 The Case where b 6= 0 . . . . . . . . . . . . . . . . . . . . 1714.6.4 Example 1: . . . . . . . . . . . . . . . . . . . . . . . . . . 1724.6.5 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

4.7 Second-Order Di¤erential Equations . . . . . . . . . . . . . . . . 1734.7.1 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1734.7.2 Stability with Real Roots . . . . . . . . . . . . . . . . . . 175

CONTENTS v

4.7.3 Stability with Complex Roots . . . . . . . . . . . . . . . . 1754.7.4 Example 1: Stable with Real Roots . . . . . . . . . . . . . 1764.7.5 Example 2: Unstable with Real Roots . . . . . . . . . . . 1774.7.6 Example 3: Stable with Complex Roots . . . . . . . . . . 1784.7.7 Example 4: Unstable with Complex Roots . . . . . . . . . 179

CONTENTS vi

0.1 Preface

Thanks go to my Economics 326 students from the Winter 1999 and Winter2000 semesters for working so hard on the …rst and second versions of theselecture notes. I would in particular like to thank Tamarha Louis-Charles, DanaAl-Aghbar, Christopher Martin and Samnang Om for pointing out many typosand errors.

Chapter 1

Total Di¤erentials

1.1 Endogenous and Exogenous VariablesIn economics we mostly work with mathematical models. By their very na-ture these models contain variables which can be divided into two classes: 1)endogenous variables and 2) exogenous variables.Endogenous variables (from Greek, endo: within and genous: born hence

born or generated from within the model) are those variables which the modelattempts to explain.In a supply and demand model, for example, both the equilibrium price P

and quantity Q are determined by the model, that is at the intersection of thesupply and demand curves. They are therefore endogenous variables.Exogenous variables (from Greek, exo: from outside, hence generated from

outside the model) are those variables whose determination the model saysnothing about, but which nevertheless in‡uence the endogenous variables.An example of an exogenous variable in a supply and demand model would

be the weather. Bad weather shifts the supply curve of say oranges to the leftcausing P to go up and Q to go down. Weather therefore in‡uences P and Qbut P and Q do not (presumably) in‡uence the weather. A supply and demandmodel says nothing about how the weather is determined.Often policy variables such as taxes, government expenditure or the money

supply are exogenous variables.Mathematically you can think of the exogenous variables as the x0s or the

independent variables, while endogenous variables are the dependent variablesor the y0s:Typically in economics we do not use x0s and y0s in our notation and so you

will often need to think carefully about what are the exogenous variables andwhat are the endogenous variables in any particular model.

1

CHAPTER 1. TOTAL DIFFERENTIALS 2

1.2 Structural and Reduced Forms of a Model

1.2.1 The Structural Form of a Model

When we …rst write down an economic model, that is before doing any com-plicated derivations, it is often the case that the endogenous variables y = [yi](say an n £ 1 vector) and exogenous variables x = [xi] (say an m £ 1 vector)are all jumbled and appear on both sides of the = sign as:

M (y; x) = N (y; x) :

This form of the model is often referred to as the structural form of the modelsince it is in this form that the basic assumptions (or structure) of the modelare most apparent.

1.2.2 The Reduced Form of a Model

In economics we are often interested in how changes the x0s; the exogenousvariables a¤ect the y0s; the endogenous variables.For example we might ask how increasing the tax rate t (an exogenous vari-

able) a¤ects the price P or quantity Q (endogenous variables).It is often hard with the structural form of the model to perform such exper-

iments. The problem is that the y0s and x0s are all jumbled together. It wouldinstead be more useful to have the reduced form of the model which is:

y = h (x)

so that y is by itself on the left-hand side and some function of x is on theright-hand side.Given knowledge of the form of h (x) we can consider any number of changes

in the x0s and then see how the y0s change.

An Example

Suppose y is the number of workers that a …rm hires, x1 is the price of the goodthat the …rm sells, and x2 is the nominal wage. From a model we …nd that themarginal revenue of hiring an extra worker is:

M (y; x1; x2) = x1y¡ 12

while the marginal cost is:

N (y; x1; x2) = x2y12 :

Equating marginal revenue M and marginal cost N we arrive at a structuralmodel: M (y; x1; x2) = N (y; x1; x2) or

x1y¡ 12 = x2y

12 :


To get the reduced form we need to get y alone on the left-hand side. In thiscase this is not hard to do. By a little algebra we …nd that:

y = h (x1; x2) =x1x2:

Suppose x1 = x2 = 1:We then know that y = 1: Consider three experiments:1) doubling x1 while keeping x2 …xed, 2) doubling x2 while keeping x1 …xed,and 3) doubling both x1 and x2: We see from the reduced form that the …rstexperiment results in doubling y to 2; the second experiment results in halvingy to 1

2 ; while in the third experiment nothing happens to y:

1.2.3 Implicit and Explicit Functions

Closely related to the notion of structural and reduced forms are implicit andexplicit functions.

De…nition 1 An implicit function is any function written as:

g (y; x) = 0:

so that both the y0s and x0s appear jumbled together on the left-hand side andzero is on the right-hand side.

Given the structural form of a model it is very easy to rewrite it as an implicitfunction. Thus from the structural form M (y; x) = N (y; x) we can …nd theimplicit function g (y; x) as:

g (y; x) =M (y; x)¡N (y; x) = 0:

Using the example of: M (y; x1; x2) = x1y¡12 and N (y; x1; x2) = x2y

12 we have:

g (y; x1; x2) = x1y¡ 12 ¡ x2y 12 = 0:

De…nition 2 An explicit function unscrambles the y0s and x0s so that y isalone on the left-hand side as:

y = h (x) :

Thus an explicit function and the reduced form are essentially the samething.Given an explicit function or reduced form it is always possible to write it

as an implicit function as:

g (y; x) = y ¡ h (x) = 0:

However, it is often impossible to …nd an explicit function or the reducedform from an implicit function or structural model.


For example given the implicit function:

g (y; x) = y ln (y)¡ ln (x) = 0; x > 1there is no way of getting y alone on the left-hand side. The best we could dois:

y ln (y) = ln (x)

or

yy = x:

Nevertheless, g (y; x) = y ln (y) ¡ ln (x) = 0 is a perfectly respectable func-tion. We can for example show that it is increasing and concave. We could evenplot it as shown below.

1

1.2

1.4

1.6

1.8

2

y

1 2 3 4 5x

g (y; x) = y ln (y)¡ ln (x) = 0 for x > 1

1.2.4 Calculating Derivatives

In economics we are often interested in calculating the partial derivative:

@yi@xj

which tells us the relationship between the jth exogenous variable and the ith

endogenous variable when all other exogenous variables are held constant. Thiscorresponds to the experiment of keeping all other exogenous variables exceptxj constant, changing xj by a small amount and seeing how much yj changes.

De…nition 3 If @yi@xj

> 0 then a positive relationship exists between xi and yj ;that is, an increase (decrease) in xj will result in an increase (decrease) in yi:

De…nition 4 If @yi@xj< 0 then a negative relationship exists between xi and yj ;

that is, an increase (decrease) in xj will result in an decrease (increase ) in yi:


In the example above where we found that

y = h (x1; x2) =x1x2

we have:

@y

@x1=1

x2> 0;

@y

@x2= ¡ x1

(x2)2 < 0:

Thus there is a positive relationship between x1 and y and a negative relationshipbetween x2 and y:Since it is often not possible to …nd the reduced form of the model, that is,

we often cannot unscramble the model enough to get it where y appears aloneon the left-hand side and h (x) appears on the right-hand side, we cannot alwayscalculate @yi

@xjin the usual manner.

Fortunately, no matter how jumbled up the structural form of the model orthe implicit function, it is still possible to calculate @yi

@xjindirectly using total

di¤erentials. In what follows we will learn this technique starting with thesimplest cases and building up to the more general.We begin with beginners’ total di¤erentials where there is one exogenous

variable x and one endogenous variable y: We then move on to intermediatetotal di¤erentials where there are many exogenous variables but still only oneendogenous variable. Finally we move on to advanced total di¤erentials wherethere are both many endogenous and exogenous variables.

1.3 Total Di¤erentials for BeginnersSuppose we are given an implicit function:

g (y; x) = 0

where both y and x are scalars so that there is one endogenous variable y andone exogenous variable x: Under one condition this de…nes an explicit function:

y = h (x) :

The reason we know this is the Implicit Function Theorem which statesthat:

Theorem 5 Implicit Function Theorem I: The explicit function h (x) is guar-anteed to exist as long as

@g (y; x)

@y6= 0:

The problem then is to calculate:

dy

dx= h0 (x) :

We do this by taking the total di¤erential of g (y; x) which is:


De…nition 6 The total di¤erential is de…ned as:

@g (y; x)

@ydy +

@g (y; x)

@xdx = 0:

The total di¤erential can be used to calculate dydx using the following recipe:

1.3.1 A Recipe for Calculating dydx

Step 1. If it is not already clear to you, identify the endogenous variable y andthe exogenous variable x in your model.Step 2. If the structural model is not written as an implicit function of the

form g (y; x) = 0; rewrite it so that it is in this form.Step 3. Take the partial derivative of g (y; x) with respect to y and call this

a: Thus: a = @g(y;x)@y : Verify that a 6= 0 which insures that the explicit function

exists by the Implicit Function Theorem. Multiply a by dy to get:

a£ dy:Step 4. Take the partial derivative of g (y; x) with respect to x and call this

b: Thus: b = @g(y;x)@x : Multiply b by dx to get:

b£ dx:Step 5. Add the results of steps 3 and 4 together and set them equal to

zero to get the total di¤erential:

a£ dy + b£ dx = 0:

Step 6. Solve for the ratio dydx to get:

dy

dx= ¡ b

a

or alternatively using the de…nition of a and b :

dy

dx= ¡

@g(y;x)@x

@g(y;x)@y

:

Note that if a = 0 this ratio would be unde…ned.

Example 1: A Simple Linear Function

Let us begin with an easy structural model written as:

3Q¡ 4R = 9:where Q is say the quantity of apples produced in an orchard and R is theamount of rainfall.


Applying Step 1 in the recipe Q is the endogenous variable and R is theexogenous variable. This follows from the common sense notion that whilerainfall might a¤ect apple production, apple production is unlikely to a¤ectrainfall.Applying Step 2 we can write this as an implicit function by putting the 9

on the other side of the equal sign to get:

g (Q;R) = 3Q¡ 4R¡ 9 = 0:From steps 3; 4 and 5; the total di¤erential for g (Q;R) is:

3dQ¡ 4dR = 0since:

a =@g (Q;R)

@Q= 3; b =

@g (Q;R)

@R= ¡4:

Note that a = 3 6= 0 and so the explicit function exists by the Implicit FunctionTheorem.Solving for dQdR from the total di¤erential it then follows that:

dQ

dR=4

3:

This is the same answer we would get from working with the explicit formof the function or the reduced form. We obtain this by solving g (Q;R) =3Q¡ 4R¡ 9 = 0 for Q to obtain:

Q =4

3R+ 3

from which it easily follows that: dQdR =43 :

Example 2

Consider the implicit function:

g (y; x) = ln (y) +1

2x2 +

1

2ln (2¼) = 0:

This can be written explicitly or as a reduced form as:

y = h (x) =1p2¼e¡

12x

2

which is the standard normal density. From h (x) we can directly calculate h0 (x)as:

dy

dx= ¡x 1p

2¼e¡

12x

2

| {z }=y

= ¡xy:


Let us obtain now attempt to obtain the derivative via the total di¤erential.Using Steps 3, 4, and 5 we obtain a = @g(y;x)

@y = 1y and b =

@g(y;x)@x = x so that

the total di¤erential is:

1

ydy + xdx = 0:

Solving for dydx we then get:

dy

dx= ¡xy:

Example 3

Suppose we were given a structural equation of the form:

yy = x:

Here from the notation y is the endogenous variable and x is the exogenousvariable. From Step 2; by taking the ln ( ) of both sides we can write this as animplicit function:

g (y; x) = y ln (y)¡ ln (x) = 0which we have already seen cannot be written as an explicit function y = h (x) :1

From Steps 3; 4 and 5 the total di¤erential of g (y; x) is:

(1 + ln (y)) dy ¡ 1

xdx = 0:

since:

a =@g (y; x)

@y= 1 + ln (y) ; b =

@g (y; x)

@x= ¡ 1

x:

Solving for dydx :

dy

dx=

1

x (1 + ln (y)):

We can use this result to show that 0 < dydx < 1 for x > 1 and hence a

positive relationship exists between x and y: This follows since:

x > 1 =) ln (y) y = ln (x) > 0

implies that y > 1 since if y < 1 then ln (y) < 0: Therefore from x > 1 and(1 + ln (y)) > 1 we have

x > 1 =) 0 <dy

dx=

1

x (1 + ln (y))< 1:

1Note that we could have just as correctly used g (y; x) = yy ¡ x = 0; that is there aremany di¤erent correct ways of writing down an implicit function.


As an exercise di¤erentiate this one more time with respect to x and show that:

d2y

dx2= ¡ 1

x (1 + ln (y))

µ1

x+

1

y (1 + ln (y))

dy

dx

¶< 0

so that the function is concave as the plot above illustrates.

Example 4: The Demand for Labour in the Short-Run

Suppose a …rm facing a short-run production function Q = f (L) where f 0 (L) >0 and f 00 (L) < 0 and whereQ is output and L is the amount of labour. If w = W

Pis the real wage then the …rm will chose that L¤ which satis…es:

f 0 (L¤) = w:

Here w is the exogenous variable and L¤ is the endogenous variable. We canwrite this structural form of the model as an implicit function as:

g (L¤; w) = f 0 (L¤)¡w = 0:This de…nes the labour demand function or the explicit function or the reducedform as:

L¤ = L¤ (w) :

The total di¤erential is:

f 00 (L¤) dL¤ ¡ dw = 0since

a =@g (L¤; w)@L¤

= f 00 (L¤) ; b =@g (L¤; w)

@w= ¡1:

Note that a < 0 and so the explicit function exists.Note that the coe¢cient on dL¤ is negative since the diminishing marginal

product of labour implies that: f 00 (L¤) < 0.Solving for the ratio dL¤

dw we …nd that:

dL¤

dw=

1

f 00 (L¤)< 0

and so the labour demand curve is downward sloping.

1.4 Total Di¤erentials for Intermediates

Consider now an implicit function were there arem exogenous variables x1; x2; : : : xm(which we may write as an m£1 vector x) and one endogenous variable y. Theimplicit function is now written as:

g (y; x1; x2; : : : xm) = 0


which determines the explicit function:

y = h (x1; x2; : : : xm)

as long as the Implicit Function Theorem is satis…ed which states that:

Theorem 7 Implicit Function Theorem II: The explicit function y = h (x1; x2; : : : xm)is guaranteed to exist as long as

@g (y; x1; x2; : : : xm)

@y6= 0:

We wish to calculate: @y@xj

= @h(x1;x2;:::xm)@xj

in order to determine how a

change in the jth exogenous variable xj will a¤ect the endogenous variable y. Todo this from the implicit function g (y; x) = 0 we calculate the total di¤erentialde…ned by:

De…nition 8 The total di¤erential of g (y; x) = 0 is given by:

a£ dy + b1dx1 + b2dx2 + ¢ ¢ ¢+ bmdxm = 0

where a; b1; b2; : : : bm are de…ned below in the following recipe:

1.4.1 A Recipe for Calculating @y@xj

We proceed as follows:Step 1. If it is not already clear to you, identify the endogenous variable y

and the exogenous variables x1; x2; : : : xm in your model.Step 2. Write the structural form of the model as an implicit function of

the form:

g (y; x1; x2; : : : xm) = 0:

This generally means just collecting all terms on the left-hand side of the struc-tural model and setting them equal to zero.Step 3. Take the partial derivative of g (y; x) with respect to y and call

this a: Make sure that a 6= 0 so that the explicit function exists by the ImplicitFunction Theorem. Multiply a by dy to get:

a£ dy:

Step 4. Take the partial derivative of g (y; x) with respect to x1 and callthis b1: Multiply b1 by dx1 to get b1dx1: Repeat this for x2; x3; : : : xm and addthem together to get:

b1dx1 + b2x2 + ¢ ¢ ¢+ bmdxm:


Step 5. Add the results of steps 3 and 4 together and set them equal tozero to get the total di¤erential:

a£ dy + b1dx1 + b2x2 + ¢ ¢ ¢+ bmdxm = 0:

Step 6. To get the partial derivative @y@xj

set all dx0s equal to zero exceptdxj ; and then replacing the remaining d0s in the total di¤erential with @0s:Thus we set dy = @y; dxj = @xj and dxi = 0 for i 6= j in the total di¤erential

to get:

a@y + bj@xj = 0:

Now solve for the ratio @y@xj:

@y

@xj= ¡bj

a= ¡

@g(y;x)@xj

@g(y;x)@y

:

1.4.2 Example 1: A Simple Linear Function

Consider the structural model:

3Q¡ 4R¡ 7S = 9

where Q is the production of apples, R is rainfall and S is the amount of sun-shine. Following Step 1, Q is the endogenous variable and R and S are the 2exogenous variables.The structural model can be re-written as an implicit function as:

g (Q;R; S) = 3Q¡ 4R¡ 7S ¡ 9 = 0:

Following Steps 3, 4 and 5 the total di¤erential is

3dQ¡ 4dR¡ 7S = 0

since

a =@g (Q;R; S)

@Q= 3; b1 =

@g (Q;R; S)

@R= ¡4;

b2 =@g (Q;R; S)

@S= ¡7:

Suppose we wish to calculate @Q@S : Following step 6 …rst set dR = 0 and then

replace the remaining d0s with @0s so that: dQ = @Q and dS = @S: We thenobtain:

3@Q¡ 7@S = 0:


Solving for the ratio @Q@S then yields:

@Q

@S=7

3:

In this example we could also have calculated @Q@S from the reduced form.

From 3Q ¡ 4R ¡ 7S ¡ 9 = 0 you can arrive at the reduced form (or explicitfunction) as:

Q = h (R;S) =4

3R+

7

3S + 3

from which it is trivial to calculate:

@Q

@S=7

3:

1.4.3 Example 2


g (y; x1; x2) = x1y¡ 12 ¡ x2y 12 = 0:

We have already seen that this has a reduced form:

y = h (x1; x2) =x1x2

from which it is easy to calculate @y@x1

and @y@x2:

Let us …nd these partial derivatives from the implicit function. The totaldi¤erential is:

¡12

³x1y

¡ 32 + x2y

¡ 12

´dy + y¡

12 dx1 ¡ y 12 dx2 = 0

since:

a =g (y; x1; x2)

@y= ¡1

2x1y

¡ 32 ¡ 1

2x2y

¡ 12

b1 =g (y; x1; x2)

@x1= y¡

12

b2 =g (y; x1; x2)

@x2= y

12 :

To calculate @y@x1

set dx2 = 0 and replace the remaining d0s with @0s so thatdy = @y and dx1 = @x1: We then obtain:

¡12

³x1y

¡ 32 + x2y

¡ 12

´@y + y¡

12 @x1 = 0:


Solving for the ratio @y@x1

yields:

@y

@x1=

2y¡12³

x1y¡ 32 + x2y¡

12

´ :As an exercise use the reduced form y = x1

x2to prove that the above expression

for @y@x1

is equal to 1x2:

To calculate @y@x2

set dx1 = 0 and replace the remaining d0s with @0s toobtain:

¡12

³x1y

¡ 32 + x2y

¡ 12

´@y ¡ y 12 @x2 = 0:

Solving for @y@x2

yields:

@y

@x2=

¡2y 12³x1y¡

32 + x2y¡

12

´ < 0:As an exercise use the reduced form y = x1

x2to prove that the above expression

for @y@x2

is equal to ¡ x1(x2)

2 :

1.4.4 Example 3: The Bivariate Normal Density


g (y; x1; x2) = ln (y) +1

2

¡x21 + x

22

¢+ ln (2¼) = 0

from which we can …nd the explicit function:

y = h (x1; x2) =1

2¼e¡

12(x

21+x

22):

This is the bivariate standard normal which is plotted below:

-3-2

-10

12

3

x

-3-2

-10

12

3

w

00.020.040.060.080.10.120.140.16

y = h (x1; x2) =12¼e

¡ 12(x

21+x

22)


If we start from the explicit function we can directly calculate:

@y

@x1= ¡x1 1

2¼e¡

12(x

21+x

22) = ¡x1y

@y

@x2= ¡x2 1

2¼e¡

12(x

21+x

22) = ¡x2y:

If instead we calculate the total di¤erential we …nd that:

1

ydy + x1dx1 + x2dx2 = 0

since

a =@g (y; x1; x2)

@y=1

y;

b1 =@g (y; x1; x2)

@x1= x1; b2 =

@g (y; x1; x2)

@x2= x2:

To calculate @y@x1

set dy = @y; dx1 = @x1 and dx2 = 0 in the total di¤erentialto obtain:

1

y@y + x1@x1 = 0

so that solving for the ratio @y@x1

we …nd:

@y

@x1= ¡x1y:

To calculate @y@x2

set dy = @y; dx2 = @x2 and dx1 = 0 in the total di¤erentialto obtain:

1

y@y + x2@x2 = 0

so that solving for the ratio @y@x2

we …nd:

@y

@x2= ¡x2y:

1.4.5 Example 4: Labour Demand

Suppose a …rm facing a short-run production function Q = f (L) where Q isoutput and L is the amount of labour. With price P and nominal wage W ,pro…ts ¼ as a function of L are given by:

¼ (L) = Pf (L)¡WL:


The pro…t maximizing L¤ then satis…es the …rst-order condition:

¼0 (L¤) = 0 = Pf 0 (L¤)¡Wwith second-order condition:

Pf 00 (L¤) < 0:

Here P andW are the exogenous variables and L¤ is the endogenous variable.Note that the …rst-order condition gives us the appropriate implicit function:

g (L¤;W;P ) = Pf 0 (L¤)¡W = 0:

This de…nes the labour demand function or the explicit function or reduced formas:

L¤ = L¤ (W;P ) :

From the implicit function g (L¤;W;P ) = 0 the total di¤erential is:

Pf 00 (L¤) dL¤ ¡ dW + f 0 (L¤) dP = 0

since

a =@g (L¤;W;P )

@L¤= Pf 00 (L¤) ; b1 =

@g (L¤;W;P )@W

= ¡1;

b2 =@g (L¤;W;P )

@P= f 0 (L¤) :

Note that the coe¢cient on dL¤ is negative from the second-order condition;that is a = Pf 00 (L¤) < 0.To calculate @L¤

@W set dP = 0 in the total di¤erential and replace the remain-ing d0s by @0s so that dL¤ = @L¤ and dW = @W to get:

Pf 00 (L¤)@L¤ ¡ @W = 0

or

Pf 00 (L¤)@L¤ = @W:

Dividing both sides by @W and solving for @L¤

@W :

@L¤ (W;P )@W

=1

Pf 00 (L¤ (W;P ))< 0

and so the labour demand curve is downward sloping.To calculate @L

¤@P set dW = 0 in the total di¤erential and replace the remain-

ing d0s by @0s so that dL¤ = @L¤ and dP = @P to get:

Pf 00 (L¤)@L¤ + f 0 (L¤)@P = 0:

Solving we …nd that:

@L¤

@P= ¡ f 0 (L¤)

Pf 00 (L¤)> 0

since f 0 (L¤) > 0 (the marginal product of labour is positive) and Pf 00 (L¤) < 0from the second-order conditions.


1.5 Total Di¤erentials for Experts

1.5.1 Systems of Implicit Functions

Suppose we have a model where there are n endogenous variables: y1; y2;y3; : : : yn and m exogenous variables x1; x2; : : : xm:We can write this more com-pactly by letting y = [yi] be an n £ 1 vector of the endogenous variables andx = [xi] an m£ 1 vector of exogenous variables.Suppose you are given n implicit functions:

gi (y; x) = 0; for i = 1; 2; : : : n:

This de…nes n reduced form or explicit functions given by:

yi = hi (x) ; for i = 1; 2; : : : n

as long as the Implicit Function Theorem is satis…ed which states that:

Theorem 9 Implicit Function Theorem III: The explicit functions yi = hi (x1; x2; : : : xm)for i = 1; 2; : : : n are guaranteed to exist as long as

det [A] 6= 0where:

A =

266664@g1(y;x)@y1

@g1(y;x)@y2

¢ ¢ ¢ @g1(y;x)@yn

@g2(y;x)@y1

@g2(y;x)@y2

¢ ¢ ¢ @g2(y;x)@yn

......

. . ....

@gn(y;x)@y1

@gn(y;x)@y2

¢ ¢ ¢ @gn(y;x)@yn

377775 :For example suppose that we have a structural model where apple production

Q1 and honey production Q2 are related to the amount of rainfall R; sunshineS and temperature T as:

3Q1 + 2Q2 + 3S + 5T = 2R¡ 7Q1 ¡ 2Q2 + 6R¡ 3T = ¡2S + 4:

Here Q1 and Q2 are the endogenous variables and R;S; and T are the exogenousvariables. Note that we have two equations and two endogenous variables.This can then be written as two implicit functions as:

g1 (Q1; Q2;R; S; T ) = 3Q1 + 2Q2 ¡ 2R+ 3S + 5T + 7 = 0g2 (Q1; Q2;R; S; T ) = Q1 ¡ 2Q2 + 6R+ 2S ¡ 3T ¡ 4 = 0:

Here:

A =

"@g1@Q1

@g1@Q2

@g2(y;x)@Q1

@g2(y;x)@Q2

#=

·3 21 ¡2

¸


and

det

··3 21 ¡2

¸¸= ¡8 6= 0

and so the explicit functions:

Q1 = h1 (R;S;T )

Q2 = h2 (R;S;T )

are guaranteed to exist by the Implicit Function Theorem.We might therefore want to calculate @Q2

@S = @h2(R;S;T )@S ; that is how sunshine

a¤ects honey production.In general we might want to calculate @yi

@xj; in order to determine the nature

of the relationship between the jth exogenous variable xj and the ith endogenousvariable yi: We do this by calculating the total di¤erential de…ned as

De…nition 10 The total di¤erential of gi (y; x) = 0 for i = 1; 2; : : : n is de…nedas:

Ady +Bdx = 0

where:

dy =

26664dy1dy2...dyn

37775 ; dx =26664dx1dx2...

dxm

37775and A is the n£ n matrix de…ned above and B is an n£m matrix given by:

B =

266664@g1(y;x)@x1

@g1(y;x)@x2

¢ ¢ ¢ @g1(y;x)@xm

@g2(y;x)@x1

@g2(y;x)@x2

¢ ¢ ¢ @g2(y;x)@xm

......

. . ....

@gn(y;x)@x1

@gn(y;x)@x2

¢ ¢ ¢ @gn(y;x)@xm

377775 :

The recipe for calculating the total di¤erential and from this @yi@xjis as follows:

1.5.2 A Recipe for Calculating @yi@xj

Step 1. Identify the n endogenous variables y1; y2; : : : yn and the m exogenousvariables x1; x2; : : : xm; in your model.


Step 2. If the structural model is not written as a system of implicit func-tions write them in the form:

g1 (y1; y2; : : : yn; x1; x2; : : : xm) = 0;

g2 (y1; y2; : : : yn; x1; x2; : : : xm) = 0;

...

gn (y1; y2; : : : yn; x1; x2; : : : xm) = 0;

Note you should have as many implicit functions as endogenous variables !Step 3. Take the partial derivative of the …rst implicit function g1 (y; x) = 0

with respect to y1. Call this a11 so that a11 =@g1(y;x)@y1

: Multiply a11 by dy1 toget a11dy1: Continue this with the remaining endogenous variables y2; y3; : : : ynand add the results together to obtain:

a11dy1 + a12dy2 + ¢ ¢ ¢+ a1ndyn:

Note that a1j =@g1(y;x)@y1

is the coe¢cient on dyj in the …rst implicit function.Step 4. Take the partial derivative of g1 (y; x) with respect to x1 and call

this b11: Multiply b11 by dx1 to get b11dx1: Repeat for x2; x2; : : : xm to get:

b11dx1 + b12dx2 + ¢ ¢ ¢+ b1mdxm:Step 5. Add the results of steps 3 and 4 together and set them equal to

zero. This gives you the total di¤erential for the …rst implicit function.

a11dy1 + a12dy2 + ¢ ¢ ¢+ a1ndyn + b11dx1 + b12dx2 + ¢ ¢ ¢+ b1mdxm = 0

Now repeat this for the second implicit function to get:


so that a2j =@g2(y;x)@yj

and b2j =@g2(y;x)@xj

.Keep doing this for all n implicit equations to obtain n total di¤erentials:



...

an1dy1 + an2dy2 + ¢ ¢ ¢+ anndyn + bn1dx1 + bn2dx2 + ¢ ¢ ¢+ bnmdxm = 0

so that aij =@gi(y;x)@yj

and bij =@gi(y;x)@xj

are the coe¢cients on dyj and dxj in

the ith total di¤erential:Thus there are n total di¤erentials, one for each implicit equation.Step 6. To get @yi

@xjfrom the total di¤erential set all dx0s equal to zero

except dxj and replace all remaining d0s with @0s to obtain:


a11@y1 + a12@y2 + ¢ ¢ ¢+ a1n@yn + b1j@xj = 0

a21@y1 + a22@y2 + ¢ ¢ ¢+ a2n@yn + b2j@xj = 0

...

an1@y1 + an2@y2 + ¢ ¢ ¢+ ann@yn + bnj@xj = 0:

Now put the terms involving @xj on the right-hand side to get:

a11@y1 + a12@y2 + ¢ ¢ ¢+ a1n@yn = ¡b1j@xja21@y1 + a22@y2 + ¢ ¢ ¢+ a2n@yn = ¡b2j@xj

...

an1@y1 + an2@y2 + ¢ ¢ ¢+ ann@yn = ¡bnj@xjor: 26664

a11 a12 ¢ ¢ ¢ a1na21 a22 ¢ ¢ ¢ a2n...

.... . .

...an1 an2 ¢ ¢ ¢ ann

3777526664@y1@y2...@yn

37775 = ¡26664b1jb2j...bnj

37775@xj :Divide both sides by the scalar @xj and re-write this in matrix notation:

A@y

@xj= ¡bj

where @y@xj

and bj are n£ 1 column vectors given respectively by:

@y

@xj=

266664@y1@xj@y2@xj...

@yn@xj

377775 and bj =

26664b1jb2j...bnj

37775 :

Note that bj is the jth column of the matrix B:To solve for @yi

@xjuse Cramer’s rule to obtain:

@yi@xj

=det [Ai (¡bj)]det [A]

where Ai (¡bj) is the matrix you obtain by replacing the ith column of A withthe vector ¡bj :


1.5.3 Example 1

Given the apple/honey production example above, we have two linear implicitequations:

g1 (Q1; Q2;R; S; T ) = 3Q1 + 2Q2 ¡ 2R+ 3S + 5T + 7 = 0g2 (Q1; Q2;R; S; T ) = Q1 ¡ 2Q2 + 6R+ 2S ¡ 3T ¡ 4 = 0:

Suppose we wish to calculate @Q2

@S :

Using Total Di¤erentials

Following steps 3; 4; and 5 we have:

a11 =@g1@Q1

= 3; a12 =@g1@Q2

= 2;

b11 =@g1@R

= ¡2; b12 = @g1@S

= 3; b13 =@g1@T

= 5

so that the total di¤erential for g1 (Q1; Q2; R; S; T ) is:

3dQ1 + 2dQ2 ¡ 2dR+ 3dS + 5dT = 0:Similarly since:

a21 =@g2@Q1

= 1; a22 =@g2@Q2

= ¡2;

b21 =@g2@R

= 6; b22 =@g2@S

= 2; b23 =@g2@T

= ¡3

the total di¤erential for g2 (Q1; Q2; R; S; T ) is:

dQ1 ¡ 2dQ2 + 6dR+ 2dS ¡ 3dT = 0:Putting these two total di¤erentials together and we have:

3dQ1 + 2dQ2 ¡ 2dR+ 3dS + 5dT = 0

dQ1 ¡ 2dQ2 + 6dR+ 2dS ¡ 3dT = 0:

To calculate @Q2

@S from the total di¤erential set dR = dT = 0 and replacethe remaining d0s with @0s so that dS = @S; dQ1 = @Q1 and dQ2 = @Q2 toobtain:

3@Q1 + 2@Q2 + 3@S = 0

@Q1 ¡ 2@Q2 + 2@S = 0

or in matrix notation:·3 21 ¡2

¸·@Q1@Q2

¸=

· ¡3¡2

¸@S:


Dividing both sides by @S we then get:·3 21 ¡2

¸ · @Q1

@S@Q2

@S

¸=

· ¡3¡2

¸:

Solving by Cramer’s rule then yields:

@Q2@S

=

det

·3 ¡31 ¡2

¸det

·3 21 ¡2

¸ = 3

8:

Similar if we wish to calculate @Q1

@R set dS = dT = 0 in the total di¤erentialand replace the remaining d0s with @0s so that dR = @R; dQ1 = @Q1 anddQ2 = @Q2: We therefore obtain:

3@Q1 + 2@Q2 ¡ 2@R = 0

@Q1 ¡ @dQ2 + 6@R = 0

or in matrix notation:·3 21 ¡2

¸ ·@Q1@Q2

¸=

·2¡6

¸@R:

Dividing both sides by @R:·3 21 ¡2

¸ ·@Q1

@R@Q2

@R

¸=

·2¡6

¸:

Solving by Cramer’s rule then yields:

@Q1@R

=

det

·2 2¡6 ¡2

¸det

·3 21 ¡2

¸ = ¡1:

From the Reduced Form

Since the structural model is linear, we can easily …nd the reduced form usingmatrix algebra. We thus have from the implicit functions:

·3 21 ¡2

¸ ·Q1Q2

¸+

· ¡2 3 56 2 ¡3

¸24 RST

35+ · 7¡4

¸= 0:


so that: ·Q1Q2

¸= ¡

·3 21 ¡2

¸¡1 · ¡2 3 56 2 ¡3

¸24 RST

35¡·3 21 ¡2

¸¡1 ·7¡4

¸

=

· ¡1 ¡54 ¡1

252

38 ¡7

4

¸24 RST

35+ · ¡34¡198

¸:

Writing these out we get the reduced form:

Q1 = h1 (R;S;T ) = ¡R¡ 54S ¡ 1

2T ¡ 3

4

Q2 = h2 (R;S;T ) =5

2R+

3

8S ¡ 7

4T ¡ 19

8

so that:

@Q1@R

= ¡1; @Q2@S

=3

8:

1.5.4 Example 2: Supply and Demand I

First consider a demand and supply model given by:

Qd = D (P ) D0 (P ) < 0 (the demand curve)Qs = S (P ) S0 (P ) > 0 (the supply curve)Qs = Qd = Q (the equilibrium condition).

Here there are no exogenous variables, the …rst equation says that the demandcurve slopes downward, the second that the supply curve slopes upward and thethird that in equilibrium supply equals demand.We can re-write this as two implicit functions with two endogenous variables

Q and P as:

g1 (Q;P ) = Q¡D (P ) = 0g2 (Q;P ) = Q¡ S (P ) = 0:

At this stage there are no exogenous variables.Let us now introduce two exogenous variables: a tax TH paid by households

and a tax TF paid by …rms so that the model becomes:

Qd = D (P + TH) ;

Qs = S (P ¡ TF ) ;Qs = Qd = Q:


or as implicit functions:

g1 (Q;P; TH ; TF ) = Q¡D (P + TH) = 0g2 (Q;P; TH ; TF ) = Q¡ S (P ¡ TF ) = 0:

This determines the equilibrium quantity Q and price P as functions of THand TF as:

Q = h1 (TH ; TF )

P = h2 (TH ; TF ) :

To simplify the notation de…ne:

DP ´ D0 (P + TH)SP ´ S0 (P ¡ TF ) :

To …nd the e¤ect of the tax TH on the endogenous variables P and Q; wecalculate the total di¤erential which is:

dQ¡DP dP ¡DPdTH = 0

dQ¡ SPdP + SPdTF = 0

since:

a11 =@g1(Q;P;TH ;TF )

@Q = 1 a12 =@g1(Q;P;TH ;TF )

@P = ¡DPb11 =

@g1(Q;P;TH ;TF )@TH

= ¡DP b12 =@g1(Q;P;TH ;TF )

@TF= 0

and:

a21 =@g2(Q;P;TH ;TF )

@Q = 1 a22 =@g2(Q;P;TH ;TF )

@P = ¡SPb21 =

@g2(Q;P;TH ;TF )@TH

= 0 b22 =@g2(Q;P;TH ;TF )

@TF= SP :

To …nd @Q@TH

set dTF = 0 and replace all the remaining d0s in the totaldi¤erential with @0s to obtain:

@Q¡DP@P ¡DP@TH = 0

@Q¡ SP@P = 0

or ·1 ¡DP1 ¡SP

¸ ·@Q@P

¸=

·DP0

¸@TH :

Dividing both sides by @TH we obtain:·1 ¡DP1 ¡SP

¸" @Q@TH@P@TH

#=

·DP0

¸


so that from Cramer’s rule:

@Q

@TH=

det

·DP ¡DP0 ¡SP

¸det

·1 ¡DP1 ¡SP

¸=

¡DPSP¡SP +DP < 0

so that the tax on households reduces the equilibrium quantity.

1.5.5 Example 3: Supply and Demand II

There are of course many more variables than taxes which shift demand andsupply curves. Let a be any variable which shifts the demand curve to the rightand b any variable which shifts the supply curve to the right. We can write amore general supply and demand model as:

Qd = D (P; a) DP ´ @D(P;a)@P < 0 Da ´ @D(P;a)

@a > 0

Qs = S (P; b) SP ´ @S(P;b)@P > 0 Sb ´ @S(P;b)

@b > 0Qs = Qd = Q:

For example if a = TH and b = TF and D (P; a) = D (P + TH) and S (P; b) =S (P ¡ TF ) then we would have the previous supply and demand model.Here the exogenous variables are a and b and the endogenous variables are

Q and P: Written as a system of implicit equations we have:

g1 (Q;P; a; b) = Q¡D (P; a) = 0g2 (Q;P; a; b) = Q¡ S (P; b) = 0:

The total di¤erential is then:

dQ¡DP dP ¡Dada+ 0db = 0

dQ¡ SPdP + 0da¡ Sbdb = 0:

Suppose we want to see what happens if the demand curve shifts to the rightwhile the supply curve stays …xed. Set db = 0; da = @a; dQ = @Q and dP = @Pso that from the total di¤erential:·

1 ¡DP1 ¡SP

¸ ·@Q@a@P@a

¸=

·Da0

¸


so that using Cramer’s rule:

@Q

@a=

det

·Da ¡DP0 ¡SP

¸det

·1 ¡DP1 ¡SP

¸ =¡DaSP¡SP +DP > 0

@P

@a=

det

·1 Da1 0

¸det

·1 ¡DP1 ¡SP

¸ = ¡ Da¡SP +DP > 0

Thus anything which shifts the demand curve to the right increases Q and P:Verify this using a diagram!

1.5.6 Example 4: The IS=LM Model

The Goods and Services Market

In the IS/LM model the goods and services market is described by:

cd = c (y) ; 0 < c0 (y) < 1;id = i (r) ; i0 (r) < 0gd = go

yd = cd + id + gd

y = yd

where cd is consumption demand (with c0 (y) the marginal propensity to con-sume), id is investment demand (which depends negatively on the rate of inter-est), gd is government expenditure yd is aggregate demand and y is GNP:Here government expenditure is exogenous and equal to some constant go

(the o subscript here is used to denote exogenous variables) while all othervariables are endogenous, i.e., c, i; y and r:In equilibrium we require that y = yd = c (y) + i (r) + go so that:

y = c (y) + i (r) + go

which is the structural form of the IS curve. We can write this as an implicitfunction as:

g1 (y; r; go;mso) = y ¡ c (y)¡ i (r)¡ go = 0

where y and r are the endogenous variables and go and (the not yet introducedexogenous money supply)ms

o are the exogenous variables.The total di¤erential for the IS curve is then:

(1¡ c0 (y)) dy ¡ i0 (r) dr ¡ dgo + 0dmso = 0:


The Slope of the IS Curve

Let us now show that the IS curve is downward sloping. The IS curve is allr; y combinations which satisfy this implicit function for …xed values of go andmso:We can therefore think of y as the independent or exogenous variable and r

as the dependent variable or endogenous variable and take the total di¤erentialwith dgo = dms

o = 0 to obtain:

(1¡ c0 (y)) dy ¡ i0 (r) dr = 0:Solving for the slope of the IS curve given by: drdy we obtain:

dr

dy=1¡ c0 (y)i0 (r)

< 0

since 1¡ c0 (y) > 0 and i0 (r) < 0: Thus the IS curve is downward sloping.

The Money Market

The money market is described by:

md = k (y) + l (r) ; k0 (y) > 0; l0 (r) < 0ms = ms

o

where k (y) is the transactions demand for money, l (r) is the speculative demandfor money, which depends negatively on r; and ms

o is the exogenous supply ofmoney.In equilibrium: ms = md so that we have:

mso = k (y) + l (r)

which is the structural form of the LM curve. We can write the LM curve asan implicit function as:

g2 (y; r; go;mso) = k (y) + l (r)¡ms

o = 0:

This has a total di¤erential given by:

k0 (y) dy + l0 (r) dr + 0dgo ¡ dmso = 0:

The Slope of the LM Curve

Let us now show that the LM curve is downward sloping. The LM curve is allr; y combinations which satisfy this implicit function for …xed values of go andmso:We can therefore think of y as the independent or exogenous variable and r

as the dependent variable or endogenous variable and take the total di¤erentialwith dgo = dms

o = 0 to obtain:

k0 (y) dy + l0 (r) dr = 0

and solve for drdy ; the slope of the LM curve as:

dr

dy= ¡k

0 (y)l0 (r)

> 0


The IS and LM Curves as Implicit Functions

Taken together IS and LM equations determine the structural form of the modelwith two endogenous variables y and r and two exogenous variables go and ms

o.We can write the IS and LM equations respectively as:

g1 (y; r; go;mso) = y ¡ c (y)¡ i (r)¡ go = 0

g2 (y; r; go;mso) = k (y) + l (r)¡ms

o = 0:

These two equations then determine a reduced form:

y = f1 (go;mso)

r = f2 (go;mso) :

We wish to calculate such partial derivatives as @y@go; which tells us how GNP

responds to changes in government expenditure, or @r@ms

owhich tells us how

interest rates respond to changes in the money supply.

The Total Di¤erential

The total di¤erential for the IS=LM model is given by:

(1¡ c0 (y)) dy ¡ i0 (r) dr ¡ dgo = 0

k0 (y) dy + l0 (r) dr ¡ dmso = 0

since:

a11 =@g1@y

= 1¡ c0 (y) ; a12 = @g1@r

= ¡i0 (r) ;

b11 =@g1@go

= ¡1; b12 = @g1@ms

o

= 0

and

a21 =@g2@y

= k0 (y) ; a22 =@g2@r

= l0 (r) ;

b21 =@g2@go

= 0; b22 =@g2@ms

o

= ¡1:

The E¤ect of Changes in Government Expenditure

To …nd @y@go

set dmso = 0 and replace the d

0s by @0s to obtain:

(1¡ c0 (y))@y ¡ i0 (r)@r ¡ @go = 0

k0 (y) @y + l0 (r)@r = 0

or writing this in matrix notation:·1¡ c0 (y) ¡i0 (r)k0 (y) l0 (r)

¸ ·@y@r

¸=

·10

¸@go:


Dividing both sides by @go we obtain:·1¡ c0 (y) ¡i0 (r)k0 (y) l0 (r)

¸" @y@go@r@go

#=

·10

¸:

Finally, using Cramer’s rule:

@y

@go=

det

·1 ¡i0 (r)0 l0 (r)

¸det

·1¡ c0 (y) ¡i0 (r)k0 (y) l0 (r)

¸=

l0 (r)(1¡ c0 (y)) l0 (r) + i0 (r) k0 (y)

so that simplifying we have:

@y

@go=

1

(1¡ c0 (y)) + i0(r)k0(y)l0(r)

:

Since 0 < c0 (y) < 1 it follows that 0 < 1 ¡ c0 (y) < 1 so the …rst term in thedenominator is positive. The second term: i

0(r)k0(y)l0(r) represents the crowding out

e¤ect and is also positive since i0 (r) < 0 , l0 (r) < 0 and k0 (y) > 0:We thereforeconclude that @y

@go> 0 so that an increase in government expenditure increases

GNP:Using Cramer’s rule to solve for @r

@gowe …nd that:

@r

@go=

det

·1¡ c0 (y) 1k0 (y) 0

¸det

·1¡ c0 (y) ¡i0 (r)k0 (y) l0 (r)

¸=

¡k0 (y)det

·1¡ c0 (y) ¡i0 (r)k0 (y) l0 (r)

¸ > 0so that an increase in government expenditure increases interest rates.It follows from this that investment falls when government expenditure in-

creases. Using the chain rule we have:

@i (r (go;mso))

@go= i0 (r (go;ms

o))@r (go;ms

o)

@go< 0

since i0 (r) < 0:

The E¤ect of Changes in the Money Supply

Similarly to …nd @y@ms

oset dgo = 0 and replace the d0s by @0s to obtain:

(1¡ c0 (y)) @y ¡ i0 (r)@r = 0

k0 (y)@y + l0 (r)@r ¡ @mso = 0:


or ·1¡ c0 (y) ¡i0 (r)k0 (y) l0 (r)

¸" @y@ms

o@r@ms

o

#=

·01

¸:

Using Cramer’s rule it follows that :

@y

@mso

=

det

·0 ¡i0 (r)1 l0 (r)

¸det

·1¡ c0 (y) ¡i0 (r)k0 (y) l0 (r)

¸ > 0and

@r

@mso

=

det

·1¡ c0 (y) 0k0 (y) 1

¸det

·1¡ c0 (y) ¡i0 (r)k0 (y) l0 (r)

¸ < 0so that an increase in the money supply increases GNP and decreases interestrates. (Verify this!)It follows from this that investment increases when the money supply in-

creases since using the chain rule we have:

@i (r (go;mso))

@mso

= i0 (r (go;mso))@r (go;ms

o)

@mso

> 0

since i0 (r) < 0:

1.6 Total Di¤erentials and Optimization

1.6.1 A General Discussion

Economics deals with rational behavior, and rational behavior typically meansmaximizing or minimizing something. For example utility and pro…ts get max-imized while costs get minimized.Maximization and minimization imply …rst-order and second-order condi-

tions. First-order conditions say that some expression of the endogenous andexogenous variables must equal zero and so give directly implicit functions fromwhich we can take total di¤erentials.Second-order conditions tell us that something about a symmetric matrix,

the Hessian H: For example with an unconstrained maximization problem weknow that H must be negative de…nite while for an unconstrained minimizationproblem H must be positive de…nite.It turns out that when we take the total di¤erential, the coe¢cients on the

dy0s; the exogenous variables, always come fromH; the Hessian from the second-order conditions. This turns out to be invaluable in signing partial derivativessince if we use Cramer’s rule det [H] will always appear in the denominator.


1.6.2 A Quick Derivation of the Main Principle

Let y be an n£1 vector and x an m£ 1 vector of exogenous variables. Supposethat our structural model involves either maximizing or minimizing:

f (y; x) :

The optimal y; say y¤ will then satisfy the …rst-order conditions:

@f (y¤; x)@yi

= 0; for i = 1; 2; : : : n:

Here y¤ is the vector of endogenous variables and x is the vector of exogenousvariables. The …rst-order conditions then de…ne a set of n implicit functionsgiven by:

gi (y¤; x) =

@f (y¤; x)@yi

= 0; for i = 1; 2; : : : n:

This determines the reduced form of n equations:

y¤i = y¤i (x) for i = 1; 2; : : : n:

From the second-order conditions we have that the Hessian:

H =

·@2f (y¤; x)@yi@yj

¸is negative de…nite for a maximum and positive de…nite for a minimum.Taking the total di¤erential of g (y¤; x) = 0 we …nd that:

Hdy +Bdx = 0

where the n£ n matrix H and the n£m matrix B are given by:

H (y¤; x) =

·@2f (y¤; x)@yi@yj

¸;

B =

·@2f (y¤; x)@yi@xj

¸Using Cramer’s rule we then …nd that as before that :

@yi@xj

=det [Hi (¡bj)]det [H]

:

From the second-order conditions we know that H is either negative de…nitefor a maximum or positive de…nite and for a minimum hence non-singular.Furthermore we can determine the sign of det [H] : For example det [H] > 0 ifH is positive de…nite. This can be used in determining whether @yi

@xjis positive

or negative when we use Cramer’s rule.


While this derivation has skipped some details and is perhaps hard to followin the details, the essential point is easy to grasp and is this:When taking the total di¤erential from an unconstrained optimization prob-

lem, the matrix of coe¢cients on the dy0s is identical to the Hessian: H fromthe second-order conditions. This fact can then be used in signing the partialderivatives like @yi

@xj:

1.6.3 Pro…t Maximization

Consider the problem of pro…t maximizing in the long-run for a perfectly com-petitive …rm with technology Q = F (L;K) where ¼ is pro…ts, Q is output, Pis price, L is labour, K is capital, W is the nominal wage and R is the rentalcost of capital. Pro…t are thus given by:

¼ (L;K;P;W;R) = PF (L;K)¡WL¡RK:From the …rst-order conditions we have:

P@F (L¤;K¤)

@L¡W = 0

P@F (L¤;K¤)

@K¡R = 0:

Here L¤ and K¤ are the endogenous variables and P;W and R are the ex-ogenous variables. The …rst-order conditions thus de…ne two implicit functions:

g1 (L¤;K¤; P;W;R) = P

@F (L¤;K¤)@L

¡W = 0

g2 (L¤;K¤; P;W;R) = P

@F (L¤;K¤)@K

¡R = 0

which determines the reduced form:

L¤ = L¤ (P;W;R)K¤ = K¤ (P;W;R) :

To simplify the notation let us make the following de…nitions:

FL ´ @F (L¤;K¤)@L Fk ´ @F (L¤;K¤)

@L

FLL ´ @2F (L¤;K¤)@L2 FLK ´ @2F (L¤;K¤)

@L@K

FKK ´ @2F (L¤;K¤)@K2

The second-order conditions are then that the Hessian H given by:

H =

·PFLL PFLKPFLK PFKK

¸is negative de…nite. Therefore:

PFLL (L¤;K¤) < 0; PFKK (L

¤;K¤) < 0det [H] > 0:


Taking the total di¤erential we …nd that:

PFLLdL¤ + PFLKdK¤ + FLdP ¡ dW = 0

PFLKdL¤ + PFKKdK¤ + FKdP ¡ dR = 0

so that in matrix notation we have:·PFLL PFLKPFLK PFKK

¸| {z }

=H

·dL¤

dK¤

¸+

·FL ¡1 0FK 0 ¡1

¸24 dPdWdR

35 = · 00

¸:

Note that the matrix of coe¢cients on the vector of dL¤ and dK¤; theendogenous variables, is H; the Hessian from the second-order conditions.Suppose now we wish to calculate @L

¤@W and @K¤

@W : Set dP = dR = 0 and changethe remaining d0s to @0s so that dL¤ = @L¤; dK¤ = @K¤; and dW = @W; putthe terms involving @L on the right-hand side, and divide by @L to obtain:·

PFLL PFLKPFLK PFKK

¸| {z }

=H

·@L¤@W@K¤@W

¸=

·10

¸:

Using Cramer’s rule we …nd that:

@L¤

@W=

det

·1 PFLK0 PFKK

¸det [H]

=PFKKdet [H]

< 0

since PFKK < 0 and det [H] > 0 from the second-order conditions. Note how wehave used the second-order conditions to show that the long-run labour demandcurve is downward sloping.From this we can also show using Cramer’s rule that:

@K¤

@W=

det

·PFLL 1PFLK 0

¸det [H]

=¡PFLKdet [H]

which can be either positive or negative depending on the sign of FLK :Suppose now we calculate @K¤

@R and @L¤@R : From the total di¤erential we can

show that: ·PFLL PFLKPFLK PFKK

¸ ·@L¤@R@K¤@R

¸=

·01

¸so that:

@L¤

@R=

det

·0 PFLK1 PFKK

¸det [H]

= ¡ PFLKdet [H]

@K¤

@R=

det

·PFLL 0PFLK 1

¸det [H]

=PFLLdet [H]

< 0:


Thus as one would expect: @K¤@R < 0 so that the long-run demand curve for

capital is downward sloping.Comparing the expressions for @K

¤@W with @L¤

@R we …nd that:

@L¤

@R=@K¤

@W= ¡ PFLK

det [H]:

This (very surprising! ) symmetry result often occurs in these kind of problems.

1.6.4 Constrained Optimization

Often in economics we are maximizing or minimizing subject to a constraint.Suppose we wish to choose y to maximize or minimize f (y; x) subject to theconstraint h (y; x) = 0:This leads to the Lagrangian:

L (¸; y; x) = f (y; x) + ¸h (y; x)and the …rst-order conditions:

@L (¸¤; y¤; x)@¸

= h (y¤; x) = 0;

@L (¸¤; y¤; x)@yi

=@f (y¤; x)@yi

+ ¸¤@h (y¤; x)@yi

= 0; for i = 1; 2; : : : n:

These n+ 1 …rst-order conditions lead to n+ 1 implicit functions given by:

g1 (¸¤; y¤; x) = h (y¤; x) = 0;

gi (¸¤; y¤; x) =

@f (y¤; x)@yi

+ ¸¤@h (y¤; x)@yi

= 0; for i = 2; 3; : : : n+ 1:

Here the endogenous variables are ¸¤ and y ¤; while x is the vector of exogenousvariables. This in turn determines the reduced form:

¸¤ = ¸¤ (x)y¤i = y¤i (x) for i = 1; 2; : : : n:

The second-order conditions lead to the (n+ 1) £ (n+ 1) matrix H whichis the Hessian of L (¸; y; x) evaluated at ¸¤; y ¤:Consider now taking the total di¤erential. We will then …nd that:

HdZ¤ +Bdx = 0

where:

dZ¤ =·d¸¤

dy¤

¸is the (n+ 1)£ 1 vector of endogenous variables.The essential point is easy to grasp and is this:When taking the total di¤erential from a constrained optimization problem,

the matrix of coe¢cients on the d¸ and dy is identical to the Hessian: H fromthe second-order conditions. This fact can then be used in signing the partialderivatives like @yi

@xj:


1.6.5 Utility Maximization

Consider a household with a utility function:

U (X1;X2) = U1 (X1) + U2 (X2)

where:

U 01 (X1) > 0; U 001 (X1) < 0U 02 (X2) > 0; U 002 (X2) < 0

and with income Y and prices P1 and P2:The Lagrangian for the constrained maximization problem is then:

L (¸;X1; X2) = U1 (X1) + U2 (X2) + ¸ (Y ¡ P1X1 ¡ P2X2)which yields the …rst-order conditions:

@L (¸¤;X¤1 ;X

¤2 )

@¸= Y ¡ P1X¤

1 ¡ P2X¤2 = 0

@L (¸¤;X¤1 ;X

¤2 )

@X1= U 01 (X

¤1 )¡ ¸¤P1 = 0

@L (¸¤;X¤1 ;X

¤2 )

@X2= U 02 (X

¤2 )¡ ¸¤P2 = 0:

Here ¸¤; X¤1 and X

¤2 are the endogenous variables while Y; P1 and P2 are the

exogenous variables. This leads to 3 implicit functions:

g1 (¸¤;X¤

1 ;X¤2 ; Y; P1; P2) = Y ¡ P1X¤

1 ¡ P2X¤2 = 0

g2 (¸¤;X¤

1 ;X¤2 ; Y; P1; P2) = U 01 (X

¤1 )¡ ¸¤P1 = 0

g3 (¸¤;X¤

1 ;X¤2 ; Y; P1; P2) = U 02 (X

¤2 )¡ ¸¤P2 = 0


¸¤ = ¸¤ (Y; P1; P2) ;X¤1 = X¤

1 (Y; P1; P2) ;

X¤2 = X¤

2 (Y; P1; P2) :

Note that ¸¤ > 0 since from the second order conditions:

¸¤ =U 01 (X¤

1 )

P1=U 02 (X¤

2 )

P2> 0:

The second-order condition for this problem is that the Hessian of L (¸¤;X¤1 ;X

¤2 )

given by:

H =

24 0 ¡P1 ¡P2¡P1 U 001 (X¤

1 ) 0¡P2 0 U 002 (X¤

2 )

35


which satis…es:

det [H] = det

24 0 ¡P1 ¡P2¡P1 U 001 (X¤

1 ) 0¡P2 0 U 002 (X¤

2 )

35= ¡ (P1)2 U 002 (X¤

2 )¡ (P2)2 U 001 (X¤1 ) > 0:


0d¸¤ ¡ P1dX¤1 ¡ P2dX¤

2 + dY ¡X¤1dP1 ¡X¤

2dP2 = 0

¡P1d¸¤ + U 001 dX¤1 + 0dX

¤2 ¡ ¸¤dP1 = 0

¡P2d¸¤ + 0dX¤1 + U

002 dX

¤2 ¡ ¸¤dP2 = 0

where to simplify the notation we de…ne:

U 001 ´ U 001 (X¤1 ) ; U

002 ´ U 002 (X¤

2 ) :

Suppose we wish to calculate: @X¤1

@P1. Then setting: dP2 = dY = 0 and

changing all the @0s to d0s we obtain:

0@¸¤ ¡ P1@X¤1 ¡ P2@X¤

2 ¡X¤1@P1 = 0

¡P1@¸¤ + U 001 @X¤1 + 0@X

¤2 ¡ ¸¤@P1 = 0

¡P2@¸¤ + 0@X¤1 + U

002 @X

¤2 = 0:

Writing this in matrix notation and dividing both sides by @P1 we obtain:24 0 ¡P1 ¡P2¡P1 U 001 0¡P2 0 U 002

35| {z }

=H

264@¸¤@P1@X¤

1

@P1@X¤

2

@P1

375 =24 X¤

1

¸¤

0

35

so that:

@X¤1

@P1=

det

24 0 X¤1 ¡P2

¡P1 ¸¤ 0¡P2 0 U 002

35det [H]

=P1X

¤1U

002 ¡ (P2)2 ¸¤det [H]

< 0

since det [H] > 0 from the second order conditions, U 002 < 0 and ¸¤ > 0:

1.6.6 Cost Minimization

For which of you, desiring to build a tower, doth not …rst sitdown and count the cost...Luke 14:28


Suppose a …rm wishes to minimize costs C given by:

C =WL+RK

subject to the constraint that Q units are produced where:

Q = F (L;K) :

This leads to the Lagrangian:

L (¸;X1; X2) =WL+RK + ¸ (Q¡ F (L;K))and the …rst-order conditions:

@L (¸¤; L¤;K¤)@¸

= Q¡ F (L¤;K¤) = 0

@L (¸¤; L¤;K¤)@L

= W ¡ ¸¤ @F (L¤;K¤)@L

= 0

@L (¸¤; L¤;K¤)@K

= R¡ ¸¤ @F (L¤;K¤)@K

= 0:

Here there are three endogenous variables ¸¤; L¤;K¤ and three exogenous vari-ables: Q;W;R:The …rst-order conditions lead to three implicit functions:

g1 (¸¤; L¤;K¤;Q;W;R) = Q¡ F (L¤;K¤) = 0

g2 (¸¤; L¤;K¤;Q;W;R) = W ¡ ¸¤ @F (L

¤;K¤)@L

= 0

g3 (¸¤; L¤;K¤;Q;W;R) = R¡ ¸¤ @F (L

¤;K¤)@K

= 0


¸¤ = ¸¤ (Q;W;R) ;L¤ = L¤ (Q;W;R) ;K¤ = K¤ (Q;W;R) :

L¤ (Q;W;R) and K¤ (Q;W;R) are the conditional factor demands for labourand capital.To simplify the notation de…ne:

FL ´ @F (L¤;K¤)@L Fk ´ @F (L¤;K¤)

@L

FLL ´ @2F (L¤;K¤)@L2 FLK ´ @2F (L¤;K¤)

@L@K

FKK ´ @2F (L¤;K¤)@K2

in which case the second-order condition is that the matrix:

H =

24 0 ¡FL ¡FK¡FL ¡¸¤FLL ¡¸¤FLK¡FK ¡¸¤FLK ¡¸¤FKK

35


satis…es:

det [H] < 0:


0d¸¤ ¡ FLdL¤ ¡ FKdK¤ + dQ = 0

¡FLd¸¤ ¡ ¸¤FLLdL¤ ¡ ¸¤FLKdK¤ + dW = 0

¡FKd¸¤ ¡ ¸¤FLKdL¤ ¡ ¸¤FKKdK¤ + dR = 0:

Suppose we wish to calculate: @L¤@W . Setting: dR = dQ = 0 and changing the

remaining d0s to @0s we …nd that:

0@¸¤ ¡ FL@L¤ ¡ FK@K¤ = 0

¡FL@¸¤ ¡ ¸¤FLL@L¤ ¡ ¸¤FLK@K¤ + @W = 0

¡FK@¸¤ ¡ ¸¤FLK@L¤ ¡ ¸¤FKK@K¤ = 0:

Writing this in matrix notation, dividing both sides by @W etc. we obtain:24 0 ¡FL ¡FK¡FL ¡¸¤FLL ¡¸¤FLK¡FK ¡¸¤FLK ¡¸¤FKK

35| {z }

=H

24 @¸¤@W@L¤@W@K¤@W

35 =24 0¡10

35

so that:

@L¤

@W=

det

24 0 0 ¡FK¡FL ¡1 ¡¸¤FLK¡FK 0 ¡¸¤FKK

35det [H]

=(FK)

2

det [H]< 0

since det [H] < 0 from the second-order conditions and (FK)2> 0:

As an exercise show that:

@K¤

@R< 0

and that:

@L¤

@R=@K¤

@W:

Chapter 2

The Envelope Theorem

2.1 Unconstrained Optimization

2.1.1 Pro…t Maximization in the Short-Run I

Consider a …rm with a production function Q =pL so that the only factor of

production is labour L: To simplify things a bit instead of working with nominalpro…ts given by:

¦ = PQ¡WLwe will work with real pro…ts given by:

¼ =¦

P= Q¡wL:

Real pro…ts can be written as a function of L and the real wage w = WP as:

¼ (L;w) =pL¡wL:

Let us call ¼ (L;w) the naive pro…t function. It is naive because it givespro…ts for any L; whether it be rational or irrational.As an example of an irrational choice, consider the case where w = 1

2 andthe …rm hired a million workers or L = 1; 000; 000: Real pro…ts are then:

¼

µ1000000;

1

2

¶=

p1000000¡ 1

2£ 1000000

= ¡499; 000:Clearly this is not rational; the …rm could do much better by simply shuttingdown its operations and setting L = 0 in which case it would receive zero pro…tssince:

¼

µ0;1

2

¶=p0¡ 1

2£ 0 = 0:

38

CHAPTER 2. THE ENVELOPE THEOREM 39

Of course in economics we do not believe that the …rm will choose irrationalvalues for L: Instead it will choose the pro…t maximizing L¤ which satis…es the…rst-order conditions:

@¼ (L¤; w)@L

=1

2pL¤¡w = 0:

This implicitly de…nes the labour demand curve L¤ = L¤ (w) : In this case wecan …nd L¤ (w) explicitly as:

L¤ (w) = (2w)¡2 :

Thus facing a real wage of w = 12 the clever or pro…t maximizing …rm would

hire

L¤µ1

2

¶=

µ2£ 1

2

¶¡2= 1

or one worker in which case its pro…ts would be:

¼

µ1;1

2

¶=p1¡ 1

2£ 1 = 1

2:

Consider now replacing L in ¼ (L;w) with L¤ (w) = (2w)¡2 . Denote thisby ¼¤ (w), which I will call the clever pro…t function. This is given by:

¼¤ (w) = ¼ (L¤ (w) ; w)

=pL¤ (w)¡wL¤ (w)

=

q(2w)¡2 ¡w (2w)¡2

=1

2w¡ 1

4w=

1

4w

and so:

¼¤ (w) =1

4w:

Note that since:

¼¤00 (w) =1

2w3> 0

¼¤ (w) is convex. We will see later that the pro…t function is always convex.¼¤ (w) gives pro…ts as a function of the real wage w on the assumption that

the …rm hires the pro…t maximizing amount of labour. For example if w = 12

the pro…t maximizing …rm will receive real pro…ts equal to ¼¤¡12

¢= 1

4£12

= 12 :


Consider taking the derivative of the naive pro…t function with respect tothe real wage w: We obtain:

@¼ (L;w)

@w=

@

@w

³pL¡wL

´= ¡L:

Now suppose we do the same thing with the clever pro…t. We …nd that:

d¼¤ (w)dw

=d

dw

µ1

4w

¶= ¡ 1

4w2

= ¡L¤ (w) :Note the parallel between the two results where we get ¡L and ¡L¤ respectively.This is no coincidence but follows from the envelope theorem which states

that:

d¼¤ (w)dw

=@¼ (L;w)

@w

¯̄̄L=L¤(w)

=@¼ (L¤ (w) ; w)

@w;

that is, to …nd the derivative of the clever function ¼¤ (w) with respect to w youonly need to take the derivative of the naive function ¼ (L;w) with respect tow and then replace L by L¤ (w) :

2.1.2 Envelope Theorem I

Let f (y; x) be the naive version of a function to be optimized (i.e., either max-imized or minimized) over y: Here x is some exogenous variable.From the …rst-order conditions we have:

@f (y¤; x)@y

= 0:

This …rst-order condition implicitly de…nes the optimal value of y¤ as a functionof x or:

y¤ = y¤ (x) :

Suppose we replace y in f (y; x) with y¤ (x) to obtain the clever version ofthe function so that:

f¤ (x) = f (y¤ (x) ; x) :

To calculate the derivative of the clever function with respect to x use thechain rule to obtain:

df¤ (x)dx

=

µ@f (y¤ (x) ; x)

@y

¶@y¤ (x)@x

+@f (y¤ (x) ; x)

@x:


However, by the …rst-order conditions:

@f (y¤ (x) ; x)@y

= 0

and so the term in brackets is zero. We therefore have:

df¤ (x)dx

=@f (y; x)

@x

¯̄̄̄y = y¤ (x)

=@f (y¤ (x) ; x)

@x:

Thus the derivative of the clever version of the function is identical to thepartial derivative of the naive version of the function with respect to x:We therefore have:

Theorem 11 Envelope Theorem I: If f (y; x) is the naive function and f¤ (x)is the clever function then:

df¤ (x)dx

=@f (y; x)

@x

¯̄̄̄y = y¤ (x)

=@f (y¤ (x) ; x)

@x:

2.1.3 Pro…t Maximization in the Short-Run II

Let us now consider the more general problem where real pro…ts are given by:

¼ (L;w) = f (L)¡wLwhere instead of f (L) =

pL simply assume that the short-run production

function satis…es: f 0 (L) > 0 and f 00 (L) < 0: Pro…t maximization requires thatL¤ (w) satis…es:

@¼ (L¤ (w) ; w)@L

= f 0 (L¤ (w))¡w = 0

from which, using total di¤erentials we can show that:

dL¤ (w)dw

=1

f 00 (L¤ (w))< 0

so that the short-run labour demand curve slopes downward.The clever pro…t function is then de…ned as:

¼¤ (w) = ¼ (L¤ (w) ; w) = f (L¤ (w))¡wL¤ (w) :From the envelope theorem we then have that since:

@¼ (L;w)

@w= ¡L


that one can obtain the negative of the demand for labour L¤ (w) by di¤erenti-ating the clever pro…t function with respect to w as:

d¼¤ (w)dw

=@¼ (L;w)

@w

¯̄̄L=L¤(w)

= ¡L¤ (w) :

Furthermore the clever pro…t function is convex since we have already showthat dL

¤(w)dw < 0 and hence:

d2¼¤ (w)dw2

= ¡dL¤ (w)dw

> 0:

2.1.4 Pro…t Maximization in the Long-Run

We will now apply the envelope theorem to pro…t maximization in the long-run. This will illustrate how the envelope theorem works when there are manyendogenous and exogenous variables.Consider a …rm with production function: Q = F (L;K) where Q is output,

L is labour and K is the amount of capital. The …rm faces three prices P; Wand R:The naive pro…t function is:

¼ (L;K;P;W;R) = PF (L;K)¡WL¡RK:

Again ¼ (L;K;P;W;R) is naive because it gives pro…ts for any choice of L andK; even irrational choices that do not maximize pro…ts.Again, in economics we do not believe …rms are irrational, that they will

just choose any L and K. Instead that they will choose that L¤ and K¤ whichmaximizes pro…ts. These will satisfy the …rst-order conditions:

P@F (L¤;K¤)

@L¡W = 0

P@F (L¤;K¤)

@K¡R = 0

which implicitly de…ne:

L¤ = L¤ (P;W;R)K¤ = K¤ (P;W;R)

that is, the demand for labour L¤ (P;W;R) and the demand for capitalK¤ (P;W;R) :If we replace L and K by in Q = F (L;K) with L¤ and K¤ then we obtain

the …rm’s supply curve Q¤ (P;W;R) given as:

Q¤ (P;W;R) = F (L¤ (P;W;R) ;K¤ (P;W;R)) :


If we replace L andK in ¼ (L;K;P;W;R)with L¤ (P;W;R) andK¤ (P;W;R) ;then we get the clever pro…t function:

¼¤ (P;W;R) = ¼ (L¤ (P;W;R) ;K¤ (P;W;R) ; P;W;R)= PF (L¤;K¤)¡WL¤ ¡RK¤:

Suppose we attempt to calculate a partial derivative of the clever pro…tfunction, say: @¼

¤(P;W;R)@W : At …rst sight it would appear that:

@¼¤ (P;W;R)@W

=@

@W(PF (L¤;K¤)¡WL¤ ¡RK¤) = ¡L¤

which looks almost the same as the derivative of the naive pro…t function:

@¼ (L;K;P;W;R)

@W= ¡L:

However, this line of reasoning is based on an error since the pro…t maxi-mizing amount of labour L¤ (P;W;R) and capital K¤ (P;W;R) also depend onW . Nevertheless, this incorrect reasoning still gives the correct answer! Thisis, again, the envelope theorem.Doing the di¤erentiation properly we obtain the correct expression:

@¼¤ (P;W;R)@W

=@

@W(PF (L¤;K¤)¡WL¤ ¡RK¤)

= ¡L¤ (P;W;R)+

µP@F (L¤;K¤)

@L¡W

¶@L¤ (P;W;R)

@W

+

µP@F (L¤;K¤)

@K¡R

¶@K¤ (P;W;R)

@W:

However, by the …rst-order conditions P @F (L¤;K¤)@L ¡W = 0 and P @F (L¤;K¤)

@K ¡R = 0: Thus the last two terms in brackets are zero and we are back to (whatwas actually correct but now for the correct reason):

@¼¤ (P;W;R)@W

= ¡L¤ (P;W;R)

or alternatively:

@¼¤ (P;W;R)@W

=@¼ (L;K;P;W;R)

@W

¯̄̄̄¯ L=L¤(P;W;R)K=K¤(P;W;R)

:

2.1.5 Envelope Theorem II

Let f (y; x) be the naive version of a function to be optimized (either maximizedor minimized) over y where y = [yi] is an n£ 1 vector of endogenous variablesand x = [xi] is an m£ 1 vector of exogenous variables.


From the …rst-order conditions we then have that the optimal value of ydenoted by y¤ will satisfy :

@f (y¤; x)@yi

= 0 for i = 1; 2; : : : n:

This implicitly de…nes the y¤0i s as a functions of the exogenous variables as:

y¤i = y¤i (x1; x2; : : : xm) for i = 1; 2; : : : n:

Suppose we replace y in f (y; x) with y¤ (x) to obtain the clever version ofthe function: f¤ (x) : Thus:

f¤ (x) ´ f (y¤ (x) ; x)or more explicitly:

f¤ (x1; x2; : : : xm) = f (y¤1 (x) ; y¤2 (x) ; : : : y

¤n (x) ; x1; x2; : : : xm) :

To calculate the partial derivative:

@f¤ (x1; x2; : : : xm)@xi

use the chain rule to obtain:

@f¤ (x1; x2; : : : xm)@xi

=

µ@f (y¤ (x) ; x)

@y1

¶@y¤1 (x)@xi

+

µ@f (y¤ (x) ; x)

@y2

¶@y¤2 (x)@xi

+

¢ ¢ ¢+µ@f (y¤ (x) ; x)

@yn

¶@y¤n (x)@xi

+@f (y¤ (x) ; x)

@xi:

By the …rst-order conditions:

@f (y¤ (x) ; x)@y1

=@f (y¤ (x) ; x)

@y2= ¢ ¢ ¢ = @f (y¤ (x) ; x)

@yn= 0

so all of the terms in brackets are zero. We therefore have:

Theorem 12 Envelope Theorem II: If f (y; x) is the naive function and f¤ (x)is the clever function then:

@f¤ (x1; x2; : : : xm)@xi

=@f (y; x)

@xi

¯̄y=y¤(x)

=@f (y¤ (x) ; x)

@xi:

Thus the partial derivative of the clever version of the function is identicalto the partial derivative of the naive version of the function when we replace yby the optimal y¤ (x) :


2.1.6 A Recipe for Unconstrained Optimization

Given the original naive version of the function f (y; x) and the clever versionof the function: f¤ (x) = f (y¤ (x) ; x) suppose you wish to calculate:

@f¤ (x)@xi

:

Step 1: First calculate the partial derivative of the naive version of thefunction with respect to the ith exogenous variable or

@f (y; x)

@xi:

Step 2: Replace all occurrences of the endogenous variables y with themaximized values y¤ (x) : This gives the correct answer, that is:

@f¤ (x)@xi

=@f (y; x)

@xi

¯̄̄̄y = y¤ (x)

=@f (y¤ (x) ; x)

@xi:

2.1.7 The Pro…t Function

The following results, which relate the factor demand and supply curve to deriva-tives of the clever pro…t function, are collectively known as Hotelling’s lemma.

The Demand for Labour

Let us use the envelope theorem recipe to calculate:

@¼¤ (P;W;R)@W

:

Step 1: Calculate the derivative of the naive pro…t function:

@¼ (L;K;P;W;R)

@W=

@

@W(PF (L;K)¡WL¡RK)

= ¡L:

Step 2: Replace all occurrences of L and K with L¤ and K¤ to obtain:

@¼¤ (P;W;R)@W

= ¡L¤ (P;W;R)

which is the (negative of the) …rm’s demand curve for labour.


The Demand for Capital

To calculate:

@¼¤ (P;W;R)@R

:

Step 1: Calculate the partial derivative of the naive pro…t function withrespect to R to obtain:

@¼ (L;K;P;W;R)

@R=

@

@R(PF (L;K)¡WL¡RK)

= ¡K:Step 2: Replace all occurrences of L and K with L¤ and K¤ to obtain:

@¼¤ (P;W;R)@R

= ¡K¤ (P;W;R)

which is the (negative of the) …rm’s demand curve for capital.

The Supply Curve

To calculate:

@¼¤ (P;W;R)@P

Step 1: Calculate the derivative of the naive pro…t function with respect toP :

@¼ (L;K;P;W;R)

@P=

@

@P(PF (L;K)¡WL¡RK)

= F (L;K) :

Now replace all occurrences of L and K with L¤ and K¤ to obtain:

@¼¤ (P;W;R)@P

= F (L¤ (P;W;R) ;K¤ (P;W;R))

= Q¤ (P;W;R)

since Q = F (L;K) : This is then the …rm’s supply curve.

An Example

Suppose the …rm’s production function Q = F (L;K) is Cobb-Douglas or:

Q = L®K¯

where ® + ¯ < 1 (the production function is homogeneous of degree ® + ¯):Then it can be shown that the clever pro…t function ¼¤ (P;W;R) is given by:

¼¤ (P;W;R) = (1¡ ®¡ ¯)® ®1¡®¡¯ ¯

¯1¡®¡¯P

11¡®¡¯W

¡®1¡®¡¯R

¡¯1¡®¡¯ :


Note that we can rewrite this as:

¼¤ (P;W;R) = BP cW dRe

where:

B = (1¡ ®¡ ¯)® ®1¡®¡¯ ¯

¯1¡®¡¯

c =1

1¡ ®¡ ¯d =

¡®1¡ ®¡ ¯

e =¡¯

1¡ ®¡ ¯ :

which also has the Cobb-Douglas functional form.Note that the pro…t function is homogeneous of degree 1 since

c+ d+ e =1

1¡ ®¡ ¯ +¡®

1¡ ®¡ ¯ +¡¯

1¡ ®¡ ¯= 1

with c ¸ 0, d · 0 and e · 0:Sometimes we start with the production function, derive L¤ (P;W;R) ; K¤ (P;W;R)

andQ¤ (P;W;R) and then ¼¤ (P;W;R) :However, we can also start with ¼¤ (P;W;R)and from there derive L¤ (P;W;R) ; K¤ (P;W;R) and Q¤ (P;W;R) :For example, suppose the clever pro…t function is given by:

¼¤ (P;W;R) = 10P52W¡ 3

4R¡34 :

To calculate the …rm’s supply curve: Q¤ (P;W;R) simply di¤erentiate ¼¤ (P;W;R)with respect to P: That is:

Q¤ (P;W;R) =@¼¤ (P;W;R)

@P

=@

@P

³10P

52W¡3

4R¡34

´= 25P

32W¡ 3

4R¡34 :

Note that Q¤ (P;W;R) is homogenous of degree 0:To calculate the …rm’s demand curve for labour L¤ (P;W;R) simply di¤er-

entiate ¼¤ (P;W;R) with respect to W and multiply by ¡1: That is:

L¤ (P;W;R) = ¡@¼¤ (P;W;R)@W

= ¡ @

@W

³10P

52W¡ 3

4R¡34

´=

15

2P

52W¡ 7

4R¡34 :


Note that L¤ (P;W;R) is homogenous of degree 0:To calculate the …rm’s demand curve for capital K¤ (P;W;R) simply di¤er-

entiate ¼¤ (P;W;R) with respect to R and multiply by ¡1: That is:

K¤ (P;W;R) = ¡@¼¤ (P;W;R)@R

= ¡ @

@R

³10P

52W¡ 3

4R¡34

´=

15

2P

52W¡ 3

4R¡74 :

Note that K¤ (P;W;R) is homogenous of degree 0:

2.1.8 Homogeneous Functions

Discussion

In economics we often work with homogenous functions. Sometimes these areproduction functions when we wish to capture the idea of increasing, decreasingor constant returns to scale. Other times economic theory requires that certainfunctions be homogeneous. Thus pro…t and cost functions are always homoge-neous of degree 1, demand and supply curves are always homogeneous of degree0 while the marginal utility of income must be homogeneous of degree ¡1:

De…nition

Let us review the de…nition of a homogenous function.

De…nition 13 A function f (x1; x2; : : : xn) ; or f (x) where x is an n£1 vector,is said to be homogeneous of degree k if for any positive scalar ¸:

f (¸x1; ¸x2; : : : ¸xn) = ¸kf (x1; x2; : : : xn)

or:

f (¸x) = ¸kf (x) :

An Example

The classic example of a homogeneous function is the Cobb-Douglas functionalform where:

f (x1; x2; : : : xn) = A (x1)®1 (x2)

®2 ¢ ¢ ¢ (xn)®n

which is homogeneous of degree:

k = ®1 + ®2 + ¢ ¢ ¢+ ®n


since:

f (¸x1; ¸x2; : : : ¸xn) = A (¸x1)®1 (¸x2)

®2 ¢ ¢ ¢ (¸xn)®n= A¸a1 (x1)

®1 ¸a2 (x2)®2 ¢ ¢ ¢¸an (xn)®n

= ¸®1+®2+¢¢¢+®nA (x1)®1 (x2)

®2 ¢ ¢ ¢ (xn)®n= ¸®1+®2+¢¢¢+®nf (x1; x2; : : : xn) :

Thus F (L;K) = L13K

16 is homogeneous of degree

k =1

3+1

6=1

2:

Some Results for Homogeneous Functions

Theorem 14 Euler’s theorem: If f (x) is homogeneous of degree k then:

kf (x) =@f (x)

@x1x1 +

@f (x)

@x2x2 + ¢ ¢ ¢+ @f (x)

@xnxn:

For example if F (L;K) = L13K

16 then you can verify that:

F (L;K) =1

2L

13K

16

=@F (L;K)

@LL+

@F (L;K)

@KK

=1

3L

13¡1K

16 £ L+ 1

6L

13K

16¡1 £K:

It turns out that the derivative of a homogenous function is also homogeneousof degree one less than the original function. We have:

Theorem 15 If f (x) is homogeneous of degree k then @f(x)@xi

is homogeneous ofdegree k ¡ 1:

For example if F (L;K) = L13K

16 which is homogeneous of degree k = 1

2then you can verify that

@F (L;K)

@L=1

3L¡

23K

16

is homogeneous of degree k ¡ 1 = ¡12 :

2.1.9 Concavity and Convexity

Discussion We have seen that concavity and convexity are of fundamentalimportance in economics. It turns out that there are certain di¢culties withthe notions of concavity and convexity that we have used so far.


For example we have said that a function is (globally) convex if f 00 (x) > 0for all x in the domain of f: Consider the function

y = jxjwhich is plotted below:

0

1

2

3

4

5

-4 -2 2 4x

y = jxj:

This function is clearly valley-like and hence we would like to say that it isconvex. However for this function f 00 (x) = 0 for x 6= 0 and f 00 (x) is not de…nedwhen x = 0 so according to the calculus de…nition of convexity, the function isnot convex.Another problem arises with functions such as:

f (x) = x4

which is plotted below:

0

100

200

300

400

500

600

-4 -2 2 4x

y = x4

:

This is also clearly valley-like. For x 6= 0 we have f 00 (x) = 12x2 > 0 butf 00 (0) = 0 so the de…nition of convexity we have used fails here as well.


For these and other reasons we will now discuss a more satisfactory de…nitionof concavity and convexity.It turns out that it is possible to de…ne concavity and convexity without

resorting to calculus. The basic idea is that if a function is convex, then theline connecting any two points of the graph of the function must lie above thefunction, while if it is concave it must lie everywhere below. This is illustratedin the diagrams below:

This leads us to the following de…nitions of concavity and convexity.


The De…nition of Convexity

De…nition 16 A function f (x) is convex if and only if for all x1 and x2 in thedomain of f (x) :

f (¸x1 + (1¡ ¸)x2) · ¸f (x1) + (1¡ ¸) f (x2)where ¸ is any scalar with 0 · ¸ · 1:We have the following:

Theorem 17 If f (x) has …rst and second derivatives then f (x) is convex ifand only if the Hessian of f (x) is positive semi-de…nite for all x:

The De…nition of Strict Convexity

De…nition 18 The function f (x) is strictly convex if and only if for all x1 andx2 in the domain of f (x) :

f (¸x1 + (1¡ ¸)x2) < ¸f (x1) + (1¡ ¸) f (x2)where ¸ is any scalar with 0 < ¸ < 1 and x1 6= x2:Note that unlike the de…nition of convexity we exclude ¸ = 0 or ¸ = 1 in

the de…nition of strict convexity and rule out identical x1 and x2:We have the following results:

Theorem 19 If f (x) is strictly convex then it is convex.

A function may be convex without being strictly convex if it has linearsegments. For example it can be shown that y = jxj is convex but not strictlyconvex because it is linear to the right and left of the origin. You can verify thisyourself by looking at the graph of y = jxj.A strictly convex function always has valley-like curvature. For example

y = x2 or y = x4 is strictly convex.

Theorem 20 If the Hessian of f (x) is everywhere positive de…nite, then f (x)is strictly convex.

The De…nition of Concavity

De…nition 21 The function f (x) is concave if and only if for all x1 and x2 inthe domain of f (x) :

f (¸x1 + (1¡ ¸)x2) ¸ ¸f (x1) + (1¡ ¸) f (x2)where ¸ is any scalar with 0 · ¸ · 1:Theorem 22 If f (x) has …rst and second derivatives then f (x) is concave ifand only if the Hessian of f (x) is negative semi-de…nite for all x:


The De…nition of Strict Concavity

De…nition 23 f (x) is strictly concave if and only if for all x1 and x2 in thedomain of f (x) :

f (¸x1 + (1¡ ¸)x2) > ¸f (x1) + (1¡ ¸) f (x2)where ¸ is any scalar with 0 < ¸ < 1 and x1 6= x2:Note that unlike the de…nition of concavity we exclude ¸ = 0 or ¸ = 1 in

the de…nition of strict concavity and rule out identical x1 and x2:

Theorem 24 If f (x) is strictly concave then it is concave.

A function may be concave without being strictly concave if it has linearsegments. For example it can be shown that y = ¡ jxj is concave but notstrictly concave because it is linear to the right and left of the origin.A linear function is both concave and convex but is neither strictly concave

nor strictly convex.A strictly concave function always has mountain-like curvature. For example

y = ¡x2 or y = ¡x4 is strictly concave.Theorem 25 If the Hessian of f (x) is everywhere negative de…nite, then f (x)is strictly concave.

2.1.10 Concavity, Convexity and Homogeneity of f¤ (x)

In economics the exogenous variables, or the x0s, are often prices. This meansthat the naive function f (y; x) is often a linear function of the x0s or prices andso can be written as:

f (y; x) =nXi=1

xiÁi (y) (2.1)

= xTÁ (y)

where Á (y) = [Ái (y)] is an m £ 1 vector. This linearity of the naive functionhas important implications for the clever function f¤ (x). We have the followingresults:

Theorem 26 If the naive function f (y; x) can be written as

f (y; x) = xTÁ (y)

then the clever function f¤ (x) is homogeneous of degree 1 so that f¤ (¸x) =¸f¤ (x) for all positive ¸: Furthermore y¤ (x) is homogeneous of degree 0; thatis y¤ (¸x) = y¤ (x) for all ¸ > 0:

Theorem 27 If the naive function f (y; x) = xTÁ (y) is being maximized theclever function f¤ (x) is convex.

Theorem 28 If the naive function f (y; x) = xTÁ (y) is being minimized theclever function f¤ (x) is concave.


Proof of Homogeneity

We …rst prove that f¤ (x) given by:

f¤ (x) = xTÁ (y¤ (x))

is homogeneous of degree 1 and y¤ (x) ; the value of y which maximizes orminimizes f (y; x) ; is homogeneous of degree 0: (See page 48 for a review ofhomogeneity).First, note that the naive function is homogeneous of degree 1 in x since:

f (y; ¸x) = (¸x)TÁ (y)

= ¸¡xTÁ (y)

¢= ¸f (y; x) :

Therefore since ¸ > 0 it follows that if y¤ (x) maximizes or minimizes f (y; x) italso maximizes or minimizes f (y; ¸x) and so

y¤ (¸x) = y¤ (x) :

Thus y¤ (x) is homogeneous of degree 0: Furthermore:

f¤ (¸x) = (¸x)T Á (y¤ (¸x))

= (¸x)T Á (y¤ (x))= ¸xTÁ (y¤ (x))= ¸f¤ (x)

where the second line follows by the homogeneity of y¤ (x) : This proves thatf¤ (x) is homogeneous of degree 1:

Proof that f¤ (x) is convex when f (y; x) is being maximized

Suppose the naive function f (y; x) is being maximized. Let x1 and x2 (whichare m £ 1 vectors) be two values of x and let x3 = ¸x1 + (1¡ ¸)x2 (where0 · ¸ · 1) be a convex combination of x1 and x2: In order to prove that f¤ (x)is convex (see page 52 for a review of convexity) we need to show that:

f¤ (x3) · ¸f¤ (x1) + (1¡ ¸) f¤ (x2) :We have then that:

f¤ (x3) = (¸x1 + (1¡ ¸)x2)T Á (y¤ (x3))= ¸xT1 Á (y

¤ (x3)) + (1¡ ¸)xT2 Á (y¤ (x3)) :However:

xT1 Á (y¤ (x3)) = f (y¤ (x3) ; x1) · f (y¤ (x1) ; x1) = xT1 Á (y¤ (x1)) = f¤ (x1)


since y¤ (x1) maximizes f (y; x1) : Similarly we have:


since y¤ (x2) maximizes f (y; x2) : We therefore conclude that:

f¤ (x3) = ¸xT1 Á (y¤ (x3)) + (1¡ ¸)xT2 Á (y¤ (x3))

· ¸f¤ (x1) + (1¡ ¸) f¤ (x2)so that f¤ (x) is convex.

Proof that f¤ (x) is concave when f (y; x) is being minimized

Suppose the naive function f (y; x) is being minimized. Let x1 and x2 (whichare m £ 1 vectors) be two values of x and let x3 = ¸x1 + (1¡ ¸)x2 (where0 · ¸ · 1 ) be a convex combination of x1 and x2: In order to prove that f¤ (x)is concave we need to show that (see page 21):

f¤ (x3) ¸ ¸f¤ (x1) + (1¡ ¸) f¤ (x2) :We have then that:

f¤ (x3) = (¸x1 + (1¡ ¸)x2)T Á (y¤ (x3))= ¸xT1 Á (y

¤ (x3)) + (1¡ ¸)xT2 Á (y¤ (x3)) :However we have:

xT1 Á (y¤ (x3)) = f (y¤ (x3) ; x1) ¸ f (y¤ (x1) ; x1) = xT1 Á (y¤ (x1)) = f¤ (x1)

since y¤ (x1) minimizes f (y; x1) : Similarly we have:

xT2 Á (y¤ (x3)) = f (y¤ (x2) ; x2) ¸ f (y¤ (x2) ; x2) = xT2 Á (y¤ (x2)) = f¤ (x2)

since y¤ (x2) minimizes f (y; x2) : We therefore conclude that:

f¤ (x3) = ¸xT1 Á (y¤ (x3)) + (1¡ ¸)xT2 Á (y¤ (x3))

¸ ¸f¤ (x1) + (1¡ ¸) f¤ (x2)so that f¤ (x) is concave.

2.1.11 Properties of the Pro…t Function

Let us now apply the results of the previous section to the pro…t function. Theseresults apply since the naive pro…t function is a linear function of the exogenousvariables which are the prices P; W and R since:

¼ (L;K;P;W;R) = PF (K;L)¡WL¡RK= xTÁ (y)


where:

x =

24 PWR

35 ; y = · LK

¸; Á (y) =

24 F (K;L)LK

35 :From this it follows that (see page 53):

Theorem 29 ¼¤ (P;W;R) is homogeneous of degree 1

Theorem 30 ¼¤ (P;W;R) is convex.

Theorem 31 L¤ (P;W;R), K¤ (P;W;R) and Q¤ (P;W;R) are homogeneous ofdegree 0:

The fact that L¤ (P;W;R), K¤ (P;W;R) and Q¤ (P;W;R) are homogeneousof degree 0 is a basic requirement of rationality; that the …rm is not subject tomoney illusion. For example if a 100% in‡ation doubles P; W and R then arational …rm does not change its production or factor demands, or:

L¤ (2P; 2W; 2R) = L¤ (P;W;R)K¤ (2P; 2W; 2R) = K¤ (P;W;R)Q¤ (2P; 2W; 2R) = Q¤ (P;W;R) :

We can use the convexity of ¼¤ (P;W;R) to show that the …rm’s factordemand curves slope downwards and the …rm’s supply curve slopes upwards.Since ¼¤ (P;W;R) is convex (and assuming …rst and second derivatives exist)we have that the Hessian given by:

H (P;W;R) =

264@2¼¤(P;W;R)

@P 2@2¼¤(P;W;R)

@P@W@2¼¤(P;W;R)

@P@R@2¼¤(P;W;R)

@P@W@2¼¤(P;W;R)

@W2@2¼¤(P;W;R)

@W@R@2¼¤(P;W;R)

@P@R@2¼¤(P;W;R)

@W@R@2¼¤(P;W;R)

@R2

375is positive semi-de…nite (see page 52)Since a positive semi-de…nite matrix has non-negative diagonal elements, we

have:

@2¼¤ (P;W;R)@P2

¸ 0; @2¼¤ (P;W;R)@W 2

¸ 0; @2¼¤ (P;W;R)

@R2¸ 0: (2.2)

The Demand Curve for Labour Slopes Downwards

Let us begin by showing the …rm’s labour demand curve is downward sloping.Di¤erentiate both sides of

@¼¤ (P;W;R)@W

= ¡L¤ (P;W;R)


with respect to W to obtain:

¡@L¤ (P;W;R)@W

=@2¼¤ (P;W;R)

@W 2¸ 0

where the inequality follows from (2:2) : It therefore follows that1:

@L¤ (P;W;R)@W

· 0

so that the …rm’s demand curve for labour slopes downwards.

The Demand Curve for Capital Slopes Downwards

A similar argument shows that the demand for capital is downward sloping.Di¤erentiate both sides of

@¼¤ (P;W;R)@R

= ¡K¤ (P;W;R)

with respect to R to obtain:

¡@K¤ (P;W;R)@R

=@2¼¤ (P;W;R)

@R2¸ 0

where the inequality follows from (2:2) : It therefore follows that:

@K¤ (P;W;R)@R

· 0

so the demand curve for capital slopes downwards.

The Supply Curve Slopes Upwards

To show that the …rm’s supply curve slopes upwards, di¤erentiate both sides of

@¼¤ (P;W;R)@P

= Q¤ (P;W;R)

with respect to P to obtain:

@Q¤ (P;W;R)@P

=@2¼¤ (P;W;R)

@P 2¸ 0

where the inequality follows from (2:2) : It therefore follows that:

@Q¤ (P;W;R)@P

¸ 0

so the …rm’s supply curve slopes upwards.

1Recall that if ¡X ¸ a then X · ¡a: For example if ¡5 ¸ ¡7 it follows that 5 · 7:Multiplying both sides of an inequality by a negative number reverses the inequality.


2.1.12 Some Symmetry Results

There are a number of symmetry results that can also be demonstrated usingthese techniques. Di¤erentiate both sides of

@¼¤ (P;W;R)@W

= ¡L¤ (P;W;R)


¡@L¤ (P;W;R)@R

=@2¼¤ (P;W;R)

@R@W:

Now di¤erentiate both sides of

@¼¤ (P;W;R)@R

= ¡K¤ (P;W;R)


¡@K¤ (P;W;R)@W

=@2¼¤ (P;W;R)

@W@R:

By Young’s theorem: @2¼¤(P;W;R)@R@W = @2¼¤(P;W;R)

@W@R and so we can conclude that:

@L¤ (P;W;R)@R

=@K¤ (P;W;R)

@W:

As an exercise use the same approach to show that:

@Q¤ (P;W;R)@R

= ¡@K¤ (P;W;R)@P

;

@Q¤ (P;W;R)@W

= ¡@L¤ (P;W;R)@P

:

2.2 Constrained Optimization

2.2.1 Envelope Theorem III: Constrained Optimization

The envelope theorem also works for constrained optimization problems wherewe are using Lagrangians.Let f (y; x) be the naive function which is to be optimized where y = [yi] is

an n£ 1 vector (of variables over which we optimize f (y; x)) and x = [xi] is anm£ 1 vector of exogenous variables.The naive function is being optimized subject to a constraint:

h (y; x) = 0:


L (¸; y; x) = f (y; x) + ¸h (y; x) :


As before we call f (y; x) the naive the version of the function because itdoes not restrict the choice of y to be optimal or rational.The optimal value of y; say y¤; will along with the Lagrange multiplier ¸¤,

satis…es the …rst-order conditions:@L (¸¤ (x) ; y¤ (x) ; x)

@¸= h (y¤; x) = 0;

@L (¸¤ (x) ; y¤ (x) ; x)@yi

=@f (y¤; x)@yi

+ ¸¤@h (y¤; x)@yi

= 0 for i = 1; 2; : : : n:

This implicitly de…nes n+ 1 functions:

¸¤ = ¸¤ (x)y¤i = y¤i (x) for i = 1; 2; : : : n:

We can replace y with y¤ (x) in f (y; x) to get the clever version of thefunction:

f¤ (x) = f (y¤ (x) ; x) :

We could also replace ¸ and y in the Lagrangian L (¸; y; x) with y¤ (x) and¸¤ (x) to obtain the clever Lagrangian: L (¸¤ (x) ; y¤ (x) ; x) : Since y¤ (x) satis…esthe constraint h (y¤ (x) ; x) = 0; it follows that:

L (¸¤ (x) ; y¤ (x) ; x) = f (y¤ (x) ; x) + ¸¤ (x)h (y¤ (x) ; x)= f (y¤ (x) ; x)= f¤ (x) :

We thus have:

f¤ (x) = L (¸¤ (x) ; y¤ (x) ; x) :Suppose now we wish to calculate:

@f¤ (x1; x2; : : : xm)@xi

:

Using the chain rule and di¤erentiating both sides of :

f¤ (x1; x2; : : : xm) = L (¸¤ (x) ; y¤ (x) ; x)with respect to xi we …nd that:

@f¤ (x1; x2; : : : xm)@xi

=

µ@L (¸¤ (x) ; y¤ (x) ; x)

@¸

¶@¸¤ (x)@xi

+µ@L (¸¤ (x) ; y¤ (x) ; x)

@y1

¶@y¤1 (x)@xi

+µ@L (¸¤ (x) ; y¤ (x) ; x)

@y2

¶@y¤2 (x)@xi

+

¢ ¢ ¢+µ@L (¸¤ (x) ; y¤ (x) ; x)

@yn

¶@y¤n (x)@xi

+@L (y¤ (x) ; x)

@xi:


However, by the …rst-order conditions:

@L (¸¤ (x) ; y¤ (x) ; x)@¸

=@L (¸¤ (x) ; y¤ (x) ; x)

@y1= ¢ ¢ ¢ = @L (¸¤ (x) ; y¤ (x) ; x)

@yn= 0

so that all the derivatives in brackets above are zero. We therefore have that:

Theorem 32 Envelope Theorem III: Let f (y; x) be the naive function which isoptimized subject to a constraint: h (y; x) = 0 with Lagrangian:

L (¸; y; x) = f (y; x) + ¸h (y; x) :Then:

@f¤ (x1; x2; : : : xm)@xi

=@L (¸; y; x)

@xi

¯̄̄̄¯̄ ¸ = ¸¤ (x)y = y¤ (x)

=@L (¸¤ (x) ; y¤ (x) ; x)

@xi:

The envelope theorem for constrained optimization states that to calculatethe derivative of the clever version of the function f¤ (x) with respect to xi; …rsttake the derivative of the Lagrangian: L (¸; y; x) with respect to xi and evaluatethis at ¸¤ (x) and y¤ (x) :

2.2.2 A Recipe for Constrained Optimization

Given the original naive version of the function f (y; x) ; the constraint h (y; x) =0; the optimal value y¤ (x) and the clever version of the function f¤ (x) =f (y¤ (x) ; x) ; suppose you wish to calculate:

@f¤ (x)@xi

:

The Lagrangian for the constrained optimization problem is then:

L (¸; y; x) = f (y; x) + ¸h (y; x) :The recipe proceeds as follows:Step 1: Calculate the partial derivative of the Lagrangian L (¸; y; x) with

respect to the ith exogenous variable or

@L (¸; y; x)@xi

=@f (y; x)

@xi+ ¸

@h (y; x)

@xi:

Step 2: Replace all occurrences of ¸ and y with ¸¤ (x) and y¤ (x) : Thisgives the desired result, that is:

@f¤ (x)@xi

=@L (¸; y; x)

@xi

¯̄̄̄¯̄ ¸ = ¸¤ (x)y = y¤ (x)

=@f (y¤ (x) ; x)

@xi+ ¸¤ (x)

@h (y¤ (x) ; x)@xi

:


2.2.3 Concavity, Convexity and Homogeneity of f¤ (x)

As with unconstrained optimization the x0s in the naive function f (y; x) areoften prices so that f (y; x) is often a linear function of the x0s and can bewritten as:

f (y; x) =nXi=1

xiÁi (y) (2.3)

= xTÁ (y)

where Á (y) = [Ái (y)] is an m£ 1 vector.We saw for unconstrained optimization that this linearity had important im-

plications for the clever function f¤ (x). These results also follow for constrainedoptimization as long as the constraint does not depend on x or:

h (y; x) = h (y) :

We then have the following results:

Theorem 33 If the naive function f (y; x) can be written as

f (y; x) = xTÁ (y)

and the constraint does not depend on x so that:

h (y; x) = h (y)

then the clever function f¤ (x) is homogeneous of degree 1 so that f¤ (¸x) =¸f¤ (x) for all positive ¸: Furthermore y¤ (x) is homogeneous of degree 0; thatis y¤ (¸x) = y¤ (x) for all ¸ > 0:

Theorem 34 If the naive function f (y; x) = xTÁ (y) is being maximized theclever function f¤ (x) is convex.

Theorem 35 If the naive function f (y; x) = xTÁ (y) is being minimized theclever function f¤ (x) is concave.

Proof of Homogeneity

We …rst prove that f¤ (x) given by:

f¤ (x) = xTÁ (y¤ (x))

is homogeneous of degree 1 and y¤ (x) ; the value of y which maximizes or mini-mizes f (y; x) ; is homogeneous of degree 0 (see page 48 to review homogeneity).First, note that the naive function is homogeneous of degree 1 in x since:

f (y; ¸x) = (¸x)T Á (y)

= ¸¡xTÁ (y)

¢= ¸f (y; x) :


Therefore since ¸ > 0 and since h (y¤ (x)) = 0 or y¤ (x) satis…es the constraint,it follows that if y¤ (x) maximizes or minimizes f (y; x) subject to h (y) = 0 italso maximizes or minimizes f (y; ¸x) subject to h (y) = 0 and so

y¤ (¸x) = y¤ (x) :

Thus y¤ (x) is homogeneous of degree 0: Furthermore:

f¤ (¸x) = (¸x)T Á (y¤ (¸x))

= (¸x)T Á (y¤ (x))= ¸xTÁ (y¤ (x))= ¸f¤ (x)

where the second line follows by the homogeneity of y¤ (x) : This proves thatf¤ (x) is homogeneous of degree 1:

Proof that f¤ (x) is convex when f (y; x) is being maximized

Suppose the naive function f (y; x) is being maximized. Let x1 and x2 (whichare m £ 1 vectors) be two values of x and let x3 = ¸x1 + (1¡ ¸)x2 where0 · ¸ · 1 be a convex combination of x1 and x2: In order to prove that f¤ (x)is convex we need to show that (see page 52):

f¤ (x3) · ¸f¤ (x1) + (1¡ ¸) f¤ (x2) :We have then that:

f¤ (x3) = (¸x1 + (1¡ ¸)x2)T Á (y¤ (x3))= ¸xT1 Á (y

¤ (x3)) + (1¡ ¸)xT2 Á (y¤ (x3)) :However we have:


since y¤ (x1) maximizes f (y; x1) subject to h (y) = 0 and h (y¤ (x3)) = 0: Simi-larly we have:


since y¤ (x2) maximizes f (y; x2) subject to h (y) = 0 and h (y¤ (x3)) = 0: Wetherefore conclude that:

f¤ (x3) = ¸xT1 Á (y¤ (x3)) + (1¡ ¸)xT2 Á (y¤ (x3))

· ¸f¤ (x1) + (1¡ ¸) f¤ (x2)so that f¤ (x) is convex.


Proof that f¤ (x) is concave when f (y; x) is being minimized

The proof follows the same reasoning as above. You may want to do it as anexercise.

2.2.4 Cost Minimization

The naive cost function is given by:

C =WL+RK

which tells us what it costs the …rm to hire L workers and K units of capital.The production function is given by:

Q = F (L;K)

which gives the technology available to the …rm.Any pro…t maximizing …rm must minimize the cost of producing a given

level of output Q: Consider the problem of minimizing the cost of producing Qunits of output.The problem of minimizing cost subject to the constraint that Q units be

produced leads to the Lagrangian:

L (¸;L;K;Q;W;R) =WL+RK + ¸ (Q¡ F (L;K))with …rst-order conditions given by:

@L (¸¤; L¤;K¤; Q;W;R)@¸

= Q¡ F (L¤;K¤) = 0

@L (¸¤; L¤;K¤; Q;W;R)@L

= W ¡ ¸¤ @F (L¤;K¤)@L

= 0

@L (¸¤; L¤;K¤; Q;W;R)@K

= R¡ ¸¤ @F (L¤;K¤)@K

= 0

which implicitly determine three functions:

¸¤ = ¸¤ (Q;W;R) ;L¤ = L¤ (Q;W;R) ;K¤ = K¤ (Q;W;R) :

L¤ (Q;W;R) and K¤ (Q;W;R) are called conditional factor demands of respec-tively labour and capital. They are conditional in the sense of being conditionalon Q units of output being produced. 2

Substituting L¤ (Q;W;R) and K¤ (Q;W;R) into the naive cost function C =WL+RK yields the clever cost function:

C¤ (Q;W;R) =WL¤ (Q;W;R) +RK¤ (Q;W;R) :

2This Q is not necessarily the pro…t maximizing level of output and so the conditionalfactor demands are not the same as the ordinary factor demands which, as we saw earlierwith pro…t maximization, depend on P; W and R:


The Conditional Demand for Labour

Suppose we wish to calculate

@C¤ (Q;W;R)@W

:

Following the recipe for the envelope theorem we proceed as follows:Step 1: Di¤erentiate the Lagrangian L (¸;L;K;Q;W;R) with respect to W

to obtain:

@L (¸;L;K;Q;W;R)@W

=@

@L(WL+RK + ¸ (Q¡ F (L;K)))

= L:

Step 2: Substitute all occurrences of L; K and ¸ with L¤; K¤ and ¸¤ :

@C¤ (Q;W;R)@W

= L¤ (Q;W;R) :

This is an example of Shepard’s lemma. It states that to …nd the conditionaldemand for labour one need only di¤erentiate the cost function C¤ (Q;W;R)with respect to W:

The Conditional Demand for Capital

Now suppose we wish to calculate

@C¤ (Q;W;R)@R

:

Following the recipe for the envelope theorem we proceed as follows:Step 1: Di¤erentiate the Lagrangian L (¸;L;K;Q;W;R) with respect to R

to obtain:

@L (¸;L;K;Q;W;R)@R

=@

@R(WL+RK + ¸ (Q¡ F (L;K)))

= K:

Step 2: Substitute all occurrences of L; K and ¸ with L¤; K¤ and ¸¤:

@C¤ (Q;W;R)@R

= K¤ (Q;W;R) :

This is again Shepard’s lemma. It states that we can …nd the conditional demandfor capital by di¤erentiating the cost function C¤ (Q;W;R) with respect to therental cost of capital R:


Marginal Cost

Finally, suppose we attempt to calculate marginal cost:

MC (Q;W;R) =@C¤ (Q;W;R)

@Q:

Following the recipe for the envelope theorem we proceed as follows:Step 1: Di¤erentiate the Lagrangian L (¸¤; L¤;K¤; Q;W;R) with respect to

Q to obtain:

@L (¸;L;K;Q;W;R)@Q

=@

@Q(WL+RK + ¸ (Q¡ F (L;K)))

= ¸:

Step 2: Substitute all occurrences of L; K and ¸ with L¤; K¤ and ¸¤ :

@C¤ (Q;W;R)@Q

= ¸¤ (Q;W;R) :

Thus the Lagrange multiplier ¸¤ is in fact equal to marginal cost.

An Example

It is a fact that if the …rm’s production function Q = F (L;K) is Cobb-Douglasor:

Q = L®K¯

then the cost function C¤ (Q;W;R) has the Cobb-Douglas form:

C¤ (Q;W;R) = (®+ ¯)®¡®

®+¯ ¯¡¯

®+¯Q1

®+¯W®

®+¯R¯

®+¯

or

C¤ (Q;W;R) = BQcW dRe

where:

B = (®+ ¯)®¡®

®+¯ ¯¡¯

®+¯

c =1

®+ ¯

d =®

®+ ¯

e =¯

®+ ¯:

Note that:

d+ e = 1


so that the cost function is homogeneous of degree 1 in W and R when Q iskept …xed. This turns out to be a property that all cost functions share.If the technology has decreasing returns to scale or ®+ ¯ < 1; then c > 1;

and average cost is given by:

C¤ (Q;W;R)Q

= BQc¡1WdRe

and is an increasing function of Q:If the technology has constant returns to scale or ®+ ¯ = 1 then c = 1 and

average cost is independent of Q (the average cost curve is ‡at) since:

C¤ (Q;W;R)Q

= BW dRe

does not depend on Q:Finally if the technology has increasing returns to scale or ® + ¯ > 1 then

c < 1 and average cost:

C¤ (Q;W;R)Q

= BQc¡1WdRe

is a decreasing function of Q:

An Example

Suppose the cost function is given by:

C¤ (Q;W;R) = Q2W34R

14

which implies decreasing returns to scale since c = 2 > 1:To calculate the …rm’s conditional demand curve for labour L¤ (Q;W;R)

simply di¤erentiate C¤ (Q;W;R) with respect to W to obtain:

L¤ (Q;W;R) =@C¤ (Q;W;R)

@W

=@

@W

³Q2W

34R

14

´=

3

4Q2W¡ 1

4R14 :

To calculate the …rm’s conditional demand curve for capital K¤ (Q;W;R)simply di¤erentiate C¤ (Q;W;R) with respect to R:

K¤ (Q;W;R) =@C¤ (Q;W;R)

@R

=@

@R

³Q2W

34R

14

´=

1

4Q2W

34R¡

34 :


2.2.5 Properties of the Cost Function

The naive cost function is a linear function of the exogenous variables W andR when Q is held …xed since:

C = WL+RK

= xTÁ (y)

where:

x =

·WR

¸; y =

·LK

¸; Á (y) = y:

From this and the fact that the constraint does not depend onW or R it followsthat (see page 26 ):

Theorem 36 C¤ (Q;W;R) is homogeneous of degree 1 in W and R when Q isheld …xed; that is

C¤ (Q;¸W;¸R) = ¸C¤ (Q;W;R) :

Theorem 37 C¤ (Q;W;R) is a concave function of W and R when Q is held…xed.

Theorem 38 L¤ (Q;W;R), K¤ (Q;W;R) are homogeneous of degree 0 in Wand R when Q is held …xed; that is

L¤ (Q;¸W;¸R) = L¤ (Q;W;R)K¤ (Q;¸W;¸R) = K¤ (Q;W;R) :

The fact that L¤ (Q;W;R) and K¤ (Q;W;R) and Q¤ (P;W;R) are homo-geneous of degree 0 is a basic requirement of rationality; that the …rm is notsubject to money illusion.For example if a 100% in‡ation doubles W and R then a rational …rm does

not change its means of producing Q units of output or:

L¤ (Q; 2W; 2R) = L¤ (Q;W;R)K¤ (Q; 2W; 2R) = K¤ (Q;W;R)

From the concavity of the cost function it follows that if we treat Q as aconstant then the Hessian given by:

H (Q;W;R) =

"@2C¤(Q;W;R)

@W2@2C¤(Q;W;R)

@W@R@2C¤(Q;W;R)

@W@R@2C¤(Q;W;R)

@R2

#is negative semi-de…nite (see page 52).This in turn implies that the diagonal elements are non-positive or:

@2C¤ (Q;W;R)@W 2

· 0; @2C¤ (Q;W;R)

@R2· 0: (2.4)


The Conditional Factor Demand Curves are Downward Sloping

We can use the concavity of C¤ (Q;W;R) to show that the …rm’s conditionalfactor demand curves are downward sloping.Let us begin by showing the …rm’s labour demand curve is downward sloping.

Di¤erentiate both sides of

@C¤ (Q;W;R)@W

= L¤ (Q;W;R)


@L¤ (Q;W;R)@W

=@2C¤ (Q;W;R)

@W 2· 0:

where the inequality follows from (2:4).A similar argument shows that the conditional demand for capital is down-

ward sloping. Di¤erentiate both sides of

@C¤ (Q;W;R)@R

= K¤ (Q;W;R)


@K¤ (Q;W;R)@R

=@2C¤ (Q;W;R)

@R2· 0

where the inequality follows from (2:4) :

A Symmetry Result

Just as with the pro…t function we can establish a certain symmetry resultsregarding the partial derivatives of the factor demands.Di¤erentiate both sides of

@C¤ (Q;W;R)@W

= L¤ (Q;W;R)


@L¤ (Q;W;R)@R

=@2C¤ (Q;W;R)

@R@W:


@C¤ (Q;W;R)@R

= K¤ (Q;W;R)


@K¤ (Q;W;R)@W

=@2C¤ (Q;W;R)

@W@R:

By Young’s theorem: @2C¤(Q;W;R)@R@W = @2C¤(Q;W;R)

@W@R and so we conclude that:

@L¤ (Q;W;R)@R

=@K¤ (Q;W;R)

@W:


More Results

Using Euler’s theorem (see page 49) and the fact that L¤ (Q;W;R) is homoge-neous of degree 0 we have:

L¤ (Q;W;R)£ 0 = 0 = @L¤ (Q;W;R)@W

W +@L¤ (Q;W;R)

@RR

or:

@L¤ (Q;W;R)@R

= ¡@L¤ (Q;W;R)@W

W

R:

Since we have already seen that:

@L¤ (Q;W;R)@W

· 0

it follows that:

@L¤ (Q;W;R)@R

¸ 0

so that an increase in the price of capital increases (or more precisely does notdecrease) the conditional demand for labour.We have shown the symmetry result that: @L¤(Q;W;R)

@W = @K¤(Q;W;R)@W and so

we have immediately that:

@K¤ (Q;W;R)@W

¸ 0:

An Example

Note that in the example above where:

C¤ (Q;W;R) = Q2W34R

14

that C¤ (Q;W;R) is homogeneous of degree 1 in W and R; that is:

C¤ (Q;¸W;¸R) = Q2 (¸W )34 (¸R)

14

= ¸34¸

14Q2W

34R

14

= ¸Q2W34R

14

= ¸C¤ (Q;W;R) :

You can verify that

L¤ (Q;W;R) =@

@W

³Q2W

34R

14

´=

3

4Q2W¡ 1

4R14


is homogeneous of degree 0 and can be written as a function of WR as:

L¤ (Q;W;R) =3

4Q2µW

R

¶¡ 14

so that:

¹L

µQ;W

R

¶=3

4Q2µW

R

¶¡ 14

:

Similarly

K¤ (Q;W;R) =@

@R

³Q2W

34R

14

´=

1

4Q2W

34R¡

34

is homogeneous of degree 0 and can be written as a function of WR as:

K¤ (Q;W;R) =1

4Q2µW

R

¶ 34

so that:

¹K

µQ;W

R

¶=1

4Q2µW

R

¶ 34

:

2.2.6 Utility Maximization

Suppose a household consuming two goods Q1 and Q2 has a (naive) utilityfunction:

U = U (Q1; Q2) :

It has income Y and faces prices P1 and P2 so that the budget constraint is:

Y = P1Q1 + P2Q2:


L (¸;Q1; Q2; P1; P2; Y ) = U (Q1; Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2)and the …rst-order conditions:

@L (¸¤; Q¤1; Q¤2; P1; P2; Y )@¸

= Y ¡ P1Q¤1 ¡ P2Q¤2 = 0@L (¸¤; Q¤1; Q¤2; P1; P2; Y )

@Q1=

@U (Q¤1; Q¤2)@Q1

¡ ¸¤P1 = 0@L (¸¤; Q¤1; Q¤2; P1; P2; Y )

@Q2=

@U (Q¤1; Q¤2)@Q2

¡ ¸¤P2 = 0:


These …rst-order conditions implicitly de…ne the reduced form:

¸¤ = ¸¤ (P1; P2; Y ) ;Q¤1 = Q¤1 (P1; P2; Y ) ;Q¤2 = Q¤2 (P1; P2; Y ) :

Here Q¤1 (P1; P2; Y ) and Q¤2 (P1; P2; Y ) are the Marshallian curves for thetwo goods. In general Marshallian demand curves are functions of prices andnominal income Y: Later we will investigate another kind of demand curve, theHicksian demand curve, which is a function of prices and real income.The indirect utility function (or the clever utility function) is given by:

U¤ (Y;P1; P2) = U (Q¤1 (Y; P1; P2) ; Q¤2 (Y; P1; P2)) :

Since Q¤1 (Y; P1; P2) and Q¤2 (Y; P1; P2) are homogeneous of degree 0 in Y;P1and P2 it follows that U¤ (Y;P1; P2) is also homogenous of degree 0:

The Lagrange Multiplier

The Lagrange multiplier: ¸¤ (P1; P2; Y ) turns out to be the marginal utility ofincome.To see this de…ne the marginal utility of income as:

@U¤ (P1; P2; Y )@Y

and apply the envelope theorem recipe as follows:Step 1: Take the derivative of the Lagrangian with respect to Y to get:

@L (¸;Q1; Q2; P1; P2; Y )@Y

=@

@Y(U (Q1; Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2))

= ¸:

Step 2: Replace all occurrences of ¸; Q1 and Q2 with the optimal val-ues:¸¤; Q¤1 andQ¤2 to obtain:

@U¤ (P1; P2; Y )@Y

= ¸¤ (P1; P2; Y ) :

Thus the Lagrange multiplier turns out to be the marginal utility of income.Exercise: Show that: ¸¤ (P1; P2; Y ) is homogeneous of degree ¡1:

Calculating @U¤(P1;P2;Y )@P1

Suppose we now wish to calculate:

@U¤ (P1; P2; Y )@P1

:


Using the recipe for the envelope theorem we proceed as follows:Step 1: Di¤erentiate the Lagrangian with respect to P1 to get:

@L (¸;Q1; Q2; P1; P2; Y )@P1

=@

@P1(U (Q1; Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2))

= ¡¸Q1:Step 2: Replace all occurrences of ¸; Q1 and Q2 with the optimal val-

ues:¸¤; Q¤1 andQ¤2 to obtain:

@U¤ (P1; P2; Y )@P1

= ¡¸¤ (P1; P2; Y )£Q¤1 (P1; P2; Y ) :

Calculating @U¤(P1;P2;Y )@P2

Suppose we now wish to calculate:

@U¤ (P1; P2; Y )@P2

:

Using the recipe for the envelope theorem we proceed as follows:Step 1: Take the derivative of the Lagrangian with respect to P2 to get:

@L (¸;Q1; Q2; P1; P2; Y )@P2

=@

@P2(U (Q1; Q2) + ¸ (Y ¡ P1Q1 ¡ P2Q2))

= ¡¸Q2:Step 2: Replace all occurrences of ¸; Q1 and Q2 with the optimal val-

ues:¸¤; Q¤1 andQ¤2 to obtain:

@U¤ (P1; P2; Y )@P2

= ¡¸¤ (P1; P2; Y )£Q¤2 (P1; P2; Y ) :

Roy’s Identity

Up until now when we di¤erentiated a clever function with respect to a pricewe have always obtained either a demand or a supply curve. With utility maxi-mization this turns out not to be the case. Nevertheless, we can use the indirectutility function to calculate the demand for Q1 and Q2:Thus using:

@U¤ (P1; P2; Y )@Y

= ¸¤ (P1; P2; Y )

@U¤ (P1; P2; Y )@P1

= ¡¸¤ (P1; P2; Y )£Q¤1 (P1; P2; Y )

we obtain:

Q¤1 (P1; P2; Y ) = ¡@U¤(P1;P2;Y )

@P1@U¤(P1;P2;Y )

@Y


which is Roy’s identity.Similarly, using:

@U¤ (P1; P2; Y )@Y

= ¸¤ (P1; P2; Y )

@U¤ (P1; P2; Y )@P2

= ¡¸¤ (P1; P2; Y )£Q¤2 (P1; P2; Y )we obtain:

Q¤2 (P1; P2; Y ) = ¡@U¤(P1;P2;Y )

@P2@U¤(P1;P2;Y )

@Y

which is again Roy’s identity.

An Illustration of Roy’s Identity

Given the utility function:

U (Q1; Q2) =1

3ln (Q1) +

2

3ln (Q2)

we obtain the Lagrangian:

L (¸;Q1; Q2; P1; P2; Y ) =1

3ln (Q1) +

2

3ln (Q2)

+¸ (Y ¡ P1Q1 ¡ P2Q2)which leads to:

¸¤ (P1; P2; Y ) =1

Y

Q¤1 (P1; P2; Y ) =Y

3P1

Q¤2 (P1; P2; Y ) =2Y

3P2:

To …nd the indirect (or clever) utility function, substitute Q¤1 and Q¤2 intoU (Q1; Q2) to obtain:

U¤ (P1; P2; Y ) =1

3ln

µY

3P1

¶+2

3ln

µ2Y

3P2

¶= ln (Y )¡ 1

3ln (P1)¡ 2

3ln (P2) +

1

3ln

µ1

3

¶+2

3ln

µ2

3

¶:

Now suppose you were just given the indirect utility function and had tocalculate the demand curves for Q1 and Q2 using Roy’s identity. We have:

@U¤ (P1; P2; Y )@Y

=1

Y@U¤ (P1; P2; Y )

@P1=

¡13P1

@U¤ (P1; P2; Y )@P2

=¡23P2


so that:

Q¤1 (P1; P2; Y ) = ¡@U¤(P1;P2;Y )

@P1@U¤(P1;P2;Y )

@Y

= ¡¡13P11Y

=Y

3P1

and so we recover the original demand curve for Q1 from U¤ (P1; P2; Y ) :Similarly:

Q¤2 (P1; P2; Y ) = ¡@U¤(P1;P2;Y )

@P2@U¤(P1;P2;Y )

@Y

= ¡¡23P21Y

=2Y

3P2

and so we recover the original demand curve for Q2 from U¤ (P1; P2; Y ) :

2.2.7 Expenditure Minimization

A rational household will always minimize the expenditure necessary to obtaina certain level of utility say Uo: Let:

E = P1Q1 + P2Q2

be the (naive) level of expenditure on Q1 and Q2:Given the utility function U (Q1; Q2) we obtain the constraint:

Uo = U (Q1; Q2) :

Minimizing E subject to this constraint leads to the Lagrangian:

L (¸;Q1; Q2; P1; P2; Uo) = P1Q1 + P2Q2 + ¸ (Uo ¡ U (Q1; Q2))and the …rst-order conditions:

@L (¸¤; Q¤1; Q¤2; P1; P2; Uo)@¸

= Uo ¡ U (Q¤1; Q¤2) = 0@L (¸¤; Q¤1; Q¤2; P1; P2; Uo)

@Q1= P1 ¡ ¸¤ @U (Q

¤1; Q

¤22)

@Q1= 0

@L (¸¤; Q¤1; Q¤2; P1; P2; Uo)@Q2

= P2 ¡ ¸¤ @U (Q¤1; Q

¤22)

@Q2= 0:

These 3 implicit functions determine the explicit functions:

¸¤ = ¸¤ (P1; P2; Uo) ;Q¤1 = Q¤1 (P1; P2; Uo) ;Q¤2 = Q¤2 (P1; P2; Uo) :


Here Q¤1 (P1; P2; Uo) and Q¤2 (P1; P2; Uo) are the Hicksian demand curves.They di¤er from the Marshallian demand curves only in that Uo, which canbe thought of as real income, takes the place of Y (or nominal income) in theMarshallian demand curves.Substituting Q¤1 (P1; P2; Uo) and Q¤2 (P1; P2; Uo) into the naive expenditure

function E = P1Q1 + P2Q2; we obtain the clever expenditure function:

E¤ (P1; P2; Uo) = P1Q¤1 (P1; P2; Uo) + P2Q¤2 (P1; P2; Uo) :

Properties of the Expenditure Function

The naive expenditure function is a linear function of the exogenous variablesP1 and P2 when Uo is held …xed since:

E = P1Q1 + P2Q2

= xTÁ (y)

where:

x =

·P1P2

¸; y =

·Q1Q2

¸; Á (y) = y:

From this it follows (see page 26 ) that:

Theorem 39 E¤ (P1; P2; Uo) is homogeneous of degree 1 in P1 and P2 when Uois held …xed; that is:

E¤ (¸P1; ¸P2; Uo) = ¸E¤ (P1; P2; Uo)

Theorem 40 E¤ (P1; P2; Uo) is a concave function of P1 and P2 when Uo isheld …xed.

Theorem 41 Q¤1 (P1; P2; Uo) and , Q¤2 (P1; P2; Uo) are homogeneous of degree0 in P1 and P1 when Uo is held …xed. That is:

Q¤1 (¸P1; ¸P2; Uo) = Q¤1 (P1; P2; Uo)Q¤2 (¸P1; ¸P2; Uo) = Q¤2 (P1; P2; Uo) :

Since the expenditure function is a concave function of P1 and P2 when Uois held …xed it follows that the Hessian:

H (P1; P2; Uo) =

24 @2E¤(P1;P2;Uo)@P2

1

@2E¤(P1;P2;Uo)@P1@P2

@2E¤(P1;P2;Uo)@P1@P2

@2E¤(P1;P2;Uo)@P 2

2

35is negative semi-de…nite (see page 52 ) and consequently it has non-positivediagonal elements or:

@2E¤ (P1; P2; Uo)@P 21

· 0; @2E¤ (P1; P2; Uo)

@P22· 0:


The Hicksian Demand Curves are Downward Sloping

The Marshallian demand curves encompass a substitution and an income e¤ect.Since the income e¤ect can either be positive or negative, Marshallian demandcurves can slope either upwards or downwards.We can however use the concavity of E¤ (P1; P2; Uo) to show that the Hick-

sian demand curves are always downward sloping. This is because, unlike aMarshallian demand curve, Hicksian demand curves have only a substitutione¤ect which always leads to downward sloping demand curves.Using the envelope theorem we can show that:

@E¤ (P1; P2; Uo)@P1

= Q¤1 (P1; P2; Uo) :

Now di¤erentiate both sides with respect to P1 to obtain:

@Q¤1 (P1; P2; Uo)@P1

=@2E¤ (P1; P2; Uo)

@P 21· 0

where the inequality follows from the concavity of the expenditure function.Similarly from:

@E¤ (P1; P2; Uo)@P2

= Q¤2 (P1; P2; Uo)

and di¤erentiating both sides with respect to P2 we obtain:

@Q¤2 (P1; P2; Uo)@P2

=@2E¤ (P1; P2; Uo)

@P 22· 0

where the inequality follows from the concavity of the expenditure function.Thus we see that the Hicksian demand curves are always downwards sloping.

A Symmetry Result

The Hicksian demand curves display the same symmetry as the conditionalfactor demand functions.Di¤erentiate both sides of

@E¤ (P1; P2; Uo)@P1

= Q¤1 (P1; P2; Uo)

with respect to P2 to obtain:

@Q¤1 (P1; P2; Uo)@P2

=@2E¤ (P1; P2; Uo)

@P2@P1:


@E¤ (P1; P2; Uo)@P2

= Q¤2 (P1; P2; Uo)


with respect to P1 to obtain:

@Q¤2 (P1; P2; Uo)@P1

=@2E¤ (P1; P2; Uo)

@P1@P2:

By Young’s theorem: @2E¤(P1;P2;Uo)@P2@P1

= @2E¤(P1;P2;Uo)@P1@P2

and so we can concludethat:

@Q¤1 (P1; P2; Uo)@P2

=@Q¤2 (P1; P2; Uo)

@P1:

Chapter 3

Integration and RandomVariables

3.1 IntroductionThe economic world is a world of uncertainty. To understand models of economicdecision making under uncertainty, whether it is insuring a house against …re orpicking a portfolio of risky assets, one needs to be able to manipulate variableswhich take on di¤erent outcomes with di¤erent probabilities. These variablesare known as random variables.The basic tool for working with a random variable X is the expectations

operator E [X] : E [ ] in turn is based on either discrete summation, which in-volves the sigma notation or

Pni=1; or continuous summation, which involves

integration orR ba ( ) dx.

Therefore, before studying random variables we will …rst study discrete sum-mation or

Pni=1 and then continuous summation or

R ba ( ) dx.

3.2 Discrete Summation

3.2.1 De…nitions

Given a set of n numbers: X1;X2;X3; : : :Xn it is natural to want to add themup as:

X1 +X2 +X3 + ¢ ¢ ¢+Xn:

I call this the dot; dot; dot notation. The dot; dot; dot notation is the safestway of working with sums, but also the most cumbersome.

78

CHAPTER 3. INTEGRATION AND RANDOM VARIABLES 79

Sums can be represented more compactly using the sigma notation as:

nXi=1

Xi ´ X1 +X2 +X3 + ¢ ¢ ¢+Xn:

Here i is the summation index. Underneath thePwe write i = 1 to indicate

that we begin the sum with X1 . After including X1, we then increase i to 2and include X2 in the sum. This process continues until we reach i = n withXn included in the sum, at which point we stop.This notation can be modi…ed in a number of ways. For example to start

the summation with X5 and end with X23 we would write:P23i=5Xi or

23Xi=5

Xi = X5 +X6 +X7 + ¢ ¢ ¢+X23:

The sigma notation is much more e¢cient than the dot; dot; dot notation,but it is also more dangerous. One strategy for avoiding these dangers whenderiving results is to revert to the safe dot; dot; dot notation and then, once wethink we have the answer, convert back into the summation notation. Here aresome examples.

3.2.2 Example 1:Pn

i=1 aXi

Converting to the dot; dot; dot notation we have:

nXi=1

aXi = aX1 + aX2 + aX3 + ¢ ¢ ¢+ aXn

= a

0B@X1 +X2 +X3 + ¢ ¢ ¢+Xn| {z }=P

ni=1Xi

1CA= a

nXi=1

Xi

3.2.3 Example 2:Pn

i=1 (aXi + bYi)

Converting to the dot; dot; dot notation we have:

nXi=1

(aXi + bYi) = (aX1 + bY1) + (aX2 + bY2) + ¢ ¢ ¢+ (aXn + bYn)

= a (X1 +X2 + ¢ ¢ ¢+Xn) + b (Y1 + Y2 + ¢ ¢ ¢+ Yn)

= anXi=1

Xi + bnXi=1

Yi:


3.2.4 Example 3:Pn

i=1 (aXi + b)

This one is tricky. It is a great temptation to write:Pni=1 (aXi + b) = a

Pni=1Xi+

b which is wrong. Instead:

nXi=1

(aXi + b) = (aX1 + b) + (aX2 + b) + ¢ ¢ ¢+ (aXn + b)

= a (X1 +X2 + ¢ ¢ ¢+Xn) + (b+ b+ ¢ ¢ ¢+ b)| {z }n times

= anXi=1

Xi + nb

3.2.5 Example 4:Pn

i=1 (aXi + b)2

Again converting into the dot; dot; dot notation we have

nXi=1

(aXi + b)2 = (aX1 + b)

2 + (aX2 + b)2 + ¢ ¢ ¢+ (aXn + b)2

=¡a2X2

1 + 2abX1 + b2¢+¡a2X2

2 + 2abX2 + b2¢+ ¢ ¢ ¢+ ¡a2X2

n + 2abXn + b2¢

= a2¡X21 + ¢ ¢ ¢+X2

n

¢+ 2ab (X1 + ¢ ¢ ¢+Xn) +

¡b2 + ¢ ¢ ¢+ b2¢| {z }

n times

= a2nXi=1

X2i + 2ab

nXi=1

Xi + nb2:

3.2.6 Example 5:Pn

i=1

¡Xi ¡ ¹X

¢= 0

The sample mean ¹X is:

¹X =1

n

nXj=1

Xj =1

n(X1 +X2 + ¢ ¢ ¢+Xn)

and so:

nXi=1

¡Xi ¡ ¹X

¢=

¡X1 ¡ ¹X

¢+¡X2 ¡ ¹X

¢+ ¢ ¢ ¢+ ¡Xn ¡ ¹X

¢= (X1 +X2 + ¢ ¢ ¢+Xn)¡ n ¹X= (X1 +X2 + ¢ ¢ ¢+Xn)¡ n 1

n(X1 +X2 + ¢ ¢ ¢+Xn)

= (X1 +X2 + ¢ ¢ ¢+Xn)¡ (X1 +X2 + ¢ ¢ ¢+Xn)= 0:


3.2.7 Manipulating the Sigma Notation

Deriving expressions by converting into the dot; dot; dot notation is quite safebut also very ine¢cient. A faster but more dangerous approach is to workdirectly with the sigma notation:

Pni=1 : There are a small number of rules that

need to be mastered in order to be able to do this.

Rules for Manipulating the Summation Operator:Pni=1

Rule 1:Pni=1 is a linear operator; that is if the exponent on the bracket

in front is 1, or givenPni=1 (Xi + Yi)

1 or equivalentlyPni=1 (Xi + Yi), then you

may takePni=1 inside the brackets and apply it to each term inside as:

nXi=1

(Xi + Yi) =

ÃnXi=1

Xi +nXi=1

Yi

!:

More generally:

nXi=1

(X1i +X2i + ¢ ¢ ¢+Xmi) =nXi=1

X1i +nXi=1

X2i + ¢ ¢ ¢+nXi=1

Xmi:

This only works for linear functions! If the exponent on the brackets is not 1then this step is not justi…ed. For example:

nXi=1

(Xi + Yi)2 6=

ÃnXi=1

Xi +nXi=1

Yi

!2:

In general if f (x) is a nonlinear function then

nXi=1

f (Xi + Yi) 6= fÃ

nXi=1

Xi +nXi=1

Yi

!;

for example if f (x) = ex = exp (x) ; which is not linear, then:

nXi=1

exp (Xi + Yi) 6= expÃ

nXi=1

Xi +nXi=1

Yi

!Rule 2: You may always pull

Pni=1 past any expression that does not

depend on i (or whatever the summation index is) as long as something remainsin front of

Pni=1. For example:

nXi=1

aXi = anXi=1

Xi

since a does not have an i subscript and Xi remains in front. However:

nXi=1

aiXi 6= ainXi=1

Xi


since ai here has an i subscript and

nXi=1

b 6= bnXi=1

since there always needs to be something in front ofPni=1.

Rule 3: If in front ofPni=1 there is nothing which has an i subscript, then

simply multiply this by the number of terms being summed. Thus:

nXi=1

b = nb

since n terms are being summed while:

7Xi=3

b = 5b

since there are 5 terms being summed, that is : i = 3; 4; 5; 6 7:Rule 4: As long as there is no con‡ict with other de…nitions you can always

change the summation index without a¤ecting the result. For example if wereplace i in

Pni=1Xi with j we get the same result. That is:

nXi=1

Xi =nXj=1

Xj :

As an exercise, redo Examples 1,2,3,4 and 5 above without converting intothe dot; dot; dot notation.Below are some examples of how these rules can be used to derive correct

results.

3.2.8 Example 1:Pn

i=1 (aXi + bYi)2

Since the exponent on the brackets inPni=1 (aXi + bYi)

2 is 2; by Rule 1 youcannot immediately pull the summation inside the brackets. However, since:

(aXi + bYi)2 =

¡a2X2

i + 2abXiYi + b2Y 2i

¢1=

¡a2X2


¢which now has an exponent of 1; we can pull the summation sign through theexpanded expression as:

nXi=1

(aXi + bYi)2 =

nXi=1

¡a2X2


¢=

nXi=1

a2X2i +

nXi=1

2abXiYi +nXi=1

b2Y 2i :


Since a2; 2ab and b2 do not depend on i; we can by Rule 2 pullPni=1 past these

constants to obtain:nXi=1

(aXi + bYi)2 = a2

nXi=1

X2i + 2ab

nXi=1

XiYi + b2

nXi=1

Y 2i :

3.2.9 Example 2:Pn

i=1 (aXi + b)2

This is a special case of Example 1 where Yi = 1 for all i: However, it is usefulto work it through since it provides an example of Rule 3. Thus by changingthe exponent to 1 we have:

nXi=1

(aXi + b)2 =

nXi=1

¡a2X2

i + 2abXi + b2¢

= a2nXi=1

X2i + 2ab

nXi=1

Xi +nXi=1

b2:

Since for b2 inPni=1 b

2 does not have an i subscript, by Rule 3 we multiply b2

by the number of terms being summed so that:

nXi=1

b2 = nb2

and hence:nXi=1

(aXi + b)2 = a2

nXi=1

X2i + 2ab

nXi=1

Xi + nb2:

3.2.10 Example 3:Pn

i=1

¡Xi ¡ ¹X

¢2This is a special case of Example 2 with a = 1 and b = ¡ ¹X: However, it is againuseful to do it from scratch.The sample mean ¹X is:

¹X =1

n

nXj=1

Xj :

Since there is has an exponent of 2 on¡Xi ¡ ¹X

¢2we cannot pull

Pni=1

through the brackets. However, by expanding and then using Rule 1 we have:

nXi=1

¡Xi ¡ ¹X

¢2=

nXi=1

¡X2i ¡ 2 ¹XXi + ¹X2

¢=

nXi=1

X2i +

nXi=1

¡2 ¹XXi +nXi=1

¹X2:


Note that in the second sum the term ¡2 ¹X does not have an i subscript and soby Rule 2 we can pull

Pni=1 past it to Xi: In the third sum ¹X2 does not have

an i subscript so that by Rule 3 we simply multiply it by n: We thus have:

nXi=1

¡Xi ¡ ¹X

¢2=

nXi=1

X2i ¡ 2 ¹X

nXi=1

Xi + n ¹X2:

We can further simplify by noting that:

¹X =1

n

nXj=1

Xi =1

n

nXi=1

Xi

since by Rule 4 we can replace the summation index of j with i without a¤ectingthe result. Therefore:

nXi=1

Xi = n ¹X

and hence:nXi=1

¡Xi ¡ ¹X

¢2=

nXi=1

X2i ¡ 2n ¹X2 + n ¹X2

=nXi=1

X2i ¡ n ¹X2:

As an exercise prove that:

nXi=1

¡Xi ¡ ¹X

¢ ¡Yi ¡ ¹Y

¢=

nXi=1

¡Xi ¡ ¹X

¢Yi

=nXi=1

XiYi ¡ n ¹X ¹Y :

3.3 Integration

3.3.1 The Fundamental Theorem of Integral Calculus

The de…nite integral

Z b

a

f (x) dx

is often interpreted as the area under the curve f (x) between a and b: Here xis the variable over which we are integrating while a is the lower limit of theintegral and b the upper limit.From the fundamental theorem of integral calculus we can evaluate integrals

by …nding the anti-derivative of f (x) given:


Theorem 42 Fundamental theorem of integral calculus:Z b

a

f (x) dx = F (x)

¯̄̄̄ba´ F (b)¡ F (a)

where F (x) is the anti-derivative of f (x) ; that is:

F 0 (x) = f (x) :

3.3.2 Example 1:R 101x2dx

If f (x) = x2 then F (x) = x3

3 ; that is, F0 (x) = d

dxx3

3 = x2: Therefore:Z 10

1

x2dx =x3

3

¯̄̄̄101

=103

3¡ 1

3

3= 333:

3.3.3 Example 2:R 50e¡xdx

If f (x) = e¡x then F (x) = ¡e¡x and:Z 5

0

e¡xdx = ¡e¡x¯̄̄̄50= 1¡ e¡5:

In general we have that:Z n

0

e¡xdx = ¡e¡x¯̄̄̄n0= 1¡ e¡n:

Note that limn!1 e¡n = 0: We can then let n =1 and obtain:Z 1

0

e¡xdx = ¡e¡x¯̄̄̄ 10

= 1¡ e¡1 = 1:

3.3.4 Integration by Parts

One the more important tricks for evaluating integrals of functions is integrationby parts. This is the integral calculus analogue of the product rule in di¤erentialcalculus.Suppose we have an integral: Z b

a

f (x) dx

and we can write f (x) as:

f (x) = u (x)£ v0 (x)where we know the antiderivative of v0 (x) which is v (x) : Then:


Theorem 43 Integration by Parts: If f (x) = u (x)£ v0 (x) thenZ b

a

f (x) dx =

Z b

a

u (x)£ v0 (x) dx

= u (x) v (x)

¯̄̄̄ba¡Z b

a

u0 (x) v (x) dx:

3.3.5 Example 1:R 40xe¡xdx

Suppose we wish to evaluate the integral:

Z 4

0

xe¡xdx:

Here it is not obvious what the antiderivative of xe¡x is. However, the anti-derivative of e¡x is just ¡e¡x so set

u (x) = x; v0 (x) = e¡x

u0 (x) = 1; v (x) = ¡e¡x:

Using integration by parts we then have:Z 4

0

xe¡xdx = x£ ¡¡e¡x¢ ¯̄̄̄ 40¡Z 4

0

(1)¡¡e¡x¢ dx

= ¡4e¡4 +Z 4

0

e¡xdx

= ¡4e¡4 +¡e¡4 + 1= 1¡ 5e¡4= :9084218:

3.3.6 Example 2: ¡ (n)

Consider the gamma function de…ned by:

¡ (n) =

Z 1

0

xn¡1e¡xdx:

This turns out to be a very important function in econometrics.For n = 1 we have:

¡ (1) =

Z 1

0

e¡xdx = 1:

There turns out to be a close connection between the gamma function andfactorials like 5! = 5£ 4£ 3£ 2£ 1 = 120: This is based on the following result:


Theorem 44 The Recursive Property of the Gamma function

¡ (n) = (n¡ 1)¡ (n¡ 1) :The proof uses integration by parts with:

u (x) = xn¡1; v0 (x) = e¡x

u0 (x) = (n¡ 1)xn¡2; v (x) = ¡e¡x:Then:

¡(n) =

Z 1

0

xn¡1e¡xdx

= ¡xn¡1e¡x¯̄̄̄ 10

+ (n¡ 1)Z 1

0

xn¡2e¡xdx:

However it can be shown that:

¡xn¡1e¡x¯̄̄̄ 10

= 0

and by de…nition Z 1

0

xn¡2e¡xdx ´ ¡(n¡ 1)

so we have:

¡ (n) = (n¡ 1)¡ (n¡ 1) :Starting with ¡ (1) = 1 = 0! and using this property of ¡(n) we have:

¡ (1) = 1 = 0!

¡ (2) = (2¡ 1)£ ¡ (1) = 1 = 1!¡ (3) = (3¡ 1)£ ¡ (2) = 2 = 2!¡ (4) = (4¡ 1)£ ¡ (3) = 6 = 3!¡ (5) = (5¡ 1)£ ¡ (4) = 24 = 4!

and in general as long as n is an integer:

¡ (n) = (n¡ 1)!We can also use the gamma function to de…ne non-integer factorials. For

example it can be shown that:

¡

µ1

2

¶=

µ¡12

¶! =

p¼

¡

µ3

2

¶=

1

2¡

µ1

2

¶=

µ1

2

¶! =

p¼

2

¡

µ5

2

¶=

3

2¡

µ3

2

¶=

µ3

2

¶! =

3p¼

4

etc::


3.3.7 Integration as Summation

Integration and summation are very closely related concepts. In fact it is oftenuseful to think of

R ba f (x) dx as the summation of f (x) dx as x goes from a to b

just asPni=1 f (Xi) is a summation of f (Xi) as i goes from 1 to n: The symbolP

is sigma, the Greek S where S here stands for sum. The symbolRalso looks

like a (somewhat squashed) S .An integral is the limiting form of a sum. If you let:

dxn =(b¡ a)n

then: Z b

a

f (x) dx = limn!1

nXi=0

f (a+ dxni) dxn

= limn!1

(b¡ a)n

nXi=0

f

µa+

i (b¡ a)n

¶:

Consequently if n is large we can approximate an integral by a sum as:Z b

a

f (x) dx ¼ (b¡ a)n

nXi=0

f

µa+

i (b¡ a)n

¶:

As n ! 1 think of dxn approaching an in…nitely small number dx that isnot quite zero. The integral

R baf (x) dx then can be thought of as summing an

in…nite number of in…nitely small f (x) dx0s as x goes from a to b.

An Example

We showed that: Z 10

1

x2dx = 333:

Here a = 1 and b = 10: If we set n = 9 in the approximation we have:

dxn =(b¡ a)n

=10¡ 19

= 1

and so we obtain the approximation:Z 10

1

x2dx ¼ dxn

9Xi=0

(1 + dxni)2

= 12 + 22 + 32 + 52 + 62 + 72 + 82 + 92 + 102

= 369:


The error, the di¤erence between 333, the exact value ofR 101x2dx; and the

approximate value of 369; can be reduced by increasing n.If let n = 18 in the approximation we have:

dxn =(b¡ a)n

=10¡ 118

=1

2

so that: Z 10

1

x2dx ¼ 1

2

18Xi=0

µ1 +

1

2i

¶2=

1

2

¡12 + 1:52 + 22 + 2:52 + ¢ ¢ ¢+ 9:52 + 102¢

= 358:625:

Using a computer we can make n much larger. If n = 1000 so that

dxn =10¡ 11000

=9

1000

we have:Z 10

1

x2dx ¼ 9

1000

1000Xi=0

µ1 +

9

1000i

¶2=

9

1000

¡12 + 1:0092 + 1:0182 + ¢ ¢ ¢+ 9:9912 + 102¢

= 333:4546:

Finally if n = 10000 so that

dxn =10¡ 110000

=9

10000

we have: Z 10

1

x2dx ¼ 9

10000

10000Xi=0

µ1 +

9

10000i

¶2= 333:0455:

Each time n is made larger the approximation gets better.

3.3.8R badx as a Linear Operator

SinceR ba dx is a limiting form of

Pni=1; and since

Pni=1 is a linear operator, it

is not surprisingR badx is also a linear operator. We can therefore manipulate

expressions involvingR ba dx using rules very similar to those for

Pni=1 :

Rules for Manipulating the Integration Operator:R badx


Rule 1:R badx is a linear operator. That is if the exponent on the bracket is

1, or givenR ba (f (x) + g (x))

1dx or equivalently

R ba (f (x) + g (x)) dx, then you

may takeR badx inside the brackets and apply it to each term inside as:Z b

a

(f (x) + g (x)) dx =

ÃZ b

a

f (x) dx+

Z b

a

g (x) dx

!:

If the exponent on the brackets is not 1 then this step is not justi…ed. Forexample: Z b

a

(f (x) + g (x))2 6=ÃZ b

a

f (x) dx+

Z b

a

g (x) dx

!2:

Rule 2:You may pullR bapast any expression that does not depend on x (or

whatever the integration variable is) but do not go past the dx term. Thus:Z b

a

cf (x) dx = c

Z b

a

f (x) dx

since c does not depend on x:Rule 3: If when moving over

R bayou reach dx then set

R badx = b¡ a: Thus

for example: Z 10

3

dx = 10¡ 3 = 7:

Rule 4: As long as it does not con‡ict with other notation, you can changethe integration variable without a¤ecting the expression. Thus:Z b

a

f (x) dx =

Z b

a

f (y) dy:

Here are some examples:

3.3.9 Example 1:R ba(cf (x) + dg (x))2 dx

Before we can takeR bathrough the brackets we need to expand the square to

get an exponent of 1: Thus:Z b

a

(cf (x) + dg (x))2 dx =

Z b

a

³c2f (x)2 + 2cdf (x) g (x) + d2g (x)2

´dx:

Using Rule 1 and pullingR ba dx through the brackets and applying it to each

term we obtain:Z b

a

(cf (x) + dg (x))2 dx =

Z b

a

c2f (x)2 dx+

Z b

a

2cdf (x) g (x) dx+

Z b

a

d2g (x)2 dx:


By Rule 2 we can takeR badx past the terms not depending on x; that is c2; 2cd;

and d2 to get:Z b

a

(cf (x) + dg (x))2 dx = c2Z b

a

f (x)2 dx+ 2cd

Z b

a

f (x) g (x) dx+

Z b

a

d2g (x)2 dx:

3.3.10 Example 2:R ba(f (x) + d)2 dx

Again squaring the brackets to get an exponent of 1 we obtain:Z b

a

(f (x) + d)2 dx =

Z b

a

³f (x)

2 + 2df (x) + d2´dx:

Now that the brackets have an exponent of 1 we can pullR badx through the

brackets to get:Z b

a

(f (x) + d)2 dx =

Z b

a

f (x)2 dx+ 2d

Z b

a

f (x) dx+ d2Z b

a

dx:

By Rule 3 the termR ba dx becomes b¡ a and so we have:Z b

a

(f (x) + d)2 dx =

Z b

a

f (x)2 dx+ 2d

Z b

a

f (x) dx+ d2 (b¡ a) :

3.4 Random Variables

3.4.1 Introduction

Up until now all variables we have considered are nonrandom. A random vari-able is a new kind of variable, one that, roughly speaking, takes on di¤erentvalues with di¤erent probabilities.There are two basic kinds of random variables: 1) discrete random variables

where we can make a numbered list of the outcomes and 2) continuous randomvariables where the outcomes are all possible values along the real line.

3.4.2 Discrete Random Variables

A discrete random variable X has n possible outcomes as: x1; x2; : : : xn eachassociated with a probability: p1; p2; : : : pn. Note the convention of using uppercase letters to denote the random variable, here X; and lower case letters todenote possible outcomes, here: x1; x2; : : : xn .


De…nition 45 A discrete random variable is a list of outcomes and associatedprobabilities given as:

X =

2666664Outcomes Probabilities

x1 p1x2 p2...

...xn pn

3777775and where the probabilities satisfy:

1. 0 · pi · 1 for i = 1; 2; : : : n2.Pni=1 pi = p1 + p2 + ¢ ¢ ¢+ pn = 1:

For example if X is the outcome of rolling a fair die, the outcomes are1; 2; 3; 4; 5 and 6 with probabilities 1

6 ;16 ;

16 ;

16 ;

16 and

16 then:

X =

2666666664

Outcomes Probabilities1 1

62 1

63 1

64 1

65 1

66 1

6

3777777775:

Given any function f (x) and random variable X you can always create anew random variable f (X) by applying f (x) to each of the outcomes. Forexample if X is the die roll above and f (x) = x2 then we simply square all theoutcomes to obtain:

X2 =

2666666664


622 1

632 1

642 1

652 1

662 1

6

3777777775:

Note that it is the outcomes and not the probabilities that get squared!

Degenerate Random Variables

De…nition 46 If a random variable has only one outcome that it takes on withprobability 1 then we say that X is a degenerate random variable.

For example if X = 3 with probability 1 or:

X =

·Outcomes Probabilities

3 1

¸then X is a degenerate random variable.


3.4.3 The Bernoulli Distribution

De…nition 47 A random variable X which takes on two values, 1 and 0 withprobability p and 1¡ p where 0 · p · 1 is said to have a Bernoulli distributionor:

X =

24 Outcomes Probabilities1 p0 1¡ p

35If X is has a Bernoulli distribution we write X » B (p) :

For example suppose that 30 percent of the population supports the gammaparty. If I take one person at random and ask him if he supports the GammaParty, and let X = 1 if he says yes and X = 0 if he says no then X has aBernoulli distribution with probability p = 0:3:

3.4.4 The Binomial Distribution

De…nition 48 Suppose Xi » B (p) are n independent Bernoulli’s and de…neY as their sum:

Y = X1 +X2 +X3 + ¢ ¢ ¢+Xn:

Then Y is said to have a binomial distribution denoted as: Y » B (p; n).

Continuing the Gamma Party example above, if I ask n = 2000 people takenrandomly from a population, then Y is the number of people in my sample whosay they support the Gamma party or Y » B (0:3; 2000).If Y » B (p; n) then there are n+ 1 possible outcomes k = 0; 1; 2; : : : n and

it can be shown that

Pr [Y = k] =n!

k! (n¡ k)!pk (1¡ p)n¡k ; k = 0; 1; 2; : : : n

or:

Y =

26666666664

Outcomes Probabilities0 n!

0!(n¡0)!p0 (1¡ p)n¡0

1 n!1!(n¡1)!p

1 (1¡ p)n¡12 n!

2!(n¡2)!p2 (1¡ p)n¡2

......

n n!2!(n¡n)!p

n (1¡ p)n¡n

37777777775


For example given a one-third probability of success or p = 13 and four trials

or n = 4 so that Y » B ¡13 ; 4¢ we have:Pr [Y = 0] =

4!

0! (4¡ 0)!µ1

3

¶0µ1¡ 1

3

¶4¡0=16

81

Pr [Y = 1] =4!

1! (4¡ 1)!µ1

3

¶1µ1¡ 1

3

¶4¡1=32

81

Pr [Y = 2] =4!

2! (4¡ 2)!µ1

3

¶2µ1¡ 1

3

¶4¡2=24

81

Pr [Y = 3] =4!

3! (4¡ 3)!µ1

3

¶3µ1¡ 1

3

¶4¡3=8

81

Pr [Y = 4] =4!

4! (4¡ 4)!µ1

3

¶4µ1¡ 1

3

¶4¡4=1

81

or:

Y =


0 1681

1 3281

2 2481

3 881

4 181

37777775Verify that the probabilities sum to one.

3.4.5 The Poisson Distribution

The Poisson distribution is an example of a discrete random variable with anin…nite number of possible outcomes.

De…nition 49 We say that Y has a Poisson distribution with mean parameter¹ > 0 or Y » P (¹) if:

Pr [Y = k] = e¡¹¹k

k!; k = 0; 1; 2; : : :1:

Alternatively we can represent a Poisson random variables as:

Y =

2666666664

Outcomes Probabilities0 e¡¹ ¹

0

0!

1 e¡¹ ¹1

1!

2 e¡¹ ¹2

2!

3 e¡¹ ¹3

3!...

...

3777777775:


You can verify that the probabilities sum to one from the Taylor series:e¹ =

P1k=0

¹k

k! so that:1Xk=0

Pr [Y = k] =1Xk=0

e¡¹¹k

k!

= e¡¹1Xk=0

¹k

k!

= e¡¹e¹

= 1:

We have the following relationship between the Poisson and the Binomialdistribution, sometimes referred to as the Law of Small Numbers:

Theorem 50 If Y » B (p; n) has a Binomial distribution and n ! 1; p ! 0but np! ¹ > 0 then Y has an approximate Poisson distribution or Y » P (¹)as n!1:For example consider an intersection where n = 1000 cars pass through

each year. The probability of any one car having an accident is very small, sayp = 1

250 and let us assume independent of other cars. If then Y is the totalnumber of accidents in the intersection, then

Y » Bµ1

250; 1000

¶:

Therefore:

Pr [Y = 0] =1000!

0!£ 1000!µ1

250

¶0µ1¡ 1

250

¶1000= 0:01817

Pr [Y = 1] =1000!

1!£ 999!µ1

250

¶1µ1¡ 1

250

¶999= 0:07296

Pr [Y = 2] =1000!

2!£ 998!µ1

250

¶2µ1¡ 1

250

¶998= 0:14638

Pr [Y = 3] =1000!

3!£ 997!µ1

250

¶3µ1¡ 1

250

¶997= 0:19556

Pr [Y = 4] =1000!

4!£ 996!µ1

250

¶4µ1¡ 1

250

¶996= 0:19556

Pr [Y = 5] =1000!

5!£ 995!µ1

250

¶5µ1¡ 1

250

¶995= 0:15661

...

If instead we work with the Poisson distribution with ¹ = 12500 £ 10000 = 4

or

Y » P (4) :


we obtain a very good approximation. Thus:

Pr [Y = 0] ¼ e¡440

0!= 0:01831

Pr [Y = 1] ¼ e¡441

1!= 0:07326

Pr [Y = 2] ¼ e¡442

2!= 0:14652

Pr [Y = 3] ¼ e¡443

3!= 0:19537

Pr [Y = 4] ¼ e¡444

4!= 0:19537

Pr [Y = 5] = e¡445

5!= 0:15629

...

and so the probability of 3 (or 4 for that matter) accidents occurring in one yearis 0:19537 using the exact binomial distribution and 0:19556 using the Poissonapproximation.

3.4.6 Continuous Random Variables

A continuous random variable X is one where the possible outcomes may fallanywhere along the real line: ¡1 < x < 1: It turns out that the in…nity ofthe real line is greater than the in…nity of the number of integers which meansthat continuous random variables have to be handled di¤erently than discreterandom variables.1

De…nition 51 A continuous random variable X is one where the possible out-comes may fall anywhere along the real line or ¡1 < x < 1: Associated witheach outcome x is a probability density: p (x) which has the following properties:

1. p (x) ¸ 0 for all ¡1 < x <12.R1¡1 p (x) dx = 1:

3. Pr [a · X · b] = R ba p (x) dx:Thus the area under the density between a and b gives the probability that

X will fall in that interval.1For example it turns out that X takes on any particular value x must be zero!


3.4.7 The Uniform Distribution

De…nition 52 We say that the random variable X has a uniform distributionwith lower bound a and upper bound b or X » U (a; b) if:

p (x) =

½1b¡a for a · x · b0 for x < a or x > b

:

Note that this satis…es the requirements for a density. In particular p (x) ¸ 0and

R1¡1 p (x) dx = 1 since:Z 1

¡1p (x) dx =

Z b

a

1

b¡ adx

=1

b¡ aZ b

a

dx

=1

b¡ a £ (b¡ a)= 1:

For example imagine picking a number between 0 and 10 where each numberhas an equal likelihood of being picked and calling this X: ThenX has a uniformdistribution with lower bound 0 and upper bound 10 or

X » U (0; 10) :Given that X » U (0; 10) we have for 0 · x · 10:

p (x) =1

10¡ 0 =1

10

and the probability of picking a number between 5 and 7 would be:

Pr [5 · X · 7] =

Z 7

5

p (x) dx

=

Z 7

5

1

10dx

=1

10

Z 7

5

dx

=1

10(7¡ 5)

=1

5:

It is possible for p (x) > 1: For example with the uniform distribution whena = 0 and b = 1

2 we have

p (x) =1

12 ¡ 0

= 2 > 1

for 0 < x < 12 : Remember p (x) is not a probability and so there is nothing

paradoxical about this.


3.4.8 The Standard Normal Distribution

De…nition 53 We say that the continuous random variable Z has a standardnormal distribution or Z » N [0; 1] if it has a density given by:

p (z) =1p2¼e¡

z2

2 ;¡1 < z <1:

A little re‡ection will show that p (z) has a global maximum at z = 0 andthat p (¡z) = p (z) so that it is symmetric around 0: This is con…rmed by theplot of the density found below:

0

0.1

0.2

0.3

0.4

-4 -2 2 4z

The Standard Normal Density

3.4.9 The Normal Distribution

De…nition 54 If a random variable X can be written as: X = ¹+¾Z where ¹and ¾ are nonrandom constants, then we say that X has a normal distributionwith mean ¹ and variance ¾2 or X » N £¹; ¾2¤ :It can be shown that the normal distribution has a density:

p (x) =1p2¼¾2

e¡(x¡¹)22¾2 ;¡1 < x <1:

For example if ¹ = 65 and ¾ = 15 we have:

p (x) =1p2¼152

e¡ (x¡65)2

2£(15)2



0

0.005

0.01

0.015

0.02

0.025

20 40 60 80 100 120x

Normal Density with ¹ = 65 and ¾ = 15:

3.4.10 The Chi-Squared Distribution

De…nition 55 If Y = Z21 + Z22 + ¢ ¢ ¢ + Z2r where the Zi 0s are independent

standard normals, then Y is said to have a chi-squared distribution with r degreesof freedom or Y » Â2r .

The chi-square distribution is very important in econometrics. For examples2; the sample variance, turns out to have a chi-squared distribution.It turns out that if Y » Â2r then y has a density:

p (y) =yr2¡1e¡y=2

2r=2¡£r2

¤ ; 0 · y <1:For example if r = 3:

p (y) =y32¡1e¡y=2

23=2¡£32

¤ = 1p2¼

pye¡

12y



0

0.05

0.1

0.15

0.2

2 4 6 8 10y

Chi-squared density with r = 3

3.4.11 The Student’s t Distribution:

De…nition 56 If Z has a standard normal distribution and Y has a chi-squareddistribution with r degrees of freedom, and if Z and Y are independent, then :

X =ZqYr

is said to have a Student’s t distribution with r degrees of freedom which isdenoted as X » Tr:

Again, this is an important distribution in econometrics. The t distributionhas the same bell shape as the standard normal distribution. In fact as r!1it turns out that the t distribution approaches a standard normal.It can be shown that if X » Tr then X has a density:

p (x) =¡¡12

¢¡¡r2

¢pr¡¡r+12

¢ µ1 + x2r

¶¡ (r+1)2

;¡1 < x <1:

For example if r = 7:

p (x) =¡¡12

¢¡¡72

¢p7¡¡7+12

¢ µ1 + x27

¶¡ (7+1)2

=5

16

¼p7

1¡1 + x2

7

¢4


which is plotted below

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

-4 -2 2 4x

Student’s t distribution with r = 7

3.4.12 The F Distribution:

De…nition 57 If Y1 » Â2r has a chi-squared distribution with r degrees of free-dom and Y2 » Â2s has a chi-squared distribution with s degrees of freedom, andif Y1 and Y2 are independent then the ratio:

Y =Y1=r

Y2=s

has an F distribution with r degrees of freedom in the numerator and s degreesof freedom in the denominator. This is denoted as Y » Fr;s:

Again this is an important distribution in econometrics. Often for examplehypothesis tests lead to test statistics with an F distribution.The density for an F distribution can be written as:

p (y) =

¡rs

¢r=2¡¡r2

¢¡¡s2

¢yr=2¡1

¡¡r+s2

¢ ³1 + ry2

s

´ (r+s)2

; 0 · y <1:

For example if r = 3 and s = 7 then the density is:

p (y) =156272

p3p7¼ £ y3=2¡1³

1 + 3y2

7

´5



0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

1 2 3 4x

F distribution

3.5 Expected Values

Suppose the random variable X is say the growth of GNP next year and that:

X =

2664Outcomes Probabilities1% 1

42% 1

23% 1

4

3775Your job is to forecast what GNP will be. To do this you must pick somenumber, which will be your forecast, as your best guess of what GNP will be.One way of doing this is to use the expected value of X; written as E [X] or

¹ or ¹X : In the above example ¹ = ¹X = E [X] = 2%:This concept turns out to be fundamental for the study of random variables.

We will …rst discuss it for discrete random variables and then for continuousrandom variables.

3.5.1 E [X] for Discrete Random Variables

De…nition 58 The expectation of a discrete random variable is X:

E [X] = x1p1 + x2p2 + ¢ ¢ ¢+ xnpn=

nXi=1

xipi:

The recipe for calculating E [X] given the events x1; x2; : : : xn and probabil-ities p1; p2; : : : pn is as follows:

Recipe for Calculating E [X]

1. Take each outcome xi


2. Weight it by its probability xi £ pi3. And sum to get: E [X] =

Pni=1 xipi:

This may be represented as:26666666664

Outcomes Probabilities Productx1 p1 x1 £ p1x2 p2 x2 £ p2x3 p3 x3 £ p3...

......

xn pn xn £ pnE [X] ´Pn

i=1 xi £ pi

37777777775Note that E [X] ; unlike X; is not random.More generally we can take the expectation of a function of a random vari-

able. Given the function f (x) we have a new random variable Y = f (X) withoutcomes f (x1) ; f (x2) ; : : : f (xn) and associated probabilities: p1; p2; : : : pn inwhich case:

De…nition 59 Given a discrete random variable X then E [f (X)] is de…nedas:

E [f (X)] = f (x1) p1 + f (x2) p2 + ¢ ¢ ¢+ f (xn) pn=

nXi=1

f (xi) pi:

This may be represented as:26666666664

Outcomes Probabilities Productf (x1) p1 f (x1)£ p1f (x2) p2 f (x2)£ p2f (x3) p3 f (x3)£ p3...

......

f (xn) pn f (xn)£ pnE [f (X)] ´Pn

i=1 f (xi)£ pi

37777777775One of the things you must be careful about is not confusing E [f (x)] with

f (E [X]) : In general the two are di¤erent except for linear functions. This isbecause, like

Pand

R, it turns out that E [ ] is a linear operator.

We will illustrate this di¤erence by comparing E [X]2 with E£X2¤for each

of the random variables considered below. In general we will see that:

E£X2¤ ¸ E [X]2

with equality only when X is a degenerate random variable; that is when ittakes on only one value with probability 1:


3.5.2 Example 1: Die Roll

Let the random variable X be the outcome of rolling a fair die so that X isde…ned as:

X =

2666666664


62 1

63 1

64 1

65 1

66 1

6

3777777775Then by weighting each outcome by its probability we obtain:

E [X] = 1£ 16+ 2£ 1

6+ 3£ 1

6+ 4£ 1

6+ 5£ 1

6+ 6£ 1

6

=7

2= 3:5:

If f (x) = x2 then we have:

X2 =

2666666664


622 1

632 1

642 1

652 1

662 1

6

3777777775so that weighting each outcome by its probability we obtain:

E£X2¤= 12 £ 1

6+ 22 £ 1

6+ 32 £ 1

6+ 42 £ 1

6+ 52 £ 1

6+ 62 £ 1

6

=91

6= 15:167:

We thus have:

15:167 =91

6= E

£X2¤> E [X]2 =

µ7

2

¶2= 12:25:


3.5.3 Example 2: Binomial Distribution

Given the probabilities calculated for the binomial distribution where: Y »B¡13 ; 4¢we have:

Y =


0 1681

1 3281

2 2481

3 881

4 181

37777775and so weighting each outcome by its probability we have:

E [Y ] = 0£ 1681+ 1£ 32

81+ 2£ 24

81+ 3£ 8

81+ 4£ 1

81

=4

3:

If f (x) = x2 we have:

Y 2 =


02 1681

12 3281

22 2481

32 881

42 181

37777775and so weighting outcomes by probabilities we have:

E£Y 2¤= 02 £ 16

81+ 12 £ 32

81+ 22 £ 24

81+ 32 £ 8

81+ 42 £ 1

81

=8

3:

Again we have that:

2:6667 =8

3= E

£Y 2¤> E [Y ]2 =

µ4

3

¶2= 1:7778:

3.5.4 E [X] for Continuous Random Variables

As with discrete random variables we can de…ne the expectation of a continuousrandom variable. The basic idea is exactly the same, we take each outcome x;weight it by its probability p (x) dx and sum using

R1¡1 instead of

Pni=1 . We

thus have:

De…nition 60 If X is a continuous random variable then

E [X] =

Z 1

¡1xp (x) dx:


De…nition 61 Given a continuous random variable X then E [f (X)] is de…nedas:

E [f (X)] =

Z 1

¡1f (x) p (x) dx:

3.5.5 Example 1: The Uniform Distribution

Theorem 62 If X » U (a; b) has a uniform distribution then:

E [X] =b+ a

2:

Proof. Since p (x) = 1b¡a for a < x < b and 0 otherwise we have:

E [X] =

Z 1

¡1xp (x) dx

=

Z b

a

x1

b¡ adx

=1

b¡ aZ b

a

xdx

=1

b¡ ax2

2

¯̄̄̄ba

=1

2

b2 ¡ a2b¡ a

=1

2

(b¡ a) (b+ a)b¡ a

=b+ a

2:

Thus E [X] = b+a2 is the midpoint between a and b: For example if X »

U (0; 10), then

E [X] =0 + 10

2= 5:


E£X2¤=b2 + ab+ a2

3:


Proof. To calculate E£X2¤we have:

E£X2¤=

Z 1

¡1x2p (x) dx

=

Z b

a

x21

b¡ adx

=1

b¡ aZ b

a

x2dx

=1

b¡ ax3

3

¯̄̄̄ba

=1

3

b3 ¡ a3b¡ a

=1

3

(b¡ a) ¡b2 + ab+ a2¢b¡ a

=b2 + ab+ a2

3:

Thus if X » U (0; 10) ; we have from the above formulae that:

E£X2¤=102 + 10£ 0 + 02

3=100

3= 33:3

Note that

33:3 = E£X2¤> E [X]2 = (5)2 = 25:

3.5.6 The Standard Normal Distribution

If Z » N [0; 1] then it can be shown that:

E [Z] =

Z 1

¡1z1p2¼e¡

z2

2 dz = 0:

This follows from the symmetry of p (z) = 1p2¼e¡

z2

2 around 0: In fact it can beshown that:

Theorem 64 If Z » N [0; 1] then:E£Zk¤= 0; for k = 1; 3; 5 : : :

For k = 2 we have:

Theorem 65 If Z » N [0; 1] then:E£Z2¤= 1:


Proof. We have:

E£Z2¤=

Z 1

¡1z2

1p2¼e¡

z2

2 dz

=1p2¼

Z 1

¡1z³ze¡

z2

2

´dz:

Using integration by parts where u (z) = z; u0 (z) = 1; v0 (z) = z³ze¡

z2

2

´and

v (z) = ¡e¡z2

2 we have:

1p2¼

Z 1

¡1z³ze¡

z2

2

´dz = ¡ 1p

2¼ze¡

z2

2

¯̄̄̄ 1¡1

+

Z 1

¡1

1p2¼e¡

z2

2 dz

= 1

since:

limz!§1¡ze

¡ z2

2 = 0

and Z 1

¡1

1p2¼e¡

z2

2 dz =

Z 1

¡1p (z) dz = 1

since p (z) is a density.We thus have for Z » N [0; 1] that:

1 = E£Z2¤> E [Z]2 = 02 = 0:

It is possible to show the following using integration by parts:

Theorem 66 If Z » N [0; 1] then:E£Z4¤= 3:

The number 3 is the kurtosis of the standard normal distribution and is usedin econometrics to tell if data are well described by the normal distribution.Proof. To show that: Z 1

¡1z4

1p2¼e¡

z2

2 dz = 3

use integration by parts where u (z) = z3; u0 (z) = 3z2; v0 (z) = z³ze¡

z2

2

´and

v (z) = ¡e¡z2

2 : From this you obtain:Z 1

¡1z4

1p2¼e¡

z2

2 dz = ¡z3e¡z2

2 jz=1z=¡1| {z }=0

+ 3

Z 1

¡1z2

1p2¼e¡

z2

2 dz| {z }=1

= 3


since e¡z2

2 dominates z3 for large z and by our previous proof that E£Z2¤= 1:

More generally it can be shown using a proof by induction that for evenpowers E

£Z2k

¤that:

Theorem 67 If Z » N [0; 1] then:

E£Z2k

¤=2k¡

¡k + 1

2

¢¡¡12

¢ for k = 0; 1; 2; 3 : : : :

Example 68 Using this result we …nd for k = 4 that:

E£Z8¤=24¡

¡4 + 1

2

¢¡¡12

¢ = 105:

3.5.7 E [ ] as a Linear Operator

Discussion

Since the expectations operator E [X] is based onPfor discrete random vari-

ables andRfor continuous random variables, and since both

Pand

Rare linear

operators, it turns out that E [ ] inherits these properties and is a linear operatoras well.This means that if f (x) is a linear function then:

E [f (X)] = f (E [X]) for f (x) a linear function.

For example given the linear function: f (x) = ax+ b with X a continuousrandom variable we have:

E [f (X)] =

Z 1

¡1f (x) p (x) dx

=

Z 1

¡1(ax+ b) p (x) dx

= a

Z 1

¡1xp (x) dx| {z }=E[X]

+ b

Z 1

¡1p (x) dx| {z }=1

= aE [X] + b

= f (E [X]) :


while if X is discrete:

E [f (X)] =nXi=1

f (xi) pi

=nXi=1

(axi + b) pi

= anXi=1

xipi| {z }=E[X]

+ bnXi=1

pi| {z }=1

= aE [X] + b

= f (E [X]) :

However if f (x) is nonlinear then we cannot take E [ ] inside the brackets.This is illustrated above by the nonlinear function f (x) = x2 where we showedfor the distributions above that so that:

E£X2¤> E [X]2 :

More generally we have the following rules for manipulating expressions in-volving E [ ] :Up until now we have derived results by expressing E [f (X)] either as an

integral usingR1¡1 f (x) dx or as a summation

Pni=1 f (xi) pi: For many deriva-

tions it turns out that it is not necessary to do this, that one can derive resultsusing the fact that E [ ] is a linear operator. To do this you need to know therules given below.

3.5.8 The Rules for E [ ]

Rules for Manipulating the Expectations Operator: E [ ]

Rule 1: E [ ] is a linear operator. That is, if X and Y are two random

variables and if the exponent on the bracket is 1, or given Eh(X + Y )

1ior

equivalently E [(X + Y )], then you may take E [ ] inside the brackets and applyit to each term inside as:

E [(X + Y )] = E [X] +E [Y ] :

More generally if X1; X2; : : :Xm are m random variables then:

E [(X1 +X2 + ¢ ¢ ¢+Xm)] = E [X1] +E [X2] + ¢ ¢ ¢+E [Xm] :This only works for linear functions! If the exponent on the brackets is not 1then this step is not justi…ed. For example:

Eh(X + Y )

2i6= (E [X] +E [Y ])2 :


In general if f (x) is a nonlinear function then

E [f (X + Y )] 6= f (E [X] +E [Y ]) ;

for example if f (x) = ex; which is not linear, then:

E£eX+Y

¤ 6= eE[X]+E[Y ]:Rule 2:You may always pull E [ ] past any expression that is non-random.

For example if a is a non-random constant (e.g. a = 3 ) then:

E [aX] = aE [X]

but if Y is another random variable:

E [XY ] 6= Y E [X]

and in general:

E [XY ] 6= E [X]E [Y ] :Rule 3: The expectation of a non-random constant is just that constant.

That is if a is a non-random constant then:

E [a] = a:

Thus E [7] = 7:Here are some examples:

3.5.9 Example 1: E£(3X + 4Y )2

¤To pull E [ ] through the brackets we must expand to get an exponent of 1:Thus:

Eh(3X + 4Y )2

i= E

£¡9X2 + 24XY + 16Y 2

¢¤:

Pulling E [ ] through the brackets we obtain:

Eh(3X + 4Y )2

i= E

£9X2

¤+E [24XY ] +E

£16Y 2

¤:

Since 9; 24 and 16 are nonrandom, we can pull E [ ] past them to obtain:

Eh(3X + 4Y )2

i= 9E

£X2¤+ 24E [XY ] + 16E

£Y 2¤:


3.5.10 Example 2: E£(3X + 4)2

¤To pull E [ ] through the brackets we must expand to get an exponent of 1:Thus:

Eh(3X + 4)2

i= E

£¡9X2 + 24X + 16

¢¤:

Pulling E [ ] through the brackets we obtain:

Eh(3X + 4Y )2

i= E

£9X2

¤+E [24X] +E [16] :

Since 9; 24 are nonrandom we can pull E [ ] past them and use the fact thatE [16] = 16 to obtain:

Eh(3X + 4Y )2

i= 9E

£X2¤+ 24E [X] + 16:

3.5.11 Example 3: X » N [¹; ¾2]If X » N £¹; ¾2¤ then we can write X as X = ¹ + ¾Z where Z » N [0; 1] : Itfollows that:

E [X] = E [(¹+ ¾Z)]

= E [¹] + ¾E [Z]| {z }=0

= ¹:

since ¹ is nonrandom and we have already shown that E [Z] = 0:Thus if X » N [6; 25] then E [X] = 6:Also since we have already shown that E

£Z2¤= 1 it follows that:

E£X2¤= E

h(¹+ ¾Z)2

i= E

£¡¹2 + 2¾Z + ¾2Z2

¢¤= ¹2 + 2¾E [Z]| {z }

=0

+ ¾2E£Z2¤| {z }

=1

= ¹2 + ¾2:

Thus if X » N [6; 25] then E £X2¤= 62 + 25 = 61:

We thus have:

Theorem 69 If X » N £¹; ¾2¤ then :¹2 + ¾2 = E

£X2¤> E [X]2 = ¹2:


3.6 Variances

3.6.1 De…nition

While the population mean ¹ = E [X] is used to measure the central tendencyof the random variable X; the variance V ar [X] ´ ¾2 ´ ¾2X is used to measurethe dispersion or volatility of X: For example if X is the return on some stockthen E [X] is used to measure the expected return while V ar [X] provides ameasure of the riskiness of that particular stock.

De…nition 70 The variance of a random variables is de…ned as:

V ar [X] ´ Eh(X ¡E [X])2

i´ E

h(X ¡ ¹)2

i;

that is if f (x) = (x¡ ¹)2 then V ar [X] = E [f (X)] :

De…nition 71 The standard deviation of X is the positive square root of thevariance and is denoted as

¾ ´ ¾X ´pV ar [X]:

3.6.2 Some Results for Variances

Theorem 72 V ar [X] ¸ 0:Theorem 73 V ar [X] = 0 if and only if X is a degenerate random variable.That is if V ar [X] = 0 then X is equal to a nonrandom constant with probability1:

Thus for example V ar [7] = 0 and V ar [X] = 0 implies that Pr [X = c] = 1for some c:

Theorem 74 If c is a nonrandom constant and X is a random variable then:2

V ar [X + c] = V ar [X] :

Thus for example V ar [X + 99999999] = V ar [X] :

2This follows since E [X + c] = E [X] and so:

V ar [X + c] ´ Eh(X + c¡E [X + c])2

i= E

h(X + c¡ c¡ E [X])2

i= E

h(X ¡E [X])2

i´ V ar [X] :


Theorem 75 V ar [X] = E£X2¤¡E [X]2 = E £X2

¤¡ ¹2:Proof. Using the rules for E [ ], and expanding the brackets to have an

exponent of 1 in the de…nition of the variance, we …nd that:

V ar [X] = Eh(X ¡ ¹)2

i= E

£¡X2 ¡ 2¹X + ¹2¢¤

= E£X2¤+E [¡2¹X] +E £¹2¤

= E£X2¤¡ 2¹E [X] + ¹2:

By de…nition:

¹ = E [X]

so that:

V ar [X] = E£X2¤¡ 2E [X]E [X] +E [X]2

= E£X2¤¡E [X]2 :

Alternatively we can always replace E [X] with ¹ to obtain:

V ar [X] = E£X2¤¡ ¹2:

Theorem 76 E£X2¤= E [X]2 + V ar [X] :

Theorem 77 E£X2¤ ¸ E [X]2 with equality only when X is a degenerate ran-

dom variable.3

Proof. This follows from the fact that V ar [X] ¸ 0 and hence:E£X2¤= E

£X2¤+ V ar [X] ¸ E £X2

¤with equality only when V ar [X] = 0 or X is a degenerate random variable.

3.6.3 Example 1: Die Roll

Suppose X is the result of a fair die so that X can take on the outcomes1; 2; 3; 4; 5 and 6 each with probability 1

6 : We saw above that

E [X] = 1£ 16+ 2£ 1

6+ ¢ ¢ ¢+ 6£ 1

6= 3:5:

3This follows from Result 5 since V ar [X] ¸ 0 and hence:E£X2¤ ¸ E £X2

¤with equality only when X is a degenerate random variable.


To calculate V ar [X] from the de…nition we need to calculate E [f (X)] wheref (x) = (x¡ 3:5)2 : We thus have:

(X ¡ 3:5)2 =

26666666664

Outcomes Probabilities(1¡ 3:5)2 1

6

(2¡ 3:5)2 16

(3¡ 3:5)2 16

(4¡ 3:5)2 16

(5¡ 3:5)2 16

(6¡ 3:5)2 16

37777777775so that weighting outcomes with probabilities we obtain:

V ar [X] = (1¡ 3:5)2 £ 16+ (2¡ 3:5)2 £ 1

6+ ¢ ¢ ¢+ (6¡ 3:5)2 £ 1

6= 2:916667:

The standard deviation of X is then

¾ =p2:916667 = 1:707825:

Another way to calculate V ar [X] is to calculate E£X2¤as before:

E£X2¤= 12 £ 1

6+ 22 £ 1

6+ ¢ ¢ ¢+ 62 £ 1

6=91

6

and then to use Result 4 so that:

V ar [X] =91

6¡ (3:5)2 = 2:916667:

3.6.4 Example 2: Uniform Distribution

Suppose X » U (a; b) has a uniform distribution. We have:


V ar [X] =(b¡ a)212

:

Proof. We have already seen that:

E [X] =a+ b

2

and

E£X2¤=b2 + ab+ a2

3:


Consequently we have:

V ar [X] = E£X2¤¡E [X]2

=b2 + ab+ a2

3¡µa+ b

2

¶2=

(b¡ a)212

:

Thus if X » U (0; 10) then

V ar [X] =(10¡ 0)212

=25

3

so the standard derivation is

¾ =

r25

3=

5p3:

3.6.5 Example 3: The Normal Distribution

If Z » N [0; 1] then V ar [Z] = 1 since we have already shown thatE£Z2¤= 1 and E [Z] = 0:

Thus we have:

V ar [Z] = E£Z2¤¡E [Z]2

= 1¡ 02= 1:

From this we can show that:

Theorem 79 If X » N £¹; ¾2¤ then:E [X] = ¹

V ar [X] = ¾2:

Proof. SinceX = ¹+¾Z where Z » N [0; 1] is a standard normal, it followsthat

E [X] = ¹+ ¾E [Z]

= ¹:

To show that V ar [X] = ¾2 note that we have already shown that: E [X] =¹2 + ¾2 and hence:

V ar [X] = E£X2¤¡E [X]2

= ¹2 + ¾2 ¡ ¹2= ¾2:

Thus if X » N [6; 25] then E [X] = 6 and V ar [X] = 25:


3.7 Covariances

3.7.1 De…nitions

De…nition 80 The covariance of two random variables X and Y is de…ned as:

Cov [X;Y ] = E [(X ¡E [X]) (Y ¡E [Y ])] :

De…nition 81 Alternatively if f (x) = (x¡ ¹x) and g (y) =¡y ¡ ¹y

¢then

Cov [X;Y ] = E [f (X) g (Y )] :

If Cov [X;Y ] > 0 then there exists some positive linear relationship betweenX and Y so that when X is above (below) average then Y tends to be above(below) average. For example if X is ice cream consumption and Y is thetemperature, one would expect that Cov [X;Y ] > 0; that is above average icecream consumption would be associated with above average temperatures.If Cov [X;Y ] < 0 then there exists some negative linear relationship between

X and Y so that when X is above (below) average then Y tends to be below(above) average. For example if X is ice cream consumption and Y is rainfall,one would expect that Cov [X;Y ] < 0; that is above (below) average ice creamconsumption would be associated with below (above) average rainfall.

De…nition 82 The correlation coe¢cient ½ is:

½ =Cov [X;Y ]

V ar [X]12 V ar [Y ]

12

:

Since the standard deviations in the denominator are positive, it follows that½ > 0 (½ < 0) if and only if Cov [X;Y ] > 0 (Cov [X;Y ] < 0) so that the sign of½ is always the same as the sign of Cov [X;Y ] :For example suppose that:

V ar [X1] = 9; Cov [X1;X2] = 6

V ar [X2] = 16

then:

½ =6

912 16

12

=1

2:

3.7.2 Independence

Suppose we have two random variables X and Y and we wish to say that the tworandom variables are independent of each other. For discrete random variablesthe usual de…nition of independence is that if the joint probability distributionsatis…es

Pr [X = xi; Y = yj ] = Pr [X = xi] Pr [Y = yj ]


for all i and j then X and Y are independent.I will use another de…nition of independence based on expectation E [ ] which

(it turns out) is equivalent to the above de…nition. It is that

De…nition 83 X and Y are independent if and only if for all functions f (x)and g (y):

E [f (X) g (Y )] = E [f (X)]E [g (Y )] :

3.7.3 Results for Covariances

Here are some important results involving covariances:

Theorem 84 If X and Y are independent then Cov [X;Y ] = 0:

Proof. This follows since independence implies that:

E [f (X) g (Y )] = E [f (X)]E [g (Y )]

for any functions f (x) and g (y). In particular Letting:

f (x) = (x¡E [X]) ; g (y) = (y ¡E [Y ])we have

E [f (X)] = (E [X]¡E [X]) = 0E [g (Y )] = (E [Y ]¡E [Y ]) = 0

and hence for these two functions independence implies that:

Cov [X;Y ] = E [f (X) g (Y )]

= E [f (X)]E [g (Y )]

= 0£ 0= 0:

Theorem 85 If Cov [X;Y ] = 0 it does not follow that X and Y are indepen-dent.

Proof. We show this result by providing a counter-example; that is tworandom variables with 0 covariance but which are not independent. Suppose Xcan take on the outcomes ¡1; 0; 1 and Y can take on the outcomes 0; 1; 2: Theprobabilities of the various combinations of outcomes are given by:

Y0 1 2

-1 0 14 0

X 0 14 0 1

41 0 1

4 0


so that for example Pr [X = ¡1; Y = 1] = 14 whilePr [X = ¡1; Y = 0] = 0: You

can show that E [X] = 0 and E [XY ] = 0 so that Cov [X;Y ] = 0: However, Xand Y are not independent since if Y = 1 it is impossible for X = 0 and so Ycarries information about X: Alternatively, Pr [X = 0] = 1

2 and Pr [Y = 0] =12

but:

0 = Pr [X = 0; Y = 0] 6= Pr [X = 0]£ Pr [Y = 0] = 1

4

and so X and Y are not independent.

Theorem 86 If Cov [X;Y ] = 0 and X and Y are normally distributed then itdoes follow that X and Y are independent.

Theorem 87 If c is a nonrandom constant then Cov [X; c] = 0:

Theorem 88 Cov [X;X] = V ar [X]

Theorem 89 Cov [X;Y ] = E [XY ]¡E [X]E [Y ] .

Proof. Let ¹x ´ E [X] and ¹Y ´ E [Y ] : From the de…nition of the covari-ance and multiplying out the brackets we have:

Cov [X;Y ] = E [(X ¡ ¹X) (Y ¡ ¹Y )]= E [(XY ¡ ¹XY ¡ ¹YX + ¹X¹Y )]= E [XY ] +E [¡¹XY ] +E [¡¹YX] +E [¹X¹Y ]= E [XY ]¡ ¹XE [Y ]¡ ¹YE [X] + ¹X¹Y :

Therefore:

Cov [X;Y ] = E£X2¤¡ 2E [X]E [Y ] +E [X]E [Y ]

= E [XY ]¡E [X]E [Y ] :

Theorem 90 E [XY ] = E [X]E [Y ] +Cov [X;Y ] :

Theorem 91 E [XY ] = E [X]E [Y ] if and only if Cov [X;Y ] = 0:

3.8 Linear Combinations of Random Variables

3.8.1 Linear Combinations of Two Random Variables

Suppose the random variable Y is a linear combination of the random variablesX1 and X2 so that:

Y = aX1 + bX2 + c:


If we know E [X1] and E [X2] then we can immediately calculate E [Y ] since:

E [Y ] = aE [X1] + bE [X2] + c:

For example given two assets (say bonds and stocks) with returns : X1 and X2;and if X1 has an expected return of 4% or E [X1] = 4 and if X2 has an expectedreturn of 10% or E [X2] = 10; and we have a portfolio with 30% in X1 and 70%in X2 or Y = 0:3X1 + 0:7X2; then the expected return on the portfolio is:

E [Y ] = E [0:3X1 + 0:7X2]

= 0:3E [X1] + 0:7E [X2]

= 0:3£ 4 + 0:7£ 10= 8:2

so the expected return on the portfolio is 8:2%:Another important question is can we calculate V ar [Y ] if we know V ar [X1]

and V ar [X2]? The answer is in general no, we also need to know Cov [X1;X2] :First if Y = aX1 + bX2 + c we can ignore c since:

V ar [aX1 + bX2 + c] = V ar [aX1 + bX2] :

To calculate V ar [aX1 + bX2 + c] we have the following important result:

Theorem 92 Given two random variables X1;X2 and two nonrandom con-stants a and b:

V ar [aX1 + bX2] = a2V ar [X1] + 2abCov [X1;X2] + b

2V ar [X2] :

From this we can show that:

Theorem 93 The matrix:·V ar [X1] Cov [X1;X2]

Cov [X1;X2] V ar [X2]

¸is positive semi-de…nite.

Proof. This follows since for all a and b :

V ar [aX1 + bX2] = a2V ar [X1] + 2abCov [X1;X2] + b2V ar [X2]

=£a b

¤ · V ar [X1] Cov [X1;X2]Cov [X1;X2] V ar [X2]

¸·ab

¸¸ 0

since V ar [aX1 + bX2] ¸ 0:We can also show that the correlation coe¢cient always lies between ¡1 and

1:


Theorem 94 If V ar [X1] 6= 0 and V ar [X2] 6= 0 then: ¡1 · ½ · 1: 4

Proof. Using the leading principal minors and the fact that the matrix inthe previous theorem is positive semi-de…nite we have:

det

·V ar [X1] Cov [X1;X2]


¸= V ar [X1]V ar [X2]¡Cov [X1;X2]2

= V ar [X1]V ar [X2]

Ã1¡ Cov [X1;X2]

2

V ar [X1]V ar [X2]

!= V ar [X1]V ar [X2]

¡1¡ ½2¢

¸ 0:

Theorem 95 If ½ = 1 then X2 = ®X1 + ¯ where ® > 0:

Proof. From

V ar [aX1 + bX2] = a2V ar [X1] + 2abCov [X1;X2] + b

2V ar [X2]

and letting a = 1pV ar[X1]

and b = ¡1pV ar[X2]

we have:

V ar

"1p

V ar [X1]X1 +

¡1pV ar [X2]

X2

#= 2 (1¡ ½) :

If ½ = 1 then the variance is zero and hence

1pV ar [X1]

X1 +¡1p

V ar [X2]X2 = c

where c is a nonrandom constant. It then follows that:

X2 =

pV ar [X2]pV ar [X1]| {z }®>0

X1 +¡cpV ar [X2]| {z }¯

:

4Using the leading principal minors and the fact that the matrix in Result 2 is positivesemi-de…nite we have:

det

·V ar [X1] Cov [X1; X2]


¸= V ar [X1]V ar [X2]¡ Cov [X1;X2]2

= V ar [X1]V ar [X2]

Ã1¡ Cov [X1;X2]

2

V ar [X1]V ar [X2]

!= V ar [X1]V ar [X2]

¡1¡ ½2¢

¸ 0:

From this it follows that:¡1¡ ½2¢ ¸ 0 or ¡1 · ½ · 1: If ½ 6= §1 then the both leading

principal minors are strictly positive and the matrix is positive de…nite.


Theorem 96 If ½ = ¡1 then X2 = ®X1 + ¯ where ® < 0:Proof. Set a = 1p

V ar[X1]and b = 1p

V ar[X2]and use the same idea as above

to show that:

V ar [aX1 + bX2] = 2 (1 + ½) :

An Example

Suppose that:

V ar [X1] = 9; Cov [X1;X2] = 6

V ar [X2] = 16:

The correlation coe¢cient is then given by:

½ =6p9p16=1

2

and the variance covariance matrix is:·9 66 16

¸which you can verify is positive de…nite.If Y = 0:3X1 + 0:7X2 then

V ar [Y ] = (0:3)2V ar [X1] + 2 (0:3) (0:7)Cov [X1;X2] + (0:7)

2V ar [X2]

= 0:32 £ 9 + 2£ 0:3£ 0:7£ 6 + (0:7)2 £ 16= 11:17

and so the standard deviation is ¾Y =p11:17 = 3:34:

3.8.2 The Capital Asset Pricing Model (CAPM)

Theory

Suppose you can invest in three assets with returns: R0; R1 and R2: The …rstasset Ro is risk free (e.g. short-term government bonds) so that Ro = ro withprobability 1: Assets 1 and 2 (say long-term bonds and stocks) are risky with:

E [R1] = r1; V ar [R1] = ¾21

E [R2] = r2; V ar [R1] = ¾22

Cov [R1; R2] = ¾12:

Consider now a portfolio with weights on each asset !o; !1; and !2 with:

!o + !1 + !2 = 1:


Let the return on the portfolio be R which is given by:

R = !oro + !1R1 + !2R2:

Since !o = 1 ¡!1 ¡ !2 we can rewrite this as:R = (1¡ !1 ¡ !2) ro + !1R1 + !2R2

or:

R = ro + !1 (R1 ¡ ro) + !2 (R2 ¡ ro)where R1 ¡ ro and R2 ¡ ro are the returns in excess of the risk free rate ro:Let r = E [R] be the return on the portfolio which is given by:

E [R] = !1E [R1 ¡ ro] + !2E [R2 ¡ ro]= ro + !1 (r1 ¡ ro) + !2 (r2 ¡ ro)

or

~r = !1~r1 + !2~r2

where ~r = r¡ ro is the expected return of the portfolio net of the risk free ratero while ~r1 = r1 ¡ ro and ~r2 = r2 ¡ ro are the expected returns on asset 1 andasset 2 net of the risk free rate:The risk of the portfolio is given by the variance which is

V ar [R] = !21¾21 + 2!1!2¾12 + !

22¾22:

Suppose you are a portfolio manager. Your client wants to obtain an ex-pected return on this portfolio of r: Your job is to …nd the portfolio weights:!1 and !2 (with !o = 1 ¡!1 ¡ !2) which achieves this but at the minimumpossible risk.This leads to the constrained minimization problem of choosing!1 and !2

to minimize5:

!21¾21 + 2!1!2¾12 + !

22¾22

2

subject to the constraint:

~r = !1~r1 + !2~r2:

and the Lagrangian:

L (¸; !1; !2) =!21¾

21 + 2!1!2¾12 + !

22¾22

2+¸ (~r ¡ !1~r1 ¡ !2~r2) :

5Dividing by two is just to make the math a little simpler. Clearly minimizing half thevariance is equivalent to minimizing the variance itself.


The …rst order conditions are:

@L (¸¤; !¤1; !¤2)@¸

= ~r ¡ !¤1~r1 ¡ !¤2~r2 = 0@L (¸¤; !¤1; !¤2)

@!1= !¤1¾

21 + !

¤2¾12 ¡ ¸¤~r1 = 0

@L (¸¤; !¤1; !¤2)@!2

= !¤1¾12 + !¤2¾

22 ¡ ¸¤~r2 = 0

with second order conditions:

H = det

24 0 ¡~r1 ¡~r2¡~r1 ¾21 ¾12¡~r2 ¾12 ¾22

35 < 0:The …rst-order conditions can be re-written in matrix notation as:24 0 ¡~r1 ¡~r2

¡~r1 ¾21 ¾12¡~r2 ¾12 ¾22

3524 ¸¤

!¤1!¤2

35 =24 ¡~r

00

35so that from Cramer’s rule we have:

!¤1 =

det

24 0 ¡~r ¡~r2¡~r1 0 ¾12¡~r2 0 ¾22

35H

=~r¡¾12~r2 ¡ ¾22~r1

¢H

and:

!¤2 =

det

24 0 ¡~r1 ¡~r¡~r1 ¾21 0¡~r2 ¾12 0

35H

=~r¡¾12~r1 ¡ ¾21~r2

¢H

:

Note that

!¤1!¤2=¾12~r2 ¡ ¾22~r1¾12~r1 ¡ ¾21~r2

is independent of ~r: This means that the relative proportion of the two riskyassets in the minimum risk portfolio is independent of required rate of return ~r:


3.8.3 An Example

Suppose that Ro = 1 with probability 1 and:

E [R1] = 5; V ar [R1] = 25

E [R2] = 9; V ar [R2] = 49

Cov [R1; R2] = 14:

The expected return on the portfolio then satis…es:

E [R] = !1E [R1 ¡ ro] + !2E [R2 ¡ ro]= 1 + !1 (5¡ 1) + !2 (9¡ 1)= 1 + 4!1 + 8!2

while the risk of the portfolio is:

V ar [R] = 25!21 + 28!1!2 + 49!22:

Suppose you wish to obtain an expected return of r = 7 (or ~r = 7¡ 1 = 6)at the least possible risk. This leads to the constrained minimization problemof choosing !1 and !2 to minimize:

25!21 + 28!1!2 + 49!22

2

subject to the constraint:

7 = 1 + 4!1 + 8!2

or:

6¡ 4!1 ¡ 8!2 = 0:This leads to the Lagrangian:

L (¸; !1; !2) =25!21 + 28!1!2 + 49!

22

2+ ¸ (6¡ 4!1 ¡ 8!2)

and the …rst-order conditions:

@L (¸¤; !¤1; !¤2)@¸

= 6¡ 4!¤1 ¡ 8!¤2 = 0@L (¸¤; !¤1; !¤2)

@!1= 25!¤1 + 14!

¤2 ¡ 4¸¤ = 0

@L (¸¤; !¤1; !¤2)@!2

= 14!¤1 + 49!¤2 ¡ 8¸¤ = 0

with second-order conditions being satis…ed since:

H = det

24 0 ¡4 ¡8¡4 25 14¡8 14 49

35 = ¡1488 < 0:


The …rst-order conditions in matrix notation are then:24 0 ¡4 ¡8¡4 25 14¡8 14 49

3524 ¸¤

!¤1!¤2

35 =24 ¡6

00

35so that from Cramer’s rule we have:

!¤1 =

det

24 0 ¡6 ¡8¡4 0 14¡8 0 49

35¡1488 =

21

62= 0:3387

and:

!¤2 =

det

24 0 ¡4 ¡6¡4 25 0¡8 14 0

35¡1488 =

36

62= 0:5806:

Thus:

!¤o = 1¡ !¤1 ¡ !¤2 = 1¡21

62¡ 3662=5

62= 0:0806:

Thus the portfolio which obtains an expected return of 6% with the least vari-ance has about 8% of the portfolio in the risk free asset or Ro; about 34% in R1and 58% in R2: The riskiness of the return is then:

V ar [R] = 25!21 + 28!1!2 + 49!22

= 25

µ21

62

¶2+ 28

µ21

62

¶µ18

31

¶+ 49

µ18

31

¶2=

3087

124

so that the standard deviation is: ¾ =q

3087124 = 4:99:

3.8.4 Sums of Uncorrelated Random Variables

Suppose that the random variable Y is a linear combination of the n uncorrelatedrandom variables X1;X2; : : :Xn so that

Y = a1X1 + a2X2 + a3X3 + ¢ ¢ ¢+ anXnand

Cov [Xi;Xj ] = 0 for i 6= j:


As we have seen, a su¢cient condition for X1;X2; : : :Xn to be uncorrelated isthat they be independent since:

Xi;Xj independent =) Cov [Xi;Xj ] = 0:

We then have the following:

Theorem 97 Given that X1; X2; : : :Xn are uncorrelated it follows that if

Y = a1X1 + a2X2 + a3X3 + ¢ ¢ ¢+ anXnthen

V ar [Y ] = a21V ar [X1] + a22V ar [X2] + ¢ ¢ ¢+ a2nV ar [Xn] :

Thus to calculate the variance of Y we need only square the coe¢cients oneach Xi; take the variance, and sum.

3.8.5 An Example

If X1; X2 and X3 are uncorrelated and

V ar [X1] = 25; V ar [X2] = 36; V ar [X3] = 49

then if Y = 5X1 ¡ 4X2 + 2X3 then:V ar [Y ] = (5)2 V ar [X1] + (¡4)2 V ar [X2] + (2)2 V ar [X3]

= (5)2 £ 25 + (¡4)2 £ 36 + (2)2 £ 49= 1397

so that the standard deviation of Y is ¾ =p1397 = 37:38:

One common source of confusion is a negative coe¢cient, for example the¡4 on X2 in this example. It is probably best at the beginning to rewrite thiswith + signs and all numbers in brackets as:

Y = (5)X1 + (¡4)X2 + (2)X3and then to square coe¢cients, take variances and sum.

3.8.6 The Sample Mean

Consider n uncorrelated random variables X1;X2; : : :Xn with the followingproperties:A1: E [Xi] = ¹ for i = 1; 2; : : : nA2: V ar [Xi] = ¾

2 for i = 1; 2; : : : nA3: Cov [Xi;Xj ] = 0 for i 6= j:Note that each of the uncorrelated random variables has the same mean ¹

by A1 and the same variance ¾2 by A2:


De…nition 98 An estimator µ̂ of a parameter µ is said to be unbiased if:

Ehµ̂i= µ:

Consider the sample mean:

¹X =1

n(X1 +X2 + ¢ ¢ ¢+Xn)

=1

nX1 +

1

nX2 + ¢ ¢ ¢+ 1

nXn

as an estimator of the unknown parameter ¹: We have the following:

Theorem 99 Given A1 the sample mean ¹X is an unbiased estimator of ¹ orE£¹X¤= ¹:

Proof. The proof is as follows:

E£¹X¤=

1

nE [X1]| {z }=¹ by A1

+1

nE [X2]| {z }=¹ by A1

+ ¢ ¢ ¢+ 1

nE [Xn]| {z }=¹ by A1

=1

n¹+

1

n¹+ ¢ ¢ ¢+ 1

n¹

= n£ 1

n¹

= ¹

and so ¹X is an unbiased estimator of ¹.We can also show that:

Theorem 100 Given A1; A2; and A3:

V ar£¹X¤=¾2

n:

Proof. Since the Xi 0s are uncorrelated by A3, we obtain the variance bysquaring the coe¢cients, here 1

n so that:

V ar£¹X¤=

µ1

n

¶2V ar [X1] +

µ1

n

¶2V ar [X2] + ¢ ¢ ¢+

µ1

n

¶2V ar [Xn] :

Now since all the variances are equal by A2 so we have:

V ar£¹X¤=

µ1

n

¶2V ar [X1]| {z }=¾2 by A2

+

µ1

n

¶2V ar [X2]| {z }=¾2 by A2

+ ¢ ¢ ¢+µ1

n

¶2V ar [Xn]| {z }=¾2 by A2

=

µ1

n

¶2¾2 +

µ1

n

¶2¾2 + ¢ ¢ ¢+

µ1

n

¶2¾2

= n£µ1

n

¶2¾2

=¾2

n:


It therefore follows that ¹X gets, in a probabilistic since, closer and closer to¹ as the number of observations n goes to in…nity. This is sometimes referredto as the Law of Large Numbers.

3.8.7 The Mean and Variance of the Binomial Distribu-tion

If Y » B (p; n) has a binomial distribution then:

Pr [Y = k] =n!

k! (n¡ k)!pk (1¡ p)n¡k :

Therefore we have:

E [Y ] =nXk=0

k £ n!

k! (n¡ k)!pk (1¡ p)n¡k

and

V ar [Y ] =nXk=0

(k ¡E [Y ])2 £ n!

k! (n¡ k)!pk (1¡ p)n¡k :

Neither of these are easy to evaluate but in fact simplify considerably aftera great deal of work. Using a proof by induction you can show that

Theorem 101 If Y » B (p; n) has a binomial distribution then:

E [Y ] = np and V ar [Y ] = np (1¡ p) :

It is much easier to prove these results using the fact that Y can be writtenas the sum of n independent Bernoulli’s: X1;X2; : : :Xn or:

Y = X1 +X2 +X3 + ¢ ¢ ¢+Xn:

Since Xi takes on the values 1 and 0 with probabilities p and 1¡ p we have:

E [Xi] = 1£ p+ 0£ (1¡ p) = pE£X2i

¤= 12 £ p+ 02 £ (1¡ p) = p

V ar [Xi] = E£X2i

¤¡E [Xi]2 = p¡ p2 = p (1¡ p) :Therefore E [Y ] = np since:

E [Y ] = E [X1]| {z }=p

+E [X2]| {z }=p

+E [X3]| {z }=p

+ ¢ ¢ ¢+E [Xn]| {z }=p

= np:


Similarly it can be shown that V ar [Y ] = np (1¡ p) : Since the X0is are

independent and hence uncorrelated we can square the coe¢cients and take thevariance, and use the fact that all V ar [Xi] = p (1¡ p) so that:

V ar [Y ] = 12V ar [X1]| {z }=p(1¡p)

+ 12V ar [Xn]| {z }=p(1¡p)

+ ¢ ¢ ¢+ 12V ar [Xn]| {z }=p(1¡p)

= np (1¡ p) :Thus for example if I ask n = 2000 and p = 0:3 so that: Y » B (0:3; 2000)

then:

E [Y ] = 2000£ 0:3 = 600V ar [Y ] = 2000£ 0:3£ (1¡ 0:3) = 420:

3.8.8 The Linear Regression Model for Beginners

The classical linear regression model with a constant and one regressor satis…esthe following assumptions:A1. Yi = ®+Xi¯ + ei for i = 1; 2; : : : n:A2. E [ei] = 0A3. V ar [ei] = ¾2 for i = 1; 2; : : : n:A4. Cov [ei; ej ] = 0 for i 6= j:A5. Xi is nonrandom with

Pnj=1

¡Xj ¡ ¹X

¢2 6= 0:The least squares estimator of ¯ is then:

^̄ =

Pni=1

¡Xi ¡ ¹X

¢ ¡Yi ¡ ¹Y

¢Pnj=1

¡Xj ¡ ¹X

¢2=

nXi=1

aiYi

where:

ai =

¡Xi ¡ ¹X

¢Pnj=1

¡Xj ¡ ¹X

¢2 :As an exercise in the use of the summation notation show that:

Lemma 102 If ai is de…ned as above then:nXi=1

ai = 0

nXi=1

aiXi = 1

nXi=1

a2i =1Pn

j=1

¡Xj ¡ ¹X

¢2 :


We then have that:

^̄ =nXi=1

aiYi

=nXi=1

ai (®+Xi¯ + ei)

= ®nXi=1

ai| {z }=0

+ ¯nXi=1

aiXi| {z }=1

+nXi=1

aiei

= ¯ +nXi=1

aiei:

We can now show that:

Theorem 103 ^̄ is an unbiased estimator of ¯ or Eh^̄i= ¯:

Proof. First note that since the ai 0s depend on theXi 0s which are assumedto be nonrandom by A5, it follows that the ai 0s are nonrandom. We cantherefore pull E [ ] past the ai 0s . Now using A2 we have:

Eh^̄i= E

"¯ +

nXi=1

aiei

#

= ¯ +nXi=1

ai E [ei]| {z }=0 by A2

= ¯:

We can also show that:

Theorem 104 The variance of ^̄ is given by:

V arh^̄i=

¾2Pnj=1

¡Xj ¡ ¹X

¢2 :Proof. First, since ¯ is a non-random constant, we can drop it from the

variance:

V arh^̄i= V ar

"¯ +

nXi=1

aiei

#

= V ar

"nXi=1

aiei

#:


By A4 the ei 0s are uncorrelated so we need only square the coe¢cients andtake the variance. Doing this, using A3 and the fact that:

nXi=1

a2i =1Pn

j=1

¡Xj ¡ ¹X

¢2we have:

V arh^̄i=

nXi=1

a2i V ar [ei]| {z }=¾2 by A3

= ¾2nXi=1

a2i

=¾2Pn

j=1

¡Xj ¡ ¹X

¢2 :

3.8.9 Linear Combinations of Correlated Random Vari-ables

Expectations Using Sigma and Vector Notation

Now let Y be a linear combination of n possibly correlated random variables:X1; X2; X3; : : :Xn so that:

Y = a1X1 + a2X2 + a3X3 + ¢ ¢ ¢+ anXn:which can be more compactly using the sigma notation as:

Y =nXj=1

ajXj

and even more compactly using vector notation as:

Y = aTX

where a and X are n£ 1 vectors given by:

a =

2666664a1a2a3...an

3777775 ; X =

2666664X1X2X3...Xn

3777775 :

and where X is an n£ 1 random vector.We then have:


Theorem 105 Given Y = a1X1 + a2X2 + ¢ ¢ ¢+ anXn we have:

E [Y ] = a1E [X1] + a2E [X2] + ¢ ¢ ¢+ anE [Xn]

or

E [Y ] =nXi=1

aiE [Xi]

or

E [Y ] = aTE [X]

where by E [X] ; the expected value of the random vector X we mean:

E [X] =

2666664E [X1]E [X2]E [X3]...

E [Xn]

3777775 :

Using Matrix Notation

We could generalize this setup even further. Suppose we have m linear combi-nations Yi for i = 1; 2; : : :m (say the return on m di¤erent portfolios) all withdi¤erent weights so that:

Yi = ai1X1 + ai2X2 + ai3X3 + ¢ ¢ ¢+ ainXn; for i = 1; 2; : : :m

or:

Yi =nXj=1

aijXj for i = 1; 2; : : :m

or in matrix notation:

Y = AX

where:

Y =

2666664Y1Y2Y3...Ym

3777775 ; A =2666664a11 a12 a13 ¢ ¢ ¢ a1na21 a22 a23 ¢ ¢ ¢ a2na31 a32 a33 ¢ ¢ ¢ a3n...

......

. . ....

am1 am2 am3 ¢ ¢ ¢ amn

3777775 ; X =

2666664X1X2X3...Xn

3777775 :


Theorem 106 If Y is an m£ 1 random vector given by Y = AX then:

E [Y ] = AE [X]

where:

E [Y ] =

2666664E [Y1]E [Y2]E [Y3]...

E [Yn]

3777775 :

Variances

If we only have one linear combination so that:

Y =nXi=1

aiXi = aTX

and the X0is are correlated then we can no longer just square the coe¢cients,

take the variance and sum. Instead it can be shown that:

Theorem 107 If Y =Pni=1 aiXi then

V ar [Y ] =nXi=1

nXj=1

aiajCov [Xi;Xj ]

Theorem 108 If Y =Pni=1 aiXi then:

V ar [Y ] =nXi=1

a2iV ar [Xi] + 2nXi=1

nXj=i+1

aiajCov [Xi;Xj ] :

Theorem 109 If Y =Pni=1 aiXi then:

V ar [Y ] =nXi=1

a2iV ar [Xi]

+2a1a2Cov [X1;X2] + 2a1a3Cov [X1;X3] + ¢ ¢ ¢+ 2a1anCov [X1;Xn]+2a2a3Cov [X2;X3] + 2a2a4Cov [X2;X4] + ¢ ¢ ¢+ 2a2anCov [X2;Xn]+2a3a4Cov [X3;X4] + 2a3a5Cov [X3;X5] + ¢ ¢ ¢+ 2a3anCov [X3;Xn]...

+2an¡1anCov [Xn¡1;Xn] :

It turns out that the best way to proceed is to generalize the notation ofV ar [X] to allow X to be an n£ 1 random vector.


De…nition 110 If X is an n£ 1 random vector then V ar [X] is de…ned as:

V ar [X] = Eh(X ¡E [X]) (X ¡E [X])T

i= E

£XXT

¤¡E [X]E [X]Twhich is equal to:

V ar [X] =

2666664V ar [X1] Cov [X1;X2] Cov [X1;X3] ¢ ¢ ¢ Cov [X1;Xn]

Cov [X1;X2] V ar [X2] Cov [X2;X2] ¢ ¢ ¢ Cov [X2;Xn]Cov [X1;X3] Cov [X2;X3] V ar [X3] ¢ ¢ ¢ Cov [X3;Xn]

......

.... . .

...Cov [X1;Xn] Cov [X2;Xn] ¢ ¢ ¢ V ar [Xn]

3777775 :

This is often referred to as the variance-covariance matrix. Note that V ar [X]is symmetric.

Theorem 111 If Y = aTX then:

V ar£aTX

¤= aTV ar [X] a:

Theorem 112 V ar [X] is positive semi-de…nite.

Proof. This follows from the previous theorem and the fact that vari-ances are always positive so that variances are always none negative so that:V ar

£aTX

¤ ¸ 0 and hence:aTV ar [X] a = V ar

£aTX

¤ ¸ 0for all a: Thus V ar [X] is positive semi-de…nite matrix.We can say that V ar [X] is positive-de…nite if no linear combinations of X

are nonrandom. In particular we have:

Theorem 113 If there is no c 6= 0 such that cTX = d with probability 1 whered is a constant then V ar [X] is positive de…nite.

Theorem 114 If Y = AX then:

V ar [AX] = AV ar [X]AT :

3.8.10 The Linear Regression Model for Experts

The classical linear regression model is de…ned by the following assumptions:A1. Y = X¯+e where Y is an n£1 random vector, X is n£p; ¯ is a p£1

nonrandom vector and e is n£ 1 random vector.A2. E [e] = 0A3. V ar [e] = ¾2I where I is an n£ n identity matrix.


A4. X is nonrandom with rank [X] = pFor example the linear regression model we considered earlier with a constant

and one regressor is a special case of this model where:2666664Y1Y2Y3...Yn

3777775| {z }

Y

=

26666641 X11 X21 X3...

...1 Xn

3777775| {z }

X

·®¯

¸| {z }

¯

+

2666664e1e2e3...en

3777775| {z }

e

:

It turns out that the least squares estimator is:

^̄ =¡XTX

¢¡1XTY:

It then can be show that:

^̄ =¡XTX

¢¡1XTY

=¡XTX

¢¡1XT (X¯ + e)| {z }

=Y by A1

=¡XTX

¢¡1XTX| {z }

=I

¯ +¡XTX

¢¡1XT e

= ¯ +¡XTX

¢¡1XT| {z }

´A

e

= ¯ +Ae

where:

A =¡XTX

¢¡1XT

is nonrandom by A4:We have the following:

Theorem 115 ^̄ is unbiased or Eh^̄i= ¯:

Proof. By A4 the matrix A is nonrandom so that:

Eh^̄i= E [¯ +Ae]

= ¯ +E [Ae]

= ¯ +A E [e]|{z}=0 by A2

= ¯:

We also have:


Theorem 116 The variance-covariance matrix of ^̄ is:

V arh^̄i= ¾2

¡XTX

¢¡1:

Proof. We have:

V arh^̄i= V ar [¯ +Ae]

= V ar [Ae]

since ¯ is nonrandom. Thus:

V arh^̄i= V ar [Ae]

= A V ar [e]| {z }=¾2I by A3

AT

= ¾2AAT :

Since A =¡XTX

¢¡1XT and AT = X

¡XTX

¢¡1we have:

V arh^̄i= ¾2AAT

= ¾2¡XTX

¢¡1XT

³X¡XTX

¢¡1´| {z }=AT

¾2¡XTX

¢¡1XTX

¡XTX

¢¡1| {z }=I

= ¾2¡XTX

¢¡1:

Chapter 4

Dynamics

4.1 IntroductionDi¤erence equations, and their continuous time analogues, di¤erential equations,are often used to describe and predict how economic variables change over time;for example, in describing movements in the business cycle.A basic understanding of trigonometry turns out to be important for dif-

ference and di¤erential equations. Strange and wonderful as it may seem,trigonometry in turn is very closely related to complex numbers, that is num-bers which involve the square root of ¡1 or i ´ p¡1: We therefore begin witha review of trigonometry, go on from there to show the deep connection withcomplex numbers, and then proceed to di¤erence and di¤erential equations.

4.2 Trigonometry

4.2.1 Degrees and Radians

In every day life we typically measure angles using degrees. A full turn forexample is 360±, a right-angle is 90±. In mathematics we typically use radiansto measure degrees. With radians a full turn is 2¼ instead of 360±; a right-angleis ¼2 instead of 90

± while a 45± angle would be ¼4 : In general we have:

radians =2¼

360|{z}=0:0175

£ degrees.

4.2.2 The Functions cos (x) and sin (x)

138

CHAPTER 4. DYNAMICS 139

The two basic trigonometric functions are cos (x) and sin (x) which are plottedbelow:

-1

-0.5

0

0.5

1

-10 -5 5 10x

y = sin (x)

-1

-0.5

0

0.5

1

-10 -5 5 10x

y = cos (x)

The functions sin (x) and cos (x) have the following properties:

Properties of sin (x) and cos (x)

1. j sin (x) j· 1; j cos (x) j· 1 ,2. sin (x)2 + cos (x)2 = 1;

3. cos (¡x) = cos (x) (i.e., cos (x) is an even function),4. sin (¡x) = ¡ sin (x) (i.e., sin (x) is an odd function),5. cos (x+ 2¼) = cos (x) ; ( cos (x) has a period of 2¼ ),

6. sin (x+ 2¼) = sin (x) ; ( sin (x) has a period of 2¼ ),


7. d sin(x)dx = cos (x) ;

8. d cos(x)dx = ¡ sin (x) ;

9. sin (µ1 + µ2) = cos (µ1) sin (µ2) + cos (µ2) sin (µ1) ;

10. cos (µ1 + µ2) = cos (µ1) cos (µ2)¡ sin (µ1) sin (µ2) ;11. cos (x) = 1¡ x2

2! +x4

4! ¡ x6

6! + ¢ ¢ ¢ ;

12. sin (x) = x¡ x3

3! +x5

5! ¡ x7

7! + ¢ ¢ ¢ :The last two follow from the Taylor series around xo = 0; the fact that

cos (0) = 1 and sin (0) = 0:

4.2.3 The Functions tan (x) and arctan (x)

De…nition 117 The function tan (x) is de…ned by:

tan (x) =sin (x)

cos (x)for ¡ ¼

2< x <

¼

2:

Using the quotient rule and the fact that: sin (x)2 + cos (x)2 = 1; it can beshown that:

Theorem 118

d tan (x)

dx=

1

cos (x)2> 0

We can therefore de…ne the inverse function: arctan (x) which satis…es:

tan (arctan (x)) = arctan (tan (x)) = x:

As an exercise you may want to show that:

d arctan (x)

dx=

1

1 + x2:

4.3 Complex Numbers

4.3.1 Introduction

Consider the roots of the quadratic:

x2 + 1 = 0:

This does not have a solution amongst ordinary or more precisely real numberssince it implies that:

x2 = ¡1


and for both negative and positive numbers:

x2 ¸ 0:We can however invent a new class of numbers, which turn out to be surpris-

ingly useful, called complex numbers where the solution to x2 = ¡1 is simplyde…ned to exist as a new kind of number, neither positive nor negative, butwhere:

x =p¡1 ´ i:

Many polynomials do not have any real roots, but any polynomial will haveroots if one allows for complex numbers. For example the quadratic

x2 ¡ 6x+ 13 = 0has no real roots since from the usual formula the roots would be:

6§q(¡6)2 ¡ 4£ 13

2=6§p¡16

2

and since the expression inside p is negative. Amongst complex numbersthough we have that since

p¡16 = 4i that the roots are:6§ 4i2

or the two roots are:

x1 = 3 + 2i and x2 = 3¡ 2i:In general complex numbers are written as:

a+ bi

where a and b are real numbers: For example given the complex number 3 + 2iwe have a = 3 and b = 2:

De…nition 119 Given a complex number a+ bi we call a the real part or

Re [a+ bi] = a:

De…nition 120 Given a complex number a+bi we call b the imaginary part or

Im [a+ bi] = b:

Thus given 3 + 2i we have

Re [3 + 2i] = 3

Im [3 + 2i] = 2:

If x is a real number we have Im [x] = 0 so that for example Im [7] = 0:


4.3.2 Arithmetic with Complex Numbers

Just as with real numbers we can add, subtract, multiply and divide complexnumbers using the following:

(a+ bi) + (c+ di) = (a+ c) + (d+ e) i

(a+ bi)¡ (c+ di) = (a¡ c) + (d¡ e) i(a+ bi) (c+ di) = (ac¡ bd) + (ad+ bc) i

1

a+ bi=

a¡ bia2 + b2

=a

a2 + b2¡ b

a2 + b2i

(a+ bi)

(c+ di)=

(a+ bi) (c¡ di)(c+ di) (c¡ di)

=(ac+ bd)

c2 + d2+(¡ad+ bc)c2 + d2

i:

Thus for example:

(3 + 4i) + (5 + 6i) = 8 + 10i

(3 + 4i)£ (5 + 6i) = ¡9 + 38i1

(5 + 6i)=

5

61¡ 6

61i

(3 + 4i)¥ (5 + 6i) =39

61+2

61i:

4.3.3 Complex conjugates:

De…nition 121 Given a complex number a+bi we denote the complex conjugateas: (a+ bi) where:

(a+ bi) ´ a¡ bi:For example the complex conjugate of 3 + 4i is given by:

3 + 4i = 3¡ 4i:The complex roots of any polynomials always appear as conjugate pairs;

that is we have:

Theorem 122 Given the nth degree polynomial:

f (x) = ®0xn + ®1x

n¡1 + ®2xn¡2 + ¢ ¢ ¢+ ®n¡1x+ ®n = 0where the coe¢cients ®0; ®1; ®2; : : : ®n are real numbers, if x = a+ bi is a rootthen so too is its complex conjugate x = a¡ bi:For example with the quadratic

x2 ¡ 6x+ 13 = 0the two roots x1 = 3 + 2i and x2 = 3¡ 2i are conjugates of each other.An interesting and useful property of conjugates is that:


Theorem 123 The conjugate of the product of two complex numbers is theproduct of the conjugate so that:

(a+ bi) (c+ di) = (a+ bi)£ (c+ di):

An Example

We verify this theorem using the following example. Note that:

(7 + 3i) (4 + 2i) = 22 + 26i

and so:

(7 + 3i) (4 + 2i) = 22 + 26i = 22¡ 26i:

However, it is also true that we get the same answer if we calculate theproduct of the conjugates as:

(7 + 3i)£ (4 + 2i) = (7¡ 3i) (4¡ 2i)= 22¡ 26i:

4.3.4 The Absolute Value of a Complex Number

For real numbers we often calculate the absolute values. For example j¡3j = 3:Just like real numbers, one can calculate the absolute value of a complex

number a+ bi:

De…nition 124 The absolute value of a complex number a+ bi is de…ned as:

j a+ bi j=q(a+ bi) (a+ bi) =

pa2 + b2:

For example

j 3 + 4i j=p32 + 42 = 5:

One very useful way of thinking about complex numbers is as a point in atwo dimensional space as shown below for 3 + 4i :Thus in the diagram a = 3 is the real part on the horizontal axis and b = 4

is the imaginary part on the vertical axis.It then follows that the length r from the origin to the point (a; b) is the

absolute value of a+ bi since:

r =j a+ bi j=pa2 + b2:

Thus in the diagram r =p32 + 42 = 5:


4.3.5 The Argument of a Complex Number

The angle µ between the horizontal axis and (a; b) is called the argument ofa+ bi: We thus have:

De…nition 125 The argument of a+ bi; denoted as arg [a+ bi] is de…ned as:

µ = arg [a+ bi] =

8<: arctan¡ba

¢; a > 0; b > 0

¼ + arctan¡ba

¢; a < 0

2¼ + arctan¡ba

¢; a > 0; b < 0

:

Thus in the diagram we have for 3 + 4i we have that:

µ = arg (3 + 4i) = arctan

µ4

3

¶= 0:927:

To illustrate the calculation of µ where either a or b is negative, we have:

arg (¡3 + 4i) = ¼ + arctan

µ¡43

¶= 2:214;

arg (3¡ 4i) = 2¼ + arctan

µ¡43

¶= 5:356:


4.3.6 The Polar Form of a Complex Number

De…nition 126 The polar form of a complex number as:

a+ bi = r (cos (µ) + i sin (µ))

where r = ja+ bij and µ = arg [a+ bi] :For example you can verify that:

3 + 4i = 5 (cos (0:927) + i sin (0:927 ))

¡3 + 4i = 5 (cos (2:214) + i sin (2:214))

3¡ 4i = 5 (cos (5:356) + i sin (5:356)) :

4.3.7 De Moivre’s theorem

Consider taking calculating (a+ bi)n : One way of doing this is to expand thebrackets so that for example:

(a+ bi)3 = (a+ bi) (a+ bi)| {z }

=a2+2iab¡b2(a+ bi)

=¡a2 ¡ b2 + 2iab¢ (a+ bi)

= a3 ¡ 3ab2 + i ¡3a2b¡ b3¢ :This approach would involve a lot of work if n were large and would not

work if n is not an integer (for example what is (1 + i)12 or

p1 + i ?)

A surprising result, called De Moivre’s theorem is given in the followingtheorem:

Theorem 127 De Moivre’s theorem:

(a+ bi)n = rn (cos (nµ) + i sin (nµ)) :

Example 1

Consider the complex number 1 + 1i: It follow from De Moivre’s theorem thatr =

p12 + 12 =

p2 and µ = arctan

¡11

¢= ¼

4 and so:

1 + 1i =p2³cos³¼4

´+ i sin

³¼4

´´:

Verify this using the fact that cos¡¼4

¢= 1p

2and sin

¡¼4

¢= 1p

2:

Furthermore we have:

(1 + 1i)5 =

³p2´5µ

cos

µ5¼

4

¶+ i sin

µ5¼

4

¶¶= ¡4¡ 4i:

You should verify this by multiplying out

(1 + 1i)£ (1 + 1i)£ (1 + 1i)£ (1 + 1i)£ (1 + 1i)and by using the polar form.


Example 2

Other examples using the results from above are that:

(3 + 4i)n = 5n (cos (0:927£ n) + i sin (0:927£ n))

(¡3 + 4i)n = 5n (cos (2:214£ n) + i sin (2:214£ n))(3¡ 4i)n = 5n (cos (5:355£ n) + i sin (5:355£ n)) :

For example we have:

(3 + 4i)12 =

p3 + 4i

= 5:012

µcos

µ0:927 £ 1

2

¶+ i sin

µ0:927£ 1

2

¶¶= 2+ 1i:

Thus we can calculate square roots of complex numbers!You can verify that 2 + i is in fact the square root of 3 + 4i since:

(2 + i) (2 + i) = 3 + 4i:

4.3.8 Euler’s Formula

To see why De Moivre’s theorem is true, consider taking a Taylor Series of eµi

which is:

eµi = 1 + µi+(µi)2

2!+(µi)3

3!+(µi)4

4!+ ¢ ¢ ¢ :

Since i2 = ¡1; i3 = ¡i; i4 = 1 etc. we have:

eµi =

µ1¡ µ

2

2!+µ4

4!¡ µ

6

6!+ ¢ ¢ ¢

¶| {z }

=cos(µ)

+ i

µµ ¡ µ

3

3!+µ5

5!¡ µ

7

7!+ ¢ ¢ ¢

¶| {z }

=sin(µ)

from the Taylor series for cos (µ) and sin (µ) : We therefore have (the reallymind-boggling!!)

Theorem 128 Euler’s formula:

eµi = cos (µ) + i sin (µ) :

To get a feel for some of the depth behind Euler’s formula consider settingµ = ¼: We then have:

e¼i = cos (¼)| {z }=¡1

+ isin (¼)| {z }=0


or

e¼i + 1 = 0

which unites the …ve most fundamental numbers: e; ¼; i; 1 and 0 into one simpleformula. Taking log’s of both sides of e¼i = ¡1 we obtain ln (¡1) = ¼i:We can also calculate ei¼=2 = i: Taking logs of both sides we …nd that

ln (i) = i¼2 : If we now ask what sort of number is ii (mathematicians oncethought this would be a new kind of number) we …nd that it turns out to be areal number! Thus ii = eln(i)i = e

¼2 i£i = e¡

¼2 ¼ 0:207880:

We have the following:

Theorem 129 The absolute value of eµi is 1 or¯̄eµi¯̄= 1 and the conjugate of

eµi is e¡µi or eµi = e¡µi:

Proof.

j eµi j=qcos (µ)2 + sin (µ)2 = 1

eµi = cos (µ) + i sin (µ) = cos (µ)¡ i sin (µ) = e¡µi:

Examples of Euler’s Formula

Verify the following examples:

e¼i = cos (¼) + i sin (¼) = ¡1e¼4 i = cos

³¼4

´+ i sin

³¼4

´=

1p2+

1p2i

ei = cos (1) + i sin (1) = 0:540 + 0:841i:

You can also verify from a previous example that:

(3 + 4i)n = 5ne0:927i£n

(¡3 + 4i)n = 5ne2:214i£n

(3¡ 4i)n = 5ne5:355i£n:

A Proof of De Moivre’s Theorem

Since

a+ bi = r (cos (µ) + i sin (µ))

and by Euler’s formula

eµi = cos (µ) + i sin (µ)


we have:

a+ bi = reµi:

It therefore follows that:

(a+ bi)n =¡reµi

¢n= rnenµi

= rn (cos (nµ) + i sin (nµ))

which is De Moivre’s theorem.Thus given 3 + 4i we have r =

p32 + 42 = 5 and µ = arg (3 + 4i) =

arctan 43 = 0:927 we have

3 + 4i = 5e0:927i

so that:

(3 + 4i)3 = 53e3£0:927i

= ¡117 + 44i:

Using Euler’s Formula to Represent cos (x) and sin (x)

Consider using Euler’s formula as follows:

eµi + e¡µi

2=

cos (µ) + i sin (µ) + cos (¡µ) + i sin (¡µ)2

= cos (µ)

since cos (¡µ) = cos (µ) and sin (¡µ) = ¡ sin (µ) : On your own apply the samelogic and show that:

eµi ¡ e¡µi2i

= sin (µ) :

We therefore have that:

cos (µ) =eµi + e¡µi

2

sin (µ) =eµi ¡ e¡µi

2i:

These relations can be used to easily prove many trigonometric results. Forexample from

e(µ1+µ2)i = cos (µ1 + µ2) + i sin (µ1 + µ2)


and

e(µ1+µ2)i = eµ1ieµ2i = (cos (µ1) + i sin (µ1)) (cos (µ2) + i sin (µ2))

= (cos (µ1) cos (µ2)¡ sin (µ1) sin (µ2))+i (cos (µ1) sin (µ2) + cos (µ2) sin (µ1))

we can, by equating real terms with real terms and complex terms with complexterms arrive at:

cos (µ1 + µ2) = cos (µ1) cos (µ2)¡ sin (µ1) sin (µ2)sin (µ1 + µ2) = cos (µ1) sin (µ2) + cos (µ2) sin (µ1) :

4.3.9 When does (a+ bi)n ! 0 as n!1?With di¤erential and di¤erence equations it turns out that the important prop-erty of stability depends on what happens to:

(a+ bi)n

as n increases without bound.For real numbers we know that an ! 0 if and only if jaj < 1: For example if

we take higher and higher powers of ¡12 we get:µ

¡12

¶n= ¡1

2; n = 1

=1

4; n = 2

= ¡18; n = 3

=1

16; n = 4

which approaches 0 as n increases and¯̄¡1

2

¯̄< 1: In contrast if a = 2 then we

have:(2)n = 2; 4; 8; 16; 32; 64 : : : as n = 1; 2; 3; 4; 5; 6; : : : so that: limn!1 (2)n =

1 6= 0:It turns out that this generalizes to complex numbers. In particular we have:

Theorem 130

limn!1 (a+ bi)n = 0 if and only if j a+ bi j< 1:

This follows since:

(a+ bi)n = rnenµi

where r =j a + bi jand the fact that we have already shown that ¯̄eµi¯̄ = 1 forany µ so that:

j(a+ bi)nj = rn¯̄enµi

¯̄= rn ! 0


if and only r =j a+ bi j< 1:For example consider 12 +

12 i: We have:

r =

sµ1

2

¶2+

µ1

2

¶2= 0:707 < 1

so that:

limn!1

µ1

2+1

2i

¶n= 0:

To see that this is so note consider:µ1

2+1

2i

¶1=

1

2+1

2iµ

1

2+1

2i

¶2=

1

2iµ

1

2+1

2i

¶3= ¡1

4+1

4iµ

1

2+1

2i

¶4= ¡1

4µ1

2+1

2i

¶5= ¡1

8¡ 18i

...µ1

2+1

2i

¶40=

1

1048576:

Note how either the real and complex components get smaller and small as ngets larger.Now consider 1 + 1i . Since: r =j 1 + 1i j is:

r =p12 + 12

= 1:414 > 1

we have that:

limn!1 (1 + i)n 6= 0:


To see that this is so consider:

(1 + i)2 = 2i

(1 + i)3 = ¡2 + 2i(1 + i)4 = ¡4(1 + i)5 = ¡4¡ 4i(1 + i)6 = ¡8i

...

(1 + i)40 = 1048576:

Note how either the real and complex components get larger and larger as ngets larger.

4.4 First-Order Linear Di¤erence Equations

4.4.1 De…nition

Let Yt be some variable at time t: For example Yt could be GNP at time t:Time is discrete beginning at t = 0 and progresses in units of 1 so that t =0; 1; 2; : : :1 and so there is a sequence of values of Yt; beginning at Yo andcontinuing as Y1; Y2; Y3; : : : :

De…nition 131 A …rst-order linear di¤erence equation is de…ned by an equa-tion of movement:

Yt = ®+ ÁYt¡1

and by a starting value Yo which gets the process started.

4.4.2 Example

For example suppose that Yo = 200; ® = 300; Á = 0:5 so that:

Yt = 300 + 0:5Yt¡1:

It then follows that:

Y1 = 300 + 0:5Yo = 300 + 0:5£ 200 = 400Y2 = 300 + 0:5Y1 = 300 + 0:5£ 400 = 500Y3 = 300 + 0:5Y2 = 300 + 0:5£ 500 = 550Y4 = 300 + 0:5Y3 = 300 + 0:5£ 550 = 575Y5 = 300 + 0:5Y4 = 300 + 0:5£ 575 = 587:5

etc::

Note in the example that Yt is increasing and appears to be approachingsome value. This is the equilibrium value Y ¤ of the process which is de…ned as:


De…nition 132 The equilibrium value Y ¤ is that value of Yt such that if Yt =Y ¤ then Yt+k = Y ¤ for all k > 0:

For a …rst-order linear di¤erence equation we have that if Yt¡1 = Y ¤; thenYt = Y

¤ so that substituting Yt¡1 = Y ¤ and Yt = Y ¤ into Yt = ®+ÁYt¡1 yields:

Y ¤ = ®+ ÁY ¤:

Thus as long as Á 6= 1 we can solve for Y ¤ as:

Y ¤ =®

1¡ Á:

In the above example we have:

Y ¤ = 300 + 0:5Y ¤

so that solving we have:

Y ¤ = 600

and so Y ¤ = 600:

4.4.3 Stability

An important question is if we start at any Yo; does Yt always approach Y ¤: Ifit does then we say that the di¤erence equation is stable.

De…nition 133 A di¤erence equation is stable if for any starting value Yo

limt!1Yt = Y

¤:

For example suppose you have a ball and a teacup as illustrated below. If youtake the teacup and put a small ball anywhere inside it the ball eventually comesto rest at the bottom of the teacup. This is an example of stable dynamicswith the bottom of the teacup a stable equilibrium point. If however we turnthe teacup over then the top of the teacup is an equilibrium point since if weput the ball exactly at this point it will remain there forever if nothing disturbsit. If however the ball starts anywhere except at the top of the cup or if it isslightly disturbed while it is at the top of the cup, then it will move away fromthe equilibrium. This then is an example of unstable dynamics with the topof the teacup an unstable equilibrium point.

Just as with the teacup example, it turns out that some di¤erence equationsare stable and some are unstable. The above example turns out to be stable.If instead

Yt = 300 + 2Yt¡1


Figure 4.1:


with say Yo = 200 then this is unstable. To see this note that:

Y1 = 300 + 2Yo = 300 + 2£ 200 = 700Y2 = 300 + 2Y1 = 300 + 2£ 700 = 1700Y3 = 300 + 2Y2 = 300 + 2£ 1700 = 3700Y4 = 300 + 2Y3 = 300 + 2£ 3700 = 7700Y5 = 300 + 2Y4 = 300 + 2£ 7700 = 15700

etc:

which does not approach Y ¤ given by

Y ¤ =300

1¡ 2 = ¡300:

The question then is under what circumstances is a di¤erence equation stableand under what circumstances is it unstable. We can answer this question pre-cisely for a …rst-order linear di¤erence equation by de…ning Y ¤t as the deviationfrom the equilibrium value Y ¤ or

Y ¤t ´ Yt ¡ Y ¤

so that: Yt = Y ¤t +Y ¤: We therefore have:

Y ¤t + Y¤| {z }

Yt

= ®+ Á¡Y ¤t¡1 + Y

¤¢| {z }Yt¡1

= ®+®Á

1¡ Á| {z }=Y ¤

+ ÁY ¤t¡1::

Thus the Y ¤ on the left side cancels with the Y ¤ on the right side and we have:

Y ¤t = ÁY¤t¡1

which is the same di¤erence equation except without the constant ®:It follows then that:

Y ¤t = Á Y ¤t¡1|{z}=ÁY ¤

t¡2

= Á2Y ¤t¡2...

= ÁtY ¤0

or

Y ¤t = ÁtY ¤0 :

Recall that Y ¤t = Yt ¡ Y ¤: It therefore follows that:


Theorem 134 The solution to a …rst order linear di¤erence equation takes theform:

Yt = Y¤ + Át (Yo ¡ Y ¤) :

We therefore have stability if and only if:

limt!1 Át = 0

and so we obtain the following:

Theorem 135 The di¤erence equation Yt = ®+ ÁYt¡1 is stable if and only if:

¡1 < Á < 1:

Note that stability depends only on Á and not on Yo or ®:Thus Yt = 300 + 0:5Yt¡1 is stable because: ¡1 < Á = 1

2 < 1 while Yt =300 + 2Yt¡1 is not stable because Á = 2 > 1:

4.4.4 A Macroeconomic Example

Suppose that Yt is GNP , Ct is consumption; It is investment and Gt is govern-ment expenditure so that:

Yt = Ct + It +Gt:

Both investment and government expenditure are exogenous so that:

It = I;

Gt = G

where I and G are constants. Consumption depends on income lagged oneperiod via the consumption function

Ct = C + cYt¡1 ; 0 < c < 1

where c is the marginal propensity to consume. It follows that:

Yt = Ct + It +Gt

= C + cYt¡1 + I +G

or

Yt = (C + I +G)| {z }®

+ c|{z}Á

Yt¡1

which is a …rst-order di¤erence equation.


The equilibrium value of GNP is thus:

Y ¤ =®

1¡ Á =C + I +G

1¡ c :

Note for example that @Y¤

@G = 11¡c ; the conventional multiplier.

The economy is stable since Á = c and 0 < c < 1: In fact

Yt =C + I +G

1¡ c + ctµYo ¡ C + I +G

1¡ c¶t!1! C + I +G

1¡ c :

As an example consider the economy where Yo = 400; c = 0:5; C = I = G =100 in which case:

Yt = 300 + 0:5Yt¡1:

which has a solution:

Yt = 600¡ 200µ1

2

¶twith a time path plotted below:

0

100

200

300

400

500

600

2 4 6 8 10t

Yt = 600¡ 200¡12

¢t :

4.4.5 The Cob-Web Model

Suppose that the quantity supplied depends on the price that occurred oneperiod ago and not the current price. For example suppose hog producers mightmake their decisions on how many hogs to bring to market based on last year’shog price and not this years. Assume that the quantity demanded depends onthe current price. Then we have:

Qdt = a¡ bPtQst = c+ dPt¡1


where Qdt is the quantity demanded, Qst is the quantity supplied and Pt is the

price. In equilibrium: Qdt = Qst so that price Pt follows a …rst-order lineardi¤erence equation given by:

a¡ bPt = c+ dPt¡1or:

Pt =a¡ cb

¡ dbPt¡1:

To solve this note that the equilibrium price P¤ is:

P ¤ =a¡cb

1 + db

=a¡ cb+ d

so that:

Pt = P¤ +

µ¡db

¶t(Po ¡ P ¤) :

Note that because ¡db is a negative number, the actual price will, over consec-

utive years, over and undershoot the equilibrium price.Stability requires that d < b or that the slope of the supply curve be less

than the slope of the demand curve.

4.5 Second-Order Linear Di¤erence EquationsDe…nition 136 A second-order linear di¤erence equation has an equation ofmotion given by:

Yt = ®+ Á1Yt¡1 + Á2Yt¡2

with given starting values: Y0 and Y1.

If Yt = Y ¤ ,the equilibrium value, it follows that Yt+1 = Y ¤ and Yt+2 = Y ¤

so that Y ¤ satis…es:

Y ¤ = ®+ Á1Y¤ + Á2Y

¤

so that:

Theorem 137 The equilibrium value for a second-order linear di¤erence equa-tion is given by:

Y ¤ =®

1¡ Á1 ¡ Á2:


If we de…ne Y ¤t = Yt ¡ Y ¤ then:

Y ¤t = Á1Y¤t¡1 + Á2Y

¤t¡2

so that Y ¤t obeys the same di¤erence equation but without the constant ®:To solve the di¤erence equation in this form we hazard a guess that:

Y ¤t = Art

for some A and r: Substituting Art for Y ¤t ; Art¡1 for Y ¤t¡1; and Art¡2 for Y ¤t¡2it follows that:

Art = Á1Art¡1 + Á2Ar

t¡2:

Cancelling the A and dividing both sides by rt¡2 we …nd that r is a root of thecharacteristic polynomial de…ned by:

De…nition 138 The characteristic polynomial of the second-order di¤erenceequation: Yt = ®+ Á1Yt¡1 + Á2Yt¡2 is

r2 ¡ Á1r ¡ Á2:

Thus the two roots: r1 and r2 of the characteristic polynomial are:

r1; r2 =Á1 §

q(Á1)

2 + 4Á2

2:

The general solution to a second-order linear di¤erence equation then takesthe form:

Y ¤t = A1rt1 +A2r

t2

or using the fact that Y ¤t = Yt ¡ Y ¤ we have:

Theorem 139 The solution to a second-order linear di¤erence equation takesthe form:

Yt = Y¤ +A1rt1 +A2r

t2:

Thus a stability requires that j r1 j< 1 and j r2 j< 1 in which case rt1 ! 0and rt2 ! 0 as t!1 and:

limt!1Yt = Y ¤ +A1 lim

t!1 rt1| {z }

=0

+A2 limt!1 r

t2| {z }

=0

= Y ¤:

Thus we have:


Theorem 140 The linear second-order di¤erence equation:

Yt = ®+ Á1Yt¡1 + Á2Yt¡2

is stable if and only if:

j r1 j< 1 and j r2 j< 1where r1 and r2 are the roots of the characteristic polynomial.

Note that there is no di¢culty if r1 and r2 are complex in establishingstability since we know how to calculate absolute values for complex numbers.Thus

Theorem 141 If r1 = a+ bi and r2 = a¡ bi the di¤erence equation is stableif and only if:

j r1 j=j r2 j=pa2 + b2 < 1:

To calculate A1 and A2 we use the fact thatYo and Y1 are given and substi-tute t = 0 and t = 1 in

Yt = Y¤ +A1rt1 +A2r

t2

to obtain:

Yo = Y ¤ +A1 +A2Y1 = Y ¤ +A1r1 +A2r2:

It then follows that: ·1 1r1 r2

¸ ·A1A2

¸=

·Yo ¡ Y ¤Y1 ¡ Y ¤

¸so that A1 and A2 are the solutions to two linear equations:·

A1A2

¸=

·1 1r1 r2

¸¡1 ·Yo ¡ Y ¤Y1 ¡ Y ¤

¸=

1

r2 ¡ r1

·r2 ¡1¡r1 1

¸·Yo ¡ Y ¤Y1 ¡ Y ¤

¸:

4.5.1 The Behavior of Yt with Complex Roots

Assuming that we have complex roots we can write them in polar form as:

r1 = reµi; r2 = re

¡µi:

It can be shown that A1 and A2 will also be complex and conjugates of eachother. Thus if A1 = c+ di and A2 = c¡ di are written in polar form we have:

A1 = Ae®i; A2 = Ae

¡®i


where A =j c+ di j and ® = arg (c+ di) : Then:Yt = Y ¤ +A1rt1 +A2r

t2

= Y ¤ +Ae®i¡reµi

¢t+Ae¡®i

¡re¡µi

¢t= Y ¤ + 2Art

µei(µt+®) + e¡i(µt+®)

2

¶= Y ¤ + 2Art cos (µt+ ®) :

Thus with complex roots Yt will be a damped cosine wave if r < 1 since in thiscase rt ! 0 as t!1 and so Yt will converge to Y ¤:If r > 1 Yt will be an exploding cosine wave since in this case rt ! 1 as

t!1 and so Yt will diverge from Y ¤:Thus in general complex roots result in a kind of wave behavior for Yt:

4.5.2 Example 1 Real Roots and Stable

Consider the second order di¤erence equation:

Yt = 100 + 1:3Yt¡1 ¡ 0:4Yt¡2Y1 = 850; Yo = 800:

The equilibrium value Y ¤ is then:

Y ¤ =100

1¡ (1:3)¡ (¡0:4) = 1000:

The characteristic polynomial is:

r2 ¡ 1:3r + 0:4 = 0so that:

r1; r2 =¡ (¡1:3)§

q(¡1:3)2 ¡ 4£ 0:42

or:

r1 =4

5; r2 =

1

2:

Since j r1 j< 1 and j r2 j< 1 this di¤erence equation is stable; that is no matterwhat the starting values Yt will converge to the equilibrium value Y ¤ = 1000:The general solution is then:

Yt = 1000 +A1

µ4

5

¶t+A2

µ1

2

¶t:

where A1 and A2 depend on the starting values.


Since Yo = 800 and Y1 = 850 we have

Yo = 800 = 1000 +A1

µ4

5

¶0+A2

µ1

2

¶0Y1 = 850 = 1000 +A1

µ4

5

¶1+A2

µ1

2

¶1so that in matrix notation:·

1 145

12

¸·A1A2

¸=

· ¡200¡150

¸or solving for A1 and A2 we have:·

A1A2

¸=

·1 145

12

¸¡1 · ¡200¡150

¸=

· ¡5003¡1003

¸We thus have the solution:

Yt = 1000¡ 5003

µ4

5

¶t¡ 100

3

µ1

2

¶t:

The movement of Yt over time is plotted below:

800

850

900

950

1000

0 5 10 15 20t :

Note how Yt approaches the equilibrium value Y ¤ = 1000:

4.5.3 Example 2 Real Roots and Unstable


Yt = ¡100 + 0:7Yt¡1 + 0:4Yt¡2Y1 = 850; Yo = 800:



Y ¤ =¡100

1¡ (0:7)¡ (0:4) = 1000:


r2 ¡ 0:7r ¡ 0:4 = 0

so that:

r1; r2 =¡ (¡0:7)§

q(¡0:7)2 ¡ 4£ (¡0:4)2

or:

r1 = 1:07; r2 = ¡0:373:

Since r1 = 1:07 > 1 it follows that the di¤erence equation is unstable.The general solution then takes the form:

Yt = 1000 +A1 (1:07)t +A2 (¡0:373)t :

where A1 and A2 depend on the starting values.Since Yo = 800 and Y1 = 850 we have

Yo = 800 = 1000 +A1 (1:07)0 +A2 (¡0:373)0

Y1 = 850 = 1000 +A1 (1:07)1 +A2 (¡0:373)1

so that in matrix notation:·1 11:07 ¡0:373

¸ ·A1A2

¸=

· ¡200¡150


A1A2

¸=

·1 1

1:07 ¡0:373¸¡1 · ¡200

¡150¸

=

· ¡155:65¡44:352

¸:

Thus the solution is:

Yt = 1000¡ 155:65£ (1:07)t ¡ 44:352£ (¡0:373)t



-200

0

200

400

600

800

5 10 15 20 25 30t

Plot of Yt

:

4.5.4 Example 3 Complex Roots and Stable


Yt = 100 + Yt¡1 ¡ 12Yt¡2

Y1 = 100; Yo = 120:


Y ¤ =100

1¡ (1)¡ ¡¡12

¢ = 200:The characteristic polynomial is:

r2 ¡ r + 12= 0

so that:

r1; r2 =1§

q(¡1)2 ¡ 4£ 1

2

2

or:

r1 =1

2+1

2i; r2 =

1

2¡ 12i:

Since

j r1 j=j r2 j=sµ

1

2

¶2+

µ1

2

¶2=

1p2< 1


the di¤erence equation is stable. That is no matter what the starting values Ytwill converge to the equilibrium value Y ¤ = 200:The general solution is then:

Yt = 200 +A1

µ1

2+1

2i

¶t+A2

µ1

2¡ 12i

¶t:


Yo = 100 = 200 +A1

µ1

2+1

2i

¶0+A2

µ1

2¡ 12i

¶0Y1 = 120 = 200 +A1

µ1

2+1

2i

¶1+A2

µ1

2¡ 12i

¶1so that in matrix notation:·

1 112 +

12 i

12 ¡ 1

2 i

¸ ·A1A2

¸=

· ¡100¡80


A1A2

¸=

·1 1

12 +

12 i

12 ¡ 1

2 i

¸¡1 · ¡100¡80

¸=

· ¡50 + 30i¡50¡ 30i

¸We thus have the solution:

Yt = 200 + (¡50 + 30i)µ1

2+1

2i

¶t+ (¡50¡ 30i)

µ1

2¡ 12i

¶t:

Here

1

2+1

2i =

1p2e¼4 i

1

2¡ 12i =

1p2e¡

¼4 i

¡50 + 30i =p3400e2:601173i

¡50¡ 30i =p3400e¡2:601173i

since:

arg (¡50 + 30i) = arctan

µ30

¡50¶+ ¼

= 2:601173:


Therefore:

Yt = 200 + 2£p3400£

µ1p2

¶t£ cos

³¼4t+ 2:601173

´:

This is plotted below:

100

120

140

160

180

200

220

0 5 10 15 20t

Plot of Yt

:

4.5.5 Example 4 Complex Roots and Unstable


Yt = 100 + 2Yt¡1 ¡ 2Yt¡2Y1 = 100; Yo = 120:


Y ¤ =100

1¡ (2)¡ (¡2) = 100:


r2 ¡ 2r + 2 = 0so that:

r1 = 1 + i; r2 = 1¡ i:Since

j r1 j=j r2 j=p12 + 12 =

p2 > 1

the di¤erence equation is unstable. That is no matter what the starting valuesYt will diverge from the equilibrium value Y ¤ = 100:The general solution is then:

Yt = 100 +A1 (1 + i)t +A2 (1¡ i)t



Yo = 100 = 100 +A1 (1 + i)0 +A2 (1¡ i)0

Y1 = 120 = 100 +A1 (1 + i)1 +A2 (1 + i)

1

so that in matrix notation:·1 1

1 + i 1 + i

¸ ·A1A2

¸=

·0¡20


A1A2

¸=

·1 1

1 + i 1¡ i¸¡1 ·

0¡20

¸=

·10i¡10i

¸:

We thus have the solution:

Yt = 100 + 10i (1 + i)t ¡ 10i (1¡ i)t :

Since:

1 + i =p2e

¼4 i

1¡ i =p2e¡

¼4 i

i = e¼2

we can rewrite this as:

Yt = 100 + 5p2tcos³¼4t+

¼

2

´which is plotted below:

-2500

-2000

-1500

-1000

-500

0

5005 10 15 20t

Plot of Yt

:


4.5.6 A Macroeconomic Example

Suppose as before that Yt is GNP , Ct is consumption; It is investment and Gtis government expenditure so that:

Yt = Ct + It +Gt:

Now however let investment be endogenous according to the accelerator theoryof investment so that

It = I + ° (Yt¡1 ¡ Yt¡2) ; ° > 0:

We keep government expenditure exogenous so that:

Gt = G:

As before consumption depends on income lagged one period via the con-sumption function

Ct = C + cYt¡1 ; 0 < c < 1

where c is the marginal propensity to consume.It now follows that:

Yt = Ct + It +Gt

= C + cYt¡1 + I + ° (Yt¡1 ¡ Yt¡2) +G

or

Yt = (C + I +G)| {z }®

+ (c+ °)| {z }Á1

Yt¡1 ¡ °|{z}Á2

Yt¡2

which is a second-order di¤erence equation.The equilibrium value of GNP is thus:

Y ¤ =®

1¡ Á1 ¡ Á2=

C + I +G

1¡ (c+ °)¡ (¡°)=

C + I +G

1¡ cso that the equilibrium value is unchanged. This is because in equilibriuminvestment is the same since if Yt¡1 = Yt¡2 = Y ¤:

It = I + ° (Yt¡1 ¡ Yt¡2)= I + ° (Y ¤ ¡ Y ¤)= I:

so that @Y¤

@G = 11¡c ; as before.


Stability requires that the roots:

r1; r2 =(c+ °)§

q(c+ °)2 + 4°

2:

both be less than one in absolute value.An interesting question is whether an economic model could have complex

roots. Thus consider the case where:

Ct = 100 + 0:5Yt¡1It = 100 + 0:8 (Yt¡1 ¡ Yt¡2)Gt = 100

and Yo = Y1 = 150:In this case we have the di¤erence equation:

Yt = 300 + 1:3Yt¡1 ¡ 0:8Yt¡2with characteristic polynomial:

r2 ¡ 1:3r + 0:8 = 0which has complex roots:

r1 = 0:65 + 0:61441i

r2 = 0:65¡ 0:61441i:Since j0:65 + 0:61441ij = 0:89443 < 1 this is stable. Alternatively, since theroots are complex and jÁ2j = 0:8 < 1 the di¤erence equation is stable.

4.6 First-Order Di¤erential Equations

Di¤erence equations are used when we think of time as being discrete. Economicdata is inevitably in discrete time whether it be weekly, monthly or annual.However, we tend to experience time as being continuous, and it is often easierto work with continuous time. In this case dynamics involves not di¤erenceequations but di¤erential equations.

4.6.1 De…nitions

De…nition 142 A …rst-order linear di¤erential equation consists of an equationof motion of the form:

y0 (t) + a1y (t) = b

with a given starting value: y (0).


The problem is to solve for y (t) : Often the dependence of y (t) on t is omittedso we write y instead of y (t) and y0 instead of y0 (t). Thus a …rst-order lineardi¤erential equation is sometimes written instead as:

y0 + a1y = b:

4.6.2 The Case where b = 0

A di¤erential equation with b = 0 is referred to as a homogeneous di¤erentialequation. Homogeneous di¤erential equations are simpler to work with.Therefore consider:

y0 (t) + a1y (t) = 0

or:

y0 (t) = ¡a1y (t) :Dividing both sides by y (t) we have:

y0 (t)y (t)

= ¡a1:

But

d ln (y (t))

dt=y0 (t)y (t)

so this is equivalent to:

d ln (y (t))

dt= ¡a1

which implies that:

ln (y (t)) = ¡a1t+ aofor some constant ao:It then follows then that:

y (t) = Ae¡a1t

where A = eao : Setting t = 0 we can solve for A as:

y (0) = A

so that:

Theorem 143 The solution to y0 (t) + a1y (t) = 0 with y (0) given is:

y (t) = y (0) e¡a1t:


A necessary and su¢cient condition for this di¤erential equation to be stableis then

a1 > 0

in which case

limt!1 y (t) = 0

independent of y (0) :For example given y0 (t) + 4y (t) = 0 with y (0) = 10 we have the solution:

y (t) = 10e¡4t

which is stable since y (t)! 0 as t!1 as can be seen by the plot below:

0

2

4

6

8

10

1 2 3 4 5t

y (t) = 10e¡4t:

Verify this on your own by calculating y (t) for say t = 1; 4; 10 etc:.On the other hand given: y0 (t)¡ 3y (t) = 0 with y (0) = 5 the solution is:

y (t) = 5e3t

which is unstable as can be seen by the plot below:

6

8

10

12

14

16

18

20

22

0 0.1 0.2 0.3 0.4 0.5t

y (t) = 5e3t


Verify this on your own by calculating y (t) for say t = 1; 4; 10 etc:.

4.6.3 The Case where b 6= 0Let us now return to the case where the constant term b is not zero. We cansolve this model by transforming the original model into a form where b = 0:Given:

y0 (t) + a1y (t) = b:

If y (t) ever reaches the equilibrium value y¤ then it stays there forever.Thus if y (t) = y¤ then y0 (t) = 0 so that:

y¤ =b

a1:

Let ~y (t) = y (t)¡ y¤ be the deviation of y (t) from y¤: We then have:

~y0 (t) = y0 (t)

so that replacing y (t) with y (t) = ~y (t)+y¤ and y0 (t) with ~y0 (t) it follows that:

~y0 (t) + a1 (~y (t) + y¤) = b

or:

~y0 (t) + a1~y (t) = 0:

Thus ~y (t) follows an identical homogenous di¤erential equation (but withb = 0 ): From previous our analysis of this special case we conclude that:

~y (t) = ~y (0) e¡a1t

or setting ~y (t) = y (t)¡ y¤ and ~y (0) = y (0)¡ y¤ that:Theorem 144 The solution to: y0 (t) + a1y (t) = b with y (0) given is:

y (t) = y¤ + (y (0)¡ y¤) e¡a1t:We therefore have:

Theorem 145 A necessary and su¢cient condition for y0 (t) + a1y (t) = b tobe stable is that:

a1 > 0

in which case

limt!1 y (t) = y¤

independent of y (0) :


4.6.4 Example 1:

Consider: y0 (t) + 4y (t) = 8 with y (0) = 10: Then

y¤ =8

4= 2

with a solution:

y (t) = 2 + 8e¡4t

which is stable as can be seen from the plot below:

0

2

4

6

8

10

1 2 3 4 5t

y (t) = 2 + 8e¡4t

Verify this on your own by calculating y (t) as t takes on larger values.

4.6.5 Example 2

Consider: y0 (t)¡ 3y (t) = 9 with y (0) = 6: Then

y¤ =9

¡3 = ¡3

with a solution:

y (t) = ¡3 + 9e3t


which is unstable as can be seen from the plot below:

10

15

20

25

30

35

0 0.1 0.2 0.3 0.4 0.5t

y (t) = ¡3 + 9e3t:

Verify this on your own by calculating y (t) as t takes on larger values.

4.7 Second-Order Di¤erential EquationsDe…nition 146 A second-order linear di¤erential equation is given as:

y00 (t) + a1y0 (t) + a2y (t) = b

with y0 (0) and y (0) given.

Then if y (t) = y¤; the equilibrium value, then y00 (t) = y0 (t) = 0 so that:

Theorem 147 The equilibrium value y¤ is given by:

y¤ =b

a2:

4.7.1 Solution

To solve this di¤erential equation we will …rst attempt to eliminate b: To thisend de…ne

~y (t) = y (t)¡ y¤

so that

~y0 (t) = y0 (t) ;~y00 (t) = y00 (t) :

Replacing y (t) by ~y (t) + y¤; y0 (t) with ~y0 (t) and y00 (t) with ~y00 (t) we have:

~y00 (t) + a1~y0 (t) + a2~y (t) = 0:


We saw for the …rst-order di¤erential equation that the solution took theform:

~y (t) = Aert:

Suppose we assume this to be true and ask what value r will take. To answerthis set ~y (t) = Aert so that ~y0 (t) = rAert and ~y00 (t) = r2Aert: It follows that:

r2Aert| {z }=~y00(t)

+ a1rAert| {z }

=~y0(t)

+ a2Aert|{z}

=~y(t)

= 0

or:

r2 + a1r + a2 = 0

which implies r is a root of the characteristic polynomial de…ned by:

De…nition 148 The characteristic polynomial of the di¤erential equation:

y00 (t) + a1y0 (t) + a2y (t) = b

is:

r2 + a1r + a2:

We therefore have from:

r2 + a1r + a2 = 0

that the two roots r1 and r2 are given by:

r1 =¡a1 +

pa21 ¡ 4a22

r2 =¡a1 ¡

pa21 ¡ 4a22

:

Since there are two roots it follows that:

~y (t) = A1er1t +A2e

r2t

or using the fact that: ~y (t) = y (t)¡ y¤ that:

Theorem 149 The solution to: y00 (t) + a1y0 (t) + a2y (t) = b with y0 (0) andy (0) given takes the form:

y (t) = y¤ +A1er1t +A2er2t:


To calculate A1 and A2 substitute t = 0 into the above solution and into

y0 (t) = r1A1er1t + r2A2er2t

to obtain the two linear equations:

y (0) = y¤ +A1 +A2y0 (0) = r1A1 + r2A2:

It therefore follows that:·1 1r1 r2

¸·A1A2

¸=

·y (0)¡ y¤y0 (0)

¸which can be solved to obtain A1 and A2.

4.7.2 Stability with Real Roots

If:

a21 ¡ 4a2 ¸ 0then the term under the square root sign that de…nes r1 and r2 , that is:

¡a1 §pa21 ¡ 4a22

will be non-negative so that both r1 and r2 will be real. It then follows thaty (t) is stable if and only if both roots are negative, that is:

r1 < 0

r2 < 0:

4.7.3 Stability with Complex Roots

If

a21 ¡ 4a2 < 0then the term under the square root sign that de…nes r1 and r2 , that is:

¡a1 §pa21 ¡ 4a22

will be negative so that both r1 and r2 will be complex numbers, or moreprecisely complex conjugates with:

r1 = c+ di

r2 = c¡ di


where:

c =¡a12

d =

p4a2 ¡ a212

:

Since:

er1t = e(c+di)t = ect (cos (dt) + i sin (dt))

er2t = e(c¡di)t = ect (cos (dt)¡ i sin (dt))it follows that

limt!1 er1t = lim

t!1 er2t = 0

if and only if c < 0: We thus have:

Theorem 150 The di¤erential equation

y00 (t) + a1y0 (t) + a2y (t) = b

is stable if and only if the real part of the two roots r1 and r2 of the characteristicpolynomial are negative.

Alternatively since c = ¡a12 < 0 stability in the case of complex roots is

equivalent to the condition:

a1 > 0:

4.7.4 Example 1: Stable with Real Roots

Consider:

y00 (t) + 4y0 (t) + 2y (t) = 4

y (0) = 0

y0 (0) = 1

then:

y¤ =4

2= 2

and the quadratic that de…nes r1 and r2 is:

r2 + 4r + 2 = 0

with solutions:

r1 = ¡2 +p2 = ¡0:5857864 < 0

r2 = ¡2¡p2 = ¡3:414214 < 0:


Since r1 < 0 and r2 < 0 we conclude that y (t) stable.To solve for A1 and A2 note that from:

y (t) = 2 +A1er1t +A2e

r2t

it follows that: ·1 1

¡2 +p2 ¡2¡p2¸·

A1A2

¸=

· ¡21

¸so: ·

A1A2

¸=

·1 1

¡2 +p2 ¡2¡p2¸¡1 · ¡2

1

¸=

· ¡34

p2¡ 1

34

p2¡ 1

¸:

It follows then that the general solution is:

y (t) = 2 +

µ¡34

p2¡ 1

¶e(¡2+

p2)t +

µ3

4

p2¡ 1

¶e¡(2+

p2)t


0

0.5

1

1.5

2

2 4 6 8 10t

Plot of y (t)

Note how Y (t) approaches Y ¤ = 2; re‡ecting the fact that this di¤erentialequation is stable.

4.7.5 Example 2: Unstable with Real Roots

Consider:

y00 (t)¡ 6y0 (t)¡ 2y (t) = ¡6y (0) = 0

y0 (0) = 1


so that:

y¤ =¡6¡2 = 3


r2 ¡ 6r ¡ 2 = 0

with solutions:

r1 = 3 +p11 = 6:32 > 0

r2 = 3¡p11 = ¡0:32 < 0:

Since r1 > 0 it follows that (even though r2 < 0) that the di¤erential equationis unstable. The exact solution is:

y (t) = 3 +

µ¡32+5

11

p11

¶e(3+

p11)t ¡ 1

22

³10 + 3

p11´p

11e(3¡p11)t


0

1

2

3

4

5

0.2 0.4 0.6 0.8 1t

Plot of y (t)

:

4.7.6 Example 3: Stable with Complex Roots

Consider:

y00 (t) + 2y0 (t) + 3y (t) = 6

y (0) = 0

y0 (0) = 1

then:

y¤ =6

3= 2



r2 + 2r + 3 = 0

so that:

r1 = ¡1 + ip2

r2 = ¡1¡ ip2:

Since the real part of r1 and r2; here ¡1; is negative, it follows that the di¤er-ential equation is stable.The solution can be shown to be:

y (t) = 2¡µ2 cos

p2t¡ 1p

2

³sinp2t´¶e¡t


0

0.5

1

1.5

2

1 2 3 4 5t

Plot of y (t)

:

Note that it is the factor e¡t; which insures stability e¡t ! 0 as t!1:

4.7.7 Example 4: Unstable with Complex Roots

Consider:

y00 (t)¡ 2y0 (t) + 3y (t) = 6y (0) = 0y0 (0) = 1

then:

y¤ =6

3= 2


r2 ¡ 2r + 3 = 0


so that:

r1 = 1 + ip2

r2 = 1¡ ip2:

Since the real part of r1 and r2 equals 1 and hence is positive, the di¤erentialequation is unstable.It can be shown that the solution is:

y (t) = 2 +

µ3p2sinp2t¡ 2 cos

p2t

¶et


-2000

200

400

600

800

1000

1200

1400

1600

1 2 3 4 5 6 7t

Plot of y (t)

:

Note that it is the factor et; which causes the instability since et !1 as t!1:

an introduction to mathematical economics - loglinear · chapter 1 total di¤erentials 1.1...

Documents