numerical integration and differentiation quantitative ... · numerical integration and di...

27
Numerical Integration and Differentiation Quantitative Macroeconomics [Econ 5725] Ra¨ ul Santaeul` alia-Llopis Washington University in St. Louis Spring 2016 Ra¨ ul Santaeul` alia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 1 / 27

Upload: trinhhuong

Post on 06-Aug-2018

275 views

Category:

Documents


3 download

TRANSCRIPT

Numerical Integration and DifferentiationQuantitative Macroeconomics [Econ 5725]

Raul Santaeulalia-Llopis

Washington University in St. Louis

Spring 2016

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 1 / 27

1 Numerical DifferentiationOne-Sided and Two-Sided DifferentiationComputational Issues on Very Small Numbers

2 Numerical IntegrationNewton-Cotes MethodsGaussian QuadratureMonte Carlo IntegrationQuasi-Monte Carlo Integration

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 2 / 27

Numerical Differentiation

The definition of the derivative at x∗ is

f ′(x∗) = limh→0

f (x∗ + h)− f (x∗)

h

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 3 / 27

• Hence, a natural way to numerically obtain the derivative is to use:

f ′(x∗) ≈f (x∗ + h)− f (x∗)

h(1)

with a small h. We call (1) the one-sided derivative.

• Another way to numerically obtain the derivative is to use:

f ′(x∗) ≈f (x∗ + h)− f (x∗ − h)

2h(2)

with a small h. We call (2) the two-sided derivative.

We can show that the two-sided numerical derivative has a smaller error than theone-sided numerical derivative. We can see this in 3 steps.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 4 / 27

• Step 1, use a Taylor expansion of order 3 around x∗ to obtain

f (x) = f (x∗) + f ′(x∗)(x − x∗) +1

2f ′′(x∗)(x − x∗)

2 +1

6f ′′′(x∗)(x − x∗)

3 +O3(x) (3)

• Step 2, evaluate the expansion (3) at x = x∗ + h

f (x∗ + h) = f (x∗) + f ′(x∗)h +1

2f ′′(x∗)(h)2 +

1

6f ′′′(x∗)(h)3 +O3(x∗ + h) (4)

and rearrange to obtain the one-sided derivative formula:

f (x∗ + h)− f (x∗)

h= f ′(x∗) +

1

2f ′′(x∗)(h) +

1

6f ′′′(x∗)(h)2 +

O3(x∗ + h)

h(5)

• Step 3, evaluate the expansion (3) at x = x∗ − h

f (x∗ − h) = f (x∗) + f ′(x∗)(−h) +1

2f ′′(x∗)(−h)2 +

1

6f ′′′(x∗)(−h)3 +O3(x∗ − h) (6)

and combine (4) and (6) to obtain the two-sided derivative formula:

f (x∗ + h)− f (x∗ − h)

2h= f ′(x∗) +

1

6f ′′′(x∗)(h)2 +

O3(x∗ + h)−O3(x∗ − h)

2h(7)

Comparing (5) and (7) shows the error associated with the two-sided formula is smaller.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 5 / 27

Computational Issues on Very Small Numbers

• Exact arithmetic and computer arithmetic do not always give the same answers.This is not an issue of programming skills but a matter of computer precision.

• For example, computey = (1.0e − 20 + 1.0)− 1.0

andy = 1.0e − 20 + (1.0− 1.0)

where 1.0e − 20 is the computer’s shorthand for 10−20.

Exact arithmetic says the two statements above are identical because addition andsubstraction are associative. A computer, however, would evaluate the statementsdifferently. The first statement would, incorrectly, likely result in x = 0 whereasthe second one would, correctly, result in x = 10−20.

• Usually, if one is using double precision in Fortran, numbers can be not precise ifwe reach 10−16.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 6 / 27

For this reason, practice suggets for the two-sided formula an alternative

f ′(x∗) ≈f (x∗ + h)− f (x∗ − h)

(x∗ + h)− (x∗ − h)(8)

with a small h. Note that the denominator might not be precisely 2h. That iswhy this formula helps to avoid trouble associated with a small h.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 7 / 27

Numerical Integration

• Goal: Compute the definite integral of a real-valued function f w.r.t. aweight function w over an interval I of <n,∫

I

f (x) w(x) dx

• The weight function can be, for example

• w(x) ≡ 1→ the integral is the area under f

• w(x) ≡ p.d.f. of a r.v. x with support I → the integral is E [f (x)].

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 8 / 27

• We study methods that approximate a definite integral with a weighted sumof function values, ∫

I

f (x) w(x) dx ≈n∑

i=0

wi f (xi )

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 9 / 27

• We will see three classes of numerical integration (numerical quadrature)methods that differ on how the quadrature weights wi and the quadraturenodes xi are chosen.

• Newton-Cotes methods approximate the integrand f between nodesusing low-order polynomials and sum the integrals.

• Gaussian Quadrature methods choose the nodes and weights thatsatisfy some moment-matching conditions.

• Monte Carlo (and quasi-Monte Carlo) methods use equally weightedrandom or equidistributed nodes.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 10 / 27

Newton-Cotes Methods

• Univariate quadrature methods are designed to approximate the integral of areal-valued function f defined on a bounded interval [a, b] of the real line.

• Two Newton-Cotes methods are widely used,

• Trapezoid Rule

• Simpson’s Rule

• Both rules are easy to implement and are typically adequate for computingthe area under a continuous function.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 11 / 27

Trapezoid Rule

• First, partition the interval [a, b] into subintervals, (say, though notnecessarily, equal length)

• Define the nodes xi = a + (i − 1)h for i = 1, ..., n with h = b−an−1 .

• Second, approximate f over each subinterval [xi , xi+1] using piecewise linearspline passing through (xi , f (xi )) and (xi+1, f (xi+1))

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 12 / 27

• Third, the area under each line segment defines a trapezoid approximatesthe area under f over the subinterval,∫ xi+1

xi

f (x) dx ≈ h

2[f (xi ) + f (xi+1)]

Summing up the areas of the trapezoides across the subintervals yields theTrapezoide rule: ∫ b

a

f (x) dx ≈n∑i

wi f (xi )

with w1 = wn = h2 and wi = h otherwise.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 13 / 27

• Remarks:

• It is simple and robust.

• First-order exact: if not for rounding error, it will exactly compute theintegral of any first-order polynomial (a line)

• If the integrand is smooth, the trapezoid rule yields an approximationerror that shrinks quadratically with the width of the subintervals,O(h2).

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 14 / 27

Simpson’s Rule

• First, partition the interval [a, b] into an even number of subintervals, (say,equal length)

• Define the nodes xi = a + (i − 1)h for i = 1, ..., n with h = b−an−1 and n

is odd.

• Second, approximate f over the jth pair of subintervals [xj−1, xj ] and[xj , xj+1] using a piecewise quadratic function that passes through(xj−1, f (xj−1)), (xj , f (xj)) and (xj+1, f (xj+1))

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 15 / 27

• Third, the area under this quadratic function approximates the area under fover the subinterval,∫ xj+1

xj−1

f (x) dx ≈ h

3[f (xj−1) + 4f (xj) + f (xj+1))]

Summing up the areas of the quadratic approximants across subintervalsyields the Simpson’s Rule:∫ b

a

f (x) dx ≈n∑i

wi f (xi )

with w1 = wn = h3 and otherwise, wi = 4 h

3 if i is even, and wi = 2 h3 if i is

odd.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 16 / 27

• Remarks:

• Easy to implement, as the Trapezoide rule.

• Even thought it is based on locally quadratic approximations of theintegrand, it is third-order exact: if not for rounding error, it willexactly compute the integral of any cubic polynomial.

• If the integrand is smooth, the Simpson’s rule yields an approximationerror that shrinks at twice the geometric rate of the error associatedwith the trapezoid rule, O(h4).

• Simpson’s rule is preferred to the Trapezoid rule when f is smoothbecause it offers twice the degree of approximation.

• If f exhibits discontinuities, the trapezoid rule will often be moreaccurate.

• Newton-Cotes rules based on 4th and higher order piecewisepolynomial approximations exist, but rarely used.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 17 / 27

• Higher dimensional integration: generalizations of the univariateNewton-Cotes quadrature schemes through tensor product principles.

• Suppose one wishes to integrate a real-valued function defined on arectangle {(x1, x2) |a1 ≤ x1 ≤ b1, a2 ≤ x2 ≤ b2} in <2.

• One way to proceed is to compute the Newton-Cotes nodes and weights:

• {(x1i ,w1i ) |i = 1, ..., n1} for the real interval (a1, b1), and

• {(x2j ,w2j) |j = 1, ..., n2} for the real interval (a2, b2)

• The tensor product Newton-Cotes rule for the rectangle would comprise ofthe n = n1n2 grid points of the form

• {(x1i ,w2j) | i = 1, ..., n1, j = 1, ..., n2} with associated weights,

• {wij = w1iw2j | i = 1, ..., n1, j = 1, ..., n2}

• This construction principle can be applied to higher dimensions usingrepeated tensor product operations.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 18 / 27

Gaussian Quadrature

• Gaussian quadratrue rules are construced w.r.t. specific weight fucntions w .

• For a weight function w defined on an interval I ⊂ < of the real line and fora given order of approximation n, the quadrature nodes x1, ..., xn andquadrature weights w1, ...,wn are chose so as to satisfy 2n“moment-matching” conditions.

• ∫I

xk w(x) dx =n∑

i=1

wi xki , for k = 0, ..., 2n − 1 (9)

• The integral approximation is computed by the weighted sum of functionvalues at the prescribed nodes:∫

I

w(x) f (x) dx ≈n∑

i=1

wi f (xi ) (10)

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 19 / 27

• Gaussian quadrature over a bounded interval w.r.t. the identity weightfunction, w(x) ≡ 1, is called Gauss-Legendre quadrature. This is appropriatefor computing the area under a curve because of its consistency withRiemann-integrable functions. If f is Riemann integrable, then theapproximation afforded by Gauss-Legendre quadrature can be madearbitrarily precise by increasing the number of nodes.

Gauss-Legendre quadrature should be applied with caution if the function

has discontinuous derivatives, as in f (x) = x+|x|2 . If the function f possesses

known known king points, it is often possible to break theintegral into thesum of 2 or more integlras of smooth functions. When it is not possible toproduce smooth integrands this way, then Newton-Cotes quadraturemethods are more efficient.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 20 / 27

• When the weight function w is the probability density function of somecontinuous random variable X , Gaussian quadrature basically “discretizes”the continuous random variable X by replacing it with a discrete randomvariable with mass points xi and probabilities wi that approximate X in thesense tha tboth random variables have the same moents of order less than2n:

n∑i=1

wixki = EX k for k = 0, ..., 2n − 1 (11)

Given the mass points and proabilities of the discrete approximant, theexpectation of any function of the continuous random variable may beapproximated using the expectation of the function of the discreteapproximant:

Ef (X ) =

∫I

f (x)w(x)dx =n∑

i=1

wi f (xi ) (12)

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 21 / 27

Monte Carlo Integration

• Motivated by the Strong Law of Large Numbers: if x1, x2, ... are independentrealizations of a random variable X and f is a continuous function, then

limn→∞

1

n

n∑i=1

f (xi ) = Ef (X ) (13)

with probability one.

• The Monte Carlo integration scheme: compute an approximation of theexpectation f (X ), one draws a random sample x1, x2, ..., xn from the

distribution of X and sets

Ef (X ) ≈ 1

n

n∑i=1

f (xi )

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 22 / 27

• Issues with the pseudorandomness:

• Most packages produce pseudorandom variables that are uniformly distributed onthe interval [0, 1].

A uniform random number generator is very useful for generating random samplesfrom other distributions: suppose X is has a cumulative distribution function

F (x) = Pr(X ≤ x) (14)

whose inverse has a well-defined closed form.

If U is uniformly distributed on (0, 1), then F−1(U) has the same distribution as X .

Thus, to generate a random sample x1, x2, ..., xm from the X distribution, onegenerates a random sample u1, u2, ..., un from the uniform distribution and setsxi = F−1(ui ).

• Most packages also provide intrinsic routines that generate pesudorandom standardnormal variables.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 23 / 27

• Problem: it is almost impossible to generate a truly random sample of variates ofany distribution. The employed subroutines generate purely deterministic, notrandom, sequences of numbers that depend on the seed (initiazion point). Goodsubroutines generate sequences that appear to be random, in that they passcertain statistical tests for randomness. This is why we call them “pseudorandom”numbers. 1

1In general, when simulating a model we deal with it by doing a large enough number of simulations and computing

statistics that average all the simulations—this applies for both estimated and calibrated economies.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 24 / 27

Some other Pros and Cons

• Preferred over Gaussian quadratures because of its simplicity if the routine forcomputing Gaussian mass points and probabilities is not efficient (or if theintegration is over many dimensions).

• Monte Carlo integration is subject to sampling error that cannot be bounded withcertainty.

• The approximation can be made more accurate by increasing the size of therandom sample, but doing so can be expensive if evaluating f or generating thepseudorandom variate is costly.

• Approximations generated by Monte Carlo integration will vary from oneintegration to the next unless intiated at the same point. This implies that makinguse of Monte Carlo integration in conjuction within other iterative schemes(dynamic programming) is then very problematic.

• Quasi-Monte Carlo methods can circumvent some of this problems.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 25 / 27

Quasi-Monte Carlo Integration

• Quasi-Monte Carlo methods started from insights from probability theory.

• These methods rely on sequences {xi} with the property that

limn→∞

b − a

n

∞∑i=1

f (xi ) =

∫ b

a

f (x)dx (15)

without regard to whether the sequence passes standard tests of randomness.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 26 / 27

• Any sequence that satisfies theis condition for arbitrary (Riemann) integrablefunctions can be used to approximate an integral on [a, b].

• Although the Law of Large Numbers assures us that his statement is true whenthe xi are i.i.d., othe sequences also satisfy this property. Actually, sequences thatare explicitly nonrandom but instead attempt to fill in space in a regular manner,can often provide more accurate approximations to definite integrals: there arenumerous schemes for generating equidistributed sequences including theNeiderreiter, Weyl, and Haber sequences.

Raul Santaeulalia-Llopis (Wash.U.) Numerical Integration and Differentiation Spring 2016 27 / 27