concepts in calculus ii - floridashines · pdf fileconcepts in calculus ii beta version...

Concepts in Calculus IIBeta Version

UNIVERSITY PRESS OF FLORIDA

Florida A&M University, TallahasseeFlorida Atlantic University, Boca RatonFlorida Gulf Coast University, Ft. MyersFlorida International University, MiamiFlorida State University, Tallahassee

New College of Florida, SarasotaUniversity of Central Florida, Orlando

University of Florida, GainesvilleUniversity of North Florida, Jacksonville

University of South Florida, TampaUniversity of West Florida, Pensacola

Orange Grove Texts Plus

Concepts in Calculus IIBeta Version

Miklos Bona and Sergei ShabanovUniversity of Florida Department of

Mathematics

University Press of Florida

Gainesville • Tallahassee • Tampa • Boca Raton

Pensacola • Orlando • Miami • Jacksonville • Ft. Myers • Sarasota

Copyright 2012 by the University of Florida Board of Trustees on behalf of the University of

Florida Department of Mathematics

This work is licensed under a modified Creative Commons Attribution-Noncommercial-No

Derivative Works 3.0 Unported License. To view a copy of this license, visit http://

creativecommons.org/licenses/by-nc-nd/3.0/. You are free to electronically copy, distribute, and

transmit this work if you attribute authorship. However, all printing rights are reserved by the

University Press of Florida (http://www.upf.com). Please contact UPF for information about

how to obtain copies of the work for print distribution. You must attribute the work in the

manner specified by the author or licensor (but not in any way that suggests that they endorse

you or your use of the work). For any reuse or distribution, you must make clear to others the

license terms of this work. Any of the above conditions can be waived if you get permission from

the University Press of Florida. Nothing in this license impairs or restricts the author’s moral

rights.

ISBN 978-1-61610-156-5

Orange Grove Texts Plus is an imprint of the University Press of Florida, which is the scholarly

publishing agency for the State University System of Florida, comprising Florida A&M

University, Florida Atlantic University, Florida Gulf Coast University, Florida International

University, Florida State University, New College of Florida, University of Central Florida,

University of Florida, University of North Florida, University of South Florida, and University of

West Florida.

University Press of Florida

15 Northwest 15th Street

Gainesville, FL 32611-2079

http://www.upf.com

Contents

Chapter 6. Applications of Integration 136. The Area Between Curves 137. Volumes 738. Cylindrical Shells 1639. Work and Hydrostatic Force 2140. Average Value of a Function 25

Chapter 7. Methods of Integration 2941. Integration by Parts 2942. Trigonometric Integrals 3243. Trigonometric Substitution 3644. Integrating Rational Functions 4045. Strategy of Integration 4546. Integration Using Tables and Software Packages 4847. Approximate Integration 5248. Improper Integrals 59

Chapter 8. Sequences and Series 6949. Infinite Sequences 6950. Special Sequences 7651. Series 8252. Series of Nonnegative Terms 8753. Comparison Tests 9354. Alternating Series 9755. Ratio and Root Tests 10256. Rearrangements 10957. Power Series 11558. Representation of Functions as Power Series 12059. Taylor Series 126

Chapter 9. Further Applications of Integration 13560. Arc Length 13561. Surface Area 139

Chapter and section numbering continues from the previous volume in the series,Concepts in Calculus I.

vi CONTENTS

62. Applications to Physics and Engineering 14563. Applications to Economics and the Life Sciences 15164. Probability 156

Chapter 10. Planar Curves 16565. Parametric Curves 16566. Calculus with Parametric Curves 17267. Polar Coordinates 17868. Parametric Curves: The Arc Length and Surface Area 18569. Areas and Arc Lengths in Polar Coordinates 19170. Conic Sections 196

CHAPTER 6

Applications of Integration

36. The Area Between Curves

36.1. The Basic Problem. In the previous chapter, we learned that ifthe function f satisfies f(x) ≥ 0 for all real numbers x in the interval[a, b], then the area of the domain whose borders are the graph of f ,the horizontal axis, and the vertical lines x = a and x = b is equal to∫ b

af(x) dx. If there is no danger of confusion as to what a and b are,

then this fact is sometimes informally expressed by the sentence “theintegral of f is equal to the area of the domain that is under the graphof f .”

What can we say about the area of the domain between two curves?There are several ways to ask this question. The easiest version, dis-cussed by the following theorem, differs from the previous situationonly in that the horizontal line is replaced by another function g.

Theorem 6.1. Let f and g be two functions such that, for all realnumbers x ∈ [a, b], the inequality f(x) ≥ g(x) holds. Then the domainwhose borders are the graph of f , the graph of g, and the vertical linesx = a and x = b has area

A =∫ b

a

(f(x)− g(x)) dx.

See Figure 6.1 for an illustration of the content of Theorem 6.1.The reader is invited to explain why this theorem is a direct conse-

quence of the fact that we recalled in the first paragraph of this section.The reader is also invited to explain why the theorem holds even if fand g take negative values.

Example 6.1. Compute the area A(D) of the domain D whoseborders are the graph of the function f(x) = x3+1, the function g(x) =x2 + 2, and the vertical lines x = 2 and x = 3. See Figure 6.2 for anillustration of this specific example.

Solution: In order to see that Theorem 6.1 is applicable, we mustfirst show that, for all x ∈ [2, 3], the inequality f(x) ≥ g(x) holds.

1

2 6. APPLICATIONS OF INTEGRATION

Figure 6.1. Area enclosed by f(x) and g(x) betweenx = a and x = b.

Figure 6.2. Area enclosed by the graphs of f(x) = x3+1 and g(x) = x2 + 2 between x = 2 and x = 3.

This is not difficult, since we only need to show that if x ∈ [2, 3], thenf(x) ≥ g(x), that is,

x3 + 1 ≥ x2 + 2,

x3 − x2 ≥ 1,

x2(x− 1) ≥ 1,

and this is clearly true since x ≥ 2, so x2 ≥ 4, and x − 1 ≥ 1, forcingx2(x− 1) ≥ 4.

36. THE AREA BETWEEN CURVES 3

Therefore, Theorem 6.1 applies, and we have

A(D) =∫ 3

2(x3+1)−(x2+2) dx =

∫ 3

2(x3−x2−1) dx =

[x4

4− x3

3− x

]3

2

= 81112

.

�

36.2. Intersecting Curves. Sometimes, the points a and b are deter-mined by the curves themselves, and not given in advance. In thatcase, we have to compute them before we can apply Theorem 6.1.

Example 6.2. Find the area A(D) of the domain D whose bordersare the graphs of the functions f(x) = x2 + 3x + 5 and g(x) = 2x2 +7x + 8. See Figure 6.3 for an illustration.

Solution: Let us find the points in which the graphs of f and g inter-sect. In these points, we have

x2 + 3x + 5 = 2x2 + 7x + 8,

0 = x2 + 4x + 3,0 = (x + 3)(x + 1).

That is, the two curves intersect in two points, and these points havehorizontal coordinates a = −3 and b = −1. Furthermore, if x ∈

Figure 6.3. Area enclosed by the graphs of f(x) andg(x) between x = −3 and x = −1.


[−3,−1], that is, if x is between those two intersection points, then

f(x)− g(x) = (x2 + 3x + 5)− (2x2 + 7x + 8)

= −(x2 + 4x + 3)= −(x + 3)(x + 1)≥ 0,

since x + 3 ≥ 0 and x + 1 ≤ 0. Therefore, if x ∈ [−3,−1], thenf(x) ≥ g(x), and Theorem 6.1 applies. So we have

A(D) =∫ −1

−3(f(x)−g(x)) dx =

∫−(x2+4x+3) dx =

[−x3

3− 2x2 − 3x

]−1

−3

= 113.

�

The situation becomes slightly more complicated if f ≥ g does nothold throughout the entire interval [a, b]. For instance, it could happenthat f(x) ≥ g(x) at the beginning of the interval [a, b], and then, from agiven point on, g(x) ≥ f(x). In that case, we split [a, b] up into smallerintervals so that on each of these smaller intervals, either f(x) ≥ g(x)or g(x) ≥ f(x) holds. Then we can apply Theorem 6.1 to each ofthese intervals. As on some of these intervals f(x) ≥ g(x) holds, whileon some others g(x)− f(x) holds, the application of Theorem 6.1 willsometimes involve the computation of

∫(f(x)−g(x)) dx and sometimes∫

(g(x)− f(x)) dx. The following theorem formalizes this idea.

Theorem 6.2. Let f and g be two functions. Then the area of thedomain whose borders are the graph of f , the graph of g, the verticalline x = a and the vertical line x = b is equal to

∫ b

a

|f(x)− g(x)| dx.

Note that Theorem 6.1 is a special case of Theorem 6.2, namely,the special case when f(x)− g(x) = |f(x)− g(x)| for all x ∈ [a, b].

Example 6.3. Let f(x) = x3 + 3x2 + 2x and let g(x) = x3 + x2.Compute the area A(D) of the domain whose borders are the graphs off and g and the vertical lines x = −2 and x = 1. See Figure 6.4 foran illustration.

36. THE AREA BETWEEN CURVES 5

Figure 6.4. Graphs of f(x) and g(x) on [−2, 1].

Solution: In order to use Theorem 6.2, we need to compute |f(x) −g(x)|. We have

f(x)− g(x) = (x3 + 3x2 + 2x)− (x3 + x2),

2x2 + 2x = 0,2x(x + 1) = 0.

That is, there are only two points where these two curves intersect,namely, at x = −1 and x = 0. If x ≤ −1 or if x ≥ 0, then f(x)−g(x) =2x(x+1) > 0, so |f(x)−g(x)| = f(x)−g(x) = 2x2+2x. If −1 < x < 0,then f(x) − g(x) < 0, so |f(x) − g(x)| = g(x) − f(x) = −2x2 − 2x.Figure 6.5 shows the behavior of the function |f(x)− g(x)|.

We can now directly apply Theorem 6.2. We get

A(D) =∫ 1

−2|f(x)− g(x)| dx

=∫ −1

−2(f(x)− g(x)) dx +

∫ 0

−1(g(x)− f(x)) dx

+∫ 1

0(f(x)− g(x)) dx

=∫ −1

−2(2x2 + 2x) dx +

∫ 0

−1(−2x2 − 2x) dx

+∫ 1

0(2x2 + 2x) dx

=[2x3

3+ x2

]−1

−2+[−2x3

3− x2

]0

−1+[2x3

3+ x2

]1

0.


Figure 6.5. Graph of |f(x)− g(x)| on [−2, 1].�

36.3. Curves Failing the Vertical Line Test. Sometimes we want to com-pute the area between two curves that do not pass the vertical linetest; that is, they contain two or more points on the same vertical line.Such curves are not graphs of functions of the variable x. If they passthe horizontal line test, that is, if they do not contain two points onthe same horizontal line, then they can be viewed as functions of y.We can then change the roles of x and y in Theorems 6.1 and 6.2 andproceed as in the earlier examples of this section.

Example 6.4. Compute the area A(D) of the domain between thevertical line x = 4 and the curve given by the equation y2 = x. SeeFigure 6.6 for an illustration.

Solution: Neither curve satisfies the vertical line test, but both satisfythe horizontal line test. Therefore, we set f(y) = 4 and g(y) = y2. It is

Figure 6.6. Graph of y2 = x and x = 4.

37. VOLUMES 7

clear that the two curves intersect at the points given by y = −2 andy = 2. Between these two curves, the value of f(y) is larger. Therefore,Theorem 6.1 applies (with the roles of x and y reversed). So we have

A(D) =∫ 2

−2(f(x)− g(x)) dx

=∫ 2

−24− y2 dy

=[4y − y3

3

]2

−2

= 16− 163

= 1023.

�

Note that the geometric meaning of reversing the roles of x and yis simply reflecting all curves through the x = y line. That reflectiondoes not change the area of any domain, so one can expect analogousmethods of computing areas before and after that reflection.

36.4. Exercises.

(1) Find the area of the domain whose borders are the vertical linex = 0, the vertical line x = 2, and the graphs of the functionsf(x) = x2 + 3 and g(x) = sin x.

(2) Find the area of the domain whose borders are the vertical linex = 1, the vertical line x = 3, and the graphs of the functionsf(x) = x3 and g(x) = e−x.

(3) Find the area of the domain between the graphs of the func-tions f(x) = x2 + 2 and g(x) = 4x− 1.

(4) Find the area of the domain between the graphs of the func-tions f(x) = x3 − 3x2 and g(x) = x2.

(5) Find the area between the curves given by the equations x =5y and x = y2 + 6.

(6) Compute the area between the three curves f(x) = x, g(x) =−x, and h(x) = 4.

37. Volumes

37.1. Extending the Definition of Volumes. If a solid S can be built upusing unit cubes, then we can simply say that the volume V (S) of S isthe number of unit cubes used to build S. However, if the borders of


S are not planes, then this method will have to be modified. A ball ora cone is an example of this.

So we would like to define the notion of volume so that it is applica-ble to a large class of solids, not just to those solids that are borderedby planes. This definition should agree with our intuition. It shouldalso be in accordance with the fact that we can approximate all solidswith a collection of very small cubes; therefore, V (S) must be close tothe number of unit cubes used in the approximation.

With these goals in mind, we recall that we already defined the areaof a domain in the plane whose borders are the graphs of continuousfunctions. Building on that definition, we say that the volume of aprism is its base area times its height. More formally, let S be a solidwhose base and cover are identical copies of the plate P , located atdistance h from each other, on two parallel planes that are at distanceh from each other. Then we define the volume of S to be

V (S) = A(P )h,

where A(P ) is the area of P . See Figure 6.7 for an illustration. Inparticular, the volume of a cylinder whose base is a circle of radius rand whose height is h is r2πh.

Now let S be any solid located between the planes given by theequations x = a and x = b. In order to define and compute the volumeV (S) of S, we cut S into n parts by the planes x = xi for i = 0, 1, . . . , n,where

a = x0 < x1 < · · · < xn = b.

Let n be large. Then the part Si of S that is between the planesx = xi−1 and x = xi is well approximated by a prism Pi described asfollows. The height of Pi is ∆x = (b − a)/n, the base plate and cover

Figure 6.7. The volume of S is defined by V (S) = A(P )h.

37. VOLUMES 9

plate of Pi are on the planes x = xi−1 and x = xi, respectively, and thebase and cover plates of Pi are congruent to the intersection Ti of S andthe plane x = x∗

i for some point x∗i ∈ [xi−1, xi]. We can assume that x∗

i

is the midpoint of [xi−1, xi], but that will turn out to be insignificant.Note that Pi has volume A(Ti) ∆x.

If n is large, then the union of the prisms Pi approximates S well,so V (S) should be defined in a way that assures that V (S) is close to

(6.1)n∑

i=1

V (Pi) =n∑

i=1

A(Ti) ∆x.

As n goes to infinity, the Riemann sum on the right-hand side of(6.1) has a limit. We define that limit to be the volume V (S) of S, so

V (S) = limn→∞

n∑i=1

A(Ti) ∆x.

On the other hand, by the definition of the definite integral, we have

limn→∞

n∑i=1

A(Ti) ∆x =∫ b

a

A(t) dt,

where A(t) is the area of the intersection of S and the plane x = t.Therefore, this integral is equal to the volume V (S) of the solid S.That proves the following theorem, which will be our main tool in thissection.

Theorem 6.3. Let S be a solid located between the planes x = aand x = b, and let A(t) be the area of the intersection of S and theplane x = t. Then the volume V (S) of S satisfies the equation

V (S) =∫ b

a

A(t) dt.

Example 6.5. Compute the volume of the ball B whose center isat the origin and whose radius is r.

Solution: For any given t ∈ [−r, r], the intersection of the planex = t and the ball B is a circle. Let Ct denote this circle. By thePythagorean theorem, Ct has radius

√r2 − t2, and therefore, the area

of Ct is A(t) = (r2 − t2)π. See Figure 6.8 for an illustration.


Figure 6.8. The volume of the ball B approximated by cylinders.

Now we can use Theorem 6.3 to compute the volume of B. We get

V (B) =∫ r

−r

π(r2π − t2

)dt

= π ·[r2t− 1

3t3]r

−r

= 2r3π − 23r3π

=43r3π.

�

There is nothing magical about the x axis as far as Theorem 6.3 isconcerned. The argument that yielded that theorem can be repeatedfor the y axis instead of the x axis, yielding the following theorem.

Theorem 6.4. Let S be a solid located between the planes y = aand y = b, and let B(t) be the area of the intersection of S and theplane y = t. Then the volume V (S) of S satisfies the equation

V (S) =∫ b

a

B(t) dt.

Example 6.6. Let S be the right circular cone whose symmetryaxis is the y axis, whose apex is at y = h, and whose base is a circlein the plane y = 0 with its center at the origin and with radius r. Findthe volume of S.

Solution: The cone S is between the planes y = 0 and y = h, andB(t) of Theorem 6.4 is easier to compute than A(t) of Theorem 6.3, sowe use the former.

37. VOLUMES 11

The intersection of the plane y = t and S is a circle. The radius rt

of this circle, by similar triangles, satisfiesrt

r=

h− t

h,

showing that rt = r(h − t)/h. Therefore, B(t) = r2(h − t)2π/h2, andTheorem 6.4 implies

V (S) =∫ h

0(r2(h− t)2π/h2) dt

=r2π

h2

∫ h

0(h2 − 2ht + t2) dt

=r2π

h2

[h2t− ht2 +

t3

3

]h

0

=r2π

h2 ·h3

3

=13hr2π.

See Figure 6.9 for an illustration. �

37.2. Annular Rings. In the examples that we have solved so far, thecomputation of A(t) or B(t), that is, the computation of the area ofthe intersection between a solid and a horizontal or vertical plane, wasnot difficult. That computation could be done directly.

There are situations in which the domains whose areas we need tocompute are not convex; that is, visually speaking, there is a hole in

Figure 6.9. Right circular cone.


them. This happens particularly often when S is obtained by rotatinga domain D around a line.

Example 6.7. Let D be the domain between the two curves y =x2 = f(x) and y = 2x = g(x) and let S be the solid obtained byrotating D about the line x = −3. Find the volume of S.

Solution: The two curves intersect at the points (0, 0) and (2, 4). Theintersection of the horizontal plane y = t with S has the form of anannular ring, which is sometimes informally called a washer. This issimply a smaller circle cut out off the middle of a larger circle, so thatthe two circles are concentric. If the larger circle has radius r1 and thesmaller circle has radius r2, then the annular ring has area π(r2

1 − r22).

This general recipe enables us to compute B(t) in the example athand. The points in D satisfy x ∈ [0, 2] and y ∈ [0, 4]. As 0 ≤ x ≤ 2,the inequality x2 ≤ 2x holds. So, for t ∈ [0, 4], the point Pi = (t/2, t)on the graph of g(x) = y = 2x is closer to the y axis than the pointPo = (

√t, t) on the graph of y = x2 = f(x). (It takes a larger value

of x to get the same value t = y by f than it takes to get the samevalue by g.) So the outer circle of the annulus will be given by therotated image of the curve of f (the parabola), and the inner circle ofthe annulus will be given by the rotated image of the curve of g (thestraight line).

In particular, for fixed t, the radii are obtained as the distance ofPo (resp. Pi) from the axis of rotation, that is, the line x = −3. Forthe inner radius, this yields

r2 =∣∣∣∣−3− t

2

∣∣∣∣ =t + 6

2,

while for the outer radius, this yields

r1 =∣∣∣−3−√t

∣∣∣ =√

t + 3.

Therefore, the area of the annular ring that is the intersection of Sand the plane y = t is given by

B(t) = π(r21 − r2

2) = π(√

t + 3)2− π

(t + 6

2

)2

= π

(6t0.5 − t2

4− 2t

).

37. VOLUMES 13

Now we can apply Theorem 6.4 to compute V (S). We obtain

V (S) =∫ 4

0B(t) dt

= π

∫ 4

06t0.5 − t2

4− 2t dt

= π

[4t1.5 − t3

12− t2

]dt

≈ 77.01.

�

37.3. Special Cases of Theorems 6.3 and 6.4. Note that the solids wediscussed so far in this section could be obtained by rotating the graphof a function around an axis. Such volumes are called volumes of rev-olution. Indeed, the ball of Example 6.5 can be obtained by rotatingthe graph of the function f(x) =

√r2 − x2 (a semicircle) about the x

axis. The cone of Example 6.6 can be obtained by rotating the graphof the function f(x) = −xh

r+ r = y (a straight line) about the y axis.

For such solids, the areas A(t) and B(t) appearing in Theorems 6.3and 6.4 are easy to compute, since the intersections appearing in thosetheorems will be circles or annular rings. If S is a solid obtained byrotating a curve about the x axis, then the intersection of the planex = t and S is a circle of radius f(x), and hence A(t) = f(x)2π. If S isa solid obtained by rotating the curve of the function g(y) = x aboutthe y axis, then the intersection of S and the plane y = t is a circleof radius g(t), and so B(t) = g(t)2π. This yields the following specialversions of Theorems 6.3 and 6.4.

Theorem 6.5. Let S be a solid between the planes x = a and x = bobtained by rotating the graph of the function f(x) = y about the xaxis. Then we have

V (S) =∫ b

a

f(t)2π dt.

Theorem 6.6. Let S be a solid between the planes y = a and y = bobtained by rotating the graph of the function g(y) = x about the y axis.Then we have

V (S) =∫ b

a

g(t)2π dt.

The exercises at the end of this section will provide further examplesfor the uses of these theorems.


If the domain to be rotated does not include the entire area betweenthe curve and and coordinate axis (for instance, because it is a domainbetween two curves), then we get annular rings, which we discussed inthe last section.

37.4. A Solid Not Obtained by Revolution. While volumes of revolutionare a very frequent application of Theorems 6.3 and 6.4, they are notthe only applications of those theorems.

Example 6.8. Let S be a pyramid whose base is a square of sidelength a and whose height is h. Compute the volume of S.

Solution: The first step is to place S in a coordinate system so thatTheorem 6.3 can be applied. Let us place the axis of S on the x axis ofthe coordinate system, so that the center of the base of S is at the originand the cusp of S is at x = h. This does not completely determine theposition of S, because S could still rotate around the x axis. However,such rotations will not change the value of A(t) for any t ∈ [0, h], andso they are insignificant for the computation of V (S).

Now note that, for any t ∈ [0, h], the intersection of S and the planex = t is a square of side length a(h − t)/h (by similar triangles). See

Figure 6.10. Pyramid.

37. VOLUMES 15

Figure 6.10 for an illustration. So A(t) = a2(h− t)2/h2, and Theorem6.3 implies

V (S) =∫ h

0

a2(h− t)2

h2 dt

=a2

h2

[h2t− ht2 +

t3

3

]h

0

=a2h

3.

�

37.5. The Big Picture. Note that the theorems presented in this sec-tion on integrals confirm our intuition that if a certain function mea-sures a quantity, then, under the appropriate conditions, the integralof that function measures a quantity that is somehow in a space thatis one dimension higher. For instance, we saw earlier that if f(x) mea-sured the height of a curve at a given horizontal coordinate x, then,under the appropriate conditions,

∫f(x) dx measured the area under

the curve. So taking integrals meant moving from one dimension totwo. In this section, the functions A(t) and B(t) measured areasof domains in a given plane, while

∫A(t) dt and

∫B(t) dt measured

volumes. So taking integrals meant moving from two dimensions tothree.

37.6. Exercises.

(1) Compute the volume of the solid between the planes x = −1and x = 1 obtained by rotating the graph of the functionf(x) = x2 = y about the y axis.

(2) Compute the volume of the solid between the planes x = −1and x = 1 obtained by rotating the graph of the functionf(x) = x2 = y about the x axis.

(3) Compute the volume of the solid between the planes x = 0and x = π obtained by rotating the graph of the functionf(x) = sin x = y about the x axis.

(4) Compute the volume of the solid obtained by rotating thedomain between the curves y =

√x and y = x about the line

y = −2.(5) Compute the volume of the solid obtained by rotating the

domain between the curves y = x4 and y = x about the linex = −1.


(6) Compute the volume of a regular tetrahedron of side length z.A regular tetrahedron is a solid that has four faces, each ofwhich is a regular triangle.

(7) Compute the volume of the solid between the planes x = 0 andx = 1 obtained by rotating the curve f(x) = ex = y about they axis.

(8) Compute the volume of the solid between the planes x = 0and x = 1 obtained by rotating the curve of the functionf(x) = x(1− x) about the x axis.

38. Cylindrical Shells

38.1. An Alternative Method to Compute Volumes. In principle, Theo-rems 6.3 and 6.4 are simple methods to compute the volumes of solids.In practice, however, the areas A(t) and B(t) that appear in these the-orems may be difficult to explicitly evaluate. One situation in whichthese areas are often difficult to compute is when the solid in questionis obtained by rotating a domain around some line; that is, it is a solidof revolution.

As an example, let us try to compute the volume of the solid Sobtained by rotating about the y axis the domain bordered by the linesx = 0, x = 3, and y = 0 and the graph of the function f(x) = 3x3−x4.If we try to solve this problem using Theorems 6.3 or 6.4, we run intodifficulties, because A(t) and B(t) will not be easy to compute. Forinstance, if we wanted to use Theorem 6.4, then, in order to computeB(t), we would need to describe the intersection of S and the planey = t. For this, we would have to find the x coordinates of the pointsof that intersection; that is, we would need to find all real numbersx ∈ [0, 3] for which y = 3x3 − x4 = t. This is a fourth-degree equationfor x, which is very difficult and cumbersome to solve. If we wantedto use Theorem 6.3, then, in order to compute A(t), we would needto describe the intersection of the plane x = t and S, which is notstraightforward to do.

In situations like this, that is, when the application of Theorems6.3 and 6.4 leads to technical difficulties, it often helps to use anothermethod called the method of cylindrical shells. A cylindrical shell Cis simply a cylinder C1 of which a smaller cylinder C2 is removed, sothat C1 and C2 have the same symmetry axis. See Figure 6.11 for anillustration. If C2 is just a little bit smaller than C1, then C looks likea shell, explaining the name cylindrical shell.

38. CYLINDRICAL SHELLS 17

Figure 6.11. Single cylindrical shell.

If C1 and C2 both have height h and Ci has radius ri, then thevolume of C can be computed as

V (C) = V (C1)− V (C2)

= hr21π − hr2

2π

= hπ(r21 − r2

2).

Note that the last form of V (C) can be rearranged as

(6.2) V (C) = h · (r1 − r2) · 2π(r1 + r2)2

.

This way of writing V (C) might seem contrived at first sight. How-ever, it has the following motivation. Note that r1 − r2 is the “width”of C, while h is its height. Finally, if we flatten C out in the plane,it will become a brick with side lengths h, r1 − r2, and, 2π r1+r2

2 , sincethe length of the missing side is equal to the circumference of a circlewhose radius is the average of the radii of C1 and C2.

In other words, (6.2) says that the volume of a cylindrical shell isequal to the product of its height, width, and “length” (if the latter isinterpreted properly).

We are now in a position to use cylindrical shells to compute vol-umes. Let S be a solid that is obtained by rotating the domain D,


which lies below the curve of f(x) = y from x = a to x = b, about they axis. In order to estimate V (S), we cut [a, b] into n intervals of equallength using points

a = x0 < x1 < · · · < xn−1 < xn = b.

For each integer i ∈ [1, n], we will take a cylindrical shell Si, whichwill roughly cover the part of S that is obtained by rotating the partof D that is between the lines x = xi−1 and x = xi about the y axis.More precisely, this shell will be obtained by removing the cylinder Ci,2

from the cylinder Ci,1, where Ci,1 and Ci,2 are both cylinders whosesymmetry axis is the y axis and whose height is f(x∗

i ) for the midpointx∗

i of the interval [xi−1, xi]. The radius of Ci,1 is f(xi), and the radiusof Ci,2 is f(xi−1).

If we set ∆x = (b− a)/n, then (6.2) implies that

V (Si) = f(x∗i ) ∆x 2πx∗

i ,

since x∗i is the midpoint of the interval [xi−1, xi]. Summing the last

displayed equation over all i, we get

(6.3) V (S) ≈n∑

i=1

f(x∗i ) ∆x 2πx∗

i ,

since the union of the shells Si has roughly identical volume to S.As n goes to infinity, this approximation gets better and better,

and the Riemann sum on the right-hand side of (6.3) converges to thecorresponding definite integral. Hence, we have proved the followingtheorem.

Theorem 6.7. The volume V (S) of the solid obtained by rotatingthe domain D whose borders are the curve of f(x) = y, the lines x = aand x = b, and the horizontal axis y = 0 about the y axis is equal to

V (S) =∫ b

a

2πxf(x) dx.

Example 6.9. Compute the volume of the solid S obtained by ro-tating about the y axis the domain bordered by the lines x = 0, x = 3,and y = 0 and the graph of the function f(x) = 3x3 − x4.

38. CYLINDRICAL SHELLS 19

Figure 6.12. (a) The curve of y = 3x3−x4 and (b) thesolid obtained by its rotation.

Solution: By Theorem 6.7, we have

V (S) = 2π∫ 3

0x(3x3 − x4) dx

= 2π∫ 3

0(3x4 − x5) dx

= 2π[3x5

5− x6

6

]3

0

= 2π(

7295− 729

6

)= 152.677.

�

The axis around which we rotate a domain does not have to bea coordinate axis in order for the method of cylindrical shells to beapplicable. We can apply the method as long as we can decompose thesolid in question into cylindrical shells whose height and radius we cancompute.

Example 6.10. Let S be the solid obtained by rotating the domainwhose borders are the horizontal axis, the vertical lines x = 0 andx = 2, and the graph of the function f(x) = 2x− x2 about the verticalline x = 3. Compute V (S).

Solution: We can decompose S into cylindrical shells whose centeris on the vertical line x = 3. The shell containing the point x ofthe horizontal axis will have height f(x) = 2x − x2 and radius 3 − x.


Figure 6.13. (a) The curve of y = x2/2 and (b) thesolid obtained by its rotation.

Therefore, we have

V (S) =∫ 2

02π(2x− x2)(3− x) dx

= 2π∫ 2

0

(x3 − 5x2 + 6x

)dx

= 2π[x4

4− 5x3

3+ 3x2

]2

0

= 2π · 83

≈ 16.755.

�

38.2. Exercises.In (1)–(3), use the method presented in this section to compute thevolume of the solid obtained by rotating the domain between the givencurves about the y axis.

(1) f(x) = x3 = y, x = 1, y = 0.(2) f(x) = 1

x2 , x = 2, x = 3, y = 0.(3) f(x) = x, g(x) = −x, x = 2.

In (4)–(7), use the best available method to compute the volume of thesolid obtained by rotating the domain between the given curves aboutthe given axis.

(4) f(x) = x = y, g(x) = −x = y, x = 2, about x = −1.

39. WORK AND HYDROSTATIC FORCE 21

(5) f(x) = 6− x, g(x) = x = y, y = 6, about the y axis.(6) f(x) = 6− x, g(x) = x = y, y = 6, about the line y = 6.(7) f(x) = 6− x, g(x) = x = y, y = 6, about the line y = 7.

39. Work and Hydrostatic Force

39.1. Work Moving a Point-like Object. In physics, the word work hasa more specific meaning than in everyday life. Work in physics meansthat a force is extended to move an object a certain distance.

The force F moving an object is computed by the formula

F = md2s

dt2,

which is called Newton’s second law. Here m is the mass of the object,while a = d2s

dt2is its acceleration. So Newton’s second law says that the

mass of an object is in direct proportion to the force needed to moveit at constant acceleration.

If a constant force F is exerted while an object moves distance d,then the work done by that constant force is computed by the formula

W = Fd.

Note that in the metric system, distance is measured in meters (m),time is measured in seconds (s), and therefore acceleration is measuredin m/s2. Mass is measured in kilograms (kg), so force is measured inkg · m

s2 , which are called newtons (N), and, finally, work is measured inN · m, which are also called joules (J). One joule is the work that isdone when a force equal to 1 newton moves an object a distance of 1meter.

Example 6.11. How much work is needed to lift a child of 20 kg toa height of 0.5 meters? Use the fact that gravitation causes downwardacceleration of g = 9.8 m/s2.

Solution: In order to lift the child, one needs to overcome the down-ward acceleration caused by gravity. This means that an upward forceof

m · g = 20 kg · 9.8 ms2 = 198 N

has to be exerted across a distance of d = 0.5 meters. This yields

W = Fd = 198 N · 0.5 m = 99 J.

So the work needed is 99 J. �

If the force exerted is not constant across the entire distance, butthe distance can be split up into a few parts so that the force is constanton each part, then we can compute the work done by the force on each


part just as in the previous example, and then we can add the obtainedamounts to get the total amount of work done across the entire distance.

If the force exerted changes according to a continuous function f(d),then we can approximate the work done using the idea of the previousparagraph and then use integration to compute the total work done bythe force as follows.

Let a and b be real numbers and let us assume that an object ismoving from a to b, and the force moving the object at a given point xis equal to f(x), where f is a continuous function. In order to computethe work done across the entire distance, let us split the interval [a, b]into n equal intervals, using points

a = x0 < x1 < · · · < xn−1 < xn = b,

and set ∆x = (a−b)/n. Then the work done by the force on the intervalIi = [xi−1, xi] is about f(x∗

i ) ∆x, where x∗i is some sample point in Ii.

Indeed, f is continuous, so if n is large and therefore Ii is short, thenf does not change much on that interval, so the shape of the domainunder the curve of f and above Ii is roughly a rectangle. This meansthat the total work done by the force on [a, b] is close to

(6.4)n∑

i=1

f(x∗i ) ∆x.

As n gets larger, this approximation gets better, and so we define thetotal work done by the force across the interval [a, b] as the limit of thesum in (6.4) as n goes to infinity. On the other hand, that sum is aRiemann sum, so its limit, as n→∞, is the definite integral

∫ b

af(x) dx.

In other words, we have proved the following theorem.

Theorem 6.8. Let a and b be real numbers. If an object is movedfrom a to b by a force that is equal to f(x) at point x, where f iscontinuous on [a, b], then the total work done by the force on [a, b] is

W =∫ b

a

f(x) dx.

Example 6.12. The force needed to extend a given spring x cen-timeters over its natural length is given by the function f(x) = 70x.How much work is needed to extend the spring 10 cm over its naturallength?

39. WORK AND HYDROSTATIC FORCE 23

Solution: By Theorem 6.8, we have

W =∫ 0.1

070x dx

= [35x2]0.10

= 12.25 J.

So 12.25 J of work is needed to stretch the spring 10 cm over its naturallength. �

We point out that the law of physics that says that the force neededto extend a spring by x units over its natural length is equal to kx iscalled Hooke’s law, and k is called the spring constant.

39.2. Hydrostatic Force. Let us say that we want to pump water outof a tank that has the shape of the southern half of a ball of radius1 (m). How much work is needed to do that? This question is morecomplex than the previous one since deeper layers of the hemisphereare smaller, and water in those layers has to travel farther in order toreach the top of tank.

Therefore, we will cut the tank up into small layers and estimatethe amount of work needed to pump out each layer of water.

Let x = 0 denote the bottom of the tank and let x = 1 denotethe center of the top circle of the tank. Cut the tank into i horizontallayers by planes that are at heights

0 = x0 < x1 < · · · < xn = 1.

Here xi − xi−1 = 1/n = ∆x for all i. Let Li denote the ith layer.The shape of this layer is close to a cylinder of height ∆x. Each waterparticle in this layer has to be pumped at a distance roughly equalto 1 − zi, where zi is a point in [zi−1, zi]. The square of the radiusof the cylinder approximating Li is, by the Pythagorean theorem, 1−(1 − zi)2 = 2zi − z2

i , and therefore the volume of Li is close to Vi =∆x(2zi − z2

i )π. See Figure 6.14 for an illustration.One cubic meter of water has a mass of 1000 kg, so the density of

water is ρ = 1000 kg/m3. Therefore, the mass of the water in Li isclose to mi = ρViπ. In order to pump this water out of the tank, thedownward acceleration caused by gravitation, that is, mig, has to beovercome, across a distance of 1 − zi. Therefore, the work needed topump out the water in Li is about mig(1 − zi), and the work neededto pump out all the water in the tank is approximated by

(6.5)n∑

i=1

mig(1− zi) = ρ · gπ

n∑i=1

∆x(2zi − z2i )(1− zi).


Figure 6.14. The tank in a coordinate system, and itslayer at height z.

As n grows, the expression displayed in (6.5) approximates theneeded work better and better. We define the total work needed (topump all the water out of the tank) to be the limit of the sum shownin (6.5) as n goes to infinity. As that sum is a Riemann sum, its limit,as n goes to infinity, is the definite integral

W = ρgπ

∫ 1

0(2x− x2)(1− x) dx.

As the integrand is a polynomial, this integral is very easy to compute.We get that

W = ρgπ

∫ 1

0x3 − 3x2 + 2x dx

= ρgπ

[x4

4− x3 + x2

]1

0

= ρgπ/4= 7696.675 J.

So it takes almost 7700 J of work to pump out all the water from thetank.

39.3. Exercises.

(1) How much work is done when a book of mass 2 kg is lifted 1.5meters from its original location?

40. AVERAGE VALUE OF A FUNCTION 25

(2) If it takes 10 J of work to lift an object 2 meters, how muchwork does it take to lift that object an additional 3 meters?

(3) If it takes 20 J of work to stretch a spring 20 cm over its naturallength, how much work does it take to stretch that spring anadditional 5 cm?

(4) How much work is needed to pump out all the water from atank of the shape of an inverted cone of height 10 whose topcircle has radius 2? (Length is measured in meters.)

(5) How much work is needed to pump out all the water from atank of the shape of a cylinder of height 20 whose base circlehas radius 30?

(6) How much work is needed to pump out all the water from atank of the shape of an inverted pyramid of height 15 whosetop plate is a square of side length 10?

40. Average Value of a Function

The concept of average is a simple one as long as we take the averageof a finite number of values, such as the average price of a house in agiven neighborhood or the average daily high temperature in a givencity in a given month. If a1, a2, . . . , an are real numbers, then

(6.6) A = (a1 + a2 + · · ·+ an)/n

is their average. However, what can we say about the average valueof a function over a given interval [a, b]? We will clearly need a newdefinition for that since there are infinitely many real numbers in [a, b],so summing all of them and then dividing their sum by the number ofsummands is not an option.

Here is an intuitive way of extending the definition of average tothe values taken by a function over an interval. It follows from (6.6)that A is the only real number with the property that if we replaceeach of a1, a2, . . . , an by A, then the sum (a1 + a2 + · · ·+ an) does notchange. This observation suggests the following definition.

Definition 6.1. Let f be a function such that∫ b

af(x) dx exists.

Then the average value of f on the interval [a, b] is the real number

c =

∫ b

af(x) dx

b− a.

Indeed, c is the only real number with the property that if we replacef by the constant function f(x) = c, then the integral

∫ b

af(x) dx does

not change.


A more systematic approach is the following. As we saw when wefirst learned about integrals,

∫ b

af(x) dx can be approximated in the

following way. Split [a, b] into n equal intervals and choose a point xi

in the ith such interval. Take a rectangle of height f(xi) over the ithinterval. The average value of the n values of f taken at the points xi

is, of course,

An =f(x1) + f(x2) + · · ·+ f(xn)

n.

On the other hand, the total area of the n rectangles we have justdefined is

Rn =b− a

n· (f(x1) + f(x2) + · · ·+ f(xn)).

Comparing the last two displayed equations, we see that

(6.7) An =Rn

b− a.

If n goes to infinity, then the n rectangles will approximate the domainunder the graph of f , and so the right-hand side of (6.7) will converge

to∫ b

a f(x) dx

b−a, while the left-hand side will approximate the average value

of f on [a, b].

Example 6.13. What is the average value A of sin x on the interval[0, π]?

Solution: We have

A =

∫ π

0 sin x dx

π

=[− cos x]π0

π

=1− (−1)

π

=2π

.

�

See Figure 6.15 for an illustration.It is worth pointing out that a continuous function f will actually

take its average value on each interval. This is the content of thefollowing theorem.

Theorem 6.9. Let f be a continuous function on [a, b] and let cbe the average value of f on [a, b]. Then there exists a real numberx ∈ [a, b] such that f(x) = c.

40. AVERAGE VALUE OF A FUNCTION 27

Figure 6.15. The average value of sin x on [0, π].

Proof. It suffices to show that if m is the minimum of f on [a, b]and M is the maximum of f on [a, b], then m ≤ c ≤M , and our claimfollows from the intermediate value theorem.

We know that

m(b− a) ≤∫ b

a

f(x) dx ≤M(b− a)

for obvious geometric reasons. Now divide all three terms by b − a toget m ≤ c ≤M . �

Example 6.14. There is a real number x ∈ [0, π] such that sin x =2/π.

Solution: This follows from the previous example and Theorem 6.9.�

40.1. Exercises.(1) Find the average value of xn on [0, 1].(2) Find the average value of tanx on [0, π/4].(3) Find the average value of lnx on [1, e].(4) Find the average value of ex on [0, 1].(5) What is larger, the average value of sinx or the average value

of cos x, if both averages are taken on the interval [0, π/2]?(6) What is larger, the average value of sinx on [0, π] or the av-

erage value of sin x on [−14π, 17π]? Can you find an answerthat does not involve computation?

CHAPTER 7

Methods of Integration

41. Integration by Parts

41.1. Method of Integration by Parts. Let u and v be two differentiablefunctions of the variable x. We used the simple product rule

(7.1) (uv)′ = u′v + uv′

to compute the derivative of the product of these two functions. Isthere a similar rule for computing the integral of the product of twofunctions? In general, the answer is no. There is no rule that providesthe integral of the product of two functions that would work in everycase. However, there are many cases in which a relatively simple wayof “reversing” the product rule of differentiation will give us the answerwe are trying to obtain.

Indeed, integrating both sides of the product rule (7.1) of differen-tiation (7.1) with respect to x, we get the identity

u(x)v(x) =∫

(u′(x)v(x)) dx +∫

(u(x)v′(x)) dx

or, after rearrangement,

(7.2)∫

(u′(x)v(x)) dx = u(x)v(x)−∫

(u(x)v′(x)) dx.

Formula (7.2) is very useful if we want to compute the integral ofthe product of two functions, one of which can play the role of u′ andthe other one of which can play the role of v. If we can compute u,and

∫(uv′), then formula (7.2) enables us to compute

∫(u′v) as well. If

we cannot carry out one or both of these computations, then formula(7.2) will not help.

Example 7.1. Compute∫

xex dx.

Solution: We set u′(x) = ex and v(x) = x. Then formula (7.2) is easyto apply, since v(x) = x and v′(x) = 1. Therefore, (7.2) implies that∫

xex dx = ex · x−∫

ex · 1 dx = ex · x− ex + C = ex(x− 1) + C.

�

29

30 7. METHODS OF INTEGRATION

The reader is encouraged to verify that the obtained solution iscorrect by computing the derivative of ex(x − 1) and checking that itis indeed equal to ex · x.

At this point, the reader may be asking how we knew that weneeded to set u′(x) = ex and v(x) = x, and not the other way around.The answer is that the other distribution of roles, that is, u′(x) = xand v(x) = ex would not have helped. Indeed, if we had chosen u′

and v in that way, we would have needed to compute∫

(uv′) dx =∫(exx2)/2 dx. That would have been more complex than the original

problem. We should always choose u′ and v so that∫

(uv′) dx is easyto compute. That usually means selecting v so that it becomes simplerwhen differentiated, and to select u′ so that u′ does not get muchmore complex when integrated (or at least one of these two desirableoutcomes occur).


x cos x dx.

Solution: We set u′(x) = cos x and v(x) = x, which means thatu(x) = sin x and v′(x) = 1. So formula (7.2) implies∫

x cos x dx = x sin x−∫

sin x dx = x sin x + cos x + C.

�

The technique of integration we have just explained is called inte-gration by parts.

41.2. Advanced Examples. Sometimes the integrand does not seem tobe a product, but it can be transformed in to one. The following is aclassic example.


ln x dx.

Solution: The crucial observation is that writing lnx = 1 · ln x helps.Let u′(x) = 1 and v(x) = ln x. Then u(x) = x and v′(x) = 1/x, so,crucially, u(x)v′(x) = 1. Therefore, formula (7.2) yields∫

ln x dx =∫

1 · ln x dx = x ln x−∫

1 dx = x ln x− x + C.

�

Sometimes integration by parts leads to an equation or a system ofequations that needs to be solved in order to get the solution to ourproblem.


ex cos x.

41. INTEGRATION BY PARTS 31

Solution: We set u′(x) = ex and v(x) = cos x. Then u(x) = ex andv′(x) = − sin x, and formula (7.2) yields

(7.3)∫

ex cos x dx = ex cos x +∫

ex sin x dx.

So we could solve our problem if we could compute the integral∫

ex sinx dx. We can do that by applying the technique of integration by partsagain, setting u′(x) = ex and v(x) = sin x. We obtain

(7.4)∫

ex sin x dx = ex sin x−∫

ex cos x dx.

Finally, note that (7.3) and (7.4) is a system of equations with un-knowns

∫ex cos x dx and

∫ex cos x dx. We can solve this system, for

instance, by adding these two equations and noting that∫

ex sin x can-cels. We get the equation∫

ex cos x dx = ex(cos x + sin x)−∫

ex cos x dx

or ∫ex cos x dx =

ex

2(cos x + sin x) + C.

�

Note that substituting the obtained expression for∫

ex cos x dx into(7.4), we get a formula for

∫ex sin x dx, namely,∫

ex sin x dx =ex

2(sin x− cos x) + C.

41.3. Definite Integrals. If we evaluate both sides of formula (7.2) froma to b and we apply the fundamental theorem of calculus, we get theidentity

(7.5)∫ b

a

(u′v) dx = [uv]ba −∫ b

a

(uv′) dx.

Example 7.5. Evaluate∫ 2

1 ln x dx.

Solution: As we saw in Example 7.3, we can set u(x) = x and v(x) =ln x. Then u′(x) = 1, v′(x) = 1/x, and formula (7.5) yields∫ 2

1ln x = [x ln x]21 −

∫ 2

11 dx = 2 ln 2− 1.

�


41.4. Exercises.

(1) Compute∫

x sin x dx.(2) Compute

∫x2ex dx.

(3) Compute∫

x ln x dx.(4) Compute

∫x2 ln x.

(5) Evaluate∫ 2

1 x cos 2x.(6) Evaluate

∫ 10 x2 sin x.

42. Trigonometric Integrals

42.1. Powers of sin and cos. In this section, we consider functions of theform f(x) = sinm x cosn x and discuss techniques for their integration.It seems natural to first consider the cases when m or n is 0, that is,when f is just a power of sin or cos. Even these special cases will breakup into further subcases. The easiest subcase is when the exponentsare even numbers. In that case, we can use the trigonometric identities

(7.6) cos 2x = 2 cos2 x− 1

and

(7.7) sin 2x = 2 sin x cos x

to eliminate high powers of trigonometric functions in the integrand.


cos4 x dx.

Solution: Using (7.6), we get that cos2 x = 1+cos 2x2 , and so

cos4 x =(

1 + cos 2x2

)2

=14

+cos 2x

2+

cos2 2x4

.

Applying (7.6) again, with 2x replacing x, we get that cos2 2x = 1+cos 4x2 ,

so the previous displayed equation turns into

cos4 x =38

+cos 2x

2+

1 + cos 4x8

.

Having eliminated the powers of cos, the integration is easy to carryout as follows:

∫cos4 x dx =

∫ (38

+cos 2x

2+

cos 4x8

)dx

=3x8

+sin 2x

4+

sin 4x32

+ C.

�

42. TRIGONOMETRIC INTEGRALS 33

The computation is more complex if the integrand is an odd powerof sin or cos. In that case, we separate one factor and convert the restinto the other trigonometric function, using the rule cos2 x+sin2 x = 1.


sin3 x dx.

Solution: We have

sin3 x = sin x · sin2 x = sin x · (1− cos2 x) = sin x− sin x cos2 x.

The advantage of this form is that it makes integration by substi-tution easy. Indeed, set u = cos x, then du/dx = − sin x, and so∫

− sin x cos2 x dx =∫

u2 du =u3

3+ C =

cos3 x

3+ C.

Comparing the two displayed equations of this solution and noting that∫sin x dx = − cos x, we get∫

sin3 x = − cos x +cos3 x

3+ C.

�

The methods shown above can be used to compute the integral ofproducts of powers of sin x and cos x. In other words, the method allowsus to compute

∫cosm x sinn x dx as shown below.


cos2 x sin3 x dx.

Solution: Just as in Example 7.7, we separate one sinx factor. Thisaccomplishes two things. It allows us to convert the remaining evennumber of sin x factors to cos x factors, and it allows us to integrate bysubstitution.

Indeed, ∫cos2 x sin3 x dx =

∫cos2 x sin2 x sin x dx

=∫

cos2 x(1− cos2 x) sin x

=∫

cos2 x sin x− cos4 x sin x

=∫−u2 du + u4 du

= −u3

3+

u5

5+ C

=− cos3 x

3+

cos5 x

5+ C,

where we used the substitution u = cos x. �


We can always proceed this way if at least one of m and n in theintegrand cosm x sinn x is odd. Indeed, in that case, after separatingone factor from that odd power, an even power remains, and that canbe converted to the other trigonometric function using the identitysin2 x + cos2 x = 1. If both m and n are even, then we can use thatidentity right away.


cos4 x sin2 x dx.

Solution: We have∫cos4 x sin2 x dx =

∫cos4 x(1− cos2 x) dx

=∫

cos4−∫

cos6 x dx.

Now note that we computed∫

cos4 dx in Example 7.6. You are asked tocompute

∫cos6 x in Exercise 42.4.2. The difference of these two results

then provides the solution of the present example. �

42.2. Powers of tan and sec. When integrating a product of the formtanm x secn x, we will use the identity sec2 x = tan2 x + 1 and the dif-ferentiation rules (tan x)′ = sec2 x and (sec x)′ = sec x tan x.

There are two easy cases, namely, when m is odd (and n is at least1) and when n ≥ 2 is even.

In the first case, that is, when m is odd and n ≥ 1, we separate onefactor of tan x sec x and express the remaining factors in terms of sec xby the identity −1 + sec2 x = tan2 x. Then we substitute u = sec x,which leads to du

dx= tan x sec x.


tan3 x sec x dx.

Solution: Following the strategy explained above, we have∫tan3 x sec x dx =

∫tan2 x tan x sec x dx

=∫

(−1 + sec2 x) tan x sec x dx

=∫

(−1 + u2) du

= −u +u3

3+ C

= − sec x +sec3 x

3+ C.

�

42. TRIGONOMETRIC INTEGRALS 35

In the second case, that is, when n ≥ 2 is even, we separate onefactor of sec2 x, express the remaining factors in terms of tanx usingthe identity sec2 x = 1 + tan2 x, and substitute u = tan x, which leadsto du

dx= sec2 x.


sec4 x dx.

Solution: We have∫sec4 x dx =

∫sec2 x sec2 x dx

= (1 + tan2 x) sec2 x dx

= (1 + u2) du

= u +u3

3+ C

= tan x +tan3 x

3+ C.

�

If we are not in these two easy cases, then there is no recipe thatwill always work. We then need to have a separate strategy for eachproblem. We will show examples of that in Exercises 42.4.6 and 42.4.7.

42.3. Some Other Trigonometric Integrals. If our goal is to compute in-tegrals of the form

∫cos mx sin nx dx,

∫cos mx cos nx dx, and

∫sin mx

sin nx dx, then we can often make use of the following identities:

(7.8) sin a cos b =12

sin(a− b) +12

sin(a + b),

(7.9) cos a cos b =12

cos(a− b) +12

cos(a + b),

(7.10) sin a sin b =12

cos(a− b)− 12

cos(a + b).


cos 3x cos 5x dx.

Solution: Using (7.9) with a = 3x and b = 5x and noting thatcos−2x = cos 2x, we get that cos 3x cos 5x = 1

2 cos 2x + 12 cos 8x, and

so ∫cos 3x cos 5x dx =

∫ (12

cos 2x +12

cos 8x)

dx

=14

sin 2x +116

sin 8x + C.


�

42.4. Exercises.

(1) Compute∫

sin3 x dx.(2) Compute

∫cos6 x dx.

(3) Compute∫

sin3 x cos2 x dx.(4) Compute

∫tan2 x sec4 x dx.

(5) Compute∫

tan3 x sec5 x dx.(6) Compute

∫tan5 x by separating one factor of tan2 x in the

integrand and expressing it in terms of sec2 x.(7) Compute

∫sec3 x dx using integration by parts, with u′(x) =

sec2 x and v(x) = sec x.

43. Trigonometric Substitution

43.1. Reversing the Technique of Substitutions. Let us assume that wewant to compute the area of a circle by viewing one-fourth of that circleas the domain under a curve. Let r be the radius of the circle, and letus place the center of the circle at the origin. Then the northeasternquarter of the circle, shown in Figure 7.1, is just the domain under thegraph of the function f(x) =

√r2 − x2, where x ranges from 0 to r. In

other words, we need to compute the integral

(7.11)∫ r

0

√r2 − x2 dx.

In Chapter 5, we presented the technique of integration by substi-tution. This technique worked in situations when the best way to com-pute an integral was to define a simple function of x, such as y(x) = x2,and then continue the integration in terms of that new variable y.

In order to compute the integral in (7.11), we use the reverse ofthe strategy mentioned in the previous paragraph. We define anothervariable y so that x is a simple function f of y. It is important to

Figure 7.1. The northeastern quadrant of the unit circle.

43. TRIGONOMETRIC SUBSTITUTION 37

define f and y so that f is one-to-one, since that assures that f(y) = xis equivalent to f−1(x) = y.

In computing the integral in (7.11), we can set x = r sin y. Thendx/dy = r cos y, and the limits of integration are y = 0 and y = π/2.This yields∫ r

0

√r2 − x2 dx =

∫ π/2

0

√r2 − r2 sin2 y r cos y dy

= r2∫ π/2

0

√1− sin2 y

= r2∫ π/2

0cos2 y dy

= r2[y + sin 2y

4

]π/2

0

= r2π

4.

Note that we could write cos y for√

1− sin2 y, since 0 ≤ y ≤ π/2,and in that interval, cos y is nonnegative.

We point out that by converting the indeterminate integral

r2y + sin 2y4

back to a function of x, we get that∫ √r2 − x2 dx =

r2

4· sin−1

(x

r

)+

r2

2· x√

1− x2.

The result that we are going to compute in the next example will beuseful in the next section, when we will learn a technique to integraterational functions.

Example 7.13. Compute the integral∫

1(1+x2)2 dx.

Solution: We use the substitution x = tan y. Then y = tan−1(x), andso dy/dx = 1/(1 + x2), and hence dy = dx/(1 + x2). This yields∫

1(1 + x2)2dx =

∫1

1 + x2dy

=∫

11 + tan2 y

dy

=∫

cos2 y dy

=y

2+

sin 2y4


=y

2+

sin y cos y

2

=12· tan−1(x) +

12· x

x2 + 1.

The last step is justified since

(7.12)x

x2 + 1=

tan y

1 + tan2 y= tan y cos2 y = sin y cos y.

�

Figure 7.2 illustrates this trigonometric argument.


1√x2−1

.

Solution: The denominator reminds us of the trigonometric identitytan2 y = sec2 y − 1, and so, if y ∈ [0, π/2), then tan y =

√sec2 y − 1.

Therefore, we use the substitution x = sec y. Then dx/dy = tan y sec y.Hence, we have ∫

1√x2 − 1

=∫

1√sec2 y − 1

dx

=∫

1tan y

dx

=∫

tan y · sec y

tan ydy

=∫

sec y dy

= ln| sec y + tan y|= ln|x +

√x2 − 1|.

�

Figure 7.3 illustrates this trigonometric argument.

Figure 7.2. Some expressions from (7.12).

43. TRIGONOMETRIC SUBSTITUTION 39

Figure 7.3. Some expressions occuring in the solutionof Example 7.14.

43.2. Summary of the Most Frequently Used Trigonometric Substitutions.The three examples that we have seen so far in this section show thethree most frequently used reverse substitutions. That is,

(i) To compute∫ √

r2 − x2 dx, use the reverse substitution x =r sin y.

(ii) To compute integrals involving (r2 + x2) under a root sign orin the denominator of a fraction, use the reverse substitutionx = r tan y.

(iii) To compute∫ √

x2 − r2 dx, use the reverse substitution x =r sec y.

Finally, a word of caution. The availability of the method of reversesubstitution does not mean that this method is always the best oneto compute an integral that contains a square root sign. One of thefollowing exercises can be solved by another method faster (and, no,we are not revealing which one).

43.3. Exercises.

(1) Use the method presented in this section to compute the areaof an ellipse determined by the equation

x2

a2 +y2

b2 = 1.

(2) Compute∫ √

1− 4x2 dx.(3) Compute

∫ √1 + x2 dx.

(4) Compute∫

x√x−5 dx.

(5) Compute∫

1x2

√x2−4

dx.(6) Compute

∫ √x2 − 2x dx.


44. Integrating Rational Functions

44.1. Introduction. Recall that a rational function is the ratio of twopolynomials, such as

R(x) =P (x)Q(x)

=3x + 5

2x2 + 4x + 9.

Integrating rational functions is relatively simple, because most of thesefunctions can be obtained as sums of even simpler functions. If thedegree of P (x) is at least as large as the degree of Q(x), then we candivide P (x) by Q(x), getting a polynomial as a quotient, and possiblya remainder. That is, if the degree of P is at least as large as thedegree of Q, then there exist polynomials P1(x) and P2(x) such thatthe degree of P2(x) is less than the degree of Q(x) and

R(x) =P (x)Q(x)

=P1Q(x) + P2(x)

Q(x)= P1(x) +

P2(x)Q(x)

.

As P1(x) is a polynomial, it is easy to integrate. Therefore, the diffi-culty of integrating R(x) lies in integrating R2(x)

Q(x) , which is a rationalfunction whose denominator is of higher degree than its numerator.

For this reason, in the rest of this section, we focus on integratingrational functions with that property, that is, when the degree of thedenominator is higher than the degree of the numerator.

Example 7.15. Let R(x) = x3+2x+1x2−x+1 . Then dividing P (x) by Q(x)

using long division, we get

P (x) = (x + 1)(x2 − x + 1) + 2x,

so

R(x) =P (x)Q(x)

= x + 1 +2x

x2 − x + 1,

and integrating Q(x) boils down to integrating 2xx2−x+1 .

44.2. Breaking Up the Denominator. In order to decide how to breakup a rational function R(x) into the sum of simpler terms, we analyzethe denominator Q(x) of R(x). A theorem in complex analysis, some-times called the fundamental theorem of algebra, implies that if q(x)is a polynomial whose coefficients are real numbers, then q(x) can bewritten as a product of polynomials that are of degree 1 or 2.

This decomposition, or factorization, of Q(x) will determine theway in which we break up our rational function into the sum of simplerterms. There are several cases to distinguish, based on the factorizationof Q(x).

44. INTEGRATING RATIONAL FUNCTIONS 41

44.2.1. Distinct Linear Factors. The easiest case is when Q(x) factorsinto the product of polynomials of degree 1, and each of these termsoccurs only once.


1x2+3x+2 dx.

Solution: Note that x2+3x+2 = (x+1)(x+2). Using that observation,we are looking for real numbers A and B such that

(7.13)1

x2 + 3x + 2=

A

x + 1+

B

x + 2as functions, that is, such that (7.13) holds for all real numbers x.Multiplying both sides by x2 + 3x + 2, we get

(7.14) 1 = A(x + 2) + B(x + 1).

If (7.14) holds for all real numbers x, it must hold for x = −1 andx = −2 as well. However, if x = −1, then (7.14) reduces to 1 = A,and if x = −2, then (7.14) reduces to 1 = −B. So we conclude thatA = 1 and B = −1 are the numbers we wanted to find. It is now easyto compute the requested integral as follows:∫

1x2 + 3x + 2

dx =∫

1x + 1

dx−∫

1x + 2

dx

= ln(x + 1)− ln(x + 2) + C.

�

The above method can always be applied if Q(x) factors into aproduct of linear polynomials, each of which occurs only once. Inparticular, if Q(x) decomposes as a(x − a1)(x − a2) · · · (x − ak), thenwe can decompose R(x) into a sum of the form

A1

x− a1+

A2

x− a2+ · · ·+ Ak

x− ak

.

After determining the numbers Ai, we can integrate each of the abovek summands.

44.2.2. Repeated Linear Factors. The next case is when Q(x) factorsinto linear terms, but some of these terms occur more than once.


2x+7(x+1)2(x−1) dx.

Solution: Just as in the previous case, we decompose the integrandinto a sum of simpler fractions. We are looking for real numbers A, B,and C such that

2x + 7(x + 1)2(x− 1)

=A

x + 1+

B

(x + 1)2 +C

x− 1.


Multiplying both sides by the denominator of the left-hand side, weget

2x + 7 = A(x + 1)(x− 1) + B(x− 1) + C(x + 1)2.

Substituting x = 1 in the last displayed equation yields 9 = 4C, soC = 2.25. Substituting x = −1 yields 5 = −2B, so B = −2.5. Finally,the coefficient of x2 on the left-hand side is 0, while on the right-handside, it is A + C. So A + C = 0, yielding A = −2.25.

Now we are in a position to compute the requested integral.

∫2x + 7

(x + 1)2(x− 1)=∫ −2.25

x + 1dx +

−2.5(x + 1)2dx +

2.25x− 1

dx

=∫ −2.25

x + 1dx +

∫ −2.5(x + 1)2dx +

∫2.25x− 1

dx

= −2.25 ln(x + 1) +2.5

x + 1+ 2.25 ln(x− 1).

�

In general, if a term (x+a)k occurs in Q(x), then the partial fractiondecomposition of R(x) will contain one term with denominator (x+a)i

for each i ∈ {1, 2, . . . , k}. For instance, if Q(x) = (x+2)3(x+5)2(x−10),then R(x) will have a partial fraction decomposition of the form

A1

x + 2+

A2

(x + 2)2 +A3

(x + 2)3 +A4

x + 5+

A5

(x + 5)2 +A6

x− 10.

44.2.3. Distinct Quadratic Factors. The third case is when the factor-ization of Q(x) contains some quadratic factors that are irreducible(i.e., they are not the product of two linear polynomials with real co-efficients), but none of these irreducible quadratic factors occurs morethan once. In that case, after obtaining the partial fraction decompo-sition of R(x), we may have to resort to the formulas∫

1x2 + 1

= tan−1 x + C

and ∫1

x2 + a2 =1a

tan−1(x

a

)+ C.


4x+2x3+x2+x+1dx.

Solution: It is easy to notice that setting x = −1 turns the denomi-nator to 0; hence, the denominator is divisible by x + 1. Dividing thedenominator by x + 1, we get x2 + 1, so the denominator factors as(x + 1)(x2 + 1). The factor x2 + 1 is irreducible (it is not divisible by

44. INTEGRATING RATIONAL FUNCTIONS 43

x−b for any real number b, since no real number b satisfies the equationb2 + 1 = 0).

Therefore, we are looking for real numbers A, B, and C such that

(7.15)4x + 2

x3 + x2 + x + 1=

A

x + 1+

B

x2 + 1+

Cx

x2 + 1.

The reader is invited to verify that the third summand of the right-hand side is necessary; that is, if the summand Cx

x2+1 is removed, thenno pair of real numbers (A, B) will satisfy (7.15).

In order to find the correct values of A, B, and C, multiply bothsides of (7.15) by (x + 1) · (x2 + 1) and rearrange, to get

4x + 2 = (A + C)x2 + (B + C)x + A + B.

The coefficient of x2 is 0 on the left-hand side, so it has to be 0 on theright-hand side. Therefore, A+C = 0. Similarly, the coefficient of x is4 on the left-hand side, so it has to be 4 on the right-hand side, forcingB + C = 4. Similarly, the constant terms of the two sides have to beequal, and, consequently, A+B = 2. Solving this system of equations,we get A = −1, B = 3, and C = 1. Therefore,∫

4x + 2x3 + x2 + x + 1

dx =∫− 1

x + 1dx + 3

∫1

x2 + 1dx +

∫x

x2 + 1dx

= ln(x + 1) + 3 tan−1 x +12

ln(x2 + 1).

�

In general, if x2+ax+b is a quadratic factor in Q(x), then the partialfraction decomposition will contain a summand of the form E

x2+ax+band

a summand of the form Fxx2+ax+b

. Again, the latter is necessary, sincea rational fraction of the form E+Fx

x2+ax+bwill not equal one of the form

Ex2+ax+b

for any choice of E if F �= 0.

44.2.4. Repeated Quadratic Factors. Finally, it can happen that thefactorization Q(x) contains irreducible quadratic factors, some of whichoccur more than once.


x3+2x2+3x+7x4+2x2+1 dx.

Solution: It is easy to see that the denominator factors as (x2 + 1)2.Hence, we are looking for real numbers A, B, C, and D such that

x3 + 2x2 + 3x + 7x4 + 2x2 + 1

=A

x2 + 1+

Bx

x2 + 1+

C

(x2 + 1)2 +Dx

(x2 + 1)2 .

Multiplying both sides by x4 + 2x2 + 1 and rearranging, we get

x3 + 2x2 + 3x + 7 = Bx3 + Ax2 + (B + D)x + (A + C).


For each k, the coefficients of xk must be the same on both sides.Hence, A = 2 and B = 1, so C = 5 and D = 2.

Now we can compute the requested integral using the precedingpartial fraction decomposition as follows:

∫x3 + 2x2 + 3x + 7

x4 + 2x2 + 1dx

=∫ (

2x2 + 1

+x

x2 + 1+

5(x2 + 1)2 +

2x(x2 + 1)2

)dx

= 2 · tan−1 x +12

ln(x2 + 1) +5x

2(x2 + 1)+

52

tan−1 x− 1x2 + 1

=12

ln(x2 + 1) + 4.5 tan−1 x +12· 5x− 2x2 + 1

.

Here we used the formula for∫

1(x2+1)2 that we computed in the last

section, in Example 7.13. �

By now, the reader must know what the general version of thetechnique of the preceding example is. If the factorization of Q(x)contains (x2 + ax + b)k, then, for each integer i such that 1 ≤ i ≤ k,the partial fraction decomposition of R(x) will contain a summand ofthe form Ei

(x2+ax+b)i and a summand of the form Fix(x2+ax+b)i .

44.3. Rationalizing Substitutions. There are situations when a functionthat is not a rational function can be turned into one by an appropriatesubstitution, and then it can be integrated by the methods presentedin this section. The most frequent scenario in which this happens iswhen the integrand contains roots, but if those roots are replaced byanother variable, we get a rational function in that other variable.

Example 7.20. Compute∫ √

x√x+1dx.

Solution: We use the substitution√

x = y. Then dy/dx = 12√

x= 1

2y.

This leads to ∫ √x√

x + 1dx =

∫y

y + 12y dy

=∫

2y2

y + 1dy

=∫

2(y − 1) +2

y + 1dy

45. STRATEGY OF INTEGRATION 45

= y2 − 2y + 2 ln(y + 1)

= x− 2√

x + 2 ln(√

x + 1).

�

Note that the computation would have been very similar if theintegrand contained some other root of x. Indeed, if the integrandcontained r

√x instead of

√x, then we would have substituted y = r

√x,

and that would have turned the integrand into a rational function ofy. Indeed, y = x1/r implies

dy

dx=

x(1/r)−1

r,

dy

dx=

y1−r

r,

and thereforedx = ryr−1 dy.

In other words, dx is a equal to dy times a polynomial function of y,so, indeed, the integrand will be a rational function of y.

44.4. Exercises.

(1) Compute∫

5x2+5x+4dx.

(2) Compute∫

3x3+x2−x−1dx.

(3) Compute∫

1x4+5x2+4dx.

(4) Compute∫

1x4+4x2+4dx.

(5) Compute∫

x2+4x3−x

dx.(6) Compute

∫x+3x3+1dx.

(7) Compute∫ √

x+1√x−3dx.

(8) Compute∫ 3√x

3√x+2dx.

45. Strategy of Integration

We presented various integration techniques in this chapter andin some preceding chapters. The most general ones were integrationby parts and integration by substitution. The most frequently studiedspecial cases were related to trigonometric functions and their inverses.Reverse substitution came up in some special cases. We also discussedthe integration of rational functions, using the technique of partialfractions.

In short, we have learned a decent number of methods. For thisvery reason, it is sometimes not obvious which method we should use


when trying to integrate a function. While there is no general rule, inthis section we will provide a few guidelines.

So let f be a function that is not equal to one of the functionswhose integral we either know offhand or have a deterministic methodto compute. That is, f is not a polynomial, f is not a rational function,f is not the function f(x) = ax or f(x) = loga x for some positive realnumber a, and f is not one of the basic trigonometric functions likesin x or tan x. Let us also assume that simple algebra will not help,that is, that f cannot be transformed into one of these elementaryfunctions by simple algebraic transformations. Then how do we decidewhich method to use?

45.1. Substitution. The method that needs the least amount of work,when it is available, is a simple substitution, so it is reasonable totry to use that method first. There is a particularly good chance forthis approach to work when f is the composition of two functions, oneof which has a constant derivative, or when f is of the form f(x) =h′(g(x))g′(x), since then f(x) = dh

dx(x), and so

∫f(x) dx = h(x). In the

language of substitutions, this means that substituting y = g(x) willwork, since∫

f dx =∫

h′(g(x))g′(x) dx =∫

h′(y)dy

dxdx =

∫h′(y) dy.

In other words, the integral of the composite function f is turned intosomething simpler, the integral of the function h.

Example 7.21. Let f(x) = sin 2x. Then, using the substitutiony = 2x, we get that

∫sin 2x dx = 1

2

∫sin y dy = −1

2 cos y = −12 cos 2x.

Example 7.22. Let f(x) = 3xx2+1 . Then we set y = x2 + 1, so

dy/dx = 2x. This leads to∫3x

x2 + 1dx =

∫3xy

dx

=32

∫1ydy

=32

ln y

=32

ln(x2 + 1).

The reader should compute the integral∫

sinn x cos x dx at thispoint.

45. STRATEGY OF INTEGRATION 47

45.2. Integration by Parts. If f is the product of two functions, butsubstitution does not seem to help, then integration by parts is thelogical next step. This technique is particularly useful when one ofthe two functions whose product is f is made significantly simpler bydifferentiation.


xe−x dx.

Solution: Considering the integrand, we notice that substitution isunlikely to help, since x and e−x are not closely related. On the otherhand, the integrand is a product, and one of the terms, x, is madesimpler by differentiation. Therefore, we choose the technique of in-tegration by parts, with x = u and v′ = e−x. Then u′ = 1, whilev = −e−x, and we get∫

xe−x dx = −xe−x −∫

e−x dx

= −xe−x + e−x

= (1− x)e−x.

�

45.3. Radicals. If the two most general methods (substitution and in-tegration by parts) are not helpful, then it is quite possible that there isa root sign in the integrand. In that case, there are two specific meth-ods that we can try, reverse substitution and rationalizing substitution.The easiest way to know when to use each of these two methods is toremember the relatively few cases in which reverse substitution worksdirectly. As we have seen, these are the integrals involving

∫1

r2+x2 ,∫ √r2 − x2, and

∫ √x2 − r2. If the integrand does not contain any of

these functions, then it may be simpler to use a rationalizing substitu-tion, such as in computing the integral

∫ √x√

x+4dx. The exercises at theend of this section will ask the reader to decide which method to usefor a few specific examples.

45.4. If Everything Else Fails. If none of our methods work, then it maybe that an unexpected transformation of the integrand may help, atleast with relating the integral to one that is not quite as challengingto compute.


sin4 xtan2 x

cos2 2x dx.


Solution: We use the trigonometric identity tanx = sin x/ cos x, torewrite the integrand. We get∫

sin4 x

tan2 xcos2 2x dx =

∫sin2 x cos2 x cos2 2x dx

=∫

14

sin2 2x cos2 2x dx

=∫

116

sin2 4x dx,

which is easy to integrate with the method we learned for powers oftrigonometric functions. �

It is important to point out that sometimes there is indeed no so-lution; that is, there exist elementary functions f such that there is noelementary function F (x) satisfying F ′(x) = f(x). Examples of suchfunctions f include ex2 , ex/x, and 1

ln x.

45.5. Exercises.In all of the exercises below, compute the integral.

(1)∫

x2e−5x dx.(2)

∫x√

x2−1dx.

(3)∫

x+1x3+1dx.

(4)∫

esin x cos x dx.(5)

∫ex

ex−1dx.(6)

∫1√

x2+4dx.

(7)∫

1√x+4dx.

(8)∫

1x ln x

dx.(9)

∫1

x ln x·ln(ln x)dx.

46. Integration Using Tables and Software Packages

46.1. Tables of Integrals. Tables of integrals can be found in many cal-culus textbooks and on the Internet. The website www.integral-table.com is a good example, and we will use it as a reference inthis subsection. (When we say “the table of integrals,” we mean thattable.)

No matter how extensive a table of integrals is, it cannot containall integrals. It is therefore important to know how to use these tablesto compute integrals that are not contained in the tables in the sameform.

The easiest case is when the integral to be computed is a specialcase of a more general integral that is in the table.

46. INTEGRATION USING TABLES AND SOFTWARE PACKAGES 49

Example 7.25. Use the table of integrals to compute∫

ln(x2+9) dx.

Solution: Looking at the table of integrals found at the websitewww.integral-table.com, we find that the integrand is a special caseof integral (45), with a = 3. Using the formula given in the table forthe general case with a = 3, we get the result

ln(x2 + 9) dx = x ln(x2 + 9) + 6 tan−1(x

3

)− 2x + C.

�

Sometimes, we have to resort to integration by substitution to beable to use the table of integrals.

Example 7.26. Use the table of integrals to compute∫ √

9− 4x2 dx.

Solution: Taking a look at the table of integrals, we find that formula(30) provides a formula for

∫ √a2 − x2 dx. In order to be able to use

that formula, we set y = 2x, which implies dy/dx = 2. Therefore,∫ √9− 4x2 dx =

12

∫ √9− y2 dy.

Now we can apply formula (30) from the table of integrals, to get

12

∫ √9− y2 dy =

y

4

√9− y2 +

92

tan−1

(y√

9− y2

)+ C

=x

2

√9− 4x2 +

92

tan−1(

2x√9− 4x2

)+ C.

�

Sometimes we need to carry out some algebraic manipulation beforewe can use the technique of substitutions in connection with using thetable of integrals.

Example 7.27. Use the table of integrals to compute∫

x2+2x+1√x2+2x+10

dx.

Solution: There is no integral in the table of integrals that would im-mediately stand out as one that is very similar to this one. The crucialobservation is that the substitution y = x+1 significantly simplifies ourintegrand. That substitution leads to the integral

∫y2√y2+32

dy, which,

in turn, can be directly found in the table of integrals as item (36).Substituting x back into the obtained formula, we get∫

x2 + 2x + 1√x2 + 2x + 10

dx

=√

x2 + 2x + 102

− 92

ln∣∣∣x + 1 +

√x2 + 2x + 10

∣∣∣+ C.


�

46.2. Software Packages. Computer software packages such as Mapleand Mathematica are very useful tools of integration. These packageswill compute the definite or indefinite integrals of a large class of func-tions, then they will present the results in a form that is usually, butnot always, in the form the user expected. In this section, we show afew examples of these unexpected results and explain how to interpretthem.

To start with a very basic example, type

int(xˆ2+3x,x);

into Maple. We get the answer 13x

3 + 32x

2. This is the correct answerhaving constant term 0. Experimenting with other functions, we notethat Maple always answers in this way, that is, without the constantC at the end. It is important not to forget this if we will be using theobtained function in some further computation.

Maple does not always provide the simplest form for the integralthat it computes. For instance, if we ask Maple to compute the indef-inite integral

∫x(x2 + 3)6 dx, we get the output

(7.16)114

x14 +32x12 +

272

x10 +1352

x8 +4052

x6 +7292

x4 +7292

x2.

However, it is very easy to compute∫

x(x2 + 3)6 dx by hand, usingthe substitution u = x2. That substitution leads to the same solu-tion, but in a much simpler form, namely, 1

14(x2 + 3)7. If we want

to verify that this result indeed agrees with the one given by Mapleand displayed in (7.16), we can ask Maple to expand the expressionH = 1

14(x2 + 3)7 using the expand command. We see that the ex-

panded expression indeed agrees with the one given in (7.16), up tothe constant terms at the end.

There are other commands like expand that, are useful if we wantto transform the output of an integration software package. The com-mands rationalize and simplify are examples of these.

There is often more than one way to express integrals involvinghyperbolic functions. A striking example is the following. If we askMaple to compute

∫1

1−x2 dx by typing

int(1/(1-xˆ2),x);

Maple returns the answer tanh−1 x. This is very surprising, since it isnot difficult to integrate the integrand as a rational function, with no

46. INTEGRATION USING TABLES AND SOFTWARE PACKAGES 51

hyperbolic functions involved. Indeed,1

1− x2 =0.5

x + 1− 0.5

x− 1,

and so

(7.17)∫

11− x2dx =

12

ln(x + 1)− 12

ln(x− 1) = ln

√x + 1x− 1

.

The result obtained by Maple actually agrees with the result givenin (7.17), even if that is not obvious. Indeed, if y = tanh−1 x, then,by definition, x = (ey − e−y)/(ey + e−y). Solving this equation fory is not completely trivial, but at the end, it yields y = ln

√x+1x−1 .

(Hint: Multiply both the numerator and the denominator by ey to getx = (e2y − 1)/(e2y + 1).)

Finally, when computing definite integrals, Maple sometimes an-swers by using the acronyms of some rare functions. For instance, ifwe want to compute

∫ 10 sin (x2) dx by typing

int(exp(sin(xˆ2)),x=0..1).

then we get the answer(1/2)*FresnelS(sqrt(2)/sqrt(Pi))*sqrt(2)*sqrt(Pi).

Here FresnelS refers to the Fresnel sine integral, a concept beyondthe scope of this book. If we simply want to know a numerical valuefor∫ 1

0 ex2dx, we can type

evalf(int(sin(xˆ2),x=0..1))

instead. Maple outputs 0.4596976941 as the answer.

46.3. Exercises.

(1) Use the table of integrals to compute∫

x ln(3x + 5) dx.(2) Use the table of integrals to compute

∫x cos

(π2 − x

)dx.

(3) Use the table of integrals to compute∫

x+7x+4x+5dx.

(4) Use the Table of Integrals to compute∫

x2√x6+16

dx.(5) Use your favorite software package to compute

∫xex dx,∫

x2exdx, and∫

x3ex dx. Do you see a pattern? Try to guesswhat

∫x4ex is, then verify your guess by using your software

package again.(6) Let f(x) = 1√

1−x2 and let g(x) =√

11−x2 . Clearly, f(x) =

g(x) for all x where these functions are defined. Computethe integral of both functions with Maple. If the results seemdifferent, show that they are in fact equal.


47. Approximate Integration

Sometimes it is not possible to find the exact value of a definiteintegral

∫ b

af(x) dx. It could happen that we cannot find the antideriv-

ative of f(x) or that the antiderivative of f(x) is not an elementaryfunction. Or it could happen that f itself is not given by a formula,but instead, f is given by its graph, which is plotted by a computerprogram. In this case, we resort to methods of approximate integration.

47.1. Basic Approximation Methods. The key observation behind theapproximation methods is the fact that if f(x) ≥ 0 for x ∈ [a, b], then∫ b

af(x) dx is equal to the area below the graph of the function f on

that interval. More precisely,∫ b

af(x) dx is equal to the area of the

domain bordered by the horizontal axis, the vertical lines x = a andx = b, and the graph of f .

In order to estimate the area of this domain D, we cut D into smallvertical strips. To do so, we choose real numbers

a = x0 < x1 < · · · < xn = b

and then consider the vertical lines x = xi for all i. There are severalways in which we can estimate the area of each strip. Let Si be thestrip bordered by the lines x = xi and x = xi+1.

We could say that the area of Si is roughly equal to the area ofthe rectangle whose base is the interval [xi−1, xi] and whose height isf(xi−1). This is called the left-endpoint approximation. Or we couldsay that the area of Si is roughly equal to the area of the rectanglewhose base is the interval [xi−1, xi] and whose height is f(xi). This iscalled the right-endpoint approximation. Or we could use f(xi), thevalue that f takes at the midpoint xi of the interval [xi−1, xi]. This iscalled the midpoint approximation. Summing over the allowed valuesof i (i.e., i ranges from 1 to n), we see that these three methods providethe following three estimates for

∫ b

af(x) dx.

(i) The left-endpoint approximation shows

(7.18)∫ b

a

f(x) dx ≈n∑

i=1

(xi − xi−1)f(xi−1).

(ii) The right-endpoint approximation shows∫ b

a

f(x) dx ≈n∑

i=1

(xi − xi−1)f(xi).

47. APPROXIMATE INTEGRATION 53

(iii) The midpoint approximation shows∫ b

a

f(x) dx ≈n∑

i=1

(xi − xi−1)f(xi).

As the above formulas suggest, it will be particularly easy to workwith these formulas if the points x1, x2, . . . , xn−1 split the interval [a, b]into equal parts, since in that case xi − xi−1 = (b− a)/n for all i.

Example 7.28. Find the approximate value of∫ 1

0 ex2dx.

Solution: Let us use the left-endpoint method with n = 4 and x1, x2,and x3 splitting [0, 1] into four equal parts. That means that x1 = 1/4,x2 = 1/2, and x3 = 3/4. Then (7.18) implies∫ 1

0ex2

dx ≈ 14(e0 + e1/16 + e1/4 + e9/16) ≈ 1.2759,

since xi − xi−1 = 1/4 for all i.If we use the right-endpoint method, with the same set of points

xi, we get ∫ 1

0ex2

dx ≈ 14(e1/16 + e1/4 + e9/16 + e

) ≈ 1.7055.

It is not surprising that the second method yields the larger result,since the integrand ex2 is an increasing function, so f(xi) > f(xi−1).

Figure 7.4. Left-endpoint method.


Figure 7.5. Right-endpoint method.

Furthermore, for each point x ∈ [xi−1, xi], we have f(xi−1) ≤ f(x) ≤f(xi). So the left-endpoint method underestimates the area of eachstrip Si, while the right-endpoint method overestimates it. Therefore,the area of D—and hence the correct value of

∫ 10 ex2

dx—is betweenthe two values of 1.2759 and 1.7055 computed above. �

Replacing the value of n = 4 by some larger number will result ina more precise approximation (and more work). Using the midpointmethod will result in an approximation A that is closer to the actualvalue of the integral

∫ 10 ex2

dx, but it is not completely obvious fromwhich side A approximates

∫ 10 ex2

dx, that is, whether A <∫ 1

0 ex2dx or

A >∫ 1

0 ex2dx.

47.2. More Advanced Approximation Methods.

47.2.1. Trapezoid Method. If the difference between f(xi−1) and f(xi)is large, then estimating the area of Si by using rectangles could leadto large errors. A more refined approach is to estimate the area of Si

by computing the area of the trapezoid whose vertices are the points(xi−1, 0), (xi, 0), f(xi), and f(xi−1). We know that the area of thistrapezoid is the average length of its parallel sides times the distanceof those parallel sides from each other, that is, (f(xi)+f(xi−1))(xi−xi−1)

2 .


Figure 7.6. Trapezoid method.

Summing over all possible values of i, we get an estimate for all thearea of D, that is, for

∫ b

af(x) dx. Indeed, we obtain the formula

∫ b

a

f(x) dx =n∑

i=1

(f(xi) + f(xi−1))(xi − xi−1)2

.

In particular, if the xi are chosen so that they split the interval [a, b]into n equal parts, then the last displayed equation simplifies to∫ b

a

f(x) dx =b− a

2n

n∑i=1

(f(xi) + f(xi−1))

=b− a

2n(f(a) + 2f(x1) + 2f(x2) + · · ·+ 2f(xn−1) + f(b)) .

Note that f(x0) = f(a) and f(xn) = f(b) occur only once in thesum in the last line since a and b are each part of only one of theintervals [xi−1, xi].

Example 7.29. Use the trapezoid method with n = 4 to find theapproximate value of

∫ 10 ex2

dx.

Solution: We will have x1 = 1/4, x2 = 1/2, and x3 = 3/4, just as inExample 7.28. This yields∫ 1

0ex2

dx =18(e0 + 2e1/16 + 2e1/4 + 2e9/16 + e

)= 1.4907.

�


Figure 7.7. Trapezoid method.

The alert reader may have noticed that the result we obtained isprecisely the average of the left-endpoint and right-endpoint approxi-mations we obtained for the same integral in the previous section. (Thismeans that it is a better approximation than at least one of the twoearlier ones.) This is not an accident, and in Exercise 47.4.5, the readerwill be asked to prove that, under certain conditions, this phenomenonwill always occur.

47.2.2. Simpson’s Method. A similar method is Simpson’s method, inwhich we use parabolas instead of straight lines for approximation. Forsimplicity, let us now assume that the points xi split the interval [a, b]into n equal parts, that is, xi − xi−1 = (b− a)/n. In order to simplifythe notation, let us set yi = f(xi).

For any integer i ∈ [1, n−1], consider the points (xi−1, yi−1), (xi, yi),and (xi+1, yi+1). There is exactly one parabola pi of the form y =Ax2 + Bx + C that contains these three points. It can then be provedthat the area under that parabola—more precisely, the area of thedomain Pi bordered by the horizontal axis, the vertical lines x = xi−1

and x = xi+1 and pi—is equal to

(7.19)b− a

3n(yi−1 + 4yi + yi+1) .

If we summed (7.19) over all possible values of i, we would not get agood estimate, since most points of the domain under the curve would


be part of two of the Pi. For instance, a point with a horizontal co-ordinate between xi and xi+1 is part of both Pi and Pi+1. Therefore,we sum the last displayed equation over all even values of i, and, ac-cordingly, we stipulate that n be an even number. This leads to thefollowing estimate.

Theorem 7.1 (Simpson’s Method). Let n be an even positive in-teger. Then

∫ b

a

f(x) dx ≈ b− a

3n(y0 + 4y1 + 2y2 + 4y3 + · · ·+ 2yn−2 + 4yn−1 + yn) .

Example 7.30. Use Simpson’s method with n = 4 to approximate∫ 10 sin (x2) dx.

Solution: We have y0 = 0, y1 = sin(1/16), y2 = sin(1/4), y3 =sin(9/16), and y4 = sin 1. So Simpson’s method yields

∫ 1

0sin(x2) dx ≈ 1

12(4 sin(1/16) + 2 sin(1/4) + 4 sin(9/16) + sin 1)

≈ 0.31.�

Figure 7.8. Simpson’s method.


Note that this result confirms our intuition in that if x ∈ [0, 1], thensin (x2) ≤ sin x, and so∫ 1

0sin(x2) dx ≤

∫ 1

0sin x = 1− cos 1 ≈ 0.4597.

47.3. Bounds on the Error Term. The error term E of an approximationis the difference between the number obtained by the approximationand the actual value of the quantity that was approximated. (In thissection, that actual value is the value of

∫ b

af(x) dx.) It goes without

saying that the smaller the absolute value of the error term, the betterthe approximation is.

The field of numerical analysis studies the error terms of approx-imation methods. The techniques of that field yield various boundson error terms. We collected some of these bounds in the followingtheorem.

Theorem 7.2. Let f be a twice-differentiable function on [a, b] suchthat |f ′′(x)| ≤M if x ∈ [a, b]. Then the following hold for the approxi-mation methods used to compute

∫ b

af(x).

(a) If ET is the error term of the trapezoid method, then

|ET | ≤ M(b− a)3

12n2 .

(b) If EM is the error term of the midpoint method, then

|EM | ≤ M(b− a)3

24n2 .

(c) If, for all x ∈ [a, b], the number f (4)(x) is defined and is atmost as large as the constant K, and ES is the error term ofSimpson’s method, then

|ES| ≤ K(b− a)5

180n4 .

Comparing the formulas of parts (a) and (b) of the previous the-orem, we can conclude that the worst-case scenario of the midpointmethod is better than the worst-case scenario of the trapezoid method.This, of course, does not mean that the midpoint method is alwaysbetter than the trapezoid method.

Example 7.31. Find an upper bound for the approximation ob-tained in Example 7.29.

48. IMPROPER INTEGRALS 59

Solution: We apply part (a) of Theorem 7.2. We have f(x) = ex2 ,so f ′′(x) = ex2(4x2 + 2). This is an increasing function on [0, 1], so itsmaximum is taken at x = 1, showing that |f ′′(x)| ≤ 6e. As in Example7.29, we chose n = 4, so the previous theorem yields

|ET | ≤ M(b− a)3

12n2 =6e

12 · 16≈ 0.085.

�

47.4. Exercises.(1) Use n = 4 and the midpoint method to find the approximate

value of∫ 1

0 ex2dx.

(2) Use n = 4 and the trapezoid method to find the approximatevalue of

∫ 10 e−x2

dx.(3) Use n = 4 and Simpson’s method to find the approximate

value of∫ 1

0 sin x2 dx.(4) What value of n should we use in each of the preceding three

exercises to get an error term that is less than 10−6?(5) Let us assume that the points x1, x2, . . . , xn−1 used for approx-

imate integration split the interval [a, b] into n equal segments.Prove that the result obtained by the trapezoid method will bethe average of the left-endpoint method and the right-endpointmethod.

(6) How large is the error term of the approximation in exercise 1in the worst case?

48. Improper Integrals

In our studies of integration, we have not dealt with definite in-tegrals over infinite intervals, nor did we integrate functions over aninterval if the function was not defined in every point of that interval.In this section, we will consider definite integrals of these kinds, whichare called improper integrals.

48.1. Infinite Intervals. For finite intervals, we have identified∫ b

af(x) dx

with the area of the domain limited by the graph of f , the vertical linesx = a and x = b, and the horizontal axis. We learned that, by the fun-damental theorem of calculus, the equality

∫ b

af(x) dx = F (b) − F (a)

holds, where F is an antiderivative of f .Now let us consider the integral of f over the infinite interval [a,∞).

Recalling that∫ b

af(x) dx is equal to a certain area, we intuitively want∫∞

af(x) dx to equal the area of the domain bordered by the line x = a,


Figure 7.9. Area under the curve y = f(x) from x = ato x = b.

the horizontal axis, and the graph of f . Note that this area may be afinite number, even if it is not squeezed between two vertical lines. Oneexample of this is when f(x) = 0 if x > N for some real number N .

Time has come to formally define∫∞

af(x) dx.

Definition 7.1. Let f be a function. If the integral∫ b

af(x) dx

exists for all b > a and limb→∞∫ b

af(x) dx = L exists as a (finite) real

number, then we say that the integral∫∞

af(x) dx is convergent, and

we write∫∞

af(x) dx = L.

If limb→∞∫ b

af(x) dx does not exist or is infinite, then we say that∫∞

af(x) dx is divergent.

Note that if F is an antiderivative of f , then

limb→∞

∫ b

a

f(x) dx = limb→∞

(F (b)− F (a))

( limb→∞

F (b))− F (a).

Therefore, the integral∫∞

af(x) dx is convergent if and only if limb→∞

F (b) exists and is finite.

Example 7.32. Let f(x) = x−2. Compute∫∞

1 f(x) dx.


Figure 7.10. Area under the curve y = f(x) from 1 to ∞.

Solution: We have ∫ ∞

1x−2 dx = lim

b→∞

∫ b

1x−2 dx

= limb→∞

[−x−1]b1

= limb→∞−1

b+ 1

= 1.

In particular,∫∞

1 f(x) dx is convergent. �

Encouraged by the simple solution of the last example, we are goingto compute the more general integral

∫∞1 xr for any real number r.

Example 7.33. Let f(x) = xr. Compute∫∞

1 f(x) dx.

Solution: Let us first assume that r �= −1. Then we have

∫ ∞

1xr dx = lim

b→∞

∫ b

1xr dx

= limb→∞

[1

r + 1xr+1

]b

1.

If r > −1, then r +1 > 0 and limx→∞ xr+1 =∞, so the limit in thelast displayed row is infinite, and hence

∫∞1 xr dx is divergent.

If r < −1, then r + 1 < 0 and limx→∞ xr+1 = 0, so the limit in thelast displayed row is equal to 1

r+1 , and hence∫∞

1 xr dx is convergent.


If r = −1, then we need to compute∫∞

1 xr dx differently, since, inthat case,

∫xa dx �= xa+1

a+1 . Instead, we have

∫ ∞

1x−1 dx = lim

b→∞

∫ b

1x−1 dx

= limb→∞

[ln x]b1

= limb→∞

ln b =∞.

Therefore,∫∞

1 x−1 dx is divergent. �

Note that the results of the previous example prove the followingimportant theorem.

Theorem 7.3. Let r be a real number.

(i) If r ≥ −1, then∫∞

1 xr dx is divergent.(ii) If r < −1, then

∫∞1 xr dx is convergent.

The following definition is not very surprising. It is the counterpartof Definition 7.1.

Definition 7.2. Let f be a function and let b be a real numbersuch that, for all real numbers a < b, the integral

∫ b

af(x) dx exists. If

L = lima→−∞ f(x) dx exists as a (finite) real number, then we say thatthe integral

∫ b

−∞ f(x) dx is convergent, and we write∫ b

−∞ f(x) dx = L.If lima→−∞ f(x) dx is infinite or if it does not exist, then we say that∫ b

−∞ f(x) dx is divergent.

The following definition makes it clear how and when we can definean integral on the entire line of real numbers.

Definition 7.3. Let f be a function and let m be a real numbersuch that both

∫ m

−∞ f(x) dx and∫∞

mf(x) dx are convergent. Then we

say that the integral∫∞

−∞ f(x) dx is convergent and that∫ ∞

−∞f(x) dx =

∫ m

−∞f(x) dx +

∫ ∞

m

f(x) dx.

Otherwise, we say that∫∞

−∞ f(x) dx is divergent.

See Figure 7.11 for an illustration.

Example 7.34. Compute∫∞

−∞ e−x dx.


Figure 7.11.

∫ m

−∞ f(x) dx is blue, while∫∞

mf(x) dx is orange.

Figure 7.12.

∫ 0−∞ e−x dx.

Solution: We set m = 0 and apply Definition 7.3. We get that∫∞−∞ e−x dx is convergent if both of

∫ 0−∞ e−x dx and

∫∞0 e−x dx are con-

vergent. However,∫ 0

−∞e−x dx = lim

a→−∞[−e−x]0a = 1 +∞

is divergent and therefore so is∫∞

−∞ e−x. �

Figure 7.12 shows the domain whose area is equal to∫ 0

−∞ e−x dx.The reader could ask how we knew that we needed to select 0, and

not some other real number, for the role of m, that is, to split thereal number line into two parts. The answer is that we did not, andother choices of m would have given the same result since the integrand


converges to infinity as x goes to negative infinity. We chose m = 0because it was convenient to do so.

Note that all improper integrals discussed in this section are calledType 1 improper integrals.

48.2. Vertical Asymptotes. Sometimes we may want to compute theintegral of a function f on a finite interval [a, b] so that in some pointc ∈ [a, b], the function f has a vertical asymptote. An example is thefunction f(x) = 1/(x2 − 4) on the interval [1, 3]. In this case, we usethe technique of limits to formally define

∫ b

af(x) dx, as we did in the

previous section.

Definition 7.4. Let f be a function that is continuous on [a, b],except for one point c ∈ [a, b]. Then we set

(7.20)∫ c

a

f(x) dx = limt→c−

∫ t

a

f(x) dx

and

(7.21)∫ b

c

f(x) dx = limt→c+

∫ b

t

f(x) dx.

Furthermore, if both of the two limits displayed above exist and arefinite, we set ∫ b

a

f(x) dx =∫ c

a

f(x) dx +∫ b

c

f(x) dx.

Note that if the only point c in which f is not continuous is oneof the endpoints of [a, b], then we only have to compute one of (7.20)and (7.21), since the other integral is taken over a trivial interval andis hence zero.

Example 7.35. Compute∫ 1

0 x−1/2 dx.

Solution: As the only point in [0, 1] in which f(x) = x−1/2 is notcontinuous is 0, we use formula (7.21) with c = 0 and b = 1.

We get ∫ 1

0x−1/2 dx = lim

t→0+

∫ 1

t

x−1/2 dx

= limt→0+

[2x1/2]1t

= 2− limt→0+

t1/2

= 2− 0= 2.


Figure 7.13.

∫ 10 x−1/2 dx.

So the integral∫ 1

0 x−1/2 dx is convergent. �

Example 7.36. Compute∫ 4

−1 x−2 dx.

Solution: We apply Definition 7.4 since the interval [−1, 4] has onepoint, c = 0, where the integrand is not continuous. Therefore,∫ 4

−1x−2 dx =

∫ 0

−1x−2 dx +

∫ 4

0x−2 dx

= limt→0−

∫ t

−1x−2 dx + lim

t→0+

∫ 4

t

x−2 dx

= limt→0−

[−x−1]t−1 + limt→0+

[−x−1]4t

=∞+∞=∞.

So the integral in question is divergent. �

Figure 7.14 shows the domain whose area is equal to∫ 4

0 x−2 dx andthe correct way of breaking that interval up to two parts.

Note that we would have reached the wrong conclusion if we haddisregarded the fact that x−2 is not continuous at x = 0 and tried toapply the fundamental theorem of calculus. Indeed, in that case, wewould have obtained the wrong result: [−x−1]4−1 = −1

4 − 1 = −54 . This

result is incorrect, and the incorrect step was to apply the fundamentaltheorem of calculus for a function that is not continuous in the entireinterval of integration.

The integrals that we have discussed in this section are called Type2 improper integrals.


Figure 7.14.

∫ 4−1 x−2 dx.

48.3. Further Remarks.

48.3.1. Improper Integrals of Mixed Type. There are some integrals thatare improper for two reasons. They are taken over an infinite interval,and that interval contains a point in which the function is not contin-uous. In that case, we split up the interval of integration so that nowwe have two integrals, one of which is of Type 1 and the other of whichis Type 2.

Example 7.37. Compute∫∞

01

(x−2)2 dx.

Solution: We break up the interval [0,∞) to the union of the twointervals [0, 2] and [2,∞), getting∫ ∞

0

1(x− 2)2dx =

∫ 3

0

1(x− 2)2dx +

∫ ∞

3

1(x− 2)2dx.

The first term on the right-hand side is an improper integral of Type2, and the second term on the right-hand side is an improper integralof Type 1. We can compute both by the methods presented earlier inthis section. �

48.3.2. Comparison Test. Comparison tests for improper integrals workvery similarly to those for proper integrals.

Theorem 7.4. Let us assume that, for all x ≥ a, the chain ofinequalities 0 ≤ f(x) ≤ g(x) holds.

(i) If∫∞

af(x) dx is divergent, then so is

∫∞a

g(x) dx.(ii) if

∫∞a

g(x) dx is convergent, then so is∫∞

af(x) dx.

Example 7.38. Show that∫∞

31

x2 ln xdx is convergent.


Figure 7.15.

∫∞0

1(x−2)2 dx.

Figure 7.16.

∫∞3

1x2 ln x

dx.

Solution: If x ≥ 3, then ln x > 1, so x2 ln x > x2, and thereforethe integrand is less than 1

x2 . See Figure 7.16 for an illustration. Onthe other hand, we know that

∫∞3 1/x2 dx is convergent, so our claim

follows from the comparison test. �

48.4. Exercises.(1) Is

∫∞1 sin x dx convergent or divergent?

(2) Is∫∞

0.5 x−1.5 dx convergent?(3) Is

∫∞3

1√x−2dx convergent?

(4) Is∫∞

−∞ x−2 dx convergent?(5) Is

∫∞0 xe−x dx convergent?

(6) Is∫∞

−∞ xe−x2dx convergent?

(7) Is∫∞

01

ex+xdx convergent?

CHAPTER 8

Sequences and Series

49. Infinite Sequences

A sequence can be thought of as an ordered list of numbers a1, a2,..., an, an+1, .... The subscript n indicates the position of a number an inthe sequence; for example, a1 is the first element, an is the nth element,and so on.

Definition 8.1 (Sequence). A sequence is a function f defined onthe set of all positive integers; that is, it is a rule that assigns a numberto each positive integer. If f(n) = an for n = 1, 2, ..., it is customaryto denote the range of f by the symbol {an} or {an}∞1 .

So a sequence can be defined by specifying the rule an = f(n) tocalculate the nth term from an integer n. For example,

an =n

n + 1←→

{ n

n + 1

}∞

1={1

2,

23,

34, ...}

,

an =(−1)n

n←→

{(−1)n

n

}∞

1={−1,

12, −1

3,

14, ...}

,

an = qn ←→ {qn−1}∞0 = {1, q, q2, q3, ...}.Sequences can also be defined recursively, that is, by a relation that al-lows us to find an if am, m < n, are known. For example, the Fibonaccisequence {fn} is defined by the recurrence relation

f1 = f2 = 1 , fn = fn−1+fn−2 , n ≥ 3 ⇒ {fn} = {1, 1, 2, 3, 5, 8, 13, ...}Graphic representation of sequences. A sequence can be pictured sim-ilarly to the graph of a function by plotting points (x, y) = (n, an),n = 1, 2, ..., on the xy plane. For example, the sequence an = n/(n+1)is the set of points on the graph y = x/(x + 1) corresponding to allpositive integer values of x, that is, x = 1, 2, ....

49.1. Limit of a Sequence. The sequence an = n/(n + 1) has the prop-erty that the values an approach 1 as n becomes larger. Indeed, thedifference

1− an = 1− n

n + 1=

1n + 1

69

70 8. SEQUENCES AND SERIES

Figure 8.1. Set of points on the graph y = x/(x +1) corresponding to integer values x = n. For large x,x/(x+1) approaches 1 from below, and hence n/(n+1) =1/(1+1/n)→ 1 as n→∞. The difference 1−n/(n+1) =1/(n + 1) can be made smaller than any (small) numberε > 0 for all n > N and some integer N .

decreases with increasing n and hence can be made smaller than anypreassigned positive number ε for all n > N , where N depends on ε.For example, put ε = 10−2. Then the condition 1−an < ε implies that1/(n + 1) < ε or 1/ε− 1 < n or 99 < n, that is, 1− an < 10−2 for alln > 99. If ε = 10−4, then 1 − an < 10−4 for all n > N = 9999. Inother words, no matter how small ε is, there is only a finite number ofelements of the sequence that lie outside the interval (1− ε, 1 + ε). Inthis case, the sequence is said to converge to the limit value 1.

Definition 8.2 (Limit of a Sequence). A sequence {an} has thelimit a if, for every ε > 0, there is a corresponding integer N such that|an − a| < ε for all n > N . In this case, the sequence is said to beconvergent, and one writes

limn→∞

an = a or an → a as n→∞.

If a sequence has no limit, it is called divergent.

One can say that a sequence {an} converges to a number a if andonly if every open interval containing a has all but finitely many of theelements of {an}.

Theorem 8.1 (Uniqueness of the Limit). The limit of a convergentsequence is unique:

limn→∞

an = a and limn→∞

an = a′ =⇒ a = a′.

49. INFINITE SEQUENCES 71

Figure 8.2. Definition of the limit of a sequence. Thedots indicate numerical values an (vertical axis). Theinteger n increases from left to right (horizontal axis).The convergence of an to a number a means that, forany small ε > 0, there is an integer N such that all thenumbers an, n > N , lie in the interval (a − ε, a + ε).It is clear that N depends on ε. Generally, a smaller εrequires a larger N .

Proof. Fix ε > 0. Then, by the definition of the limit, there arenumbers N and N ′ such that |an− a| ≤ ε if n > N and |an− a′| ≤ ε ifn > N ′. Hence, both inequalities hold for n > max(N, N ′) and for allsuch n:

0 ≤ |a− a′| = |a− an + an − a′| ≤ |an − a|+ |an − a′| < 2ε;

that is, the nonnegative number |a−a′| is smaller than any preassignedpositive number, which means that |a− a′| = 0 or a = a′. �

Since a sequence is a function defined on all positive integers, thereis a great deal of similarity between the asymptotic behavior of a func-tion f(x) as x→∞ and a sequence an = f(n).

Theorem 8.2 (Limits of Sequences and Functions). Let f be afunction on (0,∞). Suppose that limx→∞ f(x) = a. If an = f(n),where n is an integer, then limn→∞ an = a.

The validity of the theorem follows immediately from the definitionof the limit limx→∞ f(x) = a (i.e., given ε > 0, there is a correspondingnumber M such that |f(x)− a| < ε for all x > M) by noting that therange of f(x) contains the sequence an = f(n).

Example 8.1. Find the limit of the sequence an = ln n/n if it existsor show that the sequence is divergent.


Solution: Consider the function f(x) = ln x/x such that an = f(n)for all positive integers. Hence,

limn→

ln n

n= lim

x→∞ln x

x= lim

x→∞1/x1

= 0,

where the indeterminate form ∞∞ arising from ln x/x as x → ∞ has

been resolved by means of l’Hospital’s rule. Note that l’Hospital’s ruleapplies not to sequences but to functions of a real variable. �

Following the analogy between the limits of sequences and func-tions, one can select a particular class of divergent sequences.

Definition 8.3 (Infinite Limits). The limit limn→∞ an =∞ meansthat, for every positive number M , there is a corresponding integer Nsuch that an > M for all n > N . Similarly, the limit limn→∞ an = −∞means that, for every negative number M , there is a correspondinginteger N such that an < M for all n > N .

Example 8.2. Analyze the convergence of the sequence an = 1/np,where p is real.

Solution: Put f(x) = 1/xp for x > 0. Then an = f(n) and therefore

limn→∞

1np

= limx→∞

1xp

=

⎧⎨⎩

0 if p > 0,1 if p = 0,∞ if p < 0.

�

Example 8.3. Analyze the convergence of the sequence an = qn,n = 0, 1, ..., where q is real.

Solution: Suppose q > 0. Put f(x) = qx = ex ln q. From the propertiesof the exponential function, it follows that eax → ∞ if a = ln q > 0,eax = 1 if a = ln q = 0, and eax → 0 if a = ln q < 0. Therefore,an →∞ if q > 1, an = 1→ 1 if q = 1, and an → 0 if 0 < q < 1. Whenq = 0, an = 0. Suppose q < 0. Then q = −|q| and an = (−1)n|q|n =(−1)nen ln|q|. If |q| < 1, then even and odd terms of the sequenceconverge to 0: a2n = e2n ln|q| → 0 and a2n−1 = −e(2n−1) ln|q| → 0 asn→∞. When q = −1, the sequence an = (−1)n is divergent becausea2n = 1 and a2n−1 = −1; that is, the sequence oscillates between 1and −1 for all n and an does not approach any number. Finally, ifq < −1, the sequence is divergent, too, because a2n = e2n ln|q| →∞ buta2n−1 = −e(2n−1) ln|q| → −∞. Moreover, it approaches neither ∞ nor−∞ as it oscillates taking ever-increasing positive and negative values.


Thus,

limn→∞

qn =

⎧⎨⎩

0 if q ∈ (−1, 1),1 if q = 1,∞ if q > 1,

and the sequence does not converge if q ≤ −1. �

49.2. Subsequences. Given a sequence {an}, consider a sequence {nk}of positive integers such that n1 < n2 < n3 < · · · . Then the sequence{ank}, k = 1, 2, ..., is called a subsequence of {an}. Recall that a se-

quence {an} converges to a number a if and only if every open intervalcontaining a has all but finitely many of the elements of {an}. There-fore, {an} converges to a if and only if every subsequence of {an} con-verges to a. This necessary and sufficient criterion for convergence hasalready been used in Example 8.3. The sequence an = (−1)n does notconverge because it has two subsequences a2n = 1 and a2n−1 = −1,which converge to different numbers, 1 �= −1.

49.3. Limit Laws for Sequences. The limit laws for functions also holdfor sequences, and their proofs are similar. If {an} and {bn} convergeto numbers a and b, respectively, and c is a constant, then

limn→∞

(an + bn) = limn→∞

an + limn→∞

bn = a + b,

limn→∞

(can) = c limn→∞

an = ca,

limn→∞

(anbn) = limn→∞

an limn→∞

bn = ab,

limn→∞

an

bn

=limn→∞ an

limn→∞ bn

=a

bif b �= 0,

limn→∞

(an)p = ( limn→∞

an)p = ap if p > 0 and an > 0.

The squeeze theorem also applies to sequences.

Theorem 8.3 (Squeeze Theorem). If cn ≤ an ≤ bn for n > N andlimn→∞ bn = limn→∞ cn = a, then limn→∞ an = a, where a can also be±∞.

Example 8.4. Find the limit of an = sin(π/√

n).

Solution: Since −x ≤ sin x ≤ x if x ≥ 0, one has cn = −π/√

n ≤an ≤ π/

√n = bn, where cn → 0 and bn → 0 as n→∞. By the squeeze

theorem, sin(π/√

n)→ 0 as n→∞. �

Theorem 8.4. If limn→∞ |an| = 0, then limn→∞ an = 0.


Figure 8.3. The squeeze theorem. The dots indi-cate numerical values (vertical axis) of the sequences bn

(blue), cn (black), and an (red). The integer n increasesfrom left to right (horizontal axis). The sequences bn andcn converge to a number a. This means that the differ-ences |bn−a| and |cn−a| can be made arbitrarily small forall n ≥ N and some integer N . Since cn ≤ an ≤ bn, thedifference |an − a| is also arbitrarily small for all n ≥ N .By the definition of the limit, the sequence an must con-verge to a, too.

This theorem follows directly from the definition of the limit of asequence where a = 0.

Theorem 8.5. If an → a as n→∞ and the function f is contin-uous at a, then

limn→∞

f(an) = f(a).

This theorem asserts that if a continuous function is applied to theterms of a convergent sequence, the result is also convergent.

Proof. The continuity of f at a means that limx→a f(x) = f(a)or, by the definition of this limit, for any ε > 0, there is a correspondingδ > 0 such that |f(x)− f(a)| < ε whenever |x− a| < δ. Having foundsuch δ, put ε′ = δ and, by the definition of the limit limn→∞ an = a,for any such ε′ > 0, there is a corresponding integer N such that|an − a| < ε′ = δ if n > N . Therefore, for any ε > 0, one can find acorresponding integer N such that |f(an) − f(a)| < ε for all n > N ,which means that limn→∞ f(an) = f(a). �


Example 8.5. Find the limit of the sequence an = exp(1/n2).

Solution: Consider the sequence bn = 1/n2. Then

limn→∞

bn = limx→∞

1x2 = 0.

Put f(x) = e−x. Then an = f(bn). By continuity of the exponentialfunction,

limn→∞

an = exp(− limn→∞

bn) = e0 = 1.

�

49.4. Exercises.

(1) Find a formula for the general term an of the sequences:

{an} ={

1,−13,15,−1

7,19,− 1

11, ...}

,

{an} ={

1,12,14,18,

116

, ...}

,

{an} ={−1

4,27,− 3

10,

413

,− 516

, ...}

.

In (2)–(13), determine whether the sequence converges or di-verges. If it converges, find the limit.

(2) an = 2n.(3) an = 2n − (−1)n2n.(4) an = (3− 5n2)/(1 + n2).(5) an = tan[nπ/(2 + 4n)].(6) an = sin2[π(n2 + 2)/(2n2 + 5)].(7) an = ln(an)/ln(bn), where a and b are positive numbers.(8) an = npe−n, where p is real.(9) an = n cos(1/n).

(10) an =√

(n3 + 1)/(8n3 + 4n2 + 2n + 1).(11) an = (ln n)p/n, where p > 0.(12) an = tan−1(n2).(13) an =

√n2 + n− n.


50. Special Sequences

Theorem 8.6 (Special Sequences). Let p and q be real numbers.

limn→∞

n√

p = 1 if p > 0,(8.1)

limn→∞

n√

n = 1,(8.2)

limn→∞

nq

pn= 0 if p > 1,(8.3)

limn→∞

n!nn

= 0,(8.4)

limn→∞

qn

n!= 0.(8.5)

Proof.

(8.1). If p > 1, put an = n√

p − 1. Then an > 0 and, by the binomialtheorem,

1 + nan ≤ (1 + an)n = p.

Note that all terms in (1+an)n = 1+nan+n(n−1)an/2+· · ·+nan−1n +an

n

are positive. So, by retaining only the first two terms, a smaller numberis obtained, that is, 1+nan ≤ (1+an)n. It follows from this inequalitythat

0 < an ≤ p− 1n

.

By the squeeze theorem, an → 0 as n→∞ and hence n√

p = an+1→ 1.The case p = 1 is trivial. If 0 < p < 1, the result is obtained by takingreciprocals:

limn→∞

n√

p = limn→∞

1n√

(1/p)=

1limn→∞ n

√(1/p)

= 1

because 1/p > 1.(8.2). Put an = n

√n− 1. Then an ≥ 0 and, by the binomial theorem,

n = (1 + an)n ≥ n(n− 1)2

a2n.

Hence, for n ≥ 2,

0 ≤ an ≤√

2n− 1

.

By the squeeze theorem, an → 0 or n√

n = an + 1→ 1 as n→ 0.(8.3). Consider the function f(x) = xqe−cx, where c > 0. By theasymptotic property of the exponential function, f(x) → 0 as x → ∞for any q; the exponential grows faster than any power function (which

50. SPECIAL SEQUENCES 77

has been proved in Calculus I). Since an = f(n) for c = ln p > 0 ifp > 1, one concludes that

limn→∞

nq

pn= lim

n→∞nqe−n ln p = lim

x→∞xqe−x ln p = 0.

(8.4). The following inequality holds:

an =n!nn

=1 · 2 · 3 · · ·nn · n · n · · ·n =

1n· 2n· 3n· · · n

n≤ 1

n⇔ 0 < an ≤ 1

n.

By the squeeze theorem, an → 0 as n→∞.(8.5). If q > 0, then there is a positive integer k such that k−1 ≤ q < k,that is, k is the smallest positive integer such that q/k < 1. Thefollowing inequality holds:

an =qn

n!=

q

1q

2· · · q

k − 1q

k· · · q

n≤ qk−1 q

k· · · q

n− 1q

n≤ qk−1 q

n=

qk

n.

By the squeeze theorem, 0 < an ≤ qk

n→ 0 as n→∞ and an converges

to 0. The case q = 0 is trivial. If q < 0, then |an| = |qn/n!| = |q|n/n!→0 as n→∞ and hence an converges to 0, too. �

Example 8.6. Find the limit of an = n√

nq, where q > 1, if it existsor show that the sequence diverges.

Solution:

limn→∞

n√

nq = limn→∞

( n√

n)q = ( limn→∞

n√

n)q = 1q = 1

by (8.2) and the basic limit laws. �

50.1. Monotonic Sequences.

Definition 8.4 (Monotonic Sequences). A sequence an is said tobe

monotonically increasing if an ≤ an+1,monotonically decreasing if an ≥ an+1

for all n = 1, 2, ....

The class of monotonic sequences consists of the increasing and thedecreasing sequences.

Example 8.7. Show that the sequence an = n/(n2 + 1) is mono-tonically decreasing.


Figure 8.4. The sequence on the left is monotonicallyincreasing, and the sequence on the right is monotoni-cally decreasing.

Solution: The inequality an ≥ an+1 must be established. It is equiv-alent to the following inequalities obtained by cross-multiplication:

n + 1(n + 1)2 + 1

≤ n

n2 + 1⇔ (n + 1)(n2 + 1) ≤ n[(n + 1)2 + 1]

⇔ n3 + n2 + n + 1 ≤ n3 + 2n2 + 2n

⇔ 1 ≤ n2 + n.

The latter inequality is true for n ≥ 1. Therefore, an+1 ≤ an (in fact,the strict inequality an+1 < an holds as well), and the sequence ismonotonically decreasing. �

Definition 8.5 (Bounded Sequence). A sequence is said to bebounded above if there is a number M such that

an ≤M for all n ≥ 1.

A sequence is said to be bounded below if there is a number m suchthat

m ≤ an for all n ≥ 1.A sequence is said to be bounded if it is bounded above and below:

m ≤ an ≤M for all n ≥ 1.

For example, the sequence an = 1/n is bounded: 0 < an ≤ 1. Thesequence an = en is bounded below, but not above.

Completeness axiom for the set of real numbers. The completeness ax-iom for the set of real numbers says that if S is a nonempty set of realnumbers that has an upper bound M (x ≤ M for all x ∈ S), then Shas a least upper bound. By definition, the number a is a least upperupper bound of S if, for any ε > 0, a − ε is not an upper bound ofS. The least upper bound is called the supremum of S and denotedsup S. Naturally, sup S ≤ M for any upper bound M of S. If S hasa lower bound m, then it also has the greatest lower bound, denoted


Figure 8.5. A bounded sequence. The dots indicatenumerical values of an (vertical axis). The integer n in-creases from left to right (horizontal axis). All the num-bers an lie in the interval: m ≤ an ≤M .

inf S (the infimum of S). The number inf S is a lower bound of S suchthat inf S + ε is not a lower bound of S for any positive ε > 0; thatis, m ≤ inf S for any lower bound of S. The completeness axiom is anexpression of the fact that there is no gap or hole in the real numberline.

Theorem 8.7 (Monotonic Sequence Theorem). Suppose {an} ismonotonic. Then {an} converges if and only if it is bounded.

Proof. Suppose an ≤ an+1 (the proof is analogous in the othercase). Let S be the range of {an}. If {an} is bounded, let a be theleast upper bound of S (it exists by the completeness axiom). Thenan ≤ a for all n ≥ 1. For every ε > 0, there is an integer N such that

a− ε < aN ≤ a;

otherwise, a− ε would be an upper bound of S. Since {an} increases,the inequality n ≥ N implies that

a− ε < an ≤ a ⇐⇒ |a− an| < ε whenever n ≥ N,

which shows that {an} converges to s. �

Example 8.8. Investigate the convergence of the sequence definedby the recurrence relation a1 = 2 and an+1 = 1

2(an + 3).

Solution: Let us compute the first few terms of the sequence a1 = 2,a2 = 2.5, a3 = 2.75, and so on. The initial terms suggest that thesequence is monotonically increasing, and one can try to prove thisproperty an ≤ an+1 for all n. A commonly used technique to do so


Figure 8.6. Monotonic sequence theorem. A boundedmonotonic sequence with numerical values indicated bydots (vertical axis). The integer n increases from left toright. If S = supn{an} is the least upper bound of all an,then, for any number ε > 0, S− ε is not an upper boundof the sequence. Since an increases monotonically, thereis an integer N such that all the numbers an, n > N , aregreater than S−ε and hence lie in the interval (S−ε, S).This means that an converges to S.

is mathematical induction. The statement is true for n = 1. Supposethat the statement is true for n = k, then one has to prove that thestatement is also true for n = k + 1. If the proof goes through, thenstarting with n = 1, one can establish the statement for n = 2, n = 3,an so on. This is the basic idea of mathematical induction. Using therecurrence relation,

ak+1 > ak =⇒ 12(ak+1 + 3) >

12(ak + 3) =⇒ ak+2 > ak+1.

Thus, the sequence is indeed monotonically increasing. If it happens tobe bounded, then it converges. Again, mathematical induction turnsout to be helpful. The first terms suggest that an < 3. This is true forn = 1. Suppose the inequality is true for n = k. Let us try to provethat this assumption implies that the inequality holds for n = k + 1.Using the recurrence relation,

ak < 3 =⇒ 12(ak + 3) <

12(3 + 3) =⇒ ak+1 < 3.

Thus, the sequence is monotonic and bounded and hence converges. Ifthe sequence an converges to a, then so does the sequence an+k for anyinteger k (in the definition of the limit, change N to N + k to provethis). Since the existence of the limit has been established, one can


take the limit of both sides of the recurrence relation

limn→∞

an+1 =12( limn→∞

an + 3) =⇒ a =12(a + 3) =⇒ a = 3.

Thus, an → 3 as n→∞. �

Example 8.9. Investigate the convergence of the sequence definedby the recurrence relation a1 =

√2 and an+1 =

√2 + an.

Solution: The first few terms of the sequence suggest that the se-quence is increasing: a1 =

√2 < a2 =

√2 +√

2. Let us try to provethe inequality an < an+1 by induction. Suppose it is true for n = k.Then, by monotonicity of the square root function and the recurrencerelation,

ak < ak+1 ⇒ √ak <√

ak+1 ⇒√

2 + ak <√

2 + ak+1 ⇒ ak+1 < ak+2.

The first terms of the sequence suggest that a1 < 3 and a2 < 3. Let ustry to prove that an < 3 for all n by induction. Suppose the inequalityholds for n = k. Then, by the recurrence relation,

ak < 3 =⇒ 2 + ak < 5 =⇒ √2 + ak <

√5 < 3 =⇒ ak+1 < 3.

Thus, the sequence is monotonic and bounded, and hence it converges.If its limit is a, then

limn→∞

an+1 = limn→∞

√2 + an =

√2 + lim

n→∞an ⇒ a =

√2 + a ⇒ a = 2.

�

50.2. Exercises.

In (1)–(5), find the limit of the sequence {an} or show that it does notexist.

(1) an = n√

2n2 + 3.(2) an = cos2n(n2)/n.(3) an = nrn, where r is real.(4) an = (2n− 1)!!/(2n)n, where (2n− 1)!! = 1 · 3 · 5 · · · (2n− 1).(5) an = n

√3n + 5n.

In (6)–(10), determine whether the sequence is monotonic or not mono-tonic. Is the sequence bounded?

(6) an = (−2)n.(7) an = (−1)nn.(8) an = ne−n.(9) an = n + 1

n.


(10) an = sin(qn)/n, where q is real.In (11)–(15), find the limit of the sequence or show that it does notexist.

(11) a1 = 1 and an+1 = 4− an.(12) a1 = 1 and an+1 = 1/(1 + an).(13) a1 = 1 and an+1 = 3− 1/an.(14) a1 = 2 and an+1 = 1/(3− an).(15) a1 = 1 and an+1 = 1 + 1/(1 + an).(16) The size of an undisturbed fish population has been modeled

by the formula pn+1 = bpn/(a + pn), where pn is the fish pop-ulation after n years and a and b are positive constants thatdepend on the species and the environment. Suppose thatp0 > 0. Show that pn+1 < (b/a)pn. Then prove that pn → 0if a > b; that is, the population dies out. Finally, show thatpn → b− a if b > a.Hint: Show that pn is increasing and bounded, 0 < pn < b− aif p0 < b−a. If p0 > b−a, then pn is decreasing and bounded,pn > b− a.

51. Series

51.1. Basic Definitions and Notation. With a sequence {an}, one canassociate a sequence {sn}, where

sn =n∑

k=1

ak = a1 + a2 + · · ·+ an.

The symbol

(8.6)∞∑

n=1

an = a1 + a2 + a3 + · · ·

is called an infinite series, or just a series. The numbers sn are calledthe partial sums of the series (8.6). The limits of summation are oftenomitted to denote a series; that is, the symbol

∑an also stands for an

infinite series. If {sn} converges to s, then the series is said to convergeand one writes

∞∑n=1

an = s or limn→∞

n∑k=1

ak = s.

The number s is called the sum of the series. If the sequence of partialsums {sn} diverges, the series is said to diverge. It should be under-stood that s is the limit of a sequence of sums, and it is not obtainedmerely by addition.

51. SERIES 83

For example, the sequence of partial sums for the series∑

(−1)n

is s1 = −1, s2 = −1 + 1 = 0, s3 = s2 − 1 = −1, or, generally,sn = ((−1)n− 1)/2. This sequence diverges as it has two subsequencess2n = 0 and s2n−1 = −1, which converge to different numbers, 0 �= −1.If one simply uses addition, different values for the sum of the seriesmay be obtained:

∞∑n=1

an =(a1 + a2) + (a3 + a4) + (a5 + a6) + · · · = 0 + 0 + · · · = 0,

∞∑n=1

an =(a1 + a2 + a3) + (a4 + a5 + a6) + · · · = −1− 1− · · · = −∞,

∞∑n=1

an =a1 + (a2 + a3) + (a4 + a5) + · · · = −1 + 0 + 0 + · · · = −1.

Generally, by grouping terms in the sum in different ways (accordingto the associativity of addition), the sum is found to be any integer!The reader is advised to verify this. Thus, the addition rules cannotgenerally be applied to evaluate the sum of a series.

51.2. Geometric Series. Take a piece of rope of length 1 m. Cut it inhalf. Keep one half and cut the other half in two pieces of equal length.Keep doing this, that is, keeping one half and cutting the other half intwo equal-length pieces. The total length of the retained pieces is

12

+14

+18

+ · · · = 12

(1 +

12

+14

+ · · ·)

=12

∞∑n=0

12n

.

This series must converge. The partial sum sn here is the total length ofretained pieces. The sequence {sn} is monotonically increasing (aftereach cut piece of rope is added) and bounded by the total length 1. Soit converges. From the geometry, it is also clear that 1 − sn = 1/2n,where n is the number of cuts, and hence sn → 1 as one would expect(the total length of the rope). So it is concluded

∞∑n=0

12n

= 2.

This series is an example of the geometric series:

1 + q + q2 + q3 + · · · =∞∑

n=0

qn,


where q is a number. The geometric series does not converge for anyvalue of q.

Theorem 8.8 (Convergence of a Geometric Series). A geometricseries

∑∞n=1 qn converges if |q| < 1, and, in this case,

∞∑n=0

qn =1

1− q, |q| < 1,

and the series diverges otherwise.

Proof. If q = 1, the sequence of partial sums obviously diverges.If q �= 1, one has

sn = 1 + q + q2 + · · ·+ qn−1 =⇒ qsn = q + q2 + q3 + · · ·+ qn.

Subtracting these equations, one infers

sn − qsn = 1− qn =⇒ sn =1− qn

1− q.

Therefore,

limn→∞

sn = limn→∞

1− qn

1− q=

11− q

− 11− q

limn→∞

qn.

It has been found that the sequence an = qn converges only if |q| < 1,and, in this case, qn → 0 as n → ∞. If |q| ≥ 1, the geometric seriesdiverges. �

Example 8.10. Analyze the convergence of the series 4− 83 + 16

9 −3227 + · · · .Solution: The series can be written in the form 4q0 +4q1 +4q2 +4q3 +· · · , where q = −2/3. So its partial sums are four times the partialsum of the geometric series with q = −2/3. Therefore,

∞∑n=1

4(−2

3

)n

= 4∞∑

n=1

(−2

3

)n

=4

1− (−23)

=125

.

�

When real numbers are presented in decimal from, one often en-counters a situation when a number has a repeated pattern of decimalplaces. Take, for example, the number 1.2131313...; that is, the com-bination 13 repeats itself in all decimal places starting in the seconddecimal place.

Example 8.11. Is the number 1.2131313... rational or irrational?If it is rational, write it as a ratio of integers.

51. SERIES 85

Solution: By definition of the decimal representation,

1.2131313... = 1.2 +13103 +

13105 +

13107 + · · · = 1.2 +

13103

∞∑n=0

( 1100

)n

= 1.2 +13103

10099

=1210

+13990

=1201990

.

�

51.3. Necessary Condition for a Series to Converge. The following the-orem follows from the limit laws applied to the sequences of partialsums.

Theorem 8.9 (Properties of Series). Suppose that the series∑

an

and∑

bn are convergent and their sums are s and t, respectively. Letc be a number. Then the series

∑(an + bn) and

∑can converge and∑

(an + bn) =∑

an +∑

bn = s + t ,∑

(can) = c∑

an = cs.

Indeed, if {sn} and {tn} are the sequences of partial sums of theseries

∑an and

∑bn, respectively, then the partial sums of the series∑

(an + bn) and∑

(can) are sn + tn and csn, respectively. By the limitlaws, sn + tn → s + t and csn → cs.

Note that the convergence of the series∑

(an + bn) does not implythe convergence of

∑an and

∑bn. For example, put an = 1 and

bn = −1. The series∑

(an + bn) =∑

0 = 0, while∑

an =∑

1 and∑bn =

∑(−1) diverge, and the equality

∑(an + bn) =

∑an +

∑bn

becomes meaningless (“0 = ∞ −∞”). This shows that the rules ofalgebra for finite sums are not generally applicable to series. Onlyseries from a special class of absolutely convergent series, discussed later,behave pretty much as finite sums.

It is clear that every theorem about sequences can be stated in termsof series by putting a1 = s1 and an = sn − sn−1 for n > 1 and viceversa. In particular, if the series converges, that is, sn → s as n→∞,then one can take the limit on both sides of this recurrence relationand conclude that limn→∞ an = limn→∞(sn−sn−1) = s−s = 0; that is,for a convergent series

∑an, the sequence {an} necessarily converges

to 0.

Theorem 8.10 (Necessary Condition for a Series to Converge). Ifthe series

∑an converges, then limn→∞ an = 0.

The converse is not generally true; that is, the condition limn→∞ an =0 is not sufficient for a series to converge. However, it can still be usedas a test for divergence of a series.


Corollary 8.1 (Test for Divergence of a Series). If the limitlimn→∞ an does not exist or if limn→∞ an �= 0, then the series

∑an

diverges.

Example 8.12. Show that the series∑

n3/(3n3 + 1) diverges.

Solution:

limn→∞

an = limn→∞

n3

3n3 + 1= lim

n→∞1

3 + 1n3

=13�= 0,

so the series diverges. �

If the necessary condition is satisfied, the series may converge ordiverge. The sequence of partial sums has to be analyzed.

Example 8.13. Find the sum of the series∑∞

n=11

n(n+1) if it existsor show that it does not exist.

Solution: The necessary condition for convergence is evidently satis-fied. So the sequence of partial sums has to be analyzed for conver-gence:

sn =n∑

k=1

1k(k + 1)

=n∑

k=1

(1k− 1

k + 1

)

=(1− 1

2

)+(1

2− 1

3

)+(1

3− 1

4

)+ · · ·+

( 1n− 1

n + 1

)= 1− 1

n + 1→ 1 as n→∞.

So the sequence {sn} converges to 1 and hence∞∑

n=1

1n(n + 1)

= 1.

�

This example is a particular case of a telescopic series.

Theorem 8.11 (Convergence of a Telescopic Series). A telescopicseries

∑∞n=1(an − an+1) converges if limn→∞ an = a, and, in this case,

∞∑n=1

(an − an+1) = a1 − a,

The proof is analogous to the above example and based on the factthat the sequence of partial sums of a telescopic series sn = a1 − an+1

converges to a1 − a. The details are left to the reader as an exercise.

52. SERIES OF NONNEGATIVE TERMS 87

51.4. Exercises.In (1)–(3), determine whether the geometric series converges or di-verges.

(1)∞∑

n=0

πn

3n+1 (2)∞∑

n=1

en

3n−1 (3)∞∑

n=0

(−5)n

32n

In (4)–(9), determine whether the series converges or diverges. If itconverges, find its sum. Here p is a positive number, p > 0.

(4)∞∑

n=1

k2

k2 + k + 1(5)

∞∑n=1

2− 3n

5n(6)

∞∑n=1

n√

p

(7)∞∑

n=1

(sin p)n (8)∞∑

n=1

en

np(9)

∞∑n=1

(e−2n +

4n(n + 1)

)

In (10)–(12), determine whether the series converges or diverges by ex-pressing it as a telescopic series. Find the sum of the series if it exists.

(10)∞∑

n=2

1n2 − 1

(11)∞∑

n=1

3n2 + 3n + 3

(12)∞∑

n=1

lnn

n + 1

In (13) and (14), express the number as a ratio of integers(13) 1.23232323....(14) 1.53525252....In (15)–(17), find the values of x for which the series converges. Findthe sum of the series for those values of x.

(15)∞∑

n=1

xn

2n(16)

∞∑n=1

sinn x

3n(17)

∞∑n=1

(x− 5)n

In (18) and (19), solve the equation.

(18)∞∑

n=2

(1 + x)−n = 3 (19)∞∑

n=0

enx = 9

52. Series of Nonnegative Terms

In many applications, the terms of a series decrease monotonically.It appears that there is a relation between convergence of such seriesand convergence of improper integrals over an interval [1,∞). Thisrelation allows one to establish a necessary and sufficient condition forseries of nonnegative terms to converge.


52.1. The Integral Test. Suppose f(x) is a positive, continuous, mono-tonically decreasing function on [1,∞) such that f(x) → 0. Supposealso that the improper integral∫ ∞

1f(x) dx = lim

a→∞

∫ a

0f(x) dx = If

exists. The value If is the area under the graph y = f(x) over theinterval [1,∞). Consider the series

∑∞n=1 f(n). The necessary condi-

tion for convergence is fulfilled as f(n)→ 0 as n→∞. To investigatethe convergence of the series, one has to analyze the convergence of itspartial sums:

sn =n∑

k=1

f(k) = f(1) + f(2) + · · ·+ f(n).

Since the function f(x) monotonically decreases and is continuous onevery interval [k, k + 1], it attains its minimal and maximal valuesf(k + 1) ≤ f(x) ≤ f(k) on this interval and therefore

(8.7) f(k + 1) ≤∫ k+1

k

f(x) dx ≤ f(k).

This inequality leads to the following upper and lower estimates ofthe partial sums:

sn ≤ f(1) +∫ 2

1f(x) dx + · · ·+

∫ n

n−1f(x) dx = f(1) +

∫ n

1f(x) dx,

sn ≥∫ 2

1f(x) dx +

∫ 3

2f(x) dx + · · ·+

∫ n+1

n

f(x) dx =∫ n+1

1f(x) dx,

so that

(8.8)∫ n+1

1f(x) dx ≤ sn ≤ f(1) +

∫ n

1f(x) for all n ≥ 1.

This inequality shows that the following theorem holds.

Figure 8.7. Integral test. An illustration of inequality (8.7).


Theorem 8.12 (Integral Test). Suppose f is a continuous, posi-tive, decreasing function on [1,∞) and let an = f(n). Then the series∑∞

n=1 an converges if and only if the improper integral∫∞

1 f(x) dx con-verges. In other words,∫ ∞

1f(x) dx converges =⇒

∞∑n=1

f(n) converges,

∫ ∞

1f(x) dx diverges =⇒

∞∑n=1

f(n) diverges.

Proof. If the improper integral converges to a number If , thenby (8.8) the sequence of partial sums is bounded, sn ≤ f(1) + If , andmonotonically increases, sn ≤ sn + f(n + 1) = sn+1. Therefore, it isconvergent. If the improper integral diverges, then, for any numberM > 0, there is an integer N such that

∫ n+11 f(x) dx ≥ M for all

n > N . By the left inequality of (8.8), M ≤ sn for all n > N ; that is,{sn} is a monotonically increasing, unbounded sequence and hence itdiverges. �

Remark. Suppose that an = f(n), where f(x) is a function on[1,∞), such that it is continuous, positive, and decreasing on [N,∞),where N ≥ 1 is an integer. Then

(8.9)∞∑

n=1

an converges ⇐⇒∫ ∞

N

f(x) dx converges;

that is, the integral test applies even if the sequence an becomes mono-tonically decreasing only for n ≥ N ≥ 1. This is easy to understandby isolating the first N − 1 terms in the series∞∑

n=1

an = a1 + a2 + · · ·+ aN−1 +∞∑

n=N

an = a1 + a2 + · · ·+ aN−1 +∞∑

n=1

bn,

where bn = aN+n−1. Convergence of∑

bn implies convergence of∑

an

and vice versa as they differ by a number. Put bn = g(n), whereg(x) = f(x+N−1), which is a continuous, positive, decreasing functionon [1,∞), and∫ ∞

1g(x) dx =

∫ ∞

1f(x + N − 1) dx =

∫ ∞

N

f(u) du

by changing the integration variable u = x + N − 1.


52.2. Special Series of Nonnegative Terms.

Theorem 8.13. The p-series∞∑

n=1

1np

converges if p > 1 and diverges if p ≤ 1.

Proof. If p ≤ 0, the series diverges because the necessary condi-tion for convergence is not fulfilled, an →∞ if p < 0 and an = 1 �= 0 ifp = 0. For p > 0, consider the function f(x) = x−p, which is positive,continuous, and decreasing on [1,∞), and∫ a

1

dx

xp=

{1

p−1

(1− 1

ap−1

)if p �= 1,

ln a if p = 1.

So, by the integral test, the series converges if p > 1 because theimproper integral diverges if 0 < p ≤ 1 and converges if p > 1 (thelimit a→∞ exists only if p > 1). �

Note that the series∑

n−p diverges for all 0 < p ≤ 1 despite thatthe necessary condition to converge is fulfilled: an = n−p → 0. Inparticular, the harmonic series

∑∞n=1

1n

diverges.The sum of a p-series ζ(p) =

∑n−p depends on the value of p > 1;

that is, this series defines a function on (1,∞). This function is calledRiemann’s zeta function.

Example 8.14. Investigate the convergence of the series∑∞n=1(n + 2)−3/2.

Solution: The series can be written as∞∑

n=1

1(n + 2)3/2 =

∞∑n=3

1n3/2 = −1− 1

23/2 +∞∑

n=1

1n3/2 .

The latter series is a p-series that converges for p = 3/2 > 1. �

Theorem 8.14. The series∞∑

n=2

1n(ln n)p

converges if p > 1, and it diverges if p ≤ 1.

Proof. Consider the function g(x) = x(ln x)p for x > 1. Its de-rivative reads g′(x) = (ln x)p−1(p + ln x). If p ≥ 0, then g′(x) > 0 forall x > 1 and g(x) increases, while its reciprocal f(x) = 1/g(x) should


decrease. If p < 0, then g′(x) > 0 for all x > e−p and hence g(x) in-creases, while f(x) = 1/g(x) decreases if x > e−p > 1. Thus, for any p,there is an integer N such that the function f(x) = 1/[x(ln x)p] is con-tinuous, positive, and decreases on [N,∞). By the integral test (8.9),the series in question converges if and only if the improper integral∫ ∞

N

dx

x(ln x)p=∫ ∞

ln N

du

up

converges, where the integration variable has been changed, u = ln x,du = dx/x. This integral diverges if p ≤ 1 and converges if p > 1, andthe conclusion of the theorem follows. �

52.3. Estimate of the Sum. If a partial sum sn is used to estimate thesum of a convergent series of nonnegative terms

∑f(n), how good

is such an estimate? The remainder s − sn has to be investigated toanswer this question.

Corollary 8.2 (Estimate of Sums). Suppose f is a continuous,positive, decreasing function on [1,∞) and let an = f(n). If the series∑

an converges to a number s, then∫ ∞

n+1f(x) dx ≤ s− sn ≤

∫ ∞

n

f(x) dx,

where {sn} is the sequence of partial sums.

Proof. The first inequality is obtained by taking the limit n→∞in (8.8) with the result

(8.10)∫ ∞

1f(x) dx ≤

∞∑n=1

an ≤ f(1) +∫ ∞

1f(x) dx,

which is a legitimate operation because (8.8) holds for all n and theseries converges (and so does the improper integral by the integraltest). The remainder estimate is obtained by subtracting (8.8) from(8.10). Note the value of the improper integral does not coincide withthe sum; it only determines an interval (8.10) in which the sum of aseries lies. �

Example 8.15. Test the series∑∞

n=1(n2 + 1)−1 for convergence or

divergence. If it converges, estimate its sum.

Solution: Put f(x) = (x2 + 1)−1, which is a continuous, positive, de-creasing function on [1,∞), such that the series in question is

∑f(n).


Therefore, the integral test applies, and the series converges because∫ ∞

1

dx

x2 + 1= lim

a→∞tan−1 x

∣∣∣a1

= lima→∞

tan−1 a− π

4=

π

2− π

4=

π

4.

By (8.10), its sum lies in the interval π4 ≤ s ≤ f(1) + π

4 = 12 + π

4 . �


n=1 ne−n for convergence or di-vergence. If it converges, estimate its sum.

Solution: Consider the function f(x) = xe−x. Since f ′(x) = e−x −xe−x = (1− x)e−x ≤ 0 if x ≥ 1, the function decreases on [1,∞), andthe integral test applies to assess convergence of the series

∑f(n):∫ ∞

1xe−x dx = −

∫ ∞

1x de−x = − lim

a→∞xe−x

∣∣∣a1+∫ ∞

1e−x dx =

1e+

1e

=2e,

where the integration by parts has been used to evaluate the integral.The series converges to a number s that lies in the interval 2e−1 ≤ s ≤f(1) + 2e−1 = 3e−1. �

Example 8.17. Estimate values of Riemann’s zeta function ζ(p).How many terms does one need to retain in the partial sum sn to ap-proximate ζ(p) correct to N decimal places?

Solution: Riemann’s zeta function is defined by the sum of the seriesζ(p) =

∑∞n=1 n−p. For p > 1,∫ ∞

1

dx

xp= lim

a→∞x1−p

1− p

∣∣∣a1= lim

a→∞a1−p

1− p− 1

1− p=

1p− 1

.

Since f(1) = 1, by (8.10),

1p− 1

≤ ζ(p) ≤ p

p− 1.

By Corollary 8.2,

0 ≤ ζ(p + 1)− sn ≤∫ ∞

n

dx

xp+1 =1

pnp.

If ζ(p + 1), p > 0, is to be approximated by sn correct to N decimalplaces, then the remainder should be less than 5 · 10−N−1, which yieldsthe condition on the number of terms: 1

pnp < 5 · 10−N−1 or np >

10N+1/(5p) or n ∼ p√

10N+1/(5p). �

53. COMPARISON TESTS 93

52.4. Exercises.In (1)–(9), determine whether the series converges or diverges.

(1)∞∑

n=1

1n9/8 (2)

∞∑n=2

(ln n)4

n(3)

∞∑n=2

1− n ln n

n2

(4)∞∑

n=1

n2 − 2n− 53n7/3 (5)

∞∑n=1

1n2 − 4n + 5

(6)∞∑

n=1

n

n4 + 1

(7)∞∑

n=1

e1/n

n2 (8)∞∑

n=1

2n + 1n(n + 1)

(9)∞∑

n=1

tan−1 n

n2 + 1

In (10)–(14), determine the values of p for which the series is conver-gent.

(10)∞∑

n=3

1n ln n(ln(ln n))p

(11)∞∑

n=1

n(1 + n2)p (12)∞∑

n=1

npe−n

(13)∞∑

n=1

pln n , p > 0 (14)∞∑

n=1

(p

n− 1

n + 1

)(15) How many terms of the series in Theorem 8.14 would one need toadd to find its sum correct to N decimal places?(16) Show that the sequence

an = 1 +12

+13

+ · · ·+ 1n− ln n

converges. The limit limn→∞ an = γ is called the Euler number.Hints: (1) Use (8.8) to show that if sn is the partial sum of the harmonicseries, then sn ≤ 1 + ln n and hence an ≤ 1 (i.e., the sequence {an} isbounded).(2) Interpret an − an+1 as a difference of areas to show that {an} ismonotonic.

53. Comparison Tests

Consider the series∞∑

n=1

an =∞∑

n=1

1n4 + 1

.

This series has terms smaller than the corresponding terms of the con-vergent p-series:

∞∑n=1

bn =∞∑

n=1

1n4


because an < bn for all n. If sn is the partial sum for∑

an and tn isthe partial sum for

∑bn, then sn < tn. Since tn converges to a number

t, it is bounded, tn < t, and hence sn < t; that is, the sequence {sn}is monotonic and bounded, and therefore it converges. This line ofarguments admits a generalization.

Theorem 8.15 (Comparison Test). Suppose that∑

an and∑

bn

are series such that an ≥ 0 and bn ≥ 0 for all n ≥ N and some integerN ≥ 1. Then∑

bnconverges and an ≤ bn for all n ≥ N =⇒∑

an converges,∑bndiverges and an ≥ bn for all n ≥ N =⇒

∑an diverges.

Proof. The series∑∞

n=1 an and∑∞

n=N an differ by a number a1 +a2 + · · · + aN−1. So convergence of

∑∞n=N an implies convergence of∑∞

n=1 an and vice versa. Therefore, it is sufficient to consider the caseN = 1. The sequences of partial sums {sn}, sn = a1 + a2 + · · · + an,and {tn}, tn = b1 +b2 + · · ·+bn, are monotonically increasing sequencesbecause an ≥ 0 and bn ≥ 0. If

∑bn converges, then tn → t as n →

∞ and tn ≤ t for all n. By the hypothesis an ≤ bn for all n ≥ 1,and therefore sn ≤ tn ≤ t, which shows that {sn} is monotonic andbounded and, hence, converges. If

∑bn diverges, then tn →∞. From

the hypothesis an ≥ bn, it follows that sn ≥ tn. Thus, sn → ∞ asn→∞. �

When applying the comparison test, the convergence properties ofthe series

∑bn must be known. In many instances, a good choice is a

geometric series (Theorem 8.8), a p-series (Theorem 8.13), a telescopicseries (Theorem 8.11), and the series in Theorem 8.14.


n=1(2n + 1)/(3n3 + n2 + 1) forconvergence.

Solution: Since an is a rational function of n, a convenient choice ofa series in the comparison test is a p-series:

2n + 13n3 + n2 + 1

<2n + 13n3 =

23

1n2 +

13

1n3 .

The series ∞∑n=1

2n + 13n3 =

23

∞∑n=1

1n2 +

13

∞∑n=1

1n3

converges as the sum of two convergent p-series. �


n=1(√

n + 1−√n) for convergence.

53. COMPARISON TESTS 95

Solution: One has

an =√

n + 1−√n =(√

n + 1−√n)(√

n + 1 +√

n)√n + 1 +

√n

=1√

n + 1 +√

n≥ 1√

2n +√

n=

11 +√

21√n

= bn.

The p-series∑

1/√

n diverges and so does the series in question by thecomparison test. �

Theorem 8.16 (Limit Comparison Test). Suppose that∑

an and∑bn are series with positive terms. Let c = limn→∞(an/bn).• If c = 0 and

∑bn converges, then

∑an converges.

• If 0 < c <∞, then∑

an converges if only if∑

bn converges.• If c =∞ and

∑bn diverges, then

∑an diverges.

Proof. If c = 0, then there is an integer N such that an/bn < 1for all n > N by the definition of the limit. Hence, an < bn for alln > N . If

∑bn converges, then

∑an converges by the comparison

test. If c ∈ (0,∞), then, by the definition of the limit, for any numberc > ε > 0, there is an integer N such that∣∣∣c− an

bn

∣∣∣ < ε ⇐⇒ m = c− ε <an

bn

< c + ε = M for all n > N.

Therefore,mbn < an < Mbn for all n > N.

By the comparison test, convergence of∑

bn implies convergence of∑an due to the inequality an < Mbn. The divergence of

∑bn implies

divergence of∑

an, again by the comparison test as mbn < an. Ifc =∞, then, for any M > 0, there is an integer N such that an/bn > Mwhen n > N . The inequality an > Mbn shows that divergence of

∑bn

implies divergence of∑

an by the comparison test. �

It is often helpful to investigate the asymptotic behavior of an asn→∞ to identify a suitable bn in the limit comparison test.


n=1(2n3/2 + n)/

√n6 + n4 + 1 for

convergence.

Solution: Let us find the asymptotic behavior of an as n → ∞. Forlarge n, the top of the ratio behaves as ∼ 2n3/2, while the bottom ofthe ratio behaves as ∼ (n6)1/2 = n3. Therefore,

an =2n3/2 + n√n6 + n4 + 1

=2n3/2(1− 1

2√

n)

n3√

1 + 1n2 + 1

n6

∼ 2n3/2

n3 = 2n−3/2 = bn


in the asymptotic region n → ∞. This shows that the ratio an/bn

converges to c = 1 as n→∞. By the limit comparison test, the series∑an converges because the p-series

∑bn = 2

∑n−3/2 converges. �


n=1(n5 + 3n)/

√n3 + 5n for con-

vergence.Solution: Recall that the power function increases more slowly thanthe exponential function, that is, npq−n → 0 as n → ∞ for any q > 1and any p. Hence, the asymptotic behavior of an is

an =3n(1 + n53−n)5n(1 + n35−n)

∼ 3n

5n= (3/5)n = bn.

This shows that the ratio an/bn converges to c = 1 as n → ∞. Bythe limit comparison test,

∑an converges because the geometric series∑

bn converges (q = 3/5 < 1). �

53.1. Estimating Sums. If a series∑

an converges by comparison witha series

∑bn, then the sum of

∑an can be estimated by comparing

remainders for the series∑

bn. Indeed, put∑

an = s and∑

bn = t.Let {tn} and {sn} be the sequences of partial sums for

∑bn and

∑an,

respectively. The remainders satisfy the inequality:

s− sn = an+1 + an+2 + · · · ≤ bn+1 + bn+2 + · · · = t− tn.

So the accuracy of the approximation s ≈ sn is the same or higherthan that of the approximation t ≈ tn. If, for example, one findsthat n = N is sufficient for the equality t = tN to be correct to aspecific number of decimal places, then s = sN is also correct to thator even a higher number of decimal places. The remainder is easy toestimate when bn = f(n), where the function f is simple to integrate,t− tn ≤

∫∞n

f(x) dx.Example 8.22. Determine how many terms are needed to estimate

the sum of the series∑∞

n=1 tan−1(n2)/(n3 + 1) correct to five decimalplaces.Solution: The function tan−1 x is monotonically increasing for x > 0approaching asymptotically the value π/2. Therefore,

an =tan−1(n2)

n3 + 1≤ π

21

n3 + 1≤ π

21n3 = bn.

Hence,

s− sn ≤ t− tn ≤ π

2

∫ ∞

n

dx

x3 =π

4n2 < 5 · 10−6 ⇒ n >

√π

2√

5103 ≈ 396.

�

54. ALTERNATING SERIES 97

53.2. Exercises.In (1)–(12), determine whether the series converges or diverges.

(1)∞∑

n=1

n

n5/3 + n1/3 + 1(2)

∞∑n=2

n(ln n)4

n2 + 1(3)

∞∑n=2

cos2(n)n2

(4)∞∑

n=1

1 + (−1)n

n3/2 + 1(5)

∞∑n=1

13√

n3 + n + 1(6)

∞∑n=1

1 + 2n

n2 + 2n

(7)∞∑

n=1

e1/√

n

n + 1(8)

∞∑n=1

sin2( 1

n

)(9)

∞∑n=1

n!nn

(10)∞∑

n=1

1n1+1/n

(11)∞∑

n=1

( n√

n− 1)n (12)∞∑

n=1

n2

n!

(13) How many terms does one need in the partial sum to estimate thesum of the series

∑∞n=1 sin3 n/(n3 + n) up to five decimal places?

(14) Consider a sequence {an}, where an can take any value from theset {0, 1, 2, ..., p − 1}, where p > 1 is an integer. The meaning of therepresentation of a number 0.a1a3a3... with base p is that

(8.11) 0.a1a3a3... =a1

p+

a2

p2 +a3

p3 + · · · .

When p = 10, the decimal system is obtained. The binary represen-tation corresponds to p = 2. The Maya used p = 20 (the number offingers and toes). The Babylonians used p = 60. Show that the series(8.11) always converges.(15) Show that if an > 0 and

∑an converges, then

∑ln(1 + an) con-

verges, too.(16) Prove that the convergence of

∑an, where an > 0, implies the

convergence of∑√

an/n.(17) If

∑an converges and if the sequence {bn} is monotonic and

bounded, prove that∑

anbn converges.

54. Alternating Series

Definition 8.6 (Alternating Series). Let {bn} be a sequence ofnonnegative terms. The series∑

(−1)n−1bn = b1 − b2 + b3 − b4 + b5 − · · ·is called an alternating series.


For example, the series

(8.12) 1− 12

+13− 1

4+

15

+ · · · =∞∑

n=1

(−1)n−1

n

is an alternating series. It is called the alternating harmonic series.

Theorem 8.17 (Alternating Series Test). If a sequence of positiveterms {bn} is monotonically decreasing and limn→∞ bn = 0, then thealternating series

∑(−1)n−1bn converges:

(i) bn+1 ≤ bn for all n(ii) limn→∞ bn = 0 =⇒

∞∑n=1

(−1)n−1bn converges.

Proof. The convergence of the sequence of partial sums {sn} isto be established. Consider a subsequence of even partial sums {s2k}.One has s2 = b1 − b2 ≥ 0, s4 = s2 + (b3 − b4) ≥ s2, and, in general,

s2k = s2(k−1) + (b2k−1 − b2k) ≥ s2(k−1) ≥ s2(k−2) ≥ · · · ≥ s2 ≥ 0

by the monotonicity of the sequence {bn}. Thus, the subsequence {s2k}is monotonically increasing. By regrouping the terms in a different way,one can see that

s2k = b1 − (b2 − b3)− (b4 − b5)− · · · − (b2k−2 − b2k−1)− b2k ≤ b1

because all numbers in parentheses are nonnegative by hypothesis (i),which shows that {s2k} is also bounded. Therefore, it converges by themonotonic sequence theorem:

limk→∞

s2k = s.

For the subsequence of odd partial sums s2k+1 = s2k + b2k+1, one infersby the limit laws and hypothesis (ii) that

limk→∞

s2k+1 = limk→∞

s2k + limk→∞

b2k+1 = s + 0 = s.

The convergence of two particular subsequences of a sequence to thesame number s does not generally guarantee that the sequence con-verges to s (all its subsequences should converge to s). By definition,the limits of {s2k} and {s2k+1} mean that, given any number ε > 0,there are positive integers N1 and N2 such that |s2k − s| < ε if k > N1

and |s2k+1 − s| < ε if k > N2. Put N = max(2N1, 2N2 + 1). Then|sn − s| < ε for all n > N , which means that sn → s as n→∞. �

By this test, the alternating harmonic series (8.12) converges be-cause the sequence bn = 1/n is monotonically decreasing and convergesto 0.


Figure 8.8. Alternating series test. An illustration ofits proof where two subsequences, s2k and s2k−1, of thesequence sn of partial sums are analyzed for convergence.


n=1 sin(πn/2)/n for convergence.

Solution: One has sin(πn/2) = 1, 0,−1, 0, 1, ... for n = 1, 2, 3, 4, 5, ...,respectively, or, in general, for odd n = 2k − 1, sin(πn/2) = (−1)k−1,while for even n = 2k, sin(πn/2) = sin(πk) = 0. Thus, the series inquestion is an alternating series:

∞∑n=1

sin(πn/2)n

=∞∑

n=1

(−1)n−1

2n− 1=

∞∑n=1

(−1)n−1bn , bn =1

2n− 1.

The sequence {bn} is monotonically decreasing and bn → 0 as n→∞.So the series converges by the alternating series test. �

Remark. It should be noted that Theorem 8.17 provides only asufficient condition for an alternating series to converge. So there areconvergent alternating series that do not satisfy the hypotheses of Theo-rem 8.17. For example, the alternating series with bn = sin2(πn/q)/n2,where q is an integer, is convergent by the comparison test because|(−1)n+1bn| = bn ≤ 1/n2 and the p-series

∑1/n2 converges (see the

next section on absolutely conversing series). However, the sequence{bn} is not monotonically decreasing because bn ≥ 0 and it has a zero


subsequence bkq = 0, k = 1, 2, .... So bn oscillates between the zerosequence and the sequence 1/n2.

Remark. Hypothesis (i) of Theorem 8.17 may be weakened

(i) bn+1 ≤ bn for all n ≥ N

for some integer N ≥ 1. Indeed,∞∑

n=1

(−1)n−1bn = b1 − b2 + b3 − · · · − bN−1 +∞∑

n=N

(−1)n−1bn

= b1 − b2 + b3 − · · · − bN−1 + (−1)N−1∞∑

n=1

(−1)n−1cn,

where cn = bn+N−1. The series∑

(−1)n−1bn and ±∑(−1)n−1cn differby a number, and therefore the convergence of one of them impliesthe convergence of the other. The series

∑(−1)n−1cn converges by

Theorem 8.17 as cn+1 ≤ cn for all n and limn→∞ cn = limn→∞ bn+N−1 =0.

Example 8.24. Test the series∑

n=1(−1)n−1np/(n+1) for conver-gence if p < 1.

Solution: Here bn = np/(n + 1) and, for p < 1,

limn→∞

bn = limn→∞

np

n + 1= lim

n→∞np−1

1 + 1n

= limn→∞

np−1 = 0.

So hypothesis (ii) of Theorem 8.17 is fulfilled. However, the mono-tonicity of {bn} is not obvious. To investigate it, consider the functionf(x) = xp/(x + 1), where x ≥ 1. If f(x) monotonically decreases, thenso does the sequence bn = f(n). The condition f ′(x) ≤ 0 has to beverified:

f ′(x) =pxp−1(x + 1)− xp

(x + 1)2 ≤ 0⇐⇒ (p− 1)xp + pxp−1 ≤ 0

⇐⇒ p ≤ x(1− p)

If p ≤ 0, this is true as x ≥ 1. If 0 < p < 1, then f(x) monotoni-cally decreases for x ≥ p/(1 − p) and one can always find an integerN ≥ p/(1−p) such that bn+1 < bn for all n > N . So the series convergesfor all p < 1. �

54.1. Estimating Sums of Alternating Series. A partial sum sn of anyconvergent alternating series can be used as an approximation of thetotal sum s, but this is not of much use unless the accuracy of the


approximation is assessed. The following theorem asserts that the ab-solute error of the approximation s ≈ sn does not exceed the value ofbn+1.

Theorem 8.18 (Alternating Series Sum Estimation). If s =∑(−1)n−1bn is the sum of an alternating series that satisfies

(i) 0 ≤ bn+1 ≤ bn for all n and (ii) limn→∞

bn = 0,

then |s− sn| ≤ bn+1.

Proof. In the proof of the alternating series test, it was found thatthe subsequence {s2k} approaches the limit value s from below, s2k ≤ s.On the other hand, the subsequence {s2k−1} approaches the limit values from above. Indeed, s1 = b1, s3 = s1 − b2 + b3 ≤ s1 because b3 ≤ b2,and, in general, s2k+1 = s2k−1 − b2k + b2k+1 ≤ s2k−1; that is, {s2k+1}is monotonically decreasing. This shows that the sequence of partialsums sn oscillates around s so that the sum s always lies between anytwo consecutive partial sums sn and sn+1 as depicted in Figure 8.8.Hence,

|s− sn| ≤ |sn+1 − sn| = bn+1.

�

Example 8.25. Estimate the number of terms in a partial sumsn needed to approximate the sum of the alternating harmonic seriescorrect to N decimal places.

Solution: Here, bn = 1/n. Hence, the approximation s ≈ sn is correctto N decimal places if the absolute error does not exceed 5 · 10−N−1:|s−sn| ≤ bn+1 < 5·10−N−1 or 1/(n+1) < 5·10−N−1 or n > 0.2·(10N−1).

�

Remark. If the monotonicity condition bn+1 ≤ bn holds only ifn ≥ N , the conclusion of Theorem 8.18 also holds only if n ≥ N .Indeed, in the notation from Remark 2, put t =

∑(−1)n−1cn, where

cn = bn+N−1. Let tn be a partial sum for the series∑

(−1)n−1cn. Thens = sN−1 + (−1)N−1t and sn = sN−1 + (−1)N−1tn−N+1 for n ≥ N .Therefore,

|s− sn| = |t− tn−N+1| ≤ cn−N+2 = bn+1 for all n ≥ N.

54.2. Exercises.In (1)–(15), determine whether the series converges or diverges (here pis real).


(1)∞∑

n=1

(−1)n

ln(n + 3)(2)

∞∑n=1

(−1)nn√n3 + 1

(3)∞∑

n=1

cos(nπ/2)n4/5

(4)∞∑

n=1

(−1)nn

(n3/2 + 1)2/3 (5)∞∑

n=2

(−1)nn

(ln n)p(6)

∞∑n=1

(−1)nn3

2n

(7)∞∑

n=2

(−1)n(ln n)p

n(8)

∞∑n=1

(−1)n sin(π

n

)(9)

∞∑n=1

(−1)nn2

n4 + 1

(10)∞∑

n=1

(−1)n

n1+1/n − n(11)

∞∑n=1

(−1)n( n√

n− 1)n (12)∞∑

n=1

(−1)n nn

n!

(13)∞∑

n=1

(−1)n

n + p(14)

∞∑n=1

(−1)n

np(15)

∞∑n=1

(−1)n(n2 + n + 1)(2n + 3)2

In (16) and (17), find n for which the approximation by partial sumss ≈ sn is correct to N decimal places for the series.

(16)∞∑

n=1

(−1)nn

10n(17)

∞∑n=1

(−1)nn1/3

n1/3 + 6

(18) Prove that the sum of the alternating harmonic series is∞∑

n=1

(−1)n−1

n= ln 2.

Hint: Show that a partial sum of the alternating harmonic series iss2n = h2n − hn, where hn = an + ln n and the sequence {an} is definedin Exercise 52.4.16. Then use the result of the latter exercise to provethat sn → ln 2 as n→∞.

55. Ratio and Root Tests

55.1. Absolutely Convergent Series.

Definition 8.7 (Absolute Convergence). A series∑

an is calledabsolutely convergent if the series of absolute values

∑ |an| is convergent.

The absolute convergence is stronger than convergence, meaningthat there are convergent series that do not converge absolutely. Forexample, the alternating harmonic series

∑an, an = (−1)n−1/n, is

convergent, but not absolutely convergent because the series of absolutevalues |an| = 1/n is nothing but the harmonic series

∑1/n, which is

divergent (as a p−series with p = 1). On the other hand, the absoluteconvergence implies convergence.

55. RATIO AND ROOT TESTS 103

Theorem 8.19 (Convergence and Absolute Convergence). Everyabsolutely convergent series is convergent.

Proof. For any sequence {an}, the following inequality holds;

0 ≤ an + |an| ≤ 2|an|because |an| is either an or −an. It shows that the series

∑bn, where

bn = an + |an|, converges by the comparison test because∑

2|an| =2∑ |an| converges if

∑an converges absolutely. Hence, the series∑

an =∑

bn −∑ |an| converges as the difference of two convergent

series. �


[sin n− 2 cos(2n)]/n3/2 for abso-lute convergence.

Solution: Making use of the inequality |A + B| ≤ |A| + |B| and theproperties that | sin x| ≤ 1 and | cos x| ≤ 1, one infers

|an| = | sin n− 2 cos(2n)|n3/2 ≤ | sin n|+ 2| cos(2n)|

n3/2 ≤ 3n3/2 .

The series of absolute values∑ |an| con-verges by comparison with the

convergent p−series 3∑

n−3/2 (here p = 3/2 > 1). So the series inquestion converges absolutely. �

Definition 8.8 (Conditional Convergence). A series∑

an is calledconditionally convergent if it is convergent but not absolutely conver-gent.

Thus, all convergent series are separated into two classes of con-ditionally convergent and absolutely convergent series. The key dif-ference between properties of absolutely convergent and conditionallyconvergent series is studied in the next section.

55.2. Ratio Test.

Theorem 8.20 (Ratio Test). Given a series∑

an, suppose thefollowing limit exists:

limn→∞

∣∣∣an+1

an

∣∣∣ = c,

where c ≥ 0 or c =∞.• If c < 1, then

∑an converges absolutely.

• If c > 1, then∑

an diverges.• If c = 1, then the test gives no information.


Proof. If c < 1, then the existence of the limit means that, forany ε > 0, there is an integer N such that

−ε <∣∣∣an+1

an

∣∣∣− c < ε =⇒∣∣∣an+1

an

∣∣∣ < c + ε = q < 1 for all n ≥ N.

Note that since c is strictly less than 1, one can always take ε > 0small enough so that the number q = c + ε < 1. In particular, putn = N + k − 1, where k ≥ 2. Applying the inequality |an+1| < q|an|consecutively k times,

|aN+k| < q|aN+k−1| < q2|aN+k−2| < · · · < qk|aN | = |aN |q−NqN+k.

This shows that

(8.13) |an| < βqn , β = |aN |q−N , for all n ≥ N.

The series∑ |an| converges by comparison with the convergent geo-

metric series∑

βqn = β∑

qn because q < 1. So∑

an converges abso-lutely. If c > 1, then there is an integer N such that |an+1|/|an| > 1 or|an+1| > |an| ≥ 0 for all n ≥ N . Hence, the necessary condition for aseries to converge, an → 0 as n→∞ does not hold; that is, the series∑

an diverges. If c = 1, it is sufficient, to give examples of a convergentand divergent series for which c = 1. Consider a p-series

∑n−p. One

has

c = limn→∞

∣∣∣an+1

an

∣∣∣ = limn→∞

np

(n + 1)p= lim

n→∞1

(1 + 1/n)p= 1

for any p. But a p-series converges if p > 1 and diverges otherwise. �

Example 8.27. Find all values of p and q for which the series∑∞n=1 npqn converges absolutely.

Solution: Here an = npqn. One has

c = limn→∞

∣∣∣an+1

an

∣∣∣ = limn→∞

(n + 1)p

np

|q|n+1

|q|n = |q| limn→∞

(1 + 1/n)p

1= |q|.

So, for |q| < 1 and any p, the series converges absolutely by the ratiotest. If q = ±1, the ratio test is inconclusive, and these cases have to bestudied by different means. If |q| = 1, then

∑ |an| =∑

np =∑

1/n−p,which is a p-series that converges if −p > 1 or p < −1. Thus, theseries converges absolutely for all p if |q| < 1 and for p < −1 if q =±1. Note that, for −1 ≤ p < 0 and q = −1, the series conditionallyconverges (i.e., it is convergent but not absolutely convergent). In thiscase, it is a convergent alternating p-series

∑(−1)n/n−p (see Exercise

54.2.14). �


55.3. Root Test.

Theorem 8.21 (Root Test). Given a series∑

an, suppose the fol-lowing limit exists:

limn→∞

n√|an| = c,

where c ≥ 0 or c =∞.• If c < 1, then

∑an converges absolutely.

• If c > 1, then∑

an diverges.• If c = 1, then the test gives no information.

Proof. If c < 1, then, as in the proof of the ratio test, the existenceof the limit means that, for any c < q < 1, there is an integer N suchthat

n√|an| < q =⇒ |an| < qn for all n ≥ N.

This shows that the series∑ |an| converges by comparison with the

convergent geometric series∑

qn, 0 < q < 1. So∑

an convergesabsolutely. If c > 1, then there exists an integer N such that n

√|an| > 1for all n ≥ N , and hence the condition an → 0 as n → ∞ doesnot hold. The series

∑an diverges. If c = 1, consider a p-series:

n√

n−p = ( n√

n)−p → 1−p = 1 by Theorem 8.6. But a p-series convergesif p > 1 and diverges if p < 1. So the root test is inconclusive. �

Example 8.28. Test the convergence of the series∑

an, wherean =[(2n2 + 5)/(3n2 + 2)]n.

Solution: Here |an| = an, and the absolute convergence is equivalentto the convergence. One has

limn→∞

n√|an| = lim

n→∞2n2 + 53n2 + 2

= limn→∞

2 + 5/n2

3 + 2/n2 =23

< 1.

So the series converges. �

55.4. Oscillatory Behavior of Sequences in the Root and Ratio Tests. Con-sider a sequence defined recursively by a1 = 1 and an+1 = 1

2(sin n)an.An attempt to test the convergence of

∑an by the ratio test leads to

the sequence cn = |an+1|/|an| = 12 | sin n| which does not converge as it

oscillates between 0 and 1/2. Similarly, the sequence used in the roottest may also exhibit oscillatory behavior and be nonconvergent, forexample, an = (1

2 sin n)n so that cn = n√|an| = 1

2 | sin n|. The ratio androot tests, as stated in Theorems 8.20 and 8.21, assume the existenceof the limit cn → c. What can be said about the convergence of a serieswhen this limit does not exist?


To answer this question, recall that, in the proof of the ratio orroot test, the existence of limn→∞ cn = c < 1 has been used only toestablish the boundedness of the sequence cn ≤ q < 1 for all n ≥ N ,which is sufficient for the series

∑an to converge. But the boundedness

property does not imply the convergence! Evidently, the boundednesscondition holds in the above examples, cn = 1

2 | sin n| ≤ 12 < 1 for all

n. Similarly, the existence of the limit value c > 1 has only been usedto show that n

√|an| ≥ 1 or |an| ≥ 1 for infinitely many n to concludethat the sequence {an} cannot converge to 0 and hence

∑an diverges.

If |an+1|/|an| ≥ 1 for all n ≥ N , then again {an} cannot converge to 0(by the proof of the ratio test). Thus, the convergence of {cn} in theroot or ratio test is not really necessary.

Theorem 8.22 (Ratio and Root Tests Refined). Given a series∑n an, put cn = |an+1|/|an| or cn = n

√|an|. Then

cn ≤ q < 1 for all n ≥ N =⇒∑

an converges.{n√|an| ≥ 1 for infinitely many n|an+1||an| ≥ 1 for all n ≥ N

=⇒∑

an diverges

for some integer N .

55.5. Wider Scope of the Root Test. If the limit of |an+1|/|an| exists,then so does the limit of n

√|an| and

(8.14) limn→∞

n√|an| = lim

n→∞|an+1||an| .

The converse is not true; that is, the existence of the limit of n√|an|

does not generally imply the existence of the limit of |an+1|/|an| (thelatter may or may not exist). Furthermore, if the sequence n

√|an| doesnot converge, neither does |an+1|/|an|. A proof of these assertions isgiven in more advanced calculus courses. Thus, the ratio test has thesame predicting power as the root test only if |an+1|/|an| converges.

In general, the root test (as in Theorem 8.22) has wider scope,meaning that whenever the ratio test shows convergence, the root testdoes, too, and whenever the root test is inconclusive, the ratio testis, too. The subtlety to note here is that the converse of the latterstatement is not generally true; that is, the inconclusiveness of the ratiotest does not imply the inconclusiveness of the root test. The assertioncan be illustrated with the following example. Consider a convergentseries obtained from the sum of two geometric series in which the order


of summation is changed:∞∑

n=1

an =12

+13

+122 +

132 +

123 +

133 + · · · = 1

2

∞∑k=0

12k

+13

∞∑k=0

13k

=32.

where the sum of a geometric series has been used (Theorem 8.8). Nownote that if n = 2k is even, then a2k = (1/3)k, and a2k−1 = (1/2)k ifn = 2k − 1 is odd. Take the subsequence of ratios for even n = 2k,c2k = a2k+1/a2k = (2/3)k/9. It converges to 0 as k →∞. On the otherhand, the subsequence of ratios for odd n = 2k − 1 diverges: c2k−1 =(3/2)k → ∞ as k → ∞. So the limit of cn does not exist; moreover,the ratio test (as in Theorem 8.22) fails miserably because cn is noteven bounded. The series converges by the root test. Indeed, c2k =2k√

a2k = 1/√

2 < 1 and c2k−1 = 2k−1√

a2k−1 = 1/√

3 < 1. Although thesequence cn does not converge (it oscillates between 1/

√3 and 1/

√2),

it is bounded, cn ≤ q = 1/√

2 < 1 for all n, and hence the seriesconverges by Theorem 8.22. A similar example is given in Exercise55.7.19. Thus, the ratio test is sensitive to the order of summation,while this is not so for the root test.

55.6. When the Ratio Test Is Inconclusive.

Theorem 8.23 (De Morgan’s Ratio Test). Let∑

an be a series inwhich |an+1|/|an| → 1 as n→∞. The series converges absolutely if

limn→∞

n( ∣∣∣an+1

an

∣∣∣− 1)

= b < −1.

The proof of this theorem is left to the reader as an exercise (seeExercise 55.7.18). Consider the asymptotic behavior of the ratio cn =|an+1|/|an| as n → ∞. The theorem asserts that if cn behaves ascn ∼ 1 + b/n for large n (i.e., neglecting terms of order 1/np wherep > 1), then the series

∑an converges if b < −1.

For a p-series, the ratio test is inconclusive (see the proof of theratio test). However, De Morgan’s test resolves the inconclusiveness.Indeed, for large n,

cn =np

(n + 1)p=

1(1 + 1/n)p

∼ 1− p

n.

where the asymptotic behavior has been found from the linearizationf(x) = (1 + x)−p ∼ f(0) + f ′(0)x = 1 − px for small x = 1/n. Sob = −p and the series converges if b < −1 or p > 1.

This illustrates a basic technical trick to applying De Morgan’s test.Suppose that there is a function f(x) such that |an+1|/|an| = f(1/n).


If f is differentiable at x = 0, that is, f(x) ≈ f(0) + f ′(0)x as x → 0,then

|an+1|/|an| = f(1/n) ∼ f(0) + f ′(0)(1/n) = 1 + f ′(0)/n,

and the series∑

an converges absolutely if f ′(0) < −1. Note that theproperty f(0) = 1 follows from the inconclusiveness of the ratio test.

55.7. Exercises.In (1)–(15), determine whether the series is absolutely convergent, con-ditionally convergent, or divergent (here p is real).

(1)∞∑

n=1

(−1)n

3√

n(2)

∞∑n=1

n2(23)

n (3)∞∑

n=1

n!pn

, p �= 0

(4)∞∑

n=1

pn

n!(5)

∞∑n=2

(−1)n21/n

np(6)

∞∑n=1

(−1)nnp+3

n!

(7)∞∑

n=2

(−1)n(ln n)p

n(8)

∞∑n=1

(−1)nn4√

n3 + 1(9)

∞∑n=1

n!nn

(10)∞∑

n=1

(−1)n

(ln n)p(11)

∞∑n=1

(2n3 + n

3n3 + 5

)n

(12)∞∑

n=1

(1 +

1n

)n2

(13)∞∑

n=1

np

(ln n)n(14)

∞∑n=1

2 · 4 · · · (2n)n!

(15)∞∑

n=1

pnn!5 · 8 · · · (3n + 2)

(16) For which integers p > 0 is the series∑∞

n=1(n!)2/(pn)! convergent?(17) (Estimating sums). Given a series

∑an with positive terms, put

cn = an+1/an. Suppose that cn → c < 1, that is, the series converges,∑an = s. Let sn be a partial sum. Prove that

s− sn ≤ an+1

1− cn+1

if {cn} is a decreasing sequence, and

s− sn ≤ an+1

1− c

if {cn} is an increasing sequence.Hint: Use the geometric series as in the proof of the ratio test toestimate the remainder s− sn = an+1 + an+2 + · · · .(18) Prove De Morgan’s ratio test.

56. REARRANGEMENTS 109

Hints: Compare the series∑ |an| with the convergent p-series

∑bn,

where bn = A/np and p = −(1 − b)/2 > 1 if b < −1. Show thatn(bn+1/bn − 1) → b as n → ∞. Next, show that, by choosing theconstant A one can always make |an| < bn for all n.(19) Consider a geometric series with q = 1/2 in which the order ofterms is changed by swapping terms in each consecutive pair:

a1 + a2 + a3 + · · · = 12

+ 1 +18

+14

+132

+116

+1

128+

164

+ · · · .Test the convergence of this series using the root and ratio tests.

56. Rearrangements

Here the difference between conditionally convergent and absolutelyconvergent series is further refined through the concept of rearrange-ment.

Definition 8.9 (Rearrangement). Let {kn}, n = 1, 2, ..., be aninteger-valued positive sequence in which every positive integer appearsonly once (i.e., kn = kn′ if and only if n = n′). Given a series

∑an,

put a′n = akn. The series

∑a′

n is called a rearrangement of∑

an.

For a finite sum, a rearrangement of its terms does not change thevalue of the sum. This is not generally so for convergent series.

Consider an alternating p-series:

(8.15)∞∑

n=1

an =∞∑

n=1

(−1)n−1

n= 1− 1

2+

13− 1

4+

15− 1

6+ · · · .

The series is convergent but not absolutely convergent (its sum is s =ln 2; see Exercise 54.2.18). One of its rearrangements reads

(8.16)∞∑

n=1

a′n = 1 +

13− 1

2+

15

+17− 1

4+

19

+111− 1

6+ · · ·

in which two positive terms are always followed by one negative. Letsn and s′

n be partial sums of (8.15) and (8.16), respectively. Put hn =1 + 1/2 + · · · + 1/n (a partial sum of the harmonic series). Thens2n = h2n − hn. Furthermore,

s′3n = 1 +

13

+15

+17

+ · · ·+ 14n− 1

− 12− 1

4− · · · − 1

2n

= h4n − 12− 1

4− · · · − 1

4n− 1

2hn = h4n − 1

2h2n − 1

2hn

= (h4n − h2n) +12(h2n − hn) = s4n +

12s2n.


Taking the limit n→∞ in this equality, one finds s′ = s+ s/2 = 3s/2,where s and s′ are the sums of (8.15) and (8.16), respectively. Thus,a rearrangement of the series has changed its sum! This fact is notspecific to the example considered but inherent in all conditionallyconvergent series. Terms of a conditionally convergent series occurwith different signs (positive and negative). By regrouping positiveand negative terms, it will be proved that the sum of a conditionallyconvergent series can be made any number or ±∞. The analysis beginsby studying the properties of sums of positive and negative terms of aconditionally convergent series.

Given a number x, put x± = (x ± |x|)/2. The number x+ = x ifx > 0 and x+ = 0 otherwise. Similarly, x− = x if x < 0 and x− = 0otherwise.

Lemma 8.1. Given a series∑

an, consider two series∑

a+n and∑

a−n , where a±

n = (an ± |an|)/2 (the series of positive and negativeterms). Then(i) If

∑an converges absolutely, then

∑a+

n and∑

a−n converge.

(ii) If∑

an is conditionally convergent, then∑

a+n and

∑a−

n diverge.

Proof. Let∑

an = s < ∞ and∑ |an| = t, where t < ∞ if

∑an

converges absolutely and t = ∞ if it is conditionally convergent. Lets±

n be partial sums of∑

a±n , sn be partial sums of

∑an, and tn be

partial sums of∑ |an|. Since sn → s and tn → t as n→∞, one infers

thata+

n − a−n = an

a+n + a−

n = |an| =⇒ s+n − s−

n = sn

s+n + s−

n = tn=⇒ s+ − s− = s

s+ + s− = t,

where s± are the limits of s±n . If

∑an converges absolutely, then t <∞

and hence s± = (t ± s)/2; that is, both series∑

a±n converge. If∑

an is conditionally convergent, then t = ∞, and both sequences s±n

diverge. �

Theorem 8.24 (Riemann’s Rearrangement Theorem). Let∑

an

be a series that converges, but not absolutely. Then, for any c that is areal number or ±∞, there exists a rearrangement

∑a′

n whose sequenceof partial sums {s′

n} converges to c.

Proof. Let p1, p2, .... denote nonnegative terms of∑

an in the or-der in which they occur, and let q1, q2, ... denote negative terms of

∑an

in the order in which they occur. In the notation of Lemma 8.1, theseries

∑pn and

∑a+

n as well as∑

qn and∑

a−n may only differ by

zero terms (if some an = 0). So the series∑

pn and∑

qn diverge.Consider the following rearrangement. Given a number c, take first


k1 terms pn, such that the number c lies between the partial sumssk1 = p1 + p2 + · · · + pk1 and sk1−1; that is, k1 is defined by the con-dition sk1 − pk1 < c < sk1 or |c − sk1| < pk1 . If c < 0, then skip thisfirst step. Next, take first m1 terms qn where m1 is the smallest integersuch that sk1+m1 = sk1 +q1 + · · ·+qm1 < c; that is, m1 is defined by thecondition sk1+m1 + qm1 > c > sk1+m1 or |c− sk1+m1| < |qm1|. This canalways be done because partial sums of

∑pn can be larger than any

number, while partial sums of∑

qn can be smaller than any numberowing to the divergence of these series. So

s1 ≤sn ≤ sk1 , 1 ≤ n ≤ k1, where |c− sk1| < pk1 ,

sk1 ≥sn ≥ sk1+m1 , k1 ≤ n ≤ k1 + m1, where |c− sk1+m1| < |qk1|,Next, take k2 next terms pn, where k2 is the smallest integer such thatsk1+m1+k2 > c, and take m2 next terms qn, where m2 is the smallestinteger for which sk1+m1+k1+m2 < c, and so on. At the nth step of theprocedure, let n1 be the integer for which the last term in sn1 is pkn

and let n2 be the integer for which the last term in sn2 is qmn , thatis, n2 = n1 + mn. The partial sums of the constructed rearrangementoscillate about c, reaching local minima sn1 and local maxima sn1 :

sn1 ≤sn ≤ sn2 , n1 ≤ n ≤ n2,

|c− sn1| < pkn , |c− sn2| < |qmn|.(8.17)

By convergence of the series∑

an, an → 0 as n→ 0. Hence, pn and qn

also converge to 0 and so do the subsequences pkn → 0 and qmn → 0.Thus, all local maxima and minima of the sequence of partial sum {sn}converge to c by (8.17), which shows that sn → c. Finally, if c = ±∞,one can take any divergent sequence cn →∞ (or −∞) and construct arearrangement such that sk1 overshoots c1 and sk1+m1 undershoots c1,sk1+m1+k2 overshoots c2 and sk1+m1+k2+m2 undershoots c2, and so on.Obviously, this sequence of partial sum diverges. �

Absolutely convergent series have a drastically different property.

Theorem 8.25 (Rearrangement and Absolute Convergence). If aseries

∑an converges absolutely, then every rearrangement of

∑an

converges, and they all converge to the same sum

Proof. Let tn = |a1| + |a2| + · · · + |an| be a partial sum of theseries of absolute values. The sequence {tn} converges to a number tby the hypothesis; that is, for any ε > 0, there is an integer N such


that |t− tn| < ε for all n > N . Therefore,n∑

k=N+1

|ak| = |tn − tN+1| = |tn − t + t− tN+1|

≤ |tn − t|+ |t− tN+1| < 2ε.

So, by taking N large enough, the sum of any number of terms |ak|,k > N , can be made smaller than any preassigned positive number.Let sn and s′

n be partial sums of∑

an and its rearrangement∑

a′n.

One can take n > N large enough such that s′n contains a1, a2,...,aN

(i.e., the integers 1, 2, ..., N are in the set of integers k1, k2, ..., kn inthe notations of Definition 8.9). Then the difference |s′

n − sn| containsonly terms |ak| with k > N (the terms a1, a2,..., aN are cancelled).Therefore, |s′

n − sn| < 2ε for all n > N . If s′n → s′ and sn → s, then

|s′ − s| < 2ε, which shows that s′ = s because ε > 0 is arbitrary. �

Thus, an absolutely convergent series is much like a finite sum. Thesum does not depend on the order in which the summation is carriedout. In contrast, the sum of a conditionally convergent series dependson the summation order. This is the characteristic difference betweenthese two classes of convergent series.

56.1. Strategy for Testing Series. It would not be wise to apply tests forconvergence in a specific order to find one that finally works. Instead,a proper strategy, as with integration, is to classify the series accordingto its form. One should also keep in mind that a conclusion about theconvergence of a series can be reached in different ways.

1. Special series. A series∑

an coincides with (or is a combina-tion of or is equivalent to) special series such as a p-series, alternatingp-series, geometric series, telescopic series, and so on. Their conver-gence properties are known.

2. Series similar to special ones. If a series∑

an has a formthat is similar to one of the special series, then one of the comparisontests should be considered. For example, if an is a rational or alge-braic function (contains roots of polynomials), then the series shouldbe compared with a p-series.

3. Necessary condition for convergence. It is is always easierto check the condition an → 0 as n → ∞ than it is to investigate theseries

∑an for convergence. If the condition does not hold, the series

diverges.4. Alternating series. If an = (−1)nbn, bn ≥ 0, then the alter-

nating series test is an obvious possibility.


5. Ratio and root tests. Absolute convergence implies conver-gence. So, if the ratio or root test shows convergence, then the series inquestion converges absolutely. If these tests show divergence, then theseries in question may still converge but not absolutely, and a furtherinvestigation is required. The root test is convenient for series of theform

∑(bn)n. The ratio test is convenient when an involves the facto-

rial n! or similar products of integers. The root test has a wider scope,but it is more difficult to use. The ratio test is often inconclusive if an

is a rational or algebraic function (cn = |an+1|/|an| → 1). In this case,the asymptotic behavior of cn is rather easy to find, cn ∼ 1 + b/n asn→∞, and then use De Morgan’s test.

6. Series of nonnegative terms. If an = f(n) ≥ 0 and theintegral

∫∞1 f(x) dx is easy to evaluate, then the integral test is effective.

Also, it can be used in combination with the comparison test: an ≤f(n) and

∫∞1 f(x) dx converges and so is

∑an, or f(n) ≤ an and∫∞

1 f(x) dx diverges and so is∑

an.


(n + 1)/(n2 + n + 1) for conver-gence.

Solution: For large n, the leading terms of the top and bottom ofthe ratio are n and n2, respectively. So an ∼ 1/n asymptotically forlarge n. The series resembles the harmonic series, which diverges. It isnatural, then, to try to prove the divergence of the series by comparingit with the harmonic series:

n + 1n2 + n + 1

>n

n2 + n + 1≥ n

n2 + n2 + n2 =12n

.

Thus, the series indeed diverges by comparison with the harmonic se-ries. �


3n/(2 · 4 · 6 · · · (2n)) for conver-gence.

Solution: Each term an involves a factorial-like product of integers,which suggests the use of the ratio test:

an+1

an

=3n+1

2 · 4 · · · (2n) · (2n + 2)2 · 4 · · · (2n)

3n=

32n + 2

→ 0.

So, the series converges. �


sin(n2)e−n3/2 for convergence.

Solution: One has

|an| = | sin(n2)|e−n3/2 ≤ e−n3/2 ≤ e−n.


The series∑

e−n converges by the integral test:∫∞

1 e−x dx = 1/e <∞.Hence, the series in question converges absolutely. Alternatively, theconvergence of

∑bn, where bn = e−n3/2 , can be established by the root

test: n√

bn = e−n1/2 → 0 < 1 as n→∞. �

Example 8.32. Put hn = 1 + 1/2 + · · · + 1/n (a partial sum ofthe harmonic series). Investigate the convergence of

∑npe−qhn, where

p �= q.

Solution: The ratio test is inconclusive:an+1

an

=(n + 1)pe−qhn+1

npe−qhn=

(n + 1)p

npe−q(hn+1−hn) =

(1 + 1n)p

1e− q

n+1 → 1.

To apply De Morgan’s test, the asymptotic behavior of an+1/an hasto be investigated. Put f(x) = (1 + x)p exp(−qx/(1 + x)) so thatan+1/an = f(1/n). Using the linearization near x = 0, (1+x)p ∼ 1+px,and exp(−qx/(1+x)) ∼ 1−qx/(1+x) ∼ 1−qx, the asymptotic behavioris obtained:

f(x) ∼ (1 + px)(1− qx) ∼ 1 + (p− q)x =⇒ an+1

an

∼ 1 +p− q

n.

Thus, the series converges if p − q < −1 or p < q − 1. Of course, onecould simply calculate f ′(0) = p − q, but this is a bit more involvedthan the above procedure for finding the asymptotic behavior. �

56.2. Exercises.In (1)–(15), test the series for convergence or divergence (here p is real).

(1)∞∑

n=1

1√n + 2n

(2)∞∑

n=1

(−1)n n

2n + 3(3)

∞∑n=1

npp2n

n!

(4)∞∑

n=1

n!2 · 5 · 8 · · · (3n− 1)

(5)∞∑

n=2

(−1)n ln n

np

(6)∞∑

n=1

(2n + 3)n

(3n2 + 1)n/2 (7)∞∑

n=1

(−1)n 1 · 3 · 5 · · · (2n− 1)2 · 4 · 6 · · · (2n)

(8)∞∑

n=1

tan(πn + 1/n) (9)∞∑

n=2

1(ln n)ln n

(10)∞∑

n=1

sin(1/n)np

(11)∞∑

n=1

( n

n + p

)n2

, p > 0 (12)∞∑

n=1

( n√

p2 − 1)n (13)∞∑

n=1

n!enp

(14)∞∑

n=1

npn−1

pn − (1− 1/n)n, |p| < 1 (15)

∞∑n=1

( n√

p− 1) , p ≥ 0

57. POWER SERIES 115

57. Power Series

Definition 8.10 (Power Series). Given a sequence {cn}, the series∞∑

n=0

cnxn = c0 + c1x + c2x

2 + c3x3 + · · ·

is called a power series in the variable x. The numbers cn are calledthe coefficients of the series.

In general, the series will converge or diverge, depending on thechoice of x. The power series always converges for x = 0 to the numberc0.

Example 8.33. For what values of x does the power series∑∞

n=0 xn/nconverge?

Solution: By the root test,

n

√|xn|n

=|x|n√

n→ |x| as n→∞.

So the series converges for all −1 < x < −1 and diverges as x > 1or x < −1. The root test is inconclusive for x = ±1. These valueshave to be investigated by different means. For x = 1, the power seriesbecomes the harmonic series

∑1/n, which is divergent. For x = −1,

the power series becomes the alternating harmonic series∑

(−1)n/n,which is convergent. Thus, the power series converges if x ∈ [−1, 1)and diverges otherwise. �

Given a number a, consider a power series in the variable y = x−a:∞∑

n=0

cnyn =

∞∑n=0

cn(x− a)n.

It is also called a power series centered at a or a power series about a.Let S be the set of all values of x for which a power series in x con-

verges and let Sa be the set of all values of x for which the correspondingpower series in (x− a) converges. What is the relation between S andSa? Since the series are obtained from one another by merely shiftingthe value of the variable by a number a, x → x − a, the set Sa istherefore obtained by adding the number a to every element of S:

(8.18) x ∈ Sa ⇐⇒ x− a ∈ S =⇒ Sa = {x |x− a ∈ S}.For example, the series

∑∞n=0(x−2)n/n converges if x−2 ∈ [−1, 1)

or x ∈ [1, 3) and diverges otherwise by Example 8.33. Thus, the prob-lem of finding the set Sa is equivalent to the problem of finding theset S.


57.1. Power Series as a Function. Suppose that a power series in xconverges on a set S. Then it defines a function on S:

f(x) =∑n=0

cnxn , x ∈ S.

The set S is called the domain of such a function. Functions definedby power series are most common in applications. Many of them havespecial notations (like elementary functions sin, cos, exp, etc.). Theirproperties are well studied. In what follows, it will be shown thatfamiliar elementary functions such as sin x, cos x, and exp x, etc canalso be represented as power series.

Example 8.34. Find the domain of the Bessel function of order 0that is defined by the power series

J0(x) =∑n=0

(−1)n

22n(n!)2 x2n,

where, by common convention, 0! = 1.

Solution: Since an = cnx2n contains the factorial, the ratio test is

more convenient:|an+1||an| = x2 |cn+1|

|cn| = x2 22n(n!)2

22(n+1)((n + 1)!)2 =x2

22(n + 1)2 → 0

as n→∞. So the series converges for all x. �

Values of a function defined by a power series can be estimated bypartial sums that are polynomials in the variable x:

f(x) ≈ fn(x) =n∑

k=0

ckxk = c0 + c1x + c2x

2 + · · ·+ cnxn.

Thus, partial sums define a sequence of polynomials that converges tothe function on S, fn(x) → f(x) for all x ∈ S. The accuracy ofthe approximation is determined by the remainder Rn(x) = f(x) −fn(x). The accuracy assessment is discussed in Section 8.59. Since theremainder Rn(x) is a function on S, the error of the approximation isnot generally uniform; that is, it depends on x.

57.2. Radius of Convergence. The set S on which a power series isconvergent is an important characteristic and its properties have to bestudied.

Lemma 8.2 (Properties of a Power Series). (i). If a power series∑cnx

n converges when x = b �= 0, then it converges whenever |x| < |b|.


(ii). If a power series∑

cnxn diverges when x = d �= 0, then it diverges

whenever |x| > |d|.Proof. If

∑cnb

n converges, then, by the necessary condition forconvergence, cnb

n → 0 as n→∞. This means, in particular, that, forε = 1, there exists an integer N such that |cnb

n| < ε = 1 for all n > N .Thus, for n > N ,

|cnxn| =

∣∣∣cnbnxn

bn

∣∣∣ = |cnbn|∣∣∣xb

∣∣∣n <∣∣∣xb

∣∣∣n.which shows that the series

∑cnx

n converges by comparison with thegeometric series

∑qn, where q = x/b and |x/b| < 1 or |x| < |b|.

Suppose that∑

cndn diverges. If x is any number such that |x| > |d|,

then∑

cnxn cannot converge because, by part (i) of the lemma, the

convergence of∑

cnxn implies the convergence of

∑cnd

n. Therefore,∑cnx

n diverges. �

This lemma allows us to establish the following description of theset S.

Theorem 8.26 (Convergence Properties of a Power Series). For apower series

∑cnx

n, there are only three possibilities:(i) The series converges only when x = 0.(ii) The series converges for all x.(iii) There is a positive number R such that the series converges if|x| < R and diverges if |x| > R.

Proof. Suppose that neither case 1 nor case 2 is true. Then thereare numbers b �= 0 and d �= 0 such that the power series converges forx = b and diverges for x = d. By Lemma 8.2, the set of convergenceS lies in the interval |x| ≤ |d| for all x ∈ S. This shows that |d| is anupper bound for the set S. By the completeness axiom, S has a leastupper bound R = sup S. If |x| > R, then x �∈ S, and

∑cnx

n diverges.If |x| < R, then |x| is not an upper bound for S, and there exists anumber b ∈ S such that b > |x|. Since b ∈ S,

∑cnx

n converges byLemma 8.2. �

Theorem 8.26 shows that a power series converges in a single openinterval (−R, R) and diverges outside this interval. The set S may ormay not include the points x = ±R. This question requires a specialinvestigation just like in Example 8.33. So the number R is character-istic for convergence properties of a power series.

Definition 8.11 (Radius of Convergence). The radius of conver-gence of a power series

∑cnx

n is a positive number R > 0 such that


the series converges in the open interval (−R, R) and diverges outsideit. A power series is said to have a zero radius of convergence, R = 0,if it converges only when x = 0. A power series is said to have aninfinite radius of convergence, R = ∞, if it converges for all values ofx.

The ratio or root test can be used to determine the radius of con-vergence.

Corollary 8.3 (Radius of Convergence of a Power Series). Givena power series

∑cnx

n,

if limn→∞

|cn+1||cn| = α =⇒ R =

1α

,

if limn→∞

n√|cn| = α =⇒ R =

1α

,

where R = 0 if α =∞ and R =∞ if α = 0.

Proof. Put an = cnxn in the ratio test (Theorem 8.20). Then

|an+1|/|an| = |x||cn+1|/|cn| → |x|α. The series converges if |x|α < 1,which shows that R = 1/α. Similarly, using the root test (Theorem8.21), n

√|an| = |x| n√|cn| → |x|α < 1, which shows that R = 1/α. �

Remark. If the sequences in Corollary 8.3 do not converge, thenTheorem 8.22 should be used, where an = cnx

n.Once the radius of convergence has been found and 0 < R < ∞,

the cases x = ±R have to be investigated by some other means (as theroot or ratio test is inconclusive in this case) to determine the intervalof convergence S of a power series.

Example 8.35. Find the radius of convergence and the interval ofconvergence of the power series

∑cnx

n, where cn = (−q)n/√

n + 1 andq > 0.

Solution:

|cn+1||cn| =

qn+1√

n + 2

√n + 1qn

= q

√n + 1n + 2

= q

√1 + 1/n1 + 2/n

→ q = α.

Therefore, R = 1/α = 1/q. If x = −1/q, then cnxn = (−1)n/

√n + 1 =

(−1)nbn. The sequence bn converges monotonically to 0 so that∑

(−1)nbn

converges by the alternating series test. If x = −1/q, then cnxn =

1/√

n + 1 > 1/√

2n, n ≥ 1. The p-series∑

1/n1/2 diverges (p = 1/2 <1) so that

∑1/√

n + 1 diverges by the comparison test. Thus, theinterval of convergence is S = [−1/q, 1/q). �


Example 8.36. Find the radius of convergence and the interval ofconvergence of the power series

∑n2(x + 1)n/qn, where q > 0.

Solution: Put y = x+1. If S is the interval of convergence of∑

cnyn,

where cn = n2/qn, then the interval of convergence in question is ob-tained by adding −1 to all numbers in S according to the rule (8.18).By Corollary 8.3,

n√|cn| = 1

q

n√

n2 =1q( n√

n)2 → 1q

= α.

So R = 1/α = q. If y = q, then cnyn = n2, and the series

∑n2 diverges

(an = n2 does not converge to 0). If y = −q, then cnxn = (−1)nn2,

and the series diverges because an = (−1)nn2, does not converge to 0.The series converges only if |y| = |x + 1| < q, and hence the interval ofconvergence is x ∈ (−q − 1, q − 1) (the interval (−q, q) shifted by −1).

�

57.3. Exercises.In (1)–(12), find the radius of convergence and the interval of conver-gence of the power series.

(1)∞∑

n=0

n3xn (2)∞∑

n=0

2n

n3 xn (3)∞∑

n=1

( n√

2− 1)nx2n

(4)∞∑

n=2

(−1)n 12n ln n

xn (5)∞∑

n=1

1np

(x− 2)n , p > 0

(6)∞∑

n=0

√n(x + 1)n (7)

∞∑n=1

(−1)n n2 + 1n3 + 3

(x− 1)n

(8)∞∑

n=1

(4x + 1)n

n2 (9)∞∑

n=1

xn

1 · 3 · 5 · · · (2n− 1)

(10).∞∑

n=1

n2x2n

2 · 4 · · · (2n)(11)

∞∑n=0

(n!)k

(kn)!xn , k > 0 (integer)

(12)∞∑

n=0

4n

n!(x + 3)n

(13) Let p < q be real numbers. Give examples of power series whoseintervals of convergence are (p, q), [p, q], (p, q], and [p, q).(14) The Airy function is defined by the power series

A(x) = 1 +x3

2 · 3 +x6

2 · 3 · 5 · 6 +x9

2 · 3 · 5 · 6 · 8 · 9 + · · · .


Find its domain.(15) A function f is defined by the power series

f(x) = p + qx + px2 + qx3 + px4 + qx5 + · · · ;that is, its coefficients c2k = p and c2k−1 = q, where p and q are real.Find the domain of f and an explicit expression of f(x) (the sum ofthe series).(16) If f(x) =

∑cnx

n, where cn+4 = cn for all n ≥ 0, find the domainof f and a formula for f(x).(17) Power series

∑cnx

n and∑

bnxn have the radii of convergence R1

and R2, respectively. What is the radius of convergence of∑

(cn +bn)xn?(18) Suppose that the radius of convergence of

∑cnx

n is R. What isthe radius of convergence of

∑cnx

kn, where k > 0 is an integer?

58. Representation of Functions as Power Series

Consider a power series

1− x2 + x4 − x6 + x8 + · · · =∞∑

n=0

(−1)nx2n.

It is a geometric series with q = −x2, and therefore it converges forall |q| = x2 < 1 or x ∈ (−1, 1). Using the formula for the sum of ageometric series, one infers that

11 + x2 = 1− x2 + x4 + · · · =

∞∑n=0

(−1)nx2n for all − 1 < x < 1,

This shows that the function 1/(1 + x2) can be represented as a powerseries in the open interval (−1, 1). Note that this representation is validonly in the interval of convergence of the power series despite the factthat the function 1/(1 + x2) is defined on the entire real line.

In general, one can construct a representation of a function by apower series in (x − a) for some a. The interval of validity of thisrepresentation depends on the choice of a.

Example 8.37. Find a representation of 1/x as a power series in(x− a), a > 0, and determine the interval of its validity.

Solution: Put y = x − a. The function can be rewritten in a formthat resembles the sum of a geometric series:

1x

=1

a(1 + y/a)=

1a

∞∑n=0

(−y

a

)n

=∞∑

n=0

(−1)n

an+1 (x− a)n , x ∈ (0, 2a).

58. REPRESENTATION OF FUNCTIONS AS POWER SERIES 121

The geometric series converges if |q| = | − y/a| = |y|/a < 1, and hencethis representation is valid only if −a < y < a or −a < x − a < a or0 < x < 2a. �

58.1. Differentiation and Integration of Power Series. The formula forthe sum of a power series

∑cnx

n is often complicated and, in mostcases, cannot even be found explicitly. How can functions defined by apower series be differentiated and integrated? If a function is a finitesum f(x) = u1(x) + · · · + un(x), then the derivative is the sum ofderivatives f ′ = u′

1 + · · ·+ u′n and, similarly, the integral is the sum of

integrals∫

f dx =∫

u1 dx + · · · + ∫ un dx. This is not generally truefor infinite sums. As an example, consider a function defined by theseries

f(x) =∞∑

n=1

un(x) =∞∑

n=1

sin(nx)n2 .

By comparison with a p-series, |un(x)| = | sin(nx)|/n2 ≤ 1/n2, thisseries converges for all x because

∑1/n2 converges. If the series is

differentiated just like a finite sum, that is, term-by-term, u′n(x) =

cos(nx)/n, then the series∑

u′n(x) diverges for x = 2πk for any integer

k as the harmonic series∑

1/n. So f ′(2πk) does not exist. Thus,although the terms un(x) are differentiable functions in the interval ofconvergence of the series

∑un, the series of derivatives

∑u′

n may notconverge and hence f =

∑un may not be differentiable everywhere in

its domain.It appears that if un(x) = cn(x − a)n, that is,

∑un(x) is a power

series, then the term-by-term differentiation or integration is justified.A proof of this assertion is beyond the scope of this course.

Theorem 8.27 (Differentiation and Integration of Power Series).If the power series

∑cn(x − a)n has a nonzero radius of convergence

R > 0, then the function f defined by

f(x) = c0 + c1(x− a) + c2(x− a)2 + · · · =∞∑

n=0

cn(x− a)n

is differentiable (and therefore continuous) on the interval (a−R, a+R)and

f ′(x) = c1 + 2c2(x− a) + 3c3(x− a)2 + · · · =∞∑

n=1

ncn(x− a)n−1,

∫f(x) dx=C + c0(x− a) + c1

(x− a)2

2+ · · ·=C +

∞∑n=0

cn(x− a)n+1

n + 1.


The radii of convergence of these power series are both R.

Thus, for power series, the differentiation or integration and thesummation can be carried out in any order:

d

dx

∑cn(x− a)n =

∑ d

dx[cn(x− a)n],∫ (∑

cn(x− a)n)

dx =∑∫

[cn(x− a)n] dx.

Remark. Theorem 8.27 states that the radius of convergence ofa power series does not change after differentiation or integration ofthe series. This does not mean that the interval of convergence doesnot change. It may happen that the original series converges at anendpoint, whereas the differentiated series diverges there.

Example 8.38. Find the intervals of convergence for f , f ′, and f ′′

if f(x) =∑∞

n=1 xn/n2.

Solution: Here cn = 1/n2 and hence n√|cn| = 1/ n

√n2 = (1/ n

√n)2 →

1 = α. So the radius of convergence is R = 1/α = 1. For x = ±1,the series is a p-series

∑1/n2 that converges (p = 2 > 1). Thus, f(x)

is defined on the closed interval x ∈ [−1, 1]. By Theorem 8.27, thederivatives f ′(x) =

∑∞n=1 xn−1/n and f ′′(x) =

∑∞n=2(n−1)xn−2/n have

the same radius of convergence R = 1. For x = −1, the series f ′(−1) =∑(−1)n−1/n is the alternating harmonic series that converges, whereas

the series f ′′(−1) =∑

(−1)n(n−1)/n diverges because the sequence ofits terms does not converge to 0: |(−1)n(n− 1)/n| = 1− 1/n→ 1 �= 0.For x = 1, the series f ′(1) =

∑1/n is the harmonic series and hence

diverges. The series f ′′(1) =∑

(n− 1)/n also diverges ((n− 1)/n doesnot converge to 0). Thus, the intervals of convergence for f , f ′, and f ′′

are, respectively, [−1, 1], [−1, 1), and (−1, 1). �

The term-by-term integration of a power series can be used to ob-tain a power series representation of antiderivatives.

Example 8.39. Find a power series representation for tan−1 x.

Solution:

tan−1 x =∫

dx

1 + x2 =∫ ( ∞∑

n=0

(−x2)n)

dx = C +∞∑

n=0

(−1)n+1 x2n+1

2n + 1.

Since tan−1 0 = 0, the integration constant C satisfies the condition0 = C + 0 or C = 0. The geometric series with q = −x2 converges if|q| < 1. Hence, the radius of convergence of the series for tan−1 x isR = 1 (the power series representation is valid for x ∈ (−1, 1)). �


In particular, the number 1/√

3 is less than the radius of conver-gence of the power series for tan−1 x. So the number tan−1(1/

√3) =

π/6 can be written as the numerical series by substituting x = 1/√

3into the power series for tan−1 x. This leads to the following represen-tation of the number π:

π = 2√

3∞∑

n=0

(−1)n

(2n + 1)3n.

58.2. Power Series and Differential Equations. A power series represen-tation is often used to solve differential equations. The relation betweena function f(x), its argument x, and its derivatives f ′(x), f ′′(x), andso on is called a differential equation. A function f(x) that satisfies adifferential equation is generally difficult to find in a closed form. Apower series representation turns out to be helpful. Since in this repre-sentation a function is defined by a sequence {cn}, f(x) =

∑cnx

n, andso are its derivatives f (k)(x), a differential equation imposes conditionson cn that are solved recursively.

Example 8.40. Find a power series representation of the solutionof the equation f ′(x) = f(x) and determine its radius of convergence.

Solution: Put f(x) =∑

cnxn and hence f ′(x) =

∑ncnx

n−1. Thenthe equation f ′ = f gives

c1 + 2c2x + 3c3x2 + 4c4x

3 + · · · = c0 + c1x + c2x2 + c3x

3 + · · · .By matching the coefficients at the monomial terms 1, x, x2, x3, andso on, one finds:

c0 = c1 c2 =c1

2, c3 =

c2

3, ..., cn =

cn−1

n.

Using the latter relation recursively:

cn =1n

cn−1 =1

n(n− 1)cn−2 =

1n(n− 1)(n− 2)

cn−3 = · · · = c0

n!.

So f(x) = c0∑∞

n=0 xn/n!, where c0 is a constant (the equation is satis-fied for any choice of c0). By the ratio test, the series converges for allx (so R = ∞). Indeed, cn = 1/n! and cn+1/cn = 1/(n + 1) → 0 = αand hence R = 1/α =∞. �

For this simple differential equation, it is not difficult to find f(x) =c0e

x by recalling the properties of the exponential function: (ex)′ = ex.The condition f(0) = e0 = 1 determines the constant c0 = 1. Thus,


the exponential function has the following power series representation:

(8.19) ex = 1 +x

1!+

x2

2!+

x3

3!+ · · · =

∞∑n=0

xn

n!.

The series converges on the entire real line. In particular, the numbere has the following series representation:

e = 1 +11!

+12!

+13!

+ · · · =∞∑

n=0

1n!

.

58.3. Approximation of Definite Integrals. If an indefinite integral off(x) is difficult to obtain, then the evaluation of the integral

∫ b

af(x) dx

poses a problem. A power series representation offers a simple way toapproximate the value of the integral. Suppose that f(x) =

∑cnx

n

for −R < x < R. By Theorem 8.27, for any −R < a < b < R,∫ b

a

f(x) dx =∑

cn

∫ b

a

xn dx =∑

cnbn+1

n + 1−∑

cnan+1

n + 1

≈n∑

k=0

ckbk+1

k + 1−

n∑k=0

ckak+1

k + 1.

Errors of the approximation of the series sum by finite sums have beendiscussed earlier.

Example 8.41. How many terms does one need in the power seriesapproximation of the integral of f(x) = e−x2 over the interval [0, 1] tomake the absolute error smaller than 10−5?

Solution: Note first that the indefinite integral∫

e−x2dx cannot be

expressed in elementary functions! So a direct use of the fundamentaltheorem of calculus becomes problematic. However,

∫e−x2

dx can berepresented as a power series that converges on the entire real line byreplacing x in (8.19) by (−x2). One has∫ 1

0e−x2

dx =∞∑

k=0

(−1)k

k!

∫ 1

0x2k dx =

∞∑k=0

(−1)k

k!(2k + 1)≈

n∑k=0

(−1)k

k!(2k + 1).

To determine n in the finite sum approximation of the series, recallthe alternating series estimation theorem (Theorem 8.18), where bn =1/(n!(2n + 1)):∣∣∣∫ 1

0e−x2

dx−n∑

k=0

(−1)k

k!(2k + 1)

∣∣∣ ≤ bn+1 =1

(n + 1)!(2n + 3)< 10−5.


A direct calculation shows that b7 ≈ 1.32 · 10−5 and b8 ≈ 1.46 · 10−6.So n = 7 is sufficient to approximate the integral with the requiredaccuracy. �

58.4. Exercises.In (1)–(3), find a power series representation for the function and de-termine the interval of convergence.

(1) f(x) =1

1− x4 (2) f(x) =x

3x2 + 2(3) f(x) =

x + 12x2 − x− 1

In (4)–(6), use differentiation to find a power series representation forthe function and determine the interval of convergence.

(4) f(x) =1

(1 + x)2 (5) f(x) =x3

(1− 4x2)2 (6) f(x) =1

(1 + x4)3

In (7)–(9), use integration to find a power series representation for thefunction and determine the radius of convergence.

(7) f(x) = ln(1+x) (8) f(x) = ln(1 + x2

1− x2

)(9) f(x) = tan−1(3x)

In (10)–(12), find a power series representation for the indefinite inte-gral and determine the radius of convergence.

(10)∫

ln(1− x)x

dx (11)∫

ex − 1− x

x2 dx (12)∫

tan−1(x2) dx

(13) Find a power series representation for sinx and cos x using thedifferential equation f ′′+f = 0. Determine the interval of convergence.(14) Show that the Bessel function of order 0 defined in Example 8.34satisfies the differential equation:

x2J ′′0 (x) + xJ ′

0(x) + x2J0(x) = 0.

In (15)–(17), use differentiation or integration to find the sum of theseries.

(15)∞∑

n=1

nxn−1 (16)∞∑

n=2

n(n− 1)xn−2 (17)∞∑

n=0

xn+1

n + 1

In (18)–(20), how many terms does one need in a power series approx-imation to evaluate the integral with the absolute error not exceeding10−6?

(18)∫ 1

0

dx

1 + x8 (19)∫ 1

0

e−x − 1x

dx (20)∫ 1/2

0ln(1 + x4) dx


(21) Find the radius of convergence of the hypergeometric series:

1+ab

1! cx+

a(a + 1)b(b + 1)2! c(c + 1)

x2+a(a + 1)(a + 2)b(b + 1)(b + 2)

3! c(c + 1)(c + 2)x3+· · · ,

where a, b, and c are reals. Use De Morgan’s test to determine theinterval of convergence.

59. Taylor Series

59.1. Real Analytic Functions. Suppose a function f is represented bya power series (R > 0):

f(x) = c0 + c1(x− a) + c2(x− a)2 + · · · =∞∑

n=0

cn(x− a)n , |x− a| < R.

By Theorem 8.27, its derivatives f (k)(x) can obtained by the term-by-term differentiation of the series, and the resulting series has the sameconvergence radius R. Evidently, f(a) = c0. What is the significanceof the other coefficients cn? The derivative f ′ is given by

f ′(x) = c1 + 2c2(x− a) + 3c3(x− a)2 + · · ·+ kck(x− a)k−1 + · · · ,which shows that f ′(a) = c1. The second derivative is

f ′′(x) = 2c2 + 3 · 2c3(x− a) + · · ·+ k(k − 1)ck(x− a)k−2 + · · · .Therefore, f ′′(a) = 2c2. After k such steps,

f (k)(x) = k(k − 1) · · · 2 · 1 ck + (k + 1)k(k − 1) · · · 2 ck+1(x− a) + · · · ,and hence f (k)(a) = k!ck or ck = f (k)(a)/k!. This proves the followingtheorem.

Theorem 8.28 (Significance of Power Series Coefficients). If f hasa power series representation

f(x) =∑n=0

cn(x− a)n , |x− a| < R,

for some a and R > 0, then its coefficients are

cn =f (n)(a)

n!.

Definition 8.12 (Real Analytic Functions). A function f on anopen interval I is said to be analytic if, for any a ∈ I, it has a powerseries representation f(x) =

∑cn(x−a)n that converges in some open

interval (a− δ, a + δ) ⊂ I, where δ > 0.

59. TAYLOR SERIES 127

The class of analytic functions plays a significant role in applica-tions. Their properties are discussed next.

Theorem 8.29 (Power Series Representation of Analytic Func-tions). A function f that is analytic on an open interval I has thepower series representation

(8.20) f(x) =∞∑

n=0

f (n)(a)n!

(x− a)n

for any a ∈ I that converges in an open subinterval of I that includesa.

This theorem follows from Definition 8.12 and Theorem 8.28.In Example 8.37 it was found that

(8.21)1x

=∞∑

n=0

(−1)n

an+1 (x− a)n , x ∈ (0, 2a).

This shows that the function f(x) = 1/x is analytic for all x > 0because a can be any positive number; that is, the function has apower series representation that converges in an open subinterval of(0,∞) containing any a > 0. Similarly, the analyticity of f(x) = 1/xcan be established for all x < 0.

It is important to emphasize that a power series for an analyticfunction does not necessarily converge on the entire domain of thefunction. But an analytic function can always be represented by aconvergent power series in a neighborhood of every point of its domain.Equation (8.21) illustrates the point.

Theorem 8.30 (Properties of Analytic Functions).(i) The sums and products of analytic functions are analytic.(ii) The reciprocal 1/f of an analytic function f is analytic if f isnowhere zero.(iii) The composition f(g(x)) of analytic functions f and g is analytic.(iv). Analytic functions are differentiable infinitely many times.

A proof of properties (i)–(iii) is given in more advanced calculuscourses. Property (iv) follows from Theorem 8.27. Its converse isnot generally true; that is, there are functions that are differentiableinfinitely many times at a point, but they cannot be represented by apower series that converges in an open interval that includes this point.As an example, consider the function

f(x) = e−1/x2if x �= 0 and f(0) = 0.


The function is continuous at x = 0 because limx→0 e−1/x2= limu→∞ e−u

= 0 = f(0). It is differentiable at x = 0 because

f ′(0) = limx→0

f(x)− f(0)x

= limx→0

e−1/x2

x= 0.

The first equality is the definition of f ′(0). The last limit is establishedby investigating the left and right limits x → 0± with the help of thesubstitution x = 1/u → ±∞ as x → 0±; the left and right limitscoincide because ue−u2 → 0 as u → ±∞ (the exponential functiondecreases faster than any power function). In a similar fashion, it canbe proved that f (n)(0) = 0 for all n (see Exercise 59.5.24). Thus, f(x)has no power series representation

∑cnx

n in a neighborhood of x = 0because, if it did, then, by Theorem 8.29 the function should have beenidentically 0 in some interval (−δ, δ), δ > 0, (as f (n)(0) = 0 for all n),which is not true (f(x) �= 0 for all x �= 0). Hence, the function is notanalytic at x = 0.

59.2. Taylor and Maclaurin Series.

Definition 8.13 (Taylor and Maclaurin Series). The series in(8.20) is called the Taylor series of a function f at a (or about a,or centered at a). The special case of the Taylor series when a = 0 iscalled the Maclaurin series of a function f .

The Taylor series of the exponential function ex about x = 0 isgiven by (8.19). The series converges for all x; that is, its radius ofconvergence is R =∞.

Trigonometric functions. Consider the Maclaurin series of f(x) =sin x. One has f ′(x) = (sin x)′ = cos x and f ′′(x) = (cos x)′ = − sin x.Hence,

f (2n)(x) = (−1)n sin x , f (2n+1)(x) = (−1)n−1 cos x,

and f (2n)(0) = 0, f (2n+1)(0) = (−1)n−1. So

sin x =∞∑

n=0

(−1)n−1x2n+1

(2n + 1)!= x− x3

3!+

x5

5!+ · · · , R =∞.

By the ratio test, |cn+1|/|cn| = (2n + 1)!/(2n + 3)! = 1/[(2n + 2)(2n +3)] → 0 = α, and the radius of convergence is R = 1/α = ∞. Theseries converges on the entire real line.

The Maclaurin series for f(x) = cos x is obtained by differentiating

cos x = (sin x)′ =∞∑

n=0

(−1)n−1x2n

(2n)!= 1− x2

2!+

x4

4!+ · · · , R =∞.


By Theorem 8.27 it also converges on the entire real line.

Binomial series. Let f(x) = (1 + x)p, where p is any real number.Its derivatives are f ′(x) = p(1+x)p−1, f ′′(x) = p(p−1)(1+x)p−2, and,in general,

f (n)(x) = p(p− 1) · · · (p− n + 1)(1 + x)p−n.

The Maclaurin series for (1 + x)p is called the binomial series. Thetraditional notation for its coefficients is

cn =f (n)(0)

n!=

p(p− 1) · · · (p− n + 1)n!

=(

pn

).

These numbers are called the binomial coefficients. The binomial seriesand its radius of convergence are

(1 + x)p =∞∑

n=0

(pn

)xn = 1 +

p

1!x +

p(p− 1)2!

x2 + · · · , R = 1.

The coefficients satisfy the recurrence relation cn+1 = cn(p−n)/(n+1).Therefore, by the ratio test, |cn+1|/|cn| = |p−n|/(n+1) = |1−p/n|/(1+1/n)→ 1 = α as n→∞. Hence, R = 1/α = 1.

59.3. Taylor Series of Analytic Functions. Every analytic function in aneighborhood of any point is represented by the Taylor series aboutthat point. If the Taylor series converges on the entire real line, thenthe function is analytic everywhere. In particular, the exponential ex

and trigonometric functions sinx and cos x are analytic everywhere.Moreover, the properties of analytic functions stated in Theorem 8.30allows us to add, multiply, and make a composition of the Taylor se-ries (on the common intervals of their convergence) just like ordinarysums to obtain the Taylor series representation of the sums, products,and compositions of analytic functions. These are extremely usefulproperties in applications.

Example 8.42. Find first four terms of the Taylor series for thefunction f(x) = exp(tan−1 x) about x = 0.

Solution: Calculation of the derivatives of such a function is rathertedious. Instead, note that ex and tan−1 x are both analytic in a neigh-borhood of x = 0. So the composition of their Taylor series (see (8.19)and Example 8.39) gives the sought-after Taylor series. Only monomi-als 1, x, x2, and x3 have to be retained when calculating the compo-sition. This implies that it is sufficient to retain two leading terms in


the Taylor series tan−1 x = x − x3/3 + · · · and four leading terms inthe Taylor series (8.19) of the exponential function:

etan−1 x =1 + tan−1 x +12

(tan−1 x

)2+

16

(tan−1(x)

)3+ · · ·

=1+(x− x3

3+ · · ·

)+

12

(x− x3

3+ · · ·

)2+

16

(x− x3

3+ · · ·

)3+ · · ·

= 1 +(x− x3

3

)+

12

x2 +16

x3 + · · ·

= 1 + x +12

x2 − 16

x3 + · · · .�

59.4. Approximations by Taylor Polynomials. An analytic function f canbe approximated by a finite sum of the Taylor series:

f(x) ≈n∑

k=0

f (k)(a)k!

(x− a)k = Tn(x).

The polynomial Tn(x) is called the Taylor polynomial about a. Theconvergence of the Taylor series guarantees that the remainder con-verges to 0:

Rn(x) = f(x)− Tn(x)→ 0 as n→∞ for all |x− a| < R,

where R is the radius of convergence of the Taylor series. The accuracyof the Taylor polynomial approximation for a function is assessed inTaylor’s theorem discussed in Calculus I. Here it is restated in a slightlydifferent form.

Theorem 8.31 (Taylor’s Theorem). Suppose a function f is ana-lytic near a and let Tn(x) be its Taylor polynomials about a. Then, forevery n and any |x − a| < R, where R is the radius of convergence ofthe Taylor series for f about a, there exists a point ξ between a and xsuch that

Rn(x) = f(x)− Tn(x) =f (n+1)(ξ)(n + 1)!

(x− a)n+1.

Proof. Given a number x, |x−a| < R, let M be a number definedby

f(x) = Tn(x) + M(x− a)n+1.

Consider the function

g(t) = f(t)− Tn(t)−M(t− a)n+1 , where |t− a| < R.


Since the (n+1)th derivative of a polynomial of degree n vanishes andhence g(n+1)(t) = f (n+1)(t)−n!M , the proof will be complete if one canshow that g(n+1)(ξ) = 0 for some ξ between x and a (the latter wouldimply that M = f (n+1)(ξ)/n!). By the definition of Taylor polynomials,f (k)(a) = T

(k)n (a) for k = 0, 1, ..., n, and hence g(a) = g′(a) = · · · =

g(n)(a) = 0. The function g(t) is differentiable and g(x) = g(a) = 0by the choice of M ; therefore, by Rolle’s theorem, there is a numbert1 between x and a such that g′(t1) = 0. Similarly, the function g′(t)is differentiable and g′(t1) = g′(a) = 0; hence, there is a number t2between t1 and a such that g′′(t2) = 0. After n + 1 steps of thisprocedure, one arrives at the conclusion that g(n+1)(tn+1) = 0 for somenumber tn+1 = ξ between tn and a, that is, between x and a. �

Corollary 8.4 (Taylor’s Inequality). If |f (n+1)(x)| ≤ Mn for|x − a| ≤ d < R, then the remainder of the Taylor series satisfiesthe inequality

|Rn(x)| ≤ Mn

(n + 1)!|x− a|n+1 for |x− a| ≤ d < R.

Since ξ in Taylor’s theorem lies between x and a, one has |f (n+1)(ξ)| ≤Mn for |x − a| ≤ d, and the conclusion of the corollary follows. Allderivatives of an analytic function are continuous and, hence, attaintheir maximal and minimal values on any closed interval |x − a| ≤ d.So Mn = max |f (n+1)(x)| on |x− a| ≤ d.

Example 8.43. Find an upper bound on the error of the Taylorpolynomial approximation about x = 0 for the function f(x) = sin x.

Solution: The Maclaurin series for sin x contains only odd powers ofx and so are the Taylor polynomials. Since f (2n+2)(x) = (−1)n+1 sin x,one has |f (2n+2)(x)| = | sin x| ≤ 1 = M2n+1 uniformly for all x and alln and hence ∣∣∣sin x− T2n+1(x)

∣∣∣ ≤ |x|2n+2

(2n + 2)!.

�

This example shows that, although the error converges to 0 for allx, for a fixed n it grows with increasing |x|. This implies that Taylorpolynomials of higher degrees are needed to achieve the same accuracyfor large |x| as for smaller |x| (see Fig. 8.9). To avoid using high-degreeTaylor polynomials to approximate the function at large |x|, one canuse Taylor polynomials about some a close to the range of x in whichthe approximation is needed.


Figure 8.9. An illustration of an approximation off(x) = sin x (the dashed red curve) by its Taylor poly-nomials at x = 0 (the solid blue curve). As n increases,Tn(x) approaches f(x) = sin x. The approximation be-comes better in a larger interval for a larger n in accor-dance with the analysis of Example 8.43.

59.5. Exercises.In (1)–(5), find the Maclaurin series for the function and the radius ofconvergence.

(1) ln(1+x) (2) tan x (3) sinh x (4) cosh x (5) x6+2x5−x3+x−3

In (6)–(9), find the Taylor series for the function about a and the radiusof convergence.

(6) cos x , a = π (7) 1/√

x , a = 4

(8) sin x , a = π/2 (9) (1 + x)2/3 , a = 7

In (10)–(13), use Maclaurin series for basic functions to find the Maclau-rin series for the function.

(10) x cos(x2/2) (11)x

3√

1 + x4(12)

x− sin x

x3 (13) x2 tan−1(x2)


In (14)–(17), use the products and composition of the Maclaurin seriesfor basic functions to find the first three non-vanishing terms of theMaclaurin series for the function.

(14) sin(π(cos x)) (15) esin x (16) tan−1 x ln(1 + x) (17) ln(cos x)

(18) Find the first five nonvanishing terms of the Maclaurin series forf(x) = ex/ cos x.Hint: Put f(x) = c0 + c1x + · · · + c4x

4 + · · · , then use the product ofthe Maclaurin series to find the coefficients from ex = f(x) cos x.In (19)–(21), find the degree of a Taylor polynomial to approximatethe integrand so that the error of approximating the integral does notexceed 10−4.

(19)∫ 1/2

0tan−1(x2) dx (20)

∫ 1

0

ex − 1x

dx (21)∫ 1/2

0(1 + x4)1/4 dx

In (21)–(23), find the sum of the series.

(21)∞∑

n=0

(−1)nπ2n

62n(2n)!(22)

∞∑n=0

(−1)n3n

2nn!(23) 1−ln 2+

(ln 2)2

2!−(ln 2)3

3!+· · ·

(24) (i) For the function f(x) = e−1/x2 if x �= 0 and f(0) = 0, showthat f (n)(0) = 0 for all n and hence f cannot be represented as a powerseries near 0. (ii) Let f(x) = e−1/x if x > 0 and f(x) = 0 if x ≤ 0. Isthis function analytic everywhere?

CHAPTER 9

Further Applications of Integration

60. Arc Length

60.1. The Length of a Curve. We have seen various applications of in-tegration to the computation of the area of a domain and to the com-putation of the volume of a solid. It is perhaps more surprising thatwe can also use integration to compute the length of a curve betweentwo given points. This may sound counterintuitive at first, since inthe applications we have seen so far, integration was used to computesome parameter of an object that existed in a higher dimension thanthe function that was being integrated.

Let f be a function so that, on the interval [a, b], the derivative f ′

of f exists and is a continuous function. We would like to know thelength of the curve of f , starting at the point A = (a, f(a)) and endingat the point B = (b, f(b)).

Intuitively, we can imagine that we lay a rope over the graph of fbetween the two endpoints, mark A and B on the rope, then straightenthat rope out, and measure the distance between them.

A more formal definition, which is useful in the actual computationof the length of the curve, is the following. Cut the interval [a, b]into n equal parts, using points a = x0 < x1 < · · · < xn = b. LetPi = (xi, f(xi)) = f(xi, yi). Let |Pi−1Pi| denote the length of thestraight line segment from Pi−1 to Pi. Then the sum

(9.1) Kn =n∑

i=1

|Pi−1Pi|

is a little bit smaller than the length of the curve since the pointsPi−1 and Pi are on the curve and the straight line is the shortest pathbetween them.

If we keep refining the subdivision of the interval [a, b] by having ngo to infinity, then it can be proved that limn→∞ Kn exists. We definethat limit to be the length of the curve of f from A to B. See Figure9.1 for an illustration. Note that in the case when the graph of f isa straight line segment between A and B, this definition is just thelength of that segment, so our definition extends our previous notionof length.

135

136 9. FURTHER APPLICATIONS OF INTEGRATION

Figure 9.1. Arc length as a limit.

Let us now return to (9.1) in order to compute limn→∞ Kn. Let(b− a)/n = (xi − xi−1)/n = ∆x. Note that then

|Pi−1Pi| =√

(xi − xi−1)2 + (yi − yi−1)2

=

√(∆x)2 + (∆x)2 (yi − yi−1)2

(xi − xi−1)2

= ∆x

√1 +

(yi − yi−1)2

(xi − xi−1)2 .

Now observe that since f ′ is continuous, the intermediate valuetheorem implies that there is a real number x∗

i ∈ [xi−1, xi] such thatyi−yi−1xi−xi−1

= f ′(x∗i ). Hence, the previous chain of equalities yields

|Pi−1Pi| = ∆x√

1 + f ′(x∗i )2.

Summing over all i, we get

Kn =n∑

i=1

∆x√

1 + f ′(x∗i )2.

As n goes to infinity, the left-hand side, by definition, converges to thelength of the curve of f between A and B, while the right-hand side,being a Riemann sum, converges to

∫ b

a

√1 + f ′(x)2 dx. Hence, we have

proved the following theorem.

Theorem 9.1. If f ′ is a continuous function on the interval [a, b],then the length of the graph of f(x) from the point (a, f(a)) to the point

60. ARC LENGTH 137

Figure 9.2. The curve of f(x) = 23x

3/2.

(b, f(b)) is equal to

L =∫ b

a

√1 + f ′(x)2 dx.

Example 9.1. Find the length of the curve of f(x) = 23x

3/2 from(0, 0) to (1, 2/3). See Figure 9.2 for an illustration.

Solution: We have f ′(x) =√

x, so f ′ is a continuous function on [0, 1],and therefore Theorem 9.1 applies. Using that theorem, we obtain

L =∫ 1

0

√1 + (

√x)2 dx

=∫ 1

0

√1 + x dx

=[23(1 + x)3/2

]1

0

=4√

23− 2

3.

�

Note that the result is remarkably close to the length of the straightline that connects the two points in question, which is

√13/3.

We can use our new technique to verify a classic formula.

Example 9.2. Use Theorem 9.1 to compute the circumference of acircle of radius 1.

Solution: Let us place the center of the unit circle at the origin. Thenthe boundary of the circle is the set of points satisfying x2 + y2 = 1.


Figure 9.3. One quarter of the unit circle.

We want to use Theorem 9.1, so we need a part of the circle wherethat satisfies the vertical line test (so y is a function of f) and wherethe tangent line to the circle is never vertical (so that f ′(x) exists).For instance, we can choose the quarter of the circle that starts in thepoint

(−

√2√2

)and ends in the point

(−

√2√2

). See Figure 9.3 for an

illustration. On that part of the curve, f ′(x) = − x√1−x2 is continuous,

so Theorem 9.1 implies

L =∫ √

2/2

−√2/2

√1 + f ′(x)2 dx

=∫ √

2/2

−√2/2

√1 +

x2

1− x2dx

=∫ √

2/2

−√2/2

1√1− x2

dx

= [sin−1 x]√

2/2−√

2/2

= π/2.

This implies that the circumference of the full circle is four times thismuch, that is, 2π. �

61. SURFACE AREA 139

60.2. Remarks. Recall that in the first paragraph of this section, wediscussed why it may seem counterintuitive that integration plays a rolein the computation of arc lengths. Now we can see that the purportedcontradiction explained there is resolved by the fact that the integrandin Theorem 9.1 contains f ′, not f .

Compared to other formulas we learned in our earlier studies ofintegration, it is relatively rare that the formula given by Theorem 9.1can be explicitly computed, since

∫ b

a

√1 + f ′(x)2 dx is often difficult

to handle. Therefore, we must often resort to approximate integrationwhile computing arc lengths.

60.3. Exercises.

(1) Find the length of the curve f(x) = x2/2 between the pointsgiven by x = 0 and x = 1.

(2) Find the length of the curve f(x) = x3+ 34x

between the pointsgiven by x = 2 and x = 4.

(3) Find the length of the curve f(x) = ln(cos x) between thepoints given by x = 0 and x = π/4.

(4) Prove that Theorem 9.1 provides the correct value for the arclength of f when f is a linear function.

(5) Use a method of approximate integration to estimate the lengthof the curve of f(x) = ex as from (0, 1) to (1, e).

(6) Use a method of approximate integration to estimate the lengthof f(x) = sin x from (0, 0) to (π, 0).

61. Surface Area

61.1. The Definition of Surface Area. In the last section, we defined thelength of a curve, and deduced a formula for the computation of thatlength. Let us now take a curve, say of a function f(x) = y, wherex ∈ [a, b] and f ′ is continuous on [a, b]. Let us rotate this curve aroundthe horizontal axis, as shown in Figure 9.4. What is the area of theobtained surface of revolution?

The definition of the area in question, and its computation, will bequite similar to what we have discussed in the previous section for thearc length.

Cut the interval [a, b] into n equal parts, using points

a = x0 < x1 < · · · < xn = b.

Let Pi = (xi, f(xi)) = f(xi, yi) and let li = |Pi−1Pi| denote the lengthof the straight line segment from Pi−1 to Pi. As we rotate the curveof f around the horizontal axis, the rotation of the segment Pi−1Pi


Figure 9.4. A surface obtained by rotating a curve.

Figure 9.5. Approximating a surface of revolution.

results in the lateral surface Sn,i of a truncated cone with slant heightli and radii yi−1 and yi. See Figure 9.5 for an illustration. It is thennot difficult to prove that the area of Sn,i is equal to

(9.2) A(Sn,i) = π(yi−1 + yi)li.

As n goes to infinity, the sum of the areas of the surfaces Sn,i approxi-mates what we intuitively think of as the area of the surface obtainedby rotating the curve.


In fact, it can be proved that the limit

(9.3) S(A) = limn→∞

n∑i=1

Sn,i = limn→∞

n∑i=1

π(yi−1 + yi)li

exists. We define this limit to be the area of the surface of revolutionobtained when the curve of f is rotated around the horizontal axis,with x ∈ [a, b].

In order to compute this surface area, recall from the last sectionthat there exists a real number x∗

i ∈ [xi−1, xi] such that li = |Pi−1Pi| =∆x√

1 + f ′(x∗i )2. Also note that, since f is continuous, small changes

in x lead to small changes in f(x) = y, so if n is large enough, thenf(x∗

i ) ≈ f(xi) = yi and f(x∗i ) ≈ f(xi−1) = yi−1. Therefore, (9.3)

implies

(9.4) S(A) = limn→∞

n∑i=1

2πf(x∗i ) ∆x

√1 + f ′(x∗

i )2,

where ∆x = (b − a)/n. Now notice that the last expression obtainedfor S(A) is the limit of a Riemann sum, that is, an integral (of thefunction 2πf(x)

√1 + f ′(x)2). This means that we proved the following

theorem.

Theorem 9.2. Let f be a function that f ′ is continuous on theinterval [a, b]. Let S be the surface obtained by rotating the curve y =f(x), where x ∈ [a, b], around the horizontal axis.

Then the area of S is

A(S) =∫ b

a

2πf(x)√

1 + f ′(x)2 dx.

Example 9.3. Compute the surface area of a sphere of radius r.

Solution: Such a sphere can be obtained by rotating the semicirclegiven by the equation f(x) =

√r2 − x2 around the horizontal axis. See

Figure 9.6 for an illustration.


Figure 9.6. A sphere as a surface of revolution.

Theorem 9.2 then yields

A(S) =∫ r

−r

2π√

r2 − x2 ·√

1 +x2

r2 − x2dx

=∫ r

−r

2π√

r2 − x2 ·√

r2

r2 − x2dx

=∫ r

−r

2rπ dx

= [2rπx]r−r

= 4r2π.

�

61.2. Variations. If we rotate our curve around the vertical axis insteadof the horizontal axis, then most of the previous argument remainsvalid. The only difference is that when the point Pi = (xi, yi) is rotated,it is rotated in a circle of radius xi, not yi. This leads to the followingtheorem.

Theorem 9.3. Let f be a function such that f ′ is continuous onthe interval [a, b]. Let S be the surface obtained by rotating the curvey = f(x), where x ∈ [a, b], around the vertical axis.

Then the area of S is

A(S) =∫ b

a

2πx√

1 + f ′(x)2 dx.

Note that the f(x) term in the integrand of Theorem 9.2 is replacedby x.


Example 9.4. Rotate the curve given by y = f(x) = x2/2, withx ∈ [0, 1], around the vertical axis. Find the area of the obtained sur-face.

Figure 9.7. The curve of y = x2/2 and the surfaceobtained by its rotation.

Solution: Theorem 9.3 implies

A(S) =∫ 1

02πx√

1 + x2 dx

= 2π[13(x2 + 1)3/2

]1

0

= 2π · 2√

2− 13

≈ 0.6095.

�

Note that another way of writing the result of Theorem 9.2 is

A(S) =∫ b

a

2πf(x)

√1 +

(dy

dx

)2

dx.

By interchanging the roles of x and y, this implies that, for curves givenby an equation g(y) = x, the following holds.

Theorem 9.4. Let g be a function such that g′ is continuous onthe interval [a, b]. Let S be the surface obtained by rotating the curvex = g(y), where y ∈ [a, b], around the vertical axis.


Then the area of S is given by

A(S) =∫ b

a

2πg(y)

√1 +

(dx

dy

)2

dy.

While the surface area of a cone can be computed by elementarymethods, it is elucidating to compute it with our new method and seethat the result is what we expect it to be.

Example 9.5. Find the surface area of a right cone of base radiusR and height h.

Solution: The base circle of the cone has area R2π. In order to com-pute the lateral surface, note that the lateral surface can be obtained byrotating the line segment given by the equation x = −Ry

h+ R = g(y),

where y ∈ [0, h], around the vertical axis. Therefore, Theorem 9.4applies, and for the lateral surface area, it yields

A(S) =∫ h

02π ·

(−Ry

h+ R

)√1 +

R2

h2 dy

= 2πR

√R2 + h2

h2

[− y2

2h+ y

]h

0

= πR√

R2 + h2

= πRs,

where s =√

R2 + h2 is the slant height of the cone.So the total surface area of the cone is the sum of the area of its

base plus the area of its lateral surface, that is, R2π+πRs = πR(R+s).�

Theorem 9.4 also has a version that applies to curves given as func-tions of y that are rotated around the horizontal axis.

Theorem 9.5. Let g be a function such that g′ is continuous onthe interval [a, b]. Let S be the surface obtained by rotating the curvex = g(y), where y ∈ [a, b], around the horizontal axis.

Then the area of S is given by

A(S) =∫ b

a

2πy

√1 +

(dx

dy

)2

dy.

Note that for the computation of some surface areas, we will havea choice of two theorems discussed in this section. The reader is en-couraged to describe the curves whose rotations lead to such surfaces.

62. APPLICATIONS TO PHYSICS AND ENGINEERING 145

61.3. Exercises.

(1) Compute the surface area obtained by rotating the curve y =ex, for x ∈ [0, 1], around the horizontal axis.

(2) Rotate the curve of y = f(x) =√

x, where x ∈ [0, 1], aroundthe vertical axis. Find the surface area.

(3) Rotate the curve of y = f(x) = tan x, where x ∈ [0, π/4],around the vertical axis. Find the surface area.

(4) Rotate the curve of y = f(x) = x3, where x ∈ [0, 1], aroundthe horizontal axis. Find the surface area.

(5) Rotate the curve of x = g(y) = 23(y + 1)3/2 , where y ∈ [2, 3],

around the horizontal axis. Find the surface area.(6) Solve Example 9.5 using Theorem 9.3.

62. Applications to Physics and Engineering

62.1. Center of Mass.

62.1.1. One-Dimensional Systems. Let us assume that we have twoobjects of mass m1 and m2 placed on the line of real numbers, atpoints x1 and x2, respectively. We want to find the point xg such thatif we place a fulcrum under the interval [x1, x2] at xg, the objects at theendpoints of the interval will balance. See Figure 9.8 for an illustration.We assume that the interval [x1, x2], or the stick representing it, hasnegligible mass.

If m1 = m2, then we clearly have xg = (x1 + x2)/2. Otherwise,we make use of the well-known fact of physics that the interval willbalance if the moments on the two sides of the fulcrum are equal, thatis, when

(9.5) m1(xg − x1) = m2(x2 − xg)

holds. Solving (9.5) for xg, we get

(9.6) xg =m1x1 + m2x2

m1 + m2.

Figure 9.8. Center of mass.


The point xg of the real line is called the center of mass or center ofgravity of the system described above, that is, the system of an objectof mass m1 at x1 and an object of mass m2 at x2. The moment of anobject with respect to a point P is the mass of the object times thedistance of the object from P . In particular, in the above system, thetwo objects had moments m1x1 and m2x2 with respect to the origin.So the total system had moment m1x1 +m2x2. Note that if we replacethe two objects by a simple object of mass m1 + m2 placed at xg, thenthe moment of the system about the origin does not change. This is animportant property that only the center of mass has, and therefore werepeat it. If we concentrate the total mass of the system at the centerof mass, the moment of the system with respect to the origin will notchange.

If we consider a system of k distinct objects of mass m1, m2, . . . , mk

placed at points x1, x2, . . . , xk along the horizontal axis, then we can usean analogous argument to show that the center of mass of the systemis at

(9.7) xg =∑k

i=1 mixi∑ki=1 mi

.

62.2. Two-Dimensional Systems.

62.2.1. Discrete Two-Dimensional Systems. Let us now consider themore general case when the k objects of mass m1, m2, . . . , mk are placedin points (x1, y1), (x2, y2), . . . , (xk, yk) of the plane. We would like tofind the center of mass (xg, yg) of this system. In other words, weassume that a plate of negligible mass is placed under our system, andwe want to find the point (xg, yg) with the property that if we place afulcrum under the plate at that point, the plate will balance.

Using methods similar to the one-dimensional case, it can be provedthat the plate will balance if the fulcrum is placed at (xg, yg) with

(9.8) xg =∑k

i=1 mixi∑ki=1 mi

and yg =∑k

i=1 miyi∑ki=1 mi

.

This corresponds to the intuitively appropriate concept that theplate will balance if it balances both “horizontally” and “vertically.”

The sum Mx =∑k

i=1 miyi is called the moment of the system withrespect to the x axis. This name is due to the fact if we tried tobalance the system on the x axis, the larger the number Mx, the morewould the weights of the system rotate the plate. Similarly, the sum


My =∑k

i=1 mixi is called the moment of the system with respect to they axis.

62.2.2. Symmetry Lines. Now let us consider the continuous version ofthe problem. Let P be a plate and let us try to find the center of massof P . (We no longer assume that the mass of the plate is negligible; infact, that mass is the object of our study now.) Let us assume, for therest of this chapter, that the mass of P is uniformly distributed overP . Let us also assume that the density of the material of which P ismade is 1. That is, the mass of a unit square within P is 1.

Sometimes we can find the center of mass of P without computa-tion. A symmetry line of P is a straight line t such that the image ofP when reflected through t is P itself. That implies that the two partsinto which t cuts P are congruent, and the plate balances on the linet. Consequently, the center of mass C of P must be on t, since if weconcentrate the entire mass of P in C, it still has to balance on the lineP .

The argument of the previous paragraph shows that the center ofmass of P must be on every symmetry line of P . So if P has morethan one symmetry line, then these symmetry lines must all intersectin one point, namely, in the center of mass of P . In this case, we obtainthe center of mass of P as the intersection of any two symmetry linesof P . For example, we can find the center of mass of a circle, ellipse,rectangle, or rhombus in this way.

62.2.3. A Formula for Continuous Two-Dimensional Systems. Let us keepthe conditions from the previous section and let us impose the new con-dition that P is a “domain under a curve”; that is, the borders of P arethe vertical lines x = a and x = b, the horizontal axis, and the graphof the continuous function f(x) = y.

We would like to use formula (9.8) to find the approximate locationof the center of mass of P . Let us cut the interval [a, b] into n equalparts, using the intermediate points a = x0 < x1 < x2 < · · · < xn = b,and let ∆x = (b− a)/n. The vertical lines x = xi cut P into n verticalstripes. Let Si be the ith such stripe. The area, and hence the mass,of Si is close to ∆x · f(x∗

i ), where x∗i is the midpoint (xi−1 + xi)/2 of

the interval [xi−1, xi]. So we are approximating Si by a rectangle Ri.Let us concentrate the entire mass of Ri in the center of mass of Ri,that is, at (x∗

i , f(x∗i )/2).

Now we can compute the moment of the obtained system of n ob-jects with respect to the x axis. Note that the mass of Ri is equal to


the area of Ri, that is, ∆xf(x∗i ). So we have

Mx(n) =n∑

i=1

miyi =n∑

i=1

∆x · f(x∗i ) · f(x∗

i )/2.

The right-hand side is a Riemann sum, so, as n goes to infinity,it will converge to the corresponding integral, while the left-hand sidewill converge to the moment Mx of the original plate with respect tothe x axis.

This yields

(9.9) Mx =∫ b

a

12f(x)2 dx.

A similar argument using horizontal stripes instead of vertical onesshows that

(9.10) My =∫ b

a

xf(x) dx.

Finally, now that the moments of P are known, it is straightforwardto compute the coordinates of the center of mass of P . Indeed, thecenter of mass is the unique point (xg, yg) with the property that ifthe entire mass A =

∫ b

af(x) dx of P is placed in that point, then the

moments of this one-object system are identical to the moments of P .In other words, Mx = ygA and My = xgA. Therefore, xg = My/Aand yg = Mx/A, which means that formulas (9.9) and (9.10) imply thefollowing theorem.

Theorem 9.6. Let f be a function such that f(x) ≥ 0 if x ∈ [a, b].Let D be a domain whose borders are the vertical lines x = a and x = b,the horizontal axis, and the curve of the function f(x) = y. Let A(D)denote the area of D.

Let xg and yg be the coordinates of the center of mass of D. Thenwe have

xg =

∫ b

axf(x) dx

A(D)and yg =

∫ b

a12 [f(x)]2 dx

A(D).

Example 9.6. Find the coordinates of the center of mass of thequarter of the unit circle that is in the northeastern quadrant.

Note that if we asked the same question for the entire unit circle,the answer would obviously be that xg = yg = 0, since the center ofgravity of any domain must be on all symmetry lines. If we asked thesame question for the half of the unit circle that is in the northern


half-plane, then xg = 0 would clearly hold, since the vertical axis is asymmetry line of that semicircle.Solution: (of Example 9.6): Note that the domain D in question hasa symmetry line, namely, the line determined by the equation x = y.So the center of gravity of D is on that line, that is, xg = yg. Therefore,it suffices to compute one of xg and yg. We have f(x) =

√1− x2 = y,

so yg is somewhat easier to compute. Theorem 9.6 yields

yg =

∫ 10

12 · (1− x2) dx

π/4

=2π·[x− x3

3

]1

0

=2π· 23

=43π

.

So the center of gravity of the quarter of the unit circle in the north-eastern quadrant is at ( 4

3π, 4

3π). See Figure 9.9 for an illustration. �

Note that 4/(3π) ≈ 0.424. This makes perfect sense since thisshows that the center of gravity of D is closer to the horizontal axis(the bottom of the quarter circle) than to the y = 1 line (the top of

Figure 9.9. The center of mass of a quarter of the unit circle.


the quarter circle). That is reasonable, since the bottom of D is widerthan the top of D, so it constitutes a larger portion of the total weightof D than the top of D.

Example 9.7. Find the center of gravity of the domain D whoseborders are the vertical lines x = 0 and x = 1, the horizontal axis, andthe graph of the function f(x) = x2 = y.

Solution: The domain D in question does not have a symmetry line,so we must use Theorem 9.6 to compute both of xg and yg. We have

xg =

∫ b

axf(x) dx

A(D)

=

∫ 10 x · x2 dx∫ 1

0 x2 dx

=[x4/4]10[x3/3]10

=34

and

yg =

∫ b

a12 [f(x)]2 dx

A(D)

=

∫ 10 (x4/2) dx

1/3

=[x5/10]10

1/3

=310

.

So the center of gravity of D is at (0.75, 0.3). This agrees with ourintuition, since the bottom of D is larger than its top, and the left-handside of D is smaller than the right-hand side of D. See Figure 9.10 foran illustration. �

62.3. Exercises.

(1) Find the center of mass of the unit semicircle that lies in thenorthern half-plane.

(2) Find the center of mass of the plate whose borders are the linesx = 0 and x = π/2, the graph of the function f(x) = sin y,and the horizontal axis.

63. APPLICATIONS TO ECONOMICS AND THE LIFE SCIENCES 151

(3) Find the center of mass of the plate whose borders are thevertical lines x = 1 and x = 2, the horizontal axis, and thegraph of the function f(x) = ln x.

(4) Find the center of mass of the plate whose borders are thevertical lines x = 1 and x = 2, the horizontal axis, and thegraph of the function f(x) = ex.

(5) Find the center of mass of the trapezoid whose vertices are at(0, 0), (15, 0), (1, 1), and (8, 1).

(6) An object consists of two squares. The first is the squarewith vertices (0, 0), (0, 2), (2, 0), and (2, 2), and the other isthe square with vertices (0, 2), (1, 2), (0, 3), and (1, 3). Thedensity of the material of the small square is twice the densityof the material of the large square. Where is the center of massof this object?

63. Applications to Economics and the Life Sciences

63.1. Consumer Surplus. Let us consider the problem of pricing somemerchandise whose value is highly subjective; that is, it is worth moreto some customers than to others. Examples of this could be ticketsfor various sporting events, air line tickets to vacation destinations, orpopular books.

Let p(x) be the demand function of this commodity. That is, p(x)is the price that will result in selling x units of the commodity. Lowerprices usually lead to higher sales therefore, p(x) is usually a decreasingfunction as illustrated in Figure 9.11.

The area under the graph of p represents the total revenue thecompany could possibly have, if it managed to charge each customer

Figure 9.10. Center of mass of the area defined in Ex-ample 9.7.


Figure 9.11. The demand function p(x) of a commodity.

the maximum price that that customer is willing to pay. Indeed, if thehighest amount anyone is willing to pay for one unit is p(x1), and x1

customers are willing to pay that price, then the revenue coming fromthese most enthused customers is x1p(x1), which is the area of thedomain under the graph of p that is between the vertical lines x = 0and y = x1. We could continue in this way, noting that if the secondhighest price that some customers are willing to pay is p(x2), and thereare x2−x1 people who are willing to pay this price (not including thosewho are willing to pay even p(x1)), then the revenue from them willbe (x2− x1)p(x2). This is the area of the domain under the graph of pthat is between the lines x = x1 and x = x2, and so on.

If the seller decides to set one fixed price p(z), then the seller willsell z items, for a total revenue of zp(z) (the area of the rectangle Rbordered by the two coordinate axes and the lines x = z and y = p(z)).This means that the customers who would have paid an even higherprice for these goods have saved money. Besides losing that potentialrevenue, the seller also loses revenue by not getting any purchases fromcustomers who were willing to pay some amount, but not z, for oneunit.

Let xn be the number of items that the seller can sell at the lowestprice at which the seller is still willing to sell these items. It is a directconsequence of the above discussion that the total amount saved by allcustomers who bought the item at z dollars is the area under the curveof p but above the rectangle R, that is,

(9.11)n∑

i=1

(p(xi)− z)(xi − xi−1).


If the number n of prices at which various customers are willing tobuy goes to infinity, then the Riemann sum in (9.11) approaches thedefinite integral

(9.12) CS =∫ z

0(p(x)− z) dx.

In economics, CS is called the customer surplus for the given commod-ity.

Similarly, the integral ∫ ∞

z

p(x) dx

is the amount of missed revenue, that is, the money the company couldhave received from buyers who found the product too expensive. Notethat this is the area of the domain under the graph of p, but on theright of R.

Example 9.8. Tickets for a certain flight are normally priced at$300, and in an average month, 500 tickets are sold. Research showsthat, for every $10 that the price is reduced, the number of tickets soldgoes up by 20. Find the demand curve and compute the consumersurplus for these tickets if the price is set at $240.

Solution: If the airline wants to sell x tickets, then the price that theairline needs to charge is

p(x) = 300− 10 · x− 50020

= 300− x− 5002

.

Indeed, in order to sell x−500 extra tickets, the airline needs to decreaseits price by $10 for each 20 pack of extra tickets.

If the price is set at z = $240, then formula (9.12) shows that thecustomer surplus is

CS =∫ 240

0(p(x)− 240) dx

=∫ 240

0

(60− x− 500

2

)dx

=∫ 240

0

(310− x

2

)dx

= 60, 000.

So customers would save a total of $60,000 in an average month if theprice of the tickets were set at $240. �


63.2. Survival and Renewal.

Example 9.9. Let us assume that there are currently 30,000 peoplein the United States who have a certain illness. Let us also assumethat we know that the fraction of that population who will still have theillness t months from now is given by the function f(t) = e−0.05t. Wealso know that every month 1000 new patients will get the illness. Howmany people in the United States will have the illness in 20 months?

Solution: Clearly, f(20) = e−1 = 0.368 of the people who currentlyhave the illness will still have it 20 months from now. Now we haveto compute the number of people who will get the illness between now(t = 0) and 20 months from now (t = 20) and will still be ill 20 monthsfrom now.

Subdivide the interval [0, 20] into n equal subintervals using thepoints

0 = t0 < t1 < · · · < tn = 20.

Set 20/n = ∆t. Then, for all i, there are 1000 ·∆t people who will getthe illness during the time period [ti−1, ti). That means that 20 monthsfrom now, in other words, approximately 20− ti−1 months from gettingthe illness, the fraction of them who will still have the illness will bef(20 − ti−1). So their number will be about 1000 · ∆t · f(20 − ti−1).Summing over all allowed values of i, we get that the total number ofpeople in the United States who will have the illness 20 months fromnow is

30, 000 · f(20) +n∑

i=1

1000 ·∆t · f(20− ti−1).

We recognize that the above sum is a Riemann sum, so, as n goes toinfinity, the above expression converges to

D = 30, 000 · f(20) +∫ 20

01000f(20− t) dt

= 30, 000 · e−1 + 1000∫ 20

0e0.05t−1 dt

≈ 11, 036.38 + 12, 642.41≈ 23, 679.

So 20 months from now, 23,679 people in the United States willstill have the illness. �

Note that the result of the previous example shows that the numberof people in the United States who have the illness will decrease during


the next 20 months. Try to find an intuitive explanation for that factthat does not involve integration.

63.3. Exercises.

(1) A country currently has a population of 80 million people anda natural growth rate of 1.5 percent. The natural growth g ofthe population in a given year is computed as the differencebetween the number of births and the number of deaths in thatyear, while the natural growth rate for that year is g dividedby the size of the population at the beginning of that year.

Let us assume that each year 1.1 million people emigratefrom this country. If the current trends continue, how largewill the population of this country be in 20 years?

(2) A country currently has a population of 80 million people anda natural growth rate of −0.5 percent. Let us assume thateach year 0.35 million people immigrate to this country. If thecurrent trends continue, how large will the population of thiscountry be in 20 years?

(3) We deposit $100,000 into a bank account, where it will earnan annual interest of 5 percent. The interest is compoundedcontinuously, so in t years, the original deposit will be worthf(t) = $100, 000 · 1.05t. Each year, we deposit $2000 to thissame account in a continuous manner. What will our accountbalance be in 15 years?

(4) We deposit $100,000 into a bank account, where it will earnan annual interest of 5 percent. The interest is compoundedcontinuously. Each year, we withdraw a total of $4000 in acontinuous manner. What will our account balance be in 20years?

(5) Tickets to a certain section of the arena for a basketball gameusually cost $50. This results in the sales of 1000 tickets. Forevery dollar that the price is dropped, the number of ticketssold goes up 1 percent. Find the demand function for thesetickets and compute the consumer surplus if the tickets aresold at $40.

(6) Let S(x) be the supply function for a certain commodity. Thatis, S(x) is the price that one unit of the commodity has to costin order to attract enough sellers to provide x units for sale.

Note that S(x) is an increasing function, since a higherprice is needed to attract more sellers.


Let us assume that the units are sold at a fixed price T =S(t). That means that the sellers who would be willing tosell at a lower price are making a profit. The total amount ofthe profit made by all sellers is sometimes called the producersurplus. Prove that the producer surplus for this commoditycan be computed by the formula∫ t

0(T − S(x)) dx.

64. Probability

The word “probability” is often used in informal conversations, evenif it is sometimes not clear what the speaker means by that word. Itturns out that there are two distinct concepts of probability. These twoconcepts complement each other in that they are applicable in differentcircumstances, and use very different methods.

64.1. Discrete Probability. Let us say that we are tossing a fair coinfour times. What is the probability that we will get at least threeheads?

This is a situation in which the event that we study, that is, thesequence of four coin tosses, has only a finite number of outcomes.Indeed, there are 24 possible outcomes, since each coin toss has twooutcomes (heads or tails), and the coin is tossed four times.

Among these 16 possible outcomes, five are favorable outcomes,namely, HHHH, HHHT , HHTH, HTHH, and THHH. Further-more, each single outcome (favorable or not) is equally likely to occur,since the coin is fair, and the result of each coin toss is equally likelyto be heads or tails.

In this situation, that is, when the number of all possible outcomesis finite, and each outcome is equally likely to occur, define

(9.13) Probability of event =Number of favorable outcomes

Number of all outcomes,

which, in our example, shows that the probability of getting at leastthree heads is 5/16.

Probabilities defined by formula (9.13) are called discrete proba-bilities. The formula is applicable only when the number of possibleoutcomes is finite. If we want to apply this formula in complicatedsituations, we need advanced techniques to count the number of all

64. PROBABILITY 157

outcomes and the number of favorable outcomes. The fascinating dis-cipline studying those techniques is called enumerative combinatorics,and will not be discussed in this book.

64.2. Continuous Probability. Let us say we want to know the probabil-ity that during the next calendar year the city of Gainesville, Florida,will have more than 40 inches of precipitation, or we want to know theprobability that Gainesville will have less than 50 inches of precipita-tion, or that Gainesville will have at least 42 but at most 48 inches ofprecipitation.

In this case, formula (9.13) is not applicable, since both the numberof favorable outcomes and all outcomes is infinite. Indeed, the amountof precipitation in Gainesville next year can be any nonnegative realnumber. Furthermore, not all outcomes are equally likely. Receivingvery little or very much precipitation is far less likely than receivingclose to the usual amount. We need a totally different approach. Ourapproach, while different from the one in the previous section, sharessome of the most important features of that approach. For instance,the probability of an event that is certain to happen will be 1, whilethe probability of an event that never happens will be 0.

Similar situations occur when we want to know the probability thata certain device will work for more than t years, or that a randomlyselected person weighs more than p pounds but less than q pounds, orthat the blood pressure of a randomly selected person is below a givenvalue. The quantities mentioned here are called random variables.

We would like to define the probability F (a) that the amount Xof precipitation in Gainesville next year will be at most a inches. Thisprobability will sometimes be denoted by P (X ≤ a).

While we do not yet know how to compute F (a), we know that thefunction F has to satisfy the following requirements:

(1) We will have F (a1) ≤ F (a2) if a1 < a2. In other words, F isincreasing. Indeed, if X ≤ a1, then X ≤ a2 since a1 < a2.

(2) We will have lima→∞ F (a) = 1, since the amount of precipita-tion is always finite.

(3) The function F (a) is a continuous function, since a little bitof change in a will only mean a little bit of change in F (a) =P (X ≤ a). We sometimes refer to this fact by saying that Xis a continuous random variable.

Note that the function F is called the distribution of the randomvariable X.


Figure 9.12. The probability P (a ≤ X ≤ b) as an area.

It can be proved that if F has all these properties, then there existsa unique function f : R→ R that has the following properties:

(a) For all a ∈ R, we have∫ a

−∞ f(x) dx = F (a) = P (X ≤ a).(b) The equality

∫∞−∞ f(x) dx = 1 holds.

(c) For all real numbers x, the inequality f(x) ≥ 0 holds.

If f is the unique function described by the three properties above,then f is called the probability density function, or simply density func-tion, of the continuous random variable X.

Note that property (a) above implies that, for all real numbersa < b, the equality

(9.14) P (a ≤ X ≤ b) =∫ b

a

f(x) dx

holds. In other words, P (a ≤ X ≤ b) is equal to the area of the domainbetween the graph of the density function f , the horizontal axis, andthe vertical lines x = a and x = b. See Figure 9.12 for an illustration.

Indeed, we have

P (a ≤ X ≤ b) = P (X ≤ b)− P (X ≤ a)= F (b)− F (a)

=∫ b

−∞f(x) dx−

∫ a

−∞f(x) dx

=∫ b

a

f(x) dx.

64. PROBABILITY 159

Figure 9.13. The graph of the function f in Example 9.10.

Example 9.10. Let the continuous random variable X have densityfunction

f(x) =

⎧⎨⎩

0 if x,6x(1− x) if 0 ≤ x ≤ 1 ,0 if x > 1.

Verify that f is indeed a density function and compute the proba-bility P (0.3 ≤ X ≤ 0.6).

Solution: In order to see that f is indeed a density function, wemust verify that its definite integral, taken over the entire line of realnumbers, is equal to 1. This is not difficult, since f(x) = 0 outside theinterval [0, 1]. This leads to∫ ∞

−∞f(x) dx = 6

∫ 1

0

(x− x2) dx = 6

[x2

2− x3

3

]1

0= 6 · 1

6= 1.

So f is indeed a valid density function.We can use formula (9.14) to compute the requested probability.

We get

P (0.3 ≤ X ≤ 0.6) =∫ 0.6

0.3f(x) dx

= 6∫ 0.6

0.3x− x2 dx

= 6[x2

2− x3

3

]0.6

0.3

= 6(0.108− 0.036)= 0.432.


Figure 9.14. λ = 1 (red), λ = 1.5 (blue), λ = 2 (orange).

�

64.2.1. Exponential Distribution. Consider the following density func-tion. Let λ be a positive real number and let

f(x) ={

0 if x < 0,λe−λx if 0 ≤ x .

(9.15)

We see that f is a decreasing function on the interval [0,∞). Figure9.14 shows how the speed at which f decreases depends on the param-eter λ.

It turns out that this density function is a very frequently occurringone. Therefore, it has a name. It is called the exponential density func-tion with parameter λ. Using the right constant λ and under the rightcircumstances, it can be used in many scenarios, typically connectedto waiting times. For instance, it could be used to measure the proba-bility that, given a starting moment, a given cell phone will ring in lessthan t minutes, or that, at a given location, it will start raining in lessthan h hours, or that, given a random store, a customer will enter ins seconds. The exponential density function will give a good approxi-mation to compute these probabilities if the mentioned processes takeplace at a roughly constant rate. That is, we should choose a part ofthe day when that given cell phone receives calls at roughly constantfrequency, a season when it rains at that location at roughly constanttime periods, or a time of day when customers enter that store at aroughly constant rate.

Example 9.11. The probability that a certain kind of new refrig-erator will need a major repair in x years is given by the exponential

64. PROBABILITY 161

density function with parameter λ = 1/9. What is the probability thata new refrigerator will not need a major repair for 10 years?

Solution: First, we compute the probability that the refrigerator willneed a major repair in 10 years. Let X denote the number of yearspassing before the first major repair is needed. Then that probability is

P (X < 10) =∫ 10

−∞f(x) dx

=∫ 10

0

19e−x/9 dx

= [−e−x/9]100

= 1− e−10/9

= 0.671.

Therefore, the probability that the refrigerator will not need a majorrepair in 10 years is P (X ≥ 10) = 1− 0.671 = 0.329. �

64.2.2. Mean. If we want to compute the average weight of a per-son selected from a given population of n people, we can simply takethe weights a1, a2, . . . , an of those people and compute their arithmeticmean, or average, that is, the real number

A =a1 + a2 + · · ·+ an

n.

This could take a very long time if n is a very large number. If the dataare given in a more organized form, we may be able to save some time.In particular, if we know that there are b1 people in the populationwhose weight is x1, there are b2 people whose weight is x2, and so on,then we can compute the average weight of the population as

(9.16) A =b1x1 + b2x2 + · · ·+ bkxk

b1 + b2 + · · ·+ bk

,

since this fraction is the total weight of the population divided by thenumber of people in the population.

Now note that pi = bi/(b1 +b2 + · · ·+bk) is just the probability thata randomly selected person of this population has weight xi. Therefore,(9.16) is equivalent to

(9.17) p1x1 + p2x2 + · · ·+ pkxk.

Theoretically, the weight of a person can take infinitely many val-ues since the measuring scale can be always be more precise. It is notdifficult to prove that, as k goes to infinity, the sum in (9.17) will turn


into a Riemann sum, and the weights xi will be measured by a contin-uous random variable X, and the probabilities pi will be expressible bythe definite integrals of a density function. This leads to the followingdefinition.

Definition 9.1. Let f be the density function of the continuousrandom variable X. Then the value of

µ(X) =∫ ∞

−∞tf(t) dt

is called the average value or mean or expected value of X.

Example 9.12. Let X be the continuous random variable whosedensity function is the exponential density function with parameter λthat we defined in (9.15). Then µ(X) = 1/λ.

Solution: Using Definition 9.1, we have

µ(X) =∫ ∞

−∞tf(t) dt

=∫ ∞

0λte−λt dt

=[−te−λt − 1

λe−λt

]∞

0

=1λ

.

�

In other words, the parameter and the mean of an exponential dis-tribution are reciprocals of each other. In view of this, we can reformu-late the result of Example 9.11 as follows. If the average time before anew refrigerator needs a major repair is 9 years, then the probabilitythat a refrigerator will not need a major repair for 10 years is 0.329.

64.2.3. Normal Distribution. Let µ be a real number and let σ be pos-itive real number. Consider the density function

f(x) =1

σ√

2πexp

((x− µ)2

2σ2

).

The distribution defined by this density function is called the nor-mal distribution with parameters µ and σ. This distribution is denotedby N(µ, σ). In particular, if µ = 0 and σ = 1, then the obtaineddistribution N(0, 1) is called the standard normal distribution.

Plotting the graph of f for various values of µ and σ, we see thatthe graph has a bell curve; its highest point is reached when x = µ,

64. PROBABILITY 163

Figure 9.15. σ = 1 (red), σ = 1.5 (orange), σ = 2(green), and σ = 2.5 (blue).

and it increases on the left of that and decreases on the right of that.The smaller the value of σ, the steeper is the rise and fall of the graphof f . See Figure 9.15 for an illustration.

It can be proved that µ is precisely the mean of N(µ, σ). Theconstant σ is called the standard deviation of N(µ, σ). It measures howspread out the values of our variable X are. (The precise definition isthat σ is the square root of the mean of (X − µ)2.)

Many scenarios are modeled by a normal distribution, such as testscores, athletic results, or annual snowfall at a given location.

Example 9.13. In an average year, Northtown gets 10 feet of snow,with a standard deviation of 2 feet. What is the probability that, in arandom year, Northtown gets between 9 and 12 feet of snow if snowfallis modeled by a normal distribution?

Solution: Let X denote the snowfall in a random year in Northtown.We need to find the probability P (9 ≤ X ≤ 12). As snowfall is modeledby a normal distribution, the given parameters imply that that mustbe the distribution N(10, 2). Therefore, by formula (9.14), we have

P (9 ≤ X ≤ 12) =∫ 12

9

1σ√

2πexp

((x− 10)2

8

)= 0.5326,

where the definite integral has to be computed by some approximationmethod (or a software package) since e−x2 has no antiderivative amongelementary functions. �


64.3. Exercises.(1) For which value of c will f(x) = cx4 be a density function on

[0, 1]?(2) Let X be a random variable whose density function is 0 outside

the interval [0, 1] and satisfies f(x) = 2x for x ∈ [0, 1]. Provethat f is indeed a density function, and compute the mean ofX.

(3) Let us say that the lifetime of a bicycle tire (measured inmonths) has an exponential distribution with λ = −7. Whatis the probability that a tire will last between five and eightmonths?

(4) The average score on an exam is 100 points. In order to pass,a student cannot be more than 2 standard deviations belowthe average. If the scores have a normal distribution with astandard deviation of 6, how large a fraction of the studentswill pass the exam?

(5) Using the conditions of Example 9.13, what is the probabilitythat Northtown will get less than 5 feet of snow in a givenyear?

(6) Let X be the random variable that counts the goals scoredby an offensive soccer player of a certain elite league duringan entire season. An offensive player is considered exceptionalif he the number of goals he scores exceeds the average of alloffensive players by at least 3 standard deviations. Let us saythat X has distribution N(33, 3). What percentage of offensiveplayers is considered exceptional?

CHAPTER 10

Planar Curves

65. Parametric Curves

Every point in a plane can be defined as an ordered pair of real num-bers (x, y) called the rectangular or Cartesian coordinates. A graph ofa function f is the set points in a plane whose coordinates satisfythe condition y = f(x). The graph gives a simple example of a planarcurve. More generally, a planar curve can be defined as the set of pointswhose coordinates satisfy the condition F (x, y) = 0 called the Carte-sian equation of a curve. In many instances, an equation F (x, y) = 0has multiple solutions for every given x. For example, consider thecircle of unit radius:

(10.1) x2 + y2 = 1 =⇒ y = ±√

1− x2 , x ∈ [−1, 1].

The two solutions represent two semicircles. The graph y =√

1− x2

is the semicircle above the x axis, while the graph y = −√1− x2 isthe semicircle below the x axis. The union of the two graphs is the fullcircle. This example shows a deficiency in describing planar curves bythe graph of a function because the curves cannot always be representedas the graph of a single function.

On the other hand, (10.1) admits a different solution:

(10.2) x2 + y2 = 1 =⇒ x = cos t , y = sin t , t ∈ [0, 2π],

which immediately follows from the trigonometric identity cos2 t +sin2 t = 1 for all values of t.

This representation means that a point of the coordinate plane isassigned to every value of t ∈ [0, 2π] by the rule (x, y) = (cos t, sin t).The coordinates of points of the circle are functions of a third variablecalled a parameter. As t changes, the point (cos t, sin t) traces out thecircle of unit radius centered at the origin in the plane. The parame-ter t has a simple geometrical interpretation. It is the angle countedcounterclockwise from the positive x axis to a ray from the origin onwhich the point (cos t, sin t) lies. This observation admits a naturalgeneralization.

Definition 10.1 (Parametric curves). Let x(t) and y(t) be contin-uous functions on [a, b]. A parametric curve in the coordinate plane is

165

166 10. PLANAR CURVES

Figure 10.1. Circle: x(t) = cos t, y(t) = sin t.

Figure 10.2. Parametric curve. As t increases from ato b, the point (x(t), y(t)) traces out a curve in the xyplane.

the set of points satisfying the conditions, called the parametric equa-tions,

x = x(t) , y = y(t) , t ∈ [a, b].The points (x(a), y(a)) and (x(b), y(b)) are called the initial and ter-minal points of the curve, respectively.

The graph of a function f is a particular example of a parametriccurve: x = t, y = f(t).

65. PARAMETRIC CURVES 167

Figure 10.3. The spiral x = t cos t, y = t sin t, t ∈[0, 2π]. The distance from the origin R =

√x2 + y2 = t

increases linearly as the angle t, counted counterclock-wise from the positive x axis, increases from 0 to 2π.

Parametric curves are common in everyday life. The position of aparticle in a plane is defined by its rectangular coordinates (x, y) in theplane. When the particle moves, its coordinates become functions oftime t so that the parametric curve x = x(t), y = y(t) is the trajectoryof the particle. The particle moves in a specific direction along itstrajectory. The particle may repeat its trajectory (or some portions ofit) multiple times.

Example 10.1. Sketch the curve with the parametric equations x =t cos t, y = t sin t, t ∈ [0, 2π].

Solution: A basic approach to visualize the shape of a parametriccurve is to plot points (x(tk), y(tk)), k = 1, 2, ..., n, corresponding tosuccessive values of t: t1 < t2 < · · · < tn. For n large enough, a fairlygood picture of the curve emerges. This approach can be followedhere, and the reader is advised to do so, for example, tk = 2πk/n,k = 0, 1, ..., n. However, there is another way in part specific to thisvery problem. Note that x2 + y2 = t2 so that the equations may bewritten in the form of the parametric equations of the circle x = R cos t,y = R sin t, where the radius increases linearly with t, R = R(t) =t. The parameter t can be viewed as the angle between a ray from


the origin and the positive x axis counted counterclockwise. Thus,the curve has the following interpretation. As the point (x(t), y(t))rotates about the origin, the distance R between it and the originincreases linearly with the rotation angle. Such a motion occurs alongan unwinding spiral. In the interval t ∈ [0, 2π], the spiral makes onefull turn from the initial point (0, 0) to the terminal point (2π, 0). �

65.1. Parametric Curves and Curves as Point Sets. If a curve is definedas a point set in the coordinate plane, for example, by the Cartesianequation F (x, y) = 0, then there are many parametric equations thatdescribe it. For example, the circle (10.1) may also be described by thefollowing parametric equations:(10.3)

x2 + y2 = 1 =⇒ x = cos(3τ) , y = − sin(3τ) , τ ∈ [0, 2π].

What is the difference between (10.2) and (10.3)? First, note that,as the parameter t in (10.2) increases, the point (cos t, sin t) traces outthe circle counterclockwise (the initial point (1, 0) moves upward as y =sin t > 0 for 0 < t < π/2). In contrast, the point (cos(3τ),− sin(3τ))does so clockwise with increasing τ (the initial point (1, 0) moves down-ward as y = − sin(3τ) for 0 < τ < π/6). Second, as t ranges over theinterval [0, 2π], the point (cos t, sin t) traces out the circle only once,while the point (cos(3τ),− sin(3τ)) winds about the origin three timesbecause the period of the trigonometric functions involved is 2π/3, sothe point returns to the initial point (τ = 0) three times when τ = 2π/3,τ = 4π/3, and τ = 6π/3 = 2π. Third, there is a relation between theparameters t and τ : t = −3τ .

This example illustrates the main differences between curves definedas a point set by the Cartesian equation F (x, y) = 0 and parametriccurves.

• A parametric curve C is oriented ; that is, the point (x(t), y(t))traces out C in a particular direction (from the initial to theterminal point).• A parametric curve may repeat itself multiple times.• Parametric equations describing the same point set in the

plane differ by the choice of parameter; that is, if (x(t), y(t))and (X(τ), Y (τ)) trace out the same point set C in the plane,then there is a function g(τ) such that X(τ) = x(g(τ)) andY (τ) = y(g(τ)). The change of the parameter t = g(τ) iscalled a reparameterization of a curve C.


A good mechanical analogy is the motion of a particle. A paramet-ric curve describes the actual motion, that is, how fast and in whichdirection the particle moves along its trajectory defined as a point set.

Example 10.2. Suppose a curve C is described by the parametricequations x = x(t), y = y(t) if t ∈ [a, b]. Find the parametric equationsof C such that the curve is traced out backward, that is, from the point(x(b), y(b)) to (x(a), y(a)) (the initial and terminal points are swapped).

Solution: One has to find a new parameter τ , t = g(τ), such thatg(b) = a and g(a) = b. When τ increases from a to b, the parametert decreases from b to a, and the sought-after parametric equations areobtained by the composition X(τ) = x(g(τ)) and Y (τ) = y(g(τ)). Thesimplest possibility is to look for a linear relation between t and τ ,g(τ) = c + dτ . The coefficients c and d are fixed by the conditionsg(a) = b or b = c + da and g(b) = a or a = c + db. Therefore, bysubtracting these equations, b− a = (c + da)− (c + db) = −(b− a)d ord = −1. By adding these equations with d = −1, b+a = (c−a)+(c−b)or c = a + b. Hence, t = (a + b)− τ , so that the parametric equationsof C with reversed orientation are

x = x(a + b− τ) , y = y(a + b− τ) , τ ∈ [a, b].

For example, if C is the circle oriented counterclockwise as in (10.2),then the same circle oriented clockwise is described by x = cos(2π −τ) = cos τ , y = sin(2π − τ) = − sin τ , τ ∈ [0, 2π]. �

65.2. The Cycloid. The curve traced by a fixed point on the circum-ference of a circle as the circle rolls along a straight line is called acycloid (see Figure 10.4). To find its parametric equations, supposethat the circle has a radius R and it rolls along the x axis. Let thefixed point P on the circumference be initially at the origin so that thecenter of the circle is positioned at the point (0, R) (on the y axis). LetCP denote the straight line segment between the center of the circle Cand P . Initially, CP is perpendicular to the x axis. As the circle rolls,the segment CP rotates about the center of the circle. Therefore, it isnatural to choose the angle of rotation θ as a parameter. The coordi-nates of P are functions of θ to be found. If the circle rolls a distanceD so that its center is at (D, R), then the arc length Rθ of the partof the circle between P and the touch point T has to be equal to D,that is, D = Rθ. Let Q be a point on the segment CT such that PQand CT are perpendicular. Consider the right-angled triangle CPQ.Its hypotenuse CP has length |CP | = R, and the lengths of its cathetiare |CQ| = |CP | cos θ = R cos θ and |PQ| = |CP | sin θ = R sin θ. Let


Figure 10.4. Definition of a cycloid. A disk of radiusR is rolling along the x axis. A curve traced out by afixed point on its edge is called a cycloid.

Figure 10.5. Overall shape of a cycloid.

(x, y) be coordinates of the point P . The parametric equations of thecycloid are

x = D − |PQ| = Rθ −R sin θ = R(θ − sin θ).y = R− |CQ| = R−R cos θ = R(1− cos θ).

It looks like an upward arc over the interval 0 ≤ x ≤ 2πR, with max-imal height ymax = 2R (θ = π/2), and the arc repeats itself over thenext interval of the length of circumference 2πR and so on.

Remark. In 1696, the Swiss mathematician Johann Bernoulliposed the brachistochrone problem: Find the curve along which a parti-cle will slide (without friction) in the shortest time (under the influenceof gravity) from a point A to a lower point B not directly beneath A.The particle will take the least time sliding from A to B if the curve isa part of an inverted arch of a cycloid.


65.3. Families of Curves. Different values of R define different cycloids.In general, if the parametric equations contain a numerical parameter,then the parametric equations define a family of curves; each familymember corresponds to a particular value of the numerical parameter.

Example 10.3. Investigate the family of curves with the parametricequations x = a cos t, y = b sin(2t), t ∈ [0, 2π], where a and b arepositive numbers.

Solution: Consider first the simplest case a = b = 1. The functionx(t) = cos t has a period of 2π, and y(t) = sin(2t) has a period ofπ. The initial point is (x(0), y(0)) = (1, 0). As t increases, the pointmoves upward so that x(t) decreases (becomes less than 1), while y(t)increases, reaching its maximum value 1 at t = π/4. After that, y(t)begins to decrease, while x(t) continues to decrease. At t = π/2, thepoint arrives at the origin and passes through it into the third quadrantso that x(t) and y(t) continue to decrease. When t = 3π/4, y(t) attainsthe minimum value −1 and begins to increase for t > 3π/4, whilex(t) = cos t is still decreasing toward its minimal value −1, which isreached at t = π, and the curve crosses the x axis moving into thesecond quadrant. In the second quadrant π < t < 3π/2, x(t) increasestoward 0, while y(t) first reaches its maximum value 1 at t = π + π/4(the curve touches the horizontal line y = 1) and then decreases to0. The curve passes the origin again at t = 3π/2 and moves into thefourth quadrant, where it again touches the horizontal line y = −1 att = 3π/2 + π/4 and at t = 2π it arrives at the initial point. The shapeof the curve resembles the infinity sign (∞) embedded into the squarebounded by the lines y = ±1 and x = ±1 so that it touches each ofthe horizontal sides y = ±1 twice and each of the vertical sides x = ±1once. If a and b are arbitrary, the transformation x → ax stretches(a > 1) or compresses (a < 1) any geometrical set horizontally in thecoordinate plane. The transformation y → by does the same but in thevertical direction. So the family of curves consists of curves of the ∞shape stretched to fit into the rectangle bounded by the lines x = ±aand y = ±b. �

65.4. Exercises.In (1)–(9), sketch the curve by plotting its points. Include the arrowshowing the orientation of the curve. Eliminate the parameter to finda Cartesian equation of the curve.


(1) x = 1 + 2t , y = 3− t (2) x =√

t , y = 2− t (3) x = t2 , y = t3

(4) x = 1 + 2et , y = 3− et (5) x = cosh t , y = sinh t

(6) x = 2 sin t , y = 3 cos t (7) x = 2− 3 cos t , y = −1 + sin t

(8) x = cos t , y = sin(4t) (9) x = t2 sin t , y = t2 cos t

(10) The curves x = a sin(nt), y = b cos t, where n is a positive integer,are called Lissajous figures. Investigate how these curves depend on a,b, and n.(11) Consider a disk of radius R. Let P be a point on the disk at adistance b from its center. Find the parametric equations of the curvetraced out by the point P as the disk rolls along a straight line. Thecurve is called a trochoid. Are the equations well defined if b > R.Sketch the curve for b < R, b = R, and b > R.(12) The swallowtail catastrophe curves are defined by the parametricequations x = 2ct− 4t3, y = −ct2 + 3t4. Sketch these curves for a fewvalues of c. What features do the curves have in common? How dothey change when c increases?

66. Calculus with Parametric Curves

66.1. Tangent Line to a Parametric Curve. Consider a parametric curvex = x(t), y = y(t), where the functions x(t) and y(t) are continu-ously differentiable and the derivatives x′(t) and y′(t) do not vanishsimultaneously for any t. Such parametric curves are called smooth.

Theorem 10.1 (Tangent Line to a Smooth Curve). A smooth para-metric curve x = x(t), y = y(t) has a tangent line at any point (x0, y0),and its equation is

(10.4) x′(t0)(y − y0)− y′(t0)(x− x0) = 0,

where (x0, y0) = (x(t0), y(t0)).

Proof. Take a point of the curve (x0, y0) = (x(t0), y(t0)) corre-sponding to a particular value t = t0. Suppose that x′(t0) �= 0. Then,by the continuity of x′(t), there is a neighborhood Iδ = (t0 − δ, t0 + δ)for some δ > 0 such that x′(t) �= 0 for all t ∈ Iδ; that is, the de-rivative is either positive or negative in Iδ. By the inverse functiontheorem (studied in Calculus I), there is an inverse function t = f(x)that is differentiable in some open interval that contains x0. Substi-tuting t = f(x) into the second parametric equation y = y(t), oneobtains that near the point (x0, y0) the curve can be represented as apart of the graph y = F (x) such that y0 = F (x0). The function F is

66. CALCULUS WITH PARAMETRIC CURVES 173

differentiable as the composition of two differentiable functions. Thederivative F ′(x0) determines the slope of the tangent to the graph, andthe equation of the tangent line reads

(10.5) y = y0 + F ′(x0)(x− x0).

By construction,

y = F (x) =⇒ y(t) = F (x(t)) for all t ∈ Iδ.

Differentiation of this equation with respect to t by means of the chainrule yields

y′(t) = F ′(x(t))x′(t) =⇒ F ′(x(t)) =y′(t)x′(t)

=⇒ F ′(x0) =y′(t0)x′(t0)

.

Substituting this equation into (10.5) , the latter can be written inthe form (10.4). If x′(t0) = 0, then y′(t0) �= 0 by the definition ofa smooth curve so that there is a differentiable inverse t = g(y) andhence x = G(y) = x(g(y)). Similar arguments lead to the conclusionthat the tangent line to the graph x = G(y) has the form (10.4). Thedetails are left to the reader as an exercise. �

The rule for calculating the slope of the tangent line can also beobtained by means of the concept of the differential. Recall that thedifferentials of two related quantities y = F (x) are proportional: dy =F ′(x) dx. On the other hand, x = x(t), y = y(t) and therefore dx =x′(t) dt and dy = y′(t) dt. Hence,

F ′(x) =dy

dx=

dydtdxdt

=y′(t)x′(t)

.

These manipulations with differentials are based on a tacit assumptionthat, for a smooth curve x = x(t), y = y(t), there exists a differentiablefunction F such that y = F (x). In the proof of the tangent line the-orem, this has been shown to be true as a consequence of the inversefunction theorem. The use of the differentials establishes the followinghelpful rules to calculate the derivatives:

d

dx=

ddtdxdt

=1

x′(t)d

dtand

d

dy=

ddtdydt

=1

y′(t)d

dt.

66.2. Concavity of a Parametric Curve. The concavity of a graph y =F (x) is determined by the sign of the second derivative F ′′(x). IfF ′′(x) > 0, the graph is concave downward, and it is concave upward if


F ′′(x) < 0. If y(t) and x(t) are twice differentiable, then the concavityof the curve can be determined:

d2y

d2x=

d

dx

dy

dx=

1x′

d

dt

(dy

dx

)=

(y′x′

)′

x′ =y′′x′ − x′′y′

(x′)3 .

Example 10.4. A curve C is defined by the parametric equationsx = t2, y = t3 − 3t.(i) Show that C has two tangent lines at the point (3, 0).(ii) Find the points on C where the tangent line is horizontal or vertical.(iii) Determine where the curve is concave upward or downward.

Solution: (i) Note that y(t) = t(t2 − 3) = 0 has three solutions t = 0and t = ±√3. But the curve has only two points of intersection withthe x axis, (0, 0) and (3, 0), because x(±√3) = 3; that is, the curve isself-intersecting at the point (3, 0). This explains why the curve mayhave two tangent lines. One has x′(t) = 2t and y′(t) = 3t2 − 3 so thatx′(±√3) = ±2

√3 and y′(±√3) = 6. So the slopes of the tangent lines

are (y′/x′)(±√3) = ±√3, and the equations of the lines read

y =√

3 (x− 3) and y = −√

3 (x− 3).

(ii) The tangent line becomes horizontal when y′(t) = 3t2 − 3 = 0(see Eq. (10.4)). This happens when t = ±1. Thus, the tangentline is horizontal at the points (1,±2). The tangent line is vertical ifx′(t) = 2t = 0 or t = 0. So the tangent line is vertical at the origin(0, 0).(iii) The second derivative is

d2y

d2x=

1x′

d

dt

dy

dx=

12t

d

dt

3t2 − 32t

=34t

d

dt

(t− 1

t

)=

34t

(1 +

1t2

).

This equation shows that the curve is concave downward if t > 0 (thesecond derivative is positive) and the curve is concave upward if t < 0(the second derivative is negative). �

66.3. Cusps of Planar Curves. Consider a curve defined by the Carte-sian equation x2− y3 = 0. This equation can be solved for y, y = x2/3,such that dy/dx = 2

3x−1/3. For x > 0, the slope of the tangent line

diverges, y′(x)→∞ as x→ 0+ (as x approaches to 0 from the right).For x < 0, it also diverges, y′(x)→ −∞ as x→ 0− (as x approaches 0from the left). The two branches of the curve (x > 0 and x < 0) arejoined at x = 0 and have a common tangent line, which is the verticalline x = 0 (the y axis) in this case, but the slope suffers a jump dis-continuity (from −∞ to ∞). So the curve is not smooth at x = 0 and


Figure 10.6. Plot of y = x2/3. The curve has a cusp atthe origin.

exhibits a horn like shape near x = 0. Such a point of a planar curveis called a cusp.

A parametric curve x = x(t), y = y(t) may have cusps even thoughboth derivatives x′(t) and y′(t) are continuous for all t. For example,consider the parametric curve x = t3, y = t2. For all values of t,x2− y3 = 0. So this curve coincides with that discussed above and hasa cusp at the origin (t = 0). The derivatives x′(t) = 3t2 and y′(t) = 2tare continuous everywhere, and, in particular, x′(0) = y′(0) = 0 at thecusp point. Despite the continuity of the derivatives, the slope of thecurve is not defined since dy/dx = y′/x′ is an undetermined form 0

0 .A closer investigation shows that the slope y′(t)/x′(t) = 2

3t−1 suffers

a jump discontinuity (from −∞ to +∞ as t changes from negative topositive). The definition of a smooth parametric curve requires thatthe derivatives x′(t) and y′(t) are continuous and do not vanish simul-taneously at any t. A rationale for the latter condition is to eliminatepossible cusps that may occur at points where both derivatives vanish.

Furthermore, consider the curve x = t2, y = t3. The slope dy/dx =y′/x′ = 3

2t is continuous everywhere and, in particular, at t = 0, wherex′(0) = y′(0) = 0. Nevertheless, the curve has a cusp at the origin. Tosee this, let us investigate the Cartesian equation of this curve x3−y2 =0, which can be solved for x, x = y2/3. Therefore, the derivativedx/dy = 2

3y−1/3 exhibits a jump discontinuity from −∞ to ∞ as y

changes from negative to positive. The two branches of the curve (y > 0and y < 0) have a common tangent line (the horizontal line y = 0),but at their joining point a cusp is formed. Note also that the ratedx/dy = x′/y′ = 2

3t−1 suffers a familiar infinite jump discontinuity, thus

indicating a cusp. This example shows that both rates dy/dx = y′/x′


Figure 10.7. The curve x = t2, y = t3. The derivativesx′(0) = y′(0) = 0 vanish at t = 0. The curve has a cuspat the point (x(0), y(0)) = (0, 0).

and dx/dy = x′/y′ must be studied to determine whether there is acusp at the point where y′ = x′ = 0.

Example 10.5. Find the tangent line to the astroid defined by theparametric equation x = a cos3 t, y = a sin3 t, t ∈ [0, 2π] at the pointst = π/4. Determine the points where the tangent line is horizontaland vertical. Is the curve smooth? Specify the regions of upward anddownward concavity. Use the results to sketch the curve.

Solution: The slope of the tangent line at a generic point isdy

dx=

1x′

d

dty = − 1

3a cos2 t sin t3a sin2 t cos t = − tan t.

The value t = π/4 corresponds to the point x = a/23/2, y = a/23/2

because sin(π/4) = cos(π/4) = 1/√

2, and the slope at this point is−1. So the tangent line is

y =a

2√

2−(x− a

2√

2

)or y =

a√2− x.

The slope dy/dt = − tan t vanishes at t = 0 and t = π so the tangentline is horizontal (y = 0) at the points (±a, 0). However, the derivativesx′(t) = −3a cos2 t sin t and y′(t) = 3a sin2 t cos t vanish simultaneously


at t = 0 and t = π. The inverse slope dx/dy = 1/(dy/dx) = − cot texhibits an infinite jump discontinuity at t = 0 and t = π, so thecurve has cusps at (±a, 0) and hence it is not smooth at these points.The slope dy/dx is infinite at t = π/2 and t = 3π/2. Therefore, thecurve has a vertical tangent line (x = 0) at (0,±a). However, the slopedy/dx = − tan t has an infinite jump discontinuity at t = π/2 andt = 3π/2. So the curve has cusps and is not smooth at (0,±a). Notealso that both derivatives x′ and y′ vanish at these points. Thus, thecurve consists of four smooth pieces, and the curve has cusps at thejoining points of its smooth pieces. The second derivative

d2y

d2x=

1x′

d

dt

dy

dx=

13a cos2 t sin t

(tan t)′ =1

3a sin t cos4 t

is positive if sin t > 0 (or y > 0) and negative if sin t < 0 (or y < 0). Sothe two branches of the curve above the x axis are concave downward,while the two branches below it are concave upward. The curves looklike a square with vertices (±a, 0), (0,±a) whose sides are bent inwardtoward the origin. �

66.4. Exercises.In (1) and (2), find an equation of the tangent line(s) to the curve atthe given point. Sketch the curve and the tangent(s).

(1) x = t2 + t , y = 4 sin t , (0, 0)(2) x = sin t + sin(2t) , y = cos t + cos(2t) , (1,−1)

In (3)–(6), investigate the concavity of the curve.

(3) x = t3 − 12t , y = t2 − 1 (4) x = sin(2t) , y = cos t

(5) x = t2 − ln t , y = t2 + ln t (6) x = 3 sin t3 , y = 2 cos t3

(7) Investigate the slope of the trochoid x = Rφ−b sin φ, y = R−b cos φin terms of φ. Find the condition on the parameters R and b such thatthe trochoid has vertical tangent lines.(8) At what points on the curve x = 2t3, y = 1 + 4t − t2 does thetangent line have slope 1?(9) Find equations of the tangents to the curve x = 2t3 +1, y = 3t2 +1that pass through the point (3, 4).In (10)–(12). Investigate whether the curve has cusps or not. If it does,find their position. Sketch the curve.

(10) x = t3 , y = t3 (11) x = t5 , y = t2 (12) x = (t2−1)3 , y = (t3−1)2


67. Polar Coordinates

A point on a plane is described by an ordered pair of numbers(x0, y0) in the rectangular coordinate system. This description impliesa geometrical procedure to obtain the point as the intersection of twomutually perpendicular lines x = x0 and y = y0. The set of verticaland horizontal lines form a rectangular grid in a plane. There are otherpossibilities to label points on a plane by ordered pair of numbers. Herethe polar coordinate system is introduced, which is more convenient formany purposes.

Fix a point O on a plane. Let P be a point on the plane. Ahorizontal ray from O is called the polar axis, and the point O is calledthe origin or pole. Let θ be the angle between the polar axis and theray OP from O through P . The angle θ is counted counterclockwisefrom the polar axis. The position of the point P on the ray OP isuniquely determined by the distance r = |OP |. Thus, any point P ona plane is uniquely associated with the ordered pair (r, θ), and r, θ arecalled the polar coordinates of P . The coordinate r is called the radialvariable, and θ is called the polar angle.

All points on the plane that have the same value of the radial vari-able form a circle of radius r centered at the origin (all points that havethe same distance from the origin). All points on the plane that havethe same value of the polar angle form a ray (a half-line bounded by the

Figure 10.8. Definition of the polar coordinates in aplane. r is the distance |OP |, and θ is the angle countedcounterclockwise from the horizontal ray outgoing fromO to the right. The rectangular coordinates of a point Pare related to the polar ones as x = r cos θ, y = r sin θ.

67. POLAR COORDINATES 179

origin). So a point P with polar coordinates (r, θ) is the intersectionof the circle of radius r and the ray that makes the angle θ with thepolar axis. Concentric circles and rays originating from the center ofthe circles form a polar grid in a plane (see Figure 10.9).

To represent all points of a plane, the radial variable has to rangeover the interval r ∈ [0,∞), while the polar angle takes its values inthe interval [0, 2π) because any ray from the origin does not changeafter rotation about the origin through the angle 2π. It is convenient,though, to let θ range over the whole real line. Positive values of θcorrespond to rotation angles counted counterclockwise, while negativevalues of θ are associated with rotation angles counted clockwise. Allpairs (r, θ) with a fixed value of r and values of θ different by integermultiples of 2π represent the same points of the plane. For example,the ordered pairs (r, θ) = (1,−π) and (1, π) correspond to the samepoint. Indeed, both points are on the circle of unit radius. The rayθ = π is obtained from the polar axis by counterclockwise rotationof the latter through the angle π. But the same ray is obtained byrotating the polar axis through the angle π clockwise; that is, the raysθ = π and θ = −π coincide.

Figure 10.9. Polar grid. Coordinate curves of the po-lar coordinates. The curves of constant values of r areconcentric circles. The curves of constant values of θ arerays outgoing from the origin.


Furthermore, the meaning of the radial variable r can be extendedto the case in which r is negative by agreeing that the pairs (−r, θ) and(r, θ + π), r > 0, represent the same point. Geometrically, the points(±r, θ) lie on a line through the origin at the same distance |r| from theorigin but on the opposite sides of the origin. With this agreement onextending the meaning of the polar coordinates, each point on a planemay be represented by countably many pairs:

(10.6) (r, θ) ⇐⇒ (r, θ + 2πn) or (−r, θ + (2n + 1)π),

where n is an integer.

67.1. Rectangular and Polar Coordinates. Suppose that the polar axisis set so that it coincides with the positive x axis of the rectangularcoordinate system. Every point on the plane is either described bythe rectangular coordinates (x, y) or the polar coordinates (r, θ). It iseasy to find the relation between the polar and rectangular coordinatesof a point P by examining the rectangle with the diagonal OP . Itshorizontal and vertical sides have lengths x and y, respectively. Thelength of the diagonal is r. The angle between the horizontal side andthe diagonal is θ. Therefore, cos θ = x/r and sin θ = y/r, or

x = r cos θ , y = r sin θ ⇐⇒ r2 = x2 + y2 , tan θ =y

x.

These relations allow us to convert the polar coordinates of a point torectangular coordinates and vice versa.

Example 10.6. Find the rectangular coordinates of a point whosepolar coordinates are (2, π/6). Find the polar coordinates of a pointwith rectangular coordinates (−1, 1).

Solution: For r = 2 and θ = π/6, one has x = 2 cos(π/6) = 2√

3/2 =√3 and y = 2 sin(π/6) = 2/2 = 1, so (x, y) = (

√3, 1). For x = −1 and

y = 1, one has r2 = 2 or r =√

2 and tan θ = −1. The point (−1, 1)lies in the second quadrant, that is, π/2 ≤ θ ≤ π. Therefore, θ = 3π/4.Alternatively, one can take θ = 3π/4− 2π = −5π/4. �

67.2. Polar Graphs. A polar graph is a curve defined by the equationr = f(θ) or, more generally, F (r, θ) = 0. It consists of all points thathave at least one polar representation (r, θ) that satisfies the equation.Here polar coordinates are understood in the extended sense of (10.6)when they are allowed to take any value.

The simplest polar graph is defined by a constant function r = a,where a is real. Since r represents the distance from the origin, the


pairs (|a|, θ) form a circle of radius |a| centered at the origin. Similarly,the graph θ = b, where b is real, is the set of all points (r, b), wherer ranges over the real axis, which is the line through the origin thatmakes an angle b radians with the polar axis. Notice that the points(r, b), r > 0, and (r, b), r < 0, lie in the opposite quadrants relative tothe origin as the pairs (r, b) and (−r, b + π) represent the same point.

In general, the shape of a polar graph can be determined by plottingpoints (f(θk), θk), k = 1, 2, ..., n, for a set of successively increasingvalues of θ, θ1 < θ2 < · · · < θn; that is, one takes a set of rays θ = θk

and marks the point on each ray at a distance rk = f(θk) from theorigin.

Example 10.7. Describe the curve r = 2 cos θ.

Solution: By converting the polar graph equation to rectangular co-ordinates, one finds:

r = 2 cos θ ⇔ r2 = 2r cos θ ⇔ x2 + y2 = 2x ⇔ (x− 1)2 + y2 = 1.

The latter equation is obtained by completing the squares. It representsa circle with center (1, 0) and radius 1. Note also that by looking atthe graph of the cosine function, one can see that the point (2 cos θ, θ)gets closer to the origin when θ changes from 0 to π/2 (the first quad-rant), reaching the origin at θ = π/2. This gives the upper part ofthe circle. A similar behavior is observed when θ changes from 0 to−π/2 (the lower part of the circle in the fourth quadrant). In the in-tervals (−π,−π/2) and (π/2, π), the radial variable is negative. Therepresentation (r, θ) is equivalent to (−r, θ ± π). Therefore, the points(2 cos θ, θ) and (−2 cos θ, θ + π) = (2 cos(θ + π), θ + π) are the samefor θ ∈ [−π,−π/2]. But the latter set can also be described by thepairs (2 cos θ, θ) if θ ∈ [0, π/2]. Similarly, the set traced out by the pair(2 cos θ, θ) for θ ∈ [π/2, π] is the same as when θ ∈ [−π/2, 0]. So thepair (2 cos θ, θ) traces out the same set (the circle) each time θ rangesan interval of length π. �

Example 10.8. Describe the shape of the curve r = θ, θ ≥ 0.

Solution: The point (θ, θ) lies on the ray that makes an angle θ withthe polar axis and is a distance r = θ from the origin. As the ray rotatescounterclockwise about the origin with increasing θ, the distance of thepoint from the origin increases proportionally. So the curve is a spiralunwinding counterclockwise. �

67.3. Symmetry of Polar Graphs. When sketching polar graphs, it it ishelpful to take advantage of symmetry, just like when plotting graphs


Figure 10.10. Polar curve r = θ. It is a spiral becausethe distance from the origin r increases with the angle θas the point rotates about the origin through the angleθ.

y = f(x) for symmetric (f(−x) = f(x)) or skew-symmetric (f(−x) =−f(x)) functions.

(i) If a polar equation is unchanged when θ is replaced by −θ,the curve is symmetric about the polar axis. Note that thetransformation (r, θ) → (r,−θ) means that (x, y) → (x,−y),which is the reflection about the x axis (or the polar axis).

(ii) If a polar equation is unchanged when (r, θ) is replaced by(−r, θ) or by (r, θ + π), the curve is symmetric about the ori-gin. Again, these transformations are equivalent to (x, y) →(−x,−y), which is the reflection about the origin.

(iii) If the equation is unchanged under the transformation (r, θ)→(r, π − θ), then the curve is symmetric about the vertical lineθ = π/2. In the rectangular coordinates, this transformationis (x, y)→ (x,−y), which is the reflection about the y axis.

Example 10.9. Describe the cardioid r = 1 + sin θ.

Solution: The equation is unchanged under θ → π − θ so the curveis symmetric about the vertical axis (the y axis). It is sufficient toinvestigate the curve in the interval θ ∈ [−π/2, π/2] (in the fourth andfirst quadrants). Consider a ray that rotates counterclockwise fromθ = −π/2 to θ = π/2 (from the negative y axis to the positive yaxis). When θ = −π/2, r = 0. As θ increases from −π/2 to 0 (thefourth quadrant), the distance from the origin r = 1 + sin θ increases


Figure 10.11. The cardioid r = 1 + sin θ.

monotonically from 0 to 1 (r = 1 on the polar axis). In the interval[0, π/2] (the first quadrant), the distance from the origin r continues toincrease monotonically and reaches its maximal value 2 on the verticalaxis. �

67.4. Tangent to a Polar Graph. To find a tangent line to a polar graphr = f(θ), the polar angle is viewed as a parameter so that the para-metric equations of the graph are

x = r cos θ = f(θ) cos θ , y = r sin θ = f(θ) sin θ.

By the product rule for the derivative,

dy

dx=

dydθdxdθ

=f ′(θ) sin θ + f(θ) cos θ

f ′(θ) cos θ − f(θ) sin θ.

In particular, if the curve passes through the origin, r = 0, the equationfor the slope at the origin is simplified

dy

dx= tan θ if

dr

dθ= f ′(θ) �= 0.

Note that if f ′(θ) = 0, then the slope is an undetermined form 00

because x′(θ) = y′(θ) = 0 for any value of θ such that f ′(θ) = f(θ) = 0.This means that the curve may have a cusp at the origin and hence isnot smooth.

Example 10.10. Find the slope of the cardioid r = 1 + sin θ interms of θ. Investigate the behavior of the cardioid near the origin.


Solution: Here f(θ) = 1 + sin θ and f ′(θ) = cos θ. This leads to theslope

dy

dx=

cos θ sin θ + (1 + sin θ) cos θ

cos2 θ − (1 + sin θ) sin θ=

cos θ(1 + 2 sin θ)(1 + sin θ)(1− 2 sin θ)

.

where the identity cos2 θ = 1−sin2 θ has been used to transform the de-nominator. The cardioid passes through the origin as θ passes through−π/2. The slope dy/dx is undetermined because the numerator anddenominator of the ratio vanish at θ = −π/2 (both derivatives dx/dθand dy/dθ vanish). The left and right limits have to be investigated tosee if the slope has a jump discontinuity thus indicating a cusp. Thenumerator vanishes because of the factor cos θ, while the denominatorvanishes because of the factor (1 + sin θ). Hence,

limθ→(−π/2)±

dy

dx= −1

3lim

θ→(−π/2)±

cos θ

1 + sin θ= −1

3lim

θ→(−π/2)±

− sin θ

cos θ= ∓∞,

where l’Hospital’s rule has been used to resolve the undetermined form00 and the property that tan θ → ∓∞ as θ → (−π/2)± has been invokedto find the limit. The cardioid has a vertical tangent line at the origin.The slope has an infinite jump discontinuity, meaning that the cardioidhas a cusp at the origin (see Figure 10.11). �

67.5. Exercises.In (1)–(3), convert the polar graph equation to a Cartesian equationand sketch the curve.

(1) r = 4 sin θ (2) r = tan θ sec θ (3) r = 2 sin θ − 4 cos θ

In (4)–(12), sketch the curve with the given polar equation.

(4) r = θ , θ ≤ 0 (5) r = ln θ , θ ≥ 1 (6) r2 − 3r + 2 = 0

(7) r = 4 cos(6θ) (8) r2 = 9 sin(2θ) (9) r = 1 + 2 cos(2θ)

(10) r = 2 + sin(3θ) (11) r = 1 + 2 sin(3θ) (12) r2θ = 1

(13) Sketch the curve (x2 + y2)2 = 4x2y2. Hint: Use polar coordinates.(14) Investigate the dependence of the shape of the curve r = cos(nθ)as the integer n increases. What happens if n is not an integer?(15) Show that the curve r = 1+a sin θ has an inner loop when |a| > 1and find the range of θ that corresponds to the inner loop.(16) For what values of a is the curve r = 1 + a sin θ smooth?In (17) and (18), find the slope of the tangent line to the given curveat the point specified by the value of θ and give an equation of the

68. PARAMETRIC CURVES: THE ARC LENGTH AND SURFACE AREA 185

tangent line.

(17) r = 2 sin θ , θ = π/3 (18) r = 1− 2 cos θ , θ = π/6

(19) Show that the curves r = a sin θ and r = a cos θ intersect at rightangles.

68. Parametric Curves: The Arc Length and Surface Area

68.1. Arc Length of a Smooth Curve. Let C be a smooth curve definedby the parametric equations x = x(t), y = y(t), where t ∈ [a, b].Suppose that C is traversed exactly once as t increases from a to band consider a partition of the interval [a, b] such that t0 = a andtk = t0 + k ∆t, k = 0, 1, 2, ..., n, are the endpoints of the partitionintervals of width ∆t = (b−a)/n. Then the points Pk with coordinates(x(tk), y(tk)) lie on the curve so that P0 and Pn are the initial andterminal points, respectively. The curve C can be approximated by apolygonal path with vertices Pk. By definition, the length L of C isthe limit of the lengths of these approximating polygons as n→∞:

(10.7) L = limn→∞

n∑k=1

|Pk−1Pk|,

provided the limit exists, and, in this case, the curve is called measur-able.

Figure 10.12. The arc length of a smooth parametriccurve is approximated by the length of n straight linesegments connecting points on the curve. The arc lengthis defined in (10.7) as the limit n→∞.


By the mean value theorem, when applied to the functions x(t) andy(t) on the interval [tk−1, tk], there are numbers t∗k and t∗∗

k in (tk−1, tk)such that

∆xk = x(tk)−x(tk−1) = x′(t∗k) ∆t , ∆yk = y(tk)−y(tk−1) = y′(t∗∗k ) ∆t.

Therefore,

|Pk−1Pk| =√

(∆xk)2 + (∆yk)2 =√

(x′(t∗k))2 + (y′(t∗∗k ))2 ∆t.

The sum in (10.7) resembles a Riemann sum for the function F (t) =√(x′(t))2 + (y′(t))2. It is not exactly a Riemann sum because t∗k �= t∗∗

k

in general. However, if x′(t) and y′(t) are continuous, it can be shownthat the limit (10.7) is the same as if t∗k and t∗∗

k were equal, namely, Lis the integral of F (t) over [a, b].

Theorem 10.2 (Arc Length of a Curve). If a curve C is describedby the parametric equations x = x(t), y = y(t), t ∈ [a, b], where x′(t)and y′(t) are continuous on [a, b] and C is traversed exactly once as tincreases from a to b, then the length of C is

L =∫ b

a

√(dx

dt

)2+(dy

dt

)2dt.

If C is a graph y = f(x), then x = t, y = f(t), and dx = dt, andthe length is given by the familiar expression

L =∫ b

a

√1 +

(dy

dx

)2dt.

It is convenient to introduce the arc length of an infinitesimal segmentof a curve (the differential of the arc length)

ds =√

(dx)2 + (dy)2 =⇒ L =∫

C

ds =∫ b

a

ds

dtdt.

The symbol∫

Cmeans the summation over infinitesimal segments of the

curve S (the integral along a curve C) and expresses a simple fact thatthe total length is the sum of the lengths of its (infinitesimal) pieces.

68.2. Independence of Parameterization. By its very definition, the arclength is independent of the parameterization of the curve. If a curveC is defined as a point set, then any parametric equations can be usedto evaluate the arc length. Let C be traced out only once by x = x(t),y = y(t), where t ∈ [a, b], and by x = X(τ), y = Y (τ), where τ ∈ [α, β].As noted, there is a relation between the parameters t and τ , τ = g(t),such that g(t) increases from α to β as t increases from a to b, that


is, dτdt

= g′(t) ≥ 0, such that x(t) = X(g(t)) and y(t) = Y (g(t)).Therefore, the integrals (10.7) corresponding to different parametricequations of the same curve are related by a change of the integrationvariable:

L =∫ b

a

√(dx

dt

)2+(dy

dt

)2dt =

∫ b

a

√(dx

dτ

dτ

dt

)2+(dy

dτ

dτ

dt

)2dt

=∫ b

a

√(dx

dτ

)2+(dy

dτ

)2 dτ

dtdt =

∫ β

α

√(dx

dτ

)2+(dy

dτ

)2dτ.

Thus, the arc length is independent of the curve parameterization andcan be computed in any suitable parameterization of the curve.

A circle of radius R is described by the parametric equations x =R cos t, y = R sin t, t ∈ [0, 2π]. Then dx = −R sin t dt and dy =R cos t dt. Hence, ds2 = (R sin t dt)2+(R cos t dt)2 = R2(sin2 t+cos2 t) dt2

= R2 dt2, or ds = R dt, and

L =∫

C

ds =∫ 2π

0R dt = R

∫ 2π

0dt = 2πR.

Example 10.11. Find the length of one arch of the cycloidx = R(φ− sin φ), y = R(1− cos φ).

Solution: According to the description of the cycloid, one arch cor-responds to the interval φ ∈ [0, 2π]. The arc length differential ds isfound as follows:

dx = R(1− cos φ) dφ , dy = R sin φ dφ,

ds2 = dx2 + dy2 = [(1− cos φ)2 + sin2 φ]R2 dφ2

= [1− 2 cos φ + cos2 φ + sin2 φ]R2 dφ2 = (2− 2 cos φ)R2 dφ2.

ds =√

2(1− cos φ) R dφ.

To evaluate the integral of√

2(1− cos φ), the double-angle identity isinvoked, sin2(φ/2) = (1−cos φ)/2. Since 0 ≤ φ/2 ≤ π when φ ∈ [0, 2π],the sinus is nonnegative, sin(φ/2) ≥ 0, in the integration interval, andhence, after taking the square root (

√u2 = |u|), the absolute value can

be omitted. Thus,

L = R

∫ 2π

0

√4 sin2(φ/2) dφ = 2R

∫ 2π

0sin(φ/2) dφ

= 2R[−2 cos(φ/2)]∣∣∣2π

0= 8R.

�


68.3. Area of a Planar Region. The area under the curve y = f(x)and above the interval x ∈ [a, b] is given by A =

∫ b

af(x) dx, where

f(x) ≥ 0. Suppose that the curve is also described by parametricequations x = x(t), y = y(t), so that the function x(t) is one-to-one.Then, by changing the integration variable, dx = x′(t) dt and

A =∫ b

a

y dx =∫ β

α

y(t)x′(t) dt.

The new integration limits are found as usual. When x = a, t is eitherα or β, and when x = b, t is the remaining value.

Example 10.12. Find the area under one arch of the cycloid x =R(φ− sin φ), y = R(1− cos φ).

Solution: When φ ∈ [0, 2π], x ∈ [0, 2πR] for one arch of the cycloid,and y(φ) ≥ 0. Using the differential dx found in the previous example,

A =∫ 2πR

0y dx = R2

∫ 2π

0(1− cos φ)2 dφ

= R2∫ 2π

0(1− 2 cos φ + cos2 φ) dφ

= R2∫ 2π

0[1 + 1

2(1 + cos(2φ))] dφ = R2∫ 2π

0(1 + 1

2) dφ = 3πR2,

where∫ 2π

0 cos φ dφ = 0 and∫ 2π

0 cos(2φ) dφ = 0 by the 2π periodicity ofthe cosine function. �

68.4. Surface Area of Axially Symmetric Surfaces. An axially symmetricsurface is a surface symmetric relative to rotations about a line. Such aline is called the symmetry axis. For example, a cylinder is symmetricrelative to rotations about its axis, a sphere is symmetric relative torotations about its diameter, and so on. An axially symmetric surfaceis swept by a planar curve when the latter is rotated about a line. Acylinder of radius R and height h is obtained by revolving a straight linesegment of length h about a line parallel to the segment at a distanceR. A sphere of radius R is obtained by revolving a circle of radius Rabout its diameter.

Let ds be the arc length of an infinitesimal segment of a smoothcurve C positioned at a point (x, y). If the distance between the point(x, y) and the symmetry axis is R(x, y), then the area dA of the partof the surface swept by the curve segment when the latter is rotatedabout the symmetry axis is the area of a cylinder of radius R(x, y) and


height ds:

dA = 2πR(x, y) ds.

The total surface area is the sum of areas of all such parts of the surface

(10.8) A = 2π∫

C

R(x, y) ds = 2π∫ b

a

R(x(t), y(t))ds

dtdt,

where x = x(t), y = y(t), a ≤ t ≤ b are parametric equations of C.Here it is again assumed that the point (x(t), y(t)) traces out the curveC only once as t increases from a to b. In particular, if the symmetryaxis coincides with the x axis, then R(x, y) = |y| (the distance of the

Figure 10.13. A surface is obtained by rotation of asmooth curve about a vertical line. If ds is the arc lengthof an infinitesimal segment of the curve at a point Pand R is the distance of the point P from the rotationaxis, then the surface area swept by the curve segmentis dA = 2πR ds (the surface area of a cylinder of radiusR and height ds).


point (x, y) to the x axis) and

A = 2π∫

C

|y| ds = 2π∫ b

a

|y(t)|√(dx

dt

)2+(dy

dt

)2dt.

Example 10.13. Find the area of the surface obtained by revolvingone arch of the cycloid x = R(φ− sin φ), y = R(1− cos φ) about the xaxis.

Solution: The differential of the arc length of the cycloid has beencomputed in Example 10.11. Since y(t) ≥ 0 here, the absolute valuemay be omitted and

A = 2π∫

C

y ds = 2π∫ 2π

0R(1− cos φ)

√2(1− cos φ) R dφ

= 8πR2∫ 2π

0sin3(φ/2) dφ = 16πR2

∫ π

0sin3 u du

= 16πR2∫ π

0(1− cos2 u) sin u du = 16πR2

∫ 1

−1(1− z2) dz

= 16πR2(z − z3/3)∣∣∣1−1

=64πR2

3.

where the double-angle identity has been used again, sin2(φ/2) = (1−cos φ)/2, and then two successive changes of the integration variablehave been done to evaluate the integral, u = φ/2 ∈ [0, π] and z =cos u ∈ [−1, 1]. �

68.5. Exercises.In (1)–(3), find the arc length of the curve.(1) x = 2 + 3t2, y = 1− 2t3 between the points (2, 1) and (5,−1).(2) x = 3 sin t− sin(3t), y = 3 cos t− cos(3t), 0 ≤ t ≤ π.(3) x = t/(1+ t), y = ln(1+ t) between the points (0, 0) and (2/3, ln 3).(4) Find the area of the region enclosed by the curve x = a cos3 t,y = a sin3 t (the astroid).(5) Find the area of the region enclosed by the curve x = a cos t, y =b sin t (an ellipse).(6) Find the area enclosed by the curve x = t2 − 2t, y =

√t and the y

axis.In (7)–(10), find the area of a surface generated by rotating the givencurve about the specified axis. Sketch the surface.(7) x = a cos3 t, y = a sin3 t (about the x axis).(8) x2 + y2 = a2 (about the y axis).(9) x = t3, y = t2, 0 ≤ t ≤ 1 (about the x axis).

69. AREAS AND ARC LENGTHS IN POLAR COORDINATES 191

(10) x = et − t, y = 4et/2, 0 ≤ t ≤ 1 (about the y axis).(11) Let V be the volume a solid bounded by an axially symmetricsurface. Show that V = π

∫C[R(x, y)]2 ds, where C is the curve whose

revolution about the symmetry axis gives the boundary surface andR(x, y) is defined in (10.8). Find the volume of the solid bounded bythe surface described in Example 10.13.

69. Areas and Arc Lengths in Polar Coordinates

69.1. Area of a Planar Region.

Theorem 10.3 (Area of a Planar Region in Polar Coordinates).Let a planar region D be bounded by two rays from the origin θ = a,θ = b and a polar graph r = f(θ), where f(θ) ≥ 0, that is,

D = {(r, θ) | 0 ≤ r ≤ f(θ) , a ≤ θ ≤ b}.Then the area of D is

A =12

∫ b

a

[f(θ)]2 dθ.

Proof. Consider a partition of the interval [a, b] by points θk =a + k ∆θ, k = 0, 1, ..., n, where ∆θ = (b − a)/n. Let mk and Mk

be the minimum and maximum values of f on [θk−1, θk]. Recall that acontinuous function f always attains its maximum and minimum valueson a closed interval. The area ∆Ak of the planar region bounded bythe rays θ = θk−1, θ = θk and the polar graph r = f(θ) is not less thanthe area of the disk sector with radius r = mk and angle ∆θ and is notgreater than the area of the disk sector with radius r = Mk and angle∆θ. The area of a sector of a disk with radius R and angle φ radiansis A = 1

2R2φ. Therefore,

12m2

k ∆θ ≤ ∆Ak ≤ 12M2

k ∆θ,

and the total area A of the planar region in question satisfies the in-equality

ALn =

12

n∑k=1

m2k ∆θ ≤ A ≤ 1

2

n∑k=1

M2k ∆θ = AU

n , A =n∑

k=1

∆Ak,

which is true for any n. Let F (θ) = 12 [f(θ)]2. The function F is con-

tinuous on [a, b]. Then 12m

2k and 1

2M2k are the minimum and maximum

values of F on the partition interval [θk−1, θk]. This shows that thelower and upper bounds, AL

n and AUn , are lower and upper sums for

the function F on [a, b]. By the definition of the definite integral and


integrability of a continuous function (see Calculus I), the upper andlower sums converge to the integral of F over [a, b], AL

n →∫ b

aF dθ and

AUn →

∫ b

aF dθ as n→∞. The conclusion of the theorem, A =

∫ b

aF dθ,

follows from the squeeze principle. �

Example 10.14. Find the area enclosed by one loop of the four-leafrose r = cos(2θ).

Solution: Note that r = 1 when θ = 0, which is the maximal value ofr. The function cos(2θ) has two roots θ = ±π/4 that are the nearestto θ = 0. Hence, one loop corresponds to the interval θ ∈ [−π/4, π/4].The area is

A =12

∫ π/4

−π/4cos2(2θ) dθ =

14

∫ π/4

−π/4[1 + cos(4θ)] dθ

=14[θ + 1

4 sin(4θ)]∣∣∣π/4

−π/4=

π

8.

�

Let D be a planar region that lies between two polar graphs r =f(θ) and r = g(θ) such that f(θ) ≥ g(θ) ≥ 0 if θ ∈ [a, b] and 0 <b − a ≤ 2π; that is, D is the set of points whose polar coordinatessatisfy the inequalities:

D = {(r, θ) | 0 ≤ g(θ) ≤ r ≤ f(θ) , a ≤ θ ≤ b}.

Then the area of D is given by

A =12

∫ b

a

[f(θ)]2 dθ − 12

∫ b

a

[g(θ)]2 dθ =12

∫ b

a

([f(θ)]2 − [g(θ)]2

)dθ.

Example 10.15. Find the area of a region D bounded by the car-dioid r = 1 + sin θ and the circle r = 3/2 that lies above the polar axis(in the first and second quadrants).

Solution: The polar graphs r = 1 + sin θ = f(θ) and r = 3/2 = g(θ)are intersecting when f(θ) = g(θ) or 1 + sin θ = 3/2 or sin θ = 1/2.Since the region D lies in the first two quadrants, that is, 0 ≤ θ ≤ π,the values of θ for the points of intersection have to be chosen as θ =π/6 = a and θ = π − π/6 = b. Therefore,

D = {(r, θ) | 3/2 ≤ r ≤ 1 + sin θ , π/6 ≤ θ ≤ 5π/6},


and hence the area of D is

A =12

∫ b

a

[(1 + sin θ)2 − 94 ] dθ =

12

∫ b

a

[−54 + 2 sin θ + sin2 θ] dθ

=12

∫ b

a

[−54 + 2 sin θ + 1

2(1− cos(2θ))] dθ

=12

∫ b

a

[−34 + 2 sin θ − 1

2 cos(2θ)] dθ

= 12 [−3

4θ − 2 cos θ − 14 sin(2θ)]

∣∣∣5π/6

π/6=

9√

3− 2π8

.

�

Remark. When finding points of intersection of two polar graphs,r = f(θ) and r = g(θ), by solving the equation f(θ) = g(θ), onehas to keep in mind that a single point has many representations asdescribed in (10.6). So some of the pairs (f(θ), θ), where θ rangesover solutions of the equation f(θ) = g(θ), may correspond to thesame point. To select distinct points, all pairs (f(θ), θ) satisfying theintersection condition can be transformed by means of (10.6) so thatr ∈ [0,∞) and θ ∈ [0, 2π). In this range of polar coordinates, there isa one-to-one correspondence between points on a plane and pairs (r, θ)with just one exception when r = 0; all the pairs (0, θ) correspond tothe origin of the polar coordinate system.

69.2. Arc Length. Suppose that a curve C is traversed by the point(r, θ) = (f(θ), θ) only once as θ increases from a to b. Choosing θas a parameter, the curve is described by the parametric equationsx = r cos θ, y = r sin θ, where r = f(θ). To find the arc length of C,one has to find the relation between the arc length differential ds anddθ. One has

dx =(dr

dθcos θ − r sin θ

)dθ , dy =

(dr

dθsin θ + r cos θ

)dθ

Therefore,

ds2=dx2 + dy2 =[(dr

dθcos θ − r sin θ

)2+(dr

dθsin θ + r cos θ

)2]dθ2

=[(dr

dθ

)2(cos2 θ + sin2 θ) + r2(cos2 θ + sin2 θ)

]dθ2

=[(dr

dθ

)2+ r2

]dθ2.


The arc length of the curve C is

L =∫

C

ds =∫ b

a

ds

dθdθ =

∫ b

a

√r2 +

(dr

dθ

)2dθ.

where r = f(θ) and b > a.

Example 10.16. Find the length of the cardioid r = 1 + sin θ.

Solution: One has

r2 +(dr

dθ

)2= (1 + sin θ)2 + (cos θ)2 = 2(1 + sin θ),

where the trigonometric identity sin2 θ+cos2 θ = 1 has been used. Thecardioid is traversed once if θ ∈ [−π, π]. Therefore, the length is

L =√

2∫ π

−π

√1 + sin θ dθ = 2

√2∫ π/2

−π/2

√1 + sin θ dθ,

since the cardioid is symmetric about the vertical line (the y axis).This integral can be evaluated by the substitution u = 1+sin θ ∈ [0, 2]so that du = cos θ dθ, where cos θ =

√1− sin2 θ =

√1− (u− 1)2 =√

u(2− u). Hence,

L = 2√

2∫ 2

0

√u√

u(2− u)du = 2

√2∫ 2

0

du√2− u

= −4√

2√

2− u∣∣∣20

= 8.

�

69.3. Surface Area. If a surface is obtained by rotating a polar graphr = f(θ) about a line, the (10.8) can be used to find the area of thesurface where the distance R(x, y) and the arc length differential dshave to be expressed in the polar coordinates with r = f(θ).

Example 10.17. Find the area of the surface obtained by rotatingthe cardioid r = 1 + sin θ about its symmetry axis.

Solution: The symmetry axis of the cardioid is the y axis. So thedistance from the y axis to a point (x, y) is R(x, y) = |x|. The surfacecan be obtained by rotating the part of the cardioid that lies in thefourth and first quadrants, that is, x ≥ 0 or θ ∈ [−π/2, π/2]. Sincex = r cos θ, the surface area is

A = 2π∫

C

|x| ds = 2π∫ π/2

−π/2r cos θ

ds

dθdθ

= 2π∫ π/2

−π/2r cos θ

√r2 +

(dr

dθ

)2dθ.


The derivative ds/dθ has been calculated in the previous example.Therefore,

A = 2π√

2∫ π/2

−π/2(1 + sin θ)3/2 cos θ dθ = 2π

√2∫ 1

−1(1 + u)3/2 du

= 2π√

2(1 + u)5/2

5/2

∣∣∣1−1

=32π5

,

where the substitution u = sin θ has been made to evaluate theintegral. �

69.4. Exercises.In (1)–(4), sketch the curve and find the area that it encloses.

(1) r = 4 cos(2θ) (2) r = a(1 + cos θ) (3) r = 2− cos(2θ)

(4) r2 = 4 cos(2θ)

In (5)–(7), sketch the curve and find the area of one loop of the curve.

(5). r = 9 sin(3θ) (6) r = 1+2 sin θ (inner loop) (7) r = 2 cos θ−sec θ

In (8) and (9), find the area of the region that lies inside the first curveand outside the second curve. Sketch the curves.

(8) r = 2 sin θ , r = 1 (9) r = 3 cos θ , r = 1 + cos θ

In (10)–(13), find the area of the region bounded by the curves. Sketchthe region.

(10) r = 2θ , r = θ , θ ∈ [0, 2π] (11) r2 = sin(2θ) , r2 = cos(2θ)(12) r = 2a sin θ , r = 2b cos θ , a, b > 0(13) r = 3 + 2 cos θ , r = 3 + 2 sin θ

(14) Find the area inside the larger loop and outside the smaller loopof the limacon r = 1/2− cos θ.In (15)–(17), sketch the curve and find its length.

(15) r = 2a sin θ (16) r = θ , θ ∈ [0, 2π] (17) r = a + cos θ , a ≥ 1

In (18)–(20), find the area of the surface obtained by rotating the curveabout the specified axis. Sketch the surface.(18) r = a > 0 about a line through the origin.(19) r = 2a cos θ, a > 0, (i) about the y axis and (ii) about the x axis.(20) r2 = cos(2θ) about the polar axis.(21) θ = a and 0 ≤ r ≤ R, where 0 < a < π/2 and R > 0 about thepolar axis.


70. Conic Sections

Consider two intersecting lines in space, L1 and L2. A surface sweptby the line L2 when it is rotated about the line L1 is a circular doublecone. The line L1 is the symmetry axis of the cone. The point of inter-section of the lines is called the vertex of a cone. Any plane that doesnot pass through the vertex intersects the cone along a curve. It ap-pears that all such curves fall into three types as shown in Figure 10.14.If the curve of intersection is a loop, then it is an ellipse. If the planeis parallel to the line L2, then the curve is a parabola. If the planeis parallel to the axis of the cone, then the curve is a hyperbola. Thecurves of intersection of a plane and a cone are called conic sections,or conics. They have a pure geometrical description, which will bepresented here.

Remark. A trajectory of any massive object in the solar system(e.g., comet, asteroid, planet) is a conic section—that is, a parabola,hyperbola, or ellipse. This fact follows from Newton’s Law of Gravityand will be proved in Calculus III.

70.1. Parabolas. A parabola is the set of points in a plane that areequidistant from a fixed point F (called the focus) and a fixed line

Figure 10.14. Conic sections are curves that are in-tersections of a cone with various planes. The shape ofa conic section depends on the orientation of the planerelative to the cone symmetry axis.

70. CONIC SECTIONS 197

Figure 10.15. Left: Geometrical description of aparabola as a set of points P in a plane that are equidis-tant from a fixed point F , called the focus, and a fixedline called the directrix (a horizontal line in the figure).Right: A circular paraboloid is the surface obtained byrotating a parabola about the line through its focus andperpendicular to its directrix.

(called the directrix). Let P be a point in a plane. Consider the linethrough P that is perpendicular to the directrix and let Q be the pointof their intersection. Then P lies on a parabola if |FP | = |QP |. Thiscondition is used to derive the equation of a parabola.

A particularly simple equation of a parabola is obtained if the co-ordinate system is set so that the y axis coincides with the line throughthe focus and perpendicular to the directrix. The origin O is chosenso that F = (0, p) and hence the parabola contains the origin O, whilethe directrix is the line y = −p parallel to the x axis (the origin isat distance |p| from F and from the directrix). If P = (x, y), then|FP | = √

x2 + (y − p)2, the point Q has the coordinates (x,−p), and|PQ| = √

(y + p)2. An equation of the parabola with focus (0, p) anddirectrix y = −p is

|FP |2 = |PQ|2 =⇒ x2 + (y − p)2 = (y + p)2 ⇐⇒ x2 = 4py.

In the 16th century, Galileo showed that the path of a projectile that isshot into the air at an angle to the ground is a parabola. The surfaceobtained by rotating a parabola about its symmetry axis is called aparaboloid. If a source of light is placed at the focus of a paraboloidmirror, then, after the reflection, the light forms a beam parallel to thesymmetry axis. This fact is used to design flashlights, headlights, and


so on. Conversely, a beam of light parallel to the symmetry axis of aparaboloid mirror will be focused to the focus point after the reflection,which is used to design reflecting telescopes.

70.2. Ellipses. An ellipse is the set of points in a plane, the sum ofwhose distances from two fixed points F1 and F2 is a constant. Thefixed points are called foci (plural of focus). Let P be a point on aplane. Then P belongs to an ellipse if |PF1|+ |PF2| = 2a, where a > 0is a constant (the factor 2 is chosen for convenience to be seen later).Evidently, |F1F2| < 2a; otherwise, no ellipse exists.

A particularly simple equation of an ellipse is obtained when thecoordinate system is set so that the foci lie on the x axis and havethe coordinates F1 = (−c, 0) and F2 = (c, 0), where |c| < a. LetP = (x, y) be a point in a plane. Then |PF1| =

√(x + c)2 + y2 and

|PF2| =√

(x− c)2 + y2. The point P is on an ellipse if

|PF1|+ |PF2| = 2a ⇐⇒ |PF2| = 2a− |PF1|,|PF2|2 = (2a− |PF1|)2 = 4a2 − 4a|PF1|+ |PF1|2,(10.9)

16a2|PF1|2 = (4a2 + |PF1|2 − |PF2|2)2.

These transformations serve only one purpose, that is, to get rid ofthe square roots. Note that now all the distances are squared. So|PF1|2−|PF2|2 = (x+ c)2 + y2− (x− c)2− y2 = 4cx. The substitutionof the latter into the condition (10.9) yields

16a2[(x+c)2+y2] = (4a2+4cx)2 ⇔ (a2−c2)x2+a2y2 = a2(a2−c2).

By dividing both sides of this equation by a2(a2 − c2), an equation ofan ellipse with foci (±a, 0) becomes

x2

a2 +y2

b2 = 1,

where b2 = a2 − c2 so that a ≥ b > 0. The ellipse intersects the x axisat (±a, 0) and the y axis at (0,±b) (called the vertices of an ellipse).The line segment joining the points (±a, 0) is called the major axis.If the foci of an ellipse are located on the y axis, then x and y areswapped in this equation, and the major axis lies on the y axis. Thisshows that the restriction a ≥ b can be dropped in the ellipse equation.In particular, an ellipse becomes a circle of radius a if a = b.

One of Kepler’s laws is that the orbits of the planets in the solarsystem are ellipses with the Sun at one focus.


Figure 10.16. Left: An ellipse is the set of points in aplane, the sum of whose distances from two fixed pointsF1 and F2 (the foci) is a constant. Right: A circularellipsoid is the surface obtained by rotating an ellipseabout the line through its foci.

70.3. Hyperbolas. A hyperbola is the set of all points in a plane, thedifference of whose distances from two fixed points F1 and F2 (the foci)is a constant. For any point P on a hyperbola, |PF1| − |PF2| = ±2a(as the difference of the distances can be negative). Let the foci be at(±c, 0). Following the same procedure used to derive an equation of anellipse, an equation of a hyperbola with foci (±c, 0) is found to be

x2

a2 −y2

b2 = 1,

where c2 = a2 + b2. The details are left to the reader as an exercise.This equation shows that x2/a2 ≥ 1 for any y, that is, x ≥ a orx ≤ −a. A hyperbola therefore has two branches. The branch inx ≤ −a intersects the x axis at x = −a, while the branch in x ≥ adoes so at x = a. The points (±a, 0) are called vertices. Furthermore,in the asymptotic region |x| → ∞, a hyperbola has slant asymptotesy = ±(b/a)x. Indeed,

y = ±b

√x2

a2 − 1 = ±b|x|a

√1− a2

x2 ≈ ±b|x|a

(1− a2

2x2

)→ ±b|x|

a

as |x| → ∞. Here the linearization√

1 + u ≈ 1 + u/2 has been used toobtain the asymptotic behavior for small u = −a2/x2 → 0.


Figure 10.17. Left: A hyperbola is the set of all pointsin a plane, the difference of whose distances from twofixed points F1 and F2 (the foci) is a constant. Topright: A circular hyperboloid of one sheet is the surfaceobtained by rotating a hyperbola about the line throughthe midpoint of the segment F1F2 and perpendicular toit (the vertical line in the left panel). Bottom right: Acircular hyperboloid of two sheets is the surface obtainedby rotating a hyperbola about the line through its foci(the horizontal line in the left panel).

If the foci of a hyperbola are on the y axis, then, by reversing theroles of x and y, it follows that the hyperbola

y2

a2 −x2

b2 = 1

has foci (0,±c), where c2 = a2 + b2, vertices (0,±a), and slant asymp-totes y = ±(a/b)x.

70.4. Shifted Conics. Consider a curve defined by a quadratic Carte-sian equation

Ay2 + Bx2 + αy + βx + γ = 0.


Suppose that A �= 0 and B �= 0. By completing the squares, thisequation can be transformed to the standard form

A(y − α

2A

)2+ B

(x− β

2B

)2=

α2

4A+

β2

4B− γ = d

or(y − y0)2

A/d+

(x− x0)2

B/d= 1,

where x0 = β/(2B) and y0 = α/(2A), provided d �= 0. Dependingon the signs of A/d and B/d, this equation describes either an ellipseor a hyperbola as if the origin was moved to the point (x0, y0). IfA/d and B/d are both negative, then the equation has no solution.If either A or B vanishes, but not both, then the quadratic Cartesianequation describes a parabola (the details are left to the reader as anexercise). If A = B = 0, the the equation describes a straight line.If d = 0, solutions of the equation form a set of two straight lines,y − y0 = ±√−(B/A)(x − x0), through the point (x0, y0), providedAB < 0. When solutions of the Cartesian equation form a hyperbola(d �= 0, AB < 0), these lines are its slant asymptotes.

70.5. Conic Sections in Polar Coordinates. The following theorem offersa uniform description of conic sections.

Theorem 10.4 (Conic Sections). Let F be a fixed point (called thefocus) and L be a fixed line (called the directrix) in a plane. Let e be afixed positive number (called the eccentricity). The set of points P inthe plane whose the ratio of the distance from F to the distance fromL is the constant e is a conic section. The conic is

(1) an ellipse if e < 1(2) a parabola if e = 1(3) a hyperbola if e > 1

e =|PF ||PL| .

Proof. Set the coordinate system so that F is at the origin andthe directrix is parallel to the y axis and d units to the right. Thus,the directrix has the equation x = d > 0 and is perpendicular to thepolar axis. If the point P has polar coordinates (r, θ) and rectangularcoordinates (x, y), then |PF | = r =

√x2 + y2 and |PL| = d − x =

d−r cos θ. The condition |PF | = e|PL| yields the equation r = e(d−x).By squaring it, one infers a quadratic Cartesian equation

x2 + y2 = e2(d− x)2 ⇔ (1− e2)x2 + y2 + 2e2 dx− e2d = 0,

which has been investigated in the preceding section. If e = 1, thenthe equation describes the shifted parabola y2 = −2d(x− 1/2). When


e �= 1, by completing the squares, this equation is brought to thestandard form (

x +e2d

1− e2

)2+

y2

1− e2 =e2d2

(1− e2)2 .

If e < 1, then all the coefficients are positive, and the equation describesa shifted ellipse

(x− x0)2

a2 +y2

b2 = 1 , a2 =b2

1− e2 , b2 =e2d2

1− e2 , x0 =e2d

e2 − 1= −c,

where c is the distance from the origin to the foci of the ellipse, c2 =a2 − b2. The eccentricity is then e = c/a. Similarly, if e > 1, then thecoefficients have opposite signs, and the equation describes a shiftedhyperbola

(x− x0)2

a2 − y2

b2 = 1 , e =c

a, c2 = a2 + b2.

�

In the beginning of the proof, the polar equation for conic sectionswas given as r = e(d − r cos θ). If the directrix is chosen to be to theleft of the focus as x = −d, then cos θ is replaced by − cos θ in the polarequation. If the directrix is chosen to be parallel to the polar axis asy = ±d, then the conic sections are r = e(d±y) = e(d±r sin θ). Theseequations can be solved for r to obtain conic sections as polar graphs.

Corollary 10.5 (Conics in Polar Coordinates). A polar equationof the form

r =ed

1± e cos θor r =

ed

1± e sin θ

represents a conic section of eccentricity e. The conic section is anellipse if e < 1, a parabola if e = 1, and a hyperbola if e > 1.

70.6. Exercises.In (1)–(9), classify the conic section. Find the vertices, foci (or focus),directrix, and asymptotes (if the curve is a hyperbola). Sketch thecurve.

(1) y2 = 16x (2) x2 = −4y (3) y + 12x− 2x2 = 18

(4) x2 + 4y2 = 16 (5) 9x2 − 18x + 4y2 = 27

(6) x2 + 3y2 + 2x− 12y + 10 = 0 (7) 4x2 − 9y2 = 36

(8) y2 − 2y = 4x2 + 3 (9) y2 − 4x2 + 2y + 16x + 3 = 0


In (10) A long-range radio navigation system uses two radio stations,located at points A and B along the coastline, that transmit simulta-neous signals to a ship located at point P in the sea. The onboardcomputer converts the time difference in receiving these signals into adistance difference |PA| − |PB|. This locates the ship on one branchof a hyperbola. Suppose that station B is located D miles from stationA. A ship receives the signal from B τ microseconds (µs) before itreceives the signal from A. The signal travels with the speed of light,c = 980 ft/µs. How far off the coastline is the ship? If the coordinatesystem is set so that the line AB coincides with the x axis and A is atthe origin, find the coordinates of the ship as functions of τ .In (11)–(13), classify the conic section. Find the eccentricity, an equa-tion of the directrix, and sketch the conic.

(11) r =8

4 + sin θ(12) r =

102− 5 cos θ

(13) r =1

3 + 3 cos θ

(14) Show that the conic sections r = a/(1−cos θ) and r = b/(1+cos θ)intersect at right angles.(15) The orbit of Halley’s Comet, last seen in 1986 and due to returnin 2062, is an ellipse with eccentricity 0.97 and one focus at the Sun.The length of its major axis is 36.18 AU. An astronomical unit (AU)is the mean distance between the Earth and the Sun, about 93 millionmiles. Find a polar equation for the orbit of Halley’s Comet. What isthe maximal and minimal distance from the comet to the Sun?

concepts in calculus ii - floridashines · pdf fileconcepts in calculus ii beta version...

Documents