light and matter - textbookrevolution.orgtextbookrevolution.org/files/crowell-calc.pdf · 5...

88
1

Upload: phungnguyet

Post on 07-Feb-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

1

2

Light and Matter

Fullerton, Californiawww.lightandmatter.com

copyright 2005 Benjamin Crowell

rev. December 31, 2005

This book is licensed under the Creative Com-mons Attribution-ShareAlike license, version 1.0,http://creativecommons.org/licenses/by-sa/1.0/, exceptfor those photographs and drawings of which I am notthe author, as listed in the photo credits. If you agreeto the license, it grants you certain privileges that youwould not otherwise have, such as the right to copy thebook, or download the digital version free of charge fromwww.lightandmatter.com. At your option, you may also copythis book under the GNU Free Documentation License version1.2, http://www.gnu.org/licenses/fdl.txt, with no invariantsections, no front-cover texts, and no back-cover texts.

3

1 Rates of Change1.1 Change in discrete steps 7

Two sides of the same coin,7.—Some guesses, 9.

1.2 Continuous change . . 10A derivative, 12.—Propertiesof the derivative, 13.—Higher-order polynomials,14.—The second derivative,14.—Maxima and minima,16.

Problems. . . . . . . . 18

2 To infinity — andbeyond!2.1 Infinitesimals. . . . . 212.2 Safe use of infinitesimals 242.3 The product rule . . . 282.4 The chain rule . . . . 302.5 Exponentials andlogarithms . . . . . . . 31

The exponential, 31.—Thelogarithm, 33.

2.6 Quotients . . . . . . 342.7 Differentiation on acomputer . . . . . . . . 352.8 Continuity . . . . . . 38Problems. . . . . . . . 39

3 Integration3.1 Definite and indefiniteintegrals . . . . . . . . 413.2 The fundamental theoremof calculus . . . . . . . 443.3 Properties of the integral 453.4 Applications . . . . . 46

Averages, 46.—Work, 47.

Problems. . . . . . . . 48

4 Techniques andapplications4.1 Newton’s method . . . 514.2 Implicit differentiation . 524.3 Taylor series . . . . . 534.4 Methods of integration . 58

Change of variable, 58.

4.5 Integration by parts . . 604.6 Partial fractions. . . . 61

5 Complex numbertechniques5.1 Review of complexnumbers . . . . . . . . 655.2 Euler’s formula . . . . 685.3 Partial fractions revisited 70

6 Improper integrals6.1 Integrating a function thatblows up . . . . . . . . 736.2 Limits of integration atinfinity . . . . . . . . . 74

7 Iterated integrals7.1 Integrals inside integrals 757.2 Applications . . . . . 77

A Answers to self-checks 79

B Detours 81

C Photo Credits 87

4

5

Calculus isn’t a hard subject.

Algebra is hard. I still remem-ber my encounter with algebra. Itwas my first taste of abstraction inmathematics, and it gave me quitea few black eyes and bloody noses.

Geometry is hard. For most peo-ple, geometry is the first time theyhave to do proofs using formal, ax-iomatic reasoning.

I teach physics for a living. Physicsis hard. There’s a reason that peo-ple believed Aristotle’s bogus ver-sion of physics for centuries: it’sbecause the real laws of physics arecounterintuitive.

Calculus, on the other hand, is avery straightforward subject thatrewards intuition, and can be eas-ily visualized. Silvanus Thompson,author of one of the most popularcalculus texts ever written, opinedthat “considering how many foolscan calculate, it is surprising thatit should be thought either a diffi-cult or a tedious task for any otherfool to master the same tricks.”

Since I don’t teach calculus, I can’trequire anyone to read this book.For that reason, I’ve written it sothat you can go through it andget to the dessert course with-out having to eat too many Brus-sels sprouts and Lima beans alongthe way. The development of anymathematical subject involves alarge number of boring details thathave little to do with the mainthrust of the topic. These details

I’ve relegated to a chapter in theback of the book, and the readerwho has an interest in mathemat-ics as a career — or who enjoys anice heavy pot roast before movingon to dessert — will want to readthose details when the main textsuggests the possibility of a detour.

6

1 Rates of Change1.1 Change in

discrete stepsToward the end of the eighteenthcentury, a German elementaryschool teacher decided to keep hispupils busy by assigning them along, boring arithmetic problem.To oversimplify a little bit (whichis what textbook authors alwaysdo when they tell you about his-tory), I’ll say that the assignmentwas to add up all the numbersfrom one to a hundred. The chil-dren set to work on their slates,and the teacher lit his pipe, con-fident of a long break. But al-most immediately, a boy namedCarl Friedrich Gauss brought uphis answer: 5,050.

a / Adding the numbersfrom 1 to 7.

Figure a suggests one way of solv-ing this type of problem. Thefilled-in columns of the graph rep-resent the numbers from 1 to 7,and adding them up means find-

b / A trick for finding thesum.

ing the area of the shaded region.Roughly half the square is shadedin, so if we want only an approxi-mate solution, we can simply cal-culate 72/2 = 24.5.

But, as suggested in figure b, it’snot much more work to get an ex-act result. There are seven saw-teeth sticking out out above the di-agonal, with a total area of 7/2,so the total shaded area is (72 +7)/2 = 28. In general, the sum ofthe first n numbers will be (n2 +n)/2, which explains Gauss’s re-sult: (1002 + 100)/2 = 5, 050.

Two sides of the same coin

Problems like this come up fre-quently. Imagine that each house-hold in a certain small town sendsa total of one ton of garbage to thedump every year. Over time, thegarbage accumulates in the dump,taking up more and more space.

7

8 CHAPTER 1. RATES OF CHANGE

c / Carl Friedrich Gauss(1777-1855), a long timeafter graduating from ele-mentary school.

Let’s label the years as n = 1, 2,3, . . ., and let the function1 x(n)represent the amount of garbagethat has accumulated by the endof year n. If the population isconstant, say 13 households, thengarbage accumulates at a constantrate, and we have x(n) = 13n.

But maybe the town’s populationis growing. If the population startsout as 1 household in year 1, andthen grows to 2 in year 2, and soon, then we have the same kindof problem that the young Gausssolved. After 100 years, the accu-mulated amount of garbage will be5,050 tons. The pile of refuse growsmore and more every year; the rateof change of x is not constant. Tab-ulating the examples we’ve done sofar, we have this:

1Recall that when x is a function, thenotation x(n) means the output of thefunction when the input is n. It doesn’trepresent multiplication of a number x bya number n.

rate of change accumulatedresult

13 13nn (n2 + n)/2

The rate of change of the functionx can be notated as x. Given thefunction x, we can always deter-mine the function x for any valueof n by doing a running sum.

Likewise, if we know x, we can de-termine x by subtraction. In theexample where x = 13n, we canfind x = x(n) − x(n − 1) = 13n −13(n − 1) = 13. Or if we knewthat the accumulated amount ofgarbage was given by (n2 + n)/2,we could calculate the town’s pop-ulation like this:

n2 + n

2− (n− 1)2 + (n− 1)

2

=n2 + n−

(n2 + 2n− 1− n + 1

)2

= n

30

20

10

7 6 5 4 3 2 1n

x

d / x is the slope of x .

The graphical interpretation ofthis is shown in figure d: on a

1.1. CHANGE IN DISCRETE STEPS 9

graph of x = (n2 + n)/2, the slopeof the line connecting two succes-sive points is the value of the func-tion x.

In other words, the functions x andx are like different sides of the samecoin. If you know one, you can findthe other — with two caveats.

First, we’ve been assuming im-plicitly that the function x startsout at x(0) = 0. That mightnot be true in general. For in-stance, if we’re adding water to areservoir over a certain period oftime, the reservoir probably didn’tstart out completely empty. Thus,if we know x, we can’t find outeverything about x without somefurther information: the startingvalue of x. If someone tells youx = 13, you can’t conclude x =13n, but only x = 13n+ c, where cis some constant. There’s no suchambiguity if you’re going the op-posite way, from x to x. Even ifx 6= 0, we still have x = 13n + c−[13(n− 1) + c] = 13.

Second, it may be difficult, or evenimpossible, to find a formula forthe answer when we want to de-termine the running sum x givena formula for the rate of change x.Gauss had a flash of insight thatled him to the result (n2 + n)/2,but in general we might only beable to use a computer spreadsheetto calculate a number for the run-ning sum, rather than an equationthat would be valid for all valuesof n.

Some guesses

Even though we lack Gauss’s ge-nius, we can recognize certain pat-terns. One pattern is that if x is afunction that gets bigger and big-ger, it seems like x will be a func-tion that grows even faster thanx. In the example of x = n andx = (n2+n)/2, consider what hap-pens for a large value of n, like100. At this value of n, x = 100,which is pretty big, but even with-out pawing around for a calculator,we know that x is going to turn outreally really big. Since n is large,n2 is quite a bit bigger than n, soroughly speaking, we can approxi-mate x ≈ n2/2 = 5, 000. 100 maybe a big number, but 5,000 is a lotbigger. Continuing in this way, forn = 1000 we have x = 1000, butx ≈ 500, 000 — now x has far out-stripped x. This can be a fun gameto play with a calculator: look atwhich functions grow the fastest.For instance, your calculator mighthave an x2 button, an ex button,and a button for x! (the factorialfunction, defined as x! = 1·2·. . .·x,e.g., 4! = 1 · 2 · 3 · 4 = 24). You’llfind that 502 is pretty big, but e50

is incomparably greater, and 50! isso big that it causes an error.

All the x and x functions we’veseen so far have been polynomials.If x is a polynomial, then of coursewe can find a polynomial for x aswell, because if x is a polynomial,then x(n)−x(n−1) will be one too.It also looks like every polynomial

10 CHAPTER 1. RATES OF CHANGE

we could choose for x might alsocorrespond to an x that’s a poly-nomial. And not only that, but itlooks as though there’s a patternin the power of n. Suppose x is apolynomial, and the highest powerof n it contains is a certain num-ber — the “order” of the polyno-mial. Then x is a polynomial ofthat order minus one. Again, it’sfairly easy to prove this going oneway, passing from x to x, but moredifficult to prove the opposite rela-tionship: that if x is a polynomialof a certain order, then x must bea polynomial with an order that’sgreater by one.

We’d imagine, then, that the run-ning sum of x = n2 would be apolynomial of order 3. If we cal-culate x(100) = 12 + 22 + . . . +1002 on a computer spreadsheet,we get 338,350, which looks sus-piciously close to 1, 000, 000/3. Itlooks like x(n) = n3/3+ . . ., wherethe dots represent terms involvinglower powers of n such as n2.

1.2 Continuouschange

Did you notice that I sneakedsomething past you in the exampleof water filling up a reservoir? Thex and x functions I’ve been usingas examples have all been functionsdefined on the integers, so theyrepresent change that happens indiscrete steps, but the flow of waterinto a reservoir is smooth or con-tinuous. Or is it? Water is made

e / Isaac Newton (1643-1727)

out of molecules, after all. It’s justthat water molecules are so smallthat we don’t notice them as in-dividuals. Figure f shows a graphthat is discrete, but almost ap-pears continous because the scalehas been chosen so that the pointsblend together visually.

500

400

300

200

100

30 20 10n

x

f / On this scale, thegraph of (n2 + n)/2 ap-pears almost continuous.

The physicist Isaac Newton startedthinking along these lines in the1660’s, and figured out ways of an-alyzing x and x functions that were

1.2. CONTINUOUS CHANGE 11

truly continuous. The notation xis due to him (and he only used itfor continuous functions). Becausehe was dealing with the continousflow of change, he called his newset of mathematical techniques themethod of fluxions, but nowadaysit’s known as the calculus.

2

1

2 1t

x

g / The function x(t) =t2/2, and its tangent lineat the point (1, 1/2).

Newton was a physicist, and heneeded to invent the calculus aspart of his study of how objectsmove. If an object is moving inone dimension, we can specify itsposition with a variable x, and xwill then be a function of time, t.The rate of change of its position,x, is its speed, or velocity. Ear-lier experiments by Galileo had es-tablished that when a ball rolleddown a slope, its position was pro-portional to t2, so Newton inferredthat a graph like figure g wouldbe typical for any object movingunder the influence of a constantforce. (It could be 7t2, or t2/42,or anything else proportional to t2,depending on the force acting onthe object and the object’s mass.)

2

1

2 1t

x

h / This line isn’t a tan-gent line: it crosses thegraph.

Because the functions are continu-ous, not discrete, we can no longerdefine the relationship between xand x by saying x is a running sumof x’s, or that x is the difference be-tween two successive x’s. But wealready found a geometrical rela-tionship between the two functionsin the discrete case, and that canserve as our definition for the con-tinuous case: x is the area underthe graph of x, or, if you like, x isthe slope of the tangent line on thegraph of x. For now we’ll concen-trate on the slope idea.

The tangent line is defined as theline that passes through the graphat a certain point, but, unlike theone in figure h, doesn’t cut acrossthe graph.2 By measuring witha ruler on figure g, we find thatthe slope is very close to 1, so evi-dently x(1) = 1. To prove this, weconstruct the function representingthe line: `(t) = t − 1/2. We wantto prove that this line doesn’t crossthe graph of x(t) = t2/2. The dif-

2For a more formal definition, seepage 81.

12 CHAPTER 1. RATES OF CHANGE

ference between the two functions,x− `, is the polynomial t2/2− t +1/2, and this polynomial will bezero for any value of t where theline touches or crosses the curve.We can use the quadratic formulato find these points, and the resultis that there is only one of them,which is t = 1. Since x− ` is posi-tive for at least some points to theleft and right of t = 1, and it onlyequals zero at t = 1, it must neverbe negative, which means that theline always lies below the curve,never crossing it.

A derivative

That proves that x(1) = 1, butit was a lot of work, and we stilldon’t want to that amount of workto evaluate x at every value of t.There’s a way to avoid all that,and find a formula for x. Com-pare figures g and i. They’re bothgraphs of the same function, andthey both look the same. What’sdifferent? The only difference isthe scales: in figure i, the t axishas been shrunk by a factor of 2,and the x axis by a factor of 4. Thegraph looks the same, because dou-bling t quadruples t2/2. The tan-gent line here is the tangent lineat t = 2, not t = 1, and althoughit looks like the same line as theone in figure g, it isn’t, because thescales are different. The line in fig-ure g had a slope of rise/run =1/1 = 1, but this one’s slope is4/2 = 2. That means x(2) = 2.

In general, this scaling argumentshows that x(t) = t for any t.

8 7 6 5 4 3 2 1

4 3 2 1t

x

i / The function t2/2again. How is thisdifferent from figure g?

This is called differenting : findinga formula for the function x, givena formula for the function x. Theterm comes from the idea that fora discrete function, the slope is thedifference between two successivevalues of the function. The func-tion x is referred to as the deriva-tive of the function x, and the artof differentiating is differential cal-culus. The opposite process, com-puting a formula for x when givenx, is called integrating, and makesup the field of integral calculus;this terminology is based on theidea that computing a running sumis like putting together (integrat-ing) many little pieces.

Note the similarity between this re-sult for continuous functions,

x = t2/2 x = t ,

and our earlier result for discreteones,

x = (n2 + n)/2 x = n .

1.2. CONTINUOUS CHANGE 13

The similarity is no coincidence.A continuous function is just asmoothed-out version of a discreteone. For instance, the continuousversion of the staircase functionshown in figure b on page 7 wouldsimply be a triangle without thesaw teeth sticking out; the area ofthose ugly sawteeth is what’s rep-resented by the n/2 term in the dis-crete result x = (n2 + n)/2, whichis the only thing that makes it dif-ferent from the continuous resultx = t2/2.

Properties of the derivative

It follows immediately from thedefinition of the derivative thatmultiplying a function by a con-stant multiplies its derivative bythe same constant, so for examplesince we know that the derivativeof t2/2 is t, we can immediately tellthat the derivative of t2 is 2t, andthe derivative of t2/17 is 2t/17.

Also, if we add two functions, theirderivatives add. To give a goodexample of this, we need to haveanother function that we can dif-ferentiate, one that isn’t just somemultiple of t2. An easy one is t:the derivative of t is 1, since theslope of the graph of x = t is a linewith a slope of 1, and the tangentline lies right on top of the originalline.

The derivative of a constant iszero, since a constant function’sgraph is a horizontal line, with a

slope of zero.

Example 1The derivative of 5t2 +2t is the deriva-tive of 5t2 plus the derivative of 2t ,since derivatives add. The derivativeof 5t2 is 5 times the derivative of t2,and the derivative of 2t is 2 times thederivative of t , so putting everythingtogether, we find that the derivative of5t2 + 2t is (5)(2t) + (2)(1) = 10t + 2.

Example 2. An insect pest from the United

States is inadvertently released in avillage in rural China. The pestsspread outward at a rate of s kilome-ters per year, forming a widening cir-cle of contagion. Find the number ofsquare kilometers per year that be-come newly infested. Check that theunits of the result make sense. Inter-pret the result.

. Let t be the time, in years, sincethe pest was introduced. The radiusof the circle is r = st , and its area isa = πr 2 = π(st)2. To make this looklike a polynomial, we have to rewritethis as a = (πs2)t2. The derivative is

a = (πs2)(2t)

a = (2πs2)t

The units of s are km/year, so squar-ing it gives km2/year2. The 2 and theπ are unitless, and multiplying by tgives units of km2/year, which is whatwe expect for a, since it represents thenumber of square kilometers per yearthat become infested.

Interpreting the result, we notice acouple of things. First, the rate ofinfestation isn’t constant; it’s propor-tional to t , so people might not pay

14 CHAPTER 1. RATES OF CHANGE

so much attention at first, but later onthe effort required to combat the prob-lem will grow more and more quickly.Second, we notice that the result isproportional to s2. This suggests thatanything that could be done to reduces would be very helpful. For instance,a measure that cut s in half would re-duce a by a factor of four.

Higher-order polynomials

So far, we have the following re-sults for polynomials up to order2:

function derivative1 0t 1t2 2t

Interpreting 1 as t0, we detect whatseems to be a general rule, whichis that the derivative of tk is ktk−1.The proof is straightforward butnot very illuminating if carried outwith the methods developed in thischapter, so I’ve relegated it to page81. It can be proved much moreeasily using the methods of chap-ter 2.

Example 3. If x = 2t7 − 4t + 1, find x .

. This is similar to example 1, the onlydifference being that we can now han-dle higher powers of t . The derivativeof t7 is 7t6, so we have

x = (2)(7t6) + (−4)(1) + 0

= 14t6 +−4

The second derivative

I described how Galileo and New-ton found that an object subjectto an external force, starting fromrest, would have a velocity x thatwas proportional to t, and a posi-tion x that varied like t2. The pro-portionality constant for the veloc-ity is called the acceleration, a, sothat x = at and x = at2/2. Forexample, a sports car acceleratingfrom a stop sign would have a largeacceleration, and its velocity at ata given time would therefore bea large number. The accelerationcan be thought of as the deriva-tive of the derivative of x, writ-ten x, with two dots. In our ex-ample, x = a. In general, the ac-celeration doesn’t need to be con-stant. For example, the sports carwill eventually have to stop accel-erating, perhaps because the back-ward force of air friction becomesas great as the force pushing it for-ward. The total force acting on thecar would then be zero, and the carwould continue in motion at a con-stant speed.

Example 4Suppose the pilot of a blimp has justturned on the motor that runs its pro-peller, and the propeller is spinningup. The resulting force on the blimpis therefore increasing steadily, andlet’s say that this causes the blimp tohave an acceleration x = 3t , which in-creases steadily with time. We wantto find the blimp’s velocity and positionas functions of time.

For the velocity, we need a polynomial

1.2. CONTINUOUS CHANGE 15

whose derivative is 3t . We know thatthe derivative of t2 is 2t , so we need touse a function that’s bigger by a factorof 3/2: x = (3/2)t2. In fact, we couldadd any constant to this, and make itx = (3/2)t2 + 14, for example, wherethe 14 would represent the blimp’sinitial velocity. But since the blimphas been sitting dead in the air un-til the motor started working, we canassume the initial velocity was zero.Remember, any time you’re workingbackwards like this to find a functionwhose derivative is some other func-tion (integrating, in other words), thereis the possibility of adding on a con-stant like this.

Finally, for the position, we needsomething whose derivative is (3/2)t2.The derivative of t3 would be 3t2, sowe need something half as big as this:x = t3/2.

The second derivative can be in-

8 7 6 5 4 3 2 1

-1 2 1-1-2

t

x

j / The functions 2t , t2

and 7t2.

terpreted as a measure of the cur-vature of the graph, as shown infigure j. The graph of the functionx = 2t is a line, with no curvature.Its first derivative is 2, and its sec-ond derivative is zero. The func-tion t2 has a second derivative of 2,

and the more tightly curved func-tion 7t2 has a bigger second deriva-tive, 14.

4

3

2

1

-1 2 1-1-2

t

x

k / The functions t2 and3− t2.

Positive and negative signs of thesecond derivative indicate concav-ity. In figure k, the function t2 islike a cup with its mouth pointingup. We say that it’s “concave up,”and this corresponds to its posi-tive second derivative. The func-tion 3−t2, with a second derivativeless than zero, is concave down.Another way of saying it is that ifyou’re driving along a road shapedlike t2, going in the direction of in-creasing t, then your steering wheelis turned to the left, whereas on aroad shaped like 3 − t2 it’s turnedto the right.

Figure l shows a third possibility.The function t3 has a derivative3t2, which equals zero at t = 0.This called a point of inflection.The concavity of the graph is downon the left, up on the right. The

16 CHAPTER 1. RATES OF CHANGE

8 6 4 2

-2-4-6-8

2 1-1-2t

x

l / The functions t3 hasan inflection point at t =0.

inflection point is where it switchesfrom one concavity to the other. Inthe alternative description in termsof the steering wheel, the inflectionpoint is where your steering wheelis crossing from left to right.

Maxima and minima

When a function goes up and thensmoothly turns around and comesback down again, it has zero slopeat the top. A place where x = 0,then, could represent a place wherex was at a maximum. On the otherhand, it could be concave up, inwhich case we’d have a minimum.

Example 5. Fred receives a mysterious e-mail tiptelling him that his investment in a cer-tain stock will have a value given byx = −2t4 + (6.4577 × 1010)t , wheret ≥ 2005 is the year. Should he sell atsome point? If so, when?

. If the value reaches a maximum atsome time, then the derivative shouldbe zero then. Taking the derivative

and setting it equal to zero, we have

0 = −8t3 + 6.4577× 1010

t =„

6.4577× 1010

8

«1/3

t = ±2006.0 .

Obviously the solution at t = −2006.0is bogus, since the stock market didn’texist four thousand years ago, and thetip only claimed the function would bevalid for t ≥ 2005.

Should Fred sell on New Year’s eve of2006?

But this could be a maximum, a mini-mum, or an inflection point. Fred defi-nitely does not want to sell at t = 2006if it’s a minimum! To check which ofthe three possibilities hold, Fred takesthe second derivative:

x = −24t2 .

Plugging in t = 2006.0, we find thatthe second derivative is negative atthat time, so it is indeed a maximum.

Implicit in this whole discussionwas the assumption that the maxi-mum or minimum where the func-tion was smooth. There are someother possibilities.

In figure m, the function’s mini-mum occurs at an end-point of itsdomain.

Another possibility is that thefunction can have a minimum ormaximum at some point where itsderivative isn’t well defined. Fi-grue n shows such a situation.There is a kink in the function att = 0, so a wide variety of linescould be placed through the graph

1.2. CONTINUOUS CHANGE 17

0.9

-0.1 2 1-1

t

x

m / The function x =√

thas a minimum at t =0, which is not a placewhere x = 0. This point isthe edge of the function’sdomain.

there, all with different slopes andall staying on one side of the graph.There is no uniquely defined tan-gent line, so the derivative is unde-fined.

Example 6. Rancher Rick has a length of cy-clone fence L with which to enclose arectangular pasture. Show that he canenclose the greatest possible area byforming a square with sides of lengthL/4.

. If the width and length of the rect-angle are t and u, and Rick is go-ing to use up all his fencing material,then the perimeter of the rectangle,2t + 2u, equals L, so for a given width,t , the length is u = L/2 − t . The areais a = tu = t(L/2 − t). The func-tion only means anything realistic for0 ≤ t ≤ L/2, since for values of t out-side this region either the width or theheight of the rectangle would be neg-ative. The function a(t) could there-fore have a maximum either at a placewhere a = 0, or at the endpoints of thefunction’s domain. We can eliminate

2

1

1-1t

x

n / The function x = |t |has a minimum at t =0, which is not a placewhere x = 0. This is apoint where the functionisn’t differentiable.

the latter possibility, because the areais zero at the endpoints.

To evaluate the derivative, we firstneed to reexpress a as a polynomial:

a = −t2 +L2

t .

The derivative is

a = −2t +L2

.

Setting this equal to zero, we find t =L/4, as claimed. This is a maximum,not a minimum or an inflection point,because the second derivative is theconstant a = −2, which is negative forall t , including t = L/4.

18 CHAPTER 1. RATES OF CHANGE

Problems1 Graph the function t2 in theneighborhood of t = 3, draw a tan-gent line, and use its slope to verifythat the derivative equals 2t at thispoint.

2 Graph the function sin et in theneighborhood of t = 0, draw atangent line, and use its slope toestimate the derivative. Answer:0.5403023058. (You will of coursenot get an answer this precise us-ing this technique.)

3 Differentiate the followingfunctions with respect to t:1, 7, t, 7t, t2, 7t2, t3, 7t3.

4 Differentiate 3t7 − 4t2 + 6 withrespect to t.

5 Differentiate at2 + bt + c withrespect to t. [Thompson, 1919]

6 Find two different functionswhose derivatives are the constant3, and give a geometrical interpre-tation.

7 Find a function x whose deriva-tive is x = 3t7 − 4t2 + 6. In otherwords, integrate the given func-tion.

8 Let t be the time that haselapsed since the Big Bang. In thattime, light, traveling at speed c,has been able to travel a maximumdistance ct. The portion of the uni-verse that we can observe is there-fore a sphere of radius ct, with vol-ume v = (4/3)πr3 = (4/3)π(ct)3.Compute the rate v at which theobservable universe is expanding,

and check that your answer has theright units, as in example 2 on page13.

9 Kinetic energy is a measureof an object’s quantity of motion;when you buy gasoline, the energyyou’re paying for will be convertedinto the car’s kinetic energy (actu-ally only some of it, since the en-gine isn’t perfectly efficient). Thekinetic energy of an object withmass m and velocity v is given byK = (1/2)mv2. For a car acceler-ating at a steady rate, with v = at,find the rate K at which the en-gine is required to put out kineticenergy. K, with units of energyover time, is known as the power.Check that your answer has theright units, as in example 2 on page13.

10 A metal square expands andcontracts with temperature, thelengths of its sides varying accord-ing to the equation ` = (1+αT )`o.Find the rate of change of its sur-face area with respect to temper-ature. That is, find ˙, wherethe variable with respect to whichyou’re differentiating is the tem-perature, T . Check that your an-swer has the right units, as in ex-ample 2 on page 13.

11 Find the second derivative of2t3 − t.

12 Locate any poins of inflectionof the function t3 + t2. Verifyby graphing that the concavity ofthe function reverses itself at thispoint.

PROBLEMS 19

13 Two atoms will interact viaelectrical forces between their pro-tons and electrons. To put themat a distance r from one another(measured from nucleus to nu-cleus), a certain amount of energyE is required, and the minimumenergy occurs when the atoms arein equilibrium, forming a molecule.Often a fairly good approximationto the energy is the Lennard-Jonesexpression

E(r) = k

[(a

r

)12

− 2(a

r

)6]

,

where k and a are constants. Showthat there is an equilibrium at r =a. Verify (either by graphing or bytesting the second derivative) thatthis is a minimum, not a maximumor a point of inflection.

14 Prove that the total numberof maxima and minima possessedby a third-order polynomial is atmost two.

20 CHAPTER 1. RATES OF CHANGE

2 To infinity — andbeyond!

a / Gottfried Leibniz(1646-1716)

Little kids readily pick up the ideaof infinity. “When I grow up,I’m gonna have a million Barbies.”“Oh yeah? Well I’m gonna havea billion in my house.” “Well I’mgonna have infinity Barbies.” “Sowhat? I’ll have two infinity ofthem.” Adults laugh, convincedthat infinity, ∞, is the biggestnumber, so 2∞ can’t be any big-ger. This is the idea behind thejoke in the movie Toy Story: BuzzLightyear’s slogan is “To infinity— and beyond!” We assume thereisn’t any beyond. Infinity is sup-posed to be the biggest there is,so by definition there can’t be any-thing bigger, right?

2.1 InfinitesimalsActually mathematicians have in-

vented several many different log-ical systems for working with in-finity, and in most of them in-finity does come in different sizesand flavors. Newton, as well asthe German mathematician Leib-niz who invented calculus inde-pendently,1 had a strong intuitiveidea that calculus was really aboutnumbers that were infinitely small:infinitesimals, the opposite of in-finities. For instance, consider thenumber 1.12 = 1.21. That 2 in thefirst decimal place is the same 2that appears in the expression 2tfor the derivative of t2.

0.6

0.8

1

1.2

1.4

0.6 0.8 1 1.2 1.4

t

x

b / A close-up view of thefunction x = t2, show-ing the line that con-nects the points (1, 1)and (1.1, 1.21).

1There is some dispute over this point.Newton and his supporters claimed thatLeibniz plagiarized Newton’s ideas, andmerely invented a new notation for them.

21

22 CHAPTER 2. TO INFINITY — AND BEYOND!

Figure b shows the idea visually.The line connecting the points(1, 1) and (1.1, 1.21) is almost in-distinguishable from the tangentline on this scale. Its slope is(1.21 − 1)/(1.1 − 1) = 2.1, whichis very close to the tangent line’sslope of 2. It was a good approx-imation because the points wereclose together, separated by only0.1 on the t axis.

If we needed a better approxi-mation, we could try calculating1.012 = 1.0201. The slope of theline connecting the points (1, 1)and (1.01, 1.0201) is 2.01, which iseven closer to the slope of the tan-gent line.

Another method of visualizing theidea is that we can interpret x = t2

as the area of a square with sidesof length t, as suggested in fig-ure c. We increase t by an in-finitesimally small number dt. Thed is Leibniz’s notation for a verysmall difference, and dt is to beread is a single symbol, “dee-tee,”not as a number d multiplied by

c / A geometrical inter-pretation of the derivativeof t2.

a number t. The idea is that dtis smaller than any ordinary num-ber you could imagine, but it’s notzero. The area of the square is in-creased by dx = 2tdt + dt2, whichis analogous to the finite numbers0.21 and 0.0201 we calculated ear-lier. Where before we divided bya finite change in t such as 0.1 or0.01, now we divide by dt, produc-ing

dx

dt=

2t dt + dt2

dt= 2t + dt

for the derivative. On a graph likefigure b, dx/dt is the slope of thetangent line: the change in x di-vided by the changed in t.

But adding an infinitesimal num-ber dt onto 2t doesn’t really changeit by any amount that’s even the-oretically measurable in the realworld, so the answer is really 2t.Evaluating it at t = 1 gives theexact result, 2, that the earlierapproximate results, 2.1 and 2.01,were getting closer and closer to.

Example 7To show the power of infinitesimals

and the Leibniz notation, let’s provethat the derivative of t3 is 3t2:

dxdt

=(t + dt)3 − t3

dt

=3t2 dt + 3t dt + dt3

dt= 3t2 + . . . ,

where the dots indicate infinitesimalterms that we can neglect.

2.1. INFINITESIMALS 23

This result required significantsweat and ingenuity when provedon page 81 by the methods of chap-ter 1, and not only that but theold method would have requireda completely different method ofproof for a function that wasn’t apolynomial, whereas the new onecan be applied more generally, asshown in the following example.

Example 8The derivative of x = sin t , with t in

units of radians, is

dxdt

=sin(t + dt)− sin t

dt,

and with the trig identity sin(α + β) =sin α cos β+ cos α sin β, this becomes

=sin t cos dt + cos t sin dt − sin t

dt.

Applying the small-angle approxima-tions sin u ≈ u and cos u ≈ 1, wehave

dxdt

=cos t dt

dt= cos t .

But are the approximations goodenough? The situation is similar to theone we encountered earlier, in whichwe computed (t + dt)2, and neglectedthe dt2 term represented by the smallsquare in figure c. Being a little lesscavalier, I should demonstrate explic-itly that the error introduced by thesmall-angle approximations is really ofthe same order of magnitude as dt2,i.e., a number that is infinitesimallysmall compared even to the infinites-imal size of dt ; I’ve done this on page82.

Figure d shows the graphs of the func-tion and its derivative. Note how the

two graphs correspond. At t = 0,the slope of sin t is at its largest, andis positive; this is where the deriva-tive, cos t , attains its maximum posi-tive value of 1. At t = π/2, sin t hasreached a maximum, and has a slopeof zero; cos t is zero here. At t = π,in the middle of the graph, sin t has itsmaximum negative slope, and cos t isat its most negative extreme of −1.

Physically, sin t could represent theposition of a pendulum as it movedback and forth from left to right, andcos t would then be the pendulum’svelocity.

-1

1

1 2 3 4 5 6t

x

d / Graphs of sin t , andits derivative cos t .

Example 9What about the derivative of the co-

sine? The cosine and the sine are re-ally the same function, shifted to theleft or right by π/4. If the derivativeof the sine is the same as itself, butshifted to the left by π/4, then thederivative of the cosine must be a co-sine shifted to the left by π/4:

d cos tdt

= cos(t + π/4)

= − sin t .

24 CHAPTER 2. TO INFINITY — AND BEYOND!

e / Bishop GeorgeBerkeley (1685-1753)

2.2 Safe use ofinfinitesimals

The idea of infinitesimally smallnumbers has always irked purists.One prominent critic of the cal-culus was Newton’s contemporaryGeorge Berkeley, the Bishop ofCloyne. Although some of hiscomplaints are clearly wrong (hedenied the possibility of the sec-ond derivative), there was clearlysomething to his criticism of theinfinitesimals. He wrote sarcas-tically, “They are neither finitequantities, nor quantities infinitelysmall, nor yet nothing. May we notcall them ghosts of departed quan-tities?”

Infinitesimals seemed scary, be-cause if you mishandled them, youcould prove absurd things. Forexample, let du be an infinitesi-mal. Then 2du is also infinites-imal. Therefore both 1/du and1/(2du) equal infinity, so 1/du =1/(2du). Multiplying by du onboth sides, we have a proof that1 = 1/2.

In the eighteenth century, the useof infinitesimals became like adul-

tery: commonly practiced, butshameful to admit to in polite cir-cles. Those who used them learnedcertain rules of thumb for handlingthem correctly. For instance, theywould identify the flaw in my proofof 1 = 1/2 as my assumption thatthere was only one size of infinity,when actually 1/du should be in-terpreted as an infinity twice as bigas 1/(2du). The use of the sym-bol ∞ played into this trap, be-cause the use of a single symbolfor infinity implied that infinitesonly came in one size. However,the practitioners of infinitesimalshad trouble articulating a clearset of principles for their properuse, and couldn’t prove that a self-consistent system could be builtaround them.

By the twentieth century, whenI learned calculus, a clear con-sensus had existed that infiniteand infinitesimal numbers weren’tnumbers at all. A notation likedx/dt, my calculus teacher toldme, wasn’t really one number di-vided by another, it was merely asymbol for the limit

lim∆t→0

∆x

∆t,

where ∆x and ∆t represented fi-nite changes. That satisfied me un-til we got to a certain topic (im-plicit differentiation) in which wewere encouraged to break the dxaway from the dt, leaving them onopposite sides of the equation. Ibuttonholed my teacher after classand asked why he was now doing

2.2. SAFE USE OF INFINITESIMALS 25

what he’d told me you couldn’t re-ally do, and his response was thatdx and dt weren’t really numbers,but most of the time you could getaway with treating them as if theywere, and you would get the rightanswer in the end. Most of thetime!? That bothered me. Howwas I supposed to know when itwasn’t “most of the time?”

f / Abraham Robinson(1918-1974)

But unknown to me and myteacher, mathematician AbrahamRobinson had already shown in the1960’s that it was possible to con-struct a self-consistent number sys-tem that included infinite and in-finitesimal numbers. He called itthe hyperreal number system, andit included the real numbers as asubset.2

2The reader who wants to learn moreabout the hyperreal system might wantto start by skimming the Wikipedia ar-ticle Non-standard analysis for generalbackground, and then read the relevantparts of Keisler’s Elementary Calculus:

Moreover, the rules for what youcan and can’t do with the hy-perreals turn out to be extremelysimple. Take any true statementabout the real numbers. Supposeit’s possible to translate it into astatement about the hyperreals inthe most obvious way, simply byreplacing the word “real” with theword “hyperreal.” Then the trans-lated statement is also true. Thisis known as the transfer principle.

Let’s look back at my bogus proofof 1 = 1/2 in light of this sim-ple principle. The final step ofthe proof, for example, is perfectlyvalid: multiplying both sides of theequation by the same thing. Thefollowing statement about the realnumbers is true:

For any real numbers a, b, andc, if a = b, then ac = bc.

This can be translated in an obvi-ous way into a statement about thehyperreals:

For any hyperreal numbers a,b, and c, if a = b, then ac = bc.

However, what about the state-ment that both 1/du and 1/(2du)equal infinity, so they’re equal toeach other? This isn’t the trans-lation of a statement that’s trueabout the reals, so there’s no rea-son to believe it’s true — and in

An Approach Using Infinitesimals, anout-of-print calculus text that uses in-finitesimals, available for free from theauthor’s web site. The standard (diffi-cult) book on the subject is Robinson’sNon-Standard Analysis.

26 CHAPTER 2. TO INFINITY — AND BEYOND!

fact it’s false.

What the transfer principle tells usis that the real numbers as we nor-mally think of them are not uniquein obeying the ordinary rules of al-gebra. There are completely dif-ferent systems of numbers, suchas the hyperreals, that also obeythem.

How, then, are the hyperreals evendifferent from the reals, if every-thing that’s true of one is true ofthe other? But recall that thetransfer principle doesn’t guaran-tee that every statement about thereals is also true of the hyperre-als. It only works if the statementabout the reals can be translatedinto a statement about the hyper-reals in the most simple, straight-forward way imaginable, simply byreplacing the word “real” with theword “hyperreal.” Here’s an ex-ample of a true statement aboutthe reals that can’t be translatedin this way:

For any real number a, thereis an integer n that is greaterthan a.

This one can’t be translated sosimplemindedly, because it refersto a subset of the reals calledthe integers. It might be possi-ble to translate it somehow, butit would require some insight intothe correct way to translate thatword “integer.” The transfer prin-ciple doesn’t apply to this state-ment, which indeed is false for thehyperreals, because the hyperre-

als contain infinite numbers thatare greater than all the integers.In fact, the contradiction of thisstatement can be taken as a def-inition of what makes the hyper-reals special, and different fromthe reals: we assume that there isat least one hyperreal number, H,which is greater than all the inte-gers.

As an analogy from everyday life,consider the following statementsabout the student body of the highschool I attended:

1. Every student at my highschool had two eyes and a face.2. Every student at my highschool who was on the footballteam was a jerk.

Let’s try to translate these intostatements about the populationof California in general. The stu-dent body of my high school is likethe set of real numbers, and thepresent-day population of Califor-nia is like the hyperreals. State-ment 1 can be translated mind-lessly into a statement that ev-ery Californian has two eyes anda face; we simply substitute “ev-ery Californian” for “every studentat my high school.” But state-ment 2 isn’t so easy, because itrefers to the subset of studentswho were on the football team,and it’s not obvious what the cor-responding subset of Californianswould be. Would it include ev-erybody who played high school,college, or pro football? Maybe

2.2. SAFE USE OF INFINITESIMALS 27

it shouldn’t include the pros, be-cause they belong to an organiza-tion covering a region bigger thanCalifornia. Statement 2 is the kindof statement that the transfer prin-ciple doesn’t apply to.

Example 10As a nontrivial example of how to ap-

ply the transfer principle, let’s considerhow to handle expressions like theone that occurred when we wanted todifferentiate t2 using infinitesimals:

d`t2´

dt= 2t + dt .

I argued earlier than 2t +dt is so closeto 2t that for all practical purposes, theanswer is really 2t . But is it really validin general to say that 2t + dt is thesame hyperreal number as 2t? No.We can apply the transfer principle tothe following statement about the re-als:

For any real numbers a and b,with b 6= 0, a + b 6= a.

Since dt isn’t zero, 2t + dt 6= 2t .

More generally, example 10 leadsus to visualize every number as be-ing surrounded by a “halo” of num-bers that don’t equal it, but dif-fer from it by only an infinitesi-mal amount. Just as a magnify-ing glass would allow you to seethe fleas on a dog, you would needan infinitely strong microscope tosee this halo. This is similar tothe idea that every integer is sur-rounded by a bunch of fractionsthat would round off to that inte-ger. We can, however, define thestandard part of a finite hyperreal

number, which means the uniquereal number that differs from itinfinitesimally. For instance, thestandard part of 2t + dt, notatedst(2t + dt), equals 2t. The deriva-tive of a function should actuallybe defined as the standard part ofdx/dt, but we often write dx/dtto mean the derivative, and don’tworry about the distinction.

One of the things Bishop Berkeleydisliked about infinitesimals wasthe idea that they existed in akind of hierarchy, with dt2 beingnot just infinitesimally small, butinfinitesimally small compared tothe infinitesimal dt. If dt is theflea on a dog, then dt2 is a sub-microscopic flea that lives on theflea, as in Swift’s doggerel: “Bigfleas have little fleas/ On theirbacks to ride ’em,/ and little fleashave lesser fleas,/And so, ad in-finitum.” Berkeley’s criticism wasoff the mark here: there is such ahierarchy. Our basic assumptionabout the hyperreals was that theycontain at least one infinite num-ber, H, which is bigger than allthe integers. If this is true, then1/H must be less than 1/2, lessthan 1/100, less then 1/1, 000, 000— less than 1/n for any integer n.Therefore the hyperreals are guar-anteed to include infinitesimals aswell, and so we have at least threelevels to the hierarchy: infinitiescomparable to H, finite numbers,and infinitesimals comparable to1/H. If you can swallow that,then it’s not too much of a leap to

28 CHAPTER 2. TO INFINITY — AND BEYOND!

add more rungs to the ladder, likeextra-small infinitesimals that arecomparable to 1/H2. If this seemsa little crazy, it may comfort youto think of statements about thehyperreals as descriptions of limit-ing processes involving real num-bers. For instance, in the sequenceof numbers 1.12 = 1.21, 1.012 =1.0201, 1.0012 = 1.002001, . . . , it’sclear that the number representedby the digit 1 in the final decimalplace is getting smaller faster thanthe contribution due to the digit 2in the middle.

2.3 The product ruleWhen I first learned calculus, itseemed to me that if the deriva-tive of 3t was 3, and the deriva-tive of 7t was 7, then the deriva-tive of t multiplied by t ought tobe just plain old t, not 2t. Thereason there’s a factor of 2 in thecorrect answer is that t2 has tworeasons to grow as t gets bigger: itgrows because the first factor of tis increasing, but also because thesecond one is. In general, it’s pos-sible to find the derivative of theproduct of two functions any timewe know the derivatives of the in-dividual functions.

The product ruleIf x and y are both functions of t,then the derivative of their productis

d(xy)dt

=dx

dt· y + x · dy

dt.

The proof is easy. Changing t byan infinitesimal amount dt changesthe product xy by an amount

(x + dx)(y + dy)− xy

= ydx + xdy + dxdy ,

and dividing by dt gives,

=dx

dt· y + x · dy

dt+

dxdy

dt,

whose standard part is the resultto be proved.

Example 11. Find the derivative of the functiont sin t .

.

d(t sin t)dt

= t · d(sin t)dt

+dtdt· sin t

= t cos t + sin t

Figure g gives the geometrical in-terpretation of the product rule.Imagine that the king, in his cas-tle at the wouthwest corner of hisrectangular kingdom, sends out aline of infantry to expand his terri-tory to the north, and a line of cav-alry to take over more land to theeast. In a time interval dt, the cav-alry, which moves faster, covers adistance dx greater than that cov-ered by the infantry, dy. However,the strip of territory conquered bythe cavalry, ydx, isn’t as great asit could have been, because in ourexample y isn’t as big as x.

2.3. THE PRODUCT RULE 29

g / A geometrical interpretation of theproduct rule.

A helpful feature of the Leibniznotation is that one can easilyuse it to check whether the unitsof an answer make sense. If wemeasure distances in meters andtime in seconds, then xy has unitsof square meters (area), and sodoes the change in the area, d(xy).Dividing by dt gives the numberof square meters per second be-ing conquered. On the right-handside of the product rule, dx/dthas units of meters per second(velocity), and multiplying it byy makes the units square metersper second, which is consistentwith the left-hand side. The unitsof the second term on the rightlikewise check out. Some begin-ners might be tempted to guessthat the product rule would bed(xy)/dt = (dx/dt)(dy/dt), butthe Leibniz notation instantly re-veals that this can’t be the case,because then the units on the left,m2/s, wouldn’t match the ones onthe right, m2/s2.

Because this unit-checking featureis so helpful, there is a special way

of writing a second derivative inthe Leibniz notation. What New-ton called x, Leibniz wrote as

d2x

dt2.

Although the different placementof the 2’s on top and bottom seemsstrange and inconsistent to manybeginners, it actually works outnicely. If x is a distance, mea-sured in meters, and t is a time,in units of seconds, then the sec-ond derivative is supposed to haveunits of acceleration, in units ofmeters per second per second, alsowritten (m/s)/s, or m/s2. (Theacceleration of falling objects onEarth is 9.8 m/s2 in these units.)The Leibniz notation is meant tosuggest exactly this: the top of thefraction looks like it has units ofmeters, because we’re not squaringx, while the bottom of the fractionlooks like it has units of seconds,because it looks like we’re squar-ing dt. Therefore the units comeout right. It’s important to realize,however, that the symbol d isn’t anumber (not a real one, and not ahyperreal one, either), so we can’treally square it; the notation is notto be taken as a literal statementabout infinitesimals.

Example 12A tricky use of the product rule is to

find the derivative of√

t . Since√

t canbe written as t1/2, we might suspectthat the rule d(tk )/dt = ktk−1 wouldwork, giving a derivative 1

2 t−1/2 =1/(2

√t). However, the methods used

to prove that rule in chapter 1 only

30 CHAPTER 2. TO INFINITY — AND BEYOND!

work if k is an integer, so the best wecould do would be to confirm our con-jecture approximately by graphing.

Using the product rule, we can writef (t) = d

√t/dt for our unknown deriva-

tive, and back into the result using theproduct rule:

dtdt

=d(√

t√

t)dt

= f (t)√

t +√

t f (t)

= 2f (t)√

t

But dt/dt = 1, so f (t) = 1/(2√

t) asclaimed.

The trick used in example 12 canalso be used to prove that thepower rule d(xn)/dx = nxn−1 ap-plies to cases where n is an integerless than 0, but I’ll instead provethis on page 34 by a technique thatdoesn’t depend on a trick, and alsoapplies to values of n that aren’tintegers.

2.4 The chain rule

Figure h shows three clowns on see-saws. If the leftmost clown movesdown by a distance dx, the middleone will come up by dy, but thiswill also cause the one on the rightto move down by dz. If we wantto predict how much the rightmostclown will move in response to acertain amount of motion by theleftmost one, we have

dz

dx=

dz

dy· dy

dx.

This relation, called the chain rule,allows us to calculate a derivativeof a function defined by one func-tion inside another.

Example 13. Find the derivative of the functionz(x) = sin(x2).

. Let y (x) = x2, so that z(x) =sin(y (x)). Then

dzdx

=dzdy

· dydx

= cos(y ) · 2x

= 2x cos(x2)

The way people usually say it is thatthe chain rule tells you to take thederivative of the outside function, thesine in this case, and then multiplyby the derivative of “the inside stuff,”which here is the square. Once youget used to doing it, you don’t needto invent a third, intermediate variable,as we did here with y .

2.5. EXPONENTIALS AND LOGARITHMS 31

h / Three clowns on seesaws demonstrate the chain rule.

2.5 Exponentials andlogarithms

The exponential

An important application of thechain rule comes up when we wantto differentiate the omnipresentfunction ex, where e = 2.71828 . . .is the base of natural logarithms.We have

dex

dx=

ex+dx − ex

dx

=exedx − ex

dx

= ex edx − 1dx

The second factor,(edx − 1

)/dx,

doesn’t have x in it, so it mustjust be a constant. Therefore weknow that the derivative of ex issimply ex, multiplied by some un-

known constant,

dex

dx= c ex.

A rough check by graphing at, sayx = 0, shows that the slope is closeto 1, so c is close to 1. But howdo we know it’s exactly one? Theproof is given on page 83.

32 CHAPTER 2. TO INFINITY — AND BEYOND!

Example 14. The concentration of a foreign sub-

stance in the bloodstream generallyfalls off exponentially with time as c =coe−t/a, where co is the initial concen-tration, and a is a constant. For caf-feine in adults, a is typically about 7hours. An example is shown in figurei. Differentiate the concentration withrespect to time, and interpret the re-sult. Check that the units of the resultmake sense.

. Using the chain rule,

dcdt

= coe−t/a ·„−1

a

«= −co

ae−t/a

This can be interpreted as the rateat which caffeine is being removedfrom the blood and put into the per-son’s urine. It’s negative because theconcentration is decreasing. Accord-ing to the original expression for x ,a substance with a large a will takea long time to reduce its concentra-tion, since t/a won’t be very big un-less we have large t on top to com-pensate for the large a on the bottom.In other words, larger values of a rep-resent substances that the body hasa harder time getting rid of efficiently.The derivative has a on the bottom,and the interpretation of this is that fora drug that is hard to eliminate, therate at which it is removed from theblood is low.

It makes sense that a has units oftime, because the exponential func-tion has to have a unitless argument,so the units of t/a have to cancel out.The units of the result come from thefactor of co/a, and it makes sense that

the units are concentration divided bytime, because the result representsthe rate at which the concentration ischanging.

0.5

1

1.5

2

6 12 18 24t

c

i / Example 14. A typ-ical graph of the con-centration of caffeine inthe blood, in units of mil-ligrams per liter, as afunction of time, in hours.

Example 15. Find the derivative of the functiony = 10x .

. In general, one of the tricks to do-ing calculus is to rewrite functions informs that you know how to handle.This one can be rewritten as a base-10 logarithm:

y = 10x

ln y = ln`10x ´

ln y = x ln 10

y = ex ln 10

Applying the chain rule, we have thederivative of the exponential, which isjust the same exponential, multipliedby the derivative of the inside stuff:

dydx

= ex ln 10 · ln 10 .

In other words, the “c” referred to inthe discussion of the derivative of ex

2.5. EXPONENTIALS AND LOGARITHMS 33

becomes c = ln 10 in the case of thebase-10 exponential.

The logarithm

The natural logarithm is the func-tion that undoes the exponential.In a situation like this, we have

dy

dx=

1dx/dy

,

where on the left we’re thinking ofy as a function of x, and on theright we consider x to be a functionof y. Applying this to the naturallogarithm,

y = lnx

x = ey

dx

dy= ey

dy

dx=

1ey

=1x

d lnx

dx=

1x

.

This is noteworthy because itshows that there must be an ex-ception to the rule that the deriva-tive of xn is nxn−1, and the inte-gral of xn−1 is xn/n. (On page30 I remarked that this rule couldbe proved using the product rulefor negative integer values of k,but that I would give a simpler,less tricky, and more general prooflater. The proof is example 16 be-low.) The integral of x−1 is notx0/0, which wouldn’t make sense

anyway because it involves divi-sion by zero.3 Likewise the deriva-tive of x0 = 1 is 0x−1, which iszero. Figure j shows the idea. Thefunctions xn form a kind of ladder,with differetiation taking us downone rung, and integration taking usup. However, there are two specialcases where differentiation takes usoff the ladder entirely.

j / Differentiation and integration offunctions of the form xn. Constantsout in front of the functions are notshown, so keep in mind that, for ex-ample, the derivative of x2 isn’t x , it’s2x .

3Speaking casually, one can say thatdivision by zero gives infinity. This isoften a good way to think when try-ing to connect mathematics to reality.However, it doesn’t really work that wayaccording to our rigorous treatment ofthe hyperreals. Consider this statement:“For a nonzero real number a, there isno real number b such that a = 0b.” Thismeans that we can’t divide a by 0 and getb. Applying the transfer principle to thisstatement, we see that the same is truefor the hyperreals: division by zero is un-defined. However, we can divide a finitenumber by an infinitesimal, and get aninfinite result, which is almost the samething.

34 CHAPTER 2. TO INFINITY — AND BEYOND!

Example 16. Prove d(xn)/dx = nxn−1 for any realvalue of n.

.

y = xn

= en ln x

By the chain rule,

dydx

= en ln x · nx

= xn · nx

= nxn−1 .

(For n = 0, the result is zero.)

2.6 QuotientsSo far we’ve been successful witha divide-and-conquer approach todifferentiation: the product ruleand the chain rule offer meth-ods of breaking a function downinto simpler parts, and finding thederivative of the whole thing basedon knowledge of the derivatives ofthe parts. We know how to findthe derivatives of sums, differences,and products, so the obvious nextstep is to look for a way of handlingdivision. This is straightforward,since we know that the derivativeof the function function 1/u = u−1

is −u−2. Let u and v be functionsof x. Then by the product rule,

d(v/u)dx

=dv

dx· 1u

+ v · d(1/u)dx

and by the chain rule,

d(v/u)dx

=dv

dx· 1u− v · 1

u2

du

dx

This is so easy to rederive on de-mand that I suggest not memoriz-ing it.

By the way, notice how the no-tation becomes a little awkwardwhen we want to write a derivativelike d(v/u)/dx. When we’re differ-entiating a complicated function,it can be uncomfortable trying tocram the expression into the top ofthe d . . . /d . . . fraction. Thereforeit would be more common to writesuch an expression like this:

ddx

( v

u

)This could be considered an abuseof notation, making d look like anumber being divided by anothernumber dx, when actually d ismeaningless on its own. On theother hand, we can consider thesymbol d/dx to represent the op-eration of differentiation with re-spect to x; such an interpretationwill seem more natural to thosewho have been inculcated with thetaboo against considering infinites-imals as numbers in the first place.

Using the new notation, the quo-tient rule becomes

ddx

( v

u

)=

1u· dv

dx− v

u2· du

dx.

The interpretation of the minussign is that if u increases, v/u de-creases.

Example 17. Differentiate y = x/(1 + 3x), andcheck that the result makes sense.

2.7. DIFFERENTIATION ON A COMPUTER 35

. We identify v with x and u with 1+x .The result is

ddx

“ vu

”=

1u· dv

dx− v

u2 ·dudx

=1

1 + x− 3x

(1 + x)2

One way to check that the resultmakes sense it to consider extremevalues of x . For very large values ofx , the 1 on the bottom of x/(1 + x) be-comes negligible compared to the 3x ,and the function y approaches x/3x =1/3 as a limit. Therefore we expectthat the derivative dy/dx should ap-proach zero, since the derivative ofa constant is zero. It works: plug-ging in bigger and bigger numbers forx in the expression for the derivativedoes give smaller and smaller results.(In the second term, the denominatorgets bigger faster than the numerator,because it has a square in it.)

Another way to check the result is toverify that the units work out. Sup-pose arbitrarily that x has units of gal-lons. (If the 3 on the bottom is unitless,then the 1 would have to represent 1gallon, since you can’t add things thathave different units.) The function y isdefined by an expression with units ofgallons divided by gallons, so y is unit-less. Therefore the derivative dy/dxshould have units of inverse gallons.Both terms in the expression for thederivative do have those units, so theunits of the answer check out.

2.7 Differentiation ona computer

In this chapter you’ve learned aset of rules for evaluating deriva-

tives: derivatives of products, quo-tions, functions inside other func-tions, etc. Because these rules ex-ist, it’s always possible to find aformula for a function’s derivative,given the formula for the originalfunction. Not only that, but thereis no real creativity required, so acomputer can be programmed todo all the drudgery. For exam-ple, you can download a free, open-source program called Yacas fromyacas.sourceforge.net and in-stall it on a Windows or Linux ma-chine. A typical session with Yacaslooks like this:

Example 18

D(x) x^2

2*x

D(x) Exp(x^2)

2*x*Exp(x^2)

D(x) Sin(Cos(Sin(x)))

-Cos(x)*Sin(Sin(x))

*Cos(Cos(Sin(x)))

Upright type represents your in-put, and italicized type is the pro-gram’s output.

First I asked it to differentiate x2

with respect to x, and it told methe result was 2x. Then I didthe derivative of ex2

, which I alsocould have done fairly easily byhand. (If you’re trying this outon a computer as you real along,makre sure to capitalize functionslike Exp, Sin, and Cos.) FinallyI tried an example where I didn’tknow the answer off the top of myhead, and that would have been alittle tedious to calculate by hand.

36 CHAPTER 2. TO INFINITY — AND BEYOND!

Unfortunately things are a littleless rosy in the world of integrals.There are a few rules that can helpyou do integrals, e.g., that the inte-gral of a sum equals the sum of theintegrals, but the rules don’t coverall the possible cases. Using Ya-cas to evaluate the integrals of thesame functions, here’s what hap-pens.4

Example 19Integrate(x) x^2

x^3/3

Integrate(x) Exp(x^2)

Integrate(x)Exp(x^2)

Integrate(x)

Sin(Cos(Sin(x)))

Integrate(x)

Sin(Cos(Sin(x)))

The first one works fine, and Ican easily verify that the answeris correct, by taking the derivativeof x3/3, which is x2. (The an-swer could have been x3/3 + 7, orx3/3+c, where c was any constant,but Yacas doesn’t bother to tell usthat.) The second and third onesdon’t work, however; Yacas justspits back the input at us withoutmaking any progress on it. Andit may not be because Yacas isn’tsmart enough to figure out theseintegrals. The function ex2

can’tbe integrated at all in terms of aformula containing ordinary oper-ations and functions such as ad-dition, multiplication, exponentia-

4If you’re trying these on your owncomputer, note that the long input linefor the function sin cos sin x shouldn’t bebroken up into two lines as shown in thelisting.

tion, trig functions, exponentials,and so on.

That’s not to say that a programlike this is useless. For example,here’s an integral that I wouldn’thave known how to do, but thatYacas handles easily:

Example 20Integrate(x) Sin(Ln(x))

(x*Sin(Ln(x)))/2

-(x*Cos(Ln(x)))/2

This one is easy to check by dif-ferentiating, but I could have beenmarooned on a desert island for adecade before I could have figuredit out in the first place. There arevarious rules, then, for integration,but they don’t cover all possiblecases as the rules for differentiationdo, and sometimes it isn’t obviouswhich rule to apply. Yacas’s abilityto integrate sin lnx shows that ithad a rule in its bag of tricks thatI don’t know, or didn’t remember,or didn’t realize applied to this in-tegral.

Back in the 17th century, whenNewton and Leibniz invented cal-culus, there were no computers, soit was a big deal to be able to finda simple formula for your result.Nowadays, however, it may not besuch a big deal. Suppose I want tofind the derivative of sin cos sinx,evaluated at x = 1. I can do some-thing like this on a calculator:

Example 21sin cos sin 1 =

0.61813407

sin cos sin 1.0001 =

2.7. DIFFERENTIATION ON A COMPUTER 37

0.61810240

(0.61810240-0.61813407)

/.0001 =

-0.3167

I have the right answer, withplenty of precision for most realis-tic applications, although I mighthave never guessed that the myste-rious number −0.3167 was actually−(cos 1)(sin sin 1)(cos cos sin 1).This could get a little tedious if Iwanted to graph the function, forinstance, but then I could just usea computer spreadsheet, or writea little computer program. In thischapter, I’m going to show youhow to do derivatives and integralsusing simple computer programs,using Yacas. The following littleYacas program does the samething as the set of calculatoroperations shown above:

Example 221 f(x):=Sin(Cos(Sin(x)))

2 x:=1

3 dx:=.0001

4 N( (f(x+dx)-f(x))/dx )

-0.3166671628

(I’ve omitted all of Yacas’s outputexcept for the final result.) Line1 defines the function we want todifferentiate. Lines 2 and 3 givevalues to the variables x and dx.Line 4 computes the derivative; theN( ) surrounding the whole thingis our way of telling Yacas that wewant an approximate numerical re-sult, rather than an exact symbolicone.

An interesting thing to try now isto make dx smaller and smaller,

and see if we get better and bet-ter accuracy in our approximationto the derivative.

Example 235 g(x,dx):=

N( (f(x+dx)-f(x))/dx )

6 g(x,.1)

-0.3022356406

7 g(x,.0001)

-0.3166671628

8 g(x,.0000001)

-0.3160458019

9 g(x,.00000000000000001)

0

Line 5 defines the derivative func-tion. It needs to know both x anddx. Line 6 computes the derivativeusing dx = 0.1, which we expect tobe a lousy approximation, since dxis really supposed to be infinitesi-mal, and 0.1 isn’t even that small.Line 7 does it with the same valueof dx we used earlier. The two re-sults agree exactly in the first dec-imal place, and approximately inthe second, so we can be prettysure that the derivative is −0.32to two figures of precision. Line8 ups the ante, and produces a re-sult that looks accurate to at least3 decimal places. Line 9 attemptsto produce fantastic precision byusing an extremely small value ofdx. Oops — the result isn’t bet-ter, it’s worse! What’s happenedhere is that Yacas computed f(x)and f(x + dx), but they were thesame to within the precision it wasusing, so f(x+dx)−f(x) roundedoff to zero.5

5Yacas can do arithmetic to anyprecision you like, although you may

38 CHAPTER 2. TO INFINITY — AND BEYOND!

Example 23 demonstrates the con-cept of how a derivative can be de-fined in terms of a limit:

dy

dx= lim

∆x→0

∆y

∆x

The idea of the limit is that wecan theoretically make ∆y/∆x ap-proach as close as we like to dy/dx,provided we make ∆x sufficientlysmall. In reality, of course, weeventually run into the limits ofour ability to do the computation,as in the bogus result generated online 9 of the example.

2.8 ContinuityIntuitively, a continuous functionis one whose graph has no suddenjumps in it; the graph is all a singleconnected piece. Formally, f(x)is defined to be continuous if forany real x and any infinitesimal dx,f(x + dx)− f(x) is infinitesimal.

Example 24Let the function f be defined by f (x) =0 for x ≤ 0, and f (x) = 1 for x > 0.Then f (x) is discontinuous, since fordx > 0, f (0+dx)− f (0) = 1, which isn’tinfinitesimal.

If a function is discontinuous at agiven point, then it is not differen-tiable at that point. On the otherhand, a function like y = |x| shows

run into practical limits due to theamount of memory your computer hasand the speed of its CPU. For fun,try N(Pi,1000), which tells Yacas tocompute π numerically to 1000 decimalplaces.

that a function can be continuouswithout being differentiable.

Another way of thinking aboutcontinuous functions is given bythe intermediate value theorem.Intuitively, it says that if you aremoving continuously along a road,and you get from point A to pointB, then you must also visit everyother point along the road; only byteleporting (by moving discontin-uously) could you avoid doing so.More formally, the theorem statesthat if y is a continuous functionon the interval from a to b, andif y takes on values y1 and y2 atcertain points within this interval,then for any y3 between y1 and y2,there is some x in the interval forwhich y(x) = y3.6

6For a proof of the intermediate valuetheorem starting from our definitionof continuity, see Keisler’s Elemen-tary Calculus: An Approach UsingInfinitesimals, p. 162, available online athttp://www.math.wisc.edu/~keisler/

calc.html.

PROBLEMS 39

Problems1 Carry out a calculation likethe one in example 7 on page 22to show that the derivative of t4

equals 4t3.

2 Example 9 on page 23 gave atricky argument to show that thederivative of cos t is − sin t. Provethe same result using the methodof example 8 instead.

3 Suppose H is a big number.Experiment on a calculator to fig-ure out whether

√H + 1−

√H − 1

comes out big, normal, or tiny. Trymaking H bigger and bigger, andsee if you observe a trend. Basedon these numerical examples, forma conjecture about what happensto this expression when H is infi-nite.

4 Suppose dx is a small but finitenumber. Experiment on a calcu-lator to figure out how

√dx com-

pares in size to dx. Try making dxsmaller and smaller, and see if youobserve a trend. Based on thesenumerical examples, form a conjec-ture about what happens to thisexpression when dx is infinitesi-mal.

5 To which of the following state-ments can the transfer principle beapplied? If you think it can’t beapplied to a certain statement, tryto prove that the statement is falsefor the hyperreals, e.g., by giving acounterexample.

(a) For any real numbers x and y,x + y = y + x.

(b) The sine of any real number isbetween −1 and 1.(c) For any real number x, thereexists another real number y thatis greater than x.(d) For any real numbers x 6= y,there exists another real number zsuch that x < z < y.(e) For any real numbers x 6= y,there exists a rational number zsuch that x < z < y. (A ratio-nal number is one that can be ex-pressed as an integer divided byanother integer.)(f) For any real numbers x, y, andz, (x + y) + z = x + (y + z).(g) For any real numbers x and y,either x < y or x = y or x > y.(h) For any real number x, x+1 6=x.

6 Differentiate (2x + 3)100 withrespect to x.

7 Differentiate (x+1)100(x+2)200

with respect to x.

8 Differentiate the following withrespect to x: e7x, eex

.

9 Differentiate a sin(bx + c) withrespect to x.

10 Find a function whose deriva-tive with respect to x equalsa sin(bx + c). That is, find an inte-gral of the given function.

11 The range of a gun, when el-evated to an angle θ, is given by

R =2v2

gsin θ cos θ .

Find the angle that will producethe maximum range.

40 CHAPTER 2. TO INFINITY — AND BEYOND!

12 The hyperbolic cosine func-tion is defined by

coshx =ex + e−x

2.

Find any minima and maxima ofthis function.

13 Differentiate tan θ with re-spect to θ.

14 Differentiate 3√

x with respectto x.

15 Differentiate the followingwith respect to x:(a) y =

√x2 + 1

(b) y =√

x2 + a2

(c) y = 1/√

a + x(d) y = a/

√(a− x2)

[Thompson, 1919]

16 Differentiate ln(2t + 1) withrespect to t.

17 If you know the derivative ofsinx, it’s not necessary to use theproduct rule in order to differenti-ate 3 sinx, but show that using theproduct rule gives the right resultanyway.

18 The Γ function (capital Greekletter gamma) is a continuousmathematical function that hasthe property Γ(n) = 1·2·. . .·(n−1)for n an integer. Γ(x) is also welldefined for values of x that are notintegers, e.g., Γ(1/2) happens to be√

π. Use computer software that iscapable of evaluating the Γ func-tion to determine numerically thederivative of Γ(x) with respect tox, at x = 2. (In Yacas, the func-tion is called Gamma.)

19 For a cylinder of fixed surfacearea, what proportion of length toradius will give the maximum vol-ume?

20 Use a trick similar to the oneused in example 12 to prove thatthe power rule d(xk)/dx = kxk−1

applies to cases where k is an inte-ger less than 0. ?

3 Integration3.1 Definite and

indefiniteintegrals

Because any formula can be differ-entiated symbolically to find an-other formula, the main motiva-tion for doing derivatives numeri-cally would be if the function tobe differentiated wasn’t known insymbolic form. A typical exam-ple might be a two-person networkcomputer game, in which playerA’s computer needs to figure outplayer B’s velocity based on knowl-edge of how her position changesover time. But in most cases, it’snumerical integration that’s inter-esting, not numerical differentia-tion.

As a warm-up, let’s see how todo a running sum of a discretefunction using Yacas. The follow-ing program computers the sum1+2+. . .+100 discussed to on page7. Now that we’re writing realcomputer programs with Yacas, itwould be a good idea to enter eachprogram into a file before trying torun it. In fact, some of these exam-ples won’t run properly if you juststart up Yacas and type them inone line at a time. If you’re usingAdobe Reader to read this book,you can do Tools>Basic>Select,select the program, copy it into afile, and then edit out the line num-

bers.

Example 251 n := 1;

2 sum := 0;

3 While (n<=100) [

4 sum := sum+n;

5 n := n+1;

6 ];

7 Echo(sum);

The semicolons are to separate oneinstruction from the next, and theybecome necessary now that we’redoing real programming. Line 1of this program defines the vari-able n, which will take on all thevalues from 1 to 100. Line 2 saysthat we haven’t added anything upyet, so our running sum is zero dofar. Line 3 says to keep on re-peating the instructions inside thesquare brackets until n goes past100. Line 4 updates the runningsum, and line 5 updates the valueof n. If you’ve never done any pro-gramming before, a statement liken:=n+1 might seem like nonsense— how can a number equal itselfplus one? But that’s why we usethe := symbol; it says that we’reredefining n, not stating an equa-tion. If n was previously 37, thenafter this statement is executed, nwill be redefined as 38. To run theprogram on a Linux computer, dothis (assuming you saved the pro-gram in a file named sum.yacas):

% yacas -pc sum.yacas

41

42 CHAPTER 3. INTEGRATION

5050

Here the % symbol is the com-puter’s prompt. The result is5,050, as expected. One way ofstating this result is

100∑n=1

n = 5050 .

The capital Greek letter Σ, sigma,is used because it makes the “s”sound, and that’s the first sound inthe word “sum.” The n = 1 belowthe sigma says the sum starts at 1,and the 100 on top says it ends at100. The n is what’s known as adummy variable: it has no mean-ing outside the context of the sum.Figure a shows the graphical inter-pretation of the sum: we’re addingup the areas of a series of rectan-gular strips. (For clarity, the figureonly shows the sum going up to 7,rather than 100.)

a / Graphical interpreta-tion of the sum 1+2+. . .+7.

Now how about an integral? Fig-ure b shows the graphical inter-

pretation of what we’re trying todo: find the area of the shadedtriangle. This is an example weknow how to do symbolically, sowe can do it numerically as well,and check the answers against eachother. Symbolically, the area isgiven by the integral. To inte-grate the function x(t) = t, weknow we need some function witha t2 in it, since we want somethingwhose derivative is t, and differen-tiation reduces the power by one.The derivative of t2 would be 2trather than t, so what we want isx(t) = t2/2. Let’s compute thearea of the triangle that stretchesalong the t axis from 0 to 100:x(100) = 100/2 = 5000.

b / Graphical interpreta-tion of the integral of thefunction x(t) = t .

Figure c shows how to accomplishthe same thing numerically. Webreak up the area into a wholebunch of very skinny rectangles.Ideally, we’d like to make the widthof each rectange be an infinitesimalnumber dx, so that we’d be adding

3.1. DEFINITE AND INDEFINITE INTEGRALS 43

up an infinite number of infinites-imal areas. In reality, a computercan’t do that, so we divide up theinterval from t = 0 to t = 100into H rectangles, each with fi-nite width dt = 100/H. Insteadof making H infinite, we make itthe largest number we can withoutmaking the computer take too longto add up the areas of the rectan-gles.

c / Approxmating the in-tegral numerically.

Example 26

1 tmax := 100;

2 H := 1000;

3 dt := tmax/H;

4 sum := 0;

5 t := 0;

6 While (t<=tmax) [

7 sum := N(sum+t*dt);

8 t := N(t+dt);

9 ];

10 Echo(sum);

In example 26, we split the in-terval from t = 0 to 100 intoH = 1000 small intervals, eachwith width dt = 0.1. The result is5,005, which agrees with the sym-

bolic result to three digits of preci-sion. Changing H to 10,000 gives5, 000.5, which is one more digit.Clearly as we make the numberof rectangles greater and greater,we’re converging to the correct re-sult of 5,000.

In the Leibniz notation, the thingwe’ve just calculated, by two differ-ent techniques, is written like this:∫ 100

0

t dt = 5, 000

It looks a lot like the Σ notation,with the Σ replaces by a flattened-out letter “S.” The t is a dummyvariable. What I’ve been casuallyreferring to as an integral is re-ally two different but closely re-lated things, known as the definiteintegral and the indefinite integral.

Definition of the indefinite integralIf x is a function, then a functionx is an indefinite integral of x if, asimplied by the notation, dx/dt =x.Interpretation: Doing an indefi-nite integral means doing the op-posite of differentiation. All thepossible indefinite integrals are thesame function except for an addi-tive constant.

Example 27. Find the indefinite integral of thefunction x(t) = t .

. Any function of the form

x(t) = t2/2 + c ,

44 CHAPTER 3. INTEGRATION

where c is a constant, is an indefi-nite integral of this function, since itsderivative is t .

Definition of the definite integralIf x is a function, then the definiteintegral of x from a to b is definedas ∫ b

a

x(t)dt

= limH→∞

H∑i=0

x (a + i∆t) ,

where ∆t = (b− a)/H.Interpretation: What we’re calcu-lating is the area under the graphof x, from a to b. (If the graphdips below the t axis, we interpretthe area between it and the axis asa negative area.) The thing insidethe limit is a calculation like theone done in example 26, but gen-eralized to a 6= 0. If H was infinite,then ∆t would be an infinitesimalnumber dt.

3.2 The fundamentaltheorem ofcalculus

The fundamental theorem of calcu-lusLet x be an indefinite integral ofx, and let x be a continuous func-tion (one whose graph is a singleconnected curve). Then

∫ b

a

x(t)dt = x(b)− x(a) .

Interpretation: In the simple ex-amples we’ve been doing so far, wewere able to choose an indefiniteintegral such that x(0) = 0. Inthat case, x(t) is interpreted as thearea from 0 to t, so in the expres-sion x(b) − x(a), we’re taking thearea from 0 to a, but subtractingout the area from 0 to b, whichgives the area from a to b. If wechoose an indefinite integral witha different c, the c’s will just can-cel out anyway in the differencex(b)− x(a).

The fundamental theorem isproved on page 83.

Example 28. Interpret the indefinite integralZ 2

1

1t

dt .

graphically; then evaluate it it bothsymbolically and numerically, andcheck that the two results are consis-tent.

d / The indefinite integralR 21 (1/t)dt .

3.3. PROPERTIES OF THE INTEGRAL 45

. Figure d shows the graphical inter-pretation. The numerical calculationrequires a trivial variation on the pro-gram from example 26:

a := 1;

b := 2;

H := 1000;

dt := (b-a)/H;

sum := 0;

t := a;

While (t<=b) [

sum := N(sum+(1/t)*dt);

t := N(t+dt);

];

Echo(sum);

The result is 0.693897243, andincreasing H to 10,000 gives0.6932221811, so we can befairly confident that the result equals0.693, to 3 decimal places.

Symbolically, the indefinite integral isx = ln t . Using the fundamental the-orem of calculus, the area is ln 2 −ln 1 ≈ 0.693147180559945.

Judging from the graph, it looks plau-sible that the shaded area is about0.7.

This is an interesting example, be-cause the natural log blows up to neg-ative infinity as t approaches 0, so it’snot possible to add a constant ontothe indefinite integral and force it to beequal to 0 at t = 0. Nevertheless, thefundamental theorem of calculus stillworks.

3.3 Properties of theintegral

Let f and g be two functions of x,and let c be a constant. We already

know that for derivatives,

ddx

(f + g) =df

dx+

dg

dx

and

ddx

(cf) = cdf

dx.

But since the indefinite integral isjust the operation of undoing aderivative, the same kind of rulesmust hold true for indefinite inte-grals as well:∫

(f + g)dx =∫

fdx +∫

gdx

and ∫(cf)dx = c

∫fdx .

And since a definite integral can befound by plugging in the upper andlower limits of integration into theindefinite integral, the same prop-erties must be true of definite inte-grals as well.

Example 29. Evaluate the indefinite integralZ

(x + 2 sin x)dx .

. Using the additive property, the inte-gral becomesZ

xdx +Z

2 sin xdx .

Then the property of scaling by a con-stant lets us change this toZ

xdx + 2Z

sin xdx .

46 CHAPTER 3. INTEGRATION

We need a function whose derivativeis x , which would be x2/2, and onewhose derivative is sin x , which mustbe − cos x , so the result is

12

x2 − 2 cos x + c .

3.4 ApplicationsAverages

In the story of Gauss’s problem ofadding up the numbers from 1 to100, one interpretation of the re-sult, 5,050, is that the average ofall the numbers from 1 to 100 is50.5. This is the ordinary defini-tion of an average: add up all thethings you have, and divide by thenumber of things. (The result inthis example makes sense, becausehalf the numbers are from 1 to 50,and half are from 51 to 100, so theaverage is half-way between 50 and51.)

Similarly, a definite integral canalso be thought of as a kind of aver-age. In general, if y is a function ofx, then the average, or mean, valueof y on the interval from x = a tob can be defined as

y =1

b− a

∫ b

a

y dx .

In the continuous case, dividing byb− a accomplishes the same thingas dividing by the number of thingsin the discrete case.

Example 30. Show that the definition of the aver-

age makes sense in the case wherethe function is a constant.

. If y is a constant, then we can takeit outside of the integral, so

y =1

b − ay

Z b

a1 dx

=1

b − ay x |ba

=1

b − ay (b − a)

= y

Example 31. Find the average value of the func-

tion y = x2 for values of x ranging from0 to 1.

y =1

1− 0

Z 1

0x2 dx

=13

x3˛1

0

=13

The mean value theoremIf the continuous function y(x) hasthe average value y on the inter-val from x = a to b, then y at-tains its average value at least oncein that interval, i.e., there exists ξwith a < ξ < b such that y(ξ) = y.

The mean value theorem is provedon page 84.

Example 32. Verify the mean value theorem fory = x2 on the interval from 0 to 1.

3.4. APPLICATIONS 47

. The mean value is 1/3, as shown inexample 31. This value is achievedat x =

p1/3 = 1/

√3, which lies be-

tween 0 and 1.

Work

In physics, work is a measure ofthe amount of energy transferredby a force; for example, if a horsesets a wagon in motion, the horse’sforce on the wagon is putting someenergy of motion into the wagon.When a force F acts on an ob-ject that moves in the direction ofthe force by an infinitesimal dis-tance dx, the infinitesimal workdone is dW = Fdx. Integratingboth sides, we have W =

∫ b

aFdx,

where the force may depend on x,and a and b represent the initialand final positions of the object.

Example 33. A spring compressed by an amountx relative to its relaxed length providesa force F = kx . Find the amount ofwork that must be done in order tocompress the spring from x = 0 tox = a. (This is the amount of energystored in the spring, and that energywill later be released into the toy bul-let.)

.

W =Z a

0Fdx

=Z a

0kxdx

=12

kx2˛a

0

=12

ka2

The reason W grows like a2, not justlike a, is that as the spring is com-pressed more, more and more effortis required in order to compress it.

48 CHAPTER 3. INTEGRATION

Problems1 Write a computer program sim-ilar to the one in example 28 onpage 44 to evaluate the definite in-tegral ∫ 1

0

ex2.

2 Evaluate the integral∫ 2π

0

sinx dx ,

and draw a sketch to explain whyyour result comes out the way itdoes.

3 Sketch the graph that repre-sents the definite integral∫ 2

0

−x2 + 2x ,

and estimate the result roughlyfrom the graph. Then evaluate theintegral exactly, and check againstyour estimate.

4 Show that the mean value the-orem’s assumption of continuity isnecessary, by exhibiting a discon-tinuous function for which the the-orem fails.

5 Show that the fundamentaltheorem of calculus’s assumptionof continuity for x is necessary, byexhibiting a discontinuous functionfor which the theorem fails.

6 Find the average value of sinxfor 0 < x < π.

7 Sketch the graphs of y = x2

and y =√

x for 0 ≤ x ≤ 1. Graph-ically, what relationship should ex-ist between the integrals

∫ 1

0x2 dx

and∫ 1

0

√x dx? Compute both in-

tegrals, and verify that the resultsare related in the expected way.

8 In a gasoline-burning car en-gine, the exploding air-gas mixturemakes a force on the piston, andthe force tapers off as the pistonexpands, allowing the gas to ex-pand. (a) In the approximationF = k/x, where x is the positionof the piston, find the work doneon the piston as it travels fromx = a to x = b, and show thatthe result only depends on the ra-tio b/a. This ratio is known asthe compression ratio of the en-gine. (b) A better approximation,which takes into account the cool-ing of the air-gas mixture as it ex-pands, is F = kx−1.4. Computethe work done in this case.

Problem 8.

PROBLEMS 49

9 A perfectly elastic ballbounces up and down forever,always coming back up to thesame height h. Find its averageheight. ?

50 CHAPTER 3. INTEGRATION

4 Techniques andapplications

4.1 Newton’s methodIn the 1958 science fiction novelHave Space Suit — WillTravel, by Robert Heinlein, Kipis a high school student who wantsto be an engineer, and his father istrying to convince him to stretchhimself more if he wants to get any-thing out of his education:

“Why did Van Buren fail of re-election? How do you extract thecube root of eighty-seven?”

Van Buren had been a president;that was all I remembered. But Icould answer the other one. “Ifyou want a cube root, you look ina table in the back of the book.”

Dad sighed. “Kip, do you thinkthat table was brought down fromon high by an archangel?”

We no longer use tables to com-pute roots, but how does a pocketcalculator do it? A techniquecalled Newton’s method allows usto calculate the inverse of any func-tion efficiently, including cases thataren’t preprogrammed into a cal-culator. In the example from thenovel, we know how to calculatethe function y = x3 fairly accu-rately and quickly for any givenvalue of x, but we want to turn theequation around and find x when

y = 87. We start with a roughmental guess: since 43 = 64 is a lit-tle too small, and 53 = 125 is muchtoo big, we guess x ≈ 4.3. Test-ing our guess, we have 4.33 = 79.5.We want y to get bigger by 7.5, andwe can use calculus to find approx-imately how much bigger x needsto get in order to accomplish that:

∆x =∆x

∆y∆y

≈ dx

dy∆y

=∆y

dy/dx

=∆y

3x2

=∆y

3x2

= 0.14

Increasing our value of x to 4.3 +0.14 = 4.44, we find that 4.443 =87.5 is a pretty good approxima-tion to 87. If we need higher preci-sion, we can go through the processagain with ∆y = −0.5, giving

∆x ≈ ∆y

3x2

= 0.14x = 4.43

x3 = 86.9 .

This second iteration gives an ex-cellent approximation.

51

52 CHAPTER 4. TECHNIQUES AND APPLICATIONS

a / Example 34.

Example 34. Figure 34 shows the astronomer Jo-hannes Kepler’s analysis of the motionof the planets. The ellipse is the or-bit of the planet around the sun. Att = 0, the planet is at its closest ap-proach to the sun, A. At some latertime, the planet is at point B. The an-gle x (measured in radians) is definedwith reference to the imaginary circleencompassing the orbit. Kepler foundthe equation

2πtT

= x − e sin x ,

where the period, T , is the time re-quired for the planet to complete a fullorbit, and the eccentricity of the el-lipse, e, is a number that measureshow much it differs from a circle. Therelationship is complicated becausethe planet speeds up as it falls inwardtoward the sun, and slows down againas it swings back away from it.

The planet Mercury has e = 0.206.

Find the angle x when Mercury hascompleted 1/4 of a period.

. We have

y = x − (0.206) sin x ,

and we want to find x when y =2π/4 = 1.57. As a first guess, we tryx = π/2 (90 degrees), since the ec-centricity of Mercury’s orbit is actuallymuch smaller than the example shownin the figure, and therefore the planet’sspeed doesn’t vary all that much as itgoes around the sun. For this value ofx we have y = 1.36, which is too smallby 0.21.

∆x ≈ ∆ydy/dx

=0.21

1− (0.206) cos x= 0.21

(The derivative dy/dx happens to be1 at x = π/2.) This gives a new valueof x , 1.57+.21=1.78. Testing it, wehave y = 1.58, which is correct towithin rounding errors after only oneiteration. (We were only supplied witha value of e accurate to three signifi-cant figures, so we can’t get a resultwith precision better than about thatlevel.)

4.2 Implicitdifferentiation

We can differentiate any functionthat is written as a formula, andfind a result in terms of a formula.However, sometimes the originalproblem can’t be written in anynice way as a formula. For exam-ple, suppose we want to find dy/dx

4.3. TAYLOR SERIES 53

in a case where the relationship be-tween x and y is given by the fol-lowing equation:

y7 + y = x7 + x2 .

There is no equivalent of thequadratic formula for seventh-order polynomials, so we have noway to solve for one variable interms of the other in order to dif-ferentiate it. However, we can stillfind dy/dx in terms of x and y.Suppose we let x grow to x + dx.Then for example the x2 term willgrow to (x+dx)2 = x+2dx+dx2.The squared infinitesimal is negli-gible, so the increase in x2 was re-ally just 2dx, and we’ve really justcomputer the derivative of x2 withrespect to x and multiplied it bydx. In symbols,

d(x2) =d(x2)dx

· dx

= 2x dx .

That is, the change in x2 is 2xtimes the change in x. Doing thisto both sides of the original equa-tion, we have

d(y7 + y) = d(x7 + x2)

7y6 dy + 1 dy = 7x6 dx + 2x dx

(7y6 + 1)dy = (7x6 + 2x)dx

dy

dx=

7y6 + 17x6 + 2x

.

This still doesn’t give us a for-mula for the derivative in terms ofx alone, but it’s not entirely use-less. For instance, if we’re given

a numerical value of x, we can al-ways use Newton’s method to findy, and then evaluate the derivative.

4.3 Taylor seriesIf you calculate e0.1 on your calcu-lator, you’ll find that it’s very closeto 1.1. This is because the tangentline at x = 0 on the graph of ex

has a slope of 1 (dex/dx = ex = 1at x = 0), and the tangent line isa good approximation to the expo-nential curve as long as we don’tget too far away from the point oftangency.

3

2

1

1-1x

y

b / The function ex , andthe tangent line at x = 0.

How big is the error? Theactual value of e0.1 is1.10517091807565 . . ., whichdiffers from 1.1 by about 0.005.If we go farther from the pointof tangency, the approximationgets worse. At x = 0.2, the erroris about 0.021, which is aboutfour times bigger. In other words,doubling x seems to roughlyquadruple the error, so the erroris proportional to x2; it seems to

54 CHAPTER 4. TECHNIQUES AND APPLICATIONS

be about x2/2. Well, if we wanta handy-dandy, super-accurateestimate of ex for small values ofx, why not just account for thiserror. Our new and improvedestimate is

ex ≈ 1 + x +12x2

for small values of x.

3

2

1

1-1x

y

c / The function ex , andthe approximation 1 + x +x2/2.

Figure c shows that the approxi-mation is now extremely good forsufficiently small values of x. Thedifference is that whereas 1 + xmatched both the y-intercept andthe slope of the curve, 1+x+x2/2matches the curvature as well. Re-call that the second derivative is ameasure of curvature. The secondderivatives of the function and itsapproximation are

ddx

ex = 1

ddx

(1 + x +

12x2

)= 1

3

2

1

1-1x

y

d / The function ex , andthe approximation 1 + x +x2/2 + x3/6.

We can do even better. Supposewe want to match the third deriva-tives. All the derivatives of ex,evaluated at x = 0, are 1, so wejust need to add on a term pro-portional to x3 whose third deriva-tive is one. Taking the first deriva-tive will bring down a factor of 3in front, and taking and the sec-ond derivative will give a 2, so tocancel these out we need the third-order term to be (1/2)(1/3):

ex ≈ 1 + x +12x2 +

12 · 3

x3

Figure d shows the result. For asignificant range of x values closeto zero, the approximation is nowso good that we can’t even see thedifference between the two func-tions on the graph.

On the other hand, figure e showsthat the cubic approximation forsomewhat larger negative and pos-itive values of x is poor — worse,in fact, than the linear approxi-mation, or even the constant ap-proximation ex = 1. This is tobe expected, because any polyno-

4.3. TAYLOR SERIES 55

mial will blow up to either posi-tive or negative infinity as x ap-proaches negative infinity, whereasthe function ex is supposed to getvery close to zero for large negativex. The idea here is that derivativesare local things: they only measurethe properties of a function veryclose to the point at which they’reevaluated, and they don’t necessar-ily tell us anything about points faraway.

10

5

-5

-10

2 1-1-2-3-4x

y

e / The function ex , and the approxi-mation 1 + x + x2/2 + x3/6, on a widerscale.

It’s a remarkable fact, then, thatby taking enough terms in a poly-nomial approximation, we can al-ways get as good an approximationto ex as necessary — it’s just thata large number of terms may berequired for large values of x. Inother words, the infinite series

1 + x +12x2 +

12 · 3

x3 + . . .

always gives exactly ex. But what

is the pattern here that would al-lows us to figure out, say, thefourth-order and fifth-order termsthat were swapt under the rugwith the symbol “. . . ”? Let’s dothe fifth-order term as an example.The point of adding in a fifth-orderterm is to make the fifth derivativeof the approximation equal to thefifth derivative of ex, which is 1.The first, second, . . . derivatives ofx5 are

ddx

x5 = 5x4

d2

dx2x5 = 5 · 4x3

d3

dx3x5 = 5 · 4 · 3x2

d4

dx4x5 = 5 · 4 · 3 · 2x

d5

dx5x5 = 5 · 4 · 3 · 2 · 1

The notation for a product like 1 ·2 · . . . · n is n!, read “n factorial.”So to get a term for our polynomialwhose fifth derivative is 1, we needx5/5!. The result for the infiniteseries is

ex =∞∑

n=0

xn

n!,

where the special case of 0! = 1 isassumed.1 This is called the Tay-lor series for ex, evaluated aroundx = 0, and it’s true, although Ihaven’t proved it, that this partic-ular Taylor series always converges

1This makes sense, because, for exam-ple, 4!=5!/5, 3!=4!/4, etc., so we shouldhave 0!=1!/1.

56 CHAPTER 4. TECHNIQUES AND APPLICATIONS

to ex, no matter how far x is fromzero.

A Taylor series can be used to ap-proximate other functions besidesex, and when you ask your calcu-lator to evaluate a function suchas a sine or a cosine, it may ac-tually be using a Taylor series todo it. In general, the Taylor seriesaround x = 0 for a function y is

T0(x) =∞∑

n=0

anxn ,

where the condition for equality ofthe nth order derivative is

an =1n!

dny

dxn

∣∣∣∣x=0

.

Here the notation |x=0 means thatthe derivative is to be evaluated atx = 0.

Example 35The function y = e−1/x2

, shown in fig-ure f, never converges to its Taylor se-ries, except at x = 0. This is becausethe Taylor series for this function, eval-uated around x = 0 is exactly zero! Atx = 0, we have y = 0, dy/dx = 0,d2y/dx2 = 0, and so on for everyderivative. The zero function matchesthe function y (x) and all its derivativesto all orders, and yet is useless as anapproximation to y (x).

In general, every function’s Tay-lor series around x = 0 convergesto the function for all values of x inthe range defined by |x| < r, wherer is some number, known as the ra-dius of convergence. For the func-tion ex, the radius of convergence

1

4 2-2-4x

y

f / The function e−1/x2never con-

verges to its Taylor series.

happens to be infinite, whereas fore−1/x2

it’s zero.

A function’s Taylor series doesn’thave to be evaluated around x =0. The Taylor series around someother center x = c is given by

Tc(x) =∞∑

n=0

an(x− c)n ,

where

an

n!=

dny

dxn

∣∣∣∣x=c

.

Example 36. Find the Taylor series of y = sin x ,evaluated around x = 0.

4.3. TAYLOR SERIES 57

. The first few derivatives are

ddx

sin x = cos x

d2

dx2 sin x = − sin x

d3

dx3 sin x = − cos x

d4

dx4 sin x = sin x

d5

dx5 sin x = cos x

We can see that there will be a cy-cle of sin, cos, − sin, and − cos, re-peating indefinitely. Evaluating thesederivatives at x = 0, we have 0, 1, 0,−1, . . . . All the even-order terms ofthe series are zero, and all the odd-order terms are ±1/n!. The result is

sin x = x − 13!

x3 +15!

x5 − . . . .

The linear term is the familiar small-angle approximation sin x ≈ x .

The radius of convergence of this se-ries turns out to be infinite. Intuitivelythe reason for this is that the factori-als grow extremely rapidly, so that thesuccessive terms in the series even-tually start diminish quickly, even forlarge values of x .

Example 37. Find the Taylor series of y = 1/(1−x)around x = 0, and see what you cansay about its radius of convergence.

. Rewriting the function as y = (1 −x)−1 and applying the chain rule, we

have

y |x=0 = 1

dydx

˛x=0

= (1− x)−2˛x=0

= 1

d2ydx2

˛x=0

= 2(1− x)−3˛x=0

= 2

d3ydx3

˛x=0

= 2 · 3(1− x)−4˛x=0

= 2 · 3

. . .

The pattern is that the nth derivativeis n!. The Taylor series therefore hasan = n!/n! = 1:

11− x

= 1 + x + x2 + x3 + . . .

The radius of convergence of this se-ries definitely can’t be greater than 1,since for x = 1 the series is 1 + 1 +1 + . . ., which grows indefinitely with-out ever converging to a specific num-ber. Likewise for x = −1 the seriesbecomes 1 − 1 + 1 − 1 + . . ., whichoscillates back and forth rather thanconverging. Intuitively we can see acouple of hints as to why this hap-pens: (1) The function 1/(1 − x) it-self misbehaves at x = 1, blowing upto infinity; (2) The only way a seriescan get closer and closer to a finitevalue is if the absolute values of theterms decrease, and decrease suffi-ciently rapidly. But the coefficients an

of this Taylor series don’t decreasewith n, so so the only way the abso-lute values of the terms can decreaseis for |x | < 1.

58 CHAPTER 4. TECHNIQUES AND APPLICATIONS

4.4 Methods ofintegration

Change of variable

Sometimes and unfamiliar-lookingintegral can be made into a famil-iar one by substituting a a newvariable for an old one. For exam-ple, we know how to integrate 1/x— the answer is lnx — but whatabout ∫

dx

2x + 1?

Let u = 2x + 1. Differentiatingboth sides, we have du = 2dx, ordx = du/2, so∫

dx

2x + 1=

∫du/2

u

=12

lnu + c

=12

ln(2x + 1) + c .

In the case of a definite integral,we have to remember to change thelimits of integration to reflect thenew variable.

Example 38. Evaluate

R 43 dx/(2x + 1).

. As before, let u = 2x + 1.Z x=4

x=3

dx2x + 1

=Z u=9

u=7

du/2u

=12

ln u˛u=9

u=7

Here the notation | u = 7u=9 means toevaluate the function at 7 and 9, andsubtract the former from the latter.The result isZ x=4

x=3

dx2x + 1

=12

(ln 9− ln 7)

=12

ln97

.

Sometimes, as in the next example,a clever substitution is the secret todoing a seemingly impossible inte-gral.

Example 39. EvaluateZ

e√

x

√x

dx .

. The only hope for reducing this to aform we can do is to let u =

√x . Then

dx = d(u2) = 2udu, soZe√

x

√x

dx =Z

eu

u· 2u du

= 2Z

eudu

= 2eu

= 2e√

x .

Example 39 really isn’t so tricky,since there was only one logicalchoice for the substitution that hadany hope of working. The follow-ing is a little more dastardly.

Example 40. Evaluate Z

dx1 + x2 .

4.4. METHODS OF INTEGRATION 59

. The substitution that works is x =tan u. First let’s see what this doesto the expression 1 + x2. The familiaridentity

sin2 u + cos2 u = 1 ,

when divided by cos2 u, gives

tan2 u + 1 = sec2 u ,

so 1 + x2 becomes sec2 u. But differ-entiating both sides of x = tan u gives

dx = dhsin u(cos u)−1

i= (d sin u)(cos u)−1

+ (sin u)dh(cos u)−1

i=

“1 + tan2 u

”du

= sec2 u du ,

so the integral becomesZdx

1 + x2 =Z

sec2 udusec2 u

= u + c

= tan−1 x + c .

What mere mortal would everhave suspected that the substitu-tion x = tanu was the one thatwas needed in example 40? Onepossible answer is to give up anddo the integral on a computer:

Integrate(x) 1/(1+x^2)ArcTan(x)

Another possible answer is thatyou can usually smell the possi-bility of this type of substitution,involving a trig function, when

the thing to be integrated con-tains something reminiscent of thePythagoran theorem, as suggestedby figure g. The 1 + x2 looks likewhat you’d get if you had a righttriangle with legs 1 and x, and wereusing the Pythagorean theorem tofind its hypotenuse.

g / The substitution x =tan u.

Example 41. Evaluate

Rdx/

√1− x2.

. The√

1− x2 looks like what you’dget if you had a right triangle withhypotenuse 1 and a leg of length x ,and were using the Pythagorean the-orem to find the other leg, as in fig-ure h. This motivates us to try thesubstitution x = cos u, which givesdx = − sin u du and

√1− x2 =√

1− cos2 u = sin u. The result isZdx√

1− x2=

Z− sin u du

sin u

= u + c

= cos−1 x .

h / The substitution x =cos u.

60 CHAPTER 4. TECHNIQUES AND APPLICATIONS

4.5 Integration byparts

Figure i shows a technique calledintegration by parts. If the inte-gral

∫vdu is easier than the inte-

gral∫

udv, then we can calculatethe easier one, and then by sim-ple geometry determine the one wewanted. Identifying the large rect-angle that surrounds both shadedareas, and the small white rectan-gle on the lower left, we have∫

u dv =(area of large rectangle)

− (area of small rectangle)∫v du .

i / Integration by parts.

In the case of an indefinite integral,we have a similar relationship de-

rived from the product rule:

d(uv) = u dv + v du

u dv = d(uv)− v du

Integrating both sides, we have thefollowing relation.

Integration by parts∫u dv = uv −

∫v du .

Since a definite integral can al-ways be done by evaluating an in-definite integral at its upper andlower limits, one usually uses thisform. Integrals don’t usually comeprepackaged in a form that makesit obvious that you should use inte-gration by parts. What the equa-tion for integration by parts tellsus is that if we can split up theintegrand into two factors, one ofwhich (the dv) we know how tointegrate, we have the option ofchanging the integral into a newform in which that factor becomesits integral, and the other fac-tor becomes its derivative. If wechoose the right way of splitting upthe integrand into parts, the resultcan be a simplification.

Example 42. Evaluate Z

x cos x dx

. There are two obvious possibilitiesfor splitting up the integrand into fac-

4.6. PARTIAL FRACTIONS 61

tors,

u dv = (x)(cos x dx)

or

u dv = (cos x)(x dx) .

The first one is the one that lets usmake progress. If u = x , then du = dx ,and if dv = cos x dx , then integrationgives v = sin x .

Zx cos x dx =

Zu dv

= uv −Z

v du

= x sin x −Z

sin x dx

= x sin x + cos x

Of the two possibilities we consid-ered for u and dv , the reason thisone helped was that differentiating xgave dx , which was simpler, and in-tegrating cos xdx gave sin x , whichwas no more complicated than be-fore. The second possibility wouldhave made things worse rather thanbetter, because integrating xdx wouldhave given x2/2, which would havebeen more complicated rather thanless.

4.6 Partial fractions

Given a function like

−1x− 1

+1

x + 1,

we can rewrite it over a commondenominator like this:(

−1x− 1

) (x + 1x + 1

)+

(1

x + 1

) (x− 1x− 1

)=−x− 1 + x− 1(x− 1)(x + 1)

=−2

x2 − 1.

But note that the original form iseasily integrated to give∫ (

−1x− 1

+1

x + 1

)dx

= − ln(x−1)+ln(x+1)+c ,

while faced with the form−2/(x2 − 1), we wouldn’t haveknown how to integrate it.

The idea of the method of partialfractions is that if we want to doan integral of the form∫

dx

P (x),

where P (x) is an nth order poly-nomial, we can always rewrite 1/Pas

1P (x)

=A1

x− r1+ . . .

An

x− rn,

where r1 . . . rn are the roots of thepolynomial, i.e., the solutions ofthe equation P (r) = 0. If the poly-nomial is second-order, you canfind the roots r1 and r2 usingthe quadratic formula; I’ll assumefor the time being that they’re

62 CHAPTER 4. TECHNIQUES AND APPLICATIONS

real. For higher-order polynomi-als, there is no surefire, easy wayof finding the roots by hand, andyou’d be smart simply to use com-puter software to do it. In Yacas,you can find the real roots of apolynomial like this:

FindRealRoots(x^4-5*x^3-25*x^2+65*x+84){3.,7.,-4.,-1.}

(I assume it uses Newton’s methodto find them.) The constants Ai

can then be determined by alge-bra, or by the trick of evaluating1/P (x) for a value of x very closeto one of the roots. In the exam-ple of the polynomial x4 − 5x3 −25x2 + 65x + 84, let r1 . . . r4 bethe roots in the order in which theywere returned by Yacas. Then A1

can be found by evaluating 1/P (x)at x = 3.000001:

P(x):=x^4-5*x^3-25*x^2+65*x+84

N(1/P(3.000001))-8928.5702094768

We know that for x very close to3, the expression

1P

=A1

x− 3+

A2

x− 7+

A3

x + 4+

A4

x + 1

will be dominated by the A1 term,so

−8930 ≈ A1

3.000001− 3A1 ≈ (−8930)(10−6) .

By the same method we can findthe other four constants:

dx:=.000001N(1/P(7+dx),30)*dx

0.2840908276e-2

N(1/P(-4+dx),30)*dx-0.4329006192e-2

N(1/P(-1+dx),30)*dx0.1041666664e-1

(The N( ,30) construct is to tellYacas to do a numerical calcula-tion rather than an exact symbolicone, and to use 30 digits of pre-cision, in order to avoid problemswith rounding errors.) Thus,

1P

=−8.93× 10−3

x− 3

+2.84× 10−3

x− 7

− 4.33× 10−3

x + 4

+1.04× 10−2

x + 1.

The desired integral is∫dx

P (x)= −8.93× 10−3 ln(x− 3)

+ 2.84× 10−3 ln(x− 7)

− 4.33× 10−3 ln(x + 4)

+ 1.04× 10−2 ln(x + 1)+ c .

There are some possible complica-tions: (1) The same factor may oc-cur more than once, as in x3−5x2+7x−3 = (x−1)(x−1)(x−3). In this

4.6. PARTIAL FRACTIONS 63

example, we have to look for an an-swer of the form A/(x−1)+B/(x−1)2 +C/(x− 3), the solution being−.25/(x−1)−.5/(x−1)2+.25/(x−3). (2) The roots may be complex.This is no showstopper if you’reusing computer software that han-dles complex numbers gracefully.(You can choose a c that makes theresult real.) In fact, as discussed insection 5.3, some beautiful thingscan happen with complex roots.But as an alternative, any polyno-mial with real coefficients can befactored into linear and quadraticfactors with real coefficients. Foreach quadratic factor Q(x), wethen have a partial fraction of theform (A+Bx)/Q(x), where A andB can be determined by algebra.

64 CHAPTER 4. TECHNIQUES AND APPLICATIONS

5 Complex numbertechniques

5.1 Review ofcomplexnumbers

For a more detailed treatment ofcomplex numbers, see ch. 3 ofJames Nearing’s free book athttp://www.physics.miami.edu/nearing/mathmethods/.

a / Visualizing complex numbers aspoints in a plane.

We assume there is a number, i,such that i2 = −1. The squareroots of −1 are then i and −i. (Inelectrical engineering work, wherei stands for current, j is sometimesused instead.) This gives rise toa number system, called the com-plex numbers, containing the real

b / Addition of complex numbers isjust like addition of vectors, althoughthe real and imaginary axes don’t ac-tually represent directions in space.

numbers as a subset. Any com-plex number z can be written inthe form z = a + bi, where a andb are real, and a and b are thenreferred to as the real and imagi-nary parts of z. A number witha zero real part is called an imag-inary number. The complex num-bers can be visualized as a plane,figure a, with the real number lineplaced horizontally like the x axisof the familiar x−y plane, and theimaginary numbers running alongthe y axis. The complex num-bers are complete in a way that thereal numbers aren’t: every nonzerocomplex number has two squareroots. For example, 1 is a real

65

66 CHAPTER 5. COMPLEX NUMBER TECHNIQUES

c / A complex number and its conju-gate.

number, so it is also a memberof the complex numbers, and itssquare roots are −1 and 1. Like-wise, −1 has square roots i and −i,and the number i has square roots1/√

2 + i/√

2 and −1/√

2− i/√

2.

Complex numbers can be addedand subtracted by adding or sub-tracting their real and imaginaryparts, figure b. Geometrically, thisis the same as vector addition.

The complex numbers a + bi anda − bi, lying at equal distancesabove and below the real axis, arecalled complex conjugates. The re-sults of the quadratic formula areeither both real, or complex conju-gates of each other. The complexconjugate of a number z is notatedas z or z∗.

The complex numbers obey all thesame rules of arithmetic as the re-als, except that they can’t be or-dered along a single line. That is,

it’s not possible to say whether onecomplex number is greater thananother. We can compare themin terms of their magnitudes (theirdistances from the origin), buttwo distinct complex numbers mayhave the same magnitude, so, forexample, we can’t say whether 1 isgreater than i or i is greater than1.

Example 43. Prove that 1/

√2 + i/

√2 is a square

root of i .

. Our proof can use any ordinary rulesof arithmetic, except for ordering.

(1√2

+i√2

)2 =1√2· 1√

2+

1√2· i√

2

+i√2· 1√

2+

i√2· i√

2

=12

(1 + i + i − 1)

= i

Example 43 showed one methodof multiplying complex numbers.However, there is another nice in-terpretation of complex multiplica-tion. We define the argument ofa complex number, figure d, as itsangle in the complex plane, mea-sured counterclockwise from thepositive real axis. Multiplyingtwo complex numbers then corre-sponds to multiplying their magni-tudes, and adding their arguments,figure e.

Self-CheckUsing this interpretation of multiplica-tion, how could you find the square

5.1. REVIEW OF COMPLEX NUMBERS 67

d / A complex number can be de-scribed in terms of its magnitude andargument.

roots of a complex number? .

Answer, p. 79

Example 44The magnitude |z| of a complex num-ber z obeys the identity |z|2 = zz.To prove this, we first note that z hasthe same magnitude as z, since flip-ping it to the other side of the real axisdoesn’t change its distance from theorigin. Multiplying z by z gives a re-sult whose magnitude is found by mul-tiplying their magnitudes, so the mag-nitude of zz must therefore equal |z|2.Now we just have to prove that zz is apositive real number. But if, for exam-ple, z lies counterclockwise from thereal axis, then z lies clockwise fromit. If z has a positive argument, thenz has a negative one, or vice-versa.The sum of their arguments is there-fore zero, so the result has an argu-ment of zero, and is on the positivereal axis. 1

1I cheated a little. If z’s argument is

e / The argument of uv is the sum ofthe arguments of u and v .

This whole system was built upin order to make every numberhave square roots. What aboutcube roots, fourth roots, and soon? Does it get even more weirdwhen you want to do those as well?No. The complex number systemwe’ve already discussed is sufficientto handle all of them. The nicestway of thinking about it is in termsof roots of polynomials. In thereal number system, the polyno-mial x2 − 1 has two roots, i.e., twovalues of x (plus and minus one)that we can plug in to the polyno-mial and get zero. Because it hasthese two real roots, we can rewritethe polynomial as (x − 1)(x + 1).However, the polynomial x2+1 hasno real roots. It’s ugly that in thereal number system, some second-

30 degrees, then we could say z’s was -30,but we could also call it 330. That’s OK,because 330+30 gives 360, and an argu-ment of 360 is the same as an argumentof zero.

68 CHAPTER 5. COMPLEX NUMBER TECHNIQUES

order polynomials have two roots,and can be factored, while otherscan’t. In the complex number sys-tem, they all can. For instance,x2 + 1 has roots i and −i, and canbe factored as (x − i)(x + i). Ingeneral, the fundamental theoremof algebra states that in the com-plex number system, any nth-orderpolynomial can be factored com-pletely into n linear factors, andwe can also say that it has n com-plex roots, with the understand-ing that some of the roots may bethe same. For instance, the fourth-order polynomial x4 + x2 can befactored as (x− i)(x+ i)(x−0)(x−0), and we say that it has fourroots, i, −i, 0, and 0, two of whichhappen to be the same. This is asensible way to think about it, be-cause in real life, numbers are al-ways approximations anyway, andif we make tiny, random changes tothe coefficients of this polynomial,it will have four distinct roots, ofwhich two just happen to be veryclose to zero. I’ve given a proof ofthe fundamental theorem of alge-bra on page 85.

Example 45Find arg i , arg(−i), and arg 37, wherearg z denotes the argument of thecomplex number z.

Example 46Visualize the following multiplicationsin the complex plane using the inter-pretation of multiplication in terms ofmultiplying magnitudes and adding ar-guments: (i)(i) = −1, (i)(−i) = 1,(−i)(−i) = −1.

Example 47If we visualize z as a point in the com-plex plane, how should we visualize−z?

Example 48Find four different complex numbers zsuch that z4 = 1.

Example 49Compute the following:

|1 + i | , arg(1 + i) ,˛

11 + i

˛,

arg„

11 + i

«,

11 + i

5.2 Euler’s formulaHaving expanded our horizons toinclude the complex numbers, it’snatural to want to extend func-tions we knew and loved from theworld of real numbers so that theycan also operate on complex num-bers. The only really natural wayto do this in general is to use Tay-lor series. A particularly beautifulthing happens with the functionsex, sin x, and cos x:

ex = 1 +12!

x2 +13!

x3 + . . .

cos x = 1− 12!

x2 +14!

x4 − . . .

sinx = x− 13!

x3 +15!

x5 − . . .

If x = iφ is an imaginary number,we have

eiφ = cos φ + i sinφ ,

5.2. EULER’S FORMULA 69

a result known as Euler’s formula.The geometrical interpretation inthe complex plane is shown in fig-ure f.

f / The complex number eiφ lies on theunit circle.

Although the result may seem likesomething out of a freak show atfirst, applying the definition2 of theexponential function makes it clearhow natural it is:

ex = limn→∞

(1 +

x

n

)n

.

When x = iφ is imaginary, thequantity (1 + iφ/n) represents anumber lying just above 1 in thecomplex plane. For large n, (1 +iφ/n) becomes very close to theunit circle, and its argument is thesmall angle φ/n. Raising this num-ber to the nth power multiplies its

2See page 83 for an explanation ofwhere this definition comes from and whyit makes sense.

argument by n, giving a numberwith an argument of φ.

g / Leonhard Euler(1707-1783)

Euler’s formula is used frequentlyin physics and engineering.

Example 50. Write the sine and cosine functionsin terms of exponentials.

. Euler’s formula for x = −iφ givescos φ− i sin φ, since cos(−θ) = cos θ,and sin(−θ) = − sin θ.

cos x =eix + e−ix

2

sin x =eix − e−ix

2i

Example 51. Evaluate Z

ex cos xdx

. This seemingly impossible integralbecomes easy if we rewrite the cosine

70 CHAPTER 5. COMPLEX NUMBER TECHNIQUES

in terms of exponentials:

Zex cos xdx

=Z

ex„

eix + e−ix

2

«dx

=12

Z(e(1+i)x + e(1−i)x ) dx

=12

„e(1+i)x

1 + i+

e(1−i)x

1− i

«+ c

Since this result is the integral of areal-valued function, we’d like it to bereal, and in fact it is, since the first andsecond terms are complex conjuagesof one another. If we wanted to, wecould use Euler’s theorem to convertit back to a manifestly real result.3

5.3 Partial fractionsrevisited

Suppose we want to evaluate theintegral ∫

dx

x2 + 1

by the method of partial fractions.The quadratic formula tells us thatthe roots are i and −i, setting1/(x2 + 1) = A/(x + i) + B/(x− i)

3In general, the use of complex num-ber techniques to do an integral could re-sult in a complex number, but that com-plex number would be a constant, whichcould be subsumed within the usual con-stant of integration.

gives A = i/2 and B = −i/2, so∫dx

x2 + 1=

i

2

∫dx

x + i

− i

2

∫dx

x− i

=i

2ln(x + i)

− i

2ln(x− i)

=i

2ln

x + i

x− i.

The attractive thing about this ap-proach, compared with the methodused on page 58, is that it doesn’trequire any tricks. If you cameacross this integral ten years fromnow, you could pull out your oldcalculus book, flip through it, andsay, “Oh, here we go, there’s a wayto integrate one over a polynomial— partial fractions.” On the otherhand, it’s odd that we started outtrying to evaluate an integral thathad nothing but real numbers, andcame out with an answer that isn’teven obviously a real number.

But what about that expression(x+i)/(x−i)? Let’s give it a name,w. The numerator and denomina-tor are complex conjugates of oneanother. Since they have the samemagnitude, we must have |w| = 1,i.e., w is a complex number thatlies on the unit circle, the kind ofcomplex number that Euler’s for-mula refers to. The numeratorhas an argument of tan−1(1/x) =π/2 − tan−1 x, and the denomi-nator has the same argument butwith the opposite sign. Division

5.3. PARTIAL FRACTIONS REVISITED 71

means subtracting arguments, soarg w = π−2 tan−1 x. That meansthat the result can be rewritten us-ing Euler’s formula as∫

dx

x2 + 1=

i

2ln ei(π−2 tan−1 x)

=i

2· i(π − 2 tan−1 x)

= tan−1 x + c .

In other words, it’s the same resultwe found before, but found with-out the need for trickery.

72 CHAPTER 5. COMPLEX NUMBER TECHNIQUES

6 Improper integrals6.1 Integrating a

function thatblows up

When we integrate a function thatblows up to infinity at some pointin the interval we’re integrating,the result may be either finite orinfinite.

Example 52. Integrate the function y = 1/

√x

from x = 0 to x = 1.

. The function blows up to infinity atone end of the region of integration,but let’s just try evaluating it, and seewhat happens.

Z 1

0x−1/2 dx = 2x1/2

˛1

0

= 2

The result turns out to be finite. In-tuitively, the reason for this is that thespike at x = 0 is very skinny, and getsskinny fast as we go higher and higherup.

a / The integralR 10 dx/

√x is finite.

Example 53. Integrate the function y = 1/x2 fromx = 0 to x = 1.

. Z 1

0x−2 dx = −x−1

˛1

0

= −1 +10

Division by zero is undefined, so theresult is undefined.

Another way of putting it, using the hy-perreal number system, is that if wewere to integrate from ε to 1, where ε

was an infinitesimal number, then theresult would be −1 + 1/ε, which is infi-nite. The smaller we make ε, the big-ger the infinite result we get out.

Intuitively, the reason that this integralcomes out infinite is that the spike atx = 0 is fat, and doesn’t get skinnyfast enough.

73

74 CHAPTER 6. IMPROPER INTEGRALS

b / The integralR 1

0 dx/x2

is infinite.

These two examples were examplesof improper integrals.

6.2 Limits ofintegration atinfinity

Another type of improper integralis one in which one of the limits ofintegration is infinite. The nota-tion ∫ ∞

a

f(x) dx

means the limit of∫ H

af(x) dx,

where H is made to grow big-ger and bigger. Alternatively, wecan think of it as an integral inwhich the top end of the intervalof integration is an infinite hyper-real number. A similar interpreta-tion applies when the lower limit is−∞, or when both limits are infi-nite.

Example 54. Evaluate Z ∞

1x−2 dx

. Z H

1x−2 dx = −x−1

˛H

1

= − 1H

+ 1

As H gets bigger and bigger, the re-sult gets closer and closer to 1, so theresult of the improper integral is 1.

Note that this is the same graph asin example 52, but with the x and yaxes interchanged; this shows that thetwo different types of improper inte-grals really aren’t so different.

c / The integralR∞1 dx/x2 is finite.

7 Iterated integrals7.1 Integrals inside

integralsIn various applications, you needto do integrals stuck inside otherintegrals. These are known as it-erated integrals, or double inte-grals, triple integrals, etc. Simi-lar concepts crop up all the timeeven when you’re not doing cal-culus, so let’s start by imaginingsuch an example. Suppose youwant to count how many squaresthere are on a chess board, and youdon’t know how to multiply eighttimes eight. You could start fromthe upper left, count eight squaresacross, then continue with the sec-ond row, and so on, until youhow counted every square, givingthe result of 64. In slightly moreformal mathematical language, wecould write the following recipe:for each row, r, from 1 to 8, con-sider the columns, c, from 1 to 8,and add one to the count for eachone of them. Using the sigma no-tation, this becomes

8∑r=1

8∑c=1

1 .

If you’re familiar with computerprogramming, then you can thinkof this as a sum that could becalculated using a loop nested in-side another loop. To evaluate theresult (again, assuming we don’t

know how to multiply, so we haveto use brute force), we can firstevaluate the inside sum, whichequals 8, giving

8∑r=1

8 .

Notice how the “dummy” variablec has disappeared. Finally we dothe outside sum, over r, and findthe result of 64.

Now imagine doing the same thingwith the pixels on a TV screen.The electron beam sweeps acrossthe screen, painting the pixels ineach row, one at a time. This is re-ally no different than the exampleof the chess board, but because thepixels are so small, you normallythink of the image on a TV screenas continuous rather than discrete.This is the idea of an integral incalculus. Suppose we want to findthe area of a rectangle of width aand height b, and we don’t knowthat we can just multiply to getthe area ab. The brute force wayto do this is to break up the rect-angle into a grid of infinitesimallysmall squares, each having widthdx and height dy, and therefore theinfinitesimal area dA = dxdy. Forconvenience, we’ll imagine that therectangle’s lower left corner is atthe origin. Then the area is given

75

76 CHAPTER 7. ITERATED INTEGRALS

by this integral:

area =∫ b

y=0

∫ a

x=0

dA

=∫ b

y=0

∫ a

x=0

dx dy

Notice how the leftmost integralsign, over y, and the rightmostdifferential, dy, act like bookends,or the pieces of bread on a sand-wich. Inside them, we have the in-tegral sign that runs over x, andthe differential dx that matches iton the right. Finally, on the inner-most layer, we’d normally have thething we’re integrating, but here’sit’s 1, so I’ve omitted it. Writ-ing the lower limits of the integralswith x = and y = helps to keepit straight which integral goes withwith differential. The result is

area =∫ b

y=0

∫ a

x=0

dA

=∫ b

y=0

∫ a

x=0

dx dy

=∫ b

y=0

(∫ a

x=0

dx

)dy

=∫ b

y=0

a dy

= a

∫ b

y=0

dy

= ab .

Area of a triangle Example 55. Find the area of a 45-45-90 right tri-angle having legs a.

. Let the triangle’s hypotenuse runfrom the origin to the point (a, a), and

let its legs run from the origin to (0, a),and then to (a, a). In other words, thetriangle sits on top of its hypotenuse.Then the integral can be set up thesame way as the one before, but for aparticular value of y , values of x onlyrun from 0 (on the y axis) to y (on thehypotenuse). We then have

area =Z a

y=0

Z y

x=0dA

=Z a

y=0

Z y

x=0dx dy

=Z a

y=0

„Z y

x=0dx

«dy

=Z a

y=0y dy

=12

a2

Note that in this example, because theupper end of the x values dependson the value of y , it makes a differ-ence which order we do the integralsin. The x integral has to be on the in-side, and we have to do it first.

Volume of a cube Example 56. Find the volume of a cube with sidesof length a.

. This is a three-dimensional example,so we’ll have integrals nested threedeep, and the thing we’re integratingis the volume dV = dxdydz.

7.2. APPLICATIONS 77

volume =Z a

z=0

Z a

y=0

Z a

x=0dV

=Z a

z=0

Z a

y=0

Z a

x=0dx dy dz

=Z a

z=0

Z a

y=0a dy dz

= aZ a

z=0

Z a

y=0dy dz

= aZ a

z=0a dz

= a2Z a

z=0dz

= a3

Area of a circle Example 57. Find the area of a circle.

. To make it easy, let’s find the areaof a semicircle and then double it. Letthe circle’s radius be r , and let it becentered on the origin and boundedbelow by the x axis. Then the curvededge is given by the equation r 2 =x2 + y2, or y =

√r 2 − x2. Since the

y integral’s limit depends on x , the xintegral has to be on the outside. Thearea is

area =Z r

x=−r

Z √r2−x2

y=0dy dx

=Z r

x=−r

pr 2 − x2dx

= rZ r

x=−r

p1− (x/r )2 dx .

Substituting u = x/r ,

area = r 2Z 1

u=−1

p1− u2 du

The definite integal equals π, as youcan find using a trig substitution orsimply by looking it up in a table, andthe result is, as expected, πr 2/2 forthe area of the semicircle. Doublingit, we find the expected result of πr 2

for a full circle.

7.2 ApplicationsUp until now, the integrand of theinnermost integral has always been1, so we really could have done allthe double integrals as single inte-grals. The following example is onein which you really need to do it-erated integrals.

a / The famous tightropewalker Charles Blondinuses a long pole for itslarge moment of inertia.

Moments of inertia Example 58The moment of inertia is a measureof how difficult it is to start an ob-

78 CHAPTER 7. ITERATED INTEGRALS

ject rotating (or stop it). For example,tightrope walkers carry long poles be-cause they want something with a bigmoment of inertia. The moment of in-ertia is defined by I =

Rr 2dm, where

dm is the mass of an infinitesimallysmall portion of the object, and r is thedistance from the axis of rotation.

To start with, let’s do an example thatdoesn’t require iterated integrals. Let’scalculate the moment of inertia of athin rod of mass M and length L abouta line perpendicular to the rod andpassing through its center.

I =Z

r 2dm

=Z L/2

−L/2x2 M

Ldx [r = |x |, so r 2 = x2]

=112

ML2

Now let’s do one that requires iter-ated integrals: the moment of inertiaof a cube of side b, for rotation aboutan axis that passes through its centerand is parallel to four of its faces.

Let the origin be at the center of thecube, and let x be the rotation axis.

I =Z

r 2dm

= ρ

Zr 2dV

= ρ

Z b/2

b/2

Z b/2

b/2

Z b/2

b/2

“y2 + z2

”dx dy dz

= ρbZ b/2

b/2

Z b/2

b/2

“y2 + z2

”dy dz

The fact that the last step is a trivial in-tegral results from the symmetry of theproblem. The integrand of the remain-ing double integral breaks down into

two terms, each of which depends ononly one of the variables, so we breakit into two integrals,

I = ρbZ b/2

b/2

Z b/2

b/2y2dydz+ρb

Z b/2

b/2

Z b/2

b/2z2dydz

which we know have identical results.We therefore only need to evaluateone of them and double the result:

I = 2ρbZ b/2

b/2

Z b/2

b/2z2dy d z

= 2ρb2Z b/2

b/2z2dz

=16

ρb5

=16

Mb2

A Answers to self-checks

Answers to Self-Checks

Answers to self-checks for chapter 5

page 66, self-check 1: Say we’re looking for u =√

z, i.e., we want anumber u that, multiplied by itself, equals z. Multiplication multipliesthe magnitudes, so the magnitude of u can be found by taking the squareroot of the magnitude of z. Since multiplication also adds the argumentsof the numbers, squaring a number doubles its argument. Therefore wecan simply divide the argument of z by two to find the argument ofu. This results in one of the square roots of z. There is another one,which is −u, since (−u)2 is the same as u2. This may seem a little odd:if u was chosen so that doubling its argument gave the argument of z,then how can the same be true for −u? Well for example, suppose theargument of z is 4 ◦. Then arg u = 2 ◦, and arg(−u) = 182 ◦. Doubling182 gives 364, which is actually a synonym for 4 degrees.

79

80 A Answers to self-checks

B DetoursFormal definition of the tangent line

Let (a, b) be a point on the graph of the function x(t). A line `(t)through this point is said not to cut through the graph if there existssome real number d such that either x(t) ≥ `(t) or x(t) ≤ `(t) for all tbetween a− d and a + d. The line is said to be the tangent line at thispoint if it is the only line through this point that doesn’t cut throughthe graph.

Derivatives of polynomials

We want to prove that the derivative of tk is ktk−1. It suffices to provethat the derivative equals k when evaluated at t = 1, since we can thenapply the kind of scaling argument used on page 12 to find the derivativeof t2/2 was t. The tangent line at (1, 1) has the equation ` = k(t−1)+1,so we only need to prove that the polynomial tk− [k(t−1)+1] is greaterthan or equal to zero in some finite region around t = 1.

First, let’s change variables to u = t − 1. Then the polynomial inquestion becomes P (u) = (u+1)k− (ku+1), and we want to prove thatit’s nonnegative in some region around u = 0. (We assume k ≥ 2, sincewe’ve already found the derivatives in the cases of k = 0 and 1, and inthose cases P (u) is identically zero.)

Now the last two terms in the binomial series for (u + 1)k are just kuand 1, so P (u) is a polynomial whose lowest-order term is a u2 term.Also, all the nonzero coefficients of the polynomial are positive, so P ispositive for u ≥ 0.

To complete the proof we only need to establish that P is also positivefor sufficiently small negative values of u. For negative u, the even-orderterms of P are positive, and the odd-order terms negative. To make theidea clear, consider the k = 5 case, where P (u) = u5 + 5u4 + 10u3 +10u2. The idea is to pair off each positive term with the negative oneimmediately to its left. Although the coefficient of the negative termmay, in general, be greater than the coefficient of the positive termwith which we’ve paired it, a property of the binomial coefficients is theratio of successive coefficients is never greater than k. Thus for −1/k <

81

82 B Detours

u < 0, each positive term is guaranteed to dominate the negative termimmediately to its left.

Details of the proof of the derivative of the sine function Someideas in this proof are due to Jerome Keisler.

On page ??, I computed

dx = sin(t + dt)− sin t ,= sin t cos dt

+ cos t sin dt− sin t

≈ cos t dt .

Here I’ll prove that the error introduced by the small-angle approxima-tions really is of order dt2. We have

sin(t + dt) = sin t + cos tdt− E ,

where the error E introduced by the approximations is

E =sin t(1− cos dt)+ cos t(dt− sin dt) .

a / Geometrical interpre-tation of the error term.

Let the radius of the circle in figure a be one, so AD is cos dt and CD issin dt. The area of the shaded pie slice is dt/2, and the area of triangleABC is sin dt/2, so the error made in the approximation sin dt ≈ dtequals twice the area of the dish shape formed by line BC and arc BC.Therefore dt−sin dt is less than the area of rectangle CEBD. But CEBDhas both an infinitesimal width and an infinitesimal height, so this erroris of no more than order dt2.

For the approximation cos dt ≈ 1, the error (represented by BD) is1 − cos dt = 1 −

√1− sin2 dt, which is less than 1 −

√1− dt2, since

sin dt < dt. Therefore this error is of order dt2.

83

Derivative of ex

All of the reasoning on page 31 have applied equally well to any otherexponential function with a different base, such as 2x or 10x. Thosefunctions would have different values of c, so if we want to determinethe value of c for the base-e case, we need to bring in the definition ofe, or of the exponential function ex, somehow.

We can take the definition of ex to be

ex = limn→∞

(1 +

x

n

)n

.

The idea behind this relation is similar to the idea of compound interest.If the interest rate is 10%, compounded annually, then x = 0.1, andthe balance grows by a factor (1 + x) = 1.1 in one year. If, instead,we want to compound the interest monthly, we can set the monthlyinterest rate to 0.1/12, and then the growth of the balance over a yearis (1+x/12)12 = 1.1047, which is slightly larger because the interest fromthe earlier months itself accrues interest in the later months. Continuingthis limiting process, we find e1.1 = 1.1052.

If n is large, then we have a good approximation to the base-e ex-ponential, so let’s differentiate this finite-n approximation and try tofind an approximation to the derivative of ex. The chain rule tells isthat the derivative of (1 + x/n)n is the derivative of the raising-to-the-nth-power function, multiplied by the derivative of the inside stuff,d(1 + x/n)/dx = 1/n. We then have

d(1 + x

n

)n

dx=

[n

(1 +

x

n

)n−1]· 1n

=(1 +

x

n

)n−1

.

But evaluating this at x = 0 simply gives 1, so at x = 0, the approxi-mation to the derivative is exactly 1 for all values of n — it’s not evennecessary to imagine going to larger and larger values of n. This estab-lishes that c = 1, so we have

dex

dx= ex

for all values of x.

Proof of the fundamental theorem of calculus

There are three parts to the proof: (1) Take the equation that statesthe fundamental theorem, differentiate both sides with respect to b, and

84 B Detours

show that they’re equal. (2) Show that continuous functions with equalderivatives must be essentially the same function, except for an additiveconstant. (3) Show that the constant in question is zero.

1. By the definition of the indefinite integral, the derivative of x(b)−x(a)with respect to b equals x(b). We have to establish that this equals thefollowing:

ddb

∫ b

a

x(t)dt = st1db

[∫ b+db

a

x(t)dt−∫ b

a

x(t)dt

]

= st1db

∫ b+db

b

x(t)dt

= st1db

limH→∞

H∑i=0

x(b + i db/H)db

H

= st limH→∞

1H

H∑i=0

x(b + i db/H)

Since x is continuous, all the values of x occurring inside the sum candiffer only infinitesimally from x(b). Therefore the quantity inside thelimit differs only infinitesimally from x(b), and the standard part of itslimit must be x(b).1

2. Suppose f and g are two continuous functions whose derivatives areequal. Then d = f − g is a continuous function whose derivative is zero.But the only continuous function with a derivative of zero is a constant,so f and g differ by at most an additive constant.

3. I’ve established that the derivatives with respect to b of x(b)− x(a)and

∫ b

axdt are the same, so they differ by at most an additive constant.

But at b = a, they’re both zero, so the constant must be zero.

Proof of the mean value theorem

Suppose that the mean value theorem is violated. Let L by the set of allx in the interval from a to b such that y(x) < y, and likewise let M bethe set with y(x) > y. If the theorem is violated, then the union of thesetwo sets covers the entire interval from a to b. Neither one can be empty;if, for example, M was empty, then we would have y < y everywhereand also

∫ b

ay =

∫ b

ay, but it follows directly from the definition of the

1If you don’t want to use infinitesimals, then you can express the derivative as alimit, and in the final step of the argument use the mean value theorem, introducedlater in the chapter.

85

definite integral that when one function is less than another, its integralis also less than the other’s. Since y takes on values less than and greaterthan y, it follows from the intermediate value theorem that y takes onthe value y somewhere (intuitively, at a boundary between L and M).

Proof of the fundamental theorem of algebra

Theorem: In the complex number system, an nth-order polynomial hasexactly n roots, i.e., it can be factored into the form P (z) = (z−a1)(z−a2) . . . (z − an), where the ai are complex numbers.

Proof: The proofs in the cases of n = 0 and 1 are trivial, so our strategyis to reduce higher-n cases to lower ones. If an nth-degree polynomial Phas at least one root, a, then we can always reduce it to a polynomial ofdegree n− 1 by dividing it by (z − a). Therefore the theorem is provedby induction provided that we can show that every polynomial of degreegreater than zero has at least one root.

Suppose, on the contrary, that there was an nth order polynomial P (z),with n > 0, that had no roots at all. Then |P (z)| must have someminimum value, which is achieved at z = zo. (Polynomials don’t haveasymptotes, so the minimum really does have to occur for some specific,finite zo.) To make things more simple and concrete, we can constructanother polynomial Q(z) = P (z+zo)/P (zo), so that |Q| has a minimumvalue of 1, achieved at Q(0) = 1. This means that Q’s constant term is1. What about its other terms? Let Q(z) = 1+c1z+. . .+cnzn. Supposec1 was nonzero. Then for infinitesimally small values of z, the terms oforder z2 and higher would be negligible, and we could make Q(z) bea real number less than one by an appropriate choice of z’s argument.Therefore c1 must be zero. But that means that if c2 is nonzero, then forinfinitesimally small z, the z2 term dominates the z3 and higher terms,and again this would allow us to make Q(z) be real and less than onefor appropriately chosen values of z. Continuing this process, we findthat Q(z) has no terms at all beyond the constant term, i.e., Q(z) = 1.This contradicts the assumption that n was greater than zero, so we’veproved by contradiction that there is no P with the properties claimed.

86 B Detours

C Photo CreditsExcept as specifically noted below or in a parenthetical credit in the caption of afigure, all the illustrations in this book are under my own copyright, and are copyleftlicensed under the same license as the rest of the book.

In some cases it’s clear from the date that the figure is public domain, but I don’tknow the name of the artist or photographer; I would be grateful to anyone whocould help me to give proper credit. I have assumed that images that come fromU.S. government web pages are copyright-free, since products of federal agencies fallinto the public domain. I’ve included some public-domain paintings; photographicreproductions of them are not copyrightable in the U.S. (Bridgeman Art Library,Ltd. v. Corel Corp., 36 F. Supp. 2d 191, S.D.N.Y. 1999).

cover: Daniel Schwen, 2004; GFDL licensed 8 Gauss: C.A. Jensen (1792-1870).10 Newton: Godfrey Kneller, 1702. 21 Leibniz: Bernhard Christoph Francke,1700. 24 Berkeley: public domain?. 25 Robinson: source: www-groups.dcs.st-and.ac.uk, copyright status unknown. 69 Euler: Emanuel Handmann, 1753. 77tightrope walker: public domain, since Blondin died in 1897.

87

Indexcomplex numbers, 65continuous function, 38

fundamental theorem of alge-bra, 68

intermediate value theorem, 38

mean value theoremproof, 84

moment of inertia, 77

88