(chapman & hall_crc pure and applied mathematics) robert carlson-a concrete introduction to real...

299
A Concrete Introduction to Real Analysis

Upload: alejandro-gutierrez-valverde

Post on 12-Dec-2015

22 views

Category:

Documents


8 download

DESCRIPTION

estadistica

TRANSCRIPT

Page 1: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A ConcreteIntroduction toReal Analysis

Page 2: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

M. S. BaouendiUniversity of California,

San Diego

Jane CroninRutgers University

Jack K. HaleGeorgia Institute of Technology

S. KobayashiUniversity of California,

Berkeley

Marvin MarcusUniversity of California,

Santa Barbara

W. S. MasseyYale University

Anil NerodeCornell University

Freddy van OystaeyenUniversity of Antwerp,Belgium

Donald PassmanUniversity of Wisconsin,Madison

Fred S. RobertsRutgers University

David L. RussellVirginia Polytechnic Instituteand State University

Walter SchemppUniversität Siegen

Mark TeplyUniversity of Wisconsin,Milwaukee

PURE AND APPLIED MATHEMATICS

A Program of Monographs, Textbooks, and Lecture Notes

EDITORIAL BOARD

EXECUTIVE EDITORS

Earl J. TaftRutgers University

Piscataway, New Jersey

Zuhair NashedUniversity of Central Florida

Orlando, Florida

Page 3: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

MONOGRAPHS AND TEXTBOOKS INPURE AND APPLIED MATHEMATICS

Recent Titles

G. S. Ladde and M. Sambandham, Stochastic versus Deterministic Systems ofDifferential Equations (2004)

B. J. Gardner and R. Wiegandt, Radical Theory of Rings (2004)

J. Haluska, The Mathematical Theory of Tone Systems (2004)

C. Menini and F. Van Oystaeyen, Abstract Algebra: A Comprehensive Treatment(2004)

E. Hansen and G. W. Walster, Global Optimization Using Interval Analysis, SecondEdition, Revised and Expanded (2004)

M. M. Rao, Measure Theory and Integration, Second Edition, Revised and Expanded(2004)

W. J. Wickless, A First Graduate Course in Abstract Algebra (2004)

R. P. Agarwal, M. Bohner, and W-T Li, Nonoscillation and Oscillation Theory forFunctional Differential Equations (2004)

J. Galambos and I. Simonelli, Products of Random Variables: Applications toProblems of Physics and to Arithmetical Functions (2004)

Walter Ferrer and Alvaro Rittatore, Actions and Invariants of Algebraic Groups (2005)

Christof Eck, Jiri Jarusek, and Miroslav Krbec, Unilateral Contact Problems: VariationalMethods and Existence Theorems (2005)

M. M. Rao, Conditional Measures and Applications, Second Edition (2005)

A. B. Kharazishvili, Strange Functions in Real Analysis, Second Edition (2006)

Vincenzo Ancona and Bernard Gaveau, Differential Forms on Singular Varieties:De Rham and Hodge Theory Simplified (2005)

Santiago Alves Tavares, Generation of Multivariate Hermite Interpolating Polynomials(2005)

Sergio Macías, Topics on Continua (2005)

Mircea Sofonea, Weimin Han, and Meir Shillor, Analysis and Approximation ofContact Problems with Adhesion or Damage (2006)

Marwan Moubachir and Jean-Paul Zolésio, Moving Shape Analysis and Control:Applications to Fluid Structure Interactions (2006)

Alfred Geroldinger and Franz Halter-Koch, Non-Unique Factorizations: Algebraic,Combinatorial and Analytic Theory (2006)

Kevin J. Hastings, Introduction to the Mathematics of Operations Researchwith Mathematica®, Second Edition (2006)

Robert Carlson, A Concrete Introduction to Real Analysis (2006)

Page 4: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)
Page 5: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Robert CarlsonUniversity of Colorado at Colorado SpringsColorado Springs, U.S.A.

A ConcreteIntroduction toReal Analysis

Boca Raton London New York

Chapman & Hall/CRC is an imprint of theTaylor & Francis Group, an informa business

Page 6: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

CRC PressTaylor & Francis Group6000 Broken Sound Parkway NW, Suite 300Boca Raton, FL 33487-2742

© 2006 by Taylor & Francis Group, LLCCRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government worksVersion Date: 20110713

International Standard Book Number-13: 978-1-4200-1154-8 (eBook - PDF)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, trans-mitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.

Visit the Taylor & Francis Web site athttp://www.taylorandfrancis.com

and the CRC Press Web site athttp://www.crcpress.com

Page 7: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Contents

1 Discrete Calculus 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Proof by induction . . . . . . . . . . . . . . . . . . . . . 21.3 A calculus of sums and differences . . . . . . . . . . . . 61.4 Sums of powers . . . . . . . . . . . . . . . . . . . . . . 141.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2 Selected Area Computations 252.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 252.2 Areas under power function graphs . . . . . . . . . . . 262.3 The computation of π . . . . . . . . . . . . . . . . . . 312.4 Natural logarithms . . . . . . . . . . . . . . . . . . . . 352.5 Stirling’s formula . . . . . . . . . . . . . . . . . . . . . 412.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3 Limits and Taylor’s Theorem 553.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 553.2 Limits of infinite sequences . . . . . . . . . . . . . . . . 56

3.2.1 Basic ideas . . . . . . . . . . . . . . . . . . . . . 563.2.2 Properties of limits . . . . . . . . . . . . . . . . . 60

3.3 Series representations . . . . . . . . . . . . . . . . . . . 653.4 Taylor series . . . . . . . . . . . . . . . . . . . . . . . . 68

3.4.1 Taylor polynomials . . . . . . . . . . . . . . . . . 693.4.2 Taylor’s Theorem . . . . . . . . . . . . . . . . . . 733.4.3 The remainder . . . . . . . . . . . . . . . . . . . 763.4.4 Additional results . . . . . . . . . . . . . . . . . 82

3.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4 Infinite Series 934.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 93

4.1.1 Bounded monotone sequences . . . . . . . . . . . 954.2 Positive series . . . . . . . . . . . . . . . . . . . . . . . 974.3 General series . . . . . . . . . . . . . . . . . . . . . . . 101

Page 8: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

4.3.1 Absolute convergence . . . . . . . . . . . . . . . 1024.3.2 Alternating series . . . . . . . . . . . . . . . . . . 1044.3.3 Power series . . . . . . . . . . . . . . . . . . . . . 106

4.4 Grouping and rearrangement . . . . . . . . . . . . . . . 1084.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 114

5 A Bit of Logic 1195.1 Some mathematical philosophy . . . . . . . . . . . . . . 1195.2 Propositional logic . . . . . . . . . . . . . . . . . . . . . 1225.3 Predicates and quantifiers . . . . . . . . . . . . . . . . . 1275.4 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

5.4.1 Axioms for propositional logic . . . . . . . . . . . 1325.4.2 Additional rules of inference . . . . . . . . . . . . 1355.4.3 Adding hypotheses . . . . . . . . . . . . . . . . . 1365.4.4 Proof by contradiction . . . . . . . . . . . . . . . 138

5.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 140

6 Real Numbers 1456.1 Field axioms . . . . . . . . . . . . . . . . . . . . . . . . 1466.2 Order axioms . . . . . . . . . . . . . . . . . . . . . . . . 1496.3 Completeness axioms . . . . . . . . . . . . . . . . . . . 1546.4 Subsequences and compact intervals . . . . . . . . . . . 1616.5 Products and fractions . . . . . . . . . . . . . . . . . . 164

6.5.1 Infinite products . . . . . . . . . . . . . . . . . . 1646.5.2 Continued fractions . . . . . . . . . . . . . . . . 169

6.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 175

7 Functions 1817.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 1817.2 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1827.3 Limits and continuity . . . . . . . . . . . . . . . . . . . 184

7.3.1 Limits . . . . . . . . . . . . . . . . . . . . . . . . 1847.3.2 Continuity . . . . . . . . . . . . . . . . . . . . . 1907.3.3 Uniform continuity . . . . . . . . . . . . . . . . . 195

7.4 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . 1987.4.1 Computation of derivatives . . . . . . . . . . . . 1997.4.2 The Mean Value Theorem . . . . . . . . . . . . . 2057.4.3 Contractions . . . . . . . . . . . . . . . . . . . . 2097.4.4 Convexity . . . . . . . . . . . . . . . . . . . . . . 212

7.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Page 9: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

8 Integrals 2238.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2238.2 Integrable functions . . . . . . . . . . . . . . . . . . . . 2268.3 Properties of integrals . . . . . . . . . . . . . . . . . . . 2358.4 Numerical computation of integrals . . . . . . . . . . . 241

8.4.1 Endpoint Riemann sums . . . . . . . . . . . . . . 2428.4.2 More sophisticated integration procedures . . . . 244

8.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 250

9 More Integrals 2559.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2559.2 Improper integrals . . . . . . . . . . . . . . . . . . . . . 256

9.2.1 Integration of positive functions . . . . . . . . . . 2589.2.2 Absolutely convergent integrals . . . . . . . . . . 2629.2.3 Conditionally convergent integrals . . . . . . . . 264

9.3 Integrals with parameters . . . . . . . . . . . . . . . . . 2689.3.1 Sample computations . . . . . . . . . . . . . . . 2689.3.2 Some analysis in two variables . . . . . . . . . . 2709.3.3 Functions defined by Riemann integration . . . . 2739.3.4 Functions defined by improper integrals . . . . . 278

9.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 285

References 291

Index 293

Page 10: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)
Page 11: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Preface

This book is an introduction to real analysis, which might be brieflydefined as the part of mathematics dealing with the theory of calculusand its more or less immediate extensions. Some of these extensionsinclude infinite series, differential equations, and numerical analysis.This brief description is accurate, but somewhat misleading, since anal-ysis is a huge subject which has been developing for more than threehundred years, and has deep connections with many subjects beyondmathematics, including physics, chemistry, biology, engineering, com-puter science, and even business and some of the social sciences.

The development of analytic (or coordinate) geometry and then cal-culus in the seventeenth century launched a revolution in science andworld view. Within one or two lifetimes scientists developed successfulmathematical descriptions of motion, gravitation, and the reaction ofobjects to various forces. The orbits of planets and comets could bepredicted, tides explained, artillery shell trajectories optimized. Sub-sequent developments built on this foundation include the quantitativedescriptions of fluid motion and heat flow. The ability to give manynew and interesting quantitatively accurate predictions seems to havealtered the way people conceived the world. What could be predictedmight well be controlled.

During this initial period of somewhat over one hundred years, thefoundations of calculus were understood on a largely intuitive basis.This seemed adequate for handling the physical problems of the day,and the very successes of the theory provided a substantial justifica-tion for the procedures. The situation changed considerably in thebeginning of the nineteenth century. Two landmark events were thesystematic use of infinite series of sines and cosines by Fourier in hisanalysis of heat flow, and the use of complex numbers and complexvalued functions of a complex variable. Despite their ability to makepowerful and accurate predictions of physical phenomena, these toolswere difficult to understand intuitively. Particularly in the area ofFourier series, some nonsensical results resulted from reasonable oper-ations. The resolution of these problems took decades of effort, and

Page 12: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

involved a careful reexamination of the foundations of calculus. Theancient Greek treatment of geometry, with its explicit axioms, carefuldefinitions, and emphasis on proof as a reliable foundation for reason-ing, was used successfully as a model for the development of analysis.

A modern course in analysis usually presents the material in an ef-ficient but austere manner. The student is plunged into a new math-ematical environment, replete with definitions, axioms, powerful ab-stractions, and an overriding emphasis on formal proof. Those stu-dents able to find their way in these new surroundings are rewardedwith greatly increased sophistication, particularly in their ability toreason effectively about mathematics and its applications to such fieldsas physics, engineering and scientific computation. Unfortunately, thestandard approach often produces large numbers of casualties, studentswith a solid aptitude for mathematics who are discouraged by the dif-ficulties, or who emerge with only a vague impression of a theoreticaltreatment whose importance is accepted as a matter of faith.

This text is intended to remedy some of the drawbacks in the treat-ment of analysis, while providing the necessary transition from a viewof mathematics focused on calculations to a view of mathematics whereproofs have the central position. Our goal is to provide students witha basic understanding of analysis as they might need it to solve typi-cal problems of science or engineering, or to explain calculus to a highschool class. The treatment is designed to be rewarding for the manystudents who will never take another class in analysis, while also pro-viding a solid foundation for those students who will continue in the‘standard’ analysis sequence.

The book begins with a variety of concrete problems which introducethe estimation techniques central to our subject. In treating problemsof area computation, the calculation of decimal expansions for the num-bers e and π, approximation by Taylor polynomials, or considerationof infinite series, the techniques of calculus are presumed valid and areused freely. In a way that roughly mimics the historical development,the axiomatic foundations of analysis are considered only after experi-ence helps develop familiarity with estimation and limits.

A more formal approach begins in chapter five with a brief discussionof logic. Arguments in propositional logic provide a model for rigorousproofs. The text then continues along more traditional lines with anaxiomatic presentation of the real numbers. Continuing in the standardfashion, functions, limits, continuity, and differentiation are treated.

Page 13: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Following the development of the Riemann integral, the book concludeswith a discussion of improper integrals and integrals with parameters.

A note for instructorsThe author uses this text for a two-semester course, normally cover-

ing most of chapters one through four in the first semester, and chaptersfive through eight in the second semester. Chapter five has been includ-ed because I wanted my students to have a more formal introductionto proofs than is normally presented in an analysis course, where thegeneral attitude seems to be ‘do what I do’. This chapter can certainlybe omitted, leaving time for chapter nine, which considers improper in-tegrals and integrals with parameters, or allowing the class to exploresupplementary topics like infinite products and continued fractions.

The first semester of this course serves as an optional transition fromcalculus to analysis. The optional nature means that quite a few stu-dents join the class in the second semester, without having taken thefirst. To accommodate the new students, the material on limits of se-quences (section 3.2) can be reviewed before starting the discussion ofthe completeness property of the real numbers (section 6.3).

Page 14: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Chapter 1

Discrete Calculus

1.1 Introduction

There is a story from the childhood of the famous mathematical scien-tist Carl F. Gauss (1777-1855). His elementary school teacher, wantingto keep the class busy, assigns the problem of adding the numbers from1 to 100. Gauss’s hand goes up more or less instantly, and the correctanswer

(100 × 101)/2 = 10100/2 = 5050

is produced. Actually Gauss is supposed to have solved the more gen-eral problem of finding the sum

n∑k=1

k = 1 + 2 + 3 + · · · + n =n(n + 1)

2.

This problem has the following elementary solution.

2 × (1 + 2 + 3 + · · · + n)

=(1 + 2 + . . . + (n − 1) + n)

+ (n + (n − 1) + . . . + 2 + 1)

= n(n + 1).

Each vertical sum is n + 1, and there are n such sums.What about adding higher powers of integers? For instance you could

be planning to build a pyramid of height n with cubes of stone. At thetop you will use 1 cube of stone, and as you move down each layer is asquare of stones, with the k-th layer having k × k stones. In order tomake labor and transportation allocation plans you want to know thetotal number of stones if the height is n, which amounts to

n∑k=1

k2.

1

Page 15: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

2 A Concrete Introduction to Real Analysis

You may have seen a formula for this sum,

n∑k=1

k2 =n(n + 1)(2n + 1)

6.

The problem of verifying this formula for every positive integer n,along with many other problems, can be solved by the technique calledproof by induction. After introducing proofs by induction, along withseveral applications of the method, we will return to the more generalproblem of finding formulas for the sum of powers

n∑k=1

km.

Along the way it is helpful to develop a calculus for functions defined onthe nonnegative integers. Several ideas from the calculus for functionsof a real variable have direct parallels in this new context. Additionalsummation formulas will also be derived.

1.2 Proof by induction

Proof by induction is one of the most fundamental methods of proofin mathematics, and it is particularly common in problems related todiscrete mathematics and computer science. In many cases it is themethod for establishing the validity of a formula, which may have beenconjectured based on a pattern appearing when several examples areworked out. The following formulas provide a pair of classic illustra-tions:

n∑k=1

k = 1 + 2 + · · · + n =n(n + 1)

2, (1.1)

n∑k=1

k2 = 1 + 22 + · · · + n2 =n(n + 1)(2n + 1)

6. (1.2)

The first formula (1.1) had an elementary noninductive proof, butthe second (1.2) is more of a challenge. Let’s test some cases. If n = 1then the sum is 1 and the right hand side is 1 · 2 · 3/6 = 1. If n = 2then the sum is 5 and the right hand side is 2 ·3 ·5/6 = 5. The first two

Page 16: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Discrete Calculus 3

cases are fine, but how can the formula be checked for every positiveinteger n?

The brilliant induction idea establishes the truth of an infinite se-quence of statements by verifying two things:

(i) that the first statement is true, and(ii) whenever the n-th statement is true, so is statement n + 1.Let’s try this out with the list of statements

S(n) :n∑

k=1

k2 = 12 + 22 + · · · + n2 =n(n + 1)(2n + 1)

6.

First, the first statement must be verified.

1∑k=1

k2 = 12 = 1 =1(2)(3)

6.

Yes, the first statement S(1) is true.Now suppose that n is one of the numbers for which the n-th state-

ment S(n) is true:n∑

k=1

k2 = 12 + 22 + · · · + n2 =n(n + 1)(2n + 1)

6.

The next case is a statement about the sumn+1∑k=1

k2 = 12 + 22 + · · · + n2 + (n + 1)2.

Since the n-th statement is true, we may make use of it.

n+1∑k=1

k2 = 12 + 22 + · · · + n2 + (n + 1)2

= [12 + 22 + · · · + n2] + (n + 1)2 =n(n + 1)(2n + 1)

6+ (n + 1)2

= (n + 1)[n(2n + 1)

6+

6(n + 1)6

] =(n + 1)(n + 2)(2n + 3)

6,

since the identity n(2n+1)+6(n+1) = (n+2)(2n+3) is easily checked.This shows that if S(n) is true, then so is the statement

S(n + 1) :n+1∑k=1

k2 = 12 + 22 + · · · + n2 + (n + 1)2

Page 17: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

4 A Concrete Introduction to Real Analysis

=(n + 1)(n + 2)(2n + 3)

6.

If the formula (1.2) is true in case n, then it is true in case n + 1.But since it is true in the case n = 1, it must be true for every positiveinteger n. Why does this follow? Suppose some statement S(n) is nottrue. Then there must be a smallest positive integer n0 such that thestatement S(n0) is true, but S(n0 +1) is false. This is impossible, sinceit has been shown that whenever S(n) is true, so is S(n+1). The falsestatement S(n) can’t exist!

Other proofs by induction proceed in a similar way. There is a se-quence of statements S(n), usually indexed by the positive integersn = 1, 2, 3, . . . , although the case n = 0, 1, 2, . . . , is certainly okay.These statements should be either true or false (not all statementshave a truth value). To prove that every statement is true it is enoughto prove two things: (i) the statement S(1) is true, and (ii) wheneverthe statement S(n) is true, then the statement S(n + 1) is true. Of-ten it is easy to check the truth of the statement S(1). The second orinduction step usually requires a problem specific technique each timethe method is used.

The technique of proof by induction commonly arises in questionsabout concrete mathematical formulas which are clearly either true orfalse. When trying to establish the general procedure for proof byinduction a bit of care is required. There are statements which do nothave a well-defined truth value (T or F). One example is This statementis false. If the statement is true, then it is false, and if it is false, then itis true! We shouldn’t run across any of these self-referential statements.

Here are some additional formulas that can be proved by induction.

n−1∑k=0

xk =1 − xn

1 − x, x �= 1.

Case n = 1 is

1 =1 − x1

1 − x.

Assuming case n is true, it follows that

(n+1)−1∑k=0

xk =n−1∑k=0

xk + xn

=1 − xn

1 − x+ xn =

1 − xn + xn(1 − x)1 − x

=1 − xn+1

1 − x, x �= 1.

Page 18: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Discrete Calculus 5

It is a bit more challenging to establish the next result, the BinomialTheorem. Recall that n factorial, written n!, is the product of theintegers from 1 to n:

n! = 1 · 2 · 3 · · · n.

It turns out to be extremely convenient to declare that 0! = 1, eventhough it is hard to make sense of this by means of the original de-scription of n!.

It will also be necessary to introduce the binomial coefficients whichare sufficiently important to get a special notation. For any integern ≥ 0 and any integer k in the range 0 ≤ k ≤ n define the binomialcoefficient (

n

k

)=

n!k!(n − k)!

=1 · 2 · · · n

(1 · 2 · · · k)(1 · 2 · · · [n − k]).

The symbol on the left is read n choose k, because this is the numberof distinct ways of choosing k objects from a set of n objects.

Theorem 1.2.1. (Binomial Theorem) For positive integers n, andany numbers a and b,

(a + b)n =n∑

k=0

(n

k

)akbn−k.

Proof. The case n = 1 amounts to

(a + b)1 = a + b,

which looks fine.Assuming the truth of S(n), the expression in S(n + 1) is

(a + b)n+1 = (a + b)(a + b)n = (a + b)n∑

k=0

(n

k

)akbn−k

=n∑

k=0

(n

k

)ak+1bn−k +

n∑k=0

(n

k

)akb(n+1)−k.

To give the sums a common form, replace k + 1 with j in the first ofthe last two sums to get

(a + b)n+1 =n+1∑j=1

(n

j − 1

)ajbn−(j−1) +

n∑k=0

(n

k

)akb(n+1)−k.

Page 19: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

6 A Concrete Introduction to Real Analysis

Of course the variable of summation, like that of integration, is a ‘dum-my’, so

(a + b)n+1 =n+1∑k=1

(n

k − 1

)akb(n+1)−k +

n∑k=0

(n

k

)akb(n+1)−k

= an+1 + bn+1 +n∑

k=1

[(

n

k − 1

)+

(n

k

)]akb(n+1)−k.

All that remains is to show that for 1 ≤ k ≤ n,(n

k − 1

)+

(n

k

)=

(n + 1

k

).

A straightforward computation gives(n

k − 1

)+

(n

k

)=

n!(k − 1)!(n + 1 − k)!

+n!

k!(n − k)!

=n!k

k!(n + 1 − k)!+

n!(n + 1 − k)k!(n + 1 − k)!

=n!

k!(n + 1 − k)!(k + n + 1 − k)

=(

n + 1k

).

1.3 A calculus of sums and differences

A basic idea in traditional calculus is that integrals are like sumsand derivatives are slopes. Motivated in part by the problem of under-standing the sum of powers

n∑k=1

km = 1m + 2m + 3m + · · · + nm,

we will develop a bit of discrete calculus for functions f(n) whichare defined for nonnegative integers n. Functions with integer argu-ments are common: in particular they arise when computer algorithm-s are designed to perform mathematical calculations, and in digital

Page 20: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Discrete Calculus 7

communications. Examples of such functions include polynomials likef(n) = 5n3 + n2 + 3n, or rational functions like g(n) = n/(1 + n2),which are defined only when n is a nonnegative integer. The graphs ofthe functions 1/(n + 1) and sin(πn/10) are shown in Figures 1.1 and1.2.

0 2 4 6 8 10 12 14 160

0.2

0.4

0.6

0.8

1

Figure 1.1: f(n) = 1n+1

Of course the function a(n) whose domain is the set N0 of nonnegativeintegers is the same as an infinite sequence a0, a1, a2, . . . . We’ve justused functional notation rather than subscript notation for the index.

Functions f(n) of an integer variable have a calculus similar to thebetter known calculus of functions of a real variable. However, mostof the limit problems that arise in real variable calculus are missing inthe study of discrete calculus. As a consequence, many of the proofsare greatly simplified.

Page 21: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

8 A Concrete Introduction to Real Analysis

0 5 10 15 20

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Figure 1.2: g(n) = sin(πn/10)

In discrete calculus the role of the conventional integral∫ x

af(t) dt, a ≤ x,

will be replaced by the sum or discrete integral

n∑k=m

f(k), m ≤ n.

The problem of finding simple expressions for (signed) areas such as∫ x

0t3 dt

is replaced by the problem of finding a simple expression for a sum

n∑k=1

k3.

Page 22: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Discrete Calculus 9

To find a replacement for the derivative

f ′(x) =df(x)dx

= limh→0

f(x + h) − f(x)h

,

use the approximation of the derivative by the slope of a secant line,

f ′(n) � f(n + 1) − f(n)1

= f(n + 1) − f(n).

Since the difference between consecutive integers is 1, the denominatorconveniently drops out. It is helpful to introduce two notations forthese differences, which will play the role of the derivative:

f+(n) = Δ+f(n) = f(n + 1) − f(n).

This new function f+(n) is called the forward difference of f(n).Here are a few forward differences for simple functions.

f(n) = n, f+(n) = (n + 1) − n = 1,

g(n) = n2, g+(n) = (n + 1)2 − n2 = 2n + 1.

Notice that g+(n) is not 2n, as we might expect from derivative calcu-lations. Another calculation gives

h(n) = 3−n, h+(n) = 3−(n+1) − 3−n = [3−1 − 1]3−n = −2 · 3−n

3.

In fact for any fixed number x,

Δ+xn = xn+1 − xn = (x − 1)xn. (1.3)

There are several theorems from real variable calculus with close par-allels in discrete calculus. The first says that sums and differences arelinear. To simplify notation, recall that N0 denotes the set of nonneg-ative integers 0, 1, 2, . . . . The notation f : N0 → R means that thefunction f takes real values, and has domain N0.

Theorem 1.3.1. For any functions f : N0 → R and g : N0 → R, andany real numbers a, b

Δ+[af(n) + bg(n)] = aΔ+f(n) + bΔ+g(n) = af+(n) + bg+(n),

andn∑

k=m

[af(k) + bg(k)] = a

n∑k=m

f(k) + b

n∑k=m

g(k), m ≤ n.

Page 23: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

10 A Concrete Introduction to Real Analysis

Proof. These results follow from simple arithmetic. For the differences,

Δ+[af(n) + bg(n)] = [af(n + 1) + bg(n + 1)] − [af(n) + bg(n)]

= a[f(n + 1) − f(n)] + b[g(n + 1) − g(n)] = af+(n) + bg+(n).

For the sums,

n∑k=m

[af(k) + bg(k)] = af(m) + bg(m) + · · · + af(n) + bg(n)

= a[f(m) + · · · + f(n)] + b[g(m) + · · · + g(n)]

= a

n∑k=m

f(k) + b

n∑k=m

g(k).

There is a product rule only slightly different from what one mightexpect.

Theorem 1.3.2. For any functions f : N0 → R and g : N0 → R,

[f(n)g(n)]+ = f+(n)g(n+1)+f(n)g+(n) = g+(n)f(n+1)+g(n)f+(n).

Proof. The proof, whose first step is the judicious addition of 0, is againeasy.

[f(n)g(n)]+ = f(n + 1)g(n + 1) − f(n)g(n)

= f(n + 1)g(n + 1) − f(n)g(n + 1) + f(n)g(n + 1) − f(n)g(n)

= [f(n + 1) − f(n)]g(n + 1) + f(n)[g(n + 1) − g(n)]

= f+(n)g(n + 1) + f(n)g+(n).

Since f(n)g(n) = g(n)f(n), we may change the order to get

[f(n)g(n)]+ = g+(n)f(n + 1) + g(n)f+(n).

One of the important steps in calculus is the introduction of theindefinite integral, or antiderivative,

F (x) =∫ x

af(t) dt.

Page 24: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Discrete Calculus 11

The Fundamental Theorem of Calculus shows that differentiation andantidifferentiation are essentially inverse operations, in the sense that

F ′(x) = f(x),∫ x

aF ′(t) dt = F (x) − F (a).

To develop the analogous idea for sums we introduce the indefinite sum

F (n) =n∑

k=0

f(k), n = 0, 1, 2, . . . .

Notice that the indefinite sum is a way of producing new functions,rather than numbers. Here are a few examples which take advantageof our previous computations:

f(n) = n, F (n) =n∑

k=0

k = 0 + 1 + · · · + n =n(n + 1)

2,

f(n) = n2, F (n) =n∑

k=0

k2 =n(n + 1)(2n + 1)

6,

f(n) = 1, F (n) =n∑

k=0

1 = n + 1.

The Fundamental Theorem of Discrete Calculus relates indefinitesums and differences much as integrals and derivatives are related.

Theorem 1.3.3. (The Fundamental Theorem of Discrete Cal-culus) For any function f : N0 → R, and n ≥ 0,

n∑k=0

f+(k) = f(n + 1) − f(0),

and

Δ+n∑

k=0

f(k) = f(n + 1).

Proof. For the first part we have

n∑k=0

f+(k) = [f(n + 1)− f(n)] + [f(n)− f(n− 1)] + · · · + [f(1)− f(0)].

Page 25: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

12 A Concrete Introduction to Real Analysis

Adjacent terms cancel, leaving only the difference of the first and lastterms.

For the second part,

Δ+n∑

k=0

f(k) =n+1∑k=0

f(k) −n∑

k=0

f(k) = f(n + 1).

There are times when it is convenient to start additions from a num-ber m > 0. A simple corollary of the Fundamental Theorem of DiscreteCalculus is the formula

n∑k=m

f+(k) =n∑

k=0

f+(k) −m−1∑k=0

f+(k) (1.4)

= [f(n + 1) − f(0)] − [f(m) − f(0)] = f(n + 1) − f(m).

As a consequence of the Fundamental Theorem of Discrete Calcu-lus, every difference formula has a corresponding sum formula. Forexample, the calculation

Δ+ 1n + 1

=1

n + 2− 1

n + 1=

−1(n + 1)(n + 2)

givesn∑

k=0

1(k + 1)(k + 2)

= 1 − 1n + 2

.

Recall from real variable calculus that the product rule and the Fun-damental Theorem of Calculus combine to give integration by parts.Here is a similar result.

Theorem 1.3.4. (Summation by parts) For any functions f : N0 →R, g : N0 → R, and for m ≤ n,

n∑k=m

f(k)g+(k) = [f(n + 1)g(n + 1) − f(m)g(m)] −n∑

k=m

f+(k)g(k + 1).

Proof. As with derivatives and integrals we start with the product rule.

[f(n)g(n)]+ = f+(n)g(n + 1) + f(n)g+(n).

Page 26: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Discrete Calculus 13

Applying (1.4) yields

[f(n + 1)g(n + 1) − f(m)g(m)]

=n∑

k=m

f+(k)g(k + 1) +n∑

k=m

f(k)g+(k),

which is equivalent to the desired formula.

As an application of summation by parts, let’s find

n∑k=0

k

2k,

or more generallyn∑

k=0

kxk, x �= 1,

for a fixed number x. It will prove fruitful to take

f(n) = n, g(n) = xn/(x − 1).

The earlier computation (1.3) showed that g+(n) = xn.The summation by parts formula and the previous computation

n∑k=0

xk =1 − xn+1

1 − x

now given∑

k=0

kxk =n∑

k=0

f(k)g+(k)

= [f(n + 1)g(n + 1) − f(0)g(0)] −n∑

k=0

f+(k)g(k + 1)

=(n + 1)xn+1

x − 1−

n∑k=0

xk+1

x − 1=

(n + 1)xn+1

x − 1− x

x − 1

n∑k=0

xk,

so thatn∑

k=0

kxk =(n + 1)xn+1

x − 1+

x(1 − xn+1)(1 − x)2

. (1.5)

Page 27: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

14 A Concrete Introduction to Real Analysis

1.4 Sums of powers

This section begins with some additional information about differ-ences and indefinite sums. The Fundamental Theorem of DiscreteCalculus mimics the version from real variable calculus, showing inparticular that if

F (n) =n∑

k=0

f(k),

thenΔ+F (n) = f(n + 1).

This does not quite answer the question of whether for any functionf(n) there is a function F1(n) such that Δ+F1(n) = f(n), and towhat extent the ‘antidifference’ is unique. Following the resolution ofthese problems we consider whether the indefinite sum of a polynomialfunction is again a polynomial. This question will lead us back tothe sums of powers formulas which appeared at the beginning of thechapter.

Lemma 1.4.1. Suppose the function f : N0 → R satisfies

Δ+f(n) = 0

for all n ≥ 0. Then f(n) is a constant.

Proof. Write

f(n) = f(0) + [f(1) − f(0)] + [f(2) − f(1)] + · · · + [f(n) − f(n − 1)],

to see that f(n) = f(0).

Theorem 1.4.2. If f : N0 → R is any function, then there is a functionF1 : N0 → R such that

Δ+F1(n) = f(n).

Moreover, if there are two functions F1(n) and F2(n) such that

Δ+F1(n) = f(n) = Δ+F2(n),

then for some constant C

F1(n) = F2(n) + C.

Page 28: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Discrete Calculus 15

Proof. To show the existence of F1 take

F1(n) ={∑n−1

k=0 f(k), n ≥ 1,0, n = 0.

}

The forward difference of this function is

Δ+F1(n) = F1(n + 1) − F1(n) =n∑

k=0

f(k) −n−1∑k=0

f(k) = f(n), n ≥ 1.

If n = 0 thenΔ+F1(0) = F1(1) − F1(0) = f(0).

If there are two functions F1(n) and F2(n) whose forward differencesagree at all points n, then

Δ+[F1(n) − F2(n)] = 0.

By the previous lemma there is a constant C such that

F1(n) − F2(n) = C.

A nice feature of discrete calculus is that the differences of polyno-mials are again polynomials, but a drawback is that the formulas aremore complex than the corresponding derivative formulas. Looking inparticular at the power functions nm, the Binomial Theorem gives

Δ+nm = (n + 1)m − nm =m∑

k=0

(m

k

)nk − nm

=m−1∑k=0

(m

k

)nk = mnm−1 +

m(m − 1)2

nm−2 + · · · + mn + 1

This calculation is worth highlighting in the following lemma.

Lemma 1.4.3. If m is a nonnegative integer then

Δ+nm =m−1∑k=0

(m

k

)nk.

Page 29: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

16 A Concrete Introduction to Real Analysis

Since differences of the power functions are polynomials in n, Theo-rem 1.3.1 implies that the difference of a polynomial is always a polyno-mial. It is natural to then ask whether indefinite sums of polynomialsare again polynomials, and if so, are there convenient formulas. Thisquestion was already considered by Euler [4, pp. 36–42]. It suffices toconsider the sum of powers

n∑k=0

km.

Theorem 1.4.4. For every nonnegative integer m there is a polynomialpm(n) of degree m + 1 such that

n∑k=0

km = pm(n).

Moreover, these polynomials satisfy the recursion formula

(m + 1)pm(n) = (m + 1)n∑

k=0

km = (n + 1)m+1 −m−1∑j=0

(m + 1

j

)pj(n).

Proof. The proof is by induction on m. The first case is m = 0, where

n∑k=0

k0 =n∑

k=0

1 = n + 1.

Note that 00 is interpreted as 1 in this first case. If m > 0 then 0m = 0.For m ≥ 1 consider two evaluations of the sum

n∑k=0

Δ+km+1. (1.6)

On one hand, the Fundamental Theorem of Discrete Calculus gives

n∑k=0

Δ+km+1 = (n + 1)m+1 − 0m+1 = (n + 1)m+1.

On the other hand, Lemma 1.4.3 shows that

Δ+km+1 = (k + 1)m+1 − km+1 =m∑

j=0

(m + 1

j

)kj.

Page 30: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Discrete Calculus 17

Putting this expression for Δ+km+1 into the sum in (1.6) gives

n∑k=0

Δ+km+1 =n∑

k=0

m∑j=0

(m + 1

j

)kj .

Equating the two expressions for

n∑k=0

Δ+km+1

gives

(n + 1)m+1 =n∑

k=0

m∑j=0

(m + 1

j

)kj.

Interchanging the order of summation leads to

(n + 1)m+1 =m∑

j=0

n∑k=0

(m + 1

j

)kj =

m∑j=0

[(m + 1j

) n∑k=0

kj]

(1.7)

= (m + 1)n∑

k=0

km +m−1∑j=0

[(m + 1j

) n∑k=0

kj]

= (m + 1)n∑

k=0

km +m−1∑j=0

(m + 1

j

)pj(n).

By virtue of the induction hypothesis

pm(n) =n∑

k=0

km

is a polynomial in n of degree m + 1 satisfying the given recursionformula.

We make two comments about ideas arising in this proof. The firstcomment has to do with changing the order of summation in (1.7).Suppose we have any function F (j, k) of the integer variables j, k. Forthe case in question F (j, k) =

(m+1j

)kj , and the values F (j, k) are added

for all j, k in a rectangle in the j−k plane (see Figure 1.3). For a finitesum, the sum does not depend on the order of summation, so we mayadd rows first, or columns first, whichever proves more convenient.

Page 31: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

18 A Concrete Introduction to Real Analysis

−1 0 1 2 3 4 5 6 7 8 9−1

0

1

2

3

4

5

6

7

Figure 1.3: Adding F (j, k) for j = 0, . . . , 6, and k = 0, . . . , 4.

The second comment concerns a slight variation on induction whichentered in this last proof. Rather than counting on the truth of them-th statement to imply the truth of the (m + 1)-st statement, weactually assumed the truth of all statements with index less than orequal to m. A review of the logic behind induction shows that thisvariant is equally legitimate.

The recursion formula for the functions pm(n) means that in principlewe can write down arbitrary sums of powers formulas, although theyimmediately look pretty messy. For instance the previously establishedformulas

p0(n) = n + 1, p1(n) =n(n + 1)

2, p2(n) =

n(n + 1)(2n + 1)6

,

lead to

4p3(n) = 4n∑

k=0

k3 = (n + 1)4 −2∑

j=0

(4j

)pj(n)

= (n + 1)4 − p0(n) − 4p1(n) − 6p2(n)

= (n + 1)4 − (n + 1) − 2n(n + 1) − n(n + 1)(2n + 1)

Page 32: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Discrete Calculus 19

= (n + 1)2n2.

That is,n∑

k=0

k3 =(n + 1)2n2

4.

Page 33: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

20 A Concrete Introduction to Real Analysis

1.5 Problems

1. Use induction to show thatn∑

k=1

k =n(n + 1)

2.

2. Use induction to show that

n−1∑k=0

1(k + 1)(k + 2)

= 1 − 1n + 1

.

3. Use induction to show that

n−1∑k=0

k2−k = 2 − (n + 1)21−n.

4. Use the Binomial Theorem to show that

2n =n∑

k=0

(n

k

).

5. Find the flaw in the following logic.Let’s prove by induction that if you have a collection of N horses,

and at least 1 of them is white, then they are all white.Clearly if the collection has only 1 horse, and at least 1 is white, then

they are all white.Suppose the statement is true for K horses. Assume then that you

have a collection of K+1 horses, and at least 1 is white. Throw out onehorse, which is not the chosen white one. Now you have a collection ofK horses, and at least 1 is white, so all K are white. Now bring backthe ejected horse, toss out another one, repeat the argument, and allK + 1 horses are white.

Since there is a white horse somewhere in the world, all horses arewhite!!

6. Show that for any positive integer n the number n2 is the sum ofthe first n odd numbers,

n2 =n∑

k=1

(2k − 1).

Page 34: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Discrete Calculus 21

7. Suppose that for nonnegative integers m the function T satisfiesthe recurrence formula

T (2m) ≤ aT (2m−1) + b2m, m ≥ 1,

T (1) ≤ b.

Here a and b are nonnegative numbers.Use induction to show that for every positive integer m,

T (2m) ≤ b2mm∑

k=0

(a/2)k = b2m 1 − (a/2)m+1

1 − (a/2).

Such recurrence formulas are often encountered in studying the exe-cution time T of computer algorithms as a function of the size 2m of aset of inputs.

8. Find f+(n) if

a) f(n) = n3, b) f(n) =1

(n + 1)2.

9. Use the results of problem 8 to find

a)n−1∑k=0

(3k2 + 3k + 1), b)n−1∑k=0

2k + 3(k + 1)2(k + 2)2

.

10. Use trigonometric identities to show that

Δ+ sin(an) = sin(an)[cos(a) − 1] + sin(a) cos(an)

= 2 cos(a[n + 1/2]) sin(a/2).

11. Verify the quotient rule

Δ+f(n)g(n)

=f+(n)g(n) − f(n)g+(n)

g(n)g(n + 1).

12. Use the quotient rule to evaluate f+(n) if

a) f(n) =n2

2n2 + n + 1, b) f(n) =

n2

3n.

Use the Fundamental Theorem to derive summation formulas fromthese two calculations.

Page 35: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

22 A Concrete Introduction to Real Analysis

13. For fixed n let

f(k) =(

n

k

).

Find f+(k) and determine when f is an increasing, respectively de-creasing, function of k.

14. For an integer m ≥ 2, compute

n−1∑k=0

m

(k + 1)(k + m + 1).

(Hint: compute Δ+[ 1n+1 + 1

n+2 ], and generalize.)15. Use the summation by parts formula to find

n−1∑k=0

k2xk.

(Hint: Follow the method used to derive (1.5) .)16. Show that

n−1∑k=0

(sin(k)[cos(1) − 1] + sin(1) cos(k)) = sin(n).

17. Express the function

p4(n) =n∑

k=0

k4

as a polynomial in n.18. Show that if p(n) is a polynomial, then so is

n∑k=0

p(k).

19. Use the formula(n

k − 1

)+

(n

k

)=

(n + 1

k

)

to show that the binomial coefficients are integers.20. For integers k ≥ 1 define the function

qk(n) = n(n + 1) · · · (n + k − 1).

Page 36: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Discrete Calculus 23

(a) Show that nqk(n + 1) = (n + k)qk(n).(b) Show that nq+

k (n) = kqk(n).(c) Show that

n−1∑j=0

qk(j) =n

kqk(n) − 1

k

n−1∑j=0

qk(j + 1) =n

kqk(n) − 1

k

n∑j=1

qk(j),

so thatn−1∑j=0

qk(j) =n − 1k + 1

qk(n).

21. Consider the following series computations.(a) For integer m ≥ 1 show that

n−1∑k=0

kmxk =nmxn

x − 1−

n−1∑k=0

[m−1∑j=0

(m

j

)kj ]xk.

(b) Use part (a) to compute

n−1∑k=0

k2xk.

(c) Use derivatives and the formulas

f(x) =n−1∑k=0

xk =1 − xn

1 − x, xf ′(x) =

n−1∑k=0

kxk

to computen−1∑k=0

kxk.

(d) Define

Fm(x) =n−1∑k=0

kmxk.

Show thatxF ′

m(x) = Fm+1(x).

Compute

F2(x) =n−1∑k=0

k2xk.

Page 37: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Chapter 2

Selected Area Computations

2.1 Introduction

Problems involving the computation of areas and volumes of geo-metric figures date back to the some of the earliest writings [9, p.10],but the subject was first extensively developed by the ancient Greek-s. It was the Greeks (Eudoxus 408 B.C.-355 B.C.) who developed themethod of exhaustion, which computes the area within a geometricfigure by tiling the figure with polygons whose areas are known.

We begin by defining the area of a rectangle to be the product of itslength and width. Suppose then that F is a figure whose area is de-sired. The area of F can be estimated by comparing two constructions.First, cover the figure with a finite collection of rectangles so that thefigure F is a subset of the union of the rectangles. The area of F willbe no greater than the sum Ao of areas of the covering rectangles. Sec-ond, find a finite collection of rectangles which do not overlap, (exceptperhaps on the boundaries) and which lie inside F . The sum Ai of theareas of these interior rectangles is smaller than the area of F . For anysuch collections of rectangles,

Ai ≤ area(F ) ≤ Ao.

This idea can be effectively used to compute the areas of a variety ofshapes.

Several specific area computations are discussed in this chapter. Af-ter some simple cases illustrating Riemann sum calculations, the clas-sical problem of computing the area π of a circle whose radius is 1 isconsidered. This problem was studied by the ancient Greeks. The nex-t problem, the geometric development of the natural logarithm, wasconsidered about two thousand years later. The final topic is Sterling’sformula, an approximation of n! which may be developed by geometricconsiderations and a bit of calculus.

25

Page 38: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

26 A Concrete Introduction to Real Analysis

2.2 Areas under power function graphs

Figures 2.1 and 2.2 illustrate the computation of the area A of atriangle using rectangles. Suppose that the height of the triangle is h,and the equation of the linear function providing the upper boundaryis f(x) = hx/b for 0 ≤ x ≤ b. Divide the x-axis between 0 and b inton subintervals of equal length b/n. The endpoints of the subintervalsare then

xk = kb/n, k = 0, . . . , n

h

x0 = 0 x

1x

2x

3x

4x

n = b

• • •

Figure 2.1: Rectangles enclosing a triangle

In figure 2.1 the union of the rectangles encloses the triangle. Theheight of the k-th rectangle is

f(xk) =kb

n

h

b= k

h

n, k = 1, . . . n.

Page 39: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 27

The sum of the areas of the rectangles is

Ao =n∑

k=1

b

nk

h

n=

bh

n2

n∑k=1

k.

Sincen∑

k=1

k =n(n + 1)

2

it follows that

Ao =bh

n2

n(n + 1)2

=bh

2n2 + n

n2=

bh

2[1 +

1n

].

In figure 2.2 the triangle encloses the union of nonoverlapping rectan-gles. Starting the count now with k = 0 rather than k = 1, the heightof the k-th rectangle is

f(xk) =kb

n

h

b= k

h

n, k = 0, . . . n − 1.

The sum of the rectangular areas is

Ai =n−1∑k=0

b

nk

h

n=

bh

n2

n−1∑k=0

k =bh

n2

(n − 1)n2

.

Thus

Ai =bh

2[1 − 1

n].

Finally we have

Ai =bh

2[1 − 1

n] < A < Ao =

bh

2[1 +

1n

].

Since this inequality is true for every positive integer n, the area A isneither smaller nor larger than bh/2, so that

A = bh/2.

The same ideas may be applied to the computation of the area lyingunder the graph of f(x) = x2 for 0 ≤ x ≤ b (see Figures 2.3 and 2.4).Since the function is increasing for x ≥ 0, the minimum and maximum

Page 40: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

28 A Concrete Introduction to Real Analysis

h

x0 = 0 x

1x

2x

3x

4x

n = b

• • •

Figure 2.2: Rectangles within a triangle

values of the function x2 on any subinterval [xk, xk+1] are at xk andxk+1 respectively. In this new case the function values are

f(xk) = x2k =

k2b2

n2

and

Ao =n∑

k=1

b

n

k2b2

n2=

b3

n3

n∑k=1

k2 =b3

n3

n(n + 1)(2n + 1)6

=b3

3[1 +

32n

+1

2n2].

Similarly

Ai =n−1∑k=0

b

n

k2b2

n2=

b3

n3

(n − 1)n(2n − 1)6

,

or

Ai =b3

32n3 − 3n2 + n

2n3=

b3

3[1 − 3

2n+

12n2

].

Page 41: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 29

x0 = 0 x

1x

2x

3x

4x

n = b

• • •

Figure 2.3: Right endpoint sums for a parabola

This time we get

Ai =b3

3[1 − 3

2n+

12n2

] < A < Ao =b3

3[1 +

32n

+1

2n2].

Since this inequality is true for every positive integer n, the area underthe parabola is bigger than any number smaller than b3/3, and smallerthan any number bigger than b3/3, or

A = b3/3.

As a final note, the evaluation of

n−1∑k=0

km

from the previous chapter may be used to determine the area underthe graph of the function f(x) = xm for all positive integers m. The

Page 42: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

30 A Concrete Introduction to Real Analysis

x0 = 0 x

1 x

2 x

3 x

4x

n = b

• • •

Figure 2.4: Left endpoint sums for a parabola

structure of the argument is the same as above. The rectangular areashave the form

Ao =n∑

k=1

b

n

kmbm

nm=

bm+1

nm+1

n∑k=1

km

and

Ai =n−1∑k=0

b

n

kmbm

nm=

bm+1

nm+1

n−1∑k=0

km.

There is a polynomial q0 of degree at most m such that

n∑k=1

km =(n + 1)m+1

m + 1+ q0(n).

andn−1∑k=0

km =nm+1

m + 1+ q0(n − 1).

Page 43: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 31

Notice that by the Binomial Theorem the ratio

(n + 1)m+1

nm+1

may be written as

1 +q1(n)nm+1

for some new polynomial q1(n) of degree at most m. This gives

Ai =bm+1

m + 1[1 +

q0(n − 1)nm+1

] < A < Ao =bm+1

m + 1[1 +

q2(n)nm+1

]

for some polynomial q2(n) incorporating all the lower order terms inAo. Thus the area under the graph of xm from x = 0 to x = b is

A =bm+1

m + 1,

which is of course well known from calculus.

2.3 The computation of π

The number π is both the area of a circle of radius 1 and half thecircumference of the same circle. A modern calculation gives the firstten digits,

π = 3.141592654 . . . .

The ancients confronted the computation of π with varying degrees ofsuccess [9]. The Babylonians seem to have used 3 for the area of thecircle of radius 1, but 31

8 for the value computed as a circumference.The Egyptians calculated the area of the circle of radius 1 as (16/9)2,which amounts to the value

π � 3.1605.

The ancient Greeks understood that π could be approximated by calcu-lating the perimeters of inscribed and circumscribed regular polygons;Archimedes (287-212 BCE) calculated that

31071

< π < 317,

Page 44: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

32 A Concrete Introduction to Real Analysis

which in decimal form gives

3.140 < π < 3.143.

The basis for our investigation of the computation of π by geometricmethods is a construction which appears in the Elements of Euclid,written about 300 B.C. The purpose there is to show that the area ofa circle is proportional to the square of the diameter [9, p. 83].

Figure 2.5 illustrates a process which leads to increasingly refinedestimates for π in terms of the areas of inscribed polygons. The radialsegments AB and AC have length 1. The angle BAC is 45 degrees,so the coordinates (xC , yC) of C are both

√2/2. The triangle ABC

thus has area√

2/4. The larger triangle ABD has area 1/2. Since thesector of the disk BAC has one-eighth the area of the entire disk, wefirst obtain the estimate

2√

2 < π < 4.

This process may be refined by constructing inscribed regular poly-gons Pn with 2n sides. The polygons are constructed by repeated anglebisections, as shown in Figure 2.6. One of the angular sectors of thepolygon at stage n − 1 is BAC, and the line AD bisects the angleBAC. The difference between the area of the disk, π, and the area ofthe polygon Pn−1 is εn−1, where εn−1/2n−1 is the area bounded by theline segment BC and the circular arc BC.

To refine one of these polygons, the edges of Pn−1, exemplified byBC, are replaced with pairs of edges BD and DC. The full refinedpolygons are obtained from the triangles ABD by repeatedly rotatingthe figures through angles of 360/2n degrees. The difference betweenthe area of the disk and the area of the refined polygon Pn is εn, whereεn/2n is the area bounded by the line segment BD and the circular arcBD.

To compare the errors εn−1 and εn, a line FG is constructed throughD and parallel to BC, and the lines CG and BF are constructed par-allel to ED. More than half the area bounded by the line segment BCand the circular arc BC is now included in the triangle BCD, whichis part of Pn. Thus

εn < εn−1/2.

It is easy to obtain an estimate for ε2 from Figure 2.5, since thedisk is trapped between the circumscribed square with area 4 and the

Page 45: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 33

A B

C

D

Figure 2.5: Estimating π

inscribed square with area 2. Thus

ε2 < 2,

and by induction

εn <82n

, n ≥ 2. (2.1)

For practical error estimation, another inequality is helpful. Noticein Figure 2.6 that the polygon Pn is obtained from Pn−1 by adjoiningtriangles exemplified by CDE. The triangle CDG has the same area asCDE, and by adjoining CDG to Pn we obtain a figure which includesthe portion of the disk CAD. If An is the area of Pn, this implies theinequality

An < π < An + [An − An−1]. (2.2)

To obtain improved estimates of π, a bit of trigonometry will be usedto provide a systematic way of computing the area of the triangles

Page 46: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

34 A Concrete Introduction to Real Analysis

CAD. Since the length of AD is the radius 1, the missing piece isthe height of the segment CE. The coordinates of the point C are(cos(θ), sin(θ)), where θ = 360/2n degrees or θ = 2π/2n radians. Thusthe area of the triangle CAD is

An

2n=

12

sin(2π/2n), (2.3)

and our approximation An of π is the sum of the areas of the 2n trianglesof this size,

π � An = 2n−1 sin(2π/2n).

These values of sin(θ) can be calculated using the half-angle formulas

sin(θ/2) =

√1 − cos(θ)

2, cos(θ/2) =

√1 + cos(θ)

2.

It will suffice to work primarily with cos(θ). The idea is to start withcos(π/4) =

√2/2 and repeatedly use the half-angle formulas. If

cn = cos(2π/2n), (2.4)

then the first few expressions are

c4 = cos(2π/24) = [(1 +√

2/2)/2]1/2 � .9239,

c5 = cos(2π/25) =√

(1 + c4)/2 =� .9808.

Since sin2(θ)+cos2(θ) = 1, the values for sin(2π/2n) may be comput-ed, and then An may be used to approximate π. The approximationsof π corresponding to c4 and c5 are

A4 = 3.061, A5 = 3.12.

According to our earlier analysis of the error (2.1),

0 < π − A4 < 8/24 = 1/2, 0 < π − A5 < 8/25 = 1/4.

As a practical matter it is more instructive to use a particular case of(2.2) ,

A5 < π < A5 + [A5 − A4],

which gives3.12 < π < 3.19. (2.5)

Page 47: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 35

A

G

D

F

C

E

B

Figure 2.6: Polygonal approximation of a circle

We have used trigonometry to obtain this estimate of the value ofπ. This subject was developed as an outgrowth of Greek geometryin the period between 150 B.C. and 168 A.D. [9, pp. 119–126]. Itis interesting to note that the estimate 310

71 < π < 31070 developed by

Archimedes (287-212 B.C.) predates the development of trigonometry.Additional notes on early computations of π may be found in [1].

2.4 Natural logarithms

Tables of logarithms were first published by John Napier in 1614. E-quipped with a table of logarithms, calculations of products, quotients,and roots could be reduced to simpler sums, differences, or simple quo-

Page 48: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

36 A Concrete Introduction to Real Analysis

tients. Shortly afterward these logarithm tables were mechanized witha popular computing machine, the slide rule.

Logarithms are usually introduced with a base of 10, with subsequentextensions to other (usually integer) bases. However, from the pointof view of calculus the most useful logarithm function is the naturallogarithm, which arises from geometric considerations.

The context is the problem of finding the area under the graph ofthe function f(t) = 1/t, say between two positive numbers t = a andt = b. In the language of calculus the problem is to calculate

A1 =∫ b

a

1t

dt.

Consider attacking this problem by constructing Riemann sum ap-proximations of the area A1 (see Figure 2.7). For instance we can useleft and right endpoint Riemann sums, with n subintervals of equallength,

Ln =n−1∑k=0

f(tk)b − a

n=

b − a

n

n−1∑k=0

1tk

, tk = a + kb − a

n

and

Rn =n−1∑k=0

f(tk+1)b − a

n=

b − a

n

n−1∑k=0

1tk+1

.

Since the function f(t) = 1/t is decreasing for t > 0,

Rn < A1 < Ln,

andlim

n→∞Rn = A1 = limn→∞Ln.

Rather than actually coming up with the area, we are going to makea key observation about the relationships among various such areas.Let m be a positive number, and consider the new area (Figure 2.8)

A2 =∫ mb

ma

1t

dt, m > 0

Since the interval [a, b] has been scaled by the number m, the samewill be true of the n subintervals of equal length. That is,

Tk = ma + kmb − ma

n= m[a + k

b − a

n] = mtk.

Page 49: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 37

a b

Figure 2.7: Area under f(x) = 1/x

Let us now write down the corresponding left and right endpoint Rie-mann sums,

L̃n =n−1∑k=0

f(Tk)mb − ma

n= m

b − a

n

n−1∑k=0

1mtk

.

Notice that the factors m and 1/m cancel, and we find that

L̃n =n−1∑k=0

f(Tk)mb − ma

n=

n−1∑k=0

f(tk)b − a

n= Ln.

The analogous result R̃n = Rn holds as well.Our thinking now runs as follows. By picking n large enough it is

possible to make Ln and Rn as close to A1 as we like. You can pickyour favorite tiny number, say 10−j for large j, and by making n largeenough Ln and Rn will be trapped in a tiny interval,

A1 − 10−j < Rn < A1 < Ln < A1 + 10−j .

Since the Riemann sums for A2 are exactly the same,

Page 50: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

38 A Concrete Introduction to Real Analysis

a b ma mb

Figure 2.8: Two areas under f(x) = 1/x

A1 − 10−j < Rn = R̃n < A2 < Ln = L̃n < A1 + 10−j , (2.6)

which forces A1 and A2 to be the same.It’s worth thinking a bit more about this. Suppose A1 < A2. If the

number j is large enough, so that 10−j < A2 − A1, then

A1 < A1 + 10−j < A2.

The inequality (2.6) above says this can never happen. Similarly, theinequality A1 > A2 is ruled out. It must be that A1 = A2, or∫ b

a

1t

dt =∫ mb

ma

1t

dt. (2.7)

The most interesting observation occurs when a = 1 and b is theproduct b = xy, with x > 1 and y > 1. Integrating from 1 to xy,breaking the area into two parts, and then using (2.7) (with x insteadof m) gives∫ xy

1

1t

dt =∫ x

1

1t

dt +∫ xy

x

1t

dt =∫ x

1

1t

dt +∫ y

1

1t

dt.

Page 51: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 39

That is, for the function g(x), which is defined to be the area underthe graph of 1/t for t between 1 and x,

g(xy) = g(x) + g(y).

This is exactly the sort of equation expected of a logarithm; forinstance

log10(xy) = log10(x) + log10(y).

Notice that since there is zero area if x = 1,

g(1) = 0.

The functiong(x) =

∫ x

1

1t

dt

is then defined to be our natural logarithm. We’ve shown the followingresult, which was published in 1649 by Alfons A. de Sarasa (1618-67).

Theorem: The function

log(x) =∫ x

1

1t

dt

satisfies log(1) = 0 and

log(xy) = log(x) + log(y), x, y ≥ 1.

The logarithms developed by Napier were based on representing anumber y as 10x, or bx for some other base b. If y = bx, then by defini-tion x = logb(y), and b is called the base of the logarithm. This raisestwo questions about the natural logarithms. Is there a correspondingbase, and if there is, what is it?

Let’s turn the questions around. Suppose there is a base, which wewill call e. To determine the base b of a logarithm, start with theobservation that logb(b) = 1, and find the number x with log(x) = 1.

Riemann sums provide some crude information about the value of e.Using left endpoint Riemann sums with subintervals of length 1/2 wefind that ∫ 2

11/t dt <

12[1 +

23] = 5/6 < 1.

A bit more work with right endpoint Riemann sums and subintervalsof length 1/2 leads to ∫ 4

11/t dt > 1.

Page 52: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

40 A Concrete Introduction to Real Analysis

The value of log(x) at x = 2 is less than 5/6, and the value at 4 isbigger than 1. Since 1/t > 0, the area under the curve between t = 1and t = x will increase with x. A plot of log(x) (Figure 2.9) should bea smooth graph crossing the horizontal line at height 1 somewhere be-tween t = 2 and t = 4. That is, the base e of natural logarithms satisfies2 < e < 4. (Of course your calculator, which has more sophisticatedknowledge embedded inside, will report e = 2.71828 . . . .)

Having convinced ourselves that there is some number e satisfying∫ e

1

1t

dt = 1,

it is reasonable to ask if this number behaves as the base of the naturallogarithm. The main concern is whether

log(ex) = x, x ≥ 0.

A calculation with (2.7) and a positive integer k yields

log(ek) =∫ ek

1

1t

dt

=∫ e

1

1t

dt +∫ e2

e

1t

dt + · · · +∫ ek

ek−1

1t

dt = k.

Another suggestive calculation is

1 = log(e) = log([e1/k]k)

=∫ e1/k

1

1t

dt +∫ e2/k

e1/k

1t

dt + + · · · +∫ ek/k

e(k−1)/k

1t

dt

= k

∫ e1/k

1

1t

dt.

This gives ∫ e1/k

1

1t

dt = 1/k.

The evidence is pretty good that this natural logarithm behaves inthe way expected for loge(x). So far we haven’t considered log(x) fornumbers 0 < x < 1. That development is left for the problem section.

Page 53: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 41

0 1 2 3 4 5−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Figure 2.9: Graph of log(x)

2.5 Stirling’s formula

There are many problems in counting and probability where oneneeds to understand the size of n!. The basic estimate was discoveredin 1730 thanks to a collaboration between Abraham DeMoivre andJames Stirling. The result is

n! ∼√

2πe−nnn+1/2 =√

2πn(n

e)n. (2.8)

Here the symbol ∼ means that

limn→∞

√2πe−nnn+1/2

n!= 1.

It is possible to use very elementary methods to get a slightly inferiorresult. The first idea is to consider log(n!) rather than n!. Since thelogarithm of a product is the sum of the logarithms,

log(n!) =n∑

k=1

log(k). (2.9)

Page 54: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

42 A Concrete Introduction to Real Analysis

0 1 2 3 4 5 6 7−1

−0.5

0

0.5

1

1.5

2

Figure 2.10: Riemann sum for∫ x1 log(t) dt

The plan is to compare this sum to a convenient integral.Thinking of (2.9) as a left endpoint Riemann sum (Figure 2.10) leads

to the inequality

n∑k=1

log(k) ≤∫ n+1

1log(x) dx. (2.10)

The inequality follows from the fact that log(x) is an increasing func-tion, so the left endpoint Riemann sums are smaller than the corre-sponding integral.

Recognizing that log(1) = 0 allows us to rewrite (2.9) as

log(n!) =n∑

k=2

log(k). (2.11)

Interpreting (2.11) as a right endpoint Riemann sum shows that

log(n!) =n∑

k=2

log(k) ≥∫ n

1log(x) dx. (2.12)

Page 55: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 43

Recall that log(x) has an elementary antiderivative,∫ n

1log(x) dx = (x log(x) − x)

∣∣∣n1

= n log(n) − (n − 1).

Using this formula with the inequalities (2.10) and (2.12) leads to

n log(n) − (n − 1) ≤ log(n!) ≤ (n + 1) log(n + 1) − n.

Exponentiating, and using the fact that n log(n) = log(nn), we find

nne1−n ≤ n! ≤ (n + 1)n+1e−n. (2.13)

A comparison with Stirling’s formula (2.8) shows that (2.13) is closeto the desired form, but a bit more precision is needed. To achieve thisadditional precision the above ideas need some modifications.

The first modification is to replace the left and right endpoint Rie-mann sums with a midpoint Riemann sum (Figure 2.11). The followingobservation about midpoint Riemann sums will be important. On eachsubinterval the midpoint approximation∫ xi+1

xi

f(x) dx � f(xi + xi+1

2)[xi+1 − xi]

gives the same area as if we used the tangent line to the graph of f atthe midpoint (Figure 2.12),

f(xi + xi+1

2)[xi+1 − xi]

=∫ xi+1

xi

f(xi + xi+1

2) + f ′(

xi + xi+1

2)[x − xi + xi+1

2] dx.

Since the function log(x) is concave down (the first derivative is de-creasing), the tangent line at the midpoint lies above the graph of thefunction, and the midpoint approximation is greater than the integral(see problem 14).

Interpreting (2.9) as a midpoint Riemann sum also requires shiftingthe integral by 1/2. That is,

n∑k=1

log(k) ≥∫ n+1/2

1/2log(x) dx = x log(x) − x

∣∣∣n+1/2

1/2(2.14)

= (n + 1/2) log(n + 1/2) − 12

log(1/2) − n.

Page 56: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

44 A Concrete Introduction to Real Analysis

0 1 2 3 4 5 6 7−1

−0.5

0

0.5

1

1.5

2

Figure 2.11: Midpoint Riemann sum for∫ x1 log(t) dt

Another estimate for (2.9) results from using the trapezoidal rule forintegrals (Figure 2.13), which is just averaging of the left and rightendpoint Riemann sums. This time the fact that log(x) is concavedown means that the trapezoidal sums underestimate the integral (seeproblem 14). Thus

12

n∑k=1

[log(k) + log(k + 1)] ≤∫ n+1

1log(x) dx,

or, rewriting the left hand side,

n∑k=1

log(k) +12

log(n + 1) ≤∫ n+1

1log(x) dx.

This is the same as

n∑k=1

log(k) ≤∫ n+1

1log(x) dx − 1

2log(n + 1) (2.15)

= (n + 1/2) log(n + 1) − n.

Page 57: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 45

1 1.5 2 2.5 3 3.5

0

0.2

0.4

0.6

0.8

1

1.2

Figure 2.12: Midpoint tangent sum for∫ x1 log(t) dt

Together the estimates (2.14) and (2.15) are

(n + 1/2) log(n + 1/2) − 12

log(1/2) − n (2.16)

≤n∑

k=1

log(k) ≤ (n + 1/2) log(n + 1) − n.

Exponentiation gives√

2(n + 1/2)n+1/2e−n ≤ n! ≤ (n + 1)n+1/2e−n,

which may also be written as

√2nn+1/2(1 +

12n

)n+1/2e−n ≤ n! ≤ nn+1/2(1 + 1/n)n+1/2e−n. (2.17)

The reader may recall from calculus that

limn→∞(1 +

1n

)n = e, limn→∞(1 +

12n

)n = e1/2,

which suggests that our expressions can be simpified. That simplifica-tion is our next order of business.

Page 58: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

46 A Concrete Introduction to Real Analysis

Notice that for x ≥ 0

log(1 + x) ≤ x.

This follows from log(1) = 0 and

d

dxlog(1 + x) =

11 + x

so thatd

dxlog(1 + x) ≤ 1 =

d

dxx, x ≥ 0.

A simple logarithmic calculation now gives

(n + 1/2) log(1 +1n

) ≤ (n + 1/2)1n

= 1 +12n

,

or

(1 +1n

)n+1/2 ≤ e1+1/(2n). (2.18)

1 1.5 2 2.5 3 3.5 4

0

0.5

1

1.5

Figure 2.13: Trapezoidal sum for∫ x1 log(t) dt

Page 59: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 47

Similarly, the calculation

log(1 + x) =∫ 1+x

1

1t

dt ≥ x

1 + x, x ≥ 0

gives

(n + 1/2) log(1 +12n

) ≥ 2n + 12

12n

2n2n + 1

= 1/2,

ore1/2 ≤ (1 +

12n

)n+1/2. (2.19)

Using (2.18) and (2.19) in (2.17) produces the inequality√

2enn+1/2e−n ≤ n! ≤ e1+(2n)−1nn+1/2e−n. (2.20)

Finally, consider the estimation error

En =n∑

k=1

log(k) − [(n + 1/2) log(n + 1/2) − 12

log(1/2) − n]. (2.21)

The analysis of midpoint Riemann sums which led to (2.14) shows thatthis is an increasing sequence. It’s not hard to check that the sequenceis bounded, implying the existence of a constant C such that

limn→∞

n!nn+1/2e−n

= C.

The actual value of C does not emerge from this technique, althoughwe have good bounds. Our uncertainty regarding the actual constantC =

√2π is expressed by the inequalities

√2e ≤ C ≤ e.

Page 60: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

48 A Concrete Introduction to Real Analysis

2.6 Problems

1. Following the example of f(x) = x2 in the text, use left andright endpoint Riemann sums and the formula

∑nk=0 k3 = n2(n+1)2/4

to show that ∫ b

0x3 dx = b4/4.

2. We’ve shown geometrically that∫ b

0xm dx =

bm+1

m + 1, b ≥ 0.

Extend this result geometrically (not quoting a calculus result) to showthat ∫ b

axm dx =

bm+1

m + 1− am+1

m + 1, a ≤ b.

Start with 0 ≤ a ≤ b and then consider the case a ≤ 0 ≤ b.3. Denote by Ln and Rn respectively the left and right endpoint

Riemann sums for the integral∫ b

af(x) dx.

Assume that the interval [a, b] is divided into n subintervals of equallength. If the function f(x) is decreasing, then

Rn ≤∫ b

af(x) dx ≤ Ln.

(a) Determine which rectangular areas appear in both left and rightendpoint sums, and use this observation to show that

Ln − Rn =b − a

n[f(a) − f(b)].

(b) Now show that

|∫ b

af(x) dx − Ln| ≤ b − a

n[f(a) − f(b)],

and similarly for Rn.

Page 61: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 49

(c) Show that either

|∫ b

af(x) dx − Ln| ≥ b − a

2n[f(a) − f(b)],

or

|∫ b

af(x) dx − Rn| ≥ b − a

2n[f(a) − f(b)].

(d) If you use left or right endpoint Riemann sums to compute

log(10) =∫ 10

11/x dx,

how big should n be to ensure that the error in your computation isless than 10−6?

4. In the notation of (2.4), compute c6 and c7. Use (2.3) and (2.2)to obtain upper and lower bounds for π. (Use a calculator.)

5. By constructing a regular polygon with 2n sides which circum-scribes the unit circle of radius 1, show that

π < 2n−1 tan(2π2n

).

Combining this estimate and the previous estimate from the text thengives

2n−1 sin(2π2n

) < π < 2n−1 tan(2π2n

).

6. Use Figure 2.14 to help establish the following identities. Obtainthe half-angle formulas used in computing approximate values of π.

(a) Find the length w in two different ways to establish cos(B) =cos(A + b) cos(A) + sin(A + b) sin(A).

(b) Show that

u = cos(A + B) tan(A), v =x sin(B)

cos(A + B)=

sin(B)cos(A)

.

(c) Use u+v = sin(A+B) and the result of part a) to get cos(A+B) =cos(A) cos(B) − sin(A) sin(B).

7. Consider two curves in the upper half plane. The first is thesemicircle r = r0 for 0 ≤ θ ≤ π. The second satisfies

r(θ) = r02π − θ

π

Page 62: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

50 A Concrete Introduction to Real Analysis

A

A

B

u

v

w

x

y

Figure 2.14: Addition formula for cos(A + B)

for 0 ≤ θ ≤ π.(a) Divide the outer curve into n equal angle subarcs with endpoints

at angles θk so that θk+1−θk = π/n. Find expressions for θk and r(θk).(b) Consider the area Ak of the k− th angular sector bounded by the

outer curve and lines from the origin at angles θk and θk+1. Comparethese areas to the areas of sectors of disks with radii r(θk) and r(θk+1).

(c) Using a Riemann sum like calculation, calculate the area of theregion between the two curves for 0 ≤ θ ≤ π. (This type of result wasknown to Archimedes (287-212 B.C.) [9, pp. 114–115].)

(d) Replace the formula r(θ) = r0[2π − θ] by others for which theanalogous calculation can be made.

8. What modifications to the argument in the text are needed toshow

log(xy) = log(x) + log(y), 0 < x, y < 1?

Page 63: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 51

9. Show that if p/q is a positive rational number, then

log(ep/q) = p/q.

10. Estimate the sum

n∑k=1

1k1/3

(2.22)

by the following process.(a) Sketch the graph of x−1/3 for 1 ≤ x ≤ ∞. Include in your sketch

rectangular boxes with base [k, k+1] on the real axis, and height k−1/3.(b) Argue that

n∑k=1

1k1/3

≥∫ n+1

1x−1/3 dx.

(c) By a similar argument obtain an estimate of the formn∑

k=2

1k1/3

≤∫ n

1x−1/3 dx.

(d) Evaluate the integrals and obtain upper and lower bounds for thesum (2.22).

(e) Is the sum (2.22) bounded as n → ∞ ? Give reasons.11. Use the treatment of Stirling’s formula as a guide.(a) Show that ∫ n+1

1

1x

dx <

n∑k=1

1k

< 1 +∫ n

1

1x

dx,

or equivalently,

log(n + 1) <

n∑k=1

1k

< 1 + log(n).

(b) Observe that

log(k + 1) − log(k) =∫ k+1

k

1x

dx <1k,

and conclude that the function

f(n) = (n∑

k=1

1k) − log(n + 1), n ≥ 1

Page 64: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

52 A Concrete Introduction to Real Analysis

increases with n.(c) Show that f(n) ≤ 1 for all n ≥ 1.The number

γ = limn→∞

( n∑k=1

1k− log(n + 1)

)is called the Euler constant. The actual value is approximately γ =.577 . . . .

(d) Use midpoint and trapezoidal sums to show that

.5 ≤ γ ≤ log(2) = .69 . . . .

12. Based on our treatment of Stirling’s formula√

2enn+1/2e−n ≤ n! ≤ e2nn+1/2e−n, n ≥ 1.

Use these inequalities to obtain good upper and lower bounds for(nk

).

13. For fixed n we may consider(nk

)as a function of k, with 0 ≤ k ≤

n. Problem 13 of chapter 1 shows that this function increases until kreaches the middle of its range. Use Stirling’s formula to estimate thesize of

( nn/2

)when n is even and

( n(n−1)/2

)when n is odd.

14. Suppose that f ′(x) is decreasing (f is concave down) for a ≤ x ≤c. Suppose that a ≤ b ≤ c, and let y(x) be defined by the tangent lineto f at b,

y(x) = f(b) + f ′(b)(x − b).

(a) The Mean Value Theorem says that

f(x) − f(b) = f ′(d)(x − b)

for some d between b and x. Use this to show that

y(x) ≥ f(x), a ≤ x ≤ c.

(b) Obtain the conclusion of part a) by using

f(x) − f(b) =∫ x

bf ′(t) dt

instead of the Mean Value Theorem.(c) Suppose that f ′(x) is strictly decreasing. Let

m =f(c) − f(a)

c − a.

Page 65: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Selected Area Computations 53

Show that the liney(x) = f(a) + m(x − a)

satisfies y(x) ≤ f(x) for a ≤ x ≤ c. (Hint: The function g(x) =f(x)−y(x) vanishes at x = a and x = c. Use the Mean Value Theoremor Rolle’s Theorem to show that there are no other roots of g.)

15. Show that the sequence En in (2.21) is increasing and bounded.(Hint: Consider (2.16).)

16. Consider extending the geometric calculation illustrated in Fig-ure 2.8.

(a) Suppose that the numbers a, b,m are all positive, and a < b.Show that

mr+1

∫ b

atr dt =

∫ mb

matr dt

Verify the geometric argument by explicitly calculating the integrals.(b) Show that the function

g(x) =∫ x

1tr dt, x ≥ 1

satisfiesg(xy) = g(x) + xr+1g(y), x, y ≥ 1.

Page 66: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Chapter 3

Limits and Taylor’s Theorem

3.1 Introduction

In our discussion of the computation of π, the area of the circle withradius 1, we constructed a sequence of polygons Pn with areas An.Each area An was smaller than π, and the sequence of areas increaseswith n,

A1 < A2 < A3 < · · · < π.

The numbers An are also good approximations of π in the followingsense. For any positive number ε, no matter how small, the difference|π − An| is smaller than ε for n sufficiently large. It seems to makesense to say that the numbers An approach π as n increases, or in themore common terminology, π is the limit of the sequence An.

The idea of using infinite sequences to represent numbers is intrinsicin the current notion that real numbers can be expressed as possi-bly infinite decimals. Consultation with our calculator indicates thatπ = 3.14159 . . . or

√2 = 1.414 . . . . These decimal representations re-

quire an infinite list of digits to achieve actual equality. The decimalexpansion for π indicates that π is within 10−2 of 3.14, and is no morethan 10−4 from 3.1415, etc.

A similar situation arises in the summation of the geometric series.For any number x �= 1,

1 + x + x2 + · · · + xn−1 =n−1∑k=0

xk =1 − xn

1 − x=

11 − x

− xn

1 − x.

If |x| < 1 then the numbers xn shrink to 0 as n gets larger. Thus it istempting to say

1 + x + x2 + · · · =∞∑

k=0

xk =1

1 − x, |x| < 1.

55

Page 67: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

56 A Concrete Introduction to Real Analysis

Taking x = 1/2, for instance, gives

2 = 1 +12

+14

+18

+ . . . .

Infinite sequences, and particularly infinite series, are convenien-t ways to represent numbers and functions. But the use of infiniteprocesses such as infinite sums requires extra diligence. For example,the geometric series formula with x = −1 suggests that

1 + (−1) + 1 + (−1) + · · · =∞∑

k=0

(−1)k =12.

Even more improbably, if x = 2 the formula states that

1 + 2 + 4 + · · · =∞∑

k=0

2k =1−1

= −1.

These results are certainly false in the sense that the finite sumsn−1∑k=0

(−1)k =12− (−1)n

2,

andn−1∑k=0

2k =1−1

− 2n

−1,

are not approaching the numbers 1/2 or −1 respectively.The pitfalls associated with the use of infinite processes led the an-

cient Greeks to avoid them [2, p. 13–14], [9, p. 176]. As Calculus wasdeveloped [9, p. 436–467] the many successful calculations achievedwith infinite processes, including infinite series, apparently reduced theinfluence of the cautious critics. A reconciliation between the successof cavalier calculations and the careful treatment of foundational issueswas not achieved until the nineteenth century.

3.2 Limits of infinite sequences

3.2.1 Basic ideas

The first idea to single out is the infinite sequence. Intuitively, aninfinite sequence is simply an infinite list of numbers. The k−th term

Page 68: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 57

of the sequence is denoted ck or c(k). Examples include the sequences

1,12,13,14, . . . ck =

1k, k = 1, 2, 3, . . . ,

1,−1, 1,−1, . . . , ck = (−1)k, k = 0, 1, 2, . . . ,

and3, 3.1, 3.14, 3.141, 3.1415, . . . ,

where ck is the part of the decimal expansion of π consisting of the firstk digits.

In the usual mathematical language, an infinite sequence, or simply asequence, is a function c whose domain is the set N of positive integers1, 2, 3, . . . . The value c(k) of the function at k is called the k − thterm of the sequence. For our purposes the values c(k) will typicallybe real numbers, although the idea extends to complex numbers andbeyond. A slight extension of the idea allows the domain to be the setof nonnegative integers.

Although a sequence is a function, it is common to use a specialnotation for sequences. As noted above, the terms are often written ck

instead of c(k). The sequence itself is denoted {ck}. As an abbreviation,people often write ‘the sequence ck’, instead of ‘the sequence {ck}’,although this can create some confusion between the entire sequenceand its k − th term.

The notion of a limit is the most important idea connected withsequences. Say that the sequence {ck} has the limit L, which is written

limk→∞

ck = L,

if for any ε > 0 there is an integer N such that |ck − L| < ε wheneverk ≥ N . To emphasize the dependence of N on ε we may write Nε orN(ε). An equivalent statement is that the sequence {ck} converges toL. This definition has a graphical interpretation which illustrates theutility of the function interpretation of a sequence. The statement thatthe sequence has the limit L is the same as saying that the graph ofthe function c(k) has a horizontal asymptote y = L, as shown in Figure3.1 (where L = 2).

A substantial amount of both theoretical and applied mathematicsis concerned with showing that sequences converge. To pick one ele-mentary example, suppose that in the course of doing statistical work

Page 69: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

58 A Concrete Introduction to Real Analysis

0 5 10 15 20 25 301

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8

3

Figure 3.1: Limit of a function.

you need to evaluate an integral of the form

I =∫ b

ae−x2

dx.

As you may recall from calculus, the function e−x2does not have an

elementary antiderivative. To obtain numerical values for the integralone would run to the computer and use a numerical integration scheme.One could use Riemann sums (admittedly a naive and rather inefficientapproach), divide the interval [a, b] into k equal subintervals, and esti-mate the true value of I by a sequence Ik of approximations obtainedby summing the areas of rectangles. In this case we may interpret thenumber ε in the definition of limit as describing the error in approx-imating the integral I by the Riemann sum Ik. The statement thatlimk→∞ Ik = I simply means that the approximation can be made asaccurate as desired if k is chosen sufficiently large. In a practical appli-cation some explicit knowledge of the connection between the size of kand the accuracy ε of the approximation may also be required.

A few concrete examples will help us to get comfortable with the

Page 70: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 59

ideas. Suppose our goal is to show that

limk→∞

3 +1k

= 3.

Pick any ε > 0, and ask how large N should be so that all of thenumbers ck = 3+1/k will be within ε of 3 as long as k ≥ N . To ensure

|ck − L| = |(3 + 1/k) − 3| = 1/k < ε

it suffices to takek >

1ε.

To make a concrete choice, let N be the smallest integer at least as bigas 1/ε + 1, that is

N = 1/ε + 1�.As a second example, consider the sequence of numbers

sk =k−1∑j=0

2−j .

This sequence comes from the geometric series and, as previously noted,

sk =1

1 − 1/2− 2−k

1 − 1/2= 2 − 21−k.

It doesn’t strain the imagination to see that the limit is 2. However wecan again ask how large k must be before sk stays within ε of 2. Thedesired inequality is

|sk − L| = |(2 − 21−k) − 2| = 21−k < ε.

This is the same as1 − k < log2(ε),

ork > 1 − log2(ε).

In this case N may be chosen to be the smallest integer at least as bigas 2 − log2(ε), that is

N = 2 − log2(ε)�.In many cases it is impractical or impossible to find a simple expres-

sion for the smallest possible N . Suppose ck = 1+sin(k2+ek)/k, which

Page 71: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

60 A Concrete Introduction to Real Analysis

satisfies L = limk→∞ ck = 1. Rather than struggling with the precisevalue of ck, it is convenient to simply note that

|ck − L| = | sin(k2 + ek)/k| ≤ 1/k.

As in the first example, it suffices to pick

N = 1/ε + 1�.

Then if k ≥ N it follows that

|ck − 1| < ε.

3.2.2 Properties of limits

It is time to turn from the consideration of specific examples to gen-eral properties of sequences and limits. The first point to make is thata sequence can have at most one limit. In addition to establishing theresult itself, the proof introduces three common techniques. The firstis the judicious addition of 0. The second is the use of the triangleinequality

|a + b| ≤ |a| + |b|,which is easily verified for real numbers a and b (see problem 2). Thethird is the observation that if a number r is nonnegative, but r < εfor every positive number ε, then r = 0.

Theorem 3.2.1. If

limk→∞

ck = L1 and limk→∞

ck = L2,

then L1 = L2.

Proof. Notice that by the triangle inequality

|L1 − L2| = |(L1 − ck) + (ck − L2)| ≤ |L1 − ck| + |ck − L2|.

Let ε be any positive number, and take ε1 = ε2 = ε/2. Since

limk→∞

ck = L1 and limk→∞

ck = L2,

there are numbers N1 and N2 such that |L1 − ck| < ε1 for all k ≥ N1,and |L2 − ck| < ε2 for all k ≥ N2.

Page 72: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 61

Now pick k such that k ≥ N1 and k ≥ N2. Then

|L1 − L2| ≤ |L1 − ck| + |ck − L2| < ε1 + ε2 = ε.

Since the inequality |L1 − L2| < ε holds for any positive number ε, itmust be the case that |L1 − L2| = 0, or L1 = L2.

A sequence {ck} of real numbers is bounded if there is a number Msuch that |ck| ≤ M for all positive integers k.

Lemma 3.2.2. Iflim

k→∞ck = L,

then the sequence ck is bounded.

Proof. Choose ε = 1. According to the definition, there is an integerN such that

|ck − L| < 1

provided that k ≥ N . This means that for k ≥ N

|ck| < |L| + 1.

IfM = max(|L| + 1, |c1|, . . . , |cN |)

then |ck| ≤ M for all indices k.

Some sequences which are not bounded still have simple behavior.Say that limk→∞ ck = ∞, or equivalently ck → ∞, if for any M > 0there is an integer N such that ck > M for all k ≥ N . By the previouslemma a sequence satisfying limk→∞ ck = ∞ cannot converge.

The next result describes the interplay between limits and arithmeticoperations.

Theorem 3.2.3. Suppose that

limk→∞

ak = L, limk→∞

bk = M,

and C is a real number. Then

limk→∞

Cak = CL, (i)

limk→∞

ak + bk = L + M, (ii)

Page 73: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

62 A Concrete Introduction to Real Analysis

limk→∞

akbk = LM, (iii)

andlim

k→∞ak/bk = L/M, M �= 0. (iv)

Statement (iv) deserves a comment. It is possible to have M �= 0,but to have many bk which are 0, in which case some of the terms ak/bk

are not defined. This problem can be handled by adding the hypothesisthat bk �= 0 for all k. Another option is to notice, as we will in theproof, that if M �= 0 then bk �= 0 for k sufficiently large. Thus ak/bk isdefined for k large enough, which is all that is really required to makesense of limits. The reader is free to choose either point of view.

Before launching into the proof, this may be a good time to recognizethat writing a formal proof is usually preceded by some preliminaryanalysis. Let’s start with statement (i). To understand what needs tobe done it helps to work backwards. Our goal is to conclude that

|Cak − CL| < ε

for k large enough based on the fact that

|ak − L| < ε1

for k large enough. If |C| ≤ 1 there is no challenge, since we can takeε1 = ε and get

|Cak − CL| < |C|ε1 < ε.

If |C| > 1, then take ε1 = ε/|C|. As soon as k is large enough that

|ak − L| < ε1 = ε/|C|it follows that

|Cak − CL| < |C|ε1 = ε.

Now let’s examine the formal proof.

Proof. (i) Take any ε > 0. From the definition of

limk→∞

ak = L

there is an Nε such that|ak − L| < ε

whenever k ≥ Nε. Consider two cases: |C| ≤ 1, and |C| > 1.

Page 74: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 63

Suppose first that |C| ≤ 1. Then whenever k ≥ Nε we have thedesired inequality

|Cak − CL| < |C|ε < ε.

Next suppose that |C| > 1. Let ε1 = ε/|C|. Since

limk→∞

ak = L

there is an Nε1 such that k ≥ Nε1 implies

|ak − L| < ε1.

But this means that

|Cak − CL| < |C|ε1 = |C|ε/|C| = ε.

Next consider the proof of (ii). If we start by assuming that k is solarge that |ak − L| < ε and |bk − M | < ε, then the triangle inequalitygives

|(ak + bk) − (L + M)| ≤ |ak − L| + |bk − M | < 2ε.

On one hand this looks bad, since our goal is to show that |(ak + bk)−(L+M)| < ε, not 2ε. On the other hand the situation looks good sincethe value of |(ak + bk) − (L + M)| can be made as small as we likeby making k large enough. This issue can be resolved by the trick ofsplitting ε in two (see problem 7 for an alternative).

Proof. (ii) Take any ε > 0. Now take ε1 = ε/3 and ε2 = 2ε/3. Fromthe limit definitions there are N1 and N2 such that if k ≥ N1 then

|ak − L| < ε1,

and if k ≥ N2 then|bk − M | < ε2.

Take N = max(N1, N2). If k ≥ N , then

|(ak + bk)− (L + M)| ≤ |ak −L|+ |bk −M | < ε1 + ε2 = ε/3 + 2ε/3 = ε.

Page 75: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

64 A Concrete Introduction to Real Analysis

Of course the choice of ε1 = ε/3 and ε2 = 2ε/3 was largely arbitrary.The choice ε1 = ε2 = ε/2 was certainly available.

To prove (iii), begin with the algebraic manipulation

|akbk −LM | = |ak[M +(bk −M)]−LM | ≤ |(ak −L)M |+ |ak(bk −M)|.

The plan is to show that |(ak − L)M | and |ak(bk − M)| can be madesmall.

Proof. (iii) Take any ε > 0, and let ε1 = ε/2. Replacing C by M inpart (i), there is an N1 such that k ≥ N1 implies

|(ak − L)M | = |Mak − ML| < ε1.

By Lemma 3.2.2 there is a constant C1 such that |ak| ≤ C1. Thus

|ak(bk − M)| ≤ |C1(bk − M)| = |C1bk − C1M |.

Again using part (i), there is an N2 such that k ≥ N2 implies

|C1bk − C1M | < ε1.

Take N = max(N1, N2) so that for k ≥ N ,

|akbk −LM | = |ak[M + (bk −M)]−LM | ≤ |(ak −L)M |+ |ak(bk −M)|

< ε1 + ε1 = ε.

The key to (iv) is also an algebraic manipulation.

|ak

bk− L

M| = |akM − bkL

bkM| = |ak(M − bk) − bk(L − ak)

bkM|.

The other important observation is that for k large enough,

|bk| ≥ |M |/2.

Notice in the proof below how a mysterious factor M2 enters whenN2 and N3 are introduced. Its purpose is to arrange for a convenientcancellation at the end of the proof. This is another place where theidea of problem 7 could be used.

Page 76: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 65

Proof. (iv) Suppose M �= 0. Notice that if we choose ε1 = |M |/2 thenthere is an N1 such that for k ≥ N1,

|bk − M | < |M |/2,or

|bk| ≥ |M |/2.Now take any ε > 0. By parts (i) and (iii) of this theorem the

sequence ak(M −bk) has limit 0. Replacing the usual ε of the definitionof limit with M2ε/4, this means that there is an N2 such that

|ak(M − bk)| <M2ε

4, k ≥ N2.

By analogous reasoning there is an N3 such that

|bk(L − ak)| <M2ε

4, k ≥ N3.

Thus if N = max(N1, N2, N3) and k ≥ N , it follows that

|ak

bk− L

M| = |ak(M − bk) − bk(L − ak)

bkM|

≤ |ak(M − bk)| + |bk(L − ak)||bkM | <

M2ε/4 + M2ε/4M2/2

= ε.

3.3 Series representations

Having provided an introduction to sequences, we now apply theseideas to the problem of making sense of infinite sums, and the repre-sentation of functions by means of power series.

Consider the formal sum ∞∑k=0

ak.

Depending on the values of the numbers ak such a sum may or maynot make sense. For instance, our previous work with the geometricseries suggests that

∞∑k=0

2−k = 2,

Page 77: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

66 A Concrete Introduction to Real Analysis

but the sum1 + 1 + 1 + . . .

is unlikely to represent any real number.The key to analyzing infinite series is to convert the problem to one

involving sequences. The finite sum

sn =n−1∑k=0

ak, n = 1, 2, 3, . . . ,

is called the n-th partial sum of the series

∞∑k=0

ak.

In many cases it is convenient to start the sequence {ak} at index k = 1,in which case the n-th partial sum is

sn =n∑

k=1

ak.

The series∑∞

k=0 is said to converge if there is a number S such that

S = limn→∞ sn.

In this case S is said to be the sum of the series∑

ak.Here is a simple example of a convergent series. Let

ak =1

k(k + 1), k = 1, 2, 3, . . . .

It helps to observe that

ak =1

k(k + 1)=

1k− 1

k + 1.

The partial sums telescope, giving

sn =n∑

k=1

ak =n∑

k=1

(1k− 1

k + 1)

= (1−1/2)+(1/2−1/3)+(1/3−1/4)+· · ·+(1/n−1/(n+1) = 1− 1n + 1

.

Page 78: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 67

Clearly

limn→∞ sn = 1,

so that

1 =∞∑

k=1

1k(k + 1)

.

A limit theorem of the previous section immediately gives us a wayto generate new convergent series from old ones (see problem 10).

Lemma 3.3.1. If the infinite series

∞∑k=1

ak, and∞∑

k=1

bk,

converge, so does∞∑

k=1

(C1ak + C2bk),

for any real numbers C1, C2.

A particularly important example is the geometric series . The partialsums of the geometric series are

sn(x) =n−1∑k=0

xk =1 − xn

1 − x

if x is any number other than 1. If x = 1 the partial sums are sn = n,which is an unbounded sequence, so the series cannot converge.

Since the denominator 1− x for sn(x) is independent of n, the seriesconverges (for x �= 1) if limn→∞ 1 − xn exists. If |x| < 1 then (seeproblem 4)

limn→∞xn → 0.

If |x| > 1 then |x|n → ∞, and the sequence {sn} is unbounded and sohas no limit. Finally, if x = −1 then xn = (−1)n = −1, 1,−1, 1, . . . ,which again has no limit. We conclude that the geometric series con-verges (to S(x) = 1/(1 − x)) if and only if |x| < 1.

Page 79: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

68 A Concrete Introduction to Real Analysis

3.4 Taylor series

The example of the geometric series has been used to slip in anotheridea, that a function might be represented by an infinite series. In thecase of the geometric series, the function is

S(x) =1

1 − x,

with the corresponding series

∞∑k=0

xk = 1 + x + x2 + . . . .

Notice that the function 1/(1 − x) is defined for all values of x exceptx = 1, but the infinite series only converges for |x| < 1.

The geometric series looks like a polynomial with infinitely manyterms. More generally, an infinite series of the form

∞∑k=0

ak(x − x0)k.

is called a power series. The number x0 is called the center of theseries. In the case of the geometric series the center is 0.

A second example of a power series can be constructed by a simplemodification of the geometric series. Replacing x by −x gives

11 + x

=1

1 − (−x)=

∞∑k=0

(−1)kxk, |x| < 1,

where the coefficients are ak = (−1)k.Since power series look so much like polynomials, it is tempting to

treat them in the same way. Yielding to this temptation, we considerterm by term integration of the last series, obtaining the conjecturedformula

log(1 + x) =∫ x

0

11 + t

dt =∞∑

j=0

(−1)j∫ x

0tj dt

=∞∑

j=0

(−1)jxj+1

j + 1=

∞∑k=1

(−1)k−1 xk

k, |x| < 1,

Page 80: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 69

In this case a0 = 0 and ak = (−1)k−1/k for k ≥ 1.To the extent that a power series may be manipulated like a poly-

nomial, such a representation for a function is extremely convenient.Differentiation is no challenge, and more importantly the term by termintegration of power series is trivial.

This is quite different from the problem of trying to find elementaryantiderivatives, where examples such as∫ x

0e−t2 dt

prove to be an impossible challenge.So far the discussion of power series has emphasized formal algebra-

ic manipulations, with some analysis to help with the justification. Inthe remainder of this chapter, questions about power series will be con-sidered more generally, and with more analytical precision. There aresome basic questions to consider. Which functions may be representedby a power series, what are the coefficients of the power series, andhow much error is made when a partial sum of the power series is usedinstead of the entire infinite series?

3.4.1 Taylor polynomials

Suppose that the function f(x) has a power series representation

f(x) =∞∑

k=0

akxk = a0 + a1x + a2x

2 + a3x3 + . . . .

The first problem is to decide what the coefficients ak are. Notice thatevaluation of the function f(x) at x = 0 gives the first coefficient,

f(0) = a0.

To get a1 we formally differentiate the power series,

f ′(x) =∞∑

k=1

kakxk−1 = a1 + 2a2x + 3a3x

2 + . . . ,

and evaluate the derivative at x = 0, obtaining the second coefficient,

f ′(0) = a1.

Page 81: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

70 A Concrete Introduction to Real Analysis

Continuing in this manner leads to

f (2)(x) =∞∑

k=2

k(k − 1)akxk−2 = 2a2 + 6a3x + . . . , f (2)(0) = 2a2.

Differentiating term by term n times gives

f (n)(x) =∞∑

k=n

k(k − 1) · · · (k − (n − 1))akxk−n.

Now evaluate both sides at x = 0 to get

f (n)(0) = n!an, an = f (n)(0)/n!.

Similar computations may be carried out for more general powerseries of the form

f(x) =∞∑

k=0

ck(x−x0)k = c0+c1(x−x0)+c2(x−x0)2+c3(x−x0)3+ . . . .

Recall that x0 is the center or basepoint of the series. In this casef(x0) = c0, f ′(x0) = c1, and in general (see problem 14)

f (n)(x0) = n!cn, cn = f (n)(x0)/n!.

The infinite series∞∑

k=0

f (k)(x0)k!

(x − x0)k

is called the Taylor series for the function f(x) centered at x0. Thisseries is defined as long as f(x) has derivatives of all orders at x0, butthe series may not converge except at x0. If it does converge, it ispossible that the series will not converge to the function f(x).

Taylor polynomials are truncated versions of these series. Given afunction f(x) with at least n derivatives at x0, its Taylor polynomialof degree n based at x = x0 is the polynomial

Pn(x) =n∑

k=0

ck(x−x0)k = c0+c1(x−x0)+c2(x−x0)2+· · ·+cn(x−x0)n,

withck = f (k)(x0)/k!,

Page 82: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 71

or

Pn(x) =n∑

k=0

f (k)(x0)k!

(x − x0)k.

This is the unique polynomial of degree n whose 0 through n-th deriva-tives at x0 agree with those of f .

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−1

0

1

2

3

4

5

6

7

8

ex

1+x

Figure 3.2: First order Taylor polynomial for ex.

It is worth considering a few examples to see in what sense the polyno-mials Pn(x) ‘look like’ a function f(x). First take x0 = 0 and f(x) = ex.Since

dk

dxkex = ex,

the coefficients of the Taylor series for ex based at x = 0 are

ak =1k!

,

and the Taylor polynomial of degree n based at x0 = 0 is

Pn(x) =n∑

k=0

xk

k!.

Page 83: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

72 A Concrete Introduction to Real Analysis

That is,

P0(x) = 1, P1(x) = 1 + x, P2(x) = 1 + x +x2

2,

P3(x) = 1 + x +x2

2+

x3

3 · 2 , . . . .

Figure 3.b shows the graph of ex and the first order Taylor polynomialP1(x) with center x0 = 0. Figure 3.c is similar with the third orderTaylor polynomial P3(x).

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−1

0

1

2

3

4

5

6

7

8

ex

1+x+x2/2+x3/3

Figure 3.3: Third order Taylor polynomial for ex.

As a second example take x0 = π/2 and f(x) = cos(x). In this case

cos′(x) = − sin(x), cos(2)(x) = − cos(x),

cos(3)(x) = sin(x), cos(4)(x) = cos(x),

and in generalcos(2m)(x) = (−1)m cos(x),

cos(2m+1)(x) = (−1)m+1 sin(x), m = 0, 1, 2, . . . .

Page 84: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 73

Evaluation of the derivatives at x0 = π/2 gives

cos(2m)(π/2) = (−1)m cos(π/2) = 0,

cos(2m+1)(π/2) = (−1)m+1 sin(π/2) = (−1)m+1, m = 0, 1, 2, . . . .

Since the coefficients with even index k vanish we find

P0(x) = 0, P1(x) = −(x − π/2),

P2(x) = −(x − π/2), P3(x) = −(x − π/2) +(x − π/2)3

6, . . . .

The general form of this Taylor polynomial of order n based at x0 = π/2is

Pn(x) =�(n−1)/2�∑

k=0

(−1)k+1 (x − π/2)2k+1

(2k + 1)!.

Here �x is the largest integer less than or equal to x, and there is nosummation (the sum is 0) if the upper limit is negative.

Figure 3.d shows the graph of cos(x) and its third order Taylor poly-nomial with center x0 = π/2.

3.4.2 Taylor’s Theorem

Taylor’s Theorem provides an exact description of the difference be-tween a function f(x) and its Taylor polynomials. Let’s start with asimple motivating example,

f(x) = ex, a = 0

By the Fundamental Theorem of Calculus:

f(x) − f(x0) =∫ x

x0

f ′(t) dt

For the example this gives

ex − e0 =∫ x

0et dt

Now recall the integration by parts formula∫ x

x0

h′(t)g(t) dt = h(x)g(x) − h(x0)g(x0) −∫ x

x0

h(t)g′(t) dt.

Page 85: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

74 A Concrete Introduction to Real Analysis

−1 0 1 2 3 4 5−1

−0.5

0

0.5

1

1.5

2

2.5

3

3.5

4cos(x)

−(x−π/2) + (x−π/2)3/6

Figure 3.4: Third order Taylor polynomial for cos(x). The center isx0 = π/2.

For our example, take g(t) = et and h′(t) = 1. A convenient choice ofh(t) with h′(t) = 1 is h(t) = t − x. Here we are thinking of x as fixedfor the moment.

Since e0 = 1,

ex = 1 +∫ x

0et dt = 1 + (t − x)et

∣∣∣t=x

t=0−

∫ x

0(t − x)et dt

= 1 + (x − x)ex − (0 − x)e0 +∫ x

0(x − t)et dt

That isex = 1 + x +

∫ x

0(x − t)et dt.

Now use the same idea again with g(t) = et and h′(t) = (x− t). Take

h(t) = −(x − t)2

2

to get

ex = 1 + x − (t − x)2

2et

∣∣∣x0+

∫ x

0

(x − t)2

2et dt

Page 86: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 75

= 1 + x +x2

2+

∫ x

0

(x − t)2

2et dt

Such formulas can give us information about the function. For in-stance, notice that in the last formula the integrand is nonnegative, sothat the integral is positive if x ≥ 0. Dropping the last term makes theremaining expression smaller, which implies that for x ≥ 0

ex ≥ 1 + x +x2

2, x ≥ 0.

The techniques that worked in the example also work in the generalcase. Start with the Fundamental Theorem of Calculus to write

f(x) − f(x0) =∫ x

x0

f ′(t) dt

Since ddt(t − x) = 1,

f(x) = f(x0) +∫ x

x0

f ′(t) dt = f(x0) +∫ x

x0

f ′(t)d

dt(t − x) dt.

Now use integration by parts to get

f(x) = f(x0) +∫ x

x0

f ′(t)d

dt(t − x) dt

= f(x0) + (t − x)f ′(t)∣∣∣xx0

−∫ x

x0

(t − x)f (2)(t) dt

= f(x0) + f ′(x0)(x − x0) +∫ x

x0

(x − t)f (2)(t) dt.

Of course we are assuming here that f(x) can be differentiated asmany times as indicated, always with continuous derivatives. By re-peatedly using the same integration by parts idea the following theoremis obtained.

Theorem 3.4.1. (Taylor’s Theorem) Suppose that the function f(x)has n + 1 continuous derivatives on the open interval (a, b), and thatx0 and x are in this interval. Then

f(x) =n∑

k=0

f (k)(x0)(x − x0)k

k!+

∫ x

x0

(x − t)n

n!f (n+1)(t) dt

Page 87: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

76 A Concrete Introduction to Real Analysis

or, rewriting in alternate notation,

f(x) = f(x0) + f ′(x0)(x − x0) + f (2)(x0)(x − x0)2

2!+ . . .

+f (n)(x0)(x − x0)n

n!+

∫ x

x0

(x − t)n

n!f (n+1)(t) dt.

Proof. The formal proof is by induction. Notice that the Fundamen-tal Theorem of Calculus shows that the formula is true when n = 0.Suppose that the formula is true for n ≤ K. Then using integration byparts we find that

f(x) = f(x0) + f ′(x0)(x − x0) + f (2)(x0)(x − x0)2

2!+ · · ·+

f (K)(x0)(x − x0)K

K!+

∫ x

x0

(x − t)K

K!f (K+1)(t) dt

= f(x0)+f ′(x0)(x−x0)+f (2)(x0)(x − x0)2

2!+ · · ·+f (K)(x0)

(x − x0)K

K!

+−(x − t)K+1

(K + 1)!f (K+1)(t)

∣∣∣xx0

−∫ x

x0

−(x − t)K+1

(K + 1)!f (K+2)(t) dt

= f(x0) + f ′(x0)(x − x0) + f (2)(x0)(x − x0)2

2!+ . . .

+(x − x0)K+1

(K + 1)!f (K+1)(x0) +

∫ x

x0

(x − t)K+1

(K + 1)!f (K+2)(t) dt.

Thus if the formula is correct for n ≤ K it is also true for n = K + 1.This establishes the formula in general.

3.4.3 The remainder

The last term in the formula of Taylor’s Theorem is called the re-mainder,

Rn(x) =∫ x

x0

(x − t)n

n!f (n+1)(t) dt.

The size of this remainder is interesting since the error made in replac-ing f(x) with

f(x0) + f ′(x0)(x − x0) + f (2)(x0)(x − x0)2

2!+ · · · + f (n)(x0)

(x − x0)n

n!

Page 88: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 77

is just

|Rn(x)| = |f(x) −n∑

k=0

f (k)(x0)(x − x0)k

k!|.

The definition of convergence of an infinite series immediately leads tothe next result.

Theorem 3.4.2. Suppose that f(x) has derivatives of all orders at x0.The Taylor series

∞∑k=0

f (k)(x0)k!

(x − x0)k

converges to the value f(x) if and only if

limn→∞Rn(x) = 0.

To determine how accurately f(x) is approximated by its Taylor poly-nomials we usually employ an estimate for |Rn(x)|. The following lem-ma is essential for developing such estimates.

Lemma 3.4.3. Suppose that f(x) is continuous on the interval [a, b].Then ∣∣∣∫ b

af(x) dx

∣∣∣ ≤ ∫ b

a|f(x)| dx.

Proof. The ideas sketched here will be reconsidered in chapter 7. Theintegral ∫ b

af(x) dx

is the signed area for the region between the graph of f and the x-axisfor a ≤ x ≤ b. Split the function f into positive and negative parts:f(x) = f+(x) + f−(x), where

f+(x) ={f(x), f(x) > 0

0, f(x) ≤ 0

}, f−(x) =

{f(x), f(x) < 00, f(x) ≥ 0

}.

Then by the triangle inequality∣∣∣∫ b

af(x) dx

∣∣∣ =∣∣∣∫ b

af+(x) dx +

∫ b

af−(x) dx

∣∣∣≤

∫ b

a|f+(x)| dx +

∫ b

a|f−(x)| dx =

∫ b

a|f(x)| dx.

Page 89: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

78 A Concrete Introduction to Real Analysis

The lemma may now be used to obtain an estimate for the remainder.

Theorem 3.4.4. Under the hypotheses of Taylor’s Theorem, if

M = maxx0≤t≤x

|f (n+1)(t)|,

then

|Rn(x)| ≤ M|x − x0|n+1

(n + 1)!.

Proof. Suppose first that x ≥ x0. The previous lemma indicates thatthe remainder satisfies the inequality

|Rn(x)| = |∫ x

x0

(x − t)n

n!f (n+1)(t) dt|

≤∫ x

x0

|(x − t)n

n!f (n+1)(t)| dt =

∫ x

x0

|x − t|nn!

|f (n+1)(t)| dt.

The integrand will be even larger if |f (n+1)(t)| is replaced by

M = maxx0≤t≤x

|f (n+1)(t)|.

Since M is a constant,

|Rn(x)| ≤ M

∫ x

x0

|x − t|nn!

dt.

Because x ≥ x0 the term x − t is nonnegative, and∫ x

x0

|x − t|nn!

dt =∫ x

x0

(x − t)n

n!dt =

(x − x0)n+1

(n + 1)!.

This gives

|Rn(x)| ≤ M(x − x0)n+1

(n + 1)!= M

|x − x0|n+1

(n + 1)!,

as desired.If x < x0 the intermediate computations are the same except for a

possible factor −1, and the final result is the same.

The next result uses a different argument to analyze the remainder.

Page 90: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 79

Theorem 3.4.5. (Lagrange) Under the hypotheses of Taylor’s Theo-rem, there is some c between x0 and x such that

Rn(x) =∫ x

x0

(x − t)n

n!f (n+1)(t) dt = f (n+1)(c)

(x − x0)n+1

(n + 1)!.

Proof. Suppose for simplicity that x0 ≤ x; the other case is similar.Since f (n+1)(t) is continuous on the closed interval from x0 to x, it hasa minimum f (n+1)(x1) and a maximum f (n+1)(x2), with x0 ≤ x1 ≤ xand x0 ≤ x2 ≤ x. Since the values of the function f (n+1) at x1 and x2

are just constants, and

(x − t)n

n!≥ 0, x0 ≤ t ≤ x,

we have

Rn(x) =∫ x

x0

(x − t)n

n!f (n+1)(t) dt ≤

∫ x

x0

(x − t)n

n!f (n+1)(x2) dt

= f (n+1)(x2)∫ x

x0

(x − t)n

n!dt = f (n+1)(x2)

(x − x0)n+1

(n + 1)!,

and similarly

Rn(x) ≥ f (n+1)(x1)(x − x0)n+1

(n + 1)!.

Now for s between x1 and x2 look at the function

f (n+1)(s)(x − x0)n+1

(n + 1)!.

This is a continuous function of s, which for s = x1 is smaller thanRn(x), and for s = x2 is bigger than Rn(x). By the Intermediate ValueTheorem there must be some point s = c where

Rn(x) = f (n+1)(c)(x − x0)n+1

(n + 1)!,

as desired.

Page 91: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

80 A Concrete Introduction to Real Analysis

3.4.3.1 Calculating e

As an example of the use of Taylor’s Theorem, the decimal expansionfor the number e will be considered. Using x0 = 0, Taylor’s Theoremgives

ex =n∑

k=0

xk

k!+ Rn(x).

In section 2.4 we determined that 2 < e < 4. Theorem 3.4.4 thenimplies

|Rn(x)| ≤ 4|x||x|n+1

(n + 1)!.

Notice that for any fixed value of x,

limn→∞Rn(x) = 0,

so that the Taylor series for ex based at x0 = 0 converges to ex,

ex = limn→∞

n∑k=0

xk

k!.

To compute the first few terms in the decimal representation of e =e1, consider the case n = 6. Then

|Rn(1)| ≤ 4(7)!

=1

1260.

and|e − (1 + 1 +

12!

+13!

+ · · · + 16!

)| < 10−3,

Thus with an error less than 10−3,

e � 2 +12

+16

+124

+1

120+

1720

= 2.718 . . . .

Observe that the accuracy of this Taylor series approximation for ex

improves very rapidly as n increases.

3.4.3.2 Calculating π

The decimal expansion for the number π may also be obtained byusing a power series. In this example algebraic manipulations are em-phasized. The starting point is the calculus formula

d

dxtan−1(x) =

11 + x2

.

Page 92: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 81

Since tan−1(0) = 0 this derivative formula may be integrated to give

tan−1(x) =∫ x

0

11 + t2

dt.

Use the geometric series identity

11 − x

=m−1∑k=0

xk +xm

1 − x

to derive the identity

11 + t2

=1

1 − (−t2)=

m−1∑k=0

(−1)kt2k +(−1)mt2m

1 − (−t2).

The sum coming from the geometric series is easily integrated, giving

∫ x

0

m−1∑k=0

(−1)kt2k dt =m−1∑k=0

(−1)kx2k+1

2k + 1.

To estimate the additional term, notice that for any real number t,

|(−1)mt2m

1 − (−t2)| = | t2m

1 + t2| ≤ |t|2m.

If tan−1(x) is approximated by

m−1∑k=0

(−1)kx2k+1

2k + 1

the error satisfies

| tan−1(x) −m−1∑k=0

(−1)kx2k+1

2k + 1| ≤ |

∫ x

0|(−1)mt2m

1 + t2| dt| (3.1)

≤ |∫ x

0|t2m| dt| ≤ |x|2m+1

2m + 1.

The error estimate in (3.1) looks promising if |x| < 1. A convenientchoice of x is determined by observing that

tan(π/6) =1/2√3/2

=1√3,

Page 93: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

82 A Concrete Introduction to Real Analysis

or π/6 = tan−1(1/√

3). With x = 1/√

3 and, for instance, m = 6 wefind that

|π − 65∑

k=0

(−1)k1

2k + 1(

1√3)2k+1| ≤ 6

131√3

136

≤ 4 × 10−4,

and the computed value is

π � 6√3[1 − 1

3 · 3 +1

9 · 5 − 127 · 7 +

181 · 9 − 1

243 · 11] � 3.1413.

3.4.4 Additional results

3.4.4.1 Taylor series by algebraic manipulations

Taylor’s Theorem gives an estimate of the form

|f(x) −n∑

k=0

f (k)(x0)k!

(x − x0)k| ≤ C1|x − x0|n+1.

As the ad hoc method for expanding tan−1(x) illustrated, it is some-times possible, as in (3.1), to come up with estimates of similar form

|f(x) −n∑

k=0

ak(x − x0)k| ≤ C2|x − x0|n+1

without directly computing the derivatives f (k)(x0). The next theoremsays that the coefficients ak must be the Taylor series coefficients.

First, a lemma about polynomials is developed. This lemma says thatthe magnitude of p(x) = a0 + a1x + · · · + anxn can’t be smaller than|x|n+1 for all x in some interval containing 0 unless all the coefficientsak are 0.

Lemma 3.4.6. Suppose there is a polynomial

p(x) =n∑

k=0

bk(x − x0)k

and a number δ > 0 such that

|p(x)| ≤ C|x − x0|n+1

for x0 < x < x0 + δ. Then p(x) = 0 for all x.

Page 94: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 83

Proof. Suppose the polynomial p(x) is not the zero function, whichimplies that one or more of the coefficients bk is not 0. Let m be thesmallest index such that bm �= 0. Throwing away the terms whosecoefficients are known to be 0, the inequality for p(x) becomes

|n∑

k=m

bk(x − x0)k| ≤ C|x − x0|n+1.

When x �= x0 it is possible to divide by |x − x0|m to get

|n∑

k=m

bk(x − x0)k−m| = |bm + bm+1(x − x0) + . . . | ≤ C|x − x0|n+1−m.

Picking x sufficiently close to x0 will force the expression |bm+bm+1(x−x0)+ . . . | to be at least as big as |bm/2|, while at the same time forcingthe expression C|x − x0|n+1−m to be as small as desired, say smallerthan |bm/10|. (Problem 2 can be useful for this analysis.) Since bm �= 0,it follows that |bm/2| < |bm/10|, or 1/2 < 1/10. Since this is false, itmust be that all the coefficients bk are actually 0.

Theorem 3.4.7. Suppose that the function f(x) has n + 1 continuousderivatives on the open interval (a, b), and that x0 is in this interval.Suppose in addition that there is a polynomial

n∑k=0

ak(x − x0)k

and a number δ > 0 such that

|f(x) −n∑

k=0

ak(x − x0)k| ≤ C2|x − x0|n+1, x0 < x < x0 + δ < b.

Then

ak =f (k)(x0)

k!, k = 0, . . . , n.

Proof. For x in the interval [x0, x0 + δ] Theorem 3.4.4 says that

|f(x) −n∑

k=0

f (k)(x0)k!

(x − x0)k| ≤ C1|x − x0|n+1,

withC1 = max

x0≤x≤x0+δ|fn+1(x)|.

Page 95: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

84 A Concrete Introduction to Real Analysis

By the triangle inequality

|n∑

k=0

ak(x − x0)k −n∑

k=0

f (k)(x0)k!

(x − x0)k|

= |[n∑

k=0

ak(x − x0)k − f(x)] + [f(x) −n∑

k=0

f (k)(x0)k!

(x − x0)k]|

≤ (C1 + C2)||x − x0|n+1.

With C3 = C1 + C2 and bk = ak − f (k)(x0)/k!, the last inequalitysays that the polynomial

∑nk=0 bk(x − x0)k satisfies

|n∑

k=0

bk(x − x0)k| ≤ C3|x − x0|n+1, x0 < x < x0 + δ.

By Lemma 3.4.6 the coefficients bk are all 0, or

ak =f (k)(x0)

k!, k = 0, . . . , n.

This theorem may be applied to (3.1). The coefficients computedthere using the geometric series are the Taylor coefficients for tan−1(x)at x0 = 0. In addition this analysis shows that

tan−1(x) = limm→∞

m−1∑k=0

(−1)kx2k+1

2k + 1, if |x| ≤ 1.

3.4.4.2 The binomial series

Let α be a real number and consider the Taylor series centered atx0 = 0 for the function

f(x) = (1 + x)α.

This function has derivatives of all orders at x0 = 0, with inductionshowing

f (1)(x) = α(1 + x)α−1, f (2)(x) = α(α − 1)(1 + x)α−2,

. . . , f (k)(x) = α(α − 1) · · · (α − [k − 1])(1 + x)α−k.

Page 96: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 85

The Taylor polynomials centered at x0 = 0 for this function are

Pn(x) =n∑

k=0

f (k)(0)k!

xk =n∑

k=0

α(α − 1) · · · (α − [k − 1])xk

k!.

The basic question is, for which values of x does Rn(x) → 0 as n → ∞?This question will be treated by two methods: first with Lagrange’sform for the remainder, and second with the original integral form.

The Lagrange form for the remainder in this case is

Rn(x) = f (n+1)(c)xn+1

(n + 1)!(3.2)

=α(α − 1) · · · (α − n)

(n + 1)!(1 + c)α−n−1xn+1,

where c lies between 0 and x. The main challenge in estimating theremainder is to understand the factor

α(α − 1) · · · (α − n)(n + 1)!

1(α − 1)

2· · · (α − n)

(n + 1). (3.3)

Notice that

limn→∞

α − n

n + 1= lim

n→∞α

n + 1− n

n + 1= −1.

This means that for any ε > 0 there is an M such that

|(α − n)n + 1

| < 1 + ε whenever n ≥ M.

Going back to (3.3), if n ≥ M then

α

1(α − 1)

2· · · (α − n)

(n + 1)

=[α

1(α − 1)

2· · · (α − [M − 1])

M

][(α − M)(M + 1)

· · · (α − n)(n + 1)

].

The termα

1(α − 1)

2· · · (α − [M − 1])

M

is just some complicated constant, which we can call cM . Thus

|α1

(α − 1)2

· · · (α − n)(n + 1)

|

Page 97: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

86 A Concrete Introduction to Real Analysis

≤ cM (1 + ε)n−M = [cM (1 + ε)−M ](1 + ε)n.

Again the termcM (1 + ε)−M

is an awkward constant, which will be denoted CM . The bottom lineof this analysis is that for any ε > 0 there is a constant CM such that

|α(α − 1) · · · (α − n)(n + 1)!

| ≤ CM (1 + ε)n.

The other factor in the remainder (3.2) was

(1 + c)α−n−1xn+1 = (1 + c)α(1 + c)−n−1xn+1, (3.4)

where c is between 0 and x. Notice that if x ≥ 0 then c ≥ 0 and

(1 + c)−n−1 ≤ 1.

The part (1 + c)α is another of those awkward constants.Let’s put everything together. If x ≥ 0, then

|Rn(x)| = |α(α − 1) · · · (α − n)(n + 1)!

(1 + c)α−n−1xn+1|

≤ C(1 + ε)nxn+1.

Suppose that 0 ≤ x < 1. Pick ε > 0 so that

0 ≤ (1 + ε)x < 1,

and so as n → ∞

(1 + ε)nxn+1 = [(1 + ε)x]nx → 0.

The case x < 0 remains untreated. This can be partially rectifiedusing (3.4) (see problem 14), but a better result can be obtained bygoing back to the original integral form of the remainder. Since x0 = 0,

Rn(x) =∫ x

0

(x − t)n

n!f (n+1)(t) dt

=∫ x

0

(x − t)n

n!α(α − 1) · · · (α − n)(1 + t)α−n−1 dt

Page 98: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 87

=α(α − 1) · · · (α − n)

(n + 1)!n + 1

n

∫ x

0(x − t)n(1 + t)α−n−1 dt.

Since (n+1)/n ≤ 2 if n ≥ 1, and the term α(α− 1) · · · (α−n)/(n+1)!is as before, the only new piece is∫ x

0(x − t)n(1 + t)α−n−1 dt =

∫ x

0

(x − t

1 + t

)n(1 + t)α−1 dt.

It will be important to understand the function

|x − t

1 + t|.

Let’s focus on the case −1 < x ≤ t ≤ 0. Then

|x − t

1 + t| = |x||1 − t/x

1 + t| = |x|1 − t/x

1 − |t| .

Since |x| < 1,|t| ≤ t/x ≤ 1,

and|x − t

1 + t| ≤ |x|. (3.5)

This last inequality also holds if 0 ≤ t ≤ x < 1.Using (3.5) in the remainder formula gives

|Rn(x)| = |α(α − 1) · · · (α − n)(n + 1)!

|n + 1n

|∫ x

0

(x − t

1 + t

)n(1 + t)α−1 dt|

≤ C(1 + ε)n|x|n|∫ x

0(1 + t)α−1 dt|.

Again picking ε so small that |(1 + ε)x| < 1, and noting that the lastintegral is independent of n, we have limn→∞ |Rn| = 0.

After all that work let’s celebrate our success with a small theorem.

Theorem 3.4.8. Suppose that Rn is the error made in approximatingthe function (1 + x)α by the partial sum of the binomial series

Pn(x) =n∑

k=0

α(α − 1) · · · (α − [k − 1])xk

k!.

Then as long as −1 < x < 1 we have

limn→∞ |Rn| = 0.

Page 99: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

88 A Concrete Introduction to Real Analysis

3.5 Problems

1. Determine whether or not the following sequences converge. Ifthey do converge, find an Nε. If they do not converge, explain why.

(a) ck = 5 − 1/k2, (b) ck = log(k), (c) ck = (−1)k/k,(d) ck = cos(k)/k, (e) ck = 3−k, (f) ck = log(1 + 1/k).

2. Prove that for all real numbers a, b,

(a) |a + b| ≤ |a| + |b|.

This result is called the triangle inequality. Also show that

(b) |a − b| ≥ |a| − |b|.

(Hint: consider the equation a = (a − b) + b.)3. Suppose there are two sequences {ck} and {ak}, with the property

that ak = ck for all k greater than some number M . Show that if

limk→∞

ck = L,

thenlim

k→∞ak = L.

4. Show thatlim

k→∞xk → 0

if and only if |x| < 1.5. Say M is a cluster point of the sequence {ck} if for every ε > 0,

and every positive integer N , there is a k ≥ N such that |ck −M | < ε.(a) Find an example of a sequence with two distinct cluster points.(b) Show that if M is a cluster point of the sequence {ck} then for

every ε > 0 there are infinitely many ck satisfying |ck − M | < ε.6. A sequence {ck} of real numbers is bounded above if there is a

number M1 such that ck ≤ M1 for all positive integers k.A sequence {ck} of real numbers is bounded below if there is a number

M2 such that ck ≥ M2 for all positive integers k.To show that a sequence of numbers is not bounded we have to show

that for any number M there is an integer k such that |ck| > M .

Page 100: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 89

(a) What needs to be established to show that a sequence of numbersis not bounded above?

(b) Show that a sequence of numbers is bounded if and only if it isbounded above and below.

(c) Show that the sequences ck = k, ck =√

k, and ck = log(k) arenot bounded.

7. Let C > 0 be a fixed real number. Suppose that for any ε > 0there is an N such that

|ak − L| ≤ Cε

whenever k ≥ N . Show that

limn→∞an = L.

This provides an alternate way of handling cases where 2ε or 3ε comesup in trying to prove that limits exist.

8. Given thatlim

k→∞k sin(1/k) = 1,

findlim

k→∞k1/2 sin(1/k), and lim

k→∞k2 sin(1/k).

Justify your answers. Do not invoke l’Hopital’s rule.9. Suppose that

limk→∞

ck = L > 0.

Show that there is a positive integer N such that

L/2 < ck < 2L, k ≥ N.

10. Prove Lemma 3.3.1.11. Suppose that limk→∞ ck = L, and bk = ck+1. Show that

limk→∞ bk = L. Now modify your argument to show that if bk = ck+m

for any fixed integer m, then limk→∞ bk = L.12. Suppose that limn→∞ f(n) exists. Show that the series

∞∑k=0

f+(k), f+(k) = f(k + 1) − f(k),

converges. Find the sum. Give an example.13. Find the Taylor series for ex based at x0 = 1.14. Find the Taylor series for log(1 + x) based at x0 = 0.

Page 101: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

90 A Concrete Introduction to Real Analysis

15. Assuming that α is a constant, find the Taylor series for (1+x)α

based at x0 = 0.16. For those who have some exposure to complex numbers, show by

formal manipulation of power series that if i2 = −1, then

eix = cos(x) + i sin(x).

(Hint: find the Taylor series based at x0 = 0.)17. Show by induction that if

g(x) =∞∑

k=0

ck(x − x0)k

then term-by-term differentiation gives

g(n)(x) =∞∑

k=n

k(k − 1) · · · (k − (n − 1))ck(x − x0)k−n

=∞∑

k=n

k!(k − n)!

ck(x − x0)k−n.

Now evaluate at x = x0 to get a formula for cn.18. Find the Taylor polynomial P5(x) of degree 5 for sin(x) based at

x0 = 0. Suppose you are trying to calculate

y(x) =sin(x) − x

x

on a computer that stores 12 digits for each number. If the calculationof y(x) is carried out as indicated, for which values of x (approximately)will the computer tell you that y(x) = 0. How much can you improvethis situation if you use P5(x) and carry out the division symbolically?

19. Use Theorem 3.4.1 to show that if f (n+1)(x) = 0 at every pointof the interval (a, b), then f is a polynomial of degree at most n on(a, b).

20. Use Theorem 3.4.1 to show that if p(x) and q(x) are two poly-nomials of degree n with

p(k)(0) = q(k)(0), k = 0, . . . , n,

then p(x) = q(x) for all values of x.

Page 102: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Limits and Taylor’s Theorem 91

21. Use Theorem 3.4.1 to show that if p(x) is a polynomial of degreeat most n,

p(x) =n∑

k=0

ckxk,

then p(x) may also be written in the form

p(x) =n∑

k=0

ak(x − x0)k,

for any real number x0. Can you express ak in terms of the coefficientsc0, . . . , cn?

22. Suppose that p(x) is a polynomial of degree at most n, andp(k)(x0) = 0 for k = 0, . . . ,m−1. Show that there is another polynomialq(x) such that

p(x) = (x − x0)mq(x).

23. Using the Taylor series remainder, how many terms of the Taylorseries for ex centered at x0 = 0 should you use to compute the numbere with an error at most 10−12.

24. Compare the number of terms of the Taylor series needed tocompute sin(13) with an accuracy of 10−6 if the centers are x0 = 0 andx0 = 4π.

25. Use the result of Theorem 3.4.5 to give a second proof of Theo-rem 3.4.4.

26. From calculus we have

log(1 + x) =∫ x

0

11 + t

dt.

(a) Use the partial sums of the geometric series to obtain a Taylor-like formula, with remainder, for log(1 + x). For which values of x canyou show that |Rn| → 0 as n → ∞. (Be sure to check x = 1. Thepolynomials are the Taylor polynomials.)

(b) Use the algebraic identity

110 + x

=110

11 + x/10

and the method of part (a) to obtain a Taylor-like series with remainderfor log(10 + x). Again determine the values of x for which |Rn| → 0 asn → ∞.

Page 103: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

92 A Concrete Introduction to Real Analysis

27. Find the Taylor-like series, with remainder, for

f(x) = log(1 + x

1 − x)

centered at x0 = 0. (Hint: Use algebraic manipulations.) Assumingthat this series is the Taylor series, what is f (11)(0)?

28. Beginning with the Taylor series with remainder for ex centeredat x0 = 0, then taking x = −z2, find an expression for the Taylor serieswith remainder for e−z2

. Use this expression to find∫ x

0e−z2

dz, 0 ≤ x ≤ 1,

with an error no greater than 10−3.29. Consider approximating (1 + x)α by the partial sum of the bino-

mial seriesN∑

n=0

α(α − 1) · · · (α − [n − 1])xn

n!.

Use (3.4) to show that Lagrange’s form of the remainder does imply

limN→∞

|RN | = 0 if − 1/2 < x ≤ 0.

30. If |x| < 1 the function log(1+x) may be written as the followingseries:

log(1 + x) =∞∑

k=1

(−1)k+1 xk

k.

You may assume that the sequence of partial sums

sn =n∑

k=1

(−1)k+1 xk

k

has a limit s. If 0 ≤ x < .1, how big should you take n to make surethat

|sn − s| < 10−10?

What is the answer to the same question if x = .5? (Your reasoning ismore important than the specific number.)

Page 104: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Chapter 4

Infinite Series

4.1 Introduction

In the last chapter we considered Taylor series, and were able to showin some cases that the remainder Rn(x), which is just the differencebetween the function f(x) and its n-th order Taylor polynomial, has alimit 0 as n goes to ∞. A new problem arises when there is no knownfunction f(x) in the background.

As a concrete example, consider Airy’s differential equation

d2y

dx2− xy = 0. (4.1)

A fruitful approach is to look for a power series solution

y(x) =∞∑

k=0

akxk.

The equation (4.1) implies

∞∑k=0

k(k − 1)akxk−2 −∞∑

k=0

akxk+1 = 0. (4.2)

Writing out the first few powers of x produces

2a2 + [3 · 2a3 − a0]x + [4 · 3a4 − a1]x2 + [5 · 4a5 − a2]x3 + · · · = 0.

Setting the coefficients of the various powers of x to 0 gives

2a2 = 0, 3 · 2a3 − a0 = 0, 4 · 3a4 − a1 = 0, 5 · 4a5 − a2 = 0.

It appears that we are free to assign any numbers to the coefficientsa0 and a1, which should represent the values of y(0) and y′(0), but then

93

Page 105: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

94 A Concrete Introduction to Real Analysis

the equations for the coefficients will determine the rest. In fact (4.2)may be rewritten as

∞∑j=0

(j + 2)(j + 1)aj+2xj −

∞∑j=1

aj−1xj = 0,

or

2a2 +∞∑

j=1

[(j + 2)(j + 1)aj+2 − aj−1]xj = 0. (4.3)

If the coefficients of each power of x are set to 0 we obtain the recursionrelations

(j + 2)(j + 1)aj+2 − aj−1 = 0.

This is equivalent to

am+3 =am

(m + 3)(m + 2), m = 0, 1, 2, . . . . (4.4)

Since a2 = 0, all of the other coefficients are determined by the recur-sion relations (4.4) once a0 and a1 are fixed.

Let’s look for more explicit formulas for the ak. The relation (4.4)together with a2 = 0 shows that a3k+2 = 0 for k = 0, 1, 2, . . . . Inaddition,

a3 =a0

3 · 2 , a6 =a3

6 · 5 =a0

6 · 5 · 3 · 2 ,

anda4 =

a1

4 · 3 , a7 =a4

7 · 6 =a1

7 · 6 · 4 · 3 .

An induction argument can then be used to show that

a3k =a0

2 · 3 · 5 · 6 · · · (3k − 1) · (3k), (4.5)

a3k+1 =a1

3 · 4 · 6 · 7 · · · (3k) · (3k + 1),

a3k+2 = 0.

We find ourselves in a strange situation. The idea of using a powerseries to look for solutions of (4.1) seems very successful. There is aunique power series solution for every choice of y(0) and y′(0). On theother hand, it is not clear if the power series is actually a representationof a function!

Page 106: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 95

Let’s expand on this point. For the infinite series previously consid-ered, we were generally able to identify an explicit number S whichwas the limit of the sequence of partial sums. For instance, when Tay-lor series were computed for elementary functions f(x), the series wasexpected to converge to f(x). For the power series y(x) coming from(4.2) there is no explicit target function f(x).

If the value of x is fixed, then the terms in the series∑

akxk are

just numbers, and the following general problem confronts us. Givenan infinite series of numbers,

∞∑k=0

ck = c0 + c1 + . . . ,

what general procedures are available to determine if the series con-verges?

4.1.1 Bounded monotone sequences

Let’s review briefly the ideas used to discuss infinite series

∞∑k=0

ck = c0 + c1 + . . . .

To relate the study of infinite series to infinite sequences, consider thesequence of partial sums

sn =n−1∑k=0

ck = c0 + c1 + · · · + cn−1.

The infinite series is then said to converge to the sum S if

limn→∞ sn = S.

If an infinite series does not converge, it diverges.Rather than considering the most general series, let’s simplify the

discussion by considering series whose terms are nonnegative, ck ≥ 0.Notice that the sequence of partial sums is increasing, that is sn ≥ sm

if n ≥ m. This observation leads to the question of when increasingsequences have limits. Fortunately, this question has a simple answer,given by the Bounded Monotone Sequence (BMS) Theorem.

Page 107: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

96 A Concrete Introduction to Real Analysis

Theorem 4.1.1. (BMS) An increasing sequence of real numbers {sn}has a limit if and only if it is bounded.

This result will be discussed at greater length later in the book.For the moment the following informal discussion should provide someinsight.

The first observation is that Lemma 3.2.2 established that any se-quence with a limit must be bounded. The main point then is to showthat a bounded increasing sequence has a limit. Let us assume forsimplicity that 0 ≤ sn < 1.

Represent the real numbers sn by their decimal expansions. If thek-th digit of sn is ak(n), then

s1 = .a1(1)a2(1)a3(1) . . . , (4.6)

s2 = .a1(2)a2(2)a3(2) . . . ,

s3 = .a1(3)a2(3)a3(3) . . . ,

...

There is a slight problem caused by the fact that some numbers havetwo different decimal expansions. For instance

1.000 · · · = .999 . . . .

This happens when a number may be represented by a decimal ex-pansion which ends with an infinite sequence of 9’s. Assume that forsuch numbers the decimal expansion used is the one having an infinitesequence of 0’s, rather than 9’s.

Since the sequence of numbers sn is increasing, so is the sequenceof first digits a1(n). The possible digits are only 0, . . . , 9, so there issome N1 such that a1(n) = a1(N1) for all n ≥ N1. Once n exceedsN1 the digits a1(n) stop changing and the sequence a2(n) is increasing.Repeating the previous argument there is an N2 such that a2(n) =a2(N2) for all n ≥ N2. More generally, there is an increasing sequence{Nk} such that

a1(n) = a1(Nk), . . . , ak(n) = ak(Nk), n ≥ Nk.

Let S be the number whose decimal expansion has the first k digitsa1(Nk), . . . , ak(Nk). (This expansion may have an infinite sequence of9’s.) Let ε > 0, and choose k so that 10−k < ε. If n ≥ Nk then S and sn

Page 108: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 97

have the same initial sequence of k digits in their decimal expansions,and so

|S − sn| < 10−k < ε, n ≥ Nk.

ThusS = lim

n→∞ sn.

4.2 Positive series

With the Bounded Monotone Sequence Theorem at our disposal, itis now easy to obtain several results about infinite series whose termsare nonnegative.

Theorem 4.2.1. If ck ≥ 0 for k = 0, 1, 2, . . . , then the infinite series

∞∑k=0

ck = c0 + c1 + . . .

converges if and only if the sequence of partial sums {sn} is bounded.

Proof. If the series converges, then the sequence of partial sums has alimit. By Lemma 3.2.2 the sequence of partial sums is bounded.

Suppose the sequence of partial sums is bounded. Then {sn} is abounded increasing sequence, so has a limit by the BMS Theorem.

Theorem 4.2.2. (Comparison Test) Suppose that 0 ≤ ak ≤ ck for k =0, 1, 2, . . . . If the infinite series

∑∞k=0 ck = c0 + c1 + . . . converges, so

does the series∑∞

k=0 ak = a0+a1+. . . . If the infinite series∑∞

k=0 ak =a0 + a1 + . . . diverges, so does the series

∑∞k=0 ck = c0 + c1 + . . . .

Proof. Look at the partial sums

sn =n−1∑k=0

ck = c0 + c1 + · · ·+ cn−1, σn =n−1∑k=0

ak = a0 +a1 + · · ·+an−1.

Since 0 ≤ ak ≤ ck, the inequality σn ≤ sn holds for all positive integersn. If the series

∑ck converges, then the sequence of partial sums {sn}

is bounded above, and so is the sequence of partial sums {σn}. By theBMS Theorem the series

∑ak converges.

Page 109: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

98 A Concrete Introduction to Real Analysis

Suppose that the series∑

ak diverges. By the BMS Theorem thesequence of partial sums {σn} is not bounded above. Consequently,the sequence of partial sums {sn} is not bounded above, and the series∑

ck diverges.

As an example, consider the two sequences

ak =1

k2 + k, ck =

12k2

and their corresponding series. Since k2 ≥ k for k ≥ 1,

0 ≤ 12k2

≤ 1k2 + k

,

and the hypotheses of the Comparison Test are satisfied. Notice that

∞∑k=1

1k2 + k

=∞∑

k=1

(1k− 1

k + 1)

= limn→∞ 1 − 1

n + 1.

The last telescoping series converges to 1. The Comparison Test impliesthat the series ∞∑

k=1

12k2

converges. Lemma 3.3.1 tells us that

∞∑k=1

1k2

converges.As a second example, let’s apply these ideas to Airy’s equation (4.1).

Suppose for convenience that the initial coefficients a0 and a1 are non-negative. By virtue of (4.5),

0 ≤ ak ≤ M = max(a0, a1).

For x ≥ 0,0 ≤ akx

k ≤ Mxk.

The series∑

Mxk = M∑

xk is a constant times the geometric series,so it converges for |x| < 1, and in particular when 0 ≤ x < 1. By thecomparison test the power series

∑akx

k also converges when 0 ≤ x <1. (In fact the series converge for all x without any constraints on thecoefficients ak.)

Page 110: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 99

Theorem 4.2.3. (Integral Test) Suppose f(x) is a decreasing positivecontinuous function defined for x ≥ 0, and that ck = f(k). Then theseries

∑∞k=0 ck converges if and only if∫ ∞

0f(x) dx < ∞.

Proof. In the manner of chapter 2, consider left and right endpoint Rie-mann sums approximating the integral, with subintervals [xk, xk+1] =[k, k + 1]. Since the function f is decreasing, the left endpoint sumsgive ∫ n

0f(x) dx ≤ sn =

n−1∑k=0

f(k) =n−1∑k=0

ck.

The right endpoint sums yield the estimate

c0 +n∑

k=1

ck = c0 +n∑

k=1

f(k) ≤ c0 +∫ n

0f(x) dx.

Thus the sequence of partial sums is bounded if and only if∫ ∞

0f(x) dx = lim

n→∞

∫ n

0f(x) dx < ∞.

Apply this theorem to the functions

f(x) =1

x + 1, g(x) =

1(x + 1)2

.

Since ∫ n

0

1x + 1

dx = log(x + 1)∣∣∣n0

= log(n + 1),

andlim

n→∞ log(n + 1) = ∞,

the corresponding series

∞∑k=0

1k + 1

=∞∑

m=1

1m

Page 111: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

100 A Concrete Introduction to Real Analysis

diverges. Since∫ n

0

1(x + 1)2

dx = −(x + 1)−1∣∣∣n0

= 1 − 1n + 1

,

andlim

n→∞ 1 − 1n + 1

= 1,

the series ∞∑k=0

1(k + 1)2

=∞∑

m=1

1m2

converges.

Theorem 4.2.4. (Ratio Test) Suppose that ck > 0 for k = 0, 1, 2, . . . ,and that

limk→∞

ck+1

ck= L.

Then the series∑

ck converges if L < 1 and diverges if L > 1.

Proof. First suppose that L < 1, and let L1 be another number satis-fying L < L1 < 1. Since

limk→∞

ck+1

ck= L,

there is an integer N such that

0 <ck+1

ck< L1, k ≥ N.

This implies that for k ≥ N we have

ck = cNcN+1

cN

cN+2

cN+1· · · ck

ck−1≤ cNLk−N

1 .

A comparison with the geometric series is now helpful. For m > N ,

sm =m∑

k=1

ck =N−1∑k=1

ck +m∑

k=N

ck

≤N−1∑k=1

ck + cN

m∑k=N

Lk−N1 ≤

N−1∑k=1

ck + cN

∞∑j=0

Lj1

Page 112: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 101

=N−1∑k=1

ck + cN1

1 − L1.

Since the sequence of partial sums is bounded, the series converges.If instead L > 1, then a similar argument can be made with 1 < L1 <

L. This time the comparison with the geometric series shows that thesequence of partial sums is unbounded, and so the series diverges.

The ratio test provides an easy means of checking convergence of theusual Taylor series for ex when x > 0. The series is

∞∑k=0

xk

k!,

with ck = xk/k! and

ck+1

ck=

xk+1

(k + 1)!k!xk

=x

k + 1.

Clearly limk→∞ ck+1/ck = 0 for any fixed x > 0, so the series converges.As another example consider the series

∞∑k=0

kxk, x > 0.

The ratios areck+1

ck=

(k + 1)xk+1

kxk= x

(k + 1)k

.

Sincelim

k→∞ck+1

ck= x,

the series converges when 0 < x < 1.

4.3 General series

We next consider convergence of series whose terms ck need not bepositive. Other methods are now needed to show that the sequence ofpartial sums

sn =n−1∑k=0

ck

Page 113: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

102 A Concrete Introduction to Real Analysis

has a limit. To begin, note that if a series converges, then the terms ck

must have 0 as their limit.

Lemma 4.3.1. If∑

ck converges then

limk→∞

ck = 0.

Proof. Let ε > 0, and define ε1 = ε/2. Since there is a number S suchthat

limn→∞ sn = S,

there is an Nε1 such that

|sn − S| < ε1, n ≥ Nε1.

Notice that for n ≥ Nε1 ,

|cn+1| = |sn+1 − sn| = |sn+1 − S + S − sn| ≤ |sn+1 − S|+ |S − sn| < ε,

which is what we wanted to show.

4.3.1 Absolute convergence

The main technique for establishing convergence of a general series∑ck is to study the related positive series

∑∞k=0 |ck|. Say that a series∑∞

k=0 ck converges absolutely if∑∞

k=0 |ck| converges.

Theorem 4.3.2. If a series∑∞

k=0 ck converges absolutely, then it con-verges.

Proof. If

sn =n−1∑k=0

|ck|

is the n-th partial sum for the series∑∞

k=0 |ck|, then the sequence {sn}increases to its limit S. This means in particular that sn ≤ S.

Let {aj} and {−bj} be respectively the sequences of nonnegative andnegative terms from the sequence {ck}, as illustrated below. (One ofthese sequences may be a finite list rather than an infinite sequence.)

c0, c1, c2, · · · = |c0|, |c1|,−|c2|, |c3|,−|c4|, |c5|, |c6|, . . . ,a0 = c0, a1 = c1, a2 = c3, a3 = c5, a4 = c6, b0 = |c2|, b1 = |c4|, . . . .

Page 114: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 103

If σm is a partial sum of the series∑

j aj, then for some n ≥ m wehave

σm ≤ sn ≤ S.

Since the partial sums of the positive series∑

j aj are bounded, theseries converges to a number A. By a similar argument the positiveseries

∑j bj converges to a number B.

Let’s show that ∞∑k=0

ck = A − B.

Pick ε > 0, and as usual let ε1 = ε/2. There are numbers N1 and N2

such that

|n∑

j=1

aj − A| < ε1, n ≥ N1, |n∑

j=1

bj − B| < ε1, n ≥ N2.

Find a number N3 such that n ≥ N3 implies that the list c1, . . . , cN3

contains at least the first N1 positive terms ak and the first N2 negativeterms −bk. Then for n ≥ N3 we have

|n∑

k=0

ck − (A − B)| ≤ |n1∑

j=0

aj − A| + |B −n2∑

j=0

bj | < ε1 + ε1 = ε,

since n1 ≥ N1 and n2 ≥ N2.

This theorem may be used in conjunction with the tests for conver-gence of positive series. As an example, reconsider the usual powerseries for ex. The series is ∞∑

k=0

xk

k!.

Replace the terms in this series by their absolute values,∞∑

k=0

|x|kk!

.

If x = 0 the series converges, and if x �= 0 we may apply the ratio testwith ck = |x|k/k! to get

ck+1

ck=

|x|k+1

(k + 1)!k!|x|k =

|x|k + 1

.

Clearly limk→∞ ck+1/ck = 0 for any fixed x, so the original series forex converges since it converges absolutely.

Page 115: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

104 A Concrete Introduction to Real Analysis

4.3.2 Alternating series

There is a more specialized convergence test which may be used toprove convergence of some series which may not be absolutely conver-gent.

Theorem 4.3.3. (Alternating series test) Suppose that ck > 0, ck+1 ≤ck, and limk→∞ ck = 0. Then the series

∞∑k=1

(−1)k+1ck = c1 − c2 + c3 − c4 + . . .

converges. Furthermore, if S is the sum of the series, and {sn} is itssequence of partial sums, then

s2m ≤ S ≤ s2m−1, m ≥ 1.

Proof. For m ≥ 1 define new sequences em = s2m and om = s2m−1. Theproof is based on some observations about these sequences of partialsums with even and odd indices. Notice that

em+1 = s2m+2 =2m+2∑k=1

(−1)k+1ck

= s2m + c2m+1 − c2m+2 = em + [c2m+1 − c2m+2].

The assumption ck+1 ≤ ck means that c2m+1 − c2m+2 ≥ 0, so that

em+1 ≥ em.

Essentially the same argument shows that

om+1 ≤ om.

Since ck ≥ 0,

em = s2m−1 − c2m = om − c2m ≤ om ≤ o1.

Similarly,

om+1 = s2m + c2m+1 = em + c2m+1 ≥ em ≥ e1.

Thus the sequence em is increasing and bounded above, while the se-quence om is decreasing and bounded below. By the BMS Theorem

Page 116: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 105

Theorem 4.1.1 the sequences em and om have limits E and O respec-tively.

NowO − E = lim

m→∞ om − limm→∞ em = lim

m→∞ om − em

= limm→∞ s2m−1 − s2m = lim

m→∞ c2m = 0,

or O = E. Take S = O = E. Since the even partial sums s2m increaseto S, and the odd partial sums s2m−1 decrease to S, the conclusion

s2m ≤ S ≤ s2m−1, m ≥ 1

is established.Finally, to see that S = limn→∞ sn, let ε > 0. Since the even partial

sum sequence converges to S, there is a number M1 such that

|em − S| = |s2m − S| < ε, if m ≥ M1.

There is a corresponding number M2 for the odd partial sum sequence,

|om − S| = |s2m−1 − S| < ε, if m ≥ M2.

Consequently if N = max(2M1, 2M2 − 1), then whenever n ≥ N wehave

|sn − S| < ε.

When a series is alternating it is possible to be quite precise aboutthe speed of convergence. Start with the inequality

s2m ≤ S ≤ s2m−1, m ≥ 1.

This implies that

S − s2m ≤ s2m−1 − s2m = c2m

and similarlys2m−1 − S ≤ s2m−1 − s2m = c2m.

On the other hand

(s2m−1 − S) + (S − s2m) = s2m−1 − s2m = c2m,

so either |s2m−1 − S| ≥ c2m/2 or |S − s2m| ≥ c2m/2.

Page 117: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

106 A Concrete Introduction to Real Analysis

As an example of an alternating series, take ck = 1/ log10(k + 1).The alternating series

∑∞k=1(−1)k/ log10(k + 1) converges, but since

ck ≥ 1/(k + 1), comparison with the harmonic series shows that theseries does not converge absolutely. The difference between the partialsum sn and the sum S is on the order of the last term cn, so to ensurethat |sn − S| < 10−6, for instance, we would take log10(k + 1) > 106,or k + 1 > 101,000,000.

4.3.3 Power series

Now that the basic series-related weapons are in our arsenal, wereturn to questions of convergence of power series. The basic theoremis the following.

Theorem 4.3.4. Suppose the power series

∞∑k=0

ak(x − x0)k

converges for x = x1 �= x0. Then the series converges absolutely for|x − x0| < |x1 − x0|.Proof. Since the series

∑∞k=0 ak(x1−x0)k converges, Lemma 4.3.1 says

thatlim

k→∞ak(x1 − x0)k = 0.

Therefore the sequence ak(x1 − x0)k = 0 is bounded, and there mustbe a number M such that

|ak(x1 − x0)k| ≤ M, k = 0, 1, 2, . . . .

Testing for absolute convergence, suppose |x − x0| < |x1 − x0|. Let

r =|x − x0||x1 − x0| < 1.

Then

|ak(x − x0)k| = |ak(x1 − x0)k|∣∣∣ (x − x0)k

(x1 − x0)k

∣∣∣ ≤ Mrk.

Since 0 ≤ r < 1 the power series converges by comparision with thegeometric series.

Page 118: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 107

This theorem justifies defining the radius of convergence of a powerseries, which is the largest number R such that the power series con-verges for all |x−x0| < R. If the power series converges for all x we saythe radius of convergence is ∞. In some cases the radius of convergenceof a power series may be readily computed.

Theorem 4.3.5. Suppose that |ak| �= 0, and

limk→∞

|ak||ak+1| = L.

Then the radius of convergence of the power series∑

ak(x−x0)k is L.(The case L = ∞ is included.)

Proof. When L �= 0 and L �= ∞ the ratio test may be applied to

∞∑k=0

|ak||x − x0|k.

In this case

limk→∞

∣∣∣ak+1(x − x0)k+1

ak(x − x0)k

∣∣∣ =|x − x0|

L,

so by the ratio test the power series converges absolutely if |x−x0| < L.On the other hand if |x − x0| > L, the power series does not convergeabsolutely, so by the previous theorem it cannot converge at all for any|x − x0| > L.

The same argument only requires slight modifications in case L = 0or L = ∞.

Example 1: The series for ex centered at x = 0 is

∞∑k=0

xk

k!.

In this case |ak||ak+1| = k + 1 → ∞

and the series converges for all x.Example 2: Suppose that our series has the form

∞∑k=0

kmxk

Page 119: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

108 A Concrete Introduction to Real Analysis

for some positive integer m. Then

|ak||ak+1| =

km

(k + 1)m→ 1,

so the power series has radius of convergence 1.

4.4 Grouping and rearrangement

To motivate some further developments in the theory of infinite series,recall the power series solutions of Airy’s equation (4.1). In that casethe formal solutions of the differential equation were given by

∑akx

k

where the coefficients ak came in three types:

a3k =a0

2 · 3 · 5 · 6 · (3k − 1) · (3k),

a3k+1 =a1

3 · 4 · 6 · 7 · (3k) · (3k + 1),

a3k+2 = 0.

Here k = 0, 1, 2, . . . . Recall that a0 and a1 may be chosen arbitrarily.Given the structure of this series, it is tempting to split it in three,

looking at∞∑

k=0

a3kx3k, and

∞∑k=0

a3k+1x3k+1

separately. Of course the third part is simply 0. Rewrite the first ofthese constituent series as

∞∑k=0

a3kx3k =

∞∑j=0

αj(x3)j .

It is easy to check that |αj | ≤ 1/j!, so by the ratio test this series willconverge absolutely for all values of x3, which means for all values ofx. The second constituent series can be treated in the same fashion.

What is still missing is a license to shuffle the three constituent con-vergent series together, so that convergence of the original series may be

Page 120: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 109

determined. As more elementary examples, consider the convergenceof the following variation on the alternating harmonic series,

1 +12− 1

3− 1

4+

15

+16− 1

7− 1

8+ . . . , (4.7)

the rearranged alternating harmonic series,

1 +13− 1

2+

15

+17− 1

4+

19

+111

− 16

+ . . . , (4.8)

or the series with grouped terms

1 + (−12

+13) + (−1

4+

15− 1

6) + (

17− 1

8+

19− 1

10) + . . . . (4.9)

These series are closely related to a previously treated series; it wouldbe nice to have some guidelines for analyzing their convergence.

As a first step toward this end, let’s begin with the notion of a re-arrangment of an infinite series. Suppose that {p(k), k = 1, 2, 3, . . . }is a sequence of positive integers which includes each positive integerexactly once. The infinite series

∞∑k=1

cp(k)

is said to be a rearrangment of the series∑

k ck. If the series∑

k ck

converges absolutely, the story about rearrangements is simple andsatisfying.

Theorem 4.4.1. If the series∞∑

k=1

ck

converges absolutely, then any rearrangement of the series also con-verges absolutely, with the same sum.

Proof. Suppose

A =∞∑

k=1

|ck|.

Any partial sum of the rearranged series satisfiesm∑

k=1

|cp(k)| ≤n∑

k=1

|ck| ≤ A

Page 121: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

110 A Concrete Introduction to Real Analysis

if n is sufficiently large, since every term in the left sum will also ap-pear in the right sum. The partial sums of the rearranged series forma bounded increasing sequence, so the rearranged series is absolutelyconvergent.

Let

S =∞∑

k=1

ck.

For any ε > 0 there is an N such that n > N implies

n∑k=N+1

|ck| < ε/2,

and

|S −N∑

k=1

ck| < ε/2.

Find M such that each of the terms ck for k = 1, . . . , N appears in thepartial sum

M∑k=1

cp(k).

For any m ≥ M , and for n sufficiently large,

|S −m∑

k=1

cp(k)| = |(S −N∑

k=1

ck) + (N∑

k=1

ck −m∑

k=1

cp(k))|

≤ |S −N∑

k=1

ck| +n∑

k=N+1

|ck| < ε.

If a series converges, but does not converge absolutely, the series issaid to converge conditionally. The situation is less satisfactory forrearrangements of conditionally convergent series. In fact, as Riemanndiscovered [16, p. 67], given any real number x, there is a rearrangementof a conditionally convergent series which converges to x.

Conditionally convergent series can be rearranged without changingthe sum if a tight rein is kept on the rearrangement. Let us call arearrangement bounded if there is a number C such that

|p(k) − k| ≤ C.

Page 122: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 111

The following result is mentioned in [17] along with a variety of moresophisticated tests.

Theorem 4.4.2. If the series

∞∑k=1

ck

converges, then any bounded rearrangement of the series converges tothe same sum.

Proof. Consider the partial sums for the original series and the rear-ranged series,

sn =n∑

k=1

ck, σn =n∑

k=1

cp(k).

Compare these sums by writing

σn = sn − on + in,

whereon =

∑p(k)≤n

ck, k > n,

is the sum of terms from sn omitted in σn, while

in =∑

p(k)>n

ck, k ≤ n,

is the sum of terms from σn which are not in sn.The condition |p(k)−k| ≤ C implies that the sums on and in contain

no more than C terms each. Since limk→∞ ck = 0 and each term ck

appears exactly once as a term cp(k),

limn→∞ on = 0, lim

n→∞ in = 0.

It follows that limn σn = limn sn.

There are several circumstances in which convergent series may bealtered without changing their sum if the original ordering of the termsis respected. The first illustration shows that blocks of terms from aconvergent series may be added first, as in (4.9), without changing thesum. Recall that N denotes the set of positive integers 1, 2, 3, . . . .

Page 123: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

112 A Concrete Introduction to Real Analysis

Theorem 4.4.3. Suppose that m(k) : N → N is a strictly increasingfunction, with m(1) = 1. If the series

∞∑k=1

ck

converges, then so does

∞∑k=1

(m(k+1)−1∑j=m(k)

cj

),

and the sums are the same.

Proof. Let sn be the n-th partial sum for the original series, and let tnbe the n-th partial sum for the series with grouped terms. Then

tn =n∑

k=1

(m(k+1)−1∑j=m(k)

cj

)=

m(n+1)−1∑k=1

ck = sm(n+1)−1.

Since the sequence {tn} of partial sums of the series with grouped termsis a subsequence of the sequence {sn}, the result is established.

It is also possible to shuffle convergent series without losing conver-gence. Suppose there are M series

∞∑j=1

cj,m, m = 1, . . . ,M.

Say that the series∑

k ak is a shuffle of the series∑

j cj,m if thereare M strictly increasing functions km : N → N, for m = 1, . . . ,M ,such that every positive integer occurs exactly once as some km(j)(that is, the sets {km(j)} form a partition of the positive integers), andthat akm(j) = c(j,m). For example, the series (4.7) is a shuffle of thealternating series

1 − 13

+15− 1

7+ . . . , and

12− 1

4+

16− 1

8+ . . . ,

with k1(j) = 2j − 1 and k2(j) = 2j.

Page 124: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 113

Theorem 4.4.4. Suppose that the series

∞∑j=1

cj,m, m = 1, . . . ,M,

are convergent, with sums Sm. If the series

∞∑k=1

ak

is a shuffle of the series∑

j cj,m, then∑

k ak converges to S1+· · ·+SM .

Proof. Given ε > 0 there is an integer N such that

|Sm −n∑

j=1

cj,m| < ε/M, n ≥ N.

Choose L so large that all terms cj,m for j ≤ N and m = 1, . . . ,Mappear in the sum

L∑k=1

ak.

For some N1 ≥ N, . . . ,NM ≥ N ,

|S1 + · · · + SM −L∑

k=1

ak| = |[S1 −N1∑j=1

cj,1] + · · · + [SM −NM∑j=1

cj,M ]| < ε.

Page 125: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

114 A Concrete Introduction to Real Analysis

4.5 Problems

1. Following the treatment of Airy’s equation (4.1) find a powerseries solution

∑∞k=0 akx

k for the equation

dy

dx= y.

Express the coefficients ak in terms of the first coefficient a0. Now treatthe slightly more general equation

dy

dx= αy,

where α is a constant.2. Using the treatment of Airy’s equation (4.1) as a model, find a

power series solution∑∞

k=0 akxk for the Hermite equation

d2y

dx2− 2x

dy

dx+ 2αy = 0,

where α is a constant. Express the coefficients ak in terms of the firsttwo coefficients a0, a1. What happens if α happens to be a positiveinteger?

3. Use Theorem 3.2.3 and a bit of algebra to extend the proof of theBMS Theorem from the case 0 ≤ sn < 1 to the general case.

4. Define a decreasing sequence, and state a version of the BMSTheorem for decreasing sequences. Use Theorem 3.2.3, a bit of algebra,and the statement of the original BMS Theorem to prove your newtheorem.

5. Suppose that ck ≥ 0 for k = 1, 2, 3, . . . , and that∑

ck converges.Show that if 0 ≤ bk ≤ M for some number M , then

∑bkck converges.

6. Consider the convergence of the following series.(a) Show that the series

∞∑k=1

1k2k

converges.(b) Show that the series

∞∑k=0

e−k| sin(k)|

Page 126: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 115

converges.7. Show that the series

∑ke−k converges. Show that if m is a fixed

positive integer then the series∑

kme−k converges.8. Consider the following series.(a) Show that the series

∞∑k=0

2k

k!

converges.(b) Show that the series

∞∑k=0

kk

k!

diverges.9. Suppose that ck ≥ 0 and

limk→∞

ck = r > 0.

Show that∑

ck diverges.10. For which p > 0 does

∞∑k=1

1kp

converge. Justify your answer.11. Assume that ck ≥ 0 and

∑k ck converges. Suppose that there

is a sequence {ak}, a positive integer N , and a positive real number rsuch that 0 ≤ ak ≤ rck for k ≥ N . Show that

∑k ak converges.

12. Assuming that k2 + ak + b �= 0 for k = 1, 2, 3, . . . , show that theseries ∞∑

k=1

1k2 + ak + b

converges.13. (Root Test) Suppose that ck ≥ 0 for k = 1, 2, 3, . . . , and that

limk→∞

c1/kk = L.

Show that the series∑

ck converges if L < 1 and diverges if L > 1.14. Show that the series

∞∑k=1

sin(k)k2 + 1

Page 127: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

116 A Concrete Introduction to Real Analysis

converges.15. Find an example of a diverent series

∑ck for which limk→∞ ck =

0.16. A series is said to converge conditionally if

∑ck converges, but∑ |ck| diverges. Suppose that the series

∑ck converges conditionally.

(a) Show that there are infinitely many positive and negative termsck.

(b) Let aj be the j − th nonnegative term in the sequence {ck}, andlet bj be the j− th negative term in the sequence {ck}. Show that bothseries

∑aj and

∑bj diverge.

17. Consider the following series questions.(a) Establish the convergence of the series

∞∑k=1

k + 1k3 + 6

.

(b) Suppose that p(k) and q(k) are polynomials. State and prove atheorem about the convergence of the series

∞∑k=1

p(k)q(k)

.

18. Consider the convergence of the following series.(a) Show that the series

∞∑k=1

sin(1/k2)

converges, but that the series

∞∑k=1

sin(1/k)

diverges.(b) Does the series

∞∑k=1

[1 − cos(1/k)]

converge ?

Page 128: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Infinite Series 117

(c) For what values of p does the series

∞∑k=1

[log(kp + 1) − log(kp)]

converge ?19. Prove convergence of the series

∞∑k=1

(−1)k

k+

1k2

.

20. In the paragraph following the alternating series test theorem weshowed that

S − s2m ≤ s2m−1 − s2m = c2m, s2m−1 − S ≤ s2m−1 − s2m = c2m,

and either |s2m−1 − S| ≥ c2m/2 or |S − s2m| ≥ c2m/2. Starting with

s2m+1 − s2m = c2m+1,

develop similar estimates comparing the differences |s2m+1 − S| and|s2m − S| to c2m+1.

21. For which values of x does the power series

∞∑k=1

(x − 1)k

k

converge, and for which does it diverge?22. For which values of x does the series

∞∑k=1

(2x + 3)k

converge, and for which does it diverge?23. Show that if p(k) is a (nontrivial) polynomial, then the power

series ∞∑k=1

p(k)xk

converges if |x| < 1, but diverges if |x| > 1.

Page 129: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

118 A Concrete Introduction to Real Analysis

24. Find the radius of convergence for each of the following series:

(a)∑∞

k=1(2k + 10)xk, (b)

∑∞k=1 k!(x − 5)k,

(c)∑∞

k=1kk/2

k! xk, (d)∑∞

k=1 tan−1(k)xk.

25. Suppose |ck| > 0 and

limk→∞

|ck+1||ck| = L.

Show that if L > 1, then limk→∞ |ck| = ∞, so the series∑

k ck diverges.26. (Products of series) If we formally multiply power series and

collect equal powers of x we find

(∞∑

k=0

akxk)(

∞∑k=0

bkxk) = (a0 + a1x + a2x

2 + . . . )(b0 + b1x + b2x2 + . . . )

= a0 + (a1 + b1)x + (a0b2 + a1b1 + a − 2b0)x2 + . . . .

This suggests defining the product of two power series by

(∞∑

k=0

akxk)(

∞∑k=0

bkxk) = (

∞∑k=0

ckxk), ck =

k∑j=0

ajbk−j.

By setting x = 1 this leads to the definition

(∞∑

k=0

ak)(∞∑

k=0

bk) = (∞∑

k=0

ck).

Prove that if the series∑∞

k=0 ak and∑∞

k=0 bk converge absolutely,with sums A and B respectively, then the series

∑∞k=0 ck converges

absolutely, and its sum is AB.27. Show that the series solutions (4.5) for Airy’s equation converge

for all x.28. Show that the series (4.7) and (4.9) converge. Show that (4.8)

converges, but to a different sum than the alternating harmonic series.29. State and prove a version of Theorem 4.4.2 which allows for

certain rearrangements which are not bounded.30. Suppose ck > 0, ck+1 ≤ ck, and limk→∞ ck = 0. Prove the

convergence of∞∑

k=1

sin(πk/N)ck

for any positive integer N .

Page 130: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Chapter 5

A Bit of Logic

5.1 Some mathematical philosophy

Simple facts of arithmetic or geometry are often tested by commonexperience. If you throw six nuts into a basket, and then add ten more,you get the same total as if ten went in first, followed by six. Thecommutativity of addition is thus testable in a meaningful way.

The same cannot be said for many of the results of mathematics.What direct experience suggests that there are infinitely many primenumbers, or that the square root of two is not the quotient of twointegers? Is the scarcity of solutions to the equation

xn + yn = zn

convincing evidence that this equation has no positive integer solutionsif n is an integer bigger than 2?

The formula

1 + 22 + · · · + n2 =n(n + 1)(2n + 1)

6, (5.1)

can be checked for many values of n, and this might be considered assome evidence of its truth. But this evidence should be considered ascomparable to the observations that giant reptiles do not stride acrossthe land, or that Missouri is not subject to catastrophic earthquakes.Despite the temptation to believe in the persistence of patterns, theyoften fail to embody permanent truths. The reason for proving (5.1) isto establish exactly this type of enduring truth, which is not providedby consistent observations or tests of the material world.

The reliable reasoning processes used by mathematicians to discovertruths of mathematics have their counterparts in the ‘exact’ sciencesand technologies. The design of new aircraft or rockets for landingpeople on the moon are not the sorts of projects that can be sent back

119

Page 131: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

120 A Concrete Introduction to Real Analysis

to the drawing board hundreds of times. The same is true for the designof large computer programs, the manufacture of computer chips, thedevelopment of routing algorithms for telephone calls, or the selectionof economic strategies by a national government. If these systems, withthousands or millions of interacting components, are not more or lesssuccessfully developed on the first try, the consequences can be dire fora company or a country.

How then do we develop sound mathematical reasoning? After all,statements, ideas, or algorithms need not be valid simply because theyare phrased in a precise way, have been tested in a few cases, or appealto common intuition. In times past experts believed that every lengthcould be represented as the ratio of two integers, and that squares ofnumbers are necessarily greater than or equal to 0. Brilliant mindsbelieved, along with most Calculus students, that except for isolatedexceptional points, all functions have derivatives of all orders at everypoint. Even professional logicians seemed unwary of the traps (Russell’sparadox) in statements such as “there is a barber in town who shavessomeone if and only if they do not shave themselves”.

The development of a sophisticated system of reliable mathematicalthinking was one of the greatest achievements of the ancient Greeks.The accomplishment had several components. One part is the develop-ment of logic, so that one has a way to construct valid arguments andto analyze arguments to assess their validity. In addition the Greekswere able to apply these ideas in the development of geometry, there-by creating a rich mathematical discipline with numerous applicationswhich could serve as a model for subsequent mathematics. Commentson the development of these mathematical ideas may be found in [9,pp. 45,50,58–60,171–172].

An essential element of logic and mathematics is that we agree onthe precise meaning of words. Suppose three politicians are arguingabout the best economic policies. The first takes this to mean thatthe total production of the economy is maximized. The second wantsto avoid significant income disparities. The third expects to ensure ahigh minimum level of medical and educational services for all citizens.Unless the politicians can recognize that ‘best outcome’ has a numberof plausible meanings, they are unlikely to agree, even with the best ofintentions.

The need for precise definitions immediately raises a serious problem.Think of making a dictionary which contains these precise meanings.

Page 132: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 121

Each definition is itself composed of words, which need precise mean-ings. Let us assume that these definitions are not circular. We haven’tmade much progress if our definition of ‘dog’ refers to ‘wolves’, and ourdefinition of ‘wolves’ refers to ‘dogs’, as the author’s desktop dictionarydoes. Unfortunately, to avoid circularity our dictionary must be mak-ing use of words that are not defined in the dictionary. Initially thisobservation is distressing, but it turns out to be less disastrous thanone might expect.

Rather than defining the basic concepts of our subject, there willbe a collection of undefined terms whose behavior will be describedby a set of axioms. For instance, in geometry the undefined termsinclude point and line. In set theory, which is the current foundationfor mathematics, undefined terms include set and is an element of. Inthe next chapter the basic properties of the real numbers and theirarithmetic functions will be detailed in a list of axioms.

The axioms which provide the foundation for mathematical proofsmay be judged by our experience and intuition, but in the end it mustbe admitted that their truth is assumed. The same is true for the logicalprocedures which allow us to generate new results based on the axioms.The rigorous development of mathematics then uses such axioms andrules of logic as the building blocks and machinery for erecting thevarious structures of our subject.

The modern view of mathematical proof is quite mechanistic. In factthe ideal is to create a system whereby the validity of any proof can bechecked by a computer, and in principle every proof can be generatedby a computer. This development has gone quite a way beyond theoriginal conception of the Greeks. It must also be admitted that inpractice it is rare to see one of these ideal mathematical proofs. Theytend to be long and tedious, with an astounding effort required toachieve results of mathematical significance. The mechanistic view istaken as a guide, but in practice proofs are provided in a more informalstyle.

There are some additional interesting notes regarding proofs, and theaxiomatic foundations of mathematics. First, there still remain somequestions and controversies about the selection of appropriate axioms;one example related to proof by contradiction will be mentioned later,but typically such issues will not arise in this text. Second, it is a histor-ical fact that much of mathematics was developed without having thisexplicit axiomatic foundation. Although mathematicians admired and

Page 133: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

122 A Concrete Introduction to Real Analysis

tried to emulate the logical development of geometry, the historical de-velopment of Calculus proceeded without a sound axiomatic foundationfor several hundred years. That is not to say that no logical structurewas in place, or that Newton had no idea of sound reasoning. It is truethat the new ideas introduced in Calculus took many years to digest.

This chapter contains some basic material on mathematical logic.The main topic is propositional logic, which considers the use of logicalconnectives such as ‘not’, ‘and’, and ‘implies’ to construct compositestatements from elementary building blocks with well defined truthvalues. The relationship between the logical connectives and their nat-ural language counterparts will be discussed. Truth tables are usedto define the action of propositional connectives. A brief discussion oflogical predicates and quantifiers follows. Finally, the construction ofmathematical proofs is presented in the context of propositional log-ic. The axioms of propositional logic have a very restricted form, acomplete list of axioms is of manageable length, and only a single ruleof inference is needed. These features make propositional logic a goodfirst model for the proofs that will arise in analysis.

5.2 Propositional logic

In this section we will consider the construction of new statementsfrom old ones using the following collection of propositional connectives:

Table 5.1: Propositional connectives

logical symbol English equivalent¬ not∧ and∨ or⇒ implies⇔ is equivalent to

The use of propositional connectives (Table 5.1) to construct state-ments is based on a starting collection of statements, represented byletters A, B, C, etc., whose internal structure is not of concern. Sincesuch statements are indivisible and provide the basic building blocks formore complex expressions, they are called atoms, or atomic statements,

Page 134: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 123

or atomic formulas. With proper use of the propositional connectives,other composite formulas may be generated. Suppose R and S are for-mulas, which may be atomic or composite. By using the propositionalconnectives, the following new formulas may be generated:

¬R,R∧S,R∨S,R⇒S,R⇔S.

Formulas generated by these rules are said to be well formed, to dis-tinguish them from nonsense strings of symbols like

R⇔¬.

Starting from the atoms, repeated application of the rules allows usto generate complex formulas such as

[¬(A∨B)]⇔[(¬A)∧(¬B)].

Some of these constructions arise often enough to merit names. Thus¬A is the negation of A, and B⇒A is the converse of A⇒B. Theformula (¬B)⇒(¬A) is called the contrapositive of A⇒B.

To minimize the need for parentheses in composite formulas, thepropositional connectives are ranked in the order

¬,∧,∨,⇒,⇔.

To interpret a formula, the connectives are applied in left to right order(¬ ’s first, etc) to well-formed subformulas, and from left to right in aparticular expression. For example the statement

A∨B⇔¬(¬A∧¬B)

should be parsed as

[A∨B]⇔[¬((¬A)∧(¬B)

)],

whileA⇒B⇒C

should be parsed as(A⇒B)⇒C.

In propositional logic the atomic formulas A, B, C, . . . , are assumedto be either true or false, but not both. The truth of (well-formed) com-posite formulas may be determined from the truth of the constituent

Page 135: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

124 A Concrete Introduction to Real Analysis

formulas by the use of truth tables. Tables 5.2 and 5.3 are the truthtables for the propositional connectives.

Table 5.2: Truth table for negation

A ¬AT FF T

Table 5.3 Truth table for logical connectives

A B A∧B A∨B A⇒B A⇔BT T T T T TF T F T T FT F F T F FF F F F T T

The truth tables of the propositional connectives are intended tohave a close connection with natural language usage. In addition to Aimplies B, statements of the form if A, then B, or B if A, or A only ifB are interpreted formally as A⇒B. The formula A⇔B correspondsto the statement forms A is equivalent to B or A if and only if B.

In several cases the natural language usage is more complex andcontext dependent than indicated by the corresponding truth tabledefinitions. First notice that A∨B is true if either A is true, or B istrue, or if both A and B are true. This connective is sometime calledthe inclusive or to distinguish it from the exclusive or which is falseif both A and B are true. In English, sentence meaning often helpsdetermine whether the ‘inclusive or’ or the ‘exclusive or’ is intended,as the following examples illustrate:

John eats peas or carrots. (inclusive)Mary attends Harvard or Yale. (exclusive)The logical meaning of implication can also have some conflict with

common interpretations. Thus the logical implication

two wrongs make a right ⇒ all the world′s a stage

is true if the first statement is false, regardless of the truth or meaningof the second statement.

As an example of truth value analysis, consider the following naturallanguage discussion of taxation.

If state revenue does not increase, then either police services or edu-cational services will decline. If taxes are raised, then state revenue will

Page 136: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 125

increase. Therefore, if taxes are raised, police services or educationalservices, or both, will not decline.As taxpayers concerned about police and educational services, we havean interest in understanding whether the statement

if taxes are raised, police services or educational services, or both,will not declinefollows from the premises. To analyze the question, let’s formalize thepresentation. Use the letters A, B, C, D, to represent the statements

A: state revenue increases,B: police services will decline,C: educational services will decline,D: taxes are raised.

For the purposes of logical analysis, a reasonable translation of theexample into symbols is

([¬A⇒(B∨C)]∧[D⇒A]

)⇒(

D⇒[¬B∨¬C]). (5.2)

As a shorter example, consider the composite formula

[A∨B]⇔[¬((¬A)∧(¬B)

)]. (5.3)

To determine how the truth value of this composite formula dependson the truth values of its constituents, a truth table analysis can becarried out. Introduce the abbreviation

C =((¬A)∧(¬B)

).

For example (5.3) the truth table is Table 5.4.

Table 5.4 Truth table for formula (5.3)

A B ¬A ¬B C ¬C A∨B [A∨B]⇔[¬C]T T F F F T T TF T T F F T T TT F F T F T T TF F T T T F F T

Notice that the composite formula (5.3) is true for all truth values ofits component propositions. Such a statement is called a tautology. Thetautologies recorded in the next proposition are particularly importantin mathematical arguments. The proofs are simple truth table exercisesleft to the reader.

Page 137: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

126 A Concrete Introduction to Real Analysis

Proposition 5.2.1. The following formulas are tautologies of propo-sitional logic:

A∨¬A law of the excluded middle (5.4)

(A⇒B)⇔(¬B⇒¬A) contraposition (5.5)

As another example, consider the statement

[(A∨B∨C)∧¬C]⇔[A∨B]. (5.6)

Adopting the abbreviation

D = (A∨B∨C)∧¬C,

the truth table is Table 5.5.

Table 5.5: Truth table for formula (5.6)

A B C A∨B A∨B∨C ¬C D D⇔[A∨B]T T T T T F F FF T T T T F F FT F T T T F F FF F T F T F F TT T F T T T T TF T F T T T T TT F F T T T T TF F F F F T F T

Since the truth value of (5.6) is sometimes false, this is not a tau-tology. Notice however that if ¬C is true, then the statement (5.6) isalways true. When a statement S is true whenever the list of proposi-tions P is true, we say that S is a valid consequence of P . Thus (5.6)is a valid consequence of ¬C.

The valid consequence concept is typically employed when consider-ing the soundness of arguments such as (5.2) presented in natural lan-guage. Such arguments are often initiated with a collection of premisessuch as

if state revenue does not increase, then either police services or edu-cational services will declineand

if taxes are raised, then state revenue will increase.The argument is considered sound if the conclusion is true wheneverthe premises are true. Of course the truth of the premises, which

Page 138: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 127

may be disputable, should not be ignored; sound arguments are rarelyinteresting if based on false premises.

Let’s consider a truth table analysis of the question of taxation inexample (5.2). In this example the truth values of the basic propositionsA−D are not given. Rather, it is claimed that the composite formulas

¬A⇒(B∨C), and D⇒A

are true. The question is whether the claim (5.2) is a valid consequenceof these statements.

To show that the logic is faulty, it suffices to find truth values forA, . . . ,D for which these composite assertions are true, but (5.2) isfalse. Suppose that all the statements A, . . . ,D are true. Then thestatement ¬A⇒(B∨C) is true since ¬A is false, and D⇒A is truesince both D and A are true. Thus [¬A⇒(B∨C)]∧[D⇒A] is true, whileD⇒[¬B∨¬C] is false. Consequently, the implication (5.2) is false, andthe logic of the argument is flawed.

In this case an exhaustive analysis of the truth table was not needed.Since each of the propositions A, . . . ,D could be independently true orfalse, a complete truth table would have 24 = 16 rows. More generally,a composite formula with n atomic formulas would have a truth tablewith 2n rows. The exponential growth of truth tables with the numberof atomic formulas is a serious shortcoming.

5.3 Predicates and quantifiers

In the propositional logic discussed above, the propositional connec-tives ¬, ∧, ∨, ⇒, and ⇔ were used to construct composite formulasfrom a collection of atomic formulas. There was no consideration ofthe internal structure of the basic statements; only their truth valuewas important. In many mathematical statements there are aspects ofthe internal stucture that are quite important.

As examples of typical mathematical statements, consider the asser-tion

for every number x, x2 ≥ 0, (5.7)

or the statement of Fermat’s Last Theorem:

there are no positive integers x, y, z, n with n > 2 such that (5.8)

Page 139: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

128 A Concrete Introduction to Real Analysis

zn = xn + yn.

There are three aspects of these statements to be considered: the do-main of the variables, the predicate or relationship of the variables, andthe quantifiers.

First, each statement has variables which are expected to come fromsome domain D. In (5.7) the variable is x, and its domain has notbeen specified. The statement is true if the domain D is the set of realnumbers, but it is false if D is the set of complex numbers. Fermat’sLast Theorem (5.8) has a clear statement that x, y, z, n are all positiveintegers, which may be taken as the domain D.

Second, each statement has a predicate. A predicate is a function ofthe variables whose value is true or false for each choice of values forthe variables from the domain D. In (5.7) the predicate is

P (x) : (x2 ≥ 0),

while in (5.8) the predicate is more complex,

Q(x, y, z, n) : (n > 2)∧(zn = xn + yn).

The third ingredient is the quantification. Are the predicates ex-pected to be true for all values of the variables, or for only some valuesof the variables? The symbols ∀ and ∃ represent our two quantifiers.The symbol ∀ is read ‘for all’, and is called the universal quantifier. Astatement of the form ∀xP (x) is true for the domain D if P (x) has thevalue T for all x in the domain D, otherwise the statement is false. Thesymbol ∃ is read ‘there exists’, and is called the existential quantifier. Astatement of the form ∃xP (x) is true for the domain D if there is somex in D for which P (x) has the value T , otherwise the statement is false.The new symbols ∀ and ∃ are added to the previous set of propositionalconnectives to allow us to generate composite formulas. With the aidof these symbols we may formalize our mathematical statements as

∀xP (x), P (x) : (x2 ≥ 0), (5.9)

and

¬(∃(x, y, z, n)Q(x, y, z, n)), Q(x, y, z, n) : (n > 2)∧(zn = xn + yn).(5.10)

Just as with propositional logic, there is a collection of formulas thatmay be generated from variables, predicates, propositional connectives,

Page 140: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 129

and quantifiers. The atomic formulas are simply the predicates with theappropriate number of variables. For instance if P,Q,R are predicateswith one, two, and three arguments respectively, then

P (x), Q(x, y), R(x, y, z)

are atomic formulas. Then, if S and T are formulas, so are

¬S, S∧T, S∨T, S⇒T, S⇔T,

as well as∀xS, ∃xT,

where x is a variable.When formulas involve quantifiers and predicates, there can be a

question about the appropriate selection of variables. Consider theexample

[∃xP (x)∧∃xQ(x)]⇒[∃x(P (x)∧Q(x))].

This formula has the same meaning as

[∃xP (x)∧∃yQ(y)]⇒[∃z(P (z)∧Q(z))],

since the introduction of the new variables does not change the truthvalue of the formulas. ∃xQ(x) and ∃yQ(y) have the same truth value inany domain. In contrast, the formula ∃x(P (x)∧Q(x)) is not equivalentto ∃x(P (x)∧Q(y)); in the second case the quantification of the variabley has not been specified.

The introduction of predicates adds a great deal of complexity to ourformulas. For instance, in propositional logic it was possible, at leastin principle, to consider the truth value of a formula as a function ofthe truth values of its atoms. In that context we singled out certainformulas, the tautologies, which were true regardless of the truth valuesof the arguments. There is an analogous idea in the predicate calculus.Say that a formula S is valid if the truth value of S is true for everyassignment of its variables to values in every domain D. Since thedomain D might be an infinite set such as the integers, it is not possible,even in principle, to construct and examine a complete truth table.

To show that a formula is not valid it is only necessary to find a singledomain D and an assignment of the variables to elements of D suchthat the formula is false. But to establish the validity of a formula Swe would have to argue, without an exhaustive table, that S is always

Page 141: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

130 A Concrete Introduction to Real Analysis

true. This is not always difficult. For instance it is not hard to showthat P (x)⇔P (x). In general, however, establishing which formulas arevalid will be more of a challenge.

Here are some valid formulas involving quantifiers and predicates.The proofs are omitted.

[¬∃xP (x)]⇔[∀x¬P (x)] (5.11)

[¬∀xP (x)]⇔[∃x¬P (x)]

[∀xP (x)∧∀xQ(x)]⇔[∀x(P (x)∧Q(x))]

[∃xP (x)∨∃xQ(x)]⇔[∃x(P (x)∨Q(x))]

[∃x(P (x)∧Q(x))]⇒[∃xP (x)∧∃xQ(x)]

[∀xP (x)∨∀xQ(x)]⇒[∀x(P (x)∨Q(x))]

As a final topic in this discussion of predicate calculus, some briefremarks about equality are in order. Certainly one of the more commonsymbols in mathematics, equality is a two place predicate. To put it inthe context of our previous discussion we might write E(x, y) insteadof x = y. As a predicate, E(x, y) has a truth value when x and yrepresent elements of the domain D; E(x, y) is true if x and y are thesame element, otherwise it is false.

Among the properties of equality are the following:

x = x, reflexive, (5.12)

(x = y)⇒(y = x), symmetric,

[(x = y)∧(y = z)]⇒[x = z], transitive.

It is common in mathematics to encounter two place predicates shar-ing the reflexive, symmetric, and transitive properties of (5.12). Suchpredicates are called equivalence relations. To construct an example ofan equivalence relation P (x, y) which is distinct from equality, supposeour domain D is the set of integers. Define the predicate P (x, y) withthe value T if x − y is even, and let P (x, y) = F if x − y is odd (seeproblem 11).

Page 142: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 131

5.4 Proofs

The previous discussions of propositional logic, predicates, and quan-tifiers have introduced a number of important logical ideas and opera-tions. The stakes will now be raised considerably with the introductionof the main game in mathematics, the construction of proofs. The ideaof carefully reasoned mathematical proofs dates back to the ancientGreeks. In rough outline the plan is to present a sequence of state-ments in order to reach a correct conclusion. Each statement in thesequence should be true, and its truth can be established in two ways.First, a statement may be true because its truth was established be-fore the current proof was started. Second, a statement may be truebecause it follows from true statements by a rule of inference.

Probably the simplest and most familiar examples of such argumentsinvolve algebraic manipulations. As an example, consider the proofthat the square of an odd integer is odd. If t is an odd integer, then

t = 2n + 1. s.1

Since both sides are equal, their squares are equal, so

t2 = (2n + 1)(2n + 1). s.2

The distributive law says that for all real numbers x, y, z,

(x + y)z = xz + yz. s.3

Apply this property with x = 2n, y = 1, and z = 2n + 1 to find

t2 = 2n(2n + 1) + (2n + 1) = 2[n(2n + 1) + n] + 1. s.4

Ifm = n(2n + 1) + n, s.5

then m is an integer and

t2 = 2m + 1, s.6

which is odd.Notice that this argument makes use of a number of terms and re-

sults that are assumed to be known in advance. These include the

Page 143: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

132 A Concrete Introduction to Real Analysis

representation of an ‘odd integer’, the existence of multiplication andaddition, and the distributive law. To provide examples of this style ofreasoning with much less vagueness about what is assumed in advance,it is helpful to return to propositional logic. Rather than working withtruth tables, this treatment of propositional logic begins with a list ofaxioms which are assumed true.

Before actually writing down the axioms, let us consider what thisentails. Our goal is to provide an alternative to truth tables whichwill allow us to determine which formulas are tautologies. Thus everyprovable statement of propositional logic should be a tautology. In par-ticular the axioms themselves should be tautologies. Second, since theaxiomatic approach will not make explicit reference to truth tables, theaxioms must introduce and characterize the behaviour of the individu-al logical connectives, and the interactions of the various connectives.This will require a fairly long list of axioms.

5.4.1 Axioms for propositional logic

Here are several of the axioms. Recall that A, B, and C can be anystatements with a definite truth value. The first axiom is

A⇒(B⇒A),

which serves to introduce ⇒. The behavior of ∧ is partially capturedby

(A∧B)⇒A,

while the symmetry of ∧ must be addressed explicitly with the axiom

(A∧B)⇒B.

In a similar fashion axioms

A⇒(A∨B),

andB⇒(A∨B)

capture desired properties of ∨.Table 5.6 is a standard set of axioms for propositional logic [8]. Many

of them are straightforward, but a few look intimidating. See [13, pp.33–46] for an alternative approach.

Page 144: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 133

Table 5.6: Axioms for propositional logic

A⇒(B⇒A) a.1[A⇒B]⇒[(A⇒(B⇒C))⇒(A⇒C)] a.2

A⇒[B⇒(A∧B)] a.3(A∧B)⇒A a.4(A∧B)⇒B a.5A⇒(A∨B) a.6B⇒(A∨B) a.7

[A⇒C]⇒[(B⇒C)⇒((A∨B)⇒C)] a.8[A⇒B]⇒[(A⇒¬B)⇒¬A] a.9

¬¬A⇒A a.10[A⇒B]⇒[(B⇒A)⇒(A⇔B)] a.11

[A⇔B]⇒[A⇒B] a.12[A⇔B]⇒[B⇒A] a.13

Since the letters A, B, and C in the axioms can be any atomicor composite formula, it is possible to generate some variations bychanging letters, or by specializing from the given general forms tomore restricted forms. Thus a.1 could equally well be written

P⇒(Q⇒P ).

If the arbitrary formula Q is specialized to be the same as P , then a.1implies

P⇒(P⇒P ).

As another example, the formula P can be assumed to have the par-ticular form B⇒C, and Q can be replaced by A, giving

[B⇒C]⇒[A⇒(B⇒C)].

While these substitutions generate new formulas, they do not allowfor any substantial interaction among different axioms. To providesuch interaction at least one rule of inference is needed to generatenew formulas from several previously established formulas. The mostpopular rule of inference in propositional logic is modus ponens , whichsays that from formulas A and A⇒B we may conclude B,

AA⇒B

B

modus ponens

Page 145: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

134 A Concrete Introduction to Real Analysis

To summarize, a proof of propositional logic will be a list of formulas.A formula may come from an application of one of the thirteen axioms,or may be deduced from previously established formulas using modusponens. One typically provides a justification for each formula. Theproof is often said to be a proof of its last formula.

Example 1 Here is a proof of A⇒A; the example then continues toprove A⇔A.

Start with a.1, but replace B by A to get

A⇒(A⇒A). s.1

Next write down a.2, but replace B with A⇒A and C with A, getting

[A⇒(A⇒A)]⇒[(A⇒((A⇒A)⇒A))⇒(A⇒A)]. s.2

Using modus ponens on these two formulas we deduce

[(A⇒((A⇒A)⇒A))⇒(A⇒A)]. s.3

Go back to a.1, but replace B with A⇒A, so

A⇒((A⇒A)⇒A) s.4

Finally use modus ponens again on s.4 and s.3 to get

A⇒A. s.5

While this concludes a proof of s.5, we may continue by using a.11 withB replaced by A,

[A⇒A]⇒[(A⇒A)⇒(A⇔A)]. s.6

Using modus ponens on s.5 and s.6 gives

(A⇒A)⇒(A⇔A), s.7

while another application of modus ponens on s.5 and s.7 proves

A⇔A. s.8

Example 2 The next example is a proof of ¬(A∧¬A).Start with a.9

[A⇒B]⇒[(A⇒¬B)⇒¬A] s.1

Page 146: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 135

Replace A with A∧C and B with A to get

[(A∧C)⇒A]⇒[((A∧C)⇒¬A)⇒¬(A∧C)] s.2

Using a.4,(A∧C)⇒A s.3

and modus ponens, conclude that

((A∧C)⇒¬A)⇒¬(A∧C). s.4

Now replace C with ¬A to get

((A∧¬A)⇒¬A)⇒¬(A∧¬A). s.5

Use a.5(A∧¬A)⇒¬A s.6

and modus ponens to get

¬(A∧¬A). s.7

5.4.2 Additional rules of inference

In addition to modus ponens there are other valid rules of inference inpropositional logic. Recall that truth table analysis showed the logicalequivalence of A⇒B and ¬B⇒¬A. Suppose the two formulas A⇒Band ¬B are given. In terms of truth tables this is the same informationas ¬B⇒¬A and ¬B, to which we may apply modus ponens to conclude¬A. This motivates the rule of inference known as modus tollens.

¬BA⇒B

¬A

modus tollens

Another rule of inference is the disjunctive syllogism.

A∨B¬A

B

disjunctive syllogism

One may again use truth tables to establish the logical equivalence ofA∨B and ¬A⇒B, and then employ modus ponens again.

Page 147: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

136 A Concrete Introduction to Real Analysis

5.4.3 Adding hypotheses

The format of typical proofs in mathematics involves an extensionof the type of proof that has been considered so far. In addition tothe axioms, additional hypotheses may be assumed. Examples of suchhypotheses include ‘suppose n is an odd prime number’, or ‘assume thatf(x) is a function with two continuous derivatives’. An easy way toextend the notion of proof is simply to consider that the list of axiomshas been temporarily augmented by the addition of the hypotheses. Inthis more general setting the last line of the proof is called a theorem,and the added hypotheses employed in the proof are the hypotheses ofthe theorem.

Here are examples of such theorems in propositional logic. Noticethat the added hypotheses are usually not tautologies, but representsome additional information.

Example 1 For this example, assume the existence of particular for-mulas A, B, and C such that

A⇒B, h.1

andB⇒C. h.2

Since these formulas have been temporarily given the status of axioms,they may be used in the same way in the proof.

Taking advantage of the hypotheses, the proof starts with

A⇒B s.1

andB⇒C. s.2

Continue with a.1 in the form

D⇒[A⇒D]. s.3

Now suppose that D has the form B⇒C. Then

[B⇒C]⇒[A⇒(B⇒C)] s.4

Using modus ponens with s.2 and s.4 we get

A⇒(B⇒C) s.5

Page 148: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 137

Now recall a.2,

[A⇒B]⇒[(A⇒(B⇒C))⇒(A⇒C)]. s.6.

Using modus ponens with s.1 and s.6 we get

(A⇒(B⇒C))⇒(A⇒C). s.7.

Bringing in s.5 leads toA⇒C.

Thus having a pair of formulas of the form

A⇒B, B⇒C,

allows us to conclude that A⇒C. The axioms a.4 and a.6 provide sucha pair of formulas, leading to the conclusion

[A∧B]⇒[A∨B].

Example 2 Here is a proof of the contrapositive ¬B⇒¬A from thehypothesis A⇒B.

The first formula of the proof is

A⇒B. s.1

An application of a.1 yields

[A⇒B]⇒[¬B⇒(A⇒B)], s.2

and then modus ponens gives

¬B⇒[A⇒B]. s.3

From s.1 and a.9 we find

(A⇒¬B)⇒¬A. s.4

Applying a.1 again gives

¬B⇒[(A⇒¬B)⇒¬A]. s.5

Next record a.2 with the substitutions ¬B for A, (A⇒¬B) for B, and¬A for C, yielding

[¬B⇒(A⇒¬B)]⇒[(¬B⇒(A⇒¬B)⇒¬A)⇒(¬B⇒¬A)]. s.6

Page 149: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

138 A Concrete Introduction to Real Analysis

Use a.1 in the form¬B⇒[A⇒¬B] s.7.

Now use modus ponens on s.6 and s.7, obtaining

[(¬B⇒(A⇒¬B)⇒¬A)]⇒[¬B⇒¬A]. s.8

Again use modus ponens on s.5 and s.8, obtaining the desired result

¬B⇒¬A. s.9

5.4.4 Proof by contradiction

The propositional tautology A∨¬A leads to a popular style of ar-gument called proof by contradiction. Imagine trying to prove that astatement C follows from hypotheses h.1, . . . , h.k. One considers themodified collection of hypotheses h.1, . . . , h.k,¬C. Suppose that fromthese hypotheses it is possible to derive a contradiction, that is a s-tatement of the form P∧¬P . Treating the hypotheses h.1, . . . , h.k asaxioms, this means there is a proof of the assertion

¬C⇒[P∧¬P ]. s.1

In conventional propositional logic the truth of C can be establishedby a truth table analysis or by a proof from the axioms. Consider truthvalues first. Since the statement P∧¬P is always false, the truth of theimplication s.1 forces C to be true. As an alternative, use the viewpointof proofs. Recall from a previous example that

¬[P∧¬P ] s.2

is a consequence of the axioms. Applying modus tollens to statementss.1 and s.2 allows us to conclude

¬¬C,

at which point axiom a.10

¬¬C⇒C,

leads to C.This style of argument is also called reductio ad absurdum or indirect

proof . While this approach is valid within the context of propositional

Page 150: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 139

logic as we have presented it, some authors [8, p. 195–197] object tothis style of logic as a useful model for mathematics as a whole. Thecritics favor constructive proofs in mathematics, arguing that when theexistence of some object (like a prime number larger than an integer N)is asserted, there should be a procedure for producing it. In particular,the argument goes, it is not safe to claim that there must be a construc-tive procedure for establishing either P or ¬P . One can modify theaxioms of propositional logic to take these objections into account [8, p.49], replacing the axiom ¬¬A⇒A with ¬A⇒(A⇒B). With such a re-placement, every proof of the modified propositional logic would resultin a tautology, but not every tautology (in particular A∨¬A) would beprovable. This type of distinction between truth and provability playsa major role in deeper studies of mathematical logic.

Page 151: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

140 A Concrete Introduction to Real Analysis

5.5 Problems

1. Give the proof of Proposition 5.2.1.2. By comparing truth tables, establish the following results.(a) Show that A⇔B has the same truth table as (A⇒B)∧(B⇒A).(b) Show that A∨B is equivalent to (¬A)⇒B in the same way.3. The propositional connectives ∧, ∨, ⇒, ⇔ are each a function of

an ordered pair of truth values, and the value of each of these functionsis either true or false. How many distinct logical connectives of thistype are possible? Can they all be constructed using the given four ifin addition ¬ is available to negate one or both of the arguments? Asan example consider f1(A,B) = (¬A)∨B.

4. Show that the statement

A∨B⇒(C⇒A∧B)

is not a tautology, but is a valid consequence of A,B.5. Consider the following argument.Sam will keep his job only if he files a fraudulent corporate tax return.

He will avoid jail only if he files an honest tax return. Since Sam mustfile a corporate return, which is either honest or fraudulent, he willeither lose his job or go to jail.

Represent the argument using propositional logic, and decide whetheror not the argument is sound. Use the letters A − C, to represent thestatements

A: Sam will keep his job.B: Sam will go to jail.C: Sam files an honest tax return.6. The situation in the previous problem becomes a bit more com-

plex. Again, represent the argument using propositional logic, anddecide whether the argument is sound.

Sam will keep his job if he files a fraudulent corporate tax return, orif his boss goes to jail. Sam will go to jail if he files a fraudulent return,or if he files an honest return and the prosecutor is related to his boss.If the prosecutor is related to his boss, his boss will not go to jail. IfSam is lucky enough to keep his job and avoid jail, then he must file anhonest return and the prosecutor must be unrelated to his boss.

7. Consider the following narrative.

Page 152: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 141

Jane and Mary each love either William or Harry, but not both.William will marry Jane if she loves him, and William will marryMary if she loves him. (For the moment we allow the possibility oftwo wives.) Harry will marry Mary if she does not love William. IfWilliam or Harry will not marry, then either Mary loves William, orJane and Mary love Harry.

(a) Represent the narrative using propositional logic, and determinethe soundness the argument. It may be helpful to introduce the symbol� to represent the exclusive ‘or’.

(b) Use propositional logic to represent the following premises:William will marry either Jane or Mary if she loves him, but if he is

loved by both he will marry only one.8. Use the predicates

P (x) : x is a car, Q(x) : x is a Cadillac,

to represent the sentences

all cars are not Cadillacs,

andnot all cars are Cadillacs,

with the predicate calculus. Do these sentences have the same meaning?9. Solve the following problems in predicate logic.(a) Suppose the predicate P (x) is x > 0 while the predicate Q(x) is

x < 0. Show that the implication

[∃xP (x)∧∃xQ(x)]⇒[∃x(P (x)∧Q(x))]

is not valid; consider the domain D equal to the set of integers.(b) Find an example showing that the following implication is not

valid.[∀x(P (x)∨Q(x))]⇒[∀xP (x)∨∀xQ(x)].

10. Suppose P (x, y) denotes the predicate

(x = 0)∨(y = 0)∨(x ⊗ y �= 0).

Let D be the set of integers {0, 1, 2}, and suppose the product x⊗ y ismultiplication modulo 3, so that if xy is normal integer multiplicationand xy = 3n + r, with 0 ≤ r < 3, then x ⊗ y = r.

(a) Show that ∀x∀yP (x, y) is correct if the domain is D.

Page 153: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

142 A Concrete Introduction to Real Analysis

(b) Show that ∀x∀yP (x, y) is incorrect if D is the set of integers{0, 1, 2, 3}, and the product x ⊗ y is multiplication modulo 4.

11. Suppose D is the set of integers. Let P (x, y) be the predicatewhich is T if x − y is even and F if x − y is odd. Show that P is anequivalence relation. That is, show

P (x, x), P (x, y)⇒P (y, x),

and[P (x, y)∧P (y, z)]⇒P (x, z).

12. Suppose D is a set and P (x, y) is an equivalence relation definedon D. For each element x of D let

Sx = {z ∈ D | P (x, z) = T},

that is, Sx is the set of elements equivalent to x. Show that for anychoice of x and y in D, either Sx = Sy, or Sx ∩ Sy = ∅.

13. Use the axioms for propositional logic to prove the followingtheorems. You may use theorems of propositional logic proven fromthe axioms in the text, but do not use truth table arguments.

(a) Start with a.8 and take C = A to prove

[A∨A]⇒A.

(b) Start with a.8 and replace both B and C with A∨B to get

[A∨(A∨B)]⇒[A∨B].

14. Use the axioms for propositional logic to prove that

[A⇒B]⇒[(A∨B)⇒B].

Hint: Make use of B⇒B, which is already established, together withthe following consequence of a.2:

[(A⇒B)⇒(B⇒B)]⇒[((A⇒B)⇒((B⇒B)⇒(A∨B⇒B))

)

⇒((A⇒B)⇒(A∨B⇒B)

)].

You may also find a.8 useful.

Page 154: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

A Bit of Logic 143

15. Modus ponens has the form

AA⇒B

B

Use truth tables to check that the related formula

[A∧(A⇒B)]⇒B

is a tautology. Perform a similar analysis of the following rules ofinference.

(a) A⇒B (b) [A∨B]⇒C (c) A∨BA⇒C ¬C A⇒C

B⇒DA⇒B∧C ¬A∧¬B

C∨D

16. Establish the following results.(a) Prove

¬(A∨B)⇒¬B.

(b) Assume the hypothesis ¬(A∨B) and prove ¬A∧¬B.17. Assume the hypothesis (A∧B)∧C and prove A∧(B∧C).18. Consider the following problems.(a) Assume the hypothesis A⇒B and prove A⇒A∧B.(b) Assume the hypotheses A⇒B and B⇒C and prove A⇒B∧C.19. Assume (see 16) that ¬(A∨B)⇒¬B and ¬(A∨B)⇒¬A. Use a.9

and a.10 to prove A∨¬A.

Page 155: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Chapter 6

Real Numbers

This chapter starts a formal development of the foundations of analysis,beginning with an axiomatic treatment of the real numbers. Logicprovides our model for such a development. The essential buildingblocks of the subject, coming from intuition and vast experience, arepresented as axioms. Except for the foundational axioms, results areincorporated into the mathematical edifice only when they are proven.

The axioms describing the properties of the real numbers fall intothree categories: field axioms, order axioms, and completeness axioms.The more elementary field and order axioms will be presented first,along with some of their immediate consequences. The subsequentaddition of the completeness property marks something of a shift in thecharacter of the subject. It is here that the infinite processes, viewedwith suspicion by the ancient Greeks, come into play. Three versionsof the completeness property of the real numbers will be considered.

The completeness property is an invaluable tool for working withthe infinite sequences that arise so commonly in analysis. Followingthe initial study of completeness, the compactness property for closedbounded intervals [a, b] is introduced. Completeness also plays a rolein the concluding topics of the chapter, infinite products and continuedfractions.

The treatment of proofs in propositional logic will serve as a guideas we move beyond the axioms. Conjectures about ideas and resultsthat might be true can be inspired by examples, intuition, or brilliantguesswork. However, acceptance of such results will only come from rig-orous proofs. Proofs should consist of a careful and complete sequenceof arguments; each step should either be previously established, or bea logical consequence of previously established results.

To avoid getting hopelessly bogged down in technical details, a con-siderable amount of foundational material is assumed to be known inadvance. Some of this material includes properties of sets, functions,the equality predicate =, and the elementary properties of the integers

145

Page 156: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

146 A Concrete Introduction to Real Analysis

and rational numbers. Propositional logic, quantifiers, and predicateswill also be exploited.

The formal treatment of mathematics, with its emphasis on carefulproofs, is a very cautious and sometimes difficult approach, but theresulting structure has a durability and reliability rarely matched inother subjects. The choice of the real numbers as the focus for ax-iomatic characterization is efficient, but it should be noted that thereare alternative treatments which place the emphasis on the integersand rational numbers. Such alternatives, which define real numbers interms of rational numbers, are attractive because they minimize theassumptions at the foundations of mathematics. Such an alternativedevelopment may be found in [16, pp. 1–13] and in [18, pp. 35–45].

6.1 Field axioms

The first axioms for the real numbers concern the arithmetic func-tions addition + and multiplication ·. These properties of + and · areshared with the rational numbers and the complex numbers, along withother algebraic structures. A set F such as the real numbers R, witharithmetic functions + and ·, is called a field if for any a, b, c ∈ F thefollowing axioms F.1 - F.10 hold.

There is an addition function + taking pairs of numbers a, b ∈ F tothe number a + b ∈ F. Properties of addition are

a + b = b + a, commutativity (F.1)

(a + b) + c = a + (b + c), associativity (F.2)

There is a number 0 such that

a + 0 = a, existence of an additive identity (F.3)

For every number a there is a number b, written (−a), such that

a + b = 0, existence of an additive inverse. (F.4)

There is an multiplication function · taking pairs of numbers a, b ∈ F

to the number a · b ∈ F. Properties of multiplication are

a · b = b · a, commutativity (F.5)

Page 157: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 147

(a · b) · c = a · (b · c), associativity (F.6)

There is a number 1 such that

1 �= 0 (F.7)

a · 1 = a, existence of a multiplicative identity (F.8)

For every number a �= 0 there is a number b, written a−1 or 1/a, suchthat

a · b = 1, existence of a multiplicative inverse. (F.9)

Finally, there is an axiom describing the interplay of multiplicationand addition.

a · (b + c) = a · b + a · c, distributive law (F.10)

Before turning to the order axioms, several consequences of the fieldaxioms will be developed. These results will hold equally well for realnumbers, rational numbers, complex numbers, and certain other alge-braic structures (see problem 4).

Proposition 6.1.1. For every a ∈ F,

a · 0 = 0.

Proof. Axioms F.3 and F.10 lead to

a · 0 = a · (0 + 0) = a · 0 + a · 0.By F.3 and F.4, adding the inverse −(a · 0) to both sides gives

0 = a · 0.

Next, let’s show the uniqueness of the multiplicative identity.

Proposition 6.1.2. Suppose that a, b ∈ F, a �= 0, and a · b = a. Thenb = 1.

Proof. Multiply both sides of a · b = a by a−1, which exists by F.9.Using the associativity of multiplication F.6 we find

(a−1 · a) · b = 1 · b = b = a−1 · a = 1.

Page 158: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

148 A Concrete Introduction to Real Analysis

Proposition 6.1.3. Suppose that a, b ∈ F and a �= 0. Then there isexactly one number x ∈ F such that a · x + b = 0.

Proof. For the equationa · x + b = 0

it is easy to find a formula for the solution. First add −b to both sidesof obtaining,

a · x = −b.

Then multiply both sides by a−1 to get

a−1 · a · x = (a−1 · a) · x = 1 · x = x = a−1 · (−b).

Thus there is at most one solution of the equation, and it is simple tocheck that a−1(−b) is a solution.

Proposition 6.1.4. Suppose that x1, b ∈ F, and x21 = b. If x ∈ F and

x2 = b, then either x = x1 or x = −x1 (or both if x1 = 0).

Proof. If x2 = b then

(x + x1) · (x − x1) = x2 − x21 = b − b = 0.

Either x + x1 = 0 or x − x1 = 0.

As is typical in arithmetic, write b/a for a−1b, and ab for a · b. Inany field the positive integers can be defined recursively, 2 = 1 + 1,3 = 2 + 1, 4 = 3 + 1, etc, but (see problem 4) some of these ‘integers’may not be different from 0 as they are in the rationals or reals. Withthis caveat in mind, here is the quadratic formula.

Proposition 6.1.5. (Quadratic formula): Suppose that b, c, x1 ∈ F,and x1 satisfies the equation

x2 + bx + c = 0. (6.1)

Suppose in addition that 2 �= 0. Then there is a d ∈ F satisfyingd2 = b2 − 4c such that the numbers

x1 =−b + d

2, x2 =

−b − d

2,

are solutions of (6.1), and every solution x ∈ F is one of these.

Page 159: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 149

Proof. Since 2 �= 0, the number 2 has a multiplicative inverse and2−1 · b = b/2 is defined. For x ∈ F,

(x +b

2)2 = x2 + (2−1 + 2−1)bx +

b2

2 · 2 = x2 + bx +b2

4,

and (6.1) is equivalent to

(x +b

2)2 =

b2 − 4c4

. (6.2)

Ifd = 2(x1 + b/2),

then d ∈ F,

(d/2)2 =b2 − 4c

4,

and x1 = (−b + d)/2 satisfies (6.2). Similarly, if x2 = (−b − d)/2, then

(x2 +b

2)2 = (−d/2)2 = d2/4 = (b2 − 4c)/4,

so x2 also satisfies (6.2).Finally, if x is any solution of (6.1), then

(x +b

2)2 =

b2 − 4c4

= (d/2)2.

By Proposition 6.1.4

x +b

2= ±d

2,

so x must have the form

x = −b ± d

2.

6.2 Order axioms

The existence of an ordering relation ≤ for the field of real numbersis one of the ways to distinguish it from the field of complex numbers,

Page 160: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

150 A Concrete Introduction to Real Analysis

as well as many other fields. For instance, in establishing the quadraticformula we had to worry about the possibility that 2 = 1 + 1 and 0were the same number. This will not be the case if F satisfies the orderaxioms. By definition a field F is an ordered field if for any a, b, c ∈ F

the following axioms O.1 - O.6 hold.There is a relation ≤ satisfying

a ≤ a, (O.1)

a ≤ b and b ≤ a implies a = b, (O.2)

a ≤ b and b ≤ c implies a ≤ c, (O.3)

either a ≤ b or b ≤ a, (O.4)

a ≤ b implies a + c ≤ b + c, (O.5)

0 ≤ a and 0 ≤ b implies 0 ≤ a · b. (O.6)

As additional notation, write a < b if a ≤ b and a �= b. Also writea ≥ b if b ≤ a and a > b if b ≤ a and b �= a.

Proposition 6.2.1. If F is an ordered field, then 0 < 1.

Proof. The proof is by contradiction. First observe that by Proposi-tion 6.1.1

0 = (−1) · (1 + (−1)) = −1 + (−1) · (−1),

and so (−1) · (−1) = 1. Suppose that 1 ≤ 0. Adding −1 to bothsides and using O.5 implies 0 ≤ −1. By O.6 it then follows that 0 ≤(−1) · (−1) = 1. This means that 1 ≤ 0 and 0 ≤ 1. It follows from O.2that 0 = 1, contradicting F.7.

Proposition 6.2.2. Suppose F is an ordered field. If a, b, c ∈ F, a < b,and b < c, then a < c.

Proof. Axiom O.3 gives a ≤ c, so the case a = c must be ruled out. Ifa = c, then by O.5,

a − a = 0 < b − a = b − c.

This gives c− b ≤ 0, and the hypotheses give c− b ≥ 0, so O.2 leads tob = c, contradicting our assumptions.

Proposition 6.2.3. If F is an ordered field then all positive integersn satisfy n − 1 < n. In particular n �= 0.

Page 161: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 151

Proof. Start with 0 < 1, the conclusion of Proposition 6.2.1. Adding1 to both sides n − 1 times gives n − 1 < n. The combination of0 < 1 < · · · < n − 1 < n and Proposition 6.2.2 gives 0 < n.

It is helpful to establish some facts about multiplication in orderedfields that supplement O.6.

Proposition 6.2.4. In an ordered field, suppose that 0 ≤ a ≤ b and0 ≤ c ≤ d. Then 0 ≤ ac ≤ bd.

Proof. The inequality a ≤ b leads to 0 ≤ b − a. Since 0 ≤ c, it followsfrom O.6 that

0 ≤ bc − ac,

orac ≤ bc.

By the same reasoning,bc ≤ bd.

Axioms O.3 and O.6 then give

0 ≤ ac ≤ bd.

Proposition 6.2.5. In the ordered field F, suppose b ≤ 0 ≤ a. Thena · b ≤ 0.

Proof. Proceed as follows.0 ≤ −b,

by O.5, and so by O.60 ≤ a · (−b).

Again using O.5,a · b ≤ 0.

Proposition 6.2.6. In an ordered field F, if 0 < a ≤ b, then 0 < 1/b ≤1/a.

Page 162: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

152 A Concrete Introduction to Real Analysis

Proof. Proposition 6.2.5 implies that the product of a positive andnegative number is negative. Since a · a−1 = 1, a−1 > 0, and similarlyfor b−1. Applying O.6 gives

b − a ≥ 0,

a−1b−1(b − a) ≥ 0,

a−1 − b−1 ≥ 0

ora−1 ≥ b−1.

It is also convenient to be able to compare numbers with integers. Forthis we need another order axiom, called the Archimedean Property.

For every a ∈ F there is an integer n such that a ≤ n. (O.7)

Proposition 6.2.7. Suppose F is an ordered field which satisfies theArchimedean property. If a > 0 and b > 0, then there is an integer ksuch that a ≤ k · b.Proof. Since b �= 0, it has a positive multiplicative inverse. For someinteger m then, 0 < b−1 ≤ m. Proposition 6.2.4 yields 1 ≤ b · m.Similarly, if a ≤ n then

a · 1 ≤ n · m · b,

and we may take k = n · m.

Another consequence of O.7 is used quite often in analysis.

Lemma 6.2.8. In an ordered field satisfying the Archimedean property,suppose a ≥ 0 and a < 1/n for every positive integer n. Then a = 0.

Proof. The only choices are a = 0 or a > 0. If a > 0 there is an integern such that 1/a ≤ n, or a ≥ 1/n. Since this possibility is ruled out bythe hypotheses, a = 0.

The axioms F.1−F.10 and O.1−O.7 discussed so far describe prop-erties expected for the real numbers. However these axioms are alsosatisfied by the rational numbers. Since our axioms do not distinguishbetween the rational and real numbers, but these number systems are

Page 163: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 153

distinct, there must be some properties of the real numbers that havenot yet been identified.

The first distinction between the real and rational numbers was dis-covered by the Greeks, probably in the fifth century B.C. [9, p. 32].Recall that a number is rational if it can be written as the ratio oftwo integers, m/n. The Greeks were familiar with the length

√2 from

geometry, but for some time thought all lengths could be expressed asrational numbers. To establish that

√2 is not rational requires a bit of

number theory.Recall that a positive integer p is said to be prime if p > 1 and

whenever p is written as the product of two positive integers, p = j · k,one of the factors is p. When considering products of integers, a singlenumber n is taken as a product with one factor. The following resultis basic in arithmetic.

Theorem 6.2.9. (a) Every positive integer n ≥ 2 can be written asthe product of prime factors. (b) If the (possibly repeated) factors arewritten in nondecreasing order, this prime factorization is unique.

Proof. (a): Let’s prove the theorem by induction on n, with the firstcase being n = 2. In this first case 2 is prime, so there is single factor2.

Now assume the result holds for all integers k with 2 ≤ k < n. Ifn is prime then the factorization has a single prime factor. If n is notprime, then n = p · k, where 2 ≤ p < n and 2 ≤ k < n. By theinduction hypothesis both k and p have prime factorizations, and so nhas a prime factorization.

It takes more work [7, p. 3] to show that the factorization is uniqueif the factors are listed in nondecreasing order.

Theorem 6.2.10. There is no rational number whose square is 2.

Proof. The argument is by contradiction. Suppose there is a rationalnumber √

2 =m

n

whose square is two. There is no loss of generality if m and n are takento be positive integers with no common prime factors. (If there arecommon prime factors, terms in the numerator and denominator can

Page 164: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

154 A Concrete Introduction to Real Analysis

be cancelled until the desired form is obtained.) Multiplying by n andsquaring both sides leads to

2n2 = m2.

Obviously m2 is even.Notice that if m = 2l + 1 is odd, then m2 = 4l2 + 4l + 1 is odd.

Since m2 is even, m must have a factor 2. It follows however thatn2 = m2/2 is also even. This means m and n have the common factor2, a contradiction.

6.3 Completeness axioms

Unless explicitly stated otherwise, assume from now on that F satis-fies the field axioms F.1-F.10 and the order axioms O.1-O.7. There areseveral ways to describe the important property of the real numberswhich has been omitted so far. These various descriptions all involvethe convergence of sequences.

Before adding to our axioms, some definitions are needed. If x ∈ F,the absolute value of x, denoted |x|, is equal to x if x ≥ 0 and is equalto −x if x < 0. Next, recall the definition of an infinite sequence. Aninfinite sequence, or simply a sequence, is a function y whose domainis the set N of positive integers. The value y(k) of the function at k iscalled the k − th term of the sequence. The terms of a sequence areusually written yk instead of y(k), and the whole sequence is denoted{yk}. Intuitively, a sequence is just an infinite list of numbers y1, y2,y3,. . . .

The notion of a limit is the most important idea connected withsequences. Suppose that yk, ε, and L denote numbers in F. Say thatthe sequence {yk} has the limit L, if for any ε > 0 there is an integerN such that |yk − L| < ε whenever k ≥ N . For notational conveniencethe expression

limk→∞

yk = L

is used when the sequence yk has the limit L. The dependence of N onε may be emphasized by writing Nε or N(ε). The phrase {yk} convergesto L is also used instead of limk→∞ yk = L.

Page 165: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 155

As simple examples, consider the sequences xk = (k + 1)/k, yk =2 + (−1)k/(k2), and zk = sin(k)/

√k. In the first case

|xk − 1| =1k,

so the sequence {xk} converges to L = 1. The numbers yk satisfy

|yk − 2| =1k2

,

so the sequence {yk} has limit 2. Finally, since | sin(k)| ≤ 1, it followsthat

|zk − 0| ≤ 1√k,

andlim

k→∞zk = 0.

A basic feature distinguishing the real numbers from the rationalnumbers is that well-behaved sequences of rational numbers may fail toconverge because the number which should be the limit is not rational.For instance the sequence

x1 = 1, x2 = 1.4, x3 = 1.41, x4 = 1.414, . . . ,

of truncated decimal expansions for√

2 is a sequence of rational num-bers that wants to converge to

√2, but, of course,

√2 is not in the set

of rational numbers. This sequence {xk} has no limit in the rationalnumbers. This phenomenon does not occur for the real numbers.

The first way to characterize the good behavior of the reals involvesbounded increasing or decreasing sequences. Such a sequence is illus-trated in Figure 6.1. Say that a sequence xk ∈ F is monotonicallyincreasing if xl ≥ xk whenever l > k. Similarly xk is monotonicallydecreasing if xl ≤ xk whenever l > k. If xk is either monotonicallyincreasing or decreasing, the sequence is said to be monotone. A setU ⊂ F is bounded if there is an M ∈ F such that |x| ≤ M for all x ∈ U .A set U ⊂ F is bounded above if there is an M ∈ F such that x ≤ Mfor all x ∈ U , while V ⊂ F is bounded below if there is an M ∈ F suchthat x ≥ M for all x ∈ V .

Here is a completeness axiom which gives one description of the goodconvergence properties of the real numbers. This particular propertyis called the bounded monotone sequence property.

Page 166: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

156 A Concrete Introduction to Real Analysis

0 5 10 15 20 25

0

0.5

1

1.5

2

2.5

3

Figure 6.1: A bounded monotone sequence

BMS Every bounded monotone sequence {xn} has a limit L. (C.1)

There is a second completeness property of the real numbers whichis closely related to C.1. For k ∈ N say that the intervals [xk, yk] ⊂ F

are nested if [xk+1, yk+1] ⊂ [xk, yk]. The idea of nested intervals isillustrated in Figure 6.2. The next property is called the Nested IntervalPrinciple.

NIP If {[xk, yk]} is a nested sequence of intervals with limk→∞ yk −xk = 0, then there is exactly one real number L such that xk ≤ L ≤ yk

for all k = 1, 2, 3, . . . .

The third completeness property of the real numbers involves thenotion of a least upper bound. Say that y ∈ F is an upper bound forthe set U ⊂ F if x ≤ y for every x ∈ U . Say that z ∈ F is a least upperbound for U if z is an upper bound for U , and if no number y < z isan upper bound for U . The final property to be considered is the leastupper bound property,

Page 167: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 157

LUB Every nonempty set U ⊂ R which is bounded above has a leastupper bound.

Although the three properties BMS, NIP, and LUB have distinct de-scriptions, they are in fact equivalent. We will add axiom C.1 (BMS)to our previous collection of axioms; with this addition our character-ization of the real numbers will be finished. That is, the real numbersare a set R , together with the functions + and ·, and the order relation≤, which satisfy the field axioms F.1-F.10, the order axioms O.1-O.7,and the completeness axiom C.1. Any two number systems which sat-isfy the axioms for the real numbers only differ by what amounts to arenaming of the elements, arithmetic functions, and order relation (theproof is omitted). This characterization does not say what the realnumbers ‘are’; instead, the behavior of the real numbers is described,as was our plan.

x1

y1

x2

y2

x3

y3• • •

Figure 6.2: Nested intervals

Before addressing the equivalence of the completeness properties, letus show that a field F satisfying the Nested Interval Principle contains√

2. This will demonstrate that a completeness property distinguishesthe real numbers from the rationals. The method, called bisection, isa popular computer algorithm.

Theorem 6.3.1. Suppose the Archimedean ordered field F satisfies theNested Interval Principle. Then there is a number L > 0 such thatL2 = 2.

Proof. Begin with the interval [x0, y0] = [1, 2]. Notice that x20 ≤ 2 while

Page 168: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

158 A Concrete Introduction to Real Analysis

y20 ≥ 2.Given an interval [xn, yn] with x2

n ≤ 2 and y2n ≥ 2, the next interval

[xn+1, yn+1] is constructed as follows. First define cn = (xn + yn)/2;notice that xn ≤ cn ≤ yn and

yn − cn = (yn − xn)/2 = cn − xn.

If c2n > 2, take [xn+1, yn+1] = [xn, cn]; otherwise, let [xn+1, yn+1] =

[cn, yn].By construction the intervals [xn, yn] are nested, and |yn−xn| = 2−n.

The Nested Interval Principle says there is exactly one real number Lsuch that xn ≤ L ≤ yn. Also

0 ≤ y2n − x2

n = (yn + xn)(yn − xn) ≤ 42n

.

Since x2 is strictly increasing if x ∈ [1, 2], and x2n ≤ 2 while y2

n ≥ 2,

x2n − y2

n ≤ x2n − 2 ≤ L2 − 2 ≤ y2

n − 2 ≤ y2n − x2

n.

Thus|L2 − 2| ≤ y2

n − x2n ≤ 4

2n.

Since n is arbitrary, Lemma 6.2.8 gives L2 = 2.

With a bit more work this result can be generalized. The proof ofthe resulting intermediate value theorem for polynomials is left to thereader.

Theorem 6.3.2. Suppose F satisfies the field and order axioms, andthe Nested Interval Principle. Let

p(x) = anxn + · · · + a1x + a0, ak ∈ F,

be a polynomial. If r ∈ F and there are real numbers x0 and y0 suchthat

p(x0) ≤ r, p(y0) ≥ r,

then there is a number x ∈ F satisfying x0 ≤ x ≤ y0 such that p(x) = r.

The equivalence of the completeness axiom C.1 to the NIP and LUBproperties will now be established through a sequence of propositions.The following observation will be useful.

Page 169: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 159

Lemma 6.3.3. Suppose that xk ≥ M for k = 1, 2, 3, . . . , and thatlimk→∞ xk = L. Then L ≥ M .

Proof. The argument is by contradiction. Suppose that L < M , andtake ε = M − L. The fact that limk→∞ xk = L means there is an Nsuch that

|xk − L| < ε, k ≥ N.

Since xN ≥ M and L < M , it follows that xk − L > 0 and

xN − L < ε = M − L.

This gives xN < M , contradicting the hypotheses.

Proposition 6.3.4. The Bounded Monotone Sequence property impliesthe Nested Interval Principle.

Proof. Suppose there is a nested sequence of intervals [xk, yk] for k ∈ N,with xk ≤ xk+1 < yk+1 ≤ yk and limk→∞ yk − xk = 0. The sequence{xk} is increasing, and xk ≤ y1 for all k. Since the sequence {xk}is increasing and bounded above, the Bounded Monotone Sequenceproperty implies that there is a number L1 such that

limk→∞

xk = L1.

Similarly, the sequence {yk} is decreasing and bounded below, so thereis a number L2 such that

limk→∞

yk = L2.

Notice that

L2 − L1 = limk→∞

yk − limk→∞

xk = limk→∞

yk − xk = 0,

or L2 = L1. Let L = L1 = L2.For any fixed index j we have xk ≤ yj, so by the previous lemma

xj ≤ L ≤ yj. Suppose there were a second number M ∈ [xk, yk] foreach k ∈ N. Then |L − M | ≤ |yk − xk|. Since limk→∞ yk − xk = 0, itfollows that |L − M | < ε for every ε > 0, so L − M = 0.

Proposition 6.3.5. The Nested Interval Principle implies the LeastUpper Bound Property.

Page 170: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

160 A Concrete Introduction to Real Analysis

Proof. Suppose that U ⊂ R is a nonempty set which is bounded aboveby z. Pick a number a1 which is not an upper bound for U , and anumber b1 which is an upper bound for U . Let c1 = (a1 +b1)/2. If c1 isan upper bound for U define b2 = c1 and a2 = a1, otherwise define a2 =c1 and b2 = b1. Continuing in this fashion we obtain sequences {ak} and{bk} satisfying ak ≤ ak+1 < bk ≤ bk+1 with bk − ak = (b1 − a1)/2k−1.Moreover each point bk is an upper bound for U , and each point ak isnot an upper bound for U .

By the Nested Interval Principle there is a number L satisfying ak ≤L ≤ bk for all k ∈ N; this implies

|L − ak| ≤ bk − ak, |bk − L| ≤ bk − ak.

If L were not an upper bound for U , then there would be x ∈ U , x > L.Write x = L + (x −L). Since x − L > 0, the number L + (x − L) > bk

for k sufficiently large. This means x > bk, which is impossible sincebk is an upper bound for U . Thus L is an upper bound for U .

Similarly, if L were not the least upper bound for U there would besome y < L which was an upper bound. Since y = L − (L − y), thenumber L − (L − y) < ak, or y < ak, for k sufficiently large. Thiscontradiction means that L is the least upper bound for U .

Another lemma will help complete the chain of logical equivalencesfor the completeness properties.

Lemma 6.3.6. Suppose U ⊂ R is nonempty and bounded above by L.Then L is the least upper bound for U if and only if for every ε > 0there is an x ∈ U such that 0 ≤ L − x < ε.

Proof. First suppose that L is the least upper bound. Then for anyε > 0 the number L − ε is not an upper bound for U , and so there isan x ∈ U with L − ε < x ≤ L.

Suppose now that L is an upper bound for U and that for every ε > 0there is an x ∈ U such that 0 ≤ L − x < ε. Suppose M < L is anotherupper bound for U . Take ε = L−M . By assumption there is an x ∈ Usuch that

0 ≤ L − x < L − M,

or x > M . This contradicts the assumption that M is an upper boundfor U , so that L is the least upper bound.

Proposition 6.3.7. The Least Upper Bound Property of the real num-bers implies the Bounded Monotone Sequence Property.

Page 171: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 161

Proof. Suppose that {xk} is an increasing sequence bounded aboveby M . The set of numbers in the sequence has a least upper boundL. Since L is the least upper bound, for every ε > 0 there is anxN such that L − xN < ε. Since the sequence is increasing we havexk ≤ xk+1 ≤ L, so L − xk ≤ L − xN < ε for all k ≥ N . Thus thesequence xk converges to L.

6.4 Subsequences and compact intervals

One of the simplest examples of a sequence which does not convergehas the terms

xk = (−1)k.

Notice that while there is no limit for the entire sequence, there areparts of the sequence which do have limits. In fact this example consistsof two interleaved sequences, each of which converges. Looking at theterms with even and odd indices respectively,

yk = x2k = 1, zk = x2k+1 = −1,

it is easy to see that the sequences {yk} and {zk} converge. This simpleexample motivates the question of whether this type of behavior istypical; given a sequence {xk}, is there a sequence yk consisting ofsome portion of the terms xk such that {yk} converges?

To address this question, it is necessary to define a subsequence, whichroughly speaking will be an infinite sublist of the terms xk. More pre-cisely, if {xk} is a sequence, k = 1, 2, 3, . . . , say that {yj} is a subse-quence of {xk} if there is a function k(j) : N → N which is strictlyincreasing, and such that yj = xk(j). By strictly increasing we meanthat k(j + 1) > k(j) for all j = 1, 2, 3, . . . . Notice that the elementsof a subsequence appear in the same order as the corresponding ele-ments of the original sequence. Also notice that the sequence yj = 1,or 1, 1, 1, . . . is not a subsequence of the sequence xk = 1/(k + 1), or1,1/2,1/3 , . . . , since a subsequence has the requirement k(j+1) > k(j),forcing k(2) > 1, and so y2 = xk(2) < 1.

The sample sequence xk = (−1)k is not convergent, but it does haveconvergent subsequences. The behavior of the sequence yk = k is d-ifferent; this sequence has no convergent subsequences. In general a

Page 172: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

162 A Concrete Introduction to Real Analysis

sequence can have subsequences with many different limits. However,the situation is simple if the sequence {xk} has a limit.

Lemma 6.4.1. Suppose {xk} is a sequence with a limit L. Then everysubsequence {yj} = {xk(j)} of {xk} converges to L.

Proof. Pick any ε > 0. By assumption there is an index N such that|xk − L| < ε whenever k ≥ N . Suppose that j ≥ N . Since k(j) isstrictly increasing, k(j) ≥ j ≥ N and

|yj − L| = |xk(j) − L| < ε.

It is an important fact that if a sequence of real numbers is bounded,then the sequence has a convergent subsequence.

Theorem 6.4.2. Suppose {xk} is a bounded sequence of real numbers.Then there is a subsequence {xk(j)} of {xk} which converges.

Proof. Suppose the elements of the sequence satisfy −M ≤ xk ≤ Mfor some M > 0. For convenience we may assume that M is an integer.The plan is to construct a sequence of intervals satisfying the NestedInterval Principle, and arrange that the limit guaranteed by this resultis also the limit of a subsequence of {xk}.

Break up the interval [−M,M ] into subintervals [n, n + 1] of length1 with integer endpoints. Since the set of postive integers N is infinite,and the number of intervals [n, n + 1] contained in [−M,M ] is finite,there must be at least one such interval I0 containing xk for an infinitecollection of indices k. Let k(0) be the smallest of the indices k forwhich xk ∈ I0.

In a similar way, partition I0 into 10 nonoverlapping subintervals oflength 10−1. At least one of these intervals I1 ⊂ I0 contains xk for aninfinite collection of indices k. Let k(1) be the smallest of the indicesk for which xk ∈ I1 and k(0) < k(1).

Continue in this fashion (see Figure 6.3) for every positive integer m,partitioning Im−1 into 10 nonoverlapping subintervals of length 10−m.At least one of these intervals Im ⊂ Im−1 contains xk for an infinitecollection of indices k. Let k(m) be the smallest of the indices k forwhich xk ∈ Im and k(m − 1) < k(m).

The intervals Im are nested by construction, and the length of Im

is 10−m. By the Nested Interval Principle there is a (unique) point z

Page 173: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 163

which is in the intersection of all the intervals Im. Since xk(j) ∈ Im

if j ≥ m, it follows that |z − xk(j)| ≤ 10−m for all j ≥ m. Thus thesubsequence xk(j) converges to z.

This proof actually shows that if {xk} is any sequence from the set[−M,M ], then there is a subsequence {xk(j)} which converges to apoint z ∈ [−M,M ], since Im ⊂ [−M,M ]. With no essential change theargument shows that this observation may be extended to any closedinterval [a, b]. Say that K ⊂ R is compact if every sequence xk ∈ Khas a subsequence which converges to a point of K. The next result isa consequence of the previous theorem.

−M Mx x+1

I0

x x+1y y+0.1

I1

y y+0.1z z+0.01

I2

z z+0.01

Figure 6.3: Constructing a convergent subsequence

Corollary 6.4.3. For any real numbers a ≤ b the interval [a, b] iscompact.

It is certainly not true that arbitrary intervals are compact. Forinstance the interval (−∞,∞) contains the sequence xk = k, which

Page 174: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

164 A Concrete Introduction to Real Analysis

has no convergent subsequence. Also, the interval (0, 1) contains thesequence xk = 1/k for k = 1, 2, 3, . . . . This sequence, and all its sub-sequences, converge to the real number 0. Since 0 /∈ (0, 1) the openinterval (0, 1) is not compact. By checking the various possibilities wecan easily check that the only compact intervals are those of the form[a, b] where a, b ∈ R.

6.5 Products and fractions

The extension of finite sums to infinite series is typically encounteredin Calculus. At an elementary level it is less common to encounterthe analogous constructions of infinite products and continued frac-tions. Infinite products will be introduced through some elementaryproblems in probability. Properties of the natural logarithm and expo-nential function will be used, including their continuity. Presenting thismaterial before the formal development of continuity and integrationis a bit at odds with our general axiom-based approach, but the ap-peal of some interesting applications of completeness proved irresistible.Continued fractions are more commonly encountered in number the-ory, where they play a role in the study of the approximation of realnumbers by rational numbers.

6.5.1 Infinite products

Swindler Stan, the gambling man, comes to you with a pair of offers.”Let’s play one of these games,” he says. ”Each game starts with megiving you $1,000.” Needless to say, your interest is aroused.

In the first game you draw a ball from an urn once a day. On day kthe urn has k white balls and 1 black ball. If you draw a black ball thegame ends, and you pay Stan $2,000. If you never draw a black ball,you keep the $1,000 forever.

The second game has you throwing dice. On day k you throw k dice.If you throw k ones the game ends, and you pay Stan $10,000. If younever throw all ones, you keep the $1,000 forever.

You remember a bit of elementary probability, and start to thinkabout game two. Suppose you play for n days. You keep the $1,000at the end of n days if you manage to avoid throwing all ones on each

Page 175: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 165

day. Since the probability of throwing k 1’s is (1/6)k , the probabilityof avoiding all ones on day k is

pk = 1 − (1/6)k.

The throws on different days are independent, so the probability ofhanging on to the money at the end of n days is

Pn = p1 · p2 · · · · · pn =n∏

k=1

(1 − 6−k).

Turning back to game one, the probability of avoiding the black ballon day k is qk = 1 − 1/(k + 1). Reasoning in a similar fashion, youconclude that the probability of retaining the money after n days ingame one is

Qn = q1 · q2 · · · · · qn =n∏

k=1

(1 − 1k + 1

).

You reasonably conclude that the probability of retaining your moneyforever in game one is

Q = limn→∞

n∏k=1

(1 − 1k + 1

),

while for game two the probability is

P = limn→∞

n∏k=1

(1 − 6−k).

Just as an infinite series is defined by∞∑

k=1

ak = limn→∞

n∑k=1

ak,

infinite products can be defined by∞∏

k=1

pk = limn→∞

n∏k=1

pk,

if the limit exists. The first challenge is to develop some understandingof when such limits exist. Afterward (see problems 36 and 37) we canconsider the attractiveness of Stan’s offer.

Page 176: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

166 A Concrete Introduction to Real Analysis

By making use of the logarithm function, it is straightforward tounderstand the limit process for products. Suppose the numbers ck arepositive, so that log(ck) is defined. Since the log of a product is thesum of the logs,

log(n∏

k=1

ck) =n∑

k=1

log(ck).

If the last series converges, then

limn→∞

n∏k=1

ck = limn→∞ exp(

n∑k=1

log(ck)) = exp( limn→∞

n∑k=1

log(ck))

= exp(∞∑

k=1

log(ck)).

Here the continuity of the exponential function is used.Recall (Lemma 4.3.1) that in order for a series to converge it is nec-

essary (but not sufficient) to have

limk→∞

log(ck) = 0.

Since the exponential function is continuous this means that

limk→∞

ck = limk→∞

elog(ck) = elimk→∞ log(ck) = e0 = 1.

This being the case, we will write

ck = 1 + ak

and look for conditions on the sequence ak > −1 which ensure theconvergence of the series

∞∑k=1

log(1 + ak).

The essential observation is that for |ak| small,

| log(1 + ak)| � |ak|.To make this more precise, start with the definition

log(y) =∫ y

1

1t

dt,

Page 177: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 167

which implies

log(1 + x) =∫ 1+x

1

1t

dt. (6.3)

Some simple estimates coming from the interpretation of the integralas an area will establish the following result.

Lemma 6.5.1. For |x| ≤ 1/2,

|x|/2 ≤ | log(1 + x)| ≤ 2|x|.Proof. If x ≥ 0 then 1/(1+x) ≤ 1/t ≤ 1 for 1 ≤ t ≤ 1+x. Using theseestimates with (6.3) gives

x

1 + x≤ log(1 + x) ≤ x,

and the desired inequality holds if 0 ≤ x ≤ 1. For −1 < x < 0 write

1 + x = 1 − |x|.Using (6.3) again we find

|x| ≤ | log(1 − |x|)| ≤ |x|1 − |x| ,

and the desired inequality holds if −1/2 ≤ x < 0. Pasting these twocases together gives the final result.

This prepares us for the main result on infinite products.

Theorem 6.5.2. Suppose that 1 + ak > 0. If the series

∞∑k=1

|ak| (6.4)

converges, then the sequence

pn =n∏

k=1

(1 + ak)

has a limit 0 < p < ∞, where

p = exp(∞∑

k=1

log(1 + ak)),

Page 178: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

168 A Concrete Introduction to Real Analysis

Proof. If the series in (6.4) converges, then

limk→∞

|ak| = 0;

in particular there is an integer N such that |ak| < 1/2 for k ≥ N . Forn ≥ N the estimate of Lemma 6.5.1 gives

n∑k=N

| log(1 + ak)| ≤ 2n∑

k=N

|ak|.

Consequently the series

∞∑k=1

log(1 + ak)

converges absolutely, with sum S ∈ R.As noted above, the continuity of the exponential function gives

limn→∞

n∏k=1

(1 + ak) = limn→∞ exp(

n∑k=1

log(1 + ak))

= exp( limn→∞

n∑k=1

log(1 + ak)) = eS .

There are a few points to address before considering whether theconverse of Theorem 6.5.2 is valid. If a single factor 1 + aj is equal to0, then all of the partial products

∏nk=1(1+ ak) for n ≥ j will be 0 and

the sequence of partial products will converge regardless of the valuesof the other terms 1 + ak. Also, if

limn→∞

n∑k=1

log(1 + ak) = −∞,

then the sequence of partial products will have limit 0. It would seemdesirable to avoid these cases. Problems also arise if 1 + ak < 0 for aninfinite set of indices k. In this case the sequence of partial productsmust have an infinite subsequence of positive terms and an infinitesubsequence of negative terms. If the sequence of partial products hasa limit in this setting, the limit must be 0.

Page 179: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 169

Despite the difficulty that arises if even a single factor is 0, thereare good reasons to want to allow at least a finite number of factorswhich are negative or 0. We therefore say that an infinite product∏∞

k=1(1+ak) converges if there is an integer N such that 1+ak > 0 fork ≥ N , and if the modified sequence of partial products

∏nk=N (1 + ak)

has a positive limit.The proof of the next result is left as an exercise.

Theorem 6.5.3. Suppose that the sequence

pn =n∏

k=1

(1 + ak)

has a limit p, with 0 < p < ∞, and 1 + ak > 0 for k ≥ N . Then theseries ∞∑

k=N

log(1 + ak)

converges. If the terms ak are all positive or all negative, then the series∑∞k=1 |ak| converges.

6.5.2 Continued fractions

One interesting way in which sequences can be generated is by iter-ation of a function. A function f(x) and an initial value x0 are given.For n ≥ 0 the sequence is then defined by

xn+1 = f(xn).

Our study of continued fractions begins by considering the sequencedefined by

xn+1 = 2 +1xn

, x0 = 2.

The first few terms in this sequence are

x0 = 2, x1 = 2 +12, x2 = 2 +

12 + 1

2

, x3 = 2 +1

2 + 12+ 1

2

, . . . .

Of course these fractions xn are rational numbers, so could be writtenas a quotient of integers. Instead, we will explore the limiting processsuggested by the representation of numbers as such continued fractions.

Page 180: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

170 A Concrete Introduction to Real Analysis

Suppose first that a finite sequence a1, . . . , aN of positive real num-bers is given. A finite continued fraction is the expression

a0 +1

a1 + 1a2+···+ 1

aN

. (6.5)

The first term a0 is not required to be positive. Since the expres-sion (6.5) is so awkward, the continued fraction is usually denoted[a0, a1, . . . , aN ].

If a1, a2, a3, . . . is an infinite sequence of positive numbers, it is pos-sible to consider the infinite continued fraction

[a0, a1, a2, . . . ] = a0 +1

a1 + 1a2+...

. (6.6)

For n = 0, 1, 2, . . . , let xn denote the real number represented by the ex-pression [a0, a1, . . . , an]. The infinite continued fraction [a0, a1, a2, . . . ]is said to be convergent if the sequence of numbers {xn} has a limit.

For n ≤ N the continued fractions [a0, . . . , an] are said to be con-vergents of [a0, . . . , an, . . . , aN ]; the terminology is the same in the caseof an infinite continued fraction. When the numbers a0, a1, . . . arefurther restricted to be integers, the continued fractions are called sim-ple. Simple continued fractions provide an alternative to the decimalrepresentation for real numbers. They are particularly important forstudying approximations of real numbers by rational numbers.

Our first goal is to try to understand a finite continued fraction whenit is expressed as a simple ratio. Let

pn

qn= [a0, . . . , an].

Evaluating the first few cases gives

p0

q0=

a0

1,

p1

q1=

a0a1 + 1a1

,

p2

q2=

a0(a1a2 + 1) + a2

a1a2 + 1=

(a0a1 + 1)a2 + a0

a1a2 + 1p3

q3=

(a0a1a2 + a2 + a0)a3 + a0a1 + 1(a1a2 + 1)a3 + a1

=[(a0a1 + 1)a2 + a0)]a3 + [a0a1 + 1]

(a1a2 + 1)a3 + a1.

There is a recursive pattern which holds in general.

Page 181: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 171

Theorem 6.5.4. Suppose ak ∈ R and ak > 0 for k ≥ 1. If

p0 = a0, q0 = 1, p1 = a0a1 + 1, q1 = a1,

and

pn = anpn−1 + pn−2, qn = anqn−1 + qn−2, n ≥ 2, (6.7)

then pn/qn = [a0, . . . , an].

Proof. The proof is by induction, with cases n = 0, . . . , 3 already es-tablished. We will make use of the observation that

[a0, . . . , an] = [a0, . . . , an−2, an−1 + 1/an],

so here it is important that the ak not be restricted to integer values.Assuming the identity holds for all partial fractions with m ≤ n terms

a0, . . . , am−1, it follows that

[a0, . . . , an] = [a0, . . . , an−2, an−1 + 1/an] =(an−1 + 1/an)pn−2 + pn−3

(an−1 + 1/an)qn−2 + qn−3

=(anan−1 + 1)pn−2 + anpn−3

(anan−1 + 1)qn−2 + anqn−3=

anpn−1 − anpn−3 + pn−2 + anpn−3

anqn−1 − anqn−3 + qn−2 + anqn−3

=anpn−1 + pn−2

anqn−1 + qn−2=

pn

qn.

The relations (6.7) give

pnqn−1 − qnpn−1 = (anpn−1 + pn−2)qn−1 − pn−1(anqn−1 + qn−2)

= −[pn−1qn−2 − qn−1pn−2], n ≥ 2.

Repeated use of this identity to reduce the index leads to

pnqn−1 − qnpn−1 = (−1)n−1[p1q0 − q1p0]

= (−1)n−1[(a0a1 + 1) − (a0a1)] = (−1)n−1.

This gives the next result, which expresses the difference between twoconsecutive convergents.

Page 182: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

172 A Concrete Introduction to Real Analysis

Theorem 6.5.5.

pn

qn− pn−1

qn−1= (−1)n−1 1

qn−1qn.

Theorem 6.5.6. Suppose ak ∈ N for k = 1, 2, 3, . . . . Then for n ≥ 0,the integers pn and qn have no common integer factors m ≥ 2.

Proof. For n = 0, an appeal to the definitions of p0 and q0 is sufficient.For n ≥ 1,

pnqn−1 − qnpn−1 = (−1)n−1. (6.8)

If an integer m ≥ 2 divides pn and qn, then m divides (−1)n−1, whichis impossible.

The analysis of the convergence of infinite continued fractions resem-bles the analysis of the convergence of alternating series. In particular,the Nested Interval Principle is put to good use.

Theorem 6.5.7. Suppose ak ∈ R and ak > 0 for k = 1, 2, 3 . . . . Then

pn+2

qn+2>

pn

qn, n even,

pn+2

qn+2<

pn

qn, n odd.

In addition the odd convergents are greater than the even convergents.

Proof. From Theorem 6.5.5,

pn+2

qn+2− pn

qn=

(pn+2

qn+2− pn+1

qn+1

)+

(pn+1

qn+1− pn

qn

)

= (−1)n+1 1qn+1qn+2

+ (−1)n1

qnqn+1.

By (6.7) the qn are positive and increasing, so

1qn+1qn+2

<1

qnqn+1,

and the first part of the result follows by checking the signs.To show that the odd convergents are greater than the even conver-

gents, first use Theorem 6.5.5 to show that

pn

qn>

pn−1

qn−1, n odd,

Page 183: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 173

pn

qn<

pn−1

qn−1, n even.

Then it suffices to note that the magnitudes of the differences betweensuccessive convergents

|pn

qn− pn−1

qn−1| =

1qn−1qn

.

are strictly decreasing.

Theorem 6.5.8. Every simple continued fraction is convergent.

Proof. In this case ak ≥ 1, so (6.7) gives qn ≥ n. The Nested IntervalPrinciple may be immediately applied to the intervals

[pn

qn,pn+1

qn+1], n even.

Consider the representation of real numbers by continued fractions.Suppose x ∈ R, and a0 = �x is the greatest integer less than or equalto x. Let e0 be the difference between x and a0, or

x = a0 + e0, 0 ≤ e0 < 1.

If e0 �= 0 let

a′1 =1e0

, a1 = �a′1 .

Since 0 < e0 < 1, it follows that the integer a1 satisfies a1 ≥ 1. Theprocess may be continued if a′1 is not an integer, with

a′1 = a1 + e1, a′2 =1e1

, a2 = �a′1 ,

and generally if en �= 0,

a′n = an + en, a′n =1

en−1, an = �a′n .

If a term en = 0 is encountered, the algorithm simply terminates withthe sequence a0, . . . , an. Of course if the algorithm terminates, then xis rational.

Page 184: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

174 A Concrete Introduction to Real Analysis

Notice that

x = [a0, a′1] = [a0, a1 +

1a′2

] = [a0, a1, a′2] = [a0, a1, a2 +

1a′3

] = . . . .

If x is irrational the algorithm cannot terminate, and so an infinite(convergent) simple continued fraction is obtained. Now apply Theo-rem 6.5.4 with

x = [a0, a1, . . . , an−1, a′n]

to get

x =p′n+1

q′n+1

=a′n+1pn + pn−1

a′n+1qn + qn−1.

This leads to

x− pn

qn=

qn[a′n+1pn + pn−1] − pn[a′n+1qn + qn−1]qn[a′n+1qn + qn−1]

=qnpn−1 − pnqn−1

qn[a′n+1qn + qn−1].

The identity (6.8), the inequality a′n+1 > an+1 ≥ 1, and the definitionof qn combine to give

|x − pn

qn| ≤ 1

qn[an+1qn + qn−1]=

1qnqn+1

≤ 1q2n

≤ 1n2

.

This estimate and the associated algorithm prove the next result.

Theorem 6.5.9. Every real number has a simple continued fractionrepresentation.

If x is irrational, the inequality

|x − pn

qn| ≤ 1

q2n

holds for infinitely many distinct rationals pn/qn in lowest terms. Thisresult is the first step in quantifying the approximation of real numbersby rational numbers; further developments are in [7].

Page 185: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 175

6.6 Problems

1. Suppose that a and b are elements of a field F.(a) Show that if a · b = 0, then a = 0 or b = 0.(b) Show that (−a) · b = −(a · b).(c) Show that −(−a) = a.(d) Show that every element a has a unique additive inverse.2. Suppose that a �= 0 and b �= 0 are elements of a field F.(a) Show that a−1 �= 0 and (a−1)−1 = a.(b) Show that ab �= 0 and (ab)−1 = a−1b−1.3. Suppose that a is an element of an ordered field. Show that a2 > 0.4. Let Zp be the set of integers {0, 1, 2, . . . , p − 1}, and suppose

that addition x⊕ y and multiplication x⊗ y are carried out modulo p.That is, if xy is normal integer multiplication and xy = pn + r, with0 ≤ r < p, then x ⊗ y = r. Addition modulo p is similar.

(a) Construct addition and multiplication tables for Z2 and Z3. Forinstance, here is the addition table for Z2:

⊕ 0 10 0 11 1 0

(b) Show that Z2 and Z3 are fields.(c) Is Z4 a field?5. Show that if 2 = 1 + 1 �= 0 in a field F, then 4 = 1 + 1 + 1 + 1 �= 0

in F.6. Suppose that p, q ∈ F, F an ordered field. Show that q ≥ 1 and

p ≥ 0 implies pq ≥ p.7. If p is a prime number, prove that

√p is not a rational number.

(You may assume the uniqueness of prime factorization.)8. Consider the quadratic equation

x2 + bx + c = 0, x ∈ R.

Suppose that b and c are rational, and b2 − 4c is prime. Show theequation has no rational solutions.

9. Suppose that F is an ordered field. Show that if 0 < a < b, then0 < a2 < b2.

10. Suppose that F is an ordered field satisfying the ArchimedeanProperty O.7.

Page 186: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

176 A Concrete Introduction to Real Analysis

(a) Show that if a, b ∈ F and a < b, then there is a rational numberq satisfying a < q < b. (Hint: Consider the numbers m/n where1/n < b − a.)

(b) Using the ideas of part a), show that for every x ∈ F there is asequence of rational numbers {qk} such that

limk→∞

qk = x.

11. Construct examples of sequences with the following behavior.(a) Find an example of a bounded sequence without a limit.(b) Find an example of a monotone sequence without a limit.12. Find an example of a pair of sequences {xk} and {yk} such that

the intervals [xk, yk] are nested, but there are two distinct numbers L1

and L2 satisfying xk ≤ L1 < L2 ≤ yk.13. Find the least upper bound for the following sets:(a) S1 = {x ∈ R | − 2 < x < 1},(b) S2 = {x ∈ R | |x − 3| ≤ 5},(c) S3 is the set of rational numbers less than π.(d) S4 = {1 − 1/k, k = 2, 3, 4, . . . }.14. Suppose that xk+1 ≥ xk, yk+1 ≤ yk, and that xk ≤ yk for each

positive integer k. Show that if j, k are any two positive integers, thenxj ≤ yk.

15. Suppose limn→∞ xn = L and limn→∞ yn = L. Define the se-quence {zn} by interleaving these sequences,

z2n−1 = xn, z2n = yn, n = 1, 2, 3, . . . .

Show that limn→∞ zn = L.16. Assume that limn→∞ bn = b. Define

an =1n

n∑k=1

bk.

Show that limn→∞ an = b.17. Suppose that S ⊂ R is a set with least upper bound L. Show

that if L /∈ S then there is a strictly increasing infinite sequence xk ∈ Sthat converges to L. Show by example that this conclusion may befalse if L ∈ S.

18. Suppose that A and B are subsets of R with least upper boundsL and M respectively. Prove or give a counterexample:

(a) The least upper bound of A ∪ B is the maximum of L and M .

Page 187: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 177

(b) The least upper bound of A ∩ B is the minimum of L and M .19. Consider the following problems.(a) Suppose that x, y are in a field. Prove that for n = 1, 2, 3, . . . ,

xn − yn = (x − y)n−1∑k=0

xkyn−1−k.

(b) Prove that if F satisfies the field and order axioms, and the NestInterval Principle, then for every x ≥ 0 and every n ∈ N there is az ∈ F with zn = x.

20. Show that any sequence xk ∈ R has a monotone subsequence.(Hint: Handle the cases when {xk} is bounded and {xk} is unboundedseparately. If {xk} is bounded it has a subsequence {xk(n)} convergingto x0. Either there are infinitely many n with xk(n) ≤ x0 or infinitelymany n with xk(n) ≥ x0.)

21. Give an example of an unbounded sequence which has a conver-gent subsequence.

22. Construct sequences with the following properties.(a) Suppose that z1, . . . , zK is a finite collection of real numbers. Con-

struct a sequence {xn} such that each zk is the limit of a subsequenceof the sequence {xn}.

(b) Construct a sequence {xn} such that each real number z ∈ [0, 1]is the limit of a subsequence of the sequence {xn}.

23. Suppose you do not know the Nested Interval Principle. Showdirectly that the Bounded Monotone Sequence property implies theLeast Upper Bound property.

24. Show that a compact set is bounded.The next series of problems will make use of the following terminol-

ogy. Let B ⊂ R. A point z ∈ R is an accumulation point of B if thereis a sequence {xk} with xk ∈ B, xk �= z for all k, and limk→∞ xk = z.A point z ∈ R is an limit point of B if there is a sequence {xk} withxk ∈ B and limk→∞ xk = z. A set B ⊂ R is closed if every limit pointof B is an element of B.

25. Let B = (0, 1). What is the set of accumulation points for B?What is the set of limit points for B?

26. Let Z denote the set of integers. What is the set of accumulationpoints for Z? What is the set of limit points for Z?

27. Let B = {1/k, k = 1, 2, 3, . . . }. What is the set of accumulationpoints for B? What is the set of limit points for B?

Page 188: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

178 A Concrete Introduction to Real Analysis

28. Suppose that z is an accumulation point of B. Show that there isa sequence {xk} of distinct points (j �= k implies xj �= xk) with xk ∈ B,and limk→∞ xk = z.

29. Show that the set (0, 1) is not closed.30. Show that the set [0, 1] is closed.31. Show that the set R is closed.32. Suppose that for n = 1, . . . , N the real numbers an, bn satisfy

an < bn. Show that

K =N⋃

n=1

[an, bn]

is compact.33. Show that a compact set is closed.34. Assume that the sets Kn ⊂ R are compact, n = 1, 2, 3, . . . . Show

that

K =∞⋂

n=1

Kn

is compact. (Assume K is not the empty set.)35. Let {xn} be the sequence of points

1/10, . . . , 9/10, 1/100, . . . , 99/100, 1/1000, . . . , 999/1000, . . .

and let Ln = 10−n. Define

Kn = {x ∈ [0, 1] | x /∈ (xn − Ln, x + Ln)}.

Finally, define

K =∞⋂

n=1

Kn.

(a) Show that K is compact.(b) Argue that there is a positive number M such that the length of

K is greater than M .(c) Suppose that 0 ≤ a < b ≤ 1. Show that some point in [a, b] is not

in K.36. If

Q = limN→∞

N∏n=1

(1 − 1n + 1

),

is Q > 0? Should you play game one with Swindler Stan?

Page 189: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Real Numbers 179

37. If

P = limN→∞

N∏n=1

(1 − 6−n).

is P > 0? Should you play game two with Swindler Stan?38. Prove Theorem 6.5.339. What number x is represented by the continued fraction

x = [2, 2, 2, . . . ].

Recall that this continued fraction is generated by the recursion formula

xn+1 = 2 +1xn

.

40. What number x is represented by the continued fraction

x = [2, 3, 2, 3, . . . ].

(Hint: find a recursion formula.)41. A continued fraction [a0, a1, a2, . . . ] is said to be periodic if there

is a positive integer K such that an+K = an for n = 0, 1, 2, . . . (ormore generally for all n ≥ N). Prove that a periodic simple continuedfraction satisfies a quadratic polynomial [7, p. 144].

42. Construct a number x satisfying

|x − pn

qn| <

1q3n

for infinitely many distinct rationals pn/qn in lowest terms. (Hint: Letqn = 10n and consider decimal expansions of x with only zeroes andones.)

Page 190: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Chapter 7

Functions

7.1 Introduction

One of the cornerstones of analysis is the study of real valued func-tions of a real variable. To the extent that functions appear in elemen-tary mathematics, they tend to appear either as narrow classes relatedto arithmetic, such as polynomials, rational functions, or roots, or asspecific examples of ‘transcendental’ functions. Thus a function of areal variable x might be defined by operating on x by the elementaryarithmetic operations addition, subtraction, multiplication and divi-sion, yielding such examples as

p(x) = x2 + 3x + 7, r(x) =x − 1

x3 + 7x2,

or by the use of such particular functions as

sin(x), ex, log(x).

This restricted view of functions was shared to a considerable extentby researchers when the concept of a function was developed duringthe seventeenth century [9, p. 403–406] and [3]. Early in the historicaldevelopment, the infinite repetition of such operations was allowed,providing for power series, infinite products, and continued fractions.

This expectation that functions will have an explicit procedural def-inition can and does lead to trouble. For instance, a polynomial

p(x) = anxn + · · · + a1x + a0, an �= 0

will have between 0 and n real solutions, and n complex solutions (whencounted with appropriate multiplicity). Is it appropriate to say that theroots are a function of the coefficients a0, . . . , an? If the degree n is two,the quadratic formula provides a procedure for explicitly expressing the

181

Page 191: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

182 A Concrete Introduction to Real Analysis

roots in terms of the coefficients. Such explicit elementary formulas arenot available when n ≥ 5, and more sophisticated procedures, possiblyinvolving infinite processes like power series, are needed. However, theuse of infinite processes such as infinite sums or limits can lead tounexpected problems. Infinite sums may not converge, and limits ofsequences of well-behaved functions may not be continuous, or may failto have derivatives at any point.

The modern view is to initially downplay the importance of any pro-cedure when discussing functions. One usually sees definitions allowinga function to be any rule which produces a single output from any per-missible input. In principle, one can simply write down the elements xof the domain of the function and the corresponding values of f(x). Theconstructive procedure is completely removed, having been replaced bya generalized version of a list or table of function values.

The main emphasis of this chapter is on the properties of function-s that make them susceptible to mathematical study, and useful forapplications to science and engineering. Starting with the existenceof limits, the development continues with such important propertiesas continuity, uniform continuity, and differentiability. Consequencesof these properties will include important elements of Calculus such asthe Extreme Value Theorem, the Intermediate Value Theorem, and theMean Value Theorem, along with the various theorems facilitating thecalculation of derivatives.

7.2 Basics

Suppose that A and B are two sets. To define a function f , we firstidentify a subset Df ⊂ A called the domain of f . It is traditional, atleast at an elementary level, to use the following definition: a function isa rule for assigning a unique element f(x) ∈ B to each to each elementx ∈ Df . The range of f , denoted Rf , is the set of all y ∈ B such thaty = f(x) for some x ∈ Df .

While this definition is useful in practice, there are a few fine pointsworthy of attention. Suppose f and g are two functions with the samedomain, Df = Dg. Suppose too that f and g are defined by distinctrules, which happen to always give the same value for every x ∈ Df .For example, the domain might be the set R of all real numbers, and

Page 192: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 183

the rules could be

f(x) = (x + 1)3, g(x) = x3 + 3x2 + 3x + 1.

The rules are obviously different, but the result is always the same. Inthis case we agree to declare the functions equal.

To handle this technical detail, as well as to emphasize the generalityof allowed ‘rules’, functions may also be defined in terms of sets. Todefine a function f , consider a set Gf of ordered pairs (a, b) with a ∈ A,b ∈ B, having the property that if (a, b1) ∈ Gf and (a, b2) ∈ Gf , thenb1 = b2. That is, the second element of the pair is uniquely determinedby the first element. The set Gf is sometimes called the graph of f ,which is supposed to be the implicitly defined ’rule’. Notice that inthis definition there is no explicit mention of the rule which producesb from a.

For those who have some familiarity with computing, it may helpto describe functions with that vocabulary. Functions have inputs andoutputs. The inputs are elements of the domain. In programming, thetype of the input must usually be specified, and we can think of A asdefining the type (Exercise 1). Similarly, the collection of all outputs isthe range of the function, and the type of the output is given by B. Twofunctions, or procedures, are said to be the same as long as the allowedinputs are the same, and the outputs agree whenever the inputs agree.The notation f : Df → B is often used to name a function, its domain,and the type of output. The same notation f : A → B may also be usedto merely specify the type of the inputs and outputs, leaving implicitthe exact domain. For example, one might define the rational functionr : R → R by r(x) = 1/x.

In elementary analysis the domain and range of our functions areusually subsets of the real numbers R, so we may take A = R andB = R. In fact the domain of a function is often an interval I. A setI ⊂ R is an interval if for every pair a, b ∈ I, the number x ∈ I ifa ≤ x ≤ b. Important cases include the open intervals

(a, b) = {x | a < x < b}and the closed intervals

[a, b] = {x | a ≤ x ≤ b}A function f is a polynomial if it can be written in the form

f(x) =n∑

k=0

ckxk.

Page 193: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

184 A Concrete Introduction to Real Analysis

The coefficients ck will usually be real numbers, although even in el-ementary algebra it is not uncommon to allow the ck to be complexnumbers. A function g is a rational function if it can be written in theform

g(x) =p(x)q(x)

,

where p and q are polynomials, and q is not everywhere 0. The value ofa polynomial may be computed whenever x is a real number, and thevalue of a rational function may be computed whenever x is a real num-ber and q(x) �= 0. When using familiar functions whose domain maybe defined by virtue of the operations in the function’s rule, the explic-it description of the domain is often omitted, with the understandingthat the ‘natural’ domain is implied.

7.3 Limits and continuity

7.3.1 Limits

7.3.1.1 Limit as x → ∞One context in which limits seem natural is when the behavior of a

function f(x) is considered for large values of x. Starting with a verysimple example, let

f(x) =1x2

.

If the graph in Figure 7.1 is to be trusted, it is obvious that f(x) → 0as x → ∞, or limx→∞ f(x) = 0. The challenge is to develop techniqueswhich will apply when the answer is not simply obvious.

Say thatlim

x→∞ f(x) = L, L ∈ R,

(respectively limx→−∞ f(x) = L) if for every ε > 0 there is a N > 0such that

|f(x) − L| < ε

whenever x ≥ N (respectively x ≤ −N).For f(x) = 1/x2, the fact that limx→∞ f(x) = 0 can be established

with some algebraic manipulation. Pick any number ε > 0. Obtaining

|f(x) − 0| =∣∣∣ 1x2

∣∣∣ < ε,

Page 194: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 185

0 2 4 6 8 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x

y

Figure 7.1: The graph of 1/x2

is the same as requiring

x2 >1ε,

or

|x| >1√ε.

One possible choice is

N =2√ε.

In this case it was productive to work backwards, starting with thedesired conclusion, and converting it to an equivalent statement aboutx. Having understood how big to take x, it is easy to find an N suchthat whenever x ≥ N , it follows that |f(x) − L| < ε. In fact we havefound an explicit formula for N as a function of ε. It will not alwaysbe possible to obtain such a convenient or explicit expression.

Here is a second example. Let

f(x) =x2 + 12x2 + 3

.

Page 195: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

186 A Concrete Introduction to Real Analysis

We claim that limx→∞ f(x) = 1/2. Pick any real number ε > 0. Write

|f(x) − L| =∣∣∣ x2 + 12x2 + 3

− 12

∣∣∣=

∣∣∣x2 + 3/22x2 + 3

− 1/22x2 + 3

− 12

∣∣∣ =∣∣∣− 1/2

2x2 + 3

∣∣∣ =12

∣∣∣ 12x2 + 3

∣∣∣.Suppose that N = 1/

√ε, and x > N . Then

|f(x) − L| =12

∣∣∣ 12x2 + 3

∣∣∣ <1x2

≤ 11/ε

= ε.

If x ≥ N , it follows that |f(x) − 1/2| < ε, so that limx→∞ f(x) = 1/2.Notice that there was some flexibility in our choice of N .

7.3.1.2 Limit as x → x0

Suppose x0, a, and b are real numbers, with a < x0 < b. Assumethat f is a real valued function defined on the set (a, x0)∪ (x0, b); thatis, f is defined on some open interval which contains the number x0,except that f may not be defined at x0 itself. Say that

limx→x0

f(x) = L, L ∈ R

if for every ε > 0 there is a δ > 0 such that

|f(x) − L| < ε

whenever 0 < |x − x0| < δ.To amplify on the possible omission of x0, consider the function

f(x) =sin(x)

x.

This formula does not provide a value for f at x = 0. Nonetheless wecan consider

limx→0

sin(x)x

.

In fact this limit exists, and turns out to be 1.Of course limits arise in the definition of derivatives. Suppose that

x0 is fixed. The following limit problem amounts to computing thederivative of x2 at the point x0.

limx→x0

x2 − x20

x − x0= lim

x→x0

(x − x0)(x + x0)x − x0

= limx→x0

(x + x0) = 2x0.

Page 196: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 187

Notice that the function

f(x) =x2 − x2

0

x − x0

is not defined at x0 because division by 0 is not defined.When considering limits as x approaches x0, it is sometimes conve-

nient to restrict x to those values satisfying x > x0 or x < x0. Thedefinition for this version of limits simply reflects the restriction. Todescribe a limit from above, say that

limx→x+

0

f(x) = L, L ∈ R

if for every ε > 0 there is a δ > 0 such that

|f(x) − L| < ε

whenever 0 < x−x0 < δ. Similarly, to describe a limit from below, saythat

limx→x−

0

f(x) = L, L ∈ R

if for every ε > 0 there is a δ > 0 such that

|f(x) − L| < ε

whenever 0 < x0 − x < δ.It is also convenient to talk about functions growing without bound.

The statementlim

x→x0

f(x) = ∞

means that for every M > 0 there is a number δ > 0 such that

f(x) > M whenever 0 < |x − x0| < δ.

The statementlim

x→∞ f(x) = ∞

means that for every M > 0 there is a number N > 0 such that

f(x) > M whenever x > M.

Page 197: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

188 A Concrete Introduction to Real Analysis

7.3.1.3 Limit rules

Limits are well behaved with respect to arithmetic operations. Thenext theorem makes this point precise while also providing a good il-lustration of the use of existence statements. Since the theorem is ageneral assertion about limits, rather than an analysis of a particularcase, the proof makes use of the general properties, not the details ofan example. The proof for this theorem is quite similar to the proof ofthe analogous theorem for limits of sequences, so only part of the proofis provided (see problem 4).

Theorem 7.3.1. Suppose that L, M , and c are real numbers, and that

limx→x0

f(x) = L, limx→x0

g(x) = M.

Thenlim

x→x0

cf(x) = cL, (i)

limx→x0

[f(x) + g(x)] = L + M, (ii)

limx→x0

[f(x)g(x)] = LM, (iii)

and, if M �= 0,lim

x→x0

f(x)/g(x) = L/M. (iv)

Proof. (i): Take any ε > 0. From the definition of

limx→x0

f(x) = L

there is a δ > 0 such that

|f(x) − L| < ε

whenever 0 < |x−x0| < δ. We consider two cases: |c| ≤ 1, and |c| > 1.If |c| ≤ 1 the desired inequality holds whenever 0 < |x − x0| < δ,

since|cf(x) − cL| < |c|ε < ε.

Next suppose that |c| > 1. Let ε1 = ε/|c|. Since

limx→x0

f(x) = L

Page 198: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 189

there is a δ > 0 such that 0 < |x − x0| < δ implies

|f(x) − L| < ε1.

But this means that

|cf(x) − cL| < |c|ε1 = |c|ε/|c| = ε.

Proof of (ii): Take any ε > 0 and define ε1 = ε/2. From the limitdefinitions there are δ1 and δ2 such that if 0 < |x − x0| < δ1 then

|f(x) − L| < ε1,

and if 0 < |x − x0| < δ2 then

|g(x) − M | < ε1.

Take δ = min(δ1, δ2). If 0 < |x − x0| < δ, then

|(f(x) + g(x)) − (L + M)| ≤ |f(x) − L| + |g(x) − M | < ε1 + ε1 = ε.

In Theorem 7.3.1 the statement

limx→x0

f(x)/g(x) = L/M, M �= 0,

deserves a comment. One can show (see problem 6) that if

limx→x0

g(x) = M, M �= 0,

then there is some δ > 0 such that g(x) �= 0 for 0 < |x − x0| < δ. Inthis set the quotient f(x)/g(x) will be defined, and the limit may beconsidered.

The limit rules of Theorem 7.3.1 for

limx→x0

f(x) = L

also apply in the cases of

limx→∞ f(x) = L,

andlim

x→x±0

f(x) = L.

Page 199: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

190 A Concrete Introduction to Real Analysis

7.3.2 Continuity

Suppose that I ⊂ R is an interval. A function f : I → R is continuousat x0 ∈ I if

limx→x0

f(x) = f(x0).

If x0 is the left or right endpoint of the interval I, this limit is taken tobe the limit from above or below, as appropriate. The function f is saidto be continuous, or continuous on I, if f is continuous at every pointof I. When I = [a, b] is a closed interval, saying that f is continuouson I means that

limx→x0

f(x) = f(x0), a < x0 < b,

limx→a+

f(x) = f(a), limx→b−

f(x) = f(b).

Notice that if f is a continuous function on an interval I, then f is alsocontinuous on every interval I1 ⊂ I.

It is easy to check that the function f(x) = x is continuous for anyinterval I. It follows immediately from Theorem 7.3.1 that polynomialsare continuous on any interval I, and rational functions are continuousat each point where the denominator is not 0. In addition, Theo-rem 7.3.1 shows that if f and g are continuous at x0, so are f + g andfg. If g(x0) �= 0 the function f/g is also continuous at x0.

Before diving into the next theorem, it will help to make an observa-tion. Suppose that f is continuous at x0. This means that limx→x0 f(x)exists, that f(x0) is defined, and that these two numbers are the same.For a real valued function f defined on an open interval I, the defini-tion of continuity of f at x0 can thus be written as follows: for everyε > 0 there is a δ > 0 such that

|f(x) − f(x0)| < ε

whenever|x − x0| < δ.

In the definition of limit the inequality 0 < |x − x0| < δ appeared; forthe case of continuity the possibility x = x0 is included.

It is sometimes helpful to have an alternate characterization of con-tinuity which is provided by the next result. The theorem is statedfor functions defined on open intervals, but the same result holds forarbitrary intervals if appropriate one-sided limits are used.

Page 200: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 191

Theorem 7.3.2. Suppose that I is an open interval, x0 ∈ I, andf : I → R. Then f is continuous at x0 if and only if

limn→∞ f(xn) = f(x0)

whenever {xn} is a sequence in I with limit x0.

Proof. First suppose that f is continuous at x0, that is

limx→x0

f(x) = f(x0).

Picking any ε > 0, there is a δ > 0 such that

|f(x) − f(x0)| < ε

whenever0 ≤ |x − x0| < δ.

Now assume that {xn} is a sequence in I with limit x0. Using δ fromabove, there is a positive integer N such that

|xn − x0| < δ

whenever n ≥ N . Of course this means that when n ≥ N the inequality

|f(xn) − f(x0)| < ε,

holds, showing thatlim

n→∞ f(xn) = f(x0).

To show the opposite implication, assume that

limx→x0

f(x) �= f(x0).

Either the limit fails to exist, or the limit exists, but its value is differentfrom f(x0). In either case there is some ε1 > 0 such that for any δ > 0

|f(z) − f(x0)| ≥ ε1

for some z satisfying0 < |z − x0| < δ.

Since I is an open interval, there is a number r > 0 such that (x0 −r, x0 + r) ⊂ I. For k = 1, 2, 3, . . . let δk = min(1/k, r). Pick xk suchthat

0 < xk − x0 < δk

Page 201: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

192 A Concrete Introduction to Real Analysis

and|f(xk) − f(x0)| ≥ ε1.

By construction the sequence {xk} has limit x0, but

limk→∞

f(xk) �= f(x0).

Theorem 7.3.3. (Extreme Value Theorem) Suppose that f : [a, b] → R

is a continuous function on the compact interval [a, b]. Then there arepoints xmin, xmax ∈ [a, b] such that

f(xmin) ≤ f(x), a ≤ x ≤ b,

f(xmax) ≥ f(x), a ≤ x ≤ b.

Proof. If the range of f is bounded above, let ymax denote the leastupper bound of the range. If the range of f is not bounded above,write ymax = ∞. Let xn be a sequence of points in [a, b] such that

limn→∞ f(xn) = ymax.

Since the interval [a, b] is compact, the sequence xn has a subsequencexn(k) which converges to z ∈ [a, b]. Since f is continuous at z,

limk→∞

f(xn(k)) = f(z) = ymax.

Thus ymax ∈ R and we may take xmax = z. The existence of xmin isproved analogously.

Theorem 7.3.4. (Intermediate Value Theorem) Suppose f : [a, b] → R

is a continuous function, and suppose that f(a) < f(b). For everynumber y ∈ [f(a), f(b)] there is an x ∈ [a, b] such that f(x) = y.

Proof. The set J = {x ∈ [a, b] | f(x) ≤ y} is nonempty, and has a leastupper bound z ≤ b. Pick a sequence of points xn ∈ J converging to z.Since f(xn) ≤ y for each n, and f is continuous,

limn→∞ f(xn) = f(z) ≤ y. (7.1)

Page 202: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 193

It is possible that z = b. In this case y ≤ f(b) by assumption, andf(b) = f(z) ≤ y by (7.1), so y = f(b) and the desired point x is b.

If z < b, then f(x) > y for every x ∈ (z, b] by the definition of J .Pick a sequence of points wn ∈ (z, b] such that {wn} converges to z.Since f(wn) > y,

f(z) = limn→∞ f(wn) ≥ y.

Now f(z) ≥ y and f(z) ≤ y, so it follows that f(z) = y.

The sequential characterization of continuity is also useful for estab-lishing the next result, which says that the composition of continuousfunctions is continuous.

Theorem 7.3.5. Suppose that I0 and I1 are open intervals, that f :I0 → R, g : I1 → R, and that f(I0) ⊂ I1. Assume that f is contin-uous at x0 ∈ I0, and g is continuous at f(x0) ∈ I1. Then g(f(x)) iscontinuous at x0.

Proof. Suppose that {xk} is any sequence in I0 with limit x0. Since fis continuous at x0, the sequence yk = f(xk) is a sequence in I1 withlimit y0 = f(x0). Since g is continuous at y0, we also have

limk→∞

g(yk) = g(y0),

orlim

k→∞g(f(xk)) = g(f(x0)),

as desired.

7.3.2.1 Rootfinding 1

Suppose f : I → R is a function defined on the interval I. The num-ber x is said to be a root of f if f(x) = 0. The accurate approximationof roots is a common problem of computational mathematics. As anexample one might consider finding solutions of the equations

tan(x) − x − 1 = 0, 0 ≤ x < π/2,

orx7 + 3x6 + 17x2 + 4 = 0.

The Intermediate Value Theorem can be used to justify a simple com-putational technique called the bisection method.

Page 203: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

194 A Concrete Introduction to Real Analysis

Suppose that f is continuous on I, and there are points a, b ∈ I suchthat a < b and f(a)f(b) < 0. This last condition simply means that fis positive at one of the two points, and negative at the other point. Bythe Intermediate Value Theorem there must be a root in the interval[a, b]. There is no loss of generality in assuming that f(a) < 0 andf(b) > 0 since if the signs are switched we can simply replace f by thefunction g = −f . The functions f and g have the same roots.

We will now define two sequences of points {an}, {bn}, starting witha0 = a and b0 = b. The definition of the subsequent points in thesequence is given recursively. The points an and bn will always bechosen so that f(an)f(bn) ≤ 0, and if f(an)f(bn) = 0 then either an orbn is a root.

Let cn be the midpoint of the interval [an, bn], or

cn =an + bn

2.

If f(cn) = 0, we have a root and can stop. If f(cn) < 0, define an+1 =cn, and bn+1 = bn. If f(cn) > 0, define an+1 = an, and bn+1 = cn.Since f(an+1) < 0 and f(bn+1) > 0, the Intermediate Value Theoremimplies that a root lies in the interval [an+1, bn+1].

Finally, notice that

|bn+1 − an+1| =12|bn − an|.

By induction this means that

|bn − an| = 2−n|b − a|.The intervals [an, bn] are nested, and the lengths |bn − an| have limit 0,so by the Nested Interval Principle there is a number r such that

limn→∞an = r = lim

n→∞ bn.

Since f(an) < 0,f(r) = lim

n→∞ f(an) ≤ 0.

On the other hand, f(bn) > 0, so

f(r) = limn→∞ f(bn) ≥ 0,

and f(r) = 0. Moreover

|bn − r| ≤ |bn − an| = 2−n|b − a|,so an accurate estimate of the root is obtained rapidly.

Page 204: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 195

7.3.3 Uniform continuity

In discussing uniform continuity it is helpful to review the definitionof continuity at a point x0, which could have been phrased as follows.The function f : I → R is continuous at a point x0 ∈ I if for everyε > 0 there is a δ(ε, x0) > 0 such that |f(x) − f(x0)| < ε whenever|x− x0| < δ(ε, x0). The new emphasis is on the possible dependence ofδ on both ε and x0.

To illustate this point, consider the example f(x) = 1/x on the in-terval (0,∞). The condition

|1x− 1

x0| < ε,

for x ∈ (0,∞) means that

1x0

− ε <1x

<1x0

+ ε,

or1 − εx0

x0<

1x

<1 + εx0

x0.

This requiresx0

1 + εx0< x <

x0

1 − εx0.

If a fixed ε > 0 is chosen, then the size of the interval

|x − x0| < δ(ε, x0)

where

|1x− 1

x0| < ε

shrinks to 0 as x0 goes to 0.For other functions or other intervals it may be possible to choose δ

independent of the value of x0. This leads to the notion of a uniformlycontinuous function. Say that f : I → R is uniformly continuous onI if for every ε > 0 there is a δ(ε) > 0, such that |f(y) − f(x)| < εwhenever x, y ∈ I and |y − x| < δ.

Theorem 7.3.6. If f is continuous on a compact interval I, then f isuniformly continuous on I.

Page 205: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

196 A Concrete Introduction to Real Analysis

Proof. The argument is by contradiction. If f is not uniformly continu-ous then there is some ε1 > 0 such that for every δ > 0 there are pointsx and y in I satisfying |y − x| < δ, but for which |f(y) − f(x)| ≥ ε1.For this ε1 consider δn = 1/n, and let xn and an be points such that|xn − an| < 1/n, but |f(xn) − f(an)| ≥ ε1.

Since the interval I is compact, the sequence an has a convergentsubsequence an(k). Suppose that the limit of this subsequence is c. Bythe triangle inequality,

|xn(k) − c| ≤ |xn(k) − an(k)| + |an(k) − c| ≤ 1/k + |an(k) − c|,

so that c is also the limit of the subsequence xn(k).The function f is assumed to be continuous at c. Let ε = ε1/2.

There is a δ(ε1/2, c) such that |f(x)− f(c)| < ε1/2 whenever |x − c| <δ(ε1/2, c).

Now use the triangle inequality again to get

|f(xn(k)) − f(an(k))| ≤ |f(xn(k)) − f(c)| + |f(c) − f(an(k))|,

or

|f(xn(k)) − f(c)| ≥ |f(xn(k)) − f(an(k))| − |f(c) − f(an(k))|. (7.2)

If n(k) is large enough then both |xn(k) − c| and |an(k) − c| are smallerthan δ(ε1/2, c). This means that |f(xn(k)) − f(c)| < ε1/2 and |f(c) −f(an(k))| < ε1/2. But (7.2) implies that

|f(xn(k)) − f(c)| ≥ ε1 − ε1/2 = ε1/2.

This contradiction implies that f must have been uniformly continuous.

Of course it is possible to have a uniformly continuous function on anoncompact interval. Suppose f is uniformly continuous on the com-pact interval [a, b]. Then f is also uniformly continuous on every inter-val (c, d) ⊂ [a, b].

One striking consequence of Theorem 7.3.6 involves the approxima-tion of a continuous function f on a compact interval [a, b] by functionsof a particularly simple type. Say that a function g : [a, b] → R is astep function if there is a finite collection of points x0, . . . , xN such that

a = x0 < x1 < · · · < xN = b,

Page 206: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 197

and g(x) is constant on each of the intervals (xn, xn+1). The approxima-tion of continuous functions by step functions is extremely importantfor developing the theory of integration using Riemann sums.

One way of getting a step function g from an arbitrary function f isto define g using samples of f from the intervals [xn, xn+1], where n =0, . . . , N − 1. For instance the left endpoint Riemann sums commonlyseen in Calculus use the function

gl(x) ={ f(xn), xn ≤ x < xn+1

f(xN−1), x = b.

}More generally, consider the sample points ξn, where xn ≤ ξn ≤ xn+1,and define

g(x) ={ f(ξn), xn ≤ x < xn+1

f(ξN−1), x = b.

}(7.3)

A corollary of Theorem 7.3.6 is that it is always possible to approxi-mate a continuous function f on a compact interval as well as you likewith a step function.

Corollary 7.3.7. Suppose that f is continuous on a compact interval[a, b], and g is one of the step functions defined in (7.3). Then for anyε > 0 there is a δ > 0 such that

|f(x) − g(x)| < ε, x ∈ [a, b]

if0 < xn+1 − xn < δ, n = 0, . . . , N − 1.

Proof. By Theorem 7.3.6 the function f is uniformly continuous on[a, b]. Given ε > 0, let δ be chosen so that

|f(y) − f(x)| < ε if |y − x| < δ.

Pick a finite collection of points a = x0 < x1 < · · · < xN = b from [a, b]and suppose that xn ≤ ξn ≤ xn+1.

Assume that g(x) is defined as in (7.3), and

|xn+1 − xn| < δ, n = 0, . . . , N − 1.

For each x ∈ [a, b), we have xn ≤ x < xn+1 for some n. Because

|ξn − x| ≤ |xn+1 − xn| < δ,

if follows that

|f(x) − g(x)| = |f(x) − f(ξn)| < ε.

The argument is essentially the same for x = b.

Page 207: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

198 A Concrete Introduction to Real Analysis

7.4 Derivatives

The notion of the derivative of a function is essential for the study ofvarious basic problems: how to make sense of velocity and accelerationfor objects in motion, how to define and compute tangents to curves,and how to minimize or maximize functions. These problems werestudied with some success in the seventeenth century by a variety ofresearchers, including Roberval, Fermat, Descartes, and Barrow. Inthe later part of the seventeenth century derivatives became a centralfeature of the calculus developed by Newton and Leibniz [9, pp. 342-390].

Suppose that (a, b) is an open interval and f : (a, b) → R. Thederivative of f at x0 ∈ (a, b) is

f ′(x0) = limx→x0

f(x) − f(x0)x − x0

if this limit exists. If the derivative exists, the function f is said tobe differentiable at x0. The function f is differentiable on the interval(a, b) if it has a derivative at each x ∈ (a, b).

When a function f is differentiable on an open interval (a, b), thenthe derivative function f ′(x) may itself have derivatives at x0 ∈ (a, b).If f is differentiable on (a, b), the second derivative of f at x0 ∈ (a, b)is

f ′′(x0) = limx→x0

f ′(x) − f ′(x0)x − x0

if this limit exists. Denoting repeated differentiation of f with more′s leads to an unwieldy notation. As an alternative, write f (1)(x0)for f ′(x0), and f (2)(x0) for f ′′(x0). Continuing in this fashion, iff, f (1), . . . , f (n−1) are differentiable on (a, b), the n-th derivative of fat x0 ∈ (a, b) is

f (n)(x0) = limx→x0

f (n−1)(x) − f (n−1)(x0)x − x0

if this limit exists.It is often desirable to talk about a function which is differentiable

on a closed interval [a, b]. This will mean that there is some openinterval (c, d) such that [a, b] ⊂ (c, d), and the function f : (c, d) → R

Page 208: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 199

is differentiable. (Alternatively, we could ask for the existence of limitsfrom above and below at a and b respectively.)

By defining h = x − x0, which gives x = x0 + h, the derivative maybe defined in the equivalent form

f ′(x0) = limh→0

f(x0 + h) − f(x0)h

.

When it is notationally convenient the derivative is written as

df

dx(x0) = f ′(x0),

and higher derivatives are

dnf

dxn(x0) = f (n)(x0).

Occasionally one also encounters the notation

Dnf(x0) = f (n)(x0).

7.4.1 Computation of derivatives

By virtue of Theorem 7.3.1, sums and constant multiples of differen-tiable functions are differentiable.

Lemma 7.4.1. Suppose that f : (a, b) → R, g : (a, b) → R, and c ∈ R.If f and g have derivatives at x0 ∈ (a, b), so do cf and f + g, with

(f + g)′(x0) = f ′(x0) + g′(x0),

(cf)′(x0) = cf ′(x0).

Proof. An application of Theorem 7.3.1 gives

(f + g)′(x0) = limx→x0

(f(x) + g(x)) − (f(x0) + g(x0))x − x0

= limx→x0

(f(x) − f(x0)x − x0

+g(x)) − g(x0)

x − x0

)

= limx→x0

f(x) − f(x0)x − x0

+ limx→x0

g(x)) − g(x0)x − x0

= f ′(x0) + g′(x0),

Page 209: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

200 A Concrete Introduction to Real Analysis

and

(cf)′(x0) = limx→x0

cf(x) − cf(x0)x − x0

= limx→x0

cf(x) − f(x0)

x − x0

= c limx→x0

f(x) − f(x0)x − x0

= cf ′(x0).

Having a derivative at x0 is a stronger requirement than being con-tinuous at x0.

Theorem 7.4.2. If f has a derivative at x0, then f is continuous atx0.

Proof. Write x = x0 + h, and consider the following calculation.

limx→x0

f(x) − f(x0) = limh→0

f(x0 + h) − f(x0) = limh→0

f(x0 + h) − f(x0)h

h

= limh→0

f(x0 + h) − f(x0)h

limh→0

h = f ′(x0) · 0 = 0.

Thuslim

x→x0

f(x) = f(x0).

It is also possible to differentiate products and quotients, with rulesfamiliar from calculus.

Theorem 7.4.3. Suppose that f : (a, b) → R, and g : (a, b) → R. If fand g have derivatives at x0 ∈ (a, b), so does fg, with

(fg)′(x0) = f ′(x0)g(x0) + f(x0)g′(x0).

If in addition g(x0) �= 0, then f/g has a derivative at x0, with

(f

g

)′(x0) =

f ′(x0)g(x0) − f(x0)g′(x0)g2(x0)

.

Proof. The addition of 0 is helpful. First,

(fg)′(x0) = limx→x0

f(x)g(x) − f(x0)g(x0)x − x0

Page 210: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 201

= limx→x0

f(x)g(x) − f(x0)g(x) + f(x0)g(x) − f(x0)g(x0)x − x0

= limx→x0

g(x)f(x) − f(x0)

x − x0+ lim

x→x0

f(x0)g(x) − g(x0)

x − x0

= g(x0) limx→x0

f(x) − f(x0)x − x0

+ f(x0) limx→x0

g(x) − g(x0)x − x0

= f ′(x0)g(x0) + f(x0)g′(x0),

since the limit of a product is the product of the limits by Theo-rem 7.3.1, and g is continuous at x0.

A similar technique establishes the quotient rule.

(f/g)′(x0) = limx→x0

1x − x0

(f(x)g(x)

− f(x0)g(x0)

)

= limx→x0

1x − x0

f(x)g(x0) − g(x)f(x0)g(x)g(x0)

= limx→x0

1x − x0

(f(x) − f(x0))g(x0) − (g(x) − g(x0)f(x0)g(x)g(x0)

= limx→x0

g(x0)g(x)g(x0)

f(x) − f(x0)x − x0

− limx→x0

f(x0)g(x)g(x0)

g(x) − g(x0)x − x0

=f ′(x0)g(x0)

g2(x0)− f(x0)g′(x0)

g2(x0).

Another important differentiation rule is the chain rule, which tellsus how to differentiate the composition of two functions. Recall thealternate notations for composition,

f(g(x)) = (f ◦ g)(x).

The chain rule says roughly that if f and g are differentiable, then

(f ◦ g)′(x0) = f ′(g(x0))g′(x0).

In preparation for the proof of the chain rule, we establish a series oflemmas which follow quickly from the definition of the derivative. Thefirst compares a function g(x) with linear functions (see Figure 7.2).

Page 211: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

202 A Concrete Introduction to Real Analysis

0 0.5 1 1.5 20

0.5

1

1.5

2

2.5

3

3.5

4

x

yx2

0.5(x−1) + 14(x−1) + 1

Figure 7.2: Comparing g(x) with linear functions near x0 = 1

Lemma 7.4.4. Suppose that g has a derivative at x0. If g′(x0) �= 0,then there is a δ > 0 such that 0 < |x − x0| < δ implies

|g′(x0)||x − x0|/2 < |g(x) − g(x0)| < 2|g′(x0)||x − x0|.If g′(x0) = 0, then for any ε > 0 there is a δ > 0 such that 0 <|x − x0| < δ implies

|g(x) − g(x0)| ≤ ε|x − x0|.Proof. Assume that g′(x0) �= 0. There is no loss of generality in as-suming that g′(x0) > 0, since if g′(x0) < 0 the function −g can beconsidered instead.

Take ε = g′(x0)/2. From the limit definition there is a δ > 0 suchthat 0 < |x − x0| < δ implies

|g(x) − g(x0)x − x0

− g′(x0)| < g′(x0)/2,

which is the same as

g′(x0) − g′(x0)/2 <g(x) − g(x0)

x − x0< g′(x0) + g′(x0)/2.

Page 212: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 203

Since the middle term is positive, and 3/2 < 2,

g′(x0)/2 <∣∣g(x) − g(x0)

x − x0

∣∣ < 2g′(x0).

Multiply by |x − x0| to get the first result.In case g′(x0) = 0 the limit definition says that for any ε > 0 there

is a δ > 0 such that 0 < |x − x0| < δ implies

|g(x) − g(x0)x − x0

| < ε.

Multiply by |x − x0| to get the desired inequality.

Lemma 7.4.5. Suppose that g′(x0) �= 0. Then there is a δ > 0 suchthat 0 < |x − x0| < δ implies g(x) �= g(x0).

Proof. Since g′(x0) �= 0, the previous lemma says there is a δ > 0 suchthat 0 < |x − x0| < δ implies

|g(x) − g(x0)| > |x − x0||g′(x0)|/2.Since |x − x0| �= 0, it follows that g(x) − g(x0) �= 0.

The last lemma develops an estimate valid for any value of g′(x0).

Lemma 7.4.6. Suppose that g has a derivative at x0. Then there is aδ > 0 such that 0 < |x − x0| < δ implies

|g(x) − g(x0)| ≤[1 + 2|g′(x0)|

] |x − x0|.Proof. One conclusion of Lemma 7.4.4 is that for x close to x0 either

|g(x) − g(x0)| ≤ 2|g′(x0)||x − x0|,or for any ε > 0,

|g(x) − g(x0)| ≤ ε|x − x0|,depending on the value of g′(x0). In this last inequality take ε = 1. Inany case |g(x) − g(x0)| will be smaller than the sum of the right handsides, which is the claim.

Theorem 7.4.7. (Chain rule) Suppose that f : (a, b) → R, g : (c, d) →R, g is differentiable at x0 ∈ (c, d), and f is differentiable at g(x0) ∈(a, b). Then f ◦ g is differentiable at x0 and

(f ◦ g)′(x0) = f ′(g(x0))g′(x0).

Page 213: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

204 A Concrete Introduction to Real Analysis

Proof. First suppose that g′(x0) �= 0. Lemma 7.4.5 assures us thatg(x) �= g(x0) for x close to x0, so

limx→x0

f(g(x)) − f(g(x0))x − x0

= limx→x0

f(g(x)) − f(g(x0))g(x) − g(x0)

g(x) − g(x0)x − x0

.

The existence of the limit

limx→x0

g(x) − g(x0)x − x0

= g′(x0)

was part of the hypotheses. Let y0 = g(x0) and let h : (a, b) → R bethe function

h(y) ={f(y)−f(y0)

y−y0, y �= y0,

f ′(y0), y = y0

}.

The assumption that f is differentiable at y0 is precisely the assumptionthat h is continuous at y0. Since g is continuous at x0, Theorem 7.3.5says that h(g(x)) is continuous at x0, or

limx→x0

f(g(x)) − f(g(x0))g(x) − g(x0)

= f ′(g(x0)).

Since the product of the limits is the limit of the product, the piecesmay be put together to give

(f ◦ g)′(x0) = limx→x0

f(g(x)) − f(g(x0))g(x) − g(x0)

limx→x0

g(x) − g(x0)x − x0

= f ′(g(x0))g′(x0).

Now the case g′(x0) = 0 is considered. Take any ε > 0. Since g iscontinuous at x0, and f has a derivative at g(x0), Lemma 7.4.6 showsthat for x close enough to x0

|f(g(x)) − f(g(x0))| ≤[1 + 2|f ′(g(x0))|

] |g(x) − g(x0)|,and

|g(x) − g(x0)| ≤ ε|x − x0|.Putting these estimates together yields

|f(g(x)) − f(g(x0))| ≤[1 + 2|f ′(g(x0))|

]ε|x − x0|.

Page 214: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 205

This is the same as

|f(g(x)) − f(g(x0))x − x0

− 0| ≤ [1 + 2|f ′(g(x0))|

]ε,

or

0 = limx→x0

f(g(x)) − f(g(x0))x − x0

= (f ◦ g)′(x0) = f ′(g(x0))g′(x0).

7.4.2 The Mean Value Theorem

A function f : (a, b) → R is said to have a local maximum at x0 ∈(a, b) if there is a δ > 0 such that f(x0) ≥ f(x) for all x ∈ (x0−δ, x0+δ).A local minimum is defined analogously. The function f is said to have alocal extreme point at x0 if there is either a local maximum or minimumat x0.

Lemma 7.4.8. Suppose that f : (a, b) → R has a local extreme pointat x0 ∈ (a, b). If f has a derivative at x0, then f ′(x0) = 0.

Proof. The cases when f has a local maximum and a local minimumare similar, so suppose that f has a local maximum at x0.

There is a δ > 0 such that f(x) ≤ f(x0) for all x ∈ (x0 − δ, x0 + δ).This means that for x0 < x < x0 + δ

f(x) − f(x0)x − x0

≤ 0,

and for x0 − δ < x < x0

f(x) − f(x0)x − x0

≥ 0.

Since the derivative is the limit of these difference quotients, it followsthat f ′(x0) ≤ 0 and f ′(x0) ≥ 0. This forces f ′(x0) = 0.

Theorem 7.4.9. (Rolle’s Theorem) Suppose that f : [a, b] → R iscontinuous on [a, b], and differentiable on the open interval (a, b). Iff(a) = f(b) = 0, then there is some point x0 ∈ (a, b) with f ′(x0) = 0.

Proof. By Theorem 7.3.3 the function f has an extreme value at somepoint x0 ∈ [a, b]. If the function f is zero at every point of [a, b], thenf ′(x0) = 0 for every x0 ∈ [a, b]. Otherwise f must have a maximum orminimum at some point x0 ∈ (a, b). By Lemma 7.4.8 f ′(x0) = 0.

Page 215: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

206 A Concrete Introduction to Real Analysis

Rolle’s Theorem looks special because of the requirement that f(a) =f(b) = 0, but it is easy to use it to produce a more flexible result.

Theorem 7.4.10. (Mean Value Theorem) Suppose that g : [a, b] → R

is continuous on [a, b], and differentiable on the open interval (a, b).Then there is some point x0 ∈ (a, b) with

g′(x0) =g(b) − g(a)

b − a.

Proof. The idea is to modify the function g to obtain a new functionf to which Rolle’s Theorem may be applied. The new function is

f(x) = g(x) − g(a) − (x − a)g(b) − g(a)

b − a.

With this choice f(a) = f(b) = 0. By Rolle’s Theorem there is anx0 ∈ (a, b) such that f ′(x0) = 0, or

0 = g′(x0) − g(b) − g(a)b − a

,

as desired.

The Mean Value Theorem may be used to show that functions withpositive derivatives are increasing. Recall that f : I → R is increasingif

f(x1) ≤ f(x2) whenever x1 < x2, x1, x2 ∈ I.

The function f is strictly increasing if

f(x1) < f(x2) whenever x1 < x2, x1, x2 ∈ I.

Decreasing and strictly decreasing functions are defined in a similarfashion.

Theorem 7.4.11. Suppose f : [a, b] → R is continuous, and f isdifferentiable on (a, b), with f ′(x) > 0 for x ∈ (a, b). Then f is strictlyincreasing on [a, b].

Proof. If f is not strictly increasing, then there are points x1 < x2 in[a, b] such that

f(x2) − f(x1)x2 − x1

≤ 0.

By the Mean Value Theorem there is a point x0 ∈ (a, b) such thatf ′(x0) ≤ 0, contradicting the hypotheses.

Page 216: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 207

The Mean Value Theorem also implies that the amount a functionf(x) can change over the interval [x1, x2] is controlled by the magnitudeof the derivative |f ′(x)| on that interval. Notice in particular thatthe following result shows that functions with bounded derivatives areuniformly continuous (see problem 20).

Theorem 7.4.12. Suppose that f is differentiable and m ≤ |f ′(x)| ≤M for x ∈ (a, b). Then for all x1, x2 ∈ (a, b)

m|x2 − x1| ≤ |f(x2) − f(x1)| ≤ M |x2 − x1|.

Proof. Beginning with the upper bound assumption |f ′(x)| ≤ M , andarguing by contradiction, suppose there are two points x1 and x2 suchthat

|f(x2) − f(x1)| > M |x2 − x1|.Without loss of generality, assume that x1 < x2 and f(x1) < f(x2).Then

f(x2) − f(x1)x2 − x1

> M.

By the Mean Value Theorem there must be a point x0 ∈ (x1, x2) with

f ′(x0) =f(x2) − f(x1)

x2 − x1> M,

contradicting the assumed bound on |f ′(x)|.Similarly, suppose that m ≤ |f ′(x)|, but there are two points x1 and

x2 such thatm|x2 − x1| > |f(x2) − f(x1)|.

Again, it won’t hurt to assume that x1 < x2 and f(x1) < f(x2). Then

f(x2) − f(x1)x2 − x1

< m,

so by the Mean Value Theorem there must be a point x0 ∈ (x1, x2)with

f ′(x0) =f(x2) − f(x1)

x2 − x1< m,

again giving a contradiction.

Page 217: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

208 A Concrete Introduction to Real Analysis

Theorem 7.4.12 is helpful in studying inverse functions and theirderivatives. Recall that a function f is one-to-one if x1 �= x2 impliesf(x1) �= f(x2). A one-to-one function f(x) has an inverse functionf−1(y) defined on the range of f by setting f−1(f(x)) = x. The readeris invited to check that the identity f(f−1(y)) = y also holds.

Theorem 7.4.13. (Inverse Function Theorem) Suppose f : (a, b) → R

has a continuous derivative, and that f ′(x1) > 0 for some x1 ∈ (a, b).Let f(x1) = y1. Then there are numbers x0 < x1 < x2, and y0 < y1 <y2 such that f : [x0, x2] → R is one-to-one, and the range of f withthis domain is the interval [y0, y2]. The inverse function f−1 : [y0, y2]is differentiable at y1, with

(f−1)′(y1) =1

f ′(x1).

Proof. Since f ′(x0) > 0 and f ′(x) is continuous on (a, b), there is (seeproblem 7) a δ > 0 such that f ′(x) > 0 for x in the interval I =(x0 − δ, x0 + δ). It follows that f(x) is strictly increasing on I, henceone-to-one there, and so there is an inverse function f−1(y) on therange of f : I → R.

Pick points x0, x2 ∈ I such that x0 < x1 < x2, and define y0 = f(x0),y2 = f(x2). Since f is strictly increasing on [x0, x2], it follows thaty0 < y1 < y2. The Intermediate Value Theorem Theorem 7.3.4 showsthat the range of f : [x0, x2] → R is the interval [y0, y2].

To see that f−1 has a derivative at y1, examine

f−1(y) − f−1(y1)y − y1

=x − x1

f(x) − f(x1).

The right hand side has the limit 1/f ′(x0) as x → x0. We want to showthat

limy→y1

f−1(y) − f−1(y1)y − y1

= limx→x1

x − x1

f(x) − f(x1).

To this end, suppose that ε > 0, and find δ such that 0 < |x − x1| < δimplies

| x − x1

f(x) − f(x1)− 1

f ′(x1)| < ε.

On the compact interval [x0, x2], the continuous function f ′(x) hasa positive minimum m and maximum M . By Theorem 7.4.12 theinequality

m|x − x1| ≤ |f(x) − f(x1)|

Page 218: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 209

holds for x ∈ (x0, x2). Thus if

0 < |y − y1| = |f(x) − f(x1)| < mδ = δ1,

then |x − x1| < δ, so that

|f−1(y) − f−1(y1)

y − y1− 1

f ′(x1)| = | x − x1

f(x) − f(x1)− 1

f ′(x1)| < ε.

This establishes the desired limit equality, and also provides the valueof the derivative.

The assumption f ′(x1) > 0 in this theorem is for convenience in theproof. The hypothesis can be changed to f ′(x1) �= 0.

7.4.3 Contractions

This section considers an application of the ideas in Theorem 7.4.12to the root-finding algorithm known as Newton’s method. We willbe interested in functions f which map a compact interval [a, b] backinto itself, and for which all points f(x1) and f(x2) are closer to eachother than x1 and x2 are. With this idea in mind, say that a functionf : [a, b] → [a, b] is a contraction if there is a number α satisfying0 ≤ α < 1 such that

|f(x2) − f(x1)| ≤ α|x2 − x1|, for all x1, x2 ∈ [a, b].

By Theorem 7.4.12, a function f : [a, b] → [a, b] will be a contraction iff is differentiable and |f ′(x)| ≤ M < 1 for all x ∈ [a, b].

The first result is an easy exercise (see problem 20).

Lemma 7.4.14. If f : [a, b] → [a, b] is a contraction, then f is uni-formly continuous.

The second observation is also straightforward. If f : [a, b] → [a, b] iscontinuous, then the graph of f must somewhere hit the line y = x.

Lemma 7.4.15. If f : [a, b] → [a, b] is a continuous, then there is somex0 ∈ [a, b] such that f(x0) = x0.

Proof. If f(a) = a or f(b) = b then there is nothing more to show,so we may assume that f(a) > a and f(b) < b. This means that thefunction f(x) − x is positive when x = a, and negative when x = b.By the Intermediate Value Theorem there is some x0 ∈ [a, b] such thatf(x0) = x0.

Page 219: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

210 A Concrete Introduction to Real Analysis

Solutions of the equation f(x) = x are called fixed points of thefunction f . For functions f : [a, b] → [a, b] which are contractionsthere is a unique fixed point, and there is a constructive procedure forapproximating the fixed point. The ideas in the next theorem generalizequite well, leading to many important applications.

Theorem 7.4.16. (Contraction Mapping Theorem) Suppose that f :[a, b] → [a, b] is a contraction. Then there is a unique point z0 ∈ [a, b]such that f(z0) = z0. Moreover, if x0 is any point in [a, b], and xn isdefined by xn+1 = f(xn), then

|xn − z0| ≤ αn|x0 − z0|,

so that z0 = limn→∞ xn.

Proof. By the previous lemma the function f has at least one fixedpoint. Let’s first establish that there cannot be more than one. Supposethat f(z0) = z0 and f(z1) = z1. Using the definition of a contraction,

|z1 − z0| = |f(z1) − f(z0)| ≤ α|z1 − z0|.

That is,(1 − α)|z1 − z0| ≤ 0.

Since 0 ≤ α < 1 the factor (1 − α) is positive, and the factor |z1 − z0|is nonnegative. Since the product is less than or equal to 0, it must bethat |z1 − z0| = 0, or z1 = z0.

The inequality|xn − z0| ≤ αn|x0 − z0|

is proved by induction, with the first case n = 0 being trivial. Assumingthe inequality holds in the n-th case, it follows that

|xn+1−z0| = |f(xn)−f(z0)| ≤ α|xn−z0| ≤ ααn|x0−z0| = αn+1|x0−z0|.

7.4.3.1 Rootfinding 2: Newton’s Method

Newton’s method is an old, popular, and powerful technique for ob-taining numerical solutions of root finding problems f(x) = 0. Ge-ometrically, the idea is to begin with an initial guess x0. One thenapproximates f by the tangent line to the graph at x0. If the slope

Page 220: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 211

f ′(x0) is not 0, the point x1 where the tangent line intercepts the x-axis is taken as the next estimate x1, and the process is repeated. Thusthe algorithm starts with the initial guess x0 and defines a sequence ofpoints

xn+1 = xn − f(xn)f ′(xn)

.

Theorem 7.4.17. Suppose that f : (a, b) → R has two continuousderivatives on the interval (a, b). Assume that f(r) = 0, but f ′(r) �= 0,for some r ∈ (a, b). Then there is a δ > 0 such that if |x0 − r| < δ thesequence

xn+1 = xn − f(xn)f ′(xn)

will converge to r.

Proof. One approach to the proof uses the contraction idea. To thatend, define

g(x) = x − f(x)f ′(x)

.

A calculation givesg(r) = r,

and

g′(x) = 1 − (f ′(x))2 − f(x)f ′′(x)(f ′(x))2

=f(x)f ′′(x)(f ′(x))2

.

Since g′(x) is continuous for x near r, with g′(r) = 0, there is a δ > 0such that

|g′(x)| ≤ 12, r − δ ≤ x ≤ r + δ.

For x ∈ [r − δ, r + δ] the combination of g(r) = r and Theorem 7.4.12implies

|g(x) − r| = |g(x) − g(r)| ≤ 12|x − r| ≤ 1

2δ.

Since g : [r− δ, r + δ] → [r− δ, r + δ] is a contraction with fixed pointr, and xn+1 = g(xn), Theorem 7.4.16 shows that

r = limn→∞xn

for any x0 ∈ [r − δ, r + δ].

Page 221: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

212 A Concrete Introduction to Real Analysis

This proof showed that for x0 close enough to r, the sequence xn

comes from iteration of a contraction with α ≤ 1/2. This alreadyguarantees rapid convergence of the sequence {xn} to r. Actually, asxn gets close to r, the value of α will improve, further accelerating therate of convergence. Additional information about Newton’s methodcan be found in most basic numerical analysis texts.

7.4.4 Convexity

Calculus students spend a lot of time searching for extreme pointsof a function f(x) by examining critical points, which are solutions off ′(x) = 0. Without additional information, such a critical point couldbe a local or global minimum or maximum, or none of these. Thissituation changes dramatically if f satisfies the additional conditionf ′′ > 0. The positivity of the second derivative is closely related toa geometric condition called convexity. A simple model for convexfunctions is provided by the function f(x) = x2, which is shown inFigure 7.3, along with the tangent line to this graph at x = 2, and thesecant line joining (1, f(1)) and (3, f(3)). Notice that the graph of flies below its secant line on the interval [1, 3], and above the tangentline.

Given an interval I, a function f : I → R is said to be convex if thegraph of the function always lies beneath its secant lines. To make thisprecise, suppose that a and b are distinct points in I. The points onthe line segment joining (a, f(a)) to (b, f(b)) may be written as

(tb + (1 − t)a, tf(b) + (1 − t)f(a)), 0 ≤ t ≤ 1.

The function f is convex on I if for all distinct pairs a, b ∈ I

f(tb + (1 − t)a)) ≤ tf(b) + (1 − t)f(a), 0 ≤ t ≤ 1. (7.4)

The function f is strictly convex if the inequality is strict except at theendpoints,

f(tb + (1 − t)a) < tf(b) + (1 − t)f(a), 0 < t < 1.

The first result says that the graph of a convex function f lies aboveits tangent lines.

Theorem 7.4.18. Suppose that f : (c, d) → R is convex, and f ′(a)exists for some a ∈ (c, d). Then

f(a) + (b − a)f ′(a) ≤ f(b) for all b ∈ (c, d). (7.5)

Page 222: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 213

0 0.5 1 1.5 2 2.5 3 3.5 4−4

−2

0

2

4

6

8

10

12

14

16

x

ysecant linetangent line

Figure 7.3: Convex functions

Proof. After some algebraic manipulation (7.4) may be expressed as

f(t[b − a] + a) − f(a) ≤ t[f(b) − f(a)], 0 ≤ t ≤ 1,

or, for a �= b and t �= 0,

f(t[b − a] + a) − f(a)t[b − a]

(b − a) ≤ f(b) − f(a).

Take the limit as h = t[b − a] → 0 to get

f ′(a)(b − a) ≤ f(b) − f(a),

which is equivalent to (7.5).

If f is differentiable on (c, d), then a converse to Theorem 7.4.18holds.

Theorem 7.4.19. Suppose that f is differentiable on (c, d). If

f(a) + (b − a)f ′(a) ≤ f(b) for all a, b ∈ (c, d),

then f : (c, d) → R is convex.

Page 223: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

214 A Concrete Introduction to Real Analysis

Proof. Suppose that f is not convex on (c, d). Then there is a pair ofpoints a, b with c < a < b < d, and some t1 ∈ (0, 1), such that

f(t1b + (1 − t1)a) > t1f(b) + (1 − t1)f(a). (7.6)

Define the function g(x) whose graph is the line joining (a, f(a)) and(b, f(b),

g(x) = f(a) +f(b) − f(a)

b − a(x − a), a ≤ x ≤ b.

If x1 = t1b + (1 − t1)a, then (7.6) says that

f(x1) > g(x1).

By the Mean Value Theorem there is a point c1 ∈ (a, b) such that

f ′(c1) =f(b) − f(a)

b − a.

It is easy to see that there is such a c1 also satisfying

f(c1) > g(c1).

The tangent line to f at c1 has the form

h(x) = f ′(c1)(x − c1) + f(c1) =f(b) − f(a)

b − a(x − c1) + f(c1).

Since f(c1) > g(c1) and the lines g(x) and h(x) have the same slopes,

h(b) > g(b).

That is,f ′(c1)(b − c1) + f(c1) > f(b),

so the inequality (7.5) is not valid for all pairs of points in (c, d).

There is a simple second derivative test that can be used to recognizeconvex functions.

Theorem 7.4.20. Suppose that f is continuous on [c, d], and has twoderivatives on (c, d). If f ′′(x) ≥ 0 for x ∈ (c, d), then f is convex on[c, d]. If f ′′(x) > 0 for x ∈ (c, d), then f is strictly convex on [c, d].

Page 224: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 215

Proof. It is convenient to work with the contrapositive statement: if fis not convex on [c, d], then there is some x1 ∈ (c, d) with f ′′(x1) < 0.

Introduce the auxiliary function

g(t) = f(tb + (1 − t)a)) − [tf(b) + (1 − t)f(a))], 0 ≤ t ≤ 1,

which satisfies g(0) = 0 = g(1). If f is not convex, then there is somepoint t1 ∈ (0, 1) such that g(t1) > 0.

The continuous function g : [0, 1] → R has a positive maximum atsome point t2 ∈ (0, 1), with g′(t2) = 0. In addition, an application ofthe Mean Value Theorem on the interval [t2, b] shows that there is apoint t3 ∈ (t2, b) with g′(t3) < 0.

Now apply the Mean Value Theorem again, this time to the functiong′ on the interval [t2, t3], obtaining

g′′(t4) =g′(t2) − g′(t3)

t2 − t3< 0, for some t4 ∈ (t2, t3).

Finally, a chain rule calculation shows that

g′′(t) = (b − a)2f ′′(tb + (1 − t)a),

sof ′′(t4b + (1 − t4)a) = g′′(t4)/(b − a)2 < 0.

The case f ′′ > 0 is handled in a similar fashion.

This last theorem has a converse of sorts.

Theorem 7.4.21. Suppose that f has two derivatives on (c, d), andf ′′(x) < 0 for all x ∈ (c, d). Then f is not convex on (c, d).

Proof. Picking distinct points a, b ∈ (c, d), consider the function

g(x) = f(x) − f(a) − f ′(a)(x − a).

By Theorem 7.4.11 the function f ′(x) is strictly decreasing on [a, b].This implies g′(x) < 0 for x > a. Since g(a) = 0, it follows thatg(b) < 0. This means

f(b) < f(a) + f ′(a)(b − a),

so f cannot be convex by Theorem 7.4.18

Page 225: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

216 A Concrete Introduction to Real Analysis

Finally, here is the answer to a calculus student’s prayers.

Theorem 7.4.22. Suppose that f : (c, d) → R is convex, and f ′(a) = 0for some a ∈ (c, d). Then a is a global minimizer for f . If f is strictlyconvex, then f has at most one global minimizer.

Proof. To see that a is a global minimizer, simply apply Theorem 7.4.18to conclude that

f(a) ≤ f(b), for all b ∈ (c, d).

Suppose that f is strictly convex, with global a minimizer a. If bis distinct from a, and f(a) = f(b), the defining inequality for strictconvexity gives

f(tb + (1 − t)a) < tf(b) + (1 − t)f(a) = f(a), 0 < t < 1,

contradicting the assumption that a is a global minimizer.

Page 226: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 217

7.5 Problems

1. Suppose we want to talk about the set of real valued rationalfunctions of a real variable. For instance, we might say that the sum ofany finite collection of rational functions is another rational function.Discuss the problem of defining a common domain for all rational func-tions. What is the appropriate domain for a fixed finite collection ofrational functions?

2. Suppose that {an} is a sequence of real numbers. Show that afunction is defined by the rule f(n) = an. What is the domain?

3. Show that iflim

x→∞ f(x) = ∞then

limx→∞ 1/f(x) = 0.

What can you say about the set of x where f(x) = 0?4. Complete the proofs of (iii) and (iv) in Theorem 7.3.1.5. Suppose that r(x) = p(x)/q(x) is a rational function, with

p(x) =m∑

k=0

akxk, am �= 0,

and

q(x) =n∑

k=0

bkxk, bn �= 0.

Show that limx→∞ r(x) = 0 if n > m, and limx→∞ r(x) = am/bm ifm = n.

6. Suppose limx→x0 f(x) = M and M �= 0.(a) Show there is a δ > 0 such that f(x) �= 0 for 0 < |x − x0| < δ.(b) State and prove an analogous result if limx→∞ f(x) = M and

M �= 0.7. Suppose limx→x0 f(x) = M and M > 0.(a) Show there is a δ > 0 such that

M/2 ≤ f(x) ≤ 2M

for 0 < |x − x0| < δ.

Page 227: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

218 A Concrete Introduction to Real Analysis

(b) Take as a fact that

limx→0

sin(x)x

= 1.

Show there is a δ > 0 such that

x/2 ≤ sin(x) ≤ 2x

for 0 < x < δ. What happens if x < 0?8. Show that the function f(x) = x is continuous at every point

x0 ∈ R.9. For each ε > 0, explicitly find a δ > 0 such that |x2 − 1| < ε if

|x − 1| < δ.10. Establish the continuity of f(x) = |x| in two steps.(a) Show that the function f(x) = |x| is continuous at x0 = 0.(b) Show that f(x) = |x| is continuous at every point x0 ∈ R.11. Produce two proofs that the function g(x) = 1/x is continuous

on the interval (0,∞). The first should use Theorem 7.3.1, while thesecond should be based on the definition of continuity.

12. For a < b < c, suppose that the real valued function f is con-tinuous on the intervals [a, b] and [b, c]. Show that f is continuous on[a, c]. Is the conclusion still true if we only assume that f is continuouson the intervals (a, b) and (b, c)?

13. Let g : R → R be the function satisfying g(x) = 0 when xis irrational, while g(x) = x when x is rational. Show that g(x) iscontinuous at x0 = 0, but at no other point.

14. Suppose that g : R → R is continuous. In addition, assume thatthe formula g(x) = x2 holds for all rational values of x. Show thatg(x) = x2 for all x ∈ R.

15. Show that any polynomial

p(x) =n∑

k=0

akxk, an �= 0,

with odd degree n and real coefficients ak has at least one real root.16. Suppose f : [0, 1] → R is a continuous function such that f(0) < 0

and f(1) > 1. Show that there is at least one point x0 ∈ [0, 1] suchthat f(x0) = x0.

17. Suppose I0 and I1 are open intervals, and that f : I0 → R, iscontinuous at x0 ∈ I0. Show that if xk ∈ I0, limk→∞ xk = x0, and

Page 228: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 219

f(x0) ∈ I1, then there is an N such that f(xk) ∈ I1 for k ≥ N . How isthis related to Theorem 7.3.5 ?

18. Use the bisection method to approximate√

3 by taking f(x) =x2 − 3, with a = 0 and b = 2. Compute an and bn for n ≤ 5. (Use acalculator.) How many iterations are required before |bn − r| ≤ 10−10?

19. Suppose that f is continuous on the interval (−∞,∞). Assumein addition that f(x) ≥ 0 for all x ∈ R, and that

limx→±∞ f(x) = 0.

Show that f has a maximum at some x0 ∈ R. Find an example to showthat f may not have a minimum.

20. Suppose that f : I → R and f satisfies the inequality

|f(x) − f(y)| ≤ C|x − y|

for some constant C and all x, y ∈ I. Show that f is uniformly contin-uous on the interval I.

21. Show that the two definitions of f ′(x0),

f ′(x0) = limx→x0

f(x) − f(x0)x − x0

,

and

f ′(x0) = limh→0

f(x0 + h) − f(x0)h

,

are equivalent by showing that the existence of one limit implies theexistence of the other, and that the two limits are the same.

22. Show that the function f(x) = |x| does not have a derivative atx0 = 0.

23. Prove directly from the definition that

d

dxx2 = 2x,

d

dxx3 = 3x2.

24. Given a function f defined on [a, b], we sometimes wish to discussthe differentiability of f at a or b without looking at a larger interval.How would you define f ′(a) and f ′(b) using limx→a+ and limx→b−?

25. Show that the function

f(x) ={x2, x ≥ 0

0, x < 0

}

Page 229: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

220 A Concrete Introduction to Real Analysis

has a derivative at every real number x.26. Consider the following problems about differentiability.(a) Assume that f : [0, 1] → R is differentiable at x0 = 0. Sup-

pose that there is a sequence xn ∈ [0, 1] such that f(xn) = 0 andlimn→∞ xn = 0. Prove that f ′(0) = 0.

(b) Define the function

g(x) ={x sin(1/x), x �= 0

0, x = 0

}.

Is g differentiable at x0 = 0?27. Suppose f ′(x) = 0 for all x ∈ (a, b). Show that f(x) is constant.28. Here is another version of Theorem 7.4.11. Suppose f : [a, b] → R

is continuous, and f is differentiable on (a, b) with f ′(x) ≥ 0 for x ∈(a, b). Prove that f is increasing on [a, b]. Give an example to showthat f may not be strictly increasing.

29. Suppose f and g are real valued functions defined on [a, b). Showthat if f(a) = g(a) and if f ′(x) < g′(x) for x ∈ (a, b), then f(x) < g(x)for x ∈ (a, b).

30. Show that the function f(x) = x5 + x3 + x + 1 has exactly onereal root.

31. Assume that f : R → R is differentiable, that |f(0)| ≤ 1, andthat |f ′(x)| ≤ 1. What is the largest possible value for |f(x)| if x ≥ 0? Provide an example that achieves your bound.

32. Suppose that f ′(x) = g′(x) for all x ∈ (a, b). Show that f(x) =g(x) for all x ∈ (a, b) if and only if there is some x0 ∈ (a, b) such thatf(x0) = g(x0).

33. Assume that x1 < x2 < · · · < xN , and define

p(x) = (x − x1) · · · (x − xN ).

Show that p′(x) has exactly N − 1 real roots.34. Suppose that f (n)(x) = 0 for all x ∈ (a, b). Show that f(x) is a

polynomial of degree at most n − 1.35. Suppose that f is continuous on [a, b], and has n derivatives on

(a, b). Assume that there are points

a ≤ x0 < x1 < · · · < xn ≤ b

such that f(xk) = 0. Show that there is a point ξ ∈ (a, b) such thatf (n)(ξ) = 0.

Page 230: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Functions 221

36. Prove Theorem 7.4.13 if the hypothesis f ′(x1) > 0 is replace byf ′(x1) �= 0. Don’t work too hard.

37. Find the derivatives of sin−1(x) and tan−1(x) for x near 0.38. Use Newton’s method and a calculator to approximate

√2. Use

Theorem 7.4.16 to estimate the accuracy of the approximations.39. Construct the requested examples.(a) Find an example of a function f : [c, d] → R such that |f ′(x)| ≤

α < 1 for x ∈ [c, d], but f has no fixed point.(b) Find an example of a function f : (0, 1) → (0, 1) such that

|f ′(x)| ≤ α < 1 for x ∈ (0, 1), but f has no fixed point in (0, 1).40. For n ≥ 0, consider solving the equation

xn+1 −n∑

k=0

akxk = 0, ak > 0,

by recasting it as the fixed point problem

x = f(x) =n∑

k=0

akxk−n.

(a) Show that the problem has exactly one positive solution.(b) Show that f : [an, f(an)] → [an, f(an)].(c) Show that the sequence x0 = an, xm+1 = f(xm) converges to the

positive solution if

|n−1∑k=0

(k − n)ak

(an)n−k+1| < 1.

41. In addition to the hypotheses of Theorem 7.4.16, suppose thatthere is a constant C such that

|f(x2) − f(x1)| ≤ Cr|x2 − x1|, x1, x2 ∈ [z0 − r, z0 + r].

Proceeding as in the proof of Theorem 7.4.16, one first has

|x1 − z0| = |f(x0) − f(z0)| ≤ Cr|x0 − z0| ≤ C|x0 − z0|2.Show that the convergence estimates improve to

|xn − z0| ≤ C2n−1|x0 − z0|2n.

42. Consider the following problems.

Page 231: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

222 A Concrete Introduction to Real Analysis

(a) Suppose

f(x) =n∑

k=1

ck exp(akx), ck > 0.

Show that if limx→±∞ f(x) = ∞, then f has a unique global minimum.(b) Find a strictly convex function f : R → R with no global mini-

mizer.43. Suppose f and g are convex functions defined on an interval I.

Show that f + g is also convex on I. Show that αf is convex if α ≥ 0.If h : R → R is convex and increasing, show that h(f(x)) is convex.

44. Assume that f : R → R is strictly convex. Show that there areno more than two distinct points xi satisfying f(xi) = 0.

Page 232: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Chapter 8

Integrals

8.1 Introduction

One of the fundamental problems in calculus is the computation ofthe area between the graph of a function f : [a, b] → R and the x-axis.The essential ideas are illustrated in Figures 8.1 and 8.2. An interval[a, b] is divided into n subintervals [xk, xk+1], with a = x0 < x1 <· · · < xn = b. On each subinterval the area is approximated by thearea of a rectangle, whose height is usually the value of the functionf(tk) at some point tk ∈ [xk, xk+1]. In the left figure, the heights of therectangles are given by the values f(xk), while on the right the heightsare f(xk+1).

In elementary treatments it is often assumed that each subintervalhas length (b−a)/n. One would then like to argue that the sum of theareas of the rectangles has a limit as n → ∞. This limiting value willbe taken as the area, which is denoted by the integral

∫ b

af(x) dx.

In chapter 2 this idea was carried out for the elementary functions xm

for m = 0, 1, 2, . . . .In general, there are both practical and theoretical problems that

arise in trying to develop this idea for area computation. Althoughcalculus texts emphasize algebraic techniques for integration, there aremany important integrals which cannot be evaluated by such tech-niques. One then has the practical problem of selecting and usingefficient algorithms to calculate integrals with high accuracy. This willrequire fairly explicit descriptions of the errors made by approximateintegration techniques. On the theoretical side, problems arise becausethere are examples for which the area computation does not seem mean-

223

Page 233: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

224 A Concrete Introduction to Real Analysis

0 1 2 3 4 5 6 7 8 90

1

2

3

4

5

6

7

8

9

10

11

x

y

Figure 8.1: A lower Riemann sum

ingful. A major problem is to describe a large class of functions forwhich the integral makes sense.

One example of a function whose integration is problematic is f(x) =1/x. Consider an area computation on the interval [0, 1]. Fix n andchoose

xk =k

n, k = 0, . . . , n.

Form rectangles with heights f(xk+1), the value of the function atthe right endpoint of each subinterval. Since 1/x is decreasing on theinterval (0, 1], these rectangles will lie below the graph. The sum of theareas of the rectangles is

sn =n−1∑k=0

1n

f(xk+1) =1n

[n

1+

n

2+

n

3+ · · · + n

n] =

n∑k=1

1k.

The areas sn are the partial sums of the harmonic series, which diverges.The problem becomes even worse if we consider the integral on

[−1, 1], ∫ 1

−1

1x

dx.

Page 234: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 225

0 1 2 3 4 5 6 7 8 90

1

2

3

4

5

6

7

8

9

10

11

x

y

Figure 8.2: An upper Riemann sum

Recall that a signed area is intended when the function f(x) is notpositive; negative function values contribute negative area. While oneis fairly safe in assigning the ‘value’ ∞ to

∫ 10 1/x dx, extreme caution

is called for when trying to make sense of the expression∫ 1

−1

1x

dx =∫ 0

−1

1x

dx +∫ 1

0

1x

dx = ∞−∞.

A different sort of challenge is provided by the function

g(x) ={1, x is rational,0, x is irrational

}. (8.1)

Consider trying to compute∫ 10 g(x) dx. Fix n, and take xk = k/n for

k = 0, . . . , n. If the heights of the rectangles are given by f(tk), thesum of the areas of the rectangles is

Sn =n−1∑k=0

1n

f(tk).

If tk is chosen to be the left endpoint xk of the k-th subinterval, thensince xk is rational, Sn = 1. In contrast, if tk is chosen to be an

Page 235: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

226 A Concrete Introduction to Real Analysis

irrational number in the k-th subinterval, then Sn = 0. Regardless ofhow small the subintervals [xk, xk+1] are, some of our computationsresult in an area estimate of 1, while others give an area estimate of 0.

These examples indicate that a certain amount of care is neededwhen trying to determine a class of functions which can be integrated.The approach we will follow, usually referred to as Riemann’s theoryof integration, was developed in the nineteenth century by Cauchy,Riemann and Darboux [9, pp. 956–961]. This development revivedideas of approximating areas under curves by sums of areas of simplergeometric figures that had antecedents in the work of ancient Greece,and then of Leibniz. A still more sophisticated approach, which will notbe treated in this book, was developed in the early twentieth centuryby H. Lebesgue.

8.2 Integrable functions

Riemann’s theory of integration treats the integral of a bounded func-tion f(x) defined on an interval [a, b]. Area computations are based ona process of estimation with two types of rectangles, as in Figure 8.1.The strategy is easiest to describe for positive functions f , althoughthe process works without any sign restrictions. When f > 0, upperrectangles are constructed with heights greater than the correspondingfunction values, and lower rectangles are constructed with heights lessthan the function values. Functions are considered integrable when theareas computed using upper and lower rectangles agree. This methodpermits integration of an extremely large class of functions, includingall continuous functions on [a, b], as well as a large variety of functionswhich are not continuous.

Some terminology will be needed to describe various subdivisions ofthe interval [a, b]. To subdivide the interval, introduce a partition P of[a, b], which is a finite set of real numbers {x0, . . . , xn} satisfying

a = x0 < x1 < · · · < xn = b.

The interval [a, b] is divided into n subintervals [xk, xk+1], for k =0, . . . , n − 1. As a measure of the length of the subintervals in the

Page 236: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 227

partition, define the mesh of the partition,

μ(P) = maxk=0,...,n−1

|xk+1 − xk|.

A partition P2 is said to be a refinement of a partition P1 if P1 ⊂ P2.That is, P2 = {t0, . . . , tm} is a refinement of P1 = {x0, . . . , xn} if everyxk ∈ P1 appears in the list of points tj ∈ P2. The partition P3 is saidto be a common refinement of the partitions P1 and P2 if P1 ⊂ P3 andP2 ⊂ P3. The set P3 = P1 ∪ P2 (with redundant points eliminated) isthe smallest common refinement of P1 and P2.

Suppose that f is a bounded function which is defined on [a, b] andsatisfies |f | ≤ N . Recall that the infimum, or inf, of a set U ⊂ R

is another name for the greatest lower bound of U , and similarly thesupremum, or sup, of U is the least upper bound of U . For each of thesubintervals [xk, xk+1], introduce the numbers

mk = inf{f(t), xk ≤ t ≤ xk+1}, Mk = sup{f(t), xk ≤ t ≤ xk+1}.Even if f is not continuous, the numbers mk and Mk will exist, andsatisfy −N ≤ mk ≤ Mk ≤ N . No matter how pathological the functionf is, our sense of area demands that

mk[xk+1 − xk] ≤∫ xk+1

xk

f(x) dx ≤ Mk[xk+1 − xk].

Adding up the contributions from the various subintervals, we obtainan upper sum

U(f,P) =n−1∑k=0

Mk[xk+1 − xk],

which will be larger than the integral, and a lower sum

L(f,P) =n−1∑k=0

mk[xk+1 − xk],

which will be smaller than the integral. Since |f | ≤ N , and since thelower sums for a partition are always smaller than the upper sums forthe same partition, the inequalities

−N [b − a] ≤ supP

L(f,P) ≤ infP

U(f,P) ≤ N [b − a]

are always valid.The next lemma says that the lower sum for any partition is always

smaller than the upper sum for any other partition.

Page 237: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

228 A Concrete Introduction to Real Analysis

Lemma 8.2.1. Suppose that f is a bounded function defined on [a, b],and that P1 and P2 are two partitions of [a, b]. Then

L(f,P1) ≤ U(f,P2).

Proof. Given partitions P1 = {xk|k = 0, . . . , n} and P2 = {yj |j =0, . . . ,m}, let

P3 = {z0, . . . , zr} = P1 ∪ P2

be their smallest common refinement (see Figure 8.3).Let’s compare the upper sums U(f,P1) and U(f,P3). Since P3 is a

refinement of P1, each interval [xk, xk+1] may be written as the unionof one or more intervals [zl, zl+1],

[xk, xk+1] = ∪J(k)l=I(k)[zl, zl+1], xk = zI(k) < · · · < zJ(k) = xk+1.

Comparing

M̃l = supt∈[zl,zl+1]

f(t) and Mk = supt∈[xk,xk+1]

f(t),

we find that Mk ≥ M̃l, since [zl, zl+1] ⊂ [xk, xk+1].

xk

xk+1

yi−1

yi

yj

yj+1

Figure 8.3: A common refinement of partitions

Notice that the length of the interval [xk, xk+1] is the sum of thelengths of the subintervals [zl, zl+1] for l = I(k), . . . , J(k),

[xk+1 − xk] =J(k)∑

l=I(k)

[zl+1 − zl].

It follows that

Mk[xk+1 − xk] =J(k)∑

l=I(k)

Mk[zl+1 − zl] ≥J(k)∑

l=I(k)

M̃l[zl+1 − zl].

Page 238: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 229

This comparison extends to the upper sums,

U(f,P1) =j−1∑k=0

Mk[xk+1 − xk] ≥n−1∑k=0

J(k)∑l=I(k)

M̃l[zl+1 − zl] = U(f,P3).

Thus refinement of a partition reduces the upper sum. By a similarargument, refinement of a partition increases the lower sum. Since P3

is a common refinement of P1 and P2, and since the upper sum of apartition exceeds the lower sum of the same partition, it follows that

L(f,P1) ≤ L(f,P3) ≤ U(f,P3) ≤ U(f,P2).

If our expectations about area are correct, then the upper and lowersums should approach a common value as the mesh of the partitionapproaches 0. This expectation will be realized for ‘nice’ functions,although pathological functions such as (8.1) will not fulfill our expec-tations. Say that a bounded function f : [a, b] → R is integrable if theinfimum of its upper sums, taken over all partitions P, is equal to thesupremum of the lower sums, or in abbreviated notation

infP

U(f,P) = supP

L(f,P). (8.2)

If the function f is integrable, the integral is taken to be this commonvalue ∫ b

af(x) dx = inf

PU(f,P) = sup

PL(f,P).

It is often convenient to work with an alternative characterization ofintegrable functions. The straightforward proof of the next lemma isleft as an exercise.

Lemma 8.2.2. A bounded function f defined on the interval [a, b] isintegrable if and only if for every ε > 0 there is a partition P such that

U(f,P) −L(f,P) < ε.

In some cases it is possible to show that f is integrable by explicitlybounding the difference of the upper and lower sums as a function ofthe mesh of the partition P.

Page 239: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

230 A Concrete Introduction to Real Analysis

Theorem 8.2.3. Suppose that f : [a, b] → R is differentiable, and

|f ′(x)| ≤ C, x ∈ [a, b].

Then f is integrable, and

U(f,P) − L(f,P) ≤ Cμ(P)[b − a].

Proof. Since f is differentiable, it is continuous on the compact interval[a, b], and so bounded. In addition the continuity implies that for anypartition P = {x0, . . . , xn}, there are points uk, vk in [xk, xk+1] suchthat

mk = inft∈[xk,xk+1]

f(t) = f(uk), Mk = supt∈[xk,xk+1]

f(t) = f(vk).

For this partition the difference of the upper and lower sums is

U(f,P) − L(f,P) =n−1∑k=0

f(vk)[xk+1 − xk] −n−1∑k=0

f(uk)[xk+1 − xk]

=n−1∑k=0

[f(vk) − f(uk)][xk+1 − xk].

By the Mean Value Theorem

|f(vk) − f(uk)| ≤ C|vk − uk| ≤ C|xk+1 − xk| ≤ Cμ(P).

This gives the desired estimate of the difference of upper and lowersums,

U(f,P) − L(f,P) ≤n−1∑k=0

Cμ(P)[xk+1 − xk] = Cμ(P)[b − a].

By the previous lemma f is integrable, since the mesh μ(P) may bemade arbitrarily small.

When a computer is used to calculate integrals by geometric methods,it is important to relate the required number of arithmetic computa-tions to the desired accuracy. Since

L(f,P) ≤∫ b

af(x) dx ≤ U(f,P),

Page 240: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 231

Theorem 8.2.3 can be interpreted as a bound on the complexity of Rie-mann sum calculations. As a numerical technique the use of Riemannsums is rather inefficient.

The ideas used in the last proof may also be employed to show thatcontinuous functions are integrable. In this case we lose the explicitconnection between mesh size and the difference of the upper and lowersums.

Theorem 8.2.4. If f : [a, b] → R is continuous, then f is integrable.For any ε > 0 there is a μ0 > 0 such that μ(P) < μ0 implies

U(f,P) −L(f,P) < ε.

Proof. As in the proof of Theorem 8.2.3, f is bounded and there arepoints uk, vk in [xk, xk+1] such that

mk = inft∈[xk,xk+1]

f(t) = f(uk), Mk = supt∈[xk,xk+1]

f(t) = f(vk).

Since f is continuous on a compact interval, f is uniformly continuous.That is, for any η > 0 there is a δ such that

|f(x) − f(y)| < η whenever |x − y| < δ.

Pick η = ε/(b − a), and let μ0 be the corresponding δ. If P is anypartition with μ(P) < μ0, then

U(f,P) − L(f,P) =n−1∑k=0

[f(vk) − f(uk)][xk+1 − xk]

<n−1∑k=0

ε

b − a[xk+1 − xk] = ε.

Theorem 8.2.5. Suppose that f(x) is integrable on [a, b]. If [c, d] ⊂[a, b] then f is integrable on [c, d].

Proof. Any partition P of [a, b] has a refinement P1 = {x0, . . . , xn}which includes the points c, d. Let P2 ⊂ P1 be the correspondingpartition of [c, d].

Since(Mk − mk)[xk+1 − xk] ≥ 0,

Page 241: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

232 A Concrete Introduction to Real Analysis

it follows that

U(f,P2) − L(f,P2) ≤ U(f,P1) − L(f,P1) ≤ U(f,P) − L(f,P),

yielding the integrability of f on [c, d].

The next theorem allows us to construct examples of functions fwhich are integrable, but not continuous.

Theorem 8.2.6. Suppose that P = {x0, . . . , xn} is a partition of [a, b],and that fk : [xk, xk+1] → R is an integrable function for k = 0, . . . , n−1. Let g : [a, b] → R be a function satisfying

g(x) = fk(x), x ∈ (xk, xk+1),

and g(xk) = yk for any values yk ∈ R. Then g is integrable on [a, b]and ∫ b

ag(x) dx =

n−1∑k=0

∫ xk+1

xk

fk(x) dx.

Proof. Since each of the functions fk is bounded, there is a constantC > 0 such that |fk(x)| ≤ C for all x ∈ [xk, xk+1], and |g(x)| ≤ C forall x ∈ [a, b]. Given ε > 0, choose partitions Pk of [xk, xk+1] such that

U(fk,Pk) − L(fk,Pk) <ε

n, k = 0, . . . , n − 1.

Let Pk = {y0, . . . , ym} be the partition of [xk, xk+1]. Define refinedpartitions Qk of Pk by adding points tk, tk+1 satisfying

xk = y0 < tk < y1 < · · · < ym−1 < tk+1 < ym = xk+1,

and such that

0 < tk − xk <ε

nC, 0 < xk+1 − tk+1 <

ε

nC, k = 0, . . . , n − 1.

Since Qk is a refinement of Pk,

U(fk,Qk) − L(fk,Qk) <ε

n.

Since fk and g agree on (xk, xk+1), and

|fk(xk) − g(xk)| ≤ 2C, |fk(xk+1) − g(xk+1)| ≤ 2C,

Page 242: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 233

we haveU(g,Qk) − U(fk,Qk) <

4εn

,

andL(g,Qk) − L(fk,Qk) <

4εn

.

The union of the points in the collection {Qk | k = 0, . . . , n − 1}defines a partition Q of [a, b]. Here we find

U(g,Q) =n−1∑k=0

U(g,Qk) (8.3)

=n−1∑k=0

U(fk,Qk) +n−1∑k=0

[U(g,Qk) − U(fk,Qk)],

with

|n−1∑k=0

[U(g,Qk) − U(fk,Qk)]| ≤ 4ε.

The lower sum L(g,Q) may be treated in the same manner.The integrability of g follows from the estimate

U(g,Q) − L(g,Q) =n−1∑k=0

[U(fk,Qk) − L(fk,Qk)]

+n−1∑k=0

[U(g,Qk) − U(fk,Qk)] −n−1∑k=0

[L(g,Qk) −L(fk,Qk)]

< ε + 4ε + 4ε.

The identity ∫ b

ag(x) dx =

n−1∑k=0

∫ xk+1

xk

fk(x) dx

now follows from (8.3) and the analogous estimate

|n−1∑k=0

[U(g,Qk) − U(fk,Qk)]| ≤ 4ε.

Page 243: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

234 A Concrete Introduction to Real Analysis

When the functions fk in the statement of Theorem 8.2.6 are con-tinuous on the intervals [xk, xk+1], the function g : [a, b] → R is said tobe piecewise continuous. Since continuous functions are integrable, soare piecewise continuous functions, which arise fairly often in appliedmathematics.

It is often convenient, particularly for numerical calculations, to avoidthe determination of the numbers mk and Mk. Instead, given a parti-tion P, the values mk and Mk will be replaced by f(tk) for an arbitrarytk ∈ [xk, xk+1]. To simplify the notation define

Δxk = xk+1 − xk.

As an approximation to the integral one considers Riemann sums,which are sums of the form

n−1∑k=0

f(tk)Δxk, tk ∈ [xk, xk+1].

Theorem 8.2.7. Suppose that f : [a, b] → R is continuous, P ={x0, . . . , xn} is a partition, and

n−1∑k=0

f(tk)Δxk, tk ∈ [xk, xk+1]

is a corresponding Riemann sum. For any ε > 0 there is a μ0 > 0 suchthat μ(P) < μ0 implies

∣∣∣∫ b

af(x) dx −

n−1∑k=0

f(tk)Δxk

∣∣∣ < ε.

Proof. On each interval [xk, xk+1] we have

mk ≤ f(tk) ≤ Mk.

Multiplying by Δxk and adding gives

L(f,P) =n−1∑k=0

mkΔxk ≤n−1∑k=0

f(tk)Δxk ≤n−1∑k=0

MkΔxk = U(f,P).

In addition,

L(f,P) ≤∫ b

af(x) dx ≤ U(f,P).

Page 244: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 235

These inequalities imply

∣∣∣∫ b

af(x) dx −

n−1∑k=0

f(tk)Δxk

∣∣∣ ≤ U(f,P) − L(f,P).

Finally, Theorem 8.2.4 says that for any ε > 0 there is a μ0 > 0 suchthat μ(P) < μ0 implies

U(f,P) −L(f,P) < ε.

8.3 Properties of integrals

Theorem 8.3.1. Suppose that f(x) and g(x) are integrable on [a, b].For any constants c1, c2 the function c1f(x) + c2g(x) is integrable, and∫ b

ac1f(x) + c2g(x) dx = c1

∫ b

af(x) dx + c2

∫ b

ag(x) dx.

Proof. It suffices to prove that∫ b

ac1f(x) dx = c1

∫ b

af(x) dx,

∫ b

af(x) + g(x) dx =

∫ b

af(x) dx +

∫ b

ag(x) dx.

Suppose that ε > 0, and P = {x0, . . . , xn} is a partition such that

U(f,P) −L(f,P) < ε.

If c1 ≥ 0 then

infx∈[xk,xk+1]

c1f(x) = c1 infx∈[xk,xk+1]

f(x),

supx∈[xk,xk+1]

c1f(x) = c1 supx∈[xk,xk+1]

f(x),

while if c1 < 0 then

infx∈[xk,xk+1]

c1f(x) = c1 supx∈[xk,xk+1]

f(x),

Page 245: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

236 A Concrete Introduction to Real Analysis

supx∈[xk,xk+1]

c1f(x) = c1 infx∈[xk,xk+1]

f(x).

For c1 ≥ 0 it follows that

U(c1f,P) = c1U(f,P), L(c1f,P) = c1L(f,P),

while for c1 < 0

U(c1f,P) = c1L(f,P), L(c1f,P) = c1U(f,P).

In either caseU(c1f,P) − L(c1f,P) < |c1|ε.

This is enough to show that c1f(x) is integrable, with∫ b

ac1f(x) dx = c1

∫ b

af(x) dx.

Suppose that P1 is a second partition such that

U(g,P1) −L(g,P1) < ε.

By passing to a common refinement we may assume that P = P1. Forthe function f let

mfk = inf

x∈[xk,xk+1]f(x), Mf

k = supx∈[xk,xk+1]

f(x).

Then for all x ∈ [xk, xk+1],

f(x) + g(x) ≤ Mfk + Mg

k ,

so thatmf+g

k ≥ mfk + mg

k, Mf+gk ≤ Mf

k + Mgk ,

and

U(f + g,P) ≤ U(f,P) + U(g,P), L(f + g,P) ≥ L(f,P) + L(g,P).

These inequalities imply

U(f + g,P) − L(f + g,P) < 2ε,

so f + g is integrable. In addition both numbers∫ b

af(x) + g(x) dx, and

∫ b

af(x) dx +

∫ b

ag(x) dx

Page 246: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 237

lie between L(f,P) + L(g,P) and U(f,P) + U(g,P), so that∣∣∣∫ b

af(x) + g(x) dx − (

∫ b

af(x) dx +

∫ b

ag(x) dx)

∣∣∣ < 2ε.

The next result will show that the product of integrable functionsis integrable. In the proof it will be necessary to discuss the lengthof a set of the intervals [xk, xk+1] from the partition P = {x0, . . . , xn}of [a, b]. The obvious notion of length is used; if B is a subset of theindices {0, . . . , n − 1}, and

PB =⋃k∈B

[xk, xk+1],

then the length of PB is

length(PB) =∑k∈B

[xk+1 − xk].

Theorem 8.3.2. If f(x) and g(x) are integrable on [a, b], then so isf(x)g(x).

Proof. The argument is somewhat simpler if f and g are assumed tobe positive. It is a straightforward exercise to deduce the general casefrom this special case (see problem 16).

Pick ε > 0, and let P be a partition of [a, b] such that

U(f, P ) − L(f, P ) < ε2, U(g, P ) −L(g, P ) < ε2. (8.4)

Let Bf be the set of indices k such that Mfk − mf

k ≥ ε. On onehand, the contributions from the intervals [xk, xk+1] with k ∈ Bf tendto make the difference between upper and lower sums big:

U(f, P ) − L(f, P ) =n−1∑k=0

(Mfk − mf

k)Δxk

≥∑

k∈Bf

(Mfk − mf

k)Δxk ≥ ε∑

k∈Bf

Δxk.

On the other hand, (8.4) says the total difference between upper andlower sums is small. It follows that

length(PBf) =

∑k∈Bf

Δxk < ε.

Page 247: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

238 A Concrete Introduction to Real Analysis

The analogous definition for Bg leads to the same conclusion.Let J denote the set of indices k such that

Mfk − mf

k < ε, Mgk − mg

k < ε, k ∈ J.

The set of indices J is just the complement of the union of Bf and Bg.The functions f and g are bounded. Assume that

0 ≤ f(x) ≤ L, 0 ≤ g(x) ≤ L, x ∈ [a, b].

To estimate U(fg, P )−L(fg, P ), note that since f(x) > 0 and g(x) >0,

Mfgk − mfg

k ≤ Mfk Mg

k − mfkmg

k

= Mfk Mg

k −Mfk mg

k +Mfk mg

k−mfkmg

k = Mfk (Mg

k −mgk)+(Mf

k −mfk)mg

k.

Thus for k ∈ J ,Mfg

k − mfgk < 2Lε,

while for any k

Mfgk − mfg

k < L2.

Putting these estimates together, the proof is completed by notingthat the difference between upper and lower sums for the function fgis small for the partition P.

U(fg, P ) − L(fg, P ) =n−1∑k=0

(Mfgk − mfg

k )Δxk

≤∑k∈J

(Mfgk −mfg

k )Δxk +∑

k∈Bf

(Mfgk −mfg

k )Δxk +∑k∈Bg

(Mfgk −mfg

k )Δxk

< 2L[b − a]ε + L2ε + L2ε.

Theorem 8.3.3. If f is integrable, so is |f |, and

∣∣∣∫ b

af(x) dx

∣∣∣ ≤ ∫ b

a|f(x)| dx.

Page 248: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 239

Proof. Again suppose that ε > 0, and P = {x0, . . . , xn} is a partitionsuch that

U(f,P) −L(f,P) < ε.

For any numbers x, y ∣∣|x| − |y|∣∣ ≤ ∣∣x − y∣∣,

so that

M|f |k − m

|f |k = sup

t1,t2

∣∣|f(t1)| − |f(t2)|∣∣ ≤ sup

t1,t2|f(t1) − f(t2)| = Mf

k − mfk .

This in turn implies

U(|f |,P) − L(|f |,P) ≤ U(f,P) − L(f,P) < ε,

and |f | is integrable.One also checks easily that

|U(f,P)| ≤ U(|f |,P),

leading to the desired inequality.

We turn now to the Fundamental Theorem of Calculus. By relatingintegrals and derivatives, this theorem provides the basis for most ofthe familiar integration techniques of calculus. The theorem will besplit into two parts: the first considers the differentiability of integrals,the second describes the integration of derivatives.

As part of the proof of the next theorem, the integral∫ ab f(x) dx

with a < b. will be needed. This integral is defined by∫ a

bf(x) dx = −

∫ b

af(x) dx, a < b.

Theorem 8.3.4. (Fundamental Theorem of Calculus I) Suppose thatf : (a, b) → R is continuous. If x0, x ∈ (a, b), then the function

F (x) =∫ x

x0

f(t) dt

is differentiable, andF ′(x) = f(x).

Page 249: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

240 A Concrete Introduction to Real Analysis

Proof. For h > 0 the computation runs as follows.

F (x + h) − F (x) =∫ x+h

x0

f(t) dt −∫ x

x0

f(t) dt =∫ x+h

xf(t) dt.

Since f is continuous on [x, x + h], there are u, v such that

f(u) = mint∈[x,x+h]

f(t), f(v) = maxt∈[x,x+h]

f(t).

The inequality

hf(u) ≤∫ x+h

xf(t) dt ≤ hf(v),

then implies

f(u) ≤ F (x + h) − F (x)h

≤ f(v).

The continuity of f at x means that for any ε > 0 there is an h suchthat

|f(y) − f(x)| < ε, |y − x| < h.

Apply this inequality with u and v in place of y to obtain

f(x) − ε < f(u), f(v) < f(x) + ε,

and sof(x) − ε ≤ F (x + h) − F (x)

h≤ f(x) + ε.

But this says that

limh→0+

F (x + h) − F (x)h

= f(x).

For h < 0 the analogous computation begins with

F (x+h)−F (x) = −[F (x)−F (x+h)] = −∫ x

x+hf(t) dt =

∫ x+h

xf(t) dt.

The rest of the computations, leading to

limh→0−

F (x + h) − F (x)h

= f(x),

are left as an exercise.

Page 250: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 241

It is now easy to obtain the part of the Fundamental Theorem ofCalculus which forms the basis for much of calculus.

Theorem 8.3.5. (Fundamental Theorem of Calculus II) Suppose thatF : (c, d) → R has a continuous derivative, F ′(x) = f(x), and that[a, b] ⊂ (c, d). Then ∫ b

af(t) dt = F (b) − F (a).

Proof. For x ∈ (c, d) the functions F (x) − F (a) and∫ xa f(t) dt have

the same derivative by the first part of the Fundamental Theorem ofCalculus. Consequently, these two functions differ by a constant C.Evaluation of the two functions at x = a shows that C = 0.

8.4 Numerical computation of integrals

In this section our attention turns to the practical computation of in-tegrals, particularly integrals which do not have elementary antideriva-tives. A good example is

Φ(b) =∫ b

0e−x2

dx,

an integral which occurs quite often in probability and statistics. Atfirst glance it may appear that there is no problem, since Theorem 8.2.7assures us that it is sufficient to partition the interval [0, b] into nequal length subintervals, and use any Riemann sum. By taking nlarge enough the integral can be evaluated with any desired degree ofaccuracy.

In practice the relationship between the number of computations,measured in this case by the number of subintervals n, and the accu-racy of the computation is often extremely important. Computationshave costs, and in some applications computation times have severeconstraints. When accuracy requirements are stringent, or the allotedtime for computations is limited, inefficient algorithms may have littlevalue.

This discussion will begin with a look at the simplest Riemann sums,the left and right endpoint sums. It will become evident that these

Page 251: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

242 A Concrete Introduction to Real Analysis

techniques are not very efficient. The alternative midpoint and trape-zoidal rules for integration will be considered. Happily, these simplemodifications of our computational technique will offer tremendous im-provements in efficiency.

8.4.1 Endpoint Riemann sums

An upper bound on the cost of Riemann sum calculations may beobtained from Theorem 8.2.3. Start with a partition P which dividesthe interval [a, b] into n equal length subintervals,

xk = kb − a

n, k = 0, . . . , n.

The values of any Riemann sum using the partition P, as well as thevalue of the integral

∫ ba f(x) dx, are bracketed by the values of the lower

and upper sums for P, so

∣∣∣∫ b

af(x) dx −

n−1∑k=0

f(xk)Δxk

∣∣∣ ≤ U(f,P) − L(f,P).

The inequality

U(f,P) − L(f,P) ≤ Cμ(P)[b − a], C = maxx∈[a,b]

|f ′(x)|

from Theorem 8.2.3 thus implies

∣∣∣∫ b

af(x) dx −

n−1∑k=0

f(xk)Δxk

∣∣∣ ≤ maxx∈[a,b]

|f ′(x)|(b − a)2

n.

Consider the particular example∫ 1

0e−x2

dx.

In this case a simple computation shows that

maxx∈[0,∞)

|f ′(x)| = |f ′(1/√

2)| =√

2e−1/2 � .86.

Since b−a = 1, the theorem assures us that an error bounded by ε canbe achieved with n no bigger than .86/ε. This is not much comfort if

Page 252: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 243

the desired accuracy is 10−12, since then n � 1012, a daunting numberof computations for even the fastest machines.

To see that left and right endpoint Riemann sums are typically poorapproximations of an integral, it is convenient to consider monoton-ic functions. Notice that for decreasing functions the left and rightendpoint Riemann sums are the upper and lower sums for a partitionrespectively. The situation is reversed for increasing functions.

The next result will show us a lower bound which is not much differentfrom the upper bound.

Theorem 8.4.1. Suppose f : [a, b] → R is monotonic, and for thepartition P

xk = kb − a

n, Δxk =

b − a

n.

Then ∣∣∣n−1∑k=0

f(xk)Δxk −n−1∑k=0

f(xk+1)Δxk

∣∣∣ =∣∣∣U(f,P) − L(f,P)

∣∣∣=

b − a

n[f(b) − f(a)].

Proof. Suppose that f is decreasing. Then

U(f,P) − L(f,P) =n−1∑k=0

f(xk)Δxk −n−1∑k=0

f(xk+1)Δxk

=b − a

n

n−1∑k=0

[f(xk) − f(xk+1)] =b − a

n[f(a) − f(b)],

since the sum telescopes. The case of increasing f is similar.

If f is monotonic, as our example is, this result promises that at leastone of the left or right endpoint Riemann sums is not that close to theintegral, since either∣∣∣∫ b

af(x) dx −

n−1∑k=0

f(xk)Δxk

∣∣∣ ≥ 12

b − a

n|f(b) − f(a)|,

or ∣∣∣∫ b

af(x) dx −

n−1∑k=0

f(xk+1)Δxk

∣∣∣ ≥ 12

b − a

n|f(b) − f(a)|.

In fact neither Riemann sum is an efficient method for calculating thisintegral (see problem 21).

Page 253: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

244 A Concrete Introduction to Real Analysis

8.4.2 More sophisticated integration procedures

Since the use of left and right endpoint Riemann sums is generallyinefficient for numerical calculation of integrals, it is desirable to find al-ternatives. One can view Riemann sums as an attempt to approximatethe integral of f from xk to xk+1 by estimating f with the constantfunction f(tk). Two improved techniques, the midpoint rule and thetrapezoidal rule, can be interpreted as procedures which replace thisconstant function with a linear function.

8.4.2.1 Midpoint Riemann sums

Figure 8.4 illustrates a method for approximating∫ ba f(x) dx based

on an approximation of f(x) on each of the subintervals [xk, xk+1] bythe tangent line at the midpoint of the interval. If tk denotes themidpoint of the interval, the approximating linear function is

lk(x) = f(tk) + (x − tk)f ′(tk), tk =xk + xk+1

2.

First compute the exact value of the integral of lk,∫ xk+1

xk

lk(x) dx =∫ xk+1

xk

f(tk) + (x − tk)f ′(tk) dx

= f(tk)(xk+1 − xk) + f ′(tk)∫ xk+1

xk

x − tk dx.

Since tk is the midpoint of the interval [xk, xk+1], the last integralshould be 0. A short calculation gives∫ xk+1

xk

x − tk dx =(x − tk)2

2

∣∣∣xk+1

xk

=12[(

xk − xk+1

2)2 − (

xk+1 − xk

2)2] = 0.

Consequently, ∫ xk+1

xk

lk(x) dx = f(tk)Δxk.

If Δxk = (b − a)/n, then adding the contributions from the varioussubintervals gives

∫ b

af(x) dx �

n−1∑k=0

∫ xk+1

xk

lk(x) dx =b − a

n

n−1∑k=0

f(xk + xk+1

2) (8.5)

Page 254: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 245

Notice that this is simply a Riemann sum, with tk taken to be themidpoint of each subinterval. This procedure for numerical evaluationof an integral is called the midpoint rule.

xk

xk−1

xk+1

Figure 8.4: Midpoint rule

To estimate the error made when using the midpoint rule on thesubinterval [xk, xk+1], use Taylor’s Theorem in the form

f(x) = f(tk)+f ′(tk)(x−tk)+f ′′(ξx)(x − tk)2

2= lk(x)+f ′′(ξx)

(x − tk)2

2,

where ξx is some point in the subinterval [xk, xk+1]. Here it is assumedthat f has a continuous second derivative on some open interval con-taining [a, b]. Since tk − xk = xk+1 − tk = Δxk/2, and∫ xk+1

xk

(x − tk)2

2dx =

(x − tk)3

6

∣∣∣xk+1

xk

= 2(Δxk)3

48,

the error in integration on the subinterval satisfies the inequality∣∣∣∫ xk+1

xk

f(x) − lk(x) dx∣∣∣ =

∣∣∣∫ xk+1

xk

f ′′(ξx)(x − tk)2

2dx

∣∣∣ (8.6)

Page 255: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

246 A Concrete Introduction to Real Analysis

≤ maxξ∈[a,b]

|f ′′(ξ)|(Δxk)3

24= max

ξ∈[a,b]|f ′′(ξ)|(b − a)3

24n3.

The total error in estimating the integral is no more than the sum ofthese subinterval errors. The result is presented as the next theorem.

Theorem 8.4.2. Suppose that f : [a, b] → R has two continuousderivatives. If xk = a + k(b − a)/n, then

∣∣∣∫ b

af(x) dx − b − a

n

n−1∑k=0

f(xk + xk+1

2)∣∣∣ ≤ max

ξ∈[a,b]|f ′′(ξ)|(b − a)3

24n2.

The O(n−2) error of the midpoint rule is dramatically better thanleft or right endpoint Riemann sums, whose errors are O(n−1). It isstriking that such an improvement can be achieved simply by changingthe point where f is evaluated from the endpoints of the subintervalsto the midpoints.

8.4.2.2 The trapezoidal rule

Figure 8.5 illustrates a method for approximating∫ ba f(x) dx with

a trapezoidal region on each of the subintervals [xk, xk+1]. On thissubinterval the original function f(x) is approximated by the linearfunction

Lk(x) = f(xk) + (x − xk)f(xk+1) − f(xk)

xk+1 − xk.

Note that Lk is simply the line joining (xk, f(xk)) and (xk+1, f(xk+1)).The next step is to compute∫ xk+1

xk

Lk(x) dx = f(xk)(xk+1 − xk) +f(xk+1) − f(xk)

xk+1 − xk

(xk+1 − xk)2

2

=12(xk+1 − xk)[f(xk) + f(xk+1)].

If xk+1−xk = (b−a)/n, then adding the contributions from the varioussubintervals gives

∫ b

af(x) dx �

n−1∑k=0

∫ xk+1

xk

Lk(x) dx (8.7)

=b − a

2n

n−1∑k=0

[f(xk) + f(xk+1)] =b − a

n[f(a)

2+

n−1∑k=1

f(xk) +f(b)2

].

Page 256: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 247

xk

xk−1

xk+1

Figure 8.5: Trapezoidal rule

Based on the geometry, (8.7) is called the trapezoidal rule for numericalevaluation of the integral. Observe that the trapezoidal rule is the sameas the left or right endpoint Riemann sums, except for a very slightmodification at the endpoints.

Obtaining bounds for the error of the trapezoidal rule requires a bitof work. The main step is presented in the next lemma.

Lemma 8.4.3. Suppose that f(x) has two continuous derivatives onthe interval [a, b]. Define a linear approximation

L(x) = f(a) + (x − a)f(b) − f(a)

b − a,

and an approximation error

e(x) = f(x) − L(x).

For any point c ∈ [a, b], there is point ξ ∈ [a, b] such that

e(c) =f ′′(ξ)

2(c − a)(c − b). (8.8)

Page 257: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

248 A Concrete Introduction to Real Analysis

Proof. If c is equal to either a or b, both sides of (8.8) have the desiredvalues of 0. So assume that c is not equal to a or b.

Define the auxiliary function

g(x) = e(x) − e(c)(x − a)(x − b)(c − a)(c − b)

.

This function satisfies g(a) = g(b) = g(c) = 0. By Rolle’s Theoremthere are points x1 ∈ (a, c) and x2 ∈ (c, b) such that g′(x1) = g′(x2) = 0.Since x1 and x2 are distinct, Rolle’s Theorem may be applied again,producing a point ξ ∈ (a, b) with g′′(ξ) = 0.

Since L(x) is a polynomial of degree at most 1, L′′(ξ) = 0, and

e′′(ξ) = f ′′(ξ).

The function g(x) is the difference of e(x) and a polynomial of degree2. Differentiation thus gives

g′′(ξ) = 0 = f ′′(ξ) − e(c)2

(c − a)(c − b).

The desired result (8.8) is obtained by solving for e(c).

The error for the trapezoidal rule may now be evaluated by applyingthe lemma when a = xk, b = xk+1, and c = x. Integration on thesubinterval gives∣∣∣∫ xk+1

xk

f(x) − Lk(x) dx∣∣∣ ≤ ∫ xk+1

xk

|e(x)| dx

=∫ xk+1

xk

|f′′(ξx)2

(x − xk)(x − xk+1)| dx

≤ 12

maxt∈[a,b]

|f ′′(t)|∫ xk+1

xk

|(x − xk)(x − xk+1)| dx.

Since the function (x − xk)(x − xk+1) does not change sign on theinterval [xk, xk+1],∫ xk+1

xk

|(x−xk)(x−xk+1)| dx =∣∣ ∫ xk+1

xk

(x−xk)(x−xk+1) dx∣∣ =

(Δxk)3

6

and ∣∣∣∫ xk+1

xk

f(x) − Lk(x) dx∣∣∣ ≤ max

t∈[a,b]|f ′′(t)|(Δxk)3

12. (8.9)

Summing these subinterval errors provides an estimate of the totalerror for the trapezoidal rule.

Page 258: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 249

Theorem 8.4.4. Suppose that f : (c, d) → R has a continuous secondderivative, and that [a, b] ⊂ (c, d). If xk = a + k(b − a)/n, then

∣∣∣∫ b

af(x) dx − b − a

n[f(a)

2+

n−1∑k=1

f(xk) +f(b)2

]∣∣∣ ≤ max

t∈[a,b]|f ′′(t)|(b − a)3

12n2.

Page 259: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

250 A Concrete Introduction to Real Analysis

8.5 Problems

1. Calculate U(f,P) − L(f,P) if

xk = a + kb − a

n, k = 0, . . . , n

and f(x) = Cx for some constant C. How big should n be if you want

U(f,P) −L(f,P) < 10−6?

2. Assume that g : [a, b] → R is integrable. For c ∈ R, show thatf(x) = g(x − c) is integrable on [a + c, b + c] and

∫ b

ag(x) dx =

∫ b+c

a+cf(x) dx.

(Hint: draw a picture.)3. Suppose that f : [a, b] → R is continuous, and f(x) ≥ 0 for all

x ∈ [a, b]. Show that if

∫ b

af(x) dx = 0,

then f(x) = 0 for all x ∈ [a, b]. Is the same conclusion true if f ismerely integrable?

4. Show that for any n ≥ 1 there is a partition of [a, b] with n + 1points x0, . . . , xn such that∫ xj+1

xj

t2 dt =∫ xk+1

xk

t2 dt

for any j and k between 0 and n−1. You may use Calculus to evaluatethe integrals.

5. Show that it is possible to construct a sequence of partitions Pn

such that Pn has n points, Pn ⊂ Pn+1, and

min0≤k≤n−1

(xk+1 − xk) ≥ 12

max0≤k≤n−1

(xk+1 − xk), Pn = {x0, . . . , xn}.

6. Prove Lemma 8.2.2.

Page 260: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 251

7. Suppose that f : [a, b] → R is continuous and∫ d

cf(x) dx = 0

for all a ≤ c < d ≤ b. Show that f(x) = 0 for all x ∈ [a, b].8. Suppose that f : [a, b] → R is integrable, that g(x) is real valued,

and that g(x) = f(x) except at finitely many points t1, . . . , tm ∈ [a, b].Show that g is integrable.

9. Show that if f and g are integrable, and f(x) ≤ g(x) for allx ∈ [a, b], then ∫ b

af(x) dx ≤

∫ b

ag(x) dx.

10. Suppose that f : [a, b] → R is integrable, and that f(x) = 0 forall rational numbers x. Show that∫ b

af(x) dx = 0.

11. Prove that if w(x) ≥ 0 is integrable, and f(x) is continuous, thenfor some ξ ∈ [a, b]∫ b

af(x)w(x) dx = f(ξ)

∫ b

aw(x) dx.

Hint: Start by showing that for some x1, x2 ∈ [a, b]

f(x1)∫ b

aw(x) dx ≤

∫ b

af(x)w(x) dx ≤ f(x2)

∫ b

aw(x) dx.

Now use the Intemediate Value Theorem.12. Fill in the details of the proof for Theorem 8.3.4 when h < 0.13. Is f : [0, 1] → R integrable if

f(x) ={sin(1/x) 0 < x < 1,

0 x = 0

}?

14. Assume that f : [a, b] → R is bounded, and that f is continuouson (a, b). Show that f is integrable on [a, b]

15. Recall that a function f : [a, b] → R is increasing if f(x) ≤ f(y)whenever x ≤ y. Show that every increasing (or decreasing) functionis integrable.

Page 261: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

252 A Concrete Introduction to Real Analysis

16. Complete the proof of Theorem 8.3.2 for functions f and g whichare not necessarily positive. (Hint: Find cf and cg such that f +cf ≥ 0and g + cg ≥ 0. Now compute.)

17. Find functions f : [0, 1] → R and g : [0, 1] → R which are notintegrable, but whose product fg is integrable.

18. Show that Theorem 8.2.4 is valid if f is merely assumed to beintegrable by using the following outline. Given ε > 0, start with apartition P1 = {x0, . . . , xn} such that U(f,P1) − L(f,P1) < ε. Nowconsider a second partition P2 = {t0, . . . , tm}, with μ(P2) ‘small’. Findsubcollections {tl | l = I(k), . . . , J(k)} such that

xk ≤ tI(k) < · · · < tJ(k) ≤ xk+1,

and|tI(k) − xk| ≤ μ(P2), |xk+1 − tJ(k)| ≤ μ(P2).

Estimate the difference U(f,P2) −L(f,P2) by using that P2 is almosta refinement of P1.

19. Rephrase Theorem 8.3.4, and give a proof, if the function f(x) ismerely assumed to be integrable on [a, b], and continuous at the pointx.

20. Show that the conclusions of Theorem 8.3.5 are still valid ifF ′(x) = f(x) for x ∈ (a, b), and if f(x), F (x) are continuous on [a, b].

21. Suppose that f ′(x) is continuous and nonnegative for x ∈ [a, b].Use the following approach to estimate the error made when using leftendpoint Riemann sums to approximate the integral of f .

(a) Show that

∫ b

af(x) dx −

n−1∑k=0

f(xk)Δxk =n−1∑k=0

∫ xk+1

xk

f(x) − f(xk) dx.

(b) If Ck = mint∈[xk,xk+1] f′(t), show that

∫ xk+1

xk

f(x) − f(xk) dx ≥ Ck(Δxk)2/2.

(c) If f ′(x) ≥ C for x ∈ [c, d] ⊂ [a, b], and if Δxk = (b − a)/n showthat ∫ b

af(x) dx −

n−1∑k=0

f(xk)Δxk ≥ C(d − c)(b − a)

2n.

Page 262: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Integrals 253

(d) What is the analogous result if f ′ ≤ 0? Use this to obtain a lowerbound on the error of approximating

∫ 10 e−x2

dx using left endpointRiemann sums.

22. Suppose f : [a, b] → R is convex and differentiable. Let xk =a + (b − a)k/n, and let Mn and Tn respectively denote the midpointand trapezoidal rule approximations to

I =∫ b

af(x) dx.

Show that Mn ≤ I ≤ Tn.23. Use Riemann sums to establish the following limits.(a) Show that

limn→∞

n−1∑k=0

1n

(k

n

)m=

1m + 1

, m = 0, 1, 2, . . . .

(b) Show that

limn→∞

n−1∑k=0

1n

1(1 + k/n)m

=1 − 21−m

m − 1, m = 2, 3, . . . .

24. Use Riemann sums to establish the following limits.(a) Show that

limn→∞

n−1∑k=0

n

n2 + k2= π/4.

(b) Show that

limn→∞

n−1∑k=0

1n

sin(kπ/n) = 2.

25. Suppose that f : [a, b] → R is integrable. Let ε > 0, and define aset Dε to be the set of points t0 ∈ [a, b] such that for every δ > 0 there isa point t1 ∈ [a, b] such that |t1−t0| < δ, but |f(t1)−f(t0)| ≥ ε. Let P ={x0, . . . , xn} be a partition of [a, b], and let B be the set of indices k suchthat (xk, xk+1) contains a point of Dε. If PB =

⋃k∈B[xk, xk+1], show

that for every σ > 0 there is a partition P such that length(PB) < σ.Draw a conclusion about the set of points where an integrable functionis not continuous.

Page 263: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

Chapter 9

More Integrals

9.1 Introduction

The basic theory of integration presented in the last chapter pro-vides a solid foundation for the analysis of integrals. Still, there aremany problems, both practical and theoretical, whose resolution re-quires modifications or extensions of these ideas. This chapter address-es some of the more routine extensions: handling unbounded functions,unbounded intervals, and integrals which carry extra parameters.

Riemann’s theory of integration works well for bounded functionsand bounded intervals, but many integrals arising in practice involveunbounded intervals or unbounded functions. Simple examples include∫ ∞

−∞

11 + x2

dx, (9.1)

and ∫ 1

0

1√1 − x2

dx. (9.2)

Understanding when these integrals make sense involves the use of lim-its in a way that more or less parallels the study of infinite series.

The second topic for this chapter is the study of functions definedthrough integration. An example is the Laplace transform of a functionf(x),

F (s) =∫ ∞

0f(x)e−sx dx,

which converts certain problems of calculus or differential equations toproblems of algebra. When functions are defined through integration,one would like to know when the function is differentiable, and how tocalculate the derivatives.

255

Page 264: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

256 A Concrete Introduction to Real Analysis

9.2 Improper integrals

As illustrated by the examples (9.1) and (9.2), one often encoun-ters integrals where the interval of integration is unbounded, or theintegrand itself is unbounded. In an example like∫ ∞

0

e−x

√x

dx,

both the function and the interval of integration are unbounded. Inte-grals exhibiting such difficulties are often termed improper.

−10 −5 0 5 10

−1.5

−1

−0.5

0

0.5

1

1.5

Figure 9.1: Graph of tan−1(x)

It is useful to draw analogies between the study of improper integralsand the study of infinite series. It is clear that some improper integralssuch as ∫ ∞

01 dx

represent an infinite area, and so will not have a real number value.

Page 265: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 257

Examples such as ∫ ∞

−∞x dx

make even less sense. On the other hand, since an antiderivative of1/(1 + x2) is tan−1(x), and an antiderivative of 1/

√1 − x2 is sin−1(x),

it is reasonable to expect that

∫ 1

0

1√1 − x2

dx = sin−1(1) − sin−1(0) = π/2, (9.3)

and that (see Figure 9.1)

∫ ∞

−∞

11 + x2

dx = limN→∞

∫ N

−N

11 + x2

dx (9.4)

= limN→∞

tan−1(N) − tan−1(−N) = π/2 − (−π/2) = π.

As in the case of infinite series, where sums of positive terms providethe foundation of the theory, integration of positive functions has thecentral role for the study of improper integrals. After extending thetheory of integration to handle improper integrals of positive functions,the ideas are extended to functions whose absolute values are well be-haved. Finally, more general cases of conditionally convergent integralsare considered.

To fix some notation and standard assumptions, suppose (α, β) ⊂ R

is an open interval. The cases α = −∞ and β = ∞ are allowed. Rie-mann integration will be the basis for considering improper integrals.Thus for each function f : (α, β) → R for which

∫ β

αf(x) dx

is considered, it is assumed that f is Riemann integrable on each com-pact interval [a, b] ⊂ (α, β). Notice that the function f is not assumedto be bounded on the open interval (α, β), just on compact subintervals[a, b] ⊂ (α, β). As an example, the function f(x) = x on the interval(−∞,∞) falls into this class.

Page 266: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

258 A Concrete Introduction to Real Analysis

9.2.1 Integration of positive functions

In addition to the standard assumptions above, suppose that f(x) ≥0. Say that the integral

∫ βα f(x) dx converges to the number I if

sup[a,b]⊂(α,β)

∫ b

af(x) dx = I < ∞.

Otherwise the integral diverges. If the integral converges, the numberI is taken to be the value of the integral. Of course if f is Riemannintegrable on [α, β], then

∫ βα f(x) dx converges, and the value agrees

with that of the Riemann integral.An important role in the study of improper integrals is played by

the least upper bound axiom. Recall that this axiom says that anyset of real numbers with an upper bound has a least upper bound, orsupremum. This axiom plays a role analogous to that of the BoundedMonotone Sequence Theorem in our study of positive series, in thatwe are able to establish the existence of limits without finding themexplicitly.

The first lemma restates the definition of convergence of an integralin a more convenient form.

Lemma 9.2.1. Suppose that f(x) ≥ 0 for x ∈ (α, β), and f is inte-grable for every [a, b] ⊂ (α, β). Then the integral

∫ β

αf(x) dx

converges to the value I if and only if the following two conditions hold:i) for every [a, b] ⊂ (α, β),

∫ b

af(x) dx ≤ I,

ii) for every ε > 0 there is an interval [a1, b1] ⊂ (α, β) such that

I −∫ b1

a1

f(x) dx < ε.

As with the comparison test for infinite series of positive terms, theconvergence of the improper integral of a positive function can be es-tablished by showing that the integral of a larger function converges.

Page 267: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 259

Theorem 9.2.2. (Comparison test) Assume that 0 ≤ f(x) ≤ g(x) forall x ∈ (α, β). If

∫ βα g(x) dx converges to Ig ∈ R, then

∫ βα f(x) dx

converges to a number If , and If ≤ Ig. If∫ βα f(x) dx diverges, so does∫ β

α g(x) dx.

Proof. For every interval [a, b] ⊂ (α, β) the inequality∫ b

af(x) dx ≤

∫ b

ag(x) dx,

holds for the Riemann integrals. Thus

sup[a,b]⊂(α,β)

∫ b

af(x) dx ≤ sup

[a,b]⊂(α,β)

∫ b

ag(x) dx = Ig.

Since the set of values∫ ba f(x) dx for [a, b] ⊂ (α, β) is bounded above, it

has a supremum If ≤ Ig, which by definition is the integral∫ ba f(x) dx.

In the other direction, if∫ βα f(x) dx diverges, then for any M > 0

there is an interval [a, b] such that

M ≤∫ b

af(x) dx ≤

∫ b

ag(x) dx.

It follows that

sup[a,b]⊂(α,β)

∫ b

ag(x) dx = ∞.

In calculus, improper integrals are analyzed with limit computations;the next result connects limits with our definition.

Theorem 9.2.3. Suppose that f(x) is positive and integrable on everysubinterval [a, b] ⊂ (α, β).

If the integral∫ βα f(x) dx converges, then for any point c ∈ (α, β)

there are real numbers I1 and I2 such that

lima→α+

∫ c

af(x) dx = I1, and lim

b→β−

∫ b

cf(x) dx = I2.

In the opposite direction, if there is any point c ∈ (α, β), and realnumbers I1 and I2 such that

lima→α+

∫ c

af(x) dx = I1, and lim

b→β−

∫ b

cf(x) dx = I2,

Page 268: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

260 A Concrete Introduction to Real Analysis

then the integral∫ βα f(x) dx converges to I1 + I2.

Proof. Suppose that the integral∫ βα f(x) dx converges to I. Take any

point c ∈ (α, β). Suppose that a ≤ c ≤ b. Since f(x) ≥ 0 for allx ∈ (α, β), the number g(b) =

∫ bc f(x) dx increases as b increases, and

h(a) =∫ ca f(x) dx increases as a decreases. In addition∫ c

af(x) dx ≤ I, and

∫ b

cf(x) dx ≤ I.

Since the numbers∫ bc f(x) dx are bounded above, they have a least

upper bound I2. For ε > 0 there is a number d with c ≤ d < β suchthat

I2 − ε < g(d) =∫ d

cf(x) dx ≤ I2.

Since g(b) increases as b increases, we conclude that

limb→β−

∫ b

cf(x) dx = I2.

A similar argument applies to the integrals∫ ca f(x) dx.

Now suppose that there is a point c ∈ (α, β), and real numbers I1

and I2 such that

lima→α+

∫ c

af(x) dx = I1, and lim

b→β−

∫ b

cf(x) dx = I2.

Let [a, b] ⊂ (α, β). Since∫ ba f(x) dx increases as b increases, or as a

decreases, there no loss of generality in assuming that a ≤ c ≤ b, andwe have ∫ b

af(x) dx =

∫ c

af(x) dx +

∫ b

cf(x) dx ≤ I1 + I2. (9.5)

Thus the integral∫ βα f(x) dx converges.

Finally, for any ε > 0 there are points a1 ≤ c and b1 ≥ c such that

(I1 + I2) −∫ b1

a1

f(x) dx = (I1 −∫ c

a1

f(x) dx) + (I2 −∫ b1

cf(x) dx) < ε.

Together with (9.5), this shows that

sup[a,b]⊂(α,β)

∫ b

af(x) dx = I1 + I2.

Page 269: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 261

To illustrate the last theorem, consider the integrals of x−p, whichare convenient for comparisons. First look at∫ ∞

1

1xp

, p ∈ R.

For p �= 1, ∫ N

1

1xp

=x1−p

1 − p

∣∣∣N1

=N1−p

1 − p− 1

1 − p,

while for p = 1, ∫ N

1

1x

= log(N).

Thus the integral ∫ ∞

1

1xp

.

diverges if p ≤ 1 and converges if p > 1.Next, examine ∫ 1

0

1xp

, p ∈ R.

For p �= 1, ∫ 1

ε

1xp

=x1−p

1 − p

∣∣∣1ε

=1

1 − p− ε1−p

1 − p,

which has no limit as ε → 0+ if p > 1. These integrals converge ifp < 1, and diverge if p = 1.

It follows from these calculations that there are no values of p forwhich

∫ ∞0 1/xp converges. To illustrate the convenience of x−p for

comparisons, start with the inequality

0 ≤ 11 + x2

<1x2

.

Since∫ ∞1 1/x2 dx converges, so does

∫ ∞1 1/(1 + x2) dx.

Mathematicians spend a considerable effort showing that various in-tegrals converge. One of the most useful tools is the following inequal-ity.

Theorem 9.2.4. (Cauchy-Schwarz) Suppose f(x) and g(x) are posi-tive and integrable on every subinterval [a, b] ⊂ (α, β). If

∫ βα f2(x) dx

and∫ βα g2(x) dx both converge, then so does

∫ βα f(x)g(x) dx, and[∫ β

αf(x)g(x) dx

]2 ≤[∫ β

αf2(x) dx

][∫ β

αg2(x) dx

]. (9.6)

Page 270: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

262 A Concrete Introduction to Real Analysis

Proof. For notational convenience, let A, B, and C be the positivenumbers defined by

A2 =∫ β

αf2(x) dx, B2 =

∫ β

αg2(x) dx, C =

∫ β

αf(x)g(x) dx.

If B = 0 then both sides of (9.6) are 0. Suppose that B �= 0.Define the quadratic polynomial

p(t) =∫ β

α[f(x) − tg(x)]2 dx = A2 − 2tC + t2B2.

The minimum, which is at least 0 since we are integrating the squareof a function, occurs when

t =C

B2.

Thus

0 ≤ A2 − 2C2

B2+

C2

B2= A2 − C2

B2,

orC2 ≤ A2B2

as desired.

As an example illustrating the use of the Cauchy-Schwarz inequality,consider

I =∫ ∞

1

1xpex

dx.

For p > 1/2 this integral may be estimated as follows.

I2 ≤∫ ∞

1x−2p

∫ ∞

1e−2x = (

12p − 1

)(1

2e2).

9.2.2 Absolutely convergent integrals

Infinite series containing both positive and negative terms behavewell if the series converges absolutely. Integrals behave in a similarmanner. Suppose that f : (α, β) → R is integrable on every compactsubinterval [a, b] ⊂ (α, β). We no longer assume that f is positive. Saythat

∫ βα f(x) dx converges absolutely if

∫ βα |f(x)| dx converges.

Define the positive and negative parts of a real valued function f asfollows:

f+(x) ={f(x), f(x) > 0

0, f(x) ≤ 0

},

Page 271: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 263

f−(x) ={f(x), f(x) < 0

0, f(x) ≥ 0

}.

Lemma 9.2.5. If f : [a, b] → R is Riemann integrable, then so are f+

and f−.The integal

∫ βα f(x) dx converges absolutely if and only if the integrals

∫ β

αf+(x) dx, and

∫ β

α−f−(x) dx

converge.

Proof. Observe that

f+(x) =f(x) + |f(x)|

2, f−(x) =

f(x) − |f(x)|2

. (9.7)

The function |f(x)| is integrable by Theorem 8.3.3. Since f+ and f−

can be written as the sum of two integrable functions, they are inte-grable by Theorem 8.3.1.

The remaining conclusions are straightforward consequences of (9.7)and the associated formula

|f(x)| = f+(x) − f−(x).

If the integral∫ βα f(x) dx converges absolutely, define

∫ β

αf(x) dx =

∫ β

αf+(x) dx −

∫ β

α−f−(x) dx.

Theorem 9.2.6. Suppose that the integral∫ βα f(x) dx converges abso-

lutely. For any c ∈ (α, β) the limits

I1 = lima→α+

∫ c

af(x) dx and I2 = lim

b→β−

∫ b

cf(x) dx

exist, and∫ βα f = I1 + I2.

Proof. Suppose that c ≤ b < β. Then by Lemma 9.2.5 we have∫ b

cf(x) dx =

∫ b

cf+(x) dx −

∫ b

c−f−(x) dx

Page 272: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

264 A Concrete Introduction to Real Analysis

and similarly for∫ ca f if a < c. Now use Theorem 9.2.3 to conclude

that∫ β

αf =

∫ β

αf+ −

∫ β

α−f− = lim

a→α+

∫ c

af+(x) dx + lim

b→β−

∫ b

cf+(x) dx

− lima→α+

∫ c

a−f−(x) dx − lim

b→β−

∫ b

c−f−(x) dx = I1 + I2.

9.2.3 Conditionally convergent integrals

Recall that there are infinite series such as∞∑

n=1

(−1)n1n

which are convergent (by the alternating series test), but not abso-lutely convergent. A similar situation arises in the study of improperintegrals. For example, the integral∫ ∞

1

sin(x)x

dx (9.8)

is not absolutely convergent, but there is a number

L = limN→∞

∫ N

1

sin(x)x

dx.

Assume, as before, that f is Riemann integrable on each compactsubinterval [a, b] of the open interval (α, β). Say that the integral∫ βα f(x) dx converges if there is some number c ∈ (α, β) and numbers

I1 and I2 such that the limits

I1 = lima→α+

∫ c

af(x) dx, and I2 = lim

b→β−

∫ b

cf(x) dx

exist. We then define ∫ β

αf(x) dx = I1 + I2.

By Theorem 9.2.6 an absolutely convergent integral is convergent. Anintegral which is convergent, but not absolutely convergent, is said tobe conditionally convergent.

Page 273: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 265

It appears that the value of a conditionally convergent integral mightdepend on the choice of the point c. The first result will show that thisis not the case.

Theorem 9.2.7. Suppose the integral∫ βα f(x) dx converges. Let d ∈

(α, β), and let I1 and I2 be as above. There are real numbers J1 andJ2 such that

J1 = lima→α+

∫ d

af(x) dx, J2 = lim

b→β−

∫ b

df(x) dx,

and J1 + J2 = I1 + I2.

Proof. For ease of notation assume that c < d. Then

J2 = limb→β−

∫ b

df(x) dx = lim

b→β−[∫ b

cf(x) dx −

∫ d

cf(x) dx]

= I2 −∫ d

cf(x) dx,

so J2 exists. A similar argument gives

J1 = I1 +∫ d

cf(x) dx,

which in turn leads to J1 + J2 = I1 + I2.

The next pair of results make a strong link between convergent inte-grals and convergent series.

Theorem 9.2.8. Assume that f : [c, β) → R is integrable on everycompact subinterval [a, b] ⊂ [c, β). Suppose there is a sequence of points{xk} such that x1 = c, xk < xk+1, limk→∞ xk = β, and for k =1, 2, 3, . . . we have f(x) ≥ 0 when x ∈ [x2k−1, x2k], while f(x) ≤ 0when x ∈ [x2k, x2k+1]. Then the integral

∫ βc f(x) dx converges if and

only if the series∞∑

k=1

∫ xk+1

xk

f(x) dx

converges.

Page 274: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

266 A Concrete Introduction to Real Analysis

Proof. By definition the infinite series converges if the sequence of par-tial sums

sn =n∑

k=1

∫ xk+1

xk

f(x) dx =∫ xn+1

cf(x) dx

has a limit as n → ∞. This is clearly implied by the existence of thelimit

limb→β−

∫ b

cf(x) dx,

which is assumed if the integral∫ βc f(x) dx converges.

Now assume that the sequence of partial sums

sn =n∑

k=1

∫ xk+1

xk

f(x) dx =∫ xn+1

cf(x) dx

converges to a limit L. Since the terms of the series are

ak =∫ xk+1

xk

f(x) dx,

and the terms of a convergent series have limit 0, it follows that

limk→∞

∫ xk+1

xk

f(x) dx = 0.

For any ε > 0 there is an N such that

|sn − L| < ε/2, n ≥ N,

and|∫ xk+1

xk

f(x) dx| < ε/2, k ≥ N.

Since f does not change sign in the interval [xk, xk+1], it follows that

|∫ b

xk

f(x) dx| < ε/2, k ≥ N, xk ≤ b ≤ xk+1.

Suppose now that xm ≤ b ≤ xm+1, with m > N . Then by thetriangle inequality

|∫ b

cf(x) dx − L| = |

∫ xm

cf(x) dx +

∫ b

xm

f(x) dx − L|

Page 275: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 267

≤ |∫ xm

cf(x) dx − L| + |

∫ b

xm

f(x) dx| < ε.

Thus

limb→β−

∫ b

cf(x) dx = L.

This theorem, in conjunction with the alternating series test for in-finite series, can be applied to integrals such as (9.8).

Theorem 9.2.9. Suppose that f : [0,∞) → R is integrable on everycompact subinterval [a, b] ⊂ [0,∞). Assume that f is positive, decreas-ing, and limx→∞ f(x) = 0. Then the integrals∫ ∞

0f(x) sin(x) dx and

∫ ∞

0f(x) cos(x) dx (9.9)

converge.

Proof. We will treat the first case; the second is similar. The sequencexk = (k − 1)π will satisfy the hypotheses of Theorem 9.2.8 since thesign of f(x) sin(x) is the same as that of sin(x), which changes at thepoints kπ.

The numbers

ak =∫ kπ

(k−1)πf(x) sin(x) dx, k = 1, 2, 3, . . .

have alternating signs and decreasing magnitudes. The second claim isverified with the following computation, which makes use of the identitysin(x + π) = − sin(x):

|ak| − |ak+1| =∫ kπ

(k−1)π|f(x) sin(x)| dx −

∫ (k+1)π

kπ|f(x) sin(x)| dx

=∫ kπ

(k−1)π|f(x) sin(x)| − |f(x + π) sin(x + π)| dx

=∫ kπ

(k−1)π[|f(x)| − |f(x + π)|] | sin(x)| dx.

Since f(x) is decreasing, |ak| − |ak+1| ≥ 0, so the sequence {|ak|} isdecreasing.

Page 276: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

268 A Concrete Introduction to Real Analysis

Since

|ak| =∫ (k+1)π

kπ|f(x) sin(x)| dx ≤ πf(kπ),

and limx→∞ f(x) = 0, it follows that limk→∞ ak = 0. The series∑ak =

∑(−1)k+1|ak| converges by the alternating series test. By

Theorem 9.2.8 the integral∫ ∞0 f(x) sin(x) converges.

9.3 Integrals with parameters

9.3.1 Sample computations

In mathematics one often encounters integrals which depend on aux-iliary parameters. Probably the two most important examples are theLaplace transform and the Fourier transform. The Laplace transformof a function f : [0,∞) → R is the function

F (s) =∫ ∞

0f(x)e−sx dx. (9.10)

The Fourier transform has several closely related forms. The Fouriersine and cosine transforms of a function f : R → R are respectively thefunctions

S(ω) =∫ ∞

−∞f(x) sin(ωx) dx, C(ω) =

∫ ∞

−∞f(x) cos(ωx) dx. (9.11)

The Laplace and Fourier transforms are usually first encountered astechniques which convert certain problems in differential equations intoproblems of algebra, which are often simpler. The Fourier transformis particularly important in more advanced studies of both pure andapplied mathematics.

These examples have two features of interest. First, they provide anew means of defining functions. In both cases functions of two vari-ables are integrated with respect to one variable, producing a functionof the other variable. It is both natural and useful to ask whether suchfunctions have derivatives, and how to compute them. Second, bothexamples involve improper integrals. One expects this feature to leadto some complications.

Page 277: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 269

To begin our discussion, let’s look at a simpler problem which onlyinvolves Riemann integration. Consider the function f(y) = tan−1(xy).The parameter x should be considered fixed for the purpose of thesecalculations. By the chain rule

df

dy=

x

1 + x2y2.

The Fundamental Theorem of Calculus gives

f(b) − f(a) =∫ b

af ′(y) dy.

Taking a = 0 and using tan−1(0) = 0, we obtain

tan−1(xb) =∫ b

0

x

1 + x2y2dy.

Now take b = 1 in this formula to obtain

tan−1(x) =∫ 1

0

x

1 + x2y2dy. (9.12)

Having obtained this integral formula for the function tan−1(x), itis natural to ask how to relate the formula to the calculation of thederivative, which calculus tells us is

d

dxtan−1(x) =

11 + x2

.

A reasonable guess is that the derivative can be obtained by differen-tiating the integrand in (9.12) with respect to the variable x. Differen-tiating under the integral sign in (9.12) gives∫ 1

0

d

dx

x

1 + x2y2dy =

∫ 1

0

1 − x2y2

(1 + x2y2)2dy.

This last form is not so transparent, but if

g(y) =y

1 + x2y2,

then

g′(y) =(1 − x2y2)(1 + x2y2)2

.

Page 278: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

270 A Concrete Introduction to Real Analysis

The Fundamental Theorem of Calculus then yields∫ 1

0g′(y) dy = g(1) − g(0),

or ∫ 1

0

1 − x2y2

(1 + x2y2)2dy =

11 + x2

.

At least in this case, differentiating under the integral sign is valid.This example illustrates a more general situation. Given a function

g(x, y) of two variables, consider defining a function f(x) by

f(x) =∫ b

ag(x, y) dy.

What conditions on g make f a differentiable function of x, and whenis the formula

f ′(x) =∫ b

a

∂xg(x, y) dy

valid?

9.3.2 Some analysis in two variables

We are going to require some definitions for discussing functions oftwo variables. These ideas, which may be familiar from multivariablecalculus, extend to any number of variables. Let R

2 denote the set ofordered pairs of real numbers. Suppose P = (x1, y1) and Q = (x2, y2)are points in R

2. Define the distance between P and Q by

d(P,Q) =√

(x1 − x2)2 + (y1 − y2)2,

and the open ball of radius r centered at P by

Br(P ) = {Q ∈ R2| d(P,Q) < r}.

A set Ω ⊂ R2 is open if for every P ∈ Ω there is an r > 0 such that

Br(P ) ⊂ Ω.Suppose that g : Ω → R is a real valued function defined on Ω (which

is not necessarily open). With P = (x1, y1) and Q = (x2, y2), say thatg is continuous at P =∈ Ω if for every ε > 0 there is a δP > 0 such that

|g(x2, y2) − g(x1, y1)| < ε whenever d(P,Q) < δP , P,Q ∈ Ω.

Page 279: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 271

The function g is continuous on Ω if it is continuous at each point ofΩ, and g is uniformly continuous on Ω if δ can be chosen independentof P . As with functions of one variable, sums and products of contin-uous functions of two variables are continuous. Note that if g(x, y) iscontinuous at (x0, y0) as a function of two variables, then the functiong(x, y0), respectively g(x0, y), is continuous at x0, respectively y0, as afunction of one variable.

It is often convenient to work with functions defined on rectangles.Suppose that I1 = [a, b] and I2 = [c, d]. Then

I1 × I2 = {(x, y) ∈ R2| x ∈ I1, y ∈ I2}.

By following the approach used in Theorem 7.3.6, one may establishthe following result.

Theorem 9.3.1. If g : I1 × I2 → R is continuous, then g is uniformlycontinuous.

Suppose that Ω is an open set, and g : Ω → R. Say that g has partialderivatives ∂g

∂x with respect to x and ∂g∂y with respect to y at (x0, y0) if

the following limits exist:

∂g

∂x(x0, y0) = lim

h→0

g(x0 + h, y0) − g(x0, y0)h

,

∂g

∂y(x0, y0) = lim

h→0

g(x0, y0 + h) − g(x0, y0)h

.

For our treatment of functions defined by integration it will be con-venient to use the Fundamental Theorem of Calculus in the followingform. Assume that the line segment from (α, y) to (β, y) is a sub-set of Ω. If y is fixed, then a function f : [α, β] → R is defined byf(x) = g(x, y). The partial derivative of g with respect to x at (x, y)is just the derivative of f(x). Thus if ∂g

∂x is continuous on [α, β], then

g(β, y) − g(α, y) =∫ β

α

∂xg(x, y) dx.

Similar remarks apply for the partial derivative of g with respect to y.The theory of Riemann integration extends from one variable to sev-

eral variables. Instead of working on an interval [a, b], integrals are nowcomputed over rectangles R = [a, b]×[c, d] = I1×I2. Partitions are setsof order pairs (xi, yj) where a = x0 < x1 < · · · < xm = b is a partition

Page 280: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

272 A Concrete Introduction to Real Analysis

of [a, b], and c = y0 < y1 < · · · < yn = d is a partition of [c, d]. Such apartition divides the rectangle R into a collection of subrectangles

Rij = [xi, xi+1] × [yj , yj+1], i = 0, . . . ,m − 1, j = 0, . . . , n − 1.

To establish the appropriate notation, define

Δxi = xi+1 − xi, Δyj = yj+1 − yj.

Mij = sup(x,y)∈Rij

g(x, y), mij = inf(x,y)∈Rij

g(x, y).

For bounded real valued functions g(x, y) defined on I1 × I2, the upperand lower sums U(g,P) and L(g,P) may be defined for a given partitionusing the infimum and supremum of g on the subrectangles Rij of thepartition,

U(g,P) =∑i,j

MijΔxiΔyj, L(g,P) =∑i,j

mijΔxiΔyj.

The function g is said to be integrable if the infimum of the upper sumsis equal to the supremum of the lower sums, and this common value isthe integral of g, which is denoted∫ ∫

Rg(x, y) =

∫ ∫I1×I2

g(x, y).

Riemann sums corresponding to a partition have the form

m−1∑i=0

n−1∑j=0

g(si, tj)ΔxiΔyj, (si, tj) ∈ Rij.

Based on Theorem 9.3.1 there is a generalization of Theorem 8.2.4which may be stated as follows.

Theorem 9.3.2. Suppose g : I1 × I2 → R is continuous. Then g isintegrable. For any ε > 0 there is a μ0 > 0 such that

∣∣∣ ∫ ∫R

g(x, y) −m−1∑i=0

n−1∑j=0

g(xi, yj)ΔxiΔyj

∣∣∣ < ε

whenever

max(Δxi,Δyj) ≤ μ0, i = 0, . . . ,m − 1, j = 0, . . . , n − 1.

Page 281: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 273

9.3.3 Functions defined by Riemann integration

For ease of exposition, assume that g(x, y) is a function of two vari-ables which is defined and continuous for all (x, y) ∈ R

2.

Theorem 9.3.3. Suppose that g : R2 → R is continuous, and that a

and b are real numbers. Then the function f : R → R defined by

f(x) =∫ b

ag(x, y) dy

is continuous.

Proof. Pick real numbers x0, ε > 0, and σ > 0. Let I1 = [x0−σ, x0+σ],I2 = [a, b], and assume that

|x − x0| < σ.

Consider

|f(x) − f(x0)| =∣∣∣ ∫ b

a

(g(x, y) − g(x0, y)

)dy

∣∣∣ (9.13)

≤∫ b

a

∣∣g(x, y) − g(x0, y)∣∣ dy ≤ |b − a| sup

(x,y)∈I1×I2

|g(x, y) − g(x0, y)|.

Since g is continuous, it is uniformly continuous on the rectangleI1 × I2 by Theorem 9.3.1. Thus there is a δ > 0 such that

sup(x,y)∈I1×I2

|g(x, y2) − g(x0, y1)| <ε

1 + |b − a|

if the distance from (x, y2) to (x0, y1) is less than δ. This distanceinequality will certainly be satisfied if y2 = y1 = y and |x − x0| < δ.By (9.13) the function f is continuous.

A similar direct analysis gives conditions ensuring the differentiabilityof f .

Theorem 9.3.4. Suppose that g : R2 → R and ∂g

∂x : R2 → R are

continuous, and that a and b are real numbers. Then the functionf : R → R defined by

f(x) =∫ b

ag(x, y) dy

Page 282: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

274 A Concrete Introduction to Real Analysis

is differentiable, and

f ′(x) =∫ b

a

∂g

∂x(x, y) dy.

Proof. As in the last proof, pick real numbers x0, ε > 0, and σ > 0.Let I1 = [x0 − σ, x0 + σ], I2 = [a, b], and assume that

|x − x0| < σ.

Since ∂∂xg(x, y) is continuous, the Fundamental Theorem of Calculus

gives

g(x1, y) − g(x0, y) =∫ x1

x0

∂g

∂x(x, y) dx.

This identity may be used to express the difference quotients for f as

f(x0 + h) − f(x0)h

=∫ b

a

g(x0 + h, y) − g(x0, y)h

dy

=∫ b

a

1h

∫ x0+h

x0

∂g

∂x(x, y) dx dy.

Now write

∂g

∂x(x, y) =

∂g

∂x(x0, y) +

(∂g

∂x(x, y) − ∂g

∂x(x0, y)

),

and use the fact that g(x0, y) is constant in x for each y to get

f(x0 + h) − f(x0)h

=∫ b

a

∂g

∂x(x0, y) dy

+∫ b

a

1h

∫ x0+h

x0

∂g

∂x(x, y) − ∂g

∂x(x0, y) dx dy.

This identity leads to the inequality

∣∣∣f(x0 + h) − f(x0)h

−∫ b

a

∂g

∂x(x0, y) dy

∣∣∣≤

∫ b

a

1h

∫ x0+h

x0

∣∣∣∂g

∂x(x, y) − ∂g

∂x(x0, y)

∣∣∣ dx dy

≤ |b − a| sup(x,y)∈I1×I2

∣∣∣∂g

∂x(x, y) − ∂g

∂x(x0, y)

∣∣∣.

Page 283: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 275

The function ∂g∂x is uniformly continuous on I1 × I2 by Theorem 9.3.1.

This means there is a δ such that

sup(x,y)∈I1×I2

∣∣∣∂g

∂x(x, y) − ∂g

∂x(x0, y)

∣∣∣ <ε

1 + |b − a|

when |x − x0| < δ, giving

limh→0

f(x0 + h) − f(x0)h

=∫ b

a

∂g

∂x(x0, y) dy.

Continuity of g(x, y) implied that

f(x) =∫ b

ag(x, y) dy

is continuous. The function f is then integrable, and the number∫ d

cf(x) dx =

∫ d

c

∫ b

ag(x, y) dy dx

may be considered. This is called an iterated integral. The integrationsmay also be carried out in the reverse order. The next result says thatthe two iterated integrals have the same value as the Riemann integralof the function g(x, y).

Theorem 9.3.5. Let R = [a, b]× [c, d] and suppose that g : R2 → R is

continuous. Then∫ ∫R

g(x, y) =∫ b

a

∫ d

cg(x, y) dy dx =

∫ d

c

∫ b

ag(x, y) dx dy.

Proof. Pick ε > 0 and let x0 < x1 < · · · < xm and y0 < y1 < · · · < yn

be respectively partitions of [a, b] and [c, d] with equal length subinter-vals

xi+1 − xi =b − a

m, yj+1 − yj =

d − c

n.

Let μ = maxi,j(Δxi,Δyj). Theorem 9.3.2 tells us that there is a μ0

such that

D1 =∣∣∣ ∫ ∫

Rg(x, y) −

m−1∑i=0

n−1∑j=0

g(xi, yj)ΔxiΔyj

∣∣∣ < ε/3

Page 284: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

276 A Concrete Introduction to Real Analysis

if μ < μ0.Now consider the approximation of

∫ ba

∫ dc g(x, y) dy dx obtained by

using Riemann sums instead of the integration with respect to x. Let-ting D2 denote the error

D2 = |∫ b

a

∫ d

cg(x, y) dy dx −

m−1∑i=0

∫ d

cg(xi, y) dyΔxi|,

Theorem 8.2.4 shows that D2 < ε/3 if Δxi is small enough.A similar argument applies for each of the m integrals

∫ dc g(xi, y) dy.

Fixing i, let

ε1 = |∫ d

cg(xi, y) dy −

n−1∑j=0

g(xi, yj)Δyj|.

Making Δyj small enough will give ε1 < ε/(3|b − a|), or

D3 =∣∣∣ m−1∑

i=0

∫ d

cg(xi, y) dyΔxi −

m−1∑i=0

n−1∑j=0

g(xi, yj)ΔyjΔxi

∣∣∣

≤m−1∑i=0

∣∣∣ ∫ d

cg(xi, y) dy −

n−1∑j=0

g(xi, yj)Δyj

∣∣∣ |b − a|m

< ε/3.

The triangle inequality then gives

∣∣∣ ∫ b

a

∫ d

cg(x, y) dy dx −

∫ ∫R

g(x, y)∣∣∣ ≤ D2 + D3 + D1 < ε.

The same argument applies to the second iterated integral, finishingthe proof.

In addition to functions defined by integration as in Theorem 9.3.4,one often encounters more general forms where the limits of integrationare functions of x rather than constants. A case in point is the variationof parameters (or constants) formula for solving differential equations.In the most important case this result says that if w1(x) and w2(x)satisfy the equation

w′′ + q(x)w = 0, (9.14)

and the initial conditions

w1(a) = 1, w′1(a) = 0,

Page 285: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 277

w2(a) = 0, w′2(a) = 1,

then the function

z(x) =∫ x

a[w2(x)w1(y) − w1(x)w2(y)]r(y) dy

is a solution to the equation

z′′ + q(x)z = r(x). (9.15)

The functions q, and r are usually assumed to be continuous, and thetheory of differential equations tells us that w1 and w2 will have twocontinuous derivatives.

Let’s formulate a theorem that will enable us to verify the variationof parameters formula. This formula will be explored further in theexercises.

Theorem 9.3.6. Suppose that g : R2 → R and ∂g

∂x : R2 → R are

continuous. Then the function f : R → R defined by

f(x) =∫ x

ag(x, y) dy

is differentiable, and

f ′(x) = g(x, x) +∫ x

x0

∂g

∂x(x, y) dy.

Proof. The main calculation is

f(x + h) − f(x)h

=1h

∫ x+h

ag(x + h, y) dy − 1

h

∫ x

ag(x, y) dy

=1h

∫ x+h

xg(x + h, y) dy +

1h

∫ x

ag(x + h, y) − g(x, y) dy.

The continuity of g(x, y) implies

limh→0

1h

∫ x+h

xg(x + h, y) dy = g(x, x), (9.16)

while the argument of Theorem 9.3.4 may be applied to show that

limh→0

1h

∫ x

ag(x + h, y) − g(x, y) dy =

∫ x

a

∂g

∂x(x, y) dy. (9.17)

Page 286: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

278 A Concrete Introduction to Real Analysis

Readers who recall the chain rule for functions of several variablesare invited to consider a more general problem. First derive formulasfor the partial derivatives of

f(x, a, b) =∫ b

ag(x, y) dy,

and then formulas for h′(x) if h(x) = f(x, a(x), b(x)).

9.3.4 Functions defined by improper integrals

As indicated earlier, the Laplace transform (9.10) and the Fouriertransforms (9.11) are extremely important in mathematics and relatedfields. These functions are defined in terms of improper integrals, sothe previous results will require some modification before they can beused to justify helpful calculations.

Here is an example of one such calculation, which is treated formallyfor now. In several subjects one wants to evaluate the Fourier cosinetransform of the function f(x) = exp(−x2),

C(ω) =∫ ∞

−∞e−x2

cos(ωx) dx.

This integral looks challenging for the standard techniques of calculus.However differentiation of the function leads to

dC

dω=

∫ ∞

−∞−e−x2

x sin(ωx) dx.

Now integrate by parts to get

dC

dω= lim

N→∞

∫ N

−N

(12

d

dxe−x2)

sin(ωx) dx

= limN→∞

12e−x2

sin(ωx)∣∣∣N−N

− 12

∫ N

−Ne−x2

ω cos(ωx) dx

= −ω

2

∫ ∞

−∞e−x2

cos(ωx) dx = −ω

2C(ω).

The reader may recall how to solve the differential equation

C ′(ω) = −ω

2C(ω).

Page 287: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 279

Begin by moving all expressions involving C to the left.

C ′(ω)C(ω)

=d

dωlog(

∣∣C(ω)∣∣) = −ω

2

Now integrate to get

log(∣∣C(ω)

∣∣) = K − ω2/4,

or ∣∣C(ω)∣∣ = eKe−ω2/4.

Combining the continuity of C(ω) with the fact that C(0) > 0, thelast equation implies that

C(ω) = eKe−ω2/4.

To find the constant eK , consider the value of C(ω) at ω = 0,

C(0) = eK =∫ ∞

−∞e−x2

dx.

This integral is usually encountered in multivariable calculus, wheretrickery involving polar coordinates shows that C(0) =

√π. The final

result isC(ω) =

√πe−ω2/4,

which is nearly the same function we started with.Turning to some general questions, let us consider continuity and

differentiability for functions of the form

f(x) =∫ ∞

cg(x, y) dy.

Assume that

f(x) = limN→∞

∫ N

cg(x, y) dy

exists for x in some interval I ⊂ R. We say that the integral convergesuniformly on I if for every ε > 0 there is an M such that

|f(x) −∫ N

cg(x, y) dy| < ε, whenever N ≥ M, x ∈ I.

Page 288: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

280 A Concrete Introduction to Real Analysis

Lemma 9.3.7. Suppose the integral

f(x) =∫ ∞

cg(x, y) dy

converges uniformly on I. Then for every ε > 0 there is an M suchthat

|∫ N

Mg(x, y) dy| < ε, N ≥ M, x ∈ I.

Proof. Take ε1 = ε/2, and let M correspond to ε1 as in the definitionof a uniformly convergent integral. If N ≥ M , the triangle inequalitygives

|∫ N

Mg(x, y) dy| = |

∫ N

cg(x, y) dy − f(x) + f(x) −

∫ M

cg(x, y) dy|

≤ |∫ N

cg(x, y) dy − f(x)| + |f(x) −

∫ M

cg(x, y) dy| < 2ε1 = ε.

One method of establishing uniform convergence is to compare thefunctions g(x, y) to a single absolutely integrable function h(y).

Theorem 9.3.8. Suppose that the positive function h(y) is integrableon [c,∞), and for each fixed x ∈ I the function G(y) = g(x, y) isintegrable. If |g(x, y)| ≤ h(y), then the integrals

f(x) =∫ ∞

cg(x, y) dy

converge uniformly on I.

Proof. Suppose that

J =∫ ∞

ch(y) dy = sup

[a,b]⊂(c,∞)

∫ b

ah(y) dy.

Given any ε > 0 there is an M such that

J − ε <

∫ M

ah(y) dy ≤ J,

Page 289: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 281

and since h(y) is positive and integrable,

∫ N

Mh(y) dy < ε, N ≥ M.

Since

f(x) = limN→∞

∫ N

cg(x, y) dy,

for each x ∈ I there is an M1 such that

|f(x) −∫ N

cg(x, y) dy| < ε,

whenever N ≥ M1. There is no loss of generality if we assume thatM1 ≥ M .

There are two cases to consider: M ≤ M1 ≤ N , and M ≤ N ≤ M1.In the first case

|f(x) −∫ N

cg(x, y) dy| = |f(x) −

∫ M1

cg(x, y) dy −

∫ N

M1

g(x, y) dy|.

In this case the triangle inequality gives

|f(x) −∫ N

cg(x, y) dy| < ε +

∫ N

Mh(y) dy < 2ε.

In the second case

|f(x) −∫ N

cg(x, y) dy| = |f(x) −

∫ M1

cg(x, y) dy +

∫ M1

Ng(x, y) dy|,

and so

|f(x) −∫ N

cg(x, y) dy| < ε +

∫ M1

Mh(y) dy < 2ε.

Theorem 9.3.9. Assume that g : R2 → R is continuous, and that the

integrals

f(x) =∫ ∞

cg(x, y) dy

converge uniformly for x ∈ I. Then the function f(x) is continuous onI.

Page 290: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

282 A Concrete Introduction to Real Analysis

Proof. Let x1 ∈ I and ε > 0. Since the integrals for f(x) convergeuniformly, Lemma 9.3.7 says there is an M such that

|∫ N

Mg(x, y) dy| < ε, N ≥ M, x ∈ I.

Also, for each x and x1 in I there is an M1 such that

|f(x)−∫ N

cg(x, y) dy| < ε, |f(x1)−

∫ N

cg(x1, y) dy| < ε, N ≥ M1.

As in the last theorem, it is safe to assume that M1 ≥ M . Now

|f(x) − f(x1)|

= |∫ M1

cg(x, y) − g(x1, y) dy + f(x) −

∫ M1

cg(x, y) dy

−f(x1) +∫ M1

cg(x1, y) dy|

≤ |∫ M

cg(x, y) − g(x1, y) dy| + |

∫ M1

Mg(x, y) − g(x1, y) dy| + 2ε.

In addition,

|∫ M1

Mg(x, y)−g(x1, y) dy| ≤ |

∫ M1

Mg(x, y) dy|+|

∫ M1

Mg(x1, y) dy| < 2ε.

Finally, since g : R2 → R is continuous, it is uniformly continuous on

I × [c,M ]. This means there is a δ > 0, independent of x and x1, suchthat

|g(x, y) − g(x1, y)| <ε

1 + |M − c| , |x − x1| < δ.

This leads to

|∫ M

cg(x, y) − g(x1, y) dy| ≤

∫ M

c|g(x, y) − g(x1, y)| dy < ε,

concluding the proof.

Page 291: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 283

Theorem 9.3.10. Suppose that g : R2 → R and ∂g

∂x : R2 → R are

continuous, that the integral

f(x) =∫ ∞

cg(x, y) dy

converges for every x ∈ (a, b), and that the integrals

h(x) =∫ ∞

c

∂g

∂x(x, y) dy

converge uniformly for x ∈ (a, b). Then the function f(x) is differen-tiable, and

f ′(x) =∫ ∞

c

∂g

∂x(x, y) dy.

Proof. By the Fundamental Theorem of Calculus the difference quo-tients are

f(x0 + h) − f(x0)h

=∫ ∞

c

g(x0 + h, y) − g(x0, y)h

dy

=∫ ∞

c

1h

∫ x0+h

x0

∂g

∂x(x, y) dx dy = lim

N→∞

∫ N

c

1h

∫ x0+h

x0

∂g

∂x(x, y) dx dy.

Let ε > 0 and choose M such that

|∫ N

M

∂g

∂x(x, y) dy| < ε, N ≥ N.

By Theorem 9.3.5 ∫ N

c

1h

∫ x0+h

x0

∂g

∂x(x, y) dx dy

=∫ M

c

1h

∫ x0+h

x0

∂g

∂x(x, y) dx dy +

1h

∫ x0+h

x0

∫ N

M

∂g

∂x(x, y) dy dx.

The ‘tail’ is estimated by

|1h

∫ x0+h

x0

∫ N

M

∂g

∂x(x, y) dy dx| ≤ sup

x0≤x≤x0+h|∫ N

M

∂g

∂x(x, y) dy dx| < ε

by the uniform convergence of these integrals.The result

limh→0

∫ M

c

1h

∫ x0+h

x0

∂g

∂x(x, y) dx dy =

∫ M

c

∂g

∂x(x0, y) dy

is Theorem 9.3.4.

Page 292: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

284 A Concrete Introduction to Real Analysis

As an application of the last theorem, consider the differentiation ofthe Laplace transforms

F (s) =∫ ∞

0e−stf(t) dt.

Formal differentiation yields

F ′(s) = −∫ ∞

0e−sttf(t) dt. (9.18)

This formula can be justified for s > a if f(t) is continuous and thereare positive constants C1 and a such that |f(t)| ≤ C1 exp(at).

Suppose that s − a ≥ 2ε > 0. There is a constant C2 such that

t ≤ C2eεt, 0 ≤ t < ∞,

which implies|e−stteat| < C2e

−εt.

For s ≥ a + 2ε the integral F (s) converges, and by Theorem 9.3.8 theintegral in (9.18) converges uniformly. The validity of the formula forF ′(s) then follows by Theorem 9.3.10.

Page 293: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 285

9.4 Problems

1. Show that if f ≥ 0 and f : [α, β] → R is Riemann integrable,then the value of the Riemann integral of f is equal to

sup[a,b]⊂(α,β)

∫ b

af(x) dx.

2. Suppose that f and g are both positive and Riemann integrableon every compact subinterval [a, b] ⊂ [0,∞).

(a) Assume that there is a number c such that 0 ≤ f(x) ≤ g(x) forx ≥ c. Show that

∫ ∞0 f(x) dx converges if

∫ ∞0 g(x) dx does.

(b) Assume that

limx→∞

f(x)g(x)

= M, 0 < M < ∞.

Show that∫ ∞0 f(x) dx converges if

∫ ∞0 g(x) dx does.

3. Suppose that for i = 1, 2 the functions fi : (α, β) → R , areintegrable on every compact subinterval [a, b] ⊂ (α, β), and that theintegrals

∫ βα fi(x) dx converge absolutely. If c1, c2 are any real numbers,

show that the integral∫ βα c1f1(x)+c2f2(x) dx converges absolutely, and∫ β

αc1f1(x) + c2f2(x) dx = c1

∫ β

αf1(x) dx + c2

∫ β

αf2(x) dx.

4. Suppose that∫ ∞−∞ f converges absolutely. Show that∫ ∞

−∞f = lim

N→∞

∫ N

−Nf.

5. Show that the integral ∫ ∞

1

sin(x)x

dx

does not converge absolutely. (Hint: compare an integral with a series.)6. Resolve the following paradox. The function 1/x2 has an an-

tiderivative −1/x. By the Fundamental Theorem of Calculus∫ 1

−1

1x2

dx = −1x

∣∣∣1−1

= −2.

Page 294: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

286 A Concrete Introduction to Real Analysis

However the integral of the positive function 1/x2 should be positive.7. Prove that the following integrals converge:∫ ∞

1

sin(x)x

dx,

∫ ∞

0

sin(x)x

dx.

8. In Theorem 9.2.9, show that

|∫ ∞

0f(x) sin(x) dx| ≤ |

∫ π

0f(x) sin(x) dx|.

9. Extend Theorem 9.2.9 to integrals of the form∫ ∞

0f(x) sin(ωx) dx and

∫ ∞

0f(x) cos(ωx) dx

for ω �= 0.10. Show that if p > 1 then∫ ∞

1sin(xp) dx

converges. (Hint: use some calculus.)11. Suppose that f : [1,∞) → R is positive and decreasing.(a) Show that ∫ ∞

0f(x) dx < ∞

if and only if∞∑

k=0

2kf(2k) < ∞.

(b) Show that the integral∫ ∞

2

1x log(x)

dx

diverges.12. Suppose that f is continuously differentiable on R and∫ 1

0|f ′(t)|2 dt ≤ 1.

Show that|f(x) − f(0)| ≤ √

x, 0 ≤ x ≤ 1.

Page 295: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 287

13. The equation y′′ + y = 0 has solutions y1(x) = cos(x) andy2(x) = sin(x).

(a) Show that

z(x) =∫ x

0[sin(x) cos(y) − cos(x) sin(y)]r(y) dy

satisfies the equation y′′ + y = r(x).(b) Show by differentiation that

z(x) =∫ x

0sin(x − y)r(y) dy

satisfies the same equation. (Another approach is to use a trigonometricidentity.)

14. Find solutions w1(x) and w2(x) of the equation w′′ − w = 0satisfying

w1(0) = 1, w′1(0) = 0,

w2(0) = 0, w′2(0) = 1.

Continue to develop the variation of parameters formulas as in theprevious problem.

15. Justify the claims (9.16) and (9.17).16. For k = 0, 1, 2, . . . , establish the formulas

(1 − cos(ω)ω

)(2k)=

∫ 1

0(−1)kx2k sin(ωx) dx,

(1 − cos(ω)ω

)(2k+1)=

∫ 1

0(−1)kx2k+1 cos(ωx) dx.

17. Define

F (s) =∫ ∞

0e−stf(t) dt, F1(s) =

∫ ∞

0e−stf ′(t) dt, . . . ,

Fk(s) =∫ ∞

0e−stf (k)(t) dt.

Assuming sufficient decay at ∞, integrate by parts to relate F1(s) andF (s). Generalize to Fk(s). What hypotheses are needed to justify thecomputations?

18. IfF (s) =

∫ ∞

0e−stf(t) dt,

Page 296: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

288 A Concrete Introduction to Real Analysis

show that lims→∞ F (s) = 0 if∫ ∞

0f(t) dt

converges absolutely.19. Assume that f : [0,∞) → R is integrable on compact intervals

[a, b] ⊂ [0,∞), and that there are positive constants C1 and a such that|f(t)| ≤ C1 exp(at). Show that for any positive integer k, the integral∫ ∞

0e−sttkf(t) dt

converges for s > a.20. Consider these integration by parts problems.(a) What hypotheses do you need to justify the formula∫ ∞

−∞f (2)(x) cos(ωx) dx = −ω2

∫ ∞

−∞f(x) cos(ωx) dx.

Generalize from f (2)(x) to f (k)(x) for k = 1, 2, 3, . . . .(b) Compute ∫ ∞

−∞x2e−x2

cos(ωx) dx.

21. Assume that f : (−∞,∞) → R is continuous and that theintegral ∫ ∞

−∞f(x) dx

converges absolutely. If

C(ω) =∫ ∞

−∞f(x) cos(ωx) dx,

show that limω→±∞ C(ω) = 0. (Hint: Evaluate and use∫ ba cos(ωx) dx.)

22. The Gamma function is defined by

Γ(z) =∫ ∞

0e−ttz−1 dt.

(a) Show that this integral converges if z > 0. It may help to write

tz−1 = e(z−1) log(t).

Page 297: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

More Integrals 289

What happens if z ≤ 0?(b) Show that Γ(1) = 1 and that Γ(z+1) = zΓ(z), so that Γ(n+1) =

n! for n = 0, 1, 2, . . . .(c) Compute Γ′(z) and justify the calculation.23. For y > 0 and x ∈ R let

u(x, y) =1π

∫ ∞

−∞

y

y2 + (x − t)2f(t) dt.

(a) Show that the integral converges if f(t) is bounded.(b) Show that

∂2u

∂x2+

∂2u

∂y2= 0.

24. For t > 0 and x ∈ R let

u(x, t) =1√4πt

∫ ∞

−∞exp(−(x − y)2

4t)f(y) dy.

(a) Find some conditions on the growth of f(y) which will ensureconvergence of this integral.

(b) Show that∂u

∂t=

∂2u

∂x2.

Page 298: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

References

[1] D. Blatner. The Joy of π. Walker Publishing Co., United States,1997.

[2] D. Bressoud. A Radical Approach to Real Analysis. The Mathe-matical Association of America, United States, 1994.

[3] L. Euler. Introduction to Analysis of the Infinite. Springer-Verlag,New York, 1988.

[4] L. Euler. Foundations of Differential Calculus. Springer-Verlag,New York, 2000.

[5] P. Fitzpatrick. Advanced Calculus. PWS, Boston, 1996.

[6] G. Folland. Advanced Calculus. Prentice Hall, Upper Saddle River,2002.

[7] G. Hardy and E. Wright. An Introduction to the Theory of Num-bers. Oxford University Press, Oxford, 1984.

[8] S. Kleene. Mathematical Logic. Dover Publications, Mineola, 2002.

[9] M. Kline. Mathematical Thought from Ancient to Modern Times.Oxford University Press, Oxford, 1972.

[10] S. Krantz. Real Analysis and Foundations. Chapman Hall/CRC,Boca Raton, 2004.

[11] J. Marsden and M. Hoffman. Elementary Classical Analysis. W.H. Freeman, New York, 1993.

[12] A. Mattuck. Introduction to Analysis. Prentice Hall, Upper SaddleRiver, 1999.

[13] E. Mendelson. Introduction to Mathematical Logic. Chapman andHall, Boca Raton, 1997.

[14] O. Neugebauer. The Exact Sciences in Antiquity. Dover, Mineola,1969.

291

Page 299: (Chapman & Hall_CRC Pure and Applied Mathematics) Robert Carlson-A Concrete Introduction to Real Analysis-Chapman and Hall_CRC (2006)

292 References

[15] A. Peressini, F. Sullivan, and J. Uhl. The Mathematics of Non-linear Programming. Springer-Verlag, New York, 1988.

[16] W. Rudin. Principles of Mathematical Analysis. McGraw-Hill,New York, 1964.

[17] P. Schaefer. Sum-preserving rearrangements of infinite series. TheAmerican Mathematical Monthly, 88(1):33–40, 1981.

[18] D. Smith. A Source Book in Mathematics. Dover, Mineola, 1959.

[19] P. Suppes and S. Hill. First Course in Mathematical Logic. Dover,Mineola, 2002.

[20] D. Widder. Advanced Calculus. Dover, Mineola, 1989.