5,162 330 10MB
Pages 562 Page size 450 x 581.28 pts Year 2009
Introduction to Real Analysis
(*, ) Manfred Stoll
English reprint copyright ®2004 by Pearson Education North Asia Limited and China Machine Press. Original English Language title: Introduction to Real Analysis, Second Edition by Manfred Stoll EISBN 0321046250 Copyright ®2001 by Addison Wesley Longman, Inc. All right reserved. Published by arrangement with the original publisher, Pearson Education. Inc. , publishing as Addison Wesley Longman, Inc. 1 L EI1
Pearson Education (fff
J`
(
I)
X aa° 11 t T_ R Lh '
Wt * jf
.
For sale and distribution in the People's Republic of China exclusively ( except Taiwan, Hong Kong SAR and Macao SAR)
1r
Cx)
09
Ifs 0120042085
'M!FAVF PearsonEducation( * &*L$ SIM)&* Fjjjj#No 3E *;Ia iii J112 (CIP) as
®
3(S2lib/(*)Wiilj(Stoll,Nil.
/i}
Itht: V *2AthJW#.t, 2004.7 4it3 t!ktt*ic "ROW ffl# ISBN 7111147472
I ....
aI. j1... II. IV. 0174.1
CIPft*a*(2004) J059406'9,
q3W&*ffi4
22 Ea
WfIANIA: Y$ Iii
#:
WBittffili81100037)
3R
194,1:!
I7
£C
2004*7)1 Z 1 NJO I11M* 787mm x 1092mm
1/16 35.5 FPW
881 b=iz
2ft : 55.00 5 A.IIji443,
IIU
* * N 45 VL,q
k
S
MX, RX, dl It
i& (010) 68993821,
T7 R1JJ
88379646
Contents
PREFACE x
TO THE STUDENT xv
i.
J
The Real Number System
i
1.1 Sets and Operations on Sets 2 1.2 Functions 6 1.3 Mathematical Induction 15 1.4 The Least Upper Bound Property 20 1.5 Consequences of the Least upper Bound Property 28 1.6 Binary and Ternary Expansions 30 1.7 Countable and Uncountable Sets 34
Notes 43 Miscellaneous Exercises 44 Supplemental Reading 46
21 Sequences of Real Numbers 2.1 Convergent Sequences 48
2.2 Limit Theorems
53
2.3 Monotone Sequences 60
2.4 Subsequences and the BolzanoWeierstrass Theorem 67 2.5 Limit Superior and Inferior of a Sequence 73 2.6 Cauchy Sequences 80
2.7 Series of Real Numbers 86 Notes 90 Miscellaneous Exercises 90 Supplemental Reading 92
vi
47
vii
Contents
3 1 Structure of Point Sets 3.1 Open and Closed Sets 3.2 Compact Sets 101 3.3 The Cantor Set 107
93
93
Notes 110 Miscellaneous Exercises 111 Supplemental Reading 113
Q Limits and Continuity 116 4.2 Continuous Functions 130 4.3 Uniform Continuity 144 4.4 Monotone Functions and Discontinuities
115
4.1 Limit of a Function
148
Notes 162 Miscellaneous Exercises 162 Supplemental Reading 163
5 j Differentiation 5.1 The Derivative 166 5.2 The Mean Value Theorem 5.3 L'Hospital's Rule 190 5.4 Newton's Method 197
165 176
Notes 203 Miscellaneous Exercises 204 Supplemental Reading 205
The Riemann and
6 RiemannStieltjes Integral
207
6.1 The Riemann Integral 208 6.2 Properties of the Riemann Integral 223 6.3 Fundamental Theorem of Calculus 23A 6.4 Improper Riemann Integrals 239 6.5 The RiemannStieltjes Integral 245 6.6 Numerical Methods 260 6.7 Proof of Lebesgue's Theorem 272 Notes 276 Miscellaneous Exercises 277 Supplemental Reading 278
7J Series of Real Numbers 7.1 Convergence Tests 280 7.2 The Dirichlet Test 294
279
viii
Contents
7.3 Absolute and Conditional Convergence 7.4 Square Summable Sequences 306
299
Notes
313 Miscellaneous Exercises 314 Supplemental Reading 315
$
Sequences and Series of Functions 8.1 Pointwise Convergence and Interchange of Limits
317 318
8.2 Uniform Convergence 323 8.3 Uniform Convergence and Continuity 330 8.4 Uniform Convergence and Integration 337 8.5 Uniform Convergence and Differentiation 339 8.6 The Weierstrass Approximation Theorem 346 8.7 Power Serves Expansions 353 8.8 The Gamma Function 372 Notes 377 Miscellaneous Exercises 377 Supplemental Reading 378
j ..
q r Orthogonal Functions and Fourier Series
379
9.1 Orthogonal Functions 380 9.2 Completeness and Parseval's Equality 390 9.3 Trigonometric and Fourier Series 394 9.4 Convergence in the Mean of Fourier Series 404 9.5 Pointwise Convergence of Fourier Series 415 Notes 426 Miscellaneous Exercises 428 Supplemental Reading 428
Lebesgue Measure and Integration 10.1 Introduction to Measure 430 10.2 Measure of Open Sets; Compact Sets 432 10.3 Inner and Outer Measure; Measurable Sets 444 10.4 Properties of Measurable Sets 449 10.5 Measurable Functions 455 10.6 The Lebesgue Integral of a Bounded Function 462 10.7 The General Lebesgue Integral 473 10.8 Square Integrable Functions 484 Notes
491
Miscellaneous Exercises
Supplemental Reading
492 493
429
ix
Contents
Logic and Proofs
APPENDIX:
A.1 Propositions and Connectives 496 A.2 Rules of Inference 500 A.3 Mathematical Proofs 507 A.4 Use of Quantifiers 515 Supplemental Reading 521
Bibliography 522 Hints and Solutions to Selected Exercises
Notation Index Index
545
543
523
495
Preface
The subject of real analysis is one of the fundamental areas of mathematics, and is the foundation for the study of many advanced topics, not only in mathematics, but also in engineering and the physical sciences. A thorough understanding of the concepts of real analysis has also become increasingly important for the study of advanced topics in economics and the social sciences. Topics such as Fourier series, measure theory, and integration are fundamental in mathematics and physics as well as engineering, economics, and many other areas. Due to the increased importance of real analysis in many diverse subject areas, the typical first semester course on this subject has a varied student enrollment in terms of both ability and motivation. From my own experience, the audience typically includes mathematics majors, for whom this course represents the only rigorous treatment of analysis in their collegiate career, and students who plan to pursue graduate study in mathematics. In addition, there are mathematics education majors who need a strong background in analysis in preparation for teaching high school calculus. Occasionally, the enrollment includes graduate students in economics, engineering, physics, and other areas, who need a thorough treatment of analysis in preparation for additional graduate study either in mathematics or their own subject area. In an ideal situation, it would be desirable to offer separate courses for each of these categories of students. Unfortunately, staffing and enrollment usually make such choices impossible. In the preparation of the text there were several goals I had in mind. The first was to write a text suitable for a oneyear sequence in real analysis at the junior or senior level, providing a rigorous and comprehensive treatment of the theoretical concepts of analysis. The topics chosen for inclusion are based on my experience in teaching graduate courses in mathematics, and reflect what I feel are minimal requirements for successful graduate study. I get to the least upper bound property as quickly as possible, and emphasize this important property in the text. For this reason, the algebraic properties of the rational and real number systems are treated very informally, and the construction of the real number system from the rational numbers is included only as a miscellaneous exercise. I have attempted to keep the proofs as concise as possible, and to x
Preface
xi
let the subject matter progress in a natural manner. Topics or sections that are not specifically required in subsequent chapters are indicated by a footnote. My second goal was to make the text understandable to the typical student enrolled in the course, taking into consideration the variations in abilities, background. and motivation. For this reason, Chapters 1 through 6 have been written with the intent to be accessible to the average student, while at the same time challenging the more talented student through the exercises. The basic topological concepts of open, closed, and compact sets, as well as limits of sequences and functions, are introduced for the real num
bers only. However, the proofs of many of the theorems, especially those involving topological concepts, are presented in a manner that permits easy extensions to more abstract settings. These chapters also include a large number of examples and more routine and computational exercises. Chapters 7 through 10 assume that the students have achieved some level of expertise in the subject. In these chapters, function spaces are introduced and studied in greater detail. The theorems, examples, and exercises require greater sophistication and mathematical maturity for full understanding. From my own experiences, these are not unrealistic expectations.
The book contains most of the standard topics one would expect to find in an introductory text on real analysislimits of sequences, limits of functions, continuity, differentiation, integration, series, sequences and series of functions, and power series. These topics are basic to the study of real analysis and are included in most texts at this level. In addition, I have included a number of topics that are not always included in comparable texts. For instance, Chapter 6 contains a section on the RiemannStieltjes integral, and a section on numerical methods. Chapter 7 also includes a section on square summable sequences and a brief introduction to nonmed linear spaces. Both of these concepts appear again in later chapters of the text. In Chapter 8, to prove the Weierstrass approximation theorem, I use the method of approximate identities. This exposes the student to a very important technique in analysis that is used again in the chapter on Fourier series. The study of Fourier series, and the representation of functions in terms of series of orthogonal functions, has become increasingly important in many diverse areas. The inclusion of Fourier series in the text allows the student to gain some exposure to this important subject, without the necessity of taking a full semester course on partial differential equations. In the final chapter I have also included a detailed treatment of Lebesgue measure and the Lebesgue integral. The approach to measure theory follows the original method of Lebesgue, using inner and outer measure. This provides an intuitive and leisurely approach to this very important topic. The exercises at the end of each section are intended to reinforce the concepts of the section and to help the students gain experience in developing their own proofs. Although the text contains some routine and computational problems, many of the exercises are designed to make the students think about the basic concepts of analysis, and to challenge their creativity and logical thinking. Solutions and hints to selected exercises are included at the end of the text. These. problems are marked by an asterisk (*). At the end of each chapter I have also included a section of notes on the chapter, miscellaneous exercises, and a supplemental reading list. The notes in many cases pro
xii
Preface vide historical comments on the development of the subject, or discuss topics not included in the chapter. The miscellaneous exercises are intended to extend the subject matter of the text or to cover topics that, although important, are not covered in the chapter itself. The supplemental reading list provides references to topics that relate to the subject under discussion. Some of the references provide historical information; others provide alternative solutions of results or interesting related problems. Most of the articles appear ih the American Mathematical Monthly or Mathematics Magazine. and should be easily accessible for students' reference. To cover all the chapters in a oneyear sequence is perhaps overly ambitious. However, from my own experience in teaching the course, with a judicious choice of topics it is possible to cover most of the text in two semesters. A onesemester course should at a minimum include all or most of the first five chapters, and part or all of Chapter 6 or Chapter 7. The latter chapter can be taught independently of Chapter 6; the only dependence on Chapter 6 is the integral test, and this can be covered without a theoretical treatment of Riemann integration. The remaining topics should be more than sufficient for a full second semester. The only formal prerequisite for reading the text is a standard three or foursemester sequence in calculus. Even though an occasional talented student has completed one semester of this course during their sophomore year. some mathematical maturity is expected, and the average student might be advised to take the course during their junior or senior year.
Features New to the Second Edition In content, the second edition remains primarily unchanged from the first. The subject of real analysis has not changed significantly since publication of the first edition. In this edition I have incorporated many of the valuable suggestions from reviewers, instructors, and students. Some new topics have been included, and the presentation of others has been revised. New examples and revised explanations appear throughout this edition of the text. The second edition also contains additional illustrations and expanded problem sets. The problem sets in all sections of the first six chapters have been expanded to include more routine and computational problems. The challenging problems are still there. With the addition of more routine problems, instructors using this text will have greater flexibility in the assignment of exercises. The supplemental reading lists have all been updated to include relevant articles that have appeared since 1996. Two of the more substantive changes are the inclusion of a proof of Lebesgue's theorem in Chapter 6, and the addition of an appendix on logic and proofs. In the first edition, Lebesgue's theorem was stated in Section 6.1 and then proved in Chapter 10. At the recommendation of my colleague Anton Schep, I have included a selfcontained proof of Lebesgue's theorem as a separate section in Chapter 6. The proof is based on notes that he has used to supplement the text. In the proof, as in the statement of the theorem, the only reference to measure theory is the definition of a set of measure zero. With this change it is now possible not only to state but also to prove this important theorem without first having to develop the theory of Lebesgue measure and integration. The greatest difficulty facing many students taking a course in real analysis is the ability to write and to understand proofs. Most have never had a course in mathemati
Preface
xiii
cal logic. For this reason I have included a brief appendix on logic and proofs. The appendix is not intended to replace a formal course in logic; it is only intended to introduce the rules of logic that students need to know in order to better understand proofs. These rules are also crucial in helping students develop the ability to write their own proofs. The various methods of proof are discussed in detail, and examples of each method are included and analyzed. The appendix also includes a section on the use of quantifiers. with special emphasis on the proper negation of quantified sentences. The
appendix itself is independent of the text: however, references to it are included throughout the first several chapters of the text. The appendix can be included as part of the course, or assigned as independent reading.
LL
Acknowledgments I would like to thank the students at South Carolina who have learned this material from
me, or my colleagues, from preliminary versions of this text. Your criticisms, comments, and suggestions were appreciated. I am also indebted to those colleagues, especially the late Jeong Yang, who agreed to use the manuscript in their courses. Special thanks are also due to the reviewers who examined the manuscript for the first edition and provided constructive criticisms and suggestions for its improvement: Joel Anderson, Pennsylvania State University; Bogdam Baishanski, Ohio State University; Robert Brown, University of Kansas; Donald Edmondson, University of Texas at Austin; Kevin Grasse, University of Oklahoma; Harvey Greenwald, California Polytechnic State University; Adam Helfer, University of Missouri, Columbia: Jan Kucera. Washington State University; Thomas Reidel, University of Louisville; Joel Robin, University of Wisconsin, Madison; Stuart Robinson, Cleveland State University; Dan Shea, University of Wisconsin, Madison; Richard B. Sher, University of North Car
olina: Thomas Smith, Manhattan College. Your careful reading of the manuscript helped to turn the preliminary drafts into a polished text. 1 would also like to thank Carolyn LeeDavis and the staff at Addison Wesley
Longman for their assistance in the preparation of the second edition, and the reviewers for this edition for their comments and recommendations: William Barnier, Sonoma State University; Rene Barrientos, Miami Dade Community College; Denine Burkett, Lock Haven University; Steve Deckelman, University of Wisconsin; Lyn
Geisler, RandolphMacon College; Constant Goutziers, State University of New York; Christopher Heil, Georgia Institute of Technology; William Stout. Salve Regina University. Special thanks go to my colleague, George McNulty, for his careful reading of the appendix. His constructive criticisms and suggestions were appreciated. I am also grateful to the readers who informed me of errors in the first edition, and to the instructors who conveyed to me some of the difficulties encountered while using
the book as a text. Hopefully all of the errors and shortcomings of the first edition have been corrected. Finally, I would especially like to thank my wife, Mary Lee, without whose encouragement this project might never have been completed. Manfred Stoll
To the Student
The difference between a course on calculus and a course on real analysis is analogous to the difference in the approach to the subject prior to the nineteenth century and since that time. Most of the topics in calculus were developed in the late seventeenth and eighteenth centuries by such prominent mathematicians as Newton, Leibniz, Bernoulli, Euler, and many others. Newton and Leibniz developed the differential and integral calculus; their successors extended and applied the theory to many problems in mathematics and the physical sciences. They had phenomenal insight into the problems, and
were extremely proficient and ingenious in deriving complex formulas. What they lacked, however, were the tools to place the subject on a rigorous mathematical foundation. This did not occur until the nineteenth century with the contributions of Cauchy, Bolzano, Weierstrass, Cantor, and many others.
In calculus, the emphasis is primarily on developing expertise in computational techniques and applications. In real analysis, you will be expected to understand the concepts and to develop the ability to prove results using the definitions and previous theorems. Understanding the concept of a limit, and proving results about limits, will be significantly more important than computing limits. To accomplish this, it is essential that all definitions and statements of theorems be learned precisely. Most of the proofs of the theorems and solutions of the problems are logical consequences of the definitions and previous results; some, however, do require ingenuity and creativity. The text contains numerous examples and counterexamples to illustrate the particular topics under discussion. These are included to show why certain hypotheses are required, and to help develop a more thorough understanding of the subject. It is crucial that you not only learn what is true, but that you also have sufficient counterexamples
at your disposal. I have included hints and answers to selected exercises at the end of the text; these are indicated by an asterisk (*). For some of the problems I have provided complete details; for others I have provided only brief hints, leaving the details to you. As always, you are encouraged to first attempt the exercises, and to look at the hints or solutions only after repeated attempts have been unsuccessful. xc
To the Student
xV
At the end of each chapter I have included a supplemental reading list. The journal articles or books are all related to the topics in the chapter. Some provide historical information or extensions of the topics to more general settings; others provide alternative solutions of results in the text. or solutions of interesting related problems. All of the articles should be accessible in your library. They are included to encourage you to develop the habit of looking into the mathematical literature. An excellent source for additional historical information and biographies of famous mathematicians is the MacTutor History of Mathematics archive at the University of St. Andrews, Scotland. The URL of their webpage is http://wwwhistory.mcs.standrews.as.uk/ On reading the text you will inevitably encounter topics, formulas, or examples that may appear too technical and difficult to comprehend. Skip them for the moment; there will be plenty for you to understand in what follows. Upon later reading the section, you may be surprised that it is not nearly as difficult as previously imagined. Concepts that initially appear difficult become clearer once you develop a greater understanding of the subject. It is important to keep in mind that many of the examples and topics that appear difficult to you were most likely just as difficult to the mathematicians of the era in which they first appeared. The material in the text is selfcontained and independent of calculus. I do not use any results from calculus in the definitions and development of the subject matter. Occasionally, however, in the examples and exercises I do assume knowledge of the elementary functions and of notation and concepts that should have been encountered elsewhere. These concepts will be defined carefully at the appropriate place in the text. Manfred Stoll
The Real Number System 1.1
Sets and Operations on Sets
1.2 Functions
1.3 Mathematical Induction 1.4 The Least Upper Bound Property 1.5 Consequences of the Least Upper Bound Property 1.6 Binary and Ternary Expansions
1.7 Countable and Uncountable Sets
The key to understanding many of the fundamental concepts of calculus, such as limits, continuity, and the integral, is the least upper bound property of the real number system It As we all know, the rational number system contains gaps. For example. there does
not exist a rational number r such that r2 = 2, i.e., f is irrational. The fact that the rational numbers do contain gaps makes them inadequate for any meaningful discussion of the above concepts. The standard argument used in proving that the equation r2 = 2 does not have a solution in the rational numbers goes as follows: Suppose that there exists a rational number r such that r2 = 2. Write r = m/n where in, it are integers that are not both even. Thus m' = 2n2. Therefore in2 is even, and hence in itself must be even. But m2, and hence also 2n2, are both divisible by 4. Therefore n2 is even, and as a consequence n is also even. This, however, contradicts our assumption that not both in and n are even. The method of proof used in this example is proof by contradiction; namely, we assume the negation of the conclusion and arrive at a logical contradiction. A discussion of the various methods of proof is included in Section A.3 of the Appendix. The above argument shows that there does not exist a rational number r such that r2 = 2. This argument was known to Pythagoras (around 500 B.C.), and even the Greek mathematicians of this era noted that the straight line contains many more points than the rational numbers. It was not until the nineteenth century, however, when mathematicians became concerned with putting calculus on a firm mathematical footing, that the development of the real number system was accomplished. The construction of the real number system is attributed to Richard Dedekind (18311916) and Georg Cantor 0
2
Chapter I
The Real Number System
(18451917), both of whom published their results independently in 1872. Dedekind's aim was the construction of a number system. with the same completeness as the real line, using only the basic postulates of the integers and the principles of set theory. Instead of constructing the real numbers, we will assume their existence and examine the least upper bound property. As we will see, this property is the key to many basic facts about the real numbers that are usually taken for granted in the study of calculus. In Chapter 1 we will assume a basic understanding of the concept of a set and also of both the rational and real number systems. In Section 1.4 we will briefly review the algebraic and order properties of both the rational and real number systems and discuss the least upper bound property. By example we will show that this property fails for the rational numbers. In the subsequent two sections we will prove several elementary consequences of the least upper bound property. In Section 1.7 we define the notion of a countable set and consider some of the basic properties of countable sets. Among the key results of this section are that the rational numbers are countable, whereas the real numbers are not.
Sets and Operations on Sets Sets are constantly encountered in mathematics. One speaks of sets of points, collections of real numbers, and families of functions. A set is conceived simply as a collection of definable objects. The words set, collection, and family are all synonymous. The notation .r e A means that x is an element of the set A; the notation x e A means that x is not an element of the set A. The set containing no elements is called the empty set and will be denoted by 0.
A set can be described by listing its elements, usually within braces { }. For example,
A = {1,2.5.4} describes the set consisting of the numbers 1, 2, 4, and 5. More generally, a set A may be defined as the collection of all elements x in some larger collection satisfying a given property. Thus the notation
A = {x : P(x)} defines A to be the set of all objects x having the property P(x). This is usually read as "A equals the set of all elements x such that P(x)." For example, if x ranges over all real numbers, the set A defined by
A = {x:1 0, there exists a positive real number x such that
x2=y.
1.2
Functions
13
Intuitively we know that such an x exists; namely, the square root of y. However. a rigorous proof of the existence of such an x will require the least upper bound property of the real numbers. In Example 1.4.8 we will prove that for each y > 0 there exists a unique positive real number x such that X' = y. The number x is called
the square root of y and is denoted by \. Thus the inverse function of f is given by
Domf ' _ {y E R:y :0}.
f'(y) _ Composition of Functions
Suppose f is a function from A to B and g is a function from B to C. If a E A. then f(a) is an element of B, the domain of g. Consequently we can apply the function g to f(a) to obtain the element g(f(a)) in C. This process, illustrated in Figure 1.7. gives a new function h which maps a E A to g(f(a)) in C.
C
h(a) =g(f(a))
Figure 1.7
1.2.10
Composition of g with f
DEFINITION If f is a function from A to B and g is a function from B to C, then the function g of: A i C. defined by
gof= {(x.z)EA X C:z=g(f(x))} is called the composition of g with f
(f
If f is a onetoone function from A into B, then it can be shown that o f) (x) = x for all x E A and that (f o f ') (y) = y for ally E Range f (Exercise
10). This is illustrated in (b) of the following example.
1.2.11
EXAMPLES
(a) If f (x) = v +x with Dom f = {x E R : x z  I } and g(x) = x', Dom g = R. then
(g ofXx) = g(f(x)) = (
)'' = I + x,
Dom(gof)= {x ER:xz 1}.
14
Chapter 1
The Real Number System
Even though the equation (g of)(x) = 1 + x is defined for all real numbers x, the domain of the composite function g o f is still only the set {x E R : x,2: 1 }. For this example, since Range g C Dom f, we can also find f o g; namely,
(f o g)(x) =.f(g(x)) =
1 + x2,
(b) For the function fin (a), the inverse function f
f'(y)=y2 1, Dom f'' =
is given by
Range f = {y E R :Y z 0}.
Thus for x E Dom f,
(f
Dom f o g = R.
f)(x) = f _' U W) = (f (x))2  I =
(\/j)2  1 = X.
and for y > 0,
(fof')(Y) =f(f+I(Y)) =
= Y.
(
R EXERCISES 1.2 1. Let A = {1, 0, 1, 2) and B = N. Which of the following subsets of A X B is a function from A into B? Explain your answer.
b. g = {(1,2),(0,7).(1,3),(2,7)}
a. f= {(l.2).(0.3).(2.5)} 1)}
c. h =
d. k = {(x, y) : y = 2r + 3, x E A}
2. *a. Let A = {(x, y) E R X R : x2 + y2 = 11. Is A a function? Explain your answer. b. Let B = {(x, y) E A : y z 0}. Is B a function? Explain your answer.
3. Let f : N * N be the function defined by f(n) = 2n  1. Find f(E) and f '(E) for each of the following subsets E of N.
c. N b. {1,3,5.7} *a. {1,2,3,4} 4 . Let f= {(x,y):xE R,y =x3 + 1}. *a. Let A = {x : 1 A be functions satisfying (g o f)(x) = x for all x E A. Show that f is a onetoone function. Must f be onto B?
12. If f : A s B and g : B + C are onetoone functions, show that (g of)
1.3
f ' o g' on Range (g of).
Mathematical Induction Throughout the text we will on occasion need to prove a statement, identity, or inequality involving the positive integer n. As an example, consider the following identity. For each n E N,
r+r2+
+r"= r 
r"+'
Mathematical induction is a very useful tool in establishing that such an identity is valid for all positive integers n.
1.3.1
THEOREM (Principle of Mathematical Induction) For each n E N, let P(n) be a statement about the positive integer n. If
(a) P(1) is true, and (b) P(k + 1) is true whenever P(k) is true, then P(n) is true for all n E N. The proof of this theorem depends on the fact that the positive integers are wellordered; namely, every nonempty subset of N has a smallest element. This statement is usually taken as a postulate or axiom for the positive integers; we do so in this text. Since it will be used on several other occasions, we state it both for completeness and emphasis.
1.3.2
WELLORDERING PRINCIPLE
Every nonempty subset of N has a smallest element.
The wellordering principle can be restated as follows: If A C F%, A # d,, then there exists n E A such that n :5 k for all k E A. To prove Theorem 1.3.1 we will use the method of proof by contradiction. Most theorems involve showing that the statement P implies the statement Q; namely, if P is
16
Chapter 1
The Real Number System
true, then Q is true. In a proof by contradiction one assumes that P is true and Q is false. and then shows that these two assumptions lead to a logical contradiction; namely, show that some statement R is both true and false. Further details on the method of proof by contradiction are provided in Section A.3 of the Appendix.
Proof of Theorem 1.3.1. Assume that the hypotheses of Theorem 1.3.1 are true, but that the conclusion is false; that is, there exists a positive integer n such that the statement P(n) is false. Let
A = {k E N : P(k) is false}. By our assumption, the set A is nonempty. Thus by the wellordering principle A has a smallest element k,,. Since P(1) is true, k > 1. Also, since ko is the smallest element of A. P(k,,  1) is true. But then by hypothesis (b), P(k,,) is also true. Thus k, a A. This,
however, is a contradiction. Consequently, P(n) must be true for all n E N. Q
1.3.3
EXAMPLES We now provide two examples to illustrate the method of proof by mathematical induction. The first example provides a proof of the identity in the introduction to the section. An alternative method of proof will be requested in the exercises (Exercise 7).
(a) To use mathematical induction, we let our statement P(n), n E N, be as follows:
+r"= r  r"+i . r # 1. I r
r+ When n = I we have
r(1 r) _ rr2 I  r , provided r * 1. r = (1  r) Thus the identity is valid for n = 1. Assume P(k) is true for k i' I; i.e.,
+r
r+
r  rA+I
r # 1.
Ir
We must now show that the statement P(k + 1) is true; that is.
r
r+
r(A+ p+ i
I r
'
r # 1.
But
r+
+ rk+i = r +
+ rk +
rk+1,
which by the induction hypothesis r  rk+i
r  rA+i + (1  r)rk+l
Ir
I  r
r  rk+'
1r
Mathematical induction
1.3
17
Thus the identity is valid for k + 1, and hence by the principle of mathematical induction for all n E N.
(b) For our second example, we use mathematical induction to prove Bernoulli's in
equality. If h > 1, then
(I +h)">1 + nh for alln EN. When n = 1, (1 + h)' = I + h. Thus since equality holds, the inequality is certainly valid. Assume that the inequality is true when n = k, k  1. Then for is = k + 1,
(I + h)'I = (I + h)'"(1 + h), which by the induction hypothesis and the fact that (1 + h) > 0
?(1 +kh)(1 +h)= I +(k+ 1)h+kh= 1 + (k + 1)h. Therefore the inequality holds for n = k + 1, and thus by the principle of mathematical induction for all n E N. Although the statement of Theorem 1.3.1 starts with n = 1, the result is still true if we start with any integer n,, E Z. The modified principle of mathematical induction is as follows: If for each n E 77, n ? n,,, P(n) is a statement about the integer n satisfying .
(a)
is true, and
(b') P(k + 1) is true whenever P(k) is true. k ? no, then P(n) is true for all n E 77, n ? n,. The proof of this follows from Theorem 1.3.1 by simply setting
Q(n)=P(n+n, I),
n E N,
which is now a statement about the positive integer n.
Remark. In the principle of mathematical induction, the hypothesis that P(I) be true is essential. For example, consider the statement P(n):
n+i=n, nEN. This is clearly false! However, if we assume that P(k) is true, then we also obtain that P(k + 1) is true. Thus it is absolutely essential that P(n,) be true for at least one fixed value of n,.
There is a second version of the principle of mathematical induction that is also quite useful.
1.3.4
THEOREM (Second Principle of Mathematical Induction) For each n E N. let P(n) be a statement about the positive integer n. If
(a) P(1) is true, and (b) for k > 1, P(k) is true whenever P(j) is true for all positive integers j < k, then P(n) is true for all n E N.
18
chapter 1
The Real Number System
Proof.
Exercise 3.
Q
Mathematical induction is also used in the recursive definition of functions defined for the positive integers. In this procedure, we give an initial value of the function fat
n = 1. then assuming that f has been defined for all integers k = 1. .... n, the value off at n + I is given in terms of the values off at k, k 5 n. This is illustrated by the following examples.
1.3.5
EXAMPLES
(a) Suppose f : N ' N is defined by f(1) = t and f(n + 1) = nf(n). n E N. The values off for n = 1, 2, 3.4 are given as follows: 3.2.1.
f(1) = 1, f(2) = lf(1) = 1, f(3) = 2f(2) = 2.1, f(4) = 3f(3) =
Thus we conjecture that f(n) (n  1)!, where 0. is defined to be equal to one, and for n E N, n! (read n factorial) is defined as
The conjecture is certainly true when n = 1. Thus assume that it is true for n = k, 1; that is, f(k) = (k  1)!. Then for n = k + 1, k
f(k + 1) = kf(k), which by the induction hypothesis
k(k  1)! = k! Therefore the identity holds for it = (k + 1), and thus by the principle of mathematical induction, for all it E N.
(b) For our second example, consider the function f : N  R defined by f (l) = 0, f(2) = 3, and for n ? 2 by f (n) = (1, +i) f (n  2). Computing the values of f for n = 3, 4, 5, and 6, we have
f(3) = 0,
f(4) = 5,
f(5) = 0.
f(6) =
From these values we conjecture that 0,
f(n) =
1
tn
if n is odd, if it is even.
To prove our conjecture we will use the second principle of mathematical induction. Our conjecture is certainly true for it = 1, 2. Suppose n > 2, and suppose our conjec
ture holds for all k < n. If n is odd, then so is (n  2), and thus by the induction hypothesis f(n  2) = 0. Therefore f(n) = 0. On the other hand, if n is even, so is (n  2). Thus by the induction hypothesis f(n  2) = ;,?y. Therefore n
f(n)
(n1) n+l
f(
n2)
(n11
1
=
I
n+1)n1 n+l
.
IN
1.3
19
Mathematical Induction
EXERCISES 1.3 1. Use mathematical induction to prove that each of the following identities are valid for all n E N. n(n + 1)
+n=
1+2+3+
a.
.
+n2=
12+22+
*b.
2
n(n + 1)(2n + 1) 6
1 +3+5+ +(2n 1)=n2
d. 13+23+
+n3=[Zn(n+l)]2
+2"=2(2" 1)
e. 2+22+23+ *f For x, y E Il,
x"+1 y",i =(xy)(x"+x"iy+...+y")
g' 1(2)+2(3)+
n+1 n
n(n+1)
+
2. Use mathematical induction to establish the following inequalities.
b. 2" > n2 for all n E N. n ? 5 d. 13 + 23 + + n3 < n4 for all n E 1N, n
*a. 2" > n for all n E N *c. n! > 2" f o r all n E !N, n L  4 e. 13 + 23 +
2
+ n3 < £n4 for all n E !N, n > 3
3. Prove Theorem 1.3.4.
4. *Let f : N  N be defined by f (l) = 5, f(2) = 13, and for n z 2, f(n) = 2f(n  2) + f(n  1). Prove that
f(n)=3.2"+(1)"foralln E Ni. 5. For each of the following functions f with domain N, determine a formula for f(n) and use mathematical induction to prove your conclusion.
a. f(1)=2,andforn> 1,f(n)=(n1)f(n1)n+1. *b. f(l)=1,1(2)=4, andforn>2,f(n)=2f(n1)f(n2)+2. c. f (l) = 1, and for n > 1, f (n) = * d f(1) = I f(2) = 0 and '
(n
3n 1) f (n  1).
for n > 2 f(n) _
 2)  f(n n(n1)*
C. For a,, a2 E R arbitrary, l et f ( l ) = a,, f(2 ) = a2, and for n
> 2 , f (n)
f. For a 1, a2 E R arbitrary, let f (l) = a,, f (2) = a2, and for n > 2,f(n) _ 6. Let f: N! *NI be defined by f(1) = 1, f(2) = 2, and
f(n + 2) =(n + 1) + f(n)). Use Theorem 1.3.4 to prove that 1 s f(n)s 2 for all n e N. 7. *Prove that
r+r2+
+r"= r  r
,
r#
1
without using mathematical induction.
nEN
An  2) n(n  1)'
(__4)f(n  2).
20
8.
Chapter)
The Real Number System
Use mathematical induction to prove the arithmetic geometric mean inequality. If a,, a,.. nonnegative real numbers, then ala, ,
.
a. S (
a, + a2 + '

.
. a,,, n E N. are
 +a.
with equality if and only if a, = a, =
1.4
.
= a".
The Least Upper Bound Property In this section we will consider the concept of the least upper bound of a set and introduce the least upper bound or supremum property of the real numbers R. Prior to introducing these new ideas we briefly review the algebraic and order properties of 0 and R. Both the rational numbers 0 and the real numbers R are algebraic systems known as fields. The key fact about a field is that it is a set F with two operations, addition (+) and multiplication (), that satisfy the following axioms:
1. If a,bE F.then a+bE 2. The operations are commutative; that is, for all a, b E F
a+b=b+a
and
3. The operations are associative; that is. for all a, b, c E F
a + (b + c) _ (a + b) + c
and
a (b c) = (a b) c.
4. There exists an element 0 E F such that a + 0 = a for every a E F. 5. Every a E F has an additive inverse; that is, there exists an element a in F such that
a+(a)=0. 6. There exists an element 1 E F with 1 # 0 such that a 1 = a for all a E F. 7. Every a E F with a * 0 has a multiplicative inverse; that is, there exists an element a' in F such that 1.
8. The operation of multiplication is distributive over addition; that is, for all
a, b, c E F, The element 0 is called the zero of F and the element I is called the unit of F. For a # 0, the element a' is customarily written as or 1/a. Similarly, we write a  b in
stead of a + (b), ab instead of a b, and alb or instead of a b. The real numbers R contain a subset P known as the positive real numbers satisfying the following:
1.4
The Least Upper Bound Property
21
(01) If a, b E P, then a+ b E P and a b E P. (02) If a E R then one and only one of the following hold:
aEP,
aEP,
a =0.
Properties (01) and (02) are called the order properties of R. Any field F with a nonempty subset satisfying (01) and (02) is called an ordered field. For the real numbers we assume the existence of a positive set P. For the rational numbers 0, the set of pos
itive rational numbers is given by P n 0, which can be proved to be equal to
{p/q:p,gE7L,4 *O,pgE1i}. Let a, b be elements of R. If a  b is positive, i.e., a  b E P. then we write a > b or b < a. In particular, the notation a > 0 (or 0 < a) means that a is a positive
element. Also, a s b (or b ? a) if a < b or a = b. The following useful results are immediate consequences of the order properties and the axioms for addition and multiplication. Let a, b, c be elements of R.
(a) If a > b, then a + c > b + c. (b) If a > b and c > 0, then ac > be. (c) If a > b and c < 0, then ac < be. (d) If a * 0, then a2 > 0. (e) If a > 0, then 1/a > 0; if a < 0, then 1/a < 0. To illustrate the method of proof, we provide the proof of (b). Suppose a > b; i.e., a  b is positive. If c is positive, then by (01), (a  b)c is positive. By the distributive law,
(a  b)c = ac  be. Therefore ac  be is positive; that is, ac > be. The proofs of the other results are left as exercises.
Upper Bound of a Set We now turn our attention to the most important topic of this chapter; namely, the least upper bound or supremum property of R. In Example 1.4.5(c) we will show that this property fails for the rational numbers Q. First, however, we define the concept of an upper bound of a set.
1A.1
DEFINITION A subset E of R is bounded above if there exists,0 E R such that x 5 /3 for every x E E. Such a (3 is called an upper bound of E. The concepts bounded below and lower bound are defined similarly. A set E is bounded if E is bounded both above and below. We now consider several examples to illustrate these concepts.
1A.2
EXAMPLES
(a) Let A = {0, 12, 3, s, ...} = { l is bounded below by any real number r
n = 1, 2, 3, ...} (see Figure l.8). Clearly A 0 and above by any real numbers ? 1.
22
Chapter 1
The Real Number System
o
2
1
.1
a
a 4s
2 Figure 1.8
(b) N = {1, 2, 3, ...}. This set is bounded below; e.g., I is a lower bound. Our intuition tells us that N is not bounded above. It is obvious that there is no positive integer n such that j n for all j E N. However, what is not so obvious is that there is no real number .6 such that j s /3 for all j E N. In fact, given /3 E R, the proof of the existence of a positive integer n > /3 will require the least upper bound property of R (Theorem 1.5.1).
(c) B = {r E 0 : r > 0 and r2 < 2}. Again it is clear that 0 is a lower bound for B. and that B is bounded above; e.g., 2 is an upper bound for B. What is not so obvious, however, is that B has no maximum. By the maximum or largest element of B we mean an element a E B such that p a for all p E B. Suppose p E B. Define the rational number q by
2p2
_
2p+2
qp+(p+2) p+2 With q as defined, a simple computation gives
q2= 2
z_ (p+2)2
Since p2 < 2, q > p and q2 < 2. Thus B has no largest element. Similarly, the set
has no minimum or smallest element. Intuitively, the largest element of B would satisfy p2 = 2. However, as was shown in the introduction, there is no rational number p for
which p2 = 2.
Least Upper Bound of a Set 1A.3
Let E be a nonempty subset of R that is bounded above. An element a E R is called the least upper bound or supremum of E if DEFINITION
(1) a is an upper bound of E, and (ii) if /3 E R satisfies /3 < a, then 13 is not an upper bound of E. Condition (ii) is equivalent to a /3 for all upper bounds J3 of E. Also by (ii), the least upper bound of a set is unique. If the set E has a least upper bound, we write
a = sup E to denote that a is the supremum or least upper bound of E. The greatest lower bound or ifmum of a nonempty set E is defined similarly, and if it exists, is denoted by inf E.
1.4
The Least Upper Bound Property
23
There is one important fact about the supremum of a set that will be used repeatedly throughout the text. Due to its importance we state it as a theorem.
1 AA THEOREM Let A be a nonempty subset of H that is bounded above. An upper bound or of A is the supremum of A if and only if for every 13 < a, there exists an element x E A such that
13 B. On the other hand, since a is an upper bound
ofA,x: a. Conversely, if a is an upper bound of A satisfying the stated condition, then every
(3 < a is not an upper bound of A. Thus a = sup A. Q
1.4.5
EXAMPLES In the following examples, let's consider again the three sets of the previous examples. (a) As in Example 1.4.2(a), let A = {0, Z, 3, ;, ...}. Since 0 is a lower bound of A and
0 E A, infA = 0. We now prove that sup A = 1. Since I  I'1 < I for all n = 1, 2, ... , 1 is an upper bound. To show that I = sup A we need to show that if 13 E R with $ < 1, then f3 is not an upper bound of A. Clearly if 0 0, then $ is not an upper bound of A. Suppose, as in Figure 1.9, 0 < 13 < 1. Then our intuition tells us that there exists an integer no such that
n
1
1
1
1 E A, and thus 13 is not an upper bound. Therefore sup A = 1. The existence n of such an integer no will follow from Theorem 1.5.1. In this example, inf A E A but sup A e A.
(b) For the set N, inf N = 1. Since N is not bounded above, N does not have an upper bound in R. (c) In this example we prove that the supremum of the set
B={rEQ:r>0 and
r2 2. But if a 6 B, then since B contains no largest
24
Chapter 1
The Real Number System
element, there exists q E B such that q > a. This contradicts the fact that a is an upper bound of B. Similarly, if a2 > 2, then there exists a q < a such that q2 > 2. But then q is an upper bound of R, which is a contradiction of property (ii) of Definition 1.4.3. The least upper bound of B in N is V2 (Section 1.5, Exercise 9), which we know is not rational.
Least Upper Bound Property of R The following property, also referred to as the completeness property of R. distinguishes the real numbers from the rational numbers and forms the foundation for many of the results in real analysis.
1A.6 SUPREMUM OR LEAST UPPER BOUND PROPERTY OF N Even nonempn subset of P that is bounded above has a supremum in R. For our later convenience we restate the supremum property of P as the infimum property of R.
1.4.7
INFIMUM OR GREATEST LOWER BOUND PROPERTY OF P Ever nonempn subset of H that is bounded below has an infimum in B. Although stated here as a property, which we will assume as a basic axiom about
P, the least upper bound property of P is really a theorem due to both Cantor and Dedekind, both of whom published their results independently in 1872. Dedekind, in the paper "Stetigkeit and irrationale Zahlen" (Continuity and irrational numbers), used algebraic techniques now known as the method of Dedekind cuts to construct the real number system P from the rational numbers O. He proved that the system R contained a natural subset of positive elements satisfying the order axioms (01) and (02), and furthermore, that P also satisfied the least upper bound property. The books by Burrill and t by Spooner and Mentzger cited in the Supplemental Readings are devoted to number systems. Both texts contain Dedekind's construction of R. Cantor, on the other hand,
constructed P from 0 using Cauchy sequences. In the miscellaneous exercises of Chapter 2 we will provide some of the key steps of this construction.
1.4.8
EXAMPLE In this example we show that for every positive real number y > 0, there exists a unique positive real number a such that a2 = y; i.e., a = v. The uniqueness of a was established in Example 1.2.9(b). We only prove the result for y > 1, leaving the case 0 < y s I to the exercises (Exercise 6). Let
C={xEH:x>0 and x2 1, 1 E C and thus C is nonempty. Also since y > 1, y2 > y, and thus y is an upper bound of C. Hence by the least upper bound property, C has a supremum in P. Let a = sup C. We now prove that a2 = y. To accomplish this we show that the assumptions a2 < Y and a2 > v lead to contradictions. Thus a2 = y.
1.4
The Least Upper Bound Property
25
Define the real number $3 by
_ y(a + 1) C2_ /3=a+(a+r) a+r
(1)
Then y(!  I)(a2
)
P'  y =
(2)
(a + V)2
If a2 < y, then by (1) $3 > a, and by (2) $32 < v. This contradicts that a is an upper bound for C. On the other hand, if Cr' > y, then by (1),6 < a and by (2), j32 > Y. Thus if x E R with x ? $3, then x2 > y. Therefore $3 is an upper bound of C. This contradicts
that a is the least upper bound of C. Since S defined by (I) may not be rational, the same proof will not work for the set B of Example 1.4.2(c). However, using Theorem 1.5.2 of the following section, it is possible to also prove that sup B = 12. For convenience, we extend the definition of supremum and infimum of a subset E of N to include the case where E is not necessarily bounded above or below.
1.4.9
DEFINITION
If E is a nonempty subset of R, we set
sup E = oo
if E is not bounded above, and inf E _  oo if E is not bounded below.
For the empty set ¢, every element of i is, an upper bound of 46. For this reason the supremum of the empty set 44 is taken to be oo. Similarly, inf d = cc. Also, for the symbols oo and oo we adopt the convention that oo < x < cc for every x E R.
Intervals Using the order properties of R, we can define certain subsets of R known as intervals.
1.4.10
DEFINITION
For a, b E R, an55 b, the open interval (a, b) is defined as
(a,b)={xE68:a 0, then there exists a posi
nx > Y.
Proof.
If y 0. We will
again use the method of proof by contradiction. Let
A={nx:nEN}. If the result is false, that is, there does not exist an n E N such that nx > y. then nx y for all n E N. Thus y is an upper bound for A. Thus since A * 46, A has a least upper
bound in R. Let a = sup A. Since x > 0, a  x < a. Therefore a  x is not an upper bound and thus there exists an element of A, say mx, such that
a  x y.
Remark. One way in which the previous result is often used is as follows: given e > 0, there exists a positive integer n such that noe > 1. As a consequence, 1
 < E n
for all integers n, n ? no.
1.5
Consequences of the Least Upper Bound Property
29
1.5.2 THEOREM If x, .v E l and x < v, then there exists r E O such that
x 0 such that
n(y  x) > 1
or
ny > I + nx.
Again by Theorem 1.5.1, {k E N : k > nx} is nonempty. Thus by the wellordering principle, there exists m E N such that
m  1 :5 nx n, = 1. Thus, the result is true for k = 1. Let k > I and assume that n; = I for
I
all j < k. By definition nk is the largest integer in {0, 1, 2) such that
1+
3
+
3k1
+nk k, then nk and thus x,,. Therefore f is onetoone. The function f is onto E since if x E E, then x = x, for some j. By construction, nk = j for some k. and thus f(k) = x.
1.7.7 THEOREM If f maps N onto A. then A is at most countable. Proof. If A is finite, the result is certainly true. Suppose A is infinite. Since f maps N onto A, each a E A is of the form f (n) for some n E N. For each a E A, by the wellordering principle
f'({a}) = {n E N J (n) = a} has a smallest integer, which we denote by n,,. Consider the mapping a a nd of A into N. If a * b, then since f is a function, n. * nb. Also, since A is infinite, {n, : a E A) is an infinite subset of N: Thus the mapping a+ n, is a onetoone mapping of A onto an infinite subset of N. Therefore byTheorem 1.7.6, A is countable.
Indexed Families of Sets In Section 1.1 we defined the union and intersection of two sets. We now extend these definitions to larger collections of sets. Recall that if X is a set, 91(X) denotes the set of all subsets of X.
38
Chapter 1
1.7.8
The Real Number System
DEFINITION Let A and X be nonempty sets. An indexed family of subsets of X with index set A is a function from A into 9(X).
If f : A + 13(A), then for each a E A, we let Ea = f(a). As for sequences, we denote this function by {Ea}aE,%. If A = N, then {En}nEN is called a sequence of subsets of X. In this instance, we adopt the more conventional notation {En} 1 to denote {En}nEN
1.7.9
EXAMPLES The following are all examples of indexed families of sets.
(a) The sequence {N.} 1, where N. = {1, 2.... , n}, is a sequence of subsets of N. Then {1n} 1 is a sequence of sub(b) For each n E N, set 1n = {x (E R : 0 < x < sets of N.
(c) For each x, 0 < x < 1, let
Ex= {rE0:0t= r x also leads to 'a contradiction of the definition of y as follows: Set
k=yy"_x Show that if tat yk,then t44;E. 3. Fix b > 1. a. Suppose m, n, p, q are integers with n > 0 and q > 0. If m/n = p/q, prove that
(hl" =
Thus if r is rational, b' is well defined.
'
Miscellaneous Exercises
45
b. If r, s are rational, prove that b"' = b' b'. c. If x E R, let B(x) = {b': r E 0. t x}. Prove that b' = sup B(r) when r E 0. Thus it now makes sense to define b' = sup B(x) when x E R. d. Prove that b"i" = bxb'' for all real numbers x,
The following two exercises provide a detailed development of the field of complex numbers.
4. Definition. A complex number is an ordered pair (a. b) of real numbers. If ;, = (a, b) and w _ (c, d), we write z = w if and only if a = c and b = d. For complex number z and w we define addition and multiplication as follows:
z+w=(a+ c, b + d) (or  bd, ad + bc). The set of ordered pairs (a, b) of real numbers with the above operations of addition and multiplication is denoted by C. a. Find elements 0 and 1 in C such that 0 + z = z and lz = z for all z E C. zw
b. Show that if z = (a, b), then z  (a,b) is the additive inverse of z. c. For z E C with z * 0, find the multiplicative inverse z'. d. Prove that the set of complex numbers C with addition and multiplication as defined is a field.
e. Set i = (0, 1). Show that i2 = 1. f. Show that every complex number z can be written as z = a + bi where a. b E R. The real numbers a and b are called the real part and the imaginary part of z, respectively. We write a = Re(z) and b = Im(z). g. Prove that C is not an ordered field. 5. Definition. If z = a + bi E C. then the complex number i = a  bi is called the conjugate of z. The absolute a2 + value of z, denoted IzI, is defined by IzI = a. Prove each of the following. (1)
i
z+
(ii) zw = z N (iii) z + i = 2 Re(z),
z  z = 2i Im(z)
(iv) zz = 1z12
b. Prove each of the following. (1)
IZI = IzI
(ii) Izwl = IzIIwI (iii) IRe(z)I IzI, IIm(z)I s IzI (iv) Iz + w12 = Izl2 + Iwl2 + 2 Re(zw) (v) Iz  w12 = 1z12 + Iwl2  2 Re(zw) (vi) Iz + wl IZI + Iw'I
The following result, known as the SchroderBernstein theorem, is nontrivial, but very important. It is included as an exercise to motivate further thought and additional studies. A proof of the result can be found in the text b) Halmos listed in the Supplemental Reading.
6. Let X and Y be infinite sets. If X is equivalent to a subset of Y. and Y is equivalent to a subset of X. prove that X is equivalent to Y. 7. As in Theorem 1.7.18, let A denote the set of all sequences of 0's and I's. Use the previous result to prove that
A[0,1].
46
Chapter i
The Real Number System
SUPPLEMENTAL READING Buck, R. C., "Mathematical induction and recursive definition:' Amer. Math. Monthly 70 (1963),128135. Burrill, Claude W., Foundations of Real Numbers. McGrawHill, Inc., New York, 1967. Cantor, Georg, Contributions to the Founding of the Theory of Transfrnite Numbers (translated by Philip E. B. Jourdain), Open Court Publ. Co., Chicago and London, 1915. Dauben, Joseph W., Georg Cantor: His Mathematics and Philosophy of the Infinite, Princeton University Press, Princeton, N.J., 1979.
GOdel, Kurt, "What is Cantor's continuum problem?" Amer. Math. Monthly 54 (1947), 515525. Halmos, Paul. Naive Set Theory, SpringerVerlag. New York. Heidelberg. Berlin, 1974.
Richman, F. "Is 0.999... = 0 Math. Mag. 72 (1999), 396400. ShraderFrechette. M., "Complementary rational numbers," Math. Mag. 51(1978), 9098. Spooner, George and Mentzer, Richard, Introduction to Number Systems, PrenticeHall, Inc., Englewood Cliffs, N.J., 1968.
Sequences of Real Numbers 2.1 Convergent Sequences
2.2 Limit Theorems 2.3 Monotone Sequences
2.4 Subsequences and the BolzanoWeierstrass Theorem
2.5 Limit Superior and Inferior of a Sequence 2.6 Cauchy Sequences
2.7 Series of Real Numbers
In our study of sequences of real numbers we encounter our first serious introduction to the limit process. The notion of convergence of a sequence dates back to the early nineteenth century and the work of Bolzano (1817) and Cauchy (1821). Some of the concepts and results included in this chapter have undoubtedly been encountered previously in the study of calculus. Our presentation, however, will be considerably more rigorousemphasizing proofs rather than computations. We begin the chapter by introducing the notion of convergence of a sequence of real numbers and by proving the standard limit theorems for sequences normally encountered in calculus. In Section 2.3 we will use the least upper bound property of 1q to prove that every bounded monotone sequence of real numbers converges in R. The study of subsequences and subsequential limits will be the topic of Section 2.4. In this section we also prove the wellknown result of Bolzano and Weierstrass that every bounded sequence of real numbers has a convergent subsequence. This result will then be used to provide a short proof of the fact that every Cauchy sequence of real numbers converges. Although the study of series of real numbers is the main topic of Chapter 7, some knowledge of series will be required in the construction of certain examples in Chapters 4 and 6. For this reason, we include a brief introduction to series as the last section of this chapter. Even though our emphasis in this chapter is on sequences of real numbers, in subsequent chapters we will also encounter sequences of functions and convergence of sequences in normed linear spaces. A good understanding of sequences of real numbers will prove very helpful in providing insight into properties of sequences in more general settings. 47
48
Chaprer2
Sequences of Real Numbers
2.11 Convergent Sequences Before we begin our study of sequences we first introduce the absolute value of a real number.
2.1.1
DEFINITION
For a real number x. the absolute value of x, denoted ix . is de
fined by V.
1x1 =
ifx > 0,
.r, ifx  0.
For example, 141 = 4 and 15I = 5. From the definition, Ixl ? 0 for all x E R. and 1x1 = 0 if and only if x = 0. This last statement follows from the fact that if x * 0. then x * 0 and thus 1x1 > 0. The following theorem, the proof of which is left to the exercises (Exercise 1), summarizes several wellknown properties of absolute value.
2.1.2
THEOREM
(a) IxI = Ix1 for all x E R. (b) lxvI = I.xI IyI for all x. Y E R.
(c) IxI = \ for all x E R. (d) If r > 0. then lxl < r if and only if r < x < r. (e)  Ixl s x < Ixl for all x E R. The following inequality is very important and will be used frequently throughout the text.
2.1.3
THEOREM (Triangle Inequality)
For all.r, y E R, we have
Ix +v1l< lx1+h'I. Proof. The triangle inequality is easily proved as follows: For x,.% E R.
0:!:_ (x+v)=x+2xv+vIxl' + 21x1 hvi + lyl' = (Ixl + w;)=. Thus by Theorem 2.1.2(c),
Ix+yl=
(x+y)':S
(L,I+Iv1)'1xl+i1.
As a consequence of the triangle inequality, we obtain the following two useful inequalities. 2.11A
COROLLARY
For all x, v, E IL we have
(a) Ix  yl s Ix  zl + j:.yl,and (b) IIx I  ly11
Ix  yl.
2.1
Convergent Sequences
49
Proof. We provide the proof of (a), leaving the proof of (b) as an exercise (Exercise 2). If x, y, z E 08, then by the triangle inequality,
Ix  yl =
I(xz)+(z y)I S Ix  zI + Iz  yI.
I]
The following example illustrates how properties of absolute value can be used to solve inequalities.
2.1.5
EXAMPLE Determine the set of all real numbers x that satisfy the inequality 12x + 41 < 8. By Theorem 2.1.2(d), 12x + 41 < 8 if and only if 8 < 2x + 4 < 8. or equivalently, 12 < 2x < 4. Thus the given inequality is satisfied by a real number
x if and only if 6 < x < 2. Geometrically, 1x1 represents the distance from x to the origin 0. More generally, for x, y E 1i8, the euclidean distance d(x, y) between x and y is defined by
d(x,y)= Ix  yl For example, d(1, 3) = I1  (3)1 = 141 = 4, and d(5, 2) = 15  (2)1 = 7. The distance .d, may be regarded as a function on R X R which satisfies the following properties: d(x, y) _, 0, d(x, y) = 0 if and only if x = y, d(x, y) = d(y, x), and d(x, y) s d(x, z) + d(z, y)
for all x, y, z E R. This last inequality, also referred to as the triangle inequality, follows from Corollary 2.1.4(a).
Neighborhood of a Point The study of the convergence of a sequence or the limit of a function requires the notion of one real number being "close to" another. Since the euclidean distance between two points a and x is given by d(a, x) = j a  xl, saying that x is "close to" a is equivalent to saying that the distance lx  aI between them is "small" A convenient method for expressing this idea is with the concept of an eneighborhood of a point. This concept will prove useful not only in the study of the limit of a sequence but also in our study of the limit of a function and the structure of point sets in R.
2.1.6
DEFINITION
Let p E R and let e > 0. The set
N1(p)_{xER:Ix pl 0. By Theorem 2.1.2(d), for fixed p E R and E > 0,
N,(p)={x:pE 0, there exists a positive integer n, such that n,e > 1. Thus for all n _> no,
0, prove that lim
1
n 1 + nb
11. *a. If b > 1, prove that lim
l(+)nl
sin}
J
= 0.
b = 0.
b. If 0 0, prove that there exists n,, E N such that an > 0 for all
nano. 17. Let {a,} be a sequence in P satisfying Ia.  a.. I ? c for some c > 0 and all n E N. Prove that the sequence {a,} diverges.
2.2
Limit Theorems In this section we will emphasize some of the important properties of sequences of real numbers and investigate the limits of several basic sequences that are frequently encountered in the study of analysis. Our first result involves algebraic operations on convergent sequences.
2.2.1
THEOREM
If {a.} and
are convergent sequences of real numbers with
lima.=a
n oo
and
limb,=b,
n 00
then
lira (a, + b,) = a + b, and (a) n_00
(b) ltm a,b, = ab. .000
L. = b. (c) Furthermore, if a * 0, and a. * O for all n, then lint nan
a
Proof. The proof of (a) is left to the exercises (Exercise 1). To prove (b), we add and subtract the term a,b to obtain
la,b.  abl = I(a,b,  a,b) + (a,b  ab)l
c la,llb.  bl + Iblla.  al.
54
Chapter2
Sequences of Real Numbers
Since {an} converges, by Theorem 2.1.10(b), {an} is bounded. Thus there exists a constant M > 0 such that Ia.I 0 be given. Since an +a, there exists a positive integer n, such that
In  al < n
E
2(lbl + 1)
for all n a n1. Also, since bn + b, there exists a positive integer n_ such that
Ibnbt a. Since p > 1, write p = (I + q), with q > 0. By the binomial theorem, for n > 2k,
P"=0 +q)"> kn qk=
qk.
k!
Since k < in, n  k + 1 > in + 1 > in. Therefore,
n(n 1)' '(nk+ 1)
nk
kt
2kk!'
and as a consequence, n°
05
k"
c MnF"
The result now follows by part (a) and Theorem 2.2.4.
(e) Write p as p = ±1/q, where q > 1. Then, IP"I = IPA" =
4"
which by (d) (with a = 0) converges to 0 as n  oo.
(f) Fix k E N such that k > dpi. For n > k, n!
IPr < klkt) IPI " (k  1)! ( k }
 n!
Since ip(/k < 1, the result follows by (e).
2.2.7
,
EXAMPLES We now provide several examples to illustrate the previous theorems.
(a) As in Example 2.1.8(c) consider the sequence'
f 2n + 1
in + 2
2n+1_n(2+')
2+',
3n + 2
3+
} We write
58
Chapter 2
Sequences of Real Numbers
1
Since lim = lam 2 = 0, by Corollary 2.2.2(a), n"o n n.oo n
lim2 + 1n I = 2
1im3 + ? I = 3.
and
n"O(
R"00(
n
Therefore by Theorem 2.2.1,
oo\2+nJ3+?3 )=21=2. /
(2+n?)_1 3+n
1
1
3
(b) Consider the sequence
12N/n1. We first note that
+7
05

(1)"
I2\+7
2\ I
1
Thus by Theorem 2.2.4 and Theorem 2.2.6(a) with p = 2, On
0.
"Cc 2 n+
3l
n
(c) For our next example we consider the sequence
+n}
3'
As in (a), we first want
to factor out the dominant power in both the numerator and the denominator. By Theorem 2.2.6(d), lim n°/p" = 0 for any a E R and p > 1. This simply states that p" (p > 1) grows taster than any power of n. Therefore the dominant terms in the numerator and denominator are 2" and 3", respectively. Thus
2"+n3 _ 2"(1 +29') _( 2 (1 +Y) 3n + n2
3n(1 + 3,)
3
y
(I +
By Theorem 2.2.1 and Theorem 2.2.6(d),
"'
1+
1.
11+F 3
Finally, since lim ()" = 0 (Theorem 2.2.6(e)), we have n4W
3
n. 2n +n2=0. 3"
(d) As our final example we consider the sequence {n((1 + n)'2  1)}. Before we can evaluate the limit of this sequence we must first simplify the nth term of the sequence. This is accomplished as follows:
1)=n((l }!)2
 1)=n((n+1)2 1)
2n  1 _ 2n2  n (n + 1)2
F n+ l)2
2.2 Limit Theorems
59
Now we can factor out an n2 from both the numerator and the denominator. This gives
2 
(1 + i)2.
X,
"
Using the limit theorems we now conclude that lim x" = 2.
EXERCISES 2.2 1. Prove Theorem 2.2.1(a).
2. Let {a"} and {b"} be sequences of real numbers. a. If {a"} and {a" + b"} both converge, prove that the sequence {b"} converges.
b. Suppose b" * 0 for all n E N. If {b"} and
both converge, prove that the sequence {a"} also converges.
3. Prove Theorem 2.2.3. 4. Prove Theorem 2.2.4. 5.
a. If p > 0, prove that lim
1= v 0.
n
*b. If p > 0, prove that lim
= 1.
6. Find the limit of each of the following sequences.
(3n2+2n+l)°° 5n2  2n + 3 ".. ( Sc. 5
n
{\
°°
n
11+"
+a 
g.
n2+nn}.,
L{
 i)E
h. {(2" + 3)""r..,
7. For each of the following sequences, determine whether the given sequence converges or diverges. If the sequence converges, find its limit; if it diverges, explain why.
I + (1)"1°0 ((
n
1
sin n
"1
d.
Sc. 1n2\2n+3)2 } 00 a8n3 +51ao
L
e. fV9W 8. Prove that "lim R cos n
2
} 2
" l 2"+n2J.i
In cos nhr
2n+3
00
}
= 0.
9. Let {x"} be a sequence in R with x"  0. and x" * 0 for all n. Prove that lim x" sin
= 0.
z
60
Chapter2
Sequences of Real Numbers
10. Let {a"} be a sequence of positive real numbers such that lim .. a"
=L.
*a. If L < 1, prove that the sequence {a"} converges and that lim a = 0. RtiOG b. If L > 1, prove that the sequence {a"} is unbounded. c. Give an example of a convergent sequence {a"} of positive real numbers for which L = 1.
d. Give an example of a divergent sequence {a"} of positive real numbers for which L = 1. 11. Use the previous exercise to determine convergence or divergence of each of the following sequences. *a. {n2a"},0 < a < 1
b.
C'
{l n! }.0 oc as
As a consequence of Theorems 2.3.2 and 2.3.7, every monotone increasing sequence {an} either converges to a real number (if the sequence is bounded above) or diverges to oo. In either case,
lim a,, = sup{an : n E N}.
Remarks. Although the definition of diverging to infinity is included in this section on monotone sequences, this should not give the impression that Definition 2.3.6 is applicable only to such sequences. In the following we give an example of a sequence that diverges to infinity but is not monotone. Also, it is important to remember that when we say that a sequence converges, we mean that it converges to a real number.
2.3.8
EXAMPLE Consider the sequence {n(2 + ( I )")}. If n is even, then n(2 + ( I Y') = 3n; if n is odd, then n(2 + (1)") = n. In either case,
n(2+(1)")ten, and thus the sequence diverges to oo. The sequence, however, is clearly not monotone.
EXERCISES 2.3 1. Let 1n = [an, x
n E N, be closed and bounded intervals satisfying I,, D I,,. i for all n. Prove that
n 1n = [a, b], "=1
where a = sup{an : n E N} and b = inf{b : n e NJ. 2. 'Show by example that the conclusion of Corollary 2.3.3 is false if the intervals 1,, with 1. D 1_ are not bounded. 3. Show that each of the following sequences are monotone. Find a lower or upper bound if it exists; find the limit if you can.
t
It
b.
{ a+If)(a J)1,wherea> + cos
d. {sn}, where sn = cos'
2
1
2+
+ cos'
1117
66
Chapter2
Sequences of Real Numbers
4. Define the sequence {a,,} as follows: a, ', and a,r, = V + Ct". a. Show that a" s 2 for all n. b. Show that the sequence (a") is monotone increasing. c. Find lim a". 5. *Let a, > 1, and for n E N, define a"_, = 2  I/a". Show that the sequence {a") is monotone and bounded. Find n+x lim a". 6. Let 0 < a < 1. Set t, = 2, and for n E N, set:,, = 2  alt,. Show that the sequence {t") is monotone and bounded. Find lim t". 7. For each of the following, prove that the sequence (a") converges and find the limit.
n
b. a"+, _ V, a, = 3
a. a"+, = 1(2a" + 5), a, = 2
*c.a"+ti2a, a,=1
*e. a"_, = 3a"  2, a, = 4 8. Set x, = a. where a > 0 and let x"
d.a"+,=V2.. + 3, a4
f. an+, = 3, a, = I
x" + (1/x"). Determine if the sequence {x"} converges or diverges.
9. Let a > /0. Choose x, > Va. For n = 1, 2, 3, .... define xn+1 =
21X,+SJ.
\
*a. Show that the sequence {x"} is monotone and bounded.
b. Prove that lim x" = %/a. c. Prove that 0 c x"  NF. s (x. 10. In Exercise 9, let a = 3 and x, = 2. Use part (c) to find x" such that Ix"  \/31 < 105. 11. Let A be a nonempty subset of R that is bounded above and let a = sup A. Show that there exists a monotone increasing sequence {a"} in A such that a = I m a". Can the sequence (a"} be chosen to be strictly increasing? 12. Use Example 2.3.5 to find the limit of each of the following sequences.
*a. lI *c.
1
+
nb. d.
1
+ 2n/3n}
j(1
 n) +
'
jl l
n
Sn= +2+
1
1
+n
Show that {s,} is monotone increasing but not bounded above.
14. For each n E N, let I + 72+
s"=
I
+
n
Show that the sequence {s"} is monotone increasing but not bounded above.
15. *For each n E N. let I S"= i2+22+.+n2. 1
1
Show that the sequence {s"} is monotone increasing and bounded above by 2.
16. Let 0 < b < 1. For each n E N, let s" = I + b + b2 + creasing and bounded above. Find lim s".
+ b". Prove that the sequence {sue} is monotone in
2.4
Subsequences and the BolzanoWeierstrass Theorem
67
17. Show that each of the following sequences diverges to 00. n
n
18. *Which of the sequences in the previous exercise are monotone? Explain your answer.
19. If a +oc and (b^} converges in R, prove that (a, +
diverges to oc.
20. If a, > 0 for all n E h and lim a^ = 0, prove that I la, + oc. 21. Suppose a, > a, > 0. For n ? 2, set a^ _ i = ; (a, + a^ 1). Prove that a. {au, ;} is monotone decreasing.
b. {a2k) is monotone increasing, and
c. {a^} converges. 22. Let {s,} be a bounded sequence of real numbers. For each n E N. let a^ and b^ be defined as follows:
a, = inf{sk : k ? n}, b" = sup{sk : k ? n}. a. Prove that the sequences {a^} and (b,,) are monotone and bounded. b. Prove that lim a^ = lim b" if and only if the sequence {s^} converges. nix nix 23. *In Theorem 2.3.2 we used the supremum property of R to prove that every bounded monotone sequence converges. Prove that the converse is also true; namely, if every bounded monotone sequence in l8 converges. then every nonempty subset of R that is bounded above has a supremum in R. 24. *Use the nested intervals property to prove that [0, 1 ] is uncountable.
2.4
Subsequences and the BolzanoWeierstrass Theorem In this section we will consider subsequences and subsequential limits of a given sequence of real numbers. One of the key results of the section is that every bounded sequence of real numbers has a convergent subsequence. This result, known as the sequential version of the BolzanoWeierstrass theorem, is one of the fundamental results of real analysis.
2.4.1
Given a sequence { p^} in OB, consider a sequence {nk}k 1 of positive . Then the sequence { p^,}k , is called a subsequence of the sequence { p"}. DEFINITION
integers such that n, < n, < n3
/3  e for infinitely many indices n. 2.5A THEOREM Let {sn} be a sequence in H.
(a) Suppose urn sn t R. Then a = lim sn if and only if for all E > 0, n+oc
n4ou
(i) there exists no E N such that s > a  e for all n ? no, and (ii) given n E N, there exists k E Ni with k ? n such that sk < a + e. (b) lim sn = oo if and only if given M and n E N, there exists k E N with i_"_Q
k ? n such that sk oo.
Proof of Theorem 2.5.3. We will only prove (a). The proofs of (b) and (c) are left to the exercises (Exercise 5).
(a) Suppose /3 = lim S. = rlim bk where bk = sup{ s,, : n ? k}.
Let e > 0 be given. Since lim bk = (3, there exists a positive integer n,, such that bk < /3 + e for all k >t n0. Since sn s bk for all n ? k, k
s,, 0, there exists an integer k z n such that
sk> b,,  e? /3 e, which proves (ii).
Conversely, assume that (i) and (ii) hold. Let e > 0 be given. By (i) there exists
n. E N such that s < /3 + e for all n a na. Therefore,
b,,,= sup{s,,:n?n0}:/3+e.
2.5
Limit Superior and Inferior of a Sequence
77
Since the sequence {bn} is monotone decreasing. bn 0 was arbitrary, urn sn S /3. Suppose $3' = lim sn < (3. Choose e > 0 such that $3' < 6  2E. But then there exists n such that
sn 0, there exists no E N such that s" < s + e for
all n ? n prove that liras" s s. 5. a. Prove Theorem 2.5.3(b). b. Prove Theorem 2.5.3(c). 6. Let {a"} and {b"} be bounded sequences in R.
*a. Prove that lim a" + limb"slim (a" + b") 1, let s" be defined by S2m =
S2mI
2
1
,
S2,"+I = 2 + s2,,,.
Find Iim s" and lim s".
9. Let a" > 0 for all n. Prove that Tim " a"
lim
a='. a"
10. *Suppose (a"}, {b"} are sequences of nonnegative real numbers with limo b, = b * 0 and lim a" = a. Prove that
lima"b"=ab.
n =
80
Chapter2
2.6.
Sequences of Real Numbers
Cauchy Sequences In order to apply the definition to prove that a given sequence 1 1),, } converges, we must For this reason, theorems that provide sufficient know the limit of the sequence
conditions for convergence, such as Theorem 2.3.2. are particularly useful. The drawback to Theorem 2.3.2 is that it applies only to monotone sequences of real numbers. In this section we consider another criterion that. for sequences in R. is sufficient to ensure convergence of the sequence.
2.6.1
DEFINITION A sequence {p,),',, in R is a Cauchy sequence if for every r: > 0. there exists a positive integer it,, such that
IP,.P.,I 0, there exists n, E Nl such that I1, kI < e for all n > no and all k E N. Thus IS. +k  S. I < e for all n ? no and all k E N. Therefore the sequence {s"} is a Cauchy sequence. "n
(b) In this example we illustrate how the concept of a Cauchy sequence may be used to prove convergence of a given sequence. Additional applications will be given in the exercises. Let a,, a2 be arbitrary real numbers with a1 # a2. For n > 3, define an inductively by an = I(an1 + an2)
(S)
Our first goal is to show that the sequence {a"} is Cauchy. We first note that
an+1an=2(anan1) As a consequence, for n ? 2, 1
2"
an+1  an
(a2  a1).
(6)
This last statement is most easily verified by induction (Exercise 5). For m ? 1, consider Ian+,"  a,, 1. By the triangle inequality,
m:Ian+k+I  an+k) S I 1
Ian+'n  anI
7,(an+k+I
0
which by (6) I
I
1
Ia2  a]I 7, 2n+kf By Example 1.3.3(a) rk
kl
m
1
2"21a2  all 7, 2k.
rr'* 1r
r # 1.
(7)
2.6 Cauchy Sequences
83
Thus with r = 2, M
2  2)m+I
1
k=12k
1Z
l
= 1   < 1. 2m
Therefore, 1
Ian+m  and :S
22 Ia2 
2
all
(S)
for all n > 2 and m E N. Let e > 0 be given. Choose no such that lag  a, I/2"2 < e for all n ? no. Then by (8), Ian+m  and < E
for all m E N, n ? nn. This, however, is just another way of stating that
Ian  am < e for all m, n ano. Therefore the sequence {an) is a Cauchy sequence in R, and thus by Theorem 2.6.4.
a = lima" n 00 exists in R.
Can we find the limit a here? If we follow the approach in Example 2.3.4(c), by taking the limit of both sides of equation (5), we only get a = a. To find the value of a, let us observe that
an+1  a, = (an+,  an) + (an  a",) + ... + (a2  a1) n
Y, (ak+ I  ak),
k=1
then use (6) to get I
k1
(a2al)I(2) (
n
The last equality follows from formula (7). Since an+, ing the limit of both sides we obtain 2
a a, = (a2  a,)
or
and (3f +0, upon tak
a=a,+3(a2a,).
Contractive Sequences One of the key properties of the sequence {an} of the previous example was that Ian+,  and
< Ia,,  a.,I 2
84
Chapter 2
Sequences of Real Numbers
for all n ? 2. This property was used to show that the sequence {a"} was a Cauchy sequence and thus converged. Such sequences are commonly referred to as contractive sequences. We make this precise in the following definition.
2.6.6
DEFINITION A sequence { p"} in R is contractive if there exists a real number b, 0 < b < 1, such that IPn+I  P"I t bIP"  Pn
forallnEN,na2. If {p.) is a contractive sequence, then an argument similar to the one used in the previous example shows that
IPn+I  P"I ` b"'IP2  P1I for all n >_ 1, and that IP"+mP.1
b" Cb"'IP2PII(l+b+. +b'"I)< 1bIP,PII
for all n, m E N. As a consequence, every contractive sequence is a Cauchy sequence. Therefore, every contractive sequence in R converges to a point in R. We summarize this in the following theorem.
2.6.7
THEOREM Every contractive sequence in R converges in R. Furthermore. if the sequence {p.1 is contractive and p = lim p", then
(a) IP  P"I
l_b
I(b)
PP"::=:':r'
where 0 < b < 1 is the constant in Definition 2.6.6.
Proof. We leave the details of the proof to the exercises (Exercise 7). 0
2.6.8
EXAMPLE Suppose we are given that the polynomial p(x) = x2  3x + I has exactly one zero in the open interval (0, 1). If c r= (0, 1) is such that p(c) = 0, then c = (c2 + 1). We start with cl E (0, 1) arbitrary, and for n a I we set
i
1
,
c"+1=3(c;,+I). Since cl E (0, 1) we have c2 E (0, 1), and by induction, c" E (0, 1) for all n E N. To prove that the sequence {c"} converges we prove that it is contractive. For n > 2 we have
Ic"+Ic"I=I3(c+1)3(c,,1+1)I
2.6
Cauchy Sequences
85
= 3I(Cn  C.1)(c + Cn1)I 2
:5
3ICn  Cn1I
Thus the sequence is contractive with constant b = 3. If c = nlimc then c max. (c2 + 1) or p(c) = 0. Suppose we begin with c1 = .5 and we wish to determine the value of n such that Ic  Cl < 103. By Theorem 2.6.7 (a) it suffices to determine n such that bnt
(b  1)
IC2  ct I
n 2, n0.
Proof. Since k_n+I akI
the result is an immediate consequence of Theorem 2.6.2 and Theorem 2.6.4. O
Remark The previous theorem simply states that the series I ak converges if and only if the sequence {s,,} of nth partial sums is a Cauchy sequence.
2.7.4
In this example we show that the series 7" 1 k diverges. We accomplish this by showing that the sequence (sn) of partial sums is not a Cauchy sequence. Consider EXAMPLE
stnsn=n+t.+.
+2n'
nEN.
There are exactly n terms in the sum on the right, and each term is greater than or equal to 1/2n. Therefore, (s2.
sn ? n
)= 2n
J
1'
2
The sequence {sn} therefore fails to be a Cauchy sequence and thus the series diverges. The divergence of this series appears to have been first established by Nicole Oresme (1323?1382) using a method of proof similar to that suggested in the solution of Exercise 13 of Section 2.3.
2.7
2.7.5
Series of Real Numbers
89
COROLLARY If ET , ak converges, then Jim ak = 0.
Proof. Since ak = sk  sk_ i, this is an immediate consequence of the Cauchy criterion.
Remark. The condition lim at = 0 is not sufficient for the convergence of I ak. For example, the series 1 diverges, yet slim k = 0. oc k
2.7.6
THEOREM S u p p o s e ak at 0 f o r all k E N. T h e n E is bounded above.
Proof. Since ak a 0 for all k, the sequence orem 2.3.2, the sequence
is monotone increasing. Thus by Theconverges if and only if it is bounded above.
EXERCISES 2.7 1. *'Using the inequality 1
I
1
1
k2
k(k  l)
kI
k
prove that the series 71
kikz
converges.
2. Prove that the series X,,
k., k2+k
converges.
3. If Irl _ 1, show that the series 7,k'., rk diverges. 4. Prove that the series
I
converges. (See Exercise 3 of Section 2.6.)
k 0 k!
5. *Suppose ak >_ 0 for all k. Prove that if B ak converges, then
k.,
6. If Jk , ak and 1:T=, bk both converge, prove each of the following.
a Yk , cak converges for all c E R. b. 7,'k., (a,, + bk) converges.
7. If I' , (ak + bk) converges, does this imply that the series
a, +b, +a2+b2+ converges?
8. Suppose bk?ak?0for all kE N. a. If 10ko. , bk converges, prove that J:e , ak converges. b. If 7Q1. , ak diverges, prove that 7,x I bk diverges.
9. Consider the series T, k,,, p E R. k
a. Prove that the series diverges for all p 5 1. b. Prove that the series converges for all p > 1.
, at converges if and only if
k
converges.
90
Chapter2
Sequences of Real Numbers
NOTES This chapter provided our first serious introduction to the limit process. In subsequent chapters we will encounter limits of functions, the derivative, and the integral, all of which are further examples of the limit process. Of the many results proved in this chapter, it is difficult to select one or two for special emphasis. They are all important! Many of them will be encountered againeither directly or indirectlythroughout the text. Some of the concepts and results of this chapter have certainly been encountered previously; others undoubtedly are new. Two concepts that may not have been previously encountered are limit point of a set and the limit superior (inferior) of a sequence of real numbers. The notion of limit point of a set is one of the fundamental concepts of analysis. We will encounter limit points again when we characterize the closed subsets of R. The notion of limit point will also be crucial in the definition of the limit of a function. The results of Theorem 2.4.7, although elementary, are very useful. The fact that every limit point of a set is the limit of a sequence of distinct points in the set will be exploited in several instances in subsequent chapters. The primary importance of the limit superior and inferior of a sequence is that these two limit operations always exist in R U {oo, oo}. As we will see in Chapter 7, this will allow us to present the correct statements of
the root and ratio test for convergence of a series. The limit superior will also be required to define the radius of convergence of a power series. There will be other instances in the text where these two limit operations will be encountered.
In this chapter we have proved several important
consequences of the least upper bound property of R. The least upper bound property was used to prove that every bounded monotone sequence converges. This result was subsequently used to prove the nested intervals property. which in turn was used to prove the BolzanoWeierstrass theorem. By Exercise 23 of Section 2.3 and Exercises 13 and 14 of Section 2.4. each of these imply the least upper bound property of R. Another property of the real numbers that is equivalent to the least upper bound property is the completeness property of R; namely, every Cauchy sequence of real numbers converges. Other consequences of the least upper bound property will be encountered in subsequent chapters. Cauchy sequences were originally studied by Cantor
in the middle of the nineteenth century. He referred to them as fundamental sequences and used them in his construction of the real number system R (see Miscellaneous Exercises 411). The main reason that these sequences are
attributed to Cauchy, rather than Cantor, is because his 1821 criterion for convergence of a series (Theorem 2.7.3) is equivalent to the statement that the sequence of partial sums is a Cauchy sequence. The fact that Cauchy was a more prominent mathematician than Cantor may also have been a factor. In later chapters we will encounter examples of spaces of functions that have defined on them a function, called a norm, having properties analogous to those of the absolute value function. For such spaces it will also be possible to define both convergence of a sequence and the notion of Cauchy sequence. Many of these spaces will also have the property that they are complete; that is, every Cauchy sequence in the space converges.
MISCELLANEOUS EXERCISES The first three exercises involve the concept of an infinite product. Let {ak} be a sequence of nonzero real numbers. F o r e a c h is = 1, 2, ... , define
pn = flak = a, . a2 .
.
. an.
k=1
If p = lim pn exists, then p is the infinite product of the sequence {ak}k 1, and we write TM1
p=1 ak. k1 If the limit does not exist, then the infinite product is said to diverge. Some authors require that p * 0. We will not make this requirement; rather we will specify p # 0 if this hypothesis is required in a result.
Miscellaneous Exercises
91
1. Determine whether each of the following infinite products converge. If it converges, find the infinite product.
c.frlll
a. H( I)k
2. If II' 1 ak = p with p * 0, prove that lim ak = 1. 3. If a ? 0 for all n E N, prove that (1 + ak) converges if and only if
ak converges. k1
k1
To prove the result, establish the following inequality:
a,+ +a,,
(1+a,) (1+a°):5 e°'` '°
CONSTRUCTION OF THE REAL NUMBERS In the following exercises we outline the construction of the real number system from the rational number system using Cantor's method of Cauchy sequences.
in 0 is Cauchy if for every r f= Q, r > 0, there exists a positive integer n° such that Ia  a,,' < r for all n, m z n°. A sequence in 0 is called a null sequence if for every r (= 0, r > 0, there exists a positive integer n° such that Ia I < r for all n z n°. Two Cauchy sequences provided (a° in Q are said to be equivalent, denoted is a null sequence. and Let Q denote the set of rational numbers. A sequence
4. Let
and
be Cauchy sequences in Q. Prove the following.
a. {a°) 
c. If
and
e. If {a°)  {cn} and {b°}
then {a°}
then {a +
b. If
{b.), then {b°} 
d. If
{b°}, then
{c +
and
{c,d°}.
Given a Cauchy sequence in Q. let denote the set of all Cauchy sequences in Q equivalent to The set is called the equivalence class determined by S. Given two Cauchy sequences
and (b,,) in 0, prove that
provided
and
n [{b°}] = 0 otherwise. Let 9t denote the set of equivalence classes of Cauchy sequences in Q. We denote the elements of 9t by lower
case Greek letters a, 6, y, .... Thus if a E 9t, a = [{a°}] for some Cauchy sequence in Q. The sequence is called a representative of the equivalence class a. Suppose a = and P = [{b,,)]. Define a, a + j9, and a  6 as follows:
a =
One needs to show that these operations are well defined; that is, independent of the representative of the equivalence class. For example, to prove that a is well defined, we suppose that are two representatives and of a; i.e., {a°}  {b°}. But by 4(d), Therefore, [{a.)) = [(b°)]. This shows that a is well defined.
6. Prove that the operations + and are well defined on 9t. For each p E Q, let { p} denote the constant sequence p. and set ap = [{ p}]. Also, we set 0 = [{0}], t, = [{l}]. As we will see, the element 0 will be the zero of 9t and k will be the unit of R. A Cauchy sequence in 0 belongs to 0 if and only if b > 0. Similarly, E i if and only if (a  1) + 0. The following problem provides us with the multiplicative inverse of a # 0.
92
Chapter2
Sequences of Real Numbers
7. If a # 0, prove that there exists
E a such that a, * 0 for all n E N, and that {
is a Cauchy sequence.
Define a' _ S. Prove that 9t with operations + and
is a field.
We now proceed to define an order relation on 9t. A Cauchy sequence
in Q is positive if there exists r E Q, r > 0. and no E N such that a > r for all n >_ no. Let 91 be defined by 91 _ is a positive Cauchy sequence}. 9. Prove that the set 'satisfies the order properties (01) and (02) of Section 1.4. 10. Show that the mapping p > a,, is a onetoone mapping of Q into % which satisfies
ap+a, =ap,q for all p, q E Q. Furthermore, if p > 0, then a,, E
'.
11. Prove that every nonempty subset of 9t which is bounded above has a least upper bound in 9t. The above exercises prove that 9t is an ordered field that satisfies the least upper bound property One can show that any two complete ordered fields are in fact isomorphic, that is, there exists a onetoone mapping of one onto the other that preserves the operations of addition, multiplication, and the order properties. Thus 9t is isomorphic to the real numbers R.
SUPPLEMENTAL READING Aguirre, J. A. F., "A note on Cauchy sequences," Math. Mag. 68 (1995), 296297. Bell, H. B., "Proof of a fundamental theorem on sequences;' Amer. Math. Monthly 71 (1964), 665666. Goffman. C., 'Completeness of the real numbers;' Math. Mag. 47 (1974), I8. Newman, Donald J. and Parsons, T. D., "On monotone
subsequences;' Amer. Math. Monthly 95 (1988),
4445. Staib, J. H. and Demos, M. S., "On the limit points of the sequence (sin n 1:' Math. Mag. 40 (1967), 210213. Wenner, B. R., 'The uncountability of the reals;' Amer Math. Monthly 76 (1969), 679680.
3
Structure of Point Sets 3.1 Open and Closed Sets
3.2 Compact Sets 3.3 The Cantor Set
In this chapter we introduce some of the basic concepts fundamental to the study of limits and continuity, and study the structure of point sets in R. The branch of mathematics conceded with the study of these topicsnot only for the real numbers but also for more general setsis known as topology. Modern point set topology dates back to the early part of this century; its roots, however, date back to the 1850s and 1860s and the studies of Bolzano, Cantor, and Weierstrass on sets of real numbers. Many important mathematical concepts depend on the concept of a limit point of a set and the limit process, and one of the primary goals of topology is to provide an appropriate setting for the study of these concepts.
Although we restrict our study to the topology of the real line, all of the concepts encountered in this chapter can be defined in the more general setting of metric spaces. A thorough understanding of these topics on the real line will prove invaluable when they are encountered again in more abstract settings. On first reading, the concepts introduced in this chapter may seem difficult and challenging. With perseverance, however, understanding will follow.
3 ,1
Open and Closed Sets In the previous two chapters we used the terms open and closed in describing intervals in R. The purpose of this section is to give a precise meaning to the adjectives open and closed, not only for intervals, but also for arbitrary subsets of R. Before defining what we mean by an open set, we first define the concept of an interior point of a set. 93
94
Chaprer3
3.1.1
Structure of Point Sets
Let E be a subset of R. A point p E E is called an interior point of E if there exists an e > 0 such that N,(p) C E. The set of interior points of E is denoted by Int(E), and is called the interior of E. DEFINITION
Recall that for p E R and e > 0, the eneighborhood N,(p) of p is defined as
{xER:Ix  pl < e}. 3.1.2
EXAMPLES
(a) Let E = (a, b] with a < b. Every p satisfying a < p < b is an interior point of E. If e is chosen such that
0 < e 0, N,(b) = (b  e, b + e)contains points that are not in E. Any x satisfying b < x < b + e is not in E. This is illustrated in Figure 3.1. For this example, Int(E) = (a, b). NN(P)
N,(b)
P
b
a
Figure 3.1
(b) Let E denote the set of irrational real numbers, i.e., E = R \ Q. If p E E, then by Theorem 1.5.2, for every e > 0 there exists r E Q fl N,(p). Thus N,(p) always contains a point of R not in E. Therefore no point of E is an interior point of E; i.e., Int(E) = 0. Using the fact that between any two real numbers there exists an irrational number (Exercise 6, Section 1.5), a similar argument also proves that Int(Q) _ ¢.
Open and Closed Sets Using the notion of an interior point, we now define what we mean by an open set.
3.1.3
DEFINITION
(a) A subset 0 of R is open if every point of 0 is an interior point of 0. (b) A subset F of P is dosed if F` = R \ F is open.
Remark. From the definition of an interior point it should be clear that a set 0 C R is open if and only if for every p E 0 there exists an e > 0 (depending on p) so that N,(p) C 0. In Theorem 3.1.9 we will provide a characterization of closed sets in terms of limit points.
3.1.4
EXAMPLES
(a) The entire set N is open. For any p E N and e > 0, N,(p) C R. Since R is open, by definition the empty set 0 is closed. However, the empty set is also open. Since 4)
3.1
Open and Closed Sets
95
contains no points at all, Definition 3.1.3(a) is vacuously satisfied. Consequently R is also closed.
(b) Every eneighborhood is open. Suppose p E I8 and E > 0. If q E NE(p), then
p  qI < e. Choose S so that 0 < S s e  p  qI. If x E N5(q), then
IxPI:5 IP  qI+Ix  qI < IP  qi + S 0 such that NE,(p) C 0..
Let e = min{e,, ... , e,}. Then e > 0 and NE(p) C 0,
for all i. Therefore
NE(p) C 0; i.e., p is an interior point of 0. Since p E 0 was arbitrary, 0 is open. C) As a consequence of the previous theorem every closed interval [a. b], a, b E R with a s b, is a closed subset of R. Since H \ [a, b] = (oo, a) U (b, oo) is the union of two open intervals, by the previous theorem P \ [a, b] is an open subset of R. Thus [a, b] is a closed set. For closed subsets of R we have the following analogue of the previous result.
3.1.7
THEOREM
(a) For any collection {Fa}aEA of closed subsets of P. n F. is closed.
(b) For any finite collection IF,, ... ,
of closed subsets of R. U F is closed j°
Proof. The proofs of (a) and (b) follow from the previous theorem and De Morgan's laws:
(")c
EA
A
j=1
j .1
Remark. The fact that the intersection of a finite number of open sets is open is due to the fact that the minimum of a finite number of positive numbers is positive. This guarantees the existence of an e > 0 such that the eneighborhood of p is contained in the intersection. For an infinite number of open sets, the choice of a positive e may no longer be possible. This is illustrated by the following two examples.
3.1.8
EXAMPLES We now provide two examples to show that part (b) of Theorem 3.1.6 is, in general, false for a countable collection of open sets. Likewise, part (b) of Theorem 3.1.7 is, in general, also false for an arbitrary union of closed sets (Exercise 6b).
(a) For each n = 1, 2, ..
.
, let 0 =
(,1,,,,). Then each O is open, but
00
n1 O = {0}, which is not open.
3.1
Open and Closed Sets
97
(b) Alternatively, if we let G,, = (0, 1 + ), n = 1, 2.... , then again each G. is open, but x
n nI
G,, = (0. 1 ],
which is neither open nor closed.
The following theorem provides a characterization of the closed subsets of R. Before stating the theorem we recall the definition of a limit point (Definition 2.4.5). For
E C R, a point p E R is a limit point of E if for every e> 0. (N,(p)1 { p}) n E t
3.1.9
4,.
THEOREM A subset F of R is closed if and only if F contains all its limit points.
Proof. Suppose F is closed. Then by definition F` is open and thus for every p E F' there exists e > 0 such that N, (p) C F'; that is, Ne(p) fl F = 0. Consequently. no point of F` is a limit point of F. Therefore F must contain all its limit points. Conversely, let F be a subset of R that contains all its limit points. To show F is closed we must show F'' is open. Let p E V. Since F contains all its limit points. p is
not a limit point of F. Thus there exists an e > 0 such that N,(p) n F = 44. Hence N,(p) C F' and p is an interior point of V. Since p E F` was arbitrary, F` is open and therefore F is closed.
Closure of a Set 3.1.10
DEFINITION
If E is a subset of R, let E' denote the set of limit points of E. The
closure of E, denoted E, is defined as
E=EUE'. 3.1.11
THEOREM If E is a subset of R, then (a) E is closed. (b) E = E if and only if E is closed. (c) E C F for every closed set F C R such that E C F.
Proof. (a) To show that E is closed, we must show that k is open. Let p E E`. Then p 44 E.and p is not a limit point of E. Thus there exists an e > 0 such that
N,(p)flE_¢. We complete the proof by showing that NE(p) fl E' is also empty and thus N,(p) fl E =
Therefore N,(p) C E`; i.e., p is an interior point of E`.
Suppose N,(p) fl E' * 4,. Let q E NE(p) n E', and choose S > 0 such that Na(q) C N,(p). Since q E E, q is a limit point of E and thus N8(q) fl E * 46. But this
implies that N. (p) n E * tb, which is a contradiction. Therefore N,(p) n E' = 46, which proves the result.
_ (b) If E = E, then E is closed. Conversely, if E is closed, then E' C E and thus _ E. (c) If E C F and F is closed, then E' C F. Thus E C F.
98
Chapter3
3.1.12
Structure of Point Sets
DEFINITION A subset D of H is dense in R if D = H. The rationals 0 are dense in R. By Example 2.4.6(c), every point of H is a limit point of Q. Hence 0 = H. This explains the comment following Theorem 1.5.2. The rationals are not only dense; they are also countable. Countable dense subsets play a very important role in analysis. They allow us to approximate arbitrary elements in a set by elements chosen from a countable subset of H. Since the rationals are dense in R,
given any p E 11 and e > 0, there exists r E 0 such that Ip  rI < e. Additional examples of this will occur elsewhere in the text.
Characterization of the Open Subsets of R' If {In} is any finite or countable collection of open intervals, then by Theorem 3.1.6, U = U I is an open subset of N. Conversely, every open subset of H can be expressed as a finite or countable union of open intervals (see Exercise 17). However, a much stronger result is true. We now prove that every open set can be expressed as a finite or countable union of pairwise disjoint open intervals. A collection {!n} of subsets of R is pairwise disjoint if 1, fl Im = 0 whenever n # m.
3.1.13
THEOREM If U is an open subset of R, then there exists a finite or countable collection {1n} ofpairwise disjoint open intervals such that
U=UI.. Proof.
Let x E U. Since U is open, there exists an e > 0 such that
(XE,x+E)C U. In particular (s, x) and [x, t) are subsets of U for some s < x and some t > x. Define r, and lx as follows:
r, = sup{t : t > x and [x, t) C U}, lx = inf{s : s < x and (s, x] C U}.
and
Then x < r, < oo and oo s 1, < x. For each x E U, let 1, = (t r,). Then (a) 1, C U, and (b) if x, y E U, then either Ix = I, or 1, fl /,. = 4). The proofs of (a) and (b) are left as exercises (Exercise 16).
To complete the proof, we let 2 = {I, : x E U}. For each interval I E 9, choose rt E Q such that rt E I. If !, J E 2 are distinct intervals, then rt * rj. Therefore the mapping I  rt is a onetoone mapping of T into O. Thus the collection Y is at most countable and therefore can be enumerated as {I;}iEA, where A is either a finite subset of N, or A = N. Clearly
U=iE Ii, 1. This topic can be omitted upon first reading of the text. The structure of open sets will only be required in Chapter 10 when defining the measure of an open subset of R.
3.1
Open and Closed Sets
99
and by (b), if n * j, then 1 f11, = 4). Thus the collection {1%EA is pairwise disjoint.
Relatively Open and Closed Sets One of the reasons for studying topological concepts is to enable us to study properties of continuous functions. In most instances, the domain of a function is not all of R, but rather a proper subset of R, as is the case with f(x) = NA for which Dom f = [0, oc). When discussing a particular function we will always restrict our attention to the do
main of the function rather than all of R. With this in mind we make the following definition.
3.1.14
DEFINITION
Let X be a subset of R.
(a) A subset U of X is open In (or open relative to) X if for every p E U, there exists e > 0 such that NN(p) fl x C U. (b) A subset C of X is closed in (or closed relative to) X if X \ C is open in X.
3.1.15
EXAMPLE
Let X = [0, oo) and let U = (0, 1). Then U is not open in R but is open
in X. (Why?) The following theorem, the proof of which is left as an exercise (Exercise 21), provides a simple characterization of what it means for a set to be open or closed in X. 3.1.16
THEOREM Let X be a subset of R.
(a) A subset U of X is open in X if and only if U = X fl O for some open subset
OofR. (b) A subset C of X is closed in X if and only if C = X fl F for some closed subset F of R.
Connected Sets2 Our final topic in this section involves the notion of a "connected set." The idea of connectedness is just one more of the many mathematical concepts that have their roots in Cantor's studies on the structure of subsets of R. When we use the term "connected subset' of R, intuitively we are inclined to think of an interval as opposed to sets such as the positive integers N or (0, 1) U {2}. We make this precise with the following definition.
3.1.17
DEFINITION A subset A of R is connected if there do not exist two disjoint open sets U and V such that
(a) AlU*4)andAflV*e(,,and (b) (An u)U(Af1V)=A. 2. this concept, though important and used implicitly in several instances in the text, will not be required specifically in subsequent chapters except in a few exercises. Thus the topic of connectedne ss can be omitted upon first reading of the text.
100
Chapter 3
Structure of Point Sets
The definition for a connected set differs from most definitions in that it defines connectedness by negation; i.e., defining what it means for a set not to be connected. According to the definition, a set A is not connected if there exist disjoint open sets U and V satisfying both (a) and (b). As an example of a subset of R that is not connected, consider the set of positive integers N. If we let U = (1, Z) and V = (, co). then U and V are disjoint open subsets of 68 with u n Nl
l}
and
v n N = {2, 3, ...}
that also satisfy (U n N) U (V n N) = N. That the interval (a, b) is connected is a consequence of the following theoreth, the proof of which is left to the exercises (Exercise 25).
3.1.18 THEOREM A subset of 118 is connected if and only if it is an interval.
EXERCISES 3.1 1. Prove Theorem 3.1.5. 2. *Show that every finite subset of I8 is closed. 3. Show that the intervals (oo, a) and [a, oo) are closed subsets of R.
4. For the following subsets E of R, fill in the chart.
E
Int(E)
E'
Isol. pts. of E
E
Open?
Closed?
(0, 1) U {2}
(a, b)
(a,b] [a, b]
[o, 1]na 5. a. Let F be a closed subset of R and let { p,} be a sequence in F which converges to p E R. Prove that p E F. b. Show by example that the conclusion is false if F is not closed. 6. *a. Prove Theorem 3.1.6(a).
b. Give an example of a countable collection of closed subsets of R such that U.=, F. is not closed. 7. Let A, B be subsets of R. b. Show that Int(A n B) = Int(A) n Int(B). a. If A C B, show that Int(A) C Int(B). c. Is Int(A U B) = Int(A) U Int(B)? 8. Let E be a subset of R.
*a. Prove that Int(E) is open. b. Prove that E is open if and only if E = Int(E). c. If G C E and G is open, prove that G C Int(E).
3.2
Compact Sets
101
9. Let A. B be subsets of R.
*a. Show that (A U B) = A U B. b. Show that (A fl B) C A fl B. c. Give an example for which the containment in part (b) is proper. 10. Prove that the set of limit points of a set is closed. 11. Let E C R. A point p E 18 is a boundary point of E if for every e > 0, N,(p) contains both points of E and points of E. Find the boundary points of each of the following sets. d. Q c. N *a. (a, b) b. E = {,', : n E N} 12. a. Prove that a set E C B8 is open if and only if E does not contain any of its boundary points. b. Prove that a set E C 18 is closed if and only if E contains all its boundary points. 13. *Prove that the set of irrational numbers is dense in R.
in D with lim p = p. 15. Let Do = {0, 11, and for each n E N, let D" = {a/2" : a E N, u is odd, 0 < a < 2"}. Let D = U `0 D". Prose 14. *If D is dense in 18, prove that for every p E IB there exists a sequence that D is a countable dense subset of [0, I ].
16. Prove statements (a) and (b) of Theorem 3.1.13. 17. *Prove that there exists a countable collection 9 of open intervals such that if U is an open subset of E8 and
p E U, there exists / E I with p E /" C U. 18. Let X = (0, oo). For each of the following subsets of X determine whether the given set is open in X. closed in X. or neither. *c. (0, 1 ] U (2, 3) d. (0, 1 ] U (21 e. (1, : n E NJ b . (0, 1 ) *a. (0, 1 ]
19. For each of the following subsets of 0, determine whether the set is open in 0, closed in Q. both open and closed in 0, or neither.
a.A={pE0:I 2, R" _ {(xt, .. , x") xi E R, i = 1, .. , n). For p = (pt, . q") in I2" and , p"), q = (qi,
c E R. define p + q = (pt + qt, .. , p" + q"), and cp = (cpt..... cp.). Also. let 0 = (0, ... , 0). For p, q E R", the inner product of p and q, denoted (p, q), is definedas
(p,q)=ptq.+"+p"q"
1. Prove each of the following. For p, q. r E P8",
a. (p, p) a 0 with equality if and only if p = 0. b. (p, q) _ (q, p).
112
Chapter3
Structure of Point Sets
c. (ap + bq, r) = a (p, r) + b (q, r) for all a, b E d. 1(p, q)I z (
(9
This last inequality is usually called the CauchySchwarz inequality. As a hint on how to prove part (d), for A E R, expand (p  Aq, p  Aq) and then choose A appropriately. Note that by part (a). (p  Aq, p  Aq) z 0 for
allAER. 2. For p = (Pi.... ,p") E R', set lIP112 = the euclidean length of the vector p.
)P)P
=\p
a. Use the result of Exercise 1(d) to prove that Ilp + g112
.
The quantity IIPII: is called the norm or
IIPI2 + I1g112 for all p, q E R".
b. Using the result of part (a), prove that d2(p, q) = lip  q4' is a metric on R". c. In R2 sketch the 1neighborhood of 0.
3. For p=(pi.. .,p.,)and d, (p,q)_
in R", set
1p:q,1.
a. Prove that d, is a metric on R".
b. In R2 sketch the 1neighborhood of 0.
c. Suppose that {pk} is a sequence in R", where for each k E N. pk = (pk,,, ... , pk."). Prove that the sequence
{pk} converges top = (p ... , p") if and only if lim pk.; = p; for all i = 1, ... , n. d Prove that a sequence {pk} in R" converges to p E R' with respect to the metric d, if and only if it converges to p with respect to the metric d2. Specifically, prove that lim d,(pk, p) = 0 if and only if Alimd2(pk, p) = 0.
4. a. If (X, d) is a metric space, prove that p(x, y) = d(x, v)/(I + d(x, y)) is also a metric on X. b. Prove that a subset U of X is open in (X, d) if and only if it is open in (X, p). 5. If E is an uncountable subset of R. prove that some point of E is a limit point of E. (Hint: Use Exercise 17 of Section 3. 1).
6. Let {D"} be a countable collection of dense open subsets of R. Prove that (1
, O" is dense in R.
The following exercise is designed to prove the converse of Theorem 3.2.6; namely, if K is a subset of a metric space (X, d) having the properly that every infinite subset of K has a limit point in K. then K is compact.
7. Let K be a subset of a metric space (X. d) that has the property that every infinite subset of K has a limit point in K.
a. Prove that there exists a countable subset D of K which is dense in K. (Hint: Fix n E N. Let p, E K be arbitrary. Choose p2 E K, if possible, such that d(p,, P2) ? Suppose pl.. . . , pi have been chosen. Choose pj+,, if possible, such that d(pi, pj ) >_ in for all i = 1, ... , j. Use the assumption about K to prove that this process must terminate after a finite number of steps. Let 91 denote this finite collection of points, and let D = U "eN P". Prove that D is countable and dense in K.)
b. Let D be as in (a), and let U be an open subset of X such that u n K * 0. Prove that there exists p E D and n E N such that N,t"(p) C U. c. Using the result of (b), prove that for every open cover qt. of K, there exists a finite or countable collection { U")" C °U, such that K C U Un.
d. Prove that every countable open cover of K has a finite subcover. (Hint: If is a countable open cover of K, for each n E N. let W. = U, U,. Prove that K C W. for some n E N. Assume that the result is false, and obtain an infinite subset of K with no limit point in K, which is contradiction.)
Supplemental Reading
113
SUPPLEMENTAL READING Asic, M. D. and Adamovic, D. D., "Limit points of sequences in metric spaces," Amer. Math. Monthly 77 (1970), 613616. Corazza, P., "Introduction to metricpreserving functions:' Amer. Math. Monthly 106 (1999). 309323. Dubeau, Francis, "CauchyBunyakowskiSchwarz inequality revisited," Amer. Math. Monthly 99 (1990), 419421. Espelie, M. S. and Joseph, J. I, "Compact subsets of the Sorgenfrey line," Math. Mag. 49 (1976), 250251. Fleron, Julian F., "A note on the history of the Cantor set and Cantor function;' Math. Mag. 67 (1994),136140.
Geissinger, Ladner, "Pythagoras and the CauchySchwarz inequality," Amer. Math. Monthly 83 (1976), 4041. Kaplansky, Irving, Set Theory and Metric Spaces. Chelsea Publ. Co., New York, 1977. Kraft, R. L., "A golden Cantor set." Amer. Math. Monthly 105 (1998), 718725. Labarre, Jr., A. E., "Structure theorem for open sets of real numbers;' Amer. Math. Monthly 72 (1965), 1114.
Nathanson, M. B., "Round metric spaces," Amer. Math. Monthly 82 (1975), 738741.
Limits and Continuity 4.1 Limit of a Function 4.2 Continuous Functions
4.3 Uniform Continuity 4.4 Monotone Functions and Discontinuities
The concept of limit dates back to the late seventeenth century and the work of Isaac Newton (16421727) and Gottfried Leibniz (16461716). Both of these mathematicians are given historical credit for inventing the differential and integral calculus. Although the idea of limit occurs in Newton's work Philosophia Naturalis Principia Mathematica of 1687, he never expressed the concept algebraically; rather, he used the phrase "ultimate ratios of evanescent quantities" to describe the limit process involved in computing the derivatives of functions. The subject of limits lacked mathematical rigor until 1821 when AugustinLouis Cauchy (17891857) published his Cours d'Analyse in which he offered the following definition of limit: "If the successive values attributed to the same variable approach indefinitely a fixed value, such that finally they dijf°er from it by as little as desired, this latter is called the limit of all the others." Even this statement does not resemble the modem deltaepsilon version of limit given in Section 4.1. Although Cauchy gave a strictly verbal definition of limit, he did use epsilons, deltas, and inequalities in his proofs. For this reason, Cauchy is credited for putting calculus on the rigorous basis we are familiar with today. Based on the previous study of calculus, the student should have an intuitive notion of what it means for a function to be continuous. This most likely compares to how mathematicians of the eighteenth century perceived a continuous function; namely, one that can be expressed by a single formula or equation involving a variable x. Mathematicians of this period certainly accepted functions that failed to be continuous at a finite number of points. However, even they might have difficulty envisaging a function 115
116
Chapter 4
Limits and Continuity
that is continuous at every irrational number and discontinuous at every rational number in its domain. Such a function is given in Example 4.2.2(g). An example of an increasing function having the same properties will also be given in Section 4 of this chapter.
4.11 Limit of a Function The basic idea underlying the concept of the limit of a function f at a point p is to study the behavior off at points close to, but not equal to, p. We illustrate this with the following simple examples. Suppose that the velocity v (ft /sec) of a falling object is given
as a function v = v(t) of time t. If the object hits the ground in t = 2 seconds, then v(2) = 0. Thus to find the velocity at the time of impact, we investigate the behavior of v(t) as t approaches 2, but is not equal to 2. Neglecting air resistance, the function v(t) is given as follows: (t)
32t, 05t 0, there exists a 8 > O for which
1f(x)LI p.
The definition of the limit of a function can also be stated in terms of e and 6 neigh
borhoods as follows: If E C R, f : E  R, and p is a limit point of E. then lim f (x) = L XP
4.1
Limit of a Function
117
if and only if given e > 0, there exists a S > 0 such that
f(x) E N,(L) for all x E Efl (Na(p)\{p}). This is illustrated graphically in Figure 4.1.
L+E
Lt
Figure 4.1
lim f(x) = L xp
Remarks (a) In the definition of limit, the choice of S for a given e may depend not only on e and the function, but also on the point p. This will be illustrated in Example 4.1.2(g).
(b) If p is not a limit point of E, then for S sufficiently small, there do not exist any x E E so that 0 < Ix  pI < S. Thus if p is an isolated point of E, the concept of the limit of a function at p has no meaning. (c) In the definition of limit, it is not required that p E E, only that p is a limit point of E. Even if p E E, and f has a limit at p, we may very well have that
lf(X) *f(p). This will be the case in Example 4.1.2(c).
(d) Let E C R and p a limit point of E. To show that a given function f does tlt have a limit at p, we must show that for every L E R, there exists an e > 0, such that for
every S > 0, there exists an x E E with 0 < Ix  pI < S, for which
Lf(x)Llae. We will illustrate this in Example 4.1.2(e).
118
Chapter 4
4.1.2
Limits and Continuity
EXAMPLES
(a) Let E be a nonempty subset of 1R and let f, g, and h be functions on E defined by f (x) = c (c E 18), g(x) = x, and h(x) = x2, respectively. If p is a limit point of E. then
lim f(x) = c,
lim g (x) = p.
lim h(x) = p2.
XP
These limits are also expressed as limc = c, limx = p, and limx'' =
p2.
Even though we may feel that these limits are obvious, theypstill have to be proved. We illustrate the method of proof by using the definition to prove that lim h(x) = p2. The proofs of the other two limits are left to the exercises (Exercise 2). Fpor x E E,
Ih(x)  p2I = Ix2  p2I = Ix  pIIx + pi If
(IkI + Ipj)Ix  pI
pl < 1,then kI < IpI + 1. Hence forallxEEwithIx  pl < 1, Ih(x)  p2I < (2lpl + l)Ix  pI.
This last term will be less than e provided Ix  pl < e/(21 pl + 1). Thus given e > 0, we choose 8 = min{ 1, E/(2 1 pI + 1)}. With this choice of 8, if x E E with 0 < Ix  pl < S,we first of all have Ix  pl < 1, and therefore also
Ih(x)  p2I < (2lpl + 1)Ix  pl < (2IPI + 1) (2IPIE+
1)
= e.
Thus lim x2 = p2. X" p
(b) For x * 2, let f(x) be defined by f (X) = X2 4
x2 The domain of f is E = (oo, 2) U (2, oo), and 2 is clearly a limit point of E. We now show that lim f (x) = 4. For x # 2,
If(x)4I = Ix2 4I = Ix+24I = Ix2I. Thus given e > 0, the choice S = e works in the definition. (c) Consider the following variation of (b). Let g be defined on R by
x2 4 g(x) = x  2 2,
x#2,
x2.
For this example, 2 is a point in the domain of g, and it is still the case that lim g(x) = 4. However, the limit does not equal g(2) = 2. The graph of g is given in Figure 4.2.
(d) Let E = (1, 0) U (0, oo). For x e E, let h(x) be defined by
h(x)=
x+11 x
4.1
Limit of a Function
g(x), x s 2
2
Figure 4.2
We claim that l
Graph of g
h(x) = Z. This result is obtained as follows: For x # 0,
x+1ill x+1+1
x+11 x
7771 + 1/
x
_
x
1
x+I+1
x( x+1+1)
From this last term we now conjecture that h(x) + as x +0. By the above, h(x)  12
(
+1
121

x +X+ l + 11)
I2( I
x+1)(1+Vx +I) _ 2(V, +1+1)2
x
I I2(V+
IxI
2( x+1+1)2 For x E Ewe have (
x + 1 + 1)2 > 1, and thus
Ih(x)I
0, let S = e. ThenI for all x e E with 0 < jxI < S, Ih(x)
and thus
21 < ZI < 2 < E,
liim h(x)
(e) Let f be defined on R as follows:
f(x)  10, x E O, x le a.
119
120
Chapter 4
Limits and Continuity
We will show that for this function, lim f(x) fails to exist for every p E R. Fix p E R. Let L E R and let
e = max( IL  11, ILI).
Suppose e = IL  11. By Theorem 1.5.2. for any S > 0, there exists an x E 0 such
that 0 < Ip  xI < S. For such an x,
If(x)LI=11LI=elf e = ILI, then by Exercise 6, Section 1.5, for any 8 > 0, there exists an irrational
number x with 0 < Ix  p l < S. Again, for such an x, V (x)  LI = e. Thus with
e as defined, for any 8 > 0, there exists an x with 0 < Ix  pI < S such that V (x)  LI ? e. Since this works for every L E 6L lim f(x) does not exist.
(f) Let f : 68 * l be defined by 0' x E Q, AX)  IX, x Q. Then lim f (x) = 0. Since L f (x)1 jx i for all x, given e > 0. any S. 0 < 8 e, will work in the definition of the limit. A modification of the argument given in (e) shows
that for any p 0 0, iim f (x) does not exist. An alternative proof will be provided in Example 4.1.5(b).
(g) Our final example shows dramatically how the choice of S will generally depend not only on e, but also on the point p. Let E _ (0, oo) and let f : E  R be defined by
f(x)
I
We will prove that for p E. (0, oo), 1 liml = p
+p x
If x > p/2, then
111 x
=
IxPI
2
0, let S = min{p/2, pee/2}. Then if 0 < Ix  pI < S. x > p/2,
and
x P < p2Sse. The S as defined depends on both p and e. This suggests that any S that works for a given p and a must depend on both p and e. Suppose on the contrary that for a given
e > 0, the choice of 8 is independent of p E (0, oo). Then with e = 1, there exists a S > 0 such that
zI
p1 1.
This contradiction proves that the choice of S must depend on both p and e. 0
Sequential Criterion for Limits Our first theorem allows us to reduce the question of the existence of the limit of a function to one concerning the existence of limits of sequences. As we will see, this result will be very useful in subsequent proofs, and also in showing that a given function does not have a limit at a point p.
4.1.3
THEOREM Let E be a subset of l1, p a limit point of E, and f a realvalued function defined on E. Then lim f (x) = L if and only if X+p
lint f (pn) = L
n+oc
for every sequence {pn} in E, with pn # p for all n, and lim p = p.
Remark. Since p is a limit point of E, Theorem 2.4.7 guarantees the existence of a sequence { pn} in E with pn # p for all n E N and pn  p.
Proof. Suppose xlp iim f (x) = L. Let {pn} be any sequence in E with pn * p for all n and pn + p. Let E > 0 be given. Since x.p limf(x) = L, there exists a S > 0 such that V (x) (x)
 L I < e for all'x E E, 0 < Ix  p I < S.
(1)
Since lint pn = p, for the above S, there exists a positive integer no such that m
0 < Ipn  P I < S for all n ? no. Thus if n ? n by (1), l f (pn)  LI < e. Therefore, lint f (pn) = L. R00
Conversely, suppose f(pn) 4 L for every sequence { pn} in E with pn * p for all a and pn + p. Suppose lim f (x) . # L. Then there exists an e > 0 such that for every xPx E Ewith 0 p, by Theorem 4.1.3,
lmg(x) =
1 B.
L1
The proofs of the following two theorems are easy consequences of Theorem 4.1.3 and the corresponding theorems for sequences (2.2.3 and 2.2.4). First, however, we give the following definition.
4.1.7
DEFINITION A realvalued function f defined on a set E is bounded on E if there exists a constant M such that V(x)l t M for all x E E.
4.1.8
THEOREM Suppose E C R, p is a limit point of E, and f, g are realvalued functions on E. If g is bounded on E and lim f (x) = 0, then t.p
lim f(x)g(x) = 0.
Proof. Exercise 12.
4.1.9
THEOREM Suppose E C R, p is a limit point of E, and f, g, h are functions from E into R satisfying g (x)
If lire g (x) =lim h(x) = L, Proof,
f (x) : h (x) for all x E E. then
1 m f (x) = L.
Exercise 13.
We now provide several examples to illustrate the previous theorems.
4.1.10
EXAMPLES
(a) By Example 4.1.2(a), lim x = c. Thus, using mathematical induction and Theorem 4.1.6(b), x.+c lim x" = c" for all n E N. If p(x) is a polynomial !unction of degree n, that is,
.+a,x+ao, where n is a n o n n e g a t i v e i n t e g e r and a 0 ,
. .. , a" E R with a" * 0, then a repeated application of Theorem 4.1.6(a) gives lim p(x) = p(c). (b) Consider lim
x.2
x'+2x22x4 x2  4
4A
Limit of a Function
125
By part (a), xlim2 (x' + 2x2  2x  4) = 0 and dint (x2  4) = 0. Since the denominator has limit zero, Theorem 4.1.6(c) does not apply. In this example, however, for
x # 2, (x+2)(x22)
x3+2x22x4 x24
(x+2)(x2)
x' 2
x2
Since slim (x  2) = 4, which is nonzero, we can now apply Theorem 4.1.6(c) to conclude tat lim
x3+2x22x4= lim x221 x2x2 2 x24
(c) Let E = R \ {0}, and let f : E r R be defined by f (x) = x sin
x.
Since I sin (l /x) I 0 be given. Then with M = 1/e, I f (x)I < e Therefore, 1
(sin x)/x = 0.
for all x > M.
128
Limits and Continuity
Chapter 4
(b) For our second example consider f(x) = x sin irx. If we set p" = (n + ;). n E N. then
f(p") = (n + !)sin(n + ;)ir = (1)"(n + 2). Thus the sequence {f(p")}n°_ I is unbounded, and as a consequence, lim x sin zrx does not exist. x
EXERCISES 4.1 1. Use the definition to establish each of the following limits. b6 fim2 (3x + 5) = 1 *a. lim (2x  7) = 3
*c. I m
_
x
l+ x
I
d.
2
i
e.. I x + 1  3
lim 2x23x4=
x+i
L lim
x32x4 = 5 x24
2
2. Use the definition to establish each of the following limits.
a. lim c=C xv
b. lim x = p
C. lim x3 = p3
d. lim x0 x" = p", n E RI
x+v
v
I
p>0
v
x +o
x
2Vp
, p>0
3. For each of the following, determine whether the indicated limit exists in R. Justify your answer!
*a x.° limWx
b.x.i limx2x + I
*C. lim cos
d. lim
I
XO
e. lim
x
f. lim
14
4. *Define f : (1, 1)
cos x
)2 (x + lx
1
R by
f(x)=x2x2
x+l Determine the limit L of f at 1 and prove, using e and S, that f has limit L at 1. 5. *a Using Figure 4.5, prove that I sin hl  jhI for all It E R. b. Using the trigonometric identity I  cos It = 2 sin 2 2, prove that (i) limo cos h = I.
00lim
1  cosh h
0.
6. Let E C R, p a limit point of E, and f : E * R. Suppose there exist a constant M > 0 and L E R such that [f(x)  LI c Mix  pl for all x E E. Prove that lim f(x) = L. XP
4.1
Limit of a Function
129
7. Suppose f: E i R, p is a limit point of E, and limf(x) = L. xp *a. Prove that lint If(x)I = ILI xo
= VL.
b. If, in addition, f(x) > 0 for all x E E, prove that lim c. Prove that lim (f(x))" = L" for each n E N.
8. Use the limit theorems, examples, and previous exercises to find each of the following limits. State which theorems, examples, or exercises are used in each case.
.a, littt 5x2 + 3x  2
b. lim
se lint
d.
x1
_'+
m 2x + 5
x.t
x3 _ x2 + 2 x+1 Ix + 13n
xm2
x + 2)
f lim
x4
I

.r(
'1 1
Ix+21
s 9. 'Suppose f : (a, b) + R, p E [a, b], and + m f (x) > 0. Prove that there exists a 8 > 0 such that f (x) > 0 for all
x E (a, b) with 0 < Ix  pi < S. 10. Suppose E C R, p is a limit point of E, and f : E + R. Prove that if f has a limit at p, then there exists a positive constant M and a S > 0, such that [f(x)I s M for all x E E, 0 < Ix  pi < 6. 11. a. Prove Theorem 4.1.6(a). b. Prove Theorem 4.1.6(b).
12. 'Prove Theorem.4.1.8. 13. Prove Theorem 4.1.9.
14. Let f, g be realvalued functions defined on E C R and let p be a limit point of E.
a. If limf(x) and lim (f(x) + g(x)) exist, prove that lim g(x) exists. xrp xp xp b. If lim f(x) and lim (f(x)g(x)) exist, does it follow that lim g(x) exists? xap XP XP
15. Let E be a nonempty subset of R and let p be a limit point of E. Suppose f is a bounded realvalued function on E having the property that iiin f(x) does not exist. Prove that there exist sequences (p.) and {q"} in E with
lim p" = lim q" = p such that lim f(p") and lim f(q") exist, but are not equal. .CO "W
16. *Let f be a realvalued function defined on (a, oo) for some a > 0. Define g on lim f (x) = L if and only if lint g(t) = L.
by g(t)
17. Investigate the limits at oo of each of the following functions defined on (0, oo).
a. f(x) =
c. f(x) =
3x2
2+ +x
!
V47+ 1
e. f( x) =
X
b. f (x) =
1 + x2
d. f(x)  2x + 3 x+l V x  2x AX) =
2NA + 3x h. f(x) = x sin R be such that lim x f (x) = L where L E R. Prove that lim f (x) = 0.
g. f (x) = x cos x
18. Let f : (a, oo)
1
Prove that
130
Chapter4
Limits and Continuity
19. Let f : R > R satisfy f(x + y) = f(x) + f(y) for all.r. v E R. If lim f(.r) exists, prove that a. lim f(x) = 0, and b. lim f(x) exists for every p E R. =v
4.2j Continuous Functions The notion of continuity dates back to Leonhard Euler (17071783). To Euler, a continuous curve (function) was one that could be expressed by a single formula or equation of the variable x. If the definition of the curve was made up of several parts, it was called discontinuous. This definition was sufficient to convey the concept of continuity in Euler's time as mathematicians were primarily concerned with elementary functions; namely, functions built up from the trigonometric and exponential functions, and inverses of these functions, using algebraic operations and composition. The more modern version of continuity is credited to Bolzano (1817) and Cauchy (1821). Both men were motivated to provide a clear and precise definition of continuity in order to prove the intermediate value theorem (Theorem 4.2.11). Cauchy's definition of continuity was as follows: "The function f (x) will be, between two assigned val
ues of the variable x, a continuous function of this variable if for each value of x between these limits, the numerical value [i.e., absolute value] of the difference f(x + a)  f(x) decreases indefinitely with a'. Even this definition appears strange in comparison with the more modem definition in use today. Both Bolzano and Cauchy were concerned with continuity on an interval, rather than continuity at a point.
4.2.1
Let E be a subset of U8 and f a realvalued function with domain E. The function f is continuous at a point p E E, if for every e > 0, there exists a S > 0 such that DEFINITION
IA(x) f(P)I < e for all x E E with Ix  pI < S. The function f is continuous on E if and only if f is continuous at every point p E E. This definition can be rephrased as follows: A function f : E  IR is continuous at
p E E if and only if given e > 0, there exists a S > 0 such that
f(x) E NE(f(p)) for all x E N5(p) fl E. This is illustrated in Figure 4.7.
Remarks (a) If p E E is a limit point of E, then f is continuous at p if and only if
Lmf(x) =f(p).
XP 1. Cauchy Course d'Analyse, p. 43.
4.2 Continuous Functions
131
Figure 4.7
Also, as a consequence of Theorem 4.1.3, f is continuous at p if and only if
f(p) for every sequence
in E with p, + p.
(b) If p E E is an isolated point, then every function f on E is continuous at p. This follows immediately from the fact that for an isolated point p of E, there exists a S > 0 such that N8(p) (1 E = { p}. We now consider several of the functions given in previous examples, and also some new examples.
4.2.2
EXAMPLES
(a) Let g be defined as in Example 4.1.2(c), i.e.,
1xZ4 g(x)= 2,
x2.
x#2,
x=2.
At the point p = 2, lim g(x) = 4 # g(2). Thus g is not continuous at p = 2. However, if we redefine g at p = 2 so that g(2) = 4, then this function is now continuous at
p=2.
(b1 Let f be as defined in Example 4 . 1 . 2(f ): i . e., f (x) =
{0. x E O, x x Q
132
Chapter4
Lindta and Continuity
Since
limf(x) = 0 = f(0), f is continuous at p = 0. On the other hand, since limf(x) P fails to exist for every p * 0, f is discontinuous at every p E Il, p * 0. (e) The function f defined by

11, xE0,
f(x)
x VE Q,
0,
of Example 4.1.2(e) is discontinuous at every p E R. (d) As in Example 4.1.2(g), the function f (x) = 1/x is continuous at every p E (O,x). Thus, f is continuous on (0, a).
(e) Let f be.defined by
x = 0,
0, f (X)
x sin
x
,
x:# 0.
By Example 4.1.10(c),
li mf(x) = 0 = f(0). Thus f is continuous at x = 0. (f) In this example we show that f (x) = sin x is continuous on R. Let x, y E R. Then
If(s) .f(x)l = Isiny  sinxl = 2 cos (y + x) sin 2(y  x) I 2
{2lsinI(yx+ ).
By Exercise 5 of the previous section, I sin h I < Ih 1. Therefore,
If(y) f(x)I S Iy  xl, from which it follows that f is continuous on R. (g) We now consider a function on (0, 1) that is discontinuous at every rational num
ber in (0, 1) and continuous at every irrational number in (0, 1). For x E (0, 1) define
0, AX)
= in1,
if x is irrational,
if x is rational with x =
m in lowest terms. n
The graph of f, at least for a few rational numbers, is given in Figure 4.8. To establish our claim we will show that
lira f(x) = 0 XP
4.2
Continuous Functions
133
for every p E (0, 1). As a consequence, since f(p) = 0 for every irrational number p E (0, 1),f is continuous at every irrational number. Also, since f(p) # 0 when p r= 0 fl (0, 1), f is discontinuous at every rational number in (0, 1).
2
4
8 16

1
16
I
3
I
5
3
7
8
16
4
16
8
16
1
9
S
11
3
13
'7
U
2
16
8
16
4
16
8
16
Figure 4.8
Fix p E (0, 1) and let e > 0 be given. To prove that lim f (x) = 0 we need to show X __*P
that there exists a 8> 0 such that
lf(x)I < E for all x E Na(p) 11(0, 1), x * p. This is certainly the case for any irrational number x.
On the other hand, if x is rational with x = m/n (in lowest terms), then f (x) = I/n. Choose n0 E 1\J such that 1/n,, < E. There exist only a finite number of rational numbers m/n (in lowest terms) in (0, 1) with denominator less than no. Denote these by
rl, ... , rk, and let
8=min{Ir1pI:i= 1,.. ,k,r,*p}. (Note: Since p may be a rational number and thus possibly equal to r, for some i = 1, ... , k, we take the minimum of fir,  pI) only for those i for which r, # p). Thus 8 > 0, and if r E 0 fl Na(p) fl (0, 1), r * p, with r = m/n in lowest terms, then n ? n0. Therefore, if(r)I
< E.
Thus I f (x) I < e for all x E Na(p) fl (0, 1), x # p. If f and g are realvalued functions defined on a set E, we define the sum f + g, the difference f  g, the product fg, and the absolute value lfl off on E as follows: For x E E.
(f + g) (x) =1(x) + 8(x).
(f  8)(x) = f(x)  8(x),
134
Chapter 4
Limits and Continuity
(fg) (x) = f(x)g(x), Ifl(x) = V(X)1
Furthermore, if g(x) * 0 for all x E E, we define the quotient fig by
\g/(x)
g(x)*
More generally, if f and g are realvalued functions defined on a set E. the quotient f/g can always be defined on E, = {x E E : g(x) * 0}. As an application of Theorem 4.1.6 we prove that continuity is preserved under the algebraic operations defined above. The proof that Ifl is continuous whenever f is continuous is left as an exercise (Exercise 6).
4.2.3
THEOREM
If E C R and f, g : E > 68 are continuous at p E E, then
(a) f + g and f  g are continuous at p, and (b) f g is continuous at p. (c) If g(x) * 0 for all x E E, then fig is continuous at p.
Proof. If p is an isolated point of E, then the result is true since every function on E is continuous at p. If p is a limit point of E, then the conclusions follow from Theorem
4.1.6. Q Composition of Continuous Functions In the following theorem we prove that continuity is also preserved under composition of functions.
4.2.4 THEOREM Let A, B C l18 and let f : A + R and g : B  l be functions such that Range f C B. If f is continuous at p E A and g is continuous at f(p), then h = g e f is continuous at p.
Proof.
Let e > O be given. Since g is continuous at f (p), there exists a S, > 0 such that
lg(y)  g(f(p))I < e for ally E B fl NN,(f(p)).
(2)
Since f is continuous at p, for this S,, there exists a S > 0 such that
Lf(x)f(p)I 0 be given. Then by hypothesis f '(N,(f(p))) is open in E. Thus there exists a S > 0 such that
E fl Na(p) C f '(Nf(f(p))); that is, f(x) E N.(f (p)) for all x E Na(p) fl E. Therefore f is continuous at p. O 4.2.7
EXAMPLES
(a) We illustrate the previous theorem for the function f(x) = \, Dom f = [0, oo). Suppose first that V is an open interval (a, b) with a < b. Then
f,(V)=
1
¢,
b < 0,
[O,b2),
as0 0 such that N8(p) fl [0, x) C f '(I) C f  '(V ). Since p E f '(V) was arbitrary, f '(V) is open in (0, oo). (b) In this example we show that if f : E > R is continuous on E and V C E is open in E, then f(V) is not necessarily open in Range f. Consider the function f : R  R given by
f(x)
x2,
 {32x,
x < 1,
x> 1.
Then f is continuous on R and Range f = R (Exercise 10). However, f((I, 1)) [0, 1), and this set is not open in R. (See Figure 4.9.)
I
Figure 4.9 Graph of f(x) _
X2
132x,
xx> I.
>I
Continuity and Compactness We now consider several consequences of continuity. In our first result we prove that the continuous image of a compact set is compact. In the proof of the theorem we use only continuity and the definition of a compact set. An alternative proof using the HeineBorelBolzanoWeierstrass theorem (Theorem 3.2.9) is suggested in the exercises (Exercise 25).
4.2.8 THEOREM If K is a compact subset of P and if f : K .> H is continuous on K, then f (K) is compact.
4.2
Continuous Functions
137
Proof. Let { V,},EA bean open cover of f(K). Since f is continuous on Kj '(V.) is open in K for every a E A. By Theorem 3.1.16, for each a there exists an open subset U. of l such that
f '(V,) = K n U,. We claim that {U,}0EA is an open cover of K. If p E K, then f(p) Ef(K) and thus f(p) E V. for some a E A. But then p is in j'(V.) and hence also in U,. Since each U. is also open, the collection { U«}.eA is an open cover of K. Since K is compact, there
exists a,,.
.
.
, an E A such that
K C ,U U.. I
Therefore, n
n
K =U (u.fl K) = Utf j '(V,), j=t and by Theorem 1.7.14(a), n
f(K) = Uf(r'(V0,)) Since f (f '(V.,)) C V,, f (K) C U; _ IV.,. Thus f (K) is compact. As a corollary of the previous theorem we obtain the following generalization of the usual maximumminimum theorem encountered in calculus.
4.2.9
COROLLARY Let K be a compact subset of R and let f : K + R be continuous. Then there exist p, q E K such that
f(q) 0, prove that there exists an a > 0 and a S > 0 such that f(x) 2 a for all x e N$(p) fl E. 22. *Let f : E. R be continuous at p E E. Prove that there exists a positive constant M and S > 0 such that
20. Let f be a realvalued function on R satisfying f(x + y)
lf(x)I sMforallxEEflN6(p). 23. Let f : (0, 1) > R be defined by
f(x) _

0, In,
if x is irrational, if x is rational with x=m/n in lowest terms.
a. Prove that f is unbounded on every open interval I C (0, 1). b. Use part (a) and the previous exercise to conclude that f is discontinuous at every point of (0, 1). 24. Suppose E is a subset of R and f, g : E  R are continuous on E. Show that {x E E : f(x) > g(x)} is open in E. 25. *Let K be a compact subset of R and let f : K a R be continuous on K. Prove that f(K) is compact by showing that f(K) is closed and bounded.
26. Let E C R and let f be a realvalued function on E. Prove that f is continuous on E if and only iff'(F) is closed in E for every closed subset F of R. 27. Let A, B C R and let f : A  ). R and g : B a R be functions such that Range f C B.
a. If V C R, prove that (g f)'(V) = f'(g'(V)). b. If f and g are continuous on A and B respectively, use Theorem 4.2.6 to prove that g of is continuous on A. 28. Suppose 1 is a connected subset of R and f : I> R is continuous on I. Prove, using only the properties of continuity and the definition of connected set, that f(1) is connected.
29. *Let K C R be compact and let f be a realvalued function on K. Suppose that for each x E K there exists e, > 0 such that f is bounded on NE,(x) n K. Prove that f is bounded on K.
30. Let A C R. For p E R, the distance from p to the set A, denoted d(p, A), is defined by d(p, A) = inf{Ip  xJ : x E A}. a. Prove that d(p, A) = 0 if and only if p E A. b. For x, y E R, prove that Id(x, A)  d(y, A) Jx  y1. e. Prove that the function x + d(x, A) is continuous on R. d. If A, B are disjoint closed subsets of R, prove that d(x, A) AX) = d(x, A) + d(x, B) (x)
144
Chapter4
Limits and Continuity
is a continuous function on R satisfying 0 s f(x) 0, x E A.
1 for all x E R. and
f(x)  11, xEB.
31. Let f be a continuous realvalued function on R satisfying f(0) = I and f(x + v) = f(x)f(v) for all x.y E R. Prove that f(x) = a` for some a E R. a > 0.
Uniform Continuity In the previous section we discussed continuity of a function at a point and on a set. By Definition 4.2.1, a function f : E i R is continuous on E if for each p E E, given any
e > 0, there exists a S > 0 such that Lf(x)  f(p)I < e for all x E E fl N5(p). In general, for a given e > 0, the choice of S that works depends not only on a and the function f, but also on the point p. This was illustrated in Example 4.1.2(g) for the function f(x) = 1/x, x E (0, oo). Functions for which a choice of S independent of p is possible are given a special name.
4.3.1
DEFINITION
Let E C P and f : E  R. The function f is uniformly continuous
on E if given e > 0, there exists a S > 0 such that 1f(x)  fCv)I < e
for all x,vEEwith xv1 0 be given. Take S = e/2C. If x, y E E with Ix  yI < S, then by the above
Lf(x) f(y)1 s 2CIx  yI < 2C8 = Therefore, f is uniformly continuous on E. In this example, the choice of S depends both on e and the set E. In the exercises you will be asked to show that this result is false if the set E is an unbounded interval.
(b) Let f(x) = sinx. As in Example 4.2.2(f),
Lf(y) f(x)I < Iy  xI for all x, y E R. Consequently, f is uniformly continuous on R.
4.3
Uniform Continuity
145
(c) In this example we show that the function f(x) = I /.r, x E (0. oo) is r= uniformly continuous on (0, oo). Suppose, on the contrary, that f is uniformly continuous on (0, oo). Then, as in Example 4.1.2(g), if we take e = 1, there exists a S > 0 such that If(x)  f(Y) I
X y < I
for all x, y E (0,oo) with Ix  yl < S. Choose no E Rl such that 1/n < S, and for n E f i set x = 1/n. Then for all n ? no, X.  X.+1 = I
nn+1
0, if we choose S such that 0 < S < a2e, then as a consequence of the above inequality, if(x)  f(y)I < e for all x, y E [a. oo) with Ix  yi < S.
Lipschitz Functions Both of the functions in Example 4.3.2(a) and (b), and the function f(x) = 1/x with Dom f = [a, oo), a > 0, are examples of an extensive class of functions. If E C R, a function f : E + R satisfies a Lipschitz condition on E if there exists a positive constant M such that [1(x)  f(Y) I < Mix  Y I for all x, y E E. Functions satisfying the above inequality are usually referred to as Lipschitz functions. As we will see in the next chapter, functions for which the derivative is bounded are Lipschitz functions.. As a consequence of the following theorem, every Lipschitz function is uniformly continuous. However, not every uniformly continuous
function is a Lipschitz function. For example, the function f(x) = V is uniformly continuous on (0, oo), but f does not satisfy a Lipschitz condition on [0, oo) (see Exercise 5).
4.3.3
THEOREM Suppose E C R 'and f : E  R. If there exists a positive constant M such that
V(x)  f(y)I 0 be given. Since f is continuous, for each p E K, there exists a 6P > 0 such that
Proof.
(3)
If(X)  f(p) I < 2 for all x E K fl N2s,(p)
The collection {Nso(p)},eK is an open cover of K . Since K is compact, a finite number of these will cover K. Thus there exist a finite number of points P i . . . . . p in K such that
K C U Ns,(pi). i=[ Let
S = min{SP, : i = 1,
, n).
Then S > 0. Suppose x, y E K with Ix  yi < S. Since x E K, x E N,&(pi) for some i. Furthermore, since k  yI < S s Sp, x, y E Nzs,.(Pi)
Thus by the triangle inequality and inequality (3),
LAX)  f(y)I C If(x)  f(pi)I + U(pa)  f(y)I < 2 + 2 = e.
4.3.5
COROLLARY A continuous realvalued function on a closed and bounded interval [a, b] is uniformly continuous. The definition of uniform continuity and the proof of Corollary 4.3.5 appeared in a paper by Eduard Heine in 1872.
4.3.6
EXAMPLE In this example, we show that the properties closed and bounded are both required in Corollary 4.3.5. The interval [0, oo) is closed, but not bounded. The function f(x) = x2 is continuous on (0, oo), but not uniformly continuous on [0, co) (Exercise 2). On the other hand, the interval (0, 1) is bounded, but not closed. The function f(x) = 1/x is continuous on (0, 1), but is not uniformly continuous on (0, 1).
4.3
Uniform Continuity
147
EXERCISES 4.3 1. Prove Theorem 4.3.3. 2. Show that the following functions are not uniformly continuous on the given domain.
*a. AX) = x2, Dom f = [0, oo)
b, g(x) =
I.
Dom g = (0, oo)
c. h(x) = sin x,
Dom h = (0, co)
3. Prove that each of the following functions is uniformly continuous on the indicated set.
*a. AX)
1 + x , x E [0. oo)
c. h(x) = x2
xER
+
e. e(x) = x + 1, x E (0, oo)
b. g(x) _ .r',
x E hi
d. k(x) = cos x, x E R
*1 f(x) =
sinx,
x E (0, I)
4. Show that each of the following functions is a Lipschitz function.
*a. f(x) =
,
Dom f = [a, oo), a > 0
c. h(x) = sin x, Dom h = [a, oo), a > 0
b. g(x) = x2 + 1, Dom g = (0. oo)
d. p(x) a polynomial, Dom p = [ a, a], a > 0
5. *a. Show that f(x) _ \ satisfies a Lipschitz condition on [a, oo), a > 0. b. Prove that V is uniformly continuous on (0, oo). c. Show that f does not satisfy a Lipschitz condition on (0, oo). 6. Suppose E C R and f, g are Lipschitz functions on E. a. Prove that f + g is a Lipschitz function on E. b. If in addition f and g are bounded on E, or the set E is compact, prove that fg is a Lipschitz function on E.
7. Suppose E C R and f. g are uniformly continuous realvalued functions on E. a. Prove that f + g is uniformly continuous on E. *b. If, in addition, f and g are bounded, prove that fg is uniformly continuous on E. c. Is part (b) still true if only one of the two functions is bounded? is a Cauchy sequence in E, prove that { f(x )} is a 8. Suppose E C R and f : E  R is uniformly continuous. If Cauchy sequence.
9. Let f : (a, b) + R be uniformly continuous on (a, b). Use the previous exercise to show that f can be defined at a and b such that f is continuous on [a, b]. 10. Suppose that E is a bounded subset of R and f : E  R is uniformly continuous on E. Prove that f is bounded on E.
11. Suppose oo s a < c < b R is periodic if there exists p E R such that f(x + p) = f(x) for all x E R. Prove that a continuous periodic function on R is bounded and uniformly continuous on R.
44.4
Monotone Functions and Discontinuities In this section we take a closer look at both limits and continuity for realvalued functions defined on an interval 1 C R. More specifically, however, we will be interested in classifying the types of discontinuities that such a function may have. We will also investigate properties of monotone functions defined on an interval I. These functions will play a crucial role in Chapter 6 on RiemannStieltjes integration. First, however, we begin with the right and left limits of a realvalued function defined on a subset E or R.
Right and Left Limits 4.4.1
Let E C I8 and let f be a realvalued function defined on E. Suppose p is a limit point of E fl (p, oo). The function f has a right limit at p if there exists a number L E IB such that given any e > 0, there exists a S > 0 for which DEFINITION
Lf (x)  LI < e for all x E E satisb,ing p < x < p + S. The right limit of f, if it exists, is denoted by f (p +), and we write lim f(x). A p+) = x lim Ax) = x.p x>p
Similarly, if p is a limit point of En(  oo, p), the left limit off at p, if it exists, is denoted by f( p ), and we write
f(p) = lim.f(x) = limf(x). x 0, E n (p, p + S) * 4i. If E is an open interval (a, b), oo < a < b s oo, then any p satisfying a s p < b is a limit point of E fl (p, oo). Similarly, if oo R, then f has a limit at p E lnt(I) if and only if
(a) f (p+) and f (p) both exist, and
(b) f(p+) =f(p). The hypothesis that p E Int(!) guarantees that p is a limit point of both (oo, p) fl 1 and I fl (p, oo). If p is the left endpoint of the interval 1, then the right limit of f at p
4.4 Monotone Functions and Discontinuities
149
coincides with the limit off at p. The analogous statement is also true if p is the right endpoint of I. We also define right and left continuity of a function at a point p as follows.
4A.2
Let E C R and let f be a realvalued function on E. The function f is right continuous (left continuous) at p E E if for any e > 0, there exists a S > 0 DEFINITION
such that
If(x) f(p)I < e
for all xEEwith p R is right continuous at p. In particular, if E is a closed interval [a, b], then every f : [a, b] + R is right continuous at b. Also, f is left continuous at b if and only if f is continuous at p. The following theorem, the proof of which is left to the exercises, is an immediate consequence of the definitions.
4A.3 THEOREM A function f : (a, b) +R is right continuous at p E (a, b) if and only if f (p+) exists and equals f (p). Similarly, f is left continuous at p if and only if f (p ) exists and equals f (p).
Proof.
Exercise 1. Q
Types of Discontinuities By the previous theorem a function f is continuous at p E (a, b) if and only if
(a) f (p+) and f (p) both exist, and
(b) f(p+) =f(p) =f(p). A realvalued function f defined on an interval I can fail to be continuous at a point p E 1 (the closure of I) for several reasons. One possibility is that lim f(x) exists but xZ either does not equal f (p), or f is not defined at p. Such a function can easily be made continuous at p by either defining or redefining f at p as follows: lim f(x). f(p) = XP
For this reason, such a discontinuity is called a removable discontinuity. For example, the function
4
g(x)
x2'
x #.2,
2,
x = 2,
of Example 4.2.2(a) is not continuous at 2 since
1 m g(x) = 4 * g(2).
150
Chapter 4
Limits and Continuity
By redefining g such that g(2) = 4. the resulting function is then continuous at 2. Another example is given by f (x) = x sin(] /x), x E (0, oo), which is not defined at 0. If we define f on [0, oo) by
x=0
0,
f(x)
x sin
i , x > 0,
x
then by Example 4.2.2(e), f is now continuous at 0.
Another possibility is that f(p+) and f(p) both exist, but are not equal. This type of discontinuity is called a jump discontinuity. (See Figure 4.13.)
4AA DEFINITION Let f be a realvalued function defined on an interval I. The function f has a jump discontinuity at p E Int(1) if f(p+) and f (p) both exist, but f is not continuous at p. If p E I is the left (right) endpoint of 1, then f has a jump discontinu
ity at p if f (p+) (f( p )) exists, but f is not continuous at p.
Ap+)
AP)
p
Figure 4.13 Jump Discontinuity off at p
Jump discontinuities are also referred to as simple discontinuities, or discontinuities of the first kind. All other discontinuities are said to be of second kind. If f( p+ ) and f (p) both exist, but f is not continuous at p, then either
(a) f(p+) # f(p), or
(b) f(p+) =f(p) #f(p)
4.4
Monotone Functions and Discontinuities
151
In case (a) f has a jump discontinuity at p, whereas in case (b) the discontinuity is removable. All discontinuities for which f (p+) or f (p) does not exist are discontinuities of the second kind.
4.4.5
EXAMPLES
(a) Let f be defined by x,
0 1.
The graph off is given in Figure 4.14. If x < 1, then f(x) = x. Therefore,
f(I)= lint f(x) = limx= 1 =f(1). X +F X1 Likewise, the right limit off at 1 is
2
1
Figure 4.14
f(1+)=Xlimf(x)=lim(3x2)=2. Therefore, f (I ) = f (l) = 1, and f (I +) = 2. Thus f is left continuous at 1, but not continuous. Since both right and left limits exist at 1, but are not equal, the function f has a jump discontinuity at 1.
(b) Let [x] denote the greatest Integer function; that is, for each x, [x] = largest integer n that is less than or equal to x. For example, [2.9] = 2,[3.1) = 3, and [
1.5] = 2. The graph of y = [x] is given in Figure 4.15. It is clear that for each n E Z,
lim [x] = n  1
and
lim. [x] = n.
Thus f has a jump discontinuity at each n E Z. Also, since f (n) = [n] = n, f (x) = [x]
is right continuous at each integer. Finally, since f is constant on each interval (n  1, n), n E Z, f is continuous at every x E R\7L.
152
Chapter4
Limits and Continuity
34
24
I +
1
2
1
3
00 1 Figure 4.15
Graph of [x]
(c) Let f be defined on R by if x 0.
Then f(0) = 0, but f(0+) does not exist. Thus the discontinuity is of second kind. (d) Consider the function g : R + R defined by g(x) = sin (21rx[x]).
For x E (n, n + 1), n E Z, x[x] = nx, and thus g(x) is continuous on every interval (n, n + 1), n E Z. On the other hand, for n E 71,
lim sin(2ax[x]) = sin(21rn2) = 0, sn+
and
lim sin(2irx[x]) = sin(2an(n  1)) = 0. Since g(n) = sin(21rn2) = 0, g is also continuous at each n E Z. Thus g is a bounded continuous function on R. The function g, however, is not uniformly continuous on R
(Exercise 7). The graph of g for x E (4.4) is given in Figure 4.16. 0
Monotone Functions 4.4.6
DEFINITION
Let f be a realvalued function defined on an interval I.
(a) f is monotone increasing (increasing, nondecreasing) on I if f(x) < f(y) for all x, y E I with x < y.
4.4
AAIA
I
A
Monotone Functions and Discontinuities
153
A
AA
IV
TV11
Figure 4.16
A
V
Graph of g(x) = sin(2ax[x]), x E (4,4)
(b) f is monotone decreasing (decreasing, nonincreasing) on I if f (x) at f (y)
for all x, y E I with x < y. (c) f is monotone on 1 if f is monotone increasing on I or monotone decreasing on I.
A function f is strictly increasing on I if f (x) < f (y) for all x, y E I with x < y. The concept of strictly decreasing is defined similarly. Also, f is strictly monotone on I if f is strictly increasing on I or strictly decreasing on I. Our main result for monotone functions is as follows.
4A.7 THEOREM Let I C R be an open interval and let f : I + H be monotone increasing on I. Then f (p+) and f (p) exists for every p E I and
inff(x). supf(x) =f(p) :5f(p):5 f(p+) = p 0 such that If(x)  f(y) I < e for all x, y E Dom f with Ix  yI < 6. In Chapter 6 we will use this to prove that every continuous realvalued function
every positive real number x and n E 101, there exists
fies f(a) < y < f(b), then y E I, and hence there exists c E [a, b] such that f(c) = y. That the continuous image of a connected set is connected follows from the definition. However, the proof that the connected subsets of R are the intervals requires the least upper bound property.
MISCELLANEOUS EXERCISES 1. Let f be a continuous realvalued function on [a, b] with f(a) < 0 < f(b). Let c, = 2(a + b). If f(c,) > 0, let c, = } (a + c,). If f(cl) < 0, let c2 = 1(c, + b). Continue this process inductively to obtain a sequence {c"} in (a, b) which converges to a point c E (a, b) for which f(c) = 0. 2. Let E C R, p a limit point of E, and f a realvalued function defined on E. The limit superior of f at p, denoted 1 m f(x), is defined by 1 m f(x) = anfo sup{ f(x) : x E (N8( p) \ { p}) fl E}.
Similarly, the limit inferior off at p, denoted lim f(x), is defined by =FP !,f f (x) = ssu$ inf{}(x) : x e (N8(p) \ { p}) fl }.
xP
Prove each of the following:
a. lim f (x) 0, there exists a S > 0 such that f(x) < L + e for all x E E,
0 0, there exists x e E with 0 < I x  p I < S such that JP
f(x)>Le.
c. If I m f(x) = L, then for any sequence {xn} in E with x" # p for all n E N. and limx" = P, lim f(x") 5 L. d. There exists a sequence {x"} in E with x" # p for all n E N, such that and limx" = P, lim f(xn) = lim f(x). xp 3. Let X C R and f a realvalued function on X. For p E X, the oscillation of fat p, denoted w(f; p), is defined as w(f; p) = inf sup{ If(x)  f(y) I : x, y e Na(p) fl X}. 8>0
Supplemental Reading
163
Prove each of the following:
a. The function f is continuous at p if and only if w(f: p) = 0. b. For every s E i8, the set {x E X : w(f; x) < s} is open in X. c. The set {x E X : f is continuous at x} is the intersection of at most countably many sets that are open in X. 4. Find w(f; x) for the functions f of Example 4.1.2(e) and Example 4.2.2(g). The following set of exercises involves the Cantor ternary function. Let P denote the Cantor ternary set of Section 3.3. For each x E (0, 1], let x = .aia2a3 denote the ternary expansion of x. Define N as follows: oo,
N
if a # I for all is E fJ,
min{n : a = 11, otherwise.
Define b = 1a for is < N, and bN = 1, if N is finite. (Note: b E {0,1} for all n.) 5. If x E (0, 1) has two ternary expansions, show that
"b
T,
ni
2
is independent of the expansion of x.
I
The Cantor ternary function f on [0, 1 ] is defined as follows: f(0) = 0, and if x E (0. 1 ] with ternary expansion x = .aja2a3 . , set b
f(x)= _12
where N and b are defined as above. 6. Prove each of the following:
a. f is monotone increasing on [0, 1]. b. f is constant on each interval in the complement of the Cantor set in [0, 1 ]. c. f is continuous on [0, 1 ].
d. f(P) = [0, 1]. e. Sketch the graph of f.
SUPPLEMENTAL READING Bryant, J., Kuzmanovich, J. and Pavlichenkov, A., "Functions with compact preimages of compact sets:' Math. Mag. 70 (1997), 362364. Bumcrot, R. and Sheingorn, M., "Variations on continuity: Sets of infinite limits,' Math. Mag. 47 (1974), 4143. Cauchy, A. L., Cours d'Analyse, Paris, 1821, in Oeuvres compldtes d'Augustin Cauchy, series 2, vol. 3, GauthierVillars, Paris, 1899. Fleron, Julian F., "A note on the history of the Cantor set and Cantor function;' Math. Mag. 67 (1994), 136140.
Grabinger, Judith V., "Who gave you the epsilon? Cauchy and the origins of rigorous calculus;' Amer. Math. Monthly 90 (1983), 185194. Martelli, M., Dang, M. and Seph, T.. "Defining chaos;' Math. Mag. 71 (1999), 112122. Snipes, Ray F., "Is every continuous function uniformly continuous?" Math. Mag. 57 (1994),169173. Straffin, Jr., Philip. D., "Periodic points of continuous functions;' Math. Mag. 51 (1978), 99105. Velleman, D. J., "Characterizing continuity," Amer. Math. Monthly 104 (1997), 318322.
Differentiation 5.1 The Derivative
52 The Mean Value Theorem 5.3 L'Hospital's Rule
5.4 Newton's Method
The development of differential and integral calculus by Isaac Newton (16421727) and Gottfried Wilhelm Leibniz (16461716) in the midseventeenth century constitutes one of the great advances in mathematics. In the two years following his degree from Cambridge in 1664, Newton invented the method of fluxions (derivatives) and fluents (integrals) to solve problems in physics involving velocity and motion. During the same period, he also discovered the laws of universal gravitation and made significant contributions to the study of optics. Leibniz, on the other hand, whose contributions came
ten years later, was led to the invention of calculus through the study of tangents to curves and the problem of area. The first published account of Newton's calculus appeared in his 1687 treatise Philosophia Naturalis Principia Mathematica. Unfortunately, however, much of Newton's work on calculus did not appear until 1737, ten years after his death, in a work entitled Methodus fuxionum et serierum infinitorum. Mathematicians prior to the time of Newton and Leibniz knew how to compute tangents to specific curves and velocities in particular situations. They also knew how to compute areas under elementary curves. What distinguished the work of Newton and Leibniz from that of their predecessors was that they realized that the problems of finding the tangent to a curve and the area under a curve were inversely related. More importantly, they also developed the notation and a set of techniques (a calculus) to solve these problems for arbitrary functions, whether algebraic or transcendental. In Newton's presentation of his infinitesimal calculus, he looked upon y as a flowing quantity, or fluent, of which the quantity v was the fluxion or rate of change. Newton's notation is still in use in physics and differential geometry, whereas every student of calculus learns the 165
166
Chapter5
Differentiation
d (for difference) and f (for sum) notation of Leibniz to denote differentiation and integration. Many of the basic rules and formulas of the differential calculus were developed by these two remarkable mathematicians. In the paper A New Method for Maxima and Minima, and also for Tangents, which is not Obstructed by Irrational Quantities, published in 1684, Leibniz gave correct rules for differentiation of sums, products. quotients, powers, and roots. In addition to his many contributions to the subject. Leibniz
also disseminated his results in publications and correspondence with colleagues throughout Europe. Newton and Leibniz, with their invention of the calculus, had created a tool of such novel subtlety that its utility was proved for over 150 years before its limitations forced mathematicians to clarify its foundations. The rigorous formulation of the derivative did not occur until 1821 when Cauchy provided a formal definition of limit. This helped to place the theory on a firm mathematical footing. Cauchy's contributions to the rigorous development of calculus will be evident in both this and subsequent chapters. In this chapter we develop the theory of differentiation based on the definition of Cauchy, with special emphasis on the mean value theorem and consequences thereof. The first section presents the standard results concerning derivatives of functions obtained by means of algebraic operations and composition. In the examples and exercises we will derive the derivatives of some of the basic algebraic and trigonometric functions. However, throughout the chapter we will assume that the reader is already familiar with standard techniques of differentiation and some of its applications. Therefore, we will concentrate on the mathematical concepts of the derivative. emphasizing many of its more subtle properties.
5.1
The Derivative In an elementary calculus course, the derivative is usually introduced by considering the problem of the tangent line to a curve or finding the velocity of an object moving in a
straight line. Suppose y = f(x) is a realvalued function defined on an interval [a, b]. Fix p E [a, b]. For x E [a, b], x # p, the quantity Q(x) = f(x)
 f(p)
xp represents the slope of the straight line (secant line) joining the points (p,f(p)) and (x,f(x)) on the graph off (see Figure 5.1). The function Q(x) is defined for all values of x E [a, b], x # p. The limit of Q(x) as x approaches p, provided this limit exists, is defined as the slope of the tangent line to the curve y = f(x) at the point (p,f(p)). A similar type of limit occurs if we consider the problem of defining the velocity of a moving object. Suppose that an object is moving in a straight line and that its distance s from a fixed point P is given as a function of t. namely, s = s(t). If t,, is fixed. then the average velocity over the time interval from to to t, t # to, is defined as
s(t) 
tr
5.1
The Derivative
167
Figure 5.1
The limit of this quantity as t approaches to, again provided that the limit exists, is taken as the definition of the velocity of the object at time ta. Both of the previous examples involve identical limits; namely,
limf (x)  A p)
and
xp
lira 1.1,
s(t)  s(t°)
t  t,
These limits,. if they exist, are called the derivatives of the functions f and s at p and t respectively. The term derivative comes from the French fonction derivee.
5.1.1
DEFINITION Let I C R be an interval and let f be a realvalued function with domain I. For fixed p E 1, the derivative off at p. denoted f'(p), is defined to be
f(p) Pp) = limf(x) x,P xp provided the limit exists. If f(p) is defined at a point p E 1, we say that f is differentiable at p. If the derivative f is defined at every point of a set E C 1, we say that f is differentiable on E.
If p is an interior point of I, then p + h E I for all h sufficiently small. If we set x = p + h, h # 0, then the definition of the derivative of f at p can be expressed as
f'(P) =
lt_mf(P
+ hh'  f(P)
provided the limit exists. This formulation of the derivative is sometimes easier to use.
In the definition of the derivative we do not exclude the possibility that p is an endpoint of I. If p E I is the left endpoint of 1, then P(P) = Rimp.
f(x)  f(p) =
xp
ti!
f(p + h)  f(p) h
provided, of course, that the limit exists. The analogous formula also holds if p E I is the right endpoint of L In analogy with the right and left limit of a function, we also define the right and left derivative of a function.
168
Chapter5
5.1.2
Differentiation DEFINITION Let I C I be an interval and let f be a realvalued function with domain I. If p E I is such that I fl (p, oo) # 0, then the right derivative off at p, denoted f' (p), is defined as
f+(P) = lim h4O
f(P + h)  f(P) h
provided the limit exists. Similarly, if p E I satisfies (oo, p) fl I # 4,, then the left derivative off at p, denoted f'_(p), is given by
f'(P)=hli3t
f(P + h)  f(P) h
provided the limit exists.
Remarks (a) If p E Int(/), then f'(p) exists if and only if both f"+(p) and f_(p) exist and are equal. On the other hand, if p E I is the left (right) endpoint of I, then f'(p) exists if and only if f+(p) (f_(p)) exists. In this case, f'(p) = f+ (P) (.f' (P)) The reader should note the distinction between f+(p) and f'(p+). The first denotes the right derivative off at p, whereas the latter is the right limit of the derivative; i.e.,
f'(P+) = lim f'(). Here, of course, we are assuming that f' is defined for all x E (p, p + 8) for some
8>0.
(b) If f is a differentiable function on an interval 1, we will also occasionally use Leibniz's notation d dx
f(x),
"& ,
or
'*,
to denote the derivative of y = f(x). (c) If f is differentiable on an interval I, then the derivative f'(x) is itself a function on I. Therefore, we can consider the existence of the derivative of the function f' at a point p E I. If the function f' has a derivative at a point p E 1, we refer to this quantity as the second derivative off at p, which we denote f"(p). Thus
f"(P) =
hmf'(P + h) f'(P) h
In a similar fashion we can define the third derivative off at p, denoted f'"(p) or f(')(p).
In general, for n E N, ft"t(p) denotes the nth derivative off at p. In order to discuss the existence of the nth derivative off at p, we require the existence of the (n  1)st derivative off on an interval containing p.
The Derivative
5.1
5.1.3
169
EXAMPLES
(a) In the exercises (Exercise 2) you will be asked to prove that if f(x) = x", n E Z, then f'(x) = nx" ' for all x E IB (x * 0 if n is negative). For the function f(x) = xz, the result is obtained as follows:
fi(x) =
,y9 (x +
z 
h
h) = 2x.
=
A similar computation shows that f"(x) = 2.
(b) Consider f(x) = N /x, x > 0. We first note that for h * 0,
f(x+h)f(x)
vrx h
=
( x+hV)( x+h+V) h
x+h+Vx)
(
x+h+;7x Since h x +h = ix, we have
f'(x)=h+
1
1
=
x + h + vx
2Vx
(c) Consider f(x) = sin x. From the identity
sin(x + h) = sin x cos h + cos x sin h, we obtain
sin(x + h)  sin x h
r cos
= sin x l
hI
h
I+ cos x f
sin h h
By Example 4.1.10(d) and Exercise 5, Section 4.1, sin h lmi=l h
and
lim
cos h  1 = 0.
Therefore,
f'(x).
sin(x + h)  sin x h
LIn
= sin x lim i
cos h  11
= cos X.
+ cos x l
J
In Exercise 3 you will be asked to prove that
d (cos x) = sin x.
[ sin h J
170
Chapters
Differentiation (d) Let f be defined by
x ? 0,
x
.f(x) = IxI =
x, x 2
b.g(x)=L(x2), x#0 *c. h(x) = [L.(x)]', x > 0 d. k(x) = L(L(x)), x E {x > 0 : L(x) > 0} 13. For b real, let f be defined by
f(x) =
1x° sin 1,
x > 0,
0,
x50.
x
Prove the following:
a. f is continuous at 0 if and only if b > 0. b. f is differentiable at 0 if and only if b > I. c. f is continuous at 0 if and only if b > 2. mfx
14. a. If f is differentiable at x, prove that li
,+h fx,  h )
b. If Ii f (x
+h)f(xoh) 2h
=fI M
exists, is f differentiable at x0?
15. If f : (a, b) + R is differentiable at p E (a, b), prove that f'(p) = lim n[f(p + ,l,)  f(p)]. Show by example that the existence of the limit of the sequence {n[ f(p + f(p)j) does not imply the existence of f'(p). 16. Leibniz's Rule: Suppose f and g have nth order derivatives on (a, b). Prove that (nk) f (k)(X)e  kkX).
(fg)t11(x) _ 40
176
Chapter5
Differentiation
5.21 The Mean Value Theorem In this section we will prove the mean value theorem and give several consequences of this important result. Even though the proof itself is elementary, the theorem is one of the most useful results of analysis. Its importance is based on the fact that it allows us to relate the values of a function to values of its derivative. We begin the section with a discussion of local maxima and minima.
Local Maxima and Minima 5.2.1
DEFINITION Suppose E C R and f is a realvalued function with domain E. The function f has a local maximum at a point p E E if there exists a 8 > 0 such that f (x) s f(p)for all x E E fl Na(p) The function f has an absolute maximum at p E E iff(x) 5 f(p) for all x E E.
Similarly, f has a local minimum at a point q E E if there exists a 8 > 0 such that
f(x) ? f(q) for all x E E fl %(q) and f has an absolute minimum at q E E if f (x) ? f (q) for all x E E.
Remark. As a consequence of Corollary 4.2.9. every continuous realvalued function defined on a compact subset K of I has an absolute maximum and minimum on K.
The function f, illustrated in Figure 5.2, has a local maximum at a, p,, and p, and a local minimum at and b. The points (p4,f(p4)) and (p,.f(p,)) are absolute maxima and absolute minima, respectively.
Figure 5.2
5.2
The Mean Value Theorem
177
The following theorem gives the relationship between local maxima or minima of a function defined on an interval and the values of its derivative.
5.2.2
THEOREM Let fbe a realvalued function defined on an interval 1, and supposef has either a local minimum or local maximum at p E Int(l). 1f f is differentiable at p, then
f(P) = 0. Proof. If f is differentiable at p E Int(1), then f' (p) and f; (p) both exist and are equal. Suppose f has a local maximum at p. Then there exists a S > 0 such that f (t) f (p) for all t E I with I t  p I < S. In particular, if p < t < p + S, t E 1, then
f(t)  f(p) s 0. tP Thus f+(P)
0. Similarly, if p  S < t < p,
f(t)  f(P)
tp
0,
and therefore f'_(p)' 0. Finally, since f+ (p) = f'_ (p) = f'(p), we have f(p) = 0. The proof of the case wheref has a local minimum at p is similar. O As a consequence of the previous theorem we have the following corollary.
5.2.3 COROLLARY Lei f be a continuous realvalued function on [a, b]. 1f f has a local maximum or minimum at p E (a, b), then either the derivative off at p does not exist,
or f(p) = 0. Remark. The conclusion of Theorem 5.2.2 is not valid if p E I is an endpoint of the interval. For example, if f :[a, b]  R has a local maximum at a, and if f is differentiole at a, then we can only conclude that f'(n) = f+ (a) 5 0. This is illustrated in the following examples.
5.2.4
EXAMPLES
(a) The function
f(x)= x lZ , 05x2, has a local maximum at p = 0 and p = 2, and an absolute minimum at q = ;. By com
putation, we have f'(0) _ 1, f' (2) = 3, and f (2) = 0. The graph of f is given in Figure 5.3.
(b) The function f (x) _ x 1, x E [ 1, 11, has an absolute minimum at p = 0. However, by Example 5.1.3(d) the derivative does not exist at p = 0.
Rolle's Theorem Prior to stating and proving the mean value theorem, we first state and prove the following theorem credited to Michel Rolle (16521719).
178
Chapter 5
Differentiation
24J 21
Il
4
U
Figure 5.3
5.2.5
Graph of f(x) _ (x  1)2, 0 5 x 5 2
THEOREM (Rolle's Theorem) Suppose f is a continuous realvalued function on
[a, b] with f(a) = f(b), and that f is differentiable on (a, b). Then there exists c E (a, b) such that f (c) = 0. Since the derivative off at c gives the slope of the tangent line at (c, f (c)), a geometric interpretation of Rolle's theorem is that if f satisfies the hypothesis of the theorem, then there exists at least one value of c E (a, b) for which the tangent line to the graph off is horizontal. For the function f depicted in Figure 5.4, there are exactly two such points.
Proof. If f is constant on [a, b], then f'(x) = 0 for all x E [a, b]. Thus, we assume that f is not constant. Since the closed interval [a, b] is compact, by Corollary 4.2.9,f has a maximum and a minimum on [a. b]. If f(t) > f(a) for some t, then f has a maximum at some c E (a, b). Thus by Theorem 5.2.2, f'(c) = 0. If f(t) < f(a) for some t, then f has a minimum at some c E (a, b), and thus again f'(c) = 0.
Remarks (a) Continuity off on [a, b] is required in the proof of Rolle's theorem. The function
Ix, 05x< 1,
f(x)= 0, x=1
is differentiable on (0, 1) and satisfies f(0) = f(1) = 0; yet f' (x) * 0 for all x E (0, 1). The function f fails to be continuous at 1. (b) For Rolle's theorem, differentiability of f at a and b is not required. For example,
the function f(x) =  xZ, x e [2, 2], satisfies the hypothesis of Rolle's theorem, yet the derivative does not exist at 2 and 2. For x E (2, 2), fi(x) _
V4
 X2
and the conclusion of Rolle's theorem is satisfied with c = 0.
5.2 The Mean Value Theorem
Figure 5.4
179
Rolle's Theorem
The Mean Value Theorem As a consequence of Rolle's theorem we obtain the mean value theorem. This result is usually attributed to Joseph Lagrange (17361813).
5.2.6
THEOREM (Mean Value Theorem) If f : [a, b] + R is continuous on [a, b] and differentiable on (a, b), then there exists c E (a, b) such that
f(b)  f(a) =f(c)(b  a). Graphically, the mean value theorem states that there exists at least one point c E (a, b) such that the slope of the tangent line to the graph of the function f is equal to the slope of the straight line passing through (a, f (a)) and (b, f (b)). For the function of Figure 5.5, there are two such values of c, namely c, and c2.
Proof. Consider the function g defined on [a. b] by
g(x) = f(x)  f(a) 
{f(b)b
 a(a),(x  a).
Then g is continuous on [a, b], differentiable on (a, b), with g(a) = g(b). Thus by Rolle's theorem there exists c E (a, b) such that g'(c) = 0. But
g'(x) =f'(x) 
f(b)  f(a)
for all x E (a, b). Taking x = c gives f(c) = now follows.
b  a
f (b)
 a(a), from which the conclusion
180
Chapters
Differentiation
Figure 5.5
Mean Value Theorem
The mean value theorem is one of the fundamental results of differential calculus. Its importance lies in the fact that,it enables us to obtain information about a function f from its derivative f'. In Example 5.2.7 we will illustrate how the mean value theorem can be used to derive inequalities. Other applications will be given later in this section and in the exercises. It will also be used in many other instances in the text.
5.2.7
EXAMPLE In this example we illustrate how the mean value theorem may be used in proving elementary inequalities. We will use it to prove that x 1 + x
s ln(1 + x) < x for all x >  1,
where In x denotes the natural logarithm function on (0, oo). This function is defined and considered in detail in Example 6.3.5 of the next chapter. There it is proved that the
derivative of In x is 1/x. Let f(x) = ln(1 + x), x E (1, oo). Then f(0) = 0. If x > 0, then by the mean value theorem, there exists c e (0, x) such that In(1 + x) = f(x)  f(0) = f(c)x.
But f'(c) _ (1 + c)' and (I + x)' < (1 + c)t < 1 for all c E (0, x). Therefore, x 1
+x O for all x E I, then f is strictly increasing on I. (c) If f'(x) 5 O for all x E 1, then f is monotone decreasing on L (d) If f'(x) < O for all x E 1, then f is strictly decreasing on I. (e) If f'(x) = O for all x E I, then f is constant on I.
Proof.
Suppose x1, x2 E I with x, < x2. By the mean value theorem applied to f on
[x1, x2].
f(x2) f(xl) =f'(C)(x2  xl) for some c E (x1, x2). If f(c) ? 0, then f(x2) ? f(x1). Thus, if f'(x) a 0 for all x E 1, we have f(x2) ? f(xl) for all xj, x2 E I with xi < x2. Thus f is monotone increasing on I. The other results follow similarly. O
Remark It needs to be emphasized that if the derivative of a function f is positive at a point c, then this does not imply that f is increasing on an interval containing c. The
function f of Exercise 18 satisfies f'(0) = 1, but f'(x) assumes both negative and positive values in every neighborhood of 0. Thus f is not monotone on any interval containing 0. If f'(c) > 0, the only conclusion that can be reached is that there exists
a 8 > 0 such that f(x) < f(c) for all x E (c  S, c) and f(x) > f(c) for all x E (c, c + 8) (Exercise 15). This, however, does not mean that f is increasing on
(c  8, c + 8). However, if f'(c) > 0 and f' is continuous at c, then there exists a 8 > 0 such that f'(x) > 0 for all x E (c  8, c + 8). Thus f is increasing on
(c8,c+ 8).
5.2
The Mean Value Theorem
183
Theorem 5.2.9 is often used to determine maxima and minima of functions as follows: Suppose f is a realvalued continuous function on (a, b), and c E (a, b) is such that f'(c) = 0 or f(c) does not exist. Suppose f is differentiable on (a, c) and (c, b). If f'(x) < 0 for all x E (a, c) and f'(x) > 0 for all x E (c, b), then by Theorem 5.2.9. f is decreasing on (a, c) and increasing on (c, b). As a consequence, one concludes that f has a local minimum at c. This method is usually referred to as the first derivative test for local maxima or minima. The natural inclination is to think that the converse is also true; namely, if f has a local minimum at c, then f is decreasing to the left of c and increasing to the right of c. As the following example shows, however, this is false!
5.2.10
EXAMPLE
Let f be defined by
f(x) = {x4(2
x 0,
,
x = 0.
The function f has an absolute minimum at x = 0; however, f'(x) has both negative and positive values in every neighborhood of 0. The details are left as an exercise (Exercise 19). The graph of f'(x) = 4x3(2 + sin 1/x)  x2 cos l/x, x * 0, for x in a neighborhood of zero is given in Figure 5.7.
Figure 5.7
Graph of f'(x) = 4x3(2 + sin j)
 x2 cos , x * 0
The following theorem, besides being useful in computing right or left derivatives at a point, also states that the derivative (if it exists everywhere on an interval) can only have discontinuities of the second kind.
5.2.11
THEOREM Suppose f :[a, b) + R is continuous on [a, b) and differentiable on (a, b). If lim f'(x) exists, then f+ (a) exists and x+a
f, (a) = 1 m, f'(x). Xa
184
Chapter5
Differentiation
Proof. Let L = lim' f'(x), which is assumed to exist. Given e > 0, there exists a xra S > 0 such that
If'(x)LI 0. As in the remark following Theorem 5.2.9. since g'(a) < 0, there exists an x, > a such that g(.r,) < g(a)
Also, since g'(b) > 0, there exists an x, < b such that g(x.) < g(b). As a consequence, g has an absolute minimum at some point c E (a, b). But then
g'(c) = f'(c)  A = 0,
i.e., f'(c) = A. Q The previous theorem is often used in calculus to determine where a function is increasing or decreasing. Suppose it has been determined that the derivative f is zero at c, and c2 with c, < c2, and that f'(x) * 0 for all x E (c,, c2). Then by the previous theorem, it suffices to check the sign of the derivative at a single point in the interval (cr, c2) to determine whether f is positive or negative on the whole interval (c c,). Theorem 5.2.9 then allows us to determine whether f is increasing or decreasing on (cj, c2).
Inverse Function Theorem We conclude this section with the following version of the inverse function theorem.
5.2.14 THEOREM (Inverse Function Theorem) Suppose ! C R is an interval and f : I + R is differentiable on I with f'(x) * O for all x E I. Then f is onetoone on I, the inverse
function f` is continuous and differentiable on J = f(!) with 1
(f')'(f(x)) = f'(x) for all x E 1.
Proof. Since f'(x) * 0 for all x E 1, by Theorem 5.2.13, f is either positive or negative on 1. Assume that f'(x) > 0 for all x E I. Then by Theorem 5.2.9,f is strictly increasing on I and by Theorem 4.4.12, f ' is continuous on J = f(I). It remains to be shown that f ' is differentiable on J. Let y, E J, and let {y.) be any sequence in J with yn y0, and y 0 y for all n. For each n, there exists x E 1 such that y,,. Since f'' is continuous, x +x0 = fHence
f1'
xn  x
f  `Cy0)
Y.  y0
  oof(xx)  f(x0)
=f (xo)
Since this holds for any sequence definition of the derivative,
with y, r y,,, y * y,,, by Theorem 4.1.3 and the
(f TO = rl(xo)' v
186
Chapter 5
Differentiation
Remark, The hypothesis that f'(x) # 0 for all x E I is crucial. For example, the function f(x) = x3 is strictly increasing on [ 1, 1 ] with f'(0) = 0. The inverse func. tion f '(y) = y'13, however, is not differentiable at y = 0.
5.2.15
EXAMPLES
(a) As an application of the previous theorem, we show that f (x) = x'1", x E (0, oo), n E N, is differentiable on (0, oo) with n
for all x E (0, oo). Consider the function g(x) = x", n E N, Dom g = (0, oo). Then g'(x) = nx"' and g'(x) > 0 for all x E (0, oo). By the previous theorem, g' is differentiable on J = g((0, oo)) _ (0, oo) with
(g')'(g(x)) =
g'(x)
=
11
If we set y = g(x) = x", then x = y"" and
(g')'(y) = n(y )"1 = n
yI/"t.
Since f = g' the desired result follows. (b) As in Example 5.2.7, let L(x) = In x denote the natural logarithm function on (0, co). Since L'(x) = l/x is strictly positive on (0, oo), the function L is onetoone, the inverse function L' is continuous on Q8 = Range L, and by Theorem 5.2.14,
If we set E = L', then E'(L(x)) = x, or E'(y) = E(y) where y = L(x). The function E(x), x e 08, is called the natural exponential function on } and is usually denoted by e', where e is Euler's number of Example 2.3.5. The exponential function E(x) is considered in greater detail in Example 8.7.20.
(c) In this example we consider the inverse function of g(x) = cos x, x E [0, ir]. Since g'(x) = sin x is strictly negative for x E (0, ir), the function g is strictly decreasing on [0, a] with g([0, a]) _ [ I, 1 ]. Therefore its inverse function g 1, which we denote by Arccos, exists on [ 1, 1 ]. Thus for y E [ 1, 1 ], x = Arccos y if and only if y = cos x. Finally, since g'(x) 0 for x E (0, ir), by the inverse function theorem,
(8,)'(g(x)) = g'(x)
1 sinx = N/l
z
'
or since y = cos x,
l dy
Arccos y =
The graphs of both cos x, x E [0, ir], and Arccos x, x E [ 1, 13, are given in Figure 5.8.
5.2 The Mean Value Theorem
Figure 5.8
187
Graphs of cos x, x E [0, ar], and Arccos x, x E [1, 11
EXERCISES 5.2 1. For each of the following functions, determine the interval(s) where the function is increasing or decreasing, and find all local maxima and minima.
*a f(x)=x3+Gx5, xER
b.g(x)=4xx°, xER
c. h(x) = 1+x2x2, x e R
d. k(x) _ V
e.l(x)=x+7, x#0 2. Let f (x) _
2
x,
xz0
f f(x)xb, a*b,x*b
(x  a,)2, where a,, a2, ... , a are constants. Find the value of x where f is a minimum.
3. As in Example 5.2.7, use the mean value theorem to establish each of the following inequalities. a.
1 +xs1+Ix, x> I
b.exZI+x, xER
C., (I +x)a? I + ax, x> 1,a> I 1. For a E N, this inequality was proved by mathematical induction in Example 1.3.3(b). In this exercise, and in Exercise 4(b), you may assume that for a E R.
d x° = ax°
188
Differentiation
Chapter 5
4. Prove each of the following inequalities.
a.all. b""0,0 0 there exists a S > 0 such that
f(t)  f(x)
tx
R with f. (a) > 0. Prove that there exists a S > 0 such that f (x) > f(a) for all
x, a 0, let f(x) = x'. Prove that f'(x) = rx''. 22. Suppose L : (0, oo)
R is a differentiable function satisfying L'(x) = 1/x with L(1) = 0. Prove each of the fol
lowing.
a. L(ab) = L(a) + L(b) for all a, b E (0, oo)
b, L(I/b) = L(b), b > 0 c. L(b') = rL(b), b > 0, r E R d. L(e) = 1, where e is Euler's number of Example 2.3.5
e. Range L = R 23. Let g(x)  tan x, f < x < . a. Show that g is onetoone on (f, i) with Range g = R. 'b. Let Arctan x, x E R, denote the inverse function of g. Use Theorem 5.2.14 to prove that
d
An:tan x
+ x2 .
24. a. , Show that f(x) = sin x is onetoone on [ i, f] with f([  z, ]) _ [  I,.1 J. z
b. For x E ( I, 1]. let Arcsin x denote the inverse function off. Show that Arcsin x is differentiable on (1, 1). and find the derivative of Arcsin x.
190
Chapter 5
Differentiation
25. Let f : (0, oo) > R be differentiable on (0, oo) and suppose that lim f'(x) = L. f (x + h f(x) a. Show that for any h > 0, lim = L. h
b. Show that lim f
5.3

x
)) = L.
L'Hospital's Rule As another application of the mean value theorem, we now prove ('Hospital's rule for
evaluating limits. Although the theorem is named after the Marquis de l'Hospital (16611704), it should be called Bernoulli's rule. The story is that in 1691, l'Hospital asked Johann Bernoulli (16671748) to provide, for a fee, lectures on the new subject of calculus. L'Hospital subsequently incorporated these lectures into the first calculus text, L'Analyse des infiniment petis (Analysis of infinitely small quantities), published in 1696. The initial version (stated without the use of limits) of what is now known as l'Hospital's rule first appeared in this text.
Infinite Limits Since l'Hospital's rule allows for infinite limits, we provide the following definitions.
5.3.1
DEFINITION Let f be a realvalued function defined on a subset E of R and let p be a limit point of E. We say that f tends to oo, or diverges to oo, as x approaches p. denoted
Jim f (x) = 00, X+p
if for every M E R, there exists S > 0 such that
f(x) > M for all x E E with 0 < Ix  pI < S. Similarly,
lim f(x) _ 00, if for every M E R, there exists a S > 0 such that
f(x) < M for all x E E with 0 < Ix  pI < S. For f defined on an appropriate subset E of P. it is also possible to define each of the following limits:
lint, f (x) =too,
X+p
lim f (x) =too,
X+p
Jim f (x) =too,
X 00
Jim f (x) =too.
XHx
Since these definitions are similar to Definitions 4.1.11 and 4.4.1 they are left to the exercises (Exercise 1).
5.3
!Hospital's Rule
191
Remark. Since we now allow the possibility of a function having infinite limits, it needs to be emphasized that when we say that a function f has a limit at p e I3 (or at ± oo), we mean a finite limit.
L'Hospital's Rule L'Hospital's rule is useful for evaluating limits of the form gx(x)
lim
where either (a) lim f(x) = lim g(x) = 0 or (b) f and g tend to ± oo as x +p. If (a) holds, then 1 m W 419W) is usually referred to as indeterminate of form 0/0, whereas in (b) the limit is referred to as indeterminate of form oo/oo. The reason that (a) and (b) are indeterminate is that previous methods may no longer apply. In (a), if either lim f (x) or lim g(x) is nonzero, then previous methods discussed in xip
Section 4.1 apply. For example, if both f and g have limits at p and lim g(x) * 0, then p by Theorem 4.1.6(c),
f(x) lim _ x*p 8(x)
lim f(x) lim g(x)
On the other hand, if lim f (x) = A * 0 and g(x) > 0 with lim g(x) = 0, then as XP xip, f(x)/g(x) tends to `Z oo if A > 0, and to oo if A < 0 (Exercise 5). However, if lim f(x) = lim g(x) = 0, then unless the quotient f(x)/g(x) can somehow be simplified, previous methods may no longer be applicable.
5.3.2
THEOREM (L'Hospital's Rule) Suppose f, g are realvalued differentiable functions
on (a, b), with g'(x) # O for all x E (a, b), where oo lim
f,x)
x.a g 'W
a < b s oo. Suppose
= L, where L E R U {oo, oo).
if
(a) lim f(x) = lim g(x) = 0, or xsa* (b) lint g(x) = ±oo, then lim
f ((x()
x.a' g(x)
Remark. The analogous result where x
= L.
b is obviously also true. A more elementary version of 1'Hospital's rule, which relies only on the definition of the derivative, is
192
ChaprerS
Differentiation
given in Exercise 2. Also, Exercise 7 provides examples of two functions f and g satisfying (a) for which lim (f(x)/g(x)) exists but lim (f'(x)/g'(x)) does not exist. xa xa
Proof. (a) Suppose (a) holds. We first prove the case where a is finite. Let
be a
sequence in (a, b) with x +a and x * a for all n. Since we want to apply the generalized mean value theorem to f and g on the interval [a, x,), we need both f and g continuous at a. This is accomplished by setting f(a) = g(a) = 0.
Then by hypothesis (a), f and g are continuous at a. Thus by the generalized mean value theorem, for each n E NI there exists c between a and /x such that p
[f(xa)  J
g(a))f (cn),
or f(xn)
_ f'(Ca)
g(xa)
g'(Cn)
Note, since g'(x) # 0 for all x E (a, b), g(x,) # g(a) for all n. As n I oc, c > a'. Thus by Theorem 4.1.3 and the hypothesis,
'_°° AX) =
l
f' (x)
= L.
m g'(x) 
with x tea', the result follows. Suppose a = oo. To handle this case, we make the substitution x = 1/t. Then as t > 0+, x > oo. Define the functions (p(t) and *(t) on (0, c) for some c > 0 by Since the above holds for every sequence
ap(t) = f ( )
and
*(t) = g( ! ).
We leave it as an exercise (Exercise 3) to verify that t0 +G r)
x
lim
g(X)L,
and that
1m P(t) = lim iy(t) = 0. Thus by the above, t
xli.m g(x) = 1'0
00)
= L.
(b) Suppose I m, g(x) = oo. The case in which g(x) + oo is treated similarly. Rather than treating the finite case and infinite case separately, we provide a proof that works for both.
5.3
CHospital's Rule
193
Suppose first that oo s L < oo, and /3 E R satisfies /3 > L. Choose r such that
L < r < /3. Since lim f, (x)
r.a g,(x)
< r,
there exists c, E (a, b) such that
r for all C, a < < c1.
g'(C)
Fix a y, a < y < c1. Since g(x) * oo as x +a+, there exists a c2, a < c2 < y, such that g(x) > g(y) and g(x) > 0 for all x, a < x < c2. Let x E (a, c2) be arbitrary. Then by the generalized mean value theorem, there exists E (x, y) such that
f(x) AY) = f'(C) < r. g(x)  g(Y)
(v)
g'(C)
Multiplying inequality (4) by (g(x)  g(y))/g(x), which is positive, we obtain
f(x)  f(Y) < g(x)
r( l
g(Y)1 g(x)
or
g(x)
0 and consider the function
f(x)=x2a. If a > 1, then fhas exactly one zero on [0, a], namely \. If 0 < a < 1. then the zero off lies in [0, 1). Let c, be an initial guess to Va. Then by formula (8), for n ? 1,
cl.  a Cn+1  Cn
2c"
l
a
 2(C" + Cn).
This is exactly the sequence of Exercise 9 of Section 2.3, where the reader was ask to prove that the sequence converges to %/a. With a = 2, taking c, = 1.4 as an initia guess yields c2 = 1.4142857, c3 = 1.4142135,
which is already correct to at least seven decimal places.
5.4.2
THEOREM Let f be a realvalued function on [a, b] which is twice differentiable on
[a, b]. Suppose that f(a)f(b) < 0 and that there exist constants m and M such that If'(x)I m > 0 and [f"(x) 1 s M for all x E [a, b]. Then there exists a subinterval I of [a, b] containing a zero c off such that {for any c, E 1, the sequence {c"} defined by J (Cn) T(c.), nEN, Cn+l=cnf,(c),
is in 1, and nco lim c" _. c. Furthermore,
C"+,  cl S
ICn  W.
(9)
Prior to proving Theorem 5.4.2 we first state and prove the following lemma. The result is in fact a special case of Taylor's theorem (8.7.16), which will be discussed in Chapter 8.
5.4.3
LEMMA Suppose f : [a, b] a R is such that f and f' are continuous on [a, b] and f"(x) exists for all x E (a, b). Let x" E [a, b). Then for any x E [a, b], there exists a real number between x, and x such that
f(x) =f(xo) +f'(xe)(x  xo) + 2f"(C)(x  xo)2.
5.4
Proof.
Newton's Method
201
For x E [a, b], let a E R be determined by
f(x) = f(x,) +
CO 
Define g on [a, b] by
x,)  a(t  x,)2.
g(t) = f(t)  f(x.) 
If x = x, then the conclusion is true with C = x". Assume that x > x". Then g is continuous and differentiable on [x x] with g(x") = g(x) = 0. Thus by Rolle's theorem there exists c E (x0, x) such that g'(c) = 0. But
g'(t) = fi(t) f'(x,,)  2a(t  x,). By hypothesis, g' is continuous on [x c], differentiable on (x c), and satisfies g'(x,) _ g'(c) = 0. Thus by Rolle's theorem again, there exists C E (x c) such that 0. But
g"(t) = f"(t)  2a. Therefore, a = if"(C)
Proof of Theorem 5.4.2. Since f(a) f(b) < 0 and f'(x) # 0 for all x E [a. b], f has exactly one zero c in the interval (a, b). Let x, E [a. b] be arbitrary. By Lemma 5.4.3 there exists a point C between c and x, such that
0 = f(c) = f (xo) + f'(xo)(c  x,) + f"(C)(c  x,)2, 2
or
 Ax.) = f' (xo)(c  x0) + 2 f "(3'xc  x,)2.
(10)
If x, is defined by X1 = X.
f(x.)  f,(xo),
then by equation (10).
X, =x,+(c x,)+ 2 f,"(0 f )(cx,)2. 1
0
Therefore, "
IX,  cl = 2
x,
Ic  x,12 s
Ic  x,12.
(11)
Choose s>0sothat 8 0 was arbitrary, we have j' If = fo f. Thereforef is integrable on [0, 11 with f, f = 2.
214
Chapter 6
The Riemann and RiemannStieltjes Integral
(c) We now provide another example to illustrate how tedious even a trivial integral can be if one relies only on the definition of the integral. Luckily, the fundamental theorem
of calculus (Theorem 6.3.2) will allow us to avoid such tedious computations. Let f(x) = x, x E [a, b]. For the purpose of illustration we take a > 0 (Figure 6.3). Interpreting the integral as the area under the curve, we intuitively see that
tbx
=
a)(b+a)=(b2a22).
This is obtained from the formula for the area of a parallelogram. Let 91 be any partition of [a, b]. Since f(x) = x is increasing on [a, b], {x0, x,, ... ,
m;=f(x;_,)=x;_,
M;=f(x;)=x;.
and
Figure 6.3
Therefore,
4911f)
x; _ , Ox;
and
949, f)
x; Ax;
For each index i,
x;1 < 2(xi_, +x) 0, there exists a partition 9 of [a. b) such that E. (2) RL(9',f) Furthermore, if 9 is a partition of [a, b] for which inequality (2) holds, then the in
equality also holds for all refinements of 9.
Proof.
Suppose inequality (2) holds for a given e > 0. Then
0C I f Jf_ 0, choose n E N such that
(b  a) n
[f(b)  f(a)] < E.
For this n and corresponding partition 9, OIL.(91, f)  2(91j) < e. Thus f is integrable on [a, b].
The Composition Theorem We next prove that the composition iP f, of a continuous function gyp, with a Riemann integrable function f, is again Riemann integrable. As an application of Lebesgue's theorem we will present a much shorter proof of this result later in the section.
6.1.9 THEOREM Let f be a bounded Riemann integrable function on [a, b] with Range f C [c, d]. If ip is continuous on [c, d], then tp f is Riemann integrable on [a, b].
Proof. Since 9 is continuous on the closed and bounded interval [c, d], ip is bounded and uniformly continuous on [c, d]. Let K = sup{IQp(t)I: t E [c, d]}, and let e > 0 be
given. Set e' = e/(b  a + 2K). Since ip is uniformly continuous on [c, d]. there exists 8, 0 < S < e', such that (4) 19(s)  (P(01 < e' for all s, t E [c, d] with Is  tI < S. Furthermore, since f E 9%[a, b], by Theorem 6.1.7 there exists a partition 9 ' = {xo, . . . . . x}} of [a, b] such that
%(9',f)  _y(9', f) < S2.
218
Chapter6
The Rlemann and RiemannStieltjes Integral
To complete the proof we will show that (5)
°1491, (P of)  201, (P of) C E.
By Theorem 6.1.7 it then follows that (p of E Jt(a, b]. For each k = 1, 2, . . . , n, let ink and Mk denote the infimum and supremum off on [xk_ 1, xk). Also, set
mA = inf{cp(f(t)) : t E [x,,, xJ) M,*t = sup{p(f(t)):1 E [xk_I,xk]}.
and
We partiton the set 11, 2, ... , n} into disjoint sets A and B as follows:
A={k:Mkmk 0 are such that T, c, converges, then by Theorem 4.4.10, 1
00
f(x) = 71 C. 1(x n=1
is monotone increasing on [0, 1 ], and thus is Riemann integrable on [0, 11. By Theorem 4.4.10, the function f is continuous at every irrational number and discontinuous at every, rational number in [0, 1]. We now state the beautiful result of Lebesgue that provides necessary and sufficient conditions that a bounded realvalued function on [a, b) be Riemann integrable. To properly state Lebesgue's result we need to introduce the idea of a set of measure zero. The concept of measure of a set will be treated in detail in Chapter 10. The basic idea is that the measure of an interval is its length. This is then used to define what we mean by measurable set and the measure of a measurable set. At this point we only need to know what it means for a set to have measure zero.
6 1.11
DEFINITION A subset E of P has measure zero if given any e > 0, there exists a finite or countable collection {1n)n of open intervals such that
ECUJ
and
e, n
where l(1n) denotes the length of the interval In.
6.1.12
EXAMPLES
(a) Every finite set E has measure zero. Suppose E = {x1,
. .
. , xN} is a finite subset of
R. For each n = 1, 2, ... , N, as in Figure 6.4 (with N = 6), let
/
E
Nxn+2N, .
Then N
N
ECU /
> l(!n) = e.
and
n=1
n=1
Therefore, E has measure zero. it
14
12
(0) (0) Xt
X4
X2
16
l5
X3 X6
XS
13
Figure 6.4
220
Chapter 6
The Riemann and RiemannStleltjea Integral
(b) Every countable subset of R has measure zero. Suppose E = {x,}00 I is a countable
subset of R. Let f > 0 be given. For each n E N. let E l
E
Since xn E 1 for all n, E C U. 11,,. Thus since 1(I4) = E12n, 00
j l(In) = E
w=1
2n = E. n=1
As an example, the set Q of rational numbers has measure zero.
(c) The Cantor set Pin [0, 1] has measure zero (Exercise 21). We now state the following theorem of Henri Lebesgue, the proof of which will be given in Section 6.7. This result appeared in 1902 and provides the most succinct form of necessary and sufficient conditions for Riemann integrability.
6.1.13 THEOREM (Lebesgue) A bounded realvalued function f on [a. b] is Riemann integrable if and only if the set of discontinuities off has measure zero.
Remark. If f is continuous on [a, b], then clearly f satisfies the hypothesis of Theorem 6.1.13 and thus is Riemann integrable. If f is a bounded function that is continuous except at a finite number of points, then by Example 6.1.12(a) the set of discontinuities off has measure zero. Hence f E 9t[a, b]. If f is monotone on [a, b], then by Corollary 4.4.8, the set of discontinuities off is at most countable, and thus by Example 6.1.12(b).
has measure zero. Hence again f E 9t[a, b]. As an application of Lebesgue's theorem we give the following short proof of Theorem 6.1.9.
Proof of Theorem 6.1.9 Using Lebesguevs Theorem. As in Theorem 6.1.9, suppose f E 9t[a, b] with Range f C [c, d], and suppose rp : [c, d] , R is continuous. Let
E = {x E (a, b): f is not continuous at x} and F = {x E [a, b): rp of is not continuous at x}. By Theorem 4.2.4, F C E. Since f is Riemann integrable on (a, b], the set E has measure zero, and as a consequence so does the set F. Therefore. V of E 9t[a, b]. Q
6.1.14
EXAMPLES
(a) As in Example 4.2.2(g), let f be defined on [0, 1] by
11, x=0, f(x) =
0, 1n,
if x is irrational,
if x = n in lowest terms, x * 0.
6.1
The Riemann Integral
221
Since f is continuous except at the rational numbers, which have measure zero, f is Riemann integrable on [0, 1 ]. Furthermore, since $(P, f) = 0 for all partitions 9 of [0, 1 ],
Jf(x)dx = 0. (b) Let f be the Riemann integrable function on [0, 1 ] given in (a), and let g : [0, 1 ]  R be defined by
0, x0,
g(x) __ 11, x E (0, 1].
Since g is continuous except at 0, g E %[0, 1 ]. But for x E [0, 1 ],
(g of)(x) = J 1, if x is rational,
0, if x is irrational.
By Example 6.1.6(a), g of 6E Jt[O, 1).
EXERCISES 6.1 1. Let f (x) = 1  x2, x E [ 1, 2]. Find Sf(9, f) and qt(9J) for each of the following partitions of [ 1, 2].
a. 9 = {1,0, 1,2) b.9={1,12,12,1,2,2} 2. Show that each of the following functions is Riemann integrable on [0, 2], and use the definition to find f02 I
*a f(x) =
2, 15 x 5 2 '
b f (x)
1,
0, sx < 1
3,
25 x 0 be given, and let S > 0 be such that inequality ( 9 ) holds f o r all partitions 9 = {x0, xl, ... , xn} of [a, b) with 11211 < S, and all ti E [xi_ 1, xi]. By the definition of Mi, for each i = 1, ... , n, there exists i E [xi_ 1, xi] such that f M,  e. Thus n
OU(911f) =
I  E[ 1 + b  a]. Therefore, °U.(9, f)  2(91,f) < 2E[ 1 + b  a]. Thus as a consequence of Theorem 6.1.7, f E 9t[a, b) with fa f = I. Conversely, suppose f r= Jt[a, b]. Let M > 0 be such that I f(x)I s M for all x E [a, b]. Let e > 0 be given. Since f E 9t[a, b], by Theorem 6.1.7 there exists a partition . of [a, b] such that Jb
e 0. Then by Theorem 6.2.3, c+h
F(c + h)  F(c)
J
c+h
c
f(t) dt 
f(t) dt = {
f
o
f(t) A
c
Therefore,
F(c +
h)  F(c)  f(c) h
C
c+h
f(t)
hJ

1
h
f (C)
dt 
f" [f(t)  f(c)) dt.
Let E > 0 be given. Since f is continuous at c, there exists a S > 0 such that
NO  f(c) I < E f o r all t,
I t  c I < S. Therefore, if 0 < h < S,
F(c + h)  F(c)  f(c)
Jc+h
I
If(t)  f(c)I dt
h
c
h
Jc +h
0. Furthermore, since L'(x) > 0 for all x E (0, oo), L is strictly increasing on (0, oo). We now prove that the function L(x) satisfies the usual properties of a logarithm function; namely,
(a) L(ab) = L(a) + L(b) for all a, b > 0, (b) L(e) _ L(b), b > 0, and (c) L(b') = rL(b), b > 0. r E R. To prove (a), consider the function L(ax), x > 0. By the chain rule (Theorem 5.1.6),
d L(ax) = dX
.a=X=L'(x). ax
Thus by Theorem 5.2.9, L(ax) = L(x) + C for some constant C. From the definition of L we have L(l) = 0. Therefore, L(a) = L(1) + C = C.
Hence L(ax) = L(a) + L(x) for all x > 0, which proves (a). The proof of (b) proceeds analogously. It is worth noting that for the proof of (a) and (b) we only used the fact
that L'(x) = l/x and L(1) = 0. To prove (c), if n E N, then by (a) L(b") = nL(b). Also by (b),
L(b") = L\\b/"/
=
nL(b)
nL(b).
Therefore, L(b") = nL(b) for all n E Z. Consider L('6) where n E N. Since nL(6) = L(b), L(6) _ L L(b). Therefore, L(b') = rL(b) for all r E 0. Since L is continuous, the above holds for all r E R.
6.3
Fundamental Theorem of Calculus
235
Our final step will be to prove that L(e) = 1, where e is Euler's number of Example 2.3.5. To accomplish this we use the definition to compute the derivative of L at 1. Since L'(1) exists,
1 =L'(1)= ]im n,ao
L(l +
L(1) i
1mnL(l+nl
=
= lim L(t I +
!T) = L(e). n
The last equality follows by the continuity of L and the definition of e. Therefore, L(e) = 1 and the function L(x) is the logarithm function to the base e. This function is usually denoted by log, x or In x, and is called the natural logarithm function.
Consequences of the Fundamental Theorem of Calculus We now prove several other consequences of Theorem 6.3.4. Our first result is the mean value theorem for integrals.
6.3.6 THEOREM (Mean Value Theorem for Integrals) Let f be a continuous realvalued function on [a, b]. Then there exists c e [a, b] such that b
f
f =f(c)(b  a).
a
Proof.
Let F(x) = fa f. Since f is continuous on [a, b], F'(x) = f(x) for all x E [a, b]. Thus by the mean value theorem (Theorem 5.2.6), there exists c E [a, b] such that
Jbf = F(b)  F(a) = F'(c)(b  a) =f(c)(b  a). 0 a
An alternative proof of the above can also be based on the intermediate value theorem using the continuity of f. This alternative method will be used in the proof of the analogous result for the RiemannStieltjes integral.
6.3.7
THEOREM (Integration by Parts Formula) Let f, g be differentiable functions on
[a, b] with f, g' E 9t[a, b]. Then Jb fg' = f(b)g(b)  f(a)g(a) 
gf'.
'Proof. Since f, g are differentiable on [a, b], they are continuous and thus also integrable on [a, b]. Therefore by Theorem 6.2.1(c), fg' and gf' are integrable on [a, b]. Since
(fgY = gf' + fg',
236
Chapter 6
The Riemann and RiemannStieltjes Integral
the function (fg)' E 9t[a, b]. By the fundamental theorem of calculus (Theorem 6.3.2), b b Vg), b
= J gf + J fg',
f(b)g(b)  f(a)g(a) = J
a
a
a
from which the result follows. 0
6.3.8
THEOREM (Change of Variable Theorem) Let tp be differentiable on [a, b] with rp' E &[a, b]. If f is continuous on I = rp([a, b]), then h
v(b)
f(x) dx.
f((p(t))w'(t) de a
e(a)
Proof. Since tP is continuous, I = rp([a, b]) is a closed and bounded interval. Also, since f a rp is continuous and tp' E 9t[a. b], by Theorem 6.2.1(c), (f o W)tp' E 9t[a, b]. If I = r([a, b]) is a single point, then tp is constant on [a. b]. In this case rp'(t) = 0 for all t and both integrals above are zero. Otherwise, for x E I define
F(x) = J
f(s) ds.
(a) Since f is continuous, F'(x) = f(x) for all x E 1. By the chain rule,
d F(1P(t)) = F'((p(t))1P'(t) =f(tp(t))w'(t)
for all t E [a, b]. Therefore by Theorem 6.3.2, Jbf((p(t))tp'(t) dt = F((p(b))
F(,p(a)) = J b)f(s) ds. (a)
a
Remark Another version of the change of variable theorem is given in Exercise 10.
6.3.9
EXAMPLE To illustrate the change of variable theorem, consider fo t/(1 + t2) di. If we let (,p(t) = 1 + t2 and f(x) = 1/x, then Joe
1 + t2 which, by Theorem 6.3.8
dt = 2
_Js
102f(`p(t))(p'(t) dt
xdx=21n5.
2
EXERCISES 6.3 1. *Let f E 9t[a, b]. For x E [a. b], set F(x) = f' f. Prove that F is continuous on [a, b].
6.3 Fundamental Theorem of Calculus
237
2. For x E [0, 1 ], find F(x) = fo f(t) dt for each of the following functions f defined on [0, 11. In each case verify that F is continuous on [0, 11. and that F'(x) = f(x) at all points where f is continuous. 1,
a. f(x)=xz3x+5
0 s x v. This function has the property that the improper integral off on [a, o0) converges, but the improper integral of VI diverges. The proof of the convergence of the improper integral off is found in Exercise 7. Here we will show that
IfI = r
Isin xl
=oo.
6.4
Improper Riemann Integrals
243
For n E N, consider sin xI J
J
km1
x
a
((k "i° Isin xl dx. x ka
dx =
Since the integrand is nonnegative, (k +3/4h.
(k+I)" {sin xf
>
x
J(k+1/4hr
I
>_
(k
t )ir
x
\/2. Also,
On the interval [(k + 1)1r, (k + ;)1r], sin xl
x
Isin
hr,+4)JrJ. forallxErlk+4(k / \\
/
(k+3/4)n Isin x
x
Ia 22(k+1)rr
ll
ll
k+ 1' and as a consequence, "+
f
sin xl
x
1,/2 4
dx
k_ik+l
By Example 2.7.4, the series 7,k 1 k diverges. Therefore,
(°° !sin J A
X
dx = ".4 lim
("+I)vIsinxI ]r
x
dx = oo.
As the previous example shows, the convergence of the improper integral off does imply the convergence of the improper integral of If 1. If f is a realvalued function on [a, oo) such that f e ,[a, c] for every c > a and the improper integral of If I converges on (a, oo), then f is said to be absolutely integrable on [a, oo). An analogous definition can be given for unbounded functions on a finite interval. We leave it as an exercise to prove that if f is absolutely integrable on [a, oo), then the improper integral off also converges on [a, oo) (Exercise 5). We conclude this section with the following useful comparison test for improper 'integrals.
244
Chapter 6
6.4.5
The Riemann and RlemannStleltjes Integral
THEOREM (Comparison T e s t )
Let g : [a, oo) + R be a nonnegative f u n c t i o n satis
fying g E 9t[a, c] f o r every c > a and f g(x)dx < oo. If f : [a. oo) +R satisfies (a) f E Jt[a, c] for every c > a, and (b) I f (x) 5 g(x) for all x E [a, oo), then the improper integral off on [a, oo) converges, and
J
:5 J g(x)dx.
(x) dx
Proof. The proof is left to the exercises (Exercise 6).
N EXERCISES 6.4 1. For each of the following functions f defined on (0, 1), determine whether the improper integral off converges. If it converges, find fo f.
'a f(x) = Xp> 0 < p < I
1>
*d. f(x) = x In x
f(x) _
c. f(x) =
x
e. f(x) = (I + x) I
Inxx
f. f(x) = tan
+ x)
(Zx)
2. For each of the following, determine whether the improper integral converges or diverges. If it converges, evaluate the integral. *a.
fee*dx
b.
°O In.r
X ax
d.
Jx_'dx. p > I
C.
x
i
o
°D
dx
2
f'xZ+,dx h.
g.
dx
°G
*e. E x l n x
2
f(x2+ 1)pdx. P> I
x(l n x)P"
P> 1
f IV
I.
(x2+ 1)(x+
1)dx
3. For each of the following, determine the values of p and q for which the improper integral converges. *a. f I/2xpjlnxjpdx
c.
b.
0
Jaxp[ln(l +x)]°dx 0
2
4. Petfbedefined on(0, 1]by
f(x) =
±(X2 sin
X2) 2x sin X2 
cos X2.
Show that the improper Riemann integral off converges on (0, I ], but that the improper integral of VI diverges on (0,1 1.
5. *If f is absolutely integrable on [a. oo) and integrable on [a, c] for every c > a, prove that the improper Riemann
integral off on [a, oo) converges.
6. Prove Theorem 6.4.5.
6.5 The RiemannStieltjes Integral
245
x E [1r,oo). +a. Show that the improper integral of If I converges on [1r, oo).
7. Let f(x) _ (cos
*b. Use integration by parts on [ir, c], c > ir, to show that J
sin x dx exists. x
8. Show that fox' sin x dx converges for all p, 0 < p < 2.
9. Forx>0,set
The function r is called the Gamma function. a. Show that the improper integral converges for all x > 0. b. Use integration by parts to show that r(x + I) = xr(x), x > 0.
c. Show that r(1) = I. d. For n r= Rl, prove that r(n + 1) = W.
The RiemannStieltjes Integral In this section we consider the RiemannStieltjes integral, which, as we will see, is an extension of the Riemann integral. To motivate the RiemannStieltjes integral we consider the following example from physics involving the moment of inertia.
6.5.1
EXAMPLE Consider nmasses, each of mass m;, i = 1, ... , n, located along the xaxis at distances r, from the origin with 0 < rt < < r (Figure 6.8). The moment of inertia 1, about an axis through the origin at right angles to the system of masses, is given by
1 = ; r?m;.
0
HIM
Figure 6.8
3. Since the results of this section are not specifically required in subsequent chapters, this topic can be omitted on first reading of the text.
246
Chapters
The Riemann and RiemannStieltjes Integral
On the other hand, if we have a wire of length I along the xaxis with one end at the origin, then the moment of inertia 1 is given by ! = jxp(x)dx.
where for each x E [0, 1). p(x) denotes the crosssectional density at x.
Although these two problems are totally different, the first being discrete and the second continuous, the RiemannStieltjes integral will allow us to express both of these
formulas as a single integral. In the definition of the Riemann integral, we used the length Axi of the ith interval to define the upper and lower Riemann sums of a bounded function f. The only difference between the Riemann and RiemannStieltjes integral is that we replace Axi by
Aai = a(xi)  a(x; _ 1), where a is a nondecreasing function on [a, b]. Taking a(x) = x will give the usual Riemann integral. Although the modification in the definition is only minor, the consequences are farreaching. Not only will we obtain a more extensive theory of integration, but also an integral that has broad applications in the mathematical sciences.
Definition of the RiemannStieltjes Integral Let a be a monotone increasing function on [a, b], and let f be a bounded realvalued function on [a, b]. For each partition 9 = {xo, x1 , .. . , of [a, b], set Aai = a(xi)  a(x;_ 1),
i=
..
, n.
Since a is monotone increasing, Aai ? 0 for all i. As in Section 6.1, let m, = inf{ f(t) : t E [x,_ i, xi]},
M, = sup{f(t):t E [xi_1,xi]}. As for the Riemann integral, the upper RiemannStieltjes sum off with respect to a and the partition 91, denoted all.(9, f, a), is defined by
kt(9, f, a) _
r=
M;Aa,.
Similarly, the lower RiemannStleltjes sum off with respect to a and the partition 9, denoted 2(9, f, a), is defined by Y(9, f, a) _
miAai. i=1
Since m; s M, and Aai ? 0, we always have 2(91,f a) < all(9, f, a). Furthermore, if m s f(x) s M for all x E [a, b], then m[a(b)  a(a)] 0, b
f
b
a d( < a(b)R(b)  a(a)$(a)  J (3da.
A similar argument using the lower sum proves the reverse inequality.
254
chapter 6
The Riemann and RiemannStieltjes Integral We conclude this section with two results that represent the extremes encountered in RiemannStieltjes integration. As in Example 6.5.4(a), let I(x  c) be the unit jump are nonnegfunction at c E R. Suppose {s"},N, I is a finite subset of (a, b] and ative real numbers. Define the monotone increasing function a on [a, b] by N
a(X) _ I cn I (X  Sn). n1
N 1bN
If f is continuous on [a, b], then by Example 6.5.4(a) and Theorem 6.5.8(b), b
fa
f da =
c"
1
n=1
a
I Cnf (sn) f(x) dl (x  sn) = "'1
(12)
Suppose {s"}: I is a countable subset of (a, b] and {cn}: , is a sequence of nonnegative real numbers for which E cn converges. As in Theorem 4.4.10, define a on [a, b] by
x
CO) _ I, c" 1(x  sn).
(13)
"=I
Since 0 : I (x  sn) 0 be
given. Choose a positive integer N such that 00
I C. < e. nN+1
Define II and 02 as follows:
/
/31(x)
N
nil
Cn1(x  Sn),
/332(x)
Cn1(x  sn). n=N+I
Then a = 6, + $2, and by identity (12), N
{(b
J f d$l = I Cnf (sn) fa
n=1
Let M = max{I f(x)I : x (= [a, b]). Then by Theorem 6.5.8(b) and (e),
1 Ja
f da n7, cnf(sn)
I
=
J
.fdf321 c M [$2(b)  /32(a)]
6.5
The RiemannStieltjes Integral
255
W
:5 Mlc,, 0 be given. Since a' E 01[a, b], by Theorem 6.1.7 there exists a partition 91 of (a, b) such that
°U.(9, a')  Y(), a') < e.
(14)
Let °. 2. _ {xo, ... , xn} be any refinement of 91. As in Theorem 6.2.6, for each i = 1, ... , n, we can choose s, E [xi_,, x,] such that f(s,)Aa; + e.
° L(°9 f, a)
1, q = 1, q < 1.) 6. *Prove Corollary 7.1.3. 7. If I ak converges and B bk = oo, prove that E (ak + bk) = oo. is a sequence in R with a > 0 for all n E N. For each k E 101 set 8. Suppose bk =
1
k
 ± a,,. k n=I
Prove that Jk , bk diverges. 9. Suppose that the series E ak converges and {n;} is a strictly increasing sequence of positive integers. Define the sequence {bk} as follows:
b,=a,+
+a
,
bk=a.,_,+,+...+a,,:
Prove that E bk converges and that $k , bk = k I ak. (This exercise proves that if the series E ak converges, then any series obtained from Eat by inserting parentheses also converges to the same sum. The following exercise shows that removing parentheses may lead to difficulties.)
7.1
Convergence Tests
293
10. Give an example of a series F. ak such that Ek , (a2k _, + a2k) converges, but E ak diverges. 11. Prove that p(k) k1
converges for any polynomial p and a > 1. 12. *Suppose that the series E ak of positive real numbers converges by virtue of the root or ratio test. Show that the series Yt , k"ak converges for all is E N. 13. *Show that the series 1
1
+23+32++... I
1
converges, but that both the ratio and root tests are inconclusive.
14. Apply the root and ratio tests to the series I ak where 2k.
when k is even,
2t+2'
when k is odd.
at
0 for all k E N. Prove that the series Ik , ak converges if and only if some subsequence {sj of the
15. Suppose ak sequence
of partial sums converges.
16. *Cauchy Condensation Test: Suppose that a, z a2 z a3 a
? 0. Use the previous exercise to prove that
P 2kag converges.
7, R , ak converges if and only if
17. Use the Cauchy condensation test to show that $4 , 1/n^ converges for all p > 1, and diverges for all p,
00 k'
kal
kk
00
* d.
(k + 1) k
k1 si
6.
kkt rER,>0
00 (1)klnk
b. k2 7
k
!. k1D 1) *h.
*G kk
k+1
k Cs kt
(k + I)k+1
*f
(Ir Mklnk sink k_Z In k
IE R ,p > 0
Given that
w (i)k+1 k.l
k2
jr2 12'
determine how large n E N must be chosen so that I nrz/12  s^ I < 10', where s^ is the nth partial sum of the 7.
series. If p and q are positive real numbers, show that 00
(ln k)a
converges. &
*Suppose that E ak converges. Prove that
lkak=0.
lim 000 n k.1
7.3
9.
Absolute and Conditional Convergence
+
As in Exercise 19 of the previous section. let ck = I + ; +
b = I  l2 +
1 3
 ... _ 1 = 2n
299
 In k. Set
(1)k+1 k=1
k
Show that lim b = In 2. (Hint: b, = c_  c + In 2.)
7.3
Absolute and Conditional Convergence In this section we introduce the concept of absolute convergence of a series. As we will see in this and subsequent sections of the text, the notion of absolute convergence is very important in the study of series. We begin with the definition of absolute and conditional convergence.
7.3.1
DEFINITION A series ak of real numbers is said to be absolutely convergent (or converges absolutely) if 1 ak I converges. The series is said to be conditionally convergent if it is convergent but not absolutely convergent. We illustrate these two definitions with the following examples.
7.3.2
EXAMPLES
(a) Since the sequence {l/k} decreases to zero, by Theorem 7.2.3 the series (1)k+ 1/k converges. However,
° (I)k+I II k
k=1
xI I
=
Ik=oo.
k1
(1)+'/k is conditionally convergent. (b) Consider the series _ (1)k+1/k2. By Theorem 7.2.3 the alternating series converges. Furthermore, since 7, 1/k2 < oo, the series is absolutely convergent. U Thus the series
Our first result for absolutely convergent series is as follows.
7.3.3
THEOREM
If Z ak converges absolutely, then F_ ak converges and
k=1
akl
k1 lakl.
Proof. Suppose Zak converges absolutely; i.e., `jak1 < oo. By the triangle inequality, for 1 < p s q,
7,ak) < y IakI kp
kp
300
Chapter 7
Series of Real Numbers
Thus by the Cauchy criterion (Theorem 2.7.3) the series 2 , ak converges. Finally. with p = I in the above,
kI ak l = 1im l
t a,
:!s:
l
I
t
lakl :9
kY"
IakI. Q
Remark, To test a series I ak for absolute convergence we can apply any of the appropriate convergence tests of Section 7.1 to the series 7, Iakl. There is, however, one important fact which needs to be emphasized. If the series the ratio or root test, i.e.,
r = lim Ian+ > 1 w__
Iak I diverges by virtue of
a = lim » Ianl > 1, nwo
or
Ia.I
then not only does 71 1akl diverge, but I ak also diverges. To see this, suppose a > 1. Then as in the proof of the root test, IakI > I for infinitely many k. Hence the sequence {ak} does not converge to zero, and thus by Corol
lary 2.7.5, 1 ak diverges. Similarly, if r > 1 and if I < c < r, then as in the proof of Theorem 7.1.7(b), there exists a positive integer n and constant M such that
Ianl y Mc" for all n at no. Thus again, since c > 1, (an} does not converge to zero, and the series 71 ak diverges. We summarize this as follows.
7.3A THEOREM (Ratio and Root Test) Let I ak be a series of real numbers, and let
aslim
.
Also, if ak # O for all k E N, let
R= Tim k*oo
a, ak
r = lim
and
k+m
Iak+, I ak
(a) If a < I or R < 1, then the series I ak is absolutely convergent. (b) If a > 1 or r > 1, then the series 7, ak is divergent. (c) If a = 1 or r 5 1 ; and diverges for all q s ;Thus { 1/k°} E 12 if and only if q > ;
CauchySchwarz Inequality Our main goal in this section is to prove the CauchySchwarz inequality for sequences in l2. First, however, we prove the finite version of this inequality.
7.4.3
If n E NI, and at,.,., an and b1..... bn
THEOREM (CauchySchwarz Inequality) are real numbers, then n+
n
k=1
Proof.
Iakbkl `
, bk
Fk=
k=1
Let A E R and consider
k=1
k=1
` Iakbkl + A2 , bk. n
n
0 ` 7,(Iakl  AIbkl)2 = 1 ak  2A
k1
k.1
The above can be written as 0 0, there would exist a positive integer n, such that I "(j)  f(x) 1 < e for all n >_ n" and all x E [0, 1 ]. In particular,
x m, n
ISn(x)  sm(x)l = Ik
n
n
IJi(X) < ± Mk
fk(x)
m+l
km+I
k=m+1
Uniform convergence now follows by the Cauchy criterion. That 7, I fk(x) I also converges is clear.
8.2.8
EXAMPLES
.(a) If Zak converges absolutely, then since Iak cos kxl s IakI for all x E R, by the Weierstrass Mtest, the series I ak cos kx converges uniformly on R. Similarly for the series I ak sin kx. In particular, the series
7 k1
cos kx
00
kr
k=1
sin kx
k°
p>
1
converge uniformly on R.
(b) Consider the series 7,I (x12)k. This is a geometric series that converges for all
x E R satisfying I x I< 2. If 0< a < 2 and I x s a, then (2)k. \2)k I
Since a/2 < 1 the series 7, (a/2)k converges. Thus by the Weierstrass Mtest, the se
ries I', (x/2)k converges uniformly on [a, a] for any a, 0 < a < 2. The series, however, does not converge uniformly on (2, 2) (Exercise 11). Although the Weierstrass Mtest automatically implies absolute convergence, the following example shows that uniform convergence as a general rule does not imply absolute convergence.
Chapter 8
328
8.2.9
Sequences and Series of Functions
EXAMPLE Consider the series 00
T, (1)k+1
k1
k
0  x c 1.
For each k E N, set ak(x) = xk/k. For x E [0, 11, we have
a,(x) z a2(x) at ... ? 0
and
lim ak(x) = 0.
Thus by Theorem 7.2.3, the series I (I)k "ak(x) converges for all x E [0, 1]. Let
(1Y+'ak(x).
S(x) _ k=1
If S (x) is the nth partial sum of the series, then by Theorem 7.2.4
IS(x) 
1(x) s n + 1
for all x E (0. 1 ].
converges uniformly to S on [0, 1 ]. However, the given series does not conThus verge absolutely when x = 1. The converse is also false; absolute convergence need not imply uniform conver
gence! As an example, consider the series Y,1 x2(1 + x2)k of Example 8.1.2(b). Since all the terms are nonnegative, the series converges absolutely to 1
f (x) =
0, x = 0, l + x2, x # 0,
on R. However, as a consequence of Corollary 8.3.2 of the next section, since f is not continuous at 0, the convergence cannot be uniform on any interval containing 0. The series I ( 1)t+ txk/k, x E [0, 1]. also provides an example of a series that converges uniformly on [0, 1 ] but for which the Weierstrass MTest fails.
EXERCISES 8.2 1. Prove Theorem 8.2.5. a. If (f.) and converge uniformly on a set E, prove that {f. + converges uniformly on E. +b. If (f.1 and converge uniformly on a set E, and there exist constants M and N such that I f.(x)I 1.JxIs2
10. Show that each of the following series converge uniformly on (a, oo) for any a > 0, but do not converge uniformly on (0, oo). I
M
'a. 2
00
I + k2x
I
b. I ii+;
11. Show that the series E, , (x/2)k does not converge uniformly on (2, 2). 12. If Jk o ak converges absolutely, prove that Ik o akxk converges uniformly on [ 1. I ]. 13. If I .oak converges, prove that Iku akxk converges uniformly on [0, 1]. 14. Let {ck) be a sequence of real numbers satisfying 7, Ickl < on, and let {xk} be a countable subset of (a, b]. Prove that the' series lk , ck !(x  xk) converges uniformly on [a, b]. Here ! is the unit jump function defined in 4.4.9. 15. Dirichlet'17est for Uniform Convergence: Suppose {)k) and {gk} are sequences of functions on a set E satisfying Ik., gk(x) are uniformly bounded on E; i.e., there exists M > 0 such that (a) the partial sums
 fk I(x) > 0 for all k E hi and x E E, and
(c) jim fk(x) = 0 uniformly on E. Prove that I fk(x)gk(x) converges uniformly on E. 16. Prove that
. sin kx . t os kz k=1
kP
(p > 0)
k1
converge uniformly on any closed interval that does not contain an integer multiple of 21r.
330
Sequences and Series of Functions
Chapter 8
17. Define a sequence of functions If.) on [0, I ] by
if
f(x)= 0,
2"
ii < x `
2"
elsewhere.
Prove that 1
1 fn(x) converges uniformly on [0, 1 ], but that the Weierstrass Mtest fails.
18. 'Let F0 be a bounded Riemann integrable function on [0. I ]. For n E 101, define F .(x) on [0, 1 ] by F (x) = fo F. 1(t) dt. Prove that 7,; o Fk(x) converges uniformly on (0. 1 ].
8.31 Uniform Convergence and Continuity In this section we will prove that the limit of a uniformly convergent sequence of con
tinuous functions is again continuous. Prior to proving this result, we first prove a stronger result that will have additional applications later.
8.3.1
is a sequence of realvalued functions that converges uniTHEOREM Suppose formly to a function f on a subset E of R. Let p be a limit point of E, and suppose that
for each n E N, lim fn(x) = A,,. Then the sequence {An} converges and
lim f(x) = lim A,,.
x.p
n+co
Remark, The last statement can be rewritten as/ lim I lim
X_p \niW
I lim /I = lim n.00 \X p
It should be noted that p is not required to be a point of E; only a limit point of E.
Let e > 0 be given. Since the sequence I f.1 converges uniformly to f on E. there exists a positive integer n such that
Proof.
Ifn(x)  fm(x)I < E
(2)
for all n, m _a n and all x E E. Since inequality (2) holds for all x E E. letting x +p gives
IAn  Amt s e
for all n, m ? no.
Thus {An} is a Cauchy sequence in R, which as a consequence of Theorem 2.6.4 con
verges. Let A= nti00 limA,
8.3 Uniform Convergence and Continuity
331
It remains to be shown that lim f(x) = A. Again, let e > 0 be given. First, by the uniform convergence of the sequence { fn(x)} and the convergence of the sequence there exists a positive integer m such that
1f(X)  fm(x)I
0 such that
lfm(x)  Anl < 3 for all x E E, 0 0, there exists a positive integer n, such that Ilxn  xmll < E
for all integers n, m ? n,,. (b) A normed linear space (X,11 II) is complete if every Cauchy sequence in X converges in norm to an element of X.
8.3 Uniform Convergence and Continuity
335
As for sequences of real numbers, every sequence {x,,} in X that converges in norm to x E X is a Cauchy sequence. In Theorem 2.6.4 we proved that the normed linear space (R,I I) is complete. The following theorem proves that (%[a, b], II ll.) is also complete.
8.3.11
THEOREM The normed linear space (`P[a, b], II II.) is complete.
Proof. Let {fn} be a Cauchy sequence in %[a, b]; i.e., given e > 0. there exists a positive integer no such that Ilfn  fmll. < E for all n,m > no. But then Ifn(x)  fm(x)I
Il fn  fmllu < E
for all x E [a, b] and all n, m a n,. Thus by Theorem 8.2.3 and Corollary 8.3.2, the sequence { fn} converges uniformly to a continuous function f on [a, b]. Finally. since the convergence is uniform, given e > 0, there exists an integer no such that
I fn(x)  f(x)l < E for all x E [a, b] and n > n,. As a consequence, we have
1Ifn 
converges to f in the norm 11
f 1j. < E for all n a no. Therefore, the sequence { fn}
L. Q
Contraction Mappings In Exercise 13 of Section 4.3 we defined the notion of a contractive function on a subset E of R. We now extend this to normed linear spaces.
8.3.12
DEFINITION
Let (X, N II) be a nonmed linear space. A mapping (function) T : X +X
is called a contraction mapping (function) if there exists a constant c, 0 < c < 1. such that
IIT(x)  T(y)iI < cllx  yll for all x, y E X. Clearly every contraction mapping on X is continuous, in fact uniformly continuous on X. As in Exercise 13 of Section 4.3, we now prove that if T is a contraction mapping on a complete normed linear space (X, II 1). then T has a unique fixed point in X.
8.3.13
THEOREM Let (X, II II) be a complete normed linear space and let T : X + X be a contraction mapping. Then there exists a unique point x E X such that T(x) = x.
Proof. Suppose T : X  X satisfies 11T (x)  T(y)p 0 be given. Since { f (x,)} converges and If.') converges uniformly, time exists no E RI such that fn(x,,)  fm(xo) I < 2 for all n, m z n
(4)
and
I f"(t)  f",(t) I < 2(b
a
a) for all t r= [a, b] and all n, m ? n,.
(5)
Apply the mean value theorem to the functions f  fm with n, m ? n, fixed. Then for x, y E [a, b], there exists t between x and y such that
I(ff(X)  f,,(X))  (My)  fm&))I = I [f"(t)  f;'(t)](x  A. Thus by inequality (5),
I(f (X)  fm(X))  (AY)  fm&))I
2(b
a)I X  yl < 2
(6)
Take y = x, in inequality (6). Then by inequalities (4) and (6). for all x E [a, b] and
n, ma n
If,(x)  fm(X) I
I MW  fm(x))  (ff(x0)  fm(X0))I + I f.(xo)  fm(X0) I
2 a 3 l2
21r
a
a2 2
2
(
\2J"[3 Since i > 1, we obtain f(x + h")  f(x) h
f  0o
as n + oo,
n
provided a is an odd positive integer satisfying a 21r 2 3
>0;
i.e., a > 31r + 2. Since 1r < 3.15, we need a z 13. N
345
8.5 Uniform Convergence and Differentiation
Remark. The above proof is based on the proof of a more general result given in the text by E. Hewitt and K. Stromberg. There it is proved (Theorem 17.7) that
)
f( x) = '
°C cos akrrx bk
k=0
has the desired property if a is an odd positive integer, and b is any real number with
b > I satisfying
6> 1+3ir. The above function was carefully examined by G. H. Hardy [Trans. Amer. Math. Soc., 17, 301325 (1916)] who proved that the above f has the stated properties provided
1 : 0 is chosen such that Q"(t) tit = 1. 12,
Thus the sequence {Q"} satisfies property (a) of Definition 8.6.4. To show that it also satisfies (b) we need an estimate on the magnitude of c". Since I
I
1=c"J(l t2rdt=2c"J(l i
t2rdt
0
?2c"J
(1t2)"dt
o
2c,, o
(1nt2)dt=2c" (
1

1
37 /
4c"
3\fn we obtain
In the above we have used the inequality (1  t2)" z 1  nt2 valid for all t E [0, 1] (Example 1.3.3(b)). Finally, for any 5, 0 < S < 1,
Q"(t)=c"(1t2rsV(152r for all t,SsIt 1 0.
7. Let f be a continuous realvalued function on [0, 1). Prove that given e > 0, there exists a polynomial P with rational coefficients such that 1f(x)  P(x) I < e for all x E [0, 1 ].
8.7
Power Series Expansions
353
8. Suppose f is a continuous realvalued function on [0, 11 satisfying
f f(x)x" dx = 0 for all n = 0. 1. 2.... 0
Prove that f(x) = 0 for all x E [0, 1 ]. (Hint: First show that fo f(x)P(x) dx = 0 for every polynomial P. then use the Weierstrass theorem to show that fo f2(x) dx = 0.)
8J
Power Series Expansions In this section we turn our attention to the study of power series and the representation of functions by means of power series. Because of their special nature, power series possess certain properties that are not valid for series of functions. We begin with the following definition.
8.7.1
DEFINITION the form
Let {ak}k_o be a sequence of real numbers, and let c E R. A series of
00
I ak(X  c)k = a0 + a1(x  c) + a,(x  c)2 + aj(x  c)? + k=0
is called a power series in (x  c). When c = 0, the series is called a power series in x. The numbers at are called the coefficients of the power series. Even though the study of representation of functions by means of power series dates back to the midseventeenth century, the rigorous study of convergence is much more recent. Certainly Newton and his successors were concerned with questions involving the convergence of a power series to its defining function. It was Cauchy, however, who, with his formal development of series, brought mathematical rigor to the subject. As an application of his root and ratio test, Cauchy was among the first to use these tests to determine the interval of convergence of a power series. This is accomplished as follows: Consider a power series 7, ak(x  c)k. Applying the root test to this series gives
km where a = lim k.oo
Ix  cl A
Iakl
= Ix  cla.
m
k
I ak [
. Thus by Theorem 7.3.4, the series converges absolutely if
alx  cl < 1, and divergesifalx  cl > 1.Ifa=0,then alxcl < I for all x E R. If 0 < a < oo, then
aix  cl < I 8.7.2
DEFINITION defined by
if and only if
Ix  cl
R. (c) Furthermore, if 0 < p < R, then the series converges uniformly for all x with
Ix  ci :S P. Proof. Statements (a) and (b) were proved in the discussion preceding the statement of the theorem. Suppose 0 < p < R. Choose /3 such that p < 6 < R. Since
= RI < I,
{ ak I
0 such that I  S < x < I implies that (1  x) M < e, then if(x)  s l < e for all x, 1  8 < x < I. Thus
lim f (x) = s. Q 8.7.6
i
EXAMPLE To illustrate Abel's theorem, consider the series Ickoo (1)ktk. This series has radius of convergence R = 1. Furthermore, the series converges to f (t) = 1/(1 + t) for all t, I t I < 1. Since the convergence is uniform on I t I s I x I where I x I < 1. by Corollary 8.4.2, dt
ln(1 + x) = 0
°°
7(1)J = k.0
+
0
(1)t+l xk
00
(1)kxk+I = k.ok + 1 k.I
0
for all x, I x I < 1. The series
tk d
(1I k1
k
xk
k
8.7 Power Series Expansions
357
has radius of convergence R = 1, and also converges when x = 1. Thus by Abel's theorem,
001
n2=
1
12+34
k
Differentiation of Power Series Suppose the power series Yw 0 ak(x  c)k has radius of convergence R > 0. If we differentiate the series termbyterm we obtain the new power series,
0
00
2 kak(x  c)k1 = G (k + 1)ak+1(x  c)k. kl k0
(a5)
The obvious question is what is the radius of convergence of the differentiated series
(15)? Furthermore, if f is defined by f(x) = I' oak(x  c)k, lx  cI < R, does the series (15) converge to f'(x)? The answers to both of these questions are provided by the following theorem.
8.7.7 THEOREM Suppose 7, o ak(x  c)k has radius of convergence R > 0, and
Iak(xc)k, Ixcl R. Thus the radius of convergence of Joko 1 kakxk1 is also R.
358
Chapter 8
Sequences and Series of Functions
Furthermore, for any p, 0 < p < R, by Theorem 8.7.3 the series 7, kakxk' converges uniformly for all x, I x 1 s p. Thus by Theorem 8.5.1, the series (15), obtained by termbyterm differentiation, converges to f'(x), i.e., OQ
f'(x) = Y, kakxk'
for all x, I xI < R.
k=1
8.7.8 COROLLARY Suppose Ioko o ak(x  c)k has radius of convergence R > 0, and x f (x) = Y, ak(x  c)k,
I x  c I < R.
k=o
Then f has derivatives of all orders in Ix  c I < R, and for each n E IN,
ft")(x) _
x ',k(k  1)
kn
(k  n + 1)ak(x  c)k".
(16)
In particular, ft")(c) = n! a,,.
(17)
Proof. The result is obtained by successively applying the previous theorem to f, f', f", etc. Equation (17) follows by setting x = c in equation (16).
8.7.9
DEFINITION A realvalued function f defined on an open interval I is said to be infinitely differentiable on I if f(x) exists on I for all n E N. The set of infinitely differentiable functions on an open interval I is denoted by C'(1).
As a consequence of Corollary 8.7.8, if 7, ak(x  c)k has radius of convergence
R > 0 and if f is defined by f(x) = 7'k, ak(x  c)k for Ix  cI < R, then the function f is infinitely differentiable on (c  R, c + R) and its nth derivative is given by equation (16). We illustrate this with the following example.
8.7.10
EXAMPLE For Ixi < 1, 00
xk.
1x Thus by the previous corollary, 00
1
(1  x)2 = 2
(1  x)3
kz k1 =
kYa
co
1(
k
+ l)x k ,
00
=
00
k( k
k
2
 1)x k2 =
k
',( k +
2)( k
0
and for arbitrary n E N,
(n1)!(1 xY'
J(k+n1 )...(k+ 1)xk.
+ 1)xk ,
8.7 Power Series Expansions
359
Uniqueness Theorem for Power Series The following uniqueness result for power series is another consequence of Corollary 8.7.8.
8.7.11
ak(x  c)k and 7, bk(x  c)k are two power series which converge for all x, Ix  c < R, for some R > 0. Then COROLLARY Suppose
00
00
'x  ci < R,
G ak(x  c)k = T, bk(x  c)k, k0
k0
i f and only i f ak = bk f o r all k = 0, 1, 2, ... .
Proof. Clearly, if ak = bk for all k, then the two power series are equal and converge to the same function. Conversely, set
AX) = T,00 ak(x 
c)k
00
g(x) = Y, bk(x  c
and
k=0
k0
If f (x) = g(x) for all x, x  c I < R, then r)(x) = gt") (x) for all n = 0, 1, 2.. . . and all x, I x  c I < R. In particular, f")(c) = gO)(c) for all n = 0. 1, 2, .... Thus by equation (17), a" = b" for all n.
Representation of a Function by a Power Series Up to this point we have shown that if a function f is defined by a power series, that is 00
Ix  c) < R,
f(x) _ Y, ak(x  c)k, k0
with radius of convergence R > 0, then by Corollary 8.7.8, f is infinitely differentiable on (c  R, c + R) and the coefficients ak are given by ak = f clo(t)/k!. We now consider the converse question. Given an infinitely differentiable function on an open interval I and c E 1, canf be expressed as a power series in a neighborhood of the point c? Specifically, does there exist an e > 0 such that 00
f (x) = G; ak(x  c)k k0
f o r all x, I x  c I < e, with ak = f ()(c)/k! f o r all k = 0, 1, 2, ... ? The following example from Cauchy shows that this is not always possible.
8.7.12
EXAMPLE
Let f be defined on R by
x=0.
0,
Since X
m
e" = liim a`' = 0, f is continuous at 0. For x * 0, .
x3
360
Chapter8
Sequences and Series of Functions
When x = 0, we have f (h) f'(0) = lim h*0
h A o) = lim h0
e
= lim ' '
h
t. = 0. e'
The last step follows from 1'Hospital's rule. Thus,
X, e '/'', x 0 0,
.f'(x) _
x = 0.
0,
By induction, it follows as above, that for each n E N, P(s)e_'fix,
f(n)(x) = 1
x # 0,
x0,
0,
(
where P is a polynomial of degree 3n. The details are left to the exercises (Exercise 15).
Thus the function f is infinitely differentiable on R. If there exists R > 0 such that f(x) = J o akxk for all x, IxI < R, then ak = 0 for all k. As a consequence, f cannot be presented by a power series that converges to f in a neighborhood of 0.
Taylor Polynomials and Taylor Series We now consider the problem of representing a function f in terms of a power series in greater detail. Newton derived the power series expansion of many of the elementary functions by algebraic techniques or termbyterm integration. For example, the series expansion of 1/(1 + x) can easily be obtained by long division, which upon termbyterm integration gives the power series expansion of In (1 + x). Maclaurin and Taylor were among the first mathematicians to use Newton's calculus in determining the coefficients in the power series expansion of a function. Both realized that if a function f (x) had a power series expansion ak(x  c)k, then the coefficients ak had to be given by
f(c)/k!. 8.7.13
DEFINITION Let f be a realvalued function defined on an open interval 1, and let c E I and n E N. Suppose f W(x) exists for all x E I. The polynomial (k) (c)
r.( C)(x) _ kmo
f kf
(x  c)k
is called the Taylor polynomial of order n off at the point c. If f is infinitely differentiable on 1, the series
= fk)(c) k.0 k!
is called the Taylor series off at c.
(x  c)k
8.7 Power Series Expansions
361
For the special case c = 0, the Taylor series of a function f is often referred to as the Maclaurin series. The first three Taylor polynomials, To, T1, T2, are given specifically by T0(f, c)(x) = f(c),
T1(, c)(x) = f(c) + f'(c)(x  c),
f 2r) (x  c)2.
T2(f, c)(x) = f(c) + f'(c)(x  c) +
The Taylor polynomial T,(f, c) is the linear approximation to f at c; that is, the equation of the straight line passing through (c, f(c)) with slope f'(c). In general, the Taylor polynomial T. of f is a polynomial of degree less than or equal to n that satisfies T (k)(f c)(c) = f tk)(c),
for all k = 0, 1, ... , n. Since ft">(c) might possibly be zero, T. (as the next example shows) could very well be a polynomial of degree strictly less than n.
8.7.14
EXAMPLES In the following examples we compute the Taylor series of several functions. At this stage nothing is implied about the convergence of the series to the function.
(a) Let f (x) = sin x and take c = Z. Then
f(2) = sin 2 = 1,
f'(2)=cos2=0, f.("(2) = sin 2 = 1, J 3)(2) = Cos Z = 0. Thus 2
T3(f2)(x)=12t x2/, which is a polynomial of degree 2. In general, if n is odd, even, P 20(f) = (1)k. Therefore, if n is even,
n12 (I r
T ( f , 2)(x) = T.
1(J,
2)(x) = 0 (2k) \x 
The Taylor expansion of f(x) = sin x about c = 2 is given by
00 (1)k
()!
IX
ir
2k
2)
(b) For the function f(x) = e"X', by Example 8.7.12
0)(x) = 0 for all n E N.
0, and if n = 2k is
2\zk
362
Chapter 8
Sequences and Series of Functions
Thus the Taylor series of f at c = 0 converges for all x E l ; namely, to the zero function. It, however, does not converge to f. (c) In many instances, the Taylor expansion of a given function can be computed from a known series. As an example, we find the Taylor series expansion of f(x) = 1/x about
c = 2. This could be done by computing the derivatives off and evaluating them at c = 2. However, it would still remain to be shown that the given series converges to f (x). An easier method is as follows: We first write 1
1
x
2(2x)
_l
1
(2
2
For lwl < 1,
rwk. °O
1w
k_o
Setting w = (2  x)/2, we have
(2  xk =
1
2k O
X
0O
(1)k
71 2k+1 k.O
2k
(x  2)k
for all x, I x  21 < 2. By uniqueness, the given series must be the Taylor series of f(x) = 1/x. In this instance, the power series also converges to the function f(x) for all
x satisfying Ix  21 < 2.
U 11
Remainder Estimates
To investigate when the Taylor series of a function f converges to f(x), we consider
R (x) =
c)(x) = f (x) 
c)(x).
The function R. is called the remainder or error function between jr and
(18) c).
Clearly,
f (x) = Jim TT(f, c)(x) if and only if
lim
c)(x) = 0.
n
Since the Taylor polynomial T. is the nth partial sum of the Taylor series of f, the Taylor series converges to f at a point x if and only if nliini this fact, we state it as a theorem.
8.7.1 5
c)(x) = 0. To emphasize
THEOREM Suppose f is an infinitely differentiable realvalued function on the open interval I and c E I. Then for x E 1,
AX) = if and only if limn
c)(x) = 0.
fck)(c) (x  c) , k
00
kO
k. E
8.7
Power Series Expansions
363
The formula
f(x) =f(c) +f'(c)(x  c) +
+f2(x  c)2 +
f n() (x  c)" + R"(f, x)(x)
is known as Taylor's formula with remainder. We now proceed to derive several formulas for the remainder term R. These can be used to show convergence of T. to f.
Lagrange Form of the Remainder Our first result, attributed to Joseph Lagrange (17361813), is called the Lagrange form of the remainder. This result, sometimes also referred to as Taylor's theorem, was previously proved for the special case n = 2 in Lemma 5.4.3.
8.7.16
THEOREM Suppose f is a realvalued function on an open interval 1. c E I and n E N. If f (n+ ')(t) exists for every t E 1, then for any x E I, there exists a C between x and c such that Rn(x) = Rn(.f, c)(x) _
+ 1) (x  C)n+l
(19)
Remark. Continuity of f("+) is not required.
Proof. Fix x E 1, and let M be defined by f(X) = T"(f, c)(x) + M(X  c)n+ To prove the result, we need to show that (n + 1)! M = f("+ ')(C) for some tween x and c. To accomplish this, set
be
g(t) = f(t)  TTJ, c)(t)  M(t  On +, = R"(t)  M(t  c)n+1 First, since T. is a polynomial of degree less than or equal to n, g(n+Ikt) = f(n+t)(t)  (n + 1)! M.
Also, since Tne)(f, c)(c) = f (t)(c), k = 0, 1, ... , n,
g(c) = g'(c) = ... = g(")(c) = 0. For convenience, let's assume x > c. By the choice of M, g(x) = 0. By the mean value theorem applied to g on the interval [c, x], there exists x,, c < x, < x, such that
0 = g(x)  g(c) = g'(xi)(x  c). Thus g'(x,) = 0. Since g'(c) = 0, by the mean value theorem applied to g' on the interval [c, x, ], g"(x2) = 0 for some x2, c < x2 < x,. Continuing in this manner, we obtain a point x" satisfying c < xn < x, such that g(")(xn) = 0. Applying the mean value
364
Chapter 8
Sequences and Series of Functions
theorem once more to the function g(n) on the interval [c, xn], we obtain the existence of
a t E (c, xn) such that 0 = 8(n)(Xn)  g(n)(c) = Jn+ l)( )(X  C).
0; i.e., f(n+1)(C)  (n + 1)! M = 0, for some C between x and c. 0
Thus
In Example 8.7.20 we will give several examples to show how the remainder estimates may be used to prove convergence of the Taylor series to its defining function. In the following example we show how the previous theorem may be used to derive simple estimates and inequalities.
8.7.17
EXAMPLES
(a) In this example we use Theorem 8.7.16 with n = 2 to approximate f (x) =
1 + x,
x > 1. With c = 0 we find that f(0) = 11
f'(0) =
f"(0) =
,
4
.
2 Therefore, T2(f, 0)(x) = 1 + 2 x  e x, and thus 1
1 +x = 1+2x8x2+ RZV, 0x ) ( ). 1
By formula (19), R2(f, 0)(x) =
f
(3)
L)
X3
3!
= 16 I (1 + )SR x3 
for some C between 0 and x. If x > 0, then
> 0, and thus (1 + C)"s2 < 1. There
fore, we have
V1 +x  T2(f, 0) (x) I
0. If we let x = 0.4, then TZ(f, 0)(.4) = 1.18, and by the above, 1.4  1.181 < 0.004, so that twodecimalplace accuracy is assured. In fact, to five decimal places 1.4 = 1.18322. I
(b) The error estimates can also be used to derive inequalities. As in the previous example, 1
1/1+1= 1 + 2 x  8x2 + R2(f, 0) (x). For x > 0 we have 0 < R2(f, 0)(x) < 16x3. Thus
I+2x8x2< VT +x 0.
8.7
Power Series Expansions
365
Integral Form of the Remainder Another formula for R.V, c) is given by the following integral form of the remainder. This, however, does require the additional hypothesis that the (n + l )st derivative is Riemann integrable.
8.7.18 THEOREM Suppose f is a realvalued function on an open interval 1, c E I and n I, N. If f ("+'kt) exists for every t E 1 and is Riemann integrable on every closed and bounded subinterval of I, then
R"(x) = R"U c)(x) = n! { `f("+°(r)(x  t)"dt,
x E 1.
(20)
C
Proof. The result is proved by induction on n. Suppose n = 1. Then
R1(x) = f(x)  f(c)  f'(c)(x  c), which by the fundamental theorem of calculus
f
= l " f'(t) dt  f'(c) dt =
J
[f'(t)  f '(c)) dt.
From the integration by parts formula (Theorem 6.3.7) with
u(t) = f'(t)  f'(c), v'(t) = 1, u'(t) = f"(t), v(t) _ (t  jr), we obtain j CX JC
[f'(t)  f'(c)) dt = [f'(t)  f'(c))(r  x) I; 
f"(t)(t  x) dt
C
Is
(x  t) f"(i) dt.
To complete the proof, we assume that the result holds for n = k, and prove that this implies the result for n = k + 1. Thus assume Rk(x) is given by equation (20). Then Rk+ I(x) = f(x)  Tk+ IV, c)(x) p(k+ (c)
f(x)  Tk(f c)(x)  (k + 1)! (x  c)k+i f(k+I)(
=Rk(x) (k+ 1)i) (x  cr+i kt 1
js (X
 t)kf (k+')(t) dt  k' f (k+')(c) J I (x  t)k dt k
(k+I)
(k+')
366
Chapter a
Sequences and Series of Functions
As for the case n = 1, we again use integration by parts with u(t) = f (k+'1(t)  f(k+'1(c)
v(t) =
and
k
+
(x 
1)kI,
which upon simplification, gives
Rk+i(x) =
1
J
(k + 1)! J.
s
(x  t)k''f(k+'>(t)dt.
Cauchy's Form for the Remainder Under the additional assumption of continuity of f the remainder as follows.
8.7.19
we obtain Cauchy's form for
COROLLARY Let f be a realvalued function on an open interval 1, c E 1 and n E N. If f("+1) is continuous on I, then for each x E 1, there exists a C between c and x such that Rn(x) = R,(f, c)(x) =
P +14) (X  cxx  i)". n!
(21)
Proof
Since f("+')(t)(x  t)" is continuous on the interval from c to x, by the mean value theorem for integrals (Theorem 6.3.6), there exists a C between c and x such that
f fn+I)(:)(x  t/ dt = (x  c) f("+'kC)(x l  C)". C
The result now follows by equation (20).
We now compute the Taylor series for several elementary functions, and use the previous formulas for the remainder to show that the series converges to the function.
8.7.20
EXAMPLES
(a) As our first example, we prove the binomial theorem (Theorem 2.2.5). For n E N
let f(x) = (1 + x)", x E R. Since f is a polynomial of degree n, if k > n then f (k)(x) = 0 for all x E R. Therefore, by Theorem 8.7.16,
f(x) _
f (k)(0) = n!/(n  k)! f o r k = 0, 1, ... , n. Therefore, n!
k=a k!(n  k)!
8.7 Power Series Expansions
367
The series expansion of (1 + x)° for a E R with a < 0 is given in Theorem 8.8.4, whereas the expansion for a > 0 is given in Exercise 7 of the next section. For rational numbers a, the expansion of (1 + x)° was known to Newton as early as 1664.
(b) Let f (x) = sin x with c = 0. Then f ("i(x)
=
(
n = 2k + 1,
1)k cos x,
(1)ksinx,
n = 2k.
Thus f(")(0) = 0 for all even n E N, and f ii(0) = (1)'`, whenever n = 2k + 1, k = 0, 1, 2, .... Therefore, the Taylor series off at c = 0 is given by 00
(1)k
k=O
(2k + 1)!
x2k+i
To show convergence of the series to sin x we consider the remainder term Rn(x). By Theorem 8.7.16, for each x E R there exists a C such that f(n+ 1 (SS)
R"{x) = (n + 1)! x
n+l
Since Ii (" ")(x) 1 5 1 for all x, we have Ixln+1
Rn(x),
(n + 1)!
By Theorem 2.2.6(f), lim jxI"+'/(n + 1)! = 0 for any x E R. As a consequence. lim Rn(x) = 0 for all xnE+1R, and thus 00
sinx = 7 k_o
(+) (2k + 1)`
x +', x E R.
The sine function, as well as the cosine function, can be defined strictly in terms of power series. For further details, see Miscellaneous Exercise 3. (c) As our third example, we derive the Taylor series for f (x) = ln(1 + x), where, as in Example 6.3.5,
lnx=fix Idr, x>0, denotes the natural logarithm function on (0, oo). Then f(0) = In (1) = 0, and by the fundamental theorem of calculus, f '(x) = 1/(1 + x). Thus for n = 1, 2,
... ,
f("i(x) _ (1)n+i (n  1)!
0+x)"
In particular, f W(0) _ (1)"+ 1(n  1)!, and the Taylor series off at 0 becomes
. (1)n+I 7' x n., n
368
Chapter 8
Sequences and Series of Functions
Although we have already proved that this series converges to In (1 + x) for all
x, 1 < x
1 (Example 8.7.6), we will prove this again to illustrate the use of the remainder formulas. Suppose first that 0 < x 0. Prove the following. a. f (x) is even if and only if ak = 0 for all odd k. b. M) is odd if and only if ak = 0 for all even k. 14. Suppose f(x) = 7.i oakxk, I x I < R1, and g(x) _ Zm o bk xk, I x I < R2. Prove that
f(x)g(x) _ Jk ockxk, lxi < min{R,, R2}, where ck = Yioaibki 15. Let f: R > R be defined by f(x) = e "" for x * 0, and f(0) = 0. Prove that for each n E N,
Mx)
P()eV'
x * 0,
0,
x = 0,
where P is a polynomial of degree 3n.
16. Suppose b > 1. For x E R define b(x) = E(x In b), where E is the natural exponential function. a. Prove that b(r) _ b' for all r E Q.
b. For x E R, prove that b(x) = sup(b' : r E 0, r < x}.
372
Chapter 8
Sequences and Series of Functions
8.81 The Gamma Function We close this chapter with a brief discussion of the Beta and Gamma functions, both of which are attributed to Euler. The Gamma function is closely related to factorials, and arises in many areas of mathematics. The origin, history, and development of the Gamma function are described very nicely in the article by Philip Davis listed in the supplemental reading. Our primary application of the Gamma function will be in the Taylor
expansion of (1  x), where a > 0 is arbitrary. 8.8.1
DEFINITION
For 0 < x < oo, the Gamma function r(x) is defined by ]F(x) =
Jte1d:.
(22)
When 0 < x < 1, the integral in equation (22) is an improper integral not only at oo, but also at 0. The convergence of the improper integral defining f(x), x > 0, was given as an exercise (Exercise 9) in Section 6.4. The graph of r(x) for 0 < x < 5 is given in Figure 8.8. The following properties of the Gamma function show that it is closely related to factorials.
6
I
2
3
Figure 8.8 Graph of f(x), 0 < x s 5
8.8.2
THEOREM
(a) For each x, 0 < x < oo, r(x + 1) = x r(x). (b) For n E ICU, r(n + I) = n!.
Proof. Let 0 < c < R < oo. We apply integration by parts to 1 R
t'e' dt.
8.8
The Gamma Function
373
With u = tx and v' = e', R
+xJR
tx Ie'di
txe'dt= r`e'
R
Rx
e
Since im; cxe` = 0 and m R*eR = 0, taking the appropriate limits in the above R yields
r(x + 1) = J txe'dt = x
0
0
0
1xle`dt = xI'(x).
This proves (a). For the proof of (b) we first note that
I'(1) =
e' dt = 1. J000
Thus by induction, r(n + 1) = W.
8.8.3
EXAMPLE Since the value of I'(!) occurs frequently, we now show that I'('2) = . By definition,
r
J = Joy t 'ne' dt.
2 With the substitution t = s2.
0
To complete the result, we need to evaluate the socalled probability integral f0 e" ds. This can be accomplished by the following trick using the change of variables theorem from multivariable calculus. Consider the double integral
J=
Joo f 0
exl''dxdy.
0
By changing to polar coordinates
x = r cos 9, y = r sin 9,
with 0< r < oo, 0 E (0, z ), WRe `rdrd9 J= 100 0 10 a (O0 = ?J 0
e_,: rdr4. IT
374
Chapter 8
Sequences and Series of Functions
On the other hand,
J=
0o (oo
J 0
>o
J
e:= a)" dx dy =
J 0
ex:
2
dx
0
Therefore, f0c e'2 dx =
2
0
from which the result follows.
The Binomial Series As an application of the Gamma function, we will derive the power series expansion of
f(x) = (1  x)°, where a > 0 is real. The coefficients of this expansion are expressed very nicely in terms of the Gamma function. By Example 8.7.10, for n E N, 00
(1x)n= 1
(n  1)!
k.0
k!
which in terms of the Gamma function, gives
_
1
(1  xr
r(k+n)
1
1'(n) k=o
k!
k
x
We will now prove that this formula is still valid for all a E R with a > 0.
8.8.4
For a > 0,
THEOREM (Binomial Series)
_
1
x)°
1
oc
r(n + a)xn,
IxI
n!
< 1.
Proof. We first show that the radius of convergence of the series is R = 1. Set an = 17(n + a)/n!. Then
an+,r(n+1+a)
a
(n + 1)!
n!
T(n + a)*
But by Theorem 8.8.2, I'(n + 1 + a) = (n + a)I'(n + a). Therefore,
a^+, =limn + a = 1. lim n"o a oo n + 1 and as a consequence of Theorem 7.1.10, we have R = 1.
8.8 The Gamma Function
375
To show that the series actually converges to (1  x)a, we set
00 r(n + a)
1
fa(xW =
r(a)
n!
x",
IxI < I.
Since a power series can be differentiated termbyterm, 1
fv(x) = r(a)
I nr(n + a) n1
x X.
n!
Multiplying by (I  x) gives 1
'
(l  x)fa(x) = r( a) Y' _

nr(n + a) n!
(1  x)x rt'
a r(n + a) n_i (n  1)! x
1
r(a) I
`F(n + I + a)
1
F(a) o
°
nr(n + a) n1
nI
nF(n + a) 1
L
n!
n!
x
X.
But r(n + I + a)  nr(n + a) = ar(n + a). Therefore, (1  x)fa(x) = afa(x) As a consequence,
d [(1  x)afa(x)] = a(1  X)a,fa(x) + (1  x)af,(x)
_ a(1  40If (X) + all  x)°'fa(x) = 0. Therefore, (1  x)a fa(x) is equal to a constant for all x, I x I < 1. But f.(0) = 1. Thus (1  x)afa(x) = 1; that is,
fa(x) = (1  x)a, which proves the result.
0
The Beta Function There are a number of important integrals that can be expressed in terms of the Gamma function. Some of these, which can be obtained by a change of variables, are given in the exercises. There is one integral, however, that is very important and thus we state it as a theorem. Since the proof is nontrivial and would take us too far astray, we state the result without proof. For a proof of the theorem, see Theorem 8.20 in the text by Rudin.
8.8.5
THEOREM For x > 0, y > 0,
J' t'`'(i  ty' dt = o
r(x)F(y) F(x + y).
376
Sequences and Series of Functions
Chapter 8
The function r(x)r(y)
B(x' A
T(x + y)' x, y > 0'
is called the Beta function.
EXERCISES 8.8 1. 'a. Compute r(z), r('2222).
b. Prove that for n E N,
CIn+2)= (2n)! n . 2. By making a change of variable, prove that
r(x)= f0(In
I
dt, 0 m.
a. Prove that the series I:k0(k) x' converges uniformly and absolutely for x E H, 11. b. Prove that
xt = (1 + x)°, x E [1, 1 ].
Miscellaneous Exercises
377
NOTES Without question the most important concept of this chapter is that of uniform convergence of a sequence or series
of functions. It is the additional hypothesis required in proving that the limit function of a sequence of continuous or integrable functions is again continuous or integrable. As was shown by numerous examples, pointwise convergence is not sufficient. For differentiation, uniform
convergence of {f"} is not sufficient; uniform convergence of the sequence of derivatives (f' ) is also required. The example of Weierstrass (Example 8.5.3) is interesting for several reasons. First, it provides an example of a continuous function which is nowhere differentiable on R. Furthermore, it provides an example of a sequence of infinitely differentiable functions that converges uniformly on R, but for which the limit function is nowhere differentiable. Exercise 7 of Section 8.5 provides another construction of a continuous function f that is nowhere differ
entiable. Although this construction is much easier, the partial sums of the series defining the function f are themselves not differentiable everywhere. Thus it is not so surprising that f itself is not differentiable anywhere on It The proof of the Weierstrass approximation theorem presented in the text is only one of the many proofs available. A constructive proof by S. N. Bernstein using the socalled Bernstein polynomials can be found on page 107 of the text by Natanson listed in the Bibliography. The proof in the text, using approximate identities, was chosen because the technique involved is very important in analysis
and will be encountered later in the text. In Theorem 9.4.5 we will prove a variation of the Weierstrass approximation theorem. At that point we will show that every continuous realvalued function on [vr, w] with f(ir) = f(r) can be uniformly approximated to within a given e > 0 by a finite sum of a trigonometric series.
MISCELLANEOUS EXERCISES 1. Using Miscellaneous Exercise I of Chapter 6 and the Weierstrass approximation theorem, prove the following: If f E 9t.[a, b] and e > 0 is given, then there exists a polynomial P such that
tk PI
0 is chosen so that f f(x) dx = 1. For A > 0, set fx(x) = i f(Ax). a. Prove that fa E C(R) for all A > 0. b. Prove that fa(x) = 0 for all x E R, I x I z A, and that f ,, fx (x) dx = 1. c. Prove that for every 8 > 0, flasl1l} fa(r) dr = 0. 3. In this exercise we show how the trigonometric functions may be defined by means of power series. Define the functions S and C on R by S(x)
(1)k x2k,1 k=o (2k + I)i
_W
_
C (x)
(1)k u
ko (2k)! x
a. Show that the power series defining S and C converge for all x E R.
b. Show that S'(x) = C(x) and C'(x) = S(x), x E R. c. Show that S"(x) = S(x) and C"(x) = C(x).
378
Chapter8
Sequences and Series of Functions
d. Show that if f : R ) R satisfies f "(x) = f(x) with f(O) = 0, f'(0) = 1, then f (x) = S(x) for all x E R. e. If f: R  R satisfies f"(x) = 1(x), prove that there exist constants c,, c2, such that f(x) = c,S(x) + c,C(x). L Show that (S(x))2 + (C(x))2 = 1. (Hint: Consider the function f(x) _ (S(x))2 + (C(x))2.) g. Show that C(x + y) = C(x) C(y)  S(x) S(y) and S(x + y) = S(x) C(y) + C(x) S(y) for all x, y E R.
SUPPLEMENTAL READING Andrushkiw, J. W., "A note on multiple series of positive terms," Amer. Math. Monthly 68 (1961), 253258. Billingsley, P., "Van der Waerden's continuous nowhere differentiable function," Amer. Math. Monthly 89 (1982), 691. Blank, A. A., "A simple example of a Weierstrass function;' Amer. Math. Monthly 73 (1966), 515519. Boas, Jr., R. P., "Partial sums of infinite series and how they grow," Amer. Math. Monthly 84 (1977), 237258. Boas, Jr., R. P. and Pollard, H., "Continuous analogues of series;' Amer. Mark Monthly 80 (1973),1825. Cunningham, Jr., F., "faking limits under the integral sign;' Math. Mag. 40 (1967), 179186. Davis, P. J., "Leonhard Euler's integral: A historical profile of the Gamma function;' Amer. Math. Monthly 66 (1959), 849869. French, A. P., " he integral definition of the logarithm and the logarithmic series;' Amer. Math. Monthly 85 (1978), 580582. Kestleman, H., "Riemann integration of limit functions;' Amer. Math. Monthly 77 (1970),182187. Lewin, J. W., "Some applications of the bounded convergence theorem for an introductory course in analysis," Amer. Math. Monthly 94 (1987), 988993.
Mathf, P., "Approximation of Holder continuous functions by Bernstein polynomials;' Amer. Math. Monthly 106 (1999), 568725. Miller, K. S., "Derivatives of noninteger order," Math. Mag. 68 (1995), 183192.
Minassian, D. P. and Gaisser, J. W.. A simple Weierstrass function," Amer. Math. Monthly 91 (1984). 254256. Patin, J. M., "A very short proof of Stirling's formula," Amer. Math. Monthly % (1989),4142. Roy, Ranjan, "The discovery of the series formula for rr by Leibniz, Gregory and Nilakantha," Math. Mag. 63 (1990),291306. Sagan, H., "An elementary proof that Schoenberg's space filling curve is nowhere differentiable" Math. Mag. 65 (1992), 125128. Schenkman, Eugene, "The Weierstrass approximation theorem;' Amer. Math. Monthly 79 (1972), 6566. Weinstock, Robert, "Elementary evaluations of dr. Jo eos x2 dx, and Jo sin x2 dx; 'Amer. Ja a Math. Monthly 97 (1990), 3942.
Orthogonal Functions y J and Fourier Series 9.1 Orthogonal Functions 9.2 Completeness and Parseval's Equality
9.3 Trigonometric and Fourier Series
9.4 Convergence in the Mean of Fourier Series 9.5 Pointwise Convergence of Fourier Series
In this chapter we consider the problem of expressing a realvalued periodic function of period 2a in terms of a trigonometric series 00
2 ao + Y, (a cos nx + b sin nx), nl where the an and bn are real numbers. As we will see, such series afford much greater generality in the type of functions that can be represented as opposed to Taylor series. The study of trigonometric series has its origins in the monumental work of Joseph Fourier (17681830) on heat conduction in solids. His 1807 presentation to the French Academy introduced a whole new subject area in mathematics while at the same time providing very useful techniques for solving physical problems. Fourier's work is the source of all modern methods in mathematical physics involving boundary value problems and has been a source of new ideas in mathematical analysis for the past two centuries. To see how greatly mathematics has been influenced by the studies of Fourier, one only needs to look at the twovolume work Trigonometric Series by A. Zygmund (Cambridge University Press, 1968). In addition to trigonometric series. Fourier's original method of separation of variables leads naturally to the study of orthogonal functions and the representation of functions in terms of a series of orthogonal functions. All of these have many applications in mathematical physics and engineering. Fourier initially claimed and tried to show, with no success, that the Fourier series expansion of a function actually represented the function. Although his claim is false. 379
380
Chapter 9
Orthogonal Functions and Fourier Series
in view of the eighteenthcentury concept of a function this was not an unrealistic expectation. Fourier's claim had an immediate impact on nineteenthcentury mathematics. It caused mathematicians to reconsider the definition of "function." The question of what type of function has a Fourier series expansion also led Riemann to the development of the theory of the integral and the notion of an integrable function. The first substantial progress on the convergence of a Fourier series to its defining function is due to Dirichlet in 1829. Instead of trying to prove, like Fourier, that the Fourier series always converges to its defining function, Dirichlet considered the more restrictive problem of finding sufficient conditions on the function f for which the Fourier series converges pointwise to the function. In the first section, we provide a brief introduction to the theory of orthogonal functions and to the concept of approximation in the mean. In Section 9.2 we also introduce the notion of a complete sequence of orthogonal functions and show that this is equivalent to convergence in the mean of the sequence of partial sums of the Fourier series to its defining function. The proof of the completeness of the trigonometric system { 1, sin nx, cos nx}' , will be presented in Section 9.4. In this section we also prove Fejdr's theorem on the uniform approximation of a continuous function by the nth partial sum of a trigonometric series. In the final section, we present Dirichlet's contributions to the pointwise convergence problem.
9.1
Orthogonal Functions In this section we provide a brief introduction to orthogonal functions and the question of representing a function by means of a series of orthogonal functions. Although these topics have their origins in the study of partial differential equations and boundary value problems,' they are closely related to concepts normally encountered in the study of vector spaces. If X is a vector space over R (see Definition 7.4.7), a function (,) : X X X + R is
an Inner product on X if
(a) (x, x) ? 0 for all x E X, (b) (x, x) = 0 if and only if x = 0, (c) (x, y) (y, x) for all x, y E X, and (d) (ax + by, z) = a(x, z) + b(y, z) for all x, y, z E X and a, b E R. In R", the usual inner product is given by
(a, b) = I, aab1 i=1
f o r a = (a,, ... , a") and b = (b,, ... , b") in R". If () is an inner product on X, then two nonzero vectors x, y E X are orthogonal if (x, y) = 0. The term "orthogonal" is syn
1. For a detailed treatment of this subject see the texts by Berg and McGregor or Weinberger listed in the Bibliography.
9.1
Orthogonal Functions
381
onymous with "perpendicular;" and comes from geometric considerations in R". Two nonzero vectors a and b in IB" are orthogonal if and only if they are mutually perpendicular; that is, the angle 9 between the two vectors a and b is z or 90° (see Exercise 10, Section 7.4).
In the study of analysis we typically encounter vector spaces whose elements are functions. For example, in previous sections we have shown that the space 12 of square
summable sequences and the space 11[a, b] of continuous realvalued functions on [a, b] are vector spaces over R. With the usual rules of addition and scalar multiplication, 9t[a, b], the set of Riemann integrable functions on (a, b), is also a vector space over R. If for f, g E 9t[a, b] we define
U, g) = f f(x)g(x) dx, b
I
then it is easily shown that (,) satisfies (a), (c), and (d) of the definition of an inner product. It does not, however, satisfy (b). If a < b and ci, . . . , c" are a finite number of points in [a, b], then the function 0,
x # c,,
f(x) = 11 , x=ci, is in 9t[a, b] satisfying (f, f) = 0, but f is not the zero function. Thus technically () is not an inner product on 9t[a, b]; a minor difficulty which can easily be overcome by defining two Riemann integrable functions f and g to be equal if f(x) = g(x) for all x E [a, b] except on a set of measure zero. This will be explored in greater detail in Chapter 10. Alternatively, if we restrict ourselves to the subset b] of 9t[a, b], then (f, g) as defined above is an inner product on b] (Exercise 11).
Orthogonal Functions We now define orthogonality with respect to the above inner product on 9t[a, b].
9.1.1
DEFINITION A finite or countable collection of Riemann integrable functions {¢n} on [a, b] satisfying fa 40.' * 0 is orthogonal on [a, b] if
&(X)4m(X) dx = 0 for all n * m.
\Wn+ m) = 1a
9.1.2
EXAMPLES
(a) For our first. example, we consider the two, functions ¢(x) = I and 4i(x) = x, x E [ 1, 1]. Since f f O(x)*(x) dx = f t X dx = 0, the functions 46 and are orthogonal on the interval [1, 11.
382
Chapter 9
Orthogonal Functions and Fourier Series
(b) In this example we show that the sequence of functions (sin nx}. , is orthogonal on [ir, ar]. By the trigonometric identity sin A sin B = 2[cos(A  B)  cos(A + B)],
for n # m, Jsin nx sin mx dx =
[cos(n  m)x  cos(n + m)x] dx
J
2
u
_ i sm(n  m)x 2
_
sin(n + m)x
(n+m)
(n  m)
0.
For future reference, when n = m,
(I  cos 2nx) dx
sine nx dx = 2 a
X
sin nXlI'
= Jr.
p
2\
(c) As our final example, we consider the collection 11, sin ¶, cos T )"., on the interval [ L, L] where L > 0. As in (b), if n * m, then 1L
sinnlrx L sin m7rx L dx = 0.
L
Thus the collection (sin ¶ ) is orthogonal on [ L, U. Also, by the trigonometric identities cos A cos B = I [cos(A  B) + cos(A + B)]
sin A cos B = 2[sin(A  B) + sin(A + B)], we have for n # m, (L
f
1 L cos
narx L
mirx
(L
cos L dx = 1
L
sin
nirx L
m7rx dx
cos L
= 0.
Thus the functions in the collection {cos z} are all orthogonal on [L, L] as are the functions sin T and cos L" for all n, m E N with n * m. For m = n,
f sin nUx cos L
dx=
2n
sin' "'
ILL
= 0.
9.1
Orthogonal Functions
383
This last identity shows that the functions sin Of and cos "LF are also orthogonal on [L. L] for all n E N. Finally, since
Jsindxr fcos for all n E N, the constant function 1 is orthogonal to sin "lr` and cos " for all n E N. In this example we also have L
f
Lsin
2 nWX
L
2 ri?lX
L
dx = J
Co. .L dx = L.
If in Example 9.1.2(b) we define ¢"(x) _ j sin nx, then the sequence {4"(x)}R , satisfies
fA
4)"(x)¢m(x) dx =
{0, 1,
when n # in. when n = m.
Such a sequence of orthogonal functions is given a special name.
9.1.3
DEFINITION A finite or countable collection of Riemann integrable functions {di"}
is orthonormal on [a, b] if Jb
d."(x)O,"(x) dx = a
10,
1;
when n * m, when n = m.
Given a collection {¢"} of orthogonal functions on (a, b], we can always construct a family {fir"} of orthonormal functions on [a, b] by setting
where c" is defined by b
c. = J ¢^(x) dx. a
Approximation in the Mean Let {O"} be a finite or countable family of orthogonal functions defined on an interval [a, b]. F o r each N E N a n d c, ... , cN E R, consider the Nth partial sum N
SN(x)
_
c4,,(x)
(1)
A natural question is, given a realvalued function f on [a, b], how must the coefficients c" be chosen so that SN gives the best approximation to f on [a, b]? In the Weierstrass
384
Chapter 9
Orthogonal Functions and Fourier Series
approximation theorem we have already encountered one form of approximation; namely, uniform approximation or approximation in the uniform norm. However, for the study of orthogonal functions there is another type of norm approximation that turns out to be more useful. If X is a vector space over R with inner product (,), then there is a natural norm on X associated with this inner product. If for x E X we define
IIxII=
,
II is a norm is left to the exercises (Exercise 12). The crucial step in proving the triangle inequality for II II is the following version of the CauchySchwarz inequality: For all x, y E X.
then II II is a norm on X as defined in Definition 7.4.8. The details that II
I(x,Y)I s 11x11 11Y11.
The proof of this inequality follows verbatim the proof of Theorem 7.4.3. For the vector space R[a. b] with inner product (f, g) = fa f(x)g(x) dx, the norm of a function f, denoted I I f 112, is given by
111112 = If
f
E &[a, b], the problem to consider is, how must the constants c be chosen in order to minimize the quantity
Ilf SNII2 = f [f(x)  SN(x)]2dx? a
This type of norm approximation is referred to as approximation in the mean or least squares approximation. The following theorem specifies the choice of (ca) so that SN provides the best approximation to f in the mean.
9.1A THEOREM Let f E gt[a, b] and let
be a finite or countable collection of orthogonal functions on [a, b]. For N E N, let SN be defined by equation (1). Then the quantity b
[f(x)  SN(x)]2 dx Ja
is minimal if and only if fa f(x)OR(x) dx cn
n = I , 2, ... N.
(2)
fa 02(x) dx
Furthermore, for this choice of c,,, rb
[.f(x)  SN(x)]2 dx =
jf2(x) dx 
N
b
c2 I
eel
a
.0,22(x) dx.
(3)
9.1
Orthogonal Functions
385
Prior to proving the result, we give the following alternative statement of the previous theorem.
9.1.5
COROLLARY Let f E Ii(a, b] and let SN(x) = I: I caaya(x) where the ca are dean E R, then
fined by equation (2). If TN(x) = 7.N_ 1 b
f
b [f(x)  TN(x)]2 dx,
[f (x)  SN(X)]2 dx
J
f o r any choice o f aa, n = 1, 2, ... , N. For fixed N E N,
Proof of Theorem 9.1.4. b
0 5 J [f(x)  S,,(x)]2 dx a
b
f 2(x) dx  2 J bf (x)SN (x) dx + J SN(x) dx.
J
a
By linearity of the integral (Theorem 6.2.1), 1bf N (x)SN(x) dx
bf (x)4in(X) dx.
cn
Ja
n°I
a
Also, hS2N(x)dx
=
bSN(x\ 'C4Y'n(x) f dx =
J
/
J
nI
c,J hSN(x)4n(x) dx. a
But b Ja
N
SN(X)wn(X) = I ck I b 0k(X)0.(x) dx, a
which by orthogonality, b
0(x) dX.
Cn
I Therefore, b
N= c . N
jS(x)dx
2
n=1
+(
l Wn(X) dx. JJJ
Upon substituting into equation (4) we obtain
0 G i [f(x)  SN(X)]2 dx b
N C.
bf2(X) Ja
dX  2
nil
Cn
bf(X)fn(X) 1a
dx + jnil
6 1a
'On(x) dx.
(4)
386
Chapter 9
Orthogonal Functions and Fourier Series
which upon completing the square N
Ia
f 2(X) dx +
I aJ
b
n(x)
Ic
b
L
f. b
2J
L./ f bo12 6 2
fa n
n=1
The coefficients cn occur only in the middle term. Since this term is nonnegative, the right side is a minimum if and only if
cn = a
n
With this choice of cn, we also obtain formula (3) upon substitution.
9.1.6
U
EXAMPLE As was previously shown, the functions 01(x) = 1 and ¢2(x) = x are orthogonal on [ 1, 11. Let f (x) = x3 + 1. Then c '
f',f(x)01(x) = l J (x3 + 1) dx = 1. 2 I f!14i(x)dx
and
f_',f(x)d2(x)dx C2 =
0z(x)
=
(' I3
(x,
+ x)dx =
2 ,
3 .
5
Therefore, S2(x) = I + sx is the best approximation in the mean to f(x) = 1 + x3 on [I, 1]. The graphs off and S2 are given in Figure 9.1.
1
0.5 Figure 9.1
0
0.5
Graphs off and $2
1
9.1
9.1.7
Orthogonal Functions
387
I be a sequence of orthogonal functions on [a, b] and let DEFINITION Let f E %[a, b]. For each n E 1, the number f,,f(x)4n(x) dx
(5)
fa 4.'(x) dx
is called the Fourier coefficient off with respect to the system
I
The series
00
is called the Fourier series off. This is denoted by
=1 00
AX)  I cn r (x)
(6)
n1
in the Remark. The notation "" in formula (6) means only that the coefficients series are given by formula (5). Nothing is implied about convergence of the series'.
9.1.8
EXAMPLE In Example 9.1.2(b) it was shown that the sequence of functions {sin nx},1 is orthogonal on [ Tr, ir]. Since
n = 1, 2, ..
,
if f E Jt[ ar, Tr], the Fourier coefficients c, n = 1. 2_ ... , off with respect to the orthogonal system {sin nx} are given by
c = ir
f (x) sin nx dx,
and the Fourier series off becomes f (X)
=I
c sin nx.
As indicated above, nothing is implied about convergence. Even converge, it need not converge to the function f. Since the terms of the series are odd functions of x, the series, if it converges, defines an odd function on [ ir, vr]. Thus unless f itself is odd, the series could not converge to f. For example, if f(x) = 1, then _ sinnxdx=  mIr cosnx
if f
1r 1
n
a
 =0.
In this case, the series converges for all x. but clearly not to f (x) = 1.
Bessel's Inequality For each N E N, let SN(x) denote the Nth partial sum of the Fourier series off, i.e.. N
SN(x) = 71 CIAn(x), n=1
388
Chapter 9
Orthogonal Functions and Fourier Series
where the cn are the Fourier coefficients off with respect to the sequence thogonal functions on [a, b]. Then by identity (3) of Theorem 9.1.4.
f
of or
b
f2(x)dx
N r0.
cnJ
Therefore, fb
bf 2(X)dx.
0.2(x)dx
a, there exist a finite number of intervals 11, ... ,1N such that $N t m(1) > a. For each j, choose a closed and bounded interval Ji C I, such that
j tm(J,) > a. Let K = U t t J. Then K is a compact subset of U and thus A*(U) > m(K). Finally, since the intervals {J% I are pairwise disjoint, by Exercise 6 of Section 10.2,
m(JJ) > a.
m(K) _ J=1
Therefore A*(U) > a. If A*(U) = oo, then by the above A*(U) > a for every a E R; that is, A*(U) = oo. On the other hand, if A*(U) is finite, take a = A*(U)  E, where e > 0 is arbitrary. But then A*(U) ? A* (U) > A*(U)  e for every e > 0. From this it now follows that A*(U) = A*(U) = m(U).
Measurable Sets In both of the previous examples, the inner and outer measures of the sets are equal. As we shall see, all subsets of R built out of open sets or closed sets by countable unions, intersections, and complementation will have this property. This includes most sets encountered in practice. In fact, the explicit construction of a set whose inner and outer measures are different requires use of an axiom from set theory, the Axiom of Choice,
446
Chapter 1 0
Lebesgue Measure and Integration which we have not discussed. The construction of such a set is outlined in the miscellaneous exercises.
10.3A
DEFINITION
(a) A bounded subset E of ff is said to be LAbesgue measurable or measurable if
A*(E) = A*(E).
If this is the case, then the measure of E, denoted A(E), is defined as
A(E) = A*(E) = k*(E). (b) An unbounded set E is measurable if E fl [a, b] is measurable for every closed and bounded interval [a, b]. If this is the case, we define A(E) = klint A(E fl [k, kJ).
Remarks (a) If E is unbounded and E fl l is measurable for every closed and bounded interval 1, then by Theorem 10.3.2 the sequence {A(E rl [k, k])} t is nondecreasing, and as a consequence A(E) = lim A(E fl [ k, k]) exists.
(b) There is no discrepancy between the two parts of the definition. We will shortly prove in Theorem 10.4.1 that if E is a bounded measurable set, then E fl 1 is measurable for every interval 1. Conversely, if E is a bounded set for which
A*(E fl [a, b]) = A*(E fl (a, b]) for all a, b E R, then by choosing a and b sufficiently large such that E C [a. b], we have A*(E) = A*(E). The two separate definitions are required due to the existence of unbounded nonmeasurable sets E for which A* (E) = A*(E) = oo. An example of such a set will be given in Exercise 5 of Section 10.4.
10.3.5 THEOREM Every set E of outer measure zero is measurable with A(E) = 0. Proof. Suppose E C R with A*(E) = 0. Then for any closed and bounded interval 1. A*(E fl I)
A*(E) = 0.
Thus A* (En I) = A*(E fl I) = 0, and hence E fl I is measurable for every closed and
bounded interval I. Since A(E fl [ k, k]) = 0 for every k E N, A(E) = 0. 0 As a consequence of the previous theorem and Example 10.3.3(a), every countable set E is measurable with A(E) = 0. In particular, Q is measurable with A(Q) = 0. Another consequence of Theorem 10.3.5 is that every subset of a set of measure zero is measurable.
10.3
10.3.6
THEOREM
Inner and Outer Measure; Measurable Sets
447
Every interval I is measurable with A(1) = m(l).
By Example 10.3.3(b), if 1 is a bounded interval, A*(l) = A*(l) = m(l). Thus I is measurable with A(l) = m(1). On the other hand, if 1 is unbounded, then I fl [a, b] is a bounded interval for every a, b E R, and thus measurable. In this case.
Proof.
A(/)
10.3.7
THEOREM
klym
A(1 fl [k, k]) = iim m(1 fl [k, k]) = oo.
For any a, b E R and E C R, A*(E fl [a, b]) + A* (E` fl [a, b]) = b  a.
Proof. Let U.be any open subset of R with E fl [a, b] C U. Then U` fl [a, b] is compact with U` fl (a, b] C E` fl [a, b]. Therefore, m(U) + A*(E` fl [a, b]) ? m(U fl [a, b]) + m(U` fl [a, b]) = b  a. The last equality follows by Theorem 10.2.15. Taking the infimum over all open sets U containing E` fl [a, b] gives
A*(E fl [a, b)) + A*(E` fl [a, b]) ? b  a. To prove the reverse inequality, let K be a compact subset of E` fl [a, b]. Then K`
is open with K` fl [a, b] D E fl [a, b]. Therefore, A*(E fl [a, b]) + m(K fl [a, b]) m(U,) + m(U,), which by Theorem 10.2.9
=m(U,UU2)+m(U,nU2) ? A*(E, U E2) + A*(E, n E2). The last inequality follows from the definition of outer measure. Since e > 0 was arbitrary, inequality (a) follows. (b) Let a, b E Il8 be arbitrary. By (a) applied to [a, b] n E;, we have
A*([a, b] n E,) + A*([a, b] n El) ? A*([a, b] n (E, U El)) + A*([a, b] n (E, n E;)) = A*([a, b] n (E, n E2)`) + A*([a, b] n (E, U E,)`). But by Theorem 10.3.7, for any E C R,
A*([a, b] n E`) = (b  a)  A*(E n [a, b)). Therefore,
A*(E, n [a, b]) + A*(E2 n [a, b]) s} is measurable. More generally, if E is a measurable subset of I8, a function f : E  R is measurable if
{xEE:f(x)>s} is measurable for every s e R. Since f '((s, oo)) = {x : f(x) > s}, f is measurable if and only if f '((s, oo)) is a measurable set for every s E R. We illustrate the idea of a measurable function with the following examples.
10.5.2
EXAMPLES
(a) Let A be a measurable subset of I8 and let XA denote the characteristic function of A. Then
{X: XA(x) > s} =
R, A,
0:S s s} = Q ` fl (s, l), if 0 s} is measurable for every s E R. (b) {x : f(x) ? s} is measurable for every s E R.
10.5
Measurable Functions
457
(c) {x : f (x) < s} is measurable for every s E R. (d) {x : f (x) < s} is measurable for every s E R.
Proof. The set of (d) is the complement of the set in (a). Thus by Corollary 10.4.3. one is measurable if and only if the other is. Similarly for the sets of (b) and (c). Thus it suffices to prove that (a) is equivalent to (b). Suppose (a) holds. For each n E N, let
En = {x : f (x) > s  ,,}. By (a), E. is measurable for all n E N. But {x : AX) a s} = 1
; Ea,
which is measurable by Theorem 10.4.5. Conversely, since 00
{x:f(x) > s} =R=1U{x:f(x)?s+n}, I
if (b) holds, then by Theorem 10.4.5, (a) also holds. 0 10.5A THEOREM Suppose f, g are measurable realvalued functions defined on a measurable set E. Then
(a) f + c and cf are measurable for every c E R, (b) f + g is measurable, (e) fg is measurable, and (d) 11g is measurable provided g(x) # O for all x E E.
Proof. The proof of (a) is straightforward and is omitted. (b) Lets E R. Then f(x) + g(x) > s if and only if f(x) > s  g(x). If x E E is such that f(x) > s  g(x), then there exists r E 0 such that
f(x) > r > s  g(x). Let {r,,},°,°__1 be an enumeration of Q. Then
{x : f(x) + g(x) > s} = U ({x : f(x) > ra} fl {x: ra > s  g(x)}). Since f and g are measurable functions,
{x : f(x) > ra}
and
{x : ra > s  g(x)}
are measurable sets for every n E Ni. Thus their intersection and the resulting union is also measurable. Therefore, f + g is measurable. To prove (c) we first show that f2 is measurable. If s < 0, then
{x E E : f 2(x) > s} = E, which is measurable. Assume s ? 0. Then
{x : f 2(x) > s} = {x : f (x) > \6} U {x : f (x) < N/S1
458
Chapter 1o
Lebesgue Measure and Integration
But each of these two sets are measurable. Thus their union is measurable. Since
f8 = 4[(f + 8)Z  (f  8)2]1 the function fg is measurable. The proof of (d) is left as an exercise (Exercise 5).
10.5.5
THEOREM Every continuous realvalued function on [a, b] is measurable.
Proof. Exercise 7.
A Property Holding Almost Everywhere A very important concept in the study of measure theory involves the idea of a property being true for all x except for a set of measure zero. This idea was previously encountered in the statement and proof of Lebesgue's theorem in Chapter 6; namely, a bounded realvalued function if on [a, b] is Riemann integrable if and only if {x : f is not continuous at x) has measure zero. An equivalent formulation is that f is continuous except on a set of measure zero. In this section we will encounter several other properties that are assumed to hold except on sets of measure zero.
10.5.6
DEFINITION A property P is said to hold almost everywhere (abbreviated a.e.) if the set of points where P does not hold has measure zero, i.e., A({x : P does not hold}) = 0.
Remark The assertion that a set is of measure zero includes the assertion that it is measurable. This, however, is not necessary. If instead we only require that A*({x : P does not hold}) = 0, then by Theorem 10.3.5 the set {x: P does not hold) is, in fact, measurable.
We will illustrate the concept of a property holding almost everywhere by means of the following examples.
10.5.7
EXAMPLES
(a) Suppose f and g are realvalued functions defined on [a, b]. The functions f and g are said to be equal almost everywhere, denoted f = g a.e., if
{x E [a, b] :f (x)
g(x))
has measure zero. For example, if g(x) = I for all x E [0, 11 and
AX) = 11, l0,
x E [0, 1] \ 0,
x E [0, ii fl a,
then {x E [0, 1] : f (x) # g(x)) = 0, 1] (l a, which has measure zero. Therefore, f = g a.e.
(b) In Theorem 10.5.4 we proved that if g is a realvalued measurable function on [a, b] with g(x) # 0 for all x E [a, b], then 11g is also measurable on [a, b]. Suppose we replace the hypothesis g(x) # 0 for all x E [a, b] with g # 0 a.e.; that is, the set
E={xE[a,b]:g(x)=0}
10.5
Measurable Functions
459
has measure zero. If we now define f by g(x),
f(x)
1,
x E [a, b] \ E,
xEE,
then f(x) # 0 for all x E [a, b] and f(x) = g(x) except for x E E, which has measure zero. Thus f = g a.e. on [a, b]. As a consequence of our next theorem, the function f will also be measurable on [a, b]. (c) A realvalued function f on [a, b] is continuous almost everywhere if {x E [a, b] : f is not continuous at x} has measure zero. As in Example 6.1.14, consider the function f on [0,1 ] defined by
1(x) =
if x is irrational,
0,
m.
1
if x =  in lowest terms, x # 0. n
n
As was shown in Example 4.2.2(g), the function f is continuous at every irrational number in [0, 1], and discontinuous at every rational number in [0, 1 ]. Therefore, A({x E [0, 1 ] : f is not continuous at x}) = Ap fl [0, 11) = 0.
Thus f is continuous a.e. on [0, 1].
(d) Let f and f, n = 1, 2, ... , be realvalued functions defined on [a, b]. The sequence {f.) is said to converge almost everywhere to f, denoted f +f a.e., if {x E [a, b] : {f. (x)} does not converge to f (x)}
has measure zero. To illustrate this, consider the sequence (f,,} defined in Example 8.1.2(c) as follows: Let {xk} be an enumeration of 0 n [0, 1 ]. For each n E N, define f, on [0, 11 by AX) __
0, 1,
if x = xk, 1 s}={xEB:g(x)>s}UIx EE:g(x)>s} = {x E B:f(x) > s} U {x E E: g(x) > s}.
460
Chapter 10
Lebesgue Measure and Integration
Since E, = {x E E: g(x) > s} is a subset of E and A (E) = 0, the set E, is measurable. Also, since f is measurable, {x E B : f(x) > s} is measurable. Therefore, {x: g(x) > s}, and thus g, is measurable.
10.5.9 THEOREM Let f f.}:: , be a sequence of realvalued measurable functions defined on a measurable set A such that (f (x)}' , is bounded for every x E A. Let rp (x) = sup{ f" (x) : n E N)
and
Vi(x) = inf { fn (x) : n E NJ.
Then 9 and ifr are measurable on A.
Proof. The result follows by Theorem 10.4.5, and the fact that for every s E R.
{x:(p(x) > s) = U00 {x: f"(x) > s}
and
ni
10.5.10
a (x: 4,(x) < s} = ni U {x: f"(x) < s}.
be a sequence of realvalued measurable functions defined COROLLARY Let on a measurable set A, and let f be a realvalued function on A. /f f" +f a.e. on A, then f is measurable on A.
Proof. Let E = {x: { f"(x)} does not converge tof(x)}. By hypothesis A(E) = 0. Set gn(x) =
{f"ax),
xEA\E, x E E.
Then g" = fn a.e., and thus is measurable. Also, lim gn(x) = g(x) exists for all x E A. But
g(x) = lim g" (x) = lim gn (x) = inf sup{ fk(x) : k at n}. noo
n"o
n
By the previous theorem, each of the functions
F.(x) = sup{fk(x):k at n}
and
g(x) = inf{F,(x):n E N}
are measurable on A. Finally, since f = g a.e., f itself is measurable. Suppose { fn} is a sequence of measurable functions on [a, b] such that fn ).f a.e.
Then by definition there exists a subset E of [a, b] such that A([a, b]\E) = 0, and slim fn (x) = f (x) for all x E E. Exercise 15 provides a significant strengthening of this result. There you will be asked to prove that given e > 0, there exists a measurable set E C [a, b], such that A([a, b] \ E) < e, and {fn} converges uniformly to f on E. This result is known as Egorov's theorem.
10.5
Measurable Functions
461
EXERCISES 10.5 1. *Let f be defined on [0, 1) by
x=0,
0,
f(x)=
x, :. qtL(?*.f)
and
Therefore, the lower Riemann integral off satisfies h
i f = sup{J'(P, f) : 91 is a partition of (a, b]} s sup{.TL(a, f) : I), is a measurable partition of [a, b]}. Similarly, for the upper Riemann integral off we have
J f ?.inf {°ltt(a, f) : 1 is a measurable partition of [a, b)}. If f is Riemann integrable on [a, b], then the upper and lower Riemann integrals off are equal, and thus J46f
(x) dx 1:: sup `.eL(a, f)
ml 04L(2, f)
s
la bf
(x) dx,
where the supremum and infimum are taken over all measurable partitions a of [a, b]. As a consequence of Theorem 10.6.2, this proves the following result.
10.6.8
COROLLARY 1f f is Riemann integrable on [a, b], then f is Lebesgue integrable on [a, b], and h
(
fdA =
+
f Ax) dx. u
The converse, however, is false! This is illustrated by the following example.
10.6.9
EXAMPLE Let E = [0, I ] \ 0, and set
f(x) = XE(X) = {0,
when x is irrational, when x is rational.
By Example 6.1.6(a) the function f is not Riemann integrable. On the other hand, since f is a simple function, f is Lebesgue integrable, and by Lemma 10.6.7,
f dA = A(E) = 1.
Jlo. l i
10.6
The Lebesgue Integral of a Bounded Function
469
Properties of the Lebesgue Integral for Bounded Functions The following theorem summarizes some basic properties of the Lebesgue integral for bounded functions.
10.6.10 THEOREM Suppose f, g are bounded realvalued measurable functions on [a. b). Then (a) for all a,,8 E R,
(af+ 6g)dA=a f fdA+0 f a b]
gdA.
' a b]
i b]
(b) If A,. A2 are disjoint measurable subsets of [a, b], then
JA,UA2
fdA= JfdA+1 fdA. A,
A,
fdA ? J
(c) If f a g a.e. on [a, b], then
I(a. b]
(d) Iff=ga.e.on[a,b],then J fdA=
fdA < Jla, b]
f
gdA.
(a, b]
(a. b]
(e)
gdA.
[a, b]
If 1A. J(a, b]
Proof. Since the proof of (a) is similar to the proof of the corresponding result for the Riemann integral we leave it as an exercise (Exercise 4). For the proof of (b), by definition
f
fdA =
A,UA2
f
f%A,UA, A.
ia.b]
Since A, fl A2 = fXA,UA, = fXA, + fXA2, and the result now follows by (a). (c) Consider the function h(x) = f(x)  g(x). By hypothesis h ? 0 a.e. on [a, b i. Let
E, = {x:h(x) >_ 0}
and
E2 = [a, b]\ E,.
Consider the measurable partition 9' = {Et, E2} of [a, b]. Then
h A ? XL(91, f) = m,A(E,) + m2A(E,). a. b]
470
Chapter 10
Lebesgue Measure and Integration
Since h(x) z 0 for all x E E,, m, = inf{h(x):x E E,} > 0. On the other hand, since h ? 0 a.e., A(E2) = 0. Therefore, fa h dA ? 0. The result now follows by (a). The result (d) is an immediate consequence of (c), and (e) is left for the exercises. The measurability of f I follows from Exercise 8 or 10 of the previous section. Q
Bounded Convergence Theorem One of the main advantages of the Lebesgue theory of integration involves the interchange of limits. If if.} is a sequence of Riemann integrable functions on [a, b] such that fn(x) converges to a function f(x) for all x E [a, b], then there is no guarantee that f is Riemann integrable on [a, b]. An example of such a sequence was given in Example 8.1.2(c). For the Lebesgue integral, however, we have the following very useful result.
10.6.11
is a sequence of realvalued measurable functions on (a. b) for which there exists a positive constant M such
THEOREM (Bounded Convergence Theorem) Suppose that I fa(x)I
M for all n E N. and all x E [a, b]. If liim fn(x) = f (x) a.e. on [a, b],
then f is Lebesgue integrable on [a, b] and
f dA = lim
1
[a. bj
n4ooJ
fn dA.
la. bj
Remark. Although we state and prove the bounded convergence theorem for a closed and bounded interval [a, b], the conclusion is still valid if the sequence {fn) is defined on a bounded measurable set A. The necessary modifications to the proof are left to the exercises.
Proof. Since fn > f a.e., f is measurable by Corollary 10.5.10, and thus Lebesgue integrable. Let
E = {x e (a, b): fn(x) does not converge to f(x)). Define the functions g and g,,, n E N, on [a, b] as follows:
gn(x) = {f(x), 0,
x E (a, b) \ E,
xEE,
and
g(x) _
f(x),
x E [a, b] \ E.
xEE.
0,
Since A(E) = 0, gn = fn a.e. and g = f a.e. Therefore, r6 Ja
b
b
"dA =
fndA 1a
and
b
Ja gdA= I fdA.
Furthermore, gn(x) +g(x) for all x E [a, b]. Let e > 0 be given. Form E N. set
E. _ {x E [a, b]:Ig(x)  gn(x)I < e forall n >_ m}.
10.6
Then E, C EZ C
The Lebesgue Integral of a Bounded Function
471
with U° , E. = [a, b]. Therefore, a fly Em = 45.
Here Em = [a, b] \ Em. Thus by Theorem 10.4.6, lim A(E,) = 0. Choose m E N such e for all7n*w m and all x E Em. Therefore, that A(Em) < e. Then I g(x) 
f6fdA jbfdAl =
I
bgdA
IQ

f bgndAl
'5
1
Ig
gn)dA
fa. bJ
jt
=
 g d + t1&  gdA
< e A(E,) + 2M A(Ec) < [b  a + 2M]. Since e > 0 was arbitrary, we have lire fla
b]
f dA = fla b, f dA. Q
Combining the bounded convergence theorem with Corollary 10.6.8, we obtain the bounded convergence theorem for Riemann integrable functions previously stated in Chapter 8. The theorem does require the additional hypothesis that the limit function f is Riemann integrable.
THEOREM 8.4.3
with lim f(x) such that [
Let f and f,,, n E N, be Riemann integrable functions on [a, b] all x E [a, b]. Suppose there exists a positive constant M M for all x E [a, b] and all n E N. Then lim fl
10.6.12
f (x) dx. I
a
a
EXAMPLES
(a) In the first example, we show that the conclusion of the bounded convergence theorem is false if the sequence f f,,) is not bounded; that is, there does not exist a finite constant M such that I f (x)l Q (read "P implies Q") is the proposition "If P, then Q." The statement P Q is a true statement unless P is true and Q is false, in which case it is a false statement. (ii) The biconditional sentence P * Q is the proposition "P if and only if Q." The sentence P q Q is true exactly when P and Q have the same truth values, otherwise it is false. The symbol " " is referred to as the implication or conditional symbol, whereas t* is the biconditional symbol. In "P ' Q," the proposition P is the hypothesis or an. tecedent and Q is the conclusion or consequent. The truth values for P ' Q and P a Q are given in the following table.
P T
Q
P=*Q
PaQ
T
T
T
T
F
F
F F
T
T T
F F
.
F
T
In the truth table for P = Q, the only line where P Q is true and P is true is the first line, where Q is also true. Thus the conditional statement P Q is often also expressed by saying that P is a sufficient condition for Q (if P is true then Q follows), or that Q is a necessary condition for P (P cannot be true unless Q is true). That Q is a necessary condition for P is sometimes also expressed by the phrase "P. only if Q" Since the biconditional P,4* Q is true exactly when P and Q have the same truth values, this is often also verbally expressed by "P is necessary and sufficient for Q:' To illustrate the truth values assigned in the conditional sentence, let us consider the following example. Your professor agrees "If you earn an A on the final, I will assign you an A for the course."
498
Appendix
Logic and Proofs
Here the antecedent P is "You earn an A on the final" and the consequent Q is "He assigns you an A for the course" The only case in which you have reason to be angry (the sentence is false) is when P is true and Q is false. If both P and Q are false, you may not be happy, but you have no cause to be angry with your professor. On the other hand, if P is false and Q is true, you certainly will not be angry. Closely related to the conditional sentence are the converse and the contrapositive of P =*. Q.
A.l.4 DEFINITION For propositions P and Q, the converse of P contrapositive of P
. Q is Q =o. P, and the
. Q is Q' P.
The truth values for each of these is given in the following table. P T T F F
Q
P=#. Q
Q .P
T F T
T
T T
F
T T
F
Q=:I, P T F
T T
F
T
A propositional formula is an expression involving finitely many logical connectives
(such as n, V, , =, and e') and variables (such as P, Q, R, etc.). For example,
(P A (Q V R)) V R is a propositional formula; it becomes a proposition when the letters P, Q, and R represent propositions and thus have either the truth value T or F. Two propositional formulas are equivalent if and only if they have the same truth values for all assignments of truth values to the simple propositions making up the propositional formulas. Simply stated, two propositional formulas are equivalent if and only if they have the same truth table. For example, the formulas P . Q and P V Q are equivalent. This is verified by the following truth table.
P
Q
P=Q
PvQ
T T
T
T
T
F
F
F
F F
T
T T
T T
F
Also, from the table for the contrapositive we see that Q
P
A.1.5
P is equivalent to
Q. To emphasize this we state it as a theorem.
THEOREM The proposition P
Q is equivalent to its contrapositive Q = P.
Exercise I contains several pairs of equivalent propositional formulas. These are important and should be memorized. Some propositional formulas have the property that they are always true regardless of the assignment of T or F to the simple propositions making up the formula. For example, it is easily verified by a truth table that each of the propositional formulas
PV  P
and
(P A Q) r* P
A.1
499
Propositions and Connectives
have the value T for any assignment of T or F to P and Q. Such propositions are called tautologies. A tautology is a propositional formula that is true for every assignment of
truth value to its components. A contradiction is the negation of a tautology. Thus (P V P) is a contradiction. By means of a truth table it is easily verified that  (P V P) is equivalent to P A P, which simply states that "P" and "not P" cannot both be true simultaneously. This is sometimes referred to as the law of the excluded middle. Exercise 3 contains several basic tautologies that will be very useful in our discussion on rules of inference. From the definitions of equivalence of propositional formulas and the biconditional it should be clear that two propositional formulas P and Q are equivalent if and only if P q Q is a tautology. To emphasize its importance we state it as a theorem.
A.1.6 THEOREM Two propositional formulas P and Q are equivalent if and only if P q Q is a tautology.
To illustrate this we consider the equivalent propositional formulas P P V Q. The truth table for (P Q) a (P V Q) is as follows:
Thus (P
Q and
(P * Q) q (P V Q)
P
Q
T T F F
T
T
F T F
F
T T
T T T T
T F T T
Q) q (P V Q) is a tautology.
, EXERCISES Al
1. For propositions P. Q, and R, verify by means of a truth table that each of the following pairs of propositional formulas are equivalent.
a. ^(P) and P c. PA (QVR)and(PA Q)V (PAR) e. ^(P A Q) and P V Q.
b. P A Q and Q A P; P V Q and Q V P d. P V (Q A R)and(P V Q) A (Q V R) f. ^(P V Q) and ^P A Q
S. P4* Qand(Pr Q) A (Q=P) Parts (c) and (d) above are referred to as the distributive laws for A and V, whereas parts (e) and (f) are De Morgan's laws for A and V. 2. By means of a truth table show that (P V Q) A  (P A Q) is true if either P or Q is true, and false otherwise. 3. Prove that each of the following is a tautology a. P v P (Excluded Middle) b (P) q P (Double Negative) c. P a [P (R A R)] (Contradiction)
d. (P A Q)  P e. P (P V Q)
(Conjunctive Simplification) (Disjunction)
500
Appendix
Logic and Proofs
f . [PA(P=* Q)] 'Q g. [(P Q) A  Q] ' P h. [(P= Q) A (Q =,, R)] a (P
(ModusPonens) (Modus Tollcns) R)
(Transitivity)
4. For each of the following determine whether it is a tautology, a contradiction, or neither.
a.PA P
b.P''PA(PvQ)
c. (P A Q) A(' PV  Q)
d. (PAQ)''(P ' Q)
e. IQ A (P=* Q)] 'P
f. [P=* (Q A R)]
'[(Q=R) V (R 'P))
g S. In this exercise no knowledge about sequences is required. Let M, B, and C denote the following statements. is monotone." M: "The sequence B: "The sequence C: "The sequence
is bounded." converges:'
Express each of the following sentences symbolically, using the convention that "divergent' is the negation of "convergent" and "unbounded" is the negation of "bounded" (In each of the statements the tern sequence refers to a sequence of real numbers). a. The sequence
is monotone and bounded.
b. A convergent sequence is bounded.
c. A sequence converges, only if it is bounded.
d. In order that a sequence is bounded, it is necessary that it converges. diverges, then it is unbounded. e. If the sequence 6. Provide an appropriate negation of each of the sentences in Exercise 5. Express your answer first in symbolic form and then in English.
A.2J Rules of Inference Given a propositional formula R, the method of truth tables provides a reliable means to determine whether R is a tautology. This method can even be turned into a computer program which would accept any formula R as an input and would determine whether R is a tautology. However, checking formulas of even modest length, say twenty symbols, turns out to consume inordinant amounts of time. What is worse, this situation cannot be improved substantially by clever computer programming. It can be proven, using the techniques of computational complexitya branch of mathematical computer sciencethat the problem of determining whether a propositional formula is a tautology is'intractable in the sense that any program for its solution will place insurmountable demands on computational resources. Where computational methods fail, mathematical ingenuity can still succeed. Confronted with a propositional formula R it may be possible to offer a proof that it is a tautology. Roughly speaking, a proof of R is a sequence of sentences or propositional formulas, the last one being R, such that each of the sentences in the sequence is either an axiom, a hypothesis, or a statement that follows from the previous sentences in the se
A.2
Rules of Inference
501
quence by some principle of logical inference. To make this unambiguous. we have to specify these principles of logical inference. A principle of logical inference may have some premises (like the previous sentences referred to above) and a conclusion.
The Form of a Rule of Inference: From P,,.
.
.
, P, one can infer Q.
Symbolically this can be expressed as P,
P1....,P. :. Q,
or as P.
Q,
Here the formulas P,, ... , P are the premises and Q is the conclusion. (The symbol :. is used in mathematics to denote therefore.) We even allow the case when no premises are present. In that case, Q can be regarded as an axiom. The most important thing about a rule of inference is that it should be logically valid; that is,
P, A .
.
A P. = Q (or just Q if no premises are given)
should be a tautology. For example, to show that P, P Q :. Q is logically valid, it suffices to show that (P A (P Q)) Q is a. tautology. This is easily verified by means of a truth table as follows.
P A (P=Q)=:; Q
P
Q
P=* Q
T T
T
T
T
F
F
F F
T
T T
F F F
F
T T T T
Since it becomes impractical to always verify by means of a truth table whether Pt A A P. Q is a tautology, we utilize tautologies to construct rules of inference. We can take any tautology of the form above and convert it into a rile of inference. Of course, we would gain nothing if we allowed ourselves to have a rule of inference for every tautology. We still have to use truth tables to verify that each rule of inference comes from a tautology. Fortunately, a handful of fairly simple tautologies is all that is needed. Each of the following tautologies can be verified by the method of truth tables without too much effort
PV P PVQ [P A (P . Q)] P
Q
[(P = Q) A  Q) =!  P
(Excluded Middle) (Disjunction) (Modus Ponens) (Modus Tollens)
502
Appendix
Logic and Proofs
P A Q . P (or Q) [(P Q) A (Q R)} ' (P
R)
(Conjunctive Simplification) (Transitivity)
From these tautologies we obtain the rules of inference listed below. Some additional useful and important tautologies are as follows.
(P) q P P V Q a Q V P, P n Q a Q n P
Pv(QVR)a(PVQ)VR P A (Q A R) q (P A Q) A R Q (P n Q)
(P v Q) q P A
Q
[P A (Q V R)] q [(P A Q) V (P A R)]
[Pv(QAR)]e*[(PVQ)A(PVR)] j P=*QaQ=*P Pa[P==O(R AR)]
(Double Negative) (Commutative Laws) (Associative Laws)
(De Morgan's Laws) (Distributive Laws)
(Contrapositive) (Contradiction)
One of the most fundamental rules of inference, in fact usually taken as an axiom, is the following.
Rule of the Excluded Middle: One can infer P V
P, for any statement P.
This rule simply states that the propositional formula P V P can be inferred, for any proposition P. For example, if an argument involves a real number x, one can assert at
any time that "Either x = 0 or x # 0."
Rule of Conjunction: From P and Q, one can infer P A Q.
Rule of Disjunction: From P, one can infer P v Q.
Modus Ponens Rule or Rule of Detachment: From P and P
Q, one can infer Q.
Modus Tollens Rule or Rule of Contrapositive Inference: From Q and P = Q, one can infer P.
A.2
Rules of Inference
503
Rule of Conjunctive Simplification: From P A Q, one can infer both P and Q.
Rule of Transitive Inference or Hypothetical Syllogism: From P
Q and Q
. R, one
can infer P
. R.
The rule of conjunction, although not listed as a tautology, simply asserts that P, Q :. P A Q is logically valid; that is, P A Q P A Q is a tautology. Likewise, the rule of disjunction follows from the fact that P = P V Q is also a tautology. The
modus ponens rule is again a verbal statement of the modus ponens tautology [P A (P ' Q)] Q. The implication P Q by itself, even if known to be true, infers Q is also true when both P and Q are false. Hownothing about Q. The implication P ever, if both P and P =* Q are true, then Q must also be true. The rule of conjunctive simplification is a restatement of the corresponding tautol
ogy. However, it also follows as a consequence of the modus ponens rule. Since (P A Q) P (or Q) is a tautology, by the rules of conjunction and modus ponens
(P A Q) A [(P A Q)
P]
P (or Q).
The modus tollens rule follows from the modus tollens tautology. It is also a simple Q is equivalent consequence of the contrapositive law and modus ponens. Since P to Q P, it is easily verified by a truth table that the formula Q A (P Q) is
equivalent to Q A (Q
P). Thus by the modus ponens rule P can be
inferred. The above argument illustrates the use of the replacement rule. In the formula
Q A (P = Q). the expressions Q and P =:; Q are subformulas of the original formula. The new formula Q A (Q =* P) was obtained from the original formula by replacing the subformula P Q by its equivalent formula Q = P. The resulting formula is then equivalent to the original. As a rule of inference, this is stated as follows.
Simple Replacement:
If P is a subformula of a formula P and P q Q', then from P one can infer any formula Q that results from replacing an occurence of P in P with Q'.
There are several additional rules that are worthy of mention. The justification of these rules is left to the exercises.
504
Appendix
Logic and Proofs
Disjunctive Syllogism: From P V Q and P, one can infer Q.
Rule of Inference by Cases: From P
Q and R
Q. one can infer (P V R)
Q.
Rules of Biconditional Inference: From P From P
Q and Q P, one can infer P q Q. Q, one can infer P . Q and Q . P.
A more detailed discussion of the propositional proof system would have taken us
too far astray from our main goal; namely, to provide basic rules of inference with which to construct valid arguments. A formal discussion of the propositional proof system can be found in the text by Bums listed in the Supplemental Reading section. In that text the author proves that every tautology, and only tautologies, can be derived from the listed rules of inference. T o p r o v e the validity of an inference PL, ... , P. .. Q. we simply have to verify that using the rules of inference, we can infer the conclusion Q from the given premises
Pr, ... , P. Each statement in the proof should be either a premise or an axiom, or should follow from previous statements by one of the accepted rules of inference. Devising a proof is a feat of mathematical ingenuity. However, once a purported proof is in hand, it can be easily checked step by step for validity. We illustrate the use of the rules of inference with the following examples.
All EXAMPLES (a) As our first example we consider the following verbal argument. If John is a Democrat, he associates with Democrats. But John does not associate with Democrats. Therefore, John is not a Democrat. This argument can be written in symbolic form as P=:0. Q P.
By the rule of contrapositive inference the argument is valid, whatever the truth or falsity of the statements in it may be. (b) As our second example we consider the following.
(a) P (b) P = Q
A.2
Rules of Inference
505
(c) P=(QTR) By the modus ponens rule, from (a) and (b) we can infer Q. Likewise from (a) and (c) R, we can infer R. Thus the inferwe can also infer Q R. But now from Q and Q ence is logically valid. In symbolic form, the above proof can be written as follows:
3.
P P P
4.
Q
5.
Q
..
R
1.
2.
(premise) (premise)
Q
(Q
R)
(premise)
(modus ponens I & 2) (modus ponens l & 3) (modus ponens 4 & 5)
'R
(c) For our final example we consider the following argument concerning Cauchy sequences. For the purposes of illustration we need to know nothing about sequences, with the exception that they are usually denoted as and that divergence is the negation of convergence. diverges. The sequence A bounded sequence has a convergent subsequence. Every Cauchy sequence that has a convergent subsequence converges. Therefore, the sequence is not Cauchy.
In this particular example the conclusion happens to be true, but the argument is not valid. To see this we will write the argument in symbolic form. Let P, Q, R, and S denote the following statements
P: Q: R: S:
"The sequence {p,} is Cauchy:' 'The sequence {p} converges:' "The sequence is bounded:' 'The sequence has a convergent subsequence."
In symbolic form the above argument is expressed as follows:
(a)  Q (b) R=* S (c) (P A S)
Q
By (a) and (c) and the modus tollens rule we can infer (P A S). But by De Morgan's
law, (P A S) is equivalent to the statement Pv  S. However, from P V  S we can infer neither P nor S. If we knew that S (i.e., (5)) was true, then by the rule of disjunctive syllogism we could infer P. Unfortunately, from the given premises nothing can be inferred about S. If S has the truth value F, then P V S is true regardless of the truth value of P. This allows us to obtain an assignment of truth values that make the premises true and the conclusion false. Thus the argument is not valid.
506
Appendix
Logic and Proofs
If the statement R had been included as a premise, then the resulting inference would be valid. As is often the case in proofs written by the beginning student, statement (b) of the proof is totally extraneous.
EXERCISES A.2 1. Construct a truth table for [(P' Q)A .QJ
P to verify the validity of the argument in Example A.2.1(a).
2. a. Justify the rule of disjunctive syllogism. b. Justify the rule of inference by cases. 3. Justify the rules of biconditional inference. In Exercises 412 use the rules of inference or a truth table to test the validity of each of the following.
4.
If L, is 11 to L2 and L2 is 11 to L3, then L, is 11 to L,.
L,istltoL2L2 is to L3.
L, is to L3If m is even, then 2 divides m.
5.
If 2 divides m, then 4 divides m2. If m is even and 4 divides m2, then m2 is even. m2 is even. 6.
SQ
7.
S
P V S .P 8.
R
SaQ (RVS)*Q Q
P
9.
PAR RCS (R A S)mo,Q Q
P
10.
R
PAS (RAS) .Q .. 11.
Q
P
Q
R*S P V R
QVS
A.3
12.
The sequence The sequence
Mathematical Proofs
507
diverges. is bounded.
A bounded sequence has a convergent subsequence. Every Cauchy sequence that has a convergent subsequence converges.
The sequence
A.3
is not Cauchy.
Mathematical Proofs In mathematics a proof is a logically valid deduction of a theorem from the premises of the theorem, the axioms, or previously proved statements or theorems. The truth of any statement in a proof should be traceable back to some initial set of axioms or
postulates that are assumed true. A proof should not be just a string of symbols. Every step in a proof should express a complete sentence, including the justification of the step. In this section we look at several methods that are commonly used to prove a theorem. Most theorems in mathematics are stated in the form "If P, then Q"; that is. Q. Any theorem stated as a biconditional sentence "P, if and only if Q" is proved P by first proving P Q, and then Q ' P.
Direct Proof Q" is the direct proof; namely, we asThe most straightforward type of proof of "P sume the hypothesis P and use the axioms, computations, or other theorems and the rules of logic to infer Q.
Direct Proof of P= Q Proof.
Assume P.
Therefore Q. Thus P
Q. t
We illustrate the method of direct proof and the use of the rules of inference in justifying the validity of an argument with the following examples.
A.3.1
EXAMPLES
(a) In Section 2.6 of the text the following theorem about Cauchy sequences is proved.
Theorem 2.6.4. Every Cauchy sequence of real numbers converges.
is used to mark the end of a proof. Some authors prefer to use QED. which is an I In the text the symbol abbreviation of the Latin "quod eras demonstrandum." meaning "which was to be proved."
508
Appendix
Logic and Proofs
Let P and Q denote the following statements respectively. P: Q:
is a Cauchy sequence of real numbers"
'The sequence 'The sequence
converges."
The theorem to be proved is "If P, then Q"; that is, "If { is a Cauchy sequence of real numbers, then the sequence converges:' Within Section 2.6 and in previous sections the following related theorems are proved.
Thl "Every Cauchy sequence is bounded:' Th2 "Every bounded sequence of real numbers has a convergent subsequence." Th3 "If (p.} is a Cauchy sequence of real numbers that has a convergent converges" subsequence, then the sequence is bounded" and "The sequence Let R and S denote the statements "The sequence has a convergent subsequence" respectively. Using P, Q, R, and 5, theorems Th l, Th2, and Th3 can be written symbolically as
Thl
P
Th2 Th3
RCS, (PAS)
R, Q.
From Th 1 and Th2, by the transitive rule we can infer P ' S. Thus from our assumption P. by modus ponens we can infer S. Hence by the rule of conjunctive inference we can infer P A S. But now by Th3 and the modus ponens rule we have
{(P A S) A [(P A S)*Q]}z*Q. It is important to again emphasize that the fact that (P A S) Q is true does not allow us to infer anything about Q. It is also required that P A S must be true. In symbolic form, the above proof can be written as follows: 1.
P
(hypothesis)
2.
PAR
(Thl)
3.
RCS
(Th2)
4.
P
5.
S
(transitive rule) (modus ponens)
6. 7.
PAS
.
.S
(PAS) = Q Q
(conjunctive inference) (Th3) (modus ponens)
The above provides a very,methodical argument illustrating the validity of the implication P Q. For a short proof this works very nicely. However, there is not a single proof in this text that is written in such detail using symbolic logic to proceed from P to Q. Mathematical proofs should be written in complete sentences, including justifications. The truth of any statement in the proof must follow from the initial hypothesis, the axioms, or previously proved theorems. A typical proof of the theorem, using the
A.3
Mathematical Proofs
509
facts introduced above, would proceed as follows. The comments in parenthesis are not part of the proof; they are included as explanations of the statements. be a Cauchy sequence of real numbers. (This asserts the truth of the hyProof. Let is bounded. (This asserts the implication pothesis P.) By Theorem 2.6.2 the sequence P R.) Thus by Corollary 2.4.1 the sequence 1p.1 has a convergent subsequence. (This is the implication R'' S, which by the transitive rule gives P * S. In the next step we invoke Theorem 2.6.3; namely, that P A S =:O Q.) The result now follows by Theorem 2.6.3. Q
A better and more careful way, especially for the novice, to express the last senhas a convergent subsetence would be as follows: "Since the Cauchy sequence converges." quence, by Theorem 2.6.3 the sequence (b) For our second example we prove the following statement. "If m is an even integer, then m2 is divisible by 4:' Let P be the statement "The integer in is an even integer," and Q the statement "The integer m2 is divisible by 4." Thus we wish to prove that P Q. Note: If m and n are integers, we say that m divides n, or n is divisible by in, if n = km for some integer k.
Proof. (Assume P.) Suppose m is an even integer. Then m = 2k for some integer k. (Here we use the definition of an even integer.) Then m2 = 4k2. Thus m2 is divisible by
4. (The conclusion Q.) U
Proof by Contraposition Since the implication P prove the implication P
Q is equivalent to its contrapositive Q Q by assuming Q and deriving
P, we can
P. Such a proof is called
a contrapositive proof or proof by contraposition. 10,
Contrapositive Proof of P
Proof. Assume Q.
.Q
Conclude  P by means of a direct proof. Thus
^Q' P. ThereforeP =:oQ. Q
We illustrate the method of proof by contraposition with the following elementary example.
A.3.2
EXAMPLE Let n be an integer. If n2 is even, then n is even.
Proof. Suppose n is not even
Then n is odd. (Here, and below, we use the fact that an integer is odd if and only if it can be expressed as 2k + I for some integer k.) Thus n = 2k + 1 for some integer k. But then
n2=(2k+ 1)2=4k2+4k+ I =2(2k2+2k)+ 1.
510
Appendix
Logic and Proofs
(In the above we have used the rules concerning algebraic operations on the integers. i.e., proved theorems.) Since m = 2k2 + 2k is an integer, we have
n2=2m+1 for some integer m. Thus n2 is odd (P).
U
Indirect Proof or Proof by Contradiction Our next method of proof is the indirect proof or proof by contradiction. A proof by contradiction makes use of the tautology
.[P=(RAR)].
P
The statement RA  R is a contradiction. It is worthy of mention that R does not appear anywhere on the left side of the tautology. Thus any proposition R that will do the job will suffice. As opposed to our previous two methods of proof, proof by contradiction can be applied to any statement "P." Direct proofs and contrapositive proofs only apply to implications "P =;, Q."
Proof of P by Contradiction Proof. Assume P. Therefore R.
A.3.3
Therefore R. Thus P is true.
EXAMPLE To illustrate the method of proof by contradiction we consider the folis irrational. This proof is due to Pythagoras, and is the lowing classical proof that first known proof using contradiction. The statement P to be proved is as follows:
P "V is irrational." Proof. (We assume  P; that is,
is not irrational, i.e., rational.) Suppose that is rational. Then V2 = m/n, where m and n are integers, with m and n not both even. (The sentence "m and n not both even" will be our statement R.) But then m2 = 2n2. Therefore m2 is even. Thus by Example A.3.2 the integer m is even. Since m is even, by Example A.3.l(b) the integer m2 is divisible by 4. Since m2 = 2n2, 2n2 is divisible by 4. Thus n2 is even, and by Example A.3.2 the integer n is also even. Thus m and n are both
even. (This is our statement  R. The negation of "not both are even" is "both are even.") This is a contradiction. Thus V is irrational. U
P
The method of proof by contradiction can also be used to prove the implication Q. Using the law of contradiction we have that (P = Q) .
[ (P =* Q) = (R A  R) ].
A.3
Mathematical Proofs
511
Thus to prove "P Q" by contradiction we must prove that (P = Q) implies a contradiction R A  R. Since P =:: Q a P V Q, by De Morgan's law and double negation
(P=* Q)q(PAQ). Q by contradiction we must show that the assumption PA ^ Q logically implies a contradiction RA R, for some appropriate statement R.
Thus in the proof of P
Proof of P= Q by Contradiction Proof. Suppose PA
Q.
Therefore R. Thus P
Therefore R.
Q
O
A.3.4
EXAMPLES
(a) To illustrate the method of proof by contradiction we will prove the following theorem about the integers.
Theorem. If a, b, c are integers satisfying a2 + b2 = c2, then a or b is an even integer.
Let P denote the statement "a, b, c are integers satisfying a2 + b2 = c2" and Q denote the statement "a or b is an even integer." In the proof by contradiction we assume P A  Q. Since a and b are assumed to be integers, from the assumption PA  Q we conclude that a and b are odd integers. Thus we can write a = 2m + 1 and b = 2n + 1, where m and n are integers. But then
c2=a2+b2=(2m+ 1)2+(2n+ 1)2, which upon simplification gives c2 = 4k + 2 for some integer k. Thus c2 is an even integer. But then by Example A.3.2 the integer c itself is even. Hence c = 2p for some integer p. This gives
c2=4p2=4k+2
or
(p2k)=1.
p2  k is an integer. (This is our statement R.) On the other hand, 1 is not an integer. (This is our statement R.) From our assumption Since p and k are integers,
PA _ Q we derived a contradiction R A  R. (b) For our second example we prove the theorem of Example A.3.2 by contradiction. For an integer n, let P and Q denote "n2 is even" and "n is even" respectively. To prove P Q we assume P and Q; that is, n2 is even, and n is odd. Since n is odd, n = 2k + I for some integer k. Therefore,
n2 = (2k + 1)2 = 2(2k2 + 2k) + 1. Therefore n2 is odd (P). This is a contradiction (PA
P).
512
Appendix
Logic and Proofs
In the previous example we assumed PA  Q and showed that Q leads to P. Thus P itself plays the role of R in the method of proof by contradiction. This is cerQ and derives P without using P. tainly permissible. However, if one assumes PA then the proof is in fact a proof by contraposition, rather than a proof by contradiction. A WORD OF CAUTION! It has been my experience that students have a tendency to
overuseand even misuseproofs by contradiction. Quite often, in proving P
Q,
the student will assume PA Q and derive Q thereby obtaining the contradiction Q and Q. Inevitably, if the proof is correct, buried in the details is a direct proof of P Q. Similarly, if the assumption PA  Q leads to P then the proof is in all likelihood a proof by contraposition. This was the case in part (b) of the previous example. Another problem with indirect proofs or proofs by contraposition is that they involve the negation of statements. This is not always easy in analysis, especially if the statements themselves are complicated and involve one or more quantifiers. (See Examples A.4.4 in the next section.) Before attempting an indirect proof or even a proof by contraposition, the student is advised to first attempt to find a direct proof. A direct proof often has the advantage of being more constructive. If the statement involves the existence of a certain object, a direct proof may in fact provide a method for constructing the given object.
Proof by Cases The method of proof by cases is based on the rule of inference by cases. Thus to prove that (P V Q) R it suffices to prove that P R and that Q  R. We illustrate this with the following examples.
A.3.5
EXAMPLES
(a) For our first example we prove the following: "If n is a positive integer, then
n2 + n + l is odd:' If n is a positive integer, then n can be either even or odd. Thus if P and Q represent the statements "n is an even positive integer" and "n is an odd positive integer" respectively, then the statement we wish to prove is (P V Q) R, where R represents the
statement "n2 + n + 1 is odd:' By the rule of inference by cases it suffices to prove
P
R and Q =* R. The details are left to the exercises (Exercise 6).
'(b) For our second illustration of the method of proof by cases we consider the following theorem.
Theorem.
There are two irrational numbers a and b such that ab is rational.
is rational. If this is the case we take a = b = V. Second First case: case: N/r2`r' is irrational. In this case we take a = \/'2"2 and b = V2. So
Proof.
ab=(V "i)V2
_
2(`/2)'=
2=2,
which is rational. Therefore, there must be irrational numbers a and b so that ab is rational.
A.3
Mathematical Proofs
513
In the proof, we let R denote the statement of the theorem and P the statement " V2 % 2 is rational." In Case 1 we have P R, and in Case 2 we have P = R. Thus by the rule of inference by cases we have (P V  P) R. But P V  P is true regardless of P. Hence R follows by the modus ponens rule. This example not only illustrates the method of proof by cases, it also illustrates how P V  P can always be asserted in a proof.
Counterexamples Some conjectures in mathematics are simply statements that are either true or false. For example, "V is irrational:' Other conjectures are general in that they assert something about a whole class of objects. For example,
"Every Cauchy sequence of real numbers converges:' If n is an even integer, then n2 is even." The first makes an assertion about all Cauchy sequences, whereas the second makes an assertion about all even integers. Both of these statements are true. If we are presented with a statement about a class of objects, then such a statement is true if and only if it is true for every object in the class. Thus to conclude that such a statement is false it suffices to exhibit one object in the class for which the statement is not true. Such an object is called a counterexample. To illustrate this we consider the following conjecture:
"If n is a positive integer, then n2  n + 5 is prime." With a little bit of thought, most students will immediately conclude that this conjecture is false. If we check for n = 1, 2, 3, 4, then n2  n + 5 becomes 5, 7, 11, and 17, which are indeed all prime. However, when n = 5, n2  n + 5 is equal to 25, which is certainly not prime. Since we have exhibited an object (n = 5) for which the hypothesis P is true but the conclusion Q is false, the implication P Q is false. The previous example was very elementary, and to find a counterexample was not very difficult. This, however, is not always the case. As an example, consider the following conjecture. "Every continuous function on a closed and bounded interval is differentiable except perhaps at a finite number of points:'
Most students who have completed a basic calculus sequence might be inclined to believe that this conjecture is true. Certainly, on the basis of most examples encountered in calculus such a conjecture seems reasonable. In fact, many mathematicians through the midnineteenth century accepted this, or a variation of it, as true. It was not until 1874, when Weierstrass constructed an example of a continuous function that was nowhere differentiable (see Section 8.5), that the above conjecture was proved false.
Helpful Hints The most common complaint heard from students when asked to prove a theorem is "I don't know where to start" Unfortunately, there are no easy rules that can be used to tell someone how to prove a theorem. Many of the problems and theorems in the exer
514
Appendix
Logic and Proofs
cises follow from the definitions or from previous theorems. Some, however, require insight and creativity, and for these the students must devise their own arguments. Some helpful hints in constructing proofs follow.
(1) Make a list of all hypotheses and of what you want to prove. Do not ignore any of the hypotheses. As a general rule they are all required. If you have not used all the hypotheses then most likely your proof is incorrect. (2) Refresh your memory with the pertinent definitions. If necessary, write them out. This will help you to memorize them and also to understand them. It is very important that you know and understand all the definitions. They are the foundations upon which the theory is built.
(3) Search for theorems that have similar hypotheses or similar conclusions. Proved theorems are not just results; they are also tools that enable you to develop the theory further. Suppose you are given P and R as hypotheses and are asked to prove Q. If you have a theorem that P S and you can prove that (R A S) Q, then you have your desired proof. An alternative approach is to work backward; that is. to find a theorem that has the same conclusion Q with given hypothesis S, and see if you can prove S from the given hypotheses P and R. (4) Learn the statements and the proofs of theorems. When I teach real analysis. I always require students to memorize the statements of all the theorems and the proofs of selected theorems. Contrary to student beliefs, this is not done to torture them. The statements of the theorems are the tools used in proving other theorems; the proofs provide useful techniques that may be used elsewhere. They also provide a good model of how to write a correct proof. Some common errors committed by students include the following. (1) Using theorems for which all the hypotheses are not satisfied. (2) Making extra assumptions beyond those given in the statement of the problem or theorem. (3) When asked to prove something, for example all continuous functions f, the theorem is proved for a particular function such as f (x) = x. Even though this is incorrect, by attempting to prove the theorem for a special case, the student may in fact gain insight on how to prove the result for the general case.
R EXERCISES A.3 1. As in Example A.3.l(a), write the proof of Example A.3.I(b) in symbolic form. 2. Same as the previous exercise for Example A.3.2. 3. Construct proofs of each of the following statements about the positive integers. Suppose k. m, and n are positive integers.
a. If m and n are odd, then mn is odd. c. If m2 is odd, then m is odd.
e. If n is odd, then n2 + 1 is even.
b. If m is odd, then m, is odd. d. If k divides m and m divides n, then k divides n.
A.4
4. Consider the following statement. "If n is a positive integer, then a. a direct proof, b. a proof by contradiction. 5. Prove that
>
Use of Quantifiers
515
Prove the statement by pros iding
is irrational by contradiction.
6. Complete the details of the proof in Example A.3.5. 7. Prove that the equation a2 = 4b + 3 has no integer solutions. S. Provide a counterexample to each of the following statements.
a. If n is a positive integer, then n2  n + 41 is prime. b. If n is a positive integer, then n! < 2'. c. If n is a positive integer, then n' < (n + I)". d. Every continuous function is differentiable.
_AAJ Use of Quantifiers2 We have already touched on the notion of quantifiers while discussing counterexamples in the previous section. There we discussed the difference between sentences that are simply statements that are either true or false, and sentences that make assertions about a collection of objects. In this section we make the latter more precise.
Quantified Sentences The sentence "x' = 4" is not a proposition as it is neither true nor false. If we replace x by specific values then the statement "x 2 = 4" becomes a proposition. For example, the sentence is true for x = 2 and false for x = 3. Likewise, the sentence "x is a rational number" is neither true nor false until x is replaced by a specific quantity. In the sentences "x 2 = 4" and "x is a rational number," the 'x' is called a variable and the sentences themselves are called formulas or open sentences in the variable x. Specifically, a formula (in logic) is a statement containing one or more variables which becomes a proposition when the variables are replaced by particular objects. W e will use the notation P(x) to denote that P is a formula in the variable x. Likewise, a formula in the varixk will be denoted by P(xt, .. . , xk). For example, "xi = x2 + x3' is a ables x 1 ,. formula in the variables xi, x2, x3. Before the truth of a formula P(x) can be determined we must specify what objects are available for discussion. This is called the universe U for P(x). For example. for the formula x2 = 4 an appropriate choice for the universe U may be either the set of positive integers N, the set of all integers 71, or even the set of real numbers R. It is not enough, however, to just specify the universe: For example, in the formula x  y = x. a meaning must also be given to " " and If we are discussing 2 X 2 matrices. then in addition to specifying the universe C' as the set of 2 X 2 matrices, we must also
2 Since this section requires some basic knowledge of the terminology of sets it is best postponed until Section 1.1 has been read.
516
Appendix
Logic and Proofs
define matrix multiplication and equality of matrices. For logical considerations, we must also require that the universe be nonempty. For a formula P(x) with specified universe U, the truth set of P(x) is the set of all x E U such that P(x) is a true proposition. In the notation of sets, the truth set is simply
{x E U: P(x)}. This is read as "the set of x in U such that P(x)." For example, {x E N : x2 = 4} = {2}, whereas {x E TL : x2 = 4} = {2, 2}. Consider the two formulas P(x) and Q(x) given by "x2 = 4" and "(x + 1)2 = x2 + 2x + 1" respectively. If we take as our universe the set of real numbers R, the truth set for P(x) is nonempty. This is expressed by saying that there exists an x E 1R such that P(x) is true. On the other hand, the truth set for Q(x) is all of R, and this is ex
pressed by saying that Q(x) is true for all x E R. We make this precise with the following definition.
AA.1
DEFINITION Suppose P(x) is a formula in the variable x with universe U. (1) The sentence (Vx) P(x) is read 'for all x, P(x)" and is true precisely when {x E U : P(x)} = U. The symbol V is called the universal quantifier. (ii) The sentence (3x) P(x) is read "there exists x such that P(x)" and is true precisely when {x E U : P(x)} # 4. The symbol 3 is called the existential quantifier.3
The expressions (Vx)P(x) and (3x)P(x) are called quantified sentences. The phrase "for every" is synonymous with "for all" If we wish to emphasize the universe U we write (Vx E U)P(x) and (3x E U)P(x). These are read as "for all x in U. P(x)and "there exists x in U, such that P(x)" respectively. For example, with the formulas
(x+ 1)2=x2+2x+landx2=4wehave (`dxE R)[(x+ 1)2 = x2+ 2x+ I],
(3xCt8)(x2=4). Most mathematicians avoid using V and 3 in publications. In fact, with the exception of this section, they are used nowhere else in the text. Expressing mathematical statements in quantified form is first of all not always easy, and second makes for awkward reading. Quantifiers, however, are crucial when it comes to negating complicated statements.
AA.2
EXAMPLES
(a) As our first example we express the following theorem of the text as a quantified sentence.
Theorem 2.6.4. Every Cauchy sequence of real numbers converges.
Clearly, the quantifier to be used is V. For our universe we take U to be the set of all sequences of real numbers, and let C(x) and Q(x) denote the open sentences "x is a Cauchy sequence" and "x converges" respectively. Consider first the sentence
3 The symbol 3! is often used to denote the existence of a unique x for which P(x) is true.
A.4
Use of Quantifiers
517
(b'x)[C(x) A Q(x)]. This sentence would be translated as follows: "For all sequences x, x is a Cauchy sequence and x converges:' This sentence, however, is the same as "every sequence is a convergent Cauchy sequence:' and this is clearly not the intent of the original statement. is a Cauchy seIn other texts, Theorem 2.6.4 is sometimes expressed as "if quence of real numbers, then {an} converges" This statement can also be rewritten as "For all sequences {an} of real numbers, if {an} is a Cauchy sequence, then {an} converges:' This version, although somewhat awkward grammatically, is now easily written in symbolic form as (Vx E U)[C(x) ' Q(x)l. In general, a sentence of the form "All P(x) are Q(x)" is expressed in symbolic form as (Vx)[P(x) Q(x)]. (b) As another example consider the statement "Some bounded sequences converge."
As in (a) let Q(x) be the sentence "x converges" and B(x) be the sentence "x is a bounded sequence." Since "some" is taken to mean at least one, the proper quantifier is 3. However, should the statement be expressed symbolically as (3x E U)[B(x) Q(x)]
or as (3x E U)[B(x) A Q(x)]? The first would be interpreted as "There exists a sequence x, such that if x is bounded, then x converges." This clearly is not the intent of the sentence. It does not ensure the existence of a bounded sequence that converges. The second, (3x)[B(x) A Q(x)l, reads "There exists a sequence x, such that x is bounded and x converges," and this is the correct interpretation. In general, the statement "Some P(x) are Q(x)" is expressed symbolically as (3x)[P(x) A Q(x)]. (c) Most expressions in mathematics require the use of many quantifiers. To illustrate this, consider the definition of convergence of a sequence of real numbers as given in Section 2.1 of the text.
DefWtion 2.1.7. A sequence {pn} in Il8 is said to converge if there exists a point
p E R such that for every e > 0, there exists a positive integer n such that
Ipn  pI < efor all n?no. This definition uses the quantifiers V and 3, not only once, but several times. We have
"3p E R," "3no E N," "(Ve)(e > 0)," and "(Vn)(n Z n.)." To write this statement in symbolic form, we begin with (3p E R)[' ]. Consider the sentence "for every e > 0, there exists ..:' This phrase, properly stated, should be expressed as "for all e E 13, if e > 0, then ...,"which in symbolic form would be written as (Ve)[(e > 0) . (. )].
This leaves us with the final phrase "there exists a positive integer n, such that Ipn  pI < e for all n n,,. This would be written as (3no E N) ("Ipn  p] < e for all n ? no"). The statement "Ipn  pi < e for all n ? no," again if properly stated, would read as "for all n E N, if n z n,,, then Ipn  pI < e," or in symbolic form, (`tin E N)((n a I p,  p I < e). Combining all of the above finally gives "The sequence { pn} in R is said to converge" if
(3p e R)[(`de){(e > 0) ' [(3no E NI){(Vn E N)((n a no)
Ipn  pI < e)}]}].
4 In mathematics the term "some" is taken to mean at least one. This differs from the colloquial interpretation where "some' is occasionally taken to mean two or more.
518
Appendix
Logic and Proofs
This can also be expressed as
n = tp  pl < e)],
(3p)(Ve)[e > 0
where the universe for each of the quantifiers is understood.
Negation of Quantified Sentences The next thing we want to consider is how to negate a quantified statement. First, however, we need to define what it means for two quantified sentences to be equivalent. Recall that two propositional formulas P and Q are equivalent if and only if P a Q is a tautology. Suppose P(x) and Q(x) are two formulas with nonempty universe U. Then P(x) and Q(x) are equivalent in U if and only if P(x) and Q(x) have the same truth value for all x E U; that is, (Vx E U)(P(x) #' Q(x)). Two quantified sentences P(x) and Q(x) are
equivalent if and only if they are equivalent in every universe. For example. since P Q is equivalent to P V Q, if P(x) and Q(x) are formulas or open sentences in x with universe U, then
(Vx E U)[(P(x)
Q(x)) a ( P(x) V Q(x))).
Thus the quantified sentences (Vx)(P(x) alent.
Q(x)) and (Vx)( P(x) V Q(x)) are equiv
Consider now the quantified sentences (Vx)P(x) and (3x)P(x). The sentence  (Vx)P(x) is true in a given universe U if and only if (Vx)P(x) is false: that is, if and
only if the truth set {x E U: P(x)} is not equal to U. But this is true if and only if {x E U :  P(x)} is nonempty,5 i.e., (3x) P(x). Since this argument holds for any universe U, the quantified statement (Vx)P(x) is equivalent to (3x)  P(x). A similar argument also proves that (3x)P(x) is equivalent to (Vx)  P(x). To emphasize these we state them as a theorem.
A.4.3 THEOREM Suppose P(x) is a formula in the variable x. Then (a) (Vx)P(x) is equivalent to (3x) P(x). (b) (3x)P(x) is equivalent to (Vx)  P(x). In the following examples we find the negation of each of the quantified sentences of Examples A.4.2.
A.4.4
EXAMPLES
(a) As in Example A.4.2(a) we consider the statement "Every Cauchy sequence converges.' Using the same notation, in symbolic form this sentence was expressed as (Vx)[C(x) .Q(x)]. The negation of the statement becomes (3x)  [C(x) Q(x)J. Q(x)] is equivalent to ( C(x) V Q(x)), and by De Morgan's law the Now ( C(x) Q(x)] is latter is equivalent to C(x) A  Q(x). Thus the negation of (Vx E U)[C(x) (3x E U)[C(x) A  Q(x)]. This last statement would be read as "Mere exists a Cauchy
5 Here it is required that the universe U itself is not empty.
A.4
Use of Quantifiers
519
sequence in R that diverges." (Note: This .statement, however, is false in the real number system R, but true in the rational number system 0.)
(b) For our next example consider the negation of the statement "Some bounded sequences converge" In symbolic form this was expressed as (3x)[ B(x) A Q(x)] (see Example A.4.2(b)). The negation of this statement becomes (Vx)  [B(x) A Q(x)j. which Q(x)]. But  B(x) V  Q(x) is by De Morgan's law is equivalent to (Vx)[ B(x) V equivalent to B(x)  Q(x). Hence the negation of (3x)[B(x) A Q(x)] is equivalent to of real (dx)[B(x)  Q(x)]. This last statement would read as "For all sequences is bounded, then diverges," or more simply as 'All bounded senumbers, if quences in R diverge:' (c) As our final example we undertake the negation of the definition of convergence of a sequence of real numbers. In symbolic form, a sequence {pn} in R is said to converge if
Ip,,  pi
0)' [(3n E N){(`dn E N)((n >_
We proceed to negate this sentence step by step. First, the negation of (3p E R)[ . } ] we have (`de){P(e)w becomes (Vp E 111)  [ ]. Now inside the bracket [.
Q(e, no, n)}, where P(e) denotes "e > 0" and Q(e, no, n) denotes "(3n0 E N) {(V n E N)((n ? nn) a 1p. pi < e)}." But (Ve){P(e) Q(e, n,,, n)} is equivalent to (3e){P(e) A  Q(e, no, n)}. It is left as an exercise (Exercise 7) to show that the negation of Q(e, no, n) becomes
(Vn,, E N){(3n E N)[(n ? n,) A (IPn  pi Combining all the above gives us the following. "A sequence verge" if
e)]}.
in R is said to di
(Vp E R)(3e){(e > 0) A (Vn E N)[(3n E N)(n =' no) A (Ip  pi ? e)]}, or
(Vp E lR)(3e > 0)(Vn,, E N){(3n ? n,)IPp  pI ' e). Translating everything into English gives `A sequence
in 18 is said to diverge if for
all p E R, there exists e > 0 such that for all n, E N, there exists n ? n,, such that IPn  PI e.
0 EXERCISES A.4 1. Express each of the following sentences in symbolic form. Specify an appropriate universe for each. a. All men are mortal.
b. Not all mortals are men. C. Some isoceles triangles are equilateral triangles.
d. Some triangles are isoceles triangles and some are equilateral triangles. e. Some triangles are isoceles and equilateral triangles.
520
Appendix
Logic and Proofs
f. Not all isoceles triangles are equilateral triangles. g. Between any two distinct real numbers there is a rational number. 2. In the following, let the universe U be the set of all sequences of real numbers, and let M(x), B(x), and C(x) denote the sentences "x is monotone, "x is bounded,' and "x converges" respectively. Express each of the following statements in symbolic form using the quantifiers 3 and V. a. All convergent sequences are bounded.
b. There exists an unbounded monotone sequence. C. Some monotone sequences are unbounded. d. Every bounded monotone sequence converges. e. Not all bounded sequences converge. L Every divergent sequence is unbounded. 3. Determine which of the following quantified sentences are equivalent.
a. (Vx)[P(x) A Q(x)] and [(Vx)P(x) A (Vx)Q(x)] b. (Vx)[P(x) V Q(x)] and [(Vx)P(x) V (Vx)Q(x)] (Vx)Q(x) c. (Vx)[P(x) Q(x)) and (Vx)P(x) d. (Vx)(Vy)P(x. y) and (Vy)(Vx)P(x, y) e. (3x)[(3Y)P(x, y)] and (3Y)[(3x)P(x, y)] 1. (3x)[(VY)P(x, y)] and (Vy)[(3x)P(x, y)]
4. Find the negation of each of the following quantified sentences. a. (Vx)[P(x) V Q(x)]
b. (Vx)[P(x) V  Q(x)] e. (3x)[P(x)' . Q(x)] d. (3x)[(P(x)* Q(x)) A (R(x)=* Q(x))] e. (Vx)[{P(x) A (P(x) z* Q(x))}
f. (Vx)[(3Y)(P(x, y)
' Q(x)]
' Q(x, Y))]
5. Find the negation of each of the following quantified sentences.
a. (Vx E R)(2x > x) b. (Vx E R)[(x > 0) .(2x > x)) C. (3x
IE
NX5x + 11 = 3x + 14)
d. (Vx E R)(3yE l8)(x + y = 0) e. (Ve E R)[e > 0 ' (3nE I@!)(1, < e)] 6. Find the negation of each of the statements in Exercise 2. Express your answer first in symbolic form and then in English.
7. Show that the negation of (3n,)((Vn)((n ? n,) p,  pI < e)) is (Vn,)(3n)((n 3 n,) A (Ip,  pI a e)]. 8. In the following, f is a realvalued function defined on an open interval (a, b) and p is a point in (a, b). The definition of the limit of the function f at p is as follows: "The function f has a limit at p if there exists a number LE R such that given any e > 0, there exists a 8 > 0 for which [f(x)  LI < e for all xE (a, b) with
0 (n + 1)2" > 2 2" = 2"'. Thus by the modified principle of mathematical induction the inequality holds for all n E N. n ? 4. 4. For n E N let P(n) be the statement f(n) = 3 2" + (1)". Then P(n) is true for n = 1, 2. Fork ? 3, assume that P(j) is true for all 523
524
Hints and Solutions to Selected Exercises
j E N1, j < k. Use the fact that f(k) = 2f(k  2) + f(k  1) and the induction hypothesis to show that P(k) is true.
5. (b) f(n) = n2 (d) f(n) = 0 if n is even, and f(n) _ ( 1)t"'n/n! if n is odd. 7. For each n E N let S" = r + r2 + + r". Then S.  rS" = r  r"+', from which the result follows. 8. Hint: Let A = '(a, + + a") and write a" xA for some x a 0. Use the induction hypothesis to prove that (a, a" '') s x'1t"+')A. Thus by the second principle of mathematical induction, the result holds for all n E N.
Now use Bernoulli's inequality to prove that xj'j"'') s (n + x)/(n + 1). From this it now follows that xii("+')A 5 n + x A = n + 1
1
n + 1
(a,+.+a"+a"+,).
EXERCISES 1A page 26 4. Consider (a  b)2. S. (a) inf A = 0, sup A = I (e) inf C = oo, sup C = 00 (e) inf E = 1, sup E = 3 (g) inf C = 0. sup C = no (i) inf I =  2, sup ! = 2 14. (b) Since A and B are nonempty and bounded above, a = supA and is = sup B both exist in R. Since a = supA we have a s a for all a E A. Similarly b :5 0 for all b E B. Therefore, a + b 5 a + $ for all a E A, b E B. Thus a + /3 is an upper bound for A + B, and thus
y=sup(A+B)5a+J3. To prove the reverse inequality, we first note that since y is an upper bound for A + B, a + b s y forall a E A. b r= B. Let b E B be arbitrary, but fixed. Then a c y  b for all a E A. Thus y  b is an upper bound for A and hence a q. If p > q then n = 1 works. If p s q, consider (q + I )p. 6. (a) Use the fact V2/2 is irrational. (b) Use Theorem 1.5.2 and (a).
EXERCISES 1.6 page 34 2. (a) .0022 = + 51 + i + 1..0202020 (d) .101010
_ +;+ +5 +,+
= 27 + 81 =
_i
00
1
ar=13
1
1 ,] _
3. .0101010
9
EXERCISES 1.7 page 42 2. Let f : 101s 0 be defined by f (n) = 2n  1. 4. (a) g(x) = a + x(b  a) is a onetoone mapping of (0, 1) quo (a, b). 6. (a) Since A  X. there exists a onetoone function It from A onto X. Similarly, there exists a onetoone function g from B onto Y. To prove the result, show that F : A X B X x Y defined by F(a, b) = (h(a), g(b)) is onetoone and onto. 8. (a) U A" = R, f1 A. = {x :  I < x < 1) (c) U A. _ (1, 2), fl A. _ [0, 1 18. (a) Consider the function on (0, 1) that for each n E N, n ? 2, maps to "=_ ,, and is the identity mapping + a,x + ao, consider the height h of the polynomial defined by elsewhere. 19. For a polynomial p(x) = ax" +
h=n+Iaol+lat[+...+Ia"I Prove that there are only a finite number of polynomials with integer coefficients of a given height It, and therefore only a finite number of algebraic numbers arising from polynomials of a given height It. 22. If f is a function from A + '(A), show that f is not onto by considering the set {x E A : x 6E f(x)}. 23. For a, b E [0,1 ] with decimal expansion a = a, a2 and b = b, b2 , consider the function
f : [0 1) x [0, 1 )  [0, 1) by f(a, b) = a, b, a2b2 ... .
Hints and Solutions to Selected Exercises
525
CHAPTER 2 EXERCISES 2.1 page 52 2. Ix I = Ix  y + y 1 5 Ix  y I + I y 1. Therefore, I x I I y 15 I x  y 1. Interchanging x and y gives I y i Ix 1 5 I y x I= I x y I Now use the definition of II x I I y I I. 6. (a) 3:5,x:5 13/3 (c) 1 0. By Example 1.3.3(b), b" ? I + no. Now use the previous problem. 12. First show that I a2.  a21 5 (I a" I + I a I) I a"  a I. Now use the fact that since {a"} converges, there exists a positive constant M such that I a" 15 M for all n e N. 14. Consider a = 0 and a > 0
separately. For a > 0, V  V = (a"  a)/(V + Va). EXERCISES 2.2 page 59
S. (b) If p > 1, let x" = Vp  1. Apply the inequality of Example 1.3.3(b) to (I + x")". 6. (a) Is 7. (a) Converges to 1. 1
n2 + 1
n2 \2n +
2
3
(c) 1 (e) 2
(c) Since
(1 + 1/n2)2
(2 + 3/n)2'
by use of the limit theorems the sequence converges to 4.
8. Use the fact that I cos x 1 5 1 for all x E R. 10. (a) Suppose lim a"+ ,/a" = L < 1. Choose e > 0 such that L + e < 1. For this a there exists n. E 101 such that a"+ ,/a" < L + e for all n a n,. From this one obtains that for i > no, 0 < a" 1 +
" ; = x.'+,. Thus x2. > x,2,+, > 1. From this it now follows that x" > x"+1 > 1 for all n E N. and that lim x" = 1. (c) Since a > 1, a"+' = a a" > a". Thus {a"} is monotone increasing. Let or = sup{a" : n E N!}. If a < oo, then a = lim a"+ = a lim a" = as > a. This, however, is a contradiction. Thus a" 4 oo. S. Use mathematical induction to show that a" > 1 for all n (= N. From the inequality 2ab s a2 + 62, a, b >_ 0, we have 2a" 5 a.2 + I or a"+, = 2  1/a" s a". Therefore, {a"} is monotone decreasing. Finally, if a = Jim a", then
1=2 1.a a = woo lint a"+,="moo limit a") Therefore, a = 1. 7. (c) By induction, the sequence {a"} is monotone increasing and bounded above by 3. If a = Jim a", then a2 = lim a"+, = lim(2a" + 3) = 2a + 3. Thus a is a solution of a2  2a  3 = 0. The two solutions
526
Hints and Solutions to Selected Exercises
are  I and 3. Since a must be positive, we have a = 3. (e) lim a" = 2. 9. (a) Use the inequality ab !< i (a2 + b2) to prove that x"., > \ for all n. To show that {x"} is monotone decreasing, consider x",,  x" and simplify. 12. (a) e2 (e) e312 13. To show that {s"} is unbounded, show that s2. > I + "I. Hint: First show it for n = 1. 2. and 3; then use mathematical induction to prove the result for all n E N. 15. Hint: Fork = 2. t < k,k ! l) = k i  ;. 17. (b) n + n(n  1) (d) Hint: Write a = (1 + b) with b > 0. Now use the binomial theorem to show that a"In° ? cn for some positive constant c and all n sufficiently large. 18. (c) The sequence is not monotone increasing: If x" = n + (1)'Vn, then xu,, < x2,,. 23. This problem is somewhat tricky. It is not sufficient to just choose a monotone increasing sequence in the set; one also has to guarantee that the sequence converges to the least upper bound of the set. Let E be a nonempty subset of R that is bounded above. Let aqt denote the set of upper bounds of E. Since E * 0, we can choose an element x, E E. Also, since E is bounded above, OIL * 46. Choose /3, E V. Let a, = 12 (x, + 6, ), and consider the two intervals [x,, a, ] and (a,,19, ). Since x, E E, one or both of these intervals have nonempty intersection with E. If (a,, /3,] fl E * 0. choose x2 E E such that a, < x2 s S,. In this case, set $2 = /3,. If (a,, /3, ] fl E _ 0, choose x2 E E such that x, < x2 a,, and set /32 = a,. In this case, $2 E OU.. Proceeding inductively, construct two monotone sequences {)"} and {/3"} such that (a) {x"} C E with x" 5 x"+, for all n, (b) 0"  x" a  e/2 for all n ? n° and (a" + b") < y + e/2 for all n ? n°. Therefore. b" < y + e/2  a" < y  a + e. From this it now follows that limb" s y  a + e. Since e > 0 was arbitrary, we have limb" 5 y  a; i.e.. lim a" + Fim b" s lira (a" + Q. The other inequality is proved similarly. {s2.,} and {s2i,,+,}.
8. 1. 1. Hint: Consider the subsequences 10. By Theorem 2.5.7 there exists a subsequence {a",} of {a"} such that a", +a. Since {b"}
converges to b, a,,,b,,, + ab. Therefore, ab is a subsequential limit of inequality follows similarly. The fact that b * 0 is crucial.
Thus Tim
z ab. The reverse
EXERCISES 2.6 page 85 2. (a) The sequence {(n + 1)In} converges and thus is Cauchy. (d) Convergent sequence; thus Cauchy 4. (a) Show that stn  s" Z. & For n 3, I a"+,  a" I < Ia,  a"_, I. Therefore, {a"} is contractive. If a = Iim a,,, then
;
0 < a < 1 and is a solution of a2 + 2a  1 = 0. (a"_,  a") _ (b  1)"'(a2  a,). Therefore, n1
n
a.+l  al =
11. (b) Since (a,,.,  a") = (b  lxa"  a"_,), by induction
1(b1)"
(ak+,  a*) _ (a2  a,) Y, (b  1)k = (a2  a,) 2kS
b
.
Letting n  oo gives a = a, + i! b (a2  a,).
EXERCISES 2.7 page 89
1. Lets"=
Sincek sk
1,k>2,forna2,s"{ 1 + Ik=2(k
l  i)=22.
Therefore, {s"} is bounded above and hence converges by Theorem 2.7.6. 5. Use the inequality ab s 2 (a2 + b2), a, b ? 0. 9. (b) Since the sequence (s"} of partial sums is monotone increasing, it suffices to show that some
Hints and Solutions to Selected Exercises
+
subsequence is bounded. Consider the subsequence {s,,) where nk = 2k  I. and prove that s,,. < I + a + where a = 2tP1l. For example, when k = 2, nk = 3 and
s3=1+2P+
3P
527
=1+( +]1 0 be given. By definition there exists M > 0 such that I1(x)  L I < e for all XM x E (a, oo) with x > M. Let S = 1/M. Then for all t E (0, 1/a) with t < S, 1 It E (a, oc) and I It > M. Therefore. IS(r)  LI = If(',)  LI < e. The proof that limo g(r) = L implies iim f(x) = L is similar. 17. (a) 22 (c) 2 M (e) 2 (g) Limit does not exist. For all x > ;, cos ; > 1. Thus x cos x > 1 x and x cos ;  oo as x  oo. 2(
).
'v
EXERCISES 4.2 page 141 1. (c) Since I  cos x = 2 sin2(x/2), for x # 0, g(x)
sin2(x/2). Now use the fact that I sin t I s 11
4.Ifp>0.
1.
If(x)  f(P)I = If  VI = Ix  PI/(V + V P  ) < LIx  PI Let e > 0 be given. Set S = m i n { p , V p e } . Then I x  p I < S implies that 11(x)  f(p)I < e. Therefore, f is
continuous at p. If p = 0, set S = e'. Alternatively use Theorem 4.1.3 and Exercise 14 of Section 2.1. 6. (a) See Exercise 7a of Section 4.1. 7. Use Theorem 4.2.4, and the fact that x" is continuous on R for all
n E N. 9. (a) R \ (2.0,2). (c) R. 12. (a) Use the fact that max {f(x), g(x)} =12(1(x) + g(x) + I f(x)  g(x) 1). 16. Consider g(x) = f(x)  f(x  1), xE(O, I J. 18. Let p E E be a limit point of F. Use Theorem 2.4.7 and continuity off to show that p E F. 20. (b) By induction, f(nx) = nf(x) for all n E h) and x E R. In particular, f (n) = en where c = Al). Also, c = Al) = An .) = nf(.1). Therefore, Al) = cln. Since f is continuous, letting n ioo gives f(0) = 0. From this it now follows that f(x) = f(x) for all x E R. Thus f(n) = cn for all n E Z and f(r) = cr for all r E 0 (write r = m/n, m, n E Z). Finally, by continuity Ax) = cx for all x E R. 22. Take e = 1. Then for this choice of a there exists a 8 > 0 such that 11(x)  f(P)I s 1 for all x E Na(p) fl E. Show that this implies that If(x)I c (If(P)I + 1) for all x E Na(p) () E. 25. Theorem 2.4.7 and Theorem 3.2.10 should prove helpful. 29. By hypothesis, for each x E K there exists e, > 0 and M, > 0 such that I Ay) 1 5 M, for all y E N1(x) fl K. The collection {N,,(x)},Ex is an open cover of K. Now use compactness of K to show that there exists a positive constant M such that I f(y)I S M for all y E K.
EXERCISES 4.3 page 147 2. (a) Suppose f(x) = x2 is uniformly continuous on [0, oo). Then with e = 1, there exists a S > 0 such that J Ax)  f C Y ) I < I f o r all x, y E [0, oo) satisfying I x  y I < 6. Set x" = n and y" = it + 1. If n E IOl is such that no6 > 1, then Iy"  x" I = I_11 < S for all n a n,,. But I f&.) 1(x")1 = 2+3, z 2 for all n. This is a contradiction!
3. (a) For all x, y E [0, oo),
If(x)f(y)I =
l+x
l+y
(I+x)(ly+Y) 0, the choice 6 = e will work. (f) Set g(x) = sin x/x, x E (0. 1 ] and g(0) = 1. Then g is continuous on [0, 1 ], and thus by Theorem 4.3.4 uniformly continuous on [0, 1 ]. From this it now follows that f is uniformly con
tinuous on (0, 1). 4. (a) Show that If(x)  f(y)I 5
1 x  yI for all x, y E [a, oo). a3
5. (a) For x, y E [a, 00),
Imo VY I = IxyI/(V + VY )52:rIxyl. 7. (b) Suppose I f I and I g I are bounded by M, and M2, respectively. Then
I f(x)g(x)  f(y)g(y) I
I AX) I I g(x)  g(y) I + I g(y) I I f (x)  f(y) I
5 MI I g(x)  g(y) I + M2I f(x)  f(y) I
Hints and Solutions to Selected Exercises
13. (a) Let x, E E be arbitrary. For it = 1 set x,,.
Now use the uniform continuity off and g. the sequence {x"} is contractive.
EXERCISES 4.4 page 160 0; i.e, b > I. not exist.
13. (b) f;(0) = hliw, ht'''t sin ,which exists and
EXERCISES 5.2 page 187 1. (a) Increasing on R. (c) Decreasing on (oo, 0) and increasing on (0, oo), with an absolute minimum at x = 0. 4. (a) Show that the function f (x) = xv"  (x  1)'r" is decreasing on the interval 1 0 for all x. I x  c I < S. Therefore, f (x) < 0 on (c  S, c) and f (x) > 0 on (c, c + S). Thus f has a local minimum at c. 7. Since P(2) = 0 we can assume that P(x) = a(x  2)2 + b(x  2). Now use the fact that P must satisfy P(I) = I and P'(I) = 2 to determine a and b.
12. (a) Let t, c. Since f'(c) exists, lim (f(r")  f(c))/(t  c) = f'(c). Now apply the mean value theorem.
15. Since f+(a) = lira (f(x)  f(a))/(x  a;70, there exists a S > 0 such that (f(x)  f(a))/(x  a) > 0 for all x, a < x < a + S. 17. Hint: Consider f (x). 22. (a) For fixed a > 0 consider f(x) = L(ax), x E (0, x). (c) By (a) and (b), L(b") = nL(b) for all n E Z and It E (0, oo). But then L(b) = L((b"")") = it L(V"). From this it
530
Hints and Solutions to Selected Exercises
now follows that L(b') = rL(b) for all r E Q. Now use the continuity of L to prove that L(b") = xL(b) for all x E R. where If = sup{b' : r E Q. r : x}. 23. (b) Since tan(Arctan x) = x. by Theorem 5.2.14 and the chain rule, tan(Arctan x) = (sec(Arctan x))(' Arctan x) = I. The result now follows from the identity sec2(Arctan x) = 1 + x2. To prove this, consider the right triangle with sides of length 1. 1 x (, 1 + x2 respectively.
EXERCISES 5.3 page 196
f(x) f(x,')/g(x)  g(xo) 2f(x) . = x  X. x  X. g(x)
Now use Theorem 4.1.6(c) and the definition of the derivative.
4. Use the fact that since ,ilitf(x) exists, f(x) is bounded on (a, a + 8) for some S > 0. 1 Inx x3+2x3 = lint 5x4+2 = 7 (e) By L'Hospital's rule. lim  = =lim= x = 0. 6. (a) lim +'= x ! 2x3  x2  1 =+I 6x2  2x 4"
(e) Make the substitution x = 1/t.
EXERCISES 5.4 page 203 (2c; + ar)/3c2, 2. (a) f(0) = I and f (1) _ 1. Therefore. 1. Let c, > 0 be arbitrary. By Newton's method c", f has a zero on the interval (0, 1 ]. With c, = 0.5. c2 = 0.33333333, f(c2) = .037037037, c3 = 0.34722222, f (c,3) = 0.000195587, c4 _ .34729635,f(c4) = 0.000000015.
CHAPTER 6 EXERCISES 6.1 page 221
(a) f(x)
_
1, 0Sx< 1, Let 91= xa { x j, ,, ... , x"} be any partition of [0, 2] and let k E {l.... , n} be such
2, 1 s x 0. Use equations (7), (8), and the fact that f E 01 [c, b] for every c E (a, b), to prove that (b
(b
('c
O S J f J f_ J f
rc
f 17,778. The value n = 12 will work. This value of n will guarantee that E12(f) < 0.0000086. Compare your answer with the exact answer of V + 2ln(2 + V5).
EXERCISES 6.7 page 276 4. No. Consider g = Xc on [0, 11 and let f be the zero function.
CHAPTER 7 EXERCISES 7.1 page 291 2. (a) Diverges (c) Converges (e) Diverges (g) Converges by the ratio test (I) Converges (m) Converges for p > 1; diverges for 0 < p s 1 3. (a) Converges to 1 /(1  sin p) for all p E R for which sin p I < 1; that is, for all p * (2k + 1)j, k E Z 4. (b) Since 7, ak converges, lim ak = 0. Thus there exists k, E N such that 0 s ak s 1 for all k  k,. But then 0 5 ak s ak for all k a k and 7, ak converges by the comparison test. (d) Take ak = I/k2. 5. The series diverges for all q < 1, p E R, and converges for all q > 1, p E R. If q = 1, the series diverges for L, where 0 < L < oo. Take e = IL. For this e, there p s 1 and converges for p > 1. 6. Suppose exist k, E N such that #L 5 ak/bk s }L for all k z k,. The result now follows by the comparison test. 12. The proof Vk = 1 for all n r= Z. 13. The given series is the sum of the two series u++ses the fact that lim(LPY' + + a,,, and tk = a1 + 2a1 + Ijt and Mk t ?Jp, each of which converges. 16. let s, = a, + a2 + `k 1 , show that if n < 22, then s s tk, and if is > 2k, 2ka2. By writing s, = ak + (a2 + a3) + (a4  as + a6 + a7) + then s Z I'tk. From these two inequalities it now follows that F ak < oo if and only if I 2ka2, < oo. 1& (a) Diverges. If ak  1/(k In k), then 2ka2. = 1/(k In 2). 19. Use Example 5.2.7 to show that ck  ck+ 1 a 0 for all k. Thus {ck} is monotone decreasing. Use the definition of In k and the method of proof of the integral test to show that ck =' 0 for all k. 21. Write ak+I/ak  I  xk/k where xk = (q  p)(k/(q + k + 1)). 22. (c) When p = 2,
kfI.3...(2k 1)12
2.4... 2k )
112 Tk r =11`12,J zl2k ( 1)a
Now use the fact that M9(1 + h)I"k = e. 23. (a) Set s, = L4., ak and let s = lim s,,. Consider the series E bk
where b, = (1<  W771 ) and fork z 2, bk = (Vs_,  V). EXERCISES 7.2 page 298 1. If {b,} is monotone increasing to b. consider $00 , (b  bk)ak. coos kt, then 1 even. 4. If D. = E[SiO(k 1 sin
r
1
2
I
2
2
2. Take bk = 1/k fork odd, and bk = i/k2 for k
\
1
sin ?tJ.
5. (a) Converges (c) Converges (d) Diverges; kim , , = * 0 (f) Converges (h) Converges for all t * 2nsr, n E Z. If t  2nar, then the series converges for p > 1 and diverges for 0 < p s 1. & Use the partial summation formula to prove that n
n1
n
I kak = nA,  Y, Ak, where Ak = Y, al
k1 k1 Now use Exercise 14 of Section 2.2.
kI
534
Hints and Solutions to Selected Exercises
EXERCISES 7.3 page 305 2. Use the inequality lab 1 s 2 (a2 + b2), a, b E R. 6. (a) Converges conditionally (c) Converges absolutely for p > I and conditionally for 0 < p 1, and conditionally for 0 < p s I S. Fast note that nI 1 I
S.  
3k + I + 3k + 2

I
3k +
3)'
Now show that Si,, +oo as n , oo. 10. By Theorem 7.2.6 the series converges. To show that S ' t = oo, show that for any three consecutive integers, at least one satisfies I sin k I
EXERCISES 7.4 page 312 1. (a) 11{l/(In k)}112 = Moko2 1/(In k)2, which diverges (Exercise 5, Section 7.1) 2. (a) Ip I < 1 (c) p z 2 4. Since { 1/k} E 12, the result follows by the CauchySchwarz inequality. 10. If we interpret the vectors a and b as
forming two sides of a triangle, with the third side given by b  a, then by the law of cosines, I(b  a(I2 = Ilbll2 + 11a112  211 a 112 11 b 112 cos 0. Now apply Exercise 9e.
CHAPTER 8 EXERCISES 8.1 page 322 nx _ 0, x = 0, 1. (a) lim 3. (c)
rf n = j
10
1, x>0
(c) lira (cos x)2n __
".o,
0, 1,
x * k,r, k E Z,
x=kar,kEZ
2/
iM
(2n  n2x) dx = i + = 1 S. (a) If x = 0, fn(0) = 0 for all n e N. If x > 0,
n2x d x + f
0
1/n
then 0 < fn(x) < x/n, f r o m which the result follows. (b) F o r each n E N& (x) has a maximum of e ' at x = n.
6. Use the fact that forN,MEN,
N(M
"I
M1
M an. m
M1
M
N
Rt
an. m
00
M1 R=t
00
a, m
00
(
an. m 111
The above inequalities hold since a0, m > 0 for all n, m E N. Now first let M  oo, and then N b oo, to obtain an. m
an. m
n=1
mil
m=1
nil
The same argument also proves the reverse inequality.
EXERCISES 8.2 page 328 2. (b) Suppose {f,} and {g,} converge uniformly to f and g respectively on E. Then I f"(x)gn(x)  f(x)g(x)1 I g"(x) I I f"(x)  f(x) I+ I f"(x) I I g" (x)  g(x) 1. By hypothesis, I S. (x) I s N for all x E E. n E N. Also, since I f"(x) I s M for all /x E E, n E! /N, I f(x)1 fm M for all x E E. Therefore, Ifn(x)8nx)  f(x)8(x)i
N1 MX)  f(x)I + Ml 8.(x)  8(x)I
Now use the definition of uniform convergence of { fn} and {g,} to show that given e > 0, there exists no E N such that I fn (x) g" (x)  f(x)g (x) ( < e for all x E E and n a n0. 4. Find Mn = max{ fn (x) : x E [0, 111, and show that M. + oo. S. (a) For x E [0, a], I f"(x)1 s a". If 0 < a < 1, then lim a" = 0. Thus given e > 0, there exists n, E N so that d` < e for all n at n,; that is I f"(x) I < e for all x e[ w O, a], n z n,. Therefore, f f.) converges uniformly to 0 on [0, a] whenever a < 1.
Hints and Solutions to Selected Exercises
8. (a)
ii +
< co. the series 2
+ converges uniformly by the Weierstrass k' j? (c) For x , 1, k2ek: s k2(1/e)'t. Since 1/e < 1. the series 7, k2(1/e)k converges.
Mtest.
9. (a)
for all x E It Since .1
535
X
2
I I
(2k +))3n
(2k
1
1)3/2
5 C R for all x E R. Since 7
uniformly for all x E R by the Weierstrass Mtest.
(c) Hint: Let S,(x) =
< co, the given series converges 1
ku(kz + 2
1
 (k + 1)x + 2)
10. (a) For x ? a > 0. 1 + k2x ? ak2. Thus since 1/k2 < no. by the Weierstrass Mtest, the given series converges uniformly on (a, oo) for every a > 0. To show that it does not converge uniformly on (0. oo), consider M for all x E [0, 1 J. Show (S,  5_t)(1/n2), where S. is the nth partial sum of the series. 1& Suppose I Fa(x) for all x E [0, 1], n E N. Now use the Weierstrass Mtest.
that I F,(x) I < M n.
EXERCISES 8.3 page 336 1. Show that
kLx(1  xyt = {?'
Thus by Corollary 8.3.2, the convergence cannot be uniform on
[0, 1 ]. 4. Since f is uniformly continuous on R, given e > 0, there exists a S > O such that I f(x)  f(y) I < e for all x, y E R, Ix  yI < S. Choose n, E N such that 1/n, < S. Then for all n ? n I f(x)  f,(x) I < e for all x E It 6. Let e > 0 be given. Since (f.1 converges uniformly on D. there exists n, E N such that I f,(x)  f(x) I < e for
all x E D. n, m > n,. Use continuity of the functions and the fact that D is dense in E to prove that
If(y)fm(y)I Sefor all yCE E,n,in >n,. EXERCISES 8.4 page 338 1. By the Weierstrass Mtest and the hypothesis on {ak}, the series akxk converges uniformly on [0, 1). Now apply Corollary 8.4.2. 4. Since f E R[0, I ],f is bounded on (0, 1 ], i.e., I f (x) 15 M for all x e [0, 1 ]. Now apply the bounded convergence theorem to g,(x)  x"f(x), which converges pointwise to g(x) = 0.0 5 x < 1, and f(1) when x = 1. 7. For each k E N the function 2kl(x  rk) is Riemann integrable on [0, 1] with fa 2k 1(X  rk)dx = 2k(1  rk). By the Weierstrass Mtest the series converges uniformly on [0, 1). Thus f E 31.[0, I) with fo f = TOO 2k(l  rk). 9. By Theorem 6.2.1, f g E 9t[a, b] for all n (= N. Show that { f g} converges uniformly to fg on [a, b] and apply Theorem 8.4.1. 10. Since If(x) 15 g(x) for all x E [0, oo). n E N, the same is true for I f(x) I. By Exercise 5, Section 6.4, it now follows that the improper integrals of f n e N. and f converge. Since fog < on,
show that given e > 0, there exists c E R, c > 0, so that f,'g < Ie. Now show that
If fJf.1 5Jlffa[+2Js. to f on [0, c] to finish the proof. 11. (b) To show that (19[a, b), n I6) is not Use the uniform convergence of complete, it suffices to find a sequence If.) of continuous functions that converges in the norm II 1i to a Riemann integrable function f that is not continuous.
EXERCISES 8.5 page 345 2. By the fundamental theorem of calculus, f,(x) = f,(x,) + fx'f,(t) dt for all x r= [a, b]. If { f )converges uniformly tog on [a, b], use Theorems 6.3.4 and 8.4.1 to prove that {f} converges uniformly to a function f on [a, b) with f (x) = g(x) for all x E [a, b]. 4. Let x E [a, b] be arbitrary, and choose c, d such that a < c < x < d < b. Now apply Theorem 8.5.1 to the sequence {f,} on [c, d], to obtain that f is differentiable at x
with f (x) =
f;(x).
536
Hints and Solutions to Selected Exercises
kx)_2
6. (a) Use the comparison test to show that the given series converges for all x > 0. Let S(x) _ 7,k , (1 + and S (x) = Mk=, (1 + kx)2. Then S.(x) = 228k= I k(1 + kx) 3. Use the Weierstrass Mtest and the comparison test to show that the sequences {S (x)} and IS' (x)) converge uniformly on [a. oo) for every a > 0. Thus by Theorem 8.5.1. 00
S'(x) = limS., (x) _ 2 ;k(1 +
kx)_3
kI
for all x E (a, oo). Since this holds for every a > 0, the result holds for all x E (0, oo).
EXERCISES 8.6 page 352 2. Let 91 _ {xo, x ... , be a partition of [a. a + p]. Set y1 = x,  a. Then 91* = {y y ... , of [0, p]. If t E [x;_,, x;], then r = s + p for some s E [y; _,, y;]. Since f is periodic of period p,
is a partition
and as a consequence,
f(t) = f(s + p) = f(s). Therefore, sup{f(t) : t E [x,x,]} = sup{f(s) : s E xlL(91, f) = q.L(9", f). From this it now follows that: J.+P a
v
f=f'f
The proof for the lower integral is similar. Thus f E 9L[0,p] if and only if f E 9t[o,a + p].
4. (a) c = }(n + 1) 6. Set A,(8) = sup{Q (x) : x E [S, S]). Then 0 < 8, < S2 implies
A (S2).
Suppose tim A (S,) < oo for some 81 > 0. Then there exists a finite constant C. and n, E N such that A (S) s C for is an approximate identity. all n ? n 0 < S 5 81. Use this fact to obtain a contradiction to the hypothesis that
EXERCISES 8.7 page 370
1. (b) R = 2 (d) R = e 2. (b) By the root test the series converges absolutely for all x,  2 < x < t and 1 ( l)kx2k, Ixl < 1. Use the diverges for all other x E R. 3 (a) x/(1  x)2 S. (a) + x2 = 1  (x2) = 1
AD
previous exercise and the fact that Arctan x = fo(l + t2)", dt to find the Taylor series expansion of Aretanx at c = 0. (c) Use Theorem 7.2.4. 12. (b) By Example 8.7.20(c),
(
kk+I
(x l)k,
lnx=ln(l +(x  1)) = which converges for all x, 0 < x 5 2. (d) By computation, f O) (x) _ Taylor series expansion of (I  x) '/' is given by
1+
00 1.?2 .
(k2xk
2
k=I
(k  })(t  x)(k' 1R). Therefore, the
kl
For 1 < x s 0, use Theorem 8.7.16 to show that
(n + 1)! Use convergence of the series 13
(n + )
IxIA.I. IxI < 1, (n + l)! to conclude that lim R0(x) = 0,  i < x 5 0. If 0 < x < 1, use Corollary 8.7.19 to show that
7
1.3...(n+ n!
fxx (1x)3410)
Hints and Solutions to Selected Exercises
537
for some {', 0 < C < x. Now use the method of Example 8.7.20(c) to show that lim R"(x) = 0 for all x, 0 < x < 1. Thus the series converges to (I  x)'R for all x, I x I < I. (e) Use the fact that ' Aresin x =
1
1  t2
1
dt, Ix I < 1.
EXERCISES 8.8 page 376
1. (a) r(2) = r(3 + 1) _ ]r(i) = }V;i 3. (a) foe'r'ldt = r(2) = WW S. (a)
ran
la
i r(n + 21)x(21)
(sinx)z'dx = 2
r(n + 1)
CHAPTER 9 EXERCISES 9.1 page 388
1. j' 0i = j 1 = 2 and j
022 = j x2 = ;. Therefore, c, _ If 1, sin wx dx = 0 and c2 = 2 j' 1 x sin ax dx Thus by Theorem 9.1.4. S2(x) _ ;x gives the best approximation in the mean to sinir x on [ 1, 1 ].
1, a2 = 1. a3 = 6 S. (c) b" = : fo x sin nx dx   w cos mr = 2,(1)"''. Therefore,
3. (a) a, f(x) ^' 2
k)k+ sin kx. k=1
6. (c) x  2  a E00 ( kO
+ l)2 cos(2k + 1)x 12. (a) As in the proof of Theorem 7.4.3, for A E R,
O :s II x  kyll2 = Kxil2  2A(x,y) + A=IIy112. If y * 0, take k = (x,y)11IYII2 to derive the inequality.
EXERCISES 9.2 page 394 4. (a) For the orthogonal system {sin nx}.° , on [0, w], f sin2nx dx = 16. Therefore, b" = w f f(x) sin nx dx. Thus Parseval's equality for the orthogonal system (sin nx) becomes CC
M1
fIr 0
2
2
(b) (1) g ,
S. Use Parseval's equality and the fact that fg = i((f +'g)2  f2  g2] 6 that is identically zero except at a finite number of points will satisfy ja f dx = 0. (11)
6. Any function
EXERCISES 9.3 page 403 1. (a) If f is even on [ zr,vr], then f (x) sin nx is odd and 1(x) cos nx is even. Thus b = O for all n = 1, 2... , and
;jof(x)cosnxdx,n=0,1,2.... 3. (a) f(x) (c) I xI ^
5. (a) 1,
2
1,
(1
(1)k) k
IT
 ,rr F, (2k + 1)2
sin kx =
a k0 2k + l
sin(2k + 1)x
cos(2k + 1)x. (e) 1 + x  I + 2 1
1 4 ao sin(2k + 1)x a k 0. 3. First show that m(P^) = 2m(P^_1) for all n E N. From this it now follows that m(P) s (3)' for all n E N.
Hints and Solutions to Selected Exercises
539
6. First show that there exist disjoint bounded open sets U1, U2 with U, 1) K, and U2 J K2. Then
m(K,UK2)=m(U,U(.2)m((U,UU2)\(K,UK2)).But(U,UU2)\(K,UK2)=(U,\K,)U(U2\K2).Now use Theorem 10.2.9.
EXERCISES 10.3 page 448 1. (b) First show that if U is any open set, then U + x is open and m(U + x) = m(U). Use this and the definition to prove that A*(E + x) = A*(E). If K is compact and U is a bounded open set containing K. show that (U + x) \ (K + x) = (U \ K) + x. Use this to show that m(K + x) = m(K) and A*(E + x) = A*(E). 3. Since E, fl E2 C E, and A*(E,) = 0, A*(E, fl E2) = 0. Thus by Theorem 10.3.5, E, fl E2 is measurable. For E, U E2 apply Theorem 10.3.9. 6. If A*(E) < oo, then for each k E N there exists an open set Uk with UA D E such that m(Uk) < A*(E) + k. Now use the fact that E C fl U" C Uk for all k E N. S. Set Ek = E fl (k, k), k E N. Then {A*(Ek)} is monotone increasing with A*(Ek) )3. Use this to show that there exists k G N such that A*(Et) > ,B for all k ? k, which is a contradiction
EXERCISES 10A page 455 2. If E is bounded, the result follows from the definition of A*(E) and A*(E), and Theorem 10.4.5(b) (for a finite
union). If E is unbounded, let E. = En(n. n). Given e > 0, choose U. open such that E C U. and A(U" \ E) < e/2". Let U = U U,;. Show that U \ E C U (U" \ E"). Now use Theorem 10.4.5 to show that A(U \ E) < e. To obtain a closed set F C E satisfying A(E \ F) < e, apply the result for open sets to E. 4. First show that A(E, U E2) = 1; then use Theorem 10.4.1. 6. If E satisfies A*(E fl T) + A*(E` fl T) = A*(T) for every T C R, then E satisfies Theorem 10.4.2 and thus is measurable. Conversely, suppose E is measurable and T C R. If A*(T)  oo, the result is true. Assume A*(T) < oo. Let e > 0 be arbitrary. Then there exists an open set U D T such that A(U) < A*(T) + e. Since E and U are measurable, E fl U and E` n U are disjoint measurable sets with (E fl U) U (E` fl U) =U. Furthermore, E fl U D E fl T and E` fl U D E` fl T.Thus by Theorem 10.3.9,
A*(T) s A*(E fl T) + A*(E` fl T) S A(E fl U) + A(E` fl u) = A(U) < A*(T) + e. Since the above holds for every e > 0, we have A*(E fl 7) + A*(E` fl T) = A*(T).
EXERCISES 10.5 page 461
1. {x : f(x) > c} a
[0,1],
if c < 0.
(0, 1 ],
(0, 1) U {1},
if OS c< 1, if l s c < 2.
(0, 1).
if 2 s c.
> c} = {x : g(x) > 0} fl ix: g(x) < 1}. Since g is measurable, each of the sets {g(x) > 01 5. If c > 0, then {x : and {g(x) < c} = E. Since each of the sets {f(x) > c} and E are measurable, f' is measurable. (c) Not in general. If E is a nonmeasurable set, consider the function that is 1 on E and 1 on E`. 14. Since f is differentiable on [a, b], f'(x)  lim n(f(x + 1)  f(x) for all x E [a, b]. For each n E N, g"(x) = n[f(x + ,',)  f(x)] is measurable (justify). Thus by Corollary 10.5.10. the function f is measurable. 15. First show that given e, d > 0, there exists a measurable set E C [a. b] and n, E N such that A([a, b] E) < e and I f"(x)  &) I < S for all x E E and n ? no. To accomplish this, for each k E N consider Ak = {x : I f"(x)  f(x) I < S
for all n z k}.
Now show that lim A(Ak) = 0. Here Ak = (a, b] \Ak. Complete the proof of Egorov's theorem as follows: By the
above, for each k E N, there exists a measurable set Ek and integer nk such that A(Ek) < e/2k and I f(x)  f,(x) I < for all x E Ek and n z nk. The set E= fl Ek will have the desired properties.
540
Hints and Solutions to Selected Exercises
EXERCISES 10.6 page 472 1. For each n E N, let rp, = (m + (j 1)@ )XE. Then gyp, is a simple function on [a, b] with f. q,. dA = S,(f). Furthermore, for each x E [a, b], 0 s f(x)  (p,(x) s $/n. Therefore, liimrp,(x) = f(x) for all x E [a, b]. Now apply the bounded convergence theorem. S. By Theorem 10.6.10(b), fF f fE f dA + fE. f dA > fE f dA. 7. For each n E N, let E. = {x : f(x) > 11. Then U E, = {x : f(x) > 0). Use the previous exercise to show A(E) = 0. Now use Theorem 10.4.5. 12. The function rp, defined in the solution to Exercise 1 satisfies I f(x)  (p,(x) I < $/n for all x E [a, b]. Thus .{9.1 converges uniformly to f on (a, b]. 15. Suppose first that rp = XA where A is a measurable subset of [a, b]. By Exercise 2, Section 10.4, there exists an open set U ) A such that A(U \ A) < e/2. Use the set U to show that there exists a finite number of disjoint closed intervals {J,}.N=, such that V = U J. C U and A(U \ V) < e/2. Let h = X ,. j.. Then his a step function on [a, b] and {x : h(x) * rp(x)} C (U 1 V) U (U \ A). If rp = JJ.,a;XAJ, where the Al are disjoint measurable subsets of [a, b), approximate each XA by a step function hi that dAp=
agrees with XA, except on a set of measure less than a/n.
EXERCISES 10.7 page 482 2. (a) Assume first that A, and A2 are bounded measurable sets. For each n E N. set f, = min{f, n}. By Theorem 10.6.10(b),
r
(
f,dA= J f,dA+
1A,UA,
f,dA.
1A,
A,
Since each of the sequences { fA f,} ,, i = 1, 2 are monotone increasing, they converge either to a finite number, or diverge to oo. In either case,
/ = +4+O j of dA = i(
J
JAUA
f,dA + J f,dA) A,
A,
JfdA
j.fdA. A
A.
If either A, or A2 is unbounded, consider the integral off over (A, U A2) n[  n, n], and use the above. 4. For vv
f(x) = x P, x E (0,1), f,(x) = min{f(x), n} =
x'v
n o