1,645 299 22MB
Pages 420 Page size 436.32 x 682.32 pts Year 2011
'/\
Real Variables Alberto Torchinsky University if Indiana, BloominBwn
Addison-Wesley Publishing Company, Inc. The Advanced Book PrOfJram Redwood City, California· Menlo Park, California· Reading, Massachusetts New York· Amsterdam· Don Mills, Ontario· Sydney Bonn • Madrid • Singapore • Tokyo • Bogota· Santiago San Juan· Wokingham, United Kingdom
Publisher: Allan M. Wylde Production Administrator: Karen L. Garrison Editorial Coordinator: Pearline Randall Electronic Production Consultant: Mona Zeftel Promotions Manager: Celina Gonzales
Library of Congress Cataloging-in-Publication Data Torchinsky, Alberto. Real variables / Alberto Torchinsky. p. cm. Includes index. ISBN 0-201-15675-x : $39.95 1. Functions of real variables. I. Title QA331.5.T58& 1987 515.8-dc19
87-18629 CIP
Copyright @1988 by Addison-Wesley Publishing Company. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recOlding, or otherwise, without the prior written permission of the publisher. Published simultaneously in Canada. Printed in the United States of America. This book was typeset in MicroTEX using a Leading Edge Model D computer. Cameraready output from an Apple Laserwriter Plus Printer. ABCDEFGIIIJ-AL-8987 0-201-15675-x
To Massi, Kurosh, and Darius
Author's Foreword During the academic year 1985-1986 I gave a course on Real Variables at Indiana University. The main source of reference for the course was a set of class notes prepared by the students as we went along; this book is based on those notes. One of the purposes in those lectures was to present to students who are beginning a deeper study of the fairly esoteric subject of Real Variables an overview of how the familiar results covered in Advanced Calculus develop into a rich theory. Motivation is an essential ingredient in this endeavour, as are convincing examples and interesting applications. Now, teaching a course at this level two facts become quickly apparent, to wit: (i) The background of the students is quite varied, as first year graduate and upper division undergraduate Math students, as well as various science and economics majors, enroll in it, and, (ii) Even those students with a strong background are not entirely at ease with proofs involving either an abstract new concept or an c-o argument. My idea of a course at this level is one that presents to the students a modern introduction to the theory of real variables without subjecting them to undue stress. ' Although the material is not presented here in a radically different way than in other textbooks, this book offers a conceptually different approach. First, it takes into account, both in placement and content of the topics discussed, the uneven nature of the background of the students. Second, an attempt has been made to motivate the material discussed, and always the most "natural" rather than the most elegant proof of a result is given. \Ve also stress the unity of the subject matter rather than individual results. Third, we go from the particular to the general, discussing each definition and result rather carefully, closer to the way a mathematician first thinks about a new concept. Finally, students are not "talked down," but rather feel that the issues at hand are addressed in a forthright manner and in a direct language, one they can understand. It is important that readers have no difficulty in following the actual arguments presented and spend their time instead in considering questions such as: What is the role or roles of a given result? What is it good for? What are the important ideas, and which are the secondary ones? What are the basic problems in this area and how are they approached and solved?
v
vi
Author's Foreword
In fact, we expect the serious students at this level to learn to ask these questions and this text will serve as a guide to ask them at the appropriate time. How does the text present the material? An important consideration is that the students see the "big picture" rather than isolated theorems, and basic ideas rather than generality are stressed. Each chapter starts with a short reader's guide stating the goals of the chapter. Specific examples are discussed, and general concepts are developed through particular cases. There are 599 problems and questions that are used to motivate the material as well as to round out the development of the subject matter. The reader will be pleasantly surprised to find out that problems are in fact problems, and not further theorems to be proved. Problems are thought-provoking, and there is a mixture of routine to difficult, and concrete to theoretical. Because I wanted this book to be essentially self-contained for those students with a good Advanced Calculus background as well as an elementary knowledge of the theory of metric spaces, the point of departure is an informal discussion of the theory of sets and cardinal numbers in Chapter I, and ordinal numbers and Zorn's Lemma in Chapter II. These topics give the student the opportunity to work with abstract, possible new, concepts. Chapter III introduces the Riemann-Stieltjes integral and the limitations of the Riemann integral become quickly apparent; £-6 proofs are discussed here. At the completion of these chapters the background of the students has been essentially equalized. Chapter IV is the exception that proves the rule. It develops the abstract concept of measure, a particular case of which, the Lebesgue measure on Rn, is discussed in Chapter V. Anyone objecting to this treatment can plainly, and almost painlessly, read these chapters in the opposite order. The construction of the Lebesgue measure is a favorite among the students, as it allows them to discover where measures come from and how they are constructed. In Chapter VI we return to a somewhat abstract setting, although for reasons of simplicity Lusin's theorem is presented in the line where all the difficulties are already apparent. An important feature of this chapter is working with "good" and "bad" sets; this is an indispensable tool in other areas, including the Calder6n-Zygmund decomposition of integrable functions discussed in Chapter VIII. The proof of Egorov's theorem illustrates our point of view: It is longer than the usual proof, but it is clear and understandable. In Chapter VII we introduce the notion of the integral and the role of almost everywhere convergence. I am confident that the path that leads to the various convergence theorems is direct and motivational. The material described thus far constitutes a solid first semester of a yearlong course.
Foreword
vii
Chapter VIII presents new properties of integrable functions, including the Lebesgue Differentiation Theorem. The proof given here makes use of the Hardy-Littlewood maximal function, and is one that most experts agree should have worked its way into the standard treatment of this topic by now. Chapter IX constructs important new examples of measures on the line, the Borel measures. The correspondence between these measures and their distribution functions, a subject that lies at the heart of the theory of Probability, is established in an elementary and computational manner. Chapter X discusses properties of absolutely continuous functions, including the Lebesgue decomposition of functions of bounded variation and the characterization of those functions on the line that may be recovered by integrating their derivatives. The abstract setting of these results is presented in detail in Chapter XI, where the Radon-Nikodym theorem is discussed. The basic theory of the Lebesgue LP spaces, including duality and the notion of weak convergence is covered in Chapter XII. Chapter XIII deals with product measures and Fubini's theorem in the following manner: In the first section we discuss the version dealing with Lebesgue integrals in Euclidean space; the second section discusses some important applications, including convolutions and approximate identities; and, finally, the third section presents Fubini's theorem in an abstract setting. This is a concrete example on how to proceed from the particular to the general. However, if preferred, the third and second sections can be covered, and the first section assigned for reading. Chapter XIV deals with normed linear spaces, an abstraction of the notion of the LV spaces, and the Hahn-Banach theorems. Students are happy to see both the geometric and analytic forms of this result and their applications. Chapter XV covers the basic principles of Functional Analysis, to wit, the Uniform Boundedness Principle, the Closed Graph Theorem, and the Open Mapping Theorem; each principle is given individual attention. In Chapter XVI we consider those Banach spaces whose norm comes from an inner product, or Hilbert spaces. The discussion of the geometry of Hilbert spaces and the spectral decomposition of compact self-adjoint operators are some of the features of this chapter. Brief historical references concerning the origin of some of the concepts introduced in the text have been made throughout the text, and Chapter XVII presents these remarks in their natural setting, namely, the theory of Fourier series. Finally, Chapter XVIII contains suggestions and comments to some of the problems and questions posed in the book; they are not meant, however, to make the learning of the material effortless. The notations used throughout the book are either standard or else they are explained as they are introduced. "Theorem 3.2" means that the result alluded to appears as the second item in Section 3 of the present chapter, and "Theorem 3.2 in Chapter X" means that it appears as the
Author's Foreword
viii
second item of the third section in Chapter X. The same convention is used for formulas and problems. A word about where the text fits into the existing literature. It is more advanced than Rudin's book Principles of Mathematical Analysis, a good references for the material on Advanced Calculus and metric spaces. It is also more abstract than the treatise Measure and Integral by Wheeden and Zygmulld. I learned much of the material on integration from Antoni Zygmund, and some of the topics discussed, including the construction of the Lebesgue measure and the outlook on the Euclidean version of Fubini's theorem, have his imprint. Then, there are the classics. They include Natanson's Theory of Functions of a Real Variable, Saks' Theory of the Integral, F.Riesz and Sz.-Nagy's Lel$ons d'Analyse Fonctionnelle, Halmos' Measure Theory, Hewitt and Stromberg's Real and Abstract Analysis, and Dunford and Schwartz's Linear Operators. Anyone consulting these books will gain the perspective of the masters. Where do we go from here? I am confident that the reading of this book will adequately prepare the student to venture into diverse fields of Mathematics. Specifically, books such as Billingsley's Probability and Measure, Conway's A Course in Functional Analysis, Stein's Singular Integrals and Differentiabilty Properties of Functions and Zygmund's Trigonometric Series are now within reach. Acknow ledgments It is always a pleasure to acknowledge the contribution of those who make a project of this nature possible. My friends and colleagues Hari Bercovici and Ron Kerman read the complete manuscript and made valuable suggestions and comments. The opportunity to create this manuscript with the Micro'lEX version of 'lEX was an unexpected pleasure and challenge. Elena Fraboschi and George Springer were my mentors in this endeavour, and lowe them much. Pam Cunningham Pierce contributed with the illustrations. My largest debt, though, is to the students who attended the course and kept a keen interest in learning throughout the ordeal. Many examples and solutions to the problems are due to them, particularly to Steve Rowe. Steve Blakeman, Nick Kernene, and Shilin Wang were also very helpful. The manuscript was cheerfully typed by Storme Day. The staff at Addison-Wesley handled all my questions efficiently. Mopa Zeftel provided the much needed technical assistance, and Allan Wylde was the best publisher tIllS ambitious project could have had.
Contents Foreword
v
Chapter 1. Cardinal Numbers
1
Sets Functions and Relations Equivalent Sets Cardinals Problems and Questions
1 3 5 8 11
Chapter II. Ordinal Numbers
15
Ordered Sets Well-ordered Sets and Ordinals Applications of Zorn's Lemma The Continuum Hypothesis Problems and Questions Chapter III. The Riemann-Stieltjes Integral Functions of Bounded Variation Existence of the Riemann-Stieltjes Integral The Riemann-Stieltjes Integral and Limits Problems and Questions Chapter IV. Abstract Measures Algebras and u-algebras of Sets Additive Set Functions and Measures Properties of Measures Problems and Questions
15 17 20 23 23 27 27
32 38 40 45 45
49 53 58
Contents
x
Chapter V. The Lebesgue Measure Lebesgue Measure on R n The Cantor Set Problems and Questions Chapter VI. Measurable Functions Elementary Properties of Measurable Functions Structure of Measurable Functions Sequences of Measurable Functions Problems and Questions Chapter VII. Integration The Integral of Nonnegative Functions The Integral of Arbitrary Functions lliemann and Lebesgue Integrals Problems and Questions Chapter VIII. More About L1 Metric Structure of L1 The Lebesgue Differentiation Theorem Problems and Questions Chapter IX. Borel Measures Regular Borel Measures Distribution Functions Problems and Questions Chapter X. Absolute Continuity Vitali's Covering Lemma Differentiability of Monotone Functions Absolutely Continuous Functions Problems and Questions Chapter XI. Signed Measures Absolute Continuity The Lebesgue and Radon-Nikodym Theorems
63 63 73
74 79 79
89 91 98 105 105 114
120 124 131 131 136 143 149 149 153 159 165 165 167 171 178
183 183 194
Contents
Problems and Questions Chapter XII. LP Spaces The Lebesgue LP Spaces Functionals on LP Weak Convergence Problems and Questions Chapter XIII. Fubini's Theorem
Xl
204 209 209 218 229 233 237
Iterated Integrals Convolutions and Approximate Identities Abstract Fubini Theorem Problems and Questions
237 246 256 262
Chapter XIV. Normed Spaces and Functionals
267
Normed Spaces The Hahn-Banach Theorem Applications Problems and Questions Chapter XV. The Basic Principles The Baire Category Theorem The Space 8(X, Y) The Uniform Boundedness Principle The Open Mapping Theorem The Closed Graph Theorem Problems and Questions Chapter XVI. Hilbert Spaces The Geometry of Inner Product Spaces Projections Orthonormal Sets Spectral Decomposition of Compact Operators Problems and Questions
267 271
282 293 297 297 300 303 306 309 312 317 317 326 331 338 351
Contents
xii
Chapter XVII. Fourier Series
357
The Dirichlet Kernel The Fejer Kernel Pointwise Convergence
357 363 369
Chapter XVIII. Remarks on Problems and Questions
371
Index
399
CHAPTER
I
Cardinal Numbers
We open our discussion by introducing, in a naive fashion, the notion of set. We are particularly interested in operating with sets and in the concept of "number of elements" in a set, or cardinal number. We consider various cases of infinite cardinals and do some cardinal arithmetic.
1.
SETS
What is a set? According to G. Cantor (1845-1918), who initiated the theory of sets in the last part of the nineteenth century: "A set is a collection into a whole of definite, distinct objects of our intuition or our thought. The objects are called the elements (members) of the set." The origin of the theory of sets, like that of many of the basic notions and results that are covered in this book, can be traced back to the theory of trigonometric and Fourier series. The theory of sets was created by Cantor to address the problem of uniqueness for trigonometric series. We refer to the "whole of distinct objects" in Cantor's definition as the universal set. We denote sets by capital letters A, ... and elements by small letters a, .. . , say. The notation a E A, which reads a belongs to A, indicates the fact that a is a member of A. Most of the sets we consider are of the following form: If X is the universal set, then A is the set of those x in X for which the property P(x) is true. The convenient, and descriptive, notation we adopt in this instance is A = {x EX: P( x)}, or plainly A = {x: P(x)} or even A = {P(x)}. Nor Z+ is the set of natural numbers {I, 2, ... }, Z = {... , -1,0, I, ... } is the set of integers, Q = {r:r = m/n,m,n E Z,n f:. O} is the set of rational numbers, I is the set of irrational numbers and R is the (universal) set of real numbers. Q+ = {r E Q: r ~ O} and Q_ denote the sets of
I.
2
Cardinal Numbers
nonnegative and negative rational numbers respectively; similarly for I+, L, R+ and R_. If a is not a member of A we write a r;. A, which reads a does not belong to A. The complement B\A of a set A relative to a set B is defined as
B \ A = {b E B:b
r;.
A}.
We call X \ A the complement of A. For instance, in the universal set R, the complement of Q is I and that of I is Q. It is not clear at this point what the complement of the universal set X should be. For this, and other important reasons, we postulate the existence of a particular set. We say that 0 is the empty set if x E 0 holds for no element x. For instance, for every set A, A \ A = 0. If every element of a set A also belongs to a set B we say that A is a subset of B and we write A ~ B or B 2 A; these expressions read A is contained in or equal to Band B contains or is equal to A, respectively. For instance, Z ~ Q ~ Rand 0 ~ A for any A. We say that sets A and B are equal, and we write A = B, if A ~ Band B ~ A. Although this definition seems a bit cumbersome, it often represents the only practical way we have to determine whether two sets are equal. To emphasize that A is a proper subset of B, i.e., A ~ B and A 1= B, we write A C B or B ::J A. Given a set A, we let peA), or parts of A, be the set consisting of all the subsets of A, i.e., peA) = {B:B ~ A}. For instance, if A = {a,b}, then peA) = {0,{a},{b},{a,b}}. What operations can we perform with sets, and what new sets are generated? We begin by introducing the union and intersection. Let A and B be any two sets. By the union A U B of A and B we mean the set consisting of those elements which belong to either A or B. Thus Au B = {x:x E A or x E B}. By the intersection An B of A and B we mean the set consisting of all elements which belong to both A and B, i.e., An B = {x: x E A- and x E B}. In case An B = 0 we say that the sets A and B are disjoint. For instance, QUI = Rand Q n I = 0. How do we operate with more than two sets? A set whose elements are sets is referred to as a collection, a class or a family. Families are denoted by script letters A, ... For the question we posed it often suffices to consider a family A of indexed sets. More precisely, if I is a nonempty set and A = {Ai: i E I}, then we put
U Ai = {x : x E Ai for some i in I} ieI
2.
Functions and Relations
and
3
n
Ai = {x : x E Ai for all i in I} .
ieI
It is quite straightforward to operate with these concepts, cf. 5.1 below. If A = {Ai: 1 ~ i ~ n} is a family of n sets, we define the Cartesian product IIi=1 Ai, or product, of the Ai'S as the set of ordered n-tuples n
II Ai = {(all .. ·, an) : ai E Ai, 1 ~ i ~ n} . i=l
This set is named after Descartes (1596-1650), who introduced the rectangular coordinates for the plane; the analogy of the concepts is clear. A familiar product is R n = {(Xl, ... ,X n ): Xi E R, 1 ~ i ~ n}. A product of two sets A and B, say, is denoted by A X B. A useful application of the notion of product is the following: If X ~ A U B, then the sets A x {x} and {x} X B look essentially like A and B, and yet are disjoint.
2. FUNCTIONS AND RELATIONS Various fields of human endeavour have to do with relationships that exist between sets of objects. Graphs and formulas, for instance, are devices for describing special relations in a quantitative way. We start by defining a particular kind of relation, namely, a function. The terminology goes back to Leibniz (1646-1716) who used the term primarily to refer to certain kinds of mathematical formulas. The notion of function generally accepted today was first formulated in 1837 by Dirichlet (1805-1859) in a memoir dealing with the convergence of Fourier series. Given two sets A and B, say, a function I from A into B is a correspondence which associates with each element a of A, in some manner, an element, and only one, bin B, which we denote by I(a). We refer to I as a function (or map, mapping, correspondence or transformation) of A into B. A is called the domain of I and those elements of B of the form I(a) form a subset of B, denoted by I(A), called the range of I. Any letter in the English or Greek alphabets, capital or small, may be used to denote a function. The symbol I: A -+ B means that I is a function with domain A and range contained in B. If I:A -+ Band g:B -+ C, then the mapping go I:A -+ C is defined by go I(a) = g(J(a» for a in A. The function 9 0 f is called the composition of f and g. A function F is said to be
4
I.
Cardinal Numbers
an extension of a function I, and I a restriction of the function F, if the domain of F contains that of I and F( a) = I( a) for every a in the domain of I. The restriction of F to a subset A of its domain is denoted by FlA. The function I is said to map A onto B if I(A) = Bj we also say that I is surjective. The function I is said to be a one-to-one mapping of A into B, or plainly one-to-one or injective, if I(ad i l(a2) whenever al i a2 for all aI, a2 in A. Suppose I: A - B is one-to-one and onto. Then we can define the mapping g: B - A by means of g(b) = a whenever I( a) = b. The function 9 is called the inverse of I and is denoted by 1-1. For example, the function I: (-1,1) - R given by I(x) = tan(1I'x/2) is one-to-one and onto, and its inverse l-l:R - (-1,1) is I-l(x) = 2arctan(x)/1I'. Although somewhat inconsistent, we conform to tradition and adopt the following notation: If I: A - B and C ~ B, the set {a E A: I(a) E C} is called the inverse image of C by I and is denoted by l-l(C). This set should not be confused with (J-l)(C) = {a:a = I-l(b),b E C} which is only defined when 1-1 exists. Two particular functions have a specific name. They are the identity function 1: A - A, 1( a) = a for all a in A, and the characteristic function XE of a set E, i.e., the function defined by the equation XE( x) = 1 if x E E and 0 otherwise. We often work with families of functions. The collection of all the functions I: A - B from a set A into a set B is denoted by BA. For example, RN denotes the family of all real sequences {Tl' T2, • •• }. We visualize a function I from A into B as a particular subset ofAxB. Indeed, we think of I as the subset of A x B consisting of the ordered pairs (a,J(a))j in other words there is a natural identification between I and its graph. This notion can be extended considerably. An arbitrary subset R of Ax B is called a relation. To emphasize this correspondence we often write aRb to indicate that (a,b) E R. In addition to functions, an important instance of relations are the so-called equivalence relations. In this particular case we have A = B and the equivalence relation R satisfies the following three properties: R(reflexivity) aRa, all a in A, S(symmetry) aRb iff(if and only if) bRa, T(transitivity) If aRb and bRe, then aRc. The equivalence class 'R( a) of an element a E A is the set 'R( a) = {b E A: aRb}; A is then the disjoint union of these equiValence classes. For instance, let A be the collection of all the straight lines L in R2. Then the relation LlRL2 iff Ll and L2 are parallel is an equivalence
3.
Equivalent Sets
5
relation, and the equivalence class n(L) of any line L consists precisely of all the lines parallel to it.
3. EQUIVALENT SETS Suppose A and B are two sets for which there is a function I: A ---t B which is one-to-one and onto. Intuitively, the sets A and B are interchangeable provided we are interested in some property that does not concern the specific nature of their elements. Therefore, in this case we say that A and B are equivalent, with equivalence function I, and we write A rv B. It is readily seen that rv is an equivalence relation among sets. Indeed, rv verifies the following three properties: R. A rv A, S. If A rv B, then B rv A, T. If A rv B and B rv C, then A rv C. First, in the case of R, the identity equivalence function will do. As for S, if I: A ---t B is an equivalence function, then 1-1: B ---t A establishes an equivalence between Band A. Finally, if I: A ---t Band g: B ---t C are equivalence functions, so is go I: A ---t C, cf. 5.10 below. By means ofthis equivalence relation we are able to sort sets as follows: A finite set is any set that is either empty or equivalent to {I, ... ,n} for some n EN. Any set that is not finite is called infinite. For instance N, and any set equivalent to N, is infinite. Sets equivalent to N are called countable; it is easy to see why. If A is a countable set and I: N ---t A is an equivalence function, then each element a E A is of the form a = I(n), n EN, and can be identified with n. Thus A can be explicitly written as the sequence (at, a2," .), where an I(n), n E N. A set which is either finite or countable is said to be at most countable. An uncountable set is one which is not at most countable. It is not hard to see that there are uncountable sets. Indeed, let 10 = [0,1] be the unit interval of the real line; we claim that 10 is uncountable. Suppose not, then 10 can be expressed as rl, r2,' .. , say. Dividing 10 into three closed intervals, each of length 1/3 (they may have common endpoints), it is clear that one of the intervals, II say, does not contain rl; if there is more than one interval just choose any. Next we divide II into three closed intervals of equal length and choose a second subinterval, 12 say, which does not contain r2' Proceeding in this fashion we construct a nested sequence In of closed intervals, each of which is one-third of the preceding in length and such that rn fi. In, all n in N. By the wellknown nested interval principle, the intersection nnEN In is not empty
=
I.
6
Cardinal Numbers
°
and consists of a single real number T, say. Clearly ::s; T ::s; 1. Since our assumption is that all the real numbers in 10 are listed in the sequence T}, T2, ••• , T must be one of the Tn'S. But since by construction Tn f/. In, then T =I Tn for all n, and we have reached a contradiction. This result is so interesting that it deserves another proof, cf. 5.15 below. It is often helpful to "picture" a proof and once this is achieved to translate this proof into one that can be written out. For instance, let us show that any two closed, bounded intervals are equivalent. If the intervals are [a,b] and [c,d], say, and if (b- a) < (d- c), then an equivalence function can be readily obtained as indicated in Figure 1. In fact, the picture hints that an explicit expression for f might be f(x) = ((d-c)/(b-a))(x-a)+c. A similar picture can be used to establish that any two bounded, open intervals in the line are equivalent. Combining this result with the above observation that (-1,1) "" R, it readily follows that any two open intervals, bounded or not, are equivalent. It is natural to consider whether [0,1] and [0,1) are equivalent. This is a slightly more complicated question since a proof by pictures is not easy to come by. Let A = {Tl, T2, •.• } consist of a decreasing sequence of distinct points in [0,1] such that Tl = 1 and lim n -+ co Tn = 1/2. Then the function f: [0,1] --+ [0,1) given by f(T) = T if T f/. A and f(Tn) = T n +}, Tn E A, establishes the desired equivalence. Now, that [0,1) and [0,1] are equivalent is not so surprising since [0,1/2] ~ [0,1) ~ [0,1] and [0,1/2] "" [0,1]. The remarkable fact is that a similar result, conjectured by Cantor, is true for arbitrary sets.
Figure 1
3.
Equivalent Sets
7
Theorem 3.1 (Schroder-Bernstein). Let Ao, At, A2 be distinct sets such that A2 C Al C Ao and suppose that Ao A 2. Then also Ao AI. fV
fV
Proof. Let f: Ao - A2 be an equivalence function and consider fIA I , the restriction of f to AI. If we let A3 = f(A I ), then clearly (fIAI): Al - A3 is one-to-one and onto and it establishes an equivalence between Al and A 3. By a proof by pictures it also follows that Ao \ Al A2 \ A3 and that fl(Ao \ AI) is an equivalence function in this case. Next we put 9 = flAI and we repeat the above argument with A2 and 9 in place of Al and f. That is, let A4 = g(A2) and note that g1A 2: A2 A4 is an equivalence function. It is also readily seen that gl(AI \ A 2) establishes the equivalence of Al \ A2 and A3 \ A 4. We repeat this procedure and thus obtain a decreasing sequence {An} of subsets of Ao which satisfies the following properties: fV
(i) Al A3 As (ii) A2 A4 A6 (iii) An \ An+I An+2 \ A n+3 , fV
fV
fV
•••
fV
fV
fV
•••
fV
all n ~
o.
Furthermore, note that
and
The equivalence of Ao and Al follows now readily since the sets on the right-hand side of (3.1) and of (3.2) are pairwise disjoint, the sets located at the odd spots in (3.1) are equivalent to the sets located at the even spots in (3.2) and the remaining sets are the same. • This result has many important consequences, and we mention some.
Corollary 3.2. Let A, B be arbitrary sets, and let Al BI ~ B be such that Al Band BI A. Then A B. fV
fV
~
A and
fV
Proof. Simply observe that by assumption BI ~ B Al ~ A, and A. Then (a simple variant of) Theorem 3.1 applies with A2 BI and Ao = A. • fV
BI
fV
fV
An interesting application of Theorem 3.1 is to show that Q+ is countable. Since N ~ Q+ it is enough to show that Q+ is equivalent to a subset
I.
8
Cardinal Numbers
of N. But this is not hard: If r = min, m, n EN, is the relatively prime expression of r E Q+, then put I( r) = I( m/ n) = 2m3n . It is clear that I is one-to-one, and consequently, it is an equivalence between Q+ and a subset of N, as we wanted to show. There are at least two other ways to verify that Q+ is countable. For instance, we may exhibit Q+, including repetitions, as the sequence 1/1,2/1,1/2,1/3,2/2,3/1,4/1,3/2,2/3,1/4, ... , ordered by the increasing magnitude of the sum of the numerator and denominator of each rational number. A proof by pictures leading to the above sequence is also available; we leave it to the reader to set it up. These observations may be cast in a more general setting. Proposition 3.3.
A
Let A = {An} be a family of countable sets. Then
= UnEN An is also countable.
Proof. List the elements of each An = {an,t, an,2, . .. } and introduce the mapping I: A -+ N given by I(an,m) = 2n 3m . Since I is one-to-one, A is at most countable. Also, since A ~ At, say, A is actually countable. . The argument can be readily modified to show that a finite, or countable, union of at most countable sets is again at most countable. •
4.
CARDINALS
As pointed out above, sets which are equivalent cannot be told apart by purely set-theoretic properties. This observation leads to the following definition. Given a set A, we associate with it its 'Cardinal number, with the property that any two sets A and B have the same cardinal number, or cardinality, provided that they are equivalent. We denote the cardinal number of A by card A and it is clear that card A = card B whenever A '" B. This definition is somewhat imprecise, but it will do for the applications we have in mind. The cardinal number of the class of sets equivalent to 0 is denoted by 0, that of {I, ... , n} by n, and that of N by No. Thus No is the first infinite cardinal. The cardinal number of the uncountable set [0,1], or that of R for that matter, is denoted by c (for continuum). Small letters often are used to denote cardinal numbers. The inclusion relation between sets translates into a comparison relation for cardinal number. More precisely, given cardinal numbers, or plainly cardinals, a and b, we say that a precedes b, or that a is less than or equal to b, and we write a ::; b, if there are sets A and B and a function I: A -+ B such that card A = a, card B = b and I is one-to-one. In other
4.
Cardinals
9
words, and with the above notation, a ~ b if and only if A B 1 , where Bl ~ B. It is clear that ~o ~ c, and that n ~ m (in the cardinal sense) iff n ~ m (in the usual sense). Inspired by the concept of equivalent sets we say that the cardinals a and b are equal, and we write a = b, if a ~ band b ~ a. We say that a < b if a ~ b and a =I b. For instance, ~o < c. Next we develop the arithmetic of cardinal numbers, including the operations of addition, multiplication and exponentiation. We do addition first. Given cardinals a,b, we define the sum a + b of a and b as the cardinal number obtained as follows: Let A,B be disjoint sets such that card A = a and card B = b. Then put a + b = card (A U B). It is not hard to see that addition is commutative (since AU B = B U A) and associative (since A U (B U C) = (A U B) U C). For example, if n, m are finite cardinals then n + m is, as it should be, (n + m) (let A = {1, ... ,n}, B = {n+ 1, ... ,n+m}). On the other hand, n+~o = ~o (choose A = {1, ... ,n},B = {n+1, ... } and note that AnB = 0 and AUB = N) and ~o+~o = ~o (A = even natural numbers, B = odd natural numbers). Also ~o + c = c, cf. 5.18 below, and c + c·= c(A = [O,1/2),B = [1/2,1]). As for the multiplication of cardinal numbers, given cardinals a and b, we define the product ab of a and b as the cardinal obtained as follows: Let A, B be sets such that card A = a and card B = b. Then put ab = card (A x B). Multiplication of cardinal numbers is commutative and associative, and distributive with respect to addition, cf. 5.3 below. For example, in the case of finite cardinals nand m, the product nm is, as it should be, (nm), i.e., the cardinal of {1, ... , n, ... , n2, ... , nm}, and that of~o~o is ~o (Put A = N,B = {1/n:n EN}). Finally we consider exponentiation. Given cardinals a and b, we define the cardinal ba as follows: Let A,B be sets with card A = a and card B = b. Then we set ba =card BA. The usual properties of exponentiation are not hard to check, cf. 5.21, 5.22 below. There is at least one exponential that is readily computed, and it corresponds to the case b = 2, since it is not hard to identify 2A. More precisely, we have f'V
Proposition 4.1.
Given any set A, 2A
f'V
peA).
Proof. Let 'Ij;: 2A _ peA) be defined as follows: If I: A - {O,1}, then let 'Ij;(f) be the subset of A corresponding to 1- 1({1}), i.e., put 'Ij;(f) = 1-1 ({1}). We claim that 'Ij; is an equivalence function. First note that if 'Ij;(f) = 'Ij;(g), then 1- 1( {1}) = g-l( {1}), and consequently also 1- 1( {O}) = g-l( {O}) and 1= g; thus 'Ij; is one-to-one. Next suppose that
10
I.
Cardinal Numbers
B E P(A) and let f = XB. Then 1f;(J) = f-l( {1}) = B and 1f; is also onto. Thus 1f; is an equivalence function. • This result explains why P( A) is also referred to as the power set of A, and it can be used to show that there is no largest cardinal number.
Proposition 4.2.
For any set A, card A
< card 2A .
Proof. Since all the singletons of A belong to P(A) it is clear that cardA ~ card P(A). Let 1f; be a (one-to-one if you wish) map from A into P(A), we show that 1f; cannot be onto. This is not hard; suppose that 1f; is onto. Now, for each x E A, 1f;( x) is a subset of A and consequently the set B = {x E A: x ~ 1f;( x)} is well defined. Since by assumption 1f; is onto, there exists a E A such that 1f;(a) = B. Now, if a E B, then by the definition of B, a ~ 1f;( a) = B, and this cannot happen. If, on the other hand, a ~ B, then also a ~ 1f;( a) = B and consequently, by the definition of B, a E B, which is also a contradiction. In other words, 1f; cannot be onto. • Proposition 4.1 in particular implies that for finite cardinals n, 2n is as expected. How about 2 No ? This requires a new idea. Each real number r in [0,1] can be expressed as 00
r = Lan2-n = .ala2 ... ,
an = 0,1,
all n.
n=O
This is the so-called dyadic expansion of r. A minor inconvenience arises since expansions are not necessarily unique. For instance 1/2 = .011 ... = .100 ... , one dyadic expansion terminating in O's and one in 1 'so But the set of such r's is countable, cf. 5.14 below. In other words, if we consider all dyadic expansions, there are, counting repetitions, c + No = c of them. Furthermore, the set of dyadic expansions is clearly equivalent to the set A of all sequences which assume the values 0 and 1, and this set in turn is equivalent to 2N , 2 = {0,1}. Now, by definition, cardA =card2 N = 2ND , and by the above remarks card A = c. Thus 2ND = c. A similar argument allows us to compute cc. On the other hand, to compute this product it suffices to note that cc = 2ND 2No = 22No = 2No = c. One point remains open. Given two cardinals a and b, we cannot be sure that they are comparable. In order to answer this question we need a new concept, namely that of an ordered set, which we discuss in the next chapter.
5.
Problems and Questions
11
5. PROBLEMS AND QUESTIONS 5.1 Show that union and intersection are distributive with respect to
intersection and union respectively. In other words, show that
and C u (niEIAi)
= niEI(C U Ai).
In addition the de Morgan's laws also hold, to wit,
5.2 Let A = {An: n E N} be a family of sets and let A = UnEN An. Show that there is a family B consisting of pairwise disjoint sets, B = {Bn: n EN}, such that Bn ~ An and A = UnEN Bn.
=
5.3 Show that A X (B U C) (A X B) U (A X C) and that, in general, AU (B X C) I (A U B) X (A U C).
5.4 Show that if A 5.5 Suppose that B
I 0
and A X B = A xC, then B = C.
n C = 0 and show that ABUC
fV
AB X AC.
5.6 Suppose that AB = BA and show that A = B.
5.7 Suppose that A is a set of n elements, how many relations are there in A X A? 5.8 Let f: A ~ B and suppose that for all i E I, Ai Discuss the (inclusion) relations between
and do the same for
Also, what are the inclusion relations between
and between
~
A and Bi
~
B.
12
I.
5.9 Let
Cardinal Numbers
I: A ~ B. Show that 1(J-l(B)) ~ Band 1- 1 (J(A)) 2 A.
By means of examples show that the inclusions may indeed be proper. 5.10 Show that the composition of one-to-one functions is one-to-one, and that of onto functions is onto.
5.11 Let I: A ~ Band g: B ~ A, and suppose that go 1= 1 (identity in A) and log = 1 (identity in B). Show that I and 9 are one-to-one and onto and that 9 1-1.
=
5.12 Suppose that A"" B and show that for any set C, AO "" BO. 5.13 Show that if I: [0,1] x [0,1] ~ [0,1] is one-to-one and onto, then it cannot be continuous. 5.14 Show that the set of real numbers in [0,1] which have two decimal expansions (one terminating in 9's and one in O's) is countable. 5.15 This is a sketch of a proof that [0,1] is uncountable: Any countable listing .all a12a13 •.• , .a21 a22a23 ••• , .a31 a32a33 ••• , • •• of the real numbers in [0,1] can not be complete. Indeed, put r = .b1 b2 b3
••• ,
bn
-::f:. ann,
all n ,
and note that the real number r cannot be in the above listing. The actual proof requires some care; 5.14 is relevant here. The method of proof uses a "Cantor diagonal selection process." 5.16 Prove the following restatement of the corollary to the SchroderBernstein theorem: If a and b are cardinal numbers such that a ::; b and b ::; a, then a = b. This result does not require the Axiom of Choice as do the results of Chapter II, but it merely asserts that both a < b and b < a cannot occur. 5.17 Show that if A is an infinite set, then A contains a countable subset. 5.1S Prove these two corollaries to 5.17: (i) A set is infinite iff it is equivalent to a proper subset, and, (ii) If a is an infinite cardinal, then a + No = a. 5.19 Suppose at, a2 are infinite cardinals, and that that for every cardinal 6, al + b < a2 + 6? 5.20 If a and 6 are cardinal numbers so that follow that 6 = c?
a
al
O if r ~ o.
These are called the positive and negative parts of r and satisfy the relations r+, r- 2:: 0, r = r+ - r-, and Irl = r+ + r-.
4.
Problems and Questions
43
Show that if J is BV on I = [a,b], and if J E n(I) and J(x) = J'(t) dt for x E I (this condition is not redundant), then for a ~ x ~ b, we have
J:
P(x)
l
= xU'(t))+dt,
and Vex)
=
l
N(x) X
l
= xU'(t))-dt,
IJ'(t)ldt.
4.17 Assume that g, J are bounded functions defined on I = [a,b] which are discontinuous from the right at x E (a,b). Show that 9 fI. nu, I). 4.18 Assume that J is a nondecreasing real-valued function defined on an interval I, and that 9 E nU,I). Show that g2 E nU,I). Show that the converse is not true, Le., there are functions J,g such that g2 E nu, I) but 9 fI. nu, I). 4.19 Assume that J is a nondecreasing real-valued function defined on an interval I and that g, h E nu, I). Show that gh E nu, I). 4.20 Suppose that gEnU, I), and that J has a bounded derivative on 1= [a,b]. Show that gJ' E n(I) and
lb
gdJ =
lb
I'
g(x)J'(x)dx.
4.21 Let J be BV on I = [a,b], and let, as usual, V denote its variation on I, V(a) = O. Prove that if 9 is bounded on I and gEnu,!), then 9 E n(v, I). 4.22 Suppose J is BV on I = [a,b] and the bounded function gEnU, I). For x E I put G(x) = gdJ, and show that G is BV on I and continuous at those points of I where J is continuous.
J:
4.23 Let J be a nondecreasing bounded function on I = [a,b] and let 9 E nU,I), m ~ g(x) ~ M for all x E I. Show that there is a real number c, m ~ c ~ M, so that
lb
gdJ = cU(b) - J(a)).
4.24 Let J,h be nondecreasing functions defined on an interval I = [a,b] of the line with the property that J(a) = ft(a) and
lb
9 dJ =
lb
9 dft ,
all 9 continuous on I.
44
III.
The Riemann-Stieltjes Integral
Prove that if x E I is a point of continuity of both
I
and
h,
then
I(x) = hex). 4.25 Let h, h be two real-valued nondecreasing functions defined on I = [a,b] and suppose there is a value c E R such that the set D = {x E I:h(x) = hex) + c} is dense in I. Show that
1b
9 dh
=
1b
9 dh
,
all 9 continuous on I.
4.26 Let I = [0,1] and suppose I is a BV function defined on I. Let h be the function defined on I as follows: h(O) = 0, hex) = l(x+O)- 1(0) if 0 < x < 1, and h(l) = 1(1) - 1(0). Show that his BV on I, and that for each continuous function 9 we have 9 dl = 9 dh.
J;
J;
4.27 Let I be a continuous function defined on I = [a,b] and suppose that 9 is nondecreasing there. Show that there is a point Xo E I such that
Jafb gdl = g(a)
1
xO
a
dl + g(b)
1b dl· xo
4.28 (Change of variable) Let I,g be bounded on I = [a,b] and suppose that 9 E 'R,(j, I). Furthermore, assume there are an interval J = [c, d] and a continuous, strictly monotone function ¢> such that I = ¢>(J), ¢>(c) = a, ¢>(d) = b. Show that the functions F(x) = I(¢>(x)) and G( x) = g( ¢>( x)) are well-defined on J, G E 'R,( F, J), and that
1b
9 dl =
ld
G dF .
4.29 Let {In} be a sequence of BV functions on I = [a,b] and suppose there exists a BV function I defined on I such that the variation V(j - In; a, b) tends to 0 as n --+ 00. Assume also that In(a) = I( a) = 0 for each n = 1,2, ... If 9 is continuous on I, prove that 9 E 'R,(j, I) and
lim n-+oo
fb gdln = 1ba gdl.
Ja
CHAPTER
IV
Abstract Measures
In this chapter we study the notions of measure and of sets of "content" zero. These concepts are essential to measure the level sets of the new class of functions to be integrated and in the characterization of RiemannStieltjes integrable functions. A successful approach to these problems requires that we operate freely with sets, including taking limits. This we achieve with the introduction of algebras and O'-algebras of sets.
1.
ALGEBRAS AND O'-ALGEBRAS OF SETS
A class A of subsets of a (universal) set X is called an algebra of sets, or plainly an algebra, provided the following three properties hold: (i) A is nonempty. (ii) If E E A, then X \ E E A. (iii) If {Ek }k=l ~ A, then UZ=l Ek EA. Some sets of an algebra A are easily identified, namely 0 and X. In fact, A = {0,X} is the most economical algebra. On the other hand, A = P(X) is also an algebra. Another interesting example is E = {E ~ R : E can be written as a finite pairwise disjoint union of half-open intervals (a,b], with a,b in R}. Also, it is not hard to check that if {AihEI is a collection of algebras, then A = iE1 Ai is an algebra. If A is an algebra of subsets of X and E ~ X, then the family AE = {E n A: A E A} is an algebra of subsets of E. What operations can we perform with the sets of an algebra A and still remain in A?
n
45
IV.
46
Abstract Measures
Proposition 1.1. Suppose A is an algebra of sets, and E 1 , E2 E A. Then El n E2 and El \ E2 belong to A. Proof.
Since by 5.1 in Chapter I (1.1)
by (ii) and (iii) the set on the right-hand side of (1.1) IS III A and consequently, the complement of El n E2 belongs to A. By (ii) again, El n E2 EA. Moreover, since El \ E2 = El n (X \ E 2), by (ii) and the first part of the proof, we have El \ E2 EA. • In applications it is often convenient to replace (iii) by the seemingly weaker condition that A be closed under the union of pairwise disjoint sets, namely: (iii') If {Ek}k=l is a collection of pairwise disjoint subsets of A, then U~=l Ek E A. However, as an argument using 5.2 in Chapter I and Proposition 1.1 readily shows, (iii) and (iii') are actually equivalent. We consider the taking of limits next. Given a sequence {An}, we define the sets lim supAn = {X: X belongs to infinitely many An's} and liminf An = {x:x belongs to all but finitely many An's}. It is not hard to see that
lim sup
An = m0. (Q An),
(1.2)
D. CEl An) .
(1.3)
and
limiufAn =
For instance, if An = [0,1], n odd, and An = [1,2], n even, then liminf An = {I}, and lim sup An = [0,2]. When the limits are equal we say that the sequence {An} converges and the common value is denoted by lim An. From the expressions for the limits it is apparent that limiting operations are not necessarily closed in an algebra of sets; we are thus led to the concept of O"-algebra. We say that an algebra A of subsets of X is a O"-algebra of sets, or plainly a O"-algebra, if it satisfies the additional property (iv) If {Ek}~l ~ A, then U~l Ek EA. As before, (iv) is equivalent to the condition obtained by requiring that the Ek'S be pairwise disjoint.
1.
Algebras and O'-algebras
47
P(X) is a u-algebra, and the algebra £ introduced above is not. Also, if {AihEI is a family of u-algebras, then A = niEI Ai is a u-algebra as well. If A is a u-algebra of subsets of X and E ~ X, the collection AE = {A n E: A E A} is a u-algebra of subsets of E. As for the consideration of the limits we have
Proposition 1.2.
Suppose {An} is a sequence of sets of au-algebra
A. Then lim inf An and lim sup An belong to A. Proof. Since An = X \ (X \ An), by the relations 5.1 in Chapter I we have that lim sup An equals n:=1 (U~=m (X \ (X \ An))) = n:=1 (X \ (n~=m(X \ An))) = X \ U:=1 (n~=m(X \ An)) , i.e., lim sup An = X \ liminf(X \ An). Thus, if we prove the conclusion for the lim sup An, it will follow for the liminf An, and vice versa. Now, since A is au-algebra, Bm = U~=m An E A for all m, and by the countable version of Proposition 1.1 (which holds for u-algebras and which is proved in a similar fashion), lim An = n:=1 Bm E A, and we are done. • That the notion of u-algebra is the natural one to deal with limits is also expressed by Proposition 1.3. Suppose A is an algebra. Then A is au-algebra iff for every sequence {Ak} ~ A, limsupAk E A. Proof. Since Proposition 1.2 gives the necessity of the condition we only do the sufficiency. We must only show that property (iv) holds. Let {Ak} ~ A, and set Bn = Uk=1 Ak. Since A is an algebra, Bn E A for all n, and consequently, by assumption, lim sup Bn E A. But since every x in Uk:1 Ak belongs to infinitely many Bn's, in fact, if x E Ak, then x E Bn for all n ~ k, then Uk:l Ak = lim sup Bn E A, and we have finished. • Next we consider the following question: Given a family C of subsets of X, what is the smallest family of subsets of X that contains the limits of all sequences of sets in C? Or equivalently, which is the smallest u-algebra of subsets of X that contains C? If C is a u-algebra, then the answer is C. Otherwise, let :F be the family of all the u-algebras of subsets of X which contain C. Since P(X) E :F,
IV.
48
Abstract Measures
:F f. 0. As observed above, the intersection of an arbitrary family of a-algebras is again a a-algebra, and the intersection of all the a-algebras in :F is the smallest a-algebra that contains C. This a-algebra is called the a-algebra generated by C and it is denoted by S(C). For instance, if C is the family of all the singletons {x} of X, then S(C) = {E ~ X: either E is at most countable, or else X \ E is at most countable}. Of course, if X is at most countable, then S(C) = P(X). There are two other examples which are useful in applications and we discuss them next. Example 1.4. Let Al be a a-algebra of subsets of Xl and A2 a a-algebra of subsets of X 2 • We are interested in constructing a a-algebra of subsets of Xl X X 2 in terms of Al and A 2 • Our first candidate is the family C = {El X E2:El E Al and E2 E A2},
but it is not hard to see that C is not necessarily closed under complementation, and consequently, C is not even an algebra. Therefore we define the product a-algebra Al X A2 = S(C). This is the natural thing to do, since S(C) is the smallest a-algebra containing all the "rectangles" El X E 2. In some cases it is possible to give a concrete description of the product a-algebra. For instance, if Al = S(Cl ) and A2 = S(C 2 ), then Al X A2 = S(C I X C2 ), cf. 4.22 below. Example 1.5. Suppose X = R n is the Euclidean n-dimensional space endowed with the usual topology, and let 0 denote the family of open sets of Rn. Then S( 0) is called the a-algebra of Borel subsets of R n and it is denoted by Bn. There is yet a simpler way to generate Bn. When n = 1, let I denote the collection of all open intervals of R. Since every open set is a disjoint, at most countable union of intervals in I, it is clear that also S(I) = Bl. In fact, since each interval (a,b] of £ can be expressed as the intersection
n 00
(a,b] =
(a,b
+ lin),
n=l
it also follows that S(£) = Bl . When n > 1 we use the fact that every open set can be written as the countable union of nonoverlap ping closed cubes, i.e., if the cubes intersect it is only along the faces, cf. 4.18 below. Thus, if C denotes now the family of all the closed cubes in Rn, we also have S(C) = Bn. Finally, by 4.21 below, Bn X Bm = Bn+m.
2.
Additive Set Functions and Measures
49
2. ADDITIVE SET FUNCTIONS AND MEASURES A measure on Rn is a natural generalization of such elementary notions as the length of a line segment, the area of a rectangle and the volume of a parallelepiped. In 1898, Borel formulated the following four postulates for defining measures of sets: (i) A measure is always nonnegative. (ii) The measure of the union of a finite number of pairwise disjoint sets is equal to the sum of their measures. (iii) The measure of the complement of a set relative to another equals the difference of their measures. . (iv) Every set whose measure is not 0 is uncountable. Based on these postulates we introduce the notion of an additive set function. Given a set X and an algebra A of subsets of X, a set function "p on A is a function which assigns to each set of A a real value, or ±oo. To avoid technical difficulties we assume that if "p takes infinite values, they are all of the same sign. A set function "p is said to be additive provided the following property holds: If {Ek }k=l ~ A and the Ek'S are pairwise disjoint, then (2.1) This property corresponds to Borel's postulate (ii). Because "p only takes infinite values of one sign, the right-hand side of (2.1) always makes sense under the usual arithmetic rules: r+oo = 00,
r+(-oo)
= -00, 00+ 00 = 00, (-00)+(-00) = (-00).
These are two examples of additive set functions. Example 2.1. Let X be an infinite set and let A = {E ~ X: either E is finite or X \ E is finite}. Then A is an algebra and the mapping "p: A ---+ [0,00] given by "p( E) = 0 if E is finite and "p( E) = 00 if X \ E is finite, is an additive set function. Example 2.2. Let X = (0,1], put £ = {E ~ X: E can be written as a finite disjoint union of half open intervals (a,b]), and let f: X ---+ R be nondecreasing. Then £ is an algebra of sets and "p: £ ---+ [1(0+),/(1)] given by "p((a,bD = feb) - f(a) and extended additively otherwise, is an additive set function; one must check, of course, that the value "p(E) does not depend on the way in which E is represented as a disjoint union of
IV.
50
Abstract Measures
half-open intervals. There is more to this example than meets the eye, and we will return to it in Chapter IX. Additive set functions have the following property: If E I , E2 E A, EI ~ E2 and "p(EI) is a finite value, then Borel's postulate (iii) holds, to wit (2.2) Indeed, by (2.1) we have that "p(E2) = "p(Ed + "p(E2 \ Ed. Now, "p(E2) is either a finite value or not. In the former case we subtract "p(EI) from both sides of the above equality and (2.2) follows; in the latter case we observe that "p(E2 \ Ed is infinite of the same sign as "p(E2) and (2.2) still holds. An immediate consequence of (2.2) is that "p(0) = o. As a matter of fact, an additive set function "p is either identically infinite, or else "p(0) = O. From this point on we consider nonnegative set functions (Borel's postulate (i)), but we will have more to say concerning "signed" set functions, cf. 4.8 - 4.12 below and Chapter XI. How do additive set functions behave with respect to limits? For instance, if in Example 2.1 the set X = {x!, X2, ... } is countable and we put En = {Xl, ... , Xn}, n 2: 1, then it readily follows that "p(lim En) = "p(X) = 00 =1= lim "p(En) = O. To deal with this inconvenience we restrict the domain of an additive set function to a u-algebra and require an additional compatibility condition, the u-additivity. More precisely, given a set X and au-algebra M of subsets of X, we say that a set function J.L defined on M is a measure provided the following three properties hold:
(i) WM
-+
[0,00].
(ii) J.L(0) = O. (iii) If {EkH';1 ~ M is a. sequence of pairwise disjoint sets, then
J.L
(Uk=l Ek) = OO
~oo J.L(Ek).
L..Jk=l
Condition (ii) is assumed in order to exclude the possibility that J.L is identically 00. To emphasize the interrelation among these objects we say that J.L is a measure on (X,M), or that the triplet (X,M,J.L) is a measure space. The u-algebra M is called the family of J.L-measurable, or plainly measurable, sets. A measure J.L is said to be finite if J.L(X) < 00. In this case, by (2.2), it follows that for every measurable set E, J.L(E) ~ J.L(X), and J.L only takes
2.
Additive Set FUnctions and Measures
51
finite values. When Jl is a finite measure we may rescale, i.e., consider JlI(E) = Jl(E)/Jl(X),E EM, instead, and assume that Jl(X) = 1; these measures are called probability measures. The measure space (X,M,Jl) is said to be a-finite if X is the countable union of measurable sets, each of finite Jl measure. Informally, we also say that Jl is a-finite. Some examples will clarify these concepts. Let M be the a-algebra of subsets of an uncountable set X generated by the singletons of X. Then the set function tf;(E) = 0 when E is finite, and tf;(E) = 00 otherwise, is an additive set function which is not a measure. On the other hand, the set functions Jl(E) = number of elements of E when E is finite and Jl(E) = 00 when E is measurable and infinite, and veE) = 0 when E is at most countable and v( E) = 1 when X \ E is at most countable, are measures. Jl is not a-finite and v is a probability measure. Also, if X is countable, the measure Jl on (X, P(X» given by Jl(E) = number of elements of E if E is finite and Jl( E) = 00 otherwise, is a-finite. Next suppose that X is a nonempty set and that M = P(X). Let I:X ~ [0,00], and for E E M put Jl(E) = L
I(x).
:r:EE
As usual the sum is defined as
where the sup is taken over all finite subsets {Xl!"" xn} of E. To verify that (X,M,Jl) is a measure space, the only step that offers any difficulty is the a-additivity of Jl; we do this next. Let {Ek}~l be a sequence of pairwise disjoint measurable sets, and let E denote its union. Given a finite subset {Xl, ... , xn} of E, suppose that {Xl,.' . ,X n } ~ U~l Eki' m ~ n, and note that n
m
00
LI(xk) ~ LJl(EkJ ~ LJl(Ek). k=l
i=l
(2.3)
k=l
Thus, taking the supremum of the left-hand side of (2.3) over all finite subsets of E, we get that 00
Jl(E) ~
L
k=l
Jl(Ek).
(2.4)
IV.
52
Abstract Measures
We show the opposite inequality next. If J.L(Ek) = 00 for some k, since + J.L(Ek) ~ J.L(Ek), J.L(E) is also infinite and there is nothing to prove. Otherwise, given E > 0, let {XI,k, ... , Xn(k),k} ~ Ek be such that
J.L(E) = J.L(E \ Ek)
n(k) J.L(Ek) ~ L f(Xi,k)
+ E2- k ,
k
= 1,2, ...
(2.5)
i=l
For each integer m,
,Xn(m),m} is a finite subset of E, and, by
{Xl,!, ••.
(2.5),
n(k) L J.L(Ek) ~ L L f(Xi,k) m
k=l
m
k=l i=l
m
+E L
2- k ~ J.L(E)
+ E.
(2.6)
k=l
Since the right-hand side of (2.6) is independent of m, we may let m in the left-hand side there, and thus obtain
---I-
00
00
LJ.L(Ek) ~ J.L(E)
+ E.
k=l
But since E above is arbitrary, the inequality opposite to (2.4) holds, J.L is a-additive. Three particular instances of this example are of interest. If
L f(x) = 1, xEX
J.L is a probability measure. On the other hand, if f( x) = 1 for all X EX, J.L is called, for obvious reasons, the counting measure on X. The counting measure is finite if X itself is finite and it is a-finite if X is countable. Finally, if f(xo) = 1 for some fixed Xo E X and f(x) = 0 for X # xo, J.L is called the Dirac measure supported at Xo and is denoted by 6xo ; clearly 6xo (E) = 1 or 0, according as to whether Xo belongs to E or not. The interesting question of when, in general, J.L is finite or a-finite, is left for the reader to answer, cf. 4.24 below. We close this section with a simple criterion that enables us to determine when an additive set function is a measure. Theorem 2.3. Let"p be an additive, finitely valued set function defined on a a-algebra A. Then "p is a measure iff for any nonincreasing sequence {Ek}~l ~ A with n~l Ek = 0, we have limk-+oo "p(Ek) = o.
3.
Properties of Measures
53
Proof. Assume that t/J is a measure and let {Ek} be a nondecreasing sequence of sets in A. Then by the u-additivity of the finite set function t/J we have 00
k=l n-l
= n~~ ~) t/J( Ek) - t/J( Ek+t)) = t/J( E 1 ) -
n~~ t/J( En) .
k=l
Since t/J(El) is finite, limn--+co t/J(En) = 0, and the necessity follows. As for the sufficiency, let {Ed be a disjoint sequence of measurable sets with union E and let An = E\ (El U ... U En), n = 1,2 ... Then {An} is a nonincreasing sequence of measurable sets with n~=l An = 0, and lim n--+ oo t/J( An) = o. Now, since t/J is additive we have n
t/J(E)
= L t/J(Ek) + t/J(An),
n
= 1,2, ...
k=l
Whence taking the limit as n
t/J(E) = !~~
--+ 00
it follows that
n
00
k=l
k=l
L t/J(Ek) + !~~(An) = L t/J(Ek) ,
and the u-additivity of t/J has been established.
•
3. PROPERTIES OF MEASURES How do measures behave with respect to the usual set operations, and with respect to the limiting operations? Some of the basic properties are given in Proposition 3.1. Suppose (X,M,JL) is a measure space. Then the following properties hold: (i) (Monotonicity) If E,F are measurable and E ~ F, then JL(E) ~ JL( F). Moreover, if JL( E) is finite, then
JL(F \ E) = JL(F) - JL(E).
(3.1)
IV.
54
Abstract Measures
(ii) (u-subadditivity) If {Ek} is a sequence of measurable sets, then (3.2) (iii) (Continuity from below) If El ~ E2 sequence of measurable sets, then
~ ...
is a non decreasing
(3.3) (iv) (Continuity from above) If El ;2 E2 ;2 ... is a nonincreasing sequence of measurable sets and for some k ,jt(Ek) < 00, then
(3.4) Proof. The monotonicity was essentially established in (2.2) above, so we say no more. As for the u-subadditivity, first note that on account of 5.2 in Chapter I and the properties of measurable sets, we may rewrite U Ek = U F k , where {Fk} is a sequence of pairwise disjoint measurable sets with Fk ~ Ek for all k. Consequently, by the u-additivity and the monotonicity of jt it follows that
which is precisely (3.2). Next note that if {Ek} is non decreasing, then lim Ek = U Ek, so that in this particular instance the measure of the limit is the limit of the measures. Now, if jt(El) = 00, by the monotonicity of jt, jt(Ek) = 00 for all k, and also jt(U Ek) = 00; in this case there is nothing to prove. Otherwise, since the sequence in question is nondecreasing, put Eo = 0, and note that U~l Ek = U~l(Ek \ Ek-l), where the sequence {Ek \ Ek-d is pairwise disjoint. Whence
Thus, by (3.1), we obtain that
:E:=l jt(Ek \ Ek-l) = :E:=l (jt(Ek) -
jt(Ek-d)
= jt(En) - jt(Eo) = jt(En) ,
3.
Properties of Measures
55
and (3.3) follows. Finally, the idea to prove the continuity from above is to reduce the problem to one of continuity from below and to invoke (3.3). Replacing Ek by Ek n Eko if necessary, where Jl(Eko) < 00, we may assume that Jl(E l ) < 00. Since {Ek} is nonincreasing, the sequence {E l \Ek} is nondecreasing, and, by (3.3), (3.5) Since Uk:l(El \ Ek) = El \ nk:l Ek, and since by (3.1) it follows that p(Uk:l(El \ Ek)) = p(El)-p(nk:l Ek), and that peEl \Ek) = p(Et)peEk) , by (3.5) we get that
peEd - p
(nCOk=l Ek) = peEd -
lim p(Ek). k-+oo
Moreover, peEd < 00, and this quantity may be cancelled in the above inequality to give (3.4), and to complete the proof. • The restriction peEd < 00 is necessary for (3.4) to hold. Indeed, let p be the measure on (N, peN)) given by peE) = number of elements of E if E is finite and peE) = 00 otherwise, and let Ak = {k, k + 1, ... }. Then peAk) = 00 for all k, but pen Ak) = p(0) = O. In working with measures a useful result is the following Theorem 3.2 (Borel-Cantelli Lemma). Suppose (X, M, p) is a measure space and let {En} be a sequence of measurable sets with the property that Z:::::l peEn) < 00. Then
p(limsupEn) = Proof.
o.
(3.6)
First observe that by the u-subadditivity of p,
Now, the sequence consisting of Am and peAl) < 00. Whence, by (3.4) p
= U~=m En, m:2: 1, is nonincreasing
(nCOm=l Am) = m-+oo lim p(Am).
(3.7)
The set on the left-hand side of (3.7) is precisely lim sup En. As for the right-hand side, note that the measure of each set there does not exceed
56
IV.
Abstract Measures
I:~=m J.l(En ), and since these are the tails of a convergent series, we have lim m -+ oo J.l(Am) = O. Consequently, (3.7) gives at once (3.6). •
Measures, or additive set functions for that matter, can be restricted or extended. More precisely, if M1 ~ M2 are a-algebras of subsets of X, we say that a measure J.l1 on (X, M 1) is the restriction to M 1 of the measure J.l2 on (X,M 2), and we write J.l1 = J.l2IM 1, if J.l1(E) = J.l2(E) for every E E M 1. In this case we also say that J.l2 is an extension of J.l1 to M2. For instance, given a measure space (X,M,J.l) and A E M, J.lIMA can intuitively be thought of as the restriction of J.l to A. Sets of measure 0 play a special role in many questions of interest to us. Given a measure space (X, M, J.l), any measurable set of measure 0 is called a null set. Null sets are often denoted by N. If {Nd is a sequence of null sets, then by the a-subaddivity of J.l it readily follows that UNk is also a null set. Also if N is a null set and A ~ N, then by the monotonicity of J.l it follows that J.l(A) = 0, provided that A is measurable. This, of course, is not true of all measures. Consider, for instance, the following simple example: Let X = {a,b,c},M = {0,{a},{b,c},X}, and J.l({a}) = 1, and J.l( {b,c}) = o. Then J.l is a probability measure and N = {b,c} is a null set, but {b}, {c} C {b,c} are not measurable. This motivates our next definition. A measure space (X,M,J.l) is said to be complete if whenever N E M is a null set and A ~ N, then A is also a measurable null set. In this case we also say, plainly, that J.l is complete. Since it is quite inconvenient to work with measure spaces which are not complete, we consider next whether a measure which is not complete can be extended, in a natural way, to complete measure. Let, then, (X,M,J.l) be a measure space and put N = {N EM: J.l( N) = O}. If J.l is not complete, then there is a set in P(N) which is not measurable, and the first step in constructing an extension of J.l which is complete is to find a a-algebra M1 which contains both M and P(N). The natural choice for M1 is S(M UP(N))j fortunately there is a simpler way to characterize M 1. Indeed, if
A = {E U F: E EM, F E P(N)} , then we claim that M 1 = A. Clearly A ~ M 1. If we can show that A is a a-algebra of sets, since A contains M and P(N), it also contains the a-algebra generated by M U P(N), that is Mb and we are done. First, A # 0. Next we show that if A E A, then also X \ A E A. Let A = E U F, where E is measurable and F ~ N, N EN. Since N EN
3.
Properties of Measures
57
and E U F = E U (F \ E), replacing N by N \ E if necessary, we may assume that En F = En N = 0. Now, since E and N are disjoint, we have that E U F = (E UN) n (F U (X \ N», and consequently, X \ (E U F) = (X \ (E UN» U (X \ (F U (X \ N»)
= (X \ (E UN»
U «X \ F)
n N) = EI U FI ,
say. Since E UN is measurable, EI E M, and since FI ~ N, FI E P(N); in other words X \ (E U F) E A as we wanted to show. Finally we check that if {Ak} is a sequence of subsets of A, then also Uk:1 Ak belongs to A. This is not hard: Since Ak = Ek U Fk, Ek E M, H E P(N) for all k, it readily follows that U%"::I Ak = (Uk:1 Ek) U (Uk:1 Fk) = E U F, say. But as it is evident that E E M and F E P(N), our verification that A is a a-algebra is finished. We are now ready to prove Theorem 3.3. Given a measure space (X,M,p), consider N = {N E M:p(N) = O} and MI = {EU F:E E M,F E P(N)}. Then there is a unique extension PI of P to (X,Mt) so that (X,MbPI) is complete. Proof. If P is complete, MI = M, and by putting PI = P we are done. Otherwise, if P is not complete, define PIon (X,Mt) as follows: If A = E U F E M I, let
PI(A)
= p(E).
(3.8)
First we show that PI is a well-defined set function on (X,M I ), i.e., if A = EI U FI = E2 U F2, then we have p(E1 ) = P(E2)' This is not hard; indeed if F2 ~ N2 EN, note that
EI
~
EI U FI = E2 U F2
~
E2 U N2 EM,
and, by monotonicity, p(E1 ) ~ P(E2 U N 2) ~ p(E2) + p(N2) = p(E2). Reversing the roles of EI and E2 we get that P(E2) ~ p(Et), and PI is well-defined. Next we check that PI is a measure on (X, MI)' The only property that is not obvious is the a-additivity. Let {Ak} be a sequence of pairwise disjoint sets in M 1 , Ak = Ek UFk,Fk ~ Nk EN for all k. Then F = Uk Fk ~ Uk Nk = N E N, and since the Ek'S are pairwise disjoint we have PI (UkAk) = PI ((UkEk) UF)
= P (UkEk) = LP(Ek) = LPI(Ak) ,
IV.
58
Abstract Measures
and the u-additivity follows. Finally we verify that J.LliM = J.L and that J.Ll is the only complete measure on (X,Mt) with this property. Since for E E M we have E = E U 0 and 0 EN, it is clear that J.Ll(E) = J.L(E) and J.Ll is an extension of J.L. To check that J.Ll is complete, let Nl = El U PI E Ml, J.Ll(NI) = 0, and let M ~ Nl. Since Nl is a null set we get that J.L(El) = 0 and El U PI ~ N E N. Thus M ~ N,M E P(N) , J.Ll(M) = 0 and J.Ll is complete. Further, to show that J.Ll is unique, suppose that J.L2 is another extension of J.L to Ml and note that for F ~ N EN we have J.L2(F) ~ J.L2(N) = J.L(N) = O. Thus, if A = E U FE Ml, it readily follows that
+ J.L2(F) = J.L(E) = J.Ll(A). E peN) also J.Ll (F) = 0, we may essentially reverse the
J.L2(A) ~ J.L2(E)
But since for F roles of J.Ll and J.L2 above and obtain that J.Ll(A) ~ J.L2(A) as well. In other words, J.Ll(A) = J.L2(A) for every A E Ml, and J.Ll is unique. •
4. PROBLEMS AND QUESTIONS 4.1 Prove that (lim sup An)n(limsup Bn) ;;;? limsup(AnnBn), and that (lim sup An) U (limsupBn) = limsup(An U Bn). What are the corresponding statements for the lim inf? 4.2 Let A be an algebra of subsets of X and let "p be a finite additive set function defined on A. Given {AkH=l ~ A, no two of the Ak'S being the same, let Cm = {x: x belongs to exactly m of the Ak 's}, m = 1,2, ... , n. Show that L:k=l "p(Ak) = L:~=l m"p(Cm ). 4.3 In the notation of problem 4.2, let Bm = {x: x belongs to at most m of the Ak'S}, m = 1,2, ... ,n. Show that L:k=l"p(Ak) = L:~=l "p(Bm ). 4.4 Let "p be a finite additive set function defined on an algebra A of subsets of X, and let AI, A2 E A. Show that
"p(A l
n A 2) + "p(A l U A 2) =
"p(A I ) + "p(A 2).
4.5 It is possible to extend 4.4 to include more than two sets. Let {Ak}k=l ~ A and for an integer m ~ n put Tm
=
L ~< ... 0, there are open intervals I~ d Ik, such that v(ID ::; (1 + ",)V(Ik), all k. Since I ~ UI~, and since I is compact and the Irs are open, by the Heine-Borel theorem there is a finite subcovering, which for simplicity we also denote by {In, such that I ~ U:=1 I k. In this case it is intuitively clear, although involved to prove, that v(I) ::; 2::=1 v(I~), and consequently we also have v(I) ::; (1 + ",)
Lk
N =1
v(Ik) ::; (1
+ ",) L
v(Ik) ::; (1 + ",)(IIle
+ c).
(1.3)
Since c and", are both arbitrary, from (1.3) it readily follows that v(l) ::; IIle, and we have finished. •
66
V.
The Lebesgue Measure
Along the same lines it is not hard to see that the following result holds: Let {Ik}k=l be a finite c,ollection of nonoverlapping closed intervals in Rn, then
Corollary 1.2.
Let I be an open interval. Then Ille = vel).
Proof. Let I denote the closure of I, then we have Ille ~ v(1) = v(l), and one inequality holds. Also, if J is any closed interval contained in I, monotonicity implies that v(J) = IJI ~ Ille. But since v(I) = supv(J), where the sup is taken over the family of the closed subintervals J of I, we get that vel) ~ Ille, and the opposite inequality holds. • Next we show that the outer measure is q-subadditive. Proposition 1.3. Then
Let {Ek}k:l be any sequence of subsets of Rn. (1.4)
Proof. If IEkie = 00 for some k, then by monotonicity also the union has infinite outer measure, and we have equality in (1.4). Otherwise, suppose that IEkie < 00 for all k, and let c > 0 be arbitrary. For each k let {lk,j} be a covering of Ek by closed intervals with the property that
(1.5) Clearly
(1.6) and consequently, the closed intervals on the right-hand side of (1.6) form a covering of Uk Ek' Whence, by definition, we have
(1.7) and since the summands on the right-hand side of (1.7) are nonnegative we may interchange freely the order of summation and estimate that expression by
1.
Lebesgue Measure on R"
But since
E
67
> 0 is arbitrary, we also have
I Uk Ekle
~ L:k
IEkle. •
In contrast to the case of measures, it is interesting to point out that strict inequality may occur in (1.4), even ifthe Ek'S are pairwise disjoint. To see this observe that the Lebesgue outer measure is translation invariant, and that with the notation of (1.1) above, IAle > O. Then, again by (1.1), IUk Akl e ~ 3 < L:k IAkie = 00. Since open sets can be expressed as the union of closed intervals, it is reasonable to attempt to compute the outer measure of subsets of R n in terms of the outer measure of open sets. Specifically, we have Proposition 1.4. IE Ie
Let E be any subset of Rn. Then
= inf {IOle: 0
(1.8)
is open, and 0 ;2 E} .
Proof. If IEle = 00, by monotonicity every open set which contains E (and this class is nonempty since Rn is one such set) also has infinite outer measure and (1.8) holds. On the other hand, if IEle is finite, given E > 0, let {lk}k:l be a covering of E by closed intervals such that
Furthermore, for each k, let lfe be an open interval containing lk such that v(lfe) ~ v(lk) + E/2 k+1, and put 0 = Uklfe. By construction 0 is open and it contains E. Moreover, by Proposition 1.3 and Corollary 1.2 it readily follows that
IEle ~ IOle ~ ~
Lk Ilfele = Lk v(lfe)
Lk (v(lk)
+ E/2k+l)
= Lk v(lk)
+ E/2 ~ IEle + E.
Thus, for any E > 0, there is an open set 0 ;2 E such that IEle + E, and (1.8) holds. •
(1.9)
IEle ~ IOle
~
Proposition 1.4 is important in applications; to state some we need a definition. We say that a subset H of Rn is a G6 set if H is the intersection of an at most countable family of open sets. The complement of a G6 set is an Fu set, i.e.,.an at most countable union of closed sets. Corollary 1.5. Let E be an arbitrary subset of Rn. Then there is a G6 set H which contains E and such that IHle = IEle.
v.
68
The Lebesgue Measure
Proof. If IEle = 00, put H = Rn. If IEle < 00, by (1.9) there is a sequence {Ok}k:l of open sets containing E such that 10Ie ~ IEle + l/k. Let now H k Ok; by construction H is a Gs set, and E ~ H ~ Ok for all k. Thus by monotonicity it follows that
=n
(1.10)
Since k is arbitrary we get that IEle = IHle.
•
A closer look at inequality (1.10) for sets E with finite outer measure indicates that the following property is true: Given E > 0, there is an open set 0 ~ E such that 10Ie - IEle < E. This estimate does not make sense when IEle = 00, but a closely related result holds for any subset of Rn, to wit, there exists an open set 0 ~ E such that 10Ie ~ IEle+IO\Ele. This inequality is a simple consequence of the monotonicity and it hints that rather to seek to controIIOle-IEle, which is meaningless when IEle = 00, we may try to control 10 \ Ele. This control is, indeed, all that is needed. We say that E ~ Rn is Lebesgue measurable if for any E > 0, there exists an open set 0 ~ E such that
10 \ Ele
0, let k ~ l/E, and note that
°
10k \ Ele ~ 10kie ~ l/k ~
E.
In order to show that C is a O'-algebra we must verify that C is closed under countable unions, which is easy, and under complementation, which requires some work. We begin with the easier part. Proposition 1.6. UkEk E C.
Let {Ek}k:l be a sequence of subsets in C, then
Proof. For a given E > 0, we must find an open set 0 ~ Uk Ek so that 10 \ Uk Ekle < E. Since each Ek belongs to C, there are open sets Ok ~ Ek such that 10k \ Ekle ~ E/2 k for all k. Hence, 0 = Uk Ok is an open set which contains Uk Ek, and since as is readily seen (Uk Ok) \ (Uk Ek) ~ Uk (Ok \ E k ), by Proposition 1.3 we get that
10 \ UkEkle ~ Lk 10 k \ Ekle ~ Lk E/2 k = E.
1.
Lebesgue Measure on R n
Thus, the union of the Ek'S belongs to C.
69
•
To complete the verification that C is a u-algebra, we begin by showing that closed sets are indeed Lebesgue measurable. This requires some preliminary results. Lemma 1.7.
Let E 1 , E2 be subsets of Rn with the property that
Then, lEI U E21e = IEtie compact and disjoint.
+ IE2Ie. In particular, this is
true if Et, E2 are
Proof. If either lEt Ie or IE21e is infinite, then the same is true of lEt uE2le, and we are done. Now, if both are finite, since the outer measure is subadditive, it suffices to show that IEtie + IE21e ~ lEt U E21e. But in this case we also have that lEt UE21e < 00, and consequently, given £ > 0,
there is a covering {Ik} of E t U E2 consisting of closed intervals such that
There are three relevant kinds of Ik'S, to wit: (i) Those h's such that h nEt -::j:. 0, h n E2 = 0, call them Il's; (ii) Those Ik'S such that Ik nEt = 0, Ik n E2 -::j:. 0, call them I~'s; (iii) Those Ik'S which intersect both E t and E 2 • The intervals in the third class above may be subdivided into nonoverlapping closed subintervals of diameter less than or equal to d(Et, E 2 ). Each subinterval thus obtained either belongs to the first family (i), or to the second family (ii), or it does not intersect E t U E2 and it can be discarded. Therefore, we divide the Ik'S into a covering of E t , a covering of E 2 , and throwaway the rest. By definition we have
which implies, since
lEI U E21e.
£
is arbitrary, that, as asserted, IEti e + IE21e ~
•
We are now ready to prove Theorem 1.8.
Closed subsets of Rn are Lebesgue measurable.
V.
70
The Lebesgue Measure
Proof. Suppose first that the closed set F in question is bounded, and hence compact. Then, given c > 0, by inequality (1.10) there is an open set 0 :J F such that 10Ie :5 IFle + cj we would like to show that 10 \ Fie :5 c as well. Now, 0 \ F is also open, and consequently it can be expressed as the countable union of nonoverlapping closed intervals, Uk Ik, say. By Proposition 1.3, it follows that 10 \ Fie :5 ~k v(h). On the other hand, since
by monotonicity we get that
Furthermore, since F and Uf=l Ik are both compact and disjoint and the Ik's are nonoverlap ping, by Lemma 1.7 it readily follows that
In particular
But since this inequality holds with a bound independent of N, we have ~k v(Ik) :5 c, and consequently also 10 \ Fie :5 c. Thus, in this case, F is Lebesgue measurable. For general closed subsets F of Rn, let Fk = {x E F: Ix I :5 k}, and observe that each Fk is closed and bounded, and that F = Uk Fk. By the above argument each Fk is measurable, and by Proposition 1.6 so is their union, F. • Theorem 1.9.
Suppose E E C, then R n
\
E E C.
Proof. Let Ok :J E be a family of open sets such that 10k \ Ele :5 11k, k = 1,2, ... j each Rn \ Ok is closed, and hence measurable. furthermore, since Rn \ Ok ~ Rn \ E for all k, H = Uk(Rn \ Ok) is an Fq measurable subset of Rn \ E. Let A = (Rn \ E) \ Hj since Rn \ E = H U A, we will be done once we check that A is measurable. To do this we show that IAle = o. Indeed, since for each k, A ~ (Rn \ E) \ (Rn \ Ok) = Ok \E, it readily follows that
IAle :5 10 k \ Ele :5 11k,
all k.
Lebesgue Measure on R n
1.
Thus
IAle =
0, and we are done.
71
•
Theorem 1.9 completes the verification that C is a u-algebra and, at the same time, it provides a description of the Lebesgue measurable sets. Indeed, the argument of Theorem 1.9 applied to the (measurable) complement R n \ E of a measurable set E gives that E = R n \ (Rn \ E) = HUN, where H is an FO' set and INle = 0. It also gives that the Lebesgue measurable sets are precisely those subsets E of R n which satisfy the following property: Given c > 0, there is a closed set F ~ E such that IE\ Fie ~ c. Next we construct the Lebesgue measure on (Rn, C)j it is the restriction of the Lebesgue outer measure to C. More precisely, we have Theorem 1.10. The set function I . Ie restricted to C is a measure. We call this measure the Lebesgue measure on Rn and denote it by 1·1. Proof. The proof amounts to showing that the set function I . Ie is u-additive on C. For this purpose, let {Ek}~l be a sequence of pairwise disjoint measure sets, and let E denote its union. Note that, by Proposition 1.3, lEI ~ L:k IEkl. As for the opposite inequality, assume first that the Ek'S are bounded. Given c > 0, let Fk ~ Ek be a sequence of closed sets such that IFk \ Ekl ~ c/2 k for all k. Since Ek = Fk U (Ek \ Fk) we also have IEkl ~ IFkl + c/2 k . Furthermore, since the Ek'S are pairwise disjoint, the sequence of Fk'S is composed of pairwise disjoint compact subsets of Rn. Fix N, and note that by (a simple extension of) Lemma 1.7, I Uf=l Fkl = L:f=lI Fkl. Thus, since Uf=l Fk ~ E for all N, it follows that L:f=l IFkl ~ lEI, all N, and consequently also L:k:llFkl ~ lEI. Whence 00
00
k=l
k=l
L IEkl ~ L (IFkl + c/2k) ~ lEI + c,
and, since c is arbitrary, we are done in this case. In the general case, fix an increasing sequence {Ij} of bounded intervals so that Uj Ij = Rn, 10 = 0, and put Sj = Ij \Ij-bi = 1,2, ... Then, the sets Ek,j = Ek n Sj are measurable, pairwise disjoint and bounded, and for each k we have Uj Ek,j = Ek. Thus, on the one hand,
IUj,kEk,j1 In other words,
IUj,k Ek,jl
=
I Uk Ekl, and, on the other hand,
= Lk,jIEk,jl = LkLjlEk,jl = LkiEkl.
1·1 is u-additive on C, and we have finished.
•
V.
72
The Lebesgue Measure
The following characterization of C, due to CaratModory (1873-1950), highlights the interplay between the Lebesgue measurable sets and the Lebesgue measure, and it is very interesting since it can be used to define C, and more general a-algebras of sets, d. 3.40 below. Theorem 1.11 (CaratModory). A subset E of Rn is Lebesgue measurable iff for every subset A of Rn we have
IAle = IA n Ele + IA \
Ele .
(1.11)
Proof. We begin by assuming that E is measurable and A is any subset of Rn. Since A = (A n E) U (A \ E), by the sub additivity of the Lebesgue outer measure it follows that
IAle
~ IA U Ele
+ IA \ Ele .
To prove the opposite inequality, note that by Corollary 1.5 there is a Gs measurable set H ~ A such that IAle = IHI. Now, since H is also measurable, we have IHI = IH n EI + IH \ EI. Whence, by monotonicity
and (1.11) holds. Next assume that (1.11) is true for every subset A of Rnj we distinguish the cases IEle < 00 and IEle = 00. In the former case, by Corollary 1.5 there is a Gs set H ~ E such that IHI = IEle. By (1.11) we have
IHI = IH n Ele + IH \ Ele = IEle + IH \ Ele.
(1.12)
(1.12) gives at once that H \ E is a measurable set of measure 0, and that E = H \ (H \ E) is also measurable. As for the latter case, let Ek = {x E E: Ixl ~ k} be so that IEkl < 00, and let Hk be a Gs set containing Ek such that IHkl = IEkie for all k. By (1.11), with A = Hk there, we get
Since IHkl = IEkle, we have IHk \ Ele = 0 for all k. Whence, setting H = Uk Hk, it readily follows that H is a measurable set which contains E, and that H \ E = Uk(Hk \ E) is a measurable set of measure o. Thus, E = H \ (H \ E) is also measurable. •
2.
73
The Cantor Set
2. THE CANTOR SET
°
It is easy to see that there are uncountable sets of measure in Rn, n ~ 2; indeed, the boundary of any interval is such a set. How about R? The Cantor set is such an example, and we construct it next. Consider the closed interval Co = [0,1]. The first stage of the construction is to trisect Co and to remove the interior of the middle interval, (1/3,2/3). Each successive step is essentially the same. Let C 1 = [1,1/3] U [2/3,1]; C 1 is the union of 21 = 2 closed disjoint intervals. At the second stage we subdivide each of the closed intervals of C1 into thirds and remove from each one the middle open thirds, (1/9,2/9) and (7/9,8/9). Suppose that Cn has been constructed and that it consists of 2n closed disjoint intervals, each of length 3- n • Subdivide each of the closed intervals of C n into thirds and remove from each one of them the interior of the middle intervals. What is left from C n is C n+!; note that C n+! is the union of 2n+1 closed intervals, each of length 3-(n+1). The Cantor set C is now defined as C = n~=o Cn. Some of the elementary properties of C are the following: It is closed, it contains the endpoints of all intervals in Cn, and any point of C is the limit of a nondecreasing (and a nonincreasing) sequence of endpoints of the intervals of the Cn's. It is not hard to give an analytical description of the elements of C. Let x = L:~=1 a n 3- n be the tryadic expansion of an arbitrary x E C. We observe that since x ~ (1/3,2/3),a1 ~ 1; similarly, since x ~ (1/9,2/9) U (7 /9,8/9),a2 ~ 1, and so on. In other words, by induction we see that an ~ 1 for all n, and C consists precisely of those points with an = 0,2 in their tryadic expansion. For example, the number 1/4 = L:~=1 2· 3- 2n is in C, but is not an endpoint of any of the intervals of the Cn's.
o o
1/3
1/9
2/9
2/3
2/3
1/3
-
1
7/9
8/9 I
Figure 2
1
...... \
v.
74
The Lebesgue Measure
As for the cardinality of C, we have Proposition 2.1.
C is uncountable.
Proof. The idea is to show that C '" 2N ,2 = {O,l}. If (xn) E 2N , let Yn = 2x n , and put f((xn)) = E~=l Yn3-n. Since Yn -::j:. 1 for all n, f maps 2N into Cj we want to show that f is one-to-one and onto. Suppose that (xn) -::j:. (x~), and let m = min{n: Xn -::j:. x~}j we may assume that Xm = 0 and x~ = 1. Since 2 E~=m+1 3- n = 3- m, it follows that 00
m-l
f((x~)) = 2 Lx~3-n ;::: 2 L n=l
x n3- n + 2· 3- m
n=l
00
> 2 L xn3- n = f((xn)) ' n=l
and
f is one-to-one.
Since given x onto. •
= E~=l an3-n in C,
we have f((a n /2))
= x,
f is also
Is C measurable, and if so, what is its measure? Since C is covered by the intervals in any C n we have ICle ::; 2n3- n for all n, and consequently, ICI = o. Thus C is an example of an uncountable set of measure 0 in the line.
3. PROBLEMS AND QUESTIONS 3.1 Suppose A, B are not Lebesgue measurable, is the same true of AUB? 3.2 Suppose IAle = 0 and show that for every subset of B of R n we have IB U Ale = IB \ Ale = IBle . 3.3 Let A, B
~
Rn. Show that
3.4 Suppose {Ek} is a nondecreasing sequence of subsets of R n and let E = Uk Ek. Is it true that limk-+oo IEkie = IE Ie ? 3.5 Does the notion of outer measure change if we replace the coverings by intervals by coverings with balls? How about parallelepipeds with a fixed orientation?
3.
Problems and Questions
3.6 Show that if
75
L: IEkie < 00, then
3.7 Suppose A,B and IAI = IBI·
~ Rn ,
Ilim sup Ekle
= O.
A E £, and IA ~ Ble = O. Show that BE£'
3.8 Assume {Ek} is a sequence of pairwise disjoint Lebesgue measurable sets and let A be any set. Is it true that
IA n (U
OO
n=I
Ek)
I= e
",00
L....tk=I
I An Ek Ie ?
3.9 Consider the transformation ¢>( x) = 71X + 6 from R into itself, where 71 f; 0 and 6 are real numbers. Show that: (i) For any set E, 1¢>(E)le = 171IIEle. (ii) E is Lebesgue measurable iff ¢>(E) is Lebesgue measurable, and in this case I¢>(E)I = 171IIEI. Can you think of extensions of this result to R n ? 3.10 A mapping ¢> from R into itself is said to be an isometry if for any x,x'in R we have I¢>(x) - ¢>(x')1 = Ix - x'i. Show that if ¢> is an isometry and E E £', then ¢>(E) E £, and I¢>(E)I = lEI.
3.11 Assume that INI = 0 and show that {x 3 : x E N} is a null Lebesgue set. 3.12 Suppose IEle < 00 and show that E E £, iff for any E > 0, we can write E = (A U AI) \ A 2 , where A is the union of a finite collection of nonoverlap ping intervals and IAIle, IA21e < E. 3.13 Is the set of irrational numbers in the line a Gs set?
3.14 Show that E E £, iff E = H \ N, where H is a Gs set and INI = O. 3.15 Does there exist a function f: R --+ [0,1] such that the set D of its discontinuities has IDI = 0 and D n I is uncountable for every interval I of R?
3.16 Assume A is a Lebesgue measurable subset of R of finite measure and put ¢>(x) = IA n (-00, xli. Show that ¢> is continuous at each x of R.
3.17 Let A be a Lebesgue measurable subset of R and let 0 < 71 < IAI. Show that there exists a Lebesgue measurable set B so that B ~ A and IBI = 71. 3.18 Given E > 0, show that there exists a dense open subset 0 of [0,1] with 101 < E so that its boundary 80 satisfies 1801 ~ 1 - E.
3.19 Let A = {x E [0,1]: x = .aIa2 ... , an f; 7, all n}. Prove that IAI = O. Generalize this result to different configurations of an's and to dyadic, tryadic expansions.
V.
76
The Lebesgue Measure
3.20 Let A = {x E [0,1]:x = .a1a2 ... ,an = 2 or 3, all n}. Show that A is measurable and compute IAI. 3.21 Let A = {x E R: there exist infinitely many pairs of integers p, q such that Ix - p/ql ~ 1/q3}. Show that IAI = o.
=
3.22 Suppose II, ... ,In are open intervals in R, so that if Q1 Q n [0,1] denotes the rational numbers in [0,1], then Q1 c Uj=l Ij. Prove that 2:j=l IIj I ~ 1. Is the conclusion true if the Ii's are measurable sets rather than intervals? What if we allow the collection of intervals to be infinite rather than finite? 3.23 Let
T1, T2, .•.
be an enumeration of Q. Show that
On the other hand, also show that
may, or may not, be empty. 3.24 Suppose E is bounded measurable subset of R, there exist Xl, X2 E E so that Xl - X2 E Q. 3.25 Show that if B is a Hamel basis for R, then
lEI> o.
Prove that
IBI = O.
3.26 Construct Lebesgue null subsets B 1 , B2 of R such that
3.27 Construct a Cantor-type subset C n of [0,1] by removing at the nth stage a "middle" interval of length (1 - 7] )3- n , 0 < 7] < 1. Show that C n enjoys all the properties of C, but it has Lebesgue measure
ICnl
= 7].
3.28 If -1
~ T ~
1, show there exist x, y E C such that y -
X
= T.
3.29 Does the Cantor set contain a Hamel basis for R? 3.30 Construct a Cantor-like subset of [0,1] which consists entirely of irrational numbers. 3.31 Let (an) be a fixed decreasing sequence of real numbers such that ao = 1, and 0 < 2a n < an-I, and define the sequence (d n ) by dn = a n-1 - 2a n , n ~ 1.
3.
Problems and Questions
77
= [1 - at,l], h,2 = [al - a2,al], Ia,3 = [al - a2,al - a2 + a3], 13,4 = [al - a3,aI], and so on; this definition can be made precise by induction. Now put Now let 10,1 = [0,1], h,l = [O,al],
Fn =
U
2"
k=l
In k '
and
h2
P =
n°on=l Fn.
Show that P E £ and IPI = lim n ..... oo 2n a n • Moreover, if 0 ~ "I < 1, the an's can be chosen so that IPI = "I. Also, if Tn = an-l - an, then the elements of P are precisely those real numbers of the form 2:k=l ckTk, ck = 0 or 1. 3.32 Show that the dinality 2 c • Consider now say that El rv relation on £ cardinali ty c.
class of Lebesgue measurable subsets of R n has carthe following relation on £: Given EI, E2 in £, we E2 if lEI l:::. E21 = o. Show that rv is an equivalence and that the family of the equivalence classes has
3.33 Prove that there is no Lebesgue measurable subset A of R such that alII ~ IA n II ~ bill for all open intervals I of the line. More precisely, prove the following two assertions: (a) If IA n II ~ bill for all open intervals I C R and b < 1, then IAI = 0, and, (b) If alII ~ IA n II for all open intervals I C R and a > 0, then IAI = 1. 3.34 Prove there exists a Lebesgue measurable set E C R such that
o < IE n II
< III, all bounded intervals I CR.
3.35 Does there exist a measurable subset E of R such that
o < IE n II ,
0
< II \ EI , all intervals I
C R?
3.36 A measurable subset A of R is said to have a well-defined density, if the limit D(A) = lim IA n (-A,A)I ~ ..... oo
2A
exists. In this case D( A) is called the density of A. Give an example of a measurable set whose density is defined, and one whose density is not defined. Further, prove that if Al and A2 are disjoint and have well-defined density, then Al U A2 also has a well-defined density, and D(AI U A 2) = D(Ad + D(A 2). 3.37 Prove that the Lebesgue measure enjoys the following property, known as regularity: Given a measurable set A we have
IAI = sup{IKI : K is compact, and K
~
A} .
V.
78
The Lebesgue Measure
3.38 It is difficult to approximate sets that are not Lebesgue measurable with measurable ones. Specifically, suppose E C R n is not Lebesgue measurable. Show there is .,., > 0 such that if E ~ A and R n \E ~ B, and if A and B are Lebesgue measurable, then IA n BI ~ .,.,. 3.39 Decide whether the following statement is true: A ~ Rn is Lebesgue measurable iff for every open subset G of Rn we have
IGI = IG n Ale + IG \ Ale. 3.40 Suppose p.* is a nonnegative u-subadditive monotone set function defined for all the subsets of a set X such that p.*(0) = O. We say that E ~ X is measurable with respect to p.* if for every subset A ~ X we have p.*(A) = p.*(A
n E) + p.*(A \
E).
Let M be the class of subsets of X which are measurable with respect to p.*. Show that M is a u-algebra of subsets of X and that the restriction of p.* to M defines a measure on (X, M). This construction is known as the Caratheodory extension of an outer measure.
CHAPTER
VI
Measurable Functions
In this chapter we introduce the class of measurable functions, for which the integral will be defined, and discuss some of its basic properties.
1. ELEMENTARY PROPERTIES
OF MEASURABLE FUNCTIONS Let M be a u-algebra of (measurable) subsets of X and suppose f is an extended real-valued function defined on X; by this we mean that, in addition to real values, f may also assume the values ±oo. We say that f is measurable if for any real number~, {x EX: f(x) > ~} = {f > ~} E M; that is to say, all the level sets of f are measurable. For instance, for any M, f = XA is measurable iff A EM. If M = {0,X}, only constant functions are measurable, and if M = P(X), all functions are measurable. We begin by exploring some simple properties of measurable functions. Proposition 1.1. Suppose M is a u-algebra of subsets of X and let f be an extended real-valued function defined on X. Then, the following statements are equivalent: (i) f is measurable. (ii) For any real ~,{f ~ ~} E M. (iii) For any real ~,{f < ~} E M. (iv) For any real ~,{f =:; ~} E M. Proof. (i) implies (ii). Fix~, and for n ~ 1 let An = {f > ~-l/n}; by assumption An E M, all n. Now, since {f ~ ~} is the intersection of the An's, it also belongs to M, and (ii) holds.
VI.
80
Measurable Functions
(ii) implies (iii). {f < A} = X \ {f ~ A}. (iii) implies (iv). {f ~ A} = n~=l {f < A + lin}. (iv) implies (i). {f > A} = X \ {f ~ A}. • In working with measurable functions it is essential to know whether certain sets are measurable. Since these sets are readily obtained from those introduced in Proposition 1.1 we merely indicate how their measurability is established.
{f
= co} = n~=l {f > n},
{f < co} = U~=l {f < n},
= -co} = n~=l {f < -n}.
(1.1)
{f> -co} = U~=l {f > -n}.
(1.2)
{f
{ -co < f < co} = {f > -co}
n {f < co} .
(1.3)
Also, for any real numbers A, T/, we will have the occasion to deal with the measurable sets
P < f < co} =
{f > A} n {f < co} , {-co < f < A} = {f < A} n {f > -co}.
(1.4)
P < f < J.L} = {f > A} n {f < J.L}, P < f ~ J.L} = {f > A} n {f ~ J.L}. P ~ f < J.L} = {f ~ A} n {f < J.L},
(1.5)
P~f~J.L}={f~A}n{f~J.L}.
(1.6)
{f = A} = {f ~ A} n {f ~ A}, {f t A} = X \ {f = A}.
(1.7)
Our next result indicates how to handle the infinite values of a measurable function. Proposition 1.2. Let M be a u-algebra of subsets of X and let f be an extended real-valued function defined on X. Then, f is measurable iff {f = -co} EM and for each real A, {A < f < co} E M. Proof. The necessity has been established in (1.1) and (1.4). As for the sufficiency, first observe that
{f < co}
= {f = -co} u (U:=-oo {n < f
< co})
1.
Properties of Measurable Functions
81
belongs to M by assumption. Whence {I in M, and since for each real A we have
{I> A} the level sets of
I
= oo} = X \
{I < oo} is also
= {A < 1< oo} U {I = oo} EM,
are measurable and
I
is measurable.
•
In fact, a more general statement is true.
Proposition 1.3. Let M be a cr-algebra of subsets of X and suppose is an extended real-valued function defined on X. Then, I is measurable iff {I = -oo} E M and for each open subset 0 ~ R, 1-1(0) E M.
I
Proof. Since for each real A, (A, 00) is open, the sufficiency follows from Proposition 1.2. As for the necessity, suppose 0 is an open subset of R and write 0 = Uh, where the Ik's are an at most countable collection of pairwise disjoint open intervals, one or two of which are possibly unbounded. By 5.8 in Chapter I
1-1(0) =
U1-1(Ik).
(1.8)
k
Now, by (1.3), (1.4) and (1.5), the sets in the union on the right-hand side of (1.8) all belong to M and 1-1(0) is measurable. Since {I = -oo} E M whenever I is measurable, we have finished. • So far the role of measures on (X, M) is not apparent, but in dealing with measurable functions sets of measure 0 are important and the following concept essential. Given a measure space (X, M,J.L), we say that a property P(x) is true J.L-almost everywhere on a measurable subset E of X, and denote this by J.L-a.e. on E, if 1'( {x E E: P( x) is not true}) = o. For instance, we say that a measurable function I is finite J.L-a.e. on E if 1'( {x E E : I( x) = ±oo}) = o. It is natural to expect that measurable functions that coincide J.L-a.e. on X be, in some sense, equivalent. A more precise statement is
Theorem 1.4. Let I' be a complete measure on (X,M), and let I,g be extended real-valued functions defined on X. If I is measurable and 9 I J.L-a.e., then 9 is also measurable and
=
1'( {g > A}) = 1'( {f > A}) ,
all real A .
(1.9)
VI.
82
Measurable Functions
=
Proof. Let N {g "I J}; by assumption N is a null, measurable, set. Now, for each real A we have
{g > A} u N = {f > A} UN.
(1.10)
Since f is measurable, the set on the right-hand side of (1.10) is measurable, and so is the set on the left-hand side there. Moreover, since J.l is complete and No = {x E N :g(x) ~ A} ~ N, then No is also a null, measurable set and consequently,
{g> A}
= ({g > A} U N) \ No
(1.11)
is also measurable. Next observe that since N is null, we have J.l( {f > A} U N) = J.l( {f > A}) for all real A. Whence, by (1.11) and (an argument similar to) 3.2 in Chapter V, we get
J.l( {g > A})
= J.l( {g > A} UN) -
J.l(No) = J.l( {f > A} U N) = J.l( {f > A}). •
Theorem 1.4 states that functions that coincide J.l-a.e. are roughly interchangeable; this property is essential in operating with extended realvalued functions. Consider, for instance, addition: f( x) +g( x) is undefined for those x's where f and 9 assume infinite values of opposite sign. The idea is to work with functions j and 9 which are closely related to f and g, and for which the sum makes sense. We proceed as follows: Let the (bad) set
B = {x E X:f(x) = oo,g(x) = -oo}U{x E X:f(x) = -oo,g(x) = co}. Since B is measurable, MX\B = {E n (X \ B): E E M} is a u-algebra of subsets of X \ B. Observe that j = fl(X \ B) and 9 = gl(X \ B) are also measurable on (X \ B,Mx\B), and that lex) + g(x) is defined for any x E X \ B. In fact, as we shall prove below, j + 9 is also measurable. To avoid having to go through various technical considerations each time we discuss an operation involving measurable functions, we sort the functions out into equivalence classes and operate at the level of classes. Let (X,M,J.l) be a measure space. We consider the collection :F consisting of those functions f that satisfy the following properties: (i) f is an extended real-valued function defined on X \ N, where N is a null subset of X.
1.
Properties of Measurable Functions
83
(ii) I is measurable, as a function on (X \ N,MX\N). Note that we only require functions in F to be defined J1,-a.e. on X. Next we identify those measurable functions which coincide J1,-a.e.j more precisely, given I, 9 E F, we say that I '" 9 iff there is a null subset N of X such that I(x) = g(x) for x E X \ N. It is clear that", is an equivalence relation on Fj the only property that offers any difficulty is the transitivity, and this follows at once from the fact that the union of null sets is null. We return to the addition: Given equivalence classes ],y E F corresponding to the finite J1,-a.e. functions I,g, by removing the bad set B we readily see that hex) = I(x) + g(x) is a finite quantity for x E X \ N, N null, and we put ] + Y = h. It is straightforward to verify that h is well-defined, i.e., it is independent of the representatives of the classes ],y, and this completes our discussion. Now that we know how things should be done we agree to denote the equivalence class] of a function I once again by I, and to operate with the classes as if they were functions. This should cause no undue stress, and the reader should keep in mind that a statement such as "a function I defined on X" actually means "an equivalence class ] of a function I defined J1,-a.e. on X." To deal with the usual arithmetic operations we need a preliminary result.
Lemma 1.5. Let (X,M,J1,) be a measure space, and suppose I,g are extended real-valued measurable functions defined on X. Then
{I> g} EM. Proof. Let {rk} be an enumeration of the rational numbers and observe that by Proposition 1.1
Ek={/>rk}n{g g} = Uk Ek is also measurable. • We are now ready to prove
Theorem 1.6. Let (X,M,J1,) be a measure space and I,g be extended real-valued measurable functions defined on X. Then I ± 9 is also measurable. Proof. We only do the addition. Observe that for any real A, A - 9 is measurable. Since
{/+g>A}={/>A-g},
real A,
VI.
84
the conclusion follows at once from Lemma 1.5.
Measurable Functions
•
The other operations of interest to us are covered by the following result. Theorem 1.7. Let (X,M,J.L) be a measure space, assume I is a measurable, finite J.L-a.e. function defined on X and let 4> be a real-valued continuous function defined on R. Then the composition 4> 0 I is measurable. Proof. Since {I = ±oo} is a null set we may assume that 4> 0 I is well-defined and that {4> 0 I = -oo} = 0. By Proposition 1.2 the measurability of 4> 0 I will be established once we show that (4)0 l)-l((A,oo))
= 1-1 (4)-l((A,oo)))
EM,
all real A.
(1.12)
But this is not hard; indeed, since 4> is continuous, 4>-1 (( A, 00)) = 0 is an open subset of R, and, by Proposition 1.3, 1-1(0) E M. Thus (1.12) holds and we have finished. • Theorem 1.7 shows that the composition 4>0 I of a measurable function with a continuous function 4> is measurable; it is not intuitively apparent that the composition 104> should also be measurable. In fact, it is not, as the following example shows. Let {J(n} be a sequence of Cantor-like sets, lJ(nl = 1 - lin, n 2,3, ... , and let A = Un J(n' Since
I
1[0,1] \
AI
~ 1[0,1] \
it readily follows that [0,1] = any subset B of [0,1] we have
J(nl
~
lin,
Un J(n U Z,IZI
n = 2,3 ... = 0, and consequently for
In particular, if B is not Lebesgue measurable, there is an index N so that B n J(N is not Lebesgue measurable. Referring to the construction of the Cantor set, let Dn = [0,1] \ Cn, where as usual C n denotes the union of the intervals remaining after n steps. Dn consists of 2n - 1 open intervals, I! say, ordered from left to right by k, removed in the first n steps of the construction of C. Since J(n is a Cantor-like set, there also is a sequence of open intervals, J! say, 1 ~ k ~ 2n - 1, ordered from left to right by k, removed in the first n steps of the construction of J(N.
1.
Properties of Measurable Functions
85
/
/ /
/
/
/ / Figure 3 We define now a function h from [0,1] onto [0,1] as follows: Construct KN in the interval [0,1] corresponding to the domain of h, and C in the interval [0,1] that corresponds to the range of h. Then h is the function that maps the left-end point of J~ into the left-end point of the rightend point of J~ into the right-end point of I~, and is extended to [0,1]\KN by continuity. It is not hard to check that h is well-defined, one-to-one (if this were not the case KN would contain an interval, and this is not possible) and onto [0,1]. Let BnKN be a non-Lebesgue measurable subset of KN and put A = h(B n KN) ~ C. Then A is null, and consequently measurable; in other words the image ofthis non-Lebesgue measurable set by a continuous function is Lebesgue measurable. Another way to express this situation is the following: If = h- 1 is the continuous inverse of h, then (A) = B n KN, and the image of a Lebesgue measurable set by a continuous function is not necessarily Lebesgue measurable. Also ( C) = KN, and takes a null set onto a set of positive Lebesgue measure. Returning to the question at hand, let f = XA; since A is null f is Lebesgue measurable. Consider now the composition f 0 . The inverse image (J 01/;)-1((1/2,3/2» is readily seen to equal
I!,
which is not measurable. Thus, as asserted, f
0
is not measurable.
VI.
86
Measurable Functions
The measurability of several expressions involving I follow at once from Theorem 1.7, with an appropriate choice of 0, the sets Bn.,., = {x E M :/n(x) > 'fJ} satisfy p. (lim sup B n.,.,) = o. Proof. Let In - 0 p.-a.e. on M and suppose that for some 'fJ > 0 we have p. (lim sup B n.,.,) > O. Then each x E lim sup Bn.,., belongs to infinitely many of the Bn.,., 's, and consequently, there is a sequence nk - 00 such that Ink (X) > 'fJ. Whence, lim sup In(x) ~ 'fJ > 0, and the In's do not converge to 0 on a subset of M of positive measure; this is a contradiction. Conversely, given c > 0, pick 0 < 'fJ < c, and consider a point x in M \ (lim sup B n.,.,). Since x belongs to at most finitely many of the Bn.,., 's there is a no such that In(x) ~ 'fJ < c for all n ~ no. In other words, In(x) - 0 for x E M \ N,p.(N) = 0, which is precisely what we wanted to show. • Since lim sup Bn.,., = nk:l (U~=k Bn.,.,), if P.(U~=k Bn.,.,) < 00 for some k, by (3.4) in Chapter IV the conditions of Proposition 3.1 are satisfied iff OO (3.1) lim p. B n .,.,) = 0 , all'fJ>O. k-+oo n=k
(U
In particular, if p.(X) < 00, (3.1) describes convergence p.-a.e. The relation (3.1) points to a possible limitation of the concept of p.-a.e. convergence, namely, we require the control of all the Bn.,., 's, from one index on. To illustrate this point, consider the sequence of (dyadic) subintervals of I = [0,1] defined as follows: 10 = I, II = [0,1/2], 12 = [1/2,1], 13 = [0,1/4], and so on. In other words, the sequence consists of successive blocks of 2n nonoverlap ping intervals, each of length 2- n , and the union of the intervals in each block is I. Let {In} be the sequence consisting of the characteristic functions of the In's. Clearly {In} does not converge to 0 anywhere on I, yet in some sense the In's are getting
3.
Sequences of Measurable Functions
close to
93
o. Specifically, lim 1{ln
n ..... oo
> TJ}I = 0,
all real TJ>
o.
(3.2)
Notice that in contrast to (3.1), we are dealing with one Bn,T/ at a time. Motivated by this remark we introduce the following definition. Given a measure space (X,M,p) and a sequence {In} of measurable nonnegative extended real-valued finite p-a.e. functions defined on X, we say that In converges to 0 in p-measure iff lim p( {In> TJ})
n ..... oo
=0,
all real TJ
> o.
(3.3)
If p(X) < 00 we refer to convergence in p-measure as convergence in the sense of probability, or convergence in probability. Thus, p-a.e. convergence implies convergence in probability, but the opposite is not true. Also, p-a.e. convergence does not, in general, imply convergence in p-measure. To see this consider (R,.c, 1·1), and observe that the sequence In(x) = X[n,oo)(x) tends to 0 everywhere, but 1{ln > 1/2}1 = 00 for all n. Nevertheless, a closer look at the first example indicates that there is a subsequence {Ink} of the In's, specifically that consisting of the characteristic functions of the intervals [0,1/2 n ], n ~ 1, with the property that limk ..... oo Ink (x) = 0 for all x E (0,1]. The remarkable fact is that this property is true for arbitrary sequences; before we prove this we need a bit of information concerning convergence in p-measure. Proposition 3.2. Let (X, M,p) be a measure space and suppose {In} is a sequence of measurable nonnegative extended real-valued finite p-a.e. functions defined on X. Then, In -+ 0 in p-measure iff for any e,8 > 0, there exists a constant N = Ne,s such that p( {In> 8})
< e , all n
~ N .
(3.4)
Proof. The necessity of the condition is obvious. As for the sufficiency, if In 0 in p-measure, then by (3.3) there exist TJ > 0 and a sequence nk -+ 00 such that
r
L = limsupp({lnk > 77}) >
Thus, (3.4) cannot hold for c completes the proof. •
=
L and Ii
=
o.
77, and this contradiction
Next we show that convergence in probability implies p-a.e. convergence along a subsequence.
VI.
94
Measurable Functions
Proposition 3.3. Let (X,M,J.t) be a finite measure space, and assume {In} is a sequence of measurable, nonnegative, extended real-valued, finite J.t-a.e. functions defined on X. If In --+ 0 in probability, then there is an increasing sequence nk --+ 00 such that lim Ink = 0 J.t-a.e. on X.
k-+oo
Proof. By Proposition 3.2 it follows that for each n we may find an index nk < nk+1 with the property that (3.5) Let Bk = {Ink > 1/2k} and consider the (bad) set B = lim sup B k. Since by (3.5) L:~1 J.t(Bk) < 00, by the Borel-Cantelli Lemma we have J.t(B) = O. Now, it is not hard to see that lim Ink (x)
k-+oo
Indeed, if x (3.6) holds.
ft
=0 ,
xEX \ B.
(3.6)
B, then x belongs to at most finitely many of the Bk'S and •
Sometimes we have to deal with questions of convergence when no limit is in evidence. For J.t-a.e. convergence this can be reduced to the numerical case, where the Cauchy criterion is available. Specifically, let (X,M,J.t) be a measure space, and let {In} be a sequence of measurable extended real-valued finite J.t-a.e. functions defined on X. We say that {In} is Cauchy J.t-a.e. if for J.t almost every x EX, given e > 0, there is an integer no = no( x) such that
I/n(x) - In,(x)1 ~ e,
all n,n' ~ no.
By the Cauchy criterion of convergence of numerical sequences, if {In} is Cauchy, then lim n -+ oo In( x) = I( x) exists J.t-a.e. The same is true for convergence in probability. Let (X,M,J.t) be a finite measure space, and let {In} be a sequence of measurable, extended real-valued, finite J.t-a.e. functions defined on X. We say that {In} is Cauchy in probability if given e, 6 > 0, there is an integer no such that
J.t({l/n - In' I > 6}) < e,
all n,n' ~ no.
Sequences which are Cauchy in probability converge in the sense of probability, cf. 4.30 below, and convergence in probability corresponds to a notion of "metric" convergence, cf. 4.31 below.
3.
Sequences of Measurable Functions
95
Next we discuss the concept of uniform convergence. Let (X,M,J.t) be a measure space and assume {in} is a sequence of measurable nonnegative extended real-valued finite J.t-a.e. functions defined on X. We say that in -+ 0 almost uniformly if given c > 0, we can find a measurable subset B of X such that J.t(B) < c and lim in(x) = 0,
uniformly for x E X \ B .
n--+oo
Proposition 3.4. Let (X,M,J.t) be a measure space, let {in} be a sequence of measurable nonnegative extended real-valued finite J.t-a.e. functions defined on X, and suppose that in -+ 0 almost uniformly. Then in -+ 0 J.t-a.e. Proof. For every positive integer k there is a measurable subset Bk of X such that J.t(Bk) < 11k and in(x) -+ 0, uniformly for x E X \ Bk. Clearly in -+ 0 pointwise on the (good) set G = Uk:l(X \ Bk). It only remains to check that J.t(X \ G) = 0; this is not hard. Since X \ G equals
and J.t(Bt)
< 00, it readily follows that
How about the converse to Proposition 3.4? To decide whether it is true we investigate the rate at which arbitrary sequences converge pointwise to o. First we show that the convergence occurs at a fairly rapid rate. Theorem 3.5. Let (X,M,J.t) be a finite measure space, and suppose {in} is a sequence of measurable nonnegative extended real-valued finite J.t-a.e. functions defined on X so that in -+ 0 J.t-a.e. Then there exists a non decreasing sequence of integers An -+ 00 with the property that lim Anin
n--+oo
= 0 J.t-a.e. on X .
(3.7)
Proof. Redefining the In's if necessary on a set of measure 0, we may assume that the In's are finite everywhere and that In(x) -+ 0 for every x E X. Let 9n(X) = sup A(x), x EX. k~n
VI.
96
Measurable Functions
Clearly the 9n'S are measurable, 9n(X) 2: In(x), all n,x E X, and lim 9n(X) = O.
n-+oo
In other words, working with the 9n'S instead, we may also assume that, in fact, the In's decrease to 0 everywhere. The first step is to construct the An's. Let nl = 1, and note that since In -+ 0 in probability, for each integer k = 2,3, ... , there exist integers nk > nk-l such that (3.8) Put now for nk ~ n
An = k,
< nk+1, k =
1,2,. . .
(3.9)
In other words, the sequence of An's is defined in blocks: The first (n2 - nl) entries are l's, the next (n3 - n2) entries are 2's, and so on. Furthermore, since nk -+ 00 as k -+ 00, also An -+ 00 as n -+ 00. Next we deal with the convergence of the sequence {Anln}. Let
Bm =
U {An In > 11m},
m = 1,2, ...
(3.10)
n~nm
It is not hard to estimate J.L(B m ). First observe that since the An's are constant on blocks we have 00
nk+l- 1
00
nk+l- 1
U U {Anln > 11m}
Bm =
=
U U {kIn> 11m}.
(3.11)
k=m n=nk
Furthermore, since the sequence {In} is nonincreasing and since k 2: m, the innermost union in (3.11) is contained in {Ink> 1/k 2}, and 00
Bm ~
U {Ink> 1/k2}.
(3.12)
k=m Whence, by (3.12) and (3.8) 00
J.L(Bm) ~
L
k=m
J.L( {Ink> 1/k2}) ~ 2- m+1
•
3.
Sequences of Measurable Functions
97
Let the (bad) set B = limsupBm. Since Lp(Bm) < 00, by the Borel-Cantelli Lemma we have pCB) = O. It only remains to check that for any x E X \ B, we have lim n -+ oo Anln(x) = o. But this is not hard: Given c > 0, let m be so large that l/m ::; c and x rt. Bm; such a choice is always possible since x belongs to at most finitely many of the Bm's. Then by (3.10) there exists nm so that
Our next result is an interesting interpretation of Theorem 3.5. Theorem 3.6. Let (X, M,p) be a finite measure space, and let {In} be a sequence of measurable nonnegative extended real-valued finite p-a.e. functions defined on X such that In -4 0 p-a.e. Then there exist a measurable nonnegative finite p-a.e. function I defined on X and a sequence of real numbers 'f/n
-4
0 such that
In ::; 'f/nl p-a.e. on X . Proof.
(3.13)
In the notation of Theorem 3.5, let
I(x) = sup {).n In (x)} ,
x EX.
n
Clearly I is measurable and nonnegative, and since Anln also finite p-a.e. Put now 'f/n = 1/ An, and note that
In ::; 'f/nl p-a.e., with 'f/n as asserted.
-4
-4
0 p-a.e., I is
0,
•
We are now ready to show that convergence p-a.e. implies almost uniform convergence; this result is due to Egorov (1869-1931). Theorem 3.7 (Egorov). Let (X,M,p) be a finite measure space, and let {In} be a sequence of measurable nonnegative finite p-a.e. functions defined on X such that In -4 0 p-a.e. Then In -4 0 almost uniformly.
98
VI.
Measurable Functions
Proof. We must show that given e > 0, there exists B E M such that J-L(B) < e and In(x) - 0 uniformly on X\B. Let I be the J-L-a.e. finite function corresponding to the sequence {In} constructed in Theorem 3.6. By Proposition 2.1 there is a constant M such that I( x) ~ M for x E X \ B, J-L(B) < e. By Theorem 3.6
In(x)
~
'I]nM,
x E X \ B, J-L(B) < e,
and In(x) - 0 uniformly for x E X \ B.
•
The measurability of the In's is essential to the validity of Egorov's theorem, cf. 4.38 below, as is the assumption J-L(X) < 00. Indeed, in the measure space (R,C, 1·1) the sequence In = X[n,oo) tends to 0 everywhere on R, but not uniformly on any unbounded subset of R.
4. PROBLEMS AND QUESTIONS The setting of the first thirteen problems and questions is the following: M is a u-algebra of (measurable) subsets of X and I is an extended real-valued function defined on X.
{I> A} EM for each rational number Aj is I measurable? Suppose I is a measurable real-valued function defined on X, and
4.1 Suppose 4.2
put g( x) = 0 if I( x) is rational and g( x) 9 measurable?
= 1 if I( x) is irrationalj is
I is measurable and B E Bl is a Borel subset of Rj does it follow that 1-1 (B) E M?
4.3 Suppose
I is a measurable real-valued function defined on X, and let be a real-valued Borel measurable function defined on R. Show that the composition 0 I is measurable.
4.4 Suppose
4.5 Suppose I is measurable and show that for each real r, s truncations
>
0, the
r if I(x) > r Ir,s(x) = { I_(sx) if - s ~ I(x) ~ r if I(x) < -s, are measurable. 4.6 If I,g are measurable real-valued functions defined on X and
= 'I/J J.L-a.e., then
Ix
4> dJ.L
=
Ix
'I/J dJ.L .
(iii) It is positively linear, i.e., if 4>, 'I/J are nonnegative simple functions and A > 0, then
Ix
(4) + A'I/J)dJ.L = lx4>dJ.L+A lx'I/JdJ.L.
(iv) It is monotone, i.e., if
°: ; 4> ::;
Ix
(1.2)
'I/J are simple functions, then
4> dJ.L ::;
Ix
(1.3)
'I/J dJ.L .
(v) For each nonnegative simple function 4>, the set function v given by
veE) =
Ix
4>XE dJ.L = t4>dJ.L,
E EM,
is a measure on (X, M). Proof. Since the proof of (i) follows along the lines to that of (iii), we only do (iii). Let, then, 4> = L anXAn and 'I/J = L bmXBm be nonnegative simple functions; 4> + A'I/J is then the simple function that takes the value an + Abm on the set An n Bm E M. Note that the an + Abm's are not necessarily distinct, but that the An n Bm's are pairwise disjoint. Thus, by the definition of the integral, we have
If the sum in (1.4) is infinite there are indices n, m such that an + Abm =I 0, J.L(An n Bm) = 00, and either anJ.L(An) = 00 or bmJ.L(Bm) = 00. In the former case we have 4>dJ.L = 00, in the latter case it follows that 'I/J dJ.L = 00, and in either case (1.2) holds. On the other hand, if the sum in (1.4) is finite, since the summands there are nonnegative they may be rearranged freely and we obtain at once that the sum equals
Ix
Ix
n
m
m
n
1.
The Integral of Nonnegative FUnctions
107
By the additivity of J.L this expression is
and (iii) holds. We prove (ii) next. Since 4> = "p J.L-a.e., there are nonnegative simple functions (,4>' and "p' and a null set A such that 4>' and "p' vanish off A and 4> = ( + 4>', "p = ( + "p' . By (i) it readily follows that
and (ii) is true. As for (iv), let 4> = L: anXA n , "p = L: bmXBm' and observe that if J.L(An n Bm) 0, then an :::; bm . Whence, since Un,m(An n Bm) = X, we have
t
and (iv) is true. (v) is a useful property and, among other things, it gives new examples of measures on (X, M). Clearly 11 is a nonnegati ve set function and 11( 0) = 0; only the a-additivity requires some work. First observe that if 4> is a nonnegative simple function and E EM, then 4>XE is also a nonnegative simple function and
Suppose, then, that {Ek} is a sequence of pairwise disjoint, measurable subsets of X, and let E denote its union. Now, since J.L is a measure, the right-hand side of the above expression equals
Whence
11
is a measure on (X,M).
•
VII.
108
Integration
How about the integral of arbitrary nonnegative measurable functions? By Theorem 1.12 in Chapter VI, these functions are limits of nondecreasing sequences of simple functions, and this fact suggests a way of defining the integral. Let (X,M,JL) be a measure space, I a nonnegative measurable function defined on X, and set
F J = {: is simple, and 0 ~ ~ f} . The integral of lover X with respect to JL is denoted by or simply I dJL, and it is defined as the quantity
Ix
Ix I( x) dJL( x), (1.5)
Ix
By Theorem 1.12 in Chapter VI, }='J "# 0, and consequently, I dJL is a well-defined nonnegative real number or 00. (1.5) is similar to the definition of the lower Riemann integral of a nonnegative function I, but with a crucial difference: Rather than considering partitions of the "domain of integration" X, we work with partitions of the "range" of I, in a manner compatible with each individual function I. More precisely, }='J contains the 's constructed in Theorem 1.12 in Chapter VI, and these simple functions are closely related to the level sets of I. Before we go on we must check that if I is simple, then the definitions in (1.1) and (1.5) coincide. If I dJL denotes the expression given by (1.1), since I E }='J it readily follows that
Ix
(1.6) On the other hand, if E }='J' then by (1.3),
Ix
dJL
~
Ix I
dJL,
all E }='J,
the inequality opposite to (1.6) holds, and the definitions given by (1.1) and (1.5) coincide. We are now ready to consider some elementary properties of the integral.
1.
The Integral of Nonnegative Functions
109
Theorem 1.2. Assume (X,M,p.) is a measure space, and let I,g be nonnegative measurable functions defined on X. We then have (i) If 1= 9 p.-a.e., then Ix I dp. = Ix 9 dp.. (ii) IxU + g) dp. ~ Ix I dp. + Ix 9 dp.. (1.7) (iii) If I ~ g, then Ix I dp. ~ Ix 9 dp.. (iv) If A ~ B are measurable, then
(1.8) Proof. (i) follows at once since for any E F, there exists a'IjJ E Fg such that = 'IjJ p.-a.e., and, by property (ii) in Theorem 1.1, the integrals of and 'IjJ over X with respect to p. are equal. As for (ii), observe that since for any E F, and 'IjJ E Fg we have + 'IjJ E F1+g, by (1.2) it follows that
Whence taking the sup of the left-hand side in the above inequality over E F, and 'IjJ E Fg gives (1.7), and (ii) holds. As for (iii), note that in this case we have F, ~ F g , and consequently,
L ~L dp.
9 dp.,
all E F, .
Taking the sup over the 's above gives (iii). To verify (1.8) it suffices to note that in this case FhA ~ F'XB' Thus (iv) holds, and the proof is complete. • It is of interest to determine whether equality holds in (1.7). To address this question, and to investigate the behavior of the integral with respect to limits, we consider the following result.
Theorem 1.3 (Beppo Levi). Let (X,M,p.) be a measure space and assume {In} is a nondecreasing sequence of nonnegative finite p.-a.e. measurable functions defined on X. Then, lim n -+ oo In(x) = I(x) exists everywhere on X, I(x) is nonnegative and measurable and
lim f In dp. (= sup f In dP.) Jxf I dp. =n-+oo Jx Jx
.
(1.9)
VII.
110
Integration
Proof. That I is nonnegative and measurable is clear. By monotonicity, the numerical sequence In dJ.L, n = 1,2, ... , is nondecreasing, and consequently, it has a limit L, say. Also, by monotonicity, In dJ.L ~ I dJ.L for all n, and
Ix
Ix
Ix
L = sup n
f In dJ.L ~ f I dJ.L . Jx Jx
(1.10)
If L = 00, the right-hand side of (1.10) is also 00, and (1.9) holds in this case. On the other hand, if L is finite we must show the inequality opposite to (1.10), and this requires some work. Given 0 < TJ < 1 and E F" let En = {In> TJ}; {En} is a sequence of measurable sets and since the In's are non decreasing and ~ I, it readily follows that En ~ En+! for all n, and that lim En = X. Consider now the measure veE) = IE dJ.L, E E M, and observe that by monotonicity we have
(1.11) By (1.10) and (v) in Theorem 1.1 both sides of (1.11) have a finite limit as n --+ 00. Whence, taking limits there we obtain at once L
~ n~oo lim TJv(En) =
Now, (1.12) holds for each
TJv(X) = TJ
Jxf dJ.L.
(1.12)
E F" and taking sup over F, we get
TJ
L
I dJ.L
~ L.
Since TJ < 1 is arbitrary it is clear that (1.13) is also true with TJ the inequality opposite to (1.10) holds. •
(1.13)
= 1 and
This result of Beppo Levi (1875-1961), also known as the Monotone convergence theorem, or MCT, has many important applications; before we discuss them we present a simple extension of MCT, also useful in applications. Corollary 1.4 (J.L-a.e. version of MCT). Let (X, M, J.L) be a measure space and {In} a J.L-a.e. nondecreasing sequence of nonnegative finite 1'a.e. measurable functions defined on X. Then limn~oo In(x) = I(x) exists J.L-a.e on X, I( x) is nonnegative and measurable and
lim f n~oolx
In dJ.L = lx f I dJ.L •
1.
The Integral of Nonnegative Functions
111
Proof. Let N be the null set outside of which the In's increase to I, and put 9n = In on X \ N, and 9n = 0 on N, and 9 = Ion X \ N, and 9 = 0 on N. The point is that now the 9n'S converge to 9 everywhere, and they coincide with the In's and I at the level of integrals. More precisely, by property (i) of Theorem 1.2,
Ix I
dl'
=
Ix
9 dl' ,
and
Ix In = Ix 9n dl'
dl' ,
all n,
and consequently
lim n-+oo
Jxr In dl' =
lim n-+oo
Jxr9n dl' = Jxr9 dl' = Jxr I dl' .
•
As for the applications, we do the additivity and u-additivity of the integral first. Proposition 1.5. Suppose (X,M,I') is a measure space and let 1,9 be nonnegative extended real-valued measurable functions defined on X. Then (1.14) Proof. Let { A}; A..\ is measurable and AXA>. < III J.L-a.e. Thus, by Theorem 1.2 (iii), it readily follows that
VII.
116
Integration
Corollary 2.3. Let (X,M,JL) be a measure space and let Then I is finite JL-a.e. Further, if I is nonnegative and I dp 1= 0 JL-a.e.
Ix
Proof.
I
E L(JL).
= 0, then
By Chebychev's inequality we have
JL( {III> n})
:::;!n lxfill dJL -+ 0
as n
-+ 00 •
Whence JL({I/I
= oo}) = nlim JL({I/I ..... oo
> n})
= 0,
and I is finite JL-a.e. Moreover, if I is nonnegative and its integral over X vanishes, then, also by Chebychev's inequality, JL({I > A}) = 0 for all A > 0, and I vanishes JL-a.e. • How does the integral behave with respect to addition? Proposition 2.4. Let (X,M,JL) be a measure space and suppose A is a real number and I, 9 E L(JL). Then I + Ag is integrable and
Ix (J +
Ag) dJL
=
Ix I
dJL + A
Ix
9 dJL .
(2.7)
Proof. The integrability of 1+ Ag follows at once from the estimate II + Agi :::; III + IAllgl· Now, since h = 1+ Ag is integrable, the integral of h over X with respect to JL is a well-defined finite number. Furthermore, we have h+ - h- = 1+ - 1- + (Ag)+ - (Ag)- ,
and consequently, also h+
+ 1- + (Ag)- = h- + 1+ + (Ag)+ .
(2.8)
All the summands in (2.8) are nonnegative, and by Corollary 2.3 finite JL-a.e. By Proposition 1.5 then, it readily follows that
fx
h+ dJL+
fx
1- dJL+
fx
(Ag)- dJL =
fx
h- dJL+
fx
1+ dJL+
fx fx
(Ag)+ dJL,
and since all the integrals are finite we may move them freely and obtain
L L h+ dJL -
h- dJL =
Thus (2.7) holds.
L L L 1+ dJL -
1- dJL+
(Ag)+ dJL -
(Ag)- dJL.
•
The following variant of Proposition 2.4 is important in applications since it allows us to consider arbitrary functions for which the integral is defined.
2.
The Integral of Arbitrary Functions
117
Proposition 2.5. Let (X,M,JL) be a measure space and suppose j, 9 are extended real-valued measurable functions defined on X which satisfy the following conditions: The integral of j over X with respect to JL is defined, and 9 is integrable. Then the integral of j + 9 over X with respect to JL is defined and (2.9) Proof. By Proposition 2.4, (2.9) is only novel when f is not integrable. If this is the case one of the quantities, f+ dJL or I~ f- dJL, is infinite and the other one is finite. To fix ideas suppose that Jx f- dJL = 00, and observe that with the notation of Proposition 2.4, recalling that h = f + 9 and setting>. = 1 in (2.8), we have
Ix
(2.10)
Ix
Ix
Ix
Since f+ dJL, g+ dJL < 00, from (2.10) it follows that h- dJL = 00. Furthermore, since h- = 0 when h+ 'I 0, by (2.10) we also get that h+ ~ f+ + g+, and consequently h+ dJL < 00. Thus the integral of h over X with respect to JL exists and it equals -00. Whence the left-hand side of (2.9) equals -00, and so does the right-hand side. Thus (2.9) holds, and we are done. •
Ix
We are now in a position to explore what Fatou's Lemma says in the general context of functions of arbitrary sign. Theorem 2.6 (Fatou's Lemma). Let (X,M,JL) be a measure space and suppose {fn} is a sequence of extended real-valued measurable functions defined on X which satisfy the following property: There exists an integrable function 9 such that 9 ~ fn
(2.11)
all n.
Then the integrals of lim inf fn and fn over X with respect to JL exist, n = 1,2 ... , and we have
Ix
lim inf fn dJL
~ lim inf
L
fn dJL •
(2.12)
VII.
118
Integration
Proof. Since by (2.11) In - 9 ~ 0, the integral of the functions In - 9 over X with respect to p, is well-defined for n = 1,2, ... ; similarly for liminf In - g. Now, by Fatou's Lemma for nonnegative functions we have
Ix
lim inf(Jn - g) dp, =
~
Ix
(lim inf In
lim inf
- g) dp,
Ix (J
n -
g) dp, .
(2.13)
First we consider the left-hand side of (2.13). By Proposition 2.5 with I = liminf In - 9 and 9 = 9 there, we get that 1+ 9 = liminf In has a well-defined integral over X with respect to p, which satisfies
Ix
lim inf In dp,
=
Ix
(lim inf In
- g) dp, +
Ix
(2.14)
9 dp, .
Since 9 is integrable, by (2.14) it readily follows that the left-hand side of (2.13) equals
Ix
lim inf In dp, -
Ix
(2.15)
9 dp, .
A similar argument gives that the integral of In over X with respect to p, exists, n = 1,2, ... , and that the integral that appears on the right-hand side of (2.13) is equal to
Ix In
dp, -
Ix
9 dp, ,
(2.16)
n = 1,2, ...
Thus combining (2.15) and (2.16) we may rewrite (2.13) as
Ix
lim inf In dp, -
fx
9 dp,
~ lim inf
Ix In
dp, -
fx
9 dp, .
Ix
Since 9 is integrable we may now cancel 9 dp, in the above inequality. Whence (2.12) holds, and we have finished. • There is a version of Theorem 2.6 with the inequality (2.12) reversed, but with the lim inf replaced by the lim sup. Corollary 2.7. Let (X,M,p,) be a measure space and suppose {In} is a sequence of extended real-valued measurable functions defined on X
2.
The Integral of Arbitrary Functions
119
which have the following property: There exists an integrable function 9 such that (2.17) In:5 g, all n. Then the integrals of lim sup In and n = 1,2, ... , and we have
In
over X with respect to J.l exist,
(2.18) Proof.
Observe that (2.17) is equivalent to -g
:5 - In,
all n,
and that 9 is integrable iff -g is integrable, in other words, the hypotheses of Fatou's Lemma are satisfied by {- In} and -g. Since lim inf( - In) = -lim sup In, by Theorem 2.6 we have
-Ix
lim sup In dJ.l
=
Ix
lim inf ( -
:5 liminf
In) dJ.l
Ix (- In) Ix In
= -lim sup
dJ.l
= liminf
(-Ix In
dJ.l.
dJ.l ) (2.19)
Now, (2.19), whether involving finite quantities or not, is equivalent to (2.18), and we have finished. • Some remarks concerning these results: Clearly we may assume that (2.11) and (2.17) hold J.l-a.e. and obtain the same conclusion; also strict inequality may occur in estimates (2.12) and (2.18). For instance, for the Lebesgue measure on 1= [0,1] and the sequence In = X[O,3/4)' n odd, and In = X[3/4,1],n even, we have
and lim sup
1In
dJ.l = 3/4
rn}
Compute IIf(x)dx. 4.9 Prove that the sum L:~=o ito,1r](1 - "'sin x)n cos x dx converges to a finite limit, and find its value. 4.10 Let f be a nonnegative measurable function defined on R. Prove that if L:~=-oo f( x + n) is integrable, then f = 0 a.e. On the other hand, if I is integrable, then 4>( x) = L:~=-oo 1(2nx + 1/ n) is finite a.e. and integrable, and JR 4>( x ) dx = JR I( x ) dx.
VII.
126
Integration
4.11 Let (X,M,JL) be a measure space and IE L(JL). Show that the set {I i: O} is O'-finite, i.e., the at most countable union of sets of finite measure. 4.12 Referring to Proposition 1.7, decide when the measure v introduced there is: ( a) finite, and (b) 0'- finite. 4.13 Referring to Proposition 1.7 again, suppose that 9 is a nonnegative 9 dv = gl dJL. measurable function defined on X. Show that
Ix
Ix
4.14 Suppose that the assumptions of 4.27 in Chapter IV hold and let I be a nonnegative real-valued measurable function defined on Y. Prove that the following "change of variable" formula holds:
4.15 Show that Fatou's Lemma is also true for functions that depend on a continuous parameter. More precisely, under the relevant assumptions, the following is true: lim infiEI Ii dJL ~ lim infiEI Ii dJL.
Ix
Ix
4.16 Prove the following variant of Fatou's Lemma: If {In} is a sequence of nonnegative measurable functions which converges to I JL-a.e. and In dJL ~ M < 00 for all then I is integrable and I dJL ~ M.
Ix
n,
Ix
4.17 Decide whether the following Fatou-like statements are true: (a) If {In} is a sequence of nonnegative measurable functions and In converges to I in probability, then I dJL ~ liminf In dJL, and, (b) Same result with convergence in probability replaced by convergence in measure.
Ix
Ix
4.18 Show that the following extension of Fatou's Lemma is true: Rather than assuming 9 E L(JL), we may assume that g- dJL < 00 in g+ dJL < 00 in Corollary 2.7. Theorem 2.6, and that
Ix
Ix
4.19 Describe the relation of 4.26 in Chapter IV to Fatou's Lemma. 4.20 Let (X,M,JL) be a measure space and { n.
4.47 Discuss for what values of 'fJ,£,
4.48 Prove the Vitali-Caratheodory theorem: If I E L(R) and £ > 0, then there exist functions ¢l and .,p defined everywhere on R which satisfy: (i) ¢l is upper continuous and bounded above, and .,p is lower semicontinuous and bounded below. (ii) ¢l,.,p E L(R). (iii) At every x where I is defined we have
¢lex)
~
I(x)
~
.,p(x).
(iv) IR(¢l(x) - .,p(x))dx ~ £. Do properties (i)-(iv) characterize integrable functions? 4.49 Let (X,M,J.L) be a measure space and I a complex-valued measurable function defined on X, cf. 4.43 in Chapter VI. Show that also the modulus III is measurable and introduce L(J.L) = {I: I is meaIII dJ.L < oo}. This is another open ended question: surable and Discuss the properties of L(J.L).
Ix
CHAPTER
VIII
More About Ll
In this chapter we discuss the metric properties of Ll, including completeness, and some local properties of integrable functions, such as the Lebesgue Differentiation Theorem. 1.
METRIC STRUCTURE OF Ll
Let (X,M,JL) be a measure space and distance from I to 9 by the expression
d(j,g)
I,g
E L(JL). We measure the
= LI/-gldJL .
(1.1)
Is d(j,g) a metric on L(JL)? Clearly d(j,g) ~ 0 and by 4.33 in Chapter VII, d (j, g) = 0 iff I = 9 JL-a.e. Since we identify those functions that coincide JL-a.e., it is true that d(j,g) = 0 iff 1= g. Also d(j,g) = d(g,!). Finally, since for I,g, hE L(JL) we have
II -
gl ~
II -
hi + Ih - gl
JL-a.e. ,
it readily follows that d (j,g) ~ d(j, h)+d(h,g), and d indeed is a distance function. The interesting question to consider is whether endowed with this metric L(JL) is a complete metric space. The answer is affirmative. Theorem 1.1. Let (X,M,JL) be a measure space. The distance function introduced in (1.1) above turns L(JL) into a complete metric space.
131
VIII.
132
More About L 1
Proof. Assuming {In} is a Cauchy sequence of integrable functions, we must show that there is a function I E L(J1.) so that lim n-+co d (In, f) = O. First observe that since {In} is Cauchy we can find an increasing sequence nk+1 > nk, such that
Whence, by Theorem 1.6 in Chapter VII we get
Now, by Corollary 2.3 in Chapter VII, the integrand on the left-hand side of (1.2) is finite J1.-a.e., and consequently, the series with terms Ink+1 - Ink' k = 1,2, ... , converges absolutely to a finite sum J1.-a.e. In particular, we have that the limit (1.3) exists, and is measurable and finite J1.-a.e. Moreover, since the sum on the left-hand side of (1.3) telescopes to Inm+1 - Inl' it readily follows that
(1.4) say, is measurable and finite J1.-a.e.i we want to show that the convergence is also in the metric of L(J1.). Let 4> = ~k::l I/nk+1 - Ink Ii by (1.2), 4> E L(J1.). Also, since Ink = Ink - Inl + Inl' we get that k-l
lInk I ~
L: I/nm+1 -
Inm I + Ilnll ~ 4> + Ilnll E L(J1.) ,
all k.
m=l
By (1.4) a similar estimate holds with Ilion the left-hand side above, and consequently, by LDCT
To complete the proof we invoke the well-known fact that if a Cauchy sequence in a metric space has a convergent subsequence, then the sequence itself converges to the same limit. •
1.
Metric Structure of L1
133
Corollary 1.2. Let (X,M,J.L) be a measure space, 1,ln E L(J.L), n = 1,2, ... , and suppose that limn -+ oo lin - II dJ.L = O. Then there is a subsequence {Ink} such that limk-+oo Ink = I J.L-a.e.
Ix
Proof. Since the sequence {In} converges it is Cauchy, and, as in the proof of Theorem 1.1, we can find a subsequence {Ink} which converges pointwise J.L-a.e., and in the L(J.L) metric, to an integrable function g, say. But then it is clear that d(j,g) = 0 and 1= g. Thus lim Ink = I J.L-a.e.
k-+oo
•
The converse to Corollary 1.2 is false, namely, there is a sequence {In} of integrable functions and an I E L(J.L) such that lim n -+ oo In = I J.L-a.e., and yet limn -+ oo d (jn, 1) =1= 0, cf. 4.24 in Chapter VII. Suppose next that X is a topological space. It is natural to consider whether integrable functions can be approximated by continuous functions in the metric of L(J.L). We only consider the case X = R n , M = C here, but the proof can be readily extended to more general settings.
Theorem 1.3. Co(Rn), the space of continuous functions which vanish off a compact set, is dense in L(Rn). More precisely, for any I E L(Rn), given e > 0, there is a continuous function 9 which vanishes off a bounded set, and such that
d(j,g)=
f I/-gldxk ~ I, such that limk-+oo ¢>k = I a.e. Whence, by MCT, limk-+oo ¢>k dx = I dx, and since all the quantities involved are finite we also have
IRn
Let now e
> 0 be given, and choose one of the
o ~ ¢> ~ I, and
IR"
¢>k'S, call it ¢>, so that
iR" (j - ¢» dx < e/2.
We have thus reduced the problem at hand to one of approximating simple integrable functions by continuous functions in the metric of L(Rn).
VIII.
134
More About £1
Suppose = E~=l CkXEk' where for each k, Ck > 0 and Ek is a measurable set of finite measure, 1 ~ k ~ m. It suffices now to approximate each summand that appears in the definition of , or equivalently, the characteristic function of a measurable set E, say, of finite measure. By the regularity of the Lebesgue measure, for any "I > 0, there is an open set 0 of finite measure such that
10\EI 0, there exists 6 > 0, such that
~ c,
whenever Ihl
~ 6.
(1.5)
This is not hard. First let 9 E Co(Rn) be such that dU,g) ~ c/3, and observe that I/(x + h) - l(x)1 may be estimated by
g(x + h)1 + I g(x + h) - g(x)1 + Ig(x) - l(x)l. (1.6) Whence, integrating (1.6) over Rn, we get that the integral on the lefthand side of (1.5) does not exceed I/(x
+ h) -
f I/(x+h)-g(x+h)ldx+ f Ig(x+h)-g(x)ldx+ f I/(x)-g(x)ldx iRn iRn inn =A+B+C, say. By 4.6 in Chapter VII we have A = C for all h, and, by our choice of g, A,C ~ c/3. As for B, note that 9 is actually a uniformly continuous function that vanishes off a bounded interval of Rn. Thus, given TJ > 0, there exists 6 > such that Ig(x + h) - g(x)1 ~ TJ for alllhi ~ 6 ,X E Rn. Moreover, since for any fixed h, Ihl ~ 1, also g(x + h) - g(x) vanishes off a bounded interval I of Rn, it is clear that
°
B ~
III TJ,
whenever Ihl ~ min (1,6). Thus, by choosing TJ small enough, we also have B ~ c/3 whenever sufficiently small, and A + B + C < c. •
Ihl is
VIII.
136
More About Ll
2. THE LEBESGUE DIFFERENTIATION THEOREM Given x = (xt, ... ,xn ) E R n and r > 0, let I(x,r) = {y: IXi - Yil < r, i = 1,2, ... ,n} denote the open interval of sidelength 2r centered at x. The question we address in this section is: If I is an integrable function, for what x's does lim
f
1
r-+O II(x, r)1 JI(:c,r)
I dy = I( x)
?
(2.1)
At those points x where (2.1) holds we say that the (indefinite) integral of I differentiates to I(x). In case n = 1, the question is whether lim .!.
j
I dy = I( x ) ,
.!. f
Idy
r-+O 2r (:c-r,:c+r) which is equivalent to lim
h-+O,ht=O h J[:c,:c+h)
= I(x).
(2.2)
If we set F(x) = J[O,:c)ldy, (2.2) reads precisely F'(x) = I(x), and this justifies the terminology. In fact, there are two questions implicit in (2.1): When does the limit exist, and, if it exists, when does it equal I(x). For instance, when 1= [0,1] and I = XI' we have
.!. f
2r J(-r,r)
I dy = 1/2, all r > 0.
(2.3)
Thus the limit of the left-hand side of (2.3) exists and it equals 1/2 =I1(0) = 1. Some observations are in order. First, the question we posed is "local" in nature, i.e., since we take limits as r -+ only the values of I near x are relevant. Thus, we may assume that x E 1(0,1) and that I vanishes off 1(0,2). Next, since the example given in (2.3) is not very reassuring, we consider an instance where (2.1) is true. Suppose, then, that I is continuous at x and note that
°
11(:r)1 f IdY-/(x)=II(:r)1 f (f-/(x))dy. , JI(:z:,r) , JI(:z:,r)
(2.4)
2.
The Lebesgue Differentiation Theorem
137
Given c > 0, let r be so small that I/(x)- l(y)1 we may assume r < 1. By (2.4) it follows that
II( 1 )1 x,r
r
JI(x,r)
Idy-/(x)
~
II( 1 )1 x,r
< c for y
r
JI(x,r)
E I(x,r)j clearly
1/-/(x)ldy~c,
and (2.1) is true in this case. Since continuous functions are dense in the metric of L(Rn), we expect that the good behaviour of continuous functions will somehow translate into an a.e. good behaviour of integrable functions. The idea of Hardy (1877-1947) and Littlewood (1885-1977) is to seek the control of all the averages of I. They devised this procedure to study the convergence of Fourier series and were inspired by the averages in the game of cricket. To control the averages of I we introduce the so-called HardyLittlewood maximal function. Specifically, suppose I is an integrable function which vanishes off 1(0,2), and for x E R n put
M(J)
= sup II( 1 )1 r i l l dy . r>O x, r JI(x,r)
(2.5)
What can we say about M I? We claim that M I is a nonnegative lowersemicontinuous, and hence measurable, function, which tends to as Ixl 00 at the rate of Ixl-n. To show that M I is lower-semi continuous we must verify that for each oX > 0, the set {M I > oX} is openj this is not hard. Working with complements we show that for each oX > 0, {M I ~ oX} is closed. Fix oX > 0, then, and suppose {Xk} is a sequence of points in {M I ~ oX} such that Xk Xj we show that x E {M I ~ oX} as well. In other words, we check that all the averages of III about x are less than or equal to oX. First observe that since Xk - x,
°
lim I(xk, r)
k~oo
~
I(x, r) = 0,
all r.
Therefore, if Xk denotes the characteristic function of I(xk,r) and ik = IXk' it follows that
I/k(y)1
~
I/(y)1 and
lim
k~oo
!k =
~
I(x,r)
°
a.e.
Thus, by LDCT
(2.6)
VIII.
138
Next consider the average II(~,r)1
fI(x,r)
More About L1
If I dy. Since
the average in question does not exceed
r
1
11(x,r)1 JI(x,r)~I(xk,r)
+
If I dy
1
11(xk,r)1
r
If I dy = A
+ B,
JI(xk,r)
say. By (2.6), A -+ 0 as k -+ 00. As for B, since M f( Xk) ::; A for all k, we also have B ::; A. Thus all averages of If I about x are less than or equal to A, and Mf(x) ::; A. Next we show that Mf(x) Ixl- n for Ixllarge. To see this take x E Rn with Ixllarge, x E Rn \ 1(0,10) will do, and observe that unless r> clxl, where c is a dimensional constant independent of x, the average of If I about x vanishes. Thus, with a dimensional constant c which may differ at different occurrences even in the same chain of inequalities, we have f'V
1
11(x,r)1
r
If I dy ::;
JI(x,r)
~ rn
r
and consequently,
Mf(x)::;
If I dy ::; _c_
Ixl n
JI(O,l)
ICln x
r
r
If I dy ,
JI(O,l)
(2.7)
Ifldy.
JI(O,l)
Since there is a dimensional constant c such that for Ixl large we have 1( x ,clx I) ~ 1(0,1), it readily follows that
Mf(x) >
1 - 11(x,clxl)l
r
JI(x,clxl)
Ifldy> _c_ - Ixl n
r
Ifldy,
JI(O,l)
the inequality opposite to (2.7) holds, and, as asserted, Mf(x) Ixl- n for Ix I large. It is then apparent that M f is not integrable, cf. 4.46 in Chapter VII, but just barely. As for the function Ixl- n, it satisfies a weak integrability condition reminiscent of Chebychev's inequality. More precisely, there is a constant c such that f'V
AI{lxl- n
> A}I
::; c,
all A >
o.
(2.8)
Indeed, if Ixl-n > A, then there is a dimensional constant c so that x E 1(0, CA- 1 / n ) , and
2.
The Lebesgue Differentiation Theorem
The class of those measurable functions
AI{I/I > A}I::; c,
all
139
I
which satisfy the estimate
A> 0,
(2.9)
was studied by Marcinkiewicz (1910-1940). It is called the weak-Ll class of Marcinkiewicz and it is denoted by wk-L(Rn). By Chebychev's inequality, L(Rn) ~ wk-L(Rn), and for integrable functions I, (2.9) is true with c = I/(y)1 dy. On the other hand, Ixl- n E wk-L(Rn) \ L(Rn). The remarkable fact that Hardy and Littlewood proved is that although for I integrable M I is not necessarily integrable, it belongs to wk-L(Rn); in a sense this gives the next best result.
JRn
Theorem 2.1 (Hardy-Littlewood). Suppose I is an integrable function which vanishes off 1(0,2). Then M IE wk-L(Rn), and for any A > 0 we have (2.10) AI{MI> A}I ::; 3n I/ldy.
iR!'
Proof. Given A > 0, let (h = {M I > A}; we want to show that the open set V;, has finite measure and that (2.10) holds. Since by (2.7) MI(x) - 0 as lxi- 00, V;, is a bounded set of finite measure; to show that (2.10) holds requires some work. The following line of reasoning is a prototype of the so-called "covering arguments" and it is due to Wiener (1894-1964). If V;, = 0 there is nothing to prove. Otherwise, let x E V;" and observe that by the definition of M I(x) there exists T = Tx such that 1 Il(x, Tx)1 Clearly
V;, ~
1
I(x,r.,)
III dy > A.
U lex,
Tx).
(2.11)
(2.12)
xED>.
Although the set on the right-hand side of (2.12) appears to be quite cumbersome, with the overlaps and all, things are not as complicated as a first impression might indicate. Since by the regularity of the Lebesgue measure, cf. 3.37 in Chapter V, IV>.I
= sup{IJ(I: J( C V>., J(compact},
(2.13)
VIII.
140
More A bout L 1
it suffices to estimate !KI for each compact subset I( of 0.". Now, for each such a compact set of K we also have
U I(x,Tx)
K ~
xeO",
and, since the I( X, T x) 's are open, by the Heine-Borel Theorem there exist finitely many intervals I(xt, Td, ... ,1(xm' Tm ), say, so that m
I( ~ UI(xi,Ti).
(2.14)
i=l
Clearly we may assume that no interval that appears on the right-hand side of (2.14) is contained in the union of all the others, that is to say, each interval there contributes something to the union. Let T = ma.x{ Tt, ... , Tm} and, by renaming the intervals if necessary, suppose that TI = Tj if more than one Ti equals T just choose any. At this juncture of the argument the geometry of the situation takes over: Observe that if
I(xt,TI)nI(xj,Tj)
=10,
then
I(xj,Tj)
~
I(Xt,3Tl)'
Because ofthis property we discard all the intervals I( Xj, Tj), j =11, which intersect I( xl, TI), and repeat the same procedure with the remaining intervals, i.e., the family ofthose intervals I(xj, Tj) which are disjoint with I( Xl, Td. In other words, we separate an interval with largest sidelength, and then discard all the intervals which intersect it. Since the original family of intervals is finite, after a finite number of steps we are left with a pairwise disjoint family of open intervals I(xt,Tt), ... ,I(Xk,Tk), say, which by (2.14) has the property that k
K ~
UI(xi' 3Ti). i=l
Whence
k
k
IKI ~ LII(xi,3Ti)1 i=l
= 3n LII(xi,Ti)l.
(2.15)
i=l
But the intervals that appear on the right-hand side of (2.15) are special: They are pairwise disjoint and they all satisfy (2.11). Therefore the sum on the right-hand side of (2.15) is less than or equal to
L:\11
k i=l
I(Xi,ri)
IfI dy =
11
:\
k
U i =l I (xi,ri)
IfI dy ~
11 IfI
-
..\
R!'
dy ,
2.
The Lebesgue Differentiation Theorem
141
and consequently, (2.16) Taking the sup of the left-hand side of (2.16) over those K CO)., by (2.13) it follows that (2.10) holds. • We are now ready to address the questions raised in (2.1), to wit, the existence of the limit there, and its precise value. Theorem 2.2 (Lebesgue Differentiation Theorem). integrable function which vanishes off 1(0,2). Then lim Il( 1 )1 ( fey) dy = f(x) x,r JI(x,r)
Suppose
a.e. on 1(0,1).
f is an (2.17)
r-+O
Proof. First we show that the limit on the left-hand side of (2.17) exists a.e. For this purpose consider the function
~(f,x) = limr-+Osup Il( X,1 r )1 JI(x,r) { f dy -liminf lIe 1 )1 ( f dy. r-+O X, r JI(x,r) Although ~ is measurable, we need not make use of this fact to proceed with the proof. However we point out that ~ is a well-defined nonnegative function, and show that ~(f,x) = a.e. on 1(0,1). The idea is to control ~ by the Hardy-Littlewood maximal function. Now, if 9 is continuous it is clear that ~(g, x) = everywhere and consequently, ~(f, x) = ~(f - g, x). Whence it readily follows that for any continuous function 9 we have
°
°
~(f,x)
= ~(f -
g,x)
~
2M(f - g)(x) ,
all x.
Let now A > 0, and note that the above inequality gives
Thus, by the monotonicity of the Lebesgue outer measure and Theorem 2.1, we obtain I{~(f,·)
> A}le
~ I{M(f - g)
> A/2}1
2 ·3n
~ -A-
JR( If -
gl dy.
(2.18)
VIII.
142
More About Ll
Now, since continuous functions are dense in the metric of L(Rn), the integral on the right-hand side of (2.18) may be made arbitrarily smail, and consequently,
I{x E Rn:i)(j,x) > A}le = 0, all A > O. This can only be true if i)(j,x) = 0 a.e., in other words the lim sup is equal to the lim inf and the limit on the left-hand side of (2.17) exists a.e. Next we show that the limit equals
W(j,x)= lim 11(1 )1 r-+O
x, r
f
f a.e. Let
JI(x,r)
f(y)dy-f(x).
W is an a.e. well-defined nonnegative function, and it is apparent that for continuous 9 W(j,x) = W(j-g,x), all x. Whence it readily follows that W(j,x) ~ M(j - g, x) + If(x) - g(x)l, and consequently, for A > 0 we have
{x E R n : W(j,x) > A}
~
{x E R n : M(j - g)(x) > A/2} U{x E R n : If(x) - g(x)1 > A/2}.
Thus by the monotonicity of the Lebesgue outer measure, the HardyLittlewood maximal theorem and Chebychev's inequality, it follows that
I{w(j,·) > A}le ~ I{M(j - g) > A/2}1 + 1{lf - gl > A/2}1 2·3 n f 2 f ~ -A- JR'" If - gl dy + X JRn If - gl dy. Since 9 is an arbitrary continuous function, we get that
I{x E R n :W(j,x) > A}le = 0, Whence W(j, x) = 0 a.e., (2.17) holds. •
all A > o.
Although we have only discussed a "local" version of (2.1), it is not hard to obtain the "global" version as well. One way to go about this is to use the general Hardy-Littlewood maximal theorem, cf. 3.23 below, but a simpler way to proceed is this: First observe that Rn = U~l 1(0, 2k). Now, an argument entirely analogous to Theorems 2.1 and 2.2 gives that if I E L(I(O,2k», then the integral of I differentiates to f(x) a.e. on I(O,k). Given an arbitrary integrable function I note that !k = IXI(O,2k) is integrable and it vanishes off I(O,2k), and consequently there exists a null set Nk such that the integral of !k differentiates to !k(x) = I(x) a.e. on I(O,k), k = 1,2, ... Clearly the integral of I differentiates to I(x) off the null set N = U~l Nk, and the global version of the differentiation theorem also holds.
3.
Problems and Questions
143
3. PROBLEMS AND QUESTIONS In what follows (X,M,JL) is a measure space. 3.1 By means of an example show that if f, 9 E L(JL) it is not necessarily true that f 9 is integrable. However, if p, g2 E L(JL), prove that
fg E L(JL) and
Ix
If gl dJL
~
(Ix
f2 dJL ) 1/2
(Ix
g2 dJL) 1/2
(3.1)
(3.1) is known as the Cauchy-Schwarz inequality and it has many interesting applications. One of them is the following: Show that if {fn}, {gn} are sequences with the property that f~,g~ E L(JL), n = 1,2, ... , and if lim [Un - f? dJL = lim [(gn - g)2 dJL = 0, n~oolx
n~oolx
then
3.2 In the spirit of 3.1, show that if f E L(JL) and {fn} is a sequence of integrable functions so that limn~oo dUn, f) = 0, and if {gn} is a sequence of measurable functions such that Ignl ~ M JL-a.e.,
limgn = 9 JL-a.e.,
then lim [ fngn dJL = [ fg dJL. n~oolx
lx
= 1,2, ... , and limn~oo dUn' f) = 0. Show that there exist an integrable function F and a sequence {nk} such that
3.3 Suppose that f, fn E L(JL), n
Ifnk 1 :$ F JL-a.e.
and
lim fnk
k~oo
=f
f E L(R) with the followRand M > 0,
3.4 Construct a Lebesgue integrable function
ing property: For any interval I
~
JL-a.e.
I{x E I: If(x)1 > M}I > O.
VIII.
144
3.5 Suppose
I is a Lebesgue integrable function on R+ and put
f
g(x) =
x>
I(t) dt,
1[0,00)
x
+t
Is 9 continuous? Does 9 have a limit as x 3.6 Suppose
More About L1
I
E
--+-
o.
oo? Is 9 differentiable?
L([a,b]) and put F( x)
= f I dy , 1[a,x]
a
~ x ~ b.
Show that F is a continuous BV function on [a,b] that satisfies the following property: Given e > 0, there exists 8 > 0, such that for any finite collection {[ai,bi]} of nonoverlap ping subintervals of [a,b], we have
Functions which satisfy (3.2) are said to be "absolutely continuous" in the sense of Vitali; we will have more to say about absolutely continuous functions in Chapter X. 3.7 Show that the integral of an arbitrary I E L(J.') is "absolutely continuous" in the following sense: Given e > 0, there exists 8 > 0, such that
L
III dJ.' < e,
whenever J.'(E) < 8.
We will have more to say about this notion of absolute continuity in Chapter XI. 3.8 Let (X,M,J.') be a finite measure space, and
I
a measurable extended real-valued function defined on X. Show that I E L(J.') iff
Lk:l J.'( {III 2:: k}) < 00. 3.9 Show that if I E
L(J.'), then lim
f
>'-+00 1{IJ1>>'}
III dJ.' =
(3.3)
0.
3.10 A sequence {In} of integrable functions is said to be "uniformly integrable" if (3.3) holds uniformly in n. More precisely, {In} is
uniformly integrable iff
lim sup
>'-+00 n
f
J{l/nl>>'}
I/nl dJ.' =
O.
3.
Problems and Questions
145
Show that if Jl(X) < 00 and {In} is uniformly integrable and limn->oo In = I Jl-a.e., then IE L(Jl) and
r In dJl = Jxr I dJl .
lim
n->oo Jx 3.11 If
sup
Ix
I/nlI+7J dJl
< 00 , 17 > 0,
show that {In} is uniformly integrable. The same conclusion is true if there exists 9 E L(Jl) so that I/nl ::; 9 Jl-a.e.j on the other hand the sequence
In
= (n/lnn)X(O,l/n)'
n
= 2,3, ...
is uniformly integrable with respect to the Lebesgue measure of 1= [0,1]. In fact, II In dx - 0, and yet the In's are not dominated by any integrable function. 3.12 Suppose Jl(X)
0, there exists 6 > 0 such that Jl(E) < 6 implies I/nldJl < c for all n. 00
Ix
IE
< 00 and In ~ 0 Jl-a.e. for all n. Show that if the sequence {In} is uniformly integrable, then
3.13 Suppose Jl(X)
lim sup
Ix
In dJl ::;
Ix
lim sup In dJl .
= 1,2, ... , and limn->ood(fn,f) = o. Show that if Ihnl - 0, then limn->oo I/n(Y + hn) - l(y)1 dy = O.
3.14 Suppose 1,ln E L(Rn),n 3.15 Suppose that
IRn
IE L(Rn) and compute
r
lim I/(y Ihl->oo Jnn
+ h) + l(y)1 dy.
3.16 Prove that if A is a Lebesgue measurable subset of R n of positi ve measure, then the difference set A - A = {x E R n : x = Yl - Y2, Yl, Y2 E A} contains a neighbourhood of the origin.
VIII.
146
More About L1
State and prove a similar statement for A + A, and for A ± B, where B is another Lebesgue measurable set of positive measure. 3.17 Suppose I is a measurable function defined on R which assumes finite values on a set of positive measure and such that I(x + y) = I(x) + I(y) for all real x, y. Show that I is of the form I(x) = ex. 3.18 Prove that CJ(R n ) = {g: 9 is real-valued, it vanishes off a compact set, and its first order partial derivatives are continuous} is dense in L(Rn). Also, Ci(R n ) = {g:g and allits partial derivatives of order ~ (k-1) belong to CMR n )}, k = 2,3, ... , and Co(Rn) = n~l Ci(R n ), are dense in L(Rn). 3.19 If IE L(Rn), prove that there exists a sequence {Id of continuous functions such that limk.... oo Ik = I a.e. Further show that we may also require that each fk vanish off a compact set, and that it belong to CJ(Rn), or even to co(Rn). 3.20 Given
IE L(Rn), F(x,r)
put
= II( x,1 r )1 JI(x,r) f I(y)dy,
x E R n ,r > O.
Is F continuous as a function of x, for each r fixed? Is F continuous as a function of r, for each x fixed? 3.21 In the notation of 3.17, show that
limsupF(x,r) r .... O
and
liminf F(x,r) r .... O
are Lebesgue measurable. 3.22 Prove that Theorem 2.1 is true if we replace intervals by balls, i.e., (2.10) holds with M I there replaced by
Md(x)
= sup IB/ )1 f i l l dy, r>O x, r JB(x,r)
where B(x, r) = {y E Rn: Ix - yl < r} denotes the ball of radius r centered at x. 3.23 Prove the general version of the Hardy-Littlewood maximal the-
orem, i.e., remove the assumption that the integrable function vanishes off a bounded set.
I
3.
147
Problems and Questions
3.24 A point x at which lim II( 1 )1 x, r
r--+O
f
JI(x,r)
If(y) - f(x)1 dy
=0
is called a Lebesgue point of f, and the collection of all such points is called the Lebesgue set of f. Prove that if f is integrable, then almost every point is in the Lebesgue set of f. This notion is extremely important in the convergence of Fourier series, cf. Theorem 3.1 in Chapter XVII. 3.25 A family R = {R} of subsets of Rn is said to be regular at x provided that: (i) The diameters of the sets R tend to 0, and, (ii) There is a constant c such that if I( x, r) denotes the smallest interval centered at x containing R, then II(x,r)1 ~ clRI; the sets R need not contain
x. Show that if f is integrable, R is regular at x, and x is in the Lebesgue set of f, then . lim
diam(R)--+O
IRll JfR If(y) -
f(x)1 dy
= o.
3.26 Let E be a measurable subset of Rn; a point x E R n for which lim IE n I(x, r)1 II( x, r)1
=1
r--+O
is called a point of density of E. If the above limit equals 0, x is called a point of dispersion of E. Prove that almost every point of E is a point of density of E and a point of dispersion of Rn \ E. 3.27 Suppose that
hf(Y)dY = 0
(3.4)
for every subinterval I of R, and show that f = 0 a.e. In fact, the same conclusion is true if (3.4) holds for every I with III = c > 0, c a fixed constant. 3.28 Suppose that and let
f E L(R) vanishes off a bounded interval I F(x)=
r
J[a,x]
fdy,
= [a,b],
VIII.
148
More About L1
Is is true that
f
llim F(x + h) -
J[a,b] h-O
F(x) _
I(X)I
dx
= 07
h
Calderon and Zygmund, while considering problems related to the "norm" convergence of Fourier series of functions of several variables, introduced a decomposition of an integrable function into an essentially bounded "good" part and a "bad" part. The decomposition is described in the next four problems. 3.29 Let I be a finite interval in R and suppose I is a nonnegative integrable function which vanishes off I. Show that for any ,X satisfying
there is a sequence {Ik} of nonoverlapping intervals contained in I such that (i) ,X < Ilk Idy ~ 2,X, k = 1,2 ... (ii) I ~,X a.e. on 1\ Uk1k . (iii) \ Uk1k\ ~ ~ IUlk Idy ~ ~ Illdy.
Ttl
3.30 Referring to 3.26, consider the "good" function
and the "bad" function
b=l-g· Show that these functions enjoy the following properties (i) 0 ~ 9 ~,X a.e. on 1\ U Ik. (ii) 0 ~ 9 ~ 2'x on U Ik. (iii) b = 0 in 1\ Ulk ; Ilk b = 0 for all k. (iv) \b\~/+2'x. 3.31 Show that 3.29 and 3.30 are valid for I ,X>O
= R, I
E L(R), and any
3.32 The reader is invited to prove the n-dimensional version of 3.29, 3.30 and 3.31.
CHAPTER
IX
Borel Measures
In this chapter we study Borel measures on Euclidean space, their regularity properties, and the distribution functions associated with them. 1.
REGULAR BOREL MEASURES
A measure J.t on (R n , Bn) is called a Borel measure; the restriction of the Lebesgue measure to Bn is a familiar example of a Borel measure. In working with measures it is apparent that "regularity" plays an essential role. Now, in the case of the Lebesgue measure, regularity is built into its definition. We begin by showing that the same is true for those Borel measures which are finite on bounded sets; first the precise definition of regularity. A Borel measure J.t is said to be regular if for any E E Bn, J.t(E) may be computed by either of the expressions
J.t(E) = sup{J.t(K): K is compact, and K
~
E} ,
(1.1)
J.t(E) = inf{J.t(O):O is open, and 0 ~ E}. (1.2) These conditions roughly state that J.t is determined by the compact, or open, sets in Rn. A convenient way to verify these conditions is to consider the following equivalent formulations. For (1.1) we have: If J.t(E) < 00, given c > 0, there exists a compact set K ~ E such that
J.t(E \ K) = J.t(E) - J.t(K) and if J.t(E) = that
00,
~
given M > 0, we can find a compact set K
J.t(K)
~
M.
(1.3)
c, ~
E such (1.4)
14&
IX.
150
As for (1.2), if J.L(E) such that
Borel Measures
< 00, given £ > 0, there exists an open set 0 :2 E
J.L(O \ E)
= J.L(O) -
J.L(E):S
£,
(1.5)
and if J.L(E) = 00, by monotonicity any open set containing E also has infinite measure. We consider the regularity of finite Borel measures first; although the idea for proving this assertion is clear, it takes some time to carry out the details.
Theorem 1.1. Proof.
Suppose J.L is a finite Borel measure, then J.L is regular.
Let
A = {E E Bn : (1.1) and (1.2) are true for E}. The idea of the proof is to show that A is a u-algebra that contains the closed intervals, and which therefore coincides with Bn . It is not hard to check that A contains the closed intervals of Rn: Let I be a closed interval, then I is also compact and (1.1) holds. As for (1.2), let {Ik} be a decreasing sequence of open intervals that converges to I. By (iv) in Proposition 3.1 in Chapter IV it follows that
and (1.2) holds as well. Next, if {Em} ~ A and E = Um Em, then we have E E A. Indeed, suppose that £ > 0 has been given, and invoke (1.3) to find a sequence {Km} of compact sets such that
Km ~ Em, Furthermore, since E \
J.L(Em \ Km)
Um Km
~
:s £/2 m+1 ,
m = 1,2, ...
Um(Em \ K m), we have (1.6)
Now, since J.L (Um Km)
0 is given. Since J.L is finite, there exist an open set 0 ~ E and a compact set K ~ E such that
/J,(O \ E) ::; €/2,
and
J.L(E \ K) ::; €.
Moreover, since E \ K = (Rn \ K) \ (Rn \ E) and R n \ K = 0' is open, we also have
J.L( 0' \ (Rn \ E)) ::;
€ ,
0'
~
(Rn \ E) ,
and (1.2) holds for Rn \ E. A similar argument gives that
but we can only assert that R n \ 0 is closed. This is not a major difficulty: Since say, the sequence of compact sets {(Rn \ 0) n Bm} converges to R n \ O. Whence, by Proposition 3.1 (iii) in Chapter IV, we get that
and consequently, for m sufficiently large it follows that
152
IX.
Let J('
Borel Measures
= (Rn \ 0) n Bm; J(' is a compact subset of Rn \ E
and since
(Rn \ E) \ J(' = ((Rn \ E) \ (Rn \ 0)) U ((Rn \ 0) \ J('), by the above estimates it follows that 1-'( Rn \ E) - 1-'( J(') ~ c. Thus, (1.1) holds for R n \ E, and A is closed under the taking of complements. Whence A is a a-algebra that coincides with Bn , and we have finished .
•
Since R n is a-compact it is possible to extend Theorem 1.1 to more general Borel measures. More precisely, we have Theorem 1.2. Suppose I-' is a Borel measure which is finite on bounded subsets of Rn. Then I-' is regular. Proof.
Since Rn = Um{x:
Ixi
~
m} =
Um B m, say, for
each E in
Bn we have E
= Um (E n Bm) ,
with
En Bm E Bn
,
m
= 1,2 ...
Thus, from Proposition 3.1 (iii) in Chapter IV we get
I-'(E)
= m-+oo lim I-'(E n Bm).
(1.8)
The idea is to approximate the measure, of the sets that appear on the right-hand side of (1.8), and this will be achieved by "restricting" I-' to Bm. More precisely, consider the sequence of Borel measures given by
I-'m(E) = I-'(E n Em),
m = 1,2, ...
(1.9)
Since I-' is finite on bounded sets, the I-'m'S are finite Borel measures, and, by Theorem 1.1, they are regular. Fix now E E Bn , let £ > 0 be given, and put Em = E n B m , m = 1,2, ... By regularity, there exist compact sets Pm and open sets G m such that
(1.10) and
(1.11) We rewrite (1.10) and (1.11) in terms of 1-'. Since subset of Em, (1.10) reads
J(m
=
Pm
is a compact
(1.12)
2.
Distribution Functions
153
As for (1.11), let {Im,k} be a decreasing sequence of bounded open balls which converges to Em. Since f.l(im,d < 00, by (iv) in Proposition 3.1 in Chapter IV, we have
and consequently we can find a sequence of km's such that
Now, Om becomes
= Gm n Im,km
is an open set that contains Em, and (1.11) (1.13)
We are now ready to show that f.l(E) may be computed by both (1.1) and (1.2); we do (1.1) first. Combining (1.8) and (1.12) it readily follows that we may find a sequence of compact sets {Km} with the property that this gives (1.1) whether f.l(E) is finite or not. As for (1.2), we must only do the case f.l(E) < 00. If {Om} is the sequence of open sets introduced in (1.13), put 0 = Um Om and observe that 0 is an open set which contains E. Furthermore, since 0 \ E ~ Um(Om \ Em), by (1.13) it follows that
f.l(O) - f.l(E) = f.l(O \ E) $ Lm f.l(Om \ Em) $ Thus (1.2) is also true, and f.l is regular.
E:.
•
2. DISTRIBUTION FUNCTIONS Borel measures on the line which are finite on bounded sets are important in applications and there is a useful way to describe them. Let BB denote the collection of those Borel measures which are finite on bounded sets, assume that f.l E BB is finite and, referring to 4.28 in Chapter IV, let Fy be a distribution function induced by J-L. Since J-L is finite a way to normalize the Fy's is to consider not the expression given there but rather the distribution function F corresponding to y = -00, namely
F(x) = Jl«-oo,x]).
(2.1)
IX.
154
Borel Measures
F is called the distribution function of JL, it is nondecreasing and rightcontinuous, and it satisfies lim F(x)
x--+-oo
= 0,
lim F(x)
(E) E C, p. is well-defined. We claim that p. is a Borel measure which is finite on the bounded sets of R. That p.(0) = 0 is obvious. Next let {En} be a sequence of pairwise disjoint Borel subsets of the line and let E denote their union. In general the cT>(En)'s are not pairwise disjoint, as there may be overlaps created by the intervals of constancy of Fi this is not a serious inconvenience. First recall that since B is at most countable, for any A E C we have IA \ BI = IAI, cf. 3.2 in Chapter V. Next observe that the sets cT>(En) \ B, n = 1,2, ... , are pairwise disjoint and Lebesgue measurable. Thus
p.(E)
= 1cT>(E)1 = IUncT>(En) I = I(UncT>(En)) \ BI = IUn(cT>(En) \ B)I = Ln 1cT>(En) \ BI = Ln 1cT>(En)1 = Ln p.(En).
Whence, p. is a Borel measure on the line. It only remains to show that p. is finite on bounded sets, and that Tp. = F, i.e., that (2.3) holds. Let E ~ [-n,n] be a bounded Borel set. Since cT>(E) ~ cT>([-n,nD we obtain
p.(E)
= 1cT>(E)1 ~ 1cT>([-n,nDI < 00,
and p. is finite on bounded sets. Finally, let 1= (x,y]. Since lim F(z-)
z-+x+
= F(x+) = F(x)
by (2.5) it is apparent that either
cT>(I) = (F(x),F(y)] ,
or
cT>(I) = [F(x),F(y)].
In either case we have
p.(I) = IcT>(I) I = F(y) - F(x) , and (2.4) holds. It is now easy to see that (2.3) holds as well, with c there. •
= F(O)
To emphasize the fact that Tp. = F, and that, consequently, p. and F are related by (2.3), or (2.4), we denote p. by P.Fi P.F is called the Lebesgue-Stieltjes measure induced by F, and
In f
dp.F
,
f Borel measurable ,
2.
Distribution Functions
157
1
3
i
!
2
1
i
.1
o
a0
.1 3
Figure 4 is called the Lebesgue-Stieltjes integral of lover R with respect to dJ.LF' The following is a natural question to ponder: If F E 1) has a measurable derivative F' a.e., under what circumstances is dJ.LF(x) = F'(x) dx? That equality does not always hold follows by considering the Cantor Lebesgue function, which we construct next. Referring to the construction of the Cantor set and to the constructIOn in Section 1 of Chapter VI, let I! be the 2n -1 open intervals whose union is D n , and for n = 1,2, ... , let In be the continuous function defined on I = [0,1] which satisfies In(O) = 0, In(1) = 1,ln(x) = k2- n for x E I!, and which is linear on each interval of Cn. It is clear that each In is monotone non decreasing, and since the values only change on the Cn's, that In = In+! on I!, k = 1, ... , 2n - 1, and
I/n(x) - In+!(x)1
~
2- n ,
all x in I.
This estimate allows us to show that the In's converge uniformly on I. Indeed, for any k < m we have m-l
I/m(x) - Ik(X)1 = L(Jn(X) - In+l(X)) n=k
IX.
158 00
~
Borel Measures
00
L I/n(x) - In+1(x)1 ~ L2-n = 2- +1. k
n=k
n=k
Thus, given c > 0, we may choose N so large that I/m(x)-!k(x)l~c,
m,k~N,allxEI,
and the sequence {In} is uniformly Cauchy on I. Whence lim n -+ oo In(x) = I( x) exists on I, and since I is compact and the In's continuous and nondecreasing, I is also continuous and non decreasing. Also note that 1(0) = 0, 1(1) = 1, and that I is constant on every interval removed in the construction of Cj I is called the Cantor-Lebesgue function. Let now 0 if x < 0 F( x) = { I( x ) if 0 ~ x ~ 1 1 if x > 1. Clearly F E 1), and since F is not constant, fLF is not the 0 measure. On the other hand it is apparent that F'( x) = 0 for x ~ C, that is a.e. Whence it readily follows that
L
dfLF = fLF([O, 1]) = F(I) - F(O)
= 1:1
L
F' dx
=0 .
It is also clear that for the Cantor-Lebesgue function is true: [ !' dx = 0 :I 1(1) - 1(0) = 1.
I
the following
l[o.l]
This puzzling fact will be discussed in full detail in Chapter X, where we characterize those functions which coincide with the integral of their derivatives. We close this section with two interesting remarksj since the proofs follow along familiar lines we leave it to the reader to carry them out, cf. Section 3 in Chapter VII. Theorem 2.2.
Let 9 be a bounded real-valued function defined on
I = [a,b], I a nondecreasing right-continuous function defined on I, and put I(a) F(t) = { I(t)
I(b)
if t ~ a if a < t < b if t ~ b.
3.
159
Problems and Questions
Then, 9 E 1l(I) iff p F( {x E I: 9 is not continuous at x}) =
Furthermore, if
J:
°.
9 df exists, then 9 E L(p F), and
3. PROBLEMS AND QUESTIONS 3.1 What is the cardinal number of Bn? 3.2 Suppose p is a Borel measure on the line such that p([O,l]) = 1, and for each real x,
Show that p coincides with the restriction of the Lebesgue measure to B 1 • 3.3 Suppose that p is a probability Borel measure on [0,1], and that for each Borel set E C [0,1] with lEI = 1/2 we also have p(E) = 1/2. Does it follow that p coincides with the restriction of the Lebesgue measure to B1? 3.4 Suppose that p is a finite Borel measure on Rn, and that A C Rn is closed. Show that (x) = p(x + A) is upper semicontinuous, and consequently, measurable. 3.5 Let p be a finite Borel measure on a bounded interval of the line such that for each real x, p( {x}) = O. Show that given £ > 0, there exists 0 = 0(£) > 0 with the property that if E E B1 and diam (E) < 0, then p(E) < £. 3.6 Suppose p is a nonzero Borel measure on the line which is finite on bounded sets. Show that if
then I' is a Dirac measure.
IX.
160
Borel Measures
3.7 Let 4> be a nonnegative additive set function defined on Bn , and suppose that for E in Bn we have
4>(E) = sup{4>(K):Kcompact,K
~
E}.
Show that 4> is O'-additive, and hence a Borel measure. 3.8 Suppose J.L is a regular Borel measure on R n , and let E E Bn. Show that there exist a G s set U and an Fq set V such that
V ~ E ~ U,
J.L(U \ V)
= O.
3.9 Suppose J.L is a regular Borel measure on Rn, and let f be a nonnegative integrable function. Show that the set function
is also a regular Borel measure on Rn. 3.10 Suppose J.L, A, are Borel measures on Rn and let
(J.L
V
A)(E) == sup{J.L(A) + A(E \ A): A
~
E, A
E
Bn} ,
(J.L
A
A)(E)
= inf{J.L(A) + A(E \
~
E,A
E
Bn}.
and
A): A
Show that J.L V A and J.L A A are Borel measures and that
(J.L V A)(E) + (J.L A A)(E)
= J.L(E) + A(E) ,
all E E Bn.
If J.L and A are regular, are J.L V A and J.L A A also regular?
3.11 An atom of a Borel measure J.L is a singleton {x} such that J.L( {x}) > O. Show that the number of atoms of a O'-finite Borel measure J.L is at most countable. 3.12 Suppose J.LF is the Borel measure on the line induced by F E V. Show that
J.L F( {x}) = 0 iff F is continuous at x • Moreover, if {x} is an atom for J.LF' then we have
3.
Problems and Questions
161
3.13 Show that a regular Borel measure J.L on the line is a probability measure iff there exists FE V such that J.L = JLF and lim F(x)
x ..... -oo
= 0,
and
lim F(x) = 1.
x ..... oo
3.14 We say that a Borel measure JLF is atomic, or discrete, if J.LF(E) = 0 whenever E E B1 does not contain any atom of JLF; in this case the associated distribution function F is said to be discrete. On the other hand, if F is continuous, or equivalently when JLF has no atoms, we say that the Borel measure is continuous. Show that if J.L is a regular Borel measure on the line, then there exist a discrete measure JLd and a continuous measure J.Lc such that J.L = JLd + J.Lc. Is this decomposition unique? 3.15 Suppose J.L is a regular Borel measure on the line with no atoms, and let 0 < 11 < J.L(R). Show that there exists E E B1 such that JL(E) = 11· 3.16 Let J.L be a regular Borel measure on Rn. Show that there exists a unique closed subset C of R n with the following two properties: (a) J.L(R n \ C) = 0, and (b) If 0 is an open set such that C n 0 "I 0, then JL( C n 0) "I O. This closed set C is called the support of J.L, and we also say that J.L is supported in C, and denote this relation by supp J.L = C. What is the support of the Dirac 6x measure? Given a compact subset ]( of R, construct a measure J.L such that supp J.L = ](. 3.17 Show that supp (J.L V ,x)
= supp J.L U supp,x,
and that supp (J.L " ,x) ~ supp J.L n supp,x .
(3.1)
By means of an example show that the inclusion in (3.1) may be proper. 3.18 Suppose J.L is a regular Borel measure on the plane such that for any horizontal or vertical line L we have J.L(L) = o. Show that the function ¢>(x) = J.L(I(x,l)) , x E R2, is continuous. Can you think of n-dimensional extensions?
IX.
162
Borel Measures
3.19 Let J.L be a finite Borel measure on the plane, and suppose that for any line L we have J.L(L) = o. Show that if E E B2 and 0 < "I < J.L(E), there is a Borel set ACE such that J.L(A) = "I. 3.20 Suppose J.L P is the Borel measure induced by the distribution function F E V given by
F(x) =
o { x2
ifx o. Consider then the open set
Gn
=0
\ Uk=IIk
f: 0,
and the class of those intervals of V totally contained in G n ; the idea is to select In+1 as a largest interval in this class. Thus, if kn
= sup{III:I E V
and I C G n } > 0,
let In+1 be any subinterval of V contained in G n such that
IIn+11 > k n/2; by construction it is clear that In+1 n (II U ... U In) = 0. Either the selection process stops after a finite number of steps, and if this is the case we have finished, or else there exists a pairwise disjoint sequence {Id of intervals of V such that
Uklk ~ 0,
Lk IIkl ::S
and
101 < 00.
In this case, given TJ > 0, we may find N such that l:~N+1IIkl < TJ. We consider RN = E \ U~=1 h and estimate its Lebesgue outer measure in terms of TJ. Since each x in RN belongs to the open set GN, by assumption there is an interval I E V containing x such that I n (II U ... U IN) = 0. We claim that there is an index n > N so that I n In f: 0. Indeed, if I E V and for all m we have I n 1m = 0, it follows that
III ::S k m < 21Im+11 ~ 0,
as m ~
00 ,
which is impossible. Let n be smallest index so that I n In f: 0; clearly n > N. Furthermore, since by the way the In's were selected we have III ::S k n < 2IIn+1l, by simple geometric considerations we obtain
+ IIn+1I/2 < 21In+11 + IIn+1I/2 = 5IIn+1I/2.
d(x,midpoint of In+J)::S III
Let I n+1 denote the interval concentric with In+1 with sidelength 5 times that of I n +1 • By the above estimate it follows that x E I n +1 and consequently, Rn ~ U~N+1 Jk. Thus
IRNle < ,",00 IJkl - L...Jk=N+l
= 5,",00 IIkl = 5TJ. L...Jk=N+1
Whence by (1.2) we have
IE \ U~lhle ::S
IE \Ui"=IIkle < 5TJ,
and, since TJ is arbitrary, (1.1) holds.
•
(1.2)
2.
Differentiability of Monotone Functions
167
Corollary 1.2. Under the assumptions of Theorem 1.1, given c > 0, there exists a finite family II, ... , IN of pairwise disjoint intervals of V such that (1.3) Proof. Pick TJ = c/5 in the proof of Theorem 1.1; then (1.2) gives the desired conclusion. • Note that whereas the validity of Corollary 1.2 requires that IEle < 00, the conclusion of the Vitali covering lemma is true for an arbitrary subset Eof R.
2. DIFFERENTIABILITY OF MONOTONE FUNCTIONS Suppose I is a real-valued function defined on 1= (a,b), and for x E I and h -:f:. 0 with x + h E I put DI(x,h) = I(x
+ h) -
I(x) .
h Whether I is differentiable at x E I, or has a one-sided derivative at x, or not, the following four quantities, called the Dini numbers of I at x, are well-defined: D+ I(x) = lim sup D I(x, h),
D+/(x) = liminf D I(x, h),
h-+O+
h-+O+
D- I(x) = lim sup DI(x, h) ,
D_/(x) = liminf DI(x,h).
h-+O-
h-+O-
Clearly D+/(x) ~ D+ I(x) and D_/(x) ~ D- I(x) and I'(x) exists iff all four Dini numbers of I at x are equal. The stage is now set for
Theorem 2.1 (Lebesgue). Let I be an open subinterval of the line and suppose I is a monotone real-valued function defined on I. Then I' exists a.e. on I. Proof. We may assume that I is nondecreasing, and consider first the case when I is bounded. We will be done once we show that D_I ~ D- I ~ D+I ~ D+ I ~ D_I,
a.e. on I,
(2.1)
x.
168
Absolute Continuity
for then all the Dini numbers of f are equal at those x's where (2.1) holds, and f' exists a.e. on I. As noted above, the first and third inequalities in (2.1) are always true, so we only need to establish the second and fourth inequalities there. Now, this amounts to showing that the (bad) sets
B
= {x E I:D+ f(x) > D-f(x)} and B' = {x E I:D- f(x)
> D+f(x)}
are null. Since the proof for both sets follows along similar lines we only consider B. First observe that all Dini numbers are nonnegative, and if for rational numbers u > v > 0 we put
= {x E I:D+ f(x) > u > v > D-f(x)} , then we have B = Uu,v Bu,v. Thus the desired conclusion will follow once we show that each of the Bu,v's is null. So we suppose that IBu,vle = T/, Bu,v
and show that T/ = O. The idea of the proof is to approximate Bu,v by a simpler set consisting of pairwise disjoint intervals (here we use the fact that D-f < v and the Vitali covering lemma) and then to further approximate the part of Bu,v which lies within those intervals by another family of intervals (here we use the fact that D+ f > u and Vitali's covering lemma again). First observe that since Bu,v ~ I we have T/ < 00, and, by (1.8) in Chapter V, given € > 0, there exists an open set 0 ;2 Bu,v such that
Moreover, since for each x in Bu,v we have D-f(x) < v, there exists a sequence hx,n > 0 approaching 0 such that the intervals [x - hx,n,x] C 0 and (2.2) f(x) - f(x - hx,n) < vhx,n all n. Clearly V = {[x - hx,n,x]} is a covering of Bu,v in the sense of Vitali and consequently, by Theorem 1.1 there is a finite collection II = [XI ,XI - hI], ... ,In = [xn,x n - h n], say, of pairwise disjoint intervals of V such that
(2.3) Next let jj denote the interior of Ij, 1 ::; j ::; n, and observe that since for each x in
B~,v
= Bu,v n
(Ui=1 jj)
we have D+ f( x) > u, there is
2.
Differentiability of Monotone Functions
a sequence kx,m 1 :$ j :$ n, and
>
169
0 tending to 0 such that [x,x
I(x
+ kx,m) -
I(x) > ukx,m,
+ kx,m]
~
Ij for some (2.4)
all m.
Since V1 = {[x ,x + kx,m]} is a covering of B~,v in the sense of Vitali, we can find a finite collection J 1 = [x~ ,x~ + k1], ... , J m = [x~,x~ + km], say, of pairwise disjoint intervals of V1 with the property that
(2.5) Let now
m
0:$ D..J = 2:(f(xi + ki) - I(xi» i=l denote the increase of I along the Ji'S, and, similarly, let D..] denote the increase of I along the I/s. It is not hard to check that (2.6) Indeed, suppose that J1 , ••• , Jml are ordered from left to right and are contained in /1 j that Jml +1, ... ,Jm2 are ordered from left to right and are contained in 12 , and so on. Since {Ji}~l is a pairwise disjoint collection of intervals contained in /1 it readily follows that
2::1(f(xi
+ ki) -
I(xi»
= I (X~l
:$
I(X~l
+ kml ) + km1 )
-
••• -
(J(x~) - I(x~
+ k1» -
I(xt)
l(xD :$ I(xt) - I(X1 - ht).
Whence, by adding up the increase of I along these blocks of Ji'S it follows that (2.6) holds. Now, by (2.2), and since the I/s are all contained in 0, it is clear that
D..]
U
2::1IJil,
(2.8)
and consequently, we need a lower bound for the right-hand side of (2.8). Since Bv.,v = (Bv.,v \ Ui=l Ij) U B~,v UN, where N Ij 's} is a finite set, it is clear that
Bu,'IJ
~
(Bu,'IJ \ Ui=l I
= {endpoints of the
j) U (B~,'IJ \ U~lJi) U (U~lJd UN.
X.
170
Absolute Continuity
Thus, on account of (2.3) and (2.5) it readily follows that
1] =
IBu,vle ~ c + c + I:: 1IJil,
or
I::1IJil 2:: 1] -
2c.
Substituting this estimate in (2.8), and combining it with (2.6) and (2.7), we see that
(1]-2c)u
O.
(2.9)
Since c in (2.9) is arbitrary, we also have 1]U ~ 1]V, and since 0 < v < u, this can only hold if, as asserted, 1] = O. This completes the proof when I is bounded. On the other hand, if the interval I is unbounded, observe that
say, where h is a bounded open interval, k on each Ik, and consequently also a.e. on I.
= 1,2, ... Thus I' exists a.e. •
Not only does the derivative of a monotone function exist a.e., but it also is integrable. More precisely, we have Lebesgue's theorem, Theorem 2.2. Suppose I is a nondecreasing real-valued function defined on I = (a,b). Then I' E L(I) and
(2.10) Proof. Extend I to R by setting I( x) = I( a+) if x ~ a, and I( x) I(b-) if x ~ b. Put now
In(x) = n(f(x
+ lin) -
and observe that since by Theorem 2.1 lim In = I'
n-+oo
x E R, n
I(x)) ,
I'
= 1,2, ...
=
,
exists a.e., we have
a.e. on I.
Thus, by Fatou's Lemma, it follows that
i /'
dx
~ lim inf
i
In dx .
(2.11)
3.
Absolutely Continuous Functions
171
It is rather straightforward to compute the integral on the right-hand side of (2.11). Indeed, by 4.6 in Chapter VII, and with n sufficiently large, we have
h Jn dx = n h J( x + 1/n) dx - n
=n f
J dx - n
J[b,b+l/n)
i
J dx
f
J dx = A + B ,
J(a,a+1/n]
say. Clearly A = J(b-), and B ~ J(a+). Whence, for all sufficiently large n the integral in question is dominated by J(b-) - J(a+), and (2.10) holds. •
Corollary 2.3. Then l' E L(I) and
Suppose
J is
hlJ/ldx
BV on a bounded interval I = [a,b].
~ V(J;a,b).
(2.12)
Our aim now is to discover when equality holds in (2.10); for this we need the concept of absolutely continuous functions.
3. ABSOLUTELY CONTINUOUS FUNCTIONS Absolutely continuous, or AC, functions were introduced in 3.6 in Chapter VIII. They are continuous functions whose increment along any collection of pairwise disjoint intervals of sufficiently small total length is arbitrarily small. This concept excludes the Cantor-Lebesgue function J which, although being locally constant on a subset of [0,1] offull measure, it nevertheless increases from 0 to 1 there. To see this we cover the Cantor set C by a union Un( an, bn ), say, of pairwise disjoint open intervals with L:n(bn - an) is arbitrarily small. Extend J so that J(x) = 0 for x < 0 and J( x) = 1 for x > 1. Then it is not hard to verify that L:n(J(bn ) - J( an)) = 1, and consequently, L:~=l(J(bn) - J(a n )) > 1/2 for sufficiently large N, while at the same time, L:~=1 (bn - an) is arbitrarily small. So, which among the continuous functions are AC, and what properties do AC functions satisfy? Proposition 3.1.
Let I = [a,b] and suppose I is AC on I. Then
I is BV on I, and consequently, by Corollary 2.3, I' exists a.e. and it is integrable there.
x.
172
Absolute Continuity
Proof. Let 8 be the real number that corresponds to the choice c = 1 in the AC definition of I, and let the integer N > (b - a)/8. Note that, in particular, we have
v (f; x, x + "7) ~
1,
any x E I , 0
< "7 ~
(3.1 )
8.
The idea now is to use (3.1) to put together the estimates along the partitions of I. So, let P = {a = Xo < ... < Xn = b} be a partition of I and let P' be the partition of I obtained by adjoining the points a + (b - a)/N, a + 2(b - a)/N, ... ,b to P. Since P' is finer than P it readily follows that
2:
2:
I~k/l ~
over l'
I~k/l·
(3.2)
over 1"
It is not hard to estimate the right-hand side of (3.2); indeed, by (3.1) it does not exceed
V(f;a,a + (b - a)/N) + ... + V(f;a + (N -1)(b - a)/N,b) Since P is arbitrary, by (3.2) it follows that V(f; a, b) on I . •
~
~
N.
N and I is BV
So, AC functions are continuous and BV, but is the converse to this statement true? It is partially true, and to discuss it we need some preliminary results.
Lemma 3.2. Suppose A function defined on I such that
~
II' (x) I ~
I = [a,b], and let M,
I
be a real-valued
x EA.
Then
I/(A)le ~ MIAle. Proof.
(3.3)
Given c > 0, let 0 be an open set such that
101
~
IAle + c ,
0
2 A.
(3.4)
Break up A into two disjoint parts, Al = {x E A: I is constant in a neighbourhood of x}, and A2 = {x E A: I is not constant in any neighbourhood of x}, say. We construct a covering V of I(A) in the sense of Vitali as follows: If I(x) E I(A) and x E At, then there is an interval J ~ 0 such
3.
Absolutely Continuous Functions
173
that f(x) = fey) for all y E J. Now, if I( w, w') denotes the closed interval with endpoints w and w' (note that w' < w is possible), then we can find h so that I(x,x+Mh) C 0 and f is constant on I(x,x+Mh). Such values f(x) are then assigned the intervals I(f(x),f(x) + Mb), where b satisfies
I(x,x
+ Mb) ~ I(x,x + Mh).
On the other hand, if f(x) E f(A) and x E A 2 , then there is a sequence hx,n '=I 0, n = 1,2, ... , converging to 0, such that I(x,x + hx,n) C 0 for all n, and
°< If(x + hx,n) - f(x)1
~ (M
+ c)lhx,nl,
n = 1,2, ...
To these values f(x) we assign the intervals I(f(x),f(x + hx,n)), n = 1,2, ... Now, the collection V of all the intervals introduced above is a covering of f(A) in the sense of Vitali and consequently, there is an at most countable family consisting pairwise disjoint intervals II' ... ' Ik ... , say, such that Whence, we also have (3.5) To estimate the right-hand side of (3.5) we separate the Ik's into two families: Those that correspond to f(x) with x E AI, call them Il's, and those corresponding to f(x) with x E A 2 , call them ~'s. Furthermore, if Il = I(f(xk),/(xk) + Mb k ), let Jl = I(xk,xk + bk ), and if I~ = I(f(xk),f(xk + h k )), let J~ = I(xk,xk + h k ). Since the Ik's are pairwise disjoint, so are the Jk'S, and they are also contained in O. Consequently, by (3.4) we have
2: IIkl
2: jIll + 2: II~I ~ 2: M lbkl + 2:(M + c)lhkl = M 2:IJll + (M + c) 2:IJ~1 ~ (M + c) 2:IJkl ~ (M + c)IOI ~ (M + c)(IAle + c), =
which substituted into (3.5) gives If(A)le ~ (M + c)(lAle + c). Moreover, since c is arbitrary, the above inequality is also true with c = 0, and (3.3) holds. • An interesting consequence of Lemma 3.2 is
x.
174
Absolute Continuity
Lemma 3.3. Suppose I is a real-valued function defined on I = [a,b], and let A be a measurable subset of I so that I'(x) exists everywhere on A and is measurable there. Then
I/(A)le ~
i
11'1 dx.
(3.6)
Proof. First suppose that for some integer M we have on A and consider the level sets
II'(x)1 < k/2n},k =
Ak,n={X E A:(k-1)/2n ~
1/'(x)1 ~ M
1, ... ,M2n,n
= 1,2, ...
Since for each n we have A = Uk Ak,n, by Lemma 3.2 and Chebychev's inequality it readily follows that
I/(A)le ~
I/(Ak,n)le ~
L k
L(k/2 n )I A k,nl k
= L((k - 1)/2 n )I A k,nl k
1
~ Lk
Ak,n
1
+ 2n L
IAk,nl
k
1/'ldx + 2~IAI
~ JAf Il'ldx + 2~IAI.
Since this estimate holds for every n, (3.6) is true in this case. As for the general case, note that 00
U{x E A: k -
A=
00
1 ~ 11'1 < k}
k=l
=
U Ak , k=l
say, where the Ak's are pairwise disjoint and 1/'1 exists and is bounded and measurable on each A k • Then, by the first part of the proof we have
I/(A)le ~ and (3.6) holds.
L
I/(Ak)le
k
~ L f 1/'1 dx = f 1/'1 dx, k
~k
~
•
We are now ready for
Theorem 3.4 (Banach-Zarecki). Suppose I is a continuous, BV, real-valued function defined on I = [a,b]. Then I is AC on I iff I maps null sets into null sets, i.e.,
IAI =
0
implies
I/(A)I = O.
3.
Absolutely Continuous Functions
175
Proof. We show the necessity first: It is enough to prove that given a null set A ~ (a,b) and c > 0, we have
I/(A)le
~
(3.7)
c.
We invoke (iii) in 4.14 below: From the hypothesis of AC there exists 8 > 0 such that no matter what finite pairwise disjoint family {Ik = (ak,bk)} of subintervals of (a,b) we take, with the notation w(j, J) = sUPJ I - infJ I, we have 2:(bk - ak) < 8 implies 2:w(j,Ik) < c. (3.8) Also observe that since I/(Ik)le ~ w(j,Ik) for each k, by (3.8) it readily follows that
Choose now an open set 0 with 101 < 8 so that A C 0 = U~l(ak,bk) ~ (a,b), where the (ak,bk),s are pairwise disjoint, and note that the above estimate implies that
(3.7) follows at once from this. As for the sufficiency, suppose that c > 0 is given, and let {(ak,b k )} be a finite pairwise disjoint family of subintervals of I. Then, if Ak = {x E [ak,b k]: I'(x) exists }, we have I[ak,bk] \ Akl = 0 for all k. Furthermore, since I is continuous we also have
Whence, combining these remarks, and by our assumption and Lemma 3.3, we obtain
2: 1/([ak,bkDle ~ 2: 1/([ak,bk] \ Ak)le + 2: I/(Akle
2: I/(bk) - l(ak)1 ~ k
k
k
~
2: lAic f 1f'1 dx = k
k
f
lUAIc
1/'1 dx .
(3.9)
Now, since / is BV on I, by Corollary 2.3 /' is integrable on I, and we are in a position to invoke 3.7 in Chapter VIII: Choose 8 > 0 so that the
x.
176
Absolute Continuity
conclusion there holds for the c we fixed at the beginning of the argument, and observe that since Uk Ak ~ Uk[ak,bk], we also have
IUkAkl ~ 6 whenever
2:(bk - ak) ~ 6. k
Therefore, by (3.9) it follows at once that
2: I/(bk) -
l(ak)1 ~ c whenever
k
and
I is AC on I.
•
That the assumption that I is BV is necessary for the validity of Theorem 3.4 follows from a construction which is reminiscent of the discussion preceding (1.5) in Chapter III. Consider 1= [0,1] and a Cantor-like subset J( of Ij the measure of J( may be positive or not. Write the set 1\ J( = Un(an,bn ) as the at most countable pairwise disjoint union of open intervals, and let Cn denote the midpoint of (an, bn ). If dn is a sequence of positive numbers with limit 0, define Ion I as follows: I( x) = 0 for x E J(, I(c n ) = dn for all n, and I is linear in [an, cn] and [cn, bn]. Then I is continuous, and V(fj 0, 1) = 2 L~=l dn • To see that I maps null sets into null sets, consider a null subset A of I. By 5.8 in Chapter I, we have I(A) = I(An J()U Un I (A n (an, bn )). Since I is linear in [an, cn] and in [cn,b n], it readily follows that I/(A n (an,bn)1 = 0 for all n, and so I/(A) ~ I{O}I + Ln I/(A n (an, bn)1 = o. If Ln dn = 00, then I fails to be AC on I since it is not BV there. In order to establish further properties of AC functions we introduce the following definition: Suppose I is a real-valued a.e. differentiable function on an interval I. We then say that I is singular if I' = 0 a.e. on I. How do AC singular functions look?
Proposition 3.5. Suppose I is an AC singular function defined on an interval I. Then I is constant. Proof. Let A be a subset of I of full measure so that I'(x) x E A. By Lemma 3.3 we have
I/(A)le
~
L1/'1
dx = O.
= 0 for (3.10)
Further, since II \ AI = 0, by the necessity of Theorem 3.4 it follows that
If(I\A)1 =
o.
(3.11)
3.
Absolutely Continuous Functions
177
Whence combining (3.10) and (3.11) we get
I/(I)le ~ I/(A)le + 1/(1 \ A)le =
o.
(3.12)
Now, since I is continuous, unless I is constant, 1(1) contains an interval and (3.12) does not hold. Therefore I must be constant. • We are now ready to characterize, following Lebesgue, the class of functions which may be reconstructed by integrating their derivatives. Theorem 3.6. Suppose I is a real-valued function defined on I [a,b]. Then, I' exists a.e. in (a,b), it is integrable there, and I(x)-/(a)=
f
I'(t)dt,
J[a,x]
a~x~b,
=
(3.13)
iff I is AC on I. Proof. We do the sufficiency first: Since I is AC on I, I' exists a.e. on I and it is integrable, therefore we only need to show that (3.13) holds. For this purpose put F( x) = i[a,x] f' (t) dt, a ~ x ~ b, and observe that by 3.6 in Chapter VIII also F is AC on I, and, by the Lebesgue differentiation theorem, F' = I a.e. Let 9 = F - Ii 9 is AC and singular on I, and, by Proposition 2.6, 9 is constant there. More precisely, F( x) - I( x) = F(a) - I(a), a ~ x ~ b, and since F(a) = 0, it readily follows that F(x)=j
1'(t)dt=/(x)-/(a),
[a ,x]
a~x~b,
as we wanted to show. Conversely, since I(x) = I(a)
+f
f'(t)dt,
J[a,x]
a~x~b,
the sufficiency follows from Theorem 2.2 in Chapter VIII.
•
Implicit in the proof of Theorem 3.6 is the following important result concerning BV functions. Theorem 3.7 (Lebesgue). Suppose I is BV on 1= [a,b]. Then there exist an AC function 9 and a singular function h such that I(x)
= g(x) + h(x),
x E I.
Up to constants, the decomposition in (3.14) is unique.
(3.14)
x.
178
Absolute Continuity
Proof. Since I is BV on I, I' exists a.e. there, and it is integrable. Let g(x) = Ira,x] I'dt, a ~ x ~ b, and set
h(x)=/(x)-g(x),
a~x~b.
Then 9 is AC on I and by the Lebesgue differentiation theorem, h' = I' - g' = 0 a.e. on I. Thus 1= 9 + h is a desired decomposition. As for the uniqueness (modulo constants), suppose that also I = gl + hI, where gl is AC on I and hI is singular. We then have
(3.15)
9 - gl = hI - h,
where the expression on the left-hand side of (3.15) is AC and that on the right-hand side is singular. By Proposition 3.5 it readily follows that this function is constant, c say. Thus 9 = gl + c, and h = hI - c. •
4. PROBLEMS AND QUESTIONS 4.1 Show that in Vitali's covering lemma we may also demand that given E > 0, L:k IIkl ~ (1 + E)IEle. 4.2 State and prove Vitali's covering lemma, including the conclusion of 4.1, for subsets E of Rn with IEle < 00, and covered in the sense
of Vitali by closed n-dimensional intervals. 4.3 Let E be a subset of Rn that is the union of sets, each being an open
interval together with any of its edges. Prove that E is Lebesgue measurable. 4.4 A measure J.L on (Rn,£.) is said to be doubling provided there exists an absolute constant c such that
J.L(I(x,2r»
~
cJ.L(I(x,r)) ,
all x E R n ,r
> o.
Given a doubling measure J.L, a family V of closed intervals of R n is said to be a covering of E E £. in the sense of Vitali if for any x E E and E > 0, there is an interval I E V which contains x and so that J.L(I) ~ E. Show that if V is a covering of E in the sense of Vitali, and J.L(E) < 00, then there exists an at most countable family {Ik} of pairwise disjoint intervals of V such that J.L (E \ Uklk) = O. 4.5 Suppose I is a real-valued function defined on an interval I, and that all the Dini numbers of I for x in I lie between -k and k,
4.
Problems and Questions
179
where k is some positive constant. Must I be Lipschitz on I? If so, what is the relation between the Lipschitz constant of I and k? 4.6 Let I be a real-valued function defined on I there exist real constants u, v such that
u~
n+ I( x)
Is it then true that for all a
~ v,
~ x
0 are given. Show that there exist a
4.
Problems and Questions
181
Lebesgue measurable set B C I, and an AC function F defined on I such that II - FI < c and IBI < "I.
II\B
4.25 Let E be a bounded Lebesgue measurable set in the line and let
In(x)=n [
XEdy,
n=1,2, .•.
i[:c,X+l/n]
Show that each In is AC on every bounded interval of the line, that ~ 1, that lim n -+ oo In = XE a.e., and that lim n -+ oo d (In,XE)
o ~ In = o.
Is this result sufficient to prove that AC functions are dense in the metric of L(R)? 4.26 Let 9 be a continuous function defined on 1= [a,b] and suppose is AC there. Prove that I: 9 dl = II gl' dy.
I
4.27 Let 1= [a,b], and suppose I is BV on I. Prove that for each Borel set E ~ I we have 1/'1 dy ~ 1V(J)(E)le, and that there is equality here provided that I is AC on I. A related result is the following: Suppose I is a strictly monotone AC function defined on I, and let I(I) = J. Show that for every Borel subset E of J we have 1,-1 (E) I'(y) dy = lEI.
IE
4.28 (Change of Variable) Let I = [a,b], and g: I -+ R, g(I) C J C R, be continuous there. Furthermore, if J = [c,d) and I: J -+ R is integrable, put F(x) = irc,x] I dy, c ~ x ~ d. Now, suppose that 9 and Fog are a.e. differentiable on their domains of definition, and prove that the relation (F 0 g)' = (J 0 g) g' holds a.e. on I. Finally, show that Fog is AC on I iff (i) (J 0 g) g' E L([a, b)). (ii) For each subinterval I' = [a',b'] of I we have
1
9 (b')
g(a')
I dy =
1
(J 0 g) g' dy .
l'
4.29 (Integration by Parts) Let I = [a, b] be a bounded interval, and suppose F, 9 E L(I). Show that if
F(x)= [
Idy,
J[a,x]
G(x)= [ J[a,x]
gdy,
a~x~b,
then IG,gF E L(I) and
i
lGdY +
i
Fgdy = F(b)G(b) - F(a)G(a).
X.
182
Absolute Continuity
Also, if I,g are AC on [, we have
iI9'dY+ il'9dY=/(b)9(b)-/(a)g(a). 4.30 Suppose I, I' E L(R). Prove that
JR I' dy = O.
4.31 Let [ = [a, b], and suppose I is a continuous real-valued function defined 'on [. The estimate (1.5) in Chapter III implies that if V(x) is AC on [, then also I is AC on [. Discuss whether the converse is true, to wit, does the assumption that I is AC on [ imply that V(x) is AC on I? 4.32 Let {In} be a sequence of AC functions defined on [ = [0,1] such that In(O) = 0 for all n. Assume that the sequence of derivatives {/~} is Cauchy in L(I), i.e., limn,m-+ooJII/~(x) - 1~(x)ldx = o. Prove that {In} converges uniformly to a function I, and that I is AC in [. 4.33 Prove that a real-valued function I(x)
=j
(-oo,x]
dy,
I defined on R is of the form where E L(R),
iff (a) I is AC on [-n,n] for all n, (b) V(J; -n,n) n, and, (c) limlxl-+oo I(x) = O. Prove it.
~ k
there is a family {[ak,b k]} of nonoverlapping subintervals of I so that
°
L IF(bk) -
(1.4)
F(ak)1 ;?: e,
k
Now, since as is readily seen, cf. Theorem 2.1 in Chapter IX,
(1.4) implies that the set B = Uk[ak,b k] verifies (1.5) We now invoke (1.5) with 0 = 1/2 n , n of (bad) sets {Bn} so that
J-LF(Bn) > e,
IBnl ~
= 1,2, ... ,and construct a sequence
1/2 n ,
Bn ~ I,
all n.
Thus, on the one hand by the Borel-Cantelli Lemma it follows that Ilim sup Bn I = 0, and, on the other hand, by 4.25 in Chapter IV, we have J-LF(limsup Bn) > 0, contradicting the fact that J-LF is absolutely continuous with respect to the Lebesgue measure. As for the sufficiency, suppose that IAI = 0, and given e > 0, let 0 > be the number that corresponds to the choice of € in the AC definition of F on [-2n,2n], n > 0. Observe that since IA n [-n,n]1 = 0, there exists an open set 0 = Uk(ak,bk) ~ [-2n,2n] such that
°
An [-n,n] C 0,
101 < O.
Also note that by (2.4) in Theorem 2.1 in Chapter IX,
EZ"=l (bk - ak) ~ 101 ~ 0, we get J-LF (U:=l (ak,b k ») ::; :E:=1 (F(bk) - F(ak» ::; e,
and consequently, since
all N.
XI.
186
Signed Measures
Since this inequality holds for all N, it follows that J.LF(O) c is arbitrary and J.LF is regular, we have
J.LF(A n [-n,n]) = 0,
~
c, and since
all n.
But this can only be true if J.LF(A) = 0, and consequently, J.LF is absolutely continuous with respect to the Lebesgue measure. • Also corresponding to the following result.
Suppose J.L is a measure and v is a signed measure
Proposition 1.2. on (X, M) so that
Iv(E)1 < Then, v
~
c-o definition of AC functions, there is the
00
whenever
J.L(E)
0, there exists 0 > 0, such that Iv(E)1 < c
whenever
J.L(E)
0, and (1.7) gives that v(E) = 0. As for the necessity, suppose that v ~ J.L and that (1.7) is false. Then there exist a sequence {Bn} ~ M and c > 0, such that Proof.
= 0, then J.L(E)
Iv(Bn)1 > c
J.L(Bn) ~ 1/2n ,
and
n
= 1,2, ...
Pick now a subsequence {B nk }, say, so that all the v(Bnk)'s are of the same sign, and observe that since J.L(Uk B nk ) < 00, by (1.6) it also follows that Iv(Uk Bnk)1 < 00. The proof may now be finished in a stroke: By 4.25 in Chapter IV, Iv(lim sup Bnk)1 > 0, and by the Borel-Cantelli Lemma, J.L(lim sup B nk ) = 0; this contradicts the fact that v ~ J.L. • Observe that even if v is a measure, (1.6) is necessary for Proposition 1.2 to hold. Consider, for example,
v(E) =
°
l
x 2 dx,
E E C.
Then lEI = implies v(E) = 0, but since for Ixllarge the set E = (x,x+7]) has lEI = 7] and v(E) is large, (1.7) fails. Next suppose that p, is a probability measure and v is a signed measure defined on (X, M), and that
Iv(E)1
~
p,(E) ,
E EM.
(1.8)
1.
Absolute Continuity
187
Clearly v '(E) = lldJ.t,
all E EM.
(1.25)
Furthermore, if 9 is a measurable extended real-valued function defined on X, then (1.26) The equality in (1.26) is understood as follows: If the integral on either side of (1.26) exists, then the integral on the other side also exists and they are equal. Proof. Write X = Uk X k , where the union is pairwise disjoint and J.t(Xk) < 00 for all k. By rescaling if necessary, assume that J.t(Xk) = 1, and invoke Theorem 1.4 for the measures >'k,J.tk on (X,M) given by
XI.
194
Signed Measures
The function fin (1.25) is obtained as f = :Ek fkXXk' where the fk'S are the (unique) functions that satisfy (1.20) for the measures Ak and J1k. As for (1.26), suppose first that 9 ~ 0, and let {oo 19n1 J.L-a.e. Thus, by Fatou's Lemma and (2.22) it follows that (2.25) and 9 is an Lq(J.L) function with norm less than or equal to IILII. Next note that since for each I E LP(J.L) we have
In = IXxn
---+
I J.L-a.e.,
and
Ilnl ~ III J.L-a.e.,
by LDCT it follows that limn->oo IIln - Ilip = 0, and consequently, by the continuity of L we obtain lim Lin = LI. n->oo Moreover, since by (2.23) Lin = IXn In9n dJ.L = Ix In9dJ.L, by Holder's inequality we also have limn->oo Ix In9 dJ.L = Ix 19 dJ.L, and consequently,
as we wanted to show. It thus only remains to verify that IILII = 11911q, and by (2.25) we only need to check that IILII ~ 11911q. But since L = L g , this is an easy consequence of Holder's inequality. • Two natural questions arise from this result: How can we go about representing the bounded linear functionals on LOO (J.L) , and, is the assumption concerning the u-finiteness of J.L necessary? The former question will be addressed in Chapter XIV, and the latter question has two answers, to wit: If 1 < p < 00, it is not necessary that J.L be u-finite, and if p = 1, it is. We do the case p = 1 first. Let X = (0,1), let M be the u-algebra of those subsets E of X which are either countable or such that X \ E is countable, and assume J.L is the
XII.
228
LP Spaces
counting measure on (X,M); p. is. not O'-finite. If v denotes the counting measure on (X, P(X)), put
LI
=
Ix
IX(o,l/2) dv,
IE Ll(p.).
Since for I E Ll(p.) the set {I "=I O} is p. and v O'-finite, it is clear that ILII ~ 111111, and L is a bounded linear functional on Ll(p.) of norm less than or equal to 1. It is intuitively clear that if LI = LgI, then 9 must be the function X(O,l/2)' which is measurable with respect to the 0'algebra P(X), but not measurable with respect to the O'-algebra M; the verification of this observation is left to the reader. On the other hand, the situation is quite different if 1 < p < 00, for then X(O,l/2) r;. Lq(v) for any q < 00. Finally, to see that in the case 1 < p < 00 the O'-finiteness of p. is not needed in the Riesz representation theorem, note that, in the notation of that theorem, given a O'-finite subset E of X, there is a unique function 9 = 9E vanishing off E so that 9E E Lq(E,p.) and
LI =
k
19 dp.,
all I E LP(E,p.).
Furthermore, if LIE denotes the restriction of L to LP(E,p.), we also have
119Ellq ~ IILIEII ~ liLli,
all O'-finite E.
Also (a simple variant of) the argument in (2.24) above gives that if El ;2 E are O'-finite subsets of X, then we have 9EI = 9E p.-a.e. on E, and 119Ellq ~ 119El liq. Let now TJ be the finite quantity TJ = sup {1I9Ellq : E is a O'-finite subset of X} ,
and let {En} be a sequence of O'-finite subsets of X with the property that lim n ..... oo 119E"lIq = TJ· Observe that if E = Un En, then E is also a O'-finite subset of X, and since
119Ellq ~ 119EJq,
all n,
it readily follows that 119Ellq = TJ. Now, life outside E is uneventful. Indeed, let A be a O'-finite subset of X, and put Al = (A \ E) U E. Then Al is also a O'-finite subset of X, and since q < 00 and
{ 19A 1 1qdp. = { 19A1q dp. + { 19E1q dp. JAI JA\E JE = { IgAlq dp. + TJq ~ TJq , JA\E
3.
Weak Convergence
229
it readily follows that gA = 0 J1,-a.e. on A \ E. This is all we need to know: If I is an arbitrary function in LP(J1,) , then the set A = {I ¥= O} is a-finite, cf. 4.11 in Chapter VII, and
LI
L
=
L
=
f Ig dJ1, + f IgE dJ1, = f IgE dJ1" JA\E A JAnE Jx
I 9 A dJ1,
=
I 9 A dJ1,
which is the desired representation of L.
3. WEAK CONVERGENCE Assume (X,M,J1,) is a measure space, and let 1,ln E LP(J1,), n = 1,2, ... , 1 ~ p < 00. We say that the sequence {In} converges weakly to I in LP(J1,), if, with lip + 1/q = 1, we have
lim
n_oo
f IngdJ1,= Jx f IgdJ1" Jx
allgELQ(J1,).
(3.1)
We now give a few examples to show that there is no connection between weak convergence and any of the other forms of convergence, unless further assumptions are made on either the sequence itself or the measure space involved. For instance, in £P, consider the sequence {en} consisting of those sequences en = (0, ... ,1,0, ... ) with 1 in the nth place and zeroes elsewhere. If 1 < p < 00, and x = (XI, ... ,X n , ... ) E £q, then the functional Lx has the property that
(3.2) and since by the Riesz representation theorem these are all the functionals on £P, the sequence {en} converges weakly to O. Nevertheless, since lien - em lip = 21 / p for all n ¥= m, neither the sequence itself nor any of its subsequences converges to O. Neither does {en} converge to 0 in measure, nor uniformly, nor even in the pointwise sense. Note however that {en} does not converge weakly to 0 in £1; this is clear since the sequence x = (1,1, ... ) is bounded and Lxen = 1 for all n. Now, in the case of £1 we have the following interesting result. Proposition 3.1 (Schur). x in £1, then
If the sequence {xn} converges weakly to
lim IIx n
n .....oo
-
xIII =
o.
(3.3)
XII.
230
LP Spaces
Proof. By considering, if necessary, the sequence {xn - x} we may assume that {xn} converges weakly to O. Suppose that limn-+ oo Ilxnlll i- 0; passing to a subsequence, if needed, we may assume that IIxnlll ~ TJ > 0 for a.ll n. In this case also (1/llxnlldxn converges weakly to 0, and so we may as well assume that IIxnlll = 1 for a.ll n. In addition, if Xn = (Xn,b ... ,xn,m, ... ), n = 1,2, ... , by the weak convergence it follows that lim Xn m
n-+oo
'
= 0,
(3.4)
a.ll m.
Observe that since IIXl111 = 1, we can find ml so that L::!:llxl,ml > 3/4. Further, by (3.4) it readily follows that there exists an index n2 > 1 such that L::!:1 IX n2 ,m I < 1/4, and consequently, since IIx~ lit = 1, we can find an index m2 > ml so that L::;m1 +1 Ix~,m I > 3/4. The pattern is now clear: Having chosen sequences mo = 0 < ml < m2 < ... < mk and nl = 1 < n2 < ... < nk, choose first nk+1 with the property that
mk
L IXnk+ ,ml < 1/4, 1
m=1 and then
mk+1
so that
mk+1
L
Ixnk+l1m l > 3/4.
m=mk+1 Consider now the sequence Y E [00 with terms
Since IYm I : : ; 1 for all m, the functional Lyon
[I
satisfies
00
ILy(xnk)1 =
L xnk,mYm
m=1
~ m=mk+1 1: IXnk,ml1
(I: + m=1
f )
IXnk,mYml
m=mk+1 +1
mk+1 = 2
L
IXnk,ml-lIxnklh> 2·3/4 -1 =1/2,
k = 1,2, ...
m=mk+1 Thus, lim sup ILy(xn)1 ~ 1/2, which contradicts the weak convergence of {xn} to o. •
3.
Weak Convergence
231
In a different direction, an interesting example is the sequence {In} ~ LP([O,1j), 1 ~ p, given by In = nX[O,lln)' which converges to 0 in measure and a.e., and does not converge to 0 weakly in LP([O,1j) for any p 2: 1. (Just consider the functional induced by X[O,I)') Finally, the sequence {In} ~ LP(R), 1 ~ p, given by In = (1/n)X[1,en), converges uniformly to 0, yet it does not converge weakly to 0 in LP( R) for any p 2: 1. (The functional induced by (1/X)X[1,00)(X) will do.) There are additional assumptions that we may impose on weakly convergent sequences to ensure they they also are convergent. Again we discuss the t P case; the result is also true for general LP(J-t) spaces, but the proof is more complicated. Proposition 3.2. Suppose the sequence {xn} converges weakly to x in t P , 1 ~ P < 00, and that, in addition,
(3.5) Then we also have
(3.6) Proof.
As before it follows that if
x = (xt, ... ,x m, ... ),
Xn = (xn,t, ... ,xn,m,"')'
n = 1,2, ...
then lim Xn m = Xm ,
n~oo
'
(3.7)
all m .
Also, by (3.5) and (3.7), for each fixed M we have
00 )I/P (00 ) IIp ( P P n~~ m~ IXn,mI = m~ Ixnl
(3.8)
Whence, for each fixed M we get, M-I
)llP
IIXn - xl!p ~ ( ~ IXn,m - xml P
(00
+ m~ IXn,m -
) IIp
xml P
=A+B, say. It is not hard to estimate B. By Minkowski's inequality and (3.8) it follows that for all sufficiently large n, B$
Ctlz•.mIPr + CtlzmlP)"P $3Ctl•.IP)"P.
XII.
232
V Spaces
which, since x E i P , can be made arbitrarily small provided M is sufficiently large. Once M is fixed, it is clear that, on account of (3.7), A also can be made arbitrarily small provided n is large enough. Thus (3.6) holds, and we have finished. • As noted above, LP spaces do not have the Bolzano-Weierstrass property: There are bounded sequences for which no convergent subsequence may be found. The concept of weak convergence is also relevant in this context. Theorem 3.3. Let {/k} be a bounded sequence in LP(Rn), 1 < P < 00, with bound M. Then there exist a subsequence k m -+ 00 and a function / E LP(Rn) with II/lip ~ M, such that {/k m} converges weakly to / in LP(Rn). Proof. We divide the proof into a number of steps, and begin by showing that there is a subsequence k m -+ 00 with the property that limm -+ oo /kmg dx exists, provided 9 is any function in a fixed countable /kgh family {gh} ~ Lq(Rn), 1/p+l/q = 1. This is not hard: Let Ck,h = for k, h ~ 1, and note that by HOlder's inequality we have
Inn
Inn
(3.9) Fix h = 1 now. By (3.9), (Ck,l) is a bounded sequence and consequently, there is a subsequence kl -+ 00 such that limkl-+OO Ckl,1 exists. Repeating this argument with (CkI.l) in place of (Ck,t) above we obtain a new subsequence k2 -+ 00, say, such that limk:l-+oo Ck2,h exists for h = 1,2. These are the first steps of the by now familiar Cantor diagonal process which ensures the existence of a subsequence k m -+ 00 so that limkm-+oo Ckm,h exists for each h. We choose now for the 9h's a dense family in Lq(Rn ), which, since Lq(Rn ) is separable, is clearly possible, and define the functional L on the gh's by means of the expression Lg = limm -+oo /km 9 dx. Now, L is decidedly linear over these gh's, i.e., L(9hl + ).g~) = Lghl + )'Lg~ for all scalars )., and by HOlder's inequality it also satisfies ILgl ~ MlIgliq. We claim that L can be extended linearly and continuously to all of Lq(Rn). Indeed, to each 9 E Lq(Rn ) there corresponds a sequence of gh's such that limh-+oo IIg - ghllq = 0 and lim IIghllq = IIgliq. Now, for these gh's -the sequence of scalars (Lgh) is Cauchy, and consequently, convergent. Putting Lg = limh-+oo Lgh, L turns out to be a well-defined linear functional on Lq(Rn), and since
Inn
ILgl ~ lim sup ILg - Lghl + lim sup ILghl ~ MlIgllq ,
4.
Problems and Questions
233
L is also bounded and has norm IILII ~ M. By Theorem 2.5 there exists a function IE LP(Rn ) with II/lIp ~ M such that Lg LJg; the function I satisfies all the required conditions. •
=
The proof of this interesting result relies on the fact that the functional
L, originally defined on a subset of Lq(Rn), can be extended to all of Lq(Rn) without an increase of its "norm." A more general setting where this is also true is described in Chapter XIV.
4. PROBLEMS AND QUESTIONS In what follows (X,M,JL) denotes a measure space, and we don't find it necessary to stress this point at each instance. 4.1 Suppose 0 < p < q ~ 00. Give examples of functions I defined on R such that I E Lr(R) iff (a) p < r < q, (b) p ~ r ~ q, and, (c) r =p. 4.2 Let I be a bounded interval of R. By means of an example show that, in general, no t} )dt, Jx J[O,oo)
0 ,x}) ~ I{g>>.} I dp for all ,x > O. Show that IIglip ~ p'lI/l1p, where lip + lip' = 1.
f
4.22 Suppose 0 < r < p < q ~ 00, and let I E LP(p). Show that I can be written I = 9 + h, where 9 E F(p) and h E Lq(p). Further, given t > 0, we can choose 9 and h so that IIgll~ ~ tr-PII/II: and IIhll~ ~ tq-PII/II:·
4.23 Suppose I E wk-L(Rn) is such that I{I f:. O}I < 00. Prove that IE LP(Rn) for each 0 < p < 1. Also, if IE wk-L(Rn)nLOO(Rn), show that I E LP(Rn) for 1 < p < 00.
4.24 The Hardy-Littlewood maximal operator takes LP(Rn ) functions into LP(Rn) functions, 1 < p < 00. More precisely, show there is a constant C = cn,p such that
4.25 Show that if I E LP(Rn ), 1 ~ p ~ 00, then the integral of I differentiates to I( x) for almost every x in Rn. 4.26 Given an interval I = [a,b] in the line, show that a necessary and sufficient condition for a function F to be the integral of IE LP(I), 1 < p < 00, is that the sums
formed for every partition {a ~ Xo < ... < The sup of these sums is then IIlf(x)IPdx.
Xn
~
b} be bounded.
4.27 Suppose p is a finite measure and v A}, and hence they are also measurable. If I = Xa, GaGs subset in R, then we have G = k Ok, Ok open, and consequently I(x - y) is the limit I(x - y) = limk-+oo Xo,,(x - y) of measurable functions, and hence measurable. The same is true if I = XN' N a null subset of Rn. Indeed, let G be a Gs subset in R such that N ~ G, IGI = 0, and let 9 = {(x,y) E R2: x - y E G}. As pointed out above, we have 191 = 0 iff 19x1 = 0 for all x E R. Now,
n
9x = {y
E R: x - y E G} = {y E R: y = x - g,g E G} = x - G,
and by the translation invariance ofthe Lebesgue measure we have Ix - GI = IGI = 0 for all x E R. Thus 191 = O. Consider now
N = {(x,y)
E R2:x - YEN} ~
19x1 =
9.
By the completeness of the Lebesgue measure it follows that also INI = 0, and so XN( x - y) is measurable. Now, if 1= XE' where E is a measurable subset of R, write E = G\N, with GaGs subset of Rand N a null set, and note that in this case I(x - y) = Xa(x - y) - XN(x - y), is also measurable. Whence simple functions 4> also enjoy the property that 4>( x - y) is measurable, and the same is true of the limits of simple functions, to wit, arbitrary measurable functions. This is precisely what we set to prove. Returning to the properties of convolutions, we have Theorem 2.1.
Suppose I,g E L(Rn). Then
I I/(x JRB
y)llg(y)1 dy
that vanishes off Ixl ~ 1. To construct such a function, let "p be defined on the line by
= {e 1 / t
t is now defined by letting 4>(x) = ¢(lxI 2 - 1). Before we continue we need a simple observation concerning convolutions: By the translation invariance of the Lebesgue measure it readily
t
XIII.
250
follows that at those x's where
Fubini's Theorem
I * g( x) is defined, we also have
I*g(x)= iR"I(Y)9(X-Y)dY . We are now ready to prove Theorem 2.3. Suppose I E LP(Rn), 1 :::; P :::; m ~ 1. Then 1* l/J E LP(Rn ) n cm(Rn).
00,
and
l/J E C[f(Rn),
Proof. Since, by Theorem 2.2, 1* l/J E LP(Rn), only the smoothness of the convolution needs to be proved. We first show that 1* l/J is continuous. Indeed, given x,h ERn, note that
II * l/J(x + h) - 1* l/J(x)1 =
lin
I(y)(l/J(X + h - y) -l/J(X - Y))dyl
JRn II(y)IIl/J(x + h -
:::; f
y) -l/J(X - y)1 dy
:::; IIllIplll/J(X + h - .) -l/J(X - ')lIq, (2.7) conjugate indices. Now, if q < 00, since l/J E Lq(Rn) and
where p,q are since translations are continuous in Lq, the right-hand side of (2.7) goes to oas Ihl -+ 0, and so does the left-hand side; thus I*l/J(x) is continuous. On the other hand, if q = 00, by the first inequality above it readily follows that II * l/J(x + h) - 1* l/J(x)1 :::; 111111 sup Il/J(' + h) -l/J(-)I·
Now, by the uniform continuity of l/J we get that the right-hand side of the above inequality goes to 0 with h, and so does the left-hand side there. As for the smoothness, let h = (h, 0, ... ,0) denote the vector with scalar h i- 0 in the first position and zeros elsewhere, and put
"'( h)_l/J(x+h-y)-l/J(x- y ) _ (al/J) ( _ ) .,.., x,y, h aX! x y. By the conditions of the theorem it is clear that for each fixed x E R n , ~(x,y,h) -+ 0, uniformly and boundedly in y as h -+ O. Whence
lim h-+O
f I(y)~(x,y,h) dy = 0, JJln
Moreover, since the integral in (2.8) equals
x E Rn.
(2.8)
2.
Convolutions and Approximate Identities
251
it readily follows that
au * E C~(Rn) with integral 1, and let r = UXB(O,M») * 4>e. Now, by Theorem 2.3, r E LP( Rn) n COO( Rn); but there is something else we can say. Indeed, since both IXB(o,M) and 4> vanish off a compact set K, say, the convolution rex) = JR"UXB(O,M»)(X - Y)4>e(y)dy vanishes unless there are points x and y such that x - y E K and y/c E K. Hence, rex) = 0 unless x is of the form
and this is a bounded set of points in Rn. Thus r E Finally,
III - rll p
Il/xB(o,M) :::; Il/xB(o,M) :::;
rll p + 11/(1 rll p + TJ,
C~(Rn).
XB(o,M»)lIp
and by Theorem 2.5 the right-hand side above can be made arbitrarily small with c. • There are substitute results for I E Loo(Rn); one of them is Theorem 2.7. Suppose 4> is a nonnegative integrable function with integral 1, and let I E Loo(Rn). Then
lim 1* 4>e(x)
e-+O
at every point x of continuity of Proof.
II * 4>e(x) -
= I(x),
(2.14)
I.
As before, by (2.12) we have
l(x)l:::;
f I/(x JRR
:::; ( f
J{IYI~M}
y) - l(x)l4>e(Y) dy
+
f
J{lYI>M}
)
I/(x - y) - l(x)l4>e(y)dy
=A+B, say. Now, if I is continuous at x, given TJ > 0, there exists M > 0 such that I/(x - y) - l(x)1 :::; TJ if Iyl :::; M. With this choice of M we have
A + B :::; TJ + 211/1100
f
J{lyl>M}
4>e(Y) dy,
2.
Convolutions and Approximate Identities
255
where the right-hand side above tends to 0 as e that (2.14) is true. •
--t
O. This readily implies
Still we must address the harder question concerning the pointwise convergence to I of the convolutions of I with the approximate identities cPe for integrable, or more generally, p-integrable functions. We begin by proving Theorem 2.8. Suppose cP is a nonnegative integrable function with integral equal to 1, and let I E LP(Rn ), 1 ~ p < 00. If in addition cP satisfies cP(y) ~ c/lyln+ 71 , TJ> 0, all y E R n , (2.15)
then at each point x of continuity of I we have lim 1* cPe(x) = I(x).
(2.16)
e_O
Proof. The proof follows along the lines to that of Theorem 2.7. For, in the notation of that theorem, and with the same choice of M as there, we still have that A can be made arbitrarily small at a point of continuity x of I. As for B, it is majorized by
f
J{IYI>M}
I/(x-y)lcPe(y)dy+l/(x)1
f
hlyl>M}
cPe(y)dy=Bl+B2'
say. (2.10) establishes that lime_o B2 = O. Also, if p > 1, by Holder's inequality with indices p and its conjugate q, we have
B2
~ (f
JR!'
I/(x - y)IP dy)l/P (
n/q = II/lIp~ e
(1 {lyl>M/e}
f
cPe(y)q dy)l/q
J{lYI>M}
cP(y)q dy
)l/q
= II/lIp~(e),
say. Consequently, any condition on cP that ensures that lime_o ~(e) = 0, will give (2.16); we show next that (2.15) is one such condition. Indeed, if (2.15) holds, then
XIII.
256
which clearly tends to 0 with Finally, if p = 1, then
B2
::;
Fubini's Theorem
E.
I/(x)1
f
J{lyl>M/e}
(y)dy,
which, since is integrable, also tends to 0 with
E.
•
To complete the analogy with the Lebesgue Differentiation Theorem, we discuss the a.e. convergence of I * e to I. The results are now more complicated, cf. 4.18 below and Theorem 3.1 in Chapter XVII, but are surprisingly simple in case vanishes off a compact set. Theorem 2.10. Suppose is a nonnegative bounded integrable function with integral equal to 1, which vanishes off B(O,1), and let j E LP(Rn), 1 ::; p ::; 00. Then, at each point x of the Lebesgue set of I, and in particular a.e., we have
lim 1* e(x) = I(x).
e-+O
Proof. As before, and in the notation of Theorem 2.7, since e vanishes off the set {Iyl ::; c}, the choice M = c gives that B = o. As for A, since is bounded, it is dominated by
A::;c~ f
c J{lYI~e}
I/(x-y)-/(x)ldy,
which goes to 0 with c at precisely those points x in the Lebesgue set of
I· • The reader will note that the convergence results presented above may be extended to the following setting: We may assume that is an integrable function with integral one such that 1(x)1 ::; 'I/J(x) for all x ERn, where 'I/J satisfies the conditions that we required the previously nonnegative function to verify.
3. ABSTRACT FUBINI'S THEOREM In this section we present the abstract version of Fubini's theorem. Now, in the case of Euclidean space the problem at hand was facilitated
3.
Abstract Fubini's Theorem
257
by the fact that the Lebesgue measure is defined on the various spaces involved. Thus, given measures IL, v defined on (X,M) and (Y,N) respectively, the first order of business is to construct a "product measure" on M x N, the u-algebra introduced in Chapter IV, one that will make statements such as Fubini's theorem true. First some definitions. A measurable rectangle is any subset of X x Y of the form A x B, A E M, BEN. Finite unions of pairwise disjoint measurable rectangles are called elementary sets, and are often denoted by Q . .If E ~ X x Y, we define the section Ex of E (at level x E X) as the subset of Y given by Ex = {y E Y : (x ,y) E E},
x EX.
(3.1)
Similarly, the section EY of E (at level y E Y) is defined as EY = {x EX: (x,y) E E},
y E Y.
(3.2)
How do sections behave with respect to measurability? Proposition 3.1. Every section of a measurable set E E M x N is measurable. Specifically, Ex E N for all x EX, and EY E M for every y E Y. Proof. Let:F denote the class of those E E M x N such that Ex E N for all x EX; we intend to show that :F is a u-algebra of subsets of Y that contains all measurable rectangles and which therefore coincides with M X N. First note that if E = A x B is a measurable rectangle, then Ex = B when x E A and Ex = 0 otherwise, and consequently, every measurable rectangle belongs to :F. In particular X X Y E :F. Further, since N is a u-algebra, it readily follows that if E E :F, then «X x Y) \ E)x = {y E Y: (x,y) (j E} = Y \ Ex EN
and :F is closed under complementation. Finally, if En E :F, n = 1,2, ... , and E
all x EX,
= Un En, since
n
:F is also closed under countable unions. Thus :F is the u-algebra M x N.
The proof for the EY's is the same.
•
XIII.
258
Fubini's Theorem
The statement of Proposition 3.1 is one about characteristic functions of measurable sets. For arbitrary measurable functions the situation is as follows: If f is a function defined on X X Y, we call the function
fx(Y) the X-section of defined by
f at level
= f(x,y) ,
x EX,
x. Similarly, the Y-section of
fY(x)
= f(x,y) ,
f at level
y is
y E Y.
We begin by showing that sections of measurable functions are measurable; the measurability of functions defined on X X Y is always understood to be with respect to the O'-algebra M X N. Proposition 3.2. The sections of measurable functions are measurable. More precisely, if f is a measurable function defined on X X Y, fx is a measurable function on (Y, N) for every x EX, and fY is a measurable function in (X, M) for every y E Y. Proof. Let f be measurable, and given an open set 0 note that for each x E X we have
f;l(O) = {y E Y: fx(y) E O} = {y E Y: f(x,y) E O} = {y E Y: (x,y) E f- 1(0)} = (I-1(0))x . Now, since f- 1 (0) E M X N, the measurability of the set on the righthand side above follows from Proposition 3.1, and that of fx from Proposition 1.3 in Chapter VI. The proof for fY is the same. • We are now ready to prove the basic result needed to introduce the product measure. Theorem 3.3. Let (X,M,JL), (Y,N,v) be O'-finite measure spaces, and suppose E E M X N. Then for each x E X, v(Ex) is a measurable function on (X,M), and for each y E Y, JL(EY) is a measurable function on (Y,N). Furthermore (3.3)
3.
Abstract Fubini's Theorem
259
Proof. The measurability of Ex and EY has been established in Proposition 3.1, so we begin by computing the integrands of the integrals in (3.3). Now, as noted above, if E = Un(An x Bn) is an elementary set, then XE"cV) = L:nXAJX)XBJV), and by the additivity of v it readily follows that
Clearly v(Ex) is a measurable function on (X,M), and (3.4) In a similar fashion it follows that J.l(EY) is a measurable function on (Y,N) and that its integral over Y with respect to v is equal to the right-hand side of (3.4). Therefore the assertion of the theorem is true for all elementary sets in M X N; we now show that the collection F of subsets of M X N for which (3.3) is true is a monotone class which, on account of 4.19 in Chapter IV, coincides with M x N. Let {En} ~ F be a non decreasing sequence, and write E = Un En; we must show that E E F as well. First note that {(En)x} ~ N is a non decreasing sequence that converges to Ex EN, and consequently by (3.3) in Chapter IV, lim v((En)x) = v(Ex) ,
n-+oo
all x EX.
Whence v(Ex) is a limit of measurable functions on (X, M), and is therefore measurable. Further, by MCT it readily follows that (3.5) Similarly J.l(EY) is measurable on (Y,N), and (3.6) Now, since for each n the integrals that appear on the left-hand side of (3.5) and (3.6) are equal, so are their limits. In other words, the integrals on the right-hand side of (3.5) and (3.6) are equal, and (3.3) holds in this case.
XIII.
260
Fubini's Theorem
Suppose next that {En} ~ :F is a nonincreasing sequence of sets, and put E = En; we must show that E E :F. The preceding argument, invoking now (3.4) in Chapter IV instead, certainly goes through if X and Y have finite measure. To see that the same is true in the a-finite case, let {Xk}, {Ykl be sequences of sets of finite measure such that X = UXk and Y = UY k . Then, since for each k the nonincreasing sequence {En n (Xk X Yk)} converges to En (Xk X Yk) and I'(Xk), V(Yk) < 00, (3.3) is true for En (Xk X Yk), k = 1,2, ... But since the nondecreasing sequence {E n (Xk x Yk)} converges to E, the conclusion of the theorem also holds for E. We have thus shown that :F is a monotone class, and the proof is complete. •
nn
The following example shows that the a-finiteness ofthe measures was necessary. Let I = [0,1] and consider the measure spaces (1,£, I· I) and (I, P(I), v), where v denotes the counting measure on (I, P(I)). Further, let E = {(x,y) E I x I:x = y} be the "diagonal set" in I X I; it is not difficult to show that E E £ X P(I). Now, since for each real x, y the sets Ex and EY consist of a single point, we have hV(Ex)dX
= 00,
and
hlEYldv
= o.
If (X,M,I') and (Y,N,v) are as in Theorem 3.3, we define the set function I' X von (X X Y,M X N) by (I'
X
v)(E) =
Ix
v(Ex) dl' = [I'(EY) dv,
E E M x N.
(3.7)
The equality of the integrals in (3.7) is assured by Theorem 3.3. We call I' X v the "product" of the measures I' and v, and it follows without much difficulty from MCT that I' x v indeed is a measure. Observe that also I' X v is a-finite. Now, since v(Ex) = XE(x,.) dv and I'(EY) = XE(·, y) dl', (3.7) actually states that a Tonelli-like identity is true for the characteristic functions of measurable sets. In fact, the general statement holds as well, to wit,
Iy
Ix
Theorem 3.4. Let (X,M,I'), (Y,N,v) be a-finite measure spaces, and let f be a nonnegative extended real-valued measurable function defined on (X X Y, M x N). Then fx(y) dv is a measurable function on (X,M), fY(x)dl' is a measurable function on (Y,N), and
Iy
Ix
I
}XXy
fd(l'xv)=
I I
}x}y
fx(y)dvdl'
= }y}x I I fY(x)dl'dv.
(3.8)
3.
Abstract FUbini's Theorem
261
Proof. By (3.7) the theorem is true for characteristic functions of measurable sets, and hence (3.8) holds for all nonnegative simple functions. By Theorem 1.12 in Chapter VI we know that f is the limit of nondecreasing sequence of simple functions, and consequently (3.8) follows by MCT as in Theorem 1.8. • Corollary 3.S.
Under the assumptions of Theorem 3.4, if
Ix [ Iflx(Y) dv dp,
' dy,
x E R,
4.
Problems and Questions
263
is integrable over F, and so finite a.e. there, and 00 if x rI. F. M>. is called the Marcinkiewicz function corresponding to F, and it is an indispensable tool in the theory of Fourier series. The particular case I = Xl! where I is a bounded interval of the line, is of interest. 4.10 If I is a nonnegative extended real-valued Borel measurable function on Rn+m, show that iRn I(x, y) dx and iR"" I(x, y) dy are Borel measurable, and
I
JRn+m
I = I
I I(x,y) dx dy = I
JR"" JRn
I
JRn JRm
I(x,y)dydx.
4.11 Let E be a domain in the plane bounded by the continuous curves y = 4>(x), y = .,p(x) for x E I = [a,b], where 4>(x) < .,p(x). Prove that if I is a Borel measurable, integrable function defined on E, then
I 1= I i
JE
Jf
I(x,y)dydx.
[4>(x).1/I(x)1
4.12 Let I = 1(0,1) denote the unit interval in Rn, 0 < 'f/ < n, and suppose b(x, y)"is an essentially bounded function defined on I X I. Show that if IE L(I), the function
F(x)= Ib(x'Y)/(y)dy,
if Ix -
YI7J
xEI,
is finite a.e. on I. In fact, F E L(I). What is an estimate of in terms of 1I/11t ? 4.13 Prove that LP(Rn ) * Lq(Rn) l/p + l/q = 1.
~
Loo(Rn) n C(Rn), 1
E C([r, RD. Then, if F(p) = J{r~lxl~p} I(x )¢>(Ixl) dx, r ~ p ~ R, show that
f
l(x)¢>(lxl)dx =
J{r~lxl~R}
lR
¢>(p)dF(p) ,
r
the integral on the right-hand side above being a Riemann-Stieltjes integral.
JR"
4.18 Suppose ¢> E Ll(Rn) n Loo(Rn) has the property that ¢> = 1 and that for some 'lJ > 0, 1¢>(x)l/lxl n+71 ~ c for alllxllarge. Now, if IE LP(Rn), 1 ~ P < 00, show that at each point x in the Lebesgue set of I, we have lime-+o I * ¢>e( x) = I( x). 4.19 A sequence {¢>k} ~ L(Rn) is called an "approximate unit" if: (a) ¢>k ~ 0, for all k, (b) lI¢>klll = 1, for all k, and (c) For each neighbourhood G of 0 we have limk-+oo JRn\G ¢>k = o. If {¢>k} is an approximate unit, and if 1 ~ p < 00, prove that limk-+oo III * ¢>k - Ilip = 0 for all I E LP(Rn). 4.20 Show that E = R2 \ {(x, y) E R2: x - y is rational} contains no measurable rectangle of positive Lebesgue measure. 4.21 Prove that the operation of convolution is associative in L(Rn). Specifically, if I, g, and h are integrable, show that 1* (g * h)( x) = (f * g) * hex) a.e. 4.22 Suppose I and 9 are nonnegative integrable functions defined on R n so that both I and 9 are strictly positive on some set of positive measure (not necessarily the same for I and g). Prove that I*g > 0 on a set of positive measure. 4.23 (Minkowski's Integral Inequality). Under all appropriate measurability conditions on I, show that if 1 ~ p < 00 we have
([ (fx I/(x, y)1 d
P)
P
dV) lip ~
fx ([ I/(x, Y)IPdV) lip dp.
If we write this inequality in the form
then it is also true for p
= 00.
4.
Problems and Questions
265
4.24 Concerning the example following Theorem 3.3, show that the integral fIxI XE d(I·1 x v), where E is the "diagonal" set given there, is different from either of the iterated integrals. In what follows (X,M,JL) and (Y,N,v) are measure spaces. 4.25 If A is a measure on M x N such that A(A X B) = JL(A)v(B) for all measurable rectangles A x B, show that A = JL x v. 4.26 If the measure spaces involved are complete and u-finite, and if JL x veE) = 0, show that for every F ~ E we have
JL(FY) = 0 v-a.e.,
and
v(Fx) = 0 JL-a.e.
4.27 The measure space (X X Y,M xN,JL x v) is seldom complete, even when the measure spaces involved are both complete. Prove that this is the case if there exists a set A c X such that A rt M, and a nonempty set BEN such that v(B) = o. In particular, if JL denotes the Lebesgue measure on the line, (R 2 x .c, JL x JL) is an incomplete measure space.
,.c
4.28 An alternative statement of the Fubini-Tonelli theorem is the following: Suppose (X,M,JL) and (Y,N,v) are complete u-finite measure spaces, and let (X x Y, F, A) denote the completion of (X X Y,M x N,JL x v), cf. Theorem 3.3 in Chapter V. If 1 is measurable (with respect to F) and either (a) 1 ~ 0 or (b) 1 E Ll(oX), then Ix is N-measurable JL-a.e., l Y is M-measurable v-a.e., and, in case (b) holds, also Ix E Ll(v) and l Y E Ll(JL), in the a.e. sense. Moreover, the functions fx Ix dv and Ix l Y dJL are measurable and
f 1 dA = f f Ix dvdJL = f f l Y dJLdv . }xXY }x}y }y}x Prove it. 4.29 The requirement that 1 be measurable cannot be dispensed for the validity of Fubini's theorem. To see this let X = Y be well-ordered sets with ordinal il, M = N be the u-algebra consisting of those sets which are either at most countable or so that their complement is at most countable, and let JL = v be the measure defined for A E M by JL(A) = 0 if A is at most countable and JL(A) = 1 otherwise. Show that if E = {(x, y) E X x Y: x -< y}, then Ex and EY are measurable for all x, y, and that both iterated integrals of XE exist and are unequal. Hence E rt M x N, and Fubini's thoerem does not hold in this case.
XIII.
266
Fubini's Theorem
4.30 If one accepts the Continuum Hypothesis the construction in 4.29 leads to the following situation: There is a subset E of X = [0,1] X [0,1] such that Ex is at most countable for all x E [0,1], [O,l]\EY is at most countable for all y E [0,1], but E is not Lebesgue measurable. 4.31 The following result describes the behaviour of absolute continuity and singularity with respect to product measures. Let J1. and J1.* be O'-finite measures on (Y,N). Prove that if J1. ~ J1.* and v ~ v*, we have J1. X v ~ J1.* X v* and d(J1. X v) d(J1.* X J1.*) (x, y)
dJ1.
dv
= dJ1.* (x) dv* (y),
all (x, y) E X X Y.
Also, if J1. .L J1.* or v .L v*, then J1. X v .L J1.* X v*. 4.32 In the notation of 4.28, prove that if J1. = J1.a + J1.s is the Lebesgue decomposition of J1. with respect to J1.*, and similarly v = Va + Vs that of v with respect to v*, then the Lebesgue decomposition of J1. X v is given by (J1. X v)a = J1.a X Va and
4.33 Let J1.1 be a finite Borel measure on Rnl, and J1.2 a finite Borel measure on Rn2. If J1.1 X J1.2 is absolutely continuous with respect to the Lebesgue measure ~ on Rnl +n2 , does it necessarily follow that d(J1.1 X J1.2)/d~ = f . g, where f is a Lebesgue measurable function defined on Rnl, and 9 is a Lebesgue measurable function defined on R n 2?
CHAPTER
XIV
N ormed Spaces and Functionals In this chapter we study the basic properties of linear spaces, and in particular of those spaces which are normed, and of those which are complete in the metric induced by the norm, or Banach spaces. The existence of continuous linear functionals on these spaces is established by the HahnBanach Theorem. 1.
NORMED SPACES
The time has come to set up a general framework to address some of the important questions we have posed, including the existence of bounded linear functionals on various linear spaces. We begin by introducing the necessary definitions. Suppose X is a vector space over the field of real, or complex, scalars; since the theory in both cases follows along similar lines we consider them simultaneously. A scalar valued function defined on X is called a functional. We are first interested in a particular kind of functional, namely a seminorm. A nonnegative functional p defined on X is called a semi norm provided the following two properties are satisfied: (i) (Triangle Inequality) p(x + y) ~ p(x) + p(y) , x, y E X. (ii) (Absolute Homogeneity) p(AX) = IAlp(x), A scalar, X E X. Of course, in (ii) above, IAI denotes the absolute value of A when X is a vector space over the reals and the modulus of A when the scalar field are the complex numbers. It follows from (ii) that p(O) = o. We say that the semi norm p is a norm provided that (iii) (Uniqueness) p(x) = 0 implies x = O.
268
XIV.
N ormed Spaces and Functionals
Norms are often denoted by II ,11, or variants thereof. To emphasize that X is endowed with a norm we call X a normed linear space. We have already encountered many instances of normed linear spaces. The finite-dimensional spaces R n and en may, of course, be normed in different ways. For instance, if z = (Zl' ... , zn) E en, then the expressions 1~p 0 and AX E C}.
(2.3)
We claim that Pc satisfies (i)-(iv) above. First note that since by the continuity of the norm IIAxll --+ 0 as A --+ 0, and since 0 is an interior point
2.
The Hahn-Banach Theorem
273
of C, AX E C for sufficiently small A. Thus the inf in (2.3) is finite and (i) holds. As for (ii), since AO E C for every A, it follows that poCO) = 0, and consequently, we may assume that TJ "=I 0 and X "=I o. Now,
PoCTJx) = inf{l/A:A > 0, ATJX E C} = inf{TJ/A:A > 0, AX E C} = TJinf{l/ A: A> 0, AX E C} = TJPoC X) , which is precisely (li). To prove (iii), given c > 0, let that P,X E C, 1/p, ~ Pe(x) +c/2, and
vy E C, and put
l/v ~ Pe(Y) + c/2,
l/A = 1/p, + l/v.
p"V
> 0, be such
(2.4) (2.5) (2.6)
Now, since 0 < TJ = A/p, < 1 and C is convex, it readily follows that A(X + y) = TJ(p,x) + (1- TJ)(vy) E C, and by (2.3), (2.6), (2.4) and (2.5) we have Pe(x + y) ~ l/A ~ Pe(x) + PoCy) + c. But c > 0 is arbitrary, and consequently, (iii) holds. Finally, if X E X \ C, since C is convex we cannot have AX E C for some A ~ 1. Whence, if AX E C, it follows that A < 1 and, as asserted, poCx) ~ 1, thus proving (iv). By means of the Minkowski functional we may answer the question posed above concerning convex sets of the plan. The idea, after some simple arguments, is to consider the linear functional L defined on the one-dimensional subspace of X consisting of all elements of the form AXo, Xo ~ C, by the formula L(Axo) = APel (xo), where Cl is a convex set related to C, and observing that in this case L( AXo) ~ Pel (AXo) for all real A. This expression exhibits the domination alluded to above, and the question is whether L can be extended to the plane satisfying the same inequality. We make these remarks precise with the aid of the following result. Theorem 2.1 (Hahn-Banach Theorem). Suppose X is a real linear space and P is a functional on X which satisfies the triangular inequality, and so that p(AX) = Ap(X) for all X E X and A > o. Further, let Xo be a linear subspace of X and Lo a linear functional on Xo such that Lox~p(x),
allxEXo .
(2.7)
Then there is a linear functional L defined on X that extends L o, i.e., Lx = Lox for X E X o, and so that Lx
~
p( x) ,
all x EX.
(2.8)
XIV.
274
Normed Spaces and Functionals
Proof. The idea of the proof is to invoke Zorn's Lemma to construct a maximal extension of Lo, and then to show that this extension satisfies (2.8). Let X be the collection of all pairs of the form (Y, L) where (i) Y is a linear subspace of X and Xo ~ Y ~ X. (ii) L is a linear functional on Y, LIXo = L o, and Lx ~ p(x) for all x E Y. Note that (Xo, Lo) E X. On X we introduce a partial ordering as follows: We say that (Y, L) precedes (Y', L'), and we write (Y, L) -< (Y', L'), if Y ~ Y' and L'IY = L. In order to apply Zorn's Lemma we must first check that any linearly ordered family {(Ys , L s )}, say, of elements in X has an upper bound. But this is not hard: Indeed, put Y = Us Y s , and consider the functional L on Y so that LIYs = Ls. Since the family is ordered it readily follows that (Y, L) is an upper bound, and we are in a position to invoke the conclusion of Zorn's Lemma, to wit, X has a maximal element (Xl,L l ), say. There are two possibilities: Either Xl = X, and in this case we are done, or else Xl is a proper linear subspace of X. Next we show that the latter possibility does not occur, for otherwise we would reach a contradiction. Indeed, if the latter possibility occurs, let Xo E X \ Xl and consider the linear subspace X 2 of X spanned by Xl and {xo}, Le., X 2 consists of all linear combinations of the form Xl + .xxo, where Xl E Xl and .x is a real number. We claim that Ll may be extended to a linear functional L2 on X2 which satisfies L2X ~ p(x) for all X E X2, thus contradicting the maximality of (Y, L). Denote by L2 a candidate for such an extension of L to X 2, and observe that if L2XO = "I, an arbitrary real scalar, we have
If we can produce a scalar "I so that for all Xl in Xl and scalars
.x the
inequality (2.9) is true, it then follows that (X2' L 2) EX, and that (Y, L) strictly precedes it, thus contradicting its assumed maximality. Observe that if (2.9) holds, we also have
.x> 0
(2.10)
7J;::: (P(XI + .xxo) - p(xI)/>., >. < o.
(2.11)
"I ~ (P(XI
+ .xxo) -
p(xI))/.x,
2.
The Hahn-Banach Theorem
275
By setting A = -1',1' > 0, in (2.11), (2.10) and (2.11) may be combined into the single expression
p(XI) - p(XI - JLXO) I'
< < P(XI + AXO) - p(XI) A
-T/-
'
(2.12)
which should now hold for all A, I' > o. Thus, the existence of T/ is equivalent to the validity of the inequality
P(XI) - P(XI - JLXo) I'
::;
P(XI - AXo) - p(xt} A '
alIA,JL>O,
(2.13)
for then T/ may be chosen to be any real number lying between the sup of the left-hand side of (2.13) and the inf of the right-hand side there. To show that (2.13) holds is not hard. First observe that it is equivalent to
( ) P Xl
JLP(XI
::;
+ AXo) + Ap(XI (A+JL)
JLXo)
(2.14)
.
Next note that since
and since P is subadditive and positively homogeneous, it follows that I'
P(XI) ::; (A + 1') P(XI
A
+ AXo) + (A + 1') P(XI -
JLXo) ,
which is precisely (2.14). Thus, reversing the steps, also (2.13) holds and L can be extended to a subspace of X containing Y and satisfying (2.8), thus contradicting the maximality of (Y, L). Whence Y is actually X, and the proof is complete. • Next we consider the Hahn-Banach Theorem for complex linear spaces, the proof presented here is due to Bohnenblust and Sobczyk. We begin by exploring the relationship between real and complex functionals.
Lemma 2.2. Suppose X is a complex linear space, and let L be a (complex) linear functional defined on X. Then Llx = ~(Lx) is a real linear functional defined on X and (2.15) Conversely, if Ll is a real linear functional defined on X, then the functional L defined by (2.15) is a complex linear functional on X.
XIV.
276
N ormed Spaces and Functionals
Proof. That Ll is a real functional on X if L is a complex functional on X is a simple verification left to the reader. Now, since for any complex number z we have that ~(iz) = -~(z), it readily follows that
Lx =
~(Lx)
= L1x -
+ i~(Lx) = i~(L(ix))
L1x + i(-~(iLx)) = L1x - iLl(ix) ,
and (2.15) follows. Finally, if Ll is a real functional defined on X and L is given by (2.15), in order to show that L is a complex linear functional on X it suffices to check that L( ix) = iLx for all x EX. But
L(ix)
= Ll(ix) -
iLl(i(ix)) = Ll(ix) - iL1(-x) = i(LIX - iLl(ix)) = iLx. •
We are now ready to present the complex version of the Hahn-Banach Theorem. Theorem 2.3. Let X be a complex linear space, p a semi norm on X, Xo a linear subspace of X and Lo a complex linear functional defined on Xo such that (2.16) ILoxl ~ p(x) , all x E Xo. Then there is a linear functional L defined on X which extends L o, i.e., Lx = LoX for x E X o, and so that
ILxl
~
p(x) ,
all x EX.
(2.17)
Proof. Let Ll = ~Lo; by Proposition 2.2, Ll is a real linear functional defined on X o, and by (2.16) we have L1x ~ ILoxl ~ p(x) for x E Xo. We are now in a position to invoke Theorem 2.1 and extend Ll to a real linear functional L2 defined on X with the property that L 2x = L1x, X E X o, and
L 2x
~
p( x) ,
all x EX.
(2.18)
Since p is a seminorm, replacing x by -x if necessary in (2.18), we note that we have IL2Xl ~ p(x) as well. Inspired by Lemma 2.2, let
Lx
= L 2x -
iL 2(ix).
L is a complex linear functional defined on X, and since L2 extends L 1, the restriction of L to Xa coincides with La. It only remains to check (2.17):
2.
The Hahn-Banach Theorem
277
Since for each x E X so that Lx :I 0 we have ILxl = >.Lx = L(>.x), where >. = Lx/ILxl is complex number of modulus 1, it follows that L(>.x) is real, and consequently we also have
ILxl and (2.17) holds.
= L(>.x) = L 2 (>.x)
~
p(>.x)
= p(x) ,
•
We focus our discussion next in the normed linear spaces; we begin with some definitions. A functional L defined on a normed linear space X is said to be continuous if
Ilx n
-
xII -
0 implies
ILxn - Lxi -
o.
For linear functionals L, which are the functionals of interest to us, the notion of continuity is equivalent to that of continuity at a single point of X. For, suppose that L is continuous at a point Xo of X and let Xn - x EX. Then we have Xn - x + Xo - Xo, and consequently, IL(x n - x + xo) - Lxol - o. But, since L is linear, it is obvious that L(x n - x + xo) - Lxo = LX n - Lx, and our assertion follows. Also, for linear functionals, the concepts of boundedness and continuity are interchangeable. Proposition 2.4. Suppose L is a linear functional on a normed linear space X. Then L is bounded iff L is continuous. Proof. have
Suppose first that L is bounded; since L is also linear we (2.19)
and the right-hand side, and consequently also the left-hand side, of (2.19) goes to 0 with IIx n - xII. Whence, L is continuous. Conversely, suppose that L is a continuous linear functional on X which is not bounded. Then, by (2.2), for each positive integer n there is Yn :I 0 in X, so that ILYnl > nllYnll. Put Xn = (1/nIlYnII)Yn, and observe that the sequence {x n } ~ X satisfies
IIxnll- 0
and yet
ILxnl > 1 for all n.
But this is not possible if L is continuous.
•
Although not intuitively apparent, there are linear functionals that are not bounded. To see this consider an infinite dimensional linear space
XIV.
278
N ormed Spaces and Functionals
X and, referring to Section 3 in Chapter II, let H be a Hamel basis for X over the ambient scalar field. It is a straightforward application of Zorn's Lemma to prove that any linearly independent subset of X is contained in a Hamel basis for X. In particular, any linear space has a Hamel basis. Now, for each x in X we can find a unique elements hI, . .. ,hn in Hand scalars At, ... , An, say, such that x ~i=l AiXi. Define IIxli oo to be the maximum of the the numbers Ai, i = 1, •.• , n. Clearly II x 1100 is a norm in X, and consequently, any linear space over the real or complex scalar field can be given a norm. There is an interesting case for which a Hamel basis can be exhibited. Let I = [0,1] and let pel) C G(l) denote the class of all polynomial functions on I. Then H = {1,x,x 2 , ••• } is a Hamel basis for pel). Let now HI be a Hamel basis for G(l) that contains H and choose any element hI E HI \ H. Put Lhl = 1 and Lh = 0 for all h in Ht, h =I ht, and extend L to all of G(l) by requiring that it be linear. It is clear that L cannot be continuous with respect to the uniform norm on G(l). Indeed, if this were the case, then by 4.15 below the set {f E G(l): Lf = O} would be a closed subspace of G(l). But this set contains P(I), which, by the Weierstrass theorem, cf. Corollary 2.3 in Chapter XVII, is dense in G(l). Hence if L were continuous, it would have to be identically 0, contrary to the fact that Lhl = 1. We are now ready to introduce the conjugate, or dual, space X* of a normed linear space X, i.e., the space consisting of all continuous linear functionals defined on X. More precisely, given a normed linear space X, let X* = {L: L is a continuous linear functional on X}.
=
It is readily seen that X* is itself a linear space over the scalar field of Xi LI + AL2 is defined as the continuous linear functional on X given by
(LI
+ AL2 )(x) =
Llx + AL 2 x,
all x EX.
We also have Proposition 2.5.
Suppose X is a normed space. Then X*, normed
by (2.20) is a Banach space. Proof. It is clear that the expression in (2.20) is a semi norm on X*. Now, if IILII = 0, it follows that ILxl = 0 for each x E X, L is the 0 functional and consequently, IILII is a norm on X*.
2.
The Hahn-Banach Theorem
279
To show that normed by (2.20) X* becomes a Banach space, by Theorem 1.1 it suffices to prove that if ~ IILnll < 00, then ~ Ln converges in X*. First observe that for each x E X we have (2.21) Thus the numerical series with terms (Lnx) converges absolutely for each x in X, and since the scalar field is complete, also ~ Lnx converges, even unconditionally, to a sum Lx, say. First we show that L is a bounded linear functional on X. Indeed, given x, y in X and a scalar )., we have
L(x + ).y) =
L Ln(x + ).y) = ~~oo L~=l Ln(x + ).y)
= m-+oo lim "m (Ln x + ).LnY) = Lx + )'Ly , ~n=l
and the linearity of L follows. Moreover, by (2.21) it also readily follows that ILxl/llxl1 ~ ~ II Ln II , xi- 0, and consequently, L is bounded. Finally, since for x E X we have
we get that
and L = lim m -+ oo ~~1 Ln (in X*). Since all the assumptions of Theorem 1.1 are now satisfied, we get that X* is complete. • It is interesting to point out that the conclusion of Proposition 2.5 holds whether X itself is complete or not, as the proof only makes use of the completeness of the field of scalars. After this brief digression we turn to prove a version of the HahnBanach Theorem that deals with continuous linear functionals. Theorem 2.6 (Hahn-Banach Theorem). Suppose X is a normed linear space, and let Lo be a bounded linear functional defined on a subspace Xo of X. Then there exists a bounded linear functional L defined on X such that (2.22) LIXo = Lo and IILII = IILoli.
XIV.
280
Normed Spaces and Functionals
Proof. We consider first the case when X is a real linear space. Since Lo is a bounded linear functional on Xo, it follows that
Lox ~ ILoxl ~ IILollllxll ,
all x E Xo.
(2.23)
Note that the expression on the right-hand side of (2.22) may be thought of as a semi norm on X. More precisely, if for x in X we put p( x) = IILollllxll, then p is a seminorm on X, and (2.23) actually states that the assumptions of Theorem 2.1 are satisfied. By the conclusion of that theorem there is a linear functional L defined on X such that
Lx=Lox,
xEXo,
Lx~p(x),
and
allxEX.
(2.24)
The estimate in (2.24) may be rewritten as
Lx ~ IILollllxll,
(2.25)
and since L is linear we also have
-Lx
= L( -x) ~
IILolIlI- xII
= IILollllxll.
(2.26)
Thus combining (2.25) and (2.26) it follows that ILxl ~ IILollllxll for all x E X, and consequently, IILII ~ IILoli. Furthermore, since the restriction of L to Xo is Lo, we also have IILII;:::
sup
x~O,xEXo
ILoxl -II-II = IILoli , x
and IILII = IILoli. This completes the proof in the real case. As for the complex case, the prooffollows along similar lines once we invoke Theorem 2.3. • Many important topics in the theory of linear spaces rely on the notion of convexity; as a first application of the Hahn-Banach Theorem we formalize the discussion preceeding Theorem 2.1; first a definition. Given subsets Xo and Xl of a linear space X, a linear functional L defined on X is said to separate Xo and Xl if sup Lx xEXt
~
inf Lx. xEXo
The lack of symmetry in this definition is only apparent as the roles of Xo and Xl are interchanged when L is replaced by -L. It follows at once from this definition that L separates Xo and Xl iff L separates Xo - Xl = {z:z = Xo - Xt,XO E XO,XI E Xl} and {O} iff L separates Xo - x = {z:z = Xo - X,Xo E Xo} and Xl - x for every x E X.
2.
The Hahn-Banach Theorem
281
We then have Theorem 2.7. Let Co, C l be two disjoint, nonempty convex subsets of a real normed linear space X, and suppose that at least one of the sets, Co say, has a nonempty interior. Then there exists a nontrivial linear functional L on X that separates Co and C l . Proof. Let Xo be an interior point to Co; by considering if necessary Co - Xo and C l - xo, which are also convex, we may assume that 0 is an interior point to Co. Let Xl be a point of C!, then -Xl is an interior point to the convex set Co - C l = {z:x = X - y,x E Co,y E C l } and 0 is an interior point to the convex set C = Co - Cl + Xl = {x: z = X + Xl, X E Co - Ct}. Moreover, since Co and Cl are disjoint we also have
o rt Co -
Cl ,
Xl
rt C = Co -
Cl
+ Xl .
(2.27)
Let Pc be the Minkowski functional corresponding to C; from (2.27) it follows that Pc(Xl) ~ 1. Let Xl = {xt} be the one-dimensional subspace of X spanned by Xl; Xl consists of all elements of the form AX!, A real, and consider the linear functional Ll defined on Xl by
Since Pc(AXl)
= APC(Xt) if A ~ 0, while
we also have Ll(AXt) ~ Pc(AXl), for all real A. We are now in a position to invoke the Hahn-Banach Theorem, and extend Ll to a linear functional L defined on the whole space X satisfying the condition (2.28) Lx ~ Pc( x) , all x EX. Since pc(x) ~ 1 on C, while LXI = LlXl ~ 1, by (2.28) it follows that L separates C and {xt}. But as observed above this is equivalent to the statement that L separates Co - C l and {O}, which is in turn equivalent to the fact that L separates Co and C l . • We discuss next further applications of the Hahn-Banach Theorem to different settings.
XIV.
282
Normed Spaces and Functionals
3. APPLICATIONS We begin by discussing three interesting applications of the HahnBanach Theorem: The determination of when a linear subspace is dense in a linear space, the general form of the converse to HOlder's inequality, and the construction of a natural embedding of a normed space into a Banach space. First we prove Proposition 3.1. Let Y be a linear subspace of a normed linear space X, and suppose x E X is such that d (x, Y) = infyeY IIx - YII = fJ > o. Then there is a bounded linear functional L on X with norm IILII = l/fJ which separates x from Y. More precisely, we have Lx = 1,
and
Ly = 0
for all Y E Y .
Proof. Let Y1 be the subspace of X spanned by Y and {x}; each element of Y1 of Y1 can be written uniquely as Y1 = Y + AX, with Y E Y and a scalar A. Now, if Y1 = Y + AX, note that
(3.1) Indeed, if A = 0 there is nothing to prove. Otherwise, if A f:. 0, since (-1/ A)Y E Y, it readily follows that fJ :::; IIY111/IAI, and (3.1) holds. We define now the linear functional L1 on Y1 as follows: If Y1 = Y + AX, then put L 1Y1 = A. By (3.1) it follows that IL1Y11 :::; IIY111/TJ, and consequently, we have IIL111 :::; 1/TJ. To show that equality actually holds here let {yn} ~ Y be such that lim n -+ oo IIx - Ynll = fJ. It is clear that 1 = L 1(x - Yn) :::; IIL11111x - Ynll where the right-hand side above tends to IIL111fJ as n - 00. Thus we also have l/fJ :::; IIL111, and consequently, IIL111 = l/fJ· We are now in a position to invoke Theorem 2.6. By that result there exists a linear functional L defined on X with IILII = l/fJ that extends L 1. Since it is also clear that Lx = L 1x = 1 and Ly = L 1y = 0 for Y E Y, the functional L does the job. • Corollary 3.2. Let X be a normed linear space. For any 0 f:. X E X there exists a linear functional L defined on X with IILII = 1 and such that Lx = IIxli. In particular, if x and Y are distinct points of X, there exists L E X* such t~at Lx f:. Ly.
3.
Applications
283
Proof. Suppose x ::f. O. Then by Proposition 3.1, with Y = {O} there, there exists a functional L' E X* such that IIL'II = 1/lIxll and L'x = 1. The first part of the conclusion follows now upon setting L = IIxIlL'. As for the second part, it follows from the first with x replaced by x - y::f. O. • Next we show a "density" result, it roughly states that if Y is a dense subspace of X, then the only bounded linear functional that vanishes on Y is the trivial, or zero, functional. Proposition 3.3. Suppose X is a normed linear space, and let Y be a subspace of X which is not dense in X. Then there exists a nontrivial linear functional L defined on X which vanishes on Y. Proof. infyEY IIx -
Since Y is not dense in X, there is x E X which satisfies yll > O. To obtain L apply now Proposition 3.1. •
The next result we discuss is an extension to the converse to HOlder's inequality in the spirit of Proposition 2.3 in Chapter XII. Proposition 3.4. X. Then we have
Suppose X is a normed linear space, and let x E
IIxll =
ILxl
sup -IILII = sup ILxl· L,#O
IILII=1
(3.2)
Proof. Since for each L E X* we have ILxl ~ IILllllxll, it readily follows that either sup in (3.2) above is less than or equal to IIxll. As for the opposite inequality, note that by Corollary 3.2 there is a bounded linear functional L of norm 1 defined on X so that Lx = IIxli. For this functional we have IIxll = ILxl/IlLII, and we have finished. • Next we discuss the embedding of a normed space into a Banach space, but first a definition. The natural map, denoted by J x, of a normed linear space X into its second conjugate space X** (the Banach space of bounded linear functionals on X*) is defined by
(Jxx)L=Lx,
allLEX*.
(3.3)
It is not hard to check that for each x EX, J xx is a bounded linear functional on X*. To show that J X x is a linear functional on X*, let
XIV.
284
Normed Spaces and Functionals
LI, L2 E X*, A a scalar, and note that by (3.3) we have
+ AL2) =
(Ll + AL2)(x) = LIX + AL2X = (JXX)Ll + A(Jxx)L2.
(JXX)(LI
To show that Jxx is actually bounded we make use of (3.2): If IIJxx11 denotes the norm of Jxx as an element of X**, then by (3.3) and (3.2) it follows that
IIJxxll
= ~~~
l(Jxx)LI IILII
ILxl
= ~~~ IiLiT = IIxll·
In fact, we have shown that J x is also norm preserving, and consequently, one-to-one. In other words, the natural map establishes a linear isometric embedding from X into X**. These properties of the natural map lead to a simple proof of the following result. Theorem 3.5. Banach space.
Every normed linear space is a dense subspace of a
Proof. Given a normed linear space X, let Xl = Jx(X) ~ X** denote the image of X into X** under the natural map. Since, as established above, X and Xl are isometrically isomorphic, we may think of X as Xl, and prove the conclusion for Xl instead. Let X 2 denote the closure of Xl in X**j X 2 is a closed subspace of a complete space, and consequently it is also complete. Moreover, since by construction Xl is dense in X 2 , we are done. •
If the range of the natural map J x is all of X**, then X is said to be reflexive. For instance, from the definition of the natural map and the representation of the dual space to the Lebesgue LP spaces given in Theorem 2.5 in Chapter XII, it follows that LP(J.L) is reflexive when 1 < p < 00. The reader should be warned tha.t, in general, the equivalence of a normed linear space with its second conjugate does not guarantee the reflexivity of the space. On the other hand, Ll(J.L) is not in general reflexive, and to see this we make use of the following observation. A normed linear space X is said to be separable, if there exists a countable dense subset of X. For instance, LP(Rn) is separable if 1 ~ p < 00, and is not separable if p 00. We then have
=
Proposition 3.6. If the conjugate X* of a normed linear space X is separable, then X is also separable.
3.
Applications
285
Proof. Let {Ln} be an at most countable dense subset of X*, and {xn} a sequence of elements in X such that ILnxnl ~ IILnll/2, IIxnll = 1, for all n. We claim that the linear subspace Y of X spanned by the xn's is dense in X. Suppose this is not the case. Then, by Proposition 3.3, there is a nontrivial linear functional L E X* such that Lx = 0 for every x E Y. Since by assumption {Ln} is dense in X*, there is a sequence {Lnm} that converges to L. Now, since for each m we have
it readily follows that lim m -+ oo Lnm = o. But this is impossible since lim m -+ oo Lnm = L '=I 0, and consequently, Y is dense in X. Finally, since the set consisting of all finite linear combinations of the xn's with rational coefficients is countable and dense in X, X is separable. • Since £00 = (£1)* is not separable but £1 is, the converse to the above proposition is not true. Nevertheless, we have Corollary 3.7. is also separable.
The conjugate space of a reflexive separable space
Proof. Suppose X is a normed linear space which is reflexive and separable. Then X** = Jx(X) is also separable and, by Proposition 3.6, X* is separable. • Since as pointed out above (£1)* = £00 and £1 is separable but £00 is not, £1 is not reflexive. It is therefore of interest to describe the dual space to LOO(p), a task we left open in Chapter XII. We begin by discussing a related result of independent interest, namely the dual space to G(I). Let I = [0,1], and L be a continuous linear functional defined on G(I). Since G(I) is a (closed) linear subspace of LOO(I), by Theorem 2.6 there is a bounded linear functional LI defined on Loo(I) that satisfies
Ld = Lf if f E G(I),
and
IILIII =
IILII.
(3.4)
Now, for each x E I we define a bounded function ,
i=l where 4> = :E?=1 ci( 4>xi - 4>xi-l) is a bounded function with 114>1100 = l. Now, by (3.4) we get A ~ IILll1l14>lIoo = liLli, and consequently, 9 is BV on I, and V(gj 0, 1) ~ IILII. Given I E C(I), define the bounded functions n
In = L I(k/n) (4)k/n - 4>(k-1)/n)) , k=l
n = 1,2, ...
and note that since I is uniformly continuous it follows that II In as n --t 00. Whence by the continuity of L1 and (3.4) we have
lim Ldn = Ld = LI.
n--+oo
11100
--t
°
(3.5)
On the other hand, since L1/n may be rewritten as n
L1/n
=L
I(k/n)(g(k/n) - g((k - 1)/n» ,
k=l by Theorem 2.6 in Chapter III we obtain lim Ldn
n--+oo
=
11 0
Idg.
(3.6)
Thus combining (3.5) and (3.6) we conclude that
LI =
1
1/dg .
(3.7)
Furthermore, by (3.2) in Chapter III, we also have
ILII ~ maxl/(x)lV(gjO, 1) , xeI
and consequently
IILII = V(gj 0,1).
(3.8)
3.
Applications
287
Two observations: First, since g(O) = 0, it follows that V(gj 0,1) = IIgll, the norm on BV introduced in (1.5). Also, by (3.8), for BV functions 9 with g(O) = 0, the integral in (3.7) determines a bounded linear functional Lon C(J) with IILII ::; IIgll. The only difficulty here is that the expression in (3.7) does not uniquely determine the functional L, d. 4.24-4.26 in Chapter III, and, as in the case of the Lebesgue LP spaces, some kind of normalization is needed. The details are left to the reader, d. 4.36 below. We close this section with the description of the conjugate space to LOO(p). It is not intuitively clear how the bounded linear functionals on LOO(p) look. On the one hand, it is obvious that functions 9 E L1(p) induce such functionals by means of
but it is not hard to see that not all functionals are of this form. Indeed, let J = [-1,1], and Y = {I E LOO(J): lim
.! f
r-+O+ r
J(o,r)
1 dy exists}
Then Y is a nonempty subspace of LOO(J), and
LI
=
lim
.! f
r-+O+ r
J(o,r)
1 dy
is a bounded linear functional on Y with IILII = 1. Now, by the HahnBanach Theorem, L can be extended to a bounded linear functional on Loo(I), also of norm 1. For simplicity denote this extension also by L and observe that L cannot be of the form (3.9) for any 9 E L1(I). Indeed, if (3.9) is true for an integrable function g, let I'T/ = XR\(O,71)sgng, where 0 < TJ ::; 1j it is clear that 171 E Y and LI71 = O. It then readIly follows that
LI'T/
=
1.
[71,1]
Igi dy = 0,
all TJ> O.
But this implies that 9 = 0 a.e. on (0,1], and a similar argument gives that 9 = 0 a.e. on [-1,0]. In other words, 9 = a.e., and L is then the zero functional, contrary to the fact that IILII = 1.
°
XIV.
288
Normed Spaces and Functionals
The analytic representation of the conjugate space to L 00 (J..L) requires that we extend the notion of integral to include integration with respect to a signed additive set function. Because it suffices for this application, we restrict our attention to the case when both the function to be integrated and the set function with respect to which the integration is carried out, are bounded. First some definitions. Let A be an algebra of subsets of X and 'I/J a bounded nonnegative set function defined on A. Given a bounded function g: X ~ R and a partition P of X consisting of pairwise disjoint measurable sets E 1 , •.. , En, put mk = infEIc g, Mk = sUPEIc g, and consider the lower and upper sums of g corresponding to P with respect to 'I/J, defined by the expressions n
8(g, 'I/J, P)
n
= L mk'I/J(Ek)
and
B(g, 'I/J, P)
k=l
= L Mk'I/J(Ek) k=l
respectively. The usual properties of lower and upper sums hold in this case as well. They are: (i) If a partition pI refines a partition P, then we have
8(g, 'I/J, P) ~ 8(g, 'I/J, pI)
B(g, 'I/J, pI) ~ B(g, 'I/J, P).
and
(ii) No lower sum exceeds an upper sum, even when they are formed with two different partitions. In case the quantities sup 8(g, 'I/J, P) p
and
inf B(g, 'I/J, P) p
Ix
coincide, we define the integral g d'I/J of g over X with respect to 'I/J as that common value. The class of functions for which the integral exists is rather wide and, as we now show, it includes the bounded measurable functions. By the way, since A is not necessarily a u-algebra, we say that a function is measurable provided all four conditions in Proposition 1.1 in Chapter VI are satisfied. Proposition 3.S. If 'I/J(X) < then 9 d'I/J exists.
Ix
00
and g is bounded and measurable,
Proof. By (i) and (ii) above it suffices to show that there are partitions P of X for which the lower and upper sums are arbitrarily close to each other. Let m < M be real numbers such that m < g( x) < M for all x EX, suppose", is an arbitrary constant, 0 < ", < M - m, and divide
3.
Applications
289
the interval (m, M) by means of the points m = to < tl < ... < tn = M into a finite number of subintervals, each of length less than or equal to TJ. Form now the sets
and observe that they are pairwise disjoint and measurable. Let P denote the partition of X into the Ek 's; if any Ek is empty, simply drop it. Further note that since this family is finite we have 'I/J(X) = L. 'I/J(Ek). Moreover, since for each k we have
it readily follows that
S(g, 'I/J, P) - 8(g, 'I/J, P)
= 2)Mk ~ TJ
mk)'I/J(Ek)
L 'I/J(Ek) = TJ'I/J(X) .
Thus, by means of an appropriate choice of TJ, the difference between the upper and lower sums above can be made arbitrarily small, which is what we set out to prove. •
It is interesting to point out that, in general, the class of functions for which the integral exists includes functions that are not measurable. Indeed, if X = N and A is the algebra of those subsets E of N which are either finite or so that N \ E is finite, then
'I/J(E) - { 0 -
00
if E is finite if N \ E is finite,
is an additive set function defined on A, the function 9 = characteristic 9 d'I/J = 00 function of the odd integers is not measurable, and yet exists. It is possible to define the integral of 9 with respect to a signed additive set function 'I/J over A as follows: If 'I/J+ and 'I/J- denote the positive and negative variations of'I/J respectively, cf. 4.8 in Chapter IV, let
IN
i9d'I/J = i9d'I/J+ - !x9d'I/J_,
(3.10)
provided the expression on the right-hand side of (3.10) is well-defined. Now, from (3.10) it follows that the basic properties of the RiemannStieltjes and Lebesgue integrals hold in this context, with slight or no
XIV.
290
N ormed Spaces and Functionals
change. We need two specific properties of the integral, to wit, linearity and boundedness; we state them next, their proof is left to the reader. If the integral of 91 and that of 92 with respect to "p exist and), is a scalar, then the integral of 91 + ).92 with respect to "p exists, and we have
L
(91
Also, if
19(X)1
~
+ ).92) d"p =
L
91 d"p
+).
L
92 d"p.
M for all x E X, then
(3.11) We are now ready to give a description ofthe dual to Loo(J.t). Suppose (X, M, J.t) is a measure space, let L be a bounded linear functional defined on Loo(J.t), and for E E M put (3.12) From the linearity of L it readily follows that "p is an additive set function defined on M. Moreover, since L is bounded we also have (3.13) Now, if J.t(E) = 0 we have IIXElioo = 0, and consequently, by (3.13) we obtain "p(E) = O. Moreover, since we also have that J.t(A) = 0 for any A ~ E, A E M, it follows that "p(A) = 0 for those sets, and, by 4.8 in Chapter IV, we get that "p+(E) = "p_(E) = 0, and 1"pI(E) = O. Suppose now that I E Loo(J.t) and consider a representative in the equivalence class of I, which we call I again, which is bounded everywhere by 11/1100' Let M > 1111100, and divide the interval [-M ,M] by means ofthe points -M = to < tl < ... < tn = M into a finite number of subintervals, each oflength less than or equal to an arbitrary real number 'T/. Form now the partition of X consisting of the measurable sets Ek
= {tk-l
~
1< tk},
k
= 1, ... ,n,
and let h be the measurable function h = Ek=1 tk-1XE". Now, if x E Ek, then I/(x) - h(x)1 = I/(x) - tk-ll ~ 'T/, and consequently, we have
III -
hll oo = sup I/(x) - h(x)1 ~ 'T/.
(3.14)
:cEX
Further, since L is linear and bounded, by (3.14) it follows that ILl - Lhl = IL(f - h)1 :5 IILII II! - hll oo :5
IILII 'T/.
(3.15)
3.
Applications
291
On the other hand, both I and h have an integral with respect to X, and since h dt/J = L. tk-l t/J(Ek) = Lh, we get
Ix
t/J on
Whence it follows that
ILldt/J-Lhl = ILU-h)dt/JI ~ III - hlloolt/JI(X)
~ TJ It/JI(X),
which combined with (3.15) allows us to estimate
ILl - Lhl + ILh - L I dt/JI
ILl -
Ix I dt/JI
by
~ (IILII + It/JI(X)) TJ·
Since TJ is arbitrary this can only happen if
LI = Lldt/J,
all bounded
I.
(3.16)
This is the first step in obtaining the representation of L. There is, of course, the question of the uniqueness of the representation: We must be sure that for each I E Loo(J.t) the right-hand side in (3.16) above is independent of the bounded representative we choose in the equivalence class of I. This amounts to proving that if I is equivalent to 0, then we have I dt/J = 0. Now, in this case, X can be partitioned into two disjoint measurable sets El and E2 , say, so that I = on El and J.t(E 2 ) = 0. By the definition of the integral it is clear that lEI I dt/J = 0, and since as observed above we also have 1t/JI(E2 ) = 0, if c is a bound for I, by (3.11) we get that
Ix
°
Ix
By the linearity of the integral it now follows that I d'!f; = 0, which insures that the right-hand side of (3.16) is well-defined at the level of bounded equivalent functions of Loo(J.t) functions. The stage is now set for
XIV.
292
Normed Spaces and Functionals
Theorem 3.9. Let (X,M,J..L) be a measure space. The dual to LOO(J..L) can be described as follows: Each bounded linear functional L on L 00 (J..L) is of the form
(3.17) where t/J is a bounded additive set function defined on M satisfying the condition It/JI(E) = 0 whenever J..L(E) = o. Furthermore, the norm IILII is
IILII = It/JI(X) .
(3.18)
Proof. As discussed above, if L is a bounded linear functional defined on Loo(J..L), then there is a bounded additive function t/J defined on M such that (3.17) holds. Moreover, this representation is independent of the bounded representative of each I E LOO(J..L), and (3.11) implies that IILII ~ It/JI(X). Thus to verify (3.18) it suffices to prove that we also have It/JI(X) ~ IILII. Now, by 4.9 in Chapter IV, given £ > 0, there exist measurable subsets El, E2 of X such that
Put I = XEl - X~ and observe that since ±1, we have 11/1100 = 1 and consequently,
IILII
I only takes the values 0 and
~ ILII ~ LI = LXEl - LX~ = t/J(Et) - t/J(E2 ) ~ It/JI(X) -
£.
Since £ > 0 is arbitrary, the above estimate implies that IILII ~ It/JI(X), (3.18) holds, and the integral representation of L has been established. On the other hand, if t/J is a bounded additive set function defined on M with the property that It/JI(E) = 0 whenever J..L(E) = 0, then it is not hard to see that (3.17) defines a bounded linear functional L on LOO(J..L) with norm IILII = It/JI(X). Indeed, if I is a bounded representative of a function I E LOO(J..L), consider the measurable partition of X consisting of the sets B = {III> 1l/1100}, and X \ B. Since J..L(B) = 0, it follows that It/JI(B) = 0 and by (3.11) we get that I dt/J = o. Whence we have I dt/J = I dt/J and, by (3.11) again, it follows that
Ix
IX\B
IB
ILII:5 It/JI(X \ B)lI/lIoo :5 1t/JI(X)lI/lIoo . These observations imply at once that L is a bounded linear functional defined on LOO(J..L) with IILII :5 It/JI(X). The opposite inequality holds as before, L also satisfies (3.18), and we have finished. •
4.
Problems and Questions
293
4. PROBLEMS AND QUESTIONS 4.1 Is every metric on a linear space induced by a norm?
4.2 Let X be a normed linear space, and B = {x EX: IIxll that the closure of B is the set {x EX: IIxll ~ 1}.
< 1}. Show
4.3 Suppose M is a closed subspace of a normed linear space X and
define an equivalence relation R on X X X by xRy iff x - y belongs to M. If X/M denotes the set of equivalence classes and x + M the equivalence class corresponding to x EX, show that X / M is a linear space over the scalar field of X with the operations
(x + M) + (AY + M) = (x + AY) + M,
X,y E M ,A scalar.
For future reference, the dimension of X/M is called the codimension of M. Further,X/M is alsoanormed space with norm IIx+MII = d(x,M). Are these conclusions true if M is not necessarily closed? 4.4 Let X be a normed linear space, and M be a closed subspace of X.
Prove that X is complete iff M and X/M are complete. Also, show that X is separable iff M and X / M are separable. 4.5 If M is a finite-dimensional proper subspace of a normed linear space X, prove there exists an element x EX, IIxll = 1, such that d(x,M) = 1. 4.6 (F. Riesz) Let X be a normed linear space and M be a proper closed linear subspace of X. Show that given c > 0, there exists an element x E X, IIxll = 1, such that d(x,M) > 1- c.
4.7 Let X be a normed linear space and suppose lim n -+ oo Show that lim n -+ oo IIx n ll = IIxll. 4.8 Suppose
IIxn -
xII =
o.
are linearly independent elements of a normed linear space X. Show that there exists a constant c > 0 with the property that for every choice of scalars AI, ... ,An we have Xl, ••• , Xn
4.9 Referring to the construction of the norm on a linear space following Proposition 2.4, Suppose we put IIxlip = (Ei=l IAiIP)I/P, 1 ~ p < 00
there. Is
II . lip a norm '/
XIV.
294
N ormed Spaces and Functionals
4.10 Let (X,M,p) be a measure space and 1 ~ p,q ~ 00. Show that LP(p) + Lq(p) {I: I can be written as I 9 + h, 9 E LP(p), hE Lq(p)}, is a linear space. Further, normed by
=
11/11
=
= inf{lIglip + IIhllq:1 = 9 + h},
n
LP(p) + Lq(p) is a Banach space. Along similar lines, LP(p) Lq(p) normed by 11/11 = ma.x{lI/lIp, II/lIq}, is also a Banach space. Can you characterize the conjugate space to LP(p) + Lq(p)? to LP(p) Lq(p)?
n
4.11 A sequence (xn) of elements of a Banach space X is said to be a Schauder basis for X if for each x E X there is a unique sequence of scalars (An) such that lim m --+ oo IIx Anxnll = o. Show that £P has a Schauder basis if 1 ~ p < 00, but £00 does not.
r::=1
4.12 Prove that if a Banach space has a Schauder basis, then it is separable. 4.13 Let Co = {(xn) E £00 : lim n--+ oo Xn = O}. Show that Co is a closed linear subspace of £00, and that it has a Schauder basis. Is Co reflexive? 4.14 For each positive integer n let en be the sequence with 1 in the nth place and zeros elsewhere. Prove that {en} is a Schauder basis for £1, but it is not a Hamel basis for £1. 4.15 Let X be a linear normed space, and L a nontrivial linear functional on X. Prove the following three conditions are equivalent: (a) L is continuous, (b) The null space of L is a proper, closed linear subspace of X, and, (c) The null space of L is not dense in X. 4.16 Let X be a linear normed space over C. If a linear functional L on X is not continuous, prove that {Lx: IIxll ~ 1} is all of C. 4.17 Let L =I 0 be a linear functional on a linear space X and Xo any fixed element of XIN, where N is the null space of L. Show that any x E X has a unique representation x = AXo + y, where A is a scalar and yEN. 4.18 Referring to 4.17, show that any two elements Xll X2 E X belong to the same element of XI N iff LXI LX2. Further, the codimension of N is equal to 1.
=
4.19 Show that two linear functionals Lll L2 which are defined on the same linear space and have the same null space are proportional. 4.20 If Y is a subspace of a linear space X and the co dimension of Y is equal to 1, then every element of X/Y is called a hyperplane
4.
Problems and Questions
295
parallel to Y. Show that for any linear functional L ::f 0 on X the set YI = {x EX: Lx = 1} is a hyperplane parallel to the null space N of L. Further, show that the norm IILII of L can be interpreted geometrically as the reciprocal of the distance of the hyperplane Y I to the origin. 4.21 Let X be a normed linear space, and suppose L is a bounded linear functional on X with norm 1. Given E > 0, show there exists x,. EX such that II x,. II = 1 and Lx,. > 1- E. Give an example to show that there need not exist x E X such that IIxll = 1 and Lx = 1. 4.22 Let X be a normed linear space and let {x n } ~ X. Prove that x E X is the limit of finite linear combinations of the xn's iff Lx = 0 for all continuous linear functionals L on X such that LX n = 0 for all n. 4.23 Let Y be a subset of a (real) normed linear space X, and Lo a functional defined on Y. Show that a necessary and sufficient condition for Lo to have a bounded linear extension to X is that there exists a constant k with the property that 1IL:~1 AnLoxnl1 ~ k 11L::=1 Anxnll for any Xl,"" xn in Y and scalars At, ... , An. 4.24 Let 1
< p,q < 00, be conjugate indices, i.e., 1/p+ 1/q = 1. Suppose
IRn
9 E Lq(Rn ) has the property that jg dx = 0 for each j in n D = {j E LP(R ) n L(Rn): j dx = a}. Prove that 9 = 0 a.e. As a consequence, show th~,t D is dense in LP(Rn ). Is a similar statement true if we consider a bounded interval I instead of R n ? Also, what can we say about the case p = 1?
IRn
4.25 Suppose 1 < p, q < 00 are conjugate indices, and j fI. LP(X, J.t). Show that the set {g E Lq(X,J.t):jg E L(X,J.t) and jgdJ.t = O}, is dense in Lq(X,J.t).
Ix
4.26 Let X = L 2 (J.t) X L 2 (J.t) normed by
= {(1,g): j,g E L 2 (J.t)} be the linear space
11(1,g)1I = (lIjll~ + IIgll~)1/3 . Show that X is a Banach space and describe X*. 4.27 Let X 1= {O} be a normed linear space. Show that X* 1= {O}. Moreover, prove that if X has n linearly independent elements, so does X*. 4.28 Show that if Lx
= Ly for every L
E X*, then x
= y.
4.29 Prove that if a normed linear space X is reflexive, so is X*.
296
XIV.
N ormed Spaces and Functionals
4.30 Prove that the "completion" of the normed linear space described
in Theorem 3.5 is unique up to isomorphisms. 4.31 Let Y be a closed subspace of a normed linear space X. Show that if
every L E X* which vanishes on Y vanishes also on X, then Y
= X.
4.32 Let X be a normed linear space. A sequence {x n } ~ X is said to
converge weakly to an element x E X if lim n -+ oo LX n = Lx for all L E X*. Prove that no sequence can have two distinct weak limits. Further, a sequence {xn} ~ X is said to be weakly Cauchy if {Lx n } is a Cauchy (scalar) sequence for every L E X*, and X is said to be weakly sequentially complete if every weakly Cauchy sequence converges weakly. Prove that if X is weakly sequentially complete, then it is complete.
4.33 Prove that a reflexive Banach space is weakly sequentially complete. 4.34 Show that any closed subspace of a weakly sequentially complete
Banach space is itself weakly sequentially complete. 4.35 Show that il is weakly sequentially complete. 4.36 Describe a normalization of BV functions that allows for the iden-
tification of the dual of C(J). 4.37 If J is a compact interval of Rn and J.t is a finite Borel measure
on J, then LI = II I dJ.t is a positive bounded linear functional on C(J). Positive here means that LI ~ 0 whenever I ~ O. Prove now the following result, a particular case of the so called Riesz Representation Theorem: Suppose J is a compact interval of Rn and L is a positive bounded linear functional on C(I). Then there is a unique Borel measure J.t such that LI = II I dJ.t for every I E C(J).
I which is BV in J = [0,1] such that I;p(x)dl(x) = 'L.r:.=IP(n)(nIN) for all polynomials p of degree less than or equal to N?
4.38 Fix an integer N. Does there exist a function
4.39 Let J = [0,1] and consider the sequence {In} C C(J) defined by
In(x) = nx if 0 ~ x ~ lin, = 2 - nx if lin ~ x ~ 2/n, and = 0 otherwise. Show that {In} converges weakly to 0 in C(J), but that lim n -+ oo II In II i- o.
CHAPTER
xv
The Basic Principles
In this chapter we consider the three basic principles concerning continuous linear transformations that provide the foundation for many results in linear analysis. These principles are: The Uniform Boundedness Principle, The Open Mapping Theorem, and the Closed Graph Theorem. '1.
THE BAIRE CATEGORY THEOREM
Baire's theorem concerning the structure of complete metric spaces is an essential ingredient in proving the validity of the basic principles alluded to above. To state it we need some definitions. Let (X,d) be a metric space. A set E ~ X is said to be nowhere dense if its closure E has empty interior. The sets of first category in X are those that are countable unions of nowhere dense sets; these sets are also called meager. All other sets are said to be of second category in X, or nonmeager. For instance, the rational numbers Q are of first category in R, and the irrational numbers I are of second category in R. We begin by proving Theorem 1.1 (Baire's Category Theorem). X is of second category in itself. Proof. and let
A complete metric space
Suppose, to the contrary, that X is of first category in itself 00
X
= UXn , n=l
Xn nowhere dense, all n.
xv.
298
The Basic Principles
Take a point Xo in X and consider the (nonempty) open ball B(xo,l) centered at Xo of radius 1. Since the interior of X I is empty, X I does not contain B(xo, 1); let then Xl be a point in B(xo, 1) \XI and 0 < Tl < 1/2 be such that
Similarly, since X 2 is nowhere dense, X 2 does not contain B(Xl,Tt) and, as before, there are a point X2 E B( XI, Tl) \ X 2 and 0 < T2 < 1/4 such that B(X2,T2)CB(xt,Tt), and B(X2,T2)nX 2 =0. Continuing in this fashion step by step we get a decreasing sequence of closed balls {B(x n , Tn)} with the property that
Now, by a well-known result in the theory of metric spaces, actually an extension of the Nested sequence theorem on the real line, since the B(xn, Tn)'S form a monotone decreasing sequence of non empty closed sets whose diameters tend to 0, and since (X, d) is complete, there exists one, and only one, point X E X so that
n 00
~B""""(xn ,-Tn-:-)
= {x} .
n=l
By construction X ~ X n for all n; thus X ~ contradiction. •
Un Xn
X, and this is a
Theorem 1.1 is often cast in the following form: If On = X \ X n denotes the complement of X n , then each On is open and dense in X, and the conclusion of Baire's Category Theorem is that On i 0. More precisely, the intersection of every countable family of dense open subsets of X is dense in X. The Baire Category Theorem is useful in proving that a set is nonempty. In fact, the category method furnishes a whole class of examples, and it often makes it possible to construct an explicit example by successive approximations. We exemplify this by showing that, in the sense of category, almost all continuous functions are nowhere differentiable. In fact, as we prove below, it is exceptional for a continuous function to have a finite one-sided derivative anywhere in an interval.
nn
1.
Baire Category Theorem
299
To make this precise let I = [0,1], consider G(l) with the uniform metric, and let En denote the class of functions f E G(l) such that for some x in [0,1 - lin] we have If(x+h)-f(x)l~nh,
alIO 0, put Cn = c/2 n , n = 1,2, ... , and for each n let "In be the choice of "I corresponding to Cn in (4.5) above. Whence
= 1,2, ... Clearly we may, and do, assume that lim n -+ oo "In = o. T(B~n)
2 B~,
n
(4.6)
Put now TJI = "I, and suppose Y E B~. By (4.6) also Y E T(B~l)' and consequently, Y can be approximated as close as we want by points in T(B~l). In particular, there exists Xl E B~l such that IIy - TXl II < "11. In this case, since Y - TXl E B~, we can find X2 E B~2 such that IIY - TXl - TX211 < "13· In general, having chosen Xi E B~i' 1 ~ i ~ k, pick Xk+l E B~k+l with the property that
(4.7) We claim that L Xk converges to a point x E B~, and that Y = Tx. If this is the case, then we have T(B~) ;2 B~, and the second assertion is also true.
XV.
308
The Basic Principles
To see that 2: Xk converges, since X is complete it suffices to check that 2: IIXkl1 < 00. Now, since Xk E Bek for all k, it readily follows that 2:k:l IIXkll < 2:k:l c/2 k+l = c. Whence 2:k:l Xk converges to an element x E X with IIxll < c. Also, by the continuity of T we get that T(2: k Xk) == Tx, and, since limk-+oo TJk = 0, from (4.7) it is clear that
Thus y == Tx, and we have finished.
•
A word about the hypothesis of Theorem 4.1. The assumption that T is continuous is not essential, cf. 6.29 below, and the assumption that T is onto may be replaced by the assumption that the range R(T) of T is of second category in Y. On the other hand, this last assumption, as well as the completeness of X, are necessary for the map T to be open. We postpone the presentation of specific examples until after the proof of the Closed Graph Theorem. Now, concerning the inverses, we have
Corollary 4.2. Let X, Y be Banach spaces and assume T E B(X, Y) is a one-to-one mapping from X onto Y. Then T- l is a well-defined bounded linear mapping from Y onto X. Proof.
By Theorem 4.1 there exists TJ
> 0 such
T(Bl) 2 B~.
that
(4.8)
Now, since T- l is well-defined, (4.8) is equivalent to
Bl Thus, if 0
':f. y E Y,
2
T-l(B~).
(4.9)
we have (TJ/2I1yll)y E B~, and by (4.9) it follows that IIT- l ((TJ/2I1yll)y)1I ~ 1.
By the linearity of T- t we get IIT-lyll ~ (2/TJ)lIyll, and consequently,
T-t E B(Y, X), and liT-til
~
(2/TJ).
•
Corollary 4.3. Suppose that the linear space X is normed by 11·11 and by II . lit, and that, endowed with both norms, X is complete. Then if for some constant c we have IIxll ~ cllXllt ,
all x EX,
( 4.10)
all x EX. More precisely, the norms are equivalent.
(4.11)
there is a constant k such that IIxliI ~ kllxll ,
5.
The Closed Graph Theorem
309
Proof. Let T denote the identity map from X, normed by II . IiI, into X, normed by 11·11; clearly T is linear, one-to-one and onto, and, by (4.10), it is also continuous. By Corollary 4.2, T- I = T is also continuous, and (4.11) holds. • The completeness of X under both norms is essential for (4.11) to be true. Referring to the construction following Proposition 2.4 in Chapter XIV, if X is a Banach space, and if we introduce the norm II . 1100 in X, then (4.10) holds. Now, if (4.11) were to hold, then X normed by II ·1100 would become a complete normed space, which it is not.
5. THE CLOSED GRAPH THEOREM Many important operators in Analysis enjoy the following property: They are well-defined on a dense subspace of a normed linear space X, and yet fail to be continuous. For instance, put I = [0,1], and let X = Y = G(l), and Xl = GI(I) C X; Xl is dense in the uniform norm of X, a proof of this will be given in Corollary 2.3 in Chapter XVII. Consider now the linear operator T: Xl -4 Y given by
T I = I',
or
T I (x) =
1'( x ) , all x
E 1.
Thus T is a densely defined operator, and it is clear that it is not bounded since the sequence Un} consisting of the functions
In(x)=xn,
n=I,2 ...
satisfies liT In II = nand II/nll = 1 for all n. The challenge is to incorporate operators such as T into the theory we are developing, and to discover what properties they satisfy. Referring to the differentiation operator T, we are interested in considering sequences {In} ~ GI(l), In -4 I E G(I), and the corresponding sequences {T In} ~ G(I). As observed above, {Tin} need not converge, but when it does, i.e., if lim n ..... oo Tin = 9 E G(l), then the following is true: Since the sequence {/~} converges uniformly to 9 on l, the sequence {In} converges uniformly to an anti derivative of 9 on I. But since by assumption also {In} converges uniformly to Ion l, it follows that IE GI(I) and that T 1= g. Because of the importance of this example we formalize these considerations into a definition. Let X, Y be normed spaces, and let T: D(T) -4 Y be a linear mapping. We say that T is closed in X, if for any sequence {x n } ~ D(T), lim Xn = x,
7'&-+00
and
lim TX n = Y
n-+oo
(5.1)
xv.
310
imply
x E D(T) ,
and
Tx
The Basic Principles
= y.
(5.2)
As the differentiation mapping shows, not all closed operators are continuous. The opposite is also true; namely, not all continuous operators are closed. For instance, if Xl is a proper dense subspace of a normed space X = Y, then the identity map T: Xl ---+ Y is obviously bounded, but not closed. The Closed Graph Theorem, a close relation to the Open Mapping Theorem, establishes when a closed mapping is bounded. In order to prove it we find it convenient to consider a more "geometric" setting. First a definition. Let X, Y be normed linear spaces and let X x Y be the linear space normed by lI(x,y)1I = IIxli + lIyll· Given a linear mapping T: D(T)
G(T)
---+
Y, the graph G(T) of T is the set
= {(x,Tx):x E D(T)} ~ X
X
Y.
Since T is linear, G(T) is a linear subspace of X X Y. Now, when T is closed, (5.1) and (5.2) imply that G(T) is a closed subspace of X x Y. The converse is also true: If G(T) is closed in X X Y, then T is closed in X. Thus, the concepts T closed and G(T) closed are interchangeable. It is also clear that if D(T) is a closed subspace of X and T is continuous, then T is closed in X. The remarkable fact is that for Banach spaces the converse to this statement is also true. More precisely, we have
Theorem 5.1 (Closed Graph Theorem). Let X, Y be Banach spaces, and suppose T: X ---+ Y is linear. If T is closed in X, then T is continuous in X. Proof. Since X X Y is a complete normed space, and since by assumption G(T) is a closed subspace of X x Y, G(T) is also a Banach space in the norm induced by that of X x Y. Consider now the (projection) linear mapping P: G(T) ---+ X given by P«x,Tx))=x,
xEX.
Note that in addition to being linear, P is one-to-one and onto X. Moreover, since also
IIP«x,Tx»1I
= II xli $ IIxll + IITxll = lI(x, Tx)1I ,
5.
The Closed Graph Theorem
311
P is also bounded. Thus by Corollary 4.2, the inverse p-l: X P given by
p-lx
= (x,Tx),
--+-
G(T) of
x EX,
is also bounded. Specifically, there exists a constant c such that
lI(x,Tx)1I =
IIxll + IITxll
This clearly implies that IITxll ~
~
cllxll,
cllxll, i.e., that T
x EX. is bounded.
•
For the validity of the Closed Graph Theorem it is essential that both the domain X and the target space Y be complete, as may be seen by the following examples. As pointed out above, the differential operator T: Gl(l) --+- G(l) is closed but not bounded; in this case the domain X = Gl(l) is not complete. An example along similar lines, roughly speaking it corresponds to differentiation of Fourier series, is the following: Let £1 = {(an) : lIe an)lIl = E~=l lanl < oo}, and
Since X is a proper dense subspace of £1, it is not complete. Let now T: X --+- £1 be the mapping defined by
It is easy to check that T is well-defined and closed, and since for the sequence en E £1 which has a 1 in the nth position and zeros elsewhere we have
it follows that T is not bounded. Now, since T is also one-to-one and onto £1, it has a well-defined inverse T- l : £1 --+- X; in fact, we have
Observe that indeed T- l is defined on the whole of £1 since x E £1 implies T-lx E X, and
In fact, the above remark shows that T- l is bounded, and putting x = el there we also get that liT-III = 1. Moreover, since x = T-l(Tx), it
xv.
312
The Basic Principles
follows that T- 1 is onto X. Now, T- 1 is not open, for if it were open, then (T- 1 )-1 = T would be continuous, and, as we saw above, this is not the case. Thus, for the validity of the Open Mapping Theorem it is essential that the target space be complete. Consider next an infinite dimensional Banach space X, let H be a (necessarily infinite) Hamel basis for X such that IIhll = 1 for all h E H, and let 1I·11t be the norm on X given by n
IIxl11 =
L
n
lail,
x = Laihi,
i=l
hi E H, 1 ~ i ~ n.
i=l
Let Y denote the linear space X endowed with the metric II . 111, and let T: X -+ Y be the identity map. T is one-to-one, onto and, as it is readily verified, closed. However T is not bounded, for otherwise the fact that X is complete would imply that Y is also complete, and this is not the case. Thus, for the Closed Graph Theorem to be true it is essential that the target space Y be complete. Now, T- 1 :y -+ X is also the identity, and as such it is one-to-one and onto. Moreover, since
IIxll
~
IIxll1 ,
all x EX,
T- 1 is also bounded. However, T- 1 is not open, for if it were open, then (T- 1 )-1 = T would be continuous, and as pointed above, this is not the case. Thus, for the Open Mapping Theorem to be true, the domain must be complete.
6. PROBLEMS AND QUESTIONS 6.1 Show that any continuous function on [0,1] can be approximated uniformly and arbitrarily closely by a piecewise linear continuous function. 6.2 Let 4>( x) denote the function that assigns to each real x the distance from x to the nearest integer. Prove that for appropriately chosen sequences (en) and (kn ) the function given by the uniformly convergent series L:~=1 en4>(knx) is nowhere differentiable. 6.3 Let X,Y be Banach spaces and L(x,y) be a functional on X X Y, continuous and linear in each variable separately. Prove that L is continuous at (0,0), and consequently everywhere.
6.
Problems and Questions
313
6.4 If X is a finite dimensional normed linear space, prove that every linear operator T: X - t X is bounded. 6.5 Let X, Y "=I {O} be normed linear spaces and suppose the dimension of X is infinite. Show that there is at least one unbounded linear operator T: X - t Y. 6.6 If T E B(X, Y), T "=I 0, and IIxll < 1, then IITxll < IITII. Is it also true that IITxll < IIxll? 6.7 Suppose T,Tn E B(X,Y), n = 1,2, ... , suPIITn11 < 00, and limn-+oo IITnx - Txll = 0 for every x in a dense subset of X, does it follow that lim n-+ oo IITn - Til = O? 6.8 Suppose 0 "=I T E B(X) and {x n } ~ X has the property that limn-+ oo IIx n ll = 00. Does it follow that lim n-+ oo IITx n ll = oo? 6.9 Suppose Tt, T2 E B(X). Show that TIT2 E B(X), and
6.10 Prove that if X is a Banach space and T E B(X) and IITII < 1, then the "geometric series" 1+ T + ... + Tn + ... converges in B(X).
What does it converge to?
6.11
Referring to 6.7, the condition IITII < 1 is not necessary for ... + Tn + '" to converge in B(X). For, suppose
1+ T +
lim v'IIT n ll = L,
n-+oo
exists. Show that if L < 1 the above series converges and if L > 1 it does not. Further, prove that a necessary and sufficient condition for the series to converge is that for some k we have IITk II < 1. 6.12 (Banach) Let X be a Banach space and suppose T E B(X) is such that IITII ~ TJ < 1. Prove that the operator 1- T has a continuous inverse (I - T)-I and 11(1 - T)-III ~ 1/(1- TJ). 6.13 Let To E B(X, Y), where X and Yare Banach spaces, and suppose To has a bounded inverse TOI E B(Y, X). Show that if an operator T E B(Y,X) satisfies IITII < 1/IiTo-III, then the operator U = To + T: Y - t X has a continuous inverse and
xv.
314
The Basic Principles
6.14 Let X, Y be normed linear spaces and let To be a mapping from a subset M of X into Y. Show that a necessary and sufficient condition for To to have a bounded linear extension to the span of M is that there exists a constant k such that
for any
Xt, • •• ,X m
in M and scalars At, ... ,Am.
6.15 Let X, Y be normed linear spaces and suppose Y is complete. Show that every continuous linear operator To from a subset M of X into Y has a unique continuous linear extension T to the closure of M into Y, and IITII = IIToll. In particular, prove that if a continuous linear operator T from a normed linear space X into a Banach space Y maps a dense subset of X into 0, then Tx = 0 for all x E X. 6.16 Let X, Y be Banach spaces and T: X -+ Y, T linear. Prove that if LoT E X* for every L E Y*, then T E SeX, Y). 6.17 Suppose X, Y are Banach spaces and {Tn} ~ SeX, Y). Prove that if for each L E Y* we have sup ILTnxl < then sup IITnll
00,
all x EX,
< 00.
6.18 Assume {Tn} ~ SeX) and lim n--+ oo Tnx = Tx exists for each x in X. Show that T is a bounded linear operator on X and that IITII ~ lim sUPn--+oo IITnll· 6.19 Let X be a Banach space, and T, TI , T 2 , ••• be bounded linear operators defined on X with the property that lim n--+ oo Tnx = Tx for all x EX. Prove that there exists a constant c > 0 such that sup II Tn II ~ c. 6.20 As an application of the Uniform Boundedness Principle show that the space X of polynomials p( x) = L:~=o anx n , where an = 0 for all but finitely many n's, normed by IIplI max lanl, is not complete.
=
6.21 Let X be a normed linear space, Y a Banach space, and suppose T E SeX, Y). If N denotes the null space of T we may define a map T*:XjN -+ Y as follows: For each class x+N, let T*(x+N) = Tx. Prove that T* E S(XjN, Y), and that if X is a Banach space, then T* is an isomorphism.
6.
Problems and Questions
315
6.22 Let X be an infinite-dimensional Banach space, and {xn} a linearly independent set in X. Show that for each n = 1,2, ... , the linear span {x!, ... , xn} is a nowhere dense subset of X. As a consequence of this result prove that the dimension of every infinite-dimensional Banach space is at least N1 . 6.23 Prove that if X is an infinite-dimensional Banach space, there is an embedding of /00 into X. 6.24 Prove that every separable Banach space is isomorphic to some quotient space of /1. 6.25 Let 1 < p < 00. If 2:n anb n converges for every sequence (b n ) such that 2:n Ibnl P < 00, prove that 2:n lanl q < 00, where q is the index conjugate to p. 6.26 Prove that there is no sequence of positive real numbers (an) such that 2:n anlbnl converges iff the sequence (b n ) is bounded. 6.27 Let X be a Banach space, and assume Y and Z are closed subspaces of X. If each x E X has a unique representation of the form x = y+ z with y E Y and z E Z, show that there exists a constant c such that for all x = y + z we have lIyll, IIzll ~ cllxll. 6.28 Let I = [0,1]. Show that LP(I) is properly contained in L(I) for each p > 1, and conclude that in the metric of L(I), LP(I) is a set of first category in L(I). 6.29 Let X be a Banach space and Y a normed linear space of second category. Prove that if the linear mapping T: X -+ Y is closed and onto, then T takes open sets into open sets. 6.30 Let I be a compact interval of the line, and K a closed subinterval of I with the following property: For every function 9 E G(K) there exists a function I E G(I) such that 11K = g. Show that there exists a constant c > with the property that for every 9 E G(K) there exists I E G(I) such that 11K = g, and maxI III ~ maxK Igi.
°
6.31 Let I = [0,1] and suppose A is a closed subspace of G(I). Suppose that for each I E A, T 1= ¢>I E A, where ¢> is a real-valued function defined on I. Show that T is continuous in the norm of A. Is ¢> necessarily continuous? 6.32 Let X be a Banach space, and suppose that Tl, T2 are linear operators from X into itself, and that T2 E B(X) is also one-to-one. Show that Tl E B(X) iff T2Tl E B(X).
xv.
316
The Basic Principles
6.33 Let I = [0,1], 1 < p < 00, and suppose T f is the linear operator defined on LP(I) by Tf(x) = ilk(x,y)f(y)dy, x E I. Specifically, assume that for almost every x E I, the function k(x, y)f(y) is integrable as a function of y E I, and T f E LP(I). Prove that T is bounded. 6.34 Let X, Y be normed linear spaces and suppose T: X --+ Y. Prove that if T is closed and one-to-one, then T- 1 is also closed. Also, if T E 8(X, Y) and the domain D(T) of definition of T is closed, then T is closed. 6.35 Suppose X, Yare Banach spaces, and let T E 8(X, Y). Show that R(T) is closed in Y iff there exists a constant c > 0 such that inf{lIx - yll : Ty = O} ~ cllTxl1 for all x E X. 6.36 Let Y be a subspace of a normed linear space X. A mapping p E 8(X, Y) is said to be a projection of X onto Y if P maps X onto Y and p 2 = P. Suppose now Y is a closed subspace of a Banach space X. Show that there exists a projection P of X onto Y iff there exists a closed subspace Z of X such that X = Y EB Z, i.e., Y n Z = {O} and every x E X can be written uniquely as x = y + z with yin Y and z in Z. If this is the case, there also exists a constant k > 0 such that lIy + zll ~ kllyll ,
y E Y ,z E Z.
6.37 Let X, Y be normed linear spaces and suppose T: D(T) ~ X --+ Y. T is said to be closable if there exists a linear extension of T to all of X which is closed in X; the domain D(T) of T is not required to be dense in X. Prove that the following are equivalent: (i) T is closable. (ii) For any y :f. 0 in Y, (O,y) is not in the closure of the graph of
T. (iii) T has a minimal closed linear extension, i.e., there exists a closed linear extension T* of T such that any closed linear extension of T is a closed linear extension of T* . 6.38 Let X, Y be Banach spaces and T: X --+ Y be closed. Prove that T has a bounded inverse iff T is one-to-one and has a closed inverse.
CHAPTER
XVI
Hilbert Spaces
In this chapter we consider those normed spaces where, as in the case of the Euclidean space Rn, there is an inner product which is connected with the norm by a simple relation: The square of the norm of an element is the inner product of that element with itself. Some examples to keep in mind are, of course, R n and en, and what we will have the occasion to verify is a universal model of a Hilbert space, L 2 (J.L).
1.
THE GEOMETRY OF INNER PRODUCT SPACES
A complex vector space X is said to be an inner product space provided there is a complex-valued map defined on X x X denoted by (., .), and called an inner product on X, or plainly an inner product, which satisfies the following properties: (i) (Xl + AX2,Y) = (XbY) + A(X2,Y), all Xt,X2 in X, and A E e. (ii) (x,y) = (y,x), all x,y in X. (iii) (x,x) ~ 0, and (x,x) = 0 iff X = O. Of course we may also consider real inner product spaces, and in this case we restrict our attention to A E R, and require that (-,.) be real valued. Property (i) is known as the linearity of the inner product, and together with (ii) it implies "conjugate linearity" in the second variable, to wit, (iv) (X,AY) = X(x,y), all x,y E X and A E e. An immediate consequence of these properties is that (x,O) (0, x) = o for all x EX. Inner product spaces satisfy an important inequality, which we prove next.
=
XVI.
318
Hilbert Spaces
Proposition 1.1 (The Cauchy-Schwarz Inequality). X we have
For any X,y in
l(x,y)1 2:::; (x,x) (y,y).
Proof.
(1.1)
Note that for any x, y in X and scalars A, by (iii), (iv) and
(ii) above it follows that 0:::; (x + AY,X + AY) = (x,x) + (X,AY) + (AY,X) = (x,x) +'X(x,y) + A(Y,X) + A'X(Y,y) = (x, x) +'X(x, y) + A(X, y) + IAI2(y, y) = (x, x) + 2~ ('X(x, y») + IAI2(y, y).
+ (Ay,AY)
Now, if y = 0, (1.1) is trivially true since both sides there are o. On the other hand, if y -=I 0, then (y,y) > 0, and putting A = -(x,y)/(y,y) the above estimate becomes
(x, x) _
21(x, y)12 + I(x, y)12 (y, y) = (x, x) _ I(x, y)12 ~ (y,y)
(y,y)2
which is clearly equivalent to (1.1).
0,
(y,y)
•
If in an inner product space X we set
IIxll=~,
(1.2)
xEX,
then X is turned into a normed space. Indeed, with the exception of the triangle inequality, all other properties of the norm follow at once from (1.2) and the properties of the inner product. As for the triangle inequality, given x, y EX, observe that
IIx + Yll2 = (x + y, x + y) = IIxll2 + 2~(x, y) + lIyll2 ,
(1.3)
which, by the Cauchy-Schwarz inequality, is dominated by
IIxll 2+ 211xllilyll + lIyll2 = (lIxll + lIyl1)2 . we get IIx + Yl12 :::; (lIxll + lIy11)2, which is
In other words, equivalent to the triangle inequality. An easy consequence of the Cauchy-Schwarz inequality is the continuity of the inner product. More precisely, if Xn -+ x and Yn -+ y, then \Xn, Yn) -+ (x, y). This observation follows from the estimate
l{xn,Yn) - (x,y)1 = l(xn,Yn) ± (xn'Y) - (x,y)1 :::; I(xn, Yn - y)1 + I(xn - x, y}1
:::; IfxnllllYn - yll + IIx n
-
xlillyll,
1.
Geometry of Inner Product Spaces
319
and the fact that convergent sequences are bounded. An inner product space X endowed with the norm introduced in (1.2) is said to be a pre-Hilbert space. A complete pre-Hilbert space is called a Hilbert space. For instance, L2(p.) is a Hilbert space with inner product given by
(j,g)
= !xfYdp.,
j,gEL 2 (p.).
£2 is the prototype of a Hilbert space. It was introduced by Hilbert in the early 1900's in his work on integral equations. The axiomatic definition of a Hilbert space was not given until much later by J. von Neumann(19031957) in the mid 1920's, in a paper dealing with the mathematical foundations of Quantum Mechanics. In what follows we restrict our attention to Hilbert spaces since, as we show next, every inner product space can be "completed", and the completion is a Hilbert space, unique up to isomorphisms. In the present context, a linear mapping T: X --t Y from an inner product space X onto another inner product space Y over the same field of scalars is said to be an isomorphism if it preserves inner products, i.e.,
(Tx,Ty)
= (x,y),
all X,y EX.
Thus, isomorphisms of inner product spaces preserve their whole structure, including inner products and norms. Proposition 1.2. Suppose X is an inner product space. Then there exist a Hilbert space Y and an isomorphism T of X onto a dense subspace of Y. The space Y is unique up to isomorphisms. Proof. Because the proof follows along familiar lines we only sketch it. By Theorem 3.5 in Chapter XIV there exist a unique, up to isometries, Banach space Y and an isometric (in the linear sense) isomorphism T of X onto Y. Consider now the complex-valued map (·,·h defined on Y X Y as follows: If X,y E Y, and {x n } and {Yn} are sequences of elements in X that converge in Y to x and y respectively, let
By the continuity of the inner product it is not hard to see that (·,·h is well-defined, i.e., it is independent of the approximating sequences chosen, and that it is an inner product on Y. Also, by (1.2), it follows that T is an isomorphism of X onto Y. •
XVI.
320
Hilbert Spaces
The notion of Hilbert space is an immediate generalization of Euclidean space, so its "geometry" approaches Euclidean geometry more closely than that of other Banach spaces. For instance, the "parallelogram law" holds: For any x, y E X we have
(1.4) Indeed, (1.4) follows at once by adding (1.3) and the expression we obtain replacing y by - y there. In fact, even more is true. Proposition 1.3. A normed linear space X is an inner product space iff the "parallelogram law" holds. Proof. It only remains to check that if (1.4) holds, then X is an inner product space. The proof of this result is entirely computational since we can exhibit explicitly the inner product (-,.) associated to the norm \I . \I in X: It is given by the expression
When X is a real normed space the second summand above is ommited. The details of the straightforward, and tedious, computation needed to verify the properties of the inner product are left to the reader. • An interesting application of Proposition 1.3 is that the Lebesgue LP spaces are inner product spaces only if p 2. For instance, in the case of £P, consider x = (1,1,0,0, ... ), and y = (1,-1,0,0, ... ). We then have
=
and for these elements (1.4) holds iff p = 2. Another important notion is that of orthogonality. Elements x, y in an inner product space X are said to be orthogonal, and we write x ..l y, when (x, y) = 0. If x E X is orthogonal to each element of a subset A of X, then x is said to be orthogonal to A, and we write x ..l A. In this context we have Proposition 1.4 (Pythagorean Theorem). Suppose lection of pairwise orthogonal elements of X. Then
{xdi=l
is a col-
(1.6)
1.
Geometry of Inner Product Spaces
Proof.
321
By definition, the left-hand side of (1.6) equals
n X· Ln (L 1'-1" '-1 J-
X· J
)
-
L n. '-1( X·" I.J-
X· J}'
Also, since the Xi'S are pairwise orthogonal, the sum on the right-hand side above equals L:i=1 IIXi 112. • Our next goal is to explore whether given a subspace M of a Hilbert space X and X E X \ M, we can find Y E M such that
d(x,M) = inf {lIx' - xII :x' EM} = IIx -
yli.
(1.7)
This question is related to that of dropping a perpendicular from x to M, or "projecting" x onto M. Simple examples in R2 when M is an open segment or an arc show that there may exist no points Y E M which satisfy (1.7), or that there may exist infinitely many such y's. The following result handles the difficulties raised by these examples.
Proposition 1.5 (Existence of the Minimizing Element). Let X be an inner product space and M a nonempty, complete, convex subset of X. Then for every x E X there exists a unique Y E M such that
d(x,M)
= IIx - yli.
Proof. If x EM, then (1.8) holds with Y = x. Otherwise, if x let {Yn} ~ M be a minimizing sequence, i.e.,
lim IIx - Ynll = d(x,M).
n-+oo
(1.8) ~
M,
(1.9)
We claim that the sequence {Yn} is Cauchy. Indeed, by the parallelogram law we have
llYn - Yml1 2 = II(Yn - x) - (Ym - x)112 = 211Yn - xll 2 + 211Ym - xll 2 -1I(Yn - x) + (Ym - X)1I2 = 211Yn - xll 2 + 211Ym - xll 2 - 411(Yn + Ym)/2 - xll 2 . Now, since M is convex it follows that (Yn +Ym)/2 EM, and consequently,
d (x, M) ::; II(Yn + Ym)/2 - xII. Thus the above estimate becomes
llYn - Ymll 2 $ 211Yn - xll 2 + 211Ym - xW - 4d (x, M) ,
(1.10)
XVI.
322
Hilbert Spaces
and, in view of the choice of the Yn'S, the right-hand side of (1.10) goes to 0 as n, m - t 00. But this implies that the sequence {yn} ~ M is Cauchy and, since M is complete, that it converges to a limit Y EM, say. Furthermore, passing to the limit in (1.9) it follows at once that (1.8) holds. It only remains to check the uniqueness of y. Suppose Y and y' are elements of M that satisfy (1.8). Since (1.10) is actually true for arbitrary elements of M, by setting Yn = Y and Ym = Y' there, we get that lIy-y'll ~ o. Thus y = y'. • Turning from arbitrary convex sets to subspaces we obtain the result alluded to above concerning projections. But first a definition. Given a subset A of an inner product space X, let the orthogonal complement A.l of A be the set of all elements of X orthogonal to A, to wit, A.l = {x EX: x .1 y for all yEA} . (1.11) An elementary and important property of the orthogonal complement is Proposition 1.6.
A.l is a closed subspace of X.
Proof. First observe that if then we have
xl,
X2 E A.l, A is a scalar and YEA,
+ AX2, y) = (Xl, y) + A(X2' y) = 0, and consequently, Xl + AX2 E A.l.( Next suppose that {xn} ~ A and lim n oo Xn = x. Now, if YEA, we (Xl
--+
~
get that
I(x, y)1 = I(x -
Xn ,
y)1
~
Thus, X E A.l, and A.l is closed.
,"
IIx - xnllllYIl
-t
0
as n
-t
00.
•
We are now ready to prove a result of fundamental importance in the theory of Hilbert spaces, the projection theorem. Theorem 1.7. Let X be a Hilbert space and M a complete subspace of X. Then every element X E X can be expressed in the form X=
Xl
+ X2,
Xl
E M, X2 E M.l .
(1.12)
Furthermore, the representation is unique, and
(1.13)
1.
Geometry of Inner Product Spaces
323
Proof. If x E M we put Xl = X and X2 = O. Otherwise, let Xl be the unique element of M which satisfies (1.8), i.e., (1.14) Next we verify that X2 = X - Xl is orthogonal to M, and so it belongs to M.1. Let 0 i: y E M and observe that since for each scalar .\ also Xl + .\y EM, it readily follows that (1.15) Now, by a familiar argument using (1.14), (1.15) may be rewritten -'X(X2'
y} - .\(y, X2)
+ 1.\12(y, y} ~ o.
= (X2,Y}/(Y,Y) we conclude that I(X2, y}12 _ I(X2, y}12 + I(X2, y)i2 > 0
In particular, when .\
(y, y)
(y, y)
(y, y)
-
,
that is, I(X2,y}1 2 ::; O. But, this can only happen when X2 ..L y, and consequently X2 E M.1. By the Pythagorean theorem, (1.13) is then true. Finally we prove that the representation in (1.12) is unique. For, if we also have X = x~ + x~ , x~ EM, x~ E M.1 , then, comparing this with (1.12), we get
Now, the element on the left-hand side above belongs to M, while that on the right-hand side belongs to M.1. Thus,
(Xl - x~,x~ and Xl - x~
X2)
= (Xl -
X~,XI - x~)
= O. This implies that also x~ -
X2
= 0,
= 0, and we are done.
•
The elements Xl E M and X2 E M.1 uniquely determined by X are called the projections of X onto M and M.1 respectively. The operator PM: X --+- M given by PMX = Xl is called the projection on M; it is not hard to see that PM is a bounded linear operator onto M, and that IIPMII ::; 1. As we describe in Section 2, projection operators play an important role in the description of basic properties of the subspaces of
X. Theorem 1. 7 may be applied to characterize the bounded linear functionals on a Hilbert space.
XVI.
324
Hilbert Spaces
Theorem 1.8 (F. Riesz). Let X be a Hilbert space, and suppose L is a bounded linear functional defined on X. Then there exists a unique y E X such that Lx (x, y) , all x EX. (1.16)
=
Furthermore, IILII
= lIyll·
Proof. Let M be the null space of L, i.e., M = {x EX :Lx = O}j since L is continuous, M is a closed subspace of X, d. 4.15 in Chapter XIV. If M = X, then we choose y = 0, and we are done. Otherwise, let 0 ::j:. x ~ M, and note that by Theorem 1. 7 there is an element z = (1/lix - PMxll)(x - PMX) in M.l with IIzll = 1. Now, given x E X, put Xl = (Lz)x - (Lx)z, and observe that
LXI i.e.,
xl
= L((Lz)x) -
L((Lx)z)
= LzLx -
E M. Furthermore, since z E M.l and
((Lz)x - (Lx)z,z)
= Lz(x,z) -
Lx(z,z)
LxLz
= 0,
IIzll = 1, we get
= Lz(x,z) -
Lx
= o.
In other words, we have
Lx = Lz(x, z) = (x, (Lz)z) , and (1.16) holds with y = (Lz)z. Suppose now there is another point y', say, such that Lx = (x, y') for all x EX. Then (x,y - y') = 0 for all x E X, and by taking x = y - y' we find that lIy - y'll = 0, and so y = y'. Finally, about the norm. Since IIzll = 1 we have lIyll = ILzlllzll ~ IILII· On the other hand,
IILxll
~
I(x, y)1
and consequently, IILII ~ we have finished. •
~
IIxlillyll,
all x EX,
lIyll. We have thus proved that IILII
=
lIyll, and
A well-known property of finite-dimensional linear spaces is that of being algebraically reflexive. It is, therefore, natural, to consider whether arbitrary Hilbert spaces are reflexive. In order to answer this question we begin by showing that X*, which we already know to be complete, is an inner product space as well. Proposition 1.9.
If X is a Hilbert space, then X* is a Hilbert space.
1.
Geometry of Inner Product Spaces
325
Proof. Consider the mapping T:X ~ X* defined by Tx i.e., Tx is the bounded linear functional on X given by (Tx)y
= (y,x),
all y EX.
(·,x),
(1.17)
It is apparent that T is one-to-one and that, by Theorem 1.8, it is also norm preserving and onto. Observe that (1.17) may be rewritten Ly
= (y,T-1L) ,
for each L E X*,y EX.
(1.18)
Now, T establishes an equivalence between X and X* at the level of sets, but not as linear spaces, since T is not linear but rather conjugate linear. More precisely, we have T(XI
+ AX2) = TXl +J.TX2,
all Xl,X2 in X, scalars A.
Nevertheless, this property of T enables us to introduce an inner product (-,.)* on X* as follows: Given L1,L2 E X*, let (L1, L 2)*
= (T- 1L 2, T- 1L 1) .
(1.19)
A straightforward computation gives that (-,.)* is an inner product on X*, and that the norm and inner product on X* are related by (1.2). Thus X* is a Hilbert space. • We are now ready to show Proposition 1.10.
Suppose X is a Hilbert space, then X is reflex-
ive. Proof. Along the lines of the proof of Proposition 1.9 above, let r: X* ~ X** be the mapping on X* given by rL
= (. ,L)* ,
L E X* .
(1.20)
It then readily follows that r is one-to-one, onto and norm-preserving, and consequently, (1.20) may be rewritten x** L = (L, r- 1 x**)* ,
all x** E X** ,L E X* .
But then, by (1.19) and (1.18), for any x** E X** and L E X* we have x** L
= (T-1r-1x**, T- 1L) = L(T-1r-1x**) = Jx(T-1r-1x**)L.
Since L is arbitrary this can only mean that x** = Jx(T-1r-1x**), and consequent]:: T-1r-1x** E X. Thus, the natural map Jx is onto, and X is reflexive. . •
XVI.
326
Hilbert Spaces
2. PROJECTIONS We consider now some of the connections between the class of projections and the geometry of subspaces of X. We begin by noting an obvious property of projections: If PM: X -- M is the projection of X onto the subspace M, then PM is a bounded linear operator and, if M i:- {O}, we have IIPMII = 1. Indeed, for x, Y E X and scalars '\, by Theorem 1.7 it follows that x =
Xl
+ X2 , Y =
YI
+ Y2,
Xl,
YI EM, X2 ,Y2 E Ml. ,
and
Whence, we see that
and PM is linear. Further, by (1.13), IIPMXII 2 ~ IIx1I2, so that IIPMII ~ 1. Provided that M i:- {O}, we can choose X E M with IIxll = 1. Then II PM II ~ IIPMXII = IIxll = 1, and II PM II = 1. It is also possible to characterize the projections.
Proposition 2.1. Let P: X -- X be a linear map from a Hilbert space X into itself. Then P is a projection iff (2.1) (i) (Px, y) = (x, Py), all x, Y E X. (ii) p2 x = P(Px) = Px, all X EX. (2.2) Proof. We do the necessity first. Suppose P = PM is a projection onto a closed subspace M, and, given x, Y EM, write X = Xl + X2, Y = YI + Y2, with Xt,YI EM and X2,Y2 E Ml.. We then have
which is precisely (2.1). Moreover, since for Y E M we have Py readily follows that P(Px) = Px, X EX, and (2.2) is true.
= y,
it
2.
Projections
327
Conversely, suppose P is a linear mapping which satisfies both the conditions in the theorem, and first note that P is bounded. Indeed, by (2.1), (2.2) and the Cauchy-Schwarz inequality we have
= (Px, Px) = (x, p 2x) = (x, Px) ::; and consequently, P is bounded, and IIPII ::; 1. IIPxl12
IIxllIlPxll,
Next let M = P( X) be the image of X under P. It is clear that M is a linear subspace of Xj it is also closed. Indeed, if {yn} ~ M and Yn --+ y, let {x n } ~ X be such that PX n = Yn, and note that by (2.2) it follows that PYn = P(Pxn) = PXn = Yn, all n. Hence by the continuity of P we get
Y = n--+oo lim Yn
= n--+oo lim PYn = Py EM,
and M is the closed subspace of X, M = {x E X:Px = x}. Finally we show that P is the projection PM of X onto M. Take any x E X and write it as x = Px + (x - PX)j we want to show that Px E M and (x - Px) E M1.. The first assertion obtains since Px = P(Px) E M. Also, if Y E M, then Py = Y and hence, by (2.1) and (2.2),
(x - Px,y) = (x,y) - (Px,py) = (x,y) - (x,p 2y) = (x,y) - (x,Py) = (x,y) - (x,y) = 0, and x - Px E M 1. •
•
Two closed subspaces M, N of a Hilbert space X are said to be orthogonal, and we write M 1- N, if
(x, y)
= 0,
all x E M, yEN.
It is not hard to characterize orthogonal subspaces in terms of the projections they determine.
Proposition 2.2. Suppose X is a Hilbert space, and M, N are closed subspaces of X. Then, M 1- N iff PM PN = PN PM = O. Proof. First suppose M and N are orthogonal and let x, y EX. Then PMX E M, PNY E N, and
(PNy,PMX) = (PMPNY,X) =
o.
XVI.
328
Hilbert Spaces
Since x, yare arbitrary this can only happen if PM PN = OJ similarly PNPM = O. Conversely, suppose that PM PN = 0, and let x EM, yEN. Then,
and M 1. N. Note that the above assumptions are somewhat redundant in that PMPN = 0 iff PNPM = O. • How about the sum of projections? First a definition. Let M, N be closed linear subspaces of a Hilbert space X, and assume that every element in the vector sum M + N has a unique representation of the form x + y, where x EM, yEN. Then we call M + N the direct sum of M and N. If M 1. N, we denote this direct sum by M ffi N. We leave it to the reader to verify that M ffi N is also a closed subspace of X. If Y = M ffi N, then we say that N is the orthogonal complement of M in Y, and we write N = Y M j symmetrically, M = YeN denotes the fact that M is the orthogonal complement of N in Y. For instance, in the projection theorem we have X = M ffi M.L, M = X e M.L, and M.L = X eM.
e
Proposition 2.3. Suppose X is a Hilbert space, and M, N are closed subspaces of X. Then the sum PM+PN ofthe projections PM and PN is a projection iff PM PN = PN PM = O. In this case, PM+ PN = PM(J)N. Proof. First assume that P = PM sition 2.1 we have
+ PN is a projection. By Propo-
IIPxll2 = (Px,Px) = (p2X,X) = (Px,x) ,
all x EX,
and similarly,
Whence, we get IIPMXll2
+ IIPNXl1 2= (PMX,X) + (PNX,X) = (Px, x) =
IIPxll 2 ~
IIxll 2.
(2.3)
Consider now an arbitrary element y in X and put x = PNy in (2.3). Since PNX = P~y = PNY, this gives
2.
Projections
329
which can only be true if PM PN = O. By the way, this is equivalent to PNPM = O.
Conversely, we verify that P = PM + PN satisfies the conditions of Proposition 2.1, and so it is a projection. Since P is a sum of operators that satisfy (2.1), it also satisfies (2.1). As for (2.2), since
+ PN)2 = (PM + PN){PM + PN) pk + PMPN + PNPM + piv pk + piv = PM + PN = P,
p2 = {PM
= =
we have that p 2 = P, and P is a projection. Finally, it is clear that Px = PMX + PNX varies over M E9 N as x varies over X. Conversely, if x = Xl + X2 E M E9 N, Xl EM, X2 EN, since PPM = PM, PPN = PN, and since Xl = PMXI = PPMXI, X2 = PNX2 = PPNX2, we have
Hence P = PMffiN.
•
How about the product, or composition, and the difference of projections? Proposition 2.4. Suppose X is a Hilbert space, and M and N are closed subspaces of X. Then the composition P = PMPN of the projections PM and PN is a projection iff they commute, i.e., PMPN = PN PM. In that case P = PMnN. Proof.
If P is a projection, then
Moreover, since PM and PN are projections we also have
and consequently, we get
But this can only be true if PM PN established.
= PN PM, and the necessity has been
330
XVI.
Hilbert Spaces
Conversely, if PN and PM commute, then essentially reversing the above steps we get that P = PM PN = PN PM satisfies (Px,y) = (x,Py) ,
all x,y EX.
Moreover, since also p2 = (PMPN)(PMPN) = PM(PNPM)PN
= PM(PMPN)PN
= piJpiv = PMPN = P,
by Proposition 2.1 P is a projection. Furthermore, since
= PM(PNX) = PN(PMX)
Px
E M n N,
all x EX,
P projects X into M n N. On the other hand, if x E M n N, then Px P
= PMnN, and P
= PM(PNX) = PMX = X,
is the projection onto M
n N.
•
Before we consider the question of the difference of projections we need a preliminary result.
Lemma 2.5. Suppose X is a Hilbert space and let PM, PN be projections onto the subspaces M and N, respectively. Then the following four conditions are equivalent: (i) (PMX,X) ~ (PNX,X), all x E X. (ii) M2N. (iii) PMPN = PN. (iv) PNPM = PN. Proof.
(i) implies (ii). Let x E X. Since x - PMX E Ml., we have
IIxII 2=
IIPMXII 2 + IIx
-
PMxII 2 .
Now, if x E N, then PNX = x, and (PNX, x) = get
IIx1I2.
Whence, by (i) we
IIxII 2= (PNX, x) ::; (PMX, x) = (PMX, PMX) = IIPM xII 2 ::; IIxJJ2. Thus IIxll = IIPMXII, and by (1.13) we conclude that IIx - PMxlI = O.
Consequently, x = PMX E M, and (ii) holds. (ii) implies (iii). Since PNX E N ~ M, we have PMPNX = PNX. (iii) implies (iv). By Proposition 2.4, since the composition PMPN = PN
is a projection, the projections commute. Specifically, PN PM = PM PN = PN, which is precisely (iv). (iv) implies (i). It is a straightforward computation: For x E X we have (PNX,X)
and (i) holds.
= IIPNXII 2 = IIPNPMxII 2 ::; IIPMxII2 = (PMX,X) , •
3.
Orthonormal Sets
331
Proposition 2.6. Suppose X is a Hilbert space, and let PM, PN be projections onto the subspaces M and N, respectively. The difference P = PM - PN is a projection iff N ~ M. If this is the case, then P = PMeN.
Proof. Suppose that P is a projection. Then, since PM = P + PN is also a projection, by Proposition 2.3 it follows that P PN = O. Whence
and (iii) of Lemma 2.4, and consequently, also (ii) in that lemma, are true. Conversely, if N ~ M, it is clear that PNPM = PMPN = PN, and (PM - PN)2 = (PM - PN)(PM - PN)
= pk - PNPM - PMPN + PJv = PM - PN - PN + PN = PM - PN.
Also,
and by Proposition 2.1, P is a projection. Moreover, since as observed above (PM - PN )PN = 0, by Proposition 2.3 the subspace Y of the projection PM - PN satisfies Y EEl N = M. Therefore Y = MeN. •
3. ORTHONORMAL SETS A general question we address in this section is the following: Given a Hilbert space X and a subset Y of X, how can we best approximate elements of X by those of Y? A good measure of the approximation is given by the quantity
d(x,Y) = inf
yeY
IIx - yll,
so we are naturally interested whether the inf above is actually achieved. We begin by considering a simple example, namely, the case when Y is the (finite dimensional) subspace of X spanned by {Xl, ..• ,x n }. Given x EX, we seek to minimize the expression
(3.1)
XVI.
332
Hilbert Spaces
Clearly we may assume the Xi'S are linearly independent, and, by the Gram-Schmidt process, cf. 5.21 below, orthonormal. More precisely, we may assume that each Xi has norm 1, and that Xi ..L Xj for 1 ~ i =J j ~ n. The xi's are then said to constitute an orthonormal system, or ONS, in X. A closer look at (3.1) and (1.13) suggests that we consider the projection of X onto Y, and in order to invoke the projection theorem we begin by showing that Y is closed. Proposition 3.1. If M is a closed subspace of a normed space X and if X EX, then the span {M, x} of M and {x} is also a closed subspace of X. Proof. If x E M, then the span {M,x} = M, and we are done. Otherwise, assume x ~ M and suppose a sequence {mn +AnX } of elements of {M, x} converges to an element y EX; we must show that actually y E {M, x}. First note that since the sequence converges it is bounded, i.e., there is a constant c such that
limn
+ Anxll
~ c,
all n.
We claim that {IAnl} is also bounded. If this is not the case, there is a subsequence nk - t 00 such that lAnk I - t 00 as k - t 00, and consequently,
II A~;mnk + x I ~ c/IAnk I
-t
0
as k
-t
00 .
Whence x belongs to the closure of M, which is M since M is closed, and this is a contradiction. Now, since {IAnl} is bounded, passing to a subsequence if necessary, we may assume that the An'S converge to a limit A, say. But then
and since M is closed, y - AX EM. Thus y {M,x}. •
= (y -
AX)
+ AX
belongs to
Corollary 3.2. Let X be a normed space. Then the subspace Y spanned by {Xl, ... , X n} is closed. Proof. Observe that {xd = {AXI : A is a scalar} is closed in X and apply Proposition 3.1 as many times as necessary. • The stage is now set to invoke the projection theorem, and to obtain the answer to our question: It is Pyx. In fact, by the projection theorem
3.
Orthonormal Sets
333
we can find the Ai'S as follows: Since x - Pyx E y.l, if Pyx = L:~=1 AiXi, we have Moreoever, since {Xi} is an ONS in X, it readily follows that
and consequently, the minimum value of the expression (3.1) equals (3.2) Furthermore, we may compute the exact value of the expression (3.2): Its square equals
II xll 2- (2::1(X,Xi)Xi,X) - (x, 2::=1 (x, Xk)Xk) + 2:;=I I (x,Xi)1 2 = IIxII 2 - 2::1 1(x,Xi)1 2 • (3.3) The minimizing values of the Ak'S, to wit, the scalars (X,Xk), are called the Fourier coefficients of x with respect to the ONS {Xk}. In addition to providing a complete answer to the question posed above, (3.3) implies Proposition 3.3 (Bessel's Inequality). Suppose {Xa}aEA is an ONS in a Hilbert space X. Then Bessel's inequality holds, Le.,
2: I(x, xa)12 ::; IIxII 2,
(3.4)
all x EX.
aEA
In particular, for each x in X, all but an at most countable number of the Fourier coefficients (x,xa) of x with respect to the ONS {xa} vanish. Proof. Let al, ... , an be a finite subset of A. Then, given x EX, by (3.3) we have
Ilx - 2:;=I{x,Xa,)Xa,11 2 = IIxII It then readily follows that
2 -
2:: 1 1{x,xa,W
~ o.
XVI.
334
and (3.4) holds.
Hilbert Spaces
•
There is yet another way to interpret the inequality (3.4): Suppose A is endowed with the counting measure, and let T be the linear mapping from X into the space of sequences (Ca)aEA which takes x into its sequence of Fourier coefficients,
Tx
= ((x,Xa})aEA,
x EX.
(3.5)
Bessel's inequality asserts that T is a bounded mapping from X into .e2 (A) with norm IITII ~ 1. Since we are interested in describing X in terms of .e2 (A), we must decide when T is one-to-one and onto. We begin settling the "onto" question.
Theorem 3.4 (F. lliesz-Fischer). Suppose {Xa}aEA is an ONS in a Hilbert space X, and let (c a ) E .e2 (A). If T is given by (3.5), then there exists y E X such that Ty = (Ca)aEA. More precisely, T is onto. Since (c a ) E .e2 (A), there are at most count ably many a's, al, ... ,ai, ... , say, so that caj '# o. Put now
Proof.
n
Yn = LCajXaj,
n = 1,2 ...
i=l
We claim that the sequence {yn} is Cauchy in X, and consequently, it converges. This is not hard; indeed, let n < m, and note that by Proposition 1.4 we have m
m
Now, since (c a ) E .e2 (A), the right-hand side of the above equality is dominated by the tail of a convergent series, and it tends to 0 as n -+ 00. Whence the same is true of the left-hand side, and {yn} is Cauchy. Moreover, if y E X is the limit of the Yn's, from the continuity of the inner product we get if a '# a,,, all k if a = ak. Thus, Ty
= (Ca)aEA.
•
3.
Orthonormal Sets
335
Still the question as to whether T is one-to-one remains openj first a definition. We say that an ONS {Xa}aEA in a Hilbert space X is maximal, or complete, if no nonzero element can be added to it so that the resulting collection of elements is still an ONS in X. Note that given a Hilbert space X, we can always find a maximal orthormal system in Xj this is a simple consequence of Zorn's Lemma, cf. 5.23 below. The stage is now set for Theorem 3.5. Suppose {Xa}aEA is an ONS in a Hilbert space X. Then the following properties are equivalent: (i) {Xa}aEA is maximal in X. (ii) The collection of all finite linear combinations of the xa's is dense in X. (iii) (Plancherel's Equality) Equality holds in Bessel's inequality, i.e., IIXII2
=L
l(x,xaW,
all x EX.
(3.6)
aEA (iv) (Parseval's Identity) For all X,y E X, we have
(X,y) = L(x,Xa)(y,x a ). aEA
(3.7)
Proof. (i) implies (ii). Let M be the closure of the subspace of X consisting of all finite linear combinations of the Xa'Sj M is then a closed subspace of X. Now, if X\M f; 0, by Theorem 1.7 we can find an element x E M 1.. with II x II = 1. In particular, we have
(x,xa)
= 0,
all a E A,
thus contradicting the assumed maximality of the xa's. (ii) implies (iii) By Bessel's inequality, it suffices to prove the "::;" inequality in (3.6). Given x E X and c > 0, there exists a finite subset {at, ... ,an} of A such that
(3.8) Now, since the best approximation to x in the subspace spanned by
{xa1, ... ,x an } is given by :Ei=l(x,x ai )X ai , we also have (3.9)
XVI.
336
Hilbert Spaces
Moreover, since by Proposition 1.4 we have
from (3.9) it follows that
Ilxll ~ Ilx - L:~=l(X'XaJXaill + 11L:;=l(X,XaJX ai l ~ E: + (L:;=1 (X,xaJI 2Y/2 ~ E: + (L: I(X,X WY/2 . 1
aEA
a
(3.10)
Thus, since E: > 0 is arbitrary, the inequality opposite to Bessel's inequality holds, and (3.6) is true. (iii) implies (iv). Identity (3.7) is one of inner products and (3.6) one of norms; we derive the former from the latter via (1.5). Specifically, given x, y EX, using (3.6) we compute (x, y) by evaluating the norms that appear on the right-hand side of (1.5). In fact, adding up the relation
to those corresponding to x-y and x±iy, by (1.5) again, it readily follows that if ((x,xa) and ((y,x a ) are the sequences of Fourier coefficients of x and y respectively, then
This is precisely (3.7). The argument involved in this step is known as "polarization." (iv) implies (i). Suppose (x,xa) = 0 for all a E A. Then, by (3.7) we get = (x,x) = LaEA )1 = 0, and consequently, x = o. Thus {x a} is a maximal ONS in X. •
IIxll2
l(x,x a 2
Observe that, in particular, (iii) above implies that T is an isometry of X onto .e2 (A), and consequently, T is one-to-one. Thus T establishes an isomorphism between X and .e2 (A) provided {Xa}aEA is a maximal ONS in X. We reiterate that, by Zorn's Lemma, all Hilbert spaces have a maximal ONS. The question is then, how to produce concrete examples of such systems. In the familiar case of L2(1), where 1 is an interval of the line, we construct such an example following Corollary 2.3 in Chapter
XVII.
4.
Orthonormal Sets
337
It is also apparent that (3.6) gives
x
= L(x,xcy}x cy ,
(3.11)
all x EX,
CYEA
where the sum is understood to converge in the norm of X. Thus, in this setting, each element of X is represented by its Fourier series. It is also customary to call a maximal ONS in X a basis. An interesting question we are able to settle at this time is when X has an at most countable basis. Proposition 3.6. Suppose X is a Hilbert space. Then X has an at most countable basis iff X is separable, and in this case all bases are at most countable. Proof. We do the sufficiency first. Let {xn} be an at most countable dense subset of X. First discard any Xn which is a linear combination of the xi's with 1 ::; i ::; n - 1, and then, by the Gram-Schmidt process, orthonormalize the remaining elements. Call this ONS {xn} again, and note that its span coincides with that of the original ONS, and consequently, its closure is X. By (ii) in Theorem 3.6, the set {xn} is an at most countable basis for X. Conversely, suppose {xn} is an ON basis for X, and note that by (ii) in Theorem 3.6, finite linear combinations of the xn's are dense in X. Let now {An} be a countable dense subset of the field of scalars, and observe that
s = { .L
AkYk : Yk
= X n , some
n}
finite swns
is a countable dense subset of X, and consequently, X is separable. It only remains to show that any other basis {YCY}CYEA for X is also at most countable. For each integer n consider the set
An = {a E A: (YCY,x n ) i- O}, Since by (3.6)
L I(x
n , ycy}1 2
n = 1,2, ...
= II xnll 2 = 1,
CYEA
each An is at most countable. Thus A = Un An is also at most countable. We claim that, unless a E A, YCY = O. For, if YCY i- 0, then by the maximality of the xn's there exists an index n such that (ycy, xn) i- O. In this case a E An ~ A, and we have finished. •
XVI.
338
Hilbert Spaces
4. SPECTRAL DECOMPOSITION OF COMPACT OPERATORS As in the case of the Lebesgue L2 spaces, there is a notion of weak convergence in Hilbert spaces. More precisely, inspired by the Riesz Representation Theorem, we say that a sequence (xn) in a Hilbert space X converges weakly to x EX, provided that lim (xn'Y) = (x,y),
n-+oo
all Y EX.
(4.1)
As before, convergent sequences are weakly convergent, but the converse is not true. Also, bounded sequences have weakly convergent subsequences. Given a bounded sequence (Yn) in X, let {xam} be the at most countable subset of a basis {Xa}aEA for X such that the Fourier coefficients of the Yn's with respect to the ONS {Xa}aEA do not vanish. Setting now
Cn,m
= (xam,Yn),
all m,n,
we obtain a weakly convergent subsequence along the lines of the proof of Theorem 3.3 in Chapter XII. How do weakly convergent sequences look? Proposition 4.1. Let X be a Hilbert space, and suppose the sequence {Xn} ~ X converges weakly to x EX. Then (i) The set {xn} is bounded. (ii) The weak limit x lies in the subspace of X spanned by {Xl, X2, • •• }. (iii) IIXII ~ liminf IIx n ll. Proof.
Let Y EX; since by assumption we have lim Ly(xn - x)
n-+oo
= n-+oo lim (xn -
x,y)
= 0,
it readily follows that for each Y E X we have sup ILy(xn - x)1 ~ cy
0, such that ~
IITxnk - Txll
~ 1],
k = 1,2, ...
Note that the sequence {x nk } is weakly convergent and therefore also bounded. By the assumptions on T there are yet another subsequence,
4.
Spectral Decomposition
341
which we call {xnr.} again, and Y E X with such that limk-+oo TXnr. = y. But then {Txnr.} is also weakly convergent to y, and since weak limits are unique, cf. 4.32 in Chapter XIV, it follows that Tx = y. Hence, lim IITxn/c - TxlI
k-+oo
which is a contradiction.
= 0,
•
What are some completely continuous operators? The following is a useful sufficient condition to identify them. Proposition 4.4. and T E 8(X). If
Let X be a Hilbert space, {Xa}aEA a basis for X,
L I(Txa,x,6)1 2 < a,,6EA
(4.3)
00,
then T is completely continuous. Proof.
By translations it suffices to show that ifthe sequence {yn}
~
X converges weakly to 0, then {TYn} converges to O. First observe that by Theorem 3.5 there are at most countably many a's so that
We may, therefore, consider a as an integer index, and note that from (3.6) we obtain N
IITYnll 2
= L
00
I(TYn,xa)1 2
+
L
I(TYn,xa)1 2
a=1
=A+B, say. Now, since 00
Yn
= L(Yn,X,6)X,6 ,6=1
and T is continuous, we get at once that 00
(TYn,x a) = L(Yn,X,6}(TX,6,Xa} , ,6=1
XVI.
342
Hilbert Spaces
and consequently, by the Cauchy-Schwarz inequality we get 2 00
B
=
00
L L{Yn,X,8}(TX,8,xa) a=N+1 ,8=1
m,4>nW = 2: 1{4>m,T*4>n)1 m=l
2
m=l
= II T*4>n 112 ,
n = 1,2, ...
Thus summing over n above we get
~ I(T4>m,4>n)I' = %;IIT'4>nIl' = %; =
J, f
i Ii
1{4>n,k(.,x)1 2 dx =
I n=l
k(s,x)4>n(s)dsl' dx
J, J, Ik(s,x)1 dsdx < 2
I
00,
I
(4.3) holds, and T is compact. We are now ready to give a detailed description of a compact operator T with the additional property that T* = Tj any operator which satisfies this relation is said to be self-adjoint. For example, in the case of the integral operator described above, this corresponds to those kernels k that satisfy k(x,t) = k(t,x), x,t E I. As we shall have the opportunity to verify in the case of compact operators, it turns out that the structure of such an operator is reminiscent to that of a symmetric matrix. And, as in the case of matrices, eigenvalues and related concepts play an important role in determining the properties of a compact self-adjoint operator. An eigenvalue of a linear operator T defined on X is a scalar A such that there exists 0 i:- x EX, with the property that
Tx = AX.
(4.5)
An element x E X for which (4.5) holds is called an eigenvector associated, or corresponding, to the eigenvalue A. The collection of all eigenvectors associated to an eigenvalue A is a subspace X).. of X, called the eigenspace corresponding to A. The relevant facts concerning a self-adjoint operator are included in our next result.
XVI.
344
Hilbert Spaces
Proposition 4.5. Let X be a Hilbert space, and suppose T is a selfadjoint mapping defined on X. Then T satisfies the following properties: (i) T is bounded. (ii) For each x E X, (Tx,x) is a real number. (iii) The norm of T is given by
IITII =
sup I(Tx, x)l. IIxll=l
( 4.6)
(iv) All eigenvalues of T are real. (v) Eigenspaces corresponding to different eigenvalues are orthogonal. (vi) If P>.. denotes the projection onto the eigenspace X>.. corresponding to the eigenvalue>. of T, then
(4.7) Proof. (i) Assume that Xn -+ x, and that TX n -+ Yi by the Closed graph theorem it suffices to show that x E D(T) and Tx = y. Let z EX, and observe that on the one hand,
(Txn' z)
= (xn' Tz) -+ (x, Tz) ,
and, on the other hand,
(Txn' z)
-+
(y, z) .
Thus, combining the above relations we have
(Tx,z) = (x,Tz) = (y,z)
all z EX,
which, since z is arbitrary, means that x E D(T) and that Tx = y. (ii) Since for x E X we have (Tx,x) = (x,Tx) = (Tx,x), as anticipated, (Tx,x) is real. (iii) Let "l = sUPllxll=l I(Tx, x)l. Since for IIxll = 1 we have I(Tx, x)1 ~ IITxllllxll ~ IITII, it follows that "l ~ IITII. On the other hand, since by a simple computation it readily follows that 4~(Tx,y)
= (T(x + y),x + y) -
by putting Xl = {1/lix IIX211 = 1, we get that 4~(Tx, y) =
+ yll )(x + y),
X2
(T(x - y),x - y),
= {1/lix - yll )(x -
IIx + Y1l2(Txb Xl) - IIx + yll2 + "llix - Yll2 =
~ "lIiX
y),
IIXlll =
(lIxll2 + lIyIl2).
(4.8)
YIl2(Tx2' X2) 2"l
4.
Spectral Decomposition
345
Pick now x E X such that Tx =I 0 and IIxll = 1, and put y IIYII = 1, in (4.8). That inequality then becomes
= (1/IITxll )Tx,
4~(Tx, Tx)/IITxll ~ 41],
or, equivalently, IITx II ~ 'fl. This gives IITII ~ 1], and (4.6) holds. (iv) If A is an eigenvalue of T and x is an eigenvector corresponding to A with IIxll = 1, we have
A = A(X,X) = (Tx,x) , which by (ii) above is real. (v) Let A =I I' be eigenvalues of T, and let X), and XI-' denote the eigenspaces corresponding to A and 1', respectively. Now, assume x EX)" Y E XI-" and note that since the eigenvalues are real we get
A(X,y)
= (Tx,y) = (x,Ty) = I'(x,y) ,
which can only be true if (x, y) = O. (vi) Since P),x E X), for all x E X, it readily follows that
TP),x
= AP),X,
x EX.
As for the commutation relation in (4.7), observe that for x, y E X we have (TP),x,y) = (AP),X,y) = (X,AP),y) = (x,TP),y) = (Tx,P),y) = (P),Tx,y). Since x and yare arbitrary, this can only happen if T P), = P),T.
•
Referring to the finite dimensional case, it is possible to represent a self-adjoint mapping T: -+ in terms of projections. First choose a and represent T by a Hermitian matrix, which we also denote basis for by Tj for simplicity assume that the matrix T has n different, necessarily real, eigenvalues Al < ... < An, say. Then T has an ONS of n eigenvectors Xl, ... , x n , say, where Xi corresponds to the eigenvalue Ai, 1 ~ i ~ n. But this is a basis for so that
en
en
en
en,
n
X = 2)X,Xi)Xi, i=l Consequently, Tx is equal to n
all x E
en.
n
~)x,xi)Txi = LAi(X,Xi) , i=l i=l
x
E
en.
XVI.
346
en
Hilbert Spaces
en
Thus, if Pi: -I- {Xi} denotes the projection of onto the eigenspace of T corresponding to the eigenvalue ~i, 1 ~ i ~ n, the above expression becomes i=1
This is a representation of T in terms of projections; a natural question is whether a similar representation holds for compact self-adjoint operators on a Hilbert space. The first step is to show that such operators do have eigenvalues. Lemma 4.6. Let X be a Hilbert space and T a compact self-adjoint mapping defined on X. Then T has a nonzero eigenvalue. Proof. Since the conclusion of the theorem is obvious when T we assume that T f; O. Let
J.L = inf (Tx,x) , 11:1:11=1 by (iii) in Theorem 4.4, we have
and
TJ = sup (Tx,x); 11:1:11=1
IITII = max{IJ.LI, TJ}.
~ _ {J.L if IITII -
TJ
if
lim (Txn,xn) = J.L,
We claim that
= IJ.LI
IITII = TJ,
is an eigenvalue of T. Consider, for instance, the case when IITII = IJ.LI of J.L there exists a sequence {xn} such that n-+oo
= 0,
> O. By the definition
IIxnll = 1 ,all n.
Since T is compact we can choose a subsequence, which we call {xn} again, such that lim n -+ oo TX n = y, say. Moreover, since
IITXn - J.Lxnll2 = ~
lIT-xnll2 - 2J.L(Txn ,xn ) + J.L2 IITII2 - 2J.L(Tx n , xn) + J.L2 ,
and the right-hand side above tends to 0 as n
-I-
00,
we get that
lim TX n - J.LX n = 0 .
n-+oo
But then, writing 1
Xn = -(Txn - (Txn - J.LX n J.'
»,
n = 1,2, ...
4.
Spectral Decomposition
341
it follows that limn-+oo Xn = (1/JL)Y. In this case we have
y
= n-+oo lim TXn = (1/JL)Ty,
and consequently, JL is an eigenvalue for T, and y is an eigenvector corresponding to JL. • It is interesting to point out that the assumption that T is self-adjoint is necessary for the validity of Lemma 4.6. Indeed, let X = [2 and suppose T:X ~ X is given by
T is compact, but not self-adjoint. Also, as the reader can readily verify, T has no nonzero eigenvalues.
Corollary 4.7. Let X be a Hilbert space, and suppose T is a compact self-adjoint operator defined on X. If T has no nonzero eigenvalues, then T = O.
Observe that the eigenvalue we just found in Lemma 4.6 is one with largest absolute value. Indeed, if ,x is an eigenvalue of T, and x is an eigenvector corresponding to ,x with IIxll = 1, then we have l,xl
= l,xl(x, x) = I(Tx, x)1 ~ IITII = IJLI ,
which is precisely our remark. The stage is now set for the description of the action of a compact self-adjoint mapping in terms of projections. Theorem 4.8 (Spectral Decomposition of Compact Operators). Let X be a Hilbert space, and T a compact self-adjoint mapping defined on X. Then, the set of distinct eigenvalues {,xn} of T is at most countable, and if P>'n denotes the projection of X onto the eigenspace X>'n corresponding to the eigenvalue ,xn, we have
(4.9) The convergence of the series in (4.9) is understood to be in the sense of the norm of B(X), i.e., lim m-+oo
liT - ""m L.."n=l ,xnP>'n II = O.
XVI.
348
Hilbert Spaces
Proof. Unless T = 0, by Lemma 4.6, T has an eigenvalue At, say, with the largest absolute value. IfTl = T and if PAl denotes the projection of X onto X A1 , the eigenspace corresponding to the eigenvalue At, consider the mapping T2 given by T2 = Tl - Al PAl . In view of (vi) in Proposition 4.4 we can rewrite T2 as (4.10) Since Tl is compact and self-adjoint, by 5.48 below, also T2 is compact and self-adjoint. Moreover, since I - PAl is a projection, and as such III - PAll! ~ 1, we have
(4.11) Whence applying Lemma 4.6 to T2 now, unless T2 = 0 we can find an eigenvalue 0 f; A2 ofT2 with largest absolute value. By (4.11), IAII ~ IA21. Moreover, we claim that the following is also true: Al is not an eigenvalue of T2, and every eigenvalue of T2 is at the same time an eigenvalue of Tl , and the corresponding eigenspaces coincide. First we show that Al is not an eigenvalue of T2 • For if it were, let Of; x be an eigenvector corresponding to At, and note that by (4.10) and (4.7) we have (4.12) Tlx - Al PAl X = AlX . Now, applying PAl to both sides, again by (4.7), we get
Al PAl X -AlPA1X
= AlPA1X = 0,
( 4.13)
which substituted into (4.12) gives Tlx = AlX, i.e., x E X AI • But if this is the case, then by (4.13) we also have x = PAl X = 0, which gives the desired contradiction. We now show that every nonzero eigenvalue of T2 is at the same time an eigenvalue of T}, and the corresponding eigenspaces coincide. In fact, let A f; 0 be an eigenvalue of T2 , and x f; 0 an eigenvector of T2 corresponding to A. By the definition of T2 we have (4.14) and consequently, it follows that (4.15) Moreover, since Tl commutes with (I - PAl)' the left-hand side of (4.15) equals Tl(I - PAl)' which happens to be the left-hand side of (4.14).
4.
Spectral Decomposition
349
Whence equating the right-hand sides of (4.14) and (4.15), and since A"I 0, we get x = (1 - P)"l )x, which gives
Thus, A is also an eigenvalue of Tl with eigenvector x, and consequently the eigenspace corresponding to A as an eigenvector of T2 is contained or equal to that of A as an eigenvalue of T1 • Now, since A "I At, by (v) in Proposition 4.5 it follows that the eigenspaces X)., and X)"l' corresponding to Tt, are orthogonal. Whence if y is an eigenvector of Tl corresponding to A it follows that P)"l y = 0, and by (4.10) we have
Thus, y is also an eigenvector ofT2 corresponding to A, and the eigenspaces coincide. Repeating this process we construct compact self-adjoint operators Tl = T, T2, .. . ,Tn' and eigenvalues At, ... , An of these operators, such that
and Further, by what has been shown above, the Ak'S are distinct eigenvalues ofT1 • Now, if for some n we have Tn = 0, then the sum in (4.9) is finite, and the conclusion follows. However, if Tn "I 0 for every n, then the process described above leads to a sequence {Tn} of compact self-adjoint operators and a corresponding sequence {An} of eigenvalues. Next we show that in this case An --+ 0 as n --+ 00. Suppose not, then it follows that
Choose now an ONS {xn} consisting of eigenvectors associated to the An'S. By the Pythagorean theorem we get -
IITxm - TXnll2
= IIAmXm -
Anxnll2 = IAml2 + IAnl2 ~ 2£2,
m"l n,
XVI.
350
Hilbert Spaces
and consequently, neither the sequence {Tx n } nor any of its subsequences converges, thus contradicting the fact that T is compact. Moreover, since IITnll = IAnl for all n, we also have limn-+ oo II Tn II = 0, and (4.9) holds. There is yet a last detail to be checked, namely, that T has no nonzero eigenvalues apart from the An'S. For, if A'lOis such an eigenvalue and if x 'lOis an eigenvector corresponding to A, then by (4.9) we get (4.16) Now, by (v) in Proposition 4.5 the elements P>'nx, n = 1,2, ... are pairwise orthogonal. Hence by (4.16) it follows that
AP>'m X = P>'m (Ln AnP>.n X) = AmP>'m X ,
all m,
and since A 'I Am, we have that P>'mx = 0 for all m. Again by (4.16) we get that AX = 0, which, in turn, implies that x = 0, a contradiction. • Theorem 4.8 has many important applications, including the development of a functional calculus for compact self-adjoint operators. For instance, if T is such an operator, to represent T2 observe that on account of (4.9 ) we have
T2x = LAnP>.nTx = LA~P>'nX, n
all x EX.
n
It is then apparent that for polynomials p, we also have
n
where the notation peT) is self-explanatory. Moreover, since functions J that are continuous on [}L,77] are uniform limits of polynomials, we prove this in Corollary 2.3 in Chapter XVII, we also have
J(T)x = 2:J(An)P>'nX,
all x EX.
n
In fact, the expression on the right-hand side above defines J(T).
There is yet another way to express the identity in (4.9). Consider the operators E>. = ~)..n.x,y);
5.
Problems and Questions
351
4> is a right-continuous function that vanishes for constant for ~ > TJ. We then have
(Tx, V)
= Jt4>(Jt) +
~
< Jt, and that is
i" ~ d4>(~),
where the integral is an ordinary Riemann-Stieltjes integral. A similar representation is true for arbitrary self-adjoint operators, not necessarily compact; we will not discuss it here.
5. PROBLEMS AND QUESTIONS 5.1 Suppose X is a real inner product space and that the elements x, V E X satisfy IIx + vII2 = IIxll 2+ IIv1I 2. Show that x .1 V. Is the result true for complex inner product spaces? 5.2 Suppose X is an inner product space and show that the elements x, V E X satisfy Ilx + vII = IIxll + IlvlI iff there exists a scalar ~ > 0 such that x = ~V. 5.3 If X is a real inner product space and x, V E X satisfy IIxll = IIvlI, then (x - V) .1 (x + V). What does this mean geometrically? What does the assumption imply in case X is a complex inner product space? 5.4 Suppose x, V, z are elements of an inner product space X. Show that Appolonius' identity holds, to wit,
liz -
xW
+ liz - vII2 = IIx - v1I 2/2 + 211z -
5.5 Can we obtain the norm product?
IIzll
=
IZll + IZ21
(x
+ v)/211 2.
in C 2 from an inner
5.6 Discuss under what conditions equality holds in the Cauchy-Schwarz inequality. 5.7 Show that if V,X,X n are elements of an inner product space X, n = 1,2, ... , and if V .1 Xn for all n, and lim n -+ oo Xn = x, then V.l. x. 5.8 Show that in an inner product space X, xl. V iff IIx+~vll = IIx-~vll for all scalars ~. Further, x .1. V iff IIx + ~vll ~ IIxll for all scalars ~. 5.9 Prove that the span of a subset M of a Hilbert space X is dense in X iff M.l. = {a}.
XVI.
352
Hilbert Spaces
5.10 Let MI ;2 M2 be nonempty subsets of an inner product space X. Show that (a) MI ~ M{l., (b) M{ ~ M:j-, and (c) M{l.l. = M{. Further, show that a subspace Y of a Hilbert space X is closed iff Y = yl.l..
5.11 Suppose M is a closed subspace of a Hilbert space X and let x be an element of X. Prove that
min{lIx -
YII: Y E M} =
min{l(x, y)1 : y E Ml., lIyll = I}.
5.12 Let I = [0,1]. Compute max II x 3 f dx, and mina,b II(x 5 -a-bx)2dx, subject to the conditions
1
xkf(x)dx
= O,k = 0,1,2,
and
l
lf(x) 12 dx
= 1.
5.13 The following extension of Theorem 1.8 is called the Lax-Milgram lemma. Let X be a Hilbert space. Let B(x,y) be a complex-valued functional defined on X X X which satisfies the following four conditions: (i) B(XI + AX2,y) = B(XbY) + AB(X2,Y) for all XI,X2,Y in X and scalars A. (ii) B(X,YI +AY2) = B(X,YI)+XB(x,Y2) for all x, Yt, Y2 in X and scalars A. (iii) There is a positive constant k such that IB(x, y)1 ~ kllxllllyll for all x, Y in X. (iv) There exists a positive constant e such that IB(x,x)1 ~ ellxl1 2 for all x EX. Then for every L E X* there exists a unique element Y E X such that Lx = B(x,y) for all x E X. More precisely, there exists a uniquely determined bounded linear operator T with a bounded inverse T- I such that (x,y) = B(x,Ty) for all x,y E X, and
IITII
~ lie,
liT-III ~ k.
5.14 Let X be a complex inner product space. A complex-valued functional H(x, y) defined on X X X is said to be a Hermitian form provided that the following two conditions hold: (i) H(XI + AX2,y) = H(XbY) + AH(X2,Y) for all XbX2,Y in X, and scalars...."".....,,..--...,.. A. (ii) H(x,y) = H(y,x) for all x,y E X. H is said to be positive semidefinite if (iii) H(x,x) ~ 0 for all x E X.
5.
Problems and Questions
353
Show that if (i), (ii) and (iii) hold, then H satisfies the following Cauchy-Schwarz inequality
IH(x,y)1 $ H(x,x)H(y,y),
all X,y EX.
Further, show that p( x) = ..;H (x, x) defines a semi norm on X. 5.15 Let X, Y be inner product spaces and T: X --+ Y be a bounded linear operator. Then, (a) T = 0 iff (Tx, y) = 0 for all x in X and y in Y, and, (b) If X = Y is a complex linear space, then T = 0 iff (Tx,x) = 0 for all x in X. 5.16 Suppose X is a Hilbert space, and let T E 8(X). Show that
IITII =
sUPllxll=lIyll=1 I(Tx, y)l·
5.17 Suppose P: X --+ X is a linear map that satisfies (2.1) in Proposition 2.1 and so that p2 is a projection. Is P a projection? 5.18 Let I = [0,1] and consider the linear mapping T: L2(I) --+ L2(I) given by T I( x) = a( x )/( x ), I E L2 (I). Find necessary and sufficient conditions on a( x) for T to be a projection. 5.19 Let X be a Hilbert space and suppose PI, . .. ,Pn are projections on X. Find necessary and sufficient conditions for P = PI + ... + Pn to be a projection. What does the subspace of X onto which P projects look like? 5.20 Let M be a closed subspace of an infinite-dimensional Hilbert space X. Show that the projection PM is compact iff M is finite dimensional. 5.21 (Gram-Schmidt) Given an arbitrary linearly independent sequence of elements {yn} in an inner product space X there is an ONS of elements {x n } in X such that span {yl, ... ,Yn} = span {Xl, . .. ,x n }
for every n.
5.22 Suppose IIxll = 1. Show that at most 1/£2 ofthe Fourier coefficients of x with respect to any ONS in X exceed, in modulus, any £ > o. 5.23 Suppose X is a Hilbert space, show that X has an orthonormal basis. Moreover, in case X is separable, the existence of such a basis may be established without invoking Zorn's Lemma. 5.24 Suppose {x n } is a basis for a Hilbert space X, and let Yn = Xn - Xn+h n ~ 1. Show that the system {Yn}, although not orthonormal, is nevertheless complete in X.
Hilbert Spaces
XVI.
354
5.25 Suppose {x n } is a basis for a Hilbert space X, and {yn} ~ X is such that I:n IIx n - Ynll 2 < 1. Show that {Yn} is complete in X. Is the same conclusion true if I:n IIx n - Ynll 2 < 00 instead? 5.26 (Paley-Wiener) Suppose {x n } is a basis for a Hilbert space X, and suppose the sequence {yn} ~ X satisfies n
L
n
~m(xm - Ym) ~ A
m=l
L
~mxm ,
m=l
for any scalars ~m, m ~ 1, and a constant A, 0 ~ A < 1, independent of the choice of scalars and n. Show that each x E X can be written as x = I:~=1 CnYn, where the cn's are scalars and the sum converges in X. 5.27 Suppose I = [a,b] is an interval of the line. Show that an ONS {