Classical and Multilinear Harmonic Analysis, Volume 1

  • 34 311 10
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

Classical and Multilinear Harmonic Analysis, Volume 1

more information - CAMBRIDGE STUDIES IN ADVANCED MATHEMATICS 137 Editorial Board ´ , W

1,115 277 2MB

Pages 390 Page size 430.7 x 683.7 pts Year 2014

Report DMCA / Copyright


Recommend Papers

File loading please wait...
Citation preview

more information -


Classical and Multilinear Harmonic Analysis This two-volume text in harmonic analysis introduces a wealth of analytical results and techniques. It is largely self-contained and is intended for graduates and researchers in pure and applied analysis. Numerous exercises and problems make the text suitable for self-study and the classroom alike. This first volume starts with classical one-dimensional topics: Fourier series; harmonic functions; Hilbert transforms. Then the higher-dimensional Calder´on–Zygmund and Littlewood–Paley theories are developed. Probabilistic methods and their applications are discussed, as are applications of harmonic analysis to partial differential equations. The volume concludes with an introduction to the Weyl calculus. The second volume goes beyond the classical to the highly contemporary and focuses on multilinear aspects of harmonic analysis: the bilinear Hilbert transform; Coifman– Meyer theory; Carleson’s resolution of the Lusin conjecture; Calder´on’s commutators and the Cauchy integral on Lipschitz curves. The material in this volume has not been collected previously in book form. Camil Muscalu is Associate Professor in the Department of Mathematics at Cornell University. Wilhelm Schlag is Professor in the Department of Mathematics at the University of Chicago.

CAMBRIDGE STUDIES IN ADVANCED MATHEMATICS Editorial Board: B. Bollob´as, W. Fulton, A. Katok, F. Kirwan, P. Sarnak, B. Simon, B. Totaro All the titles listed below can be obtained from good booksellers or from Cambridge University Press. For a complete series listing visit: Already published 93 D. Applebaum L´evy processes and stochastic calculus (1st Edition) 94 B. Conrad Modular forms and the Ramanujan conjecture 95 M. Schechter An introduction to nonlinear analysis 96 R. Carter Lie algebras of finite and affine type 97 H. L. Montgomery & R. C. Vaughan Multiplicative number theory, I 98 I. Chavel Riemannian geometry (2nd Edition) 99 D. Goldfeld Automorphic forms and L-functions for the group GL(n,R) 100 M. B. Marcus & J. Rosen Markov processes, Gaussian processes, and local times 101 P. Gille & T. Szamuely Central simple algebras and Galois cohomology 102 J. Bertoin Random fragmentation and coagulation processes 103 E. Frenkel Langlands correspondence for loop groups 104 A. Ambrosetti & A. Malchiodi Nonlinear analysis and semilinear elliptic problems 105 T. Tao & V. H. Vu Additive combinatorics 106 E. B. Davies Linear operators and their spectra 107 K. Kodaira Complex analysis 108 T. Ceccherini-Silberstein, F. Scarabotti & F. Tolli Harmonic analysis on finite groups 109 H. Geiges An introduction to contact topology 110 J. Faraut Analysis on Lie groups: an introduction 111 E. Park Complex topological K-theory 112 D. W. Stroock Partial differential equations for probabilists 113 A. Kirillov, Jr An introduction to Lie groups and Lie algebras 114 F. Gesztesy et al. Soliton equations and their algebro-geometric solutions, II 115 E. de Faria & W. de Melo Mathematical tools for one-dimensional dynamics 116 D. Applebaum L´evy processes and stochastic calculus (2nd Edition) 117 T. Szamuely Galois groups and fundamental groups 118 G. W. Anderson, A. Guionnet & O. Zeitouni An introduction to random matrices 119 C. Perez-Garcia & W. H. Schikhof Locally convex spaces over non-Archimedean valued fields 120 P. K. Friz & N. B. Victoir Multidimensional stochastic processes as rough paths 121 T. Ceccherini-Silberstein, F. Scarabotti & F. Tolli Representation theory of the symmetric groups 122 S. Kalikow & R. McCutcheon An outline of ergodic theory 123 G. F. Lawler & V. Limic Random walk: a modern introduction 124 K. Lux & H. Pahlings Representations of groups 125 K. S. Kedlaya p-adic differential equations 126 R. Beals & R. Wong Special functions 127 E. de Faria & W. de Melo Mathematical aspects of quantum field theory 128 A. Terras Zeta functions of graphs 129 D. Goldfeld & J. Hundley Automorphic representations and L-functions for the general linear group, I 130 D. Goldfeld & J. Hundley Automorphic representations and L-functions for the general linear group, II 131 D. A. Craven The theory of fusion systems 132 J. V¨aa¨ n¨anen Models and games 133 G. Malle & D. Testerman Linear algebraic groups and finite groups of Lie type 134 P. Li Geometric analysis 135 F. Maggi Sets of finite perimeter and geometric variational problems 136 M. Brodmann & R. Y. Sharp Local cohomology (2nd Edition) 137 C. Muscalu & W. Schlag Classical and multilinear harmonic analysis, I 138 C. Muscalu & W. Schlag Classical and multilinear harmonic analysis, II 139 B. Helffer Spectral theory and its applications

Classical and Multilinear Harmonic Analysis Volume I CAMIL MUSCALU Cornell University

WILHELM SCHLAG University of Chicago

cambridge university press Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, S˜ao Paulo, Delhi, Mexico City Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York Information on this title:  C

Camil Muscalu and Wilhelm Schlag 2013

This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2013 Printed and bound in the United Kingdom by the MPG Books Group A catalogue record for this publication is available from the British Library Library of Congress Cataloguing in Publication data Muscalu, C. (Camil), author. Classical and multilinear harmonic analysis / C. Muscalu and W. Schlag. volumes cm. – (Cambridge studies in advanced mathematics ; 137–) Includes bibliographical references. ISBN 978-0-521-88245-3 (v. 1 : hardback) 1. Harmonic analysis. I. Schlag, Wilhelm, 1969– author. II. Title. QA403.M87 2013 515 .2422 – dc23 2012024828 ISBN 978-0-521-88245-3 Hardback

Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.


Preface Acknowledgements

page ix xvii


Fourier series: convergence and summability 1.1 The basics: partial sums and the Dirichlet kernel 1.2 Approximate identities, Fej´er kernel 1.3 The Lp convergence of partial sums 1.4 Regularity and Fourier series 1.5 Higher dimensions 1.6 Interpolation of operators Notes Problems

1 1 8 12 15 22 23 24 25


Harmonic functions; Poisson kernel 2.1 Harmonic functions 2.2 The Poisson kernel 2.3 The Hardy–Littlewood maximal function 2.4 Almost everywhere convergence 2.5 Weighted estimates for maximal functions Notes Problems

28 28 30 36 38 41 49 49


Conjugate harmonic functions; Hilbert transform 3.1 Hardy spaces of analytic functions 3.2 Riesz theorems 3.3 Definition and simple properties of the conjugate function 3.4 The weak-L1 bound on the maximal function

52 52 56 59 60 v


Contents 3.5 3.6

The Hilbert transform Convergence of Fourier series in Lp Notes Problems

62 67 69 69

The Fourier transform on Rd and on LCA groups 4.1 The Euclidean Fourier transform 4.2 Method of stationary or nonstationary phases 4.3 The Fourier transform on locally compact Abelian groups Notes Problems

90 103 104


Introduction to probability theory 5.1 Probability spaces; independence 5.2 Sums of independent variables 5.3 Conditional expectations; martingales Notes Problems

106 106 108 122 133 134


Fourier series and randomness 6.1 Fourier series on L1 (T): pointwise questions 6.2 Random Fourier series: the basics 6.3 Sidon sets Notes Problems

136 136 145 151 162 163


Calder´on–Zygmund theory of singular integrals 7.1 Calder´on–Zygmund kernels 7.2 The Laplacian: Riesz transforms and fractional integration 7.3 Almost everywhere convergence; homogeneous kernels 7.4 Bounded mean oscillation space 7.5 Singular integrals and Ap weights 7.6 A glimpse of H 1 –BMO duality and further remarks Notes Problems

166 166 174 177 181 188 192 193 193

Littlewood–Paley theory 8.1 The Mikhlin multiplier theorem 8.2 Littlewood–Paley square-function estimate

196 196 199



73 73 84

Contents 8.3 8.4 8.5

Calder´on–Zygmund decomposition, H¨older spaces, and Schauder estimates The Haar functions; dyadic harmonic analysis Oscillatory multipliers Notes Problems


205 211 228 230 231


Almost orthogonality 9.1 Cotlar’s lemma 9.2 Calder´on–Vaillancourt theorem 9.3 Hardy’s inequality 9.4 The T (1) theorem via Haar functions 9.5 Carleson measures, BMO, and T (1) Notes Problems

235 235 240 242 247 259 264 265


The uncertainty principle 10.1 Bernstein’s bound and Heisenberg’s uncertainty principle 10.2 The Amrein–Berthier theorem 10.3 The Logvinenko–Sereda theorem 10.4 Solvability of constant-coefficient linear PDEs Notes Problems

268 268 272 274 281 284 285


Fourier restriction and applications 11.1 The Tomas–Stein theorem 11.2 The endpoint 11.3 Restriction and PDE; Strichartz estimates 11.4 Optimal two-dimensional restriction Notes Problems

287 287 295 301 310 311 312


Introduction to the Weyl calculus 12.1 Motivation, definitions, basic properties 12.2 Adjoints and compositions 12.3 The L2 theory 12.4 A phase-space transform Notes Problems

315 315 322 343 350 357 358

References Index

360 367


Harmonic analysis is an old subject. It originated with the ideas of Fourier in the early nineteenth century (which were preceded by work of Euler, Bernoulli, and others). These ideas were revolutionary at the time and could not be understood by means of the mathematics available to Fourier and his contemporaries. However, it was clear even then that the idea of representing any function as a superposition of elementary harmonics (sine and cosine) from an arithmetic sequence of frequencies had important applications to the partial differential equations of physics that were being investigated at the time, such as the heat and wave equations. In fact, it was precisely the desire to solve these equations that led to this bold idea in the first place. Research into the precise mathematical meaning of such Fourier series consumed the efforts of many mathematicians during the entire nineteenth century as well as much of the twentieth century. Many ideas that took their beginnings and motivations from Fourier series research became disciplines in their own right. Set theory (Cantor) and measure theory (Lebesgue) are clear examples, but others, such as functional analysis (Hilbert and Banach spaces), the spectral theory of operators, and the theory of compact and locally compact groups and their representations, all exhibit clear and immediate connections with Fourier series and integrals. Furthermore, soon after Fourier proposed representing every function on a compact interval as a trigonometric series, his idea was generalized by Sturm and Liouville to expansions with respect to the eigenfunctions of very general second-order differential operators subject to natural boundary conditions – a groundbreaking result in its own right. Not surprisingly harmonic analysis is therefore a vast discipline of mathematics, which continues to be a vibrant research area to this day. In addition, over the past 60 years Euclidean harmonic analysis, as represented by the schools associated with A. Calder´on and A. Zygmund at the University of Chicago as ix



well as these associated with C. Fefferman and E. Stein at Princeton University, has been inextricably linked with partial differential equations (PDEs). While applications to the theory of elliptic PDEs and pseudodifferential operators were a driving force in the development of the Calder´on–Zygmund school from the very beginning, the past 25 years have also seen an influx of harmonic analysis techniques to the theory of nonlinear dispersive equations such as the Schr¨odinger and wave equations. These developments continue to this day. The basic “divide and conquer” idea of harmonic analysis can be stated as follows: that we should study those classes of functions that arise in interesting contexts (for example, as solutions of differential equations; as measured data; as audio or video signals on DVDs, CDs, or possibly transmitted across glass fiber cables or great distances such as that between Mars and Earth; as samples of random processes) by breaking them into basic constituent parts and that (a) these basic parts are both as simple as possible and amenable to study and (b) ideally, reflect some structure inherent to the problem at hand. In a classical manifestation these basic parts are given by the standing waves used in Fourier series, but over the past 30 years wavelets (as well as curvelets and ringlets) have revolutionized applied harmonic analysis, especially as used in image processing. This fundamental idea is ubiquitous in science and engineering. Examples where it arises and is put to use include the following: all types of medical imaging (such as magnetic resonance imaging (MRI), computed tomography, ultrasound, echocardiography, and positron emission tomography (PET)); signal processing, especially through the methods used in compressed sensing; and inverse problems such as those that arise in remote sensing, medical imaging, geophysics, and oil exploration. In addition, advances in electrical engineering – and with it essentially the whole of modern industry as we know it – have only been possible through the systematic use and implementation of mathematics, often in the form of Fourier analysis and its ramifications. To go even further, Nature appears to carry in herself the blueprint of the basic harmonics. Indeed, electrons and other elementary particles are understood as spherical waves, and the discrete energy levels so characteristic of quantum mechanics are dictated by the necessity of fitting such a standing wave onto a two-dimensional spherical surface. String theory takes this concept to an entirely different level of abstraction by reducing everything to the vibration of tiny strings. Harmonic analysis has the advantage, over other subjects in mathematics, that it has never been completely isolated or divorced from applications; rather, a significant part of it has been steadily guided and inspired by them. For example, the study of the Cauchy kernel on Lipschitz curves might have arisen



as a seemingly academic exercise and a rather mindless generalization but for the fact that there are so many important problems in materials science where Nature produces just such non-smooth boundaries. Conceptual developments in harmonic analysis are at the center of many important scientific and technological advances. It is rather remarkable that wavelets are actually used in the JPEG 2000 standard. The technical details will be of interest mainly to specialists, but the conceptual framework has a much greater reach and perhaps significance. Theorems have hypotheses that may not be exactly satisfied in many real physical situations; in fact, engineering is not an exact science but rather one of good enough approximation. So, while theorems may not apply in a strict sense the thinking that went into them can still be extremely useful. It is precisely this thinking that our book wishes to present. Classic monographs and textbooks in harmonic analysis include those by Stein [108], [110], and Stein and Weiss [112]; amongst the older literature there is the timeless encyclopedic work on Fourier series by Zygmund [131] and the more accessible introduction to Fourier analysis by Katznelson [65]. Various excellent, more specialized texts are also available, such as Folland [41], which focuses on phase-space analysis and the Weyl calculus, as well as Sogge [105], which covers oscillatory integrals. Wolff’s lecture notes [128] can serve as an introduction to the ideas associated with the Kakeya problem. This is a more geometric, as well as combinatorial, aspect of harmonic analysis that is still rather poorly understood. Our intention was not to compete with any of these well-known texts. Rather, our book is designed as a teaching tool, both in a traditional classroom setting as well as in the setting of independent study by an advanced undergraduate or beginning graduate student. In addition, the authors hope that it will also be useful for any mathematician or mathematically inclined scientist who wishes to acquire a working knowledge of select topics in (mostly Euclidean) Fourier analysis. The two volumes of our book are different in both scope and character, although they should be perceived as forming a natural unit. In this first volume we introduce the reader to a broad array of results and techniques, starting from the beginnings of the field in Fourier series and then developing the theory along what the authors hope are natural avenues motivated by certain basic questions. The selection of the material in this volume is of course partly a reflection of the authors’ tastes, but it also follows a specific purpose: to introduce the reader to sufficiently many topics in classical Fourier analysis, and in enough detail, to allow them to continue a guided study of more advanced material possibly leading to original research in analysis, pure or applied.



All the material in this volume should be considered basic. It can be found in many texts but to the best of our knowledge not in a single place. The authors feel that this volume presents the course that they should have taken in graduate school, on the basis of hindsight. To what extent the entirety of this volume constitutes a reasonable course is up to the individual teacher to decide. It is more likely that selections will need to be made, and it is of course also to be expected that lecturers will wish to supplement the material with certain topics of their own choosing that are not covered here. However, the authors feel, particularly since this has been tested on individual students, that Vol. I could be covered by a beginning graduate student over the course of a year in independent, but guided, study. This would then culminate in some form of qualifying or “topic” exam, after which the student would be expected to begin independent research. Particular emphasis has been placed on the inclusion of exercises and problems. The former are dispersed throughout the main body of the text and are for the most part an integral part of the theory. As a rule they are less difficult than the end-of-chapter problems. The latter serve to develop the theory further and to give the reader the opportunity to try his or her hand at the occasional hard problem. An old and commonplace principle, to which the authors adhere, is that any piece of mathematics can only be learned by active work, and this is reflected in our book. In addition, the authors have striven to emphasize intuition and ideas over both generality and technique, without, however, sacrificing rigor, elegance, or for that matter, relevance. While Vol. I presents developments in harmonic analysis up to the mid to late 1980s, Vol. II picks up from there and focuses on more recent aspects. In comparison with the first volume the second is of necessity much more selective, and many topics of current interest could not be included. Examples of the many omitted areas that come to mind are the oscillatory integrals related to the Kakeya, restriction, and Bochner–Riesz conjectures, multilinear Strichartz estimates and geometric measure theory and its relations to combinatorics and number theory. The selection of topics in Vol. II can roughly be described as phase-space oriented; it comprises material that either grew out of or is closely related to the David–Journ´e T (1) theorem, the Cauchy integral on Lipschitz curves, and the Calder´on commutators on the one hand and the solution to Lusin’s problem by Carleson on the other hand. The latter work, also through Fefferman’s reworking of Carleson’s proof, greatly influenced the resolution of the bilinear Hilbert-transform boundedness problem by Lacey and Thiele in the mid 1990s. Generally speaking, Carleson’s work has had a profound influence on the more combinatorially oriented analysis of phase space that lies at the heart of Vol. II.



To be more specific, Vol. II concentrates on paraproducts (which make an appearance in Vol. I), the bilinear Hilbert transform, Carleson’s theorem, and Calder´on commutators and the Cauchy integral on Lipschitz curves. In fact, the analysis of paraproducts can be seen as the most foundational material for Vol. II; paraproducts are developed on polydisks and as flag paraproducts in their own right. In this sense, Vol. II is more research oriented in character, as some results are quite recent and the organization and development of the material are in part original and specific to Vol. II. In terms of presentation, in the second volume the authors have generally chosen not the shortest proofs but those that are most robust as well as (to their taste) most illuminating. For example, there are simpler ways of approaching the Coifman–Meyer theorem on paraproducts but these do not carry over to other contexts such as flag paraproducts. Much emphasis has been placed on motivation, and so the authors have included applications to PDEs, for example in the form of Strichartz estimates and their use in nonlinear equations in Vol. I and in the form of fractional Leibnitz rules with application to the KdV equation in Vol. II. The chapter entitled “Iterated Fourier series and physical reality” in Vol. II is entirely devoted to motivation and to an explanation of the larger framework in which much of that volume sits. Throughout both volumes the authors have striven to emphasize intuition and ideas, and often figures have been used as an important part of an explanation or proof. This is particularly the case in Vol. II, which is more demanding in terms of technique and with often longer and more complicated proofs than those in Vol. I. Volume II overlaps considerably with the recent research literature, especially with papers by Lacey and Thiele on the one hand and by the first author, Tao, and Thiele on the other hand. We shall not give an account of these papers here, since a discussion can be found in the end-of-chapter notes (we have avoided placing citations and references in the main body of the text, so as not to distract the reader). We shall now comment briefly on the relation of Vol. II to classic textbooks in the area. The influential work by Coifman and Meyer [24] overlaps the first and fourth chapters of Vol. II. However, not only is the technical approach developed in [24] completely different from that in this book but it is also designed for a different purpose: it is less a textbook than a rapid review of many deep topics such as Hardy spaces on Lipschitz domains, Murai’s proof of the Cauchy integral boundedness, and commutators and the Cauchy integral. Finally, [24] develops wavelets, which are an essential tool in many real-world applications but make no appearance in this book.



Another well-known text that overlaps Vol. II contains Christ’s CBMS lectures [21]. These are centered around the T (1) and T (b) theorems and their applications. While some of this material does make an appearance in our book, Vol. II does not use T (1) or T (b). We conclude this preface with a discussion of prerequisites. Essential is a grounding in basic analysis, beginning with multivariable calculus (including writing a hypersurface as a graph and the notion of Gaussian curvature on a hypersurface) going on to measure theory and integration and the functional analysis relevant to basic Hilbert spaces (orthogonal projections, bases, completeness) as well as to Banach spaces (weak and strong convergence, bases, Hahn–Banach, uniform boundedness), and finally the basics of complex analysis (including holomorphic functions and conformal maps). In Vol. I probability theory also makes an appearance, and it might be helpful if the reader has had some exposure to the notions of independence, expected value, variance, and distribution functions. The second volume requires very little more in terms of preparation other than a fairly mature understanding of the above topics. The authors do not recommend, however, that one should attempt Vol. II before Vol. I. As is customary in analysis, interpolation is frequently used. To be more specific, we rely on both the Riesz–Thorin and Marcinkiewicz interpolation theorems. We state these facts at the end of the first chapter but omit the proofs, as they can readily be found in the texts of Stein and Weiss [112] and Katznelson [65]. Finally, in Chapter 11 of the present volume, on the restriction theorem, we also make use of Stein’s generalization of the Riesz– Thorin theorem to analytic families of operators. Volume II uses multilinear interpolation, and the facts needed are collected in an appendix. A standard reference for interpolation theory, especially as it relates to Besov, Sobolev, and Lorentz spaces, is the text by Bergh and L¨ofstr¨om [7]. As we have already pointed out, many important topics are omitted from our two volumes. Apart from classical topics such as inner and outer functions, which we could not include in Vol. I owing to space considerations, an aspect of modern harmonic analysis that is not covered here is the vast area dealing with oscillatory integrals. This field touches upon several other areas including geometric measure theory, combinatorial geometry, and number theory and is relevant to nonlinear dispersive PDEs in several ways, such as through bilinear restriction estimates (see for example Wolff’s paper [127] as well as his lecture notes [128]). This material would naturally comprise a third volume, which would need to present the research that has been done since the work of Sogge [105] and Stein [109].



How to use this volume. Ideally, the reader should work through the chapters in order. A reader or class familiar with the Fourier transform in Rd could start with Chapter 7, then move on to Chapter 8 and subsequently choose any of the remaining four chapters in this volume according to taste and time constraints. Chapters 7 through 11 constitute the backbone of real-variable harmonic analysis. Of those, Chapter 10 can be regarded as an optional extra; however, the authors feel that it is of importance and that students should be exposed to this material. As an application of the Logvinenko–Sereda theorem of Section 10.3, we prove the local solvability of constant coefficient PDEs (the Malgrange–Ehrenpreis theorem) in Section 10.4. Another such outlier is Chapter 12, in which we introduce the reader to an area that is itself the subject of many books; see for example Taylor’s book on pseudodifferential operators [120] as well as H¨ormander’s treatise [58]. The present authors decided to include a very brief account of this story, since it is an essential part of harmonic analysis and also since it originates in Calder´on’s work as part of his investigations of singular integrals and the Cauchy problem for elliptic operators. In principle, Chapter 12 can be read separately by a mature reader who is familiar with Cotlar’s lemma from Chapter 9. In the writing of this chapter a difficult decision had to be made, namely which of the two main incarnations of pseudodifferential operators to use, that of Kohn– Nirenberg or that of Weyl. While the former is somewhat simpler technically, and therefore often used for elliptic PDEs, the older Weyl quantization is very natural owing to its symmetry and is the one that is normally used in the so-called semiclassical calculus. We therefore chose to follow the latter route; Kohn–Nirenberg pseudo-differential operators make only a very brief appearance in this text. In Chapter 5 we introduce the reader to probability theory, which is also often omitted in a more traditional harmonic analysis presentation. However, the authors felt that the ideas developed in that chapter (which are very elementary for the most part) are an essential part of modern analysis and of mathematics in general. They appear in many different settings and should be in the toolbox of any working analyst, pure or applied. Chapter 6 contains several examples of how probabilistic thinking and results appear in harmonic analysis. Section 6.3 on Sidon sets can be omitted on first reading, as it is somewhat specialized (it contains, in particular, Rider’s theorem, which we prove there). The first four chapters of this volume are intended for a reader who has had no or very little prior exposure to Fourier series and integrals, harmonic functions, and their conjugates. A basic introductory advanced-undergraduate or beginning-graduate course would cover the first three chapters, omitting



Section 2.5, and would then move on to the first section of the fourth chapter (here, the material on locally compact Abelian groups could be omitted entirely, since it is used only in a non-Euclidean setting in the proof of Rider’s theorem in Section 6.3, and the stationary phase method is used only in the final two chapters of this volume). After that, an instructor could then select any topics from Chapters 5–8 as desired, the most traditional choice being the Calder´on– Zygmund, Mikhlin, and Littlewood–Paley theorems. The last of these theorems, at least as presented here, does require a minimal knowledge of probability, namely in the form of Khinchine’s inequality. Feedback. The authors welcome comments on this book and ask that they be sent to [email protected].


Wilhelm Schlag expresses his gratitude to Rowan Killip for detailed comments on his old harmonic analysis notes from 2000, from which the first volume of this book eventually emerged. Furthermore, he thanks Serguei Denissov, Charles Epstein, Burak Erdogan, Patrick G´erard, David Jerison, Carlos Kenig, Andrew Lawrie, Gerd Mockenhaupt, Paul M¨uller, Casey Rodriguez, Barry Simon, Chris Sogge, Wolfgang Staubach, Eli Stein, and Bobby Wilson for many helpful suggestions and comments on a preliminary version of Vol. I. Finally, he thanks the many students and listeners who attended his lectures and classes at Princeton University, the California Institute of Technology, the University of Chicago, and the Erwin Schr¨odinger Institute in Vienna over the past ten years. Their patience, interest, and helpful comments have led to numerous improvements and important corrections. The second volume of the book is partly based on two graduate courses given by Camil Muscalu at S¸coala Normalˇa Superioarˇa, Bucures¸ti, in the summer of 2004 and at Cornell University in the fall of 2007. First and foremost he would like to thank Wilhelm Schlag for the idea of writing this book together. Then, he would like to thank all the participants of those classes for their passion for analysis and for their questions and remarks. In addition, he would like to thank his graduate students Cristina Benea, Joeun Jung, and Pok Wai Fong for their careful reading of the manuscript and for making various corrections and suggestions and Pierre Germain and Rapha¨el Cˆote for their meticulous comments. He would also like to thank his collaborators Terry Tao and Christoph Thiele. Many ideas that came out of this collaboration are scattered through the pages of the second volume of the book. Last but not least, he would like to express his gratitude to Nicolae Popa from the Institute of Mathematics of the Romanian Academy for introducing him to the world of harmonic analysis and for his unconditional support and friendship over the years. xvii



Many thanks go to our long-suffering editors at Cambridge University Press, Roger Astley and David Tranah, who continued to believe in this project and support it even when it might have been more logical not to do so. Their cheerful patience and confidence is gratefully acknowledged. Barry Simon at the California Institute of Technology deserves much credit for first suggesting to David Tranah roughly ten years ago that Wilhelm Schlag’s harmonic analysis notes should be turned into a book. The authors were partly supported by the National Science Foundation during the preparation of this book.

1 Fourier series: convergence and summability

1.1. The basics: partial sums and the Dirichlet kernel 1.1.1. Definitions We begin with a basic object in analysis, namely the Fourier series associated with a function or a measure on the circle. To be specific, let T = R/Z be the one-dimensional torus (in other words, the circle). We will consider various function spaces on the torus T, namely the space of continuous functions C(T), the space of H¨older continuous functions C α (T) where 0 < α ≤ 1, and the Lebesgue spaces Lp (T) where 1 ≤ p ≤ ∞. The space of complex Borel measures on T will be denoted by M(T). Any μ ∈ M(T) has associated with it a Fourier series μ∼


μ(n)e(nx) ˆ



where e(x) := e2πix and ˆ μ(n) ˆ :=


ˆ e(−nx) μ(dx) =


e(−nx) μ(dx). T

The symbol ∼ in (1.1) is formal and simply means that the series on the righthand side is associated with μ. If μ(dx) = f (x) dx where f ∈ L1 (T), then we may write fˆ(n) instead of μ(n). ˆ The central question which we wish to explore in this chapter is the following: when does μ equal the right-hand side in (1.1), that is, when does it represent f in a suitable sense? Note that if we start from a trigonometric polynomial f (x) =


an e(nx)




Fourier series: convergence and summability

where all but finitely many an are zero, then we see that fˆ(n) = an

∀n ∈ Z.


In other words, we have the pointwise equality f (x) =


fˆ(n)en (x),


with en (x) := e(nx). Property (1.2) is equivalent to the basic orthogonality relation ˆ en (x)em (x) dx = δ0 (n − m), (1.3) T

where δ0 (j ) = 1 if j = 0 and δ0 (j ) = 0 otherwise. It is therefore natural to explore the question of the convergence of the Fourier series for more general functions. Of course, precise meaning of the convergence of infinite series needs to be specified before this fundamental question can be answered. It is fair to say that much modern analysis (including functional analysis) arose out of the struggle with this question. For example, the notion of the Lebesgue integral was developed in order to overcome the deficiencies in the older Riemannian definition of the integral that had been revealed through the study of Fourier series. The reader will note the recurring theme of the convergence of Fourier series throughout both volumes of this book. 1.1.2. Dirichlet kernel It is natural to start from the most basic notion of convergence, namely that of pointwise convergence, in the case where the measure μ is of the form μ(dx) = f (x) dx with f (x) continuous or of even better regularity. The partial sums of f ∈ L1 (T) are defined as N N ˆ   SN f (x) = fˆ(n)e(nx) = e(−ny)f (y) dy e(nx) n=−N

ˆ =

N  T n=−N



ˆ e(n(x − y))f (y) dy =


DN (x − y)f (y) dy,

 where DN (x) := N n=−N e(nx) is the Dirichlet kernel. In other words, we have shown that the partial sum operator SN is given by convolution with the Dirichlet kernel DN : SN f (x) = (DN ∗ f )(x).


1.1 The basics: partial sums and the Dirichlet kernel


Figure 1.1. The Dirichlet kernel DN and the upper envelope min(2N + 1, |π x|−1 ) for N = 9. See Exercise 1.1.

In order to understand basic properties of this convolution, we first sum the geometric series defining DN (x) to obtain an explicit expression for the Dirichlet kernel. Exercise 1.1 Verify that, for each integer N ≥ 0, DN (x) =

sin((2N + 1)π x) sin(π x)


and draw the graph of DN for several different values of N , say N = 2 and N = 5; cf. Figure 1.1. Prove the bound   1 (1.6) |DN (x)| ≤ C min N, |x| for all N ≥ 1 and some absolute constant C. Finally, prove the bound C −1 log N ≤ DN L1 (T) ≤ C log N for all N ≥ 2, where C is another absolute constant.



Fourier series: convergence and summability

The growth of the bound in (1.7), as well as the oscillatory nature of DN as given by (1.5), indicates that to understand the pointwise or almost everywhere convergence properties of SN f may be a very delicate matter. This will become clearer as we develop the theory. 1.1.3. Convolution In order to study (1.4) we need to establish some basic properties of the convolution of two functions f, g on T. If f and g are continuous, say, then define ˆ ˆ f (x − y)g(y) dy = g(x − y)f (y) dy. (1.8) (f ∗ g)(x) := T


It is helpful to think of f ∗ g as an average of translates of f by the measure g(y) dy (or the same statement but with f and g interchanged). In particular, convolution commutes with the translation operator τz , which is defined for any z ∈ T by its action on functions, i.e., (τz f )(x) = f (x − z). Indeed, one may immediately verify that τz (f ∗ g) = (τz f ) ∗ g = f ∗ (τz g).


In passing, we mention the important relation between the Fourier transform and translations:  (τ ˆ z μ)(n) = e(−zn)μ(n)

∀n ∈ Z.

In what follows, we will abbreviate almost everywhere or almost every by a.e. Lemma 1.1 The operation of convolution as defined in (1.8) satisfies the following properties. (i) Let f, g ∈ L1 (T). Then, for a.e. x ∈ T, one has that f (x − y)g(y) is L1 in y. Thus, the integral in (1.8) is well defined for a.e. x ∈ T (but not necessarily for every x), and the bound f ∗ g 1 ≤ f 1 g 1 holds. (ii) More generally, f ∗ g r ≤ f p g q for all 1 ≤ r, p, q ≤ ∞, 1+

1 1 1 = + , f ∈ Lp , g ∈ Lq . r p q

This is called Young’s inequality. (iii) If f ∈ C(T), μ ∈ M(T) then f ∗ μ is well defined. For 1 ≤ p ≤ ∞, f ∗ μ p ≤ f p μ ; this allows one to extend f ∗ μ to arbitrary f ∈ Lp .  (iv) If f ∈ Lp (T) and g ∈ Lp (T), where 1 ≤ p ≤ ∞, and 1 1 +  =1 p p

1.1 The basics: partial sums and the Dirichlet kernel


then f ∗ g, originally defined only a.e., extends to a continuous function on T, and f ∗ g ∞ ≤ f p g p .


(v) For f, g ∈ L1 (T) one has, for all n ∈ Z, ˆ f ∗ g(n) = fˆ(n)g(n). Proof (i) is an immediate consequence of Fubini’s theorem since f (x − y)g(y) is jointly measurable on T × T and belongs to L1 (T × T). For (ii), first let q = 1. Then (ii) can be obtained by interpolating between the case p = 1 covered in (i) and the easy bound for p = ∞. Alternatively, one can use Minkowski’s inequality, ˆ f ∗ g p ≤ f (· − y) p |g(y)| dy ≤ f p g 1 , T

which also implies (iii). The other extreme is q = p , which is covered by (iv). The remaining choices of q follow by interpolation relative to g. The bound (1.10) is H¨older’s inequality. Part (iv) follows from the fact that C(T) is dense in Lp (T) for 1 ≤ p < ∞ and from the translation invariance (1.9). Indeed, one may verify from the uniform continuity of functions in C(T) and (1.9) that f ∗ μ ∈ C(T) for any f ∈ C(T) and μ ∈ M(T). Since uniform limits of continuous functions are continuous, (iv) now follows from the aforementioned denseness of C(T) and (1.10). Finally, (v) is a consequence of Fubini’s theorem and the homomorphism  property of the exponentials e(n(x + y)) = e(nx)e(ny). The following exercise introduces the convolution as an operation acting on the Fourier coefficients of functions, rather than on the functions themselves. This is in the context of the largest class of functions where the respective Fourier series are absolutely convergent. This class of functions is necessarily a subalgebra of C(T), called the Wiener algebra. Exercise 1.2 Let μ ∈ M(T) have the property that  |μ(n)| ˆ < ∞.



Show that μ(dx) = f (x) dx, where f ∈ C(T). Denote the space of all measures with this property by A(T) and identify these measure with their respective densities. Show that A(T) is an algebra under multiplication and that   ˆ − m) ∀n ∈ Z, fg(n) = fˆ(m)g(n m∈Z


Fourier series: convergence and summability where the sum on the right-hand side is absolutely convergent for every n ∈ Z and indeed is absolutely convergent over all n. Moreover, show that fg A ≤ f A g A where f A := fˆ 1 . Finally, verify that if f, g ∈ L2 (T) then f ∗ g ∈ A(T).

Note that the Wiener algebra has a unit, namely the constant function 1. It is clear that if f has an inverse in A(T) then this is 1/f , which in particular requires that f = 0 everywhere on T. A remarkable theorem due to Norbert Wiener states that the converse holds, too; that is, if f ∈ A(T) does not vanish anywhere on T then 1/f ∈ A(T). We present this result in Section 4.3 as an easy corollary to Gelfand’s theory of commutative Banach algebras; see Corollary 4.27. One of the most basic as well as oldest results on the pointwise convergence of Fourier series is the following theorem. We shall see later that it fails for functions that are merely continuous. Theorem 1.2 N → ∞.

If f ∈ C α (T) with 0 < α ≤ 1 then SN f − f ∞ → 0 as

Proof One has, with δ ∈ (0, 12 ) to be determined, ˆ 1 (f (x − y) − f (x))DN (y) dy SN f (x) − f (x) = 0

ˆ =


(f (x − y) − f (x))DN (y) dy




(f (x − y) − f (x))DN (y) dy. 1/2>|y|>δ

Here we have exploited the fact that ˆ DN (y) dy = 1. T

We now use the bound from (1.6), i.e.,

 1 |DN (y)| ≤ C min N, . |y| 

Here and in what follows, C will denote a numerical constant that can change from line to line. The first integral in (1.12) can be estimated as follows: ˆ ˆ 1 |f (x) − f (x − y)| |y|α−1 dy ≤ C[f ]α δ α , (1.13) dy ≤ [f ]α |y| |y|≤δ |y|≤δ with the usual C α semi-norm [f ]α = sup x,y

|f (x) − f (x − y)| . |y|α

1.1 The basics: partial sums and the Dirichlet kernel


To bound the second term in (1.12) one needs to invoke the oscillation of DN (y). In fact, we have ˆ (f (x − y) − f (x))DN (y) dy B := 1/2>|y|>δ


f (x − y) − f (x) sin((2N + 1)πy) dy sin(πy) 1/2>|y|>δ    ˆ 1 dy =− hx (y) sin (2N + 1)π y + 2N + 1 1/2>|y|>δ =

where hx (y) :=

f (x − y) − f (x) . sin(πy)

Therefore, with all integrals understood to be in the interval (− 12 , 12 ), ˆ hx (y) sin((2N + 1)πy) dy 2B = |y|>δ

 1 sin((2N + 1)πy) dy 2N + 1 |y−1/(2N+1)|>δ    ˆ 1 = hx (y) − hx y − sin((2N + 1)πy) dy 2N + 1 |y|>δ   ˆ 1 − hx y − sin((2N + 1)πy) dy 2N + 1 [−δ,−δ+1/(2N+1)]   ˆ 1 hx y − sin((2N + 1)πy) dy. + 2N + 1 [δ,δ+1/(2N+1)] ˆ

 hx y −

These integrals are estimated by putting absolute values inside. To do so we use the bounds f ∞ , |hx (y)| < C δ  α  |τ | [f ]α f ∞ |τ | , |hx (y) − hx (y + τ )| < C + δ δ2 if |y| > δ > 2τ . In view of the preceding discussion, one obtains  −α  N [f ]α N −1 f ∞ |B| ≤ C , + δ δ2


provided that δ > 1/N . Choosing δ = N −α/3 one concludes from (1.12), (1.13), and (1.14) that   2 |(SN f )(x) − f (x)| ≤ C N −α /3 + N −2α/3 + N −1+2α/3 (1.15) for any function with f ∞ + [f ]α ≤ 1, which proves the theorem.


Fourier series: convergence and summability

Figure 1.2. The Fej´er kernel KN and the upper envelope min(N, (N(π x)2 )−1 ) for N = 9.

The reader is invited to optimize the rate of decay derived in (1.15). In other words, the challenge is to obtain the largest β > 0 in terms of α such that the bound in (1.15) becomes CN −β for any f with f ∞ + [f ]α ≤ 1.

1.2. Approximate identities, Fej´er kernel 1.2.1. Ces´aro means of partial sums The difficulties with the Dirichlet kernel (see Figure 1.1), such as its slow 1/|x| decay, can be regarded as a result of the “discontinuity” in D N = χ[−N,N] : this indicator function on the lattice Z jumps at ±N . Therefore we may hope to obtain a kernel that is easier to analyze – in a sense that will be made precise below by means of the notion of approximate identity – by substituting for DN a suitable average whose Fourier transform does not exhibit such jumps. An elementary way of carrying this out is given by the Ces`aro mean, i.e., N−1 1  Sn f. σN f := N n=0

1.2 Approximate identities, Fej´er kernel


Setting KN :=

N−1 1  Dn , N n=0

where KN is called the Fej´er kernel, one therefore has σN f = KN ∗ f . Exercise 1.3 Let KN be a Fej´er kernel with N a positive integer.

N looks like a triangle (see Figure 1.3), i.e., for all n ∈ Z, (a) Verify that K   |n| +

KN (n) = 1 − . (1.16) N (b) Show that KN (x) =

1 N

sin(N π x) sin(π x)

2 .


(c) Conclude that 0 ≤ KN (x) ≤ C N −1 min(N 2 , x −2 ).


We remark that the square and thus the positivity in (1.17) are not entirely surprising, since the triangle in (1.16) can be written as the convolution of two rectangles (the convolution is now at the level of the Fourier coefficients on the lattice Z). Therefore we should expect KN to have the form of the square of a version of DM , where M is about half the size of N , suitably normalized. The properties established in Exercise 1.3 ensure that the KN form what is called an approximate identity. ∞ Definition 1.3 The family { N }∞ N=1 ⊂ L (T) forms an approximate identity provided that ´1 (A1) 0 N (x) dx = 1 for all N, ´1 (A2) supN 0 | N (x)| dx < ∞, ´ (A3) for all δ > 0 one has |x|>δ | N (x)| dx → 0 as N → ∞.

The term “approximate identity” derives from the fact that N ∗ f → f as N → ∞ in any reasonable sense; see Proposition 1.5. In other words, N δ0 in the weak-∗ sense of measures. Clearly, the so-called box kernels

N (x) = Nχ[|x|N 0, let δ > 0 be such that sup sup |f (x − y) − f (x)| < ε. x

|y| N . Using the Fej´er kernel we can now give a simpler proof of Theorem 1.2. Second proof of Theorem 1.2 For simplicity, we take α < 1. The case α = 1 is left to the reader. We use the quantitative estimates ˆ 1

σN f − f ∞ ≤ min N, (Ny 2 )−1 |y|α dy [f ]α ≤ CN −α [f ]α . 0

Since SN σN f = σN f we have SN f − f = SN (f − σN f ) + σN f − f, whence SN f − f ∞ ≤ ( DN 1 + 1) f − σN f ∞ ≤ CN −α log N [f ]α ,

and we are done.

1.3. The Lp convergence of partial sums We now turn our attention to whether the partial sums SN f converge in the sense of Lp (T) or in the sense of C(T). Observe that it makes no sense to ask about the uniform convergence of SN f for general f ∈ L∞ (T), because the uniform limits of continuous functions are continuous. Proposition 1.9 The following statements are equivalent for any 1 ≤ p ≤ ∞: (i) for every f ∈ Lp (T) (or f ∈ C(T) if p = ∞) one has SN f − f p → 0 as N → ∞; (ii) supN SN p→p < ∞.

1.3 The Lp convergence of partial sums


Proof The implication (ii) =⇒ (i) follows from the fact that trigonometric polynomials are dense in the respective norms; see Corollary 1.6. The implication (i) =⇒ (ii) can be deduced immediately from the uniform boundedness principle of functional analysis. Alternatively, one can prove this directly by the method of the “gliding hump”. Suppose that supN SN p→p = ∞. For every positive integer  one can therefore find a large integer N such that SN f p > 2 , where f is a trigonometric polynomial with f p = 1. Now let f (x) =

∞  1 e(M x)f (x), 2 =1

with integers {M } to be specified. Notice that ∞  1 f p ≤ f p < ∞. 2  =1

Now choose a sequence {M } tending to infinity so rapidly that the Fourier support of e(M x)f (x) lies to the right of the Fourier support of g (x) :=

−1  1 e(Mj x)fj (x) j2 j =1

for every j ≥ 2 (here “Fourier support” means those integers for which the corresponding Fourier coefficients are nonzero). We also demand that M − N → ∞ as  → ∞ and that SM −N −1 f = g and SM +N f = g+1 . Then 1 2 SN f p > 2 , 2   which tends to ∞ as  → ∞. However, since N + M → ∞ and M − N − 1 → ∞ the left-hand side tends to 0 as  → ∞. This contradiction finishes the proof.  (SM +N − SM −N −1 )f p =

Exercise 1.4 Let {cn }n∈Z be an arbitrary sequence of complex numbers and associate with it formally a Fourier series  cn e(nx). f (x) ∼ n

Show that there exists μ ∈ M(T) with the property that μ(n) ˆ = cn for all n ∈ Z if and only if {σn f }n≥1 is bounded in M(T). Discuss also the case of Lp (T) with 1 ≤ p < ∞ and C(T).


Fourier series: convergence and summability 1.3.1. Failure of uniform convergence

We can now settle the question of the uniform convergence of SN on functions in C(T). Corollary 1.10 Fourier series do not converge on C(T) and L1 (T), i.e., there exists f ∈ C(T) such that SN f − f ∞  0 and g ∈ L1 (T) such that SN g − g 1  0 as n → ∞. Proof By Proposition 1.9 it suffices to verify the limits sup SN ∞→∞ = ∞,



sup SN 1→1 = ∞. N

Both properties follow from the fact that DN 1 → ∞ as N → ∞; see (1.7). To deduce (1.19) from this, notice that SN ∞→∞ = sup DN ∗ f ∞ f ∞ =1

≥ sup |(DN ∗ f )(0)| = DN 1 . f ∞ =1

We remark that the inequality sign here can be replaced with an equality, in view of the translation invariance (1.9). Furthermore, with Fej´er kernels {KM }∞ M=1 , SN 1→1 ≥ DN ∗ KM 1 → DN 1

as M → ∞.

Exercise 1.5 The previous proof was indirect. Construct f ∈ C(T) and g ∈ L1 (T) such that SN f does not converge to f uniformly as N → ∞ and such that SN g does not converge to g in L1 (T). Hint: The method of the gliding hump, from the proof of Proposition 1.9, may be used. We shall see below that, for 1 < p < ∞, sup SN p→p < ∞. N

Thus, by Corollary 1.10, for any f ∈ Lp (T) with 1 < p < ∞, SN f − f p → 0 as N → ∞.


The case p = 2 is clear, see Corollary 1.6, but the result for p = 2 is much deeper. We will need to develop the theory of the conjugate function to obtain

1.4 Regularity and Fourier series


it; see Chapter 3. Note that, unlike in Theorem 1.2, here there is no explicit rate of convergence in terms of some expression involving N ; clearly this cannot be expected, given only the size of f p , since g(x) := e(−mx)f (x) has the ˆ same Lp norm as f but g(n) = fˆ(n + m) for all n ∈ Z. This suggests that we may hope to obtain such a rate if we remove this freedom of translation in the Fourier coefficients. One way of accomplishing this is by imposing a little regularity, as expressed for example in terms of the standard Sobolev spaces. Since we do not yet have at our disposal the Lp convergence in (1.20) for general p, we need to restrict ourselves to p = 2; this case is particularly simple. Exercise 1.6 For any s ∈ R define the Hilbert space H s (T) by means of the norm  f 2H s := |fˆ(0)|2 + |n|2s |fˆ(n)|2 . (1.21) n∈Z

Obtain the following quantitative improvements in certain qualitative convergence properties. (a) Show that for any 0 ≤ s ≤ 1 one has f (· + θ ) − f 2 ≤ 2π f H s |θ |s . (b) Derive a rate of convergence for SN f − f 2 in terms of N alone, assuming that f H s ≤ 1 where s > 0 is fixed.

1.4. Regularity and Fourier series 1.4.1. Bernstein’s inequality We now investigate further the connection between the regularity of a function and its associated Fourier series. The following estimate, which builds upon the fact that (d/dx)e(nx) = 2π ine(nx), is known as Bernstein’s inequality. Proposition 1.11 Let f be a trigonometric polynomial with fˆ(k) = 0 for all |k| > n. Then f  p ≤ Cn f p for any 1 ≤ p ≤ ∞. The constant C is absolute. Proof Let Vn (x) := (1 + e(nx) + e(−nx))Kn (x) be de la Vall´ee Poussin’s kernel. We leave it to the reader to check that

n (j ) = 1 V

if |j | ≤ n



Fourier series: convergence and summability

as well as Vn 1 ≤ Cn. Then f = Vn ∗ f and thus f  = Vn ∗ f so that, by Young’s inequality, f  p ≤ Vn 1 f p ≤ Cn f p

as claimed.

1.4.2. Convex Fourier coefficients It is interesting to note that there exists a converse to Bernstein’s inequality due to Bohr. We ask the reader to verify a special case of this converse in Exercise 1.7 below; see Problem 1.8 at the end of the chapter for a more difficult case. For the proofs of these converses it is useful to invoke the following general fact. Lemma 1.12 Let {an }n∈Z be an even sequence of nonnegative numbers that tend to zero, which is convex in the following sense: an+1 + an−1 − 2an ≥ 0

∀n > 0.

Then there exists f ∈ L1 (T) with f ≥ 0 and fˆ(n) = an . Proof To understand the construction, we start from the simple observation (based on integration by parts) that ˆ ∞ ϕ  (y)(y − x) dy = ϕ(x) x

for any C 2 function on the line such that ϕ  (x)x → 0 and ϕ(x) → 0 as x → ∞. In the discrete setting, this becomes  (an+1 + an−1 − 2an )(n − m) = am (1.23) n>m

for every sequence such that an → 0 and n(an − an−1 ) → 0 as n → ∞. In particular, for such sequences (1.23) can be rewritten as (with Fej´er kernel Kn )  n (m) = am , n(an+1 + an−1 − 2an )K n>m

which means that the required function f is given by ∞  n=1

n(an+1 + an−1 − 2an )Kn =: f.

1.4 Regularity and Fourier series


Note that, by the convexity assumption, an − an+1 is a decreasing sequence, whence n(an − an+1 ) → 0 and N 

n(an+1 + an−1 − 2an ) = a0 − aN − N (aN − aN+1 ) → a0


as N → ∞.

Exercise 1.7 Suppose that f ∈ L1 (T) satisfies fˆ(j ) = 0 for all j with |j | < n. Then show that f  p ≥ Cn2 f p holds for all 1 ≤ p ≤ ∞, where C is independent of n ∈ Z+ and of the choices of f and p. 1.4.3. Smoothness and Fourier coefficients Now we will turn to the question how the smoothness of a function is reflected in the decay of its Fourier coefficients. We begin with the easy observation, based on integration by parts, that for any f ∈ C 1 (T) one has f  (n) = 2π infˆ(n)

∀n ∈ Z.


This shows that we have not only fˆ(n) = O(n−1 ) but also C 1 (T) → H 1 (T), where H 1 is the Sobolev space defined in (1.21). If f ∈ C k (T) with k ≥ 2 then we may iterate this relation to obtain a decay of the form O(n−2 ). The following exercise establishes the connection between rapid decay of Fourier coefficients and the infinite regularity of the corresponding function. Exercise 1.8 Let f ∈ L1 (T). (a) Show that f ∈ C ∞ (T) if and only if fˆ decays rapidly, i.e., for every M ≥ 1 one has fˆ(n) = O(|n|−M ) as |n| → ∞. (b) Show that f (x) = F (e(x)), where F is analytic on some neighborhood of {|z| = 1}, if and only if fˆ(n) decays exponentially, i.e., fˆ(n) = O(e −ε|n| ) as |n| → ∞ for some ε > 0. 1.4.4. Smoothness and decay In the previous subsection we used integration by parts to find the decay of fˆ(n) provided that f ∈ C 1 (T). If f lies in a H¨older continuous class C α (T) with 0 ≤ α ≤ 1 then the decay may be obtained as follows. Starting from    ˆ ˆ   1 1 ˆ f (y) dy = − e(−ny)f y − dy f (n) = − e −n y + 2n 2n T T


Fourier series: convergence and summability

we obtain 1 fˆ(n) = 2


  1 e(−ny) f (y) − f y − dy, 2n T

which implies that if f ∈ C α (T) then fˆ(n) = O(n−α ) as |n| → ∞.


Note that (1.24) is strictly stronger than this bound, since it allows for L2 summation and thus an embedding into H 1 (T), whereas (1.25) does not. However, since we have already observed that C 1 (T) → H 1 (T) and since clearly C 0 (T) → H 0 (T) = L2 (T), one could invoke some interpolation machinery at this point to conclude that C α (T) → H α (T) for all 0 ≤ α ≤ 1; nevertheless we shall not make use of this fact. Next, we ask how much regularity on f it takes to achieve ∞ 

|fˆ(n)| < ∞.



In other words, we ask for a sufficient condition for f to lie in the Wiener algebra A(T) of Exercise 1.2. There are a number of ways to interpret the requirement of “regularity”. It is easy to settle this embedding question at the level of the spaces H s (T). In fact, applying Cauchy–Schwarz yields ⎞1/2 ⎛ ⎞1/2 ⎛    |fˆ(n)| ≤ ⎝ |fˆ(n)|2 |n|1+ε ⎠ ⎝ |n|−1−ε ⎠ , n =0

n =0

n =0

so that ∞ 

|fˆ(n)|2 |n|1+ε < ∞



is a sufficient condition for (1.26) to hold. In other words, we have shown that H s (T) → A(T) for any s > 12 . We leave it to the reader to check that this fails for s = 12 . 1.4.5. Sobolev spaces and embeddings Next, we would like to address the more challenging question as to the minimal value of α > 0 such that C α (T) → A(T). In fact, we first ask, for which values of α does C α (T) → H s (T) for some s > 12 ? We have already observed that α > 12 suffices for this, but this observation required some interpolation theory. Instead, we prefer to give a direct proof in Theorem 1.13 below. It introduces an important idea that we shall see repeatedly throughout this book, namely the grouping of the Fourier coefficients fˆ(n) into blocks having approximately

1.4 Regularity and Fourier series the same n; more precisely, we introduce the partial sums  (Pj f )(x) := fˆ(n)e(nx)



2j −1 ≤|n| 0 one has C α (T) → H β (T) for arbitrary 0 < β < α. In particular, for any f ∈ C α (T) with α > 12 one has  |fˆ(n)| < ∞ n∈Z

and thus C α (T) → A(T) for any α > 12 . Proof As in the second proof of Theorem 1.2, for simplicity, we take α < 1; the case α = 1 is left to the reader. Let [f ]α ≤ 1. We claim that, for every j ≥ 0,  |fˆ(n)|2 ≤ C2−2j α . (1.29) 2j ≤|n| α. Hint: To bound |f (x + y) − f (x)| distinguish the cases 2k |y| < 1 and 2k |y| ≥ 1.

1.4 Regularity and Fourier series


The second statement of Theorem 1.13, concerning A(T), is more difficult. Proposition 1.14 There is no embedding from C 1/2 (T) into A(T). In fact, there exists a function f ∈ C 1/2 (T) \ A(T). Proof We claim that there exists a sequence of trigonometric polynomials  n −1 Pn (x) = 2=0 an, e(x) such that, with N = 2n , √ Pn ∞  N, (1.34) P n 1  N for each n ≥ 1 (the Pn (x) are the Rudin–Shapiro polynomials). Here a  b means C −1 a ≤ b ≤ Ca, where C is an absolute constant, and P n refers to the sequence of Fourier coefficients. Assuming such a sequence for now, we set Tn (x) := 2−n e(2n x)Pn (x), f :=


Tn .


Note that the above series converges uniformly to f ∈ C(T) since Tn ∞ ≤ C2−n/2 . Moreover, the Fourier supports of the Tn are pairwise disjoint for distinct n. Thus, f A(T) = ∞ and f ∈ A(T). Finally, let h  2−m for some positive integer m. Then |f (x + h) − f (x)| ≤


|Tn (x + h) − Tn (x)| +


C|h| Tn ∞ +


C 2−n/2




Tn ∞






|h| + 2

≤ C|h|1/2


and thus f ∈ C (T) as desired. To pass to the last line we used Bernstein’s inequality, Proposition 1.11. It thus remains to establish the existence of the Rudin–Shapiro polynomials (1.34). Define them inductively by P0 (x) = Q0 (x) = 1 and 1/2

Pn+1 (x) = Pn (x) + e(2n x)Qn (x), Qn+1 (x) = Pn (x) − e(2n x)Qn (x), for each n ≥ 0. Since |Pn+1 (x)|2 + |Qn+1 (x)|2 = 2(|Pn (x)|2 + |Qn (x)|2 ) = 2n+1


Fourier series: convergence and summability

we see that Pn+1 ∞ is of the desired magnitude. Furthermore, all coefficients of both Pn and Qn are either +1 or −1, and the only exponentials with nonzero coefficients are e(x) with 1 ≤  ≤ 2n − 1. Hence {Pn }∞ n=0 is the desired family of polynomials.  Exercise 1.10 Show that any trigonometric polynomial P with Pˆ 1 = N and the √ property that the cardinality of its Fourier support is at most N satisfies N ≤ P 2 ≤ P ∞ ≤ N . Hence the polynomials constructed in the previous proof are “extremal” for the lower bound on P ∞ , whereas the Dirichlet kernel (or Fej´er kernel) is extremal for the upper bound.

1.5. Higher dimensions We conclude this chapter with a brief discussion of the Fourier series associated with functions on Td = Rd /Zd with d ≥ 2. The exponential basis in this case is {e(ν · x)}ν∈Zd , where the dot indicates a scalar product in Rd , and this is an orthonormal family in the usual L2 (Td ) sense. Thus, with every measure μ ∈ M(Td ) we associate a Fourier series ˆ  μ(ν) ˆ e(ν · x), μ(ν) ˆ := e(−ν · x) μ(dx). Td


As before, a special role is played by the trigonometric polynomials  aν e(ν · x), ν∈Zd

where all but finitely many aν vanish. In contrast with the one-dimensional torus, in higher dimensions we face a nontrivial ambiguity in the definition of partial sums. In fact we can pose the convergence problem relative to any exhaustion of Zd by finite sets Ak that are increasing and whose union is the whole of Zd . A possible choice here is the squares [−k, k]d , but one could also take more general rectangles, or balls relative to the Euclidean metric, or other shapes. Even though this distinction may seem innocuous, it has given rise to very important developments in harmonic analysis such as Fefferman’s ball multiplier theorem and the Bochner–Riesz conjecture, which is still unresolved in dimensions 3 and higher. For now, we content ourselves with some basic results. Proposition 1.15 The space of trigonometric polynomials is dense in C(Td ), and one has Parseval’s identity for L2 (Td ), i.e.,  |fˆ(ν)|2 ∀f ∈ L2 (Td ). f 22 = ν∈Zd

1.6 Interpolation of operators


If f ∈ C ∞ (Td ) then the Fourier series associated with f converges uniformly to f irrespective of the way in which the partial sums are formed. Proof We base the proof on the fact that the products of Fej´er kernels with respect to the individual coordinate axes form an approximate identity. In other words, the family KN,d (x) :=


KN (xj )

∀N ≥ 1

j =1

forms an approximate identity on Td . Here x = (x1 , . . . , xd ). This follows immediately from the one-dimensional analysis above. Hence, for any f ∈ C(Td ) one has KN,d ∗ f − f ∞ → 0 as N → ∞. By inspection KN,d ∗ f is a trigonometric polynomial, which implies the claimed denseness and thus also the Plancherel theorem. Suppose that f ∈ C ∞ (Td ). Repeated integration by parts yields the estimate fˆ(ν) = O(|ν|−m ) as |ν| → ∞ for arbitrary but fixed m ≥ 1. The convergence statement now follows from KN,d ∗ f → f in C(Td ) as N → ∞ and the rapid decay of the coefficients.  A useful corollary to the previous result is that tensor functions are dense in C(Td ). A tensor function on Td is a linear combination of functions of the  form dj=1 fj (xj ) where fj ∈ C(T). In particular, trigonometric polynomials are tensor functions, whence the claim.

1.6. Interpolation of operators Throughout this book we shall make use of the following two fundamental interpolation theorems. We will merely state the results and for more details refer the reader to standard references; see the notes below. The first result is due to Riesz and Thorin. Here Lp spaces are scalar-valued (real or complex), and throughout the book all measure spaces are σ -finite. Theorem 1.16 Let (X, μ) be a measure space. Suppose that we have 1 ≤ p1 , p2 ≤ ∞ and assume that Y ⊂ Lp1 (X, μ) ∩ Lp2 (X, μ) is a subspace that is dense in both Lp1 (X, μ) and Lp2 (X, μ). Let T be a linear operator defined on Y that takes its values in the measurable functions on some other space ˜ μ) (X, ˜ in such a way that for all f ∈ Y one has p Tf Lqj (X, ˜ μ) ˜ ≤ Aj f L j (X,μ) ,

j = 1, 2,


Fourier series: convergence and summability

where 1 ≤ q1 , q2 ≤ ∞. Then θ 1−θ Tf Lq (X, ˜ μ) ˜ ≤ A1 A2 f Lp (X,μ)

∀f ∈ Y,

where 1 1−θ θ + , = p p1 p2

1 1−θ θ + = q q1 q2


for all 0 ≤ θ ≤ 1. Like H¨older’s inequality this interpolation result is based on convexity, to be precise log-convexity. The standard proof relies on the three-lines theorem from complex analysis, which states that the maximum of the absolute value along vertical lines of a function holomorphic in a vertical strip is log-convex. In addition to operators that are bounded on Lebesgue spaces, we shall also need the following weak-type boundedness property. Let T be a map from ˜ μ). Lp (X, μ) to the measurable functions on (X, ˜ For 1 ≤ q < ∞ we say that T is weak-type (p, q) if and only if   q μ˜ x ∈ X˜ |(Tf )(x)| > λ ≤ Aλ−q f Lp (X,μ) ∀λ > 0 for all f ∈ Lp (X, μ). Further, we define weak-type (p, ∞) to be the same as strong-type (p, ∞), which simply means “bounded in the usual Lebesgue sense”. In the following Marcinkiewicz interpolation theorem we allow the operators T to be quasilinear, which means that for some constant κ > 0 one has |T (f1 + f2 )| ≤ κ(|f1 | + |f2 |) for all step functions f1 , f2 . Theorem 1.17 Suppose that 1 ≤ p1 < p2 ≤ ∞ and qj ≥ pj with q1 = q2 . Let (p, q) be as in (1.35) with 0 < θ < 1. If T is a quasilinear operator that is weak-type (pj , qj ) for j = 1, 2 then T is strong-type (p, q). This is proved by breaking functions up according to the sizes of their level sets. It applies in the wider context of Lorentz spaces. To be more specific, one can weaken the hypotheses of Theorem 1.17 further by requiring only weaktype bounds when T is applied to indicator functions of sets. This is referred to as restricted weak-type (p, q) and the resulting interpolation theorem can be very helpful in applications. Notes An encyclopedic treatment of Fourier series is found in the classic treatise by Zygmund [133]. A less formidable but still comprehensive account of many fundamental results is given in Katznelson [65]. Both these references contain the interpolation



results in Section 1.6; see in particular [65, Chapter IV]. A comprehensive account of interpolation theory is given in the monograph by Bergh and L¨ofstr¨om [7], which goes far beyond what we need here. Stein and Weiss [112, Chapter V] presents restricted weak-type interpolation as well as Lorentz spaces. For the construction in Proposition 1.14, see [65, p. 36, Exercise 6.6]. For gap series, as well as other constructions involving Fourier series with an arithmetic flavor, see the classic paper of Rudin [94], which introduced the p set problem later solved by Bourgain [10].

Problems Problem 1.1 Suppose that f ∈ L (T) and that {Sn f }∞ n=1 (the sequence of partial sums of the Fourier series for f ) converges in Lp (T) to g for some p ∈ [1, ∞] and some g ∈ Lp . Prove that f = g. If p = ∞ conclude that f is continuous. 1

 Problem 1.2 Let T (x) = N n=0 (an cos(2π nx) + bn sin(2π nx)) be an arbitrary trigonometric polynomial with real coefficients a0 , . . . , aN , b0 , . . . , bN . Show that there is a   polynomial P (z) = 2N =0 c z ∈ C[z] such that T (x) = e−2πiN x P (e2π ix ) and such that P (z) = z2N P (¯z−1 ). How are the zeros of P distributed in the complex plane?  Problem 1.3 Suppose that T (x) = N n=0 (an cos(2π nx) + bn sin(2π nx)) is such that T ≥ 0 everywhere on T and an , bn ∈ R for all n = 0, 1, . . . , N . Show that there are c0 , . . . , cN ∈ C such that N 2  cn e2π inx . (1.36) T (x) = n=0

Find the cn for the Fej´er kernel.  Problem 1.4 Suppose that T (x) = a0 + H h=1 ah cos(2π hx) satisfies T (x) ≥ 0 for all x ∈ T and T (0) = 1. Show that, for any complex numbers y1 , . . . , yN , N  N −h  N H  2    2 yn ≤ (N + H ) a0 |yn | + |ah | yn+h y¯n n=1




Hint: Write (1.36) with n cn = 1. Then apply the Cauchy–Schwarz inequality to   n yn = m,n yn cm . The above result is called van der Corput’s inequality; see Montgomery [84], p. 18. It plays a role in the theory of the uniform distribution of sequences; see the Chapter 6 problems. ∞  2 aro summable. Show Problem 1.5 Suppose that ∞ n=1 n |an | < ∞ and 1 an is Ces` ∞ that n=1 an converges. Use this to prove that any f ∈ C(T) ∩ H 1/2 (T) satisfies Sn f → f uniformly. Note that H 1/2 (T) does not embed into A(T), so this convergence does not follow trivially.


Fourier series: convergence and summability

Problem 1.6 Show that there exists an absolute constant C such that ˆ   2 2 |f (x) − f (y)|2 C −1 |n| fˆ(n) ≤ |n| fˆ(n) dxdy ≤ C 2 (π (x − y)) 2 sin T n =0 n =0 for any f ∈ H 1/2 (T). Does this generalize to H s (T) and, if so, for which values of s? Problem 1.7 Use the previous two problems to prove the following theorem of Pal and Bohr: for any real function f ∈ C(T) there exists a homeomorphism ϕ : T → T such that Sn (f ◦ ϕ) −→ f ◦ ϕ uniformly. Hint: Without loss of generality let f > 0. Consider the domain defined in terms of polar coordinates by means of r(θ) = f (θ/2π). Then apply the Riemann mapping theorem to the unit disk. Problem 1.8 Suppose that f ∈ L1 (T) satisfies fˆ(j ) = 0 for all j with |j | < n. Show that f  p ≥ Cn f p for all 1 ≤ p ≤ ∞, where C is independent of n ∈ Z+ and of the choices of f and p. Problem 1.9 Show that f ∗ g 2L2 (T) ≤ f ∗ f L2 (T) g ∗ g L2 (T) for all f, g ∈ L2 (T). Problem 1.10 Show that for every p > 0 there exists an approximate identity KN,p on T with the following properties: r K  N,p (ν) = 1 for all |ν| ≤ N , r K  N,p (ν) = 1 for all |ν| > CN , r |K (θ)| ≤ CN 1−p min(N p , |θ|−p ), N,p where C = C(p) is some constant and N ≥ 1 is arbitrary. N Problem 1.11 Given N disjoint arcs {Iα }N α=1 ⊂ T, set f = α=1 χIα . Show that 

|fˆ(ν)|2 ≤


CN . k


Hint: The bound N /k is much easier and should be obtained first. Going from N 2 to N then requires one to exploit orthogonality in a suitable fashion. Problem 1.12 Given any function ψ : Z+ → R+ such that ψ(n) → 0 as n → ∞, show that one can find a measurable set E ⊂ T for which lim sup n→∞

| χE (n)| = ∞. ψ(n)



Problem 1.13 In this problem the reader is asked to analyze some well-known partial differential equations in terms of Fourier series. (a) Solve the heat equation ut − uθθ = 0, u(0) = u0 (the data at time t = 0) on T using a Fourier series. Show that if u0 ∈ L2 (T) then u(t, θ ) is analytic in θ for every t > 0 and solves the heat equation. Prove that u(t) − u0 2 → 0 as t → 0. Write u(t) = Gt ∗ u0 and show that Gt is an approximate identity for t > 0. Conclude that u(t) → 0 as t → 0 in the Lp or C(T) sense. Repeat for higher-dimensional tori. (b) Solve the Schr¨odinger equation iut − uθθ = 0, u(0) = u0 on T with u0 ∈ L2 (T), using a Fourier series. In what sense does this Fourier series “solve” the equation? Show that u(t) 2 = u0 2 for all t. Discuss the limit u(t) as t → 0. Repeat for higher-dimensional tori. (c) Solve the wave equation utt − uθθ = 0 on T by Fourier series. Discuss the Cauchy problem as in (a) and (b). Show that if ut (0) = 0 then, with u(0) = f , u(t, θ ) = 12 (f (θ + t) + f (θ − t)).

2 Harmonic functions; Poisson kernel

2.1. Harmonic functions In this chapter we shall investigate harmonic functions on the disk. This class of functions is not only of fundamental importance to analysis in general but also essential to the resolution of the Lp -convergence problem of Fourier series, i.e., to the question whether SN f − f p → 0 as N → ∞ when 1 < p < ∞ (see Chapter 3). We now briefly review some basic facts about harmonic functions on general domains  ⊂ R2 (in fact, we could also consider Rd here for most properties, with the exception of any reference to holomorphic functions). As usual, a domain is open and connected. We say that u ∈ C 2 () is harmonic provided that u = 0 on , where  is the Laplacian, i.e., we require that uxx + uyy = 0 on . Examples of such functions are easily constructed; they include all linear functions and u(x, y) = x 2 − y 2 . More generally, for any holomorphic F on , both real and imaginary parts of F are harmonic, as follows from the Cauchy– Riemann equations F = u + iv,

ux − vy = 0,

uy + vx = 0.

We recall the important converse from complex function theory: if u is harmonic on a simply connected domain  then there exists a holomorphic function F on  with Re F = u. To see this, define ˆ v(z) = −uy dx + ux dy, γ

where γ is any path connecting a fixed point z0 ∈  with the variable point z ∈ . By the harmonicity of u and the simple connectivity of  we see that the path integral does not depend on the specific choice of γ . By inspection, F := u + iv is C 2 and the Cauchy–Riemann equations hold, whence F is holomorphic. 28

2.1 Harmonic functions


This implies that harmonic functions u are analytic in the sense that their infinite Taylor series converge locally and are equal to u. In particular, harmonic functions are C ∞ . In a discrete setting, harmonic functions u on the lattice Z2 are given by

u(n, m) = 14 u(n + 1, m) + u(n − 1, m) + u(n, m + 1) + u(n, m − 1) ∀(n, m) ∈ Z2 . Note that this means that the value of u at every point (n, m) is the average of the values of u at the four nearest neighbors of (n, m). This motivates the mean-value property enjoyed by harmonic functions in the continuum. 2.1.1. Mean-value property and maximum principle The mean value property is given as follows. Lemma 2.1 Let u be harmonic on some domain  ⊂ R2 . Then, for every z ∈  and for all 0 < r < dist(z, ∂), one has ˆ ˆ 1 u(z + re(θ )) dθ = u(x, y) dxdy. u(z) = (2.1) π r 2 z+rD T Proof By the divergence theorem, with dm the Lebesgue measure in the plane, ˆ ˆ ˆ d 1 1 u(z + re(θ )) dθ = ∂n u(w) σ (dw) = u dm = 0. dr T 2π r z+r∂D 2π r z+rD Thus, the circular means (with σ the arc length) ˆ ˆ 1 M(r) := u(z + re(θ )) dθ = u(w) σ (dw) 2π r |w−z|=r T are constant and, since M(r) → u(z) as r → 0, we see that u(z) = M(r) as claimed. Integrating su(z) = sM(s) over 0 < s < r implies the mean-value  property over solid disks. The special role of circles here is no coincidence. In fact, not only is the Laplacian  invariant under rotations (which means that for every rotation ρ in the plane and every C 2 function u one has (u ◦ ρ) = (u) ◦ ρ) but  uniquely has this property. If L = a∂xx + b∂xy + c∂yy enjoys this commutation property with constant a, b, c then a = c and b = 0. An immediate consequence of the mean-value property is the maximum principle. Corollary 2.2 Let u be harmonic in  ⊂ R2 . If u attains an extremum in  then u must be constant.


Harmonic functions; Poisson kernel

Proof Suppose that u(z) ≤ u(z0 ) for all z ∈ , where z0 ∈  is fixed. Define S := {z ∈  | u(z) = u(z0 )}. Then S = ∅, and S is closed. Moreover, by the mean-value property, S is also an open set, whence S =  as claimed.  Exercise 2.1 Show that it suffices to assume that u attains a local extremum in the previous result. Another common formulation of the maximum principle on bounded domains relies on boundary values. Corollary 2.3 Let  be bounded and suppose that u ∈ C() is harmonic on . Then min u ≤ u(z) ≤ max u 

∀z ∈ ,

and equality can occur only if u is a constant. Proof Since  is compact, u attains both its maximum and its minimum on that set. We may assume that u is not constant, but then Corollary 2.2 implies  that the extrema are not attained in , whence the claim.

2.2. The Poisson kernel There is a close connection between Fourier  series and analytic or harmonic functions on the disk D := z ∈ C |z| ≤ 1 . In fact, at least formally, Fourier series can be viewed as the “boundary values” of a Laurent series ∞ 

an z n ;



this can be seen by setting z = x + iy = e(θ). Alternatively, suppose that we are given a continuous function f on T and wish to find the harmonic extension u of f into D, i.e., a solution to u = 0 u=f

in D, on ∂D = T.


The term “solution” here refers to a pointwise solution, i.e., a function f ∈ ¯ However, as we shall see, it is also important to investigate C 2 (D) ∩ C(D). other notions of solutions of (2.3), with less regular functions f .

2.2 The Poisson kernel


2.2.1. Derivation of the Poisson kernel Note that we cannot use negative powers of z in (2.2) in an ansatz for u. However, we can use complex conjugates instead. Indeed, since zn = 0 and ¯zn = 0 for every integer n ≥ 0, we are led to define u(z) =


fˆ(n)zn +


fˆ(n)¯z|n| ,




 ˆ which, at least formally, satisfies u(e(θ)) = ∞ n=−∞ f (n)e(nθ ) = f (θ ). Inserting z = re(θ ) and ˆ fˆ(n) = e(−nϕ)f (ϕ) dϕ T

into (2.4) yields u(re(θ)) =

ˆ  T n∈Z

r |n| e(n(θ − ϕ))f (ϕ) dϕ.

This resembles the derivation of the Dirichlet kernel in Chapter 1, and we now ask the reader to find a closed form for the sum. Exercise 2.2 Check that, for 0 ≤ r < 1, Pr (θ) :=

r |n| e(nθ ) =


1 − r2 . 1 − 2r cos(2π θ ) + r 2

This is the Poisson kernel. 2.2.2. The Poisson kernel as an approximate identity Based on our formal calculation above, we therefore expect to obtain the harmonic extension of a sufficiently well-behaved function f on T by means of the convolution ˆ u(re(θ )) = Pr (θ − ϕ)f (ϕ) dϕ = (Pr ∗ f )(θ ) T

for 0 ≤ r < 1. Note that Pr (θ ), for 0 ≤ r < 1, is a harmonic function of the variables x + iy = re(θ ). Moreover, for any finite measure μ ∈ M(T) the expression (Pr ∗ μ)(θ ) is not only well defined but in fact defines a harmonic function on D. The remainder of this chapter will therefore be devoted to analyzing the boundary behavior of Pr ∗ μ. Clearly Proposition 1.5 will play an important role in this investigation, but the fact that we are dealing with harmonic functions will enter in a crucial way (such as through the maximum principle).


Harmonic functions; Poisson kernel

θ Figure 2.1. Graphs of Pr for r = 0.70, 0.83, 0.90.

In the following we use the notion of approximate identity in a more general form than in Chapter 1. However, the reader will have no difficulty in transferring this notion, including Proposition 1.5, to the present context. Exercise 2.3 Check that {Pr }0 12 ε}| + θ ∈ T |h(θ)| > 12 ε √ ≤ C ε. To pass to the final inequality we have used Proposition 2.9 as well as Markov’s inequality (recall that h 1 < ε); see the start of Section 2.3. 


Harmonic functions; Poisson kernel

As a corollary we obtain not only the classic Lebesgue differentiation theorem (by considering the box kernel) but also the a.e. convergence of the Ces´aro means σN f (via the Fej´er kernel) as well as of the Poisson integrals Pr ∗ f to f for any f ∈ L1 (T). A theorem of Kolmogorov states that this fails for the partial sums SN f on L1 (T). We will present this example in Chapter 6. 2.4.1. The case of measures It is natural to ask whether there is an analogue of Theorem 2.12 for measures μ ∈ M(T). We begin with the following general fact from measure theory. Lemma 2.13 If μ ∈ M(T) is a positive measure that is singular with respect to Lebesgue measure m (symbolically, μ⊥m) then for a.e. θ ∈ T with respect to Lebesgue measure we have μ([θ − ε, θ + ε]) → 0 as ε → 0. 2ε Proof For every λ ≥ 0 let  E(λ) := θ ∈ T

 lim sup μ([θ − ε, θ + ε]) > λ . 2ε ε→0

By assumption there exists a Borel set A ⊂ T with |A| = 0 and such that μ(E) = μ(E ∩ A) for every Borel set E ⊂ T. Suppose first that A is compact. Then it follows that E(0) ⊂ A, whence |E(0)| = 0 as desired. In general A does not need to be compact, but, for every δ > 0, there exists a compact Kδ such that μ(A \ Kδ ) < δ. Denote by μδ the measure μ localized to A \ Kδ . Then, by the preceding, we have for every λ > 0 |E(λ)| ≤ | {θ ∈ T | M(μδ )(θ ) > λ} |. However, by the weak-L1 estimate for the Hardy–Littlewood maximal function, see (2.9), one has that the measure of the set on the right-hand side satisfies | {θ ∈ T|M(μδ )(θ ) > λ} |
0 is arbitrary, we are done.

Exercise 2.6 Let { n }∞ n=1 satisfy (A1)–(A3) of Definition 1.3 and also (A4) from Definition 2.10, and assume that the {n }∞ n=1 from Definition 2.10 also satisfy sup δ λ}) ≤ C(A)λ−1 f L1 (μ) Mμ f Lp (μ) ≤ C(p, A) f Lp (μ)

∀λ > 0, ∀1 < p ≤ ∞,


for any functions f for which the respective right-hand sides are finite. 1

If the two maximal functions are denoted (Mf )1 and (Mf )2 then “comparable by multiplicative constants” means that c(Mf )1 < (Mf )2 < C(Mf )1 , where c, C given by c < 1 < C are constants.


Harmonic functions; Poisson kernel

Proof This is essentially identical with the proof of Proposition 2.9. The Wiener covering lemma applies equally well in Rd , and we have μ(3B) ≤ A2 μ(B) by (2.13). This gives the weak-L1 bound, and the Lp bound follows by interpolation as before.  Note that this establishes the Lebesgue differentiation theorem for general doubling measures. 2.5.2. Weighted estimates for the maximal function Next we wish to explore a somewhat different question, namely whether the standard maximal operator remains bounded on weighted spaces. More precisely, we would like to characterize all measurable functions w ≥ 0 in Rd with the property that, for fixed 1 < p < ∞, one has the bound ˆ ˆ (Mf )p (x)w(x) dx ≤ C(p) |f (x)|p w(x) dx, (2.15) Rd


with constant C(p) and all f ∈ L (w), or even those functions having the weak-Lp version (now also allowing p = 1), ˆ w({Mf > λ}) ≤ C(p)λ−p |f (x)|p w(x) dx, (2.16) ´ where w(E) := E w(x) dx. Here M is as in (2.12) but with μ the Lebesgue measure. Assume that (2.16) holds ´with 1 ≤ p < ∞ fixed. Let B be any ball and f ≥ 0 be such that f (B) := B f (y) dy > 0. Let λ ∈ (0, f (B)/|B|). Then p

B ⊂ {x | M(f χB )(x) > λ} and thus, from the weak-Lp bound, we have ˆ w(B) ≤ C(p)λ−p |f (y)|p w(y) dy. B

Maximizing over λ implies that   ˆ f (B) p ≤ C(p) |f (y)|p w(y) dy. w(B) |B| B Setting f := χE for some measurable E ⊂ B implies that   |E| p ≤ C(p) w(E). w(B) |B| Exercise 2.7 Deduce the following dichotomies from (2.18): (a) either w > 0 a.e. or w = 0 a.e; (b) either w ∈ L1loc (Rd ) or w = ∞ a.e. Also verify that any w satisfying (2.18) defines a doubling measure.



2.5 Weighted estimates for maximal functions


We shall now deduce conditions on w from (2.17); they are called Ap conditions on w. As we shall see, they are also sufficient for (2.16) (and, ffl in fact, ´the −1 strong bounds when 1 < p < ∞) to hold. In what follows, B := |B| B, and we shall often omit the infinitesimal dx from integrals relative to Lebesgue measure. Proposition 2.15 If the estimate (2.16) holds with p = 1 then Mw ≤ C(1)w



where C(1) is the constant from (2.16) with p = 1. If (2.16) holds for 1 < p < ∞ then p−1   (2.20) w w1−p ≤ C(p) ∀B ⊂ Rd , B


where B is a ball. Proof Because w dx is a doubling measure, the Lebesgue differentiation theorem holds, as we verified above. Hence, taking x to be a Lebesgue point of w, we infer from (2.18) with p = 1 that w(B)/|B| ≤ C(1)w(x) for every ball B  x. This is equivalent with Mw(x) ≤ Cw(x).  If 1 < p < ∞ then we deduce from (2.17), by setting f = w 1−p χB , that p  ˆ   w1−p ≤ C(p) w1−p , w(B) B


which implies (2.20). Strictly speaking, we need to replace w here with min(w, n) and then let n → ∞.  Exercise 2.8 Show that, for p > 1, any w > 0 that satisfies (2.20) also satisfies (2.17) and therefore (2.18). Hint: Use H¨older’s inequality. The conditions (2.19) and (2.20) are called A1 and Ap conditions, respectively. Any w satisfying these conditions is referred to as an Ap weight (with 1 ≤ p < ∞); we write w ∈ Ap . These conditions are not only necessary for (2.16), as we have just shown, but are also sufficient. We shall now prove the following theorem. Theorem 2.16 If w ∈ Ap then (2.16) holds. Exercise 2.9 Verify the following properties of Ap classes: (a) Ap ⊂ Aq if 1 ≤ p < q;  (b) w ∈ Ap if and only if w 1−p ∈ Ap ; 1−p (c) if w0 , w1 ∈ A1 then w0 w1 ∈ Ap .


Harmonic functions; Poisson kernel Hint: These properties follow immediately from the definitions of the quantities and H¨older’s inequality (the latter is needed for the first property only if p > 1).

For p = 2 the Ap -condition takes the following form: w−1 ≤ C

w B

∀ B ⊂ Rd .



2.5.3. The Calder´on–Zygmund decomposition To prove Theorem 2.16, we shall use a fundamental device, the Calder´on– Zygmund decomposition in L1 (Rd ). The construction is an example of a stopping-time argument. Lemma 2.17 Let f ∈ L1 (Rd ) and λ > 0. Then one can write f = g + b with  |g| ≤ λ and b = Q χQ f , where the sum runs over a collection B = {Q} of disjoint cubes such that for each Q one has ˆ 1 λ< |f | ≤ 2d λ. (2.22) |Q| Q Furthermore,

 1 Q < f 1 . Q∈B λ

Proof For each  ∈ Z we define a collection D of dyadic cubes by   D = di=1 [2 mi , 2 (mi + 1)) | m1 , . . . , md ∈ Z .



Notice that if Q ∈ D and Q ∈ D then Q ∩ Q = ∅ or Q ⊂ Q or Q ⊂ Q. In other words, either any two dyadic cubes are disjoint or one cube is contained inside the other. Now pick 0 large enough that ˆ 1 |f | dx ≤ λ |Q| Q for every Q ∈ D0 . For each such cube consider its 2d “children” of size 20 −1 . Any such cube Q will then have the property that ˆ ˆ 1 1 |f (x)| dx ≤ λ or |f (x)| dx > λ. (2.25) either |Q | Q |Q | Q In the latter case we stop and include Q in a family B. Observe that in this case ˆ ˆ 1 2d |f (x)| dx ≤ |f (x)| dx ≤ 2d λ, |Q | Q |Q| Q where Q denotes the parent of Q . Thus (2.22) holds.

2.5 Weighted estimates for maximal functions


If, however, the first inequality in (2.25) holds then subdivide Q again into its children, with half the size of Q . Continuing in this fashion produces a collection of disjoint (dyadic) cubes B satisfying (2.22). Consequently (2.23) also holds, since ˆ   1ˆ 1 |Q| < |f (x)| dx ≤ |f (x)| dx. Q ≤ λ Q λ Rd B B B  Now let x0 ∈ Rd \ B Q. Then x0 is contained in a decreasing sequence {Qj } of dyadic cubes, each of which satisfies ˆ 1 |f (x)| dx ≤ λ. |Qj | Qj By Lebesgue’s theorem, |f (x0 )| ≤ λ for a.e. such x0 . Since, moreover, Rd \ ∪B Q and Rd \ ∪B Q differ only by a set of measure zero, we can set  χQ f g := f − Q∈B

so that |g| ≤ λ a.e. as desired.

We refer to λ in the above decomposition as the height. In view of (2.22) there exists a strong connection between the Hardy–Littlewood maximal function and this decomposition; in fact,  {Mf > c(d)λ} ⊃ Q. Q∈B

In the following exercise, we ask the reader to establish a reverse inclusion. Exercise 2.10 Given f ∈ L1 (Rd ), perform a Calder´on–Zygmund decomposition at height λ that results in the collection B of cubes. Show that there exists a constant C(d) such that  Q∗ , {Mf > C(d)λ} ⊂ (2.26) Q∈B ∗

where Q is the double of Q. In the proof of Theorem 2.16, the following corollary to (2.26) will play a key role. Proposition 2.18 For any measurable w ≥ 0 and 1 ≤ p < ∞ there exist constants Cp , depending also on the dimension, such that ˆ ˆ −1 w(x) dx ≤ C1 λ |f (x)| (Mw)(x) dx, {Mf >λ} Rd (2.27) ˆ ˆ p p (Mf ) (x)w(x) dx ≤ Cp |f (x)| (Mw)(x) dx; Rd


the latter estimate requires that 1 < p < ∞.


Harmonic functions; Poisson kernel

Proof We shall prove only the weak-L1 part of (2.27), as the other part follows by Marcinkiewicz interpolation with an easy L∞ estimate. We write ˆ  w(x) dx ≤ 2d |Q| w(x) dx {Mf >C(d)λ}


≤ 2d λ−1


|f (y)| Q


≤ 2d λ−1

 w(x) dx





|f (y)|(Mw)(y) dy,

as claimed. The first inequality follows from (2.26), the second from (2.22), and the third from the definition of the maximal function and the disjointness of the cubes in B.  Exercise 2.11 Given f ∈ L1 (Rd ) and λ > 0 show that there exists E ⊂ Rd with |E| ≤ λ−1 and such that ˆ |f (x)|2 dx ≤ λ f 21 . Rd \E

State the analogous result for the torus. 2.5.4. The weak bound for Ap weights We can now prove Theorem 2.16. Proof of Theorem 2.16 If p = 1 then combining the first estimate in (2.27) with the condition (2.19) concludes the proof. If 1 < p < ∞ then we know from Exercise 2.8 that property (2.17), and therefore also (2.18), holds. Performing a Calder´on–Zygmund decomposition of f at height c(d)λ, where c(d) is a small constant, one argues as in Exercise 2.10 that  Q∗ . {Mf > λ} ⊂ Q∈B

Therefore w({Mf > λ}) ≤

w(Q∗ ) ≤ C




  |Q| p ˆ |f |p w f (Q) Q Q∈B ˆ ≤ Cλ−p |f |p w.



2.5 Weighted estimates for maximal functions


Here, we used (2.18) to obtain the second inequality, then (2.17), (2.22), and the disjointness of the cubes in B.  In fact, one has a strong Lp bound for p > 1. The standard proof of the above theorem is typically based on the reverse H¨older inequality. This remarkable property essentially means that any w ∈ Ap with p > 1 also belongs to some Aq where 1 < q < p. By interpolation, Theorem 2.16 then implies the above strong bound on Lp (Rd ). We shall present the reverse H¨older inequality in Chapter 7 in the context of singular integral bounds relative to Ap weights. For the maximal function, however, we shall now give an argument based on a Calder´on–Zygmund decomposition and essentially no more than the definition of Ap weights. Before presenting the details, we introduce another maximal function, namely the dyadic maximal function: |f (y)| dy,

(Mdyad f )(x) := sup Qx


where the supremum ranges over all dyadic cubes defined in (2.24). By inspection of the proof of the Calder´on–Zygmund decomposition at height λ, we see that  Q, {Mdyad f > λ} = (2.28) Q∈B

where B is the family of “bad” cubes. Note that the dyadic maximal function is not comparable in the pointwise sense with the usual maximal function, since the dyadic maximal function of a function that vanishes on x1 > 0 also vanishes on that half-space. In other words, while Mf dominates Mdyad f the converse does not hold. However, one has the following property of the level sets, which is sufficient for many purposes. Exercise 2.12 Show that there exists a constant C(d) such that |{Mf > C(d)λ}| ≤ C(d)|{Mdyad f > λ}| for all λ > 0. In fact, for any doubling measure μ (such as w dx where w ∈ Ap ), one has μ({Mf > C(d)λ}) ≤ C(d)μ({Mdyad f > λ}) for all λ > 0. 2.5.5. The strong bound for Ap weights Now we can formulate and prove the strong Lp bound for Ap weights.


Harmonic functions; Poisson kernel

Theorem 2.19 Any weight w ∈ Ap with 1 < p < ∞ satisfies the strong Lp -boundedness property (2.15). Proof By Exercise 2.12 it suffices to prove ˆ ˆ p (Mdyad f ) (x) w(x) dx ≤ C Rd

|f (x)|p w(x) dx,



where the constant C = C(p) is bounded for p0 ≤ p ≤ p1 with 1 < p0 < p1 < ∞ arbitrary but fixed. To this end we perform a Calder´on–Zygmund decomposition of f at height C0k for each integer k ∈ Z, where C0 := C0 (d) is a suitable constant. Denote the totality of the “bad” cubes generated in this process for all k by B. With each Q ∈ B associate E(Q), defined by  E(Q) := Q − Q . Q ∈B Q Q

Then we have ˆ  p (Mdyad f ) (x) w(x) dx ≤ C Rd

p |f |





By construction, if C0 is large enough then |E(Q)| > 12 |Q| for each Q ∈ B.  Hence, if we set σ := w 1−p then, by Exercises 2.9 and 2.8 as well as (2.18), we may conclude that σ (E(Q)) > c0 σ (Q) for all cubes, with some constant c0 . Hence we can bound the right-hand side of (2.30) by p  C |f | w(E(Q)) Q


 1 σ (Q) Q∈B



ˆ ≤C


|f |σ −1 σ


 σ (E(Q))


1 σ (Q)

|f |σ




Mσ (f σ −1 )

p σ

w(E(Q)) σ (E(Q))

w(Q) σ (E(Q)) |Q|

σ (Q) |Q|

σ (Q) |Q|


p−1 !


To pass to the third line we removed the expression in brackets since, by the definition of the Ap -class, it is bounded uniformly in Q. However, we can now invoke Proposition 2.14 to bound the last line by ˆ ˆ p 1−p |f | σ = |f |p w, Rd

as desired.




Notes Standard references on harmonic functions on the disk and on the Poisson kernel are the classic book by Hoffman [55], as well as those by Garnett [46] and Koosis [71]. For a more systematic development of Ap weights, see for example Stein’s book [111]; we shall return to them later in Chapter 7 in the context of singular integrals. The proof of Theorem 2.19 is due to Christ and Fefferman [22]. Harmonic functions play a very important role in higher dimensions also, and notions such as the mean value property, the maximum principle, and the Poisson kernel carry over to Rd for d > 2. The book by Han and Lin [52] begins with a discussion of harmonic functions in general dimensions and then continues with a rapid development of scalar second-order elliptic PDEs.

Problems Problem 2.1 Let (X, μ) be a general measure space. We say a bounded sequence 1 {fn }∞ n=1 in L (μ) is uniformly integrable if for every ε > 0 there exists δ > 0 such that for any measurable E one has ˆ μ(E) < δ =⇒ sup |fn | dμ < ε n



and there exists E0 ⊂ X with supn≥1 X\E0 |fn | dμ < ε. The same applies to any subset of L1 , not just to sequences. For simplicity, suppose now that μ is a finite measure. (a) Let φ : [0, ∞) → [0, ∞) be a continuous increasing function with limt→∞ φ(t)/t = +∞. Prove that ˆ sup φ(|fn (x)|) μ(dx) < ∞ n

implies that {fn } is uniformly integrable. Conversely, show that this inequality is also necessary for uniform integrability and in particular that bounded sequences in Lp (μ) with p > 1 are uniformly integrable. (b) Show that {fn }∞ n=1 is uniformly integrable if and only if ˆ |fn (x)| μ(dx) = 0. lim sup A→∞ n≥1

[|fn |>A]

1 (c) Show that for an arbitrary sequence {fn }∞ n=1 in L (μ) the following are equivalent: 1 (i) fn → f in L (μ) as n → ∞; (ii) fn → f in measure, with {fn }n≥1 uniformly integrable. 1 Problem 2.2 Suppose that {fn }∞ n=1 is a sequence in L (T). Show that there is a subseσ∗

quence {fnj }∞ j =1 and a measure μ with fnj −→ μ provided that supn fn 1 < ∞. Here σ ∗ is the weak-* convergence of measures. Show that in general μ ∈ / L1 (T). However, is uniformly integrable then μ(dθ) = f (θ) dθ for if we assume in addition that {fn }∞ 1 some f ∈ L1 (T). Can we conclude anything about strong convergence (i.e., in the L1 norm) of {fn }? Consider the analogous question on Lp (T), p > 1.


Harmonic functions; Poisson kernel

Problem 2.3 Let μ be a positive finite Borel measure on Rd or Td . Set Mμ(x) := sup r>0

μ(B(x, r)) , m(B(x, r))

where m is the Lebesgue measure. This problem examines the behavior of Mμ when μ is a singular measure; see Problem 3.5 for more on this case. (a) Show that μ ⊥ m implies μ({x | Mμ(x) < ∞}) = 0. (b) Show that if μ ⊥ m then lim sup r→0

μ(B(x, r)) =∞ m(B(x, r))

for μ-a.e. x. Also show that this limit vanishes m-almost-everywhere. Problem 2.4 For any f ∈ L1 (Rd ) and 1 ≤ k ≤ d, let ˆ Mk f (x) = sup r −k |f (y)| dy. r>0


Show that, for every λ > 0, mL ({x ∈ L | Mk f (x) > λ}) ≤

C f L1 (Rd ) , λ

where L is an arbitrary affine k-dimensional subspace and mL stands for Lebesgue measure (i.e., k-dimensional measure) on this subspace; C is an absolute constant. Problem 2.5 Prove the Besicovitch covering lemma on T. Suppose that {Ij } are finitely many arcs with |Ij | < 1. Then there is a sub-collection {Ijk } such that the following properties hold: (a) ∪k Ijk = ∪j Ij ; (b) No point belongs to more than C of the Ijk , where C is an absolute constant. What is the optimal value of C? What can you say about the situation for higher dimensions (see for example F¨uredi and Loeb [44] and the references cited therein)? Problem 2.6 Let F ≥ 0 be a harmonic function on D. Show that there exists a positive measure μ on T with F (re(θ )) = (Pr ∗ μ)(θ ) for 0 ≤ r < 1 and with μ = F (0). Problem 2.7 A sequence of complex numbers {an }n∈Z is called positive definite if (a) an = a−n for all n ∈ Z,  (b) n,k an−k zn zk ≥ 0 for all complex sequences {zn }n∈Z . Show that any positive definite sequence satisfies |an | ≤ a0 for all n ∈ Z. Show that if ´ μ is a positive measure then an = T e(−nθ)μ(dθ) is such a sequence. Now prove that every positive definite sequence is of this form. Hint: Apply the previous problem to the  harmonic function n an r |n| e(nθ). Check the positivity of this sum by representing it as in property (a) above. Problem 2.8 Verify the mean value property and the maximum principle for harmonic functions on domains in Rd for any d ≥ 2.



Problem 2.9 Let w ∈ Ap for any 1 ≤ p < ∞. Show that for f ∈ L1loc with f ≥ 0 one has  1/p ˆ 1 p f ≤C |f (x)| w(x) dx w(Q) Q Q for any cube Q. Problem 2.10 Show that w(x) = |x|a belongs to Ap (Rd ) with p > 1 if and only if −d < a < d(p − 1). Also show that w(x) dx is a doubling measure if and only if a > −d. Problem 2.11 Verify Green’s identity for a bounded domain  ⊂ Rd with C ∞ boundary ¯ and functions u, v ∈ C ∞ ():  ˆ ˆ  ∂u ∂v −v dσ, (2.31) u (uv − vu) dx = ∂n ∂n  ∂ where σ is the surface measure on ∂, and ∂/∂n refers to the normal derivative with respect to the outward-pointing normal vector. Use this identity to verify that a solution to u = f with f ∈ S(Rd ) is given by ˆ 1 log |z − ζ |f (ζ ) m(dζ ) u(z) = 2π R2 in R2 with m the Lebesgue measure in the plane and ˆ |x − y|2−d f (y) dy u(x) = c(d) Rd

in Rd , for d ≥ 3 with a suitable constant c(d). Discuss the uniqueness of these solutions. Hint: Apply (2.31) to a  equal to a big ball minus a small ball around the point at which u is being evaluated. Then pass to suitable limits. Problem 2.12 Let f ∈ L1 (T) satisfy fˆ(n) = 0 if |n| > N, where N is some positive integer. Show that there exists E ⊂ T with |E| < λ−1 and with ˆ ˆ kN (θ)|f (ϕ − θ)|2 dθ dϕ ≤ C λ f 21 , T\E


where kN (θ) := N χ[2|θ|N ≤1] is the box kernel and C is some absolute constant. Compare this with Exercise 2.11.

3 Conjugate harmonic functions; Hilbert transform

3.1. Hardy spaces of analytic functions In the previous chapter, we dealt with harmonic functions on the disk satisfying various boundedness properties. We now turn to functions F = u + iv ∈ h1 (D) that are analytic in D. The usage of h1 here is legitimate, since analytic functions form a subset of the class of functions that are complex-valued and harmonic. Thus analytic functions in h1 (D) form the class H1 (D), the “big” Hardy space. We showed in the previous chapter that, for functions in this class, Fr = Pr ∗ μ for some μ ∈ M(T). It is important to note that, by analyticity μ(n) ˆ =0 if n < 0. A theorem by F. and M. Riesz asserts that such measures μ are absolutely continuous. From the example 1+z = Pr (θ ) + iQr (θ ), z = re(θ ), 1−z one sees an important difference between the analytic and the harmonic cases. Indeed, while Pr ∈ h1 (D), clearly F ∈ h1 (D). The boundary measure associated with Pr is δ0 , whereas F is not associated with any boundary measure in the sense of the previous chapter. F (z) :=

3.1.1. Subharmonic functions An important technical device that will allow us to obtain more information in the analytic case is provided by subharmonic functions. Loosely speaking, they allow us to exploit algebraic properties of analytic functions that harmonic functions do not have (such as the fact that F 2 is analytic if F is analytic, a property that fails for harmonic functions). Definition 3.1 Let  ⊂ R2 be an open and connected region and let f :  → R ∪ {−∞}, where we extend the topology to R ∪ {−∞} in an obvious way. We say that f is subharmonic if: 52

3.1 Hardy spaces of analytic functions


(i) f is continuous; (ii) for all z ∈  there exists rz > 0 such that ˆ 1 f (z + re(θ )) dθ f (z) ≤ 0

for all 0 < r < rz ; we refer to this as the (local) sub-mean-value property. Usually one requires only that f be upper semicontinuous but the stronger condition (i) is sufficient for our purposes. It is helpful to keep in mind that in one dimension harmonic implies linear and subharmonic implies convex. Subharmonic functions derive their name from the fact that they lie below harmonic functions, in the same way that convex functions lie below linear functions. We will make this precise by means of the important maximum principle, which subharmonic functions obey. We begin by deriving some basic properties of this class. Lemma 3.2 Subharmonic functions satisfy the following properties. (i) If f and g are subharmonic then f ∨ g := max(f, g) is subharmonic. (ii) If f ∈ C 2 () then f is subharmonic implies that f ≥ 0 in  and vice versa. (iii) That F is analytic implies that log |F | and |F |α with α > 0 are subharmonic. (iv) If f is subharmonic and ϕ is increasing and convex then ϕ ◦ f is subharmonic (we set ϕ(−∞) := limx→−∞ ϕ(x)). Proof (i) is immediate. For (ii) use Jensen’s formula, ˆ ¨ r f (w) m(dw), f (z + re(θ )) dθ − f (z) = log |w − z| T D(z,r)


where m stands for two-dimensional Lebesgue measure and   D(z, r) = w ∈ C |w − z| < r . The reader is asked to verify this formula in the following exercise. If f ≥ 0, then the sub-mean-value property holds. If f (z0 ) < 0 then let r0 > 0 be sufficiently small that f < 0 on D(z0 , r0 ) Since log r0 /|w − z0 | > 0 on this disk, Jensen’s formula implies that the sub-mean-value property fails. Next we verify (iv) by means of Jensen’s inequality for convex functions: ˆ  ˆ ϕ(f (z)) ≤ ϕ f (z + re(θ )) dθ ≤ ϕ(f (z + re(θ )) dθ. T


The first inequality uses the fact that ϕ is increasing, whereas the second uses the convexity of ϕ. If F is analytic then log |F | is continuous, with values in


Conjugate harmonic functions; Hilbert transform

R ∪ {−∞}. If F (z0 ) = 0 then log |F (z)| is harmonic on some disk D(z0 , r0 ). Thus, one has the stronger, mean-value, property on this disk. If F (z0 ) = 0 then log |F (z0 )| = −∞, and there is nothing to prove. To see that |F |α is subharmonic, apply (iv) to log |F (z)| with ϕ(x) = exp(αx).  Exercise 3.1 Prove Jensen’s formula (3.1) for C 2 functions. Now we can derive the aforementioned domination of subharmonic functions by harmonic functions. Lemma 3.3 Let  be a bounded region. Suppose that f is subharmonic on , f ∈ C(), and let u be harmonic on , u ∈ C(). If f ≤ u on ∂ then f ≤ u on . Proof We may take u = 0, so that f ≤ 0 on ∂. Let M = max f and assume that M > 0. Set S = {z ∈  | f (z) = M}. Then S ⊂  and S is closed in . If z ∈ S then, by the sub-mean-value property, there exists rz > 0 such that D(z, rz ) ⊂ S. Hence S is also open. Since  is assumed to be connected, one obtains S = . This is a contradiction.  3.1.2. Sub-mean-value property The following lemma shows that the sub-mean-value property holds on any disk in . The point here is that we are upgrading the local sub-mean-value property, to a true sub-mean-value property, using the largest possible disks. Lemma 3.4 Let f be subharmonic in , z0 ∈ , D(z0 , r) ⊂ . Then ˆ f (z0 ) ≤ f (z0 + re(θ )) dθ. T

Proof Let gn = max(f, −n), where n ≥ 1. Without loss of generality, set z0 = 0. Define un (z) to be the harmonic extension of gn restricted to ∂D(z0 , r), where r > 0 is as in the statement of the lemma. By the previous lemma ˆ un (re(θ )) dθ, f (0) ≤ gn (0) ≤ un (0) = T

the last equality expressing the mean-value property of harmonic functions. Since max un (z) ≤ max f (z), |z|≤r


3.1 Hardy spaces of analytic functions


the monotone convergence theorem for decreasing sequences yields ˆ f (0) ≤ f (re(θ)) dθ T

as claimed.

Corollary 3.5 If g is subharmonic on D then, for all θ, g(rse(θ)) ≤ (Pr ∗ gs )(θ ) for any 0 < r, s < 1. Proof If g > −∞ everywhere on D then this follows from Lemma 3.3. If not then set gn = g ∨ n := max(g, n). Thus g(rse(θ )) ≤ gn (rse(θ)) ≤ (Pr ∗ (gn )s )(θ ) and consequently g(rse(θ)) ≤ lim sup(Pr ∗ (gn )s )(θ ) ≤ (Pr ∗ gs )(θ ), n−→∞

where the final inequality follows from Fatou’s lemma (which can be applied  in the “reverse form” here since the gn have a uniform upper bound). 3.1.3. Maximal function F ∗ Note that if gs ∈ / L1 (T) then g ≡ −∞ on D(0, s) and so g ≡ −∞ on D(0, 1). We now introduce the radial maximal function associated with any function on the disk. It, and the more general nontangential maximal function, where the supremum is taken over a cone in D, are of central importance in the analysis discussed in this chapter. Definition 3.6 Let F be any complex-valued function on D; then F ∗ : T → R is defined as follow: F ∗ (θ) = sup |F (re(θ))|.


0 λ; 2 2y . (2) ωλ (0, y) ≤ πλ For the first property it suffices to note that ˆ ∞ 1 y dx = 12 ∀y > 0. 2 + y2 π x 0 For the second property, compute ˆ ˆ y 1 2 ∞ dt 2y , ωλ (0, y) = dt ≤ ≤ π (−∞,−λ)∪(λ,∞) t 2 + y 2 π λ/y 1 + t 2 πλ as claimed. Observe now that ωλ ◦ F is harmonic and that θ ∈ Eλ implies (ωλ ◦ F )(re(θ)) ≥

1 2

for some 0 < r < 1. Thus   |Eλ | ≤ θ ∈ T | (ωλ ◦ F )∗ (θ) ≥ 12 ≤ 3 × 2 |||ωλ ◦ F |||1 ,


by Proposition 3.7. Since ωλ ◦ F ≥ 0, the mean-value property implies that |||ωλ ◦ F |||1 = (ωλ ◦ F )(0) = ωλ (iu(0)) ≤

2 u(0) 2 |||u|||1 = . π λ π λ


Conjugate harmonic functions; Hilbert transform

(x,y) α




Figure 3.1. The angle α subtended by the line segment [−λ, λ] is given by the harmonic measure ωλ (x, y) in the proof of Theorem 3.15: ωλ (x, y) = 1 − α/π .

Combining this with (3.5) yields |Eλ | ≤

12 |||u|||1 , π λ

and we are done.

Exercise 3.3 Show that ωλ (x, y) = 1 – α/π , where α is the angle subtended by the line segment [−λ, λ] at the point z = x + iy. Figure 3.1 shows two possible positions of z. Use this to show that ωλ (x, y) ≥ 12 provided that (x, y) lies outside the semicircle with radius λ and center 0. Furthermore, show that ωλ is the unique harmonic function in the upper half-plane that equals 1 on  := (−∞, −λ] ∪ [λ, ∞) and 0 on (−λ, λ) = R \ ¯ and which is globally bounded. It is called the harmonic measure of (−∞, −λ] ∪ [λ, ∞) and can ¯ be defined in the same fashion on general domains  and any open  ⊂  (in fact, more general  than that). It turns out that the harmonic measure of  relative to  equals the probability that Brownian motion starting at z ∈  hits the boundary for the first time at  rather than at ∂ \ . Of particular importance is the conformal invariance of this notion, but this probabilistic connection lies beyond our scope in this book.

3.5. The Hilbert transform In the following result we introduce the Hilbert transform and establish a weak-L1 bound for it. Formally speaking, the Hilbert transform H μ of a measure μ ∈ M(T) is defined by μ → uμ → u " uμ )r =: H μ, μ → lim (" r→1

3.5 The Hilbert transform


i.e., the Hilbert transform of a function on T equals the boundary values of the conjugate function of its harmonic extension. By Corollary 3.14 this is well defined if μ(dθ ) = f (θ) dθ , f ∈ L2 (T). We now consider the case f ∈ L1 (T). 3.5.1. The weak-L1 bound ˜ Corollary 3.16 Given u ∈ h1 (D), the limit limr→1 u(re(θ )) exists for a.e. θ . With u = Pr ∗ μ, μ ∈ M(T), this limit is denoted by H μ. There exists the weak-L1 bound   θ ∈ T |H μ(θ)| > λ ≤ C μ . λ Proof If μ(dθ ) = f (θ) dθ with f ∈ L2 (T) then lim u " f (re(θ )) exists for a.e. θ , r→1

by Corollary 3.14. If f ∈ L1 (T) and ε > 0 then let g ∈ L2 (T) such that f − g 1 < ε. Denote, for any δ > 0,   " " Eδ := θ ∈ T lim sup u f (re(θ)) − u f (se(θ )) > δ r,s→1


  Fδ := θ ∈ T lim sup u#h (re(θ)) − u#h (se(θ )) > δ , r,s→1

where h = f − g. In view of the preceding theorem and the L2 case,   |Eδ | = |Fδ | ≤ θ ∈ T | (u#h )∗ (θ) > 1 δ 2

C C |||uh |||1 ≤ f − g 1 → 0 δ δ as ε → 0. This finishes the case where μ is absolutely continuous relative to Lebesgue measure. To treat singular measures, we first consider measures μ = ν that satisfy |supp(ν)| = 0. Here ≤

supp(ν) := T \ ∪ {I ⊂ T | ν(I ) = 0}, I being an arc. Observe that for any θ ∈ / supp(ν) the limit limr→1 u#r (θ ) exists since the analytic function u + i u˜ can be continued across an interval J on T for which μ(J ) = 0 and which contains θ. Hence limr→1 u#r exists a.e. by the assumption |supp(ν)| = 0. If μ ∈ M(T) is an arbitrary singular measure then use inner regularity to observe that for every ε > 0 there exists ν ∈ M(T) with μ − ν < ε and |supp(ν)| = 0. Indeed, set ν(A) := μ(A ∩ K) for all Borel sets A where K is compact and |μ|(T\K) < ε. The theorem now follows on passing from the statement for ν to one for μ by means of the same argument as that used in the absolutely continuous case above. 


Conjugate harmonic functions; Hilbert transform

3.5.2. The Lp bound It is now easy to obtain the Lp -boundedness of the Hilbert transform on 1 < p < ∞. This result is due to Marcel Riesz, who gave a different proof; see the exercise following the theorem for his original argument. Theorem 3.17 If 1 < p < ∞ then Hf p ≤ Cp f p . Consequently, if u ∈ ˜ p ≤ Cp |||u|||p . hp (D) with 1 < p < ∞ then u˜ ∈ hp (D) and |||u||| Proof By Proposition 3.13, H u 2 ≤ u 2 ; the equality occurs if and only if ˆ u(θ) dθ = 0. T

Interpolating this with the weak-L1 bound from Corollary 3.16 by means of the Marcinkiewicz interpolation theorem finishes the case 1 < p ≤ 2. If 2 < p < ∞ then we use duality. More precisely, if f, g ∈ L2 (T) then   fˆ(n)H isign(n)fˆ(n)g(n) ˆ f, H g = g(n) = n∈Z



(n)g(n) ˆ = −Hf, g. −Hf

n∈Z ∗

This shows that H = −H . Hence, if f ∈ Lp (T) ⊂ L2 (T) and g ∈ L2 (T) ⊂  Lp (T) then |Hf, g| = |f, H g| ≤ f p H g p ≤ Cp f p g p and thus Hf p ≤ Cp f p as claimed.

Exercise 3.4 Give a complex-variable proof of the L2m (T)-boundedness of ˜ 2m the Hilbert transform by applying the Cauchy integral formula to (u + i u) when m is a positive integer; cf. the proof of Proposition 3.13. Obtain the Theorem 3.17 from this by interpolation and duality. Consider the analytic mapping F = u + iv that takes D onto the strip   z |Re z| < 1 . / h∞ (D), so that Theorem 3.17 fails on L∞ (T). Then u ∈ h∞ (D) but clearly v ∈ 1 By duality, it also fails on L (T). The correct substitute for L1 in this context is the space of real parts of functions in H1 (D). This is a deep result that goes much further than the F. and M. Riesz theorem. The statement is that Hf 1 ≤ C u∗f 1 ,


where u∗f is the nontangential maximal function of the harmonic extension uf of f . Recall that by the Burkholder–Gundy–Silverstein theorem the right-hand

3.5 The Hilbert transform


side of (3.6) is finite if and only if f is the real part of an analytic L1 -bounded function. A more modern approach to (3.6) is given by the real-variable theory of Hardy spaces, which subsumes statements such as (3.6) by the boundedness of singular integral operators on such spaces. 3.5.3. Kernel representation of the Hilbert transform Next, we turn to the problem of expressing Hf in terms of a kernel. By Exercise 2.4 it is clear that one would expect that ˆ cot(π(θ − ϕ)) μ(dϕ) (3.7) (H μ)(θ ) = T

for any μ ∈ M(T). This, however, requires justification as the integral on the right-hand side is not necessarily convergent. Proposition 3.18 If μ ∈ M(T) then ˆ cot(π(θ − ϕ)) μ(dϕ) = (H μ)(θ ) lim →0



for a.e. θ ∈ T. In other words, (3.7) holds in the principal value sense. Exercise 3.5 Verify that the limit in (3.8) exists for all dμ = f dθ + dν where f ∈ C 1 (T) and |supp(ν)| = 0. Furthermore, show that these measures are dense in M(T). Proof of Proposition 3.18 The idea is to represent a general measure as a limit of measures of the kind given by Exercise 3.5. As we saw in the proof of the a.e. convergence result Theorem 2.12, the double limit appearing in such an argument requires a bound on an appropriate maximal function. In this case the natural bound is of the form $ % ˆ C cot(π(θ − ϕ)) μ(dϕ) > λ ≤ μ (3.9) θ ∈ T sup 0< λ for all λ > 0. We leave it to the reader to check that (3.9) implies the theorem. In order to prove (3.9) we invoke our strongest result on the conjugate function, namely Theorem 3.15. More precisely, we claim that ˆ sup (Qr ∗ μ)(θ ) − cot(π(θ − ϕ)) μ(dϕ) ≤ CMμ(θ), (3.10) 0