2,579 524 7MB
Pages 175 Page size 867.84 x 612 pts Year 2011
-
I A Practical Introduction
PROBABILITY AND STOCHASTICS SERIES Edited by Richard Durrett and Mark Pirisky
Probability and Stochastics Series
Linear Stochastic Control Systems, Guanrong Chen, Goong Chen, and Shia-Hsun Hsu Advances in Queueing: TheonJ, Methods, and Open Problems, Jewgeni H. Dshalalow Stochastics Calculus: A Practical Introduction, Richard Durrette
A Practical Introduction
Chaos Expansion, Multiple Weiner-Ito Integrals and Applications, Christian Houdre and Victor Perez-Abreu White Noise Distribution Theory, Hui-Hsiung Kuo Topics in Contemporary Probabilitlj, J. Laurie Snell
Richard Durrett
CRCPress Boca Raton New York London Tokyo
Preface .
Tim Pletscher:
Acquiring Editor
Dawn Boyd:
Cover Designer
Susie Carlisle:
Marketing Manager
Arline Massey:
Associate Marketing Manager
Paul Gottehrer:
Assistant Managing Editor, EDP
Kevin Luong:
Pre-Press
Sheri Schwartz:
Manufacturing Assistant
Library of Congress Cataloging-in-Publication Data Catalog record is available from the Library of Congress This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the
publisher cannot assume responsibility for the validity of all materials or for the consequences of their use.
Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic
or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher.
CRC Press, Inc.'s consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained in writing from CRC Press for such copying.
Direct all inqniries to CRC Press, Inc., 2000 Corporate Blvd., N.W., Boca Raton, Florida 33431.
© 1996 by CRC Press, Inc. No claim to original U.S. Government works
International Standard Book Number 0-8493-8071-5
Printed in the United States of America I 2 3 4 5 6 7 8 9 0
Printed on acid-free paper
.
This book is the :C-e-incarnation of my fi�st: beok Bro·w�i�'!l Motion and Martingales in A nalysis, which was published by Wadsw..pr:th ip.-1984. For more than a decade I have used Chapters 1, 2, 8 , �nq-9 of tii'at'_ bo·ok to give "reading courses" to graduate students who have compJ�t.ed. t_he first year graduate prob ability course and were interested in learning more about processes that move continuously in space and time. Taking the advice from biology that ''form fol lows function" I have taken that material on stochastic integration, stochastic differential equations, Brownian motion and its relation to partial differential equations to be the core of this book {Chapters 1-5). To this I have added other practically important topics: one dimensional diffusions, semigroups and generators, Harris chains, and weak convergence. I have struggled with this material for almost twenty years. I now think that I understand most of it, so to help you master it in less time, I have tried to explain it as simply and clearly as I can. My students' motivations for learning this material have been diverse: some have wanted to apply ideas from probability to analysis or differential geometry, others have gone on to do research on diffusion processes or stochastic partial differential equations, some have been interested in applications of these ideas to finance, or to problems in operations research. My motivation for writing this book, like that for Probability Theory and Examples, was to simplify my life as a teacher by bringing together in one place useful material that is scattered ip. a variety of sources. An old joke says that "if you copy from one book that is plagiarism, but if you copy from ten books that is scholarship." From that viewpoint this is a scholarly book. Its main contributors for the various subjects are (a) stochastic integration and differential equations: Chung and Williams {1990), Ikeda and Watanabe {1981), Karatzas and Shreve {1991), Protter {1990), Revuz and Yor {1991), Rogers and Williams {1987), Stroock and Varadhan {1979); (b) partial differential equations: Folland (1976) , Friedman (1964) , {1975), Port and Stone {1978) , Chung and Zhao {1995); (c) one dimensional diffusions: Karlin and Taylor {1981); {d) semi-groups and generators: Dynkin {1965) , Revuz and Yor {1991); (e) weak convergence: Billingsley {1968) , Ethier and Kurtz {1986), Stroock and Varadhan {1979). If you bought all those books you would spend more than $1000 but for a fraction of that cost you can have this book, the intellectual equivalent of the ginzu knife.
vi
Preface
Shutting off the laugh-track and turning on the violins, the road from this book's first publication in 1984 to its rebirth in 1996 has been a long and winding one. In the second half of the 80's I accumulated an embarrassingly long list of typos from the first edition. Some time at the beginning of the 90's I talked to the editor who brought my first three books into the world, John Kimmel, about preparing a second edition. However, after the work was done, the second edition was personally killed by Bill Roberts, the President of Brooks/Cole. At the end of 1992 I entered into .a contract with Wayne Yuhasz at CRC Press to produce this book. In the first few months of 1993, June Meyerman typed most of the book into TeX. In the Fall Semester of 1993 I taught a course from this material and began to organize it into the current form. By the summer of 1994 !-thought I was almost done. At this point I had the good (and bad) fortune of having Nora Guertler, a student from Lyon, visit for two months. When she was through making an average of six corrections per page, it was clear that the book was far from finished. During the 1994-95 academic year most of my time was devoted to prepar ing the second edition of my first year graduate textbook Probability: Theory and Examples. After that experience my brain cells could not bear to work on another book for another several months, but toward the end of 1995 they decided "it is now or never." The delightful Cornell tradition of a long winter break, which for me stret.ched from early December to late January, provided just enough time to finally finish the book. I am grateful to my students who have read various versions of this book and also made numerous comments: Don Allers, Hassan Allouba, Robert Bat tig, Marty Hill, Min-jeong Kang, Susan Lee, Gang Ma, and Nikhil Shah. Earlier in the process, before I started writing, Heike Dengler, David Lando and I spent a semester reading Protter (1990) and Jacod and Shiryaev (1987), an enterprise which contributed greatly to my education. The ancient history of the revision process has unfortunately been lost. At the time of the proposed second edition, I transferred a number of lists of typos to my copy of the book, but I have no record of the people who supplied the lists. I remember getting a number of corrections from Mike Brennan and Ruth Williams, and it is impossible to forget the story of Robin Pemantle who took Brownian Motion, Martingales and Analysis as his only math book for a year long trek through the South Seas and later showed me his fully annotated copy. However, I must apologize to others whose contributions were recorded but whose names were lost. Flame me at [email protected] and I'll have something to say about you in the next edition. Rick Durrett
About the Author
Rick Durrett received his Ph.D. in Operations research from Stanford University in 1976. He taught in the Mathematics Department at UCLA for nine years before becoming a Professor of Mathematics at Cornell University. He was a Sloan Fellow.1981-83, Guggenheim Fellow 1988-89, and spoke at the International Congress of Math in Kyoto 1990. Durrett is the author of a graduate textbook, Probability: Theory and Examples, and an undergraduate one, The Essentials of Probability. He has written almost 100 papers with a total of 38 co-authors and seen 19 students complete their Ph.D.'s under his direction. His recent research focuses on the applications of stochastic spatial models to various problems in biology.
Stochastic Calculus: A Practical Introduction
1. Brownian Motion
1.1 1.2 1.3 1.4
Definition and Construction 1 Markov Property, Blumenthal's 0-1 Law 7 Stopping Times, Strong Markov Property 18 First Formulas 26
2. Stochastic Integration
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2- . 10
Integrands: Predictable Processes 33 Integrators: Continuous Local Martingales 37 Variance and Covariance Processes 42 Integration w .r.t. Bounded Martingales 52 The Kunita-Watanabe Inequality 59 Integration w.r.t. Local Martingales 63 Change of Variables, Ito's Formula 68 Integration w.r.t. Semimartingales 70 Associative Law 74 Functions of Several Semimartingales 76 Chapter Summary 79 2.11 Meyer-Tanaka Formula, Local Time 82 2.12 Girsanov's Formula 90
3. Brownian Motion, II
3.1 3.2 3.3 3.4 3.5 3.6
Recurrence and Transience 95 Occupation Times 100 Exit Times 105 Change of Time, Levy's Theorem 111 Burkholder Davis Gundy Inequalities 116 Martingales Adapted to Brownian Filtrations 119
4. Partial Differential Equations
A. Parabolic Equations
4.1 The Heat Equation 126 4.2 The Inhomogeneous Equation 130 4.3 The Feynman-Kac Formula 137 B. Elliptic Equations
4.4 The Dirichlet Problem 143 4.5 Poisson's Equation 151 4.6 The Schrodinger Equation 156 C. Applications to Brownian Motion
4. 7 Exit Distributions for the Ball 164 4.8 Occupation Times for the Ball 167 4.9 Laplace Transforms, Arcsine Law 170
5. Stochastic Differential Equations
5.1 5.2 5.3 5.4 5.5 5.6
Examples 177 Ito's Approach 183 Extension 190 Weak Solutions 196 Change of Measure 202 Change of Time 207
6. One Dimensional Diffusions
6.1 6.2 6.3 6.4 6.5 6.6
Construction 211 Feller's Test 214 Recurrence and Transience 219 Green's Functions 222 Boundary Behavior 229 Applications to Higher Dimensions 234
7. Diffusions as Markov Processes
7.1 7.2 7.3 7.4 7.5
Semigroups and Generators 245 Examples 250 Transition Probabilities 255 Harris Chains 258 Convergence Theorems 268
8. Weak Convergence
8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8
In Metric Spaces 271 Prokhorov's Theorems 276 The Space C 282 Skorohod's Existence Theorem for SDE 285 Donsker's Theorem 287 The Space D 293 Convergence to Diffusions 296 Examples 305
Solutions to Exercises 311 References Index
339
335
1
Brownian Motion
1.1. D efi nition and Construction
In this section we will define Brownian motion and construct it. This event, like the birth of a child, is messy and painful, but after a while we will be able to have fun with our new arrival. We begin by reducing the definition of a d-dimensional Brownian motion with a general starting point to that of a one dimensional Brownian motion starting at 0. The first two statements, (1.1) and (1.2), are part of our definition.
(1.1) Translation invariance. {Bt - Bo, t 2:: 0} is independent of Bo and has the same distribution as a Brownian motion with Bo = 0. (1.2) Independence of coordinates. If Bo = 0 then { B£, t 2:: 0} . . . , { Bf, t 2:: 0} are independent one dimensional Brownian motions starting at 0. Now we define a one dimensional Brownian motion starting at 0 to be a process Bt , t 2:: 0 taking values in R that has the following properties: ( a) If to < t 1 < ... < independent. (b ) If s,
t n then B(to), B(tr) - B(to), . . . B(t n ) - B(t n - 1 ) are
t 2:: 0 then P(B(s + t) - B(s) E A) =
( c) With probability one,
Bo = 0 and t
l (27rt) - 112 exp (-x2 f2t) dx ___,.
Bt is continuous.
Bt has independent increments. (b ) says that the increment B(s + t) - B(s) has a normal distribution with mean vector 0 and variance t. ( a) says that
( c) is self-explanatory. The reader should note that above we have sometimes
2
Section
Chapter 1 Brownian Motion
written B. and sometimes written B (s) , a practice we will continue in what · follows.
1.1
Definition and Construction
3
For each 0 < t 1 < . . < tn define a measure on Rn by .
An immediate consequence of the definition that will be useful many times is: (1.3) Scaling relation. If B0 = 0 then for any t > 0,
{B.t ,S � 0} 4 {t 1 1 2B. , s � 0} To be precise, the two families of random variables have the same finite dimen sional distributions, i.e., if s 1 < . . . < sn then (B.1t,
. . •
, B. nt ) 4 (t 1 1 2B.1 ,
• • •
, t 1 1 2B. n)
In view of {1.2) it suffices to prove this for a one dimensional Brownian motion. To check this when n = 1 , we note that t 1 1 2 times a normal with mean 0 and variance s is a normal with mean 0 and variance st. The result for n > 1 follows from independent increments. A second equivalent definition of one dimensional Brownian motion starting from Bo = 0 , which we will occasionally find useful, is that Bt , t � 0, is a real valued process satisfying· (a') Bt- is a Gaussian process (i.e., all its finite dimensional distributions are multivariate normal) , (b') EB. = 0, EB.Bt = s A t = min{s, t}, (c) With probability one, t -+ Bt is continuous.
It is easy to see that (a) and (b) imply (a'). To get (b') from (a) and (b) suppose s < t and write EB.Bt = E (B'f) + E(B. (Bt - B. )) = s + EB.E(Bt - B. ) = s
The converse is even easier. (a') and (b') specify the finite dimensional distri butions of Bt , which by the last calculation must agree with the ones defined in (a) and (b). The first question that must be addressed in any treatment of Brownian motion is, "Is there a process with these properties?" The answer is ''Yes," of course, or this book would not exist. For pedagogical reasons we will pursue an approach that leads to a dead end and then retreat a little to rectify the difficulty. In view of our definition we can restrict our attention to a one di mensional Brownian motion starting from a fix ed x E R. We could take x = 0 but will not for reasons that become clear in the remark after (1.5).
where xo = x, to = 0 ,
Pt (a , b) = (2m) - 1 / 2 exp(-(b- a) 2 f2t)
and Ai E n the Borel subsets of R. From the formula above it is easy to see that for fixed x the family J.L is a consistent set of finite dimensional distributions {f.d.d.'s), that is, if {s 1 , . . . Sn - 1 } C {t 1 , . . . tn} and ti rf {s 1 , . . . S n -d then This is clear when j = n. To check the equality when 1 :=:; j < n, it is enough to show that
By translation invariance, we can without loss of generality assume x = 0, but all this says in that case is the sum of independent normals with mean 0 and variances ti - ti - 1 and ti +l - ti has a normal distribution with mean 0 and variance ti + 1 - t1_ 1 . With the consistency of f.d.d.'s verified we get our first construction of Brownian motion: (1.4) Theorem. Let no = {functions w : [O, oo ) -+ R} and :Fa be the l1-field generated by the finite dimensional sets {w : w(ti ) E Ai for 1 $ i $ n} where A; E n. For each x E R, there is a unique probability measure llx on (no, :Fa) so that llx{w : w(O) = x} = 1 and when 0 < t 1 · · · < tn
This follows from a generalization of Kolmogorov's extension theorem. We will not bother with the details since at this point we are at the dead end referred to above. If C = {w : t -+ w(t) is continuous} then C rf :F0, that is, C is not a measurable set. The easiest way of proving C rf :Fa is to do
E :Fa if and only if there is a sequence of times t 1 , t 2 , E (O, oo) and aB E Jl{ 1 • 2 • ···l {the infinite product 0'-field R x R X · · ·) so that A = {w : (w(tt), w (t2 ), ... ) E B } . In words, all events in :Fa depend on only Exercise 1.1. A
countably many coordinates.
4
Chapter 1 Brownian Motion
Section
The above problem is easy to solve. Let Q 2 = {m2- n : m, n ;:::_ 0} be the dyadic rationals. IH 1 q = {w : Q2-+ R} and :Fq is the u-field generated by the finite dimensional sets, then enumerating the rationals ql> q2 , and applying Kolmogorov's extension theorem shows that we can construct a probability v"' on (!lq , :Fq ) so that vx {w : w (O ) = x} = 1 and ( * ) in (1.4) holds when the t; E Q2 . To extend the definition from Q2 to [0, oo) we will show: • • •
(1.5) Theorem. Let T < oo and x E R. Vx assigns probability one to paths w : Q2-+ R that are uniformly continuous on Q2 n [0, T].
�
Remark. It will take quite a bit of work to prove
{1.5). Before taking on that
task, we_ will attend to the last measure theoretic detail: we tidy things up by moving our probability measures to ( C, C) where C = {continuous w : [0, oo)-+ R} and C is the u-field generated by the coordinate maps t-+ w (t ) . To do this, we observe that the map 'if; that takes a uniformly continuous point in !lq to its extension in C is measurable, and we set
Px =
l/;r; 0
J
By (1.1) and (1.3), we can without loss of generality suppose Bo = 0 and prove the result for T = 1. In this case, part (b) of the definition and the scaling relation (1.3) imply Proof of (1.5)
where C = E I B1 1 4 < oo. From the last observation we get the desired uniform continuity by using a result due to Kolmogorov. In this proof, we do not use the independent increments property of Brownian motion; the only thing we
Definition and Construction
5
use is the moment condition. In Section 2.11 we will need this result when the Xt take values in a space S with metric p; so, we will go ahead and prove it in that generality.
(1.6) Theorem. Suppose that Ep(X. , Xt)fi � Kit - s l l+a where a, {3 > 0. If < a/{3 then with probability one there is a contant C (w) so that
'Y
p(Xq , Xr ) � C lq - ri 'Y for all q, r E Q2 n [0, 1] n Let 1 < af{3, 1J > 0, In = {( i, j) : 0 � i � j � 2n , 0 < j - i � 2 f1 } n n n and Gn = {p(X(j2 - ) , X(i2 - )) � ((j - i)2 - )'Y for all (i, j) E In } · Since afi P(IY I > a) � E!Yifi we have Proof
P(G� ) �
I: ((j - i)2-n ) -fi'Y Ep(X(j2-n ),X(i2- n ))fi
(i,i)Efn
� J(
'1/; - 1 .
Our construction guarantees that Bt (w) = Wt has the right finite dimensional distributions for t E Q2 . Continuity of paths and a simple limiting argument shows that this is true when t E [0, oo) . A'§ mentioned earlier the generalization to d > 1 is straightforward since the coordinates are independent. In this generality C = {continuous w : [0, oo ) -+ Rd} and C is the u-field generated by the coordinate maps t-+ w ( t ) . The reader should note that the result of our construction is one set of random variables Bt (w) = w (t ) , and a family of probability measures Px , x E Rd, so that under Px , Bt is a Brownian motion with Px(Bo = x) = 1. It is enough to construct the Brownian motion starting from an initial point x, since if we want a Brownian motion starting from an initial measure J.l (i.e., have PJJ(Bo E A) = J.l (A)) we simply set P11 (A) = J.l (dx)Px(A)
1.1
I: {(j i)T n ) -fi'Y+l+a
(i,j)Eln
_
2n 2nfl, so 2n 2nf} · {2nf1 2 - n ) -fi'Y +l+a = 1{ 2 - n >.
by our assumption. Now the number of {i, j) E I n is �
P(G� ) � J( where >. = ( 1 - 7J)(1 + a - {3-y) - (1 + 7J). Since 1 < af{3, we can pick 7J small enough so that >. > 0. To complete the proof now we will show (1.7) Lemma. Let A = 3 · 2(1 -'lh /{1 - 2- 'Y). On HN = n�=N Gn we have ·
p(Xq , Xr ) � Alq - r i'Y for q, r E Q2 n [0, 1] with l q - r l � 2- (1 -'I )N {1.6) follows easily from (1.7): P(H/v ) �
oo
oo
2 - N>.
I: P(G� ) � K I: 2- n>. = -1I(_2- _->.
n=N n=N This shows p(Xq , Xr ) � Alq - r i 'Y for lq - r l � 8 (w ) and implies that we have p(Xq , Xr ) � C (w )lq - r i'Y for q, r E [0, 1]. ( 1 )N Proof of (1. 7) Let q, r E Q2 n [0, 1] with 0 < r - q < 2- -'I . Pick m � N so that
a and write 6
Ch pter 1 Brownian Motion
Section
r j2 - m + 2 - r(l) + ... + 2 - r(l) q i2 - m 2- q( l ) 2 - q(k ) < <
+
p(Xq , Xr) � 3C,.2 --rm (l -l)) � 3C,.2 ( l -l)h i r - qi �' 2- m ( l -'7) � 1 -'7 i r qi.
since (1.6) and {1.5) 2 . - This completes the proof of {1.7) and hence of The scaling relation {1.2) implies EiBt - B, i 2m = Cm i t - si m where Cm = EiB1 i 2m So using {1.6) with {3 = 2m and a = m- 1 and then letting m -+ gives: {1.8) Theorem. Brownian paths are Holder continuous with exponent for any < 1/2. It is easy to show: {1.9) Theorem. tinuous (and henceWith not probability differentiable)one,atBrownian any point.paths are not Lipschitz con Proof Let An = {w : there is an s E [0, 1] so that i Bt - B. i � C i t - s i when i t - si � 3/n}. For 1 � k � n - 2 let Yk,n = max { IB (k�j ) - B (k+ � - 1 ) I: j = 0, 1,2} Gn = { at least one Yk,n is � 5C/n} o
oo
r
1 .2
r
r
�
and use the Borel-Cantelli lemma to conclude that Lm-< 2n .6.� .n -+ t a.s. as n -+ Remark. The last result is true if we consider a sequence of partitions II 1 C II 2 C ... with mesh - + 0. See Freedman (1970) p.42-46. The true quadratic variation, defined as the sup over all partitions, is for Brownian motion. oo.
oo
1.2. M arkov P roperty, B lumenthal's 0-1 Law
Intuitively the Markov property says "given before the present state, B. , any other information about what hap pened time s is irrelevant for predicting what happens after times."
8
Chapter 1 Brownian Motion
Since simply Brownian more statedmotion as: is translation invariant, the Markov property can be "ifs;:::of0what thenhappened Bt +• - B. , t 0 is a Brownian motion that is indepen dent before times." This shouldimplies not be that surprising: increments definition) ifs1 thes2 . independent . .::;sm =s and 0 < t1 . property . . < tn then((a) in the ;:::
::;
The major obstacle in provingtheory the necessary Markov property forandBrownian motion then is to introduce the measure to state prove the result. The fi r st step in doing this is to explain what we mean by "what happened before times." The technical name for what we are about to define is a filtra tion, a fancy term for an increasing collection of a--fields, :F., i. e . , ifs::; t then :F. C :Ft. Since we want B. E :F., i. e . , B. is measurable with respect to :F., the first thing that comes to mind is For technical reasons, it·is convenient to replace :F� by The fields :Fj are nicer because they are right continuous. That is, Init iswords the :Fj allow us an "infinitesimal peek at the future," i.e., A E :Ff if in :F�+e for any f 0. If Bt is a Brownian motion and f is any measurable function with f(u) 0 when u 0 the random variable limsup (Bt - B. )/f(t -s) t .l.• is measurable with respect tof(u):Fj =butfonotand:F�.f(u)Exercises 2.9 and 2.10 consider happens when we take = Juloglog(1/u). However, what as wenotwillinsee:F�.inThe(2.6two ), there are noareinteresting examples ofsets). sets that are in :Fj but a--fields the same (up to null Toa family state theof measures Markov property we need some notation. First, recall that we have Px, x E Rd, on (C, C) so that under Px, B, (w) = w(t) is a Brownian motion with Eo = In order to define "what happens after time >
>
Section
Markov Property, Blumenthal's 0-1 Law 9
s,"by it is convenient to define the shift transformations e. : c c, fors;::: 0 (e.w)(t) = w(s + t) for t;::: 0 Inso words, we cut off the part of the path before time s and then shift the path that time s becomes time 0. To prepare for the next result, we note that if R is C measurable then Yo e. is a function of the future after times. Y: C To see this, consider the simple example Y(w) = f(w(t)). In this case Yo e. = j(e. w(t)) = f(w(s + t)) = j(B. + t ) Likewise, if Y(w) = j(Wt1 , w, ,J then -lo
_...
• • •
(2.1) forTheallMarkov property. Ifs;::: 0 and Y is bounded and C measurable then x E Rd Ex(Yo e. I:F;t") = Es.Y where the right hand side is r.p(y) = EyY evaluated at y = B(s). Explanation. In words, this says that the conditional expectation of Yo e. :Fj is just the expected value of Y for a Brownian motion starting at B• . Togiven explain whyhappened this implies "given theispresent state,forBpredicting . , any other information about what before time s irrelevant what happens after time s," we begin by recalling (see (1.1) in Chapter 5 of Durrett (1995)) that if g :F and E(ZI:F) E g then E(ZI:F) = E(ZIQ). Applying this with :F = :Fj and g = u(B.) we have c
Ex(Yo e. IF;t")
>
x.
1 .2
= Ex(Yo e. I B. )
wetherecall thatof minimizing X = E(ZI:F) is our best guess at given the information in F (inDurrett sense E(Z- X) 2 over X E :F., see (1.4 ) in Chapter 4 of B. is the same (1995)) then we see our best at Yo ein. given aspredicting our bestyguesse . given :Fj, that i.e., any otherguess information :Fj is irrelevant for • Proof By the definition of conditional expectation, what we need to show is Ex(Yo e. ; A) = Ex(Es.Y;A) for all A E :r: (MP) Z
If
0
10
Chapter 1 Brownian Motion
Section
We beginwebywill proving the result>. theorem for a special classmonotone of Y's andclassa special classto ofextend A's.willThen use the and the theorem to the general case. Suppose Y(w) = l �mIT�n fm(w(tm)) where 0 < t 1 < . . . < tn and the fm Rd -+ Rare bounded and measurable. Let 0 < h < t 1 , let 0 < s1 . . . < SJ: :=:; s + h, and let A = {w : w( si) E Aj , 1 :=:; j :=:; k} where Aj E n,d for 1 :=:; j :=:; k. We will call these A's the finite dimensional :r:;+h . of Brownian motion it follows that if 0 = < < or f.d.Fromsetsthein definition Ut then the joint deQsity of (Bu 1 , Bu J is given by P:c(Bu1 = Yl > ·· .Bu t = Yt) = i=l IlPu;- u;_1 (Yi - 1> Yi ) where Yo = x. From this it follows that E:c (}] g;(Bu;)) = JdYl Pu1 - u0 (Yo, yi )gl (Yl ) J dyt Put- u t-t (Yt- 1 , Yt )Ut (Yt ) Applying resultforwiththetheg wegiven by s , . . . , SJ:, s + h, s + t1 , . . . s + tn and the obviousthischoices ; have 1 E,(Y A) = E, (Q lA;(B, ;) Ia• (B, +h ) ]/m (B, +,.)) = }At{ dxl Pst(x,x l ) ... jAk{ dXkPsk- sk-t(Xk- liXJ:) . r dyps+h - sk (x�:,y)
-+
• • •
-+
-+
Using result. (MP-2) and the bounded convergence theorem now gives the desired
0
Chapter 1 Brownian Motion
12
(MP-3) shows that (MP) Toholdsextend for Yto=general Ill
>
x
s
x
will see many applicationsthatofhasthe Markov property below, soSinceweTheturnreader our attention now toother a "triviality" surprising consequences. Ex(Y !:F.+) = EB. YE :F: it follows (see, e.g., (1.1) in Chapter 5 of Durrett (1995)) that Ex (Y .j:Ff) = Ex(Y .IF:) From the last equation it is a short step to (2.6) Theorem. If ZE C is bounded then for all 2:: 0 and xE Rd, o
o
8
8s
o
8
s
>
>
(2.5)
Law 13 Exercise 2.5. Let G be an open set and letT = inf{t : B1 (j. G}. Let J( be a closed subset of G and suppose that for all xE J( we have Px (T 1, B1E K) ;::: then for all integers 1 and xE J( we have Px (T n, BtE K) ;::: arn. The next two exercises prepare for calculations in Chapter 4. Exercise 2.6. Let 0 < s < t. If h : R Rd -+ R is bounded and measurable Ex (it h(r, Br ) dr l :F. ) = i• h(r, Br) dr + EB( ) it-s h(s + u, Bu) du • Exercise 2.7. Let 0 < < t. Iff: Rd -+ R and h : R Rd -+ R are bounded and measurable then Ex ( f(Bt) exp (it h(r, Br) dr) I:F. ) = exp (it h(r, Br) dr) EB. { f(B1_3) exp (it-• h(s + u, Bu) du)} Section
= J Pt(O, y)Py (To 1 - t) dy >
Proof when
By
the monotone class theorem, (2.3), it suffices to prove the result n Z = II fm(B(tm )) m=l
14
Chapter 1 Brownian Motion
Section
and the fm are bounded and measurable. In this case Z = X(Y B3) where X E :F� and Y E C, so using a property of conditional expectation (see, e.g., (1.3) in Chapter 4 of Durrett (1995)) and the Markov property (2.1) gives E.,(Z I :FJ) = X E.,(Y B31:FJ) = X EB. Y E :F: and the proof is complete. If wesame let ZupE :Fi thensets. (2.6)Atimplies Z = E.,( Z I :F:) E :F:, so the two u-fields are the to null first The fun starts when we takes = 0 in (2.glance, 6) to getthis conclusion is not exciting. (2.7) Blumenthal's 0- 1law. If A E :Fci then for all x E Rd, P.,(A) E {0, 1} Proof Using (i) the fact that A E :Fci, (ii) (2. 6 ), (iii) :F8 = u(Bo) is trivial under P.,, and (iv) if g is trivial E(X I Q) = EX gives P., a.s. This indicator function 1A is a.s. equal to the number P.,(A) and itshows followsthatthattheP.,(A) E {0, 1}. thestudying last resultthesayslocalthatbehavior the germof field, :Fci, is trivial. This result isnotice veryInwewords, useful in Brownian paths. Until will restrict our attention to one dimensional Brownian motion.further (2.8) Theorem. If = inf{t � 0: Bt > 0} then Po(r = 0) = 1. Proof P ( :::; t) � P (Bt > 0) = 1/2 since the normal distribution is sym metric about 0. Letting t ! 0 we conclude Po(r = 0) = liJ;;Po(r:::; t) � 1/2 so it follows from (2.7) that Po(r = 0) = 1. must hitSince(0, t ) immediately starting from 0, it mustOnce also hitBrownian ( 0)motion immediately. Bt is continuous, this forces: (2.9) Theorem. If To = inf{t > 0: Bt = 0} then Po(To = 0) = 1. o
o
0
0
T
0 r
0
0
-oo,
oo
-+
1 .2
Markov Property, Blumenthal's 0-1 Law
15
Combining (2.8) and (2.9) with the Markov property you can prove Exercise 2. 8. If a < b then with probability one there is a local maximum of Bt maxima in (a, b).ofSince with probability local Brownian motion are one dense.this holds for all rational a< b, the Another typical application of (2.7) is Exercise 2. 9. Let f(t) be a function with f(t) > 0 for all t > 0. Use (2. 7 ) to conclude that limsupt!O B(t)ff(t) = c Po a.s. where c E (0, ] is a constant. Initerated the nextlogarithm exercise(seeweSection will see7.that c = when f(t) = t 112• The law of the 112 when 9 of Durrett (1995)) shows that = 2 1 f(t) = (t log log(l/t)) 12• Exercise 2. 10. Show that limsUP t!oB(t)jt 112 = Po a.s., so with probability one Brownian paths are not Holder continuous of order 1/2 at 0. Remark. Let 'H'Y(w) be the set of times at which the path w E C is Holder continuous ofshows orderthat1. P('H'Y (1.6) shows that P('H'Y = [0, )) = 1 for 1 < 1/2. Exercise 1. 2 = 0) = 1 for 1 > 1/2. The last exercise shows P(t E 'H1t2 ) = 0 for each t, but B. Davis (1983) has shown P('H112 f 0) = 1. Comic Relief. There is a wonderful way of expressing the complexity of Brownian paths that I learned from Wilfrid Kendall. you runit willBrownian motion in two dimensions for a positive amount of"Iftime, write your name." Ofthecourse, on Shakespeare, top of your name it willpornographic write everybody else'sandname, asof well as all works of several novels, a lot nonsense. as'rhinking follows:of the function g as our signature we can make a precise statement (2.10) Theorem. Let g : (0, 1] Rd be a continuous function with g (O) = 0, let f > 0 and let tn ! 0. Then Po almost surely, sup I B(y�tn ) - g(B) l < for infinitely many In viewthisofifBlumenthal's 0-1 law, (2.7), and the scaling relation (1.3), weProof can prove we can show that Po ( sup I B(B) - g(B) I < f) > 0 for any > 0 oo
oo
c
oo
oo
-+
0 �89
0 �89
t:
n
t:
16
Chapter 1 Brownian Motion
Section
This wasme easy forin methe todetails. believe,Thebutfirstnotstepso easy fortreatmetheto dead proveman's whensignature students asked to fill is to g(x) 0. =
Show that if e > 0 and t < then Po (sup 0sup I B! I < e) > 0 i � •9 Ingetdoing this youresult mayfrom find this Exercise 2.5 helpful. In (5.4) of Chapter 5 we wiil the general one by change of measure. Can the reader find a simple ·direct proof of (*)? With our discussion of Blumenthal's 0-1 law complete, the distinction be tween :Ff and :F: is no longer important, so we will make one final improvement in our u-fields and remove the superscripts. Let N:c = {A : A B with P:c(B) = 0} :F: = u(:Ff U N:c) oo
Exercise 2. 11.
c
1 .2
Markov Property, Blumenthal's 0-1 Law
17
allows usthisto relate the behavior of Bts 0-1as law t to tothea behavior as Combining idea with Blumenthal' leads very useful result. Let :Ff = u(B. : s t) = the future after time t = nt� O :F: = the tail u-field (2.12) Theorem. If A E then either P:c(A) 0 or P:c(A):::: 1. Remark. Notice that this is stronger than the conclusion of Blumenthal's 0-1 law examples : w(O) E B} show that for A in the germ u-field(2.7).:Fft The the value of P:cA(A)=may{w depend on x. Proof Since the tail u-field of B is the same as the germ u-field for X, it follows that Po(A) E {0, 1}. To improve this to the conclusion given observe that A E :Ff, so 1A can be written as 1 B 81 • Applying the Markov property, (2.1), gives t
__,.
(2.11) 0.
__,. oo
2::
T
T
=
o
P:c(A) = E:c(1 B 81 ) = E:c(E:c( 1B 81 I :F1 )) = E:c(EB1 1B ) = (2n-) - df 2 ( -ly - xl 2 /2)Py (B) dy o
o
(2.11) Theorem. If Bt is a Brownian motion starting at 0 then so is the process defined by Xo = . 0 and Xt = t B(1/t) for t > 0. Proof By (1.2) it suffices to prove the result in one dimension. We begin by observing that thepathsstrong law of large numbers implies Xt 0 as t 0, so X has continuous and we only have to check that X has the right f.d. d.'s. By0 < the second definition of Brownian motion, it suffices to show that (i) if t 1 < ... < tn then (X(t l ), ... X(tn )) has a multivariate normal distribution with mean 0 (which is obvious) and (ii) if s < t then
exp j Taking x = 0 we see that if P0(A) = 0 then Py (B) = 0 for a.e. y with respect toToLebesgue measure, and using the formula again shows P:c(A) = 0 for all x. handle the case P0(A) = 1 observe that Ac E and Po(Ac ) = 0, so the last result implies P:c(Ac ) = 0 for all x. nextofresult application of (2.12). The argument here is a closeThe relative the oneis afortypical (2.8). (2.13) Theorem. Let Bt be a one dimensional Brownian motion and let A = nn{Bt = 0 for some t n}. Then P:c(A) = 1 for all x. In"infinitely words, one dimensional Brownian motion is recurrent. It will return to 0 often, " i. e ., there is a sequence of times t n j so that Bt n = 0. We have to0, beBt careful withtothe0 infinitely interpretation the phrase since starting from will return manyoftimes by timein equotes > 0. Proof We begin by noting that under P:c, Btf.J'i has a normal distribution with mean xj.J'i and variance 1, so if we use x to denote a standard normal,
E(X.Xt) = stE(B(1/s)B(1ft)) = s
P:c(Bt < 0) = P:c(Btf-/i < 0) = P(x < -xj-/i)
:F. = nx:F:
are thethenull set and :F; are the completed u-fields for P:c. Since we do not want filtration to depend on the initialwillstatebe mentioned we take theatintersection of allthethenextcompleted u-fields. This technicality one point in section but can otherwise be ignored. (2.7) concerns the behavior of Bt as t 0. By using a trick we can use this result to get information about the behavior as t
N:c
s
__,.
__,. oo.
__,.
__,.
0
T
0
2::
oo
18
Chapter 1 Brownian Motion
Section
and limt-.oo P:r:(Bt < 0) = 1/2. If we let To = inf{t : Bt = 0} then the last result and the fact that Brownian paths are continuous implies that for all x > 0
P:r:(Bt = 0 for some t 2:: n) = E:r:(PBn (To < oo)) 2:: 1/2 --+ oo it follows that P:r:(A) >
P:r:(Bt = 0 i.o.) ::::: 1 .
1/2 but
AE
T
so (2.12) implies D
1.3. Stopping Times, Strong Markov P roperty
We call a random variable S taking values in [0, oo] a stopping time if for all t 2:: 0, { S < t} E :Ft . To bring this definition to life think of Bt as giving the price of a stock and S as the time we choose to sell it. Then the decision to sell before time t should be measurable with respect to the information known at time t. In -the last definition we have made a choice between { S < t} and { S � t} . This makes a big difference in discrete time but none in continuous time (for a right continuous filtration :.Ft) : If { S � t} E :.Ft then { S < t} = U n {S � t - 1/n} E :Ft.
If { S < t} E :.Ft then { S � t} = n n{ S < t + 1/n} E :Ft. The first conclusion requires only that t --+ :.Ft is increasing. The second relies on the fact that t --+ :.Ft is right continuous. (3.2) and (3.3) below show that when checking something is a stopping time it is nice to know that the two definitions are equivalent. (3.1) Theorem. If G is an open set and T
stopping time.
= inf{t ;:::: 0: Bt E G} then T is a
Proof Since G is open and t --+ Bt is continuous {T < t } = U q < t {Bq E G} where the union is over all rational q, so {T < t} E :Ft. Here, we need to use the rationals so we end up with a countable union. D (3.2) Theorem. If Tn is a sequence of stopping times and Tn ! T then T is a
stopping time.
Un {Tn < t}.
19 D
stopping time.
Symmetry implies that the last result holds for x < 0, while {2.9) (or the Markov property) covers the last case x = 0. Combining the fact that P:r:(To < oo) 2:: 1/2 for all x with the Markov property shows that
n
Stopping Times, Strong Markov Property
(3.3) Theorem. If Tn is a sequence of stopping times and Tn j T then T is a
P:r:(To < oo) 2:: 1/2
Letting
Proof {T < t} =
1 .3
Proof {T � t} =
n n{Tn � t}.
D
. (3 .4) Theorem. If I< is a closed set and T = inf{t;:::: 0:
stopping time.
Bt E I S ) . From the verbal description it should be clear that Sn is a stopping tim�. Prove that it is. Exercise 3.3. If S and T are stopping times, then S 1\T = min{S, T} , SVT = max{S, T} , and S + T are also stopping times. In particular, if t ;:::: 0, then S 1\ t, S V t, and S + t are stopping times.
20
Section
Chapter 1 Brownian Motion
Let Tn be a sequence of stopping times. Show that infT n n, limsupT n n, limninfTn are stopping times Our nexttogoal is to state and provefrom the strong Markov property. To do this, we need generalize two definitions Section 1.2. Given a nonnegative random we define randomso that shifttime Bs which "cuts off the part ofw beforevariable S(w)S(w) and then shifts the the path S(w) becomes time 0." (Bsw)(t) = { w(S (w) + t) onon {{SS =< oo} oo} Here getsis shifted an extraaway. pointSome we addauthors to like to cover the case inconvention which thethatwholeall path to adopt the functions have f(l::!.restrict ) = 0ourto take care toof {the second case. However, we will usually explicitly attention S < oo} so that the second half of the definition will not come into play. , "the information known at time S," is a little moreThesubtle.secondWe quantity could have:Fsdefined
Exercise 3.4.
1::!.
1::!.
G
1 .3
Stopping Times, Strong Markov Property
21
LetS be a stopping time and let AE :Fs . Show that R = {Soo on on AAc is a stopping time Exercise 3.7. LetS and T be stopping times. (i)(ii) {{SS t}, {S = t} are in :Fs . T}, and {S = T} are in :Fs (and in :FT) · Two properties of :Fs that will be useful below are: (3.5) Theorem. IfS �T are stopping times then :Fs :FT . Proof If AE :Fs then A n { T � t} = (A n {S � t}) n {T � t}E :Ft. (3.6) Theorem. If Tn ! T are stopping times then :FT = nn:F(Tn ) · Proof (3.5) implies :F(Tn ) :FT for all n . To prove the other inclusion let AE n:F(Tn) · Since A n {Tn < t}E :Ft and Tn ! T, it follows that A n ff < t}E :Ft, so AE :FT . �he last result and Exercises 3.2 and 3.7 allow us to prove something that . obvious from the verbal definition. Exercise 3.�. B s E :Fs, � . e . , the value of Bs is measurable with respect to . for atiOn known at timeS! To prove this letSn = ([2nS ] + 1)/2n be the the m _ �times defined m_ Exercise 3.2. Show B(Sn)E :Fs,. then let n --+ oo and stoppmg use (3.6). The next result goes in the opposite direction from (3.6). Here gn l g means n --+ gn is increasing and g = ( Qn ) · Exercise 3. 9. Let S < oo and Tn be stopping times and suppose that Tn l oo as n l oo. Show that :FsAT,. l :Fs as n l oo. We are property now readyholdsto state the strong the Markov at stopping times.Markov property, which says that (3.7) Strong Markov property. Let (s,w) --+ Y(s,w) be bounded and n C measurable. If S is a stopping time then for all xE R d Ex (Ys Bsi :Fs) = EB(s)Ys on {S < oo} Exercise 3.6.
c
o
:J
o
-
IS
so by analogy we could set The definition we will now give is less transparent but easier to work with. :Fs = {A : A n {S � t}E :Ft for all t 2: 0} In{S words, this makes the reasonable demand that the part of A that lies in � t} should be measurable with respect to the information available at time t. Again we have made a choice between � t and < t but as in the case of stopping this makes no difference and it is useful to know that the two definitionstimes, are equivalent. continuous, the definition of :Fs is unchanged ifExercise we replace3.5.{SWhen � t} by:Ft {Sis right < t}. For practice with the definition of :Fs do
cr
x
o
22
Section
Chapter 1 Brownian Motion
where the right-hand side is
cp(y, t) = EyYt evaluated at y = B(S), t = S.
Remark. In most applications the function that we apply to the shifted path will not depend on s but this flexibility is important in Example 3.3. The verbal description of this equation is much like that of the ordinary Markov property:
"the conditional expectation of Y o Bs given :FJ is just the expected value of Ys for a Brownian motion starting at Bs ." Proof We first prove the result under the assumption that there is a sequence of times t n j oo, so that P:c ( S < oo ) = I: P:c( S = t n ) · In this case we simply
break things down according to the value of S, apply the Markov property and put the pieces back together. If we let Zn = Yt n (w) and A E :Fs then CXl
L
E:c(Zn Btn ; A n { S = t n }) n= l Now if A E :Fs , A n { S = t n } = (A n { S ::; t n }) - (A n { S ::; t n - 1 }) E :F(tn ), E:c(Ys Bs ; A n {S < oo}) = 0
0
so it follows from the Markov property that the above sum is =
CXl
L E:c(EB (t n ) Zn ; A n {S = t n }) = E:c(EB(S) Ys; A n { S < oo})
n= l
To prove the result in general we let Sn = ([2 n S] + 1 ) /2n where [x] = the largest integer ::; x. In Exercise 3.2 you showed that Sn is a stopping time. To be able to let n - -;. oo we restrict our attention to Y's of the form
Ys (w) = fo(s)
n
IT fm (w(tm ))
m= l
where 0 < t 1 < ... < t n and /0, , fn are real valued, bounded and continuous. If f is bounded and continuous then the dominated convergence theorem implies that x --;. dy Pt(x, y)f(y) • • .
J
is continuous. From this and induction it follows that
J
cp(x, s) = E:cYs = fo(s) dy1 Pt 1 (X, YI)f( Yl ) • • •
J
dYn Pt n - t n-l (Yn - l , Yn )f(Yn )
1.3
Stopping Times, Strong Markov Property
23
is bounded and continuous. Having assembled the necessary ingredients we can now complete the proof. Let A E :Fs. Since S ::; Sn , (3.5) implies A E :F(Sn ) · Applying the special case of (3.7) proved above to Sn and observing that { Sn < oo} = { S < oo} gives
Ex (Ysn Bsn ; A n { S < oo}) = E:c(cp (B(Sn ) , Sn ) ; A n {S < oo }) o
Now as
n--�- oo, Sn ! S, B(Sn )--�- B(S), cp(B(Sn ), Sn )--�- cp(B(S), S) and
so the bounded convergence theorem implies that (3.7) holds when Y has the form given in ( * ) . To complete the proof now we use the monotone class theorem, (2.3). Let 1l be the collection of bounded functions for which (3.7) holds. Clearly 1{. is a vector space that satisfies (ii) if Yn E 1{. are nonnegative and increase to a bounded Y, then Y E ?-{.. To check (i) now, let A be the collection of sets of the form {w : w ( ti ) E Gi } where Gi is an open set. If G is open the function 1G is a decreasing limit of the continuous functionsfk (x) = ( 1 - k dist(x, G))+, where dist( x, G) is the distance from x to G, so if A E A then 1A E ?-{.. This shows (i) holds and the desired conclusion follows from (2.3) . 0 Example 3. 1. Zeros of Brownian motion. Consider one dimensional Brow nian motion, let Rt = inf{u > t : Bu = 0} and let To = inf{u > 0 : Bu = 0}. Now (2.13) implies P:c(Rt < oo) = 1, so B(Rt) = 0 and the strong Markov property and ( 2.9 ) imply
P:c(To BR, o
>
Ol:FR ,) = Po(To > 0) = 0
Taking the expected value of the last equation we see that
P:c(To BR, o
>
0 for some rational t) = 0
From this it follows that with probability one, if a point u E Z(w) :: { t : Bt(w) = 0} is isolated on the left (i.e., there is a rational t < u so that (t, u ) n Z(w) = 0) then it is a decreasing limit of points in Z(w ) . This shows that the closed set Z(w) has no isolated points and hence must be uncountable. For the last step see Hewitt and Stromberg (1969), page 72. If we let I Z(w) l denote the Lebesgue measure ofZ(w) then Fubini's theorem implies
E:c( lZ(w) n [O, T] l ) =
1T P:c(Bt = O) dt= O
Chapter 1 Brownian Motion
24
So
Section 1 .3 Stopping Times, Strong Markov Property
Z(w) is a set of measure zero.
0
Example 3.2. Let G be an open set, x E G, let T = inf{t : B1 f/:. G} , and suppose P:r:(T < oo ) = 1. Let A C {)G, the boundary of G, and let. u(x) = P:r:(BT E A) . I claim that if we let 6 > 0 be chosen so that D(x, o) = {y : I Y - x l < o} c G and let s = inf{t 2: 0: Bt f/:. D(x, o)}, then
u(x) = E:r:u(Bs) Since D(x, o) C G, Bt cannot exit G without first exiting D(x, o) at B8. When Bs = y, the probability of exiting G in A is u(y) independent of how Bt got to y.
Intuition
Proof To prove the desired formula, we will apply the strong Markov prop erty, (3.7), to
Example 3.3. Reflection principle. Let B1 be a one dimensional Brownian a > 0 and let Ta = inf{t : B1 = a}. Then
motion, let
Po(Ta < t) = 2Po(Bt > a)
(3.8)
Intuitive proof We observe that if B, hits
a at some time s < t then the strong Markov property implies that B1 - B(Ta ) is independent of what hap pened before time Ta . The symmetry of the normal distribution and Po(Bt = a) = 0 then imply Po(Ta < t, Bt > a) = 21 Po(Ta < t)
(3.9)
Multiplying by 2, then using
y = 1 (BTE A)
To check that this leads to the right formula, we observe that since D(x, o) C G, we have BT o Bs = BT and 1 (BTEA) o Bs = 1 (BTEA)· In words, w and the shifted path Bsw must exit G at the same place. Since S $ T and we have supposed P:r:(T < oo ) = 1, it follows that P:r:(S < oo) = 1. Using the strong Markov property now gives
{Bt > a} C {Ta < t} we have
Po(Ta < t) = 2Po(Ta < t, Bt > a) = 2Po(Bt > a) Proof To make the intuitive proof rigorous we only have to prove (3.9). To extract this from the strong Markov property (3.7) , we let
{
if s < t , w(t - s) > a Y,(w) = 01 otherwise _ We do this so that if we let S = inf{s < t
Using the definition of u, 1 (BTEA) o Bs the previous display, we have
Exercise 3. 10.
Let G,
If we let
T, D(x, o) and S be as above, but now suppose a bounded function and let u(y) =
on
Now Bs
: B. = a} with inf 0 = oo then
{S < oo} = {Ta < t}
rp(x , s) = E:r: Y• the strong Markov property implies
Eo(Ys o Bsi :Fs) = rp (Bs , S)
0
E9T < oo for all y E G. Let g be E9(J: g (B. ) ds). Show that for x E G
that
Ys(Bsw) = 1 (B,>a)
= 1 (BTEA)> and taking expected value of
u(x) = E:r:1 (BTE A) = E:r: ( 1 (BTE A) o Bs ) = E:r:E:z: ( 1(BTEA) o Bs I Fs ) = E:r:u(Bs)
on
{S < oo} = {Ta < t}
= a on {S < oo } and rp(a, s) = 1 /2 if s < t, so Po(Ta < t, Bt > a) = Eo(1/2 ; Ta < t)
which proves (3.9). Exercise 3. 11. Generalize the proof of (3.9) to conclude that if u
then Our third application shows why we want to allow the function Y that we apply to the shifted path to depend on the stopping time S.
25
(3.10)
Po(Ta < t, u < Bt < v ) = Po(2a - v < B1 < 2a - u)
a,u < Bt < v) = Po(2a - v < Bt < 2a - u) Letting the interval (u, v) shrink to x we see that 1 e - ( 2 a - x )2 / 2 t o( Mt > a, Bt - x) - o (Bt - 2a - x) - -12-if. Differentiating with respect to a now we get the joint density ) e-(2a-x)2/2t {3.11) Po(Mt = a, Bt = x) = 2� 2m3 To explain our interest in
p,
_
_
p,
_
[0, s]. dt =sf(r-s/(rs) s)2 dr, 1 100 {r + s) 2 1/2 s dr ; _. --;:-;--- {r + s)2
We have two interrelated aims in this section: the first to understand the be havior of the hitting times for a one dimensional Brownian motion the second to study the behavior of Brownian motion in the upper half space We begin with and observe that the E reflection principle implies
: Bt = a} Bt;H = {x Rd Ta: =>inf{t Ta {3.8) xd 0}. Po(Ta < t) = 2Po(Bt > a) = 2 100 (2m) - 112 exp(-x2 /2t) dx Here, and until further notice, we are dealing with a one dimensional Brow nian motion. To find the probability density of Ta, we change variables x = t1 12ajs112 , dx = -t112a/2s312 ds to get (4.1) Po(Ta < t) = 2 1° (2m) - 112 exp(-a2/2s) (-t 112a/2s312) d� t = 1 (27rs3 )- 112a exp( -a2 /2s) ds Using the last formula we can compute the distribution of L = sup{t � 1 : Bt2.3 =and0} 2.4. and R = inf{t � 1 : Bt = 0}, completing work we started in Exercises By (2.5) if 0 < s < 1 then Po(L � s) = 1: p, (O, x)Px (T0 > 1 - s) dx = 2 Jrooo {27rs) - 1 12 exp( -x2 /2s) 1r1 -oos (27rr3) - 112 x exp( -x2 /2r) dr dx
r [1 - s,
Our next step is to let t = oo ) + to convert the integral over E into one over t E so to make the calculations easier + we first rewrite the integral as
( )
_
1.4. First Formulas
27
Changing variables as indicated above and then again with
t = u2 to get
Po(L � s) = .!:_ l['o (t(1 -t)) - 1/2 dt = -2 10 .../i{1 - u2)- 1 12 du = -2 arcsin(Vs)
(4.2)
1T
1T
1T
L= 1, Po(L = t) = -1 10 ' (t(1 - t))- 112 for 0 < t < 1 is symmetric about 1/2 and blows up near 0 and 1. This is one of two arcsine laws associated with Brownian motion. We will encounter the other one in Section 4.9. The computation for R is much easier and is left to the reader. Exercise 4.1. Show that the probability density for R is given by Po(R = 1 + t) = 1/(m1 12 (1 + t)) for t � 0 Notation. In the last two displays and in what follows we will often write P(T = t) = f(t) as short hand for T has density function f(t). As our next application of (4.1) we will compute the distribution of Br where = inf{t : Bt f/:. H} and H = {z : Zd > 0} and, of course, Bt is a d dimensional Brownian motion. Since the exit time depends only on the last coordinate, it is independent of the first d- 1 coordinates and we can compute the distribution of Br by writing for x, E Rd- 1 , E R P(x,y){Br = (B , 0)) = 100 ds P(x,y) ( = s)(27rs) - (d- 1)/2e- lx- BI2/2• = {27rY) d/2 Jro oo ds 8- (d+2)/2e- .Ta)) = exp (- KVA)
Exercise 4.4. Adapt the argument for cp8 (s)
()
.
=
a
The representation of the Cauchy process given above allows us to see that its sa,mple paths a -+ and s -+ are very bad.
Ta
Exercise 4.5. If u Exercise 4.6. If u
C3
Po( -+ Ta is discontinuous in ( < v then Po(s -+ C3 is discontinuous in (
< v then
a
u,
v )) = 1 .
u,
v)) = 1 .
Hint. B y independent increments the probabilities in Exercises 4.5 and 4.6
-
only depend on v size of the interval.
u
but then scaling implies that they do not depend on the
B1
leaves H. The rest of the The discussion above has focused on how section is devoted to studying where goes before it leaves H. We begin with the case = 1 .
B1
d
(4.9) Theorem. If where
x,y > 0, then Pc(Bt = y,To > t) = Pt(x,y) - Pt(x,-y) Pt (X , Y) - (2 _.li t) - 1/2 e - (y-x)2 /2t _
The proof is a simple extension of the argument we used in Section 1.3 to prove that � a ) . Let � 0 with = 0 when :s; t) = :s; 0. Clearly
Proof
Po(Ta
2Po(B1
f(x) f Ex(f(Bt) ; To > t) = Exf(Bt) - Ex(f(Bt); To :s; t) If we let f(x) = f( -x), then it follows from the strong Markov property and symmetry of Brownian motion that Ex(f(Bt); To :s; t) = Ex[Eof(Bt -T0);To :s; t] = Ex[Eo f t) = Exf(Bt) - Ex/(Bt) = j (Pt(x, y) - P t(x,-y))f(y)dy The last formula generalizes easily to d � 2. (4.10) Theorem. Let r = inf{t : Bf = 0}. If x, y E H, Px (Bt = y , r > t) = Pt(x, y) - Pt(x , jj) where jj = (y1 , . . . , Yd- 1 , -ya ) . Exercise 4.7. Prove (4 . 10) .
31
since
0
2
Stochastic Integration
In this chapter we will define our stochastic integral It = J; H.dX• . To motivate the developments, think of X. as being the price of a stock at time s and H. as the number of shares we hold, which may be negative (selling short) . The integral It then represents the net profits at time t, relative to our wealth at time 0. To check this note that the infinitesimal rate of change of the integral di1 = Ht dXt = the rate of change of the stock times the number of shares we hold. In the first section we will introduce the integrands H. , the "predictable processes," a mathematical version of the notion that the number of shares held must be based on the past behavior of the stock and not on the future performance. In the second section, we will introduce our integrators X. , the "continuous local martingales." Intuitively, martingales are fair games, while the "local" refers to the fact that we reduce the integrability requirements to admit a wider class of examples. We restrict our attention to the case of martingales with continuous paths t -+ X1 to have a simpler theory. 2.1. Integrands: Predictable Proc esses
To motivate the class of integrands we consider, we will discuss integration w.r.t. discrete time martingales. Here, we will assume that the reader is familiar with the basics of martingale theory, as taught for example in Chapter 4 of Durrett ( 1995). However, we will occasionally present results whose proofs can be found there. Let Xn , n ;:::: 0 , be a martingale w.r.t. :Fn . If Hn , n ;:::: 1, is any process, we can define n
L Hm(Xm - Xm -d m=l To motivate the last formula and the restriction we are about to place on the Hm , we will consider a concrete example. Let 6 , 6 , . . . be independent with P(ei = 1) = P(ei = -1) = 1/2, and let Xn = 6 + · ·+en · Xn is the symmetric simple random walk and is a martingale with respect to :Fn = u (6 , . . . ' en ) · (H · X)n =
34
Section 2.1 Integrands: Predictable Processes
Chapter 2 Stochastic Integration
If we consider a person flipping a fair coin and betting $1 on heads each time then Xn gives their net winnings at time n. Suppose now that the person bets an amount Hm on heads at time m (with Hm < interpreted as a bet of -Hm on tails). I claim that (H · X)n gives her net winnings at time n . To check this note that if Hm > our gambler wins her bet at time m and increases her fortune by Hm if and only if Xm - Xm - 1 = 1 . The gambling interpretation of the stochastic integral suggests that it is natural to let the amount bet at time n depend on the outcomes of the first n - 1 flips but not on the flip we are betting on, or on later flips. A process Hn that has Hn E Fn - 1 for all n � 1 (here :Fa = {0, n}, the trivial a--field) is said to be predictable since its value at time n can be predicted (with certainty) at time n - 1. The next result shows that we cannot make money by gambling on a fair game.
0
0
{1 . 1) Theorem. Let Xn be a martingale. If Hn is predictable and each Hn is
bounded, then (H · X)n is a martingale.
Proof It is easy to check that (H · X)n E :Fn . The boundedness of the Hn implies EI(H · X)n l < oo for each n. With this established, we can compute conditional expectations to conclude
E((H · X)n + l i:Fn) = (H · X)n + E(Hn + l (Xn + l - Xn) I :Fn) = (HX) n + Hn +l E(Xn + l - Xn i:Fn) = (H · X)n since Hn + l E :Fn and E(Xn + l - Xn i:Fn ) =
0.
D
The last theorem can be interpreted as: you can't make money by gambling on a fair game. This conclusion does not hold if we only assume that Hn is optional, that is, Hn E :Fn, since then we can base our bet on the outcome of the coin we are betting on. Example 1. 1. If Xn is the symmetric simple random walk considered above
and Hn = en then
n
(H · X)n = since e� =
1.
I: em · em = n
m= l
D
In continuous time, we still want the metatheorem "you can't make money gambling on a fair game" to hold, i.e., we want our integrals to be martingales. However, since the present (t) and past ( < t) are not separated, the definition of the class of allowable integrands is more subtle. We will begin by considering a simple example that indicates one problem that must be dealt with.
35
0
Example 1.2. Let (n, :F, P) be a probability space on which there is defined
a random variable T with P(T ::; t) = t for ::; t ::; 1 and an independent random variable e with P(e = 1) = P(e = - 1) = 1/2. Let Xt =
{0 e
t H(s, w) as n --> 00 .
o
. The distinction between the optional and predictable u-fields is not impor tant for Brownian motion since in that case the two u-fields coincide. Our last fact is that, in general, II C A. Show that if H(s, w) = 1 (a , bj (s)1A (w) where A E :Fa , then Exercise l. l. . · · of a sequence of optional processes; therefore, H is optional and the hm1t H IS
II c A.
2.2. Integrators: Continuous Lo cal Martingales
In Section 2.1 we described the class of integrands that we will consider: the predictable processes. In this section, we will describe our integrators: contin uous local martingales. Continuous, of course, means that for almost every w, the sample path s --> X8 (w) is continuous. To define local martingale we need some notation. If T is a nonnegative random variable and yt is any process we define on { T > 0} y:T t = YTAt on { T = 0} 0
{
(2.1) Definition. Xt is said to be a local martingale (w.r.t. {:Ft, i ;::: 0}) if there are stopping times Tn j oo so that X'{n is a martingale (w.r.t. {:FtATn : t ;::: 0}). The stopping times Tn are said to reduce X. We need to set Xf :: 0 on {T = 0} to deal with the fact that X0 need not be integrable. In most of our concrete examples Xo is a constant and we can take T1 > 0 a.s. However, the more general definition is convenient in a number of situations: for example (i) below and the definition of the variance process in
(3.1).
In the same way we can define local submartingale, locally bounded, locally of bounded variation, etc. In general, we say that a process Y is locally A if there is a sequence of stopping time Tn j oo so that the stopped process Yt has property A. (Of course, strictly speaking this means we should say locally a submartingale but we will continue to use the other term.)
38
Chapter 2 Stochastic Integration
Section 2.2 Integrators: Continuous Local Martingales
Why local martingales?
(2.2) Theorem. {XI' (t ) , .1"1' (t ) , t 2:: 0 } is a martingale.
There are several reasons for working with local martingales rather than with martingales.
First we need to show:
(i) This frees us from worrying about integrability. For example, let X1 be a martingale with continuous paths, and let cp be a convex function. Then cp (Xt) is always a local submartingale (see Exercise 2.3). However, we can conclude that cp (X1) is a submartingale only if we know Ejcp(X1) 1 < oo for each t, a fact that may be either difficult to check or false in some cases. (ii) Often we will deal with processes defined on a random time interval [0, r) . If r < oo, then the concept of martingale is meaningless, since for large t X1 is not defined on the whole space. However, it is trivial to define a local martingale: there are stopping times Tn j r so that . . .
(iii) Since most of our theorems will be proved by introducing stopping times Tn to reduce the problem to a question about nice martingales, the proofs are no harder for local martingales defined on a random time interval than for ordinary martingales. Reason (iii) is more than just a feeling. There is a construction that makes it almost a theorem. Let X1 be a local martingale defined on [0, r) and let Tn j r be a sequence of stopping times that reduces X. Let To = 0, suppose T1 > 0 a.s., and for k 2:: 1 let
,(t) =
{ �� (k - 1) rk .L i.
n - 1 + (k - 1) :::; t :::; + (k - 1) + )2 k=k(•)+2 The next step is to take conditional expectation with respect to :F. and use the first formula with = t k ( • ) and t = tk ( • )+l · To neaten up the result, define a sequence Un 1 k(s) - 1 $ n $ k(t) + 1 by letting uk ( • ) - 1 = s, u i = t i for k(s) $ i $ k(t) , and u k (t )+l = t. r
E(Xf - Qf"(X)!:F.)
46
Chapter 2 Stochastic Integration
Section
where the second equality follows from (3.4).
0
Looking at (a) and noticing that Qf(X) is increasing except for the last term, the reader can probably leap to the conclusion that we will construct the process A1 desired in (3.1) by taking a sequence of partitions with mesh going to 0 and proving that the limit exists. Lemma (c) is the heart (of darkness) of the proof. To prepare for its proof we note that using (a) , taking expected value and using (3.4) gives
E(Qf (X) - Q_;. (X)IF.) = E(Xl - X] IF.) = E((X1 - X,) 2 jF.)
(b)
To inspire you for the proof of (c) , we promise that once (c) is established the rest of the proof of (3.1) is routine. (c) Lemma. Let X1 be a bounded continuous martingale. Fix r > 0 and let �n be a sequence of partitions 0 = < . . . < = r of [0, r] with mesh = supk 0. Then Q� n (X) converges to a limit in L2.
l �n l
t� t�
It� - t�_ 1 l -+
tt
(3.5)
t -+
for any real numbers
ti :::; sk}. By the Cauchy-Schwarz inequality 1 12 E(Q��I (Q�)) :::; {EQ��I (X) 2 J ll 2 { E s p(X. k+1 + x. k - 2X1j(k) )4} � Whenever 1�1 + 1 �' 1 -+ 0 the second factor goes to 0, since X is bounded and continuous, so it is enough to prove (e) If I X1 I :::; M for all t then EQ� �� (X)2 :::; 12M4 . where j(k) = sup{j :
The value 12 is not important but it does make it clear that the bound does not depend on r, �, or �' . Proof of (e)
If r = ��I is the partition 0 = so
Q;(x )' =
2a2 + 2b2 - (a + b) 2 = (a - b) 2 � 0) we have Q��� (Y) :::; 2Q��� (Q�) + 2Q��� (Q� ' ) Combining the last two results about Q we see that it is enough to prove that
ti
+ :::; ti+1 · Recalling
To do this let Sk E �� ' and E � so that ij :::; Sk < Bk l the definition of Q� (X) and doing a little arithmetic we have
Q;:+ l - Q ;: = (X. k+1 - X1J 2 - (X, k - X1i ) 2 = (X. k+1 - X, k ) 2 + 2(X. k+1 - X, k )(X. k - X1J = (X. k+1 - X, k )(X. k+1 + X. k - 2XtJ
(f,
(x - x, _ _ , )' ••
)
'
O) - (X)TnAt is � so Xl - (X)t is a local a martmgale martingale. o
0
0
End of Existence Proof
Finally, we have the following extension of (c) that will be useful later.
50
Section 2.3 Variance and Covariance Processes
Chapter 2 Stochastic Integration
(3.8) Theorem. Let Xt be a continuous local martingale. For every t and sequence of subdivisions .6.n of [0, t] with mesh l .6.n I --+ 0 we have
(
> c)
(
> c)
(X5) - (X 5 ). 1 � o + P sup P sup IQ� (X) - {X). I ·�t IQ� · �t Since X 5 is a bounded martingale ( c) implies that the last term goes to 0 as 0 1 .6. 1 --+ 0. -
Proof Since
sup I Q � n (X)
- {X} , I -+ 0 in probability · �t 0. We can find a stopping time S so that Xl is a bounded Proof Let o, martingale and P(S � t ) � o. It is clear that QA(X) and QA (X5) coincide on [O, S]. From the definition and (3.7) it follows that {X} and (X5) are equal on [0 , S] . So we have
c>
51
this follows immediately from the definition of {X, Y) and Our next result is useful for computing
(3.8).
0
{X, Y}t .
�3.11) T�eorem . . Suppose X� and Yt are continuous local martingales. {X, Y)t Is the umque contmuous predictable process At that is locally of bounded vari ation, has Ao = 0, and makes Xt Yt - At a local martingale. Proof
From the definition, it is easy to see that
(3.9) Definition. If X and Y are two continuous local martingales, we let {X, Y}t = 41 ({X + Y)t - {X - Y)t) Remark. If X and Y are random variables with mean zero,
�
EXY = (E(X + Y) 2 - E(X - Y) 2 ) 1 = 4 (var (X + Y) - var (X - Y)) so it is natural, I think, to call (X, Y}t the covariance of X and Y. cov (X, Y ) =
Given the definitions of {X)t and (X, Y}t the reader should not be surprised that if for a given partition .6. = {0 = to < t 1 < t2 . . . } with limn tn = oo, we let k (t ) = sup { k : t k < t } and define
k (t ) Qf'(x , Y) = L(Xt k - Xt k_J (Ytk - Ytk_J k =I + (Xt - Xtk )(Yt - Yt k< • > ) then we have
(3.10) Theorem. Let X and Y be continuous local martingales. For every t and sequence of partitions .6.n of [0, t] with mesh l .6.n l --+ 0 we have sup I Q� n (X, Y) - {X, Y}. l -+ 0 in probability · �t
is a local martingale. To prove uniqueness, observe that if At and A' are two processes with the desired properties, then At - A� = (XtYt -A� ) - (XtYt - At) is a continuous local martingale that is locally of bounded variation and hence 0 :::: 0 by (3.3). From (3.11) we get several useful properties of the covariance. In what follows, X, Y and Z are continuous local martingales. First since Xtyt = Yt Xt ,
(X, Y)t = (Y, X}t Exercise 3.2.
{X + Y, Z)t = (X, Z)t + (Y, Z}t.
Exercise 3.3.
{X - X0 , Z} t = {X, Z) t·
a, b are real numbers then (aX, bY}t = ab(X, Y}t. Taking a = b, X = Y , and noting (X, X} t = (X) t it follows that (aX} t = a 2 (X} t . Exercise 3.5. (XT , yT ) = (X, Y}T
Exercise 3.4. If
It is also true that Since we are lazy we will wait until Section 2.5 to prove this. We invite the reader to derive this directly from the definitions. The next two exercises pre pare for the third which will be important in what follows:
52
Chapter 2 Stochastic Integration
Exercise 3.6. If Xt is a bounded martingale then integrable martingale.
Xl - (X}t is a uniformly
Exercise 3. 7. If X is a bounded martingale and
S is a stopping time then yt = Xs +t -Xs is a martingale with respect to :Fs+t and (Y)t = (X}S+t - (X}s. Exercise 3. 8. If S :$ T are stopping times and constant on [S, T] .
(X}s = (X} T , then X is
Exercise 3. 9. Conversely, if S :$ T are stopping times and [S, T] then (X}s = (X} T .
X is constant on
2.4. Integration w.r.t. B ounded M artingales
In this section we will explain how to integrate predictable processes w.r.t. bounded continuous martingales. Once we establish the Kunita-Watanabe in equality in Section 2.5, we will be able to treat a general continuous local martingale in Section 2.6. As in the case of the Lebesgue integral (i ) we will begin with the simplest functions and then take limits to get the general case, and (ii) the sequence of steps used in defining the integral will also be useful in proving results later on. St ep 1: Basic Integrands
We say H(s, w) is a basic predictable process if H(s, w) = l (a ,b](s)C(w) where C E :Fa . Let ITo = the set of basic predictable processes. If H = l (a ,b]C and X is continuous, then it is clear that we should define
J H3dX3 = C(w)(Xb(w) - Xa (w)) Here we restrict our attention to integrating over [0, oo ) , since once we know how to do that then we can define the integral over [0, t] by
(H · X)t ::
1 H3 dX3 = J H31[o,tJ (s)dX3 t
To extend our integral we will need to keep proving versions of the next three results. Under suitable assumptions on H, K, X and Y (a)
(H · X)t is a continuous local martingale
(b) ((H + K) ·X)t = (H · X)t + (K · X)t , (H · (X + Y))t = (H · X)t + (H · Y)t
Section 2.4 Integration w.r.t. Bounded Martingales 53 t (c) (H · X, I( · Y}t = fo H3K3 d(X, Y}3 In this step we will only prove a version of ( a) . The other two results will make their first appearance in Step 2. Before plunging into the details recall that at the end of Section 2.2 we defined Ht to be bounded if there was an M so that with probability one Ht :$ M for all t � 0.
(4 . 1 . a) Theorem. If X is a continuous martingale and H H is bounded } , then (H · X)t is a continuous martingale.
E bii0 = {H E II0 :
Proof
o ::; t ::; a a ::; t ::; b b :$ t < oo From this it is clear that ( H ·X) t is continuous, (H ·X) t E :Ft and E I (H ·X)t l < oo. Since (H · X)t is constant for t rf. [a, b] we can check the martingale property by considering only a :$ s < t :$ b . (See Exercise 4.1 below. ) In this case, E((H · X)ti:F3) - (H · X)3 = E((H · X)t - (H · X)3 1 :F3) = E(C(Xt - X3 ) 1 :F3) = CE(Xt - X3 ) 1 :F3) = 0 where the last two equalities follow from c E :F3 and E(Xt I:F3)
= x3 .
0
Exercise 4. 1. If yt is constant for t rf. [a, b] and E(yt I :F3) = Y3 when a :::; s < t :::; b then yt is a martingale. Step 2: Simple Integrands
We say H(s, w) is a simple predictable process and write H E II 1 if H can be written as the sum of a finite number of basic predictable processes. It is easy to see that if H E II1 then H can be written as
m H(s, w) = L_ )(t,_ 1 ,t;J (s)C;(w) i =l
where to
< t 1 < . . . < t m and C; E :Ft , _ 1 • In this case we let C;(Xt, - Xt,_J J H3dX3 = t i =l
Section 2.4 Integration w.r.t. Bounded Martingales 55
54 Chapter 2 Stochastic Integration The representation in the definition of H is not unique, since one may subdivide the intervals into two or more pieces, but it is easy to see that the right-hand side does not depend on the representation of H chosen.
(4.2.b) Theorem. Suppose X and Y are continuous martingales. If H, J( E il1 then ((H + K) · X)t = (H · X)t + (I< · X)t (H · (X + Y))t = (H · X)t + (H · Y)t H1 = H and H2 = K. By sub dividing the intervals in the defin tions we can suppose Hi = I:�1 1(t; _1 ,t;] (s)C/ for j = 1, 2. In this case it is
Proof Let
v�riation it defines a 0"-finite signed measure, so we fix an w and then integrate With respect to that measure. In the case under consideration, the integrand is piecewise constant so the integral is trivial. Proof To prove these results it suffices to show that
is a martingale, for then the first result follows from (3.11), the second by taking expected values, and the third formula follows from the second by taking H = J{ and X = Y. To prove Zt is a martingale we begin by noting that
((H1 + H 2 ) X)t(K Y) t = (H1 · X)t(I< · Y)1 + (H 2 X)t (I< · Y)t t t [t (Hi + H;)Ksd(X, Y}s = [ Hi I 0 for Z = X, Y, X + Y . (4.2.b) implies that (Hn · (X + Y))t = (Hn · X)t + (Hn · Y)t Now let n --> and use ( Hn · Z)t --> (H · Z)t for Z = X, Y, X + Y . If X , Y are bounded continuous martingales, H E IT 2 (X), K(5.4)E ITTheorem. then (Y) 2 t (H · X , K · Y}t = 1 H,K,d(X, Y} .
D
oo
Proof
By (3.11 ) , it suffices to show that
Hn and f{n be sequences of elements of bll1 that converge respectively, and let Zf be the quantity IT2H(X)n andandJ{ITn2 (Y), replace H and J{ in (t) . By ( 4 .2.c) , Zf is a
is a martingale. Let and I< in to that results when
H
62
Section 2.6 Integration w.r.t. Local Martingales
Chapter 2 Stochastic Integration
martingale. By (3.6), we can complete the proof by showing zr; -1- z. in L1. The triangle inequality and (4.3.b) imply that E sup I (Hn
t
· X)t(I N (recall !Xt I $ N and use (11.3)) this proves the result for g and completes the proof of (11.4). o g(Xt) - g(Xo) =
l
1
For the next two results let S be a set, let S be a u-field of subsets of S, and let J1. be a finite measure on (S, S). For our purposes it would be enough to take S = R, but it is just as easy to treat a general set. The first step in justifying . the Interchange of the order of integration is to deal with a measurability issue: if we have a family Hf(w) of integrands for a E S then we can define the integrals J; Hf dX, to be a measurable function of (a, t, w ).
(11.5) Lemma. Let X be a continuous semimartingale with Xo = 0 and Hf(w) = H(a,t,w) be bounded and S x II measurable. Then there is a
let
�
I
86
Chapter 2 Stochastic Integration
Z(a, t,w) E S x II such that for J.L-almost every a Z(a, t, w) is a continuous version of I; H� aX3 • Proof Let Xt = Mt + At be the decomposition of the semimartingale. By stopping we can suppose that I A it � N, !Mt l � N and (X}t = (M}t � N for all t. Let 1-l be the collection of bounded H E S x II for which the conclusion holds. We will check the assumptions of the Monotone Class Theorem, (2.3) in Chapter 1. Clearly 1-l is a vector space. To check (i) of the MCT suppose H(a, t , w) = f(a)I a } dM. and Jf J; 1 {X, > a} dA. then the argument showsassumptions that (a, t) that If isimply continuous, so we will get the desired conclusion ifabove we adopt (a, t) J,a is continuous.
0
=
=
=
-+
-+
2.12. Girsanov's Formula
Indefinition this section, westochastic will showintegral that thearecollection of semimartingales and the of the not affected by a locally equivalent change of measure. For concreteness, we will work on the canonical probability space ( C, C), with :F1 the filtration generated by the coordinate maps X1(w) w1• Two measures Q and P defined on a filtration :F1 are said to be locally equivalent if for each t their restrictions to :F1, Q, and P1 are equivalent, i. e ., mutually absolutely In this case clear we letasa1the dQt/ The reasons for our interest in thiscontinuous. quantity will become storydP1• unfolds. (12.1) Lemma. Yi is a (local) martingale/Q if and only if a1Y1 is a (local) martingale/ P. Proof The parentheses are meant to indicate that the statement is true if the locals are removed. Let Yi be a martingale/Q, s < t and E :F Now if(i),Ztwo E :F1, then (i) J Z dP J Z dP1, and (ii) J Zat dPt J Z dQ,. So using (ii), the fact that Y is a martingale/Q, (ii), and (i) we have l a1YidP l a,ytdP, · l YidQ1 L Y·dQ. l a.Y.dP. l a.Y,dP =
=
=
=
=
=
=
=
=
A
•.
Section 2. 12 Girsanov's Formula
91
!hisa sequence shows a,ytof stoppm is a :nart�ngale/ P.j Ifoo Ysoisthata local martingale/Q then there Ishence times T Yi is a martingale/Q and n A n T � P. The optional stopping theorem implies that . a,yt" n IS a m artmgalef ! . at TTonYiAprove and itthat follows(a) that a,Yt is a local martingale/ Tn IS a martmgalef the converse, observe interchanging the roles ofa (local) and Qmartmgale/ an.d applying the last result shows that if {31 dPtfdQ1, and Z1 is i P, then f3tZt a (local) martingale/Q and (b) f3t a;- 1 , so letting � Zt a,Y, we have the desned result. Since 1 is a martingale/Q we have (12.2) Corollary. a1 dQtfdP1 is a martingale/F. There is a converse of (12.2) which will be useful in constructing examples. (12.3) Lemma. Given a, a nonnegative martingale/P there is a unique locally equivalent probability measure Q so that dQtfdP1 a1• Proof The last equation in the lemma defines the restriction of Q to :F, for < tn , A ; any Tol subseeetsthatofthR�, defines aBunique measure on forC, let1 :=:;t 1i ::;< n}.t2 ...Define befimte.Bort.dimensiOnal and let {w : w(t ) E A; the ; � : distributions of a measure Q by setting Q(B) L a1dP whenevnaler t :2: tn. The martingale property of a1 implies that the finite di mensio (C, C). distributions are consistent so we have defined a unique measure on We are ready to prove the main result of the section. (r'12.4)1 Girsanov's formula. . X is a local martingale/P and we let At J o a.- d(a,X }., then X1 - A, IS a local martingale/Q. Although formulaIfforweAsuppose looks a little strange, it is easy to see that 1t�oroof must0, bethen theintegrating righttheanswer. that A is locally of b.v. and has by parts, i.e., using (10.1), and noting (a, A} t 0 � smce A has bounded variation gives p
i\
p
=
=
=
p.
o
=
=
=
=
o
If
=
=
92
Chapter 2 Stochastic Integration
Section 2. 12 Girsanov's Formula
At this point, we need the assumption made in Section 2.2 that our filtration only admits continuous martingales, so there is a continuous version of a to which we can apply our integration-by-parts formula. Since a3 and X3 are local martingales/ P, the first two terms on the right in (12.5) are local martingales/ P. In view of (12.1), if we want Xt - At to be a local martingale/Q, we need to choose A so that the sum of the third and fourth terms = 0, that is,
93
(12.6) Theorem. The quadratic variation (X)t and hence the covariance (X, Y)t is the same under P and Q. Proof The second conclusion follows from the first and (3.9). To prove the first we recall (8.6) shows that if .6.n = {0 = tg < t� . . . < t]:n = t} is a sequence of partitions of (0, t] with mesh l.6.nl -+ 0 then
L(Xt i+ t - Xq ? -+ (X)t i
From the last equation and the associative law (9.6), it is clear that it is neces sary and sufficient that The last detail remaining is to prove that the integral that defines Let Tn = inf { t : Cl't :::; n - 1 } . If t :::; Tn , then
At exists.
by the Kunita-Watanabe inequality. So if T = limn- oo Tn then At is well defined for t :::; T. The optional stopping theorem implies that Eau.Tn = Eat , so noting ll't ATn = Cl't on Tn > t we have Since
in probability for any semimartingale/ P. Since convergence in probability is not affected by locally equivalent change of measure, the desired conclusion follows. D
(12.7) Theorem. If H E £bii then (H X)t is the same under P and Q. ·
Proof The value is clearly the same for simple integrands. Let
M and N be the local martingale parts and A and B be the locally bounded variation parts of X under P and Q respectively. Note that (12.6) implies (M)t = (N)t. Let Tn = inf{t : (M)t , I Ait or IBit ;::: n} If H E £biT and Ht
= 0 for t ;::: Tn then by (4.5) we can find simple Hm so that
These conditions allow us to pass to the limit in the equality
Cl't ;::: 0 is continuous and Tn :::; T, it follows that
( Hm (M + A))t = (Hm (N + B))t ·
Letting
·
(the left-hand side being computed under P and the right under Q) to conclude that for any H E £biT the integrals under P and Q agree up to time Tn . Since n is arbitrary and Tn -+ oo the proof is complete. D
n -+ oo, we see that Cl't = 0 a.s. on {T .$ t}, so
Pt is equivalent to Qt , so 0 = Pt(T :::;. t) = P(T :::; t). Since t is arbitrary D P(T < oo) = 0 , and the proof is complete.
But
(12.4) shows that the collection of semimartingales is not affected by change of measure. Our next goal is to show that if X is a semimartingale/ P, Q is locally equi valent to P, and H E ibII , then the integral ( H X)t is the same under P and Q. The first step is ·
3
Brownian Motion , II
In this chapter we will use Ito's formula to deepen our understanding of Brow nian motion or, more generally, continuous local martingales. 3.1. Recurrence and Transience
If B1 is a d-dimensional Brownian motion and Bf is the ith component then Bf is a rp.artingale with (B i )t = t. If i f= j Exercise 2.2 in Chapter 1 tells us that BfBf is a martingale, so (3.11) in Chapter 2 implies (B i , Bi )t = 0. Using this information in Ito's formula we see that if f : Rd - R is C2 then
L1=l Dii f for
Writing 'Vf = (Dd, . . . , Dd f) for the gradient of f, and L'::. f = the Laplacian of f, we can write the last equation more neatly as
(1.1)
f(Bt) - f(Bo) =
11
'Vf(B3) · dB3 +
� 11 L'::.f(B3)ds
Here, the dot in the first term stands for the inner product of two vectors and the precise meaning of that is given in the previous equation. Functions with L'::. f = 0 are called harmonic. (1.1) shows that if we com pose a harmonic function with Brownian motion the result is a local martingale. The next result (and judicious choices of harmonic functions) is the key to de riving properties of Brownian motion from Ito's formula.
(1.2)
Theorem. Let G be a bounded open set and r = inf{t B1 rf. G} . If f E C2 and L'::. f = 0 in G, and f is continuous on the closure of G, G, then for :z: E G we have f(:z:) = E,::f (Br ). Proof Our first step is to prove that Pc(r < oo) = 1. Let J( = sup{ l:z: - v i : :z:, y E G} be the diameter of G. If :z: E G and I B1 - :z:l > J( then B1 rf. G and r < 1. Thus P:c(r < 1) 2:: P:c( I B1 - :z:l > K) = Po( I B1 I > K) = cK > :
0
96
Chapter 3 Brownian Motion, II
Section 3.1 Recurrence and Transience
This shows sup, P, (r 2?: k) � (1 - EK ) k holds when k = 1. To prove t)le last result by induction on k we observe that if Pt(x, y) is the transition probability for Brownian motion, the Markov property implies
P,(r 2?: k) � �
L P1 (x, y)Py (r 2?: k - 1) dy
(1 - CI( ) k- l P,(B l E G) � (1 - CI( l
where the last two inequalities follow from the induction assumption and the fact that I B1 - x l > I< implies B1 fl. G. The last result implies that P,(r < oo ) = 1 for all x E G and moreover that for all 0
(1.3)
To get the last conclusion recall that (see e.g. ,
(1995))
E,rP =
1
00
< p < oo (5.7) in Chapter 1 of Durrett
ptP - 1 P,(r > t) dt
When D.. f = 0 in G, {1.1) implies that f(Bt) is a local martingale on [0, r ) . We have assumed that f is continuous on G and G is bounded, so f is bounded and if we apply the time change -y defined in Section 2.2, Xt = f{B-y(t ) ) is a bounded martingale (with respect to Yt = F-y(t ) ) · Being a bounded martingale Xt converges almost surely to a limit Xoo which has Xt = E,(Xoo l 9t) and hence E,Xt = E,X00 • Since r < oo and f is continuous on G, Xoo = f(Br) · 0 Taking t = 0 it follows that f(x) = E,Xo = E,Xoo = E,f(Br ) . In the rest of this section we will use (1.2) to prove some results concerning the range of Brownian motion {Bt : t 2?: 0}. We start with the one-dimensional case.
(1.4) Theorem. Let a < x < b and T = inf{ t : Bt rj. (a, b)} . x-a b - x P,(B = b) = - P,(BT = a) = - T b-a b-a Proof f(x) = (b - x)J(b - a) has f" = 0 in (a, b), is continuous on [a, b], and has f(a) = 1, f(b) = 0, so (1.2) implies f(x) = E,f(BT ) = P,(BT = a). 0 Exercise 1. 1 Deduce the last result by noting that using the optional stopping theorem at time T.
Let T, = inf{t
Bt is a martingale and
: Bt = x}. From (1.4), it follows immediately that
97
(1.5) Theorem. For all x and y, P,(Ty < oo) = 1. Since P,(Ty < oo ) = P,_ Y (To < oo ), it suffices to prove the result y = 0. A little reflection (pun intended) shows we can also suppose x > 0. Now using {1.4) P,(To < TM, ) = (M - l)JM, and the right-hand side approaches 1 as M --+ oo . o
Proof
when
It is trivial to improve (1.5) to conclude that
{1.6) Theorem. For any s < oo, P,(Bt = y for some t 2?: s) = 1. Proof
By the Markov property, 0
The conclusion of {1.6) implies ( argue by contradiction) that for any y with probability 1 there is a sequence of times t n l oo ( which will depend on the outcome w ) so that Bt n = y, a conclusion we will hereafter abbreviate as "Bt = y infinitely often" or "Bt = y i.o." In the terminology of the theory of Markov processes, what we have shown is that one-dimensional Brownian motion is recurrent. Exercise 1.2
Use
{1.6) to conclude lim sup Bt
t -+ 00
= oo
liminf Bt t
-+ 00
= -oo
In order to study Brownian motion in d 2?: 2, we need to find some ap propriate harmonic functions. In view of the spherical symmetry of Brownian motion, an obvious way to do this is to let cp (x) = /(l x l 2) and try to pick f : R --+ R so that D.. c,o = 0. We use lxl 2 = XI + · · · x� rather than lxl since it is easier to differentiate:
Dd (lxl 2 ) = f'(lxl 2 )2x; D;d (lxl 2 ) = f"(lx l 2 )4xf + 2f'( l x l 2 ) Therefore, for D.. c,o = 0 we need
0 = l:{f"(lxl 2 )4x f + 2/'( lxl 2 )} i
Section 3.1 Recurrence and Transience
Chapter 3 Brownian Motion, II
98
y = l x l 2 , we can write the above as 4yf"(y) + 2df'(y) = 0 or, if y > 0, f"(y) = -d 2y f'(y) Taking f'(y) = cy- d/ 2 guarantees D.. cp = 0 for X i= 0, so by choosing c appro Letting
priately we can let
{
d=2 lxl cp(x) = log d 2 d�3 x l l We are now ready to use (1.2) in d � 2. Let Sr = inf{t : !Btl = r} and r < R. Since cp has D..cp = 0 in G = {x : r < lxl < R}, and is continuous on G , (1.2) implies
for all c: > 0, so Po(Bt = 0 for some t > So = inf{t > 0 : Bt = 0}, we have
(1.10)
99
0) = 0, and thanks to our definition of
P:c(So < oo) = 0 for all x
Thus, in d � 2 Brownian motion will not hit 0 at a positive time even if it starts there.
1.3 Use the continuity of the Brownian path and to conclude that if x f:. 0 then P:c(Sr j oo as r ! 0) = 1.
Exercise
P:c(So = oo) = 1
For d � 3, formula (1.7) says
(1.11) where
cp( r) is short for the value of cp( x) on {x lx I = r} . Solving now gives :
- cp(x) P:c(Sr < SR ) = cp(R) cp(R) _ cp(r)
(1.7) In d =
2, the last formula says log R - log lxl
P:c(Sr < SR ) = log R - log r
(1.8) If we fix r and let
R --+ oo in (1.8), the right-hand side goes to 1. So P:c(Sr < oo) = 1 for any x and any r > 0
and repeating the proof of (1.6) shows that
(1.9) Theorem. Two-dimensional Brownian motion is recurrent in the sense that if G is any open set, then P:c( Bt E G i.o.) = 1.
If we fix R, let r --+ 0 in (1.8), and let So = inf{t > 0 : Bt = 0}, then for
x f:. O
Since this holds for all R and since the continuity of Brownian paths implies SR j oo as R j oo, we have P:c (So < oo) = 0 for all x f:. 0. To extend the last result to x = 0 we note that the Markov property implies
Po(Bt = 0 for some t � ) = Eo[Pn. (To < oo)] = 0 c:
There is no point in fixing R and letting r --+ 0, here. The fact that two dimensional Brownian motion does not hit points implies that three dimensional Brownian motion does not hit points and indeed will not hit the line { x : x 1 = x 2 = 0}. we fix r and let R --+ oo in (1.11) we get
If
(1.12)
P:c(Sr < oo) = (r/ l x l ) d-2 < 1 if l x l > r
From the last result it follows easily that for d �
3, Brownian motion is tran
sient, i.e. it does not return infinitely often to any bounded set.
(1.13) Theorem. As t --+ oo, lBt 1 --+ oo a.s. Proof Let An = { !Btl > n 1 1 2 for all t � Sn } and note that Sn < oo by (1.3). T�e strong Markov property implies
as n --+ oo. Now lim sup An
= n�=l U�=N An has
P(limsup An) � lim sup P(An)
=1
So infinitely often the Brownian path never returns to {x : time Sn and . this implies the desired result.
lxl :::; n 1 1 2 } after
D
Dvoretsky and Erdos (1951) have proved the following result about how fast Brownian motion goes to oo in d � 3.
100 Chapter 3 Brownian Motion, II (1.14) Theorem. Suppose g(t) is positive and decreasing. Then
Po( I Bt l � g(t).Ji i.o. as t i oo ) = 1 or 0 according as f'" g(t) d-2 Jt dt = oo or < oo. Here the absence of the lower limit implies that we are only concerned with the behavior of the integral "near oo." A little calculus shows that
according as a < 1 or a > 1, so Bt goes to oo faster than .Ji/(logt)'Jt/ d-2 for any a > 1. Not;that in view of the Brownian scaling relationship Bt = d t 1 12 B1 we could not sensibly expect escape at a faster rate than .Ji,. The last result shows that the escape rate is not much slower. Review. At this point, we have derived the basic facts about the recur rence and transience of Brownian motion. What we have found is that
(i ) P.,(jB1 I < 1 for son-te t 2:: 0) = 1 if and only if d � 2
Section 3.2 Occupation Times 101 section we will investigate the occupation time ]000 1n (Bt) dt and show that for
any x
(2.1) Pc (j000 1n (Bt) dt = oo) = 1 in d � 2 (2.2) E., ]000 1n (Bt ) dt < oo in d 2:: 3 Proof of
(2.1) Let T0 = 0 and G = B(O, 2r). For k 2:: 1, let S�o = inf{t > n - 1 : Bt E D} T�o = inf {t > S�o B1 E G} :
Writing r for T1 and using the strong Markov property, we get for k 2::
From this and
(4.5) in Chapter 1 it follows that
(ii) P.,(Bt = 0 for some t > 0) = 0 in d 2:: 2. The reader should observe that these facts can be traced to properties of what we have called cp, the (unique up to linear transformations) spherically sym metric function that has Llcp ( x) = 0 for all x f. 0, that is : cp ( x)
{
d=1 lxl = log l x l d = 2 lxl 2 - d d 2:: 3
and the features relevant for (i ) and (ii) above are
(i) cp (x ) -;. oo as l xl -;. oo if and only if d � 2 (ii) j cp (x) l -;. oo as x -;. 0 in d 2:: 2. 3.2. O ccupation Times
Let D = B(O, r) = {y : I Y I < r} the ball of radius r centered at 0. In Section 3. 1 , we learned that Bt will return to D i.o. in d � 2 but not in d 2:: 3. In this
1
1Tk 1n (Bt ) dt sk
are i.i.d.
Since these random variables have positive mean it follows from the strong law of large numbers that a.s. proving the desired result. Proof of
0
(2.2) If f is a nonnegative function, then Fubini's theorem implies
E.,
100 f(Bt) dt = 100 E.,f(Bt ) dt = 100 J Pt( 00 = J 1 Pt(X, y) dt f(y) dy
x, y)f(y) dy dt
where Pt(x, y) = (21li) -d/ 2 e - I"' -Y I 2 12 t is the transition density for Brownian motion. As t -;. oo, Pt(x, y) {2m)- d/2, so if d � 2 then f p1 ( x, y)dt = oo . "'
102
Chapter 3 Brownian Motion, II
3, changing variables t = l x - yl 2 /2s gives 00 1 00 Pt (x, y) dt = 0 (2m) d/ 2 e - l y - x l2 / 2 t dt 0 o d/ 2 & lx - yl 2 e- = r J 2 2s2 ds lx l Y 1l' oo (2.3) I x - y 1 2- d 00 s( d/ 2 ) -2 e -$ ds = 27l'd/ 2 0 r ( � - 1) 1 X - y 1 2- d = 27l'd/ 2 where f(a) = J000 s a - l e - & ds is the usual gamma function. If we define
When d ;:::
1
1
(
8
G(x, y) = then in d ;:::
(2.4)
1
)
(
)
100 Pt(x, y) dt
To complete the proof of (2.2) now we observe that taking B (O, r) and changing to polar coordinates
j G(x, y)f(y) dy When d =
=
loo E,f(Bt) dt
1, we let at = Pt (O, 0). With this choice, 1 G(x , y) = rn= (e - (y - x )2 / 2t - 1)r 1 12 dt y 27l' 0
1 00
and the integral converges , since the integrand is ::; 0 and ...... -(y - x ) 2 j2t3f2 as t __... oo. Changing variables t = (y - x)2 /2u gives
1oo (e-u - 1) ( (y -2ux)2 ) 1/2 - (y2u2- x)2 du - �;1 1 00 (lu e - & as) u- 312du I Y - xl loo ds e - & joo u - 3/2 0 2 du $ oo - IY - x l l ds e s - 1 / 2 = - I y - x I y 7l' 0 =
(2.5)
j
0
v'2i �
= - --
..fo
- - --c-
f = 1D with D =
--
-s
since
roo ds e - & s- 1 12 = r oo dr e - r 2 f 2 h .!.}2 . ..J 0} C H we have E:c 1T f(Bt) dt = 100 Ex{f(Bt); > t) dt = 100 L (Pt(x,y) - Pt(x, y))f(y)dydt = 100 L (Pt(x,y) - at)f(y)dydt - 100 L (Pt(x, y) - at)f(y)dydt = j G(x, y)f(y) dy- j G(x, y)f(y) dy T
.
-1 · lx - yl
.
Y
T
Section 3.3 Exit Times
G
105
I G(x,y)f(y)ldy The proof given above simplifies considerably in the case d :?: 3; however, part of the point of the proof above is that, with the definition we have chosen for G in the recurrent case, the formulas and proofs can be the same for all d. Let GH(x, y) = G(x, y) - G(x, y). We think of GH(x, y) as the "expected occupation time {density) at y for a Brownian motion starting at x and killed when it leaves H." The rationale for this interpretation is that (for suitable f) The compact support of f and the formulas for imply that J and J are finite, so the last two equalities are valid.
IG(x, y)f(y)l dy
D
yGH--. GH(x, y)
With this interpretation for introduced, we invite the reader to pause for a minute and imagine what looks like in one dimension. If you don't already know the answer, you will probably not guess the behavior as oo. So much for small talk. The computation is easier than guessing the answer:
y --.
G(x,y) = -lx - Y l so GH(x, y) = -lx - Y l + lx + Y l · Separating things into cases, we see that when 0 < y < x GH(x ,y) _- { -(x -(y-- x)y) ++ (x(x ++ y)y) == 2x2y when x 0 It is somewhat surprising that y --. GH(x, y) is constant = 2x for y :?: x, that is, all points y > x have the same expected occupation time! (2.9)
3.3. Exit Times
= t : B=t Srf/:. G}in
In this section we investigate the moments of the exit times T inf{ for various open sets. We begin with in which case T the notation of Section
G = {x : lxl < r} 3.1. {3.1) Theorem. If l x l � r then ExSr = (r2 - lxl2 )/d
Proof The key is the observation that
d {(B;)2 -t} I Bt 1 2 - dt = I: i=l
106 Chapter 3 Brownian Motion, II
Section 3.3 Exit Times 107
being the sum of d martingales is a martingale. Using the optional stopping theorem at the bounded stopping time Sr 1\ t we have .: < oo, and we have I Bsrl\t 1 2 :::; r2, so letting t -+ oo and using the dominated convergence theorem gives l x l 2 = Ex (r2 - Sr d) which 0 implies the desired result.
(1.3) tells us that E Sr Exercise 3.1.
Let
a, b > 0 and T = inf{ t : Bt ff. (-a, b)} . Show EoT = ab.
To get more formulas like (3.1) we need more martingales. Applying Ito's formula, ( 1 0. 2 ) in Chapter 2, with Xl = Xt , a continuous local martingale, and Xf = {X}t we obtain
1 Dd(X3 , {X}3)dX3 t + 1 D2 f(X3 , {X}3)d{X}3 t + � 1 Dn f(X. , {X),) d{X),
f(Xt, {X}t) - f(Xo, 0) = {3.2)
From ( 3..2) w e see that if (�D11 + D2 )f tingale. Examples of such functions are
t
=
(3.3) Theorem. Let Ta = inf{t : IBt l � a}. Then (i) Eora = a2 , (ii) EorJ = 5a4 /3. The dependence of the moments on a is easy to explain: the Brownian scaling relationship Bet =a c1 12Bt implies that Ta /a2 =a r1 .
{i) follows from (3.1), so we will only prove {ii) . To do this let X1 = B{ - 6Bft + 3t 2 and Tn :::; n be stopping times so that Tn j oo and Xti\Tn is a
Proof
martingale. Since Tn :::; n the optional stopping theorem implies
Now I Br.,I\Tn I :::; a, so using {1.3) and the dominated convergence theorem we can let n -+ oo to conclude 0 = a4 - 6a2 Eora + 3E0r� . Using (i ) and rearranging 0 gives (ii) . Find a, b, c so that Bf -aB{t+bBft 2 -ct3 is a (local) martingale and use this to compute E0r�.
Exercise 3.2
0, then f(Xt , {X}t) is a local mar
Our next result is a special case of the Burkholder Davis Gundy inequal ities, {5.1), but is needed to prove (4.4) which is a key step in our proof of
(5.1).
(3.4) Theorem. If Xt is a continuous local martingale with Xo = 0 then
or to expose the pattern
fn (x, y) =
These local martingales are useful for computing expectations for one dimen sional Brownian motion.
O� m �S]n / 2]
cn ,m x n -2m ym
where [n /2] denotes the largest integer :::; n/ 2, cn ,o = 1 and for 0 we pick 1 2 Cn ,m (n - 2m)(n - 2m - 1) = - (m + 1)cn , m+ l so that Dx.: /2 of the mth term is cancelled by the [n /2] th term is 0.)
:::; m < [n/ 2]
Proof First suppose that IXtl and {X}t are :::; M for all t. In this case ( 2 .5) in Chapter 2 implies X{ - 6Xf {X}t + 3{X}; is a martingale, so its expectation is 0. Rearranging and using the Cauchy-Schwarz inequality
Dy of the ( m + 1 ) th. (Dxx of
The first two of our functions give us nothing new (Xt and Xf - {X}t are local martingales) , but after that we get some new local martingales:
Xl - 3Xt {X}t , X{ - 6Xf{X}t + 3{X};,
Using the L4 maximal inequality (( 4.3) in Chapter 4 of Durrett fact that (4/3)4 :::; 3.1605 < 19/6 we have
{1995)) and the
108 Chapter 3 B o nian Motion, II r
Section 3.3 Exit Times 109
w
Since I X3 1 ::; M for all s we can divide each side by E(sup3 9 Xi)1 1 2 then square to get
So if Tn is a sequence of times that reduces Z1 , the L2 maximal inequality applied to Yt ATn gives
The last inequality holds for a general martingale if we replace t by Tn = inf{t : t , I X1 I , or (X}t ;::: n } . Using that conclusion, letting n --? oo and using the monotone convergence theorem, we have the desired result. D
Letting
If we notice that f(x, y) = exp(x - y/2) satisfies (�Du + D2 ) ! = 0 , then we get another useful result.
(3.5) The Exponential Local Martingale. If X is a continuous local mar tingale, then E(X)1 = exp(X1 - � (X}1) is a local martingale. If we let yt = exp(X1 - t {X}1), then (3.2) says that (3.6) or, in stochastic differential notation, that dyt = ytdX3 • This property gives yt the right to be called the martingale exponential of yt . As in the case of the ordinary differential equation
f'(t) = f(t)a(t)
f(O) = 1
which for a given continuous function a(t) has unique solution f(t) = exp(At) , where At = J� a(s) ds. It is possible to prove (under suitable assumptions) that Z is the only solution of (3.6). See Doleans-Dade (1970) for details. The exponential local martingale will play an important role in Section 5.3. . Then (and now) it will be useful to know when the exponential local martingale is a martingale. The next result is not very sophisticated (see (1.14) and (1.15) in Chapter VIII of Revuz and Yor (1991) for better results) but is enough for our purposes. (3.7) Theorem. Suppose X1 is a continuous local martingale with (X}1 ::; Mt and Xo = 0. Then yt = exp(X1 - � (X}1 ) is a martingale. Proof Let Zt = exp(2X1 -
Now,
t (2X}t), which is a local martingale by (3.5).
yt2 = exp(2X1 - {X}t) = Zt exp((X}t)
n l oo and using the monotone convergence theorem we have
by Jensen's inequality. Using (2.5) in Chapter 2 now, we see that yt is a martingale. o Remark. It follows from the last proof that if X1 is a continuous local mar
tingale with Xo = 0 and (X}t ::; M for all t then yt = exp(X1 - � (X}t ) is a martingale in M 2 .
Letting 8 E R and setting X1 = 8Btin (3.6), where B1 is a one dimensional Brownian motion, gives us a family of martingales exp(8B1 - 8 2 tj2) . These martingales are useful for computing the distribution of hitting times associated with Brownian motion. (3.8) Theorem. Let Ta = inf{t : Bt = a}. Then for a > 0 and ,\ ;::: 0 Eo exp( -.ATa ) = e - aVV: Remark. If you are good at inverting Laplace transforms, you can use (3.8)
to prove (4.1) in Chapter 1:
Proof Po (Ta < oo) = 1 by (1.5). Let X1 = exp(8B1 - 82 tj2) and Sn ::; n be stopping times so that Sn l oo and Xt A Sn is a martingale. Since Sn ::; n, the optional stopping theorem implies
If 8 2: 0 the right-hand side is ::; exp(8a) so letting n --? oo and using the bounded convergence theorem we have 1 = Eo exp(8a - 82 Ta /2). Taking 8 = J2X now gives the desired result. D
110 Chapter 3 Brownian Motion, II
If
is a martingale. B = -2J.Lfa-2 then -OJ.L - ()2 a-2 f2 = 0 and exp(-{2J.Lfa-2 )Zt) is a local martingale. Repeating the proof of (3.8) one gets
(3.9) Theorem. Let Ta = inf{t : I Bd ;:::: a}. Then for a > 0 and A 2: 0, Eo exp(-Ara ) = 2e - av'2X/(1 + e -2 av'2X) Proof Let 'lj;a (A) = Eo exp(-ATa)· Applying the strong Markov property at time Ta (and dropping the subscript a to make the formula easier to typeset)
gives
Section 3.4 Change of Time, Levy's Theorem 111
= Eo(exp( -Ar) ; Br = a) + Eo( exp( -Ar) 'lj;2 a(A) ; Br = - a) Symmetry dictates that ( r, Br) = d ( r, -Br). Since Br E {-a, a} it follows that r and Br· are independent, and we have
Exercise
3.3. Let T_a = inf{t : Zt = -a}. If a , J.L > 0 then
Eo exp( -ATa)
Using the expression for 'lj;a (A) given in
(3.8) now gives the desired result.
0
Another consequence of (3.8) is a bit of calculus that will come in handy in Section 7 .2.
(3.10) Theorem. If B > 0 then 00 -1 exp( -lzlv� 1 e - z2 f2t e - 8t dt = -2B)
1 0
.../2-ii
-128
Proof Changing variables t
= 1/2s, dt = -dsf2s2 the integral above
100 -ffs e - z 2 e -
3.4. Change o f Time , Levy's Theorem
In this section we will prove Levy's characterization of Brownian motion ( 4.1) and use it to show that every continuous local martingale is a time change of Brownian motion.
(4.1)
Theorem. If Xt is a continuous local martingale with X0 (X}t = t, then Xt is a one dimensional Brownian motion.
Proof
loo
-128 0
Using {3.8) now with a = Je and A = z2 and consulting the remark after (3.8) for the density function of Ta we see that the last expression is equal to the 0 right-hand side of {3.10). The exponential martingale can also be used to study a one dimensional Brownian motion Bt plus drift. Let Zt = a- Bt + J.Ll where a- > 0 and J.L is real. Xt = Zt - J.Lt is a martingale with (X}t = a-2t so {3.7) implies that
{4.5) in Chapter 1 it suffices to show
{4.2) Lemma. For any s and t, X3 +t - X3 is independent of :F3 and has a normal distribution with mean 0 and variance t. Proof o f ( 4.2)
ter 2, to X�
Applying the complex version oflto's formula, {7.9) in Chap
= X3 +r - X3 and f(x) = ei8:c , we get ei8x: - 1 = iB
ds 3 - 8/23 -
2s2 o ..j'Fff 1 1 vr;;Bffe- 8 / 23 e - z2 3 ds --= -�
By
= 0 and
1 ei8X� dX' o u t
-
B 2 t i8X � e du 2 ' Jo
Let :F� = :F3 + r and let A E :F3 = :F6. The first term on the right, which we will call Yt , is a local martingale with respect to :Ff. To get rid of that term let Tn l oo be a sequence of stopping times that reduces Y;, replace t by t A Tn, and integrate over A. The definition of conditional expectation implies
since
Yo = 0. So we have
112 Chapter 3 Brownian Motion, II Since l e i0"'1 = 1, letting n -+ oo and using the bounded convergence theorem gives 82 E(e'·ox t' ; A) - P(A) = - 2 E to e'· ox ' du ; A J 2 8 i 6X � 2 E e ; A du
(
=
-
J
u
(
)
)
by Fubini's theorem. (The integrand is bounded and the two measures are finite. ) Writing j(t) = E(e;ox ; ; A), the last equality says
-1
2 t 8 j(t) - P(A) = - 2 j(u) du 0 2 Since we know that j j( s) l ::::; 1, it follows that li(t) - j ( u)l ::::; I t - u l 8 /2, so j is continuous and we can differentiate the last equation to conclude j is differentiable with
j'(t) = -282 j(t) Together with j(O) = P(A), this shows that j(t)
= P(A)e- 62tf2, or
Section 3.4 Change of Time, Levy's Theorem 113 Multiplying each side of (4.3 ) by cp ( - 8 ) and integrating we see that E(g(Xf)I:F6) is a constant and hence
E(g(X:) I :F6) = Eg(XD A monotone class argument now shows that the last conclusion is true for any bounded measurable g. Taking g = 1B and integrating the last equality over A E :F6 we have
P(A)P(x: E
B) = L E(1B (XD I:F�) dP = P(x: E B)
by the definition of conditional expectation, and we have proved the desired independence. 0 Exercise 4.1. Suppose Xf , 1 ::::; i ::::; d are continuous local martingales with Xo = 0 and i = i. (X i , Xi )t = t0 if otherwise then Xt = (Xl, . . . , Xf) is a d-dim�nsional Brownian motion.
{
An immediate consequence of (4.1 ) is: Every continuous local martingale with X0 = 0 and having (X) oo = oo is a time change of Brownian motion. To be precise if we let 1 (u) = inf{t : (X}t > u} then Bu = X-y(u ) is a Brownian motion and Xt = B(x )1 • Proof Since ! ( (X}t) = t the second equality is an immediate consequence of the first. To prove that we note Exercise 3.8 of Chapter 2 implies that u -+ Bu is continuous, so it suffices to show that Bu and B� - u are local martingales.
(4.4) Theorem.
Since this holds for all A E :F6 it follows that
(4.3 ) or in words, the conditional characteristic function of x: is that of the normal distribution with mean 0 and variance t. To get from this to (4.2 ) we first take expected values of both sides to .conclude that Xf has a normal distribution with mean 0 and variance t. The fact that the conditional characteristic function is a constant suggests1 that Xf is independent of :F6. To turn this intuition into a proof let g be a C function with compact support, and let cp ( 8 )
=
j ei6"' g(x) dx
be its Fourier transform. We have assumed more than enough to conclude that cp is integrable and hence
g(x)
=
_!_ 271"
j ei6"'cp(-8) dx
(4..5) Lemma. Bu , :F-y(u ) • u ;?: 0 is a local martingale . Proof of (4.5 ) Let Tn = inf{ t : IXtl > n}. The optional stopping theorem
implies that if u < v then
·
E(X-y(v)IITn I :F-y(u )) = X-y(u )IITn where we have used Exercise 2.1 in Chapter 2 to replace :F-y(u )IITn by :F-y(u )· To 2 let n -+ we observe that the £ maximal inequality, the fact that x;(v)IITn (X)-y(v)IITn is a martingale, and the definition of 1(v) imply E sunp X;(v)IITn ::::; 4 sunp EX;(v)IITn = 4 sup E(X}-y(v)IITn ::::; 4v n 00
114 Chapter 3 Brownian Motion, II
Section 4 . 3 Change of Time, Levy's Theorem 115
The last result and the dominated convergence theorem imply that as n -+ oo, X-y(t )ATn -+ X-y(t ) in L2 for t = u, v. Since conditional expectation is a contraction in L 2 it follows that E{X-y(v ) ATJF-y(u ) ) -+ E(X-y(v) IF-r(u ) ) in L2 D and the proof is complete. To complete the proof of {4.4) now it remains to show
(4.6) Lemma. B� - u, .1"-y(u ) • u ;::: 0 is a local martingale. v
As in the proof of {4.5), the optional stopping theorem implies then
icx;(v )ATn - (X}-y(v )ATn IF-y(u )) = x;(u ) ATn - (X}-y(u )ATn To let n -+ oo we observe that using (a + b) 2 ::::; 2a2 + 2b2 , definition of 'Y( v) then
(
- (X}-y (v )ATn E sup n x�(v )ATn
)2
{3.4), and the
4 2 v) v )ATn + 2E(X}-y( ::::; 2E sup x-y(
n s; CE(X}; ( v) ::::; Cv2
The proof can now be completed as in (4.5) by using the dominated convergence theorem, 11nd the fact that conditional expectation is a contraction in L2 • D Our next goal is to extend (4.4) to X1 with P( (X} oo < case X-y(u ) is a Brownian motion run for an amount of time step in making this precise is to prove
oo) > 0. In this (X} oo . The first
{4.7) Lemma. limtroo X1 exists almost surely on {(X} oo < oo}. Tn = inf{t : (X}t ;::: n}. (3.7) in Chapter 2 implies that (XTn ) t = (X}t ATn s; n Using this with Exercise 4.3 in Chapter 2 we get Xt ATn E M 2 so limt-+ oo Xt ATn exists almost surely and in L2 • The last statement shows limt- oo Xt exists almost surely on {Tn = oo} :::> {(X} oo < n } . Letting n -+ oo now gives {4.7). D To prove the promised extension of ( 4.4) now, let 'Y( u) = inf{t : (X}t > u} when u < (X} oo , let Xoo = limt-+ oo Xt on {(X} oo < oo } , let Bt be a Brownian motion which is independent of {Xt, t ;::: 0}, and let U < (X} oo -y(u ) Y. u- X Xoo + B(u - (X} oo ) U ;::: (X} oo
Proof Let
_
{
Proof By {4.1) it suffices to show that Yu and Y; - u are local martingales with respect to the filtration cr(Yt : t $ u) . This holds on [0, (X} oo ] for reasons indicated in the proof of {4.4). It holds on [(X} oo , oo) because B is a Brownian motion independent of X. D
. The reason for our interest in
Proof of ( 4.6)
that if u
1
{T < oo} we have
limsup(H · X)t = oo
t TT
liminf(H · X)t = -oo t
TT
and there is no reasonable way to continue to define
(H · X)t for t ;::: T.
Convergence is not the only property of local martingales that can b e studied using (4.9). Almost any almost-sure property concerning the Brownian path can be translated into a corresponding result for local martingales. This immediately gives us a number of theorems about the behavior of paths of local martingales. We will state only:
(4.10) Law of the Iterated Logarithm. Let L (t) = y'2t log log t for t ;:::
Then on {(X} oo = oo } ,
limsup Xt/L( (X } t ) = 1 a.s.
t-+ oo
e.
116 Chapter 3 Brownian Motion, II This follows from (4.9) and the result for Brownian motion proved in Section 7.9 of Durrett (1995). D
(b) P(r 1 12
Proof
Finally we have a distributional result that can be derived by time change.
-+ R is measurable and locally bounded. Use (4.4) to generalize Exercise 6.7 in Chapter 2 and conclude that 1 1 X1 = h. dB. is normal with mean 0 and variance h� ds Exercise 4.2. Suppose h : [0, oo)
1
3.5.
1
> {J>., B ; � 6>.) � 13g� 1 P(r1 12 > >.)
Remark. The inequalities above are called "good )." inequalities, although the reason for the name is obscured by our formulation (which is from Burkholder (1973)). The name "good >." comes from the fact that early versions of this and similar inequalities (see Theorems 3.1 and 4.1 in Burkholder, Gundy, and Silverstein (1971)) were formulated as P(f > >.) � Cp,K P(g > >.) for all >. that satisfy P(g > >.) � KP(g > [3>.). Here {J, I 1. Proof It is enough to prove the result for bounded r A n for all n, it also holds for r. Let
for
(5.1) Theorem. For any 0 < p < oo there are constants 0 < c, C < oo so that
Remark. This result should be contrasted with the martingales that only holds for 1 < p < oo
LP maximal inequality for
Levy's Theorem, (4.4), tells us that any continuous local martingale is a time change of Brownian motion Bt , so it suffices to let B; = sup• 1 and 6 > 0. Then for any >. > 0 (a) P(B; > {3>., r1 12 � 6>.) � (/3�21 p P(B; > >.)
r for if the result holds
S1 = inf{t : IB(t A r) l > >.} s2 inf{t : IB(t A r) l > {3>.} T inf{t : (t A r) 1 12 > 6>.}
B urkholder Davis Gundy Inequalities
Let X1 be a local martingale with Xo = 0 and let x; = sup• [3>. implies sl < s2 < T and r1 12 � 6>. implies T
= 00
we have
P(B; > [3>., r1 ' 2 � 6>.) � P( I B(r A S2 A T) - B(r A S1 A T) I 2:: ({3 - 1)>.) � ({3 - 1) -2 >. - 2 E{(B(r A S2 A T) - B(r A St A T)) 2 } where the second inequality is due to Chebyshev. Now if R1 stopping times then
� R2 are bounded
So .we have
E(B(R2 ) - B(R1 )) 2
EB(R2 ) 2 - 2EB(R1 )B(R2 ) + EB(R1 ) 2 E(R2 - Rt) = EB(R2 ) 2 - EB(R1 ) 2 =
=
since Bl - t is a martingale. Resuming our first computation we find
({3 - 1) - 2 >. - 2 E{(r A S2 A T) - (r A S1 A T)} � ({3 - 1) - 2 >. -2 (6>.) 2 P(S1 < oo) = ({3 - 1) -2 62 P(B; > >.) =
since T A r � (6>.)2 proving (a) .
118
Chapter 3 Brownian Motion, II
Section 3.6 Martingales Adapted to Brownian Filtrations
1 To prove (b) we interchange the roles of B ( t 1\ r) and (t 1\ r) 12 in the first set of definitions and let sl = inf{t : (r 1\ t ? l 2 > ..\ } s2 = inf{t : (r 1\ t) 1 12 > ,8..\ } T = inf{t : I B (r 1\ t)l > 6..\ } 1 Reversing the roles of B3 and s 12 in the proof, it is easy to check that P (r
Replacing h by a nonnegative random variable Z, taking expectations and using Fubini's theorem gives
Ecp (Z) =
(5.5)
1oo P(Z
>
..\) dcp (..\)
From our assumption it follows that
2 2 1'2 ,B ; > ..\, B :::; 6..\) :::; P (( r A s2 A T) - (r A s1 A T) � (,8 - 1)..\ ) :::; (,82 - 1)- 1 ..\- 2 E{( r 1\ S2 1\ T) - (r 1\ S1 1\ T) }
and using the stopping time result mentioned in the proof of (a) it follows that the above is
(,82 - 1)- 1 ..\ - 2 E{ B (r A s2 A T) 2 - B ( r A s1 A T) 2 } :::; (,82 - 1)- 1 ..\- 2 ( 6..\) 2 P (Sl < oo) :::; (,82 _ 1)- 1 62 P (r1 12 > ..\) 2 since I B (r 1\ T) l � ( 6..\) proving (b).
119
=
0
It will take one more lemma to extract ( 5.2) from (5.3) . First, we need a definition. A function cp is said to be moderately increasing if cp is a , nondecreasing function with cp ( O) = 0 and if there is a constant J( so that cp (2..\ ) :::; J( cp (,\). It is easy to see that cp(x) = xP is moderately increasing (I< = 2P) but cp ( x) = e a:r: - 1 is not for any a > 0. To complete the proof of (5.2) now it suffices to show (take ,8 = 2 in (5.3) and note 1/3 :::; 1)
2
(5.4) Lemma. If X, Y � 0 satisfy P (X > 2..\, Y :::; 6..\ ) :::; 6 P (X > ..\ ) for all 6 � 0 and cp is a moderately increasing function, then there is a constant C that only depends on the growth rate J{ so that
Ecp(X ) :::; CEcp (Y) Proof It is enough to prove the result for bounded cp for if the result holds for cp 1\ n for all n � 1, it also holds for cp. Now cp is the distribution function of a measure on [0, oo ) that has
P (X > 2..\) = P (X > 2..\, Y :::; 6..\) + P (X > 2..\, Y > 6..\) :::; 62 P (X > ..\) + P (Y > 6..\) Integrating dcp( ..\) and using (5.5) with Z = X/ 2, X, Y/6 Ecp(X/2) :::; 62Ecp (X) + Ecp (Y/ 6) N
2
1
Pick 6 so that J( 6 < 1 and then pick N � 0 so that 2 > 6- • From the growth condition and the monotonicity of cp, it follows that Combining this with the previous inequality and using the growth condition again gives
Solving for Ecp(X) now gives 0
p = 4. In this case J( = 24 = 2 1 / 8 to make 1 - J( 6 = 3/4 and N = 3, we get a constant which
381 Revisited. To compare with (3.4), suppose
16. Taking 6
is
=
164 . 4/3 = 8 7 , 3 81 .333 . . .
Of course we really don't care about the value, just that positive finite constants 0 < c, C < oo exist in (5.1).
3.6.
Martingales Adapted t o B rownian Filtrations
Let {Bt , t � 0} be the filtration generated by a d-dimensional Brownian motion Bt with Bo = 0, defined on some probability space (n, :F, P). In this section we will show (i) all local martingales adapted to {Bt , t � 0} are continuous and
120
Chapter 3 Brownian Motion, II
(ii) every random variable integral.
X E L 2 (n, Boo , P) can be written as a stochastic
(6.1) Theorem. All local martingales adapted to {Bt , t ;::._ 0 } are continuous. Proof Let Xt be a local martingale adapted to {Bt , t ;::._ 0 } and let Tn ::; n be a sequence of stopping times that reduces X. It suffices to show that for each n, X(t A Tn ) is continuous, or in other words, it is enough to show that the result holds for martingales of the form yt = E(Y!Bt) where Y E Bn . We
Section 3.6 Martingales Adapted to Brownian Filtrations 12 1 and it follows from step 1 that yt is continuous on (0, t 1 ]. Repeating the ar gument above and using induction, it follows that the result holds if Y = fl (Bt 1 ) • • • fk (Bt�o) where t 1 < t 2 < < t k ::; n and fl , . . . , fk are bounded · · ·
continuous functions.
Step 3. Let Y E Bn with ElY I < oo. It follows from a standard application of the monotone class theorem that for any E > 0, there is a random variable XE of the form considered in step 2 that has E!XE - Yl < E. Now
build up to this level of generality in three steps.
S tep 1. - Let Y = f(Bn ) where f is a bounded continuous function. If t 2':: n, then yt = f(Bn ), so t __. yt is trivially continuous for t > n. If t < n, the Markov property implies yt = E(Y!Bt) = h( n - t, Bt) where
and if we let Zt = E(!XE - YI !Bt), it follows from Doob's inequality (see e.g., (4.2) in Chapter 4 of Durrett ( 1995)) that
(t :::;n
>.P sup Zt It is easy to see that h( s, x) is a continuous function on (0, oo ) x R, so yt is continuous for t < n. To check continuity at t = n, observe that changing variables y = x + z-jS, dy = sdf 2 dz gives
) J (21r)1 d/ 2 e-lzl2 /2 j(x + zvt:s) dz
h(s, x =
so the dominated convergence theorem implies that as
f(Bn ) ·
S tep 2. Let
t i n, h (n - t, Bt)
__.
Y = fl (BtJh(Bt 2 ) where t 1 < t 2 ::; n and fl , h are bounded and
continuous. If t ;::._ t 1 , then
so the argument from step 1 implies that yt is continuous on other hand, if t ::; t 1 , then yt = E(Yt1 !Bt) and where
g(x) =
[t 1 , oo )
j (27r(t2 -1 t l) ) d/2 e- ly-xj2/2(t 2-t l) j?(y) - dy
is a bounded continuous function, so
.
On the
)
> >. $ EZn = E!X E - Yl < E
Now XE(t) = E(XE!Bt) is continuous, so letting E 0 we see that for a.e. w, Yt (w) is a uniform limit of continuous functions, so yt is continuous. 0 __.
Remark. I would like to thank Michael Sharpe for telling me about the proof given above.
We now turn to our second goal. Let Bt = (Bf , . . . , Bf) with B0 = 0 and {� , t ;::._ 0 } be the filtrations generated by the coordinates. (6.2) Theorem. For any X E L 2 (r!, B00 , P) there are unique Hi E II2 (B i )
let
with
Proof We follow Section V.3 of Revuz and Yor ( 1991 ) . The uniqueness is immediate since the isometry property, Exercise 6.2 of Chapter 2, implies there is only one way to write the zero random variable, i.e., using Hi ( s, w) := 0. To prove the existence of H3, the first step is to reduce to one dimension by proving:
(6.3) Lemma. Let X E L 2 (r!, Boo , P ) with EX = 0, and let X; = E(X!Bi00). = X1 + + Xd .
Then X
· · ·
Proof Geometrically, see ( 1.4) in Chapter 4 of Durrett ( 1995 ) , X; is the pro jection of X onto L[ the subspace of mean zero random variables measurable
Chapter 3 Brownian Motion, II with respect to 800 • If Y E L1 and Z E L} then Y and Z are independent so EY Z = EY EZ = 0. The last result shows that the spaces L[ and L} are orthogonal. Since Lr, . . . , L� together span all the elements in L2 with mean 0, 122
·
(6.7) follows.
0
Proof in one dimension Let I be the set of integrands which can be written as L,j=1 Aj 1 (3j_1,3j] where the >.i and si are (nonrandom!) real numbers and 0 = so < s 1 < . . . < Sn . For any integrand H E I C II2 (B), we can define the stochastic integral Yt = J; H3 dB3 , and use the isometry property, Exercise 6.2 of Chapter 2, to conclude Yt E M 2 • Using (3.7) and the remark after its proof, we can further define the martingale exponential of the integral, Zt = &(Y)t and conclude that Zt E M 2 • Ito's formula, see (3.6), implies that Zt - Z0 = J; Z3 dY3• Recalling Yt = (H B)t and using the associative law, (9.6) in Chapter 2, we have ·
Zt - Zo =
1 Z3H3 dB3 t
Yo = 0 and (Y) o = 0, so Zo = 1. Since Zt E M2 we have EZt = 1 = Zo and it
follows that
{ oo
Zt = EZt + J z. H3 1[o ,tJ (s) dB. . o
i.e., the _ desired representation holds for each of the random variables in the set :1 = {&(H · B)t : H E I, t ? 0} . To complete the proof of (6.2) now, it suffices to show
(6.4) Lemma. If W E L 2 has E(ZW) = 0 for all Z E :1 then W = 0. (6.5) Lemma. Let g be the collection of L2 random variables X that can be written as EX + J000 H3 dB Then g is a closed subspace of L2 • Proof of { 6.4) We will show that the measure W dP (i.e., the measure J.L with dJ.L/dP = W) is the zero measure. To do this, it suffices to show that W dP is the zero measure on the O"-field ( Bt1 , • • • , Bt,.) for any finite sequence 0 = to < t 1 < t2 < . . . < t n . Let Aj , 1 � j � n be real numbers and z be a •.
C7
complex number. The function
is easily seen to be analytic in C, i.e., cp ( z) is represented by an absolutely convergent power series. By the assumed orthogonality, we have cp ( x ) = 0 for
Section 3.6 Martingales Adapted to Brownian Filtrations
123
all real x , so complex variable theory tells us that cp must vanish identically. In particular cp (i) = 0, when i = A. The last equality implies that the image of W dP under the map is the zero measure since its Fourier transform vanishes. This shows that W dP vanishes on C7 ( Bt1 - Bt0 , • • • , Bt,. - B1,_ J = C7( Bt1 , • • • , B1 J and the proof of 0 (6.4) is complete. Proof of
(6.5) It is clear that g is a subspace. Let Xn E g with
(6.6) and suppose that Xn __... X in L2 • Using an elementary fact about the variance, then (6.6), and the isometry property of the stochastic integral, Exercise 6.2 in Chapter 2, we have
E(Xn - Xm ) 2 - (EXn - EXm ) 2 = E{(Xn - Xm ) - (EXn - EXm )} 2 = E (H� - H';')2 ds
1oo
Since Xn __... X in L2 implies E(Xn - Xm ) 2 __... 0 and EXn __... EX, it follows that II H m - Hn iiB __... 0. The completeness of II 2 (B), see the remark at the beginning of Step 3 in Section 2.4, implies that there is a predictable H so that II Hn - H II B � 0. Another use of the isometry property implies that H m B converges to H . B in M2 • Taking limits in (6.6) gives the representation for X. This completes the proof of (6.5) and thus of (6.2) . o ·
4
Partial D ifferential Equations
A. Parab olic Equations
In the first third of this chapter, we will show how Brownian motion can be used to construct (classical) solutions of the following equations:
Ut Ut Ut
( )
1
=
2 .6.u
=
2 .6. u + g
=
2 .6. u + cu
1
1
in (0, oo) x Rd subject to the boundary condition: u is continuous at each point of {0} X Rd and u O , x = f(x) for x E Rd . Here,
D. u
82 u l
2
8 ud + . . · + -axi
= -?
OXJ
and by a classical solution, we mean one that has enough derivatives for the equation to make sense. That is, u E C1 • 2 , the functions that have one contin The uous derivative with respect to t and two with respect to each of the continuity in u in the boundary condition is needed to establish a connection between the equation which holds in (O, oo) x Rd and u ( O , x ) = f(x) which holds on {0} x Rd . Note that the boundary condition cannot possibly hold unless Rd -+ R is continuous. We will see that the solutions to these equations are (under suitable as sumptions) given by
Xi ·
f:
f(Bt) Ex (t(B ) + 1t g(t - s, B.) ds) t Ex (t(B ) (1t c(t - s, B.) ds)) t
Ex
exp
126
Section 4.1 The Heat Equation
Chapter 4 Partial Differential Equations
In words, the solutions may be described as follows:
(i) To solve the heat equation, run a Brownian motion and let u(t, x) = Exf(Bt ) · (ii) To introduce a term g(x) add the integral of g along the path. (iii) To introduce cu, multiply f(B1) by m1 = exp { f; c(t - s, B3)ds) before taking expected values. Here, we think of the Brownian particle as having mass 1 at time 0 and changing mass according to m� = c(t - s, B3)m3 , and when we
take expected values, we take the particle's mass into account. An alternative interpretation when c ::; 0 is that exp { f; c(t - s, B3) ds) is the probability the particle survives until time t, or - c(r, x) gives the killing rate when the particle is at x at time r. In the first three sections of this chapter, we will say more about why the expressions we have written above solve the indicated equations. In order to bring out the similarities and differences between these equations and their el liptic counterparts discussed in Sections 4.4 to 4.6, we have adopted a rather robotic style. Formulas (m.2 ) through (m.6 ) and their proofs have been devel oped in parallel in the first six sections, and at the end of most sections we discuss what happens when something becomes unbounded. 4.1. The Heat Equation
127
X� = t - s and X! = B! for 1 ::; i ::; d gives 3 u(t - s, B3) - u(t, Bo) = -ut (t - r, Br) dr 3 + V'u(t - r, Br) · dBr 3 + -1 6.u(t - r, Br) dr 2 0
1 1 1
To check this note that dX� = -dr and X� has bounded variation, while the x: with 1 ::; i ::; d are independent Brownian motions, so
( 1.2 ) follows easily from the Ito's formula equation since - u1 + the second term on the right-hand side is a local martingale.
!6.u = 0 and
D
Our next step is to prove a uniqueness theorem.
( 1.3) Theorem. If there is a solution of ( 1.1 ) that is bounded then it must be
v(t, x) = Exf(Bt)
In this section, we will consider the following equation:
( 1.1a) u E C1 • 2 and u1 = �6.u in (0, oo ) x Rd . ( 1 .1b ) u is continuous at each point of {0 } x Rd and u(O, x) = f(x).
Here = means that the last equation defines v. We will always use generic solution of the equation and v for our special solution. Proof If we now assume that u is bounded, then M3 , 0 ::; s < �artingale. The martingale convergence theorem implies that
u for a
t, is a bounded
Mt = lim 3 Tt M3 exists a.s.
This equation derives its name, the heat equation, from the fact that if the units of measurement are chosen suitably then the solution u(t, x) gives the temperature at the point x E Rd at time t when the temperature profile at time 0 is given by f(x) . The first step in solving ( 1.1 ) , as it will be six times below, is to find a local martingale.
If u satisfies ( 1.1b ) , this limit must be it follows that
( 1.2) Theorem. If u satisfies ( 1.1a) , then M3 = u(t-s, B3) is a local martingale on [O, t).
Now that ( 1.3 ) has told us what the solution must be, the next logical step is to find conditions under which v is a solution. It is ( and always will be ) easy to show that if v is smooth enough then it is a classical solution.
Proof
Applying Ito's formula, ( 10.2 ) in Chapter 2, to
u(xo, . . . , xd ) with
f(B1 ). Since M3 is uniformly integrable,
u(t , x) ExMo = ExMt = v(t, x) =
( 1.4) Theorem. Suppose f is bounded. If v E C1 • 2 then it satisfies ( 1.1a) .
D
128
Chapter 4 Partial Differential Equations
Proof
The Markov property implies, see Exercise
Section 4. 1 The IIeat Equation where p1(x, y) = (2 m ) - df2e - lx- yl2/ 2 1• Writing little calculus gives
2.1 in Chapter 1, that
The left-hand side is a martingale, so the right-hand side is also. If v E C1 • 2 , then repeating the calculation in the proof of (1.2) shows that
1 - r, Br )dr v(t - s , B3) - v(t, Bo) = 0 (-Vt + -Av)(t 2 + a local martingale
13
It is easy to give conditions that imply that v satisfies (l.lb). In order to keep the exposition simple, we first consider the situation when f is bounded.
(1.5) Theorem. If f is bounded and continuous, then v satisfies (l.lb ) . (Bt - Bo) 4. t 1 1 2N , where N has a normal distribution with mean 0 and variance 1, so if t n -+ 0 and X n -+ x, the bounded convergence theorem Proof
D; = 8f8x; and Dt = 8/8t, a
D;pt(x, y) = -(x; - y;) Pt(x, y) t (x; y;) 2 - t Pt(x, y) D;;pt(x, y) = t2 (x y;)(xi - Yi ) (X y) ; -1- J· ; Dij Pt ( X , y) = Pt , • T t2 -d/2 l x - Yl 2 DtPt(x, y) = t - + 2t 2 Pt(x , y) If f is bounded, then it is easy to see that for = i, ij , or t
)
(
The left-hand side is a local martingale, so the integral on the right-hand side is also. However, the integral is continuous and locally of bounded variation, so by (3.3) in Chapter 2 it must be = 0 almost surely. Since Vt and Av are continuous, it follows that - Vt + �Av = 0. For if it were f. 0 at some point (t, x), then it would be f. 0 on an op en neighborhood of that point, and, hence, with positive probability the integral would not be ::: 0, a contradiction. 0
129
a
j IDaPt (x, Y) f(y)l dy < oo and is continuous in Rd , so (1.6) follows from the next result on differentiating under the integral sign. This result is lengthy to state but short to prove since we assume everything we need for the proof to work. Nonetheless we will see that this result is useful.
(1.7) Lemma. Let (S, S, m) be a a--finite measure space, and g : S -+ R be measurable. Suppose that for x E G an open subset of Rd and some h0 > 0 we have: (a) u(x) = fs I< (x, y)g(y)m(dy) where I< and 8I 0 such that V D(y, r) y is a regular point. Proof The first thing to do is to define a cone with vertex y, pointing in direction v , with opening a as follows: V(y, v, a) = {x : x = y+ B(v + ) where 0 E (O, oo), z ..l v , and l z l < a}
The last two examples show that if ac is too small near y, then may be irregular. The next result shows that if a c is not too small near then is regular. c ac ,
n
z
Now that we have defined a cone, the rest is easy. Since the normal distribution is spherically symmetric,
Py(Bt E V(y, v, a)) = fa > 0
fa is a constant that depends only on the opening a. Let r > 0 be such V(y, v, a) D(y, r) C The continuity of Brownian paths implies tlim-o Py (sup • ::S t jB. - yj > r) = 0
where that
ac .
n
Combining the last two results with a trivial inequality we have
-< liminf t !O Py(Bt E Gc) -< limPy t !O {T -< t) -< Py(T = 0) and it follows from Blumenthal's zero-one law that Py( T = 0) = 1. fa
0
(4.5c) is sufficient for most cases.
{
1) = 0 The last two observations show that h is a solution of (4.1) with f 0, which 0 completes the proof. By working a little harder, we can show that adding aP:c(T = oo ) to v(x) is the only way to produce new bounded solutions. limsup P:z: (r = oo )
:::; limsup P:z:(r > 1) :::;
=
(4.7c) Theorem. Suppose that f is bounded and continuous and that each point of oG is regular. If is bounded and satisfies ( 4.1) in G, then there is a constant such that
u
a
We will warm up for this by proving the following sp ecial case in which G = Rd .
u is constant. Proof (4.2) above and (2.6) in Chapter 2 imply that u(Bt ) is a bounded oo, martingale. So the martingale convergence theorem implies that as t is measurable with respect to the tail CT-field, it follows ) -;. U u(B • Since U oo t 00 from (2.12) in Chapter 1 that P:c(a < Uoo < b) is either 0 or 1 for any a < b. The last result implies that there is a constant c independent of x so that P:c(U = c) oo :: 1. Taking expected values it follows that u(x) = E:cUoo c. 0 Proof of ( 4. 7 ) From the proof of (4. 7d) we see that u( Bt ) is a bounded local martingale on [0, r) so U,. = limtt-r u(Bt ) · On {r < oo } , we have U,. = f(B,.) so what we need to show is that there is a constant independent of the starting point Bo so that U,. = a on { T = oo } . Intuitively, this is a consequence of the triviality of the asymptotic CT-field, but the fact that 0 < P:c( T = oo ) < 1 makes it difficult to extract this from (2. 12) in Chapter 1. To get around the difficulty identified in the previous paragraph, we will extend u to the whole space. The two steps in doing this are to (a) Let h(x) = u(x) - E:c( f (B,.) ; r < oo) . (4.6) and (4.5) imply that h is bounded and satisfies (4.1) with boundary function f 0. (b) Let M = l h l oo and look at w(x) = { �(x) + MP:c(T = oo) xX EE GGc (4.7d) Theorem. If u is bounded and harmonic in Rd then
_,.
=
=
=
c
a
=
151
f3P:c( r = oo) , (i) When restricted to G, w satisfies (4. 1) with boundary function f :: 0. The proof of (4.7b) implies that M P:c( T < oo ) satisfies (4.1) with boundary function f 0. Combining this with (a) the desired result follows. (ii) w ;::: 0. To do this we use the optional stopping theorem on the martingale h(Bt ) at time T 1\ t and note h(B,.) = 0 to get h(x) = E:c(h(Bt) ;r > t) ;?: -MP:c(r > t) Letting t -;. oo proves h(x) ;?: -MP:c(T = oo ) . (iii) w(Bt ) is a submartingale. Because of the Markov property it is enough to show that for all x and t we have w(x) :::; E:cw(Bt ) . Since w ;?: 0 this is trivial if x rJ. G. To prove this for x E G, we note that (i) implies = w(Bt) is a bounded local martingale on [O,r) and W,. = 0 on {r < oo} , so using the optional stopping theorem To complete the proof now, we will show in four steps that w(x) = from which the desired result follows immediately.
=
Wi
f3 so that w(x) = f3P:c(r = oo) . Since w is a bounded submartingale it follows that as t oo, w(Bt ) converges to a limit Woo . The argument in (4.7d) implies there is a constant (3 so that P:c(Woo = (3) = 1 for all x. Letting t oo in w(x) = E:c(w(Bt );r > t) (iv) There is a constant
-
_,.
arid using the bounded convergence gives (iv).
0
4. 5. Poisso n's Equation
In this section, we will see what happens when we add a function of equation considered in the last section. That is, we will study: (5.1a)
u E C2 and �.t1u =
-g
(5. 1b) At each point of oG,
x to the
in G.
u is continuous and = 0. u
0
As in Section 4.2, we can add a solution of (4.1) to replace u = in (5.1b) by = . As always, the first step in solving (5.1) is to find a local martingale.
u f
152
T
Chapter 4 Partial Differential Equations
(5.2) Theorem. Let
Proof
[0, T).
153
The left-hand 2 side is a local martingale on [0, T), so the right-hand side is also. If v E C , then repeating the calculation in the proof of (5.2) shows that for
T = inf{t > 0 : B1 rf. G} . If u satisfies (5.1a) , then t Mt = u(Bt) + g(B.) ds
is a local martingale on
Section 4.5 Poisson 's Equation
i
s E [O, T),
v(B.) - v(Bo) +
Applying Ito's formula as we did in the last section gives
u(Bt) - u(Bo) = Jt \lu(B.) · dB. + 21 Jt 6.u(B.)ds o o for t < T: This proves (5.2) , since t6.u = -g and the first term on the right hand side is a local martingale on (0, T). 0 The next step is to prove a uniqueness result. (5.3) Theorem. Suppose that G and g are bounded. If there is a solution of (5.1) that is bounded , it must be
1• g(Br) dr = 1• (� 6.v + g) (Br) dr
+ a local martingale The left-hand side is a local martingale on (0, T), so the integral on the right
hand side is also. However, the integral is continuous and locally of bounded variation, so by (3.3) in Chapter 2 it must be = 0. Since �6.v + g is continuous in G, it follows that it is = 0 in G, for if it were :f:. at some point then we would have a contradiction. 0
0
After the extensive discussion in the last section, the conditions needed to guarantee that the boundary conditions hold should come as no surprise. (5.5) Theorem. Suppose that G and g are bounded. Let of 8G. If X n E G and X n ..__. y, then v(x n ) ..__. 0.
y
be a regular point
We begin by observing: (i) It follows from {4.5a) that if f > 0, then Px n ( T > f ) ..__. 0. (ii) If G is bounded, then (1 .3) in Chapter 3 implies C = supx Ex T < oo and, hence, ll v ll oo :s:; C II Y II oo < oo. Let e: > 0. Beginning with some elementary inequalities then using the Markov property we have rA £ Proof
u satisfies (5. 1a) then M1 defined in (5.2) is a local martingale on (0, T). If G is bounded, then (1.3) in Chapter 3 implies Ex T < oo for all x E G. If u and g are bounded then for t < T Proof - If
Since the right-hand side is integrable, (2.7) in Chapter 2 and (5.1b) imply r = Mr := lim Mt g(Bt) dt t Tr 0
1
u(x) = ExMo = Ex (M;) = v(x)
i v(x n ) I :s:; Ex n
0
As usual , it is easy to show (5.4) Theorem. Suppose that G is bounded and then it satisfies (5.1a). Proof
g
is continuous. If v
The Markov property implies that on { T > s } ,
E C2,
(i
)
(l iT
l )
lg(B. ) I ds + Ex n g(B.) ds ; T > e: :s:; t: I I Y II oo + Ex n (lv(Bf )l ; T > e:) :s:; e:II Y II oo + ll v ll oo Px n (T > e:)
Letting n ..__. oo, and using (i) and (ii) proves (5.5) since f is arbitrary.
0
Last, but not least, we come to the question of smoothness. For these developments we will assume that g is defined on Rd not just in G and has compact support. Recall we are supposing G is bounded and notice that the values of g on ac are irrelevant for (5.1), so there will be no loss of generality if we later want to suppose that J g(x) dx = 0. We begin with the case d ;;?: 3, because in this case (2.2) in Chapter 3 implies
w(x) = Ex
loa ig(Bt)i
dt < oo
154
Section 4.5 Poisson's Equation
Chapter 4 Partial Differential Equations
As in Section 4.2, trouble starts when we consider second derivatives. If
and, moreover, is a bounded function of x. This means we can define
w(x) = Ex
155
i =!= j, then
100 g(Bt) dt
use the strong Markov property to conclude
In this case, the estimate used above leads to
and change notation to get
which is (just barely) not locally integrable. As in Section 4.2, if g is Holder continuous of order a, we can get an extra lx - Y l a to save the day. The details are tedious, so we will content ourselves to state the result.
x) = w( x) - Ex w( BT ) (4.6) tells us that the second term is C00 in G, so to verify that v E C2 we need only prove that w is, a task that is made simple by the fact that (2.4) and (*)
v(
(2.3) in Chapter 3 gives the following explicit formula
j
w(x) = Ca l x - yf - dg(y) dy The first derivative is easy.
J{g;tO} I
dy yl d- 1
X -
w is C1 and
w(x) =
( ) (�(xi - Yi f)-d/2 2(x; - y;)
X
dy < oo - Y l d- 1
{
-� log ( l x - yl) d = 2 G(x, y) = -l x - yl d=1
where the see that
-y)
J I(I X - y li·d g(y) I dy $ llu lloo J{g;tO} I
D
j G(x, y)g(y) dy
G is the potential kernel defined in {2.7) of Chapter 3 , that is,
J
The integral on the right-hand side is convergent since .
where
G was defined as
So differentiating under the integral sign
I
>
The last result settles the question of smoothness in d 3. To extend the result to d $ 2, we need to find a substitute for ( * ) . To do this, we let
< 00
\Ve will content ourselves to show that the expression we get by differ entiating under the integral sign converges and leave it to the reader to apply (1.7) to make the argument rigorous. Now
X
The reader can find a proof either in Port and Stone {1978), pages 116-1 17, or in Gilbarg and Trudinger { 1977) , pages 53-55. Combining (*) with (5 .6b) gives {5.6) Theorem. Suppose that G is bounded. If g is Holder continuous, then E C2 and hence satisfies (5.1a).
Proof
2-d D; lx - Y l 2- d = 2-
w is C2 •
v
(5.6a) Theorem. If g is bounded and has compact support, then there is a constant C which only depends on d so that
I D;w(x)l $ Cll u ll oo
(5.6b) Theorem. If g is Holder continuous, then
100 {Pt(x, y) - at} dt
at were chosen to make the integral converge. So if J g dx = 0, we
J G(x, y)g(y)dy
{T
lim Ex g(Bt)dt T-+oo Jo Using this interpretation of w, we can easily show that ( *) holds, so again our problem is reduced to proving that w is C2, which is a problem in calculus. Once all the computations are done, we find that (5.6) holds in d $ 2 and that in d = 1 , it is sufficient to assume that g is continuous. The reader can find =
156
Chapter 4 Partial Differential Equations
details for d = 1 in (4.5) of Chapter 6, and for cited above.
d � 2 in either of the sources
Section 4.6 The Schrodinger Equation
B
The general solution is A cos bx + sin bx, where b = V'Fj. So if we want the boundary condition to be satisfied we must have
B
1 = A cos ba + sin ba 1 = A cos ( -ba) + sin ( -ba ) = A cos ba -
4 . 6 . The S chrodinger Equation
cu
In this section, we will consider what happens when we add to the left-hand side of the equation considered in Section 4.4. That is, we will study
( 6.1a) u E C2 and k.D.. u + cu = 0 in G. {6.1b ) At each point of 8G, u is continuous and u = f. As always, the first step in solving { 6.1 ) is to find a local martingale. ( 6.2) Theorem. Let r = inf{t > 0 : B1 rf. G}. If u satisfies (6.1a) then
B sin ba
Adding the two equations and then subtracting them it follows that
From this we see that solve for A.
B
=
0 = 2B sin ba
0 always works and we may or may not be able to
If cos ba = 0 then there is no solution. If cos ba =f. 0 then x = cos bx / cos ba is a solution.
u( )
We will see later (in Example 9.1 ) that if ab < rr/2 then v ( x ) = cos bx f cos ba
(0, r) .
c1 = J; c(B,)ds. Applying Ito's formula gives u(B1-) exp (c1) - u(Bo) = 11 exp (c,)V'u(B,) · dB, + 11 u(B,) exp (c, ) de, t + -1 1 .D.. u (B,)exp (c,)ds 2 for t r. This proves ( 6.2) , since dc5 c(Bs ) ds, k.D.. u + cu 0, and the first term on the right-hand side is a local martingale on (0, r) . Proof
B
2 = 2A cos ba
M1 = u(B1 ) exp (11 c(B.)ds)
is a local martingale on
157
Let
However this cannot possibly hold for ab > rr/ 2 since v ( x ) � hand side is < 0 for some values of x.
0 while the right
0
We will see below ( again in Example 9.1 ) that the trouble with the last example is that if ab > rr /2 then = 'Y is too large, or to be precise, if we let
c
0
=
0. There is a f1 > 0 so that if H is an open set with Lebesgue measure !H I :::; f1 and TH = inf{t > 0 : tJ. H} then sup
We will not do this, however, because the following simple example shows that this result is false. Example 6.1. Let
considering is
d = 1, G = (-a, a) , c and f 1. The equation we are 1 -u" + -yu = 0 u(a) = u(-a) = 1 2 = 'Y,
=
X
Proof Pick
'Y
Ex (exp ( )) B rH
> 0 so that e6'Y :::; 4/ 3. Clearly,
B1
:::;
2
158
Chapter 4 Partial Differential Equations
Section 4.6 The Schrodinger Equation
if we pick p so that pf(27r'Y ) df 2 � 1/ 4. Using the Markov property as in the proof of (1.2) in Chapter 3 we can conclude that
> >
>
So it follows by induction that for all integers k � 0 we have
00
I:: exp(B'Yk)Pz ((k - 1)1 TH � k1) 00
(6.3b) Lemma. Let
26 � r0 • If D(x, 26) C G and y E D(x, 6), then w(y) � 2d+2 w(x)
The reason for our interest in this is that it shows w(x) for y E D(x, 6). Proof
If D(y, r) C G, and r �
< oo implies w(y) < oo
o then the strong Markov property implies
r
w(y ) = Ey[exp (cTr) w(B (Tr))] � Ey[exp (c*Tr) w(B (Tr))] = Ey[exp ( c*Tr)] w(z ) 1r(dz ) w(z ) 1r(dz ) � 2 laD(y,r) laD(y,r) where is surface measure on 8D ( y, r) normalized to be a probability measure, since the exit time, Tr , and exit location, B (Tr ) , are independent.
f
1r
f
rra is the surface area of {x : l x l = 1}. Rearranging we have od w(z ) dz � 2 - 1 c w(y) jD(y,6) o where Co = d/ua is a constant that depends only on d. Repeating the first argument in the proof with y = x and using the fact that CTr � -c*Tr gives w(x) = E:�:[exp(cTr) w( B (Tr))] � E:�:[exp(-c*Tr) w( B (Tr))] w(z ) 1r(dz ) = E:�:[exp(-c*Tr)] JaD(x,r) Since 1 / x is convex, Jensen 's inequality implies where
r
Since ez is increasing, and e 8'Y � 4/3 by assumption, we have
k=1
rd- l and
>
Pz(TH k1) = E:�:(Pn.,(TH ( k - 1)1) ; TH 1) 1 � sup Py (TH ( k - 1) 1) 4 y
E:�: exp (BrH ) �
If 6 � ro and D ( y, 6) C G, multiplying the last inequality by integrating from 0 to 6 gives
159
f
E:�:[exp(-c*Tr)] � 1/ Ez [exp (c*Tr)] � 1 / 2 Combining the last two displays, multiplying by rd- l , and integrating from 0 to 26 we get the lower bound 1 (26) d -d-w(x) � 2 - 1 a z w(z ) dz rr D( ,26) Rearranging and using D ( x, 26) ::) D(y, 6), w � 0 we have ·
-
j
� r
w(x) � 2 - 1 ( o) d w(z ) dz }D(y,6) where again C0 = dfua. Combining (**) and (*) it follows that
� r
w(x) � 2 - 1 ( o) d w(z ) dz jD(y,6) co_ . 2 - 1 sa w(y) = 2- (d+2) w(y) 1_ 2 - (2fi) d Co
>
(6.3b) and a simple covering argument lead to
0
160
Chapter 4 Partial Differential Equations
(6.3c) Theorem. Let
Section 4.6 The Schrodinger Equation
G be a connected open set. If w :j. oo then
(A3) w :j. oo .
w(x) < oo for all x E G
(6.3) Theorem.
Proof From (6.3b ), we see that if w(x) < oo, 2o :::; ro, and D(x, 2o) C G, the:n w < oo on D(x, o). From this result, it follows that G0 = {x : w(x) < oo } 1s
an open subset of G. To argue now that Go is also closed (when considered as a subset of G) we observe that if 2o < ro , D(y, 3o) C G, and we have Xn E Go with Xn ---> y E G then for n sufficiently large, y E D(xn , 5) and D(xn , 2o) C G, 0 so w(y) < oo. Before we proceed to the uniqueness result, we want to strengthen the last conclusion. (6.3d) Theorem. Let G be a connected open set with finite Lebesgue measure, I G I < oo. If w :j. oo then sup w(x) < oo X
C G be compact so that I G - K l < f1. the constant in (6.3a) for c* . For each x E [( we can pick a Ox so that 2ox :::; ro and D(x, 2ox) C G. The open sets D(x , ox) cover I< so there is a finite subcover D(x;, ox.), 1 :::; i :::; I. ClearlY-, sup w(x;) < oo Proof Let I< 0
=
If there is a solution of (6.1) that is bounded , it must be
Proof (6.2) implies that M3 = u(B, ) exp(c3 ) is a local martingale on [O, r) . Since J, c, and u are bounded, letting s j r A t and using the bounded conver gence theorem gives
u(x) = Ex (f(Br ) exp(cr ); r :::; t) + Ex ( u(Bt) exp(ct); r > t) Since f is bounded and w(x ) = Ex exp(cr ) < oo, the dominated convergence theorem implies that as t ---> oo, the first term converges to Ex (f(Br ) exp(cr )). To show that the second term ---> 0, we begin with the observation that since {r. > t} E :F1 , the definition of conditional expectation and the Markov property imply Ex(u(Bt) exp(cr ) ; r >
l :::=; i $1
w(y) 2: exp(-c* )Py (r :::; 1) 2: f > 0 The first inequality is trivial. The last two follow easily from (A1). See the first display in the proof of (1.2) in Chapter 3. Replacing w(B1 ) by f,
M = sup w(y) < oo yEK
w(y)
[( ,
then Ey (exp(c* rH)) :::; 2 by (6.3a) so using the strong
Ey (exp(crH ); TH = r) + Ey (exp(crH )w(BrH); BrH :::; 2 + MEy (exp(crH ) ; BrH E K) :::; 2 + 2.1\1
=
E K) 0
With (6.3d) established, we are now m?re than ready to prove our �nique ness result. To simplify the statements of the results that follow we w1ll now list the assumptions that we will make for the rest of the section. (A1)
G is a bounded connected open set.
(A2) f and c are bounded and continuous.
t) = Ex (Ex( u(Bt) exp(cr )1Ft); r > t) = Ex (u(Bt) exp(ct)w(Bt) ; r > t)
Now we claim that for all y E G
(6.3b) implies w(y) :::; 2 d+ 2 w(x;) for y E D(x;, ox; ) , so
If y E H = G Markov property
161
Ex (iu(Bt)l exp(ct ); r > t) :::; f - l Ex (iu(Bt) l exp(cr ); T > t) :::; f - 1 llullooEx (exp(cr ) ; r > t) ---> 0 as t ---> oo, by the dominated convergence theorem since w( x) = Ex exp(Cr ) < oo and Px( r < oo ) = 1. Going back to the first equation in the proof, we have shown u(x) = v ( x ) and the proof is complete. 0 This completes our consideration of uniqueness. The next stage in our program, fortunately, is as easy as it always has been. Recall that here and in what follows we are assuming (A1)-(A3). (6.4) Theorem. If v
E C2 , then it satisfies (6. 1a) in G.
162 Proof
T
Chapter 4 Partial Differential Equations The Markov property implies that on
{ r > s},
Ex(exp(cr )f(Br )!F.) = exp(c,)EB. (exp(cr )f(Br )) = exp(c, )v(B, ) The left-hand side is a local martingale on [0, r), so the right-hand side is also. If v E C2 , then repeating the calculation in the proof of ( 6. 2) shows that for
s E [O, r),
v(B,) exp(c,) - v(Bo) =
1' (�
) (Br ) exp (cr )dr
�v + cv
163
to the previous case. We begin with the identity established there, which holds for all t and Brownian paths w and, hence, holds when t = r(w)
Multiplying by f(Br ) and taking expected values gives
Conditioning on F, and using the Markov property, we can write the above as
+ a local martingale
The left-hand side is a local martingale on [0, r), so the integral on the right hand side is also. However, the integral is continuous and locally of bounded variation, so by ( 3.3 ) in Chapter 2 it must be = 0. Since v E C2 and c is continuous, it follows that t �v + cv = 0, for if it were ::f. 0 at some point then we would have a contradiction. D Having proved ( 6.4 ) , the next step is to consider the boundary condition. As in the last two sections, we need the boundary to be regular.
( 6.5) Theorem. v satisfies (6.1b ) at each regular point of aG.
Proof Let y be a regular point of aG. We showed in (4.5a) and (4.5b) that if Xn -+ y, then PxJr � 6) -+ 1 and Px n (Br E D ( y , 6 )) -+ 1 for all 6 > 0. Since c is bounded and f is bounded and continuous, the bounded convergence theorem implies that
To control the contribution from the rest of the space, we observe that if lei � M then using the Markov property and the boundedness of w established in ( 6.3d ) we have
Ex n (exp(cr )f(Br ); r > 1 ) � e M IIfll oo Ex n (w(Bi ); r > 1 ) � e M IIfll oo llwll oo PxJr > 1 ) -+ 0
Section 4.6 The SclJrodinger Equation
D
This brings us finally to the problem of determining when v is smooth enough to be a solution. We use the same trick used in Section 4.3 to reduce
100
Ex(c(B, )v(B,); T > s) ds v(x) = Exf(Br ) + = v 1 (x) + v (x) 2 The first term, v 1 (x), is coo by (4.6 ) . The second term is so if we let g( x) = c( x ) v ( x) then we can apply results from the last section. If c and f are bounded and w ;j:. oo, then v is bounded by ( 6.3d) , so it follows from results in the last section that v2 is C 1 and has a bounded derivative. Since v1 E coo and G is bounded, it follows that v is C1 and has a bounded derivative. If c is Holder continuous, then g(x) = c (x) v (x) is Holder continuous, and we can use ( 5.6b ) from the last section to conclude v2 E C2 and hence
(6.6 ) Theorem. If in addition to (A1 ) -(A3 ) , c is Holder continuous, then v E C2 and, hence, satisfies (6.1a) . C. Applicat ions t o Brownian Mot ion
In the next three sections we will use the p.d.e. results proved in the last three to derive some formulas for Brownian motion. The first two sections are closely related but the third can be read independently.
164
Chapter 4 Partial Differential Equations
T
4. 7. Exit D istributions for the B all
In4.4this section, we will use resultsforforD the= {xDirichlet provedresultin Section to find the exit distributions : lxl < 1}.problem Our main is (7.1) Theorem. Iff is bounded and measurable, then Exf(Br) = lanf �x--lxY:: f(y) 1r(dy) where 7r is surface measure on an normalized to be a probability measure. Anoo itapplication ofbounded the monotone class f.theorem shows thatweifcan(*) prove holds forProof f E c is valid for measurable In view of ( 4.3), (*) for f E C00 by showing that if ky(x) = (1 - l x l2)/lx - Y ld and v(x) = { I r}. inf{t = let 0, then the convergence is uniform for z E B0( 8) = 8D - D(y, 8 ). Thus, if we let B1 (8) = 8D - B0(8), then
r
Jno(o) since I(x) =
kz (x)?T(dz) -+ 0 and
1. To prove that v(x) -+ f(y) now we observe
lv(x) - f(y) l = �
I f kz (x)f(z)?T(dz) - f(y) J kz (x)1r(dz) l
2 llfll oo
r
Jn0 ( 5 )
kz (x)?T( dz) + sup lf(z) - f(y)l zE B1(o)
T
I
I
Section 4.8 Occupation Times for the Ball 4.8. Occupation Times for the B all
In the last section we considered how B1 leaves D = {x : l x l < 1 } . In this section we will investigate how it spends its time before it leaves. Let T = inf {t : Bt fl. D} and let G(x, y) be the potential kernel defined in (2.7) of Chapter 3. That is,
cd = f(d/2 - 1}/21Td/ 2 • (8.1) Theorem. If g is bounded and measurable then
where
The first term -+ 0 as x -+ y, while the sup in the second is small if 8 is. This shows that (7 .2b) holds and completes the proof of (7 .1 ) . 0 The derivation given above is a little unsatisfying, since it starts with the answer and then verifies it, but it is simpler than messing around with Kelvin's transformations (see pages 100-103 in Port and Stone (1978)). It also has the merit of explaining why ky (x) is the probability density of exiting at y: ky is a nonnegative harmonic function that has ky(O) = 1 and ky(x) -+ 0 when x -+ z E 8D and z :f. y. Exercise 7.1. The point of this exercise is to apply the reasoning of this section to T = inf{ t : Bt fl. H} where H = {(x, y) : X E nd- I , y > 0 } . For (} E nd- 1 let
ho(x, y) = ( x - 8C2d+y y2) d/ 2 l 1 where
Ex where
(a) Show that 6.ho = 0 in H and use (1.7) to conclude 6.u = 0 in H. (b ) Show I(x, y) = J d(} ho(x, y) = 1. ( c) Show that if X n -+ x, Yn -+ 0 then u(x n , Yn ) -+ f(x, 0). ( d) Conclude that E( y) f(B ) = J d(} ho(x, y)f(8, 0). :c ,
r
GD(x, y) = G(x, y) -
J 1lx- l xz lld2 G(z, y)1r(dz) _
{7.1).
D
We think of GD(x, y) as the expected amount of time a Brownian motion starting at x spends at y before exiting G. To be precise, if A C G then the expected amount of time B1 spends in A before exiting G is fA GD(x, y) dy. Our task in this section is to compute GD (x, y). In d = 1, where D = (-1, 1) , (8.1) tells us that
x + 1 G{1, y) - 1 - x G{-1, y) GD(x, y) = G(x, y) - 22x+1 1 -x = -lx - Y l + -- {1 - y) + -- (y + 1) 2 2 = -lx - Y l + 1 - xy
J d(} ho(x, y)f(8, 0)
where f is bounded and continuous.
1r g(Bt) dt = J GD(x, y)g(y) dy
Proof Combine ( * ) in Section 4.5 with
cd is chosen so that I d(} ho(O, 1} = 1 and let u(x, y) =
167
Considering the two cases
(8.2)
x ;::: y and x � y leads to
{
- x)(1 + y) -1 � y � x � 1 GD (x, y) = (1 (1 - y)(1 + x) -1 � x � y � 1
Geometrically, if we fix y then x -+ GD(x, y) is determined by the conditions that it is linear on (O, y] and on (y, l] with GD(O, y) = GD(l, y) = 0, and GD(y, y) = 1 - y2•
168
Section 4.9 Laplace Transforms, Arcsine Law
Chapter 4 Partial Differential Equations
In d ;::: 2, (8.1) works well when y = 0 for then G(z, O) is constant on { l zl = 1} and the expression in (8.1) reduces to
Proof To check (A) and (B), we observe that if we let 11x denote the Laplacian
acting in the
x variable for fixed y then
11x G(x, y) = 0 for x =/= y
(8.3) To get an explicit formula for Gn (x, y) in d ;::: 2 for y =/= 0 we will cheat and look up the answer. This weakness of character has the advantage of making the point that Gn (x, y) is nothing more than the Green's function for D with Dirichlet boundary conditions. Folland {1976), page 109, defines the Green's function for- D to be the function J{ ( x, y) on D x D determined by the following properties:
(a) Since yjjyj 2 cf_ D , (* ) implies that the second term is harmonic for x E D. (b ) Let x E 8D. Clearly, the right-hand side is continuous at x. To see that it vanishes, we note
0
(A) For each y E D, K(· , y) - G( · , y) is C2 and harmonic in D. (B) For each y E D, if X n -+ x E 8D, K(xn , y) -+ 0.
The last equality follows from a fact useful for the next proof:
Remark. For convenience, we have changed Folland's notation to conform
(8.6) Lemma. If l x l = 1 then jxjyj - YIYI - 1 1 = !x yj.
to ours and interchanged the roles of x and y. This interchange makes no difference, since the Green's function is symmetric, that is, K(x, y) = K(y, x). (See Folland {1976), page 110.)
Proof Using
·
defined above.
Proof Using
Again turning to page
(8.1) and (7.1) we have
Gn(x, z) - G(x, z) = -
1an I1x--lxlyjd2 G(y, z)1r(dy) = -Ex G(Br , y)
This is harmonic in D by (4.6).
++
0
123 of Folland {1976) we "guess"
(8.7) Theorem. In d = 2 if 0 < I Y I < 1 then -1 Gn (x, y) = - (ln l x - yj - ln lxlyl - YI Y I - 1 1 ) 1f
(4.5) implies if X n -+ x E 8D, then D
Having made the connection in {8.4), we can now find Gn by "guessing and verifying" functions with properties ( A) and (B). To "guess," we turn to page 123 in Folland {1976) and find
(8.5) Theorem. In d ;::: 3, if 0 < I Y I < 1 then
lzl 2 = z · z and then lxl2 = 1 we have l x lyl - Y IY I - 1 1 2 = l x i 2 I Y I 2 - 2x · Y 1 = I Y I 2 - 2x Y l x l2 = lx - Y l 2
{8.4) Lemma. The occupation time density Gn is equal to the Green's function
f(
169
Proof Again we need to check ( A) and (B).
(a)
Gn(x, y) - G(x, y) =
; (ln l x - ��2 � + ln I YI )
Again the first term is harmonic by ( * ) since yjjyj 2 rJ. D. The second term does not depend on x and hence is trivially harmonic. (b ) Let x E 8D. Clearly, the right-hand side is continuous at x. The fact that it vanishes follows immediately from {8.6). o
170
Section 4.9 Laplace Transforms, Arcsine Law
Chapter 4 Partial Differential Equations
4.9. Laplace Transfor1ns , Arcsine Law
In this section we will apply results from Section 4.6 to do three things: (a) Complete the discussion of Example 6.1. (b) Prove a remarkable observation of Ciesielski and Taylor (1962) that for Brownian motion starting at 0, the distribution of the exit time from D = {x : lxl < 1} in dimension d is the same as that of the total occupation time of D in dimension d + 2. (c) Prove Levy's arcsine law for the occupation time of (0, oo ). The third topic is independent of the first two. Example 9.1. Take d = 1, G = (-a, a), c(x) = problem considered in Section 4.6.
(9.1)
Theorem.
Let
> 0, and f =
1
r = inf{t : B1 ¢ (-a, a)}. If 0
oo. Indeed r = 1/2 is the value for which the probability density 1/7rJr(1 - r) is smallest. Proof For the last equality see the other arcsine law, prove the first we start with
(9.3) Lemma. Let c(x) = bo1:1nded, C1 , and satisfies for all x :f.:
0. Then
(4.2) in Chapter 1. To
-a - f3 1[o, oo) (x) with a, {J ;:::: 0. Suppose v
IS
1 -.Q.v 2 + cv = -1
Having "found" the solution we want, we can now forget about its origins and simply check that it works. Differentiating we have
C(a) co h(ar) - C(a) smh . (ar) J'(r) = -arr s 2C(a) smh C(a)a sinh(ar) - 2C(a) co h(ar) + � . (ar) j"(1· ) = s r rar 2-f'(r) = C( a)a . h(ar) = a-j (r) j"(1· ) + r sm -?
--
r
?
-
--
?
Proof Our assumptions about v imply that v"(x) = -2(1 + c(x)v(x)) dx in the sense of distribution. So two results from Chapter 2, the Meyer-Tanaka formula, (11.4), and (11.7) imply
174
Chapter 4 Partial Differential Equations
Section
Letting Ct = I� c(B3 ) ds and using the integration by parts formula, (10.1) in Chapter 2, with Xt = v(Bt) and Yi = exp(ct ), which is locally of bounded variation, we have
i exp(c3 )v' (B. ) dB3 t - i exp(c3){l + c(B3)v ( B.)} ds t + i v(B. ) exp(c3)dc3
So Mt = v(Bt) exp(ct ) + I� exp(c3) ds is a local martingale. Since v is bounded and Ct :5 0, left is bounded. As t -+ oo, exp(ct) :5 e - at -+ 0, so using the martingale and b ounded convergence theorems gives
v(x) = ExMo = ExMeo = Ex
A = va+lJ - fo (a + [J)fo
1
(a + (J) v = 2 v + 1 1/
x>O
1 -v" + 1
x R}. Suppose that Bt is Brownian motion w.r.t. :Ft and that Yi and Zt satisfy (*) on [O, r(R)] . Then Yi = Zt on [0, r(R)] .
(2.8) Lemma.
Yo = Zo =
0
Since Yi and Zt are solutions,
To see that Xt" is a solution, we let Yi = Xf and Zt = Xt00 in (2.3) to get T E 0 sup 1 Xf +1 - Xt" 1 2 :::; BE I X: - xr: 1 2 ds 9�T
(
by
)
(2.6), so Xt"
1
::=; B T · E
= limXf +l = Xt" .
(O� • �T IX: - X';" l2) sup
--+ 0
Uniqueness. The key to the proof is ·
(2.7) Gronwall's Inequality. Suppose cp(t) ::=; A + B f� cp(s) ds for all t � 0 and cp(t) is continuous. Then cp(t) :::; AeBt . Proof If we let .,P(t) = (A + e)eB t then .,P'(t) = B.,P(t) so
.,P(t) = A + e + B
1
t
.,P(s) ds
Let r = inf{t : cp(t) � .,P(t)}. Since '1/J and
A + B for cp(s) ds � cp(t)
x
Proof Let
n
n
The last contradiction implies that cp(t) :::; (A + e)eBt , but e > 0 is arbitrary, so the desired result follows. 0
cp(t) ::=; B E ::=; B E so
(2.3) implies
10 IYs - Zs l2 1 IY.AT - Zsi\T I 2 t i\T(R)
t
ds
. ds ::=; B
(2.7) holds with A = 0 and it follows that
1. We like the last proof since it on'ly depends on the asymptotic behavior of the coefficients. One can also solve the equation explicitly, which has the advantage of giving the exact explosion time. Let yt = 1 + Xt . dYt = Y/ dt so if we guess yt = (1 - at)-P then
ap (1 - at)P+l Setting a = 1/p and then p = 1/(6 - 1) to have p + 1 = 6p we get a solution that explodes at time 1/a = p = 1/(6 - 1). dytjdt =
Example 3.3. Suppose b = 0; and O" = (1 + l xi) 6 J. (2.2) implies there is no explosion when 6 ::; 1. Later we will show (see Example 6.2 and (6.3)) that in d � 3 explosion occurs for 6 > 1 in d ::; 2 no explosion for any 6 < oo b. Yamada and Watanabe in d
=
1
In one dimension one can get by with less smoothness in
(3.3) Theorem.
tion p with
O".
Suppose that (i) there is a strictly increasing continuous func
I O"(x) - O"(Y ) I ::; p( l x - yl) where p(O) = 0 and for all c > 0
1E p-2 (u) du = oo
(ii) there is a strictly increasing and concave function
x:(lx - yl) where x:(O) = 0 and for all f > 0
x: with lb(x) - b(y) l ::;
1 f x:- 1 (u) du = oo
Then pathwise uniqueness holds.
Remark. Note that there is no mention here of existence of solutions. That is taken care of by a result in Section 8.4. Proof The first step is to define a sequence cpn of smooth approximations of
l x l . Let an ! 0 be defined by ao = 1 and
Section 5.3 Extensions 195
194 Chapter 5 Stochastic Differential Equations Let 0 � t/Jn (u) � and
2p- 2 (u)/n be a continuous function with support in (an , an - 1 )
Since the upper bound in the definition of t/Jn integrates to such a function exists. Let
0 and show that g'(t)
solution. However that proof is more complicated and the conclusion is not important for our purposes so we refer the reader to Revuz and Yor (1991), p. 340-341. Proof Suppose
(X 1 , B 1 ) and (X2, B2) are two solutions with XJ = X6 = x.
(As remarked above the solutions are specified by giving the joint distributions
Cx log(1/x) for x < e-2 + x) for x � e -2
(b) Consider g (t) = exp(-1/tP ) with where 1/Jp ( Y) = py{log(1/y)} (p+l )/p .
(4.1) Theorem. If pathwise uniqueness holds then there is uniqueness in dis tribution. Remark. Pathwise uniqueness also implies that every solution is a strong
3.2. The main reason for interest in (3.3) is the weakening of Lips
n.
4.1. Since sgn(x)2 = 1, any solution to the equation in Example 2.1 is a local martingale with (X)t = t. So Levy's theorem, (4.1) in Chapter 3, implies that Xt is a Brownian motion. The last result asserts that unique ness in distribution holds, but (2.1) shows that pathwise uniqueness fails. The next result, also due to Yamada and Watanabe (1971), shows that the other implication is true. Example
(p - 1) = p8 i.e. , p = 1/(1 Cp = C6 i.e., C = p- 1 /( 1 - o) Xt = { C(t
the Brownian motion .and the solution at the same time. In signal processing applications one must deal with the noisy signal that one is given, so strong solutions are required. However, if the aim is to construct diffusion processes then there is nothing wrong with a weak solution. Now if we do not insist on staying on the original space ( n, :F, P) then we can use the map w - (Xt (w), Bt (w)) to move from the original space (r2, :F) to (C X C, C X C) where we can use the filtration generated by the coordinates w1 (s), w2 (s), s � t. Thus a weak solution is completely specified by giving the joint distribution (Xt , Bt) · Reflecting our interest in constructing the process Xt and sneaking in some notation we will need in a minute, we say that there is uniqueness in distri bution if whenever (X, B) and (X', B') are solutions of SDE(b, (T) with X0 = X� = x then X and X' have the same distribution, i.e., P(X E A) = P(X' E A) for all A E C. Here a solution of SDE(b, (T) is something that satisfies (*) in the sense defined in Section 5.2.
=
1/Jp ( g (t))
5 . 4. Weak S olutions
Intuitively, a strong solution corresponds to solving the SDE for a given Brow nian motion, while in producing a weak solution we are allowed to construct
(Xi , B i) .) Since (C, C) is a complete separable metric space, we can find regular conditional distributions Q i (w , A) defined for w E C and A E C (see Section 4.1
in Durrett (1995)) so that
P(X i E A I B i ) = Qi (B i ,A) Let Po be the measure on ( C, C) for the standard Brownian motion and define a measure i on (C3 , C3) by
1r(Ao X A1 X A2 ) =
{
}A0
dPo(wo)Q l (wo, A1 )Q 2 (wo, A2 )
198 Chapter 5 Stochastic Differential Equations If we let
Section 5.4 Weak Solutions 199
�i (wo, w2) = w;(t) then it is clear that Wt ,
A2 = C A 1 = C Y1 = Y 2
=
y2
4
x2
a(:z:) = uuT (:z:) and recall from Section 5.1 that (Xi , Xi ) t = 11 a;i (X.) ds We say that X is a solution to the martingale problem for simply X solves MP(b,a), if for each i and j, 1 1 xf - 1 b; (X.) ds and x;x{ - 1 a;j (X.) ds
0
b
and
a, or
:F1
To be precise in the definition above we need to specify a filtration for the local martingales. In posing the martingale problem we are only interested in the process so we can assume without loss of generality that the underlying space is and the filtration is the one generated by C) with In the formulation above and are the basic data for the mar tingale problem. This presents us with the problem of obtaining from . To solve this problem we begin by introducing some properties of Since then Also, if z E is symmetric, that is,
(Xi ,Xi )1 = (Xi ,Xi) 1 , a
a;j = aii ·
see Stroock and
T ;::: al 2 and lla(:z:) - a(y)l l :::; Cl :z: - Y l for all :z:, y al 1 12(:z:) - a1 12e(y)lla(:z::::;)O(C/2a81 112 )l:z: - Yl for all :z:,y. When a can degenerate we need to suppose more smoothness to get a Lipschitz continuous square root. (4.4) Theorem. Suppose aij E C2 and 1rgft-d l eT D;;a(:z:)B.I :::; C IBI 2 then ll a 1 12 (:z: ) - a 1 12 (y)l l :::; d(2C) 1 12 I :z: - Y l for all :z:, y. To see why we need a E C2 to get a Lipschitz a 1 12 , consider d = 1 and a(x) = lxl>- . In this case u(x) = l :z: l>./2 is Lipschitz continuous at 0 if and only if >. ;::: 2. (4.3) Theorem. If then
are local martingales. The second condition is, of course, equivalent to
a = uuT
0"
Varadhan (1979), Section 5.2.
Let
X (C, 1 , Xt (w) = w1b
;:::
• • •
Take or respectively in the definition. Thus, we have two solu tions of the equation driven by the same Brownian motion. Pathwise uniqueness implies with probability one. (i) follows since
xt 4 yt
and A with diagonal entries >.1 ;::: >.2 ;::: >.d 0 so that a = UaTdiagonal AU. a ismatrix invertible if and only if >. d > 0. (4.2 }_tells us a = uT AU, so if we want a = O"O"T we can take = uT .;A, where VA is the diagonal matrix with entries .;>:i. (4.2) tells us how to find the square root of a given a. The usual algorithms for finding U are such that if we apply them to a measurable a( :z:) then the resulting square root u( :z:) will also. be measurable. It takes more sophistication to start with a smooth a(:z: ) and construct a smooth u(:z: ) . As in the case of (4.2), we will content ourselves to just state the result. For detailed proofs of (4.3) and (4.4)
X1 . ua. a. Rd
Returning to probability theory, our first step is to make the connection between solutions to the martingale problem and solutions of the SDE.
a
(4.2) Lemma. For any symmetric nonnegative definite matrix a, we can find an orthogonal matrix U (i.e., its rows are perpendicular vectors of length 1)
MP(b,a) u Bt , u). (X, B) If u is invertible at each X this is easy. Let �i = Xf - J� b; (X.) ds
a.
Proof
and let So is nonnegative definite and linear algebra provides us with the following useful information.
X
{4.1)) Theorem. Let be a solution to and a measurable square root of Then there is a Brownian motion possibly defined on an enlarge ment of the original probability space, so that solves SDE(b,
(a)
From the last two definitions and the associative law it is immediate that (b )
1t u(X.) dB. = yt - Yo = X1 - Xo - 11 b(X.)ds
200 Chapter 5 Stochastic Differential Equations so (*) holds. The B; are local martingales with (B i , Bi )t = O;jt so Exercise 4.1 in Chapter 3 implies that B is a Brownian motion. To deal with the general case in one dimension, we let
let J3 = 1 - 13 , let W3 be an independent Brownian motion (to define this we may need to enlarge the probability space) and let (a') The two integrals are well defined since their variance processes are :::; t. Bt is a local martingale with (B}t = t so (4.1) in Chapter 3 implies that Bt is a Brownian motion. To extract the SDE now, we observe that the associative law and the fact that J3 u(X3 ) = 0 imply
it u(X3) dB3 = it 13 dY3 = yt - Yo -
it J3 dY3
In view of (b) the proof will be complete when we show this we note that
Section 5.4 Weak Solutions 201 Our result for one dimension implies that the zf may be realized as stochastic integrals with respect to independent Brownian motions. Tracing back through the definitions gives the desired representation. For more details see Ikeda and Watanabe (1981), p. 89-91, Revuz and Yor (1991), p. 190-191, or Karatzas and Shreve (1991), p. 315-317. 0
(4.5) shows that there is a 1-1 correspondence between distributions of so lutions of the SDE and solutions of the martingale problem, which by definition are distributions on (C, C). Thus there is uniqueness in distribution for the SDE if and only if the martingale problem has a unique solution. Our final topic is an important reason to be interested in uniqueness. (4.6) Theorem. Suppose a and b are locally bounded functions and MP (b , a) has a unique solution. Then the strong Markov property holds. Proof For each x E Rd let Px be the probability measure on ( C, C) that gives the distribution of the unique solution. starting from X0 = x. If you talk fast the proof is easy. If T is a stopping time then the conditional distribution of {XT+3 , s � 0} given :FT is a solution of the martingale problem starting from XT and so by uniqueness it must be Px(T) · Thus, if we let BT be the random shift defined in Section 1.3, then for any bounded measurable Y : C -+ R we have the strong Markov property
(4.7) J� J3 dY3 = 0. To do
E (Y o BT i:FT ) = Ex(T) y
To begin to turn the last paragraph into a proof, fix a starting point Xo = x0• To simplify this rather complicated argument, we will follow the standard
practice (see Stroock and Varadhan (1979), p. 145-146, Karatzas and Shreve (1991), p . 321-322, or Rogers and Williams (1987), p. 162-163) of giving the details only for the case in which the coefficients are bounded and hence
it b;(X3 ) ds t Mfi = XfX{ - i a;i (X. ) ds
Mf 0 = Xf To handle higher dimensions we let structed in (4.2) and let
Zt = Introducing Kronecker's
U(x) be the orthogonal matrices con
t
i UT (X3 ) dY3
O;j = 1 if i = j and 0 otherwise, we have
are martingales. We have added an extra index in the first case so we can refer to all the martingales at once by saying "the Mfi ." Consider a bounded stopping time T and let Q(w , A) be a regular condi tional distribution for {XT+t , t � 0} given :FT . We want to claim that (4.8) Lemma. For Px0 a. e. w , all the
ij t - 1�Lrii = Mtij 1�u·T+ T
0
e
T
202 Chapter 5 Stochastic Differential Equations
Section 5.5 Change of Measure 203
Q(w, ·).
are martingales under To prove this we consider rational < and B E stopping theorem implies that if A E then
s :Ft :F, and note that the optional T E.,0 ({ (Mfi - M;i )1B) ()T } 1A) = 0 Letting B run through a countable collection that is closed under intersection and generates :F, it follows that P.,0 a.s., the expected value of (Mfi -M!i )1B is 0, which implies (4.8). Using uniqueness now gives (4.7) and completes the o
0
proo(
5.5.
2.12
Our next step is to use Girsanov's formula from Section to solve martingale problems or more precisely to change the drift in an existing solution. To accomodate Example below we must consider the case in which the added drift depends on time as well as space.
(5.1)
5.2
a), Xt a- 1 (x) P)x, :Ft b(s, x) Xt(w)bT=aW- t1 · l P, b(s, x)l :::; M. b, a). Let Xt = Xt - J� ,B(X. )ds, let c(s, x) = a - 1 (x)b(s, x) and let t Yt = lo c(s,X,) · dX,
is a solution of MP(,B, which, for concreteness Theorem. Suppose and without loss of generality, we suppose is defined on the canonical probability space (C, C, with the filtration generated by Suppose exists for_all and that is measurable and has We can define a probability measure Q locally equivalent to so that under Q , is a solution t o MP(,B +
Xt
Proof
at 2),
So by the formula for the covariance of stochastic integrals
t (a,Xi )t = :Li i0 c; (s,X,) d(Xi , Xi ), t = lo a,(cT a)i (s, X,) ds = lot a, bj(s, X,) ds since c(s,x) = a - 1 (x)b(s,x). It follows that a,
Change of Measure
b
(Y)t(12.3) :::; Mt, (3.7) 2 3 ((12 ) = a;-1 d(a,Xi ), .
Since in Chapter implies that is a martingale, and invoking in Chapter we have defined Q. To apply Girsanov's formula .4 from Chapter we have to compute A{ J� Ito's formula and the associative law imply that
Using Girsanov's formula now, it follows that
is a local martingalefQ. This is half of the desired conclusion but the other half is easy. We note that under
P
.which exists since
t (Y)t = lo � c; (s,X. )ci (s,X, )d(Xi , J(i ), t t = lo CT ac(s, x. ) ds = lo bT a- 1 b(s, x. ) ds :::; Mt by our assumption that l bT a - 1 b(s, x)l M. Letting Q t and Pt be the restric tions of Q and P to :Ft we define our Radon-Nikodym derivative to be I)
::=:;
(12.6)
and in Chapter under Q.
t (Xi ,Xi )t = lo a;i (X. )ds
2 implies that the covariance of X; and Xi is the same
0
Exercise 5.1. To check your understanding of the formulas, consider the case in which does not depend on start with the Q constructed above, change to a measure R to remove the added drift and show that so
R = P.
b
s,
Xt
b,
dRtfdQt = !fat
Example 5.1. Suppose is a solution of MP(O, I) , i.e., a standard Brownian motion and consider the special situation in which In this case
b(x) = 'V U(x).
204 Ch apter 5 Stochastic Differential Equations
Section 5.5 Change of Measure 205
recalling the definition of yt in the proof of (5.1) and applying Ito's formula to U(X1 ) we have yt
=
1 0
t
t \lU(X.) · X. = U(Xt ) - U(Xo) - -2l 0 � U(X. ) ds
i
Thus, we can get rid of the stochastic integral in the definition of the Radon Nikodym derivative:
��: = (U(X1 ) - U(Xo) - � 1t �U(X. ) + IVU(X. ) 12 ds) exp
In the special case of the Ornstein-Uhlenbeck process, b(x) is a positive real number, we have U(x) = -alxl2 � U(x) expression above simplifies to
= - 2ax, where a = - 2da, and the
��: = (-a1Xtl2 + a1Xo l2 + dat - 1t 2a2 1Xs l2 ds)
= Jl · X
dQt = exp . Jl · Xt - Jl · Xo - -21 lf.ll 2t d� It is comforting to note that when Xo = x, the one dimensional density functions satisfy
(
� )
Qt (Xt = y) = exp Jl · Y - Jl · x - 1 Jltl 2 Pt(Xt = y) = (211i) - d/ 2 exp( -ly - x - Jlil 2 /2t) as it should be since under Q, Xt has the same distribution as Bt + Jli.
D
(5.1) deals with existence of solutions, our next result with uniqueness. (5.2)
Theorem. Under the hypotheses of (5.1) there is a
between solutions of MP(,B, a) and MP(,B + b, a).
(5.3) Theorem. Suppose X is a solution of MP(O, a) constructed on (C, C). Suppose that a - 1 (x) exists, bT a - 1 b (x) is locally bounded and measurable, and the quadratic growth condition from (3.2) holds. That is, d
i=l
)
(
One can considerably relax the boundedness condition in (5.1). The next result will cover many examples. If it does not cover yours, note that the main ingredients of the proof are that the conditions of (5.1) hold locally, and we know that solutions of MP(b, a) do not explode.
2:: {2x;b; (x) + aii (x)} � A(l + lxl2 )
exp
In the trivial case of constant drift, i.e., b(x) = Jl = \lU(x) we have U(x) and � U(x) = 0 so
CASE 2. Suppose P1 and P are locally equivalent. In the definition of at, {Y}t 2 and yt are independent of P; by (12.6) and (12.7) in Chapter 2, so dQrfdP1 = dQ 2/ dP2 and Q 1 f:. Q 2 . D
1-1 corresp ondence
P1 and P2 be solutions of MP(,B, a). By change of measure we can produce solutions Q 1 and Q 2 of MP(,B + b, a). We claim that if P1 f:. P2 then Q 1 f:. Q 2 . To prove this we consider two cases:
Then there is a 1-1 correpondence between solutions to MP(b, a ) and MP(O, a). Remark. Combining (5.3) with
(3.1) and (4.1) we see that if in addition to the conditions in (5.3) we have a = u(F where u is locally Lipschitz, then MP(b, a) has a unique solution. By using (3.3) instead of (3.1) in the previous sentence we can get a better result in one dimension. Proof Let Z and a be as in the proof of (5.1) and let Tn = inf{t : I Xt l > n}. From the proof of (5.1) it follows that a(t 1\ Tn ) is a martingale. So if we let P? be the restriction of P to :Fv.Tn and let dQf fdP? = a(t 1\ Tn ) then under Q n , Xt is a solution of MP(b, a) up to time Tn . The quadratic growth condition implies that solutions of MP(b, a) do not explode. We will now show that in the absence of explosions, at is a martingale. First, to show at is integrable, we note that Fatou's lemma implies
(a) where
Proof Let
CASE 1 . Suppose P1 and P are not locally equivalent measures. Then since 2 Q; is locally equivalent to P;, Q 1 and Q 2 are not locally equivalent and cannot
be equal.
(b)
Eat � lim n -inf oo E(a(t 1\ Tn )) = 1 E denotes expected value with respect to P. Next we note that Elat - at i\TJ = E(lat - a(Tn ) I ; Tn � t) � E(a(Tn );Tn � t) + Eat - E(at; Tn > t)
The absence of explosions implies that as n --+ oo (c)
Section 5.6 Change of Time 201
206 Chapter 5 Stochastic Differential Equations This and the fact that
a(t
1\
2b3 dB3 - (1/2) l2b3 12 ds) 1.
Tn) is a martingale implies
( d)
a(t at in L1 . It follows (see a(t Tn ) = E (at I:FtATn ) and from this that a . , 0 s t is a martingale. The rest of the proof is the same as that of (5.1) and (5.2). as n -+ oo . (b ) , ( c) , and ( d) imply that 1\ Tn ) -+ e.g. , (5.6) in Chapter 4 of Durrett (1995)) that 1\
::=;
::=;
D
Our la.St chore in this section is to complete the proof of (2.10) in Chapter
1 by showing (5.4) Theorem.
Brownian motion starting at 0, let g be a contin g(O)Bt =be0,a and let f > 0. Then P (0sup 99 !Bt -g(t)l < f) > 0 Proof We can find a C1 .function h with h(O) = 0 and l h (s) - g(s)l < E/2 < for s so we can suppose without loss of generality that g E C1 and let t b3 = g'(s). If we let yt = J0 b3 dB3 and define Qt by dQ: = exp (i0t b3 dB3 - 2l i0t l b3l 2 dB3 ) dP then it follows from ( 5.1 ) that under Q, Bt -g(t) is a standard Brownian motion. Let G = { I B3 - g(s)l < f for s t}. Exercise 2.11 in Chapter 1 implies that Qt ( G) > 0. It follows from the Cauchy-Schwarz inequality that 1/2 Qt (G) = j ��: · la dPt :::; j ( ��:) dPt Pt (G) 112 To complete the proof now we let E denote expectation with respect to Pt and observe that if l b3l :::; M for s :::; t then j (��:Y dPt = E exp (it 2b. dB3 - it l b. l 2 ds) = exp (it l b$ 12 ds) :::; eM2 t
Let uous function with
since exp (I� J� is a martingale and hence has ex pected value For this step it is important to remember that is a non-random vector. D
5.6.
b3 = g'(s)
C hange of Time
b
In the last section we learned that we could alter the in the SDE by changing the measure. However, we could not alter with this technique because the quadratic variation is not affected by absolutely continuous change of measure. In this section we will introduce a new technique, change of time, that will allow us to alter the diffusion coefficient. To prepare for developments in Section 6.1 we will allow for the possibility that our processes are defined on a random time interval.
u
Xt be a solution of MP(b, a) for t < (. That is, for each t t x: - i bi (X. ) ds and x:xf - i aii (X3) ds are local martingales on [0,(). Let g be a positive function and suppose that
( 6.1 ) Theorem. Let
and j,
i
t,
.
::=;
(
2
)
u "'{ or t � ( } and let = X('Y3) for : Ut > s ajg) for s < .; . Y. 3 = MP(bjg, Proof We begin by noting .that Xt = Y(ut ), then in the second step change variables r = u. , dr = g(X. ) ds, X. = X-y(r) = Yr, to get
Define the inverse of by inf{t < e U( . Then is a solution of
s =
Changing variables
'Yu
Y.
t = 'Yu gives MP(b/g, a/[0,g)..;)
Since the are stopping times, the right-hand side is a local martingale on and we have checked one of the two conditions to be a solution of
210 Chapter 5 Stochastic Differential Equations Then MP(O, a) is well posed, i.e., uniqueness in distribution holds and there is no explosion.
Xt = Bt , a Brownian motion, which solves MP(O, I) and g(x) = h(x)- 1 • Since g(Bt ) ::; E_R1 for t ::; TR = inf{t : ! Btl > R} we have J; g(B,) ds < oo for any t. On the other hand Proof We start with
take
6
One D imensional D iffusions
since Brownian motion is recurrent, so applying (6.1) we have a solution of
MP(O, a). Uniqueness follows from (6.2) so the proof is complete.
D
Combining (6.3) with (5.2) we can construct solutions of MP(b, a) in one dimension for very general b and a.
d
6 . 1 . Construction
In this section we will take an approach to constructing solutions of
(6.4) Theorem. Consider dimension = 1 and suppose (i)
0 < fR ::; a(y) ::; CR < oo when IYI ::; R
b(x) is locally bounded (iii) x b(x) + a(x) ::; A(1 + x2) Then MP(b, a) is well posed. (ii)
·
that is special to one dimension, but that will allow us to obtain a very detailed understanding of this case. To obtain results with a minimum of fuss and in a generality that encompasses most applications we will assume: ( 1D) b and u are continuous and
>
a(x) = u2 (x) 0 for all x
The purpose of this section is to answer the following questions: Does a solution exist when (1D) holds? Is it unique? Does it explode? Our approach may be outlined as follows:
Xt
(i) We define a function cp so that if is a solution of the SDE on is a local martingale on [O, e). yt =
cp(Xt )
[o, e) then
(ii) Our yt has (Y)t = J; h(Y, ) ds so construct yt to be a solution of MP(O, h) by time changing a Brownian motion. is a solution of the SDE. = cp- 1 (Yt) and check that (iii) We define
Xt
Xt
Xt f t '(X.) dX, t J"(X.)d(X . X ) f(Xo) = JC t 1! �1 } t = local mart. + 1 Lf(X,) ds
To begin to carry out our plan, suppose that is a solution of MP(b, a) 2 for t < e. If E C , Ito's formula implies that for t < e
(1.1)
+
Section 6.1 Construction 213
212 Chapter 6 One Dimensional Diffw?ions where
Lf(x) = 21 a(x) t'(x) + b(x)f'(x)
(1.2)
and a(x) = u2 ( x). From this we see that f (Xt) is a local martingale on [0, ( ) if and only if Lf(x) = 0. Setting Lf = 0 and noticing ( 1D) implies a(x) > 0 gives a first order differential equation for f'
' ( !' )' = -2b a !
(1.3) Solving this equation we find,
f'(y) = B exp and it follows that
f(x) = A +
(1Y-�(�} dz)
1x B (1Y-2:g; dz) dy exp
Any of these functions ca� be called the natural scale. We will usually take A = 0 and B = 1 to get (1.4)
so(x) =
1x (1Y-2:g; dz) dy exp
However, in some situations it will be convenient to replace the O 's at the lower limits by some other point. Note that our assumption (1D) implies so is C2 so our use of Ito's formula in (1.1) is justified. Since yt = 0 for all x, the image of ( ex , /3) under so is an open interval (£, r) with - oo :::; l < 0 < r :::; oo. Letting Wt be a Brownian motion, ( = inf{t : Wt rf:. (£, r)}, g = 1 /h ,
h(y) = {so'(so - 1 (y)) } 2 a(so - 1 (y)) > 0 is continuous So if we let Tt = inf{s : (Y}3 > t } then Wt = YT, is a Brownian motion run for an amount of time (Y} e .
lTt =
1 g(W3) ds t
for t < ( and
Using (6.1) in Chapter 5, we see that
s < e = O'( ·
{3 = inf{t : O't > s or t ;::: ( }
Ys = W(!3 ) is a solution of MP(O, h) for
To define X1 now, we let '1/J b e the inverse of so and let X1 = 'l/J(Y1). To check that X solves MP(b, a) until it exits from ( a , {3) at time e, we differentiate so('l/J(x)) = x and rearrange, then differentiate again to get
:
'1/J'(x) = so' ( (x)) '1/J"(x) = { so' ( ( ))}2 so" (.,P(x )).,P' (x) Using the first equality and S011(y) = -(2b(y)/a(y))so'(y), which follows from ( 1 3 ) , we have 1 2b(.,P(x)) '1/J11(x) - {so'('l/J(x))}2 a(.,P(x))
;�
.
_
These calculations show for t < e
'1/J E C2 so Ito's formula, (1.1), and (1.5) imply that
'l/J(Y1 ) - .,P(Yo) =
1t .,P'(¥3) d¥3 + � 1t 'l/J11 (Y3 )h(Y3 ) ds
For the second term we note that combining the formulas for (recall '1/J = so - 1 ) 1 II
'l/J11 and h gives
2 '1/J (y)h(y) = b(.,P(y)) To deal with the first term observe that Y1 is a solution of MP(O, h) up to time e so by (4.5) in Chapter 5 there is a Brownian motion B3 with d¥3 = .jh{'YJ dB3. Since .,P'(y),jh(ilj = u(.,P(y)), letting X1 = .,P(yt) we have for t < e t t Xt - Xo = u(X3 ) dB3 + b(X3 ) ds
1
1
Section 6.2 Feller's Test 215
214 Chapter 6 One Dimensional Diffusions The last computation shows that if yt is a solution of MP(O, h) then X1 = ,P(yt) is a solution of MP(b, a), while (1.1) shows that if X1 is a solution of MP(b, a) then yt = so (X1) is a solution of MP(O, h). This establishes there is a 1-1 correspondence between solutions of MP(O, h) and MP(b, a ) . Using (6.2) of Chapter 5 now, we see that (1.6) Theorem. Consider (C, C) and let Xt (w) = w1• Let a < {3 and T( cr,,B) = inf{X1(w ) fJ. (a, {3)} . Under (1D), uniqueness in distribution holds for MP(b, a) on [0, T( cr,,B))· Using the uniqueness result with (4.6) of Chapter 5 we get (1.7) Theorem. For each x E (a, /3) let Px be the law of Xt , t < T( a ,,B) from (1.6). If we set X1 = A. for t ;:::: T(a,.B)• where A. is the cemetery state of Section 1.3, then the resulting process has the strong Markov property. 6.2. Feller's Test
In the previous section we showed that under (1D) there were unique solutions to MP(b, a) on [O, T(cr ,,B)) where T(a ,,B) = inf{t : X1 fJ. (a, {3)} . The next result gives necessary and sufficient conditions for no explosions, i.e., T(a,,B) = oo a.s. Let
Ty = inf{t : X1 = y} for y E ( a , {3) Ta = lim y! cr Ty and Tp = lim YT.B Ty
In stating (2.1) we have without loss of generality supposed 0 E ( a , {3) . If you are confronted b y an (a, {3 ) that does not have this property pick your favorite "' in the interval and translate the system by -"! . Changing variables in the integrals we see that (2.1) holds when all of the O 's are replaced by 1 's. (2.1) Feller's test. Let
m(x) = 1/(so ( x)a(x )). '
(a)
so(x) be the natural scale defined by (1.4) and let
Px (Tp < To) is positive for some (all) x E (0, {3) if and only if
1,8 dx m(
(b)
X) ( SO (f3) - SO( X ))
0 for all x E ( 0 , {3)
Proof The key to the proof of our sufficient condition for no explosions given in (3.1) and (3.2) of Chapter 5 was the fact that if A is sufficiently large S1 = ( 1 + l Xt l 2 )e - At is a supermartingale
To get the optimal result on explosions we have to replace 1 + lxl 2 by a function that is tailor-made for the process. A natural choice that gives up nothing is a function g ;:::: 0 so that e - 1 g (X1) is a local martingale To find such a g it is convenient to look at things on the natural scale, i.e. , let Yi = so (Xt) · Let £ = limx!cr so(x) and let r = limx r.B so(x). We will find a C2 function f(x ) so that f is decreasing on (£, 0), f is increasing on (0, r) and e - t f(Yi ) is a local martingale Denouement. Once we have f the conclusion of the argument is easy, so to explain our motivations we begin with the end. Let
f(£) = lim y f(y)
!l
/(r) = lim f(y)
YTr
(2.6) and the calculations at the end of the proof will show that (2.2a) The integral in (a) is finite if and only if f(r) < oo (2.2b) The integral in (b) is finite if and only if f(£) < oo
.
.
Once these are established the conclusions of (2.1) follow easily. Let
Ty = inf{t : yt = y} for y E (£, r) Tr = lim yTr Ty and Tt = lim y!l Ty T(t, r) = Tt A Tr
216 Chapter 6 One Dimensional Diffusions
Section 6.2 Feller's Test 217
Proof of {c) Suppose 1(1!.) = l(r) = oo. Let an < 0 < bn be chosen so that l ( a n ) = l ( bn ) = n and let Tn = Ta n A Tb n · Since e- • I(Y, ) is bounded before time Tn At we can apply the optional stopping theorem at that time to conclude that if y E (an , bn ), then et l(y) 0 as n -jo oo Py (Tn < t) ::; --jo
n
Letting t -jo oo we conclude 0 = Py(f(t,r ) < oo) = Prp-l (y) (T( a ,,8) < oo).
Let 0 < y < r, let bn j r with b 1 > y, and let Tn = To A Tb n · If l ( r) = oo , applying the optional stopping theorem at time Tn A t, we conclude that et l(y) --!- 0 as n -l- oo Py(Tb n < To A t) ::; Proof of {a)
l(bn )
Letting t -jo oo we conclude 0 = Py(Tr < To) = Prp-l(y ) (Tp < T0). This shows that if (a2) is false then (al) is false, i.e., (a1) implies (a2). If l(r) < oo, applying the optional stopping theorem at time Tn (which is justified since e- t I (Yt ) ::; l(bn ) for t ::; Tn ) we conclude that 1 < l(y) = Ey (e-Tn I(YrJ)
(
::; 1 + l ( r) Ey e-1\n ; Tb n < To
Rearranging gives
)
)
(
l(y) - 1 >0 ;.::::: Ey e-Tb n ; T-b n < To l ( r)
Noting Tr = T0 < oo is impossible, and letting n -jo oo we have Ey(e-Tb n ; Tb n < To) ! Ey (e-Tr ; Tr < To)
which is a contradiction, unless 0 < Py(Tr < To) = Prp-l(y) (Tp < To). This shows that (a2) implies (a3). (a3) implies (a1) is trivial so the proof of (a) is complete. Proof of (b) is identical to that of (a) .
I so that e-t I(Yt ) is a local mar tingale, we apply Ito's formula to l(x l , x 2 ) = e-"'1 l ( x 2 ) with Xl = t, Xl = Yt, and recall Yt is a local martingale with variance process given b y (1.5), so The search for f. To find a function
e-t I(Yt ) - f (Yo) =
t
i {- e- • I(Y, ) + � e- • f" (Y, )h(Y, )} ds
+ local mart.
so we want
�h(x)/"(x) - l(x) =
(2.3)
0
To solve this equation we let lo = 1 and define for n ;.::::: 1
ln (x) = r dy r dz 2ln - l ( z ) h(z) lo lo
(2.4)
(1 .8) in Chapter 4 implies that if we set l(x) = I:�= O ln (x) and (2.5)
00
I: sup lln (x)l < oo for any R < oo n =o i':_I � R
then f"(x) I:�= O l::(x). Using l::(x) = 2ln - l ( x ) fh (x ) for n > 1 and l�'(x) = 0 it follows that
f" (x) =
x l::(x) = t 21n - l (x) = 2l( ) t h(x) n=O n= l h(x)
i.e., I satisfies (2.3). To prove (2.5), we begin with some simple properties of the In ·
In ;.::::: 0 (2.6b) In is convex with 1�(0) = 0 (2.6c) In is increasing on (0, r) and decreasing on (£, 0) (2.6a)
h(z) > 0 (2.6a) follows easily by induction from the definition in (2.4). Using (2.6a) in the definition, (2.6b) follows. (2.6c) is an immediate consequence of {2.6b ). 0
Proof Since
Our next step is to use induction to show {2.7) Lemma.
ln (x) ::; (fi(x)) n /n!, (2.5) holds, and 1 + !I
::; I ::; exp( fi )
Proof Using (2.6c) and (2.6a) the second and third conclusions are immediate consquences of the first one. The inequality ln (x) ::; (fi(x)) n /n! is obvious if
Section 6.3 Recurrence and Transience 219
218 Chapter 6 One Dimensional Diffusions
n = 1. Using (i) the definition of fn in (2.4}, (ii} (2.6c}, (iii) the definition of /1 and /0 , (iv) the result for n - 1, and (v) doing a little calculus, we have 2 z) fn (x) = dy dz
x 1 1Y /�{;} dy f ( ) r dz h(z) � Jr o x n - 1 Y Jo = 1 dy fn - 1 (Y )f{(y) 2
(fl (y)) n - 1 ' (y) = (fl (x)) n r d Y (n - 1}! !1 n! - Jo
-1 is the index of the Bessel process. To explain the restriction on 1 and to prepare for the results we will derive, note that Example 1.2 in Chapter 5 shows that the radial part of d dimensional Brownian motion is a Bessel process with 1 = (d - 1). The natural scale is
cp ( x) =
lx exp (- 1Y 1/z dz) dy
= }r y- '"'( dy =
1
{ (x1 -"' - 1)/{1 - 1) ln x
if / = 1 if / # 1
From the last computation we see that if 1 2:: 1 then cp {O) = -oo and I = oo. To handle - 1 < 1 < 1 we observe that the speed measure
1 = z"� m(z) = (z)a(z) cp' So taking q = 1 in the definition of I I=
11 1z1--"t'Y z"� dz < oo 0
--
To compute J we observe that for any 1 > -1, J=
M(z) = z"1 +1 /(1 + 1) and
1o 1 1z"1++11 z-"1 dz < oo --
0 is an entrance boundary if 1 E [1, oo) if r E {-1, 1) regular b oundary
Combining the last two conclusions we see that
{0, oo ). The natural scale is cp(x) = x and the speed measure is m{ x) 1/(cp'(x)a(x)) = x -26 so 1 1 oo �f 5 < 1 I= x - 26 dx = =< 00 If 5 2:: 1 0
on
1
When 5 2::
{
1/2, M{O) = -oo and hence J = oo. When 5 < 1/2 1 z 1 -26 J= -- dz < oo 0 1 - 25
1
Combining the last two conclusions we see that the boundary point 0 is if 5 E [1, oo) if 5 E [1/2, 1) if 5 E {0, 1/2) The fact that we can start at 0 if and only if 5 < 1/2 is suggested by (3.3) and Example 6.1 in Chapter 5. The new information here is that the solution can reach 0 if and only if 5 < 1. For another proof of that result, recall the time change recipe in Example 6.1 of Chapter 5 for constructing solutions of the equation above and do natural absorbing regular
Exercise
5.3. Consider
Show that (a) for any 5 > 0, P1 (H� {c) P1 (H6 = oo) = 1 when 5 ;:::) .
< oo) = 1. (b) E1 H6 < oo when 5 < 1.
Exercise 5.4. Show that the boundary at oo in Example 5.4 is natural if 5 :::; and entrance if 5 > 1.
1
234 Chapter 6 One Dimensional Diffusions Remark. The fact that we cannot reach oo is expected. (5.3) in Chapter 5 implies that these processes do not explode for any 6. The second conclusion is surprising at first glance since the process starting at oo is a time change of a Brownian motion starting from oo. However the function we use in (5.1) of Chapter 5 is g(x) = x - 26 so (2.9) in Chapter 3 implies
EM
1T1 g(B. )1(B. s;M) 1M -26 ds =
which stays bounded as M
x
•
Section 6.6 Applications to Higher Dimensions 235 for all x with lxl :::; r. Let h (x) =
Noting that cosh(y) �
5.5. Wright-Fisher diffusion. Recall the definition given in Ex 1.7 in Chapter 5. Show that the boundary point 0 is absorbing if {3 = 0 regular if {3 E (0, 1 /2)
t
10 I:d a sinh(aX! )b;(X.) ds + local mart. i=l 1 11 d a 2 cosh(aX.i )a;;(X. ) ds +2 0 I: i=l
h(X1 ) - h(Xo) =
2x dx
� oo if and only if 6 > 1.
E1= 1 cosh( ax;) . Using Ito's formula we have
1 and
Exercise
ample
entrance if {3 � 1/2 Hint: simplify calculations by first considering the case a = 0 and then arguing that the value of a is not important.
6.6.
Applications to Higher Dimensions
In this section we will use the technique of comparing a multidimensional dif fusion with a one dimensional one to obtain sufficient conditions (a) for no explosion, (b) for recurrence and transience, and (c) for diffusions to hit poin:s or not. Let Sr = inf{t : I X1 1 = r} and S00 = limr-+ oo Sr . Throughout this section we will suppose that X1 is a solution to MP(b, a) on [0, Soo ) , and that the coefficients satisfy (i) b is measurable and locally bounded (ii) a is continuous and nondegenerate for each x, i.e., ; Y a;i (x)yi > 0 when y f. 0
Li,i
C < oo
Proof Assumptions (i) and (ii) imply that we can pick an a large so that if
lxl :::; r
(
eY e -Y 2' 2
)
� l sinh(y) l
our choice of a implies
�2 cosh(aX! )a;;(X.) + a sinh(aX! )b; (X. ) � a cosh(aX!) ( � a;;(X. ) - lb;(X. ) I ) � a cosh(aX!) � a
so h(Xt ) - tad is a local submartingale on theorem at time Sr 1\ t we have
[0, Sr ). Using the optional stopping
0 :5 h (x) :5 Ex{h(XsrAt) - ad(Sr 1\ t)} Since
h(XsrAt) :::; d cosh(ar) it follows that Ex(Sr 1\ t) ::::; cosh(ar) /a
Letting t � oo gives the desired result.
0
Remark. Tracing back through the proof shows
chosen so that
The first detail is to show that X1 will eventually exit from any ball.
(6.1) Theorem. Let Sr = inf{t : IXd = r} . Then there is a constant so that ExSr :::; C for all x with lxl :5 r.
cosh(y) � max
C = cosh(ar)/a where a is
a 2a;;(x) � 1 + lb;(x) l To see that the last estimate is fairly sharp consider a special case Example 6.1. Suppose d = The natural scale has
so (x) =
1, a(x)
=
1, and b(x)
= -{3 sgn(x) with {3
10 :z; e2f3Y dy = -2{31 (e 2f3x - 1)
> 0.
236 Chapter 6 One Dimensional Diffusions
Section 6.6 Applications to Higher Dimensions 237
for x � 0 and cp (-x) = - 1. In this part we use the formulation of Meyers and Serrin (1960). Let a(x) =
{ !��:>
}
' (R. ) + g" (R. ) a(X. )
:::; - g (R. ) + { v (R.)g ' (R. ) + g" (R.) } a(X. ) g (R,) p(R. ) :::; 0 :::; -g (R. ) + p(R. )
{ }
1:1
. a(x)
1:1
de ( X ) =
2x · b(x) + tr (a(x)) a(x)
where tr (a(x)) = 2:::; a;;(x) is the trace of a. We call de(x) the effective dimension at x because (6.3) and (6.4) will show that there will be recurrence or transience if the dimension is < 2 or > 2 in a neighborhood of infinity. To formulate a result that allows de(x) to approach 2 as lxl --+ oo we need a definition: c5(t) is said to be a Dini function if and only if
100 c5(t)t dt < 1
00
In the next two results we will apply the definition to
( j c�s) ds)
e -• g " (R. )4X!X�a;i (X. ) ds
•
- g (R. ) +
Combining (d) and (e) shows that e - t g (Rt) is a local supermartingale while o Rt ;?: 1 and completes the proof.
c5(t) = exp -
t
To see what the conditions say note that c5(t) = (log t) -P is a Dini function if p > 1 but not if p :::; 1 and this corresponds to c(s) = pf log s . (6_.3) Theorem. Suppose de(x) ;?: 2(1 + c:(lxl)) for lxl ;?: R and c5(t) is a Dini function, then Xt is transient. To be precise, if lxl > R then P.,(SR < oo) < 1. Here Sr = inf{t : IXt l = r} , and being transient includes the possibility of explosion. (6.4) Theorem. Suppose de(x) :::; 2(1 +c(lx l)) for lxl ;?: R and c5(t) is not a Dini function, then Xt is recurrent. To be precise, if lxl > R then P.,(SR < oo) = 1. Proof of { 6.3)
The first step is to let
240
Chapter 6 One Dimensional Diffusions
Section 6.6 Applications to Higher Dimensions
(-lr � ) ( ], d )
which has
The upper bound is independent of s and
c t) dt ::; 0 0 for lxl < TJ -+
Using ( 3.2 ) and the Markov property now extends the conclusion to
lxl 2: 7J·
0
(6.8) Corollary. If (i) d 2: 3 or (ii) d = 2 and a is Holder continuous then Px (To < oo ) = 0 for all x. Proof Referring to (4.2) in Chapter 5, we can write a(O) = UT AU where A is diagonal matrix with entries A i . Let r be the diagonal matrix with entries /A and let v = uT r. The formula for the covariance of stochastic integrals implies that Xt = VXt has ii(O) = VT a(O)V = I
1
a
So we can without loss of generality suppose that a(O) = I. In d 2: 3, the continuity of a implies de (x) -+ d as x -+ 0 so we can take e(r) = 0 and the desired result follows from (6.6). 6 If d = 2 and a is Holder continuous with exponent 6 we can take e(r) = Cr To extract the desired result from (6.6) we note that •
c - y6 )) ydy = oo ( 11 Cz6-1 dz) ydy = J{o1 (-8(1
[1 Jo exp -
Y
exp
244 Chapter 6 One Dimensional Diffusic;ms since the exponential in the integrand converges to a positive limit as y �
0.
D
It follows from (6.7) that a two dimensional diffusion can hit 0. Example
6.1. Suppose b(x) = 0, a(O) = I, and for 0 < l x l < 1/e let a(x) be
7
Diffusions as Markov Processes
the matrix with eigenvectors
and associated eigenvalues
A2 = 1 - 2pf log l x l These definitions are chosen so that de (x ) = 2 - 2pflog(lxl) and hence we can take c(r) = pf log(r). Plugging into the integral
1 0
1 /e
exp
(- 1 y
1 / e p dz 1
)
dy z og z y
-
1 11/ e =
=
1/ e
0
o
dy exp (p log l log yl) y dy y I Iog y I P
In this chapter we will assume that the martingale problem MP(b, a) has a unique solution and hence gives rise to a strong Markov process. As the title says, we will be interested in diffusions as Markov processes, in particular, in their asymptotic behavior as t � oo. 7.1. S e migroups and Generators
The results in this section hold for any Markov process, measurable /, and t 2:: 0, let
1
Xt . For any bounded
T't f(x) = E,z:f (Xt) The Markov property implies that
Remark. The problem of whether or not diffusions can hit points has been investigated in a different guise by Gilbarg and Serrin (1956), see page 315.
So taking expected values gives the semigroup property T3 +d (x) = T3 (Td) (x) (1.1) for. s, t 2:: 0. Introducing the norm II/II = sup lf(x) l and L00 = {! : II/II < oo } we see that Tt is a contraction semigroup on L00 , i.e., (1.2) IITt / 11 ::; II III Following Dynkin (1965), but using slightly different notation, we will de fine the domain of T to be 'D(T)
= {! E L00 IITt f - fll � 0 as i � 0} :
As the next result shows, this is a reasonable choice.
(1.3) Theorem. 'D(T) is a vector space, i.e., if fl , 12 E 'D(T) and c1 , c E R, cd1 + c2 !2 E 'D(T) . 'D(T) is closed. If f E 'D(T) then T3 f E 'D(T) and s 2� T3 f
is continuous.
246 Chapter 7 Diffusions
as
Markov Processes
Section
Remark. Here and in what follows, limits are always taken in, and hence
11 · 11· Proof Since T't (cd1 + c2!2) = c1Tt ft + c2T't h , the first conclusion follows from the triangle inequality. To prove 1J(T) is closed, let fn E 1J(T) with we pick n large llfn - fl l < f/3. fn E 1J(T) so l fnfn- -fnll fll 0 . f/3Letforf t>. e - >.s ll /11 e - 8 11/11
and compare the last two displays to conclude (using
e >.h II Ah U>. f - (AU>./ - 1) 11 $ h- 1 - A h h (1 - e - >. • ) 11/11 II Td - /II
+X1
0.
1
ll ds + X 1
as h -+ The formula for AU>. / implies that and its inverse is A - A.
ds + A 1 ds -+ 0
ds
(A - A)U>. f = f. Thus U>. is 1-1
250 Ch apter 7 Diffusions To prove that
as
Section
·Markov Processes
7.2
Examples 251
Xt defines our pure jump Markov process. To compute its generator we note that Nt = sup { n : Tn ::::; t} has a Poisson distribution with mean >.t so if f is bounded and measurable
U>. is onto, let f E V(A) and note (1.5) implies
1oo e - >.t Tt (>.f - A!) dt = 100 >.e - >.t Tt f dt - 100 e - >.t ! Ttf dt
U>. (>.f - Af) =
Doing a little arithmetic it follows that
Integrating by parts
Tt f(x) - f(x) - >. P(x, dy)(f(y) - f(x)) t = C 1 (e - >.t - 1 + >.t)f(x) + >.(e- >.t - 1) P(x, dy)f(y) + O (t)
J
Combining the last two displays shows complete.
U>. (>.f - A!)
f and the proof is
0
7.2. Examples
In this section we will consider two families of Markov processes and compute their generators.
2.1. Pure jump Markov processes. Let (S, S) be a measurable space a_nd let Q(x , A) S X S -+ [0, oo) be a transition kernel. That is, (i) for each A E S, x -+ Q(x, A) is measurable (ii) for each x E S, A -+ Q(x , A) is a finite measure (iii) Q(x, { x}) = 0 Intuitively Q(x, A) gives the rate at which our Markov chains makes jumps from x into A and we exclude jumps from x to x since they are invisible. To be able to construct a process Xt with a minimum of fuss, we will suppose that >. = SUPze s Q(x, S) < oo Define a transition probability P by Example
:
Since r 1 (e - >. t - 1 + >-.t) -+,0, e - >.t it follows that
as t -+ 0.
-+ 1, and P(x, dy) is a transition probability
I Ttf(x)t- f(x) - ).. J P(x, dy)(f(y) - f(x)) l -+ 0
Recalling the definition of P(x, dy) it follows that
(2.1 ) Theorem. V(A) = V(T) = Af(x) = ·
j
L00 and
j Q(x, dy)(f(y) - f(x))
We are now ready for the main event.
.
±
P(x, A) = Q(x, A) if x rf. A 1
P(x, { x}) = X (>. - Q(x, S) ) Let t 1 . t 2 , . . . be i.i.d. exponential with parameter >., that is, P(ti > t) = e->-t . Let Tn = t 1 + + tn for n � 1 and let To = 0. Let Yn be a Markov chain with transition probability P(x, dy) (see e.g. , Section 5.1 of Durrett (1995) for a defintion), and let · · ·
Example 2.2. Diffusion processes. Suppose we have a family of measures Pz on (C, C) so that under Pz the coordinate maps Xt (w) = w(t) give the unique solution to the MP(b, a) starting from x, and :Ft is the filtration generated by the Xt . Let C_k be the C2 functions with compact support.
(2.2) Theorem. Suppose a and b are continuous and MP(b, a) is well posed. Then C_k C 'P(A) and for all f E C_k
252 Chapter 7 Diffusions
as
Markov Processes
Section 7.2 Examples 253
C2 then Ito's formula implies t f(Xt) - f(Xo) D;J(X. )b;(X.) ds + local mart.
Proof If f E
h(y) L; (Yi - x;) 2 • Using Ito's formula we have t 2(X! - x;)b; (X.) ds + local mart. h(Xt) - h(Xo)
Proof Let
�1 + 21 � Jto D;j/(X. )a;j(X.) ds
=
=
=
I
+ 21
1 ,}
If we let Tn be a sequence of times that reduce the local martingale, stopping at time t A Tn and taking expected value gives AT,. Exf(XtAT,. ) - f (x) Ex Jt D;f(X.)b;(X.) ds o I tAT,. + -21 :l: Ex D;j/(X. )a;i (X. ) ds . . 0 =
2;:
1 ,}
�1
l
If f E C_k and the coefficients are continuous then the integrands are bounded
I
t � Jo 2aii(X.) ds I
Letting Tn be a sequence of. times that reduce the local martingale, stopping at t A Sr A Tn , taking expected value, then letting n -+ oo and invoking the bounded convergence theorem we have
t i\Sr Ex h(XtAs. ) = Ex J{ 2: { 2(X! - x;)b;(X.) + a;;(X.) } ds � 2t(rB + A) 0 n i Since
h 2: 0 and h(Xs. ) r2 it follows that r2 Px (Sr � t) � 2t(rB + A) =
and the bounded convergence theorem implies
which is the desired result.
(2.3)
(2.2) Pick R so that H {x : lxl � R - 1} contains the support of f and let I< {x : l x l � R}. Since f E C_k, and the coefficients are continuous, Af is continuous and has compact support. This implies Af is uniformly continuous, that is, given c > 0 we can pick 8 E (0, 1] so that if l x - Y l < 8 then I AJ(x) - Af(y)j < c. If we let C = supz I Af(z) l and use (2.3) it follows that for x E I< Exf(Xt - f(x ) - Af(x) � Ex t AJ(X. - Af( x)l ds ) I Proof of
]
(2.4) Lemma. Let r > 0, ]{ C Rd be compact, and Kr {y : lx - yj � r for some x E K}. Suppose lb(x) l � B and Li a;;(x) � A for all x E I 0 is arbitrary, the desired result follows.
Section 0
For general b and a it is a hopelessly difficult problem to compute V(A). To make this p oint we will now use (1.7) to compute the exact domains of the semi-group and of the generator for one dimensional Brownian motion. Let Cu be the functions f on R that are bounded and uniformly continuous. Let C� be the functions f on R so that f, f', !" E Cu .
7.3
Transition Probabilities 255
Differentiating again gives
It is easy to use the dominated convergence theorem to justify differenti ating the integrals. From the formulas it is easy to see that g, g1, g11 E Cu so g·E C� and
(2.5) Theorem. For Brownian motion in d = 1 , V(T) = Cu, V(A) = C� , and Af(:x) = f"(:x)/2 for all f E V(A). Remarl!:. For Brownian motion in d > 1, V(A) is larger than C� and consists of the functions so that D.. f exists in the sense of distribution and lies in Cu (see Revuz and Yor (1991), p. 226).
g 11 (x) = 2Ag(x) - 2f(x) Since Ag(:x) = Ag(x) - f(x) by the proof of (1.7), it follows that Ag(x) = g11(x)f2. To complete the proof now we need to show V(A) :::> C� . Let h E C� and define a function f E Cu by
Proof If t > 0 and f is bounded then writing p1( :x , z) for the Brownian transition probability and using the triangle inequality we have
f(x) = Ah(x) - �h"(x)
I Ttf(:x ) - Tt f(y)l :::; II/II
j IPt(x, z) - Pt (Y, z)l dz
The function y = h - U>. f satisfies
The right-hand side only depends on l x - Y l and converges to 0 as l x - Y l -+ 0 so Ttf E Cu . If f E V(T) then II Tt f - !II -+ 0. We claim that this implies To check this, pick f > 0, let t > 0 so that I I Tt! - !II < c/3 then pick 6 fE so that if l x - Y l < 6 then I Tt f(x) - Tt f(y)l < t:/3. Using the triangle inequality now it follows that if l x - Y l < 6 then
c�.
lf(x) - f(y) l :::; lf(x) - Tt f(x)l + lTt f(x) - Tt f(y) l + lTt f(y) - f(y)l < f To prove that V(A) = C� we begin by observing that (3. 10) in Chapter 3
implies
Our first task is to show that if f E Cu then U>..f E C� . Letting g(x) = we have
U>. f(x)
All solutions of this differential equation have the form Aex V2"X + Be - x V2X . Since h - U>. f is bounded it follows that h = U>. f and the proof is complete. 0 7.3. Transition Probabilities
In the last section we saw that if a and b are continuous and MP(b, a ) is well posed then
d dt Tt f(x) = ATt f(x) for f E C_k , or changing notation v(t, x) = Tt f(x), we have dv = Av(t x) ' dt
(3.1) Differentiating once with respect to x and noting that the terms from differen tiating the limits cancel, we have
where A acts in the x variable. In the previous discussion we have gone from the process to the p.d.e. We will now turn around and go the other way. Consider
(3.2)
(a) (b)
u E C 1 • 2 and u1 = Lu in (O, oo) x Rd u is continuous on [O, oo) x Rd and u(O, x) = f(x)
256 Chapter 7 Diffusions as Markov Processes Here we have returned to our old notation
L=
i � aij (x)D;i f(x) + 2;: b; (x)Dd(x) J
J)
To explain the dual notation note that (3.1) says that v(t, · ) E V(A) and satisfies the equality, while (a) of (3.2) asks for u E C 1 • 2. As for the boundary condition (b) note that f E V(A) C V(T) implies that 11 11 ! - !II -+ 0 as t -+ 0. To prove the existence of solutions to ( 3.2 ) we turn to Friedman (1975 ) , Section 6.4. (3.3 ) Theorem. Suppose a and b are bounded and Holder continuous. That is, there are constants 0 < o, C < oo so that
l b; (x) - b;(Y) I :::; C l x - Y l 5 Suppose also that a is uniformly elliptic, that is, there is a constant >. so that for all x, y (UE) L Y;a;j(x)yi 2: ..\ lyl 2 i ,j Then there is a function Pt(x, y) > 0 jointly continuous in t > 0, x, y , and C2 as a function of x which satisfies dpjdt = Lp (with L acting on the x variable) and so _that if f is bounded and continuous (HC)
u(t, x) =
1 Pt (x, y)f(y) dy
satisfies ( 3.2) . The maximum principle implies
l u(t, x) l :::; llfll .
To make the connection with the SDE we prove (3.4) Theorem. Suppose Xt is a nonexplosive solution of MP(b, a). If u is a bounded solution of (3.2) then
Section 7.3 Transition Probabilities 257 so (a) tells us that u(t - s, X3 ) is a bounded martingale on [O, t). (b) implies that u(t - s, X3) -+ f(Xt) as s -+ t, so the martingale convergence theorem implies
u(t, x) = Exf(Xt)
Pt (x, y) is called a fundamental solution of the parabolic equation Ut = LiL since it can be used to produce solutions for any bounded continuous initial data f. Its importance for us is that (3.4) implies that
i.e., Pt(x, y) is the transition density for our diffusion process. Let D = l x l < R}. To obtain result for unbounded coefficients, we will consider ( 3.5 )
(a) (b)
1s - �� (t - r, Xr) dr 3 + :2;::: 1 D;u(Xr )b;(Xr) dr + local mart.
u(t - s, X3) - u(O, Xo) =
J
{x :
u E C1 •2 and dujdt = Lu in (O, oo) X D u is continuol!s on (O, oo) x iJ with u(O, x) = f(x), and u(t, y) = 0 when t > O, y E aD
To prove existence of solutions to ( 3.5 ) we turn to Dynkin ( 1965 ) , Vol. II, pages 230-231. (3.6) Theorem. Let D = {x : l xl :::; R} and suppose that (HC) and (UE) hold in D. 2Then there is a function p{l(x, y) > 0 jointly continuous in t > 0, x, y E D , C as a function of x which satisfies dpRjdt = Lp (with L acting on the x variable) and so that if f is bounded and continuous
u(t, x) =
u(t, x) = Exf(Xt)
Proof Ito's formula implies
0
1 p{l(x, y)f(y) dy
satisfies (3.5 ) . To make the connection with the SDE we prove (3.7) Theorem. Suppose Xt is any solution of MP(b, a) and let r = inf{t : Xt f/:. D} . If u is any solution of ( 3.5 ) then
u(t, x) = Ex(f(Xt)i r > t)
258 Chapter 7 Diffusions as Marko v Proce ses
Section 7.4 Harris Chains 259
s
u is continuous in (0, t] x jj implies it is bounded there. Ito's formula implies that for 0 � s < t 1\ r 3 u(t - s, X3 ) - u(O, Xo) = - (t - r, Xr) dr 3 + 2;:: D;u(Xr)b;(Xr) dr + local mart. Proof The fact that
1 �� 1 •
1
+ 2 � Jor D;j u(Xr)a;j (Xr) dr IJ
so (a) tells us that u(t - s, Xs) is a bounded local martingale on [0, t 1\ r). (b) implies that u(t - s, X3 ) --;. 0 as s t T and u(t - s, Xs) --;. f(X1) as s --;. t on { r > t}. Using (2.7) from Chapter 2 now, we have
u(t, x) = E:r:(f(X1 ) ; r > t)
D
Combining (3.6) and (3.7) we see that
Letting R t oo and Pt(x, y) = limR-. oo pf(x, y), which exists since (3.7) implies R --;. pf(x, y) is increasing, we have (3.8) Theorem. Suppose that the martingale problem for MP(b, a) is well posed and that a and b satisfy (HC) and (UE) hold locally. That is, they hold in {x : l x l � R} for any R < oo. Then for each t > 0 there is a lower semicontinuous function p(t, x, y) > 0 so that if f is bounded and continuous then
E:r: f(Xt) =
J Pt(x , y) dy
Remark. The energetic reader can probably show that Pt ( x, y) is continuous.
the next, we will assume that the reader is familiar with the basic theory of Markov chains on a countable state space as explained for example in the first five sections of Chapter 5 of Durrett (1995). This section is a close relative of Section 5.6 there. We will formulate the results here for a transition probability defined on a measurable space (S, S). Intuitively, P(Xn +l E AIXn = x) = p(x, A). Formally, it is a function p : S x S --;. [0, 1] that satisfies for fixed A E S, x --;. p(x, A) is measurable for fixed x E S, A --;. p( x, A) is a probability measure In our applications to diffusions we will take S = Rd , and let
p(x, A) =
1 Pl (x, y) dy
where p1(x, y) is the transition probability introduced in the previous section. By taking t = 1 we will be able to investigate the asymptotic behavior of Xn as n --;. oo through the integers but this and the Markov property will allow us to get results for t --;. oo through the real numbers. We say that a Markov chain Xn with transition probability p is a Harris chain if we can find sets A, B E S, a function q with q( x, y) � E > 0 for x E A, y E B, and a probability measure p concentrated on B so that: (i) If TA = inf{n � 0 : Xn E A}, then Pz (TA < oo) > 0 for all z E S (ii) If x E A and C C B then p(x, C) � fc q(x, y)p(dy) In the diffusions we consider we can take A = B to be a ball with radius r, p to be a constant c times Lebesgue measure restricted to B, and q(x, y) = p1 (x, y)jc. See (5.2) below. It is interesting to note that the new theory still contains most of the old one as a special case.
4.1. Countable State Space. Suppose Xn is a Markov chain on a countable state space S. In order for Xn to be a Harris chain it is necessary and sufficient that there be a state u with Px(Xn = u for some n � 0) > 0 for all X E S.
Example
7.4. Harris C hains
Proof To prove sufficiency, pick v so that p( u, v) > 0. If we let A = { u} and B = { v} then (i) and (ii) hold. To prove necessity, let { u} be a point with p( { u}) > 0 and note that for all x D . P:r:(Xn = u for some n � 0) � Ex{q(XrA )p({u})} > 0
In this section we will give a quick treatment of the theory of Harris chains to prepare for applications to diffusions in the next section. In this section and
The developments in this section are based on two simple ideas. (i) To make the theory of Markov chains on a countable state space work , all we need
However, lower semicontinuity implies that p1(x, y) is bounded away from 0 on compact sets and this will be enough for our results in Section 7.5.
260 Chapter 7 Diffusions
as
Markov Processes
is to have one point in the state space which is hit. (ii) In a Harris chain we can manufacture such a point (called a below) which corresponds to "being in B with distribution p." The notation needed to carry out this plan is occasionally obnoxious but we hope these words of wisdom will help the reader stay focused on the simple ideas that underlie these developments. Given a Harris chain on (S, S), we will construct a Markov chain Xn with transition probability p on (S, S) where S = S U {a} and S = { B , B U {a} : B E S}. Thinking of a as corresp onding to being on B with distribution p, we define the modified transition probability as follows: If x E S - A, p( x, C) = p( x, C) for C E S p(X' {a}) = f If X E A, p(x, C) = p(x, C) - e:p(C) for C E S If x = a, p(a, D) = J p(dx)p(x, D) for D E S Here and in what follows, we will reserve A and B for the special sets that occur in the definition and use C and D for generic elements of S. We will often simplify notation by writing p(x, a) instead of p(x, {a}), p(a) instead of p({a}), etc. Our first step is to prove three technical lemmas that will help us carry out the proofs below. Define a transition probability v by
v( X, {X}) = 1 if X E S v( a, C) = p ( C)
In words, v leaves mass in S alone but returns the mass at a to S and distributes it according to p.
(4.1) Lemma. (a) vp = p and (b) pv = p. Proof Before giving the proof we would like to remind the reader that mea
sures multiply the transition probability on the left, i.e., in the first case we want to show pvp = pp. If we first make a transition according to v and then one according to p, this amounts to one transition according to p, since only mass at a is affected by v and
p(a, D) =
J p(dx)p(x, D).
The second equality also follows easily from the definition. In words, if p acts first and then v, then this is the same as one transition according to p since v returns the mass at a to where it came from. 0 From (4.1) it follows easily that we have:
Section
Harris Chains 261
7.4
(4.2) Lemma. L:_t Yn be an inhomogeneous Markov chain with p2 �: = v and = fi. Then Xn = Y2 n is a Markov chain with transition probability p and Xn = Y2 n + 1 is a Markov chain with transition probability p.
P2 k+l
(� -2) shows that the�e is an intimate relationship between the asymptotic behavwr of Xn and of Xn. To quantif;r this we need a definition. If f is a bounded measurable function on S, let f = vf, i.e., /(x) = f(x) for x E S /(a) = f dp
j
(4.3) Lemma. If J.l is a probability measure on (S, S) then Proof Observe tha� if Xn and
S) = 1 then Xo according to v.
= Xo and Xn
are constructe� as in (4.2), and P(Xo E is obtained from Xn by making a transition
Xn
o
Before developing the theory we give one example to explain why some of the statements to come will be messy.
4.2. Perverted Brownian motion. For x that is not an integer 2: 2, let p(x, ) be the transition probability of a one dimensional Brownian motion. When x 2: 2 is an integer let p( X, {X + 1}) = 1 - X - 2 p(x, C) = x - 2 IC n [0, 1] 1 if X + 1 ¢.
Example
·
c
is the transition probability of a Harris chain (take A Lebesgue measure on B) but .
p
= B = (0, 1),
p
=
P2 (Xn = n + 2 for all n) > 0 I can sympathize with the reader who thinks that such crazy chains will not arise "in applications," but it seems easier (and better) to adapt the theory to include them than to modify the assumptions to exclude them. a. Recurrence and transie.nce
We begin with the dichotomy between recurrence and transience. Let inf{n 2: 1 : Xn = a}. If Po:(R < oo ) = 1 then we call the chain
R=
262 Chapter 7 Diffusions
as
Markov Processes
R1 = R and for k � 2, let R1: = inf{ n > R�:- 1 : Xn = a } be the time of the kth return to a. The strong Markov property implies Pa(RI: < oo) = Pa(R < oo) k so Pa(Xn = a i.o. ) = 1 recurrent, otherwise we call it transient. Let
in the recurrent case and is follows easily.
0 in the transient case. From this the next result
(4.4) Theorem. Let A( C) = I:n 2 - n _pn (a, C) . In the recurrent case if A( C) > 0 then Pa(Xn E C i.o. ) = 1. For A a.e. x, P,(R < oo) = 1. Remark. Here and in what follows ?(x, C) = P,(Xn E C) is the nth iterate of the transition probability defined inductively by p1 = p and for n � 1
Proof The first conclusion follows from the following fact (see e.g., (2.3) in Chapter 5 of Durrett (1995)). Let Xn be a Markov chain and suppose
on
1 '
Section 7.4 Harris Chains 263
= 0 then the definition of A implies Pa(Xn E C) = 0, so Pa(Xn E C, R > n) = 0 for all n, and ji.(C) = 0. We next check that j1 is a-finite. Let G1:, o = {x : _pk (x, a) � b" } . Let To = 0 and let Tn = inf{ m � Tn - 1 + k : Xm E G�:, o }. The definition of G1:, o implies P(Tn < Ta i Tn - 1 < Ta) :::; (1 - b") Pro?f If A ( C )
N = inf{ n : Tn � Ta} then EN < 1/b". Since we can only have with R > m when Tn :::; m < Tn + k for some 0 :::; n < N it follows that ji. (G�:, o ) :::; kj b". Part (i) of the definition of a Harris chain implies S C U�:,m;:: 1 Gk , 1 / m and a-finiteness follows. Next we show that jlp = fl. so · if we let
Xm E G1:, o
CASE 1 . Let C be a set that does not contain Fubini's theorem.
j Ji.(dy)p(y, C) = n=Of j Pa(Xn E dy, R > n)p(y, C)
{Xn E An}
00
P({Xn E An i.o.}) - P({Xn E Bn i.o.}) = 0. Taking An = {a} and Bn = C gives the desired result. To prove the second conclusion, let D = { x : P,(R < oo) < 1 } and observe that if pn (a, D) > 0 for some n then
= L Pa(Xn + l E C, R > n + 1) = ji.(C)
Then
Pa(Xm = a i.o. ) :::; J pn (a, dx )P (R < oo) < 1 ,
Remark. Example 4.2 shows that we cannot expect to have
0
b. Stationary measures
R = inf{ n � 1 : Xn = a}.
n=O
a (/. C and Pa(Xo = a) = 1 . CASE 2. To complete the proof now i t suffices t o consider C = {a}.
since
f J Pa(Xn E dy, R > n)p(y, a) J Jl(dy)p(y, a) = n=O
P,(R < oo) = 1
for all x. To see that this can occur even when the state space is countable, consider a branching process in which the offspring distribution has Po = 0 and I: kp1: > 1. If we take A = B = {0 } this is a Harris chain by Example 4.1. Since pn (o, 0) = 1 it is recurrent.
(4.5) Theorem. Let
a. Using the definition j1 and
In the recurrent case,
00
= L Pa(R = n + 1) = 1 = ji.(a) n=O
where in the last two equalities we have used recurrence and the fact that when C = {a} only the n = 0 term contributes in the definition. Turning to the properties of J.l., we note that J.1. = jlv and Ji. ( a) = 1 so J.1. is a-finite. To check J.l.P = J.l., we note that (i) using the definition J.1. = jlv and ( b ) in (3.1), (ii) using ( a) in (4.1), (iii) using jlp = jl, and (iv) using the definition J.1.
defines a a-finite stationary measure for p, with j1 0. If B = U;B; with v (B;) < oo then p(B;) = 0 by the last observation and p(B) = 0 by countable subadditivity, a contradiction. So v(A) < oo and v(a) = vp(a) = ev(A) < oo. Using the fact that vp = v, we find v(C) = vp(C) = v(C) f v(A)p(B n C) the last subtraction being well defined since v ( A) < oo. From the last equality it is clear that v is a--finite. Since v(a) = w (A ) , it also follows that vv = v. To check vp = v, we observe that ( a) of (4. 1 ) the last result, and the definition of v imply 0 vp = vvp = vp = v Proof We will first show that v (A )
-
,
(4 . 7) Theorem. Suppose p is recurrent. If v is a a--finite stationary measure then v = v(a)p where Jl is the measure constructed in the proof of (4.5). Proof By (4.6) it suffices to prove that if v is a a--finite stationary measure for p with v( a) < oo then v = v( a )fl. Our first step is to observe that
v(C) = v(a)p(a, C) + Using the last identity to replace
+
{
ls -{a}
v(a)p(a, dy)p(y, C)
1S-{a} v(dx) J{s-{a} p(x, dy)p(y, C)
= v(a)Pa(Xl E C) + v(a)Pa(Xl =f. a, X2 E C) + P;;(Xo =f. a, X1 =f. a,X2 E C) Continuing in the obvious way we have
n v(C) = v(a) L Pa (XJ: =f. for 1 � k < m, Xm E C) m=l Pv(X�: =f. for 0 � k < n + 1,Xn + l E C) + Q'
a
Harris Chains 265
The last term is nonnegative, so letting n � oo we have
v(C) ;:::: v(a)p(C) To turn the ;:::: into an = we observe that
v(a) = J v(dx)pn (x, a) ;:::: v(a) J fl(dx)pn (x, a) = v(a)jl(a) = v(a)
fl(a) = 1. Let Sn = {x : pn (x, a) > 0}. By assumption UnSn = S. If v(D) > v(a)jl(D) for some D, then v(Dn Sn) > v(a)jl(D n Sn) for some n and it follows that v(a) > v(a), a contradiction. o since
(4.5) and { 4. 7) show that a recurrent Harris chain has a a--finite stationary measure that is unique up to constant multiples. The next result, which goes in the other direction, will be useful in the proof of the convergence theorem and in checking recurrence of concrete examples. (4.8) Theorem.
is recurrent.
If there is a stationary probability distribution then the chain
Proof Let 1f = 1rp, which is a stationary distribution by (4.6), and note that part (i) of the definition of a Harris chain implies if{a) > 0. Suppose that the chain is transient, i.e., Pa (R < oo) = q < 1. If R1: is the time of the kth return then Pa(R�: < oo) = qJ: so if x =f. a
v(dy) on the right-hand side we have
v(C) = v(a)p(a, C) + .
1S-{a} v(dy)p(y, C)
7.4
The last upper bound is also valid when
x = a since
Integrating with respect to 1f and using the fact that 1f is a stationary distribu tion we have a contradiction that proves the result 0
266 Chapter 7 Diffusions
as
Markov Processes
Section 7.4 Harris Chains 267
c. Convergence theorem
If I is a set of positive integers, we let g.c.d. (I) be the greatest common divisor of the elements of I. We say that a recurrent Harris chain Xn is aperi odic if g.c.d. ( {n � 1 : pn (a, a) > 0}) = This occurs, for example, if we can take A = B in the definition for then p(a, a) > 0. A well known consequence of aperiodicity is
1.
(4.9) Lemma. There is an
mo < oo so that j1"'(a, a) > 0 for m � mo.
For a proof see (5.4) of Chapter 5 in Durrett (1995) . We are now ready to prove (4.10) Theorem. Let Xn be an aperiodic recurrent Harris chain with stationary distribution 1r. If P:r:(R < oo) = 1 then as n ___. oo,
Remark. Here 11 · 11 denotes the total variation distance between the measures. (4.4) guarantees that .\ a.e. x satisfies the hypothesis, while (4.5), (4.7), and (4.8) imply 1r is absolutely continuous with respect to .\. Proof In view of (4.3) �nd (4.6) it suffices to show that if if = 1rp then
llpn (x, · ) - if(·) ll
___.
We claim that the product chain is a Harris chain with A = {(a, a)} and B = B X B. Clearly {ii) is satisfi e d. To check {i) we need to show that for all (x 1 , x 2 ) there is an N so that pN((x 1 , x 2 ), (a, a)) > 0. To prove this let I< and L be such that pK (x 1 , a) > 0 and pL(x 2 , a) > 0, let M � mo, the constant in {4.9) , and take N = J( + L + M. From the definitions it follows that
pK+L+M (x l , a) � pK (x l , a)pL+M (a, a) > 0 pK+L +M (x 2 , a) � pL (x l , a)pK+M (a, a) > 0 and hence pN((x 1 , x 2 ), (a, a)) > 0.
The next step is show that the product chain is recurrent. To do this, note that (4.6) implies 7f = 1rp is a stationary distribution for p, and the two coordinates move independently, so
defines a stationary probability distribution for p, and the desired conclusion follows from (4.8). To prove the convergence theorem now we will write Zn = (Xn, Yn) and consider Zn with initial distribution lia x if. That is X0 = a with probability 1 and Yo has the stationary distribution if. Let T = inf{ n : Zn = (a, a)} = R. (4.8) , (4.7), and (4.5) imply if x if « 5. so (4.4) implies �
0
Po, x'it(T
n) = P (Yn E C, T :::; n) + P (Xn E C, T > n) :::; P(Yn E C) + P(T > n) Interchanging the roles of X and Y we have P (Yn
and it follows that
E C) :::; P (Xn E C) + P(T > n)
Section
268 Chapter 7 Diffusions as Markov Processes
llvn (a, · ) - 7r( - ) ll = II P (Xn E ·) - P(Yn E ·) II
= sup !P(Xn E C) - P(Yn E C ) I c
as n __. oo.
7.5.
l x l2 fc: gives the following (5.4) Corollary. Suppose tr a (x ) + 2x · b(x) :::; then ExTK :::; l x l 2 /c . Taking u(x) =
:::; P(T > n) __. 0 0
-€
for
x E gc = {x : l x l > r}
Though ( 5.4) was easy to prove, it is remarkably sharp.
Convergence Theorems
Assumption.
Convergence Theorems 269
Letting n __. oo and using the monotone convergence theorem the desired result follows. o
In this section we will apply the theory of Harris chains developed in the previ ous section to diffusion processes. To be able to use that theory we will suppose throughout this section that
(5.1)
7.5
MP(b, a ) is well posed and the coefficients a and b satisfy
Example 5.1. Consider d = 1. Suppose a(x) = 1 and b(x) = - cfx for l x l � 1 where c � 0 and set I< = [-1 , 1] . If c > 1/ 2 then ( 5.4) holds with c = 2c- l. To compute ExTK , suppose c -:/= 1 / 2, let € = 2c - 1, and let u(x) = x2fc . Lu = - 1 in � 1, b) so u(�t) + t is a local martingale until time T( l , b) = inf{t : Xt rf. ( 1, b)}. Usmg the optiOnal stopping theorem at time T(l , b) A t we have
(HC) and ( UE) locally.
(5.2) Theorem. If A = B = {x : l x l :::; r} and p is Lebesgue measure on B normalized to be a probability measure then Xn is an aperiodic Harris chain. Proof By
( 3.8) there is a lower semicontinuous function Pt(x, y) > 0 so that Px ( Xt E A) =
1 Pt(x, y) dy
E X2 X2 Ex T.( l b) = - - X T(l,b) '
From this we see that Px(R = 1) > 0 for each x so (i) holds. Lower semi continuity implies P l (x, y) � c > 0 when x, y E A so (ii) holds and we have
p(a, a) > 0.
Rearranging, then letting t __. oo and using the monotone and bounded conver gence theorems, we have
0
To finish checking the hypotheses of the convergence theorem (4.10), we have to show that a stationary distribution 1r exists. Our approach will be to show that Ea R < oo, so the construction in (4.5 ) produces a stationary measure with finite total mass. To check Ea R < oo We will use a slight generalization of Lemma 5.1 of Khasminskii ( 1960).
(5.3) Theorem. Suppose u � 0 is C2 and has Lu :::; -1 for all x rf. K where K is a compact set. Let TK = inf {t > 0 : Xt. E K}. Then u(x) � ExTK. Proof Ito's formula implies that Vt = u(Xt) + t is a local supermartingale on [0, TK ) . Letting Tn l TK be a sequence of times that reduce vt, we have
€
€
To evaluate the right-hand side, we need the natural scale and doing this it is convenient to let start from 1 rather than 0: x c �(x) = exp x =C Y 2 c dy = C'(y 2c+l - 1)
l (ly � dz) l
Using ( 3.2) in Chapter 6 with a = 1 it follows that
x 2 - b2 c+l - x2 c+l . 1 x 2 c+l - 1 b2 Ex r(l,b) = 7 b2 c+l - 1 � - b2c+l - 1 The second term on the right always converges to - 1 / c as b __. oo. The third term on the right converges to 0 when c > 1 /2 and to oo when c < 1/2 (recall that € < 0 in this case) . Exercise
5.1. Use (4.8) in Chapter 6 to show that ExTK = oo when c :::; 1 / 2.
Ch apter 7 Diffusions Markov Processes Exercise 5.2. Consider d = 1. Suppose a(x) = 1 and b(x) � - f.j x 6 for x � 1 where f. > 0 and 0 < 6 < 1. Let T1 = inf{t : Xt = 1} and show that for x � 1, Ex T! � (x 1 +6 - 1)/c. Example 5.2. When d > 1 and a(x) = I the condition in (5.4) becomes x · b(x) � -(d + c)/2 To see this is sharp, let yt = ! Xt l . and use Ito's formula with f(x) = lxl which has Dd = x;fl x l to get t X · b(X3)ds 1 X · d 3 + 1 11 d - 1 ds 1 Yo = + 1o IX: I W 2 o IX3 1 yt o IX: I We have written W here for a d dimensional Brownian motion, so that we can let Bt = 1t IXX33 1 . dW3 be a one dimensional Brownian motion B (it is a local martingale with (B)t = t) and write d --1 ) dt + dBt dyt = I.X1t l (Xt . b(Xt ) + 2 If x · b(x) = -(d - 1 + c)/2 this reduces to the previous example and shows that the condition x b(x) � -(d + c)/2 is sharp. The last detail is to make the transition from Ex TK < oo to Ea R < oo. {5.5) Theorem. Let I Um - 1 : Xt E K }, Um = inf{t > V�m : IX1 1 (j. H or t E Z}, and M = inf{m � 1 : Um E Z}. Since X(Um) E A, R UM . To estimate Ea R, we note that X(Um - d E A so E (Vm - Um- 1 I Fum-l ) � Co = xEsupA ExTK To estimate M , we observe that if TH is the exit time from H then (3.6) implies inf p� + 1 (x,y) = co > 0 xEinfK Px (TA > 1) � I I< ! x ,yE K 1 + where p� (x, y) is the transition probability for the process killed when it leaves the ball of radius r + 1. From this it follows that P(M > m) < (1 - co) m . Since Um - Vm � 1 we have EUM � {1 + Co)/co and the proof is complete. 0 as
270
0
·
8
Weak Convergence
8.1. In M etric Spaces
We begin with a tre'!tment of weak convergence on a general space S with a metric and {iii) i.e., a function with {i ) 0 , (ii) z) z) . Open balls are defined by < r} with r > 0 and the B orel sets S are the u-field generated by the open balls. Throughout this section we will suppose S is a metric space and S is the collection of Borel sets, even though several results are true in much greater generality. For later use, recall that S is said to be separable if there is a countable dense set, and complete if every Cauchy sequence converges. A sequence of probability measures on (S, S) is said to converge weakly if for each bounded continuous function I _,. I In this case we write In many situations it will be convenient to deal directly with random variables rather than with the associated distribu tions converges weakly to and E We say that write _,. if for any bounded continuous function Our first result, sometimes called the Portmanteau Theorem, gives five equivalent definitions of weak convergence.
p, � p(:i, p(x, y)+p(y,
·
{yp(x,y) : p(x,=y)p(y ,x),
p(x,x) =
J.Ln
J.Ln => f.L· X J.Ln (A) = P(Xn A). n Xn => X
f(x)J.Ln(dx)
f(x)J.L(dx). X f, Ef(Xn) Ef(X).
Xn
(1.1) Theorem. The following statements are equivalent. (i ) Ef(Xn) Ef(X) for any bounded continuous function f. (ii) For all closed sets I x 2 , ) of real numbers. To define a metric on R00 we introduce the bounded metric p0(x, y) = lx - Y l/(1 + lx - yl) on R and let • • •
00
p(x, y) = L 2 - i p0(x;, y; ) i=1 It is easy to see that the induced topology is that of coordinatewise convergence, (i.e., x n -;. x if and only if xf -;. x;) for each i and hence this metric makes R00 complete and separable. A countable dense set is the collection of points with finitely many nonzero coordinates, all of which are rational numbers.
S = noo , the product u-field gnerated by the finite dimensional sets { x : x; E A; for 1 � i � k}. Proof To argue that s ::) noo observe that if G; are open then {x : X; E G; for 1 � i � k} is open. For the other inclusion note that (2.5)
Lemma. The Borel sets
{
{y : p(x, y) � 8} = n�= 1 y : so
t= 2- m Po (xm , Ym ) � 8
m 1
{y : p(x, y) < 'Y } = U�= 1 {y : p(x, y) � 'Y - 1 /n } E noo .
}
E noo 0
Let 1f'd : R00 -;. Rd be the projection 7rd (x) = (xi> . . . , x d ) · Our first claim is that if II is a tight family on (R00 , noo ) then {J.L o 7rJ 1 : J.L E II} is a tight family on (ltd , n d ). This follows from the following general result.
(2.6) Lemma. ' If II is a tight family on (S, S) and if h is a continuous map from S to S then {J.L o h - 1 : J.L E II} is a tight family on (S', S'). Proof Given c: , choose in S a compact set K so that J.L( K) 2: 1- c: for all J.L E II. If I 0 so that B(y, 8) = {x : p(x, y) < 8} C G. Pick 8 so large that B(y, 28) n Gc =j:. 0. (G = R00 E A so we can suppose Gc =j:. 0.) If we pick N so that 2 - N < 8/2 then it follows from the definition of the metric that for r < 8/2 the finite dimensional set
A(y, r, N) = { y : lx; - y; j < r for 1 :::; i :::; N} C B(y, 8) C G The boundaries of these sets are disjoint for different values of r so we can pick r > 8/4 so that the boundary of A(y, r, N) has v measure 0. As noted above R00 is separable. Let y1 , Y2 , . . . be an enumeration of the members of the countable dense set that lie in G and let A; = A(y;, r;, N;) be the sets chosen in the last paragraph. To prove that U;A; = G suppose x E G - U;A;, let r > 0 so that B(x, r) C G, and pick a point Yi so that B(x, y; ) < r/9. We claim that x E A; . To see this, note that the triangle inequality implies B(y;, Br/9) C G, so 8 2:: 4r/9 and r > 8/4 = r/9, which completes the proof of J.lnk => v. 0 Before passing to the next case we would like to observe that we have shown
Section 8.2 Prokhorov's Theorems 281 If the points Xn � x in S then limp(xn , q; ) = p(x, q;) and hence h (x n ) � h ( x ) . Suppose x =j:. x' and let c = p(x, x') . If p(x, q;) < c/2, which must be true for some i, then p(x' , q;) > c/2 or the triangle inequality would lead to the contradiction f < p(x, x'). This shows that h is 1-1. Our final task is to show that h - 1 is continuous. If X n does not converge to x then limsup p(xn , x) = f > 0. If p(x , q;) < c/2 which must be true for some q; then lim sup p (x n , q;) 2:: c/2, or again the triangle inequality would give a· contradicition, and hence h( xn) does not converge to h(x) . This shows that 0 if h ( xn ) � h(x) then Xn � x so h - 1 is continuous.
It follows from (2.6) that if II is tight on S then {J.L o h - 1 : J.l E II} is a tight family of measures on R00 • Using the result for R00 we see that if J.ln is a sequence of measures in II and we let Vn = J.ln o h - 1 then there is a convergent subsequence Vnk · Applying the continuous mapping theorem, (1.2), now to the function cp = h - 1 it follows that J.lnk = Vn k o h converges weakly. The general case Whatever S is, if
II is tight, and we let K; be
so that
J.L (K;) 2:: 1 - 1/i for all J.l E II then all the measures are supported on the u-compact set So = U;K; and this case reduces to the previous one. 0 Proof of (2.2) We begin by introducing the intermediate statement:
(H) For each f, 8 > 0 there is a finite collection A1 , . . . , An of balls of radius so that J.L (Ui �n Ai ) 2:: 1 - f for all J.l E II.
8
(2.7) Theorem. In R00 , weak convergence is equivalent to convergence offinite
We will show that (i) if (H) holds then II is tight and (ii) if {H) fails then II is not relativley compact. Combining (ii) and (i) gives (2.2).
Proof for the u-compact case We will prove the result in this case by
Proof of (i) Fix f and choose for each k finitely many balls Af , . . . , A� k of radius 1/k so that J.L (U; �nk Af) 2:: 1 - c/2k for all J.l E II. Let J( be the closure of nf= 1 U;�n k A� . Clearly J.L (K) 2:: 1 - c. K is totally bounded since if Bj is a ball with the same center as AJ and radius 2/k then Bj, 1 :::; j :::; n k covers J(. Being a closed and totally bounded subset of a complete space, J( is compact and the proof is complete. 0
dimensional distributions.
reducing it to the previous one. We start by observing
(2.8) Lemma.
If S is u-compact then S is separable.
Proof Since a countable union of countable sets is countable, it suffices to prove this if S is compact. To do this cover S by balls of radius 1/n, let x�, m < mn be the centers of the balls of a finite sub cover, and check that the x� ar;-a countable dense set. 0
(2.9) Lemma. If S is a separable metric space then it can be emb edded home omorphically into R00 • Proof Let
q1 , q2 ,
• • •
from S into R00 by
be a sequence of points dense in S and define a mapping h(x) = (p(x, q1 ), p(x, q2 ), . . . )
Proof of (ii) Suppose (H) fails for some c and 8. Enumerate the members of the countable dense set q 1 , q2 , . . . , let A; = B ( q; , 8), and Gn = Uf= 1 A;. For each n there is a measure J.ln E II so that J.l n ( Gn ) < 1 - f. We claim that J.ln has no convergent subsequence. To prove this, suppose J.lnk => v. (iii) of {1. 1) implies that for each m v(Gm ) :::; lim sup J.lnk (Gm )
k-+oo :::; lim sup J.lnk (Gnk ) :::; 1 - f k-+oo
Section
282 Chapter 8 Weak Convergence However, Gm j S as m j
oo so we have (S) � 1 - e, a contradiction.
0
v
{w : Xt; (w) E A; for 1 � i � k} where 0 � t 1 < t 2 < . . . < tk � 1 and A; E nd , the Borel subsets of Rd . (3. 1 ) Lemma. B is the same as the u-field C generated by the finite dimensional
The Space C 283
As n -+ oo, fn (t) -+ foo = 0 but not uniformly. To see that Jl.n does not converge weakly to J.l oo , note that h(w) = sup 05 t 9 w(t) is a continuous function but
j h(w)Jl.n (dw)
8.3. The Space C
d Let C = C([O, 1], Rd ) be the space of continuous functions from [0, 1] to R equipped with the norm llwll = sup t 9 !w(t) l Let B be the collection of Borel subsets of C. Introduce the coordinate random variables Xt (w) = w(t) and define the finite dimensional sets by
8.3
= 1 -f> 0 =
j h(w)Jl.oo (dw)
� n � oo be probability measures on C. If the finite dimensional distributions of Jl.n converge to those of p.00 and if the Jl.n are tight then J.l n ::} p.00
{3 .2 ) Theorem. Let f1.n , 1 •
Proof If fl.n is tight then by (2.1) it is relatively compact and hence each subsequence Jl.n m has a further subsequence Jl.n ',. that converges to a limit v . If f : Rk -+ R is bounded and continuous then f(Xt u · . . Xt k ) is bounded and continuous from C to R and hence
sets.
Proof Observe that if e is a given continuous function
{w : llw - ell � e - 1/n} = nq {w : lw(q) - e(q) l � E - 1/n} where the intersection is over all rationals q E [0, 1]. Letting n -+ oo shows {w : ilw - ell < e} E C and B C C. To prove the reverse incl�sion observe that if the A; are open the finite dimensional set {w : w(t; ) E A; } open so the A 0 theorem implies C C B. Let 0 � t 1 < t2 < . . . < tn � 1 and 1rt : C -+ (Rdt be defined by 11" -
IS
1rt (w) = (w(t1 ), . . . , w(tn )) I Given a measure fl. on (C, C), the measures f1. o 1r;- , which give the dis�ribu the finite dimensxoz;tal called are tion of the vectors (Xt1 , , X1 J under Jl., . 1 ) one might hope that 3 ( of distributions or f.d.d.'s for short. On the basis convergence of the fl.n · weak for convergence of the f.d.d.'s might be enough • • •
However, a simple example shows this is false.
Example 3.1. Let an = mass on the function
{
1/2 - 1/2n, bn = 1/2 - 1/4n, and let fl.n be the point
fn (t) =
·
0
4n(x - an ) �n(1/2 - x)
x E [O, an] X E [an , bn] X E (bn , 1/2] x E (1/2, 1)
The last conclusion implies that Jl.n � o 1r;- 1 ::} v o 1r;- 1 , so from the assumed convergence of the f. d. d.'s we see that the f.d.d's of v are determined and hence there is only one subsequential limit. To see that this implies that the whole sequence converges to v, we use the following result, which as the proof shows, is valid for weak convergence on any space. To prepare for the proof we ask the reader to do Exercise 3.1. Let rn be a sequence of real numbers. If each subsequence of rn has a further subsequence that converges to r then rn -+ r.
(3.3) Lemma. If each subsequence of J.l n has a further subsequence that con verges to v then Jl.n ::} v . Proof Note that if f is a bounded continuous function, the sequence of real numbers I f(w)Jl.n (dw) have the property that every subsequence has a further subsequence that converges to I f(w)v(dw). Exercise 3.1 implies that the whole sequence of real numbers converges to the indicated limit. Since this holds for 0 any bounded continuous f the desired result follows.
As Example 3.1 suggests, Jl.n will not be tight if it concentrates on paths that oscillate too much. To find conditions that guarantee tightness, we intro duce the modulus of continuity
wa(w) = sup{ lw(s) - w(t) l : I s - t! � 8}
284 Chapter 8 Weak Convergence
{are3 .4n) Theorem. The sequence J..ln is tight if and only if for each e > 0 there 0 , M and 6 so that {i) J..ln (lw (O)l > M) � e for all n ;::: no (ii) J..ln (w 6 > e) � e for all n ;::: no Remark. Of course by increasing M and decreasing 6 we can always check the condition with n0 = 1 but the formulation in ( 3.4) eliminates the need for that final adjustment. Also by taking e = TJ A ( it follows that if J..ln is tight then there is a 6 and an no so that J..l n (w6 > TJ) � ( for n ;::: no. Proof We begin by recalling (see e.g. , Royden (1988), page 169) (3.5) The Arzela-Ascoli Theorem. A subset of C has compact closure if and only if supw E A l w (O)l < oo and lim6.... o SUPw e A w 6(w) = 0. To prove the necessity of (i) and (ii), we note that if J..ln is tight and e > 0 we can choose a compact set (I 0J..lnthen 6. I< 6 To prove the sufficiency of (i) and (ii) , choose M so that J.l. n(I X (O) I > M) � e/2 for all n and choose 6k so that J..ln (w6k > 1/k) � ej2k+1 for all n. If we let I< be the closure of {X(O) � M, W6k � 1/k for all k} then ( 3.5 ) implies I< is compact- and J..l n (I 0 A
[{
C
D
---+-
.
then (ii) in ( 3.4) holds. Remark. The condition should remind the reader of Kolmogorov's continuity
(1.6) (1.6) (1968).
1.
criterion, in Chapter The key to our proof is the observation that the proof of gives quantitative estimates on the modulus of continuity. For a much different approach to a slightly more general result, see Theorem in Billingsley
12.3
Section 8.4 Skorokhod 's Existence Theorem for SDE 285
Proof Let
r
< af(:J, and pick TJ > 0 small enough so that
(1 - 1})(1 + a - (:J-y) - (1 + TJ) > 0 From (1.7 ) in Chapter 1 it follows that if A = 3 2(1 - 'lh /(1 - 2 - 'Y) then with probability ;::: 1 - I.j( l - 2 - >.) we have IXn(q) - Xn(r)l � A l q - rl'Y for q, r E Q2 n [0, 1] with 2- (l - !J)N �ick N so that I./(1 - 2 - >.) � e then 6 � 2 - (l - !J)N so that A6'Y < e and - 0 It follows that P(w6 > e) � e. ..\
=
·
8.4. S korokhod 's Existence Theorem for SDE
In this sect�on, we wil! describ : Skorokhod's approach to constructing solutions . of stochastic differential equatiOns. We will consider the special case
(X ) dt
lT
where is bounded and continuous, since we can introduce the term b t by change of measure. The new feature here is that u is only assumed to be cont!nuous, not Lipschitz or even Holder continuous. Examples at the end of SectiOn 5.3 show �hat we cannot hope to have uniqueness in this generality. Sko okhod's Idea for solving stochastic differential equations was to dis . : �re�Ize time to get an equation that is trivial to solve, and then pass to the hm1t and extract subsequential limits to solve the original equation. For each n, define by setting = and, for < +
Xn(t)
Xn(O) x
m2- n t � (m 1)2- n ,
Xn
Since is a stochastic integral with respect to Brownian motion ' the formula for covariance of stochastic integrals implies
ns]j2n))ds Xn([ i\lT 2 klT j k}( i k i a;j (Xn.([2ns]/2n ))ds where as usual a = uuT . If we suppose that a;j(x) � M for all i, j and x, it follows that if s < t, then (X� , X� ) t = L 0 t =
I
' I !
286 Chapter 8 Weak Con vergence so
Section
(5.1) in Chapter 3 implies sup IX�(u) - X�(s)IP � Cp EI(X� ) t - (X� ) . IP/ 2 � Cp {M(t - s) }P/ 2 E ue[•,t]
Taking p = 4, we see that
E ue[sup• ,t] IX�(u) - X�(s)l4 � CM2 (t - s)2 Using (3.6) to check (ii) in ( 3 .4) and then noting that Xn(O) = x so (i) in 3.4) is trivial, it follows that the sequence Xn is tight. Invoking Prohorov's (Theorem (2 .1 ) now, we can conclude that there is a subsequence Xn (k) that converges weakly to a limit X. We claim that X satisfies MP (O, a) . To prove this, we will show (4.1) Lemma. If f E C2 and Lf = � L:ii aii Dii f then f(Xt ) - f(Xo) - 1t Lf(X. ) ds is a local martingale Once this is done the desired conclusion follows by applying the result to = and
Xi f(x) XiXj .
f, Dd, Dij f f(Xn(t)) - f(Xn(s)) = 2;:::: 1t Dd(Xn(r))dX�(r) . . 1 "' t + 2 � i Dij /(Xn(r))d(X� ,X� )r • So it follows from the definition of Xn that
f(x) =
are bounded, then the Proof It suffices to show that if and process above is a martingale. Ito's formula implies that I
I]
8.5
Donsker's Theorem
287
Skorokhod's representation theorem, (1.4), implies that we can construct processes with the same distributions as the on some probability space in such a way that with probability 1 as k -+ oo, t converges to t uniformly on T] for any T < oo. If < t and : C -+ R is a bounded continuous function that is measurable with respect to :F. , then
Yk
Xn(k) Yk ( ) g s
[0,
Y( )
E (g(Y)· {!(Yi ) - f("Y, ) - it Lf(Yr) dr}) t = lim E (u(Y k ) · { f(Yl) - f(Y.k ) - i Lnf(Y/.') dr }) = 0
Since this holds for any continuous g , an application of the monotone class theorem shows
which proves (4.1).
D
8 . 5 . Donsker's Theorem
El;.i 0 E(f Sm , 0Sn m n · · · x. S(nt] /Vn [x] 0 t = m/n Sm /Vn ift Btn - { linear if t E [m/n, (m + 1 )/n] (5.1) Donsker's Theorem. As n oo, B n B, where B is a standard
= 1. Let = 6 + + /;.n be the = and Let 6, 6, . . . be i.i.d. with nth partial sum. The most natural way to turn � � into a process is the largest integer � where indexed by � � 1 let il'( = To have a continuous trajectory we will instead let
=>
-+
Brownian motion.
(3 2) . Thus there are two things to do: Convergence of finite dimensional distributions. Since S(nt] is the sum of [n t] independent random variables, it follows from the central limit theorem that if 0 < t1 < . . tn � 1 then Proof We will prove this result using
.
.
so if we let
Lnf(r) = � L:ij aij {Xn { [2nr]2- n ))Dij /(X� ) then f(Xn(t)) - f(Xn(s)) - J.t Lnf(r) dr is a local martingale.
To extend the last conclusion from = nt - nt then
rn
[]
iJn to Bn , we begin by observing that if
288 Chapter 8 Weak Convergence
rn
Section
rn�[ntJ+dVn--+
Since 0 ::; < 1 , we have 0 in probability and the converging together lemma, (1.3), implies that the individual random variables Bf =? Bt . To treat the vector we observe that
( Bf, - Bf,_J - ( Bf, - Bf,_ J = ( Bf, - Bf, ) - ( Bf,_ 1 - Bf,_J - o in probability, so it follows from (1.3) that
(x1 , x2 ,
8.5
Donsker's Theorem
289
C1 = 32 {fl� + iiM } and C2 = 24 {ji._L + iiM }. Remark. We give the explicit values of C1 and C2 only to make it clear that they only depend on fi.M and iiM . Proof Only (iii) needs to be proved. Since E(ei - fi.M ) = 0 and the ei are independent where
( --+) (xi> X 1 x 2 , ... , x 1
Using the fact that . . . , Xn) + · · · + X m ) is a + continuous mapping and invoking 1. 2 it follows that the finite dimensional distributions of B n converge to those of B .
L2
Tightness. The maximal inequality for martingales (see e.g., Chapter 4 of Durrett (1995)) implies
(4.3) in
2 -< 4ES'f s E (0max ) i 5,i5.l
Taking £ =
If p is an even integer (we only care about p =
E(ej - fi.M )P ::; 2P(ji.�{ + Eef )
From this, (ii) , and our assumptions it follows that
njm and using Chebyshev's inequality it follows that P (05,j5,maxn/m lSi I > ivfn) ::; 4/me2
C2£2 . To estimate the fourth moment we notice that E(ej) = 1M 4x3P(I{t l > x)dx ::; 2M2 1M 2xP( Iei 1 > x) dx = 2M2 iiM
This gives the term
or writing things in terms of Bf
m
So using our inequality for the pth power again, we have
1/m
Since we need intervals of length to cover [0, 1] the last estimate is not . good enough to prove tightness . To improve this, we will truncate and compute fourth moments. To isolate these details we will formulate a general result.
.6,=. . . be i.i.d. with mean Let {i = �i l ( I€15: M ) , _let � 1 2 , let > St el +et o:(M) E(l�d ; l�d M), fi.M = E�i , and iiM = E(([) . P(�i ei) ::; M- 2E(I�i l 2 ; 1�1 > M) = o:(M)/M2 (ii) I J.l - fi.M I ::; E(l�d; 1�1 > M) ::; o: (M ) /M (iii) If M � 1 and o: (M) ::; 1 then (5.2) Lemma. Let = +. let Then we have ( i) :f::
2, 4) then
f.l ·
M � 1, this gives the term C1£M2 and the proof is We will apply ( 5 2 ) with M = 8 vfn. Part (i) implies (5.3) nP(�i :f:: �i)- ::; n o:(882nvfn) --+ 0 Since we have supposed complete.
o
.
·
by the dominated convergence theorem. Since J.l = 0 part (ii) implies
(5.4)
Section
290 Chapter 8 Weak Con vergence
n oo. Using the L4 maximal inequality for martingales (again see e.g., (4.3) in Chapter 4 of Durrett (1995)) and part (ii9 , it follows tha� if i = .nfm and n is large then (here and in what follows C Will change from hne to hne) E sup I Sk - kiiMI 4 � GE I St - liiMI 4
as
--.
O�k9
- P
2 Sk - kiiMI � c Vn/9) � Cc-4 ( 6m + � (osup I m) �k�l
Let Bf = S[n1] /Vn and Ik ,m = probabilities in (5.5), it follows that
[kjm, (k + 1)/m].
(
maxm max I B� - Bl:; m I lim sup P O�k< •Elk,m n-oo
Adding up
)
(
m of the
> c/9 � Cc -4 o2 + _!_ m
)
To turn this into an estimate of the modulus of continuity, we note that maxm max lf( s) - f (k/m) l w1 ; m (f) � 3 O�k< •Eh,m
� s � (k + 1)/m � t � (k + 2)/m lf(t) - f (s) l � lf (t) - f ((k + 1)/m) l + lf((k + 1)/m) - f (kjm) l + if(kjm) - f (s) i If we pick m and o so that Cc4 ( o 2 + 1/m) � f. it follows that limsup P(wl fm(f:J n ) > c/3) � f. (5.6) n-oo
since for example if k/m
To convert this into an estimate the modulus of continuity of iJn we begin by observing that (5.3) and (5.4) imply
P as
n
--.
(0�sup1 9 IB� - B� l > c/3) -o 0
oo so the triangle inequality implies limsup P(wt ; m ( iJn ) n-oo
> c) � f.
Donsker's Theorem
291
Now the maximum oscillation of B n over [s, t] is smaller than the maximum oscillation of iJ n over [[ns]jn, ([nt] + 1)/n] so
(5.7) and it follows that
limsup P(w1/ 2m (B n ) > c) � f. n-oo This verifies (ii) in (3.4) and completes the proof of Donsker's theorem.
0
The main motivation for proving Donsker's theorem is that it gives as corollaries a number of interesting facts about random walks. The key to the vault is the continuous mapping theorem, (1.2), which with (5.1) implies:
Using Chebyshev's inequality now it follows that
(5.5)
8.5
.,P : C[O, 1] R has the property that it is continuous .,P(Bn ) => .,P(B) . Example 5.1. Let .,P(w) = max {w(t) : 0 � t � 1}. It is easy to see that I .,P(w) - .,P(e) l � llw - .,P II s o .,P : C[O, 1) --. R is continuous, and (5.8) implies max = max Bt O� m � n Sm /Vn => Mt 0� 1 9 (5.8)
Theorem. If
--.
Po-a.s. then
To complete the picture, we observe that by (3.8) in Chapter 1 the distribution of the right-hand side is
Po(Mt � a) = Po(Ta � 1) = 2Po(Bt � a) 5.2. Let .,P(w) :::d sup {t � 1 : w(t) = 0}. This time .,P is not continuous, for if w, has we(O) = 0, we(1/3) = 1, we(2/3) = f., w(1) = 2, and linear on each interval (j, (j + 1)/3) then .,P(wo) = 2/3 but .,P(we) = 0 for f. > 0. It is easy to see that if .,P(w) < 1 and w(t ) has positive and negative values in each interval (.,P(w) - o, .,P(w)) then .,P is continuous at w. By arguments in Example 3.1 of Chapter 1, the last set has Po measure 1. (If the zero at .,P(w) was isolated on the left, it would not be isolated on the right. ) Using (5.8) now Example
n : Sm-1 Sm � 0}/n => L = sup {t � 1 : B1 = 0} The distribution of L, given in (4.2) in Chapter 1, is an arcsine law. Example 5.3. Let .,P(w) = l {t E [0, 1] : w(t) > a } ! . The point w = a shows that .,P is not continuous but it is easy to see that .,P is continuous at paths w with l {t E (0, 1] : w(t) = a} l = 0. Fubini's theorem implies that sup { m �
·
Eo l {t E (0, 1] : Bt = a} l =
11 Po(Bt = a) dt = 0
292 Chapter 8 Weak Con vergence
Section
so 7/J is continuous Po-a.s. With a little work (5.8) implies
l {m :::; n : Sm > a vfn}l/n l{t E [0, 1] : Bt > a} l Before doing that work we would like to observe that (9.2) in Chapter 4 shows that l {t E [0, 1] : Bt > O}l has an arcsine law under P0 • Proof Application of (5.8) gives that for any a, l {t E [0, 1] : Bf > avfn}l l {t E [0, 1] : Bt > a}l To convert this into a result about l { m :::; n : Sm > afo}l we note that if e > 0 then =>
=>
(5.9)
{ m n IXml :::; ey'Ti}, we have l {t E [0, 1] : Bf > (a + e) vfn}l :::; -n1 l {m :::; n : Sm > avfn}l :::; l {t E [0, 1] : Bf (a - e)vfn}l Combining this with the first conclusion of the proof and using the fact that b -+ l{t E [0, 1] : Bt > b} I is continuous at b = a with probability one, we arrive easily at the desired conclusion. 0 Example 5.4. Let 1/J(w) = Jr0 , 11 w(t) k dt where k > 0 is an integer. 7/J is
by dominated convergence, and on max
.::;
>
continuous so applying (5.8) gives
IJ.'
8.6.
°
The Space D
In the previous section, forming the piecewise linear approximation was an an noying bookkeeping detail. In the next sectioZ: when we consider processes that jump at random times, making them piecewise linear will be a genuine nui sance. To deal with that problem and to educate the reader, we will introduce the space Rd ) of functions from into Rd that are right continuous and have left limits. Since we only need two simple results, (6.4) and 6.5 be low, and one can find this material in Chapter of Billingsley 68 Chapter of Ethier and Kurtz 86 or Chapter VI of Jacod and Shiryaev 87 we will content ourselves to simply state the results. We begin by defining the Skorokhod topology on D. To motivate this consider
D([O, 1],
To convert this into a result about the original sequence, we begin by observing that if y with and then
8. 6
3
Example
[0, 1]
(19 ),
6.1. For 1
3
:::; n :::; oo let 1)/2n) fn(t) = { 01 tt EE [O,(n [(n + +1)/2n, 1]
(19 ), ( ) (19 ),
294
Chapter 8 Weak Convergence
where (n+I)/2n = I/2 for n = oo . We certainly want fn -+ foo but l fn-foo ll = I for all n. Let A be the class of strictly increasing continuous mappings of [0, I] onto itself. Such functions necessarily have = 0 and (I ) = 1. For J, E D define to be the infimum of those positive c for which there is a >. E A so that ::; c sup ::; c and sup t t It is easy to see that d is a metric. If we consider = and = in Example 6.I then for c < I we must take >.((n + I)/2n) = (m + I)/2m so
>.(0)
d(f,g)
>.
g
l f(t) -g(>.(t))l f fn g fm
i >.(t) -t i
I l I I When m = we have d(fn, Joo ) = I/2n so fn -+ foo in the metric d. We will see in (6.2) that d defines the correct topology on D. However, in n+I - m+I = I - I d(fn, fm ) = � � 2n 2m
oo
view of (2.2), it is unfortunate that the metric d is not complete. Example 6.2. For
{
= I/2)
� � The pointwise limit of Un is g 00 0 but d(gn,Uoo ) = 1. We leave it to the reader to show =
> 0.
To fix the problem with completeness we require that >. be close to the identity in a more stringent sense: the slopes of all of its chords are close to 1. If >. E A let = ��� log
11>-1 1
I ( >.(t� = �(s) ) I
The Space D 295
For J, E D define do(x , y) to be the infimum of those positive c for which there is a >. E A so that
11>-11 ::; c
l f(t) - g(>.(t))l ::; c It is easy to see that do is a metric. The functions Un in Example 6.2 have do(Un , Um ) = min { I, pog (n/ m)l} and
sup t
so they no longer form a Cauchy sequence. In fact, there are no more problems.
(6.I) Theorem. The space D is complete under the metric d0• For the reader who is curious why we discussed the simpler metric d we note:
(6.2) Theorem. The metrics d and do are equivalent, i.e., they give rise to the same topology on D .
S C [O, I]. For 0 < 8 < I put w�(f) = {inft i} O u = min rn Tn - 1 : 1 n () = 0
o and 8 $
E/4 then w0(Xh ) $ f..
Proof To check this we note that if
cases to consider 1.
CASE
Con vergence to Diffusions
u > o and 0 < t - s < o there are two
Tn $ < t < Tn+1 lf(t) - f(s)l $ lf(t) - f(rn)l + lf(rn) - f(s)l $ 2E/4 S
l f(t) - f(s)l :5 l f(t) - f(rn)l + l f(rn) - f(rn-) 1 + l f (Tn-) - f(Tn - 1 ) 1 + l f(Tn - 1 ) - f(s)l $ f.
Combining (7.10) and (7. 11) we have
Py(u $ o) $ k sup Px (r $ o) + Py(N X
>
k) $ C•14k0 + e.Xk
If We pick k SO that e,Xk $ E/3 and then pick 0 SO that C E) :::; f.. This verifies (ii) in (6.4). Since X8 = (i) is trivial and the proof of tightness is complete in discrete time. 0
Py(w6(Xh )
Xh ,
0
Tightness proof, discrete time. One of the probabilities in (7.8) is
Tightness proof, continuous time. One of the probabilities in (7.8)
E/4 occur at a rate smaller than llh (x, B(x, (7.9b) Py(O ;::: E/4) $ 1 - exp(- sup Q h (x, B(x, Ej4t)) $ sup ��14 (x) --+ 0 by (iii'). The first step in estimating Py(u > o) is to estimate Py(r1 $ o). To do this we begin by observing that (7.2b) and (7.3) imply
is trivial to estimate. Since jumps of size supx E/4Y) and e-z ;::: 1 - z
>
X
(7.9a)
Py(u
_xn
(7.11)
(7.12)
trivial to estimate
by (iii'). The first step in estimating > o) is to estimate do this we begin by observing that (7.2a) and (7.3) imply
301
Iterating and using the strong Markov property it follows that Ey( e- r" ) :::; and
To relate these definitions to tightness, we will now prove (7.8) Lemma. If
8. 7
Py(r1 $ o). To
fy •f4 (Xfh ) + C.t4kh, k = 0, 1, 2, . . . is a submartingale ,
!y,
>
=:
(7.13) Lemma. If f E C'k then
I Lh f - Lfl loo --+ 0.
Proof Using (7.4) then integrating with respect to
IY - xl � 1 we have Lh f (x) = I )f (x)Dd(x)
Kh(x, dy)
over the set
+ 1l y- xl9 .L:( Yi - x;)(Yj - Xj )D;jf (zx,y ) Il {f(y) - f(x)} I n - 1 12 , 6.! / n (x) = 0 for all x so (iii) holds. To check conditions (i) and (ii) we note that n - x..jii __!__ . n - x ..jii } = -x . blfn (x) = n { __!__ 2n ..jii 2n Vn a1/n ( x) - n { -n1 · n -2nx ..jii + -n1 · n -2nx..jii } = 1 The limiting coefficients b(x) = -x and a(x) = 1 are Lipschitz continuous, so n n the martingale problem is well posed and (A) holds. Letting x1l/ = y,[nl/1]/ n . (7.1) now it follows that and applymg n (8.1) Theorem. As n -�o oo, X11/ converges weakly to an Ornstein-Uhlenbeck process x1 , i.e. , the solution of n
-
t:
_
We have supposed that the martingale problem for and has a unique nonexplosive solution In view of (3.3), to prove it suffices to show that given any sequence =? To -�o 0 there is a subsequence so that prove this we note that since each sequence is tight, a diagonal argument =? shows that we can select a subsequence so that for each k, Let is an open set so if w : w(t) E B(O, k) , 0 ::; t ::; { , =? and H c C is open then implies that
G Xhn ,k Xk0·k=
Examples
Examples
In this section we will give applications of very simple situation.
c. Localization
0• X hn
8 .8.
8.8
_
Next we state a lemma that will help us check the hypotheses in the next two examples. The point here is to replace the truncated moments by ordinary moments that are easier to compute.
lifi (x) = j(y; - x;)(yj - Xj)I1
� IY - xl 2
i (y; - x;)(Yi - Xj) I I 0, Lk >on k2 pA: = 0 for large n Then we will argue that if n is large then with high probability we never see any families of size larger than n6. The limiting coefficients satisfy (3.3) in Chapter 5, so using (4.1) there, we see that the martingale problem is well posed. Calculations in Example 1.6 of Chapter 5 show that (a) and (b) hold. To check (c) with p = 4, we begin by noting that when zr; = f., zr is the sum of f. independent random variables �r with mean 1 + f3n/n ;:::: 0 and variance IT�. Let 6 > 0. Under (A3') we have in addition l�f I � n6 for large n. Using (a + b )4 � 24a4 + 24b4 we have �� / n (x) = nE(n-4 {Z1 - nx} 4 I Z� = nx) � 16n -3(xf3n)4 + 16n -3E( {Zf - nx (1 + f3n/n)}4 I Z� = nx) To bound the second term we note that
{z:;. , m ;:::: 0 } in which the probability of k children Pk has mean 1 + (f3n/n) and variance IT� . Suppose that (A1) f3n -+ (3 E ( -oo, oo), (A2) ITn -+ IT E (O, oo), (A3) for any 6 > 0, Lk>o n k2 pA: -+ 0 n Following the motivation in Example 1.6 of Chapter 5, we let Xt1 / = Z�t] fn.
processes
307
the result, see Sections 3.3-3.4 of Jagers (1975). For an approach based on semigroups and generators see Section 9.1 of Ethier and Kurtz (1986).
then (i) , (ii), and (iii) of (7.1) hold.
lat (x) - afj (x) l �
Examples
References. For a classical generating function approach and the history of
(c) limh!O SUP j xj� R /; (x) = 0
Proof Taking the conditions in reverse order, we note
8. 8
E({Zf - nx(1 + f3nfn)} 4 I Z� = nx ) = nxE(�i - (1 + f3n/n))4 + 6 n2x E(�i - (1 + f3n/n))2
()
E(ei - (1 + f3n fn))4, we note that if 1 + f3nfn � n6 no 4x3 P( l�f - (1 + f3n/n) i > x ) dx E(�i - (1 + f3n/n))4 = no � 2(n6) 2 2x P( l�i - (1 + f3n/n) l > x) dx � 262 1T�n2
To bound
1
1
::; R then �� /n (x) � 3262 1T�R + 961T;R2 n - 1 + 16( xf3n/n)4
Combining the last three estimates we see that if lxl
Since 6 > 0 is arbitrary we have established (c) and the desired conclusion follows from (8.2) and (7.1). To replace (A3') by (A3) now, we observe that the convergence for the special case . and use of the continuous mapping theorem as in Example 5.4 imply
n-1 1 2 n 2.:::: z:;. => 1 X. ds m=O
0
Section
308 Chapter 8 Weak Con vergence
(A3) implies that the probability of a family of size larger than n6 is o(n- 2). Thus if w e trunca�e the original sequence b y not allowing more than bn n children where bn -> 0 slowly, then {A1), {A2) and {A3') will hold and the probability of a difference between the two systems will converge to 0.
0
Example 8.3. Wright Fisher Diffusion. Recall the setup of Example 1.7 in Chapter 5. We have an urn with n letters in it which may be A or a. To build up the urn at time m + 1 we sample with replacement from the urn at time n but with probability a/n we ignore the draw and place an a in, and with probability f3/n we ignore the draw and place an A in. Let z::, be the number n of A's in the urn at time n, and let xfl = Z{:.t /n. Our goal is to show
n
{8.4) Theorem. As n -> oo, Xtl / converges weakly to the Wright-Fisher diffusion Xt , i.e., the solution of
{1951). Chapter 10 of {1986) treats this problem and generalizations allowing more
References. Again the first results are due to Feller
Ethier and Kurtz than 2 alleles.
Proof The limiting coefficients satisfy (3.3) in Chapter 5 so using (4.1) there we see that the martingale problem is well posed. Calculations in Example 1.7 in Chapter 5 show that (a) and (b) hold. To check condition (c) with p = 4 now, we note that if z� = nx then the distribution of zr is the same as that of Sn = 6 + · · · + �n where 6 , . . . , �n E {0, 1} are i.i.d. with P(�i = 1) =
x ( l - afn) + {1 - x)f3/n.
E(Sn - np)4 = E (t �i - p) ] =1 = nE(�i - p)4 + 6 (;) E(�i - p) 2 � Cn2
E(€i - p) m � 1 for all m ;::: 0. 24a4 + 24b4 it follows that
From this and the inequality
Proof The new {a) and (b) imply the ones in {8.2). To check (c) of (8.2) with
p = 2 we note that
� af; (x) = j l y - xi 2I (s + 1, i) (s, i) -> (s - 1, i + 1) ( i) ( i - 1) 81
->
81
a(n - s - i) f3sifn Ji rate a and is
In wor�s, each immune individual dies at replaced by a new susc�ptrble. Each susceptible individual becomes infected at rate f3 times the fractiOn of the population that is infected, ifn. Finally, each infected individual recovers at rate 1 and enters the immune class. n Let Xtl / = (Sf!n, Ifln). To check (a) in (8.5) we note that
1 {an(1 a111/ n (x1 , x 2 ) = ? x1 - x 2 ) + f3nx1x 2 } n1 f3n a 11 /2n (x1, x 2 ) = ? n- x1x 2 1 {f3nx1x + a 221 / n (x1 , x 2 ) = 2 2 1nx1x 2 } n · and all three terms converge to 0 uniformly in r = {(xr, x 2 ) : xr, x 2 ;::: 0, x1 + x 2 � 1}. To check (b) in (8.5) we observe b 11 / n (xr, x 2 ) = an(1 - x1.- x 2 ) n1 f3nx1x 2 1 n = a(1 - xr - x 2) - f3x1x 2 = br(x) b21/ n (xr, x 2 ) = f3n x1x 2 -n1 - 1nx 2 n1 = f3xrx 2 - 'YX 2 = b2 (x) ·
(a + b)4
�
0
Our final example has a deterministic limit. In such situations the following lemma is useful.
309
(8.5) Lemma. If for all R < oo we have (a) limh!o sup l xi � R lafi (x) l = 0 (b) limh!o sup l xi � R lbfex ) - b;(x ) l = 0 then (i), (ii) , and (iii) of (7.1) hold with a;j { x) := 0.
· - -
uniformly for x E [0, 1) and the proof is complete.
Examples
·
4
since
·
8.8
·
·
-
·
310 Ch apter 8 Weak Con vergen e c
The limiting coefficients are Lipschitz continuous on f. Their values outside f are irrelevant so we extend them to be Lipschitz continuous. It follows that the martingale problem is well posed and we have
n (8.6) Theorem. As n --1- oo, X:l / converges weakly to X: , the solution of the
Solutions to Exercises
ordinary differential equation
dX: = b(Xt) dt = 0 then our epidemic model reduces to one considered in Sections 11.1-11.3 of Ethier and Kurtz (1986). In addition to (8.6) they prove n a ce�tral limit theorem for JTi(Xi 1 - X:). Remark. If we set
a
C hapter 1
1.1. Let A = {A = {w : (w(t 1 ) , w(t 2 ), . . . ) E B} : B E n{ 1 • 2 ····1 }. Clearly, any
A E A is in the cr-field generated by the finite dimensional sets.
To complete the proof, we only have to check that A is a cr-field. The first and easier step is to note if A = {w : (w(t l ), w(t 2 ), . . . ) E B} then Ae = {w : (w(t 1 ), w(t 2 ), . . . ) E Be} E A. To check that A is closed under countable unions, let An = {w : (w(t�), w(t�), . . . ) E Bn}, let t 1 , t 2 , . . . be an ordering of {t� : n, m 2:: 1 } and note that we can write An = {w : (w(t l ),w(t 2 ), . . . ) E En } so Un An = {w :
(w(t1 ), w(t2 ), . . .) E UnEn} E A. 1.2. Let An = {w : there is an s E [0, 1] so that I B: - B. I $ Ci t - s i""Y when it - s l $ kfn}. For 1 $ $ n - k + 1 let Yi,n = max B -B �- 1 = 0, 1, . . . k - 1 Bn = { at least one Yi ,n is $ (2k - 1)C/n1'} Again An C Bn but this time if 'Y > 1/2 1/k, i.e. , k(1/2 - 'Y) < - 1, then P(Bn) $ nP( I B(1/n) l $ (2k - 1)C/n-y ) k ::; nP( I B(1 ) 1 ::; (2k - 1)Cn 11 2 --y ) k $ C' n k(l /2 --y) +l 0 1.3. The first step is to observe that the scaling relationship (1.3) implies D. m,n .!!- 2- n /2 D. 1, 0 while the definition of Brownian motion shows E D.. r '0 = t, and E(D.. i 0 - t) 2 = C < oo. Using (*) and the definition of Brownian motion, it follo�s that if k -::/= m then. D..� ,n - t2- n and D..� ,n - t2 - n are independent and have mean 0 so
i { I (i : j) (i + ) I : j + --�-
}
312 Solutions to Exercises
Answers for Chapter
where in the last equality we have used (*) again. The last result and Cheby shev's inequality imply
{T n,
1
313
> Bn E I 0. Let Un = inf{t > 0 : Bt f/. D(y 1 1/n) } . Un ! 0 as n j oo so Py (un < r) --+ 1. 1 - f = Ey/ (Br)
11 \lg(y + O(z - y))
= Ey { f (Br ) ; T :::; Un) + Ey (v(Bun ) ; T > Un)
·
O( z - y) dO
Continuity of \lg implies that if l z - Y l < r and z E U then \lg(y + O(z y)) · O(z - y) 2:: 0 so g(z) 2:: 0. This shows that for small r the truncated cone U n D(y, r) c ac so the regularity of y follows from (4.5c).
4.4. Let r = inf{t > 0 : Bt E V(y, v, a) } . By translation and rotation we can suppose y = 0, v = (1, 0, . . . , 0) , and the d-1 dimensional hyperplane is Zd = 0. Let To = inf{t > 0 : .Bf = 0}. (2.9) in Chapter 1 implies that P0(T0 = 0) = 1. I_! a > 0 then for some k < oo the hyperplane can be covered by k rotations of V so Po(r = 0) 2:: 1/k and it follows from the 0-1 law that P0(r = 0) = 1. 7.1. (a) Ignoring Cd and differentiating gives
; - O; ) Y D h 8 _- - �2 . ( x _2(x 2 0 1 + y2 ) (d+2 )/2 l + 2)(x; - 0) 2 y D h 8 - - d · ( l x _ 0 1 2 +y y2 ) (d+2 )/ 2 + ( ld(d x 0 1 2 + y2 ) (d+4)/2 2 Dyh 8 = ( x - 0 1 21+ y2 ) d/2 - d . ( x - 0 1 2 +y y2 ) (d+2 )/2 l l 2y y + 3 DYY h8 - -d · ( x 0 1 2 + y2 ) (d+2 )/2 + ( l x d(0 1 2d ++ y2)y 2 ) ( d+4)/2 l :r:;
:r: ; :r: ;
_
_
_
Chapter 4
327
Adding up we see that
d- 1
- 1) +" (-d) · 3}y "\;"" D {;;:_ h 8 + Dyy h 8 _- {(-d)(d ( l x - Oj 2 + y2 ) (d+2 )/2 :r: ; :r: ;
+ 2)y( l x - 0 1 2 + y2 ) = 0 + d(d ( l x 0 1 2 + y2 ) (d+4)/2 _
The fact that .6. u( x)
= 0 follows from ( 1. 7)'as in the proof of (7 .1).
Answers for Chapter 5
328 Solutions to Exercises (b) Clearly J dO h 9(x, y) is independent of x. To show that it is independent of y, let x = 0 and change variables 0; = yep ; for 1 ::; i ::; d - 1 to get
(c) Changing variables 0; = x ; + r; y and using dominated convergence
{
JD(:c ,e)c (d) Since P:c(r
< oo) = 1 for all x E H, this follows from (4.3).
=
1 2
-u" - {Ju
::; 1
u(-a)
=0
and it follows from (6.3) that
= u(a) = 1
Guessing u (x) = B cosh ( bx) with b > 0 we find
1
2u" - {Ju =
if b = -12!3. Then we take B = Chapter
1/ cosh( a.J2/3) to satisfy the boundary condition.
5
3.1. We first prove that h is Lipschitz continuous with constant C2 = 2C1 + Let
f (x) = (2R - l x i)/R and g(x)
=
l f(x) - f (y) l = I IYI � l x l l ::; l x - Yl
�
= { l x l ::; R}, and Rxf l x l is the projection
Since h is Lipschitz continuous on D1 of x onto D1 we have
lu(x) - g(y)i = ih(Rxfl x l ) - h(Ryfiyl) l Rx Ry ::; cl � - IYT ::; C1l x - Yl
I
l
lh(x) - h(y)l ::; ll f lloo , 2 I Y (x) - g(y) l + l f(x) - f (y) lll u lloo , 2 ::; 1 · C1 l x - Yl + R1 llhll oo , l ::; (2Cl + lh (O) I /R) l x - Y l = C2 l x - Yl since for x E D1 , lh(x) l ::; lh(O) I + C1 R.
lh(x) - h(y) l ::; lh(x) - h(z) l + lh (z) - h(y) l ::; C1l x - z l + C2 lz - Yl ::; C2 l x - Yl since C1 ::; C2 and l x - zl + lz - Yl = l x - Yl · This shows that h is Lipschitz continuous with constant C2 on lzl ::; 2R. Repeating the last argument taking l xl ::; 2R and y > 2R completes the proof. 3.2. (a) Differentiating we have
) (2 B b2 - {JB cosh ( bx ) = 0
·R- 1 I h(O) I on D2 = {R ::; l x l ::; 2R}. h(Rxf l x l). If x, y E D2 then
: x E D;}, and
To extend the last result to Lipschitz continuity on Rd we begin by ob serving that if x E D1 , y E D2 , and z is the point of 8D1 on the line segment between x and y then
dO h6(x, y) = J
-{3 < 0 so w(x) = E:c e - f3r f3 e r E:c - is the unique solution of
9.1. c(x)
v(x) =
{D(O,efy)c dr hr(O, 1) -+ 0
Combining the last two results, introducing ll k lloo , i = sup { l k(x) l using the triangle inequality:
329
��:(x) = -Cx log x ��:'(x) = -C log x - C > 0 if O < x < e - 1 ��:"(x) = - Cfx < 0 if x > 0 so is strictly increasing and concave on [0, e - 2 ]. Since ��:(e - 2 ) = 2Ce - 2 and ��:'(e- 2 ) = C, is strictly increasing and concave on [O, oo ). When f ::; e - 2 - 1 dx = - c- 1 loglog x = oo I' C x Iog x (b) If g(t) = exp( - 1/tP ) with p > 0 then g' (t) = t �l exp(- lftP) = pg(t){log(1/g(� ))} (P+l )/p P 5.1. Under Q the coordinate maps Xt (w) satisfy MP ([J + b, ) Let 11:
11:
1,
0
0
u .
it [J(X. ) + b(X. ) ds t = Xt - i b (X. ) ds
Xt = Xt -
Answers for Chapter
330 Solutions to Exercises Since we are interested in adding drift
-b we let c = a- 1 b as before, but let
-1t c(X.) dX. t = -Yt + 1 c(X.) b(X.) ds t 1 = -yt + 1 ba- b(X.) ds
Yt =
exp
·
(- 1Y 2b(z)fa(z) dz) ::; e-2 0, M(z) - M(O) ....., Cz2� as z -+ 0 so J < oo if f3 > 0. Comparing the possibilities for I and J gives the desired result.
= Cy- 2�
1/