Linear Algebra: A Modern Introduction

li A rn CIiOD Second David Poole Trent University THOIVISON __ ~JI _ BROOKS/COLE Aust ralia. Canada. Mexico .

6,680 288 106MB

Pages 736 Page size 1274 x 1649 pts Year 2011

Report DMCA / Copyright

DOWNLOAD FILE

Recommend Papers

Introduction to Linear Algebra

816 201 5MB Read more

Introduction to Linear Algebra

Third Edition MANUAL FOR INSTRUCTORS Gilbert Strang [email protected] Massachusetts Institute of Technology http://web.

447 68 500KB Read more

Introduction to Linear Algebra

"!$# %'&)(+*,()&.-0/

441 64 551KB Read more

Introduction to Linear Algebra

Introduction to Linear Algebra Fourth Edition Gilbert Strang Fourth Edition GILBERT STRANG Massachusetts Institute

4,735 3,029 34MB Read more

Introduction to Linear Algebra

840 24 3MB Read more

Introduction to linear algebra

566 161 2MB Read more

Introduction to Linear Algebra

558 27 5MB Read more

Introduction to Linear Algebra

538 86 3MB Read more

An Introduction To Linear Algebra

Kenneth Kuttler July 6, 2010 2 Contents 1 Preliminaries 1.1 The Number Line And Algebra Of The Real Numbers 1.2 Ord

559 53 3MB Read more

$Introduction to Linear Algebra (Math)$

Introduction to Linear Algebra (Math)

467 179 6MB Read more

File loading please wait...

Citation preview

li A

rn

CIiOD

Second

David Poole Trent University

THOIVISON

__

~JI

_

BROOKS/COLE

Aust ralia. Canada. Mexico . Singapore . Spa in United KHl gdom • United Slates

THOIVISON -~...

-.

BROOKS/COLE

Linear Algebra: A Modern Introduct ion Second Ed ition

Dm'la Poole Executive Publisher C urt Hinnchs Executive Edi tor: jennifer Laugier Ed itor: /ohn·Paul Ramin ASSIStant Edi tor: Stacy Gre!.'n Ed itonal Assistant: Leala Holloway T!.'chnology l)roject Managt'r: E,lrl Perry Marketing M,mager. Tom Ziolkowski Marketing Assistant: Enn M IIchell Advertising Project ~lanager: Bryan Vann ProjeCl Ma nager. Edllorial ProduClion: Kelsey McGee Art Director: Vernon Boes Prinl Buyer: Jud y Inouye

Permissions Ed,lor: loohtt lLe Production Service: Matrlx Producllons Text Designer: John Rokusek Photo Research: Sarah Ever t50n / lmage Quest Copy Editor: Conllle Day Il lust ration: SCientific Illustra tors Cover Design: Kim l~okusek Cover Images: Getty Images Cover Pnnllng, Prim ing and Binding: Transcontlllen Pnnllllgl L.ouisrv ille Composilor: Inte racti\'e Compositton Corporation

e

Asia (including India ) Thomson Learning 5 Shenton Way #01 -01 UIC Buildtng Singapore 068808

2006 Thomson Brooks/Cole, a parI of The Thomson Corporation Thomson, Ihe St~r logo, and Brooks/Cor.. arc tr~demarks llsed herein under license. ALL RI GHTS RESERVED. No part of this work covered by the

cop)'right hereon may be reproduced or used III any form or b)' I)v = (1/ Vl4) _ I 3

2/ Vl4 -1/ Vl4 3/ Vl4

Since property (b ) of Theorem 1.3 describes how length behaves With respect to scala r multiplication. natural curiosity suggests that we ask whether length and vector addition arc compatible. It would be nice If we had an identity such as lu + vi = I l u~ + Ivll. bu t fo r nlmost any choice of vectors u a nd vthis turns out to be false. [See Exercise 42(n).) However, all is not lost, fo r It turns out tha t if we replace the = sign by :5 ,l he resulti ng inequality IS true. The proof of this famous and important resuh-

••

Ihe Triangle Inequality-relies on anolher imporlant inequality-the CauchySchwarz Inequality-which \....e will prove and disc uss In m ore detail in Chaple r 7.

Theorem 1.4

• The Cauchy-Schwarz InequaUty

a,

•

For all vectors u and v in Rn,

u

+ ,.

" flllrt 1.21 The Triangle In(>(Juality

.

See Exercises 55 nnd 56 for nlgebraic and geometric npproaches to the pmof o f this inequality. In Rl or IR J , where we G ill usc geom etry, it is dea r from a d iagra m such as Figure 1.26 that l u + vi < ~ u ll + I vl fo r all vectors u and v. We now show that thIs is true more generally.

,

~

ThlOrlm 1.5

The Triangle Inequality For all vectors u and v in R~.

20

Chapter I

Vectors

P"II Since both sides of the inequality arc nonnegative, showing thai the square of the left-hand side is less Ihan or equal to the square of the right-hand side IS equiva lent to proving the theo rem. (Why?) We compute

Il u + V~2

(u + v) ·(u + v)

""

= u·u + 2(u·v) + v'v

By EX31llpie 1.9

" iull' + 21u· vi + ~vr s lul1 2 + 21uU + IvII 2 = (lui + IvI )'

By Cauchy-Schwoarl

as requ ired.

DlSlance The d istance betw'cen two vectors is the direct analogue of the distance between IwO points on the real n umber line or IwO points in the Cartesian plane. On the number line (Figure 1.27 ), t he distance between the numbers a and b is given by ~. (Taking the absolute need to know which o f a o r b is larger.) This d istance is also eq ual to , and its two-dimensional generalization is points (II I' a2 ) and (bl' btl-na mely, the familia r fo rmulaIor the dis· nce (I

la-

d=Y

-b

'+

-11,)' t

/,

o

I i

I

o

-2 flglr'1 .21 d

= 10 -

Jn terms of vectors, if a

=.0

[

~

::]

I 4 I I . 3

= 1- 2 - 31= 5 and b "" [ : : ], then ti is just the length o f a - b.

as shown III Figu re 1.2B. This is the basis for the next definition. y

a- b

,

,, , ,. ,, "' , __ _ ________ JI,

l a~- h _

FIliI,. 1.Z1 (/ - v""(a-,'----b,.,")'~+-c(a-,-b,.,")l

DelialtloD

=

'--------~ x

la - hI

The distanced(u, v) between vectors u and v in Rn is defined b

(u. v)

=

lu .

i

' UW'"

S«t ion

1.2 Length and Angle: Thl' Dot f'roduct

Exlmple 1.13

o

Find the distance betwl'en u =

and v :::

I

- I

S,I.II..

21

2

- 2

v2 We com pute u - v =

-

I ,so

1

d(u . , ) ~ ~ u -

'i

s

\I( v2) ' + ( -

I )'

+

I' ~

V4

~ 2

The dol product can also be used to calc ulat e the ang le between a pair of vcctors. In 1!l2 orR), Ihe angle bet>.vee n Ihe non7.ero vect ors u and v will refer 10 the angle 0 determ ined by these vectors that S:ltisfi es 0 s () :S 180" (sec Plgure 1.29 ).

,

, u

"

• C\

•

u

u

fl,. f.l .2I The lIngie bctw('('n u lind v In Figure 1.30, con sider the tria ngle with side s u, v,and u - v, where () is the angle between u and v. Applyin g the law of cosi nes \0 Ih is triangle yields

U- '

•

Il u - vl1

U

FI"f' U'

lu92 + I 2 - 20ull cos 0 Expanding the left-hand side and usin g Ivf2= v · v seve ral time s, we obta in lui ' - 2(u" ) + lvi' ~ lu! ' + lvi' - '1Iu!II'1005 0 :::

which, after simplification, leaves li S with u' v = U uUvf cosO. From this we obtain the foll owing fo rmula for Ihe cosine of tile angle () between non zero ve, determ ine whether f and QI> are parallel, perpendicular, or neither: (a) 2x+ 3y - Z "" I (c) x - y - z = 3

(b) 4x - y + 5z = 0 (d ) 4x + 6y - 2z = 0

19. The plane @'] has the equation 4x - y + 5z = 2. For each of the planes q. in Exercise 18, de termine whether qp] and 'lJ' are parallel, perpendicular, or neither.

x 28. Q = (0, \ ,0), € with eq uatio n y

{-:J -2

1

\ +

,

I

1

0

3

In Exercises 29 and 30, find the distallce from tllf point Q to the phme ~ . 29, Q "" (2, 2, 2), ~ with equation x

+ y-

z= 0

30. Q = (0, 0,0), f!J' with equation x - 2y + 22 = 1

20. Find the vector fo rm of the equation of the 11Ile in 1R2 that passes thro ugh P = (2, - I ) and is perpendicular to the line with general equation 2x - 3y = 1.

Figure 1.63 suggests a way to use vectors to locate the point R Otl f that is closest to Q.

2\. Find the vecto r fo rm of the eq uatio n of the line in [R:2 that passes th rough P = (2, - \ ) and is parallel to the line with general equat ion 2x - 3y = 1.

32. Find the point Ron f t hat is closest to Q in Exercise 28.

3 1. Find the poin t Ron

e that is doses t to Q in Exercise 27.

Q

22. Find the vector fo rm of the equation of the line in IR J that passes through P "" (- \,0, 3) and is perpendicular to the plane with general equation x - 3y + 2z = 5.

e

23. Fmd the vector fo rm of the equation o f the line in R J that passes through P = ( - 1,0, 3) and is parallel to the lme with parametric equations

,

x = I - t Y "" 2 + 3t z= - 2- I 24. Find th e nor m al for m of the equation of the plane that passes thro ugh P = (0, - 2,5) and is parallel to the plane with general equatio n 6x - y + 22 == 3.

p

o flgur. 1.63

~

r = p

+ PR

Section 1.3

Figure 1.64 suggests a way to use vectors to locate the poim R on VI' tlrat is closest to Q.

Lines and Planes

43

the angle between W> I and qp 2 to be either 8 or 180" - 0, whichever is an acu te angle. (Figure 1.65)

Q

n,

,

c

,

o

flgare 1.64 r = p + PQ + OR

- -

180 - 8

figure 1.65

33. Find the point Ron g> that is closest to Q in Exercise 29. 34. Find the po int Ron 'll' that is closest to Q in Exercise 30.

Exercises 35 (II/(/ 36, filld the distall ce between tile {X/rallel lilies.

III Exercises 43-44, find tlse acute mlgle between the pIa/Ie! with the given equat ;0115.

43.x+ y+ z = 0 and 2x + y - 2z = 0 44. 3x - y+2 z=5 and x+4y - z = 2

111

35.

= [I] + s[-'] [x] y I 3

III Exercises 45-46, show tlll/tihe pllllie and line with the given 1.'(llIatiol15 illlersecf, (lnd then find the aela/.' angle of intersectioll between them. 45. The plane given by x

36.

x Y

I

I

x

O+siandy

,

- \

I

z

o

I

\ +t \ 1

Z=

3

+

I

46. The plane given by 4x - Y -

In Exercises 37 011(/38, find the distance between the parallel planes. C 137. 2x + y - 1%= 0 and 2x + y - 2z =5

38.x+y + z =

J

,nd x + y +z= 3

39. Prove equation 3 o n page 40. 40. Prove equation 4 on page 4 J. 41. Prove that, in R ', the distance bet..."een parallel lines wit h equations n' x = c, and n· x = c1 is given by

given by x =

Exercises 47-48 explore Olle approach 10 the problem of fillding the projection of a ,'ector onlO (/ pial/e. As Figllre 1.66 shows, if@> is a plalle throllgll the origin ill RJ with normal

n

en

till

p= \

I nil If two nonparallel plalles f!J> I alld 0>2 lrave lIormaI vectors " l al1d 11, mui 8 is tile angle /Jetween " l anti " 2, then we define

6 and the line

t

42. Prove that the dis tance between parallel planes with equations n· x = til and n' x = ti, is given by -

z

y = I +2t. Z = 2 + 31

ICI - ~ I ~ nil

I(il

0 and the line

givcn byx = 2 +

I y = I - 2t.

I

+ Y + 2z =

figure 1.66 Projection onto a pl(lllc

en

44

Chapler I

Veclors

vector n, ami v is a vector in Rl, then p = pro~{ v) is a vector ;11 r:I sllch that v - en = p for some scalar c.

onto the planes With the fo llowi ng equations: (a) x+ y+ z = O

(b) 3x - y+ z = O

47. Usi ng the fa ct that n is orlhogonal to every vector in ~ (and hence to p), solve for c ::and the reby fi nd an expressio n fo r p in terms of v and n.

(e) x - 2z = 0

(d ) 2x - 3y

48. Use the method of Exercise 43 to find the p rojection of v =

1 0

-2

+z= 0

I The Cross Product It would be convenient if we could easily convert the vector form x "" p + s u + tvor the equation of a plane to the normal for m n' x = n ' p. What we need is a process that, given two nonparallel vecto rs u and v, produces a third vecto r n that is orthogo nal to both u and v. One approach is to use a const ruction known as the cross product of vectors. Dilly Yldid In RJ , it is defined as follows:

Definition

The cross prOtlUCI of u =

U2

and v =

VI

is the vector u X v

defin ed by

U X II

=

IIl V) -

II J Vl

" l V, -

" I V)

U I Vl -

U2 Y ,

. A s hortcut that can help yo u rem ember how to cakut.lte the cross product of two lIectors is illustra ted below. Under each com p lete Yector, write the first two compo-

nents of that vector. Ignon ng the two components on the top line, consider each block o f four: Subtract the products of the components connected by dashed lines from the products o f the components connected by solid lines. (It helps to notice that the fi rst component of u X v has no Is as subscripts, the second has no 2s, and the third has no 35.)

IIl Vl -

II J V,

UJ V, -

II I VJ

Il , I'l -

1'2 VI

45

The following problems brietly explore the cross product. I. Compute u x v.

,

3

0 (a) u =

,

, v :::::

(0) u

=

,

flgur.1.61

=

-.-. 2

2 ,v =

3

(b ) u

2

-, u X ,

-,

(d ) u

=

-, , , ,

•v =

2

,

, ,

0

3

, ,=

2

3

2. Show that c 1 X c 2 = c)' c1 X c j = c., and c j X e l = ez. 3. Using the definitiOn o f a cross p roduct, prove that u X v (as shown in Figure 1.67 ) is orthogonal to u and v. 4. Use the cross product to help find the no rmal form of the equation of the plane. o 3 (a) The plane passing through P = (l , 0, - 2), pa rallel to u = I and v = - ]

,

2

(b) The plane passing through p = CO, - 1, l),Q = (2,0,2),aod R = (1,2, - 1) 5. Prove the following properties of the cross produ ct: ( a) v X u = - (u x v ) (b) u X 0 = 0 (c) u X u = 0 (d ) u X kv = k(u X v) (e) u X ku == 0 (0 u x (v + w) = u X v + u X w 6. Prove th e fo llowing properties of the cross product: (a ) u· (v X w) ::::: (u X v) ·w ( b ) u x (v X w ) "" (u ·w )v - (u -v)w (e) Illl x

V!l =

I U ~ 2 ~ v ll~ -

(u -vy

7. Redo Problem s 2 and 3, this time making use of Problems 5 and 6. 8. I.et u and v be vecto rs in RJ and let 0 be the angle between u and v. (a) Prove that lu x v ~ = l un vll sin O. t Hlnt: Usc Problem 6(c).J ( b) Prove that the arc.. A of the tri .. ngle de termined by u and v (as shown in Figure 1.68) is given by

,

A u

fll.,.1 .6I

= t llu x

vii

(c) Use the resul t in part (b) to compute the area o f the tria ngle with vertices A = (1 , 2, 1) , B = (2, I,O),and C = (5, - I, 3) .

Section 1.4

Code Vectors lmd Modular Anthmetlc

41

" ."~

Code Vectors and Modular Arithmetic

The modern theory of codes onglllated WIth the work o f the American mathematician and com puter scientist Claude Shannon ( 1916-2001 ). whose 1937 thesis showed how algebra could playa role in the design and analysis o f electncal clfcuits. Shan non would later be Instrumental in th e formatIon of the field of IIIformation tlreoryand gtve the theorctkal basis for what are now called errorcorreclmg codes.

Throughout hislory, people have transmitted informa tio n usi ng codes. Sometimes the intent is to disgUise the message being sen t, such as when each letter in a word is replaced by a different leiter acco rding \ 0 a substitu tio n rule. Although fascinating, these secret codes, or ciphers, ilre not o f concern here; they are the focus of the field of cryptography. Rather, we wi ll concentrate o n codes that are used when data m ust be transmitted electronically. A familiar example of such a code is Morse code, \~ i th its system of dots and dashes. The adven t of d igital computers In the 20th centu ry led to the need to tra nsmit massive amounts of data q uickly and accurately. Computers are designed to en code data as sequences of Os ilnd Is. Many recent tech nological advilncements depend on codes, and we encounter Ihem every d ay withoul being aware of them: satellite communications, compact disc players, the u niversal product codes (U PC) associated with the bar codes fo u nd o n merchandise, and the international standard book numbers (ISBN) found o n every book published today are but a few examples. In this sectIOn, we will use vectors to design codes for detecting errors that may occur in the transmission of data. In laler cha plers, we will construct codes that can not only detect but also correct erro rs. The vectors thaI arise in the study of codes are not the familia r vectors of R" but vectors with only a fi nite number of choices for the components. These veclo rs depend on a different type of arlthmetic-moduiar arithmetIc-which will be introduced in Ihis section and used throughout the book.

Binary COdes Since computers represen t d;lIa in terms o f Os and Is (which can be interpreted as off/on, closed/open, false/ t rue, o r no/yes), we begin by consideri ng biliary codes, which co nsist of vectors each of whose componenls is eilher a 0 Qf a \, In thls setting, the usual rliles of arit hmetIC must be modified, since the result of each calculation involving SOl lars must be a 0 or a I. The modifi ed rules for addition and multiplication are given below.

+

0

I

o

I

001

o

0

0

I

I

0

I

I

0

The only curiosity here is the rule that I + I = O. This is not as strange as it appears; If we replace 0 wit h the word ""even" and I with the word "odd," these tables simply sum marize the fami liar panty rules for the additio n and multiplicatIOn of even and odd integers. For example, I + I = 0 expresses the fact tha t the sum of IWO odd integers is an even lllteger. With these rules, a Uf set of scala rs 10, I } is denoted by Z2 and is called the set of integers modulo 2.

":t

In Z2' 1 + 1 + 0 + 1 = 1 and 1 + 1 + I + I = O. (Thesecakulalions ill ustrate the panty ,"I" Th, sum o f lh,oo odds ,"d , n eve" " odd; lh, sum of fout odds is

'S''"'

We are using the term kmgth differently from the way we used it in R". This should not be confusing, since there is no gromt'tric notion of length for binary vectors.

Wi th l, as OU f set o f scalars, we now extend the above rules to vectors. The SCI of all ,,-tuples of Os and Is (with all ari th metic performed m odu lo 2) is de noted by Zl' The vectors In Z~ are called binary vectors o/Iength n.

Cha pler I

V« lo rs

Example 1.28

The vectors in l~ arc [0 , 0 1, [0, I], [I, 0], and II, contain , in general?)

Exampll 1.29

Lei U = f 1, 1,0, I, OJ and v - 10, I. I, 1,01 be two brna ry veclorsoflen glh 5. Find U ' v.

II.

(How Illany vectors does

Z'l

Solution The calculation of u' v takes place in Zl' so we have u ·v = \·0+ \ . \ + 0·\ + \·1 + 0·0 = 0 + 1 +0+ 1+0

= 0

t

I In practice, we have a message (consisting of words, numbers, or symbols) that we wish to transmit. We begin by encod ing each "word" of Ihe message as a binary vecto r. III

Definition

A binary code is a set o f binary vecto rs (of the same length ) Gliled

code vectors. The process o f com'erring a message into code vectors is ca tled encoding, and the reverse process is called decodi"g. •

"=zz

A5 we will 5«, it is highly desirable that a code have other p ro~rti es as well, such as the ability to spot when an error has occu rred in the transmission o f a code vecto r and, if possible, to suggest how to correct the erro r.

Error-Ollecllng COdes Suppose that we have alread y encoded a message as a set of binary code vectors. We now want to send the binar y cod e vecto rs across a cluHlllei (such as a radio tra nsm itter, a telepho ne line, a fiber o ptic cable, or a CD laser). Unfortunatel y, the channel may be "noisy" (because o f electrical interference, competing Signals, or dirt and scratches). As a result, erro rs may be introduced: Some of the Os may ~ changed to Is, and vice versa. How can we guard agaInst this problem ?

hample 1.30

We wish to encode and transmit a message conSisting of one of the words up, do""., I"/t, or rigill. We decide to use the fo ur vectors in Z~ as our binar y code, as shown in Table 104. If the receiver has this table too and the encoded message is transmitted without e rro r, decod ing is trivial. However, let's suppose that a si ngle error occurred. (By an error, we mean that one component o r the code vec to r changed .) For example, suppose we sent the message "down" encoded as [0, I J but an error occurred in the transm ission o f the fi rst component and the 0 changed to a t. The receiver wo uld then sec

Tlble 1.4 Message Code '

up

[0.0)

down

left 11 . 0)

right ) 1,

tJ

Section 1.4

Code Vectors and Modular Anthmetlc

49

[1. II instead and decode the message as "right." (We will only concern ourselves wi th thc case of single errors such as this o ne. In practice. it is usually assumed that the probabil ity of multiple errors is negligibly small.) Even If the receiver knew (somehow) that a single error had occurred, he or she would not know whether Ihe cor rect code vector was (0, Jj or [ I, OJ. But suppose we sent the message usi ng a code that was a subset of Z~in other wo rds, a binary code of length 3, as shown in Table 1.5.

Tnlll.5 Message

Cod,

up

down

left

right

[O.O. OJ

10, I, J]

11,0, iJ

[ 1,1, 01

This code can detect any single error. For example, if "down" was sent as [0, I, J J and an error occurred in one component, the receiver would read either [I , I , 1 J o r !0, 0, I J or [0, I , 0], none of which is a code vector. So the receiver would know that an error had occurred (but not where) and could ask that the encoded message be retransmitted. (Why wouldn't the receiver know where the error was?)

The term pa rifY comes from th~ l ntin wo rd par, meaning "equnl or ~CVCI\:' Two inteser~ ar~ s.aid t~. have the sa me parity if they are both even or bQth odd,

The code ill Table 1.5 is an example of an error-detecting code. Until the 1940s, this was the best that could be achieved. The advent of digital computers led to the development of codes thllt could correct as well as detect erro rs. We will consider these in Chapters 3, 6, and 7. The message to be transmitted may itself consist of binary vectors. In th is case, a simple but useful error-detecting code is a parity d l/?ck code, which is created by ap-

pending an extra componellt---catied a check digit-to each vector so that the par ity (the total numberof Is) is even.

Exampla 1.31

If the messllge to be sent is the binary vector I I, 0, 0, 1,0, 11. which has an odd number of Is, then the check digit will be I (In o rder to make the total number of Is in the code vector even) and th e code vector will be [ 1, 0,0, 1, 0, I, I J. Note that a single error will be detected, since It will CllUse the panty of the code vecto r to change from even to odd. For exam ple, if an erro r occurred III the third compo nent. the code vector would be received as [ I, 0, I, 1, 0, I, I J, whose parity is odd because it has fi ve Is.

~ jI

Let's look at this concept a bit more formally. Suppose the message is the binary vector b = [bl' bl •• •• , hnJ in I.';. Then the parity check code vector is v = [ /' " b2 , . .. , l bn , d] in , where the check digit d is chosen so that

Zr

bl + h2 + ... + b" + d = 0

In

Zl

or, equivalently, so that I .v = 0

where I = [1, I, ... , I J, a vector whose every component is I. The vector 1 is cal led a check yector. If vector Vi IS received and I· Vi = I, then we can be certai n that

50

Chapter I

Vectors

an error has occurred. (Although we are not considering the possibility of more tha n one erro r, observe that th is schem e will not detect an even number of erro rs.) Parity check codes arc a special case of the more general check digit codes, which we will consider after first extend ing the forego mg ideas to more general seuings.

Modular Arithmetic I I is possible to generalize what we have just done fo r b inary vecto rs to vecto rs whose

components are taken from a finite set 10, 1,2, ... , kJ fo r k 2:: 2. To do so, we must fi rst extend the id ea of b inary arit hmetic.

EKample 1.32

The integers modulo 3 consist of the set Zl = {O, I, 2 ) I io n as given below:

+1 0

I

2

0

0

I

2

0

I

I

2 0

I

2

2 0

I

2

With

addition and multiplica -

0

I

2

0 0 0

0

0

I

2

2

O bserve that the result of each addition and m ultiplication belongs to the set 10, I , 2J; we say that Zl is closed wi th respect to the operatio ns of addi tion and multiplicatio n. It is perhaps easiest to think of this set in term s of a three-ho ur dock with 0 , I, and 2 on its face, as shown in Figure 1.69. The calculation 1 + 2 = 0 translates as fo llows: 2 hours afte r I o'dock, it is o o'dock. l us t as 24:00 and 12:00 are the same on a 12-hou r d ock, so 3 and 0 are eq uivalent on this 3- ho ur clock. Likewise, all mult iples of 3-positive and negativeare equivalent to 0 here; 1 is equi valent to any num ber tha t is I more than a multiple o f 3 (such as - 2, 4, and 7); and 2 is eq uivalent to any numbe r that is 2 mo re than a m ultiple of 3 (such as - 1,5, and 8). We can Vis ualize the n um ber line as wra pping a round a circle, as shown in Figure 1.70.

o . . . . -3.0.3 . .. .

2

.... 1,2.5 . . . .

. . . . - 2, 1.4.. ..

filer. 1.&9 Arit hmetic modulo 3

Example 1.33

Fllur. 1.10

To whi nalion of the vectors 0 and 4 3

\

,

-3

- \

\ ?

- 3

Sollilion (a) \Ve wanllo fin d scalars x and y s uch that

I

x 0

+y

3

- 1

1

1

2

-3

3

Expandlllg, we obtain the system

x -

y= I Y= 2

3x -3y=3 whose augmen ted matrix is 1

- I

o

\

3

I

2 - 3 3

(Observe that the column s of the augmented matrix are just the given vectors; notICe the o rder of the vecto rs-in particular, \'Ihic h vC

we have

ZC.l

1

Thus. Rl = span(e l , e2, ( 3 ). You should have n o difficulty seeing lhat, in general, R~ "" span(e l' c! • ... , e~).

-t-

When the span of a set o f vectors In a d escription of the vectors' span .

EKBmple 2.21

1

Find the span o f 0 and

3

R~

is n ot all of lR~, i\ is reasonable to ask for

- I

1 . (See Exam ple 2.18.)

-3

94

Chapter 2 Systems of Linear EquatIOns

,

Solutloll Thinking geometrically, we can see that the set of all linear combi nations of - I

]

o

and

I IS Just

]

the plane through the origi n with

0

-3

3

x

plane

]

as direction

- 3

x y

,

- ]

]

= , 0

+ 1

,

- 3

3 1

1

- I

y is in the span of 0 and

which is just another way of sayi ng that

Two nonparallel ve e,).)

- I

2

2 • - I • I - I 3 0

= span

24.

Exercises /3- 16, describe tile Spl/II of ri,e givel1 I'ec fors (a) geometrically lind (b) alge/Jraica//y. III

13.[_:].[-;] 15.

I

3

2,

2

o

14. [~ ].[!] 1

16.

- I

0, ..... 1

J

2 •

I •

I

2

26. - I

0

1 ,- 1 0 I

18. Prove that u , v, and ware all in span( u, v, w ). 19. Prove that u, v, and w are all in span(u, u v + w ).

4

7 I

28.

2

,

3

2

2 2 •

3

I

- I

I

0

- I

I

0

I

0 •

I

0 0

0 0 •

2

I

=

31.

,

I

•

0

4

3

3

2 •

2

I

l

- I

I

- I

- I

3

I

- I

- I

•

, 0

.2

3

I

I

3 I

• I

- I

-I

I

0

I

- I

4

•

2

I

I

I

2

6 0 27. 4 • 7 • 0 5 8 0

I • 0 2 3

5

0

3

5

3

0

30.

25.

2

- I

29.

-5

3 • - I •

+ v, u +

20. (a) Prove that if u l • . .• , u mare vectors in R~, S = { u p u ~"" , u kl, a nd T = {u p ...• Ul. U hP .•• , uml. then span(S)!: span( T). (Him: Rephrase this questio n in terms of linear combinations.) (b) Deduce that if R" span (S), then R~ span( T) also. 21. (a) Suppose that vector w is a linear combi natio n of vectors U I' ••. , u •. an(\ that each u, is a linear combination of vectors v I" . . , v,,,. Prove that W IS a linear combination of V I" •• , v'" and therefo re span(u p . .. , Uk) \: span( v p ••• > von)'

I

- 2

17. The general equation of the plane t hat contains the points (1, 0, 3), (-1, I, - 3), and the origin is of the form ax + by + cz = O. Solve fo r (/, b, and c.

=

2

I - I

•

3 I

•

I

3

111 Exercrses 32- 4 J, determine If ri,e sets of vectors 1/1 the given exerCISe are Iim~ilrly itldependelll by cOllverting rile

Section 2.4

vectors to row vectors (Il1d usmg tile method of Example 2.25 (/lid Theorem 2.7. For any sets that are lil/eMly dependem, find a dependence relationship among the vectors. 32. Exercise 22

33. Exercise 23

34. Exercise 24

35. Exercise 25

36. Exercise 26

37. Exercise 27

38, Exercise 28

39. Exercise 29

40. ExerCISe 30

41. Exercise 3 1

Applications

1D1

(b) If vectors u , v, and ware hnearly independent, will u - v, v - w, and u - w also be linearly independent? Justify your answer. 44. Prove that two vectors are linearly dependent if and o nly ,f one is a scalar multiple of the o ther. ( Him: Sepa rately consider the case where one of the vectors is 0.) 45. Give a "row veclor proof" of Theorem 2.8.

42. (a) If the columns of an fi X /I mat rix A are linearly independent as vecto rs in IR", what is the rank of A? Explain. (b) If the rows of an nXn matrix A are linearly independent as vectors in R ", what is the rank of A? Explain. 43. (a) If vecto rs u, v, and w arc linearly independen t, will u + v, v + w, and u + w also be linearly independent? Justify your answer.

46. Prove that every subset of a linearly independent set is linearly independen t. 47. Suppose that 5 = Iv 1" •• , v ~, vI is a sel of ve(.to rs in some R" and that v is a linear combination of v..... , vk. lf S = !vp ... , vk}, prove that span (S) = span (S'). [Hint: Exercise 2 1(b) is helpful here. ] 48. Let {v" .. . , v~l be a linearly independent set of vectors in R", and let v be a vector in R ~. Suppose Ih"l v = e,v l + C2V1 + ... + ck v k with CI *- O. Prove that lv, v!" .. , v,l is li nearly independent.

Applications There are too many applications of systems o f linear equations to do them justice in a single section. This section Will introduce a few applications.. to illust rate the diverse settings in which they arise.

Allocation 01 Resources A great many applications of systems of ti near eq uations involve allocating limited resources subject to a set of constraints.

Example 2.21

A biologist has placed three st rains of bacteria (denoted I, II, and III ) in a test tube, where they will feed on threedifTerent food sources (A, 13, and C). Each day 2300 units of A, BOO units of B, and 1500 units of C are placed in the test tube, and each bac· terium consumes a certain nu mber of units of each food per day, as shown in Table 2.2 . How many b'lCteria of each strain ca n coexist in the test tube and consume ti ll of the foo d?

Table 2.Z

Food A Food B FoodC

Bacteria Strain I

Bacteria Strain II

Bacteria Strain III

2

2 2 3

4

I I

0 I

112

Chapler 2 Systems of Linear Equations

Sol.tl..

Let XI' x 2 , and x) be the numbers of ba cteria of strains [, II , and 1[ [, respectively. Smce each o f the XI baCieria of strain I consumes 2 units of A per da y, strain 1 consumes a total of 2xI units per da y. Similarly. strai ns II and III consume a to tal of 2x2and 4xJ units of food A daily. Since we W10

o

I

I 2

o

0

0 2

1 01 0

I I

Hence, there is a unique solutio n: x, = 2,Xz "" I,x] = 1.ln o ther words, we must push swi tch A twice and the other two switches on ce each. (Check this.)

Exercises 2.4 Alloeallon of Resolrees I. Suppose that, in Example 2.27, 400 un its of food. A, 600 units of B, and 600 units of C arc placed in the test tube ellch day and the dll ta on dail y food consumption by the bacteria (in u nits per day) are as shown in Table 2.4. How many bacteria o f each strain can coexist III the test tube and consume all of th e food?

able 2.4 Bacteria Strain I

Bacteria Strain II

Bacteria Strain III

1 2 1

2 1 1

0 1

FooclA Food B Food. C

2

2. Suppose that in Example 2.27, 400 units offood A, 500 units of B, and 600 units of C arc placed in the test

Jable 2.5

Food A Food B Food C

Bacteria Strain I

Bacteria Strain II

Bacteria Strain III

1 2 1

2 1 1

0 3 1

tube each d ay and the data o n dal ly food consumptio n by the bacteria (in un its per day) are as shown in Table 2.5. How many bacteria of each strai n can coexist in the test tube and consume all of the food? 3. A florist offers th ree sizes o f flowe r arrangements containing roses, daisies, and chrysanthemums. Each small arrangement contains one rose, three daisies, and three chrysanthemullls. Each llledium arrangemen t contains two roses, four daiSies, and six chrysan them ums. Each large arrangement contains four roses, eight d;lisies, and six chr ysant hemums. One da y, the flo rist no ted Ihal she used ;l to tal of 24 roses, SO d aisies, and 48 chrysant hemum s in fi lling orders for these th ree types of arrangements. How m ally arrangements o f each type did she make? 4. (a) In your pocket rou have some IlIckels, d imes, and quarters. There arc 20 coins altogether and exactly twice as many dimes as nickels. The total value of the coins is S3.00. Find the number of coi ns of each type. (b) Find all possible combinations of 20 coins (nickels, dimes, and quarters) tha t will make exactl y $3.00. 5. A coffee merchant sells three blend s of coffee. A bag o f the house blend contains 300 grams of Colombian beans and 200 grams of French roast beans. A bag of the special blend contains 200 gram s o f Colombian beans, 200 grams of Kenyan beans, and 100 gra ms of French roast beans. A bag o f the gourmet blend

11.

Chapter 2 Systems of Linear Equations

f,

contains 100 grams o f Colombian beans, 200 grams o f Kenyan beans, and 200 grams of French roast beans. The merc hant has o n hand 30 kilogra ms o f Colom bian bea ns, 15 kilograms of Ke nya n beans, a nd 25 kilograms of French roast bea ns. If he wishes to use up all of the beans, how ma ny bags o f eac h type of blend can be made?

c

,

6. Redo Exercise 5, assu m ing tha t the house blend contains 300 gra ms of Colo m bian beans, 50 gra ms of Kenyan beans, and 150 grams of Fren ch roast bea ns and the gourmet blend contains 100 grams of Colombian beans, 350 grams of Ke nyan bea ns, a nd 50 grams of French roast beans. Th is time the me rchant has on hand 30 kilograms of Colombian beans, 15 ki lograms of Kenyan beans, and 15 kilograms of French roast beans. Suppose o ne bag of the house blend produces a profit of $0.50, one bag o f the special blend prod uces a profit of $1.50, and one bag of the gourmet blend produces a profit of $2.00. How many bags o f each type should the merchant prepare if he wants to usc up all o f the beans and maxinm.e his profit? Wha t is the maximum profit?

III ExerCISes 7- /4, bn/tmce tile chemical equation for each ret/cl iot!.

8. CO!

+ I-Ip

~

+ SOl C,H 120, + 0 1 (This reaction takes

r e 20 ) ~

place when a green plant converts carbon dioxide a nd wa te r to glucose and oxygen during photosynthesis.)

9.

CO 2 + H 20 (This reac tion occurs when butane, C~ H 1 0' burns in the presence of oxygen to form car bon dioxide and wa ter.) C~HtU

+ O2

10. Ci H 60 Z + 0 l

)

~

Flgur.2.18 (b) If the fl ow through A B is res tricted to 5 Um in, what WIll the fl ows th ro ugh the o ther two branches be? (c) What a re the m inimum and maximum possible flows through each branch? (d ) We have been assuming tha t flow is always poSitive. \¥ha t would negaave flow mean, assum ing we allowed it? G ive an illus tration for this example.

16. The downtown core of Gotha m City consists of one-way st reets, a nd the traffic fl ow has been measured al each in lersectioll. For the city block shown in Figure 2. 19, the numbers represent the average numbers of vehicles per m inute entering and leaving intersections A, C, a nd D d uring business hours . (a) Set up and solve a syste m of linear equal iOns to fin d the possible flows fl' ... ,f;,. (b) If traffic is regula ted on CD so thath "" 10 ve hicles per minute, what will the average flows on the other streets be? (c) What are the minimum and maximum possible flows on each s treet? (d) How would the solution change if all of the dIrections were reversed?

+ CO 2 Hp + CO 2 (This equation rep-

H 20

II. CsH I10H + Ol ~ resents the combus tion of amyl alcoho1.)

+ P40 10 ) 13. Na 2CO J + C + N: ~ 14. C2 H lCl~ + Ca(O H ): 12. HCIO t

B

n,

• 111.elnl Ch,.lell IquIUOOS

7. FeS 2 + O 2

H)P04 + Cl z0 7 ) NaCN + CO ) C 2HCl J

•

101 10,

15. Figure 2. 18 shows a neh...-ork of water pipes with flows measured in lite rs per min ute. (a) Set up and solve a system of lin ear equations to find the possible flows.

f,

A

•

hI

+ CaCI2 + H zO

N,lwor. 111111111

20i s, B

hi f,

•"

D I()

flgur' 2.19

I

•

C

lSi

•"

Section 2.4

17. A netwo rk of m igation ditches is shown in Figure 2.20, with flows measured in thousa nds o f liters per day. (a)

SCI

up and solve a system of lin ca r equations to find

the possible 110ws h ....

,is'

(b ) Suppose DC is closed. Wha t range of flow will need 10 be mamlai ned th ro ugh DB? (e) Fro m Figu re 2.20 it is deartha! DIJ cannO( be closed. (Why no t?) How does your solution in part (a) show

Electllcal "etwDI's For Exercises J9 (Iud 20, defemllne tile Ctlrrctt/s jor the gIven elecrricailletworks. I

19.

•

,

I

•

, 1 ohm

IIliS.

J,

(d) Front your solution in part (a), determine the minimu m and maxImum fl ows through DB.

i

c 8 volts

.,

100 ~

115

Applica ti ons

I,

•

A

•

B

1 oh m 4 ohms

A

I)

I,

•

•

•

I

\

I,

13 volls

:{'

20.

•

I

,

\

•

I

,

5 volts 1 ohm

"

C

c

J,

o

A

I,

•

•

B

2 ohms

fillare 2.21 4 ohms

18. (a) Set up and solve a system of linear equations to find the possible fl ows in the network shown in Figure 2.21. (b ) Is it possible for f. == 100 and i6= ISO? (Answer this questio n firs t wi th reference to your solutIOn III part (al and then direct ly from Figure 2.21. ) (el If h = 0, what will the ra nge of flo w l>c on ellch of the other b ranches?

150 !

lOot

[,

200 ~

[,t 4

c

I,t

f,t

/6

2 oh m ~

c

100 ~

/J

h

• ISO

•

r

E

0

21. (a) Find the cu rren ts I, 11" ' " I., in the bridge circuit . I·· 1Il " sure 2" . __ . (b ) Find the effective resistance of this network. (e) Crill you change the resistance III bra nch Be (bu t leave everything else unchanged ) so Ihal the current through branch CEbecomesO? I ohm

h~

[, !

200

8 volts

200t

•

A

•

, I,

I)

I,

I ohm

I, /J

"!

• 2 ohms

I,

£

•

I)

1 ohm

A

loot fig,,, 2.21

100 !

loot

•I fllure 2.22

14 volts

•I

116

Chapter 2 Systems of Linear Equations

22. The networks III parts (a) and (b) of Figure 2.23 show two resistors coupled in series and in parallel, respectively. We wish to find a general formula for the effective resistance of each network- that IS, find R,ff such that E = RoffI.

24. (a) In Example 2.33, suppose the fourth light is initially on and the other four lights arc ofT. Can we push the switches in some o rder so that only the second and fourth lights will be on? (b) Can we push the switches in some order so that only the second light will be on?

(a) Show that the effective resistance Rdf of a netwo rk With two resistors coupled in series [Figure 2.23 (a) I is given by

25. In Example 2.33, desc ribe all possible configurations of lights that can be obtained if we start wi th all the ligh ts off. 26. (a) In Exam ple 2.34, suppose that all of the lights are iniliallyoff. Show that it is possible to push the switches in some o rder so that the lights are off, da rk blue, and light blue, III that order. (b) Show that it IS possible to push the sWilches in some o rder so that the lights are light blue, off, and light blue, III that order. (cl Prove tha t any configu ration of the th ree lights can be achieved.

(b) Show that the effective resista nce Rdf of a network with two resistors cou pled in parallel [Figure 2.23(b)] is given by

27. Suppose the lights in Example 2.33 can be ofT, light blue, or dark blue a nd the switches wo rk as described in Example 2.34. (That is, the sWI tches control the same lights as in Example 2.33 but cycle through the colors as in Example 2.34.) Show that it is possible to start with all of the lights off a nd push the switches in some order so that the lights are dark blue, light blue, dark blue, light blue, and dark blue, in that order.

E (,)

/

I I

,

/,

•

R,

•

N,

28. For Exercise 27, desc ribe all possible configurations of lights that can be obtamed , starting With all the lights off.

-

CM

1 I

/1 E (b)

29. Nine squares, each one either blac k or wh ite, are arranged in a 3X3 grid. Figure 2.24 shows one possible arrangement. 'Nhen touched, each square changes Its own state a nd the states o f some of its neighbors (black ~ white and white ~ black). Figure 2.25 shows how the state changes work. (Touchi ng the square whose numbe r is circled causes the states of the squares marked " to change.) The object of the game is to tu rn all n ine squa res b lack.lExe rcises 29 and 30

Figure 2.23 Resistors in series and in parallel

Flnlle linear Games 23. (a) In Example 2.33, suppose all th e lights are initially off. Ca n we push the swi tches in some order so that only the second and four th lights will be on? (b) Can we push the switches in some order so that only the second light will be on?

Figure 2.24 The nine squares puzzle

Section 2.4

CD•

2

4

5

•

• •

8

7

0) call be established usm,!; mtlfhematical indllCtioll (see Appemlix 8 ). Dlle way 10 make (III edllcatetl guess (I S to what the fo mlllias (Ire. though, is to observe Ilrtll we ((III rewrile the two forml/I(ls above as

(b) (-3, 1),(-2,2),and (- 1,5)

40. Th rough any three noncoHinear points there also passes a unique circle. Find the circles (whose general equations are ofthcform xl + I" + ax + by + c"" 0) that pass through Ihe sets of points in Exercise 39. (To check the validity of your answer, find the center and radius of each ci rcle and d raw a sketch. )

The process ofadtling rational filllCl ;011$ (ratios ofpoly"om I aIs) by placing them over a commo" denominator is tile anaioglleof tl(itlillg ratlOna/numbers. Tire reverse process oflakmg (i rallO/wi fU llctioll l'parl by wntmg /ltU a SI/ III of si",pier ratiOlwl/lItICliollS is IIseflll in several areas ofmatllematics;

respectlve/y. TIllS leads to the conjecture that the Sll III of ptll powers of tI,e jim" natural /wmbers is a polynomial of degree p + I in tire variable II. 45. Assuming that I + 2 + ... + 11 "" ati + bn + c, find a, I"~ and (by substituting thrcc values for /I and thereby obtai nrng a system of linear equations in (I , b, and c. 46. Assume that 12 + 21 + ... + ,,2"" (11,3 + bll 2 + en + d. Fi nd (I, b, c, and d. ( Hm l: It is legitima te 10 use 1/ = o. What is the left-hand side in that case?) 47. Show that I' + 2] + ... + ,1' = ("(/1 + I) 2)1.

I I The GI

1Positio

g

The Global Positioning System (CPS) is used In a variety of situations fo r dctcrmining geographical locatio ns. The military, survero rs, airlines. shipping co mpanies, and hikers all make use of It. CPS technology IS becoming so commonplace that some auto mobiles, cellu lar phones, and various handheld deV ices are now equipped with it. The basic idea of G PS is a variant on th ree-dimensionaltriangulation: A point on Ea rth's surface is uniquely determined by knowing Its distances from th ree other points. I']ere the point we wish to determine is the location of the CPS receiver. the other points are s.1tcl lites, and the distances a re computed usmg the travel times of radio signals from the sa tellites to the receiver. We will assume that Earth IS a sphere on which we impose an xyz-coordinate system with Earth centered at the origin and with the posi tive z-axis runn ing through the north pole and fi xed relative to Earth. For simplici ty, let's take one unit to be equal to the radius of Earth. Thus Earth's su rface becomes the unit sphere with equation ,(- + I + t = I. Time will be measured in hundredths of a second. GPS fi nds distances by knowing how long II takes a radio signal to get from o ne point to another. For this we need to know the speed of light, which is approximately equal to 0.47 (Earlh radii per hundredths of a second). Let's imagine that you are a hiker lost in the woods at poin t (x,y, zl at some lime t. You don't know whe r~ you are, and fur thermo re, )'o u have no watch, so you don't know what time it IS. However, you have your C PS device, and it receives simultaneous signals from four satel lites, giving their positions and limes as shown in Table 2.6. (Distances are measured in Earth radii and time in hundredths of a second past midnight.)

This application is based on the article MAn Underdetermined linear System for GPS" by Dan Kalman In 71ll' Col/l'gf' Mil themiltKJ jol"nill. 33 (2002). pp. 384--390.

For a more in-depth treatment of the Ideas introduced here, s« G. Strang and K. Ilorre. Lil/cllr Algebra, GeM!'!),. II/Ill CPS (Wellesley-Cambridge Press, MA, 1997).

Let (x,y. z) be your position. and let t be the time when the signals arrive. The g0.11 is 10 solve for x. y, Z, and t. Your distance from 5..1tellite I can be compu ted as foll ows. The signal, traveling at a speed of 0.47 Earth radl i llO~l sec, was sent at time 1.29 and arrived at lime t, so it took t - 1.29 hundredths of a second to reach you. Distance equals veloci ty multiplied by (dapsed) lime, so

rl = 0.47«( - 1.29) 111

• Table 2.6 salemle Oat. Position

Time

(1.11. 2.55, 2.14) (2.87.0.00. 1.43) (0.00. 1.08.2.29) (1.54, 1.01 , 1.23)

1.29

SateUile

,

2 3 4

1.31

2.75 4.06

We can also express d in terrn$ o f (x. y. z) and the s,1Iell itc's position (1.11,2.55,2.14) using the distance formula:

d ~ \/"(x- _-:,-:.,7, ") ,-+:-7(y-_-:,".,","7; ),-+:-7 (,- _ - :,.7,,"')' Combin ing these results leads to the cqutllion (x - Lll) ~

+

(y- 2.55 )2

+ (z-

2. 14 )2 - 0.47z(t - 1.29)2

(I)

Expanding, simplifying, and rearrangi ng. we find thot equation ( I) becomes 2.22x

+ 5.1Oy + 4.28z -

0.57 ' = y;2

+ y + ZI

O.22r

-

+

11.95

Similarly, we can derIve a correspond ing equation fo r each of the other three satel lites. We end up with a system o f four equations in x. y, z, and t:

+ 5.IOr + 4.28z + 2.86z 5.74x 2.16y + 4.58% 3.08x + 2.02y + 2.46z -

2.22x

+ I + r - O.22r + ) 1.95 0.581 = xl + 1+ r - O.22r + 9.90 1.2 11 = K + i + o.zzr + 4.74 l.79r = x:2 + Y + xl - 0.221 2 + 1.26 0.57t =

r

r-

These are not lineOlr equations, but the nonlinear terms are the same in each equatio n. If we subtrOlct the fi rst equation from each o f the ot her three equations, ,,'e obtain a linear system: 3.52x - 5.lOy - I.42z-0.0 I t =

2.05

- 2.22x - 2.94y + 0.30z - O.64t =

7.2 1

O.86x - 3.08y - J.82z - 1.22( = - 10.69

The augmented matrix row reduces as 3.52

-5. 10

- 1.42

-0.01

-2.05

-2.22 0.86

- 2.94 -3.08

0 .30

- 0.64

-7.2 1

- 1.82

- 1.22 - 10.69

o

,o ,

0 0.36 2.97 0.03 0.8 1 0.79 59 1

from which we see that

x= 2.97 - 0.36/ y=0.8 1 -0.03 / Z

= 5.9 1 - 0.791

wi th t free. Substituti ng these equations into ( I ), we obtain (2.97 - 0.36r - 1.11)1 + {0.8 1 - 0.031 - 2.55)2 + (5.91 - 0.79t - 2. 14 )~ = 0.471( t - 1.29 )~

. 120

(')

which sim plifies to the q uadratic equation

0.54 r - 6.65t + 20.32 :: 0 There are two solutions: t = 6.74

and

t = 5.60

Substituti ng into (2), we find that the first solu tio n corresponds to (x, y, z) = (0.55, 0.61,0.56) and the second solution to (x, y, z) = (0.96,0.65,1.46). The second solution is clea rl y no t o n the unit sphere (Eanh), so we reject it. The firs t solution produces Xl + + = 0.99, so we are satisfied that, within acceptable rou ndoff error, we have located yo ur coordinates as (0.55,0.61,0.56) . In practice, G PS ta kes sign ificantly more facto rs in to acco unt, such as the fact that Eart h's surface is not exactly spherical, so additio nal refinements are needed involvIng such techn iq ues as least squares approximation (see Chapter 7). In addition, the results of the CPS calculation arc converted fro m rect:mgular (Cartesian) coordinates into latitude and longi tude. an interesting exercise in Itself and one involvmg yet other branches o f mathematics.

l

-r

'"

Seellon 2.5

Table 2.1 n o o x, o x,

1

2

J

4

5

6

0.714

0.914

0.976

0.993

0.998

0.999

1.400

1.829

1.949

1.985

1.996

1.999

TIle successive vectors iterate is

121

Iterative Methods for Sol\'ing Linear Systems

[::J

arecalled iterates.so,forexample. when" = 4, the fourth

[~::~~]. We can sec that the iterates in this example arc approaching [~],

which is the exact solution of the given system . (Check lhis.) We say in this case tha t Jacobi 's Illethod converges.

Jacobi's method calculates the successive iterates in a two-variable system according to the crisscross pattern shown," Table 2.S.

o

2

1

x,

J

:

:=

Before we consIder Jacobi's method in the general case, we will look al a I g

modification of it that o ften converges fas ter to the solution. The Gauss-SeIdel method is the same as the Jacobi method except th:Jt we usc each new val ue (IS SOOIl (IS lVe CfIlI. So In OllT exa mple. we begin by c3 leulati ng x, = (5 + 0)/7 = ~ .. 0.7 14 as before, bu t we now use this value o f x, to gel the next value of x 2:

7 + 3· J "'" 1.829 5

,•

We then usc this value of A1 to recalculate X" and so on. The iterates this lime arc shown in Table 2.9. We observe thaI the Gauss-Seidel method has converged (3sler to Ihe solution. The ilerates this time are c3lculaled according to the zigzag pattern shown in Table 2.10.

[ able 2 91

"

x, x,

0

1

2

J

4

5

0

0.714

0.976

0.998

1.000

1.000

0

1.829

1.985

1.999

2.000

2.000

124

Chapter 2 Systems o f Lmear Equations

Table 2.10 n o

1

2

3

The Gauss-Seidel method also has a nice geometric interpretation in the case of two variables. We can thin k of X I and ~ as the coordinates of po ints ill the plane. O ur starting poi nt LS the point corresponding to our Ill itial approximation, (0, 0 ). O ur first calculation gives X I = ~ ,so we move to the point ( ~, 0) .. (0.7 J 4, 0). T hen we compute:s = ~ = 1.829, which moves us to the point (~, ~) ... (0.71 4, 1.829). Continuing in this fashion, ou r calculations from the Gauss-Seidel method give risc to a sequence of poin ts, each one d ifferi ng frOIll the precedmg poi nt in exactly one coor== 5 and 3x1 - 5-'0 = - 7 correspo ndi ng to the two d inate. If we plot the lines 7X I given equations, we find that the points calculated above fall alternately on th e two lines, as shown III Figure 2.27. Moreover, they approach the poi nt of intersec\Jon of the li nes, which corres po nds to the solution o f the system o f equations. Th is IS what cO ll l'ergcllcc means!

:s

2

Ij. I

0.5

0.2

0.4

0.6

~-+-+-+- ~ XI 0.8 I 1.2

- 0.5 - I

flnrl 2.21 Convcrgmg l te ra l ~

T he general cases of the two methods are analogous. Given a system of equations in tI vanables,

""X, + ",,'"_ + .. + " ,"x" ~ " /.

II

linear

b, (2)

we solve the first equation for XI' the second for :S' and so on. Then, beginning with an initial approximation, we use these new equatio ns to iteratively update each

Section 2 5

herati\'e Me th ods for Solving Linear Systems

125

variable. Jacobi's method uses all of the values at the hh iteration to compute the (k + l )st iterate, whereas the Gauss-Seidel m ethod always uses the mos' recent value of each variable in every calculation. Example 2.37 below illustrates the Gauss-Seidel method in a three-variable problem. At th is paim, yo u should have some questions and concerns abo ut these iterat ive methods. (Do you?) Several corne to mind: Must these met hods co nverge? ICnot, when do they converge? If they co nverge, must they converge to the solution? The answer \0 the first ques tion is no, as Example 2.36 ill ustrates.

Example 2.36

Apply the Gauss-Seidel method to the system

with initial approximation SOIIUOI

[~].

We rea rrange Ihe equatio ns to gel

x l = l +x2 Xl

= 5 - 2xI

The fi rst few iterates are given in Table 2. 11. (Check these.) T he ac tual solution to the given system is [ : :] =

[~l Clearly. the iterates ill

Table 2. 11 are not app roaching this point, as Figure 2.28 m akes graphically clear in an example of divergence.

128

Chapter 2 Systems of Linear Equations

So when do these iter3tive methods com'erge? Un for tunately, the answer to this question is rather tricky. We will answer it completel y in Chapter 7, but fo r now we will gl\'e a partial answe r, \vithou t proof. Let A be the II X II m atrix

au

A=

a"

, ,

'"

a.,

a~

fl 21

'

a,.

' ,

.. ,

fl2"

.. ,

a.

We say tha t A is strictly diagonally dominant if

lulI I > laul + laul + ... + Iud > Itl211 + laBI + ... +

I UI ~I l al ~1

That is, the absol ute value of each diagonal entry a ll' U w .. . ,

n~~

is greater than the

sum of the absolute values of the re/Twining entries in that row.

Theor8 .. 2.9

If :I system o f II linear equatio ns in /I variables has a strictly diagonally dominant coefficient matrix, then it has a uniq ue solution and both the Jacobi and the Gauss-Seidel method converge to it .

•••• ,. Be wa rned! This theorem IS a one-way implicat ion . The fa cl that a system is lIo t strictl y dl3gonally domm[lnt does lIor m ean that the iterati ve methods diverge. They mayor may no t converge. (See Exercises 15- 19.) Indeed , there 3re examples in wh ich o ne o f the m ethods converges and the o ther d iverges. However, If either of these methods converges, then it must co nverge to the sol ution- It cannOI converge to some other point.

Theorem 2.10

If thc Jacobi or the Gauss-Seidel method converges for a syMem of /I linea equations in n variables, then it must converge to the solution of the system.

•

PilOt We will illustra te the idea behind the proof by sketch mg It o ut for the case of Jacobi 's method, using the sys tem of equations in E.xample 2.35. The general proof is similar. Convergence mea ns that from some iteration on, the val ues of th e iterates rel1l:li n the same. Th is means that X I and X:z converge to rand s. respecti vely, as shown in Table 2. 12. We musl prove tha t

[x'J ['J' X2

=

s

IS

,

the solutl()n of the system of eq uatio ns. In

other \'o'o rds, at the ( k + l )sl iteration , the values of

XI

and X:z must sta y the same as al

Se.::tion 2.5

Iterative Met hods for Solving Linear Systems

121

Table 2.12 k

n

x,

.. .

,

••

s

•

k+ I

k+ 2

,

,

.

s

s

.. .

the kth iteration. But the calculatio ns give (7 + 3xl )/5 == (7 + 3r)/5. Therefore.

5h 7

~

,

and

XI

7

= (5

..

+ x1 )17 = (5 + $)/7 and x2 =

+ 3r 5

=s

Rearranging, we see that

7r - s = 5 3r -55= - 7 Thus,

XI

= r, Xi = s salisfy the o riginal equations, as required.

----

By now you may be wonden ng: If iterative methods don't always converge to the solution, what good arc they? Why don't we just use Gaussian eliminatio n? First, we have seen that Gaussian elim inatIOn is sensitive to roundoff errors, and this sensitivity can lead to inaccurate or even wildly wrong answers. Also, even if Gaussian elimi · nation does not go ast ray, we canno t improve o n a solu\ion once we have found it. For example, if we use Gaussian elimination to calculate a solution to two decimal places, there is no way to obtain the solution to fou r decimal places except to start over again and wo rk with increased accuracy. In contrast, we can achieve additional accu racy with ite ra tive methods simply by doing more iteratio ns. For large systems, particularly those with sparse coefficien t matrices, iterative methods are m uch faster than direct methods when Implemented on a computer. In many applications, the systems that arise are strictly diago nally dominant, and th us iterative methods are guaranteed to converge. The next example illustrates one such ap plication.

Example 2.37

Suppose we heat each edge of a metal plate to a constant temperature, as shown in Figu re 2.29.

50""

JlgurI2 .29 1\ heated metal plate

o·

!ms o f ti near Equatio ns

Eventuall y the temperatu re at the in terior po ints will reach equilibrium, where the following propert y can be shown 10 ho ld :

The temperature at each interior point Pon a plate is the average of the temperatures on the circumference of any circle centered at Pinside the plate (Figure 2.30).

Fluu,.2.30

To apply this pro perty in an actual exam ple requ ires techniques fro m calc ulus. As a n alternative, we can approximate the sit ua tio n by ove rla yi ng the plate with a grid , or mesh, that has a fi nite number o f interior points, as shown in Figure 2.3 1.

5fJ'

50'

nuur. 2.31

I,

I,

100' 'l

The di5(: rele verSLon o f Ihe heated

plale problem

0'

0'

The disc rete analogue o f the averagin g p roperi y governing equil ib ri um tempera tu res is slated as fo llows: Th e temperature at each interior point P is the average of the temperatures at th points adjacent to P.

For the example shown in Figure 2.3 1, there are th ree In terior points, and each is adjacent to four other points. Let the equ il ib ri um temperatures o f the m terior points

Section 2.5

be t.,

' 2'

and

fl ,

129

Iterati ve Methods for Solving Linear Systems

as snown. Then, by the temperature-averaging p roperty, we have

'I + SO

+ 100 +

100

4

+

tl

+ 0

t)

+ 50

(3)

4

+

100

100

+

+

0

t2

4 "'" 250 - /1 +4t} -

t}"'"

50

'1 + 41} "'" 200

-

No tice tnat this system is strictly diagonally do m inan\. No tICe also that equations (3) arc in the fo rm required for Jacobi o r Ga uss-Seidel iteratIon. With an initial approxima tion of" "" 0, 12 "" 0, t) = O,the Gauss-Seidel method gives the foll owing itera tes. 100

Iteration 1:

+

100

+ 0 + 50

4 /1

I}

Iteration 2:

=

=

62.5+0+0 + 50 100

+

4 100

100

+

100

= 62.5

= 28. 125

+0+

+ 28.1 25 + 50

4

28.125

= 57.03 1

4 69.531

= 69.53 1

+ 57.031 + 0 + 50 4

100

+

100

= 44.141

°

+ + 44.1 41 :: 61.035 4

Continuing, we fi nd the ite rates listed in Table 2.13. We wo rk with five-slgnificantdigit acc uracy a nd stop when two successive Iterates ag ree wuhin 0.001 in all variables. Thus, the equdib rium temperatures at the in tenor po ints are (to an accuracy of 0.001) II = 74. 108, = 46.430, and ') "" 61 .607. (Check Ihesecalculations.) By using a fin er grid ( wuh more Inte rior points), we can get as precise in forma tio n as we like abo ut the eq uilibrium te mperat ures at va rious POlllts on the plate.

'2

Table 2.1a 0

J

2

3

t,

0

62.500

69.531

t,

0

28. 125

I;

0

57.031

"

...

7

8

73.535

74.107

74. 107

44.141

46.1 43

46.429

46.429

6 1.035

61.536

6 1.607

61.607

.. .

-+

138

Ch;lpwr 2

Systems of Lmear Eqwltion5

CAS

111 Exercises 1--6, apply Jacobi's lUe/lrod to thegivell S)'stelU. 'lflke Ihe zero vector as the illilifl/approximation am/work wllh fOllr-signific,," t-digit accuml)' IItllli IwO srlCceuive aerales agree willli" 0.001 in each vam,"'e {" each case, compare YO llr al/swer wi/II tile exact so/wioll foulld using (IllY direct tIIethod )'Ollilke. 1.

7x,

-

2. 2xl +x1 =5 x, - Xl = 1

= 6 x l -Sxz = -4 Xl

3. 45xI - O.5xz "" x, - J.5J,·l

- I

4. 20xI +

Xl

:II

Xl -

x, - lOx! + - XI

5.

+

3xI +

Xl

+

-

= 17

6xI -

xJ

-

2xJ = I

17. Draw a diagram to illu strate the divergence of the Gauss-Seidel method in Exercise 15.

l B. - 4xl :

I

-'YI+3x1 - X, -Xl

+ 2xJ = 2 2X2 + 4xJ = I

XI - 4xj

diagolt(llly dOlllill(lIIl, lIor ((III the efjJlfltiollS be reamll/ged 10 make it so. Howel'Cr, both tile Jacobi arid till' Gauss-Seidel method cotll'crge (lIIyway Dell/ollSlmle dUlt tlris is Irlle of IIJe G(1I/5s-Seidel method, starlmg with lite zero vector as lite i"itial approx;matlon and obta;IIl118 (I $o/ul ioll tiwI ;$ accurate 10 lI'ith;/1 0.01.

1

x,

16.

hi Exercises 18 a1l(1 19, lhe coefficiem matrix is IIOt strrclly

xl +4x2 + x} = I Xl + 3x, = 1

6. 3x,

15. XI - 2X2 = 3 3xl+2x2 = 1

1

Xj = 13 l Ox} "" 18

x,

o/mlill a ll approxilllflte solution Ihtl/ is aCCl/rtlte to witilill O.(XH.

= 0 + 3x} - X4 "" I -X, + 3x,. = 1

I" Exercises 7-/2, repcat Ille givell exercise rlSillg tire GallssSeidel metllOd. Take the zero vector as tile illitial approxilllfllioll (lIId work with fOlir-sigmfiflJtJI-dlgil aa"llml)' 1I11/i/ 111'0 sllccessiw! itemres agree willli" 0.00 I m each I'anable. Compare thc /II/ mila of itemtiolts requi red by tire Iflcobi arId Gal/ss-Seldel met/lods 10 reach sl/ch all approxmlllte solmio1l. 7. Exercise I

B. Exercise 2

9. Exercise 3

10. Exercise 4

II. Exercise 5

12. Exercise 6

I" Exercises 13 mill 14, (Imw (/iflgrums 10 illustrate I/,e eOIlvergence of tIle G(II/ss-Seidel method willi lire gll'e" system. 13. The system in Exercise I 14. The s),slem in Exercise 2

+

5xl = 14

3x1 = - 7 19. 5xI - 2xz + 3x, = - 8 XI + 4xz - 4x.l = 102 -2x l - 2X:! + 4x3 = -90 XI -

20. Continue perfonning iterations in EXerc"lSe 18 to oblain a solution Ih;1t is accurale 10 wilhin 0.001 . 21. Continue performing iterations

Exercise 19 10 oblain a solution Ihal is accura te to within 0.00 1.

I" Exercises 22-24, the mettll plate lUIS tire C0Il5/(1II1 lemt1Crafllres shown 011 its bOllmlancs. PmtJ rite equilibrium tempemlure fll caell of tile /fIt/ieared ill/erior poillts by sCltillg up a syslem of lille(lr eqlllltiolls miff applyillg eitller the I(lcobi or the Gauss-Seitlel method. Obtai" a SOIIIlIOII Ihal ;$ accurate 10 lVithi" 0.001. 22.

0'

I" Exercises 15 (/lid 16, compule the firs l four ilemles, Iisillg lilt' zero vector as tile jllitiaf approximatlOtl, to show II,at tile Gm/SS-Seidef metllod (Iiverges. TlJen show tlJaltl,e eqlw llOIls am be rcammged 10 give (/ strictly diago//(/I/y (IOllllllalll coeffilictll matrrx, and apply II,e Gauss-Seidel tIIelhod 10

III

0'

"

,. ,.

SeCiion 2.5

o·

CY'

23.

"

o· IOCY'

"

24.

o·

o·

'.

IOCY'

100"

CY'

2CY'

"

'2

40'

4CY'

27. A narrow strip of paper I unit long is placed along a number line so that its ends are at 0 and I. The paper IS folded In half, right end over left, so that its ends are now at 0 and t. Next, it is fo lded in half again, this time left end over right, so that It S ends a fC at ~ and ~ . Figure 2.32 shows this process. We con tinue fo lding the paper in half, alternating flght -over-Ieft and leftover-right. If we could con ti nue indefinitely, il IS dear that the ends of the paper would converge to a poi nt. It is thjs point that we want 10 find .

2CY'

(a) lei XI co rrespond to the left -hand end of the paper

100'

'.

"

131

Exercises 27 ali(I 28 demonstrtltc that soll/ctimes, if we arc lucky, the form of an iterative problelllll1ay allow [IS 10 lise a little IIIsight to olJtaiu all exact soilltion.

'2

IO(f

Iterati ve Methods fo r Solvmg Linear Systems

IOCY'

111 Exercises 25 (lI1d 26, we refille the gnd used In Exercises 22 mid 24 to obtai" more accllrate iltjormatloll about IIII? eqllilibrlllm lcmperarures almtcrior poims of the plates. ObU/ili solUliollS Ihat are accurate to WIt/1111 0.00], J/SlIlg eitiler the Jacoul or tile G(It/S5-Seidel method. 25.

and Xz to the right-h:llld end. Make a table with the first six values of [ XI' X21and plol the corresponding pomts o n Xl-X:! coordinate axes. (b) Find two linear equulions of the form X:! = (lX I + b and XI = £'Xl + d that determine the new values o f the endpoints at each iteration. Draw the correspondlllg lines on your coordinate axes and show that thiS d iagram would result from applying the Gauss-Seidel method to the system of linear equations you have found. (Your diagram should resemble Figure 2.27 on page 124.) (c) Switching to decimal representation, continue applying th e Gauss-Seidel method to approximate the

point to which the ends of the paper are converging to within 0.00 1 accuracy. (d) Solve the system of equations exactly and compare your answers. 28. An ant is standin g on a number line al poilu A. It

o· 5~

26.

5°

0°

o·

"

'2

'. '.

" 4CY' 40'

00

'.

'10

'13

•

5°

20°

20 0

"

'.

"

'.

'n

'12

'"

'16

20°

2CY'

100'

,,,,alks halfway to point 8 and turns arou nd. Then it walks halfway back to point A, lurns around again, Jnd walks halfway to pomt B. It continues to do this indefinitely. Let point A be at 0 and poil\\ 13 be at I. The ant's walk is made up of a sequence of overlap· ping line segments. Let XI record the positions of the left-hand endpoints of these segments a nd X:! their right -hand endpoin ts. (Thus, we begin with XI = 0 and X:! = Then we have Xl = ~ and X2 = ~. and so on.) Figure 2.33 shows Ihe stun of the ant's wulk.

i.

(a) Make a table with the fi rst six values of Ix i • ~J and plot the corresponding points on X I - X 2 coordinate axes. (b) Find two linem equations of the form ~ = aX I + /, and XI = cx.z + d that dClertlline the new values of the endpoints at each iterallon. Draw the corresponding

132

Chapter 2 Systems of I.inear Equations

-...

..I 0

• I ;I /Iite

I

I

0

I I

, ,, ,I •

I

,

,

I

I 0

,,I

I 0

0 I

I

-1

1

,

2

I

,

-3

n11U12.32 Folding a strip ofpapcr

I 0

,,I

I

I

• -I- i I

I 0

,-

-,J

)

I"-J9{~\

-

J

-

I

, -,

I/P'if·1 I

I

J

I

-,

1

lines on your coordinate axes and show that this diagram wo uld result from applying the Gauss-Seidel method to the system of linear equa tions you have found. (Your diagram should resemble rigure 2.27 on page 124.) «) Switchmg 10 decimal represen tation, continue appl ying the Ga uss-Seidel method to approximate the values to which XI and Xl arc converging to within 0.00 1 acc uracy. (d ) Solve the system of equatio ns exacdy and compare yo ur answers. Inter pret yo ur resul ts.

1

2

8

figl,. 2.33 The anCs walk

R

.

.'

~

.~

.~

..:......... ":'

.

'.'

Ie, Dellnltlons and augment ed matrix, 62 back substitu tion , 62 coefficient matrix, 68 consistent system, 61 convergence, 123- 124 d ivergence, 125 elementary row o perations, 70 free variable, 75 Gauss- Jordan eli mination, 76 Gauss-Seidel method, 123

Gaussian eiimmatio n, 72 ho mogeneous system, 80 inconsistent system, 6 1 iterate, 123 Jacobi's m ethod , 122 leadmg Yanable (leading 1), 75-76 linear equation, 59 linearl y dependent vecto rs, 95 linearly independent vectors, 95

pivot, 70 ra nk of a matrix, 75 .. Rank Theorem, 75 reduced row echelon form, 76 row echelon fo rm, 68 row equivalent matrices, . 72 span of a set o f vectors, 92 spanning Set, 92 sysiCm of linear equations. 59

Review Questions I. Mark each o f the following statemen ts true or fa lse:

(a ) Every system of linear eq ua tions has a solution. (b ) Every homogeneo us system of linear equations has a solution. (c) If a system of linear equations has more vanables than equat ions, then it has infinitely many solutio ns. (d ) If .. system of linear equatio ns has mo re equations than variables, then it has no solution.

(el Determining whether b is in span(a l , •• . ,an) is equivalent to determlll lllg whether the system [A I b l is consistent, whe re A ::0 lOl l'" anI . (f) In RJ,span( u , v) is alwa ys a plane through the on gln. (g) In R3, if nonzero vectors u and v are not parallel, then they are linearl y independent. (h ) In R 3 , If a set of vecto rs can be drawn head to tail, one after the o ther so that a closed path (polygon) is fo rmed, then th e vectors are linearly dependen t.

133

Chapter Review

(i ) If a set of vectors has the propert y that no two vectors in the set are scalar m ultiples of o ne another, then the set of vectors is linearly independent. (j) If there arc more vectors in a set of vectors than the num ber of entries in each vector, then the sCI of vectors is linearl y dependent.

2. Find the rank o f the mat rix.

] 3

-2 - ]

o

3

2

]

3

4

3

4

2

- 3

o

- 5

- I

6

2 2

II. Find the gener;al equatio n of the plane span ned by

1

3

1 and

2

1

]

2 12. Determ ine whet her independent.

u

~

,v =

]

4. Solve the linear system

- ]

,v =

over 2 7,

6. Solve the linear sys tem

3x +2y = 1 x + 4y = 2

7. For what value(s) of k is the linear system with

2 I] inconsistent?

2k 1

9. Find the point of intersectio n of the fo llowing lines, if it exists.

Y

2 + , -I 3 2

,

10. Determine whether 1

and

2 -2

x

1

and

y

,

5 -2 - 4

- I

+

~

- I

1

3

5 is in the span of

1

3

a1 a,l. What are the possible val-

t

1

17. Show that if u and v are linearly independent vecto rs, thensoare u + vand u - v.

18. Show that span(u, v) = span(u, u u and v.

+ v) fo r any vectors

19. In order for a linear system with augmented mat rix [A I b l to be consisten t, what mus t be true about the ran ks of A and [ A I b j? 1

1 1

- ]

w

16. What is the maximum rank o f a 5 X 3 matrix? What is the minimum rank of a 5 X 3 matrix?

8. Find parametric equations fo r the line of intersection of the planesx+ 2y+ 3z = 4 and 5x + 6y+ 7z = 8.

1

,

15. Let a i' a 2, aJ be linearly dependen : vectors in R', not all zero, and let A = [a l ues o f the rank of A?

x

0 1

0

(a) The reduced row echelo n for m o f A is 13' (b ) The rank of A is 3. (e) The system [A I b] has a unique solution for any vector b in [RJ. (d) (a), (b ), and (c) are all true. (e) (a) and (b ) are both true, but nOI (el.

2x+ 3y = 4 x + 2y = 3

k

1

w ~

14. Let a i' a 2, a J be linearl y independent vectors in [RJ, and let A = ta l a~ a JI. Which o f the following s tatements are true?

5. Solve the linear system

augmented matrix [ I

- 2

1

0

3w+ 8x- 18y+ z = 35 w + 2x - 4y = II w+ 3x- 7y+ z = iO

9 are linearly

0

]

- I

,

= span{u, v, w) if:

,

0

1

Cbl u ~

-2

1

0

x + y - 2z = 4 x + 3y - z = 7 2x+ y - 5z = 7

-I

-3

]

Cal

3. Solve the linear system

,

]

13. Determine whether R'

3

1

20. Arc the matrices

I

I

2 3 - I and - 1 4 1 row equivalent? Why or why not?

I

0

- 1

I 0

I I

I 3

'---' .. .... ... . --,-_ ......- ~,..

.....

~

-

.

trice

We [Halmos lind Kllpltlllsky/ share II philosophy a llOlll lim:llr algebra: we Ihink basis-free, we wnte basis-free, bur wile" Ihe chips are down we clost' Ihe affin' door ami comp"tt with matricts tikt fury. -lr:vlllg Kaplansky In Pa,11 Halmas: Celebrating 50 )cars of Mar/rt'malics J. H. Ewingand F '" Gehrmg. 005. Springer-Verlag, J991 , p. 88

3.0 Introduction: Matrices In Action In this ch3pter, we will study matrices in their own right. We have already used matrices-in the form of augmented matrices-to record information about and to help stream,line calculatio ns involvmg systems of Imear equations. Now you will see that matrices have algebraic properties of their own, whICh enable us to calculate with them, subjoct to the rules of matrix algebra. Furthermo re, you will observe that matrices arc not stalic objects, recording information a nd data; rather, they rep resent certain types offunctions that "act" on vectors, transformi ng them in to other vecto rs. These "mat rix transformations" will begin to play a key role in our study of linear 31gcbra and will shed new light o n what you have al ready learned about vectors and systems o f Imear equatio ns. Furthermo re, mat rices arise in many form s other than augmented matrices; we will explore some of the many applications of mat rices al the end of th iS chapter. In thiS section, we will consider a few si mple examples to illustrate how matrices ca n transfo rm vectors. In the process, you will gel your first gl impse of "matrix arithmetic." Consider the equations

y, = x l + 2x2 Y2 =

(I)

3 X2

\'Ve can view these equations as describing a tran sformation of the vector x -- [xX,'

1

in to the vector y = [;:]. If we d enote the matrix of coefficients of the right-hand side by F, then F =

[ ~ ~] we can rewrite the transformation as

or, more succinctly, y = Fx. (T hi nk o f this expression as analogous to the functional notation y = ! (x ) you are used to: x is the independ ent "varltlble" here, y is the dependent "variable," and F is the name of the "functio n.")

13.

Section 3.0

Th us, if x = [ -

Introduction: Matrices m Action

135

~ ], then the eq uations ( I) give YI=- 2 +2 'I =O

Y2 =

3 ' 1=3

We can write this expression as

y =

[ ~]

[ ~] = [ ~ ~][ - ~ ].

ProblelD 1 Compute Fx for the following vectors x:

Problem 2 The heads o f the fo ur vectors x in Problem 1 locate the four corners of a square in the x I X2 pla ne. Draw this square a nd label its corners A, B, C, and D, cor· responding to parts (a), (b ), (c), and (d ) o f Problem 1. On separate coordinate axes (labeled YI and Yl)' d raw the fo ur points determined by Fx in Problem 1. Label these po~s A', 8' ,C , and D' . Let's make the (reasonable) assumption thaI the line segment AB is tra nsformed in lO the line segment A' B', and likewise for the other three sides of the square ABCD. Whal geometric figure is rep re· sen ted by A' B'C D'?

Problell 3 The center of square ABCD is the origin 0 =

[ ~ ]. What

IS

the center of

A' 8' C D' ? What algebraic calculation confirm s Ihis? Now consider the equations 21

=

YI- Yl

(2)

2;> = - 2YI

that t ransform a vector y =

[;J

[~]. We can abbreyiatc this

into the vecto r z =

tra nsformation as"Z = Gy, where

G=[ - 2' -'] 0 Prolllllll 4 We arc going to fi nd out how G transfor ms the figure A' B' C D' . Compute Gy for each o f the four ve aw " . ,and If m = n ( that IS, if A has the same nu mber of rows as columns), the n A is called a square mntrix. A square matrix whose nondiagonal entries a rc all zero IS called a tlingomd matrix. A diagonal matrix all of whose diagonal en tries a rc the same is called a scalar matrix. If the scalar o n the diagonal IS 1. the scalar mat ri x is called a n idw,ity matrix. For example, let A = [

2

- I

5 4

B

-

[34 5'I]

o o c= o 6 o , o o 2 3

D =

1 0 0 0 1 0

o

0

1

The diagonal enlries of A a re 2 and 4, but A is no t square; B is a square ma trix of Size 2 X 2 with diagonal entries 3 a nd 5, C is a diagonal ma trix; D is a 3 X3 identity ma tTix. The n X II identity ma trix is denoted by I~ (or simply I if its size is unde rslOod). Since we c.1 n view ma trices as generalizations of vectors (and. indeed, matrices can and sho uld be though t of as being made up of bot h row a nd column veclOrs), many of the conventions and o pe rations for vectors carry th rough ( m a n obvious way) to m atrices. Two ma trices arc equal if they have the same size a nd if their corresponding ent ries are equal. Th us. if A = [a'JJmxn and B (b91 ,xl' the n A Bif and only if //I "'" r llnd /I = s a nd tI,) = hI] for atll a nd j.

=

Example 3,1

=

Conside r the matrices

A = [:

!].

B=

[ ~ ~ ].

c-

o

[! ;] 3

Neither A no r B can be eq ual to C( no matter what the values of xand y), s ince A lInd Bare2 X2 malrices a nd C is2X3. However, A = Bifand on ly if ( I = 2, /, ;;; O,e - 5, and d "" 3.

Example 3,2

Consider the malri c~s

R =[ l

4

3J and

C=

1 4 3

138

Chapter 3

Matrices

D espite the fac t that Rand C have the same entries in the same order, R -=I- C since R is 1 X3 and C is 3X I. (If we read Rand Caloud , they both sound the same; "one, fou r, th ree.") Thus, o ur distinc tion between row matrices/vectors and column matrices! vecto rs is an importan t one.

Matrix Addition and Scalar Multiplication Generalizing from vector add ition, we defi ne mat rix addi tion compOl1el1twise. If A = [a,) and B = [b;) are mX tI mat rices, their sum A + B is the mX tI matrix obtained by adding the corresponding entries. Thus,

A

+ B = [11;j +

bij]

[We could equally well ha ve defined A + B in terms o f vector addition by specifying that each column (or row) of A + B is the sum of the correspo nding colum ns (or rows) of A and 8. [ If A and B a re no t the same size, the n A + B is not defined.

Example 3. 3

Let A = [

1

4

-2 6

~].

B ~

1 ] [-: -1 2 . 0

Then

A+B = b ut neither A

+

C no r B

[ -~

5 6

"d

C =

[~

:]

-; ]

+ C is defi ned.

The com ponen twise defi n ition of scalar multiplication will come as no surprise. If A is an m Xn matrix and c is a scalar, then the scalar multiple cA is the mXn matrix o btained by m ultiplyi ng each e ntry of A by c. More fo rmally, we have

[ In te rms o f vectors, we could equivalently stipulate that each column (or row) of cA is c times the corresponding colum n (or row) of A.I

Example 3.4

For mat ri x A in Example 3.3 ,

2A = [

2

- 4

8 12

l~l

!A= [_: ~

~l

and

(- l)A =[- ~

- 4 - 6

The matrix (- I)A is written as - A and called the negativeo f A. As with vectors, we can use this fact to defi ne the difference of two matrices; If A and B are the same size, then A - B ~ A +(- B)

Sect ion 3.1

111m pie 3.5

lJ1

Ma tnx Operatio ns

For matrices A and B in Example 3.3,

]- [-3 o A - B= [ I 4 0 I

-2 6

5

3

A matrix all of whose entries arc l eTO is called a zero matrix ;md denoted by 0 (or 0 "')(11 if it is imporlant to specify its size), It should be dear that if A IS any matrix and o is the 7£ ro matrix of the same size, then

A+O=A=Q+A . nd A - A = 0 = - A

+A

MaUll MUlllpllclliOD ~13t hcll1aticia ns

are sometimes like Lewis Carroll's Hu mpty Dumpty: "Wb en I use a w\l rd ,~ Hu mpty Dumpty said, "it means just what I choose it to me-anneither more nor JdoS ( from 11Iro11811 Illf Loobll8 GIIIU), M

The Introduction in Sect ion 3.0 suggested that there is a "product" of matrices that is analogous to the compo sition of fun ctions. We now make Ihis no tion morc precise. The defini tion we arc about to give generalizes what you should have discovered in Problems 5 and 7 in Section 3.0. Unl ike the definitions of matrix addition and sca lar multiplicauon, the defi nitio n o f th e product of IwO m,l\rices is not a componentwise definiti on . Of cou rse, there is nothing to stop u s from defin ing a product o f matrices in a componenlwl5e fas hion; unfortunately such a defini tion has fcw ap plica tions and is not as "natu ral" as the one we now give.

If A is an ", Xn matrix and 8 is an tlX r matriX', then the product an "'x r matrix. The (i, j) entry of the product is computed as follows:

II

.

U.,r•• Notice that A and B need not be the same size. However, the number of colulIIlI$ of A must be the same as the number o f rows of 8. If we write the sizes of A, and AIJ in order, we C;11l scc at a gl A'(Ax) = A' b ::::} (A' A)x = A' b ::::} Ix = A' b => x == A' b

=

(Why would each of these steps be justified?) Our goal in this section is to determi ne preCisely when we can find such a matrix A'. In fact , we arc going to insist on a bit more: We want not only A' A = I but also AA ' == 1. This requirement forces A and A' to be square matrices. (Why?) =2

Dellnltlon

If A is an nX II mat rix, an j"verse of A is an the property that

AA' = I and where 1 == ;'JVertible.

EKample 3.22

,

If A =

AA ' :

[

21

is the

matrix A' with

A' A ~ 1

"x" identity matrix. If such an A' exists, then A is called

5] then A' = [ 3 -'] 2 is an inverse or A, since 3 '

- I

[2 5][ 3 -5]: [1 I

Example 3.23

1~

/IX /I

3

- I

2

0

0] and A' A = [ 3 1 - 1

Show that the following matTices are not invertible:

(, ) 0 :

Solullon

[~ ~]

(b) B : [ ;

:]

(a) It is easy to see that the zero matnx o docs not have an inverse. Ifit did, then there would be a matrix 0 ' sllch that 00' = J == 0 ' 0. But the product of the zero matrix with any other matrix is the zero matrix, and so 00' could never equal the identity

matrix I. (Notice that this proof makes no reference to the si%.e of the mal rices and so is true for nXn matrices in general.) (b) Suppose B has an inverse B'

=

[ ;'

:J.

The equalion 118' = / gives

[; ~t :] [~

~]

=

from which we get the equations

+ 2y

w

+ 2z

x

+ 4y

211'

= 1

= 0

=0

+ 4z = I

2x

Subtracting twice the fi rst equation from the third yields 0 = - 2, which is clearly absurd. Thus, there is no solution. (Row reduction gives the same result but is not rea lly needed here.) We deduce that no such matrix B' exists; that is, IJ is not invertible. (In fact, it docs not even have an inverse th 8XA = W~AB-JA => B-1BXAA - I = B- 1B- 3 AB- J AA - 1 => IXI = B- 4 AB- J I => X = 8- 4 AB- 3 (Can you justify each step?) Note the careful use of Theorem 3.9( c) and the expansion o f (A-1 8 3 ) 2. We have also made liberal use of the associativity of matrix multiplicatio n to simplify the placement (or el imination) of parentheses.

Elementary Matrices V.re are going to use matrix multiplication to take a different perspectIve o n the row reduction of mat rices. In the process, you will discover many new and important insights into the nature o f invertible matrices.

If I E ~

0

0 0

0 I

0

I

0

,nd

A ~

5 - I

0

8

3

7

we find that 5

7

8

3

-1

0

EA ~

In other words, multiplying A by E(on the left ) has the same effect as lllterchanging rows 2 and 3 o f A. What is significant about E? It is si mply the matnx we obtain by applying the same elementary row operation, R2 +-,). R3, to the identIty matrix 13, It turns out that this always works.

Definition

An elementary matrix is any matrix that can be obtained by per forming an elementary row operation on an identity matrix.

Since there are three types of elementary row operations, there are three corresponding types of elementary matrices. Here are some more elementary matrices.

Example 3.21

Let I

E]

=

0

0 0

0 3 0 0

0 0

0 0 1 0

0

1

, Ez =

0 0

0

1

1

1

0

0 0

0 0 , 0

0

0

0

I

I

,nd

E, ~

0 0 0

0

0 0 1 0 0 0 1 0 - 2 0 1

Section 3.}

The Inverse of a Matrix

169

Each of thcse matrices has been obtained from the identity matrix I. by applying a single elementary row operation. The matrix £1 corresponds to 3R1 , E, to RI +-+ R.p and ~ to R( - 2R 2• Observe that when we left-multiply a 4X II matrix by one of these elementary matrices, the corresponding elementary row operation is performed on the matrix. For example, if

a" a" a" a"

A~

then

E1A =

a"

al2

au

3""

3all

3al}

a" a"

a"

a"

EJA ;;

and

a.,

al2

au

a2l

a"

a" a"

a.2

a.,

• E2A =

""

a" a" a" a"

a" an

a'l

a"

au •

all

a" a"

a"

al!

au

a 21

au

a"

a"

an

an

a" - 2a21

a. 2 - 2a Z2

ao - 2a D

--t

Example 3.27 and Exercises 24-30 should convince you that tltlyelemen tary row operation on tilly matrIX can be accomplished by left-multiplying by a suitable elementary matrix, We record this fact as a theorem, the proof of which is omitted.

Theo,.. 3.10

,

L

Let E be the elementary matrix obtained by performing an elemcntdry row opcration on Tw' If the salllc clementary row operat iOll is performed on an fi X r m,lI rix A, the result is the S(Hne as the matrix fA

a•• .,.

From a compu tational poin t of view, it is not a good idea to use elementary matrices to perform elementary row operations-j ust do them direct ly. However, elementary mat rices C:1I1 provide some valuable IIlslghts 1Il10 invertible matrices and the solution of systems of linear equations. We have already observed that ewry elementary row operation can be "undone," or "reversed." Th is sa me observation applied to element,lfY matrices shows us that they are invertible.

Example 3.28

Let L

E,

Then £ 1-

~

0 0

0 0

0

I

0

L

L

,~=

0 0

0 4

0 0 • and

0

I

E) =

L

0

0

I

0 0

- 2 0

I

corresponds to Rz H RJ , which is undone by doi ng R2 H RJ agai n. Thus, 1 = £ 1' (Check by showing that Ei = EIE. = I.) The matrix Ez comes from 41l1, EI

Chapt ~r 3

110

Matrices

which is undone by perform ing ~ R2 ' Thus.

o o 1 o E,- · - o • o o I I

which can be easily checked. Finally. ~ corresponds to the elementary row o peration RJ - 2R" which can be undone by the elementary row opera tion R} + 2R .. So, in this case,

(Again, it is easy to check this by confirming that the product of this matrix and both o rd ers, is I.)

~,

in

Notice that not only is each elementary matrix invertible, but its inverse is another elementary matrix of the same type. We record this finding as the next theorem.

-

Theo". 3.11

Each elementary matrix is invertible, and its inverse is an elementary matrix of the same type.

T.e luUamenlal Theorem 01 Inverllble Mallicas Weare now in a position to p rove one of the main resul ts in this book- a set of equivalent characterizatio ns of what it means for a matrix to be invertible. In a sense, much o f line;l r algebra is connected to this theorem, either 10 the develo pment o f these characterizations or in their applicatio n. As you m ight expect, given this introduction, we will use this theorem a great deal. Make it yo ur fr iend! We refer to Theorem 3.12 as the first version of the Fundamental T heorem, since we will add to it in subsequent chapters. You are rem inded that, when we $l.IY that a set of statements about a matrix A are equivalent, we mean that , for a given A, the statements are either all true o r all fal se.

, The Fundamental Theorem of Invertible Matrices: Version I Let A be an a. b. c. d. e.

11 X n

matrix. The following statements are equivalent:

A is invenible. Ax = b has a unique solution for every b in IR,n. Ax = 0 has only the trivial solution. The reduced row echelon form of A is 1.. A is a product of elementary matrices.

SectiOn 33

Praal

111

The Inverse of a Matrix

We Will establish the theore m by proving the Ci rcular cham of implications

(a )::::} (b ) We have al ready shown that if A is invertible, then Ax = b has the unique solut ion x = A- Ib fo r any b in 1R"(Theorem 3. 7). (b) => (c) Ass ume that Ax = b has a unique solu tion for any b in [R". This implies, in particular, that Ax = 0 htls tl unique sol utIOn. But tl homogeneous system Ax = 0 alwtlys has x = 0 as olle solution. So in Ih is case, x = 0 must be tl.esolution. (c)::::} (d ) Suppose th tlt Ax = 0 has o nl y the tnvitll solution. The corresponding system of eq uations is (I ,

tX t

(l2tXt

+ +

(1124 (1 214

+ ... + + ... +

at..x" = (ll..x" =

0 0

and we are ass um ing that its solutio n is

x,

= 0 =0

x

"

=

0

In o the r words, Gauss-Jordan eliminatio n applied to the augmented matrix of the system gives

a" a" [AI OJ =

""

a 22

a",

anl

.. ,

ti , "

0

a,"

0

,

1 0 0

1

' , ,

.. ,

0

0

0

0

=

[/"I OJ

, ,

a""

0

Thus, the reduced row echelon form of A

0 IS

0

1 0

I".

(d ) =? (e) If we assume that the reduced row echelon for m o f A is I", then A can be reduced to I" usi ng a fi nite sequence of elemen tary row operations. By Theorem 3. 10, each one o f these elementary row operations COl n be achieved by left-multiplyi ng by an appro pria te elementary matrix. If thc app ropr iate sC{1uence of elem entary matrices is E., f l '" ., EI; (in tha t order), then we have

" "'k

''' ""2 "EA = I" 1

According to Theorem 3.11 , these elementary matrices are all invertible. Therefore, so is their p roduct. and we have

E) - II" = ( E1 ... £, 2 E'1 )- ' -- £, 1- ' E'-l 1... E-,. ' A -- (El .. , E21 Agai n, each E,-1 is anothe r elementary matrix, by Theorem 3. J I, so we have wriuen A as a product of elemen tary mat rices, as required. (e) =? (a) If A is a product of elementary matri ces, lhen A is invertible, since elementary matrices are invertible and products of inverti ble matrices are invert ib le.

111

Chapter 3 Matrices

Example 3.29

If possible, express A =

Solullon

[ ~ ~] as a product of elemen tary matrices.

We row reduce A as follo\~s:

A --

'[I 3]3 ,_, " ) [I2 !] KJ-l~ [~ -!J '_"1 [I 0] IRe, [I 0] = I ° -3 o

1

'

Th us, the reduced row echelon fo rm of A is the identity matrix, so the Fundamental Theorem assures us that A IS invert ible and can be written as a product of elementary matrices. We have E~EJ ~EIA = /, where

EI = [~ ~]. E2=[_~ ~]. E3=[~

:]. ~=[~ _~]

are the elementary matrices corresponding to the four elementary row operations used to reduce A to /. As in the proof of the theorem , we have E"2 E" E" _ [ 01 A = (E~}! E E EI ) . , -- E" 'I ') '4 -

~] [~ - : ] [~

as required.

Remark

Because the sequence of elementary row operations that transforms A into / i.~ not un l(lue, neither is the representation of A as a product of elementary matrices. (Find a d ifferent way to express A as a product of elementary matrices.)

"Never bring a cannon on stage in Act I unless you intend to fire it by the last aC1." - Anton Chekhov

Theorem 3.13

The Fundamental Theorem is surprisingly powerfu l. To ill ustrate its power, we consider two of ItS consequences. The nrst is that. although the defin ition of an invertible mat rix Slates that a matrix A is invertible if there is a mat rix B such that both AD = / and BA = / are satisfied, we need only check oneof these equatio ns. Thus, we can cut our work in half! Let A be a square matrix. If 8 isa square matrix such that either AD = lor BA = I, then A is invertible and B = A-I.

PlIof

Suppose BA = I. Consider the equation Ax = O. Left-multiplying by IJ, we have BAx = 80. This implies thatx = Ix = O. Thus, the system re presen ted by Ax = 0 has the unique solution x == O. From the eq uivalence of (c) and (a) in the Fundamental Theorem, we know that A is invertible. (That is, A- I exists and satisfies AA - I = I = A - I A.) If we now right -multiply both sides of BA = / by A- I, we obtain BAA-I = LA- I ~ BJ ::: A- I => B = A-I

(The proof III the case of AB

= I

is left as Exercise 4 L)

The next consequence of the Fundamental Theorem is the basis for an efficient method of computing the inverse of a matrix.

Section 3.3

Theorem 3.14

The Inverse of a Matrix

113

Let A bc a square matrix. If a sequence of elem entary row operations reduces A to /, then the same sequence o f elementary row op erations transforms 1 into A-I .

If A is row equivalent to t, thcn we can achieve the reduction by leftIll ulti plying by a sequence £1' Ez • ... , E1. of elem entary matrices. Therefore, we have Ek . .. ~EI A = I. Setting B = E, ... EzE I gives IJA = I. By Theorem 3. 13, A is invertible and A- I = B. Now applying the same sequcnce of elementary row olXrations to 1 is equivalent to left- multiplyi ng Iby El · ·· ElEI = 8. The result is

Proof

Ek ... ~Ell "" 131

= B = A-I

Thus, 1 is transform ed into A-I by the slime seq uence of elemcntary row opcrations.

The Gauss-Jordan Method lor Computing the Inverse We can perform row opcrations on A and I sim ultlillcously by constructi ng a "superaugmented ma tri x" [A l l]. Theorem 3. 14 shows that if A is row eq uivale nt to [ [which, by the I:: undamc ntal Theorem (d ) (a), means that A is invertible !, then elementary row operations Will yield

If A cannot be reduced to /, then the Fundamental Theorem guarantees us tha t A is not invertible. The procedu re just described IS simply Ga uss-Jordan elimilllltion performed o n an tIX27/, instead of an II X( n + 1), augmented matrix. Another way to view this procedure is to look at the problem of fi nd ing A- I as solVi ng the mat rix eq uation AX = I" for an n X II matrix X. (This is sufficie nt, by the Fundamental Theorem , since a right inverse o f A mus t be a two-sided inve rse. ) If we deno te the colum ns o f X by X I ' .•. , x n' then this matrix equation is equ ivalent to so lvlllg fo r the columns of X, one at a time. Since the col um ns of /" are the standard um t vectors f l ' ... , en' we th us have /I systems of linear equa tio ns, all with coeffiCIent matrix A:

Since the sam e sequence o f row o perations is needed to b rlllg A to red uced row echelo n form in each case, the augmented matr ices for these systems, [A I e d, ... , [A I e" i, ca n be co mbined as

(AI ' ,' , ... ' .]

~

[A I I.]

We now apply row operations to try 10 reduce A to I", which, if successful, will sim ul taneo usly solve for the columns o f A -I , transfo rming In into A - I. We illustrate this use of Ga uss- Jo rdan el iminatio n with three examples.

(Kample 3.30

FlIld the inve rse of A ~

if it exists.

1 2

- I

2 2 1 3

4

- 3

114

Chapler 3

Matrices

Solulloa

Gauss-Jordan elimination produces

2 2 2 1 3 1

[AI f ) = H. · 211, 11,- II,

R..- II. )

1

0

0

4 0 -3 0

1

0 1

0

2 -2

0

1

1

)

- I

0

- I

6 - 2

- 2 - I

1

2

- I

1

0

0 0

1

-t

1

- 3 1 - 2 - I

1

2

- I

0

1

0

0

1

2 0 - I

0

0 0 1

1

0

- 3 1 r - 2

,

0

_1

0

1

r

0 1 0 -s 0 0 1 - 2

,

1

1

1

3

,1

, -,

I 0 0 9 0 1 0 -s 0 0 1 - 2

II, -lll,

1

A-I

=

9

-1

- 5

-s

1

-2

1

3 1

,

1

-s

,1 ,-

-

Therefore,

0 0 1 0 0 1

1

3 1

(You should always check that AA- l = Tby di rect m ultiplicat ion. l3y T heorem 3.13, we do not need to check that A- , A = Ttoo.)

Remlr.

NOlice Ihal we have used the v3riant of Gauss-Jordan elimination th3t first introduces all of the zeros below the le3dmg Is, from left to right and to p to baltom, and then cre31es zeros above the leading Is, from right to left and bollom to top. This approach saves o n calculations, as we noted in Chapter 2, but yo u mOlYfind it easier, when working by hand, \0 create ,,/I o f the zeros in each column as yo u go. The answer, of cou rse, will be the same.

lxample 3.31

Find the inverse of 2 A '=

if it exists.

- 4 -2

1

- 4

- [ 6 2-2

SectIon 3.3 The In\'erst of a Malrix

115

SOI,II.. We proceed as in Example 3.30, adjoin ing the identity matrix to A and then trying to manipulate IA I tJ into II I A-I I. 2

[AI I}

~

I{_ 11{,

•••• II. ,_II

•

-.

]

-,

- \

- 2

2

-,

]

0

0

6 0

\

0

- 2 0

0

\

\

0

0

\

0

3

- 2 2 - 6 \

0

\

\

2

-\

]

0

0 0

]

-3

2

\

0 0

0 -5

-3

\

2

]

0

]

0

0

At this point, we see that it is not possible to reduce A to 1, SIIlCCthere is a row of zeros on Ih, Id l-h,nd sid, of Ih, ,msmenICd n,,"" . Co ''''q ''c j - in other wo rds, workjllgfrol1l lOp

to bottom ill each CO/lIIllIl. ('Vhy?) In th is example, we have

[: 1~

1 - I

fl. •.111,

1 -I

0 1 3 -3

R. - ~R ,

0

1

0

0

0

9

4

5

•

1 - 1

(c) => (h) If rank(A) = tI, then the red uced row echelon fo rm of A has "leading Is and so is In" From (d) => (c) we know that Ax = 0 has on ly the triVIal solution, wh ich implies that the column vectors o f A are linearly independent. since Ax is jusl a linca r comblllatio n o f the column veClors o f A.

(h ) => (i) If the column vectors of A are linearly independen t. then Ax == 0 has only the triVIal solution. Thus, by (c) => (b), Ax = b has a unique solu tion for every bill IR". This means that every vector b in JR" can be written as a linea r combination of the column vecto rs of A. establishi ng (i). (i) => (j) If the colum n vectors of A span IR", the n col (A ) = Rn by definition, so rank (A) :::: dim (col (A» :::: tI. This is (f), and we have already established that (f) => (h ). We conclude that the column vectors o f A are linearly independent and so form a basis for IR:", since, by assumption, they also span IR". (j ) => ( f) If the column vectors of A form a basis for JR", then, in pa rticular, they are linearly independ ent. It follows that the reduced row echelon form of A contains /I lead ing Is, and thus rank(A) = II. The above dIscussion shows that ( f) => (d ) => (c) => (h) => (i) => (i ) => ( f) ¢> (g). Now recall that, by Theo rem 3.25, rank (AT ) :::: ra nk (A), so what we have just proved g ives us the correspond rng results about the column vectors of AT. These are then resulls about the rol\lvectors of A, bringing (k), (1), ------c "

In momy applications that can be mode/cd by a graph, the vert ices are ordered by SQme type of relation that imposes a direction on the edges. For example, directed edges I11ight be used to represent one-way rou tes in a graph that models a transportation network or predator-prey relationships in a graph modeling an emsystem. A graph with directed edges is called a digraph. FIgure 3.26 shows an example. An easy modification to the definition of adjacency matrices allows us to use them with digraphs.

DeliaUion

If G is a digraph with /I X II n1:l trix A ror A( G) I defined by

figure 3.26

tI

vertices, then its adjautlcy matrix is the

I+. digraph

a . = {I if there is an edge from vertex ito verlex j IJ 0 otherwise

Thus, the adjacency matrix for the digraph in Figure 3.26 IS

A=

o o I I

I

0

I

0 0 I 000 0 I 0

Not surprisingly, the adjacency matrix of a dlgmph is not symmetric III general. ('Vhen would it be?) You should have no difficulty seeing that Ai now contains the numbers of directed k-paths between vertices, where we insist that all edges along a path flow in the same direction. (See ExerCise 54. ) The next example gives an applica· tion of this idea.

Example 3.68

D

w

Five tennis players (Davenport , Graf, Hingis, Scles, and Williams) compete 111 a round-robin tournament in which each player plays every other player once. The digraph in FIgure 3.27 summarizes the results. A directed edge from vertex i to vertex j means thai player I defeated player j . (A digraph 11\ whic.h there is exactly one directed edge between ellery pair of vertices is called a tQurnamerll.) The adjacency matrix for the digraph in Figure 3.27 is

G

A-

S

flglrI 3.:n A tournament

H

0 I 0 I I 0 0 I I I I 0 0 I 0 0 0 0 0 I 0 0 I 0 0

where the order of the vertices (and hence the rows and colu mns of A) is determined alphabetically. Thus, Graf corresponds to row 2 and column 2, for example. Suppose ,"'c wish to rank the five players. based on the results or their matches. One way to do this migbt be to count the number of wins for each pla)'er. Observe that the number of WinS each player had is just the sum of the entries in the

StClion 3.7 Applications

239

corresponding row; equivalently, the ve(lor conhlining all the row sums is given by the product Aj, where

1 1 ) =

I

1 1 in our case. we have

0 0 Aj "'"

1 0 0 1

1 0 0 0 0 0

0 0 I

1

1

1

1 1 0

1

1 1 = 1 I

0

J J 2 1 1

which produces the followin g ran king: First: Davenport, Graf (tic) Second: H ingis Third:

Sdes. Williams (tie)

Are the players who tied in this ranking equally strong? Davenport might argue that since she defea ted Graf, she deserves first place. Scles would use the same type of argumcnt to break the tICwi th Williams. However, Williams could argue tha t she has two "indi rect" victories because she beat Hmgis. who defeated two others; fu rt hermo re, she m ight note that Seles has only one indircct victory (over Williams, who then dcfeated H mgis). Since one player might not have defeated all the others with whom she ultimately lies, the notion of ind irect wi ns seem s more useful. Moreover, an indi rect victory corresponds to a 2-path In the digrap h, so we can use the square of the adjacency ma trix. To compute both wins and indirect wins for each player, we nced the row su ms of the matrix A + A 2 , which arc given by

(A + A'lj

=

=

0 1 0 1 1 0 0 1 1 1 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 1 2 2 J 1 0 2 2 2 1 1 0 2 2 0 0 1 0 1 I 0 I 0

2 1 2 0 1 1 1 0 1 0 1 2 0 0 1 0 0 1 0 0 1 0

0 I

+

1 1 1 1 1

1 I

1 1 1

8 7 6

2 3

Thus, we would rank the players as fo llows: Davenport, Graf, Hingis, Williams, Seles. Unfortunately, this app roach is not guaran teed 10 break all ties.

Chapter 3 M:J{rlce's

frror-Correctlng Codes Section 1.4 discussed examples of error-delecling codes. We turn now 10 the problem of designing codes that can ,nrrecl as well as d elect certai n ty pes of errors. O ur message will be a vector x in .l~ for some k, and we will encode it by using a mat rix transforma tion T : 1:~ ..... 1:; for some" > k. The vector T(x) will be called a ",de veclor. A simple example will serve to ill us trate the approach we will take, which is a generaJization of the parity-check vectors in Example 1.31.

Example 3.69

Suppose the message is a single bina ry digit: 0 or I. If we encode the message by simply repeating it twice, then the code vectors are [0, OJ and [ I, I]. This code can de tect single errors. For example, if we tra nsmit [0, OJ and a n error occurs in the first component, then [1,01 is received and an error is detected, because this IS not a legal code vector. However, the receiver can not correct the error, since [ I, OJ would also be the result of an error in the second component if [ I, I] had been transmitted . We can solve this problem by mak lllg the code vectors longer- re peating the message d igit three times instead of two. Thus, 0 and 1 are encoded as {O, 0, 0] and [ I, I, I ], respect ively. Now if a single error occurs, we can nol only detect it but also correct it. For example, If 10, 1. OJ is received, then \ve know it m ust have been the result of a Single error in Ihe transmission of [0, 0, 0 ]. since a single error ill [ I, I, I J could not have produced it.

Note that the code in EXC =

parity check matrix fo r the code. Observe that PC =

[~]

o. The mat ri x P lscalled :1 =

o.

To see how these matrices COllle in to play in the correction of errors, su ppose I we send I as I = [ 1 I I but a single error causes it 10 be received as

f,

I

SectIOn 3.7

c' = [1 0

AppilcatlOns

711

IV. We comp ute

Po'

[: o I

I

~]

0 I

so we know that c ' cannot be a code vector. VV'here is the error? Notice that Pc' IS the second column of the parity check matrix P- this tells us that the error is in the second componen t of c ' (which we will prove in Theorem 3.34 below) and allows us to correct the error. (Of course, in this example we could find the erro r faster without uSing matrices, but the idea is a useful one.) To generalize the ideas In the last exam ple, we make the following definitions.

Definitions

If k < II, then an y nXkmatrix o f the form G =

[ ~], where A is

an (II - k)X k matrix over 7l 2> is called a standard generator matrix fo r an (II, k) biliary code T: 7l~ --,lo z~. Any (/I - k) X n matrix of the form P = [ B In k)' where B is an ( II - k) X k matrix over 71.1' is called a standard parity check matrix. The code is said to have length II and dim ension k.

Here

what we need to know: (a ) When is G the standard generator matrix fo r an error-correcflllg bmary code? (b ) Given G, how do we find an associated standard parity check matrix P? It turns out that the answers are quite easy, as shown b}' the follow ing theorem .

Theorem 3.34

If G =

IS

[ ~] is a standard generato r matrix and P =

[B

I~~tl is a standard par-

ity check matrix, then P is the parity check m atrix associated with G if and on ly if A = B. T he corresponding (II, k) binary code is (single) error-correcting if and only if the colum ns o f Pare nonzero and distinct.

Before we prove the theorem, let's consider another, less tri vial example that illustrv'" are linea rly dependen t is false. It follows that Y. , V1'. ' " V'" must be linearly independent.

/n exercISes /- 12, compute (a) the characteristic polynomial of A, (b) the eigenvalues ofA, (c) a basIS for eacl! eigenspace of A, lIlid (d) tile algebmic alld geometric multiplicity ofeach eIgenvalue. I. A=[

3. A =

5. A =

I

- 2

~l

2.A =[

~l

2

-\

0

\

0

0

\

3

\

\

0 0 3

0 0 0

\

- 2

\

2

- \

0 0

- 2

\

4. A = 0

\

0

3

\

\

- \

\

0

\

\

6. A =

3 2

-, 0

10. A

\

\

- \

\

,-

11. A =

- \

0

2

0

-\

- \

\

\

0

0 \ 4 0 0 3 0 0 0

5

2

\

0

0

- \

0 0 0 0

\

\

\

\

4

0

0

9.A =

3

- \

8. A =

2 2

\

\

2

7. A =

\

\

0 0 0 0

\

\

0 2 3 - \ 0 4

=

\

\

2

296

Chapter 4

Eigenvalues and Eigenvectors

I

0

12. A :::

4 0 0 4 0 0 0 0

I

I

I

2 0

3

(b) Using Theorem 4. 18 and Exercise 22, find the eIgenvalues and eigenspaces of A-I , A - 2/, and A + 21. 24. Let A and Bbe n X II matrices with eigenvalues A and J.L, res pectively. (a ) Give an example to show that A + J.L need not be

13. Prove Theorem 4.1 8(b). 14. Prove Theorem 4.18(c}. [Hint: Combine the proofs of parts (a) and (b) and see the fo urth Remark follow ing

Theorem 3.9 (p. 167}.!

I" Exercises IS and 16, A tors V ,

::: [ _ : ]

ami V2

IS

a 2X2 matrix with eigenvec-

::: [ :

J corresponcling to elgenvailles

!

AI ::: and Al = 2, respectively, and x :::

[~].

15. Find Al(lx.

an eigenvalue of A + B. (b) Give an example to show thtH AJ.L need not be an eigenvalue of AB. (c) Suppose A and J.L cor respond to the sallie eigenvector x. Show that, in t his case, A + JJ. is an eigenvalue of A + Hand AJ.L is an eigenvalue of AB. 25. If A and Bare IWO row equivalent matrices, do they necessarily have the S;lme eigenvillues? Ei ther prove Ihat they do or give a counterexample.

Let p(x) be tile poiynom;tli

16. Find AlX. What happens as k becomes large (i.e., k -+ oo)? II! exerCISes 17 alld 18, A is a 3X3 matrix with eigerlvectors I I I V, = 0 ' V2 = 1 ,am/v)::: I corresponding to eigellI o o vailles A, = - i, A2 ::: and A) = 1, respectivel)\ and

1,

p(x} = X' + a"_Ix"-1 + ... + a,x +

Tile companion matrix ofp(x) is Ille /I X" matrix - a" _1

- 11" - 2

I

0

0

I

0

0

0

0

C(p)-

2

x=

I

ao

-

(I ,

0

- "0 0 (4 )

0

• ••

I

0 0

26. Find the companion matrix of p(x) = x 2 - 7x + 12 and then find the characteristic polynomial of C( pl.

2

17. Find A 10x. 18. Find Akx. What happens as k becomes large (Le., k '"""" oo)? 19. (a) Show that, for any sq uare matri:x A, AT and A have the same characteristic polynomial and hence the same eigenvalues. (b) Give an example of a 2X2 matrix A fo r which AT and A have different eigenspaces. 20. Let A be a nilpotent matrix (that is, Am = a fo r some II! > I). Sho\'I' that A ::: 0 is the only eigenvalue of A. 21. letA bean idempotent matrix (that Is, A! = A).Showthat A = 0and A = I are the only possible eigenvalues of A. 22. If V is an eIgenvector of A with corresponding eigenvalue A and c IS a scalar, show Ihat v is an eigenvector of A - cI with co rrespondi ng eigenvalue A - c. 23. (a) Find the eIgenva lues and eigenspaces of A=

[~ ~]

27. Find the companio n ma trix of p(x) = xl + 3x2 4x + 12 and then find the characteristic polynomial

of C( pI. 28. (a) Show that the companion matrix C( p ) of p(x) ::: xl + ax + b has characteristic polynomial A2 + aA

+

b.

(b) Show that if A is an eIgenvalue of the companion

~

matrix C( p) in part (a), then [ ] is an eigenvector of C( p) corresponding to A. 29. (a) Show that the companion matrix C( p) of p(x) ::: Xl + ax 2 + bx + c has characteristic polynomial _( A' + aA 2 + bA + c). (b) Show that If Aisan eigenval ue of the compan ion

A' matrix C( I'} in part (a), then

A isan eigenvector I

of C( p) corresponding to A.

Section 4.3

30. Construct a tloili riangular 2x 2 matrix wit h eigenvalues 2 and 5. (Hint: Usc Exercise 28.)

33. Verify the Cayley- Hamilton Theorcm fo r A =

2, the companion matrix C( p) of p(x) "'" x" + tl.. I x "~ I + ... + a , x + ao has characteristic polynomial (- \ )"p (A). 1HlMt: Expand by (ofacton along the last colum n. You may find it helpfu l to introduce the polynomial q (x) = ( p(x) - '\I)/x.1 (b) Show that if A IS an eigenvalue of the compan ion matrix C( p } in equation (4), then an eigenvector corresponding to A is given by II

~

- I

0

1

- 2

I

0

powers mId inverses of flit/trices. I-o r example, if A is a 2X 2 matrix with ,/ramaer;st j, poiynomitll ' ,,( A) ::< A2 + aA + b, thenA 2 + aA + bl= O,so

+

A

~

I

~

a" _ I A~-1

+ . + a,A + aoI

Au imporlrlllt theorel/llll (Idwlllced li" eM alge/ml says 111(11 if c,.(.A) IS the ciltlftlc/eriS/1Cpolynomial of the matrix A. lilen CA( A) = (itl words, every matrix satisfies its characterislic equotioll). This IS the celebrated Cayley-Hamilloll r 1leorem, mll1leti after Arthur Cayley ( 182 1- 1895) Ilud SI( WjJ{iam Rowan Hamiltotl (sec page 2). Cayley proved this tllt.'orem ill 1858. HamiltOll dlscoveretl iI, illdepemle1lf/); ill IllS work all quaterniorlS, (I gellemlizat;oll of tile complex nllmbers.

- aA - bl

AI = AA2 = A( - (11\ - bf) = - aA 2

A~

~].

'n,e Cayley- Hamilton TI ,corem can be used to ca1cultJfe

and

I\A ) "'"

-

34. Verify the Cayley-Hamilton Theorcm for A :: I I 0

A 2::11:

If p{x) = x" + tl..~ I X "- 1 + ... + alx + ao and A is a square matrix, we alii define tl sqllflrc rnatflX P(A) by

[~

That is, find the characteristic polynomial c,,( A) of A and show that cA(A) = 0.

3 1. Const ruct a tlont riangular 3 X3 matrix with eigenvalues - 2, I,and 3. ( H int: Use ExerclSoo

'

O b. AI has a correspond ing positive eigenvector. c. If A is any other eigenvalue of A, then IAI ~ AI"

,

In tuitively, we can see why the first two statements should be true. Consider the case of a 2 X 2 positive mat rix A. The corresponding matrix transfo rmatio n ma ps the first quadrant of the plane properly into Itself, since all com ponents are positive. If we repeatedly allow A to act on the images we get, they necessarily converge toward som e ray in the first quad rant (Figure 4.17). A direction vector for this ray will be a positive vector x, wh ich must be mapped into some positive multiple of itsclf (say, AI)' since A leaves the ra y fixed. In other wo rds, Ax = Alx, with x and A. both positive.

Proof

for some nonzero vectors x , Ax ?: Ax for some scalar A. When this happens, lhen A(kx) ~ A(kx ) fo r all k > 0; thus, we need only co nsider unit vectors x. In Chapter 7, we will see that A m aps the set of all unit vectors in R" (the IHIII sphere) into a "generalized ellipSOid." So, as x ranges over the nonnegative vectors on th is un it sphere, there will be a maximum value of A suc h lhat Ax 2: Ax. (See Figure 4. 18.) Denote this number by AI and the corresponding unit vector by XI' y

y

y

y

, -+---~x

-+----+ x -+----+ x -l' '-----

Figure 4.11 y

Figure 4.18

-+ x

Sed.ion 4.6

Applications and t he Perron-Frobcnius Theorem

331

We nO\\l show that Ax l = Alx l. If not, then Axl > A1x l, and, applying A agai n, we obtain A(Ax l) > A(Alx l ) = A1(Ax I) where the inequality is preserved, since A IS positive. (See Exercise 40.) But then y = ( 1/II Axllj)Ax l is a unit vector that satisfi es Ay > Aly, so there will be so me A. > AI such that Ay 2: A2y . This contradicts the fact tha t AI was the maxi mum val ue wit h th is property. Consequently, it must be the case that A X I = Alx l; thai is, AI 's an eigenvalue of A. Now A is positive and X I is positive, so A,x l = Ax, > O. This means that AI > 0 and XI> 0, which completes the proof of (a) a nd (b). To prove (c). suppose A is any other (real or complex ) eigenvalue of A with co rrespondlllg eigenvector z. Then Az == Az, and, taking absolute values, we have (4)

where the middle inequality fo llows [rom the Triangle Ineq ual ity. (See Exercise 40.) Since jzI > 0, the unit vector u III the d ireCtIon of Izi is also positive and sa tisfies Au :;> IAlu. By the maximality of AI from the first part of thiSproof, we must have IAI:$: A,. In fact, more is true. It turns out that A, is dominant, so IAI < A, for any eigenvalue A AI. It is also the case thai AI has algebraic, and hence geometric, mult iplici ty L We will not prove these facls. Perron's Theorem can be generalized from positIVe to certain nonnegative matrices. Frobeni us d id so in 191 2. The resuit requires a technical condition o n the malrlx. A S(luare matrix A is called reducible if, subject 10 some permutation of the rows and the same permutation of the columns. A can be written It1 I>lock form as

"*

where Band D arc square. Equivalently, A is reducible matrix Psuch that

,r there is some permutatio n

(See page 185.) For eX:lm ple, the mat rix 2 4

A=

I

6 I

0 2 2 0 0

0

I

3

I

5

5

7 0 0

3

0

2

I

7

2

is reducible, since jnterchangmg rows I and 3 and then col umns I and 3 produces 72 i l 30 • 2 i •••. 4 ...... 5 -----5 -_I.. _--_.+ o O •i 2 I 3 o Oj 6 2 I o O i l 72

,

332

Chapter 4 EIgenvalue; and Eigenvectors

(This is just PApT, where

p-

0 0

0

I

I

I

0

0

0

0

0

0 0 0 0

0 0 0 I

0 0 0 0

0

I

Check Ihis!) A square matrix A that is not reducible is called irreducible. If Al > 0 for some k, then A is called primitive. For example, every regular Markov chain has a primitive transition matrix, by definition. It IS not hard to show that every prtmitive matrix is irreducible. (Do you see why? Try showi ng the cont rapositive of this.)

Theora. 4.31

The Perron-Frobenius Theorem Let A be an irreducible nonnegative

nX n

matrix. Then A has a real eigenvalue Al

with the following properties: a. Al > 0 b. Al has a corresponding positive eigenvector. c. If A is any other eigenvalue of A, then .A !SO AI' If A is primitive, then this inequality is strict. d. If A is an eigenvalue of A such that A = AI' then A is a (complex) root o f the equa tion An - A ~ = O. c. Al has algebraic multiplicity I .

S« Matrix Alwlysis by R. A. Horn and C. R. Johnson (Cambridge,

England: Cambridge Uruve~ity Pre$$, 1985).

The interested reader can filld a proof of the Perron-Froheni us Theorem in many texts on nonnegative matrices or matrix analysis. The eigenvalue AI is often calted the Perron root of A, and a corresponding probability eigenvector (which is necessarily unique) is called the Perron eigenvector of A.

linear Recarrence Relations The Fibonacci numbers are the numbers in the sequence 0, 1, 1. 2, 3, 5, 8, 13, 21 , ... , where, after the fi rSI two terms, each new term is obtained by summing the two terms preceding it. If we denote the nth Fibonacci number by f.. then this sequence is completely defined by the equations fo = 0, It = 1, and. for n 2. 2,

This last equation is an example of a linea r recurrence relation. We will return to the Fibonacci numbers, but first we will consider linear recurrence relations somewhat more generally.

Section 4.6

Applicatio ns and t he Perron-Frobenius Theorem

an

I.eonardo of PiS where

'

n

X. [ X,, _ I

=

1

and

A =

[~ ~]

Since A has d istinct eigenva lues, it can be di3gon3lizcd . The rest of the de tails are left fo r ExeTCIse 51. (b ) \Ve will show that x" = CI An + Cl IIA" satisfies the recurrence relation x" = aX.. _1 + bx.._1 or, equivalentl y, (6)

if A2 - tlA - b = O. Since X .. _ 1

=

, ,,-I + " (1/

CIA

x,, - 2 =

and

-

CI A "- ~

+ C2(II - 2) A,,- 2

substitution into equa tion (6) yields

x" - aX,,_1 - bx.._2 = ( cI A ~ + " IIA") - (/(CIA"- I + ,,(II - I ) A,,- I) - b( cI A,,- 2 + ~ (II - 2) A,,- 2) (I( An

-

aA"- 1 - !JA"-I)

+

~(/l A " -

a( /1 - I ) A"- I

- b( n - 2) , ··- ' )

= ( IA"- 2(A2 - aA - IJ) + C211A,,- 2(A2 - aA - b) + ~ A" - 2(aA + 2b) = cI A,,-2(0) + " I1A,,- 2(0) + = c1A"- 1( aA

~A " - 2 (aA

+ 2b)

+ 2b)

=

=

But since A is a double root o f ,\2 - (IA - b 0, we m ust have Ql + 4b = 0 and A a/2, using the quad ratic (ormula. Consequently, aA + 2b = cr/2 + 2b = - 4b/ 2 + 21J = 0, so

SeCtio n 4.6

331

Apphcatio ns and the Perro n-Frobenius Theorem

Suppose the in itial conditions are XV = r and x, "" s. Then, in either (a) or (b ) there is a unique soluti on for and '1- (Sec Exercise 52. )

'I

Ixample 4.42

Solve Ihe recurrence relatio n XV = I. x, = 6, and

x~

= 6x.. _, - 9xn_l fo r n 2: 2.

The characteristic equation is,\2 - 6A + 9 = 0, which has A = 3 as a dou ble root. By Theorem 4.38(b), we must have X n == e13" + ':zu3" = ( e. + ~ 1I ) 3 ". Since I = XV = c1 tl nd 6 = X I = ( ' I + ez)3, we fi nd that '2:: I,SO

SOlllilon

x" = ( I + /1)3" The techniques outlmed in Theorem 4.38 can be extend ed to higher o rder recurrence relations. We slale, without proof, the general result.

Theorem 4.39

Let x" = a ," _ \x~_ 1 + a .. _2x~_2 + "'" + ~x" '" be a recurrence relatio n of order III that is sa tisfied by a sequence (XII) ' Suppose the (lssoci:lIed characteristic polyno mial

' " - a", _ I A, ,,,-I_ a",_1ft• ...-2_ ... _

•

A

factors as (A - A1)"" (A - A2)"'; '" (A - AA)"", where 1111 + Then x~ has the form X,, :::: (cll A ~ + c12 nA ~ + c13 u2A7 + ... + cl",n",,-I An + ...

m.,:'"'.~..,F'mL

::

111.

+ (Ckl Aj; + cu /lAi: + cul12AI' + ... + Ckm/,m" -IAl)

SYSlemS 01 linear D111erenllaIIQualions In calculus, you learn that if x = x ( t) is a diffe rentiable fu nction satisfyi ng a differential equation of the fo rm x' :::: h, where k is a constant, then the genenll solut ion is x = ee b , where C is a constant, If an initial cond ition x (O) = ~ is specifi ed, then, by substitut ing I = 0 in the general solution, we fi nd that C = ~. Hence, the uniq ue solution to the differential equation that s(ltisfi es the ini tial conditio n is

Suppose we have n differen tiable fun ctio ns of I-say, x" X:z, .. . I x,,- that sallsfy a system of differential equations

x; =

a l 1x 1

+

+ ... +

" l n X ..

xi =

(l2 I X .

+ (ln X 2 + ... +

(l2" X ..

{11 2Xi

We C(l 1l wflle this system in matrix for m as x'

x(I)

~

XI( t) x,( I) •

x,,( I)

X'( I)

~

x;( I) . Qand ei?:: D 2: 0, then AC 2: BD i?:: 0,

It call be sllOwl/ that a nonnegatil1e /I X /I mat rix is irretillCilJle if and Dilly if ( / + A) ,,-I > 0. b. Exercises 32-35,

32. A ~

Ibl IA + BI siAl + 181

42. a l

= 128, an = a n_ I / 2

43, Yo = 0,11 ... I, y" =

for

II

2:

2

Y,,-2 for N 2' 2

Yn-l -

0

I

0

0

I

0

0

0

0

I

0

0

0

0

I

0

0

0

0

0

I

0

I

0

0

0

0

I

I

0

I

0

I

I

0

0

0

I

hi Exercises 45-50, solve Ihe recllrrence relatioll IVillr tile givell Irlilial cOlldilions,

0

0

I

I

0

0

0

I

0

0

45. Xc!

I

0

0

0

0

0

0

0

I

I

46, Xu = 0,

35. A ::

44. /)0

36. (a) If A is the adjacency matrtx of n graph G, show that A is Irreducible if a nd only if G is connected . (A graph is c0111lected if there is a path between every pair of vertices.) (b) Which of the graphs in Section 4.0 have an c irreducible adjacency matrix? \Vhich have a prrm itive adjacency matrix? 37. Let G be a bipa rtite graph with adjacency matrix A. (a) Show that A is no t prim itive. (b) Show that if A is an eigenvalue of A, so is - A, \ Hil1l: Use Exercise 60 in Section 3.7 and partition an eigenvector fo r A so that it is compatible with this partitioning of A. Use this partitioning to fi nd an eigenvector fo r - A. I 38. A graph is called k-regl,{ar If k edges meet at each vertex. Let G be a k-rcgular graph. (a) Show that the adjacency matrix A of G has A = k as an eigenva lue. (1'/1111: Adapt Theorem 4.30.)

= I , /)1 = 1, b

n

= 0, x 1 = XI

S,x"

49.

"

= 3x

n_

1

= I, x" = 4xn_ 1

47. YI = \ 'Y2 = 6,y" 48. ('0

= 2bn _ 1 + b"_2 for II 2: 2

+ 4X n_l -

fOTIi

>

2

3X,,_2 for /I i?:: 2

=

4Yn_1 - 4Y,,_2 for 1/ i?:: 3

= 4, " I = I, a" =

a,,_1 - a,,_z/4 for II i?:: 2

bo = 0, bl

= I, b" = 2b n _ 1

+ 2b,,_2 for"

2:

2

50. The recu rrence relation in Exercise 43. Show that your solut ion agrees with the answer to Exercise 43, 5 1. Complete the proof of Theorem 4.38(a ) by showing that jf the recurrence relation x" = ax"_ 1 + bX,,_2has distlilct eigenvalues Al '\2' then the solution will be

'*

of the form

( Hilll: Show that the method of Example 4.40 wo rks in general.)

52. Show that for any choice of mltial conditio ns Xu = r and x, = S, the scalars c i and C:! can be fo und, as stated jn Theorem 4,38(a) and (b).

Section 4.6

Applications and the Perron-Frobenius Theorem

T he area of the square is 64 square u mls, but the rectangle's area is 65 square u nits! Where did the extra square coille fro m? (Him: What does this have to do wi th the Fibonacci sequence?)

53. The Fibonacci recurrence f" = /,,- 1+ /,,-2 has the associated matrix equation x ~ = Ax n _ p where

(a) With fo = 0 and f.. = I, use mathematical ind uction to prove that

A"

~ [f"+' f.

f,

/~- I

359

54. You have a supply of three kmds of tiles: two ki nds of 1 X2 tiles and one kind of I X 1 tile, as shown in Figure 4.29.

1 figure 4.29

for ,11111 (cl Assume that Q is orthogonal. The n QTQ = I, and we have

Qx· Qr = (QX)TQ y = XTQTQ y = xTly = xTy :::: x · y ( c) => (b) Assume that Qx · Qy = x . y for every x and y in R". Theil, taki ng y : x,

weh ave Qx 'Qx = x'x ,so I Qxl:::: Y Qx · Qx = Y x-x = I x~. (b) => (a) Assume that property (b) holds and let q, den ote the Ith colu mn of Q.

Using Exercise 49 in Sect ion 1.2 and propert y (b), we have

x'y :::: Hlix + y~ 2

-I x -

yU1)

+ Y)I ' - l Q(x - Y)i ' ) = IU Qx + QYI ' - IQx - QYI') = WQ (x

= Qx · Qy for all x and y in ~ ". [This shows that (b) => (e).1 Now if e, is the IIh stan dard basis vector, then q, :

q, . q) = ,-,e also have \-V = f J. . Figure 5.5 Illustrates this situation.

e

e

318

ChapteT 5 OrThogonalilY

In Example 5.8. the orthogonal complement of a subspace turned out to be ano ther subspace. Also, the complement of the complement of a subspace was the original subspace. These properties are true In general and are proved as properlles (a) and (b) of Theorem 5.9. Properties (c) and (d) will also be useful. (Recall that the intersectIOn A n B of sets A and B consists of their common elements. See Appendix A. )

Theo,.. 5.9

Let W be a subspace of R8. a. W.I. is a subspace of R8. b. ( W.l.).1.= W c. wn W.l. = 101 d. If w = span (wj • • . • , Wi)' then v is in W.l. if and only if v ' w, = 0 for all i = l •. . . , k.

Proal (a ) Since o · W "" 0 for all w in

W. O is in W..I. . u-t u and v be in W.I. and let c

be a scalar. Then u 'W = v·w = 0

forallwlfl W

Therefore, (u + v)·w = u ' W + V'w = 0 + 0 "" 0 so u + vis in W.I. . We also have

« . ) ·w

= « •. w) = «0) = 0

from which we sec that cu is in W.I.. It foll ows that W.I. is a subspace of R". (b) We will prove this properly as Corollary 5.12. (c) You are asked to prove this property in Exercise 23. (d ) You are asked to prove this properly in Exercise 24.

--

We can now express some fu ndamental relationships involving the subspaces associated with an m X " matrix.

Theore .. 5.1.

Let A be an m X n matrix. Then the orthogonal complement of the row space of A is the null space of A, and the orthogonal complement of the column space of A is the null space of AT:

P,ool

If x is a vector in R", then x is in (row (A».I. if and only if x is orthogonal to every row of A. But this is true if and only if Ax = 0, whi(h is equivalent to x bc:'ing in null (A), so we have established the firs t Identity. To prove the second identity, we s imply replace A by ATand use the fa ct that row (A T ) = col (A). Thus. an m X n mat rix has four subspaces: row(A), null (A ), col (A), and null (AT ). The first two arc orthogonal complements in R8, and the last two arc orthogonal

•

,

Section 5.2

I

Orthogonal Complements and O rthogonal Proj«tions

null(Al

,0

col(A)

row(A)

R" flglre 5.6 The four fundamental subspaces

complements in R"'. The mX /I mat rix A d efines a linear transfo rmation from R~ into R" whose range is collA). Moreover, th is transfo rmatio n sends null (A ) to 0 in Ill .... Figure 5.6 illustrates these ideas schematically. These four subspaces arc called the fundame"tal subspausofthe mX" matrix A.

Example 5.9

Find bases fo r the four fu ndamental subspaccs of

A ~

1

1

3

1

6

2

- 1

0

1

- 1

-3

2

1

- 2

1

lll)' where

Also, null {A) = spa n (x •• Xl)' where

x,

~

-1

1

-2

-3 0

1

0 0

,

X2

=

-. 1

To show thaI (row (A».!. = null (A ), it is enough to show that every u, is orthogonal to each x" which IS an easy exercise. (Why is th is sufficierll?)

311

Chapter 5

Orthogon:llity

The colu m n space of A is col (A) "" span {a J, a 2• a J ), where ]

2 , - 3

3 1 ""

=

32

]

]

- ]

]

2

4

,

a,

=

- 2

]

]

We still need 10 compute the null space of AT. Row reduction produces ]

2

- 3

4 0

]

0

0

]

]

- ]

2

0

6 0

0

]

]

- 2

-]

]

0 0 0

]

]

0 0 0 0

]

IA'I O)= 3

0 0 ] 0 3 0

3 0 0 0 0 0

•

]

•

•

0 0

0

So, if y is in the n ull space of AT, then Y. ""' - Y., Y2 "" - 6y., and Yj = - 3Y4. It fo llo\ys that

nuU(A')

- 6y. - 3y.

=

"" span

vCClO r

-3 ]

)'.

and it is easy to check tha t this

-. - ]

-Y.

is orthogonal to a l> a z• and a y

The method of Example 5.9 is easily adapted to other situatio ns.

(umple 5,10

Let W be the subspace of [R5 spanned by ]

wl =

-3 5 , w1 = 0

5

-]

0

]

- ]

2 , - 2 3

Wj=

4 - ]

5

Find a basis for W J.. .

Salallall

The subspace W spanned by wI'

w~ .

and wJ is the same as the column

space of

A=

]

- ]

0

-3

]

- ]

0

2 -2

4 - ]

5

3

5

5

Seclion 5.2

Orthogonal Complements and Orthogonal Project ions

319

Therefore, by Theorem 5.1 0, W i = (col(A))'" = null (AT ), and we may p roceed as in the p re vio us exam ple. We com pute

[A' loJ-

1

- 3

- I

1

°

5 2

- I 4

°J ° 5

•

- 2 0 - \ 5 0

° °0 01° o1 32 °° 1

°°

3

4

1

Hence, yisin W.L ifandonlY lf Yi = - 3Y4 - 4YS'Y2 = - Y4 - 3Ys, and YJ = - 2Ys. lt follows that

- 3

- 3Y4 - 4Y5 -Y4 - 3Y5 - 2ys

W l. =

-4 - I - 3 0 , -2 1 0 0 1

= span

y,

y, and these two vectors for m a basis for W-i

,

Orthogonal Prolee"ons Recall that, in Rl, the projection o f a vecto r v on lO a no nzero vector u is given by )

projo(v)

v)

U· = ( U· U

u

Furthermo re, the vector perpg(v) = v - proj,,( v) is orthogo nal to proju(v ), and we can decompose v as

v = proj.(v) + perpu{v) as shown in rigurc 5.7. If we leI W = span (u ), then w = proj., ( v) is in Wand w'" = perp.( v) is In Wi , We therefore have a way of "decomposing" v into the sum of two vectors, one from Wand the other orthogonal to IV- namely, v = w + W i . We now generalize this idea to R~.

Definition

Let Wbe a subspace of R~ and le t {u l , . . . , uJ.} be an orthogonal basis for W. For any vector v in R~, the ortlJogonal projection of v Ollt o W is defi ned as

The component ofv orthogonal to W is the vector

Each sum mand in the defi nition o f proj I~ V) is also a projectio n onto a single vecto r (o r, equivalently, the one-d imensional subspace span ned by it- in our p revIO Us sense). Therefore, with the notation of the preceding defin ition, we can write

p roj W 0 for all x *" O. 2. positive semidefinite if f(x) ~ 0 for all x. 3. negative definite if f(x ) < 0 for all x *" O. 4. negative semidefinite if f(x ) < 0 for all x. 5. jndefinite if f( x ) takes on both positive and negative values.

41&

Cha pter 5 Orl hogollaluy

A symmetric matrix A is called positive definite, positive semidefinite, negative definite, negative semidefi nite, or i"definite if th e associated quadratic fo rm [(x) = xTAx has the corresponding property.

The quadratic forms in parts (a), (b), (cl, and (d) of Figure 5. 12 are posi tive defin ite, negative defi nite, indefi nite, and posltl\'e semidefini te, respectIvely. The PrinCIpal Axes Theorem makes it easy to tel l if a quadratic form has one of these properties.

Theorem 5.24

Let A be an !IX 1/ sym metric matrix. The quadratic form [(x) = x TAx is a. jIOsitlve definite if and only if aU"of the eigenvalues of A ar po&Jtive. b . .positive SCiUldefiniJe j f and only if II of the eigenval ues of A nonn ive. egative definite-if and only if all of the eigenvalues of A are rfegs· . egative semidefinitejf and only if aU of the eigenvalues of A ak non"",,,';"v

a O (c) a pair of straight lines o r an imaginary conic if k 'l 0, and prove that if x lies o n this ell ipse, so does Ax.J

• ., Dellnltions and fundamental subspaces of a matrix.. 377 Gram·Schmidt Process. 386 orthogonal basis, 36i orthogonal complemen t of a subspace, 375 orthogonal matrix. 37 1

orthonormal set of vectors. 369 propcn ies of onhogonal mat rices. 373 QR faclOrization. 390 Rank Theorem, 383 spectral decompositIOn , 402 Spectr,,1 Theorem , 400

o nhogonal proJKtion , 379 o rthogonal set of vecto rs, 366 Orthogonal Decomposi tion Theorem. 381 o rthogonally diagonalizable matrix. 397 o nhonormal basis, 369

Review Questions 7

1. Mark e"ch of the following statements t rue or false: (a) Every ortho normal set of vttto rs is linearly

(b) (c) (d) (e)

(0 (g)

(h) (i) (j)

independent. Every nonzero subspace of R~ has an orthogonal basis If A is a square matrix with orthonormal rows, then A is an orthogo nal matTlx. Every o rthogol1,,1 matrix is invertible. If A is a mat rix with det A :: I, then A is an orthogonal matrix. If A is an m X " matrix such that (row(A» .l = R~. then A must be the zero matrix. If W is a subspace of IR~ and v is a vector in R" such that pro; lO'( v) = O, lhen v must be the zero vector. If A is a symmetTlc,orthogonal matrix, then AZ = I. E\'ery o rthogonally diagonaliz.l ble matrix is Invertible. Given any /I real numbers AI! . .. ,An' there exists a symmetric /I X III11"tfl)C with AI' ... , An as Its eigenvalues.

2. Find all values of (/ and b such that I

2. 3

4

•

I , b - 2 3

is an o rthogonal set of vectors.

3. Find the coordmate vector [V]6 of v =

- 3 with 2

respect to the orthogonal basis I

I

0,

I ,

I

- I

- I

2

of R'.

1

4. The coordina te vector of a veClOr v with respect to an

orthonorm:llbasis 6 "" {v l, vl}of R l is [V]6 = If VI =

J/5] [

[ l/~] '

4/ 5, find allposslblevectorsv.

6! 7 2! 7 3! 7 5. Show that - I!V, o 2! V, 4! 7Vs - 15!7V, 2! 7V,

•

IS an

o rthogonal matTix. 6. If

[ 1~2

:] is an o rthogonal matrix, fi nd all possible

values of a, h, and c. 7. If Qis an orthogonal " X II matrix and {VI> ...• v(} is an orthonormal sct In Rn, prove that {Q v l , . . . , QVt} is an o rthonormal sct.

431

Chapter 5 Orthogonali ty

8. If Q IS an " X " mat rix such that the angles L (Qx, Qy) and L (x , y ) a re eq ual for all vectors x and y in lQ:", prove that Q IS an orthogonal matrix .

(b) Use the result of part (a) to find a OR factorization

o f A ""

In QlleJtlO1U 9-12. find Il basis lor IV J.. 9. W is the line 111 H2 with general equation 2x - Sy = O. X=I

11. W "" span

\

0

- \ ,

\

,

\

vectors

A=

3

2

\

-2

\

4

8

9

- 5

6

- \

7

2 3

IS. Let A =

- \

: XI

El

I

- 1

I - I

2 I

I 2

-=

span

20. If {V I' V 2• 0

,

\

=

\

\

\

0

\

\

\

\ \

, Xl

=

\

, X)

0

to fi nd a n o rthogonal basis for

=

\ \

W

.•

\

,

\

\

\

,E _l = span

- \

o

v,,} is an o rthonormal basis for R d a nd

prove that A IS a sym m e tric m a trix wi th eigenvalues cl' ':!' . .. • c,. and corresponding eigenvectors VI' v 1• • . , v .....

- 2

15. (a) Apply the G ram · Schmidt Process to

XI

of W.

\

- \

\

= 0

3

\

,

\

o

with rcsp«t to

0

+ Xi + X, + ~

2

2

\

R~ that contains the

\

\

0 - \

\

\

19. Find asymmetric ma trix wit h eigenvalues Al = Az = I, AJ = - 2 and eigenspaces

\

W = span

\ 0

(a) O rthogonally diagonahze A. (b) Give the spectral decomposition of A.

14. Find the orthogonal decompositio n of

v =

\

17. Find an ort hogo nal basis for the subspace

\

- \

\

2

2 - 2

- \

\

"d

2

13. Find bases for each o f the four fundame ntal subspaces of \

\

\

x,

- \

\

o

2

\

\

0

\

\

0

\

x, x, x,

\ \

- I

-3

4

12. W = span

=

\

16. Fi nd an orthogonal basis for

10. Wis the line in W wi th parametric equations y = 21. Z

\

= span{xl'

X l ' Xl }'

ctor

Algebm is gellerous; she of,ell gives more 1111111 is asked of IU'r. - Jean Ie Rond d'Alembert

6.0 Introduction: Fibonacci in (Veclor) Space The Fibonacci sequence was introduced in Section 4.6. It is the sequence

( 17 17- 1783)

In Carl B. Boyer A Hisrory of MII/h emafies \'Viley, 1968, p. 481

0, I, 1,2,3,5,8, 13, ...

of no nnegative integers with the property that after the fi rst IwO terms, each term is the sum of the two terms preceding it. Thus 0 + 1 = 1, 1 + 1 = 2, J + 2 "" 3, 2 + 3 = 5,and soon. If we denote the terms of the Fibonacci sequence by ~, h., ... , then the entire sequence is completely determined by specifyi ng that

to,

fo = Q,!;

= I

and

in=

In- I

+ i"-2 fo r II 2: 2

By analogy with vector notation, let's write a sequence .\("

XI'

X:!' x3• '

••

as

x = [Xo,XI' ~,x3" " )

The Fibonacci sequence then becomes f = [Io,!"!,,!,,. .. ) = [0, I, 1, 2,. .. )

We now general ize this notion.

Definition

A Fibonacci-type sequence is any sequence x = (Xu, xI' X 2, Xl" such that Xu and X I are real numbers and xn = xn _ 1 + xn_2 for n > 2. For example, [ I, sequence.

Vi. I + V2. 1 + 2 V2. 2 + 3 v'2.... )

••

is a Fibonacci-type

Problell1 Write down the first five terms of three more Fibonacci-t ype sequences. By analogy with vecto rs agai n. let's defi ne the $11111 of two seq uences x = [At). xI' X l > . . . ) and y = [Yo. Y» Y2' .. . ) to be the sequence

x + Y = [41

+ Yo,xl + YI'X2 + Yl.·· ·)

If c is a scalar, we can likewise define the scalar multiple of a sequence by

•

431

taZ

Chapter 6

Vector Spaces

'r,lIle.2 (a) Using your examples from Problem 1 or other examples, compute thf! sums of various pairs of Fibonacci-type sequences. Do the resulting sequences appear to be Fibonacci-type? (b ) Com pute va rious scalar multiples of your Fibonacci-type sequences from Problem I . Do t he resulting sequences appear to be Fibonacci-type? Probl •• 3 (a) Prove that if x and y arc Fibonacci-type sequences, then so is x + y. (b ) Prove that if x is a Fibonacci-type sequence and c is a scalar, then ex is also a

Fibonacci-type sequence. Let's denote the set of all Fibonacci-type sequences by Fib. Problem 3 shows Ihat, like R~, Fib is closed under addition and scalar multiplication. The next exercises show that Fib has much more in common with R". 'robl •• 4 Review the algebraic properties of vectors in Theorem 1. 1. Does Pib satisfy all of these properties? What Fibonacci-type sequence plays the role of O? For a Fibonacci-type sequence x, what is - x? Is - x also a Fibonacci-type sequence? 'robl •• 5 In An, we have the standard basis vecto rs el' e1•... • eft' The Fibonacci sequence f = [0. [, I, 2, . .. ) can be thought of as the analogue of e~ because its fi rst two terms arc 0 and l. Whal sequence e in Fib plays the role of c l? What about el , e~ •. .. ? Do these vectors have analogues in Fib? 'rolll•• 6 Let x = [;.;;., xl'~ ' ... ) be a Fibonacci- type sequence. Show that x is a linear combination of e and f. J Show that e and f arc linearly independent. (That is, show that if ce + df = O,then c '" (1 = 0. ) Problel! 8 Given your answers to Problems 6 and 7, what would be a sensible value to assign to the "'dimension" of Fib? Why? ProbleUl 9 Are there any geometric sequences in Fib? That is. if

'r.III,.

Il,r, r1, rJ , ••

.)

is a Fibonacci-type sequence, what arc the possible values of ~ 'r,bl •• 11 Find a "baSIS" for Fib consisting of geometric Fibonacci-type sequences. ",lIle. 11 Using your answer to Problem 10, give an alternative derivation of Biflet's fomlllfil Iformula ( 5) in Section 4.6 1:

I (I + v'S)" _ I (I - v'S )" "v'S 2 v'S 2

f, _

for the terms orthe Fibonacci sequence f = the basiS from Problem to.) The Lucas sequence is named after Edouard lucas (see pagf! 333).

1fo.J; ./,., . . . ). ( Hint: Express f in terms of

The Luctu seq uence is the Fibonacci-type sequence 1 =[ ~,11'12 ,13 , ·· · ) - [ 2, 1 ,3,4,

. .. )

Problea 12 Use the basis from Problem 10 to find an analogue of Binet's formula fo r the nth term f" of the Lucas seq uenCl~~. Proble. 13 Prove that the Fibonacci and Lu cas sequen c~ are related by the identity

{,. -I + f~" 1 = I~ [H im; The fibona cci-type sequences r-

for tl

2:.

1

"" 11, I, 2, 3, ... ) and f'"

= [ I, 0, I, I, ... )

fo rm a basis for Fib. (Why?)] In this Introduction, we have seen that the collection Fib of all Fibonacci -type sequences ~ha ves in many respects like H2. even though the "vectors" are actually infinite sequencrs. This useful analogy leads to the general notion of a vector space that is the subject of this chapter.

5«-tion 6. 1 Vector Spaces and Subspaces

ua

Vector Spaces and Subspaces In Chapters 1 and 3, we saw that lhe algebra of vectors and the algebra of matrices are similar in many respects. In particular, we can add both vC(:tors and matrices, and we can multiply both by scalars. The properties that result from these two operations (Theorem 1.1 and Theorem 3.2) are identICa l in bot h settings. In th IS section, we usc these properties to define generalized "vectors" tha t arise in a wide variety of exam ples. By proving general theorems about these "vectors," we will therefore sim ultaneo usly be provlllg results about all of these examples. ThiS is lhe real po\Y"er of algebra: its ability to take properties from a concrete setting, like RM, and (lbstmct them into a general setting.

,, Let Vbe a set 011 wh ich two operations, called ndditiol1 and 5calar ; have been defi ned. If u and v arc in V, the 511m of u and v is denoted by u + v, and if c is a scalar, the scalar multiplc of u by c is denoted by cu. If the following axioms hold for all u, v, and w in Vand for aU scaJars cand d. then V is called a vector space and its elements are called vectOni. The German mathematiCIan Hermann Grassmann ( 18091877) is generally credited with first Introducing the idea of a vector space (although he did not can it that) in 1844 Unfortu · nately, his work was very difficult to read and did not receive the attention it deserved. One person who did study it was the Italian mathematician Giuseppe !'eano ( [8 58~ 1932). In his 1888 book C(llcolo GeomctncQ, Peano clarified Grassmann's e;lrlier work and laid down the axioms for a vector space as we know them today. Pea no's book is also remarkable for introducing operations on sets. His notations U, n , and E (for "union," "inler· section,Mand "is an dement of") are the ones we still use, although they were nOI immcdlaldy accepted by other mathematici(1I1S. Peano's axiomatic defini· tion of a vector space 111so had vcry little mfluence for many years. Acceplance came in 1918, after Hermann Weyl ( 18851955) repeated it 111 his book Space, Time, Mmler, 1111 introduction to Einstcl11's general theory of relativity.

l. u + v lsinV. 2. u + v = v + u

under addition Commutativity 3. ( u + v) + w = u + (v + w ) M!,ociati\-il\' 4. There ex ists an element 0 in v, called a %ero vector, such that u + 0 = u. 5. Fo r each u in V, there is an clement - u in V such that u + (- u ) == o. 6. culs inV. Clo~urc under !oCalar muJtipJi";,1lion 7. c( u + v) = co + CV Diwibutivity 8. (c + d) u = ru + du D i~tributivi t y 9. c(tlu ) = (cd )u 10.1u = u Ch)~ure

Re • • ," • By "scalars" we will usually mea n the real numbers. Accordingly, we should refer to Vas a rC(l1 vector space (or a vector space over tile Tenlmlmbers) . It IS also possible fo r scalars to be complex numbers o r to belong to Zp' where p is prime. In these Cllses, V is called a complex vector SplICe o r a vector space over Zp' respectively. Most of our examples will be real vector spaces, so we will usually o mit the adjective " real." If something is referred to as a "vector space," assume that we arc working over the real number system. In fact, the scalars can be chosen from any num ber system in which, roughly speakmg, we can add, subtract, multiply, and divide according to the usual laws of arit hmetic. In abstract algebra, such a number system is called a field. • The definition of a vector space does not specity what the set V consists of. Neither docs it specify what the operations called "addition" and "scalar multiplication" look like. Often, they will be fam ilia r, but they necd not he. Sec Example 6 below and ExerCises 5-7,

\Ve will now look at several examples of vector spaces. In each case, we need to specify the set Vand the operations of addition and· scalar multiphcation and to verify axioms 1 th rough 10. We need to pay particular attention to axioms 1 and 6

434

Chapler 6

Veclor Spaces

(closu re), axiom 4 (the existence o f a zero vector V must have a negative in V).

In

V), and axiom 5 (each vector in

Ixample 6.1

For any 1/ i2: 1, IRn is a vector space with the us ual op erations of addition and scalar m ultiplication. Axio ms I and 6 follow from the defi n itions o f these operations. and the remaining axioms foll ow from Theorem 1.1.

lKample 6.2

The set of all 2X3 matrices is a vecto r space with the usual operations of matrix addition and m atrix scalar multiplication. Here the "vectors" are actually matrices. We know that the sum of 1\\10 2X3 matrices is also a 2X3 matrix and that multiplying a 2X3 matrix by a scalar gives anothe r 2X3 mat r ix; hence, we have closure. The remaming aXIO ms follow from Theorem 3.2. In particular. the zero vector 0 is the 2X3 "lero matrix, and the negative of a 2x3 matrix A is just the 2x3 matri x - A. There IS noth ing special about 2X3 matrices. For any positive integers m and n, th e set of all //I X tI mat rices fo rms a vector space with the usual operatio ns of m atri x add ition and matrix scalar multi plication. This vector space is denoted M m".

IxampleJt

Let ~ 1 denote the set of all polynomials o f degree 2 or less with real coefficients. Define addition and sca lar multiplication in the usua l way. (See Appendix D.) If

p(x)

= flo + alx + al~

and

(Ax)

= bo + b,x + blxl

are in f!J' 2' then

p(x)

+ q(x) ""

(110

+ Vo) + (a l + vl)x + (al + b2 )X2

has degree at most 2 and so is in r;p 2' If c is a scalar, then

cp(x) "" ctlo + calx + cal;; is also in qp 2' This verifies axioms 1 and 6. The zero vector 0 is the zero po lynom ial- that is, the polyno mial all of whose coefficients are zero. The negati ve of a polynom ial p(x) = flo + (/IX + (l lX2 is the polyn om ial -p{x) "" -flo - (l 1X - a2x 2. lt is now easy to verify the remaining axio m s. We will check axiom 2 and Icave the ot hers for Exercise 12. With p{x) and q(x) as above, we have

P(x) + (Kx) = (ao + a\x + a2;;) + (~ + blx + blxl) = (flo

+ bo) + (a\ + b,)x + (a2 +

b2 )Xl

+ (b\ + al)x + (b2 + ( 2)x2 = (bo + b\x + b2xl) + ('10 + (l\X + (l2xl) = q(x) + p(x) = (bo + (10)

where the third equality follows fro m the fac t that addition o f real nu m bers is comm utative.

Section 6. 1 Vector Spaces and Subspaces

435

In general, for any fixed t/ :> 0, the set Cjp" o f all polynomials of degree less than or equal to " is a vector space, as is the set g> of all polynomials.

Example 6.4

Let ~ denote the set of all real-valued fu nctio ns defined on the real line. [f [a nd g arc two such func tions and c is a scala r, then f + g and c[are defi ned by

(f + g)(x)

~

f(x) + g(x)

,,,d

(e!)(x) - if(x)

In other words, the valll l! of f + g at x is obtained by adding together the values of f and g at x/Figure 6. 1(a)l. Similarl y, the val ue of c[at x IS Just the value of fat x mu ltiplied by the scalar c I Figure 6.1 (b)]. The zero vector in c:;. is the constant fu nctlon /o tha t is identically zero; that is,/o (x) ::: 0 for all x. The negative of a funct ion f IS the function - f defined by ( - f) (x) ::: - [( x) IFigure 6. 1(c)]. Axioms I and 6 arc obviously true. Verifi cation of the remaimng axioms is left as Exercise 13. Th us, g. is a vector space.

)'

(x. 2j(x» (x,j(x)

\

+ g(x»

I

2(

f f+g

8

\ _ --+.o:,x:c.f,,"::: '"j:)...-' '~

--

(x. 0)

~~ L--4--'----~ x (x. 0)

f

/ - '-" Jj(x))

(l.

(b)

(a)

)'

(x,J(x)

\ / -fIx»~

f

-f

(x. (e)

"gar. &.1 The graphs of (a) f, g, and [ + g, (b) [, 2[, and - 3[, and (c) f and - f

-3f

'

436

C hapter 6

Vector Spaces

In Example 4, we could also h:we considered o n ly those fu nctions defined o n some closed mterval [a, h] of the real line. T his approach also prod uces a vector space, d enoted by ~ [a, h].

Example 6.5

T he set 1L of integers wit h the usual ope ra tions is lIo t a vector space. '10 de mo nstrate this, it is enough 10 find tha t ol/cof the ten axioms fail s and to give a specific instance in which it fails (a cOllllterexample). In this ca se, we find that we do not have closure under scalar multiplica tion. For example, the m ult iple o f the in teger 2 by the scalar is 0)(2) = whic h is no t an integer. Th us, il is nOI true that ex is in if.. for every x in 1L a nd every scalar c (i.e., axiom 6 fails).

1

Example 6.6

L

Let V = R2 wit h the us ual defi nition of add ilion but the fo llowing defin ition of scala r m ultiplication:

t ]~ [ ~] 1[:] ~ [~] t [:]

Then, for example,

so axiom 10 fal ls. [In fact , the other nine axioms are all true (check Ihis), but we do n ot need to look into the m because V has already failed to be a vector space. This example shows the value of lookmg ahead, rathe r th an working through the list of axioms in the o rde r in which they have been given. )

Example 6.1

LeI (2 de note the set of all o rdered pairs of com plex n umbe rs. Defi ne addition and scalar multi plication as in 1112, except here the scalars are com plex nu mbers. For exam ple,

[ 1+i] +[-3+ 2i] [-2+3i] 2 - 31

(I -

a nd

4

6 - 31

i)[ I+i] ~ [(I-i)( I +i)] ~ [ 2-3i

( l - i)(2 -3i)

2] -I - 51

Using prope rt ies of the complex numbers, it is straigh tforward 10 check that all ten axio m s hold. The refore, C 2 is a co mplex vector space.

In general,

Example 6.8

e" is a complex vector space for all n 2:

I.

If P is prime, the set lL; (with the us ual d efinitions of addition a nd multiplication by scalars from Z,J is a vector space over lLp for all n 2: I.

Sa:tion 6.1

Vector Spaces and Subspace!;

UI

Udore we consider furt her examples. we st,llc a theorem that contains somc useful properties of vecto r spaces. It is Important to note Ihal, by proving this theorem fo r vector spaces In gelleml, we are actually provmg it for every specific vector spact.

Theor•• 6.1

Ltt V be a vector spact, u a vector in V, and c a scalar.

II

3.0U = 0

b.eO = O c. (- l) u = - u d. Ifcu = O,then

C '"'

Oor u = O.

Proal We prove properties (b) and (d ) and leavt Ihe proofs of the rema ining propenies as exercises. (b) \Ve have dI ~ « 0

+

0 ) ~ dI

+

dI

by vector space axioms 4 ::and 7. Adding the negat ive of d) 10 both sides produces

+ (-dl )

dI

~

(dl

+ (0 ) +

(-dl)

which implies

°= dI + (dl + ( - dl )) - cO

+0

By ax iom ~ 5 and 3

By axiom 5

= dl

By axiom 4

c = 0 or u = 0 , let's assume that c ¢ O. (If c = O. there is no thing to prove.) Then, since c r:f. 0, its reciprocal I/c is defi ned, and

(d ) Suppose cu = O. To show that ei ther

u

=

lu

U)' axiom 10

e} I

,

-fro )

I\y axiom 9

I

-0 c

°

Ill" property (b )

We will wn te u - v for u + (- v ), thereby definmg sub/merion of veclo rs. We will also exploit the associativity property of addit io n to unambiguo usly write u + v + w fo r the sum of three vectors and, more generally,

for a linear combiNation of vectors.

Sibspaces We have seen that, in R~. it is possible for onc vector space to sit inside another one, glVmg rise to the notion of 3 subspace. For example. a plane through the ongin is a subspace of R '. We now extend th Ls concept to general vector spaces.

431

Chapter 6

Vtctor Spaces

Dennmon

A subset W of a vector space V is ca lled a sllbspace of V if IV is .tse f a vector space with the same scalars, add ition, and scala r multiplication as V.

As in IR~, checking to see whet her a subset W o f a vector space Vis a subspace of V involves testing only two of the ten vecto r space axioms. We prove this observation as a theorem.

•

it

Theorem 6.2

Let V be a vector space and lei W be a nonempty subset of V. Then \Visa subspace of Vi f and only if the fol lowi ng conditions ho ld: a. Ifu and varei n W, thenu + v is in W. b. If u is in Wand c is a scalar, then cu is in IV.

Prool Assume that W is a subspace of V. Then W satisfi es vecto r space axio ms I to 10. In particular. ax iom 1 is cond ition (a) and axiom 6 is condition (b). Conversely, ass ume that W is a subset of a vector space v, satisfying co nditions (a) and (b ). By hypothesis, axioms I and 6 hold. Axioms 2, 3, 7,8,9, and 10 hold in Wbecause they are true for allveclors in Vand thus are true in particular for those veclo rs in W. (We say that W inlterits these properties from V.) This leaves axioms 4 and 5 to be checked . Si nce W is noncmpty, it contains at least one vcctor u. Then condi tion (b) and Theorem 6. I(a) imply that Ou = 0 is also in W. This is axiom 4. If u is in V,then, bytakingc = - I in condi tion (b ), we have that - u = (- J) u is also in W, using Theorem 6.1 (c).

R,.arr

SlIlce Theorem 6.2 generalizes the no tion of a subspace from the ca nlext of lR~ to general vector spaces, all of the subsp:lcCS of R" that we encountered in Chaptcr 3 arc subspaces o f R" in the current context. In particular, lines and planes th rough the origin arc subs paces of Rl.

-

Ixample 6.9

lxample 6.10

We have already shown that the set ~ n of all polynomials with d egree at most vector space. Hence, (jJ> ~ is a subspace o f the vector space ~ of all polyno mials.

/I

is a

.-t

lei Wbe the set o f sym met ric /I X /I matrices. Show that W is a subspace of M n" .

Sola11011 Clearly, W is nonempty, so we need only check condJlio ns (a) and (b ) in Theorem 6.2. I.et A and B be in Wand let c be a scalar. Then A T = A and 8 T = B. from wh ich it fo llows that (A + 8) T = AT + 8T = A + B

Therefore, A + B is symmetric and , hence, is in \V. Similarly,

(CA )T = CAT = cA so cA is symmetric a nd, thus, is in W. We have shown that W is dosed under add ition and scalar multiplication. Therefore, it is a subspace of M"", by T heo rem 6. 2.

$cClion 6. J

Elample 6.11

Veclo r Spaces and Subspaces

la9

LeI cg be the set of all continuous real-valued functions defined 011 R and let £b be the sel of all d Ifferen tiable real -valued func tions defined on R. Show that is a subspace o f 3' and Mn ' Typical elements of these vector spaces are, respectively,

a b u

In th e words of Yogi Berra, "It s dej il. vu all over again."

EMample 6.13

~

,

, p(x)

=

a

+ bx +

a;l

+ dx 3 ,

d

Any calculations involving the vec tor space o pe rations of add ition and scalar multiplication are essentially the same in all three settings. To high light t he simila rities, in the next exam ple we will perform the necessary steps in the three vector spaces side by side. (a) Show tha t the set W of all vectors of the form

a b - b a is a subspace of [R4. (b) Show that the set W of all polynomials of t he form a s ubspace of 9J> y (e) Show that the set W of all matrices of the form [ _ :

+ bx - bil + (o?

:J

IS

a

is a subspace of Mn ·

,,

(

Section 6.1

,

•

Vector Spaces and Subspaccs

.41

Solutlan (3) W is no n empty beca use it contai ns the ):cro vector O. (Take a = b = 0.) leI u and v be in \.~' say,

,

a

"~

(b) W IS nonem pty because it contains the zero polyno m ial. (Take a = b = 0.) Let p {x ) and q(x) be in W-say,

b -b

and

v

~

AX) =

d

,

+ I,x - bx- + ax' A =

q(x) = c + (Ix -

cJr + ex'

and

Then

Then

+C b+ d

B=

[ - "b b] (l

,I] , [ -d

c

Then

p(x) + q(x)

II

u + v=

tams the zero matri x O. (Take (l = b =' 0.) Let A and B be in W-say,

and

- d

a

(I

(c) W is nonempty because il con-

~

+, b+ d] A +B= [ - (b + d) a + ,

(a + ,)

a

+ (b + d)x

- b- d

- (b + d)"

+ (a+ c)K

a+, b+ d -(b + d)

a+, so u + v is also in W (because It has the right form ), Similarly, if k is a scalar, then

so p{x ) + q(x) is also in W (because it has the righ t fo rm ). Sim ilarly, if k is a scalar, then

kp{x)

ka ku =

=

so A + B is also in W (because it has the right form ). Si m ilarly, if k is a .scalar, then

ka + kbx - kbK + kax'

ka kA = [ -kh

kb - kb

kb] ka

ka so ku is in W. Thus, W IS a noncmpty subset of

R4 that is d osed under addition and scalar multiplication. Therefore, W is

a subspace of R 4 , by Theorem 6.2.

so kp(x) is In w. Thus, W is a no n('m pty subset of IJi'J tha t is closed under addition and scalar m ultiplication. Th('rcfo re, W is a subspace of qp ) by T heorem 6.2.

so kA is in \-V. Thus, W is a nonempty subset of M n that is closed under addition and scalar multiplicat IOn. T herefo re, W is

a sob, p'" of M ".by Th'N' ''' 6.:...t

Exampk 6. 13 shows that it is often possible to relate examples that,on the surfa ce, appear to have nothing in com mo n. Conseq uently, we can apply o ur knowledge of III " to polynomials, matrices, and othe r examples. We will encoun ter this idea several times mlhis chapter and will m ake it prel' determine whether ,(x) = I - 4x ((x ) = I - x

+ xl

+ 6~ is in span ( p(x). q(x)), where

and

q(x)::::: 2

+x

- 3K

Solallol We arc looki ng for scalars c and d such that cp(x) + dq(x) - ,(x). This means that c( 1 - x

+ r) +

(2

+x

- 3x1 ) = I - 4x

+ 6K

Regrouping according powers of x, we have

(c+ 2d)

+ (-c+ tl)x+ (c- 3t1),r

Equaling the coeffi cients o f like powers of x gives

c+ 2d= I -c + d = -4

c - 3d =

6

= 1 - 4x+ 6x 2

...

Ch:lptcr 6 Vector Spaces

which is easily solved to give c = 3 and d ... - J. Therefore. r{x) = 3p(x) - q(x) , so r(x) is in span (p(x). q(x» . (Check this.)

Example 6.20

In !J. determine whether sin 2x is in span(si n x, cos x). We set C Sin x + i/ cos x "" sin 2x and try to determine c and Ii so that th is equ::ltion is true. Since these are function s, the equation must be true for (III values of x. Setting x = 0, we have

$0111101

csinO + dcos O = Sin O or

c(O) + d(l ) = 0

from which we see Ihat i/ = O. Setting x = rr / 2, we get csin(7T/ 2) + dCOS(7T/2}

= sin(7T)

or

c( l ) + d(O)

=0

giving c = O. Bu t this implies that sin 2x "" O(SIn x) + O(cos x ) = 0 for all x, which is absu rd, since sin 2x is not the u ro function. We conclude thai sin 2x is not in span (sin x. cos x) .

••••'l It is true that sin 2x can be written in terms of sin x and cos

x. ror

example, we have the double angle for mula sin 2x = 2 sin xcos x. However, th is is not a lillcilrcombination.

Ellmpl,6.21

[~

In Mw descnbe the span of A "" SIIIII..

'], 8=[ '

°

°

O,].and C=[ O,

0'],

Every linear combination of A, B, and C is of the for m

CA+dB +eC= C[ ~ ~]+d[ ~ ~]+ e[~ ~] ~

[ v2' . • . , .... l. Then, since W is closed under addition and scalar multiplication , it contains every linear combination CIVI + C~V2 + ... + ckV k of vI' V I" •• , vr Therefo re, span (v i • v l " .. , vt ) is contallled in W.

ExerCises 6.1 In Exercises 1-11, determine whether thegivell set, together with the specified operatiolls of additioll and scalar mulriplicatiOlI, is a vector space. If It is 1I0t, list all of t ile axioms !lrat fail to hold 1. The set of all vectors in

R2of the

form

[:J.

with the

usual vcrtor addiuon and scalar multiplication

2. The set of all vectors [ ;] in Rl with x C!: 0, Y 2: 0 (i.e., the first quadrant), with the us ual vector addition and scalar multiplication

3. The set of all vectors [;] in 1R2 with xy C!: 0 (i.e., the union of the fi rst and third quadrants), with the usual vector addition and scalar multiplication 4. The set of all vectors [ ;] in R2 with x

~ y, with the

usual vector addition and scalar multiplication

5. IR', with the usual addition but scalar multiplication defi ned by

6. 1R 2, with the usual scala r multiplication but addition defi ned by

X'] + [""] _ [x' +"" + [Y I Yl -YI+y,+

I] l

7. The set of all posiuve real numbers, with addition defined by xE!) y = xy and scalar multiplication 0 defined by c 0 x "'" x'

E:B

8. The set of all rat ional numbers, wi th the usual additIOn and multi plication 9. The set of all uppe r triangular 2X2 matrices, with the usual matm additio n and scalar multiplication 10. The set of all 2 X 2 matrices of the form [:

: ].

where ad :::: 0, with the usual matrix addition and scalar multiplicat ion

11. The set of all skew-symmetric 71X n matrices, with the usual matflx additio n and sca lar multiplication (see Exercises 3.2). 12. Fin ish veri fying tha t qp l is a vector space (see Exampie 6.3) . 13. Finish verifying that ~ is a vector space (see Example 6.4).

...

Cha pt~r

6

Vector Spaces

•• ~ III Exercises 14- / 7, delt:rl/lll1e whether the gIven set, toge/I, er w;th the specified operntiollS of (uldition (wd scalar multipUcmioll, is a complex vector space. If it is nor, list all of the axioms thnt fnil 10 /IO/d. 14. The set of all vectors in C 1 o f the for m

[~], with the

usual vector add ition and scalar multiplication 15. The sct M",~(C ) o f all m X " comple x matrices, wi th the usual ma trix addi tion and scalar multiplication 16. The set

el, with

the usual vector addit ion but scalar

multiplication defin ed by 17.

c[::] = [~~]

Rn, with the usual vector add ition and scalar multiplicat ion

III Exercises 18-2 1, determille whether the give" set, together wirh tlJe specified operatiolls of mld,t,oll (lml scatt" multipU-

alliorl, IS a vector space over tlJe illdicated Z,.. If it IS IIOt, Ust rill of lite rlxiOIllS II/at fat! to Itold.

18. The set of aU vectors in Z; with an tvt'n numocr of I s, over Zz with the usual vector additio n and scalar multiplication 19. The set of all vectors in Zi with an odd number o f Is, over Z, with the usual VC1:tor addition and scalar multiplication 20. The set M"",(Z,J of all m X " mat rices With entries from Zp> over Zp with the usual ma trix addition and scalar multipl icatio n

21. 1 6 , over ill with the usual additio n and multiplicatio n (Think this o ne Ihrough carefu lly!) 22. ProveTheorem6.1 (a).

23. PrQ\'e Theorem 6.1 (c).

In Exercises 24-45, lIse Theorem 6.2 to determine whether W is a subspace ofY.

27. V = Rl, W =

• b

I. I

28. V=M n ,W = {[:

2~)}

29. V=M n ,W = { [ :

~] :ad2bc}

30. V = Mn~' W = lAin M",, : det A = I} 31. V = M".., W is the set o f diagonal

"X" mat rices

32. V = M"", W is the set o f idem potent nXn matrices 33. V = At"", \V = IA in M",, : AB = BA}, where B IS a given (fixed ) matrix

34. V ~ ~" W = {bx+ d} 35. V = CJ>:z, W= fa + bx+ a 1:u + b+ c= O} 36. V=~" W = {.+ Itr+ d ,abc=O} 37. V =

~,

W is the set o f all polynomials o f degree 3

38. V= '§, W = {n n '§'f(- x) = f(x))

39. V = 1/', IV = (f ;,, 1/" f( - x) = - f(x))

'0. V = S;, IV = (f; n 1/' , f(O) = I) 41. V = :1', IV = 1f;":I', f(O) = O} 42. V = '§, IV is the set o f all llliegrable fu nctions 43. V = 9i, IV = {fin ~: r ( x) ~ 0 for all x} 44. V = ,§, w = (€ (l), the sct of all fu nctions with continuous second derivatives ~ 45. V =

,-,

1/', IV = (f h' 1/', Um f(x) = 00)

46. leI Vbe a vector space with subspaces U and W Prove that u n W IS a subspace of V. 47. Let Vbe a vector space wit h subspaces U and HI. Give an example wit h V "" Rl to show that U U W need nOI be a subspace of V. 48. Le t Vbe a vecto r space with subspaces U and \V. Define the slim of U t,ml W 10 be

U+ W = lu + w : u isin U, w is in W]

25. V = R', W=

•

-. 2.

26. V = Rl, W=

a b a+b+1

(a) If V = IR:J, U is the x-axis, and W is the y-axis, what is U + W ? (b) If U and Wa re subspaces of a vector space V, p rove Ih:1I U + W /s a subspace of V.

49. If U and Yare vector spaces, define the Cartesian product of U and V to be U X V = leu , v) : u isin Uand v isi n VI

Prove that U X V is a vector space.

Secllon 6.2

50. Let W be a subspace of a vector space V. Prove that = !( w, w ): wislll W I is asubspace of VX V.

a

In Exercises 51 (lnd 52, let A = [ 8 =

\ -I] [ 1

\ - \

5I. C=[~!]

52.C =[~

-5] - \

xl-

54. sex) = I

58. hex)

= situ

:].[ ~ ~]. [: ~]t

-~}

60. Is M22 spanned by [ :

~].[ : ~].[: :].[~

-~]?

62. IsttP 2 spannedbyi

span(p(x). q{x). r( x)).

56. h(x) = cos 2x

59. ISM21SpannedbY [~

61. Is (jJ>1 spanned by I

In Exercises 53 011(/ 54. let p(x) = 1 - 2x, q(x) = x - X l , alld r(x) = - 2 + 3x+ x 2. Determine whether s(x) IS in 53. s(x) = 3 - 5x -

=I 57. h (x) = sin 2x

55. h(x)

\ ] and \

O · Determine whether C is ill span (A, 8 ).

en

I.mear Independence, BasIs, and Dimension

+ x,x + xl, 1 + Xl? +x+ 2x",2 + x+ 2X 2,

-1+ x+2x 2?

63. Prove tha t every vector space has a unique zero vector.

+ x + xl-

64. Prove that for every vector v in a vector space V. there is a unique ,,' in V such that v + v' = o.

In Exercises 55-58, let f (x) = sin 2x ami g(x) = cos 2x. Defermine wlle/ller II(X) is ill spaf/(f (x), g(x)).

Linear Independence. Basis. and Dimension In this section , we extend the notions of linear independence, basis, and dime nsion to general veclor spaces, generalizing the results of Sections 2.3 and 35. In most cases, the proofs o f the theo rems ca rryove r ; we simply replace R" by the vector space V.

linear Iidependence DeOnIllD.

A set of vectors {V I ' v2, ••• , vk} in a vector space V is linearly de· pendent if there are scalars CI' C:l, ... • c1, allerul one of wJUda..u.. 0, such that

A set of vectors tha t is not linearly d ependent is sa id to be linearly independen .

As only if

In

RIr, Ivp v" . . . , vA.} is linearl y inde pendent in a vector space V if and

We also have the following useful alternative formulation of linear d ependence.

441

Chapter 6

Vector Spaces

Ii

Theorem 6.4

A set of vectors l VI' V2" . • , v k } in a vector space Vis linearly dependent if and only if alieasl one of the vectors can be expressed as a linear combination of the others.

Prill

•

The proof is .dentical to that of Theorem 25.

As a spc • •• , e,,\ is a basis for R ~,

xl> I is a basis for qJ>~, called the sumdard btlSis for qJ> ,..

Xl,

I + .0}isaba si s for ~l'

We have already shown that 6 is linearly independen t, in Example 6.26. To show that 8 spans f/P 2' let a + bx + ex! be an arbitrary polynomial in ~l' We must show that there 3re scalars c" ';}. and t; such that

Solulloa

(,( I

+ x) +

C:l( x

+

x2)

+

cil

+

x 2) = ()

+

/JX

+

ex l

or, equivalently,

Equating coefficients of like powers of x, we obtain the linear system

(, +

Cj = a

which has a solution, Since the coefficien t matrix

I

0

I

I

I

0 has rank 3 and, hence,

o

I

I

is invertible. (We do nO I need 10 know wllll/ ihe solution is; we only need to know that it exists.) Therefore, B is a basis for r;p l '

Remar. Observe that the matrix

I I

0 I lOis the key to Example 6.32. We can

o

I

I

immediately obtain it using the correspondence between flP'1 and RJ, as indicatC'd the Remark foUowing Example 6.26.

In

.52

Chapter 6

Veclor Spaces

Example 6.33

Show that 6 = { I, X,

X l , ..• }

isa basis fo r ~.

[n Example 6.28, we saw that 6 IS linear[y mdependent. It also spans IJJ>, since clearly every polynomial IS a linear combination of (finite ly many) powers o f x.

Solution

-tExample 6.34

Find bases for the three vector spaces in Example 6.13:

a

(,) W, =

b -b

a

Solullon Once again, we will work the three examples side by side to highlight the similari ties amo ng them. In a strong sense, they are all the SClme ex:ample, but ]t will rake us until Section 6.5 to make this idea perfectly precise.

('J

(b) Sin ce

Since

"b :

-b

"

"

1

0

0

1

0 1

+ b

0

1

0

0

1

0 1

'nd

n +bx - bx 2 +nxl = a( l + x 3) + h(x - xl)

- I

we have WI = span (u , v ), where

u:

(c) Since

v =

we have W z = span ( ll(x), V(xl), where

u(x) = 1 + x 3

- I

0

Since lu, v) is clearly linea rly indepe ndent, it is also a basis fo r WI'

and

v(x)

=

we have W3 = span( U, V) , where

u=l~~]

and

v = [ _~~]

x - x2

Since lu (x), vex» ) is dearly linearly independent, it is also a baSIS for W2•

Since 1U, VI is dearly linearly in dependent, it is also a basis for WJ •

Coordinates Section 3.5 in troduced the idea of the coordinates of a vector with respect to a basis for subspaces of Rn. We now extend thiS concept to arbitrary vector spaces.

Theorem 6.5

Let V be a vector space and let 6 be a basis for V. For every vector v in V, there is exactly one way to wri te v as a linear combination of the basis vectors in 13.

Proof

The proof is the same as the proof o f T heorem 3.29.11 works even if the basis B is infmite, since linear combinat ions are, by defi nition, finite.

Sectton 6.2

453

Linear independence, Basis, and Dimension

The conve rse of Theorem 6.5 IS also true. That is, If 13 is a set of vectors in a vector space V \" ith the pro perty that every vector in V can be wri tten uniquely as a linear combination of the vectors in 13, then B is a basis for V (see Exercise 30) . In this sense, the unique representation property characterizes a basis. Since representation of a vector with respect to a basis IS unique, the next definition m akes sense.

Definition

Let 13 = { V I> V2' • . " v~} be a basis for a vector space V. Let V be a vector in V, and write v = ci v, + '7v2 + ... + env no Then (I' (2' .. . , cn arc called the coordinates ofv with respect to B, and the column vector

c,

c, c. is called the coordi1mte vector of v with respect to 8.

O bserve that if the basis 13 of Vhas n vectors, then [vlLl is" (colum n) vector III Rn.

Example 6.35

Find the coordinate vcrtor [P(X) ]8 of p (x ) = 2 - 3x dard basis B = {i, x, Xl} of rz; 2.

SoluUon

+ 5Xl with respect to

The polynomial p(x) is already a line" r combination of i , x. and

the sta n-

xl, so

2 [P(X) iB~

-3 5

This is the correspondence between QIl 2 and IR3 tha t we remarked o n after Example 6.26, and it can easily be generalized to show that the coordin,lIe vector of a polynomial

p(x) = ~

+ alx + alx 2 + .. +

with respect to the standard basis 13 = { i,

X,

an x~ in qp~

x 2, .. . ,xn} is jus t the vector

a, [p(x) iB~ a,

in IRn+1

a. The order in which the basis vectors appear in 6 affects the o rder of the entries III a coordina te vector. For examp le, in Example 6.35, assume that the

1I,•• rl

.5.

Chapter 6

VeclOr Spaces

standllrd basis vecto rs lire o rde red as B' = {x 2, p (x) = 2 - 3x + 5; wit h respect to B' is

X,

I}. Then the coordillate vector o f

5

[p(xl l.

lumpls 6.36

Find the coo rd inate vecto r

~

,

- 3

-']

[A],~ of A = [~

3 with respect to the standa rd basis

l3 = {Ell' E12 , £:z l' £:zl}o f M12 • SoluUon

Since

2

-,

we have

4 3 ThIS is the correspondence between Mn lind IR" that we no ted bcfo rc thc intro· duct ion to Exam ple 6.13, It too can easily be generalized to give 11 corrcspondcnce b etween M",~ and R""'.

lump Ie 6.31

Find the coordinate vector [ p(x)]/3 of p(x) = 1 + 2x C = II + x, x + x 2, I + x 2 ) o f W> 2'

SOIl iiol

Xl

wit h respect to the basis

We need to find cl ' c2' and c, such that

' 1(1 + x ) + Gj(x + x 2) + c3(1 + x 2) "" 1 + 2x - x 2 or, eq uivalently,

(el

+ eJ ) + (c1 +

Gj )x

+ (Gj +

c, )x 2 = 1 + 2x - x 2

As in Exam ple 6.32, this m eans we need to solve the system

+ c1 + ( I

1

(J ""

(2

""

2

c2 +(3=- 1 whose solution is found to be el

= 2, C:.! = 0, £3 = - I. T herefore, [p(xl lc

~

2 0

-,

Secllon 6.2

Linear Independence, Basis, and Dimension

(Since this result says that p( x) "" 2( I correct.)

+ x)

- (I

455

+ xl), it is easy to check that it is

The next theorem shows that the process of forming coordinate veclOrs is compatible wi th the vector space operations of addition and scalar multiplication.

Theor,. B.B

lei 6 "" {VJ' VI" .. , V~} be a basis for a vector space V..Le.LLLand v be vectors in and let c be a scalar. Then

a. [u + v]s "" [u]s + [vls b. [cuJ. = cl uj.

Prllt

We begin by writing u and v in terms of the basis vectors-say, as

Then, using vector space properties, we have

,nd so

d,

c, ~

[u + v]s ""

[eu]s

and

+

d,

= [uJ. + [vJ.

=c

=

ee.,

e.,

An easy corollary to Theorem 6.6 states t hat coordinale vectors preserve linear

combinations:

[ c, u, + "" ..

i!""OCC'

!. ' " :

- ;;-;c

c;-"f"-:

(I )

You are asked to prove this corollary in Exercise 3 1. The most useful aspe l ' 51. Findabasisfor span (J - x,x- X l , I - x 2 , 1 - 2x+ X l ) in f!J>2' 52. Fi nd a basis for

span ([~ ~]. [~

'] [-' ']

O·

[ - 1, - I' ]) ;nM"--.

53. Find a baSIS for span(sin1x, cos 2x, cos 2x) in ~.

55. Let S == {VI" ..• v,,} be a spanning set for a vector space V. Show that ifv" IS in span (v l •• .. , V,,_ I)' then S' = {VI" .. • v n- I} is still a spann ing set for V. 56. Prove Theorem 6. IO(f).

I

- I '

58. Let {Vi' ...• v ..} be a basis fora vector space V. Prove thai

{VI' VI + Vl, VI + v1 + V,' ...• VI + ... + v,,} is also a basis for V.

Let (Ie, Ill>' • . ,(I" be n + I dis/mel rea/ nllmbers. Defil1c polynomials pJ"x), plx), . . .• p.(x) by .I ) _

(x - IlO) ". (x - a' _I)(x - a,+I)'" (x - a. )

p" .' - (a, -

"0) . . (a, - (1, _1)( (I,

-

a,+I)' " (a, - an)

These are all/ell tlte lAgrange polY'lOmials associate,' with tIo, Ill" •. , an' IJoseph·wllis Lagmllse ( 1736- 1813) was hom i,1 ffaly bllt spent most of his life ill GermallY ami Frallce. He made important cOllf riblltioll$ to mel! fields as /lumber theory, algebra, astronomy. mechanics, and the calculus of variatiOllS. I" 1773, lAgrnnge WtlS tile first to give the volume iflterpreltl tioll of a determinant (see Chapter 4).1 59. (a) Compute the Lagrange polynomials associ31cd With a." = l ,u 1 = 2, ° 2 :: 3. (b) Show, in general, that

p,(a,) =

t

ifi "'} if j = j

60. (a) Prove that the set 13 = {Alx), plx), .. . , p,,(x)} of Lagrange polynom ials is linearly independent III ~,..IHjm:Se t GJAix) + ... + c"p,,(x) == Oand use Exercise 59(b).] (b) Deduce that B IS a basis fo r flI'". 6 1. If q(x) is an arbit rary polynomial in qp M it follows from Exercise 6O(b) that

q(x) = '>Po(x ) + ... + "p,(x) for some sca l ars~, ... , c,..

(l )

(a) Show that c, == q(a,) for i = 0•. .. , n. and deduce th .. q(x) = q(a,)p,(x) + ... + q(a. )pJx) ;sth, unique represen tation of q(x) with respect to the basis B.

Settion 6.2

(b ) Show that fo r ;lny n + I points (Ug. Co), (al' c, ), . .. , ( tI~, cn) with distinct first components, the (unction q(x) defined by equation ( I) is the unique po lynomial of degree a t most I1 lha l passes th rough all of

the points. This formula is known as the Lagrange ;"'erpolation formula. (Compare this formula with Problem 19 in E..xploration: Geometric Applications of Determinants in Chapler 4.) (cl Usc the L1grangc interpolation for mula to fin d the polynomial of degree at most 2 that passes through the points

tmear Independenct. Bas,s, and Dimension

461

(i) (1. 6). (2. -i) .,nd (3. - 2) (ii) ( -I, 1O), (O,S), and {3,2)

62. Use the Lagrange interpolation for mula to show that if a polynomial in and Ptlo-C for the bases 8 :::: {I, x, x 2} and C "" {I + X, x + X l , I + X l) of'lJ 2• Then find the coord inate vector of I'(x) = I + 2x - x 2 with respect to C.

Solullon

Changing to a standard basis is easy. so we fi nd PlJ-c firs t. Observe Ihat the coordinate vectors for C m terms of Bare

o

I

[l+x jB =

I .

o

[X+ Xl ]B =

I , I

I

[ 1 + x 1],6

=

0 I

Section 6.3

Change of lJasis

.U

(Look back at the Rema rk followi ng Example 6.26.) It fo llows that I

PB-c=

0

I

II

O

o

I

I

To find PCo-li. we could express each vector in B as a linear combination of the vectors in C (do this), but it is much casier 10 use the fac l that PC_ B = ( PI:I-C) - I, by Theo rem 6.12(c). We fi nd that

, 1, _1, _1 , 1, ,1 ,1 _ 1, 1, 1

(I'a ...d

PCo- B -

-I

=

It now fo llows that

,1 1,, -~ _1, , 1, _1, 1, I

I

2 - I

2 0 - I

which agrees with Exam ple 6.37.

Rlmlrk

If we do no t need PCo- Bexplicitly, we can find [p(x) ]c fro m [p(x) ]s and PB-c using Gaussi:lll elimination. Row reductIOn produces

(Sec the next section o n using Gauss-Jordan eliminatio n.) It is worth re~ating the observation in Example 6.46: C hanging to a standard basis is easy. If t is the standard basis for a vector space Vand 8 is any other basis, then the columns of Pt _ s are the coordmate vectors of 8 with respect to [ . and these arc usually "visible." We make use of this observation again in the next example.

In M w let 8 be Ihe basis IEII' E21' E12, En ! and leI C be the basis lA, B, C, Dl, where

A-

[10]

00 '

8 -

[II] 00 '

C-

[I ~]. I

D=[ :

Find the change-of-basis matrix PC-I:> and verify that [ Xlc

[; !l

z:::

:]

pc_J X]s

for X =

.12

Chapter 6

Vector Spaces

Solulltn 1 To solve this problem d irectly. we must fi nd the coordinate vectors of B with respect to C. This involves solvi ng four linear com bmation problems of the form X = aA + bB + cC + dD. when' X is in B and we mu st find a, b. c, and d. However, he re we are lucky, since we can fi nd the required coeffi cients by inspection. Clearly. Ell = A, ~ I = - B+ C, EI2 = - A + B,and ~2 = -C+ D. Thus.

[ EI1 ]C =

If X =

[~

0 , 0 0

- I

0 - I

1

1

, [E"k = 1

[E, ,]c=

0

[E,, ]c =

,

0 0

0 0

1

0

- I

0

- I

1

0 0

1

0 0

0

- I 0 0

- I 1

! ]. then 1

3

2 4

and

PC_8[XJ.

=

1

0

-I

0

1

0

- I

1

0

0 0

1

0 0

- I

3 2 4

0

1

=

- I -I - I 4

T h is is the coordinate vector with respect to C of the malrix

-A - 8 -C+ 4D= - [ ~ ~] - [~ ~) -[ : ~]+ 4[ : =

[~

!]

:]

= X

as it sh ould be.

Solullon 2 We can compute PC and compare YOllr answer with tile one JOUlld In part (a).

3. x =

1

1

o

0 ,6 =

0 , 0

1 , 0

-1 1 c~

1

,

1

0

0

1

, 0

1

1

3

4.x =

I ,8

5 c ~

1

I ,

o,

0

0

1

0

0

I ,

I , 0

0

1

Pco- B

I

in

~j

1

=[_: -~ J

16. Let Band C be bases forQJ> 2. If 6 = {x, I + X, I - x + xl} and t he change-of-basis matrix from B to C is 100 021 - I

I

I

find C. Xl,

X + x 2},

C= {I, 1 + x,xl}in QJ>2

III calmlus, you leam that a Taylor polynomial of degree n

In Exercises 9 and 10, follow the instructiolls for Exercises 1-4 IIsing A instead of x.

~ {[~

{ [ ~], [ ~]} and

find 6 .

8.p(x) = 4 - 2x- x 2,6= {x, 1+

C

III Exercises 11 and 12,jollow the Im tructions for Exercises 1-4 IIsingf(x) ills/ead of x. 11. f(x) = 2 sin x - 3 cos x, 6 = {sin x + cos X, cos xl, C = {sin x, cos x} in span(sin x, cos x) 12. f(x) = sin x, B = {sin x + cos X, cosx},C = {cosx sin x, sin x + cos x} in span (sin x, cos x)

the change-of-basis matrix from B to C is

,

C={I,x,x Z}in~2

o

~ {[~ ~]. [~ C ~ {[~ :J. [ :

B

15. U=I Band C be bases for R'. If C =

III Exercises~. follow the imtructions for Exercises 1-4 IIsing p(x) instead of x. 5. p(x) = 2 - x,6 = {1,x},C = {x, I + x} in 9J'1 6.p(x) = I +3x,6 = {l +x,1 - x},C={2x,4}in Q/'1 2 2 7 . p(x) = I + x , 6 = {I + x + X l, X + X Z, x },

9. A = [ 4

:].

14. Repeat Exercise 13 with 0 = 135°.

in W

o

1

,

1

0 ~

415

13. Rotate the xy-axes in the plane counterclockwise through an angle 8 = 60° to obtain new x' y' -axes. Usc the methods of this section to find (a) the x' y' -coordinates of the point whose xy-coordinates are (3, 2) and (b) the xy-coordinatcs of the point whose x' y' -coordina tes are (4, - 4).

0

o

IO. A=[ :

Change of Basis

about a is a polYllomial of the form

2 ], 6 = the standard basis,

ao + aj(x - a) + a,(x - a)l + ... + a,,(x where an *" o. In other words, it is a polynomial/hat has

-:J. [:

beell expanded ill terms of powers of x - a illstead of powers of x. Taylor polYllomials are very useful for approximating futlc/ions that are "well behaved" lIear x = a.

- 1

~J. [~

:J. [~

~]} in M"

p(x)

=

at

Chapter 6 Vee/or Spaces

416

The set 8 = {I, x - a, (x - af .. ., (x - a)If} is a basis for9P nfor(IIIY real /IIIII/ ber (/. (Do YOI I see a quick way to show tills? Try uSing Throrem 6.7.) TIIis fact allows us to lise the techniqlles of tllis section (0 rewrite a polYllomial as a 1(lylor poiYllolllwl abollt a given a.

be bases for a timte-dlmensional vecfO r space V. Prove that

21 . Let 8. C, and V

22. Le t V be an II-dimensional ve .. ---+ '!J'1t and T: ~ It ---+ IlJ> It by

S(p(x)) - p(x + 1) ,nd

T(I~ x))

- p'(x)

Find (S 0 T)(p(x» and (To S)(p(x)}. ( H mt: Remember the Cham Rule.) ~28. Defin e linear transfo rmation s 5:

'lP .. ---+ 'lP and It

T : IlJ> It ---+ IlJ> It by

S(p(x)) - p(x + 1) ,nd F; nd (5 0

T)~x))

"i,p(x)) - xp'(x)

, nd (T o 5)(p(x)).

y

y

3x

x- y - 3x+4y

y

and T :1R1---+1R2

1

30. S: Q/' I ---+ eJ> I d efined by S( a + bx) = ( - 4a and T: 2 ~ [R2 be the linear t.ransformation defined by

+

bx

+ a! }

=

c ["b -+ b]

B ~[

[~l

- I

-I] 1

1-I]

- I

1

~ 13.

(ii) x - x 2

(iii) I

+x-

x2

(b) Which, if any, of thc following vecto rs are in range( T)? (i)

1

12. T: M22 ~ MZ2 defined by T(A} = AB - BA, \",here

(a) Which, if any, of the following polynomials arc in ker(T)? (i ) 1 + x

T ~,-+R' d'finedbyT(p(x» ~ [~~~n

11. T: M22 ~ M12 defined by 'It A) "" AB, where

(iii) I/ Vi

(cl Describe ker(T) and range( T).

'{{a

T:Mu _ W definedbYT[~ ~] = [~ = ;]

(ii)

[~l

(iii)

[~l

(c) Describe ker(T) and range(T). 4. Let T: 'lP 1 -+ 'lP 1 be the linear t ransformatIon defined by T(p(x» ~ xp'(x),

T : 1J>2_IR defi ned by T(p(x)) = p'(O) 14. T:MJJ ~MjJ de fi nedby 'J1A} = A - AT

Itt Exercises 15-20, determit1e wlletller I/Ie linear transfor-

mallOn T is (a) aile-la-one alld (b) 011/0. 15. T: 1H z _ [Rl defined by T [ x] = Y

[2X - Y] x + 2y

588

Cha pter 6 Vector Sp:lces

ee[0, 2]. 32, Show that (€[a, bJ - C€[ c, d] fo r all a < ba nd c < d. 31. Sho w that C(6[ 0, I J -

x - 2y 3x

+Y

x +y 2a - b

a + b - 3c

18. H I" --+11' dcfi ncd by 'I11'(x» = n

19. T : llt~ M22 d efi n edby T U

,

[;~~n

a+b+ [ b - 2c

C

2']

,

(a) Prove that if 5 and T are both o ne-to -one, so IS SoT. (b ) Prove that if 5 and T are both on to, so is SoT.

34. Let 5: V -+ Wand T: U --'" V be linear tra nsformations.

__ ["" +bb bb -+ ,,]

(a) Prove that If 5 0 T is o ne-to -one, so is T (b ) Prove that If S o T is onto, so is S. 35. Let T: V --'" W be a linear tra nsfo rmatio n between two fi nite-d im ensional vector spaces.

a 20. T: R J --'" W defi ned by T b

33. Let 5: V ~ W and T : U --'" V be linear transfo rm ations.

=

b, where W is the vector space o f a- ,

all symmet ric 2 X2 matrices

(a) Prove that if di m V < dim W, then Tcan not be onto. (b) Prove that if dim V> d im W, then 1'can not be one- to-o ne. 36, Let no, (1, •••• , a" be n + I distinct real n umbe r~. Defin e T: W'" -+ IR:"-t l b y

In Exercises 2/-26, determine whether Vand Wa re

T(p(x) =

isomorphIC. If they are, gIVe an explicit IsomQrphism T: V~ W. 22. V = 53 (sym metric 3 X 3 matrices) , W = UJ (upper t riangu lar 3 X 3 mat rices)

53 (skcw-

24. V = !j>,. IV = (P(x) in !j>" P(O ) = 0) •• 101

25. V = C, W = R2 26. V = {A in M" , u(A) = 0). W =

Il'

27. Show that T:~n ~ ~n defi ned by T{p(x» = p(x) p'(x) is an isomorphism. 28. Show that T:'lP n --'" 'lP n d efined by is an isomorphism. 29.

+

T(p(x» = p(x - 2)

~how that T:~n--"' 'lPn defined by T(p(x»)

=

xnp(; )

IS an Isomorphism. 30. (a) Show that (£[0, I ] - '{; [2, 3]. [ Hint: Define T: C€ (0, 11--'" C€ [2,3] by letti ng T(f) be the functio n whose value at x IS (T(f))(x) = f(x - 2) for x in

[2.3[.[ (b ) Show that f=ars, on the surface, to be a calculus proble m. We will explore this idea further in Example 6.83.

Example 6.80

Let V be an ,,·dimensional vector space and let I be tne identity transformation on V. What is the matrix of I with respc, -+ R' d,fin, d by T( p(x» B - {x' ,x, I},e V ""

7. T:

p(x) = a + bx

-,

{:] -

th" cos x. t h" Sin x ).

, B-

{ [;].[_~]},

b

I I o, I , I I 0 0

8. Repeat Exercise 7 with v

,

v -

[-~]

= [:].

9. T: M Z1 -+ M12 defi ned by T(A) = AT, B = C = {E' l' Eu.

~l' ~l}' V = A = [:

!]

10. Repeat Excrcise 9 w ith B "" (Ell' Ell ' Eu. Ell} and C = {Ell> E2l , E22, Ell}' II. T: MI.2 -+ M22 defined by T(A ) = All - BA, where

B- [

I -I], B _ e - {E" , E E", E,,}, - I I

v - A - [:

(a) Find the matrix of D with respect to B = {th , eh" cos X, eh" sin x}. (b) Computethederivativeoff(x) = 3t h - tUcosx+ Zth" sin x indirectly. using Theorem 6.Z6, and verify that it agrees with 3S computed d irectly.

+ (X l

I

e -

~ 15. ConSider the subspace IV of~, given by W = span (t h",

W].[:]},

R2 --+ RJ d efi ned by a + 2b

rex)

~ 16.

Consider Ihe subspace Wof 9:J, given by W = span (cos x, sin x, xcos x, x si n x). (a) Find Ihe mat rix of Dwith res pect 10 5 = {cos X, sin x, x cos X, xsin x}. (b) Compute the d erivative of f(x) = cos x + Zxcos x ind irectly. using Theorem 6.26, and verify Ihat it agrees wilh f (x) as computed directl y.

III Exercises 17 alld 18, T: U -+ V and 5 : V-+ Ware Ii/lear tra/l sformatlOlIS (Illd 5, C. mul V are bases for V. V. lIIld W, respectively. Compute [S 0 T):D ..... 8 ill two ways: (a) by finding So T directly and thell compl/ting Its matrix and (b) by finding the matrices ofS anti T separately and using Theorem 6.27. 17. T: @I, -+ R 2 defi n ed by T(p(x)) =

u,

:]

d'finodby

[~~~]. S: RI-+ RI

J.Jl a]_ [a 2b],B _ {I,x}, 2a - b b

C = V = {e l . e2}

12. T:M21 -+M11 definedbyT(A) "" A - AT,B =

c= {Ell,E12'~l>~l}' V =

511

1l. 14. Consider the subspace W o f 2b, given by W - span (C'", e- 1., "),

+d

5. T ;@>l --+ R d efinedbyT(p(x» =

The MatriJ( of a Linea r TransformatIon

A = [:

~]

13. Consider the subspace Wof 2'b. given by W = span (slll X, cos x). (a) Show that the differential operator D maps IV into itself. (b) Find the matrix of 0 with respect to B = {sin x, cos x}. (c) Compute the d erivative of fix) "" 3 sin x - 5 cos x indireclly. using Theorem 6.26, and verify that it agrees with r ex) as computed directly.

18. T: ~' -+ ~l definedbyT(P(x» = p(x+ I), S:~l-+~ldefi n edbyS{p(x» - p(x+ I),

B: {I, x},e - V - {I,x,x'}

III Exercises 19-26, determine wller/ler tire lillear trfillsformatioll T is invertible by considering its matrix witll respect to the standard bases. 1f T is invertible. lise Theorem 6.28 and tile metlrod of Example 6.82 /0 fiml T- ' . 19. Tin Exercise 1

20. Tin Exercise 5

21. Tin ExeTClse3 22. T: (jJ> I -+ '!P 2 defi ned by T(p(x» = p' (xl . T: ~l-+'!Pl defined by T(p(x»

= p(x) + p' (x)

Chapler 6 Vector Spaces

51.

24. T: Mn --+ MZ2 defi ned by T(A ) = AU, when!:

it to compute the o rthogonal projection of v o nto W, where 3 v -

25. T in Exercise II

2

26. T In Exercise 12

C()1l1pare your answer with Example 5.11. I Him: Find an orthogonal decomposition oflVas W = w + W1. using an o rthogonal baSIs for W. See Example 5.3.1 39. Let T: V--+- Wbe a li near transform.llion between finite- dimensional vecto r spaces and let Band C be bases fo r Vand W, respectively. Show that the matnx of Twith respect to Ba nd C is un ique. That is, If A IS a matrix such that A[ v]o = [ T(v)] c for all v in V, then A = [1'Jc_B' {Hi"t: Find values of v that will show this. one column at a time.]

~ 11, ExerCIses 27-30, use the method of Example 6 83 to eVal'lnte the given inlrgml.

f 28. f

27.

(si n x - 3 cos x )llx (See Exercise 13.) Se- z.. (ix (Sce Exercise 14. )

29.

J (;S cos x -

30.

f

2i-" si n x) dx(See Exercise 15.)

(xcos x + xsin x ) dx(See Exercise 16.)

III Exercises 31-36, a lillear Ir(msfor/1lf1liml T: V--+ Vis give" If possil1le, find a 1n,sis C for V S'ld, IIJat the matrix [T1- ofT will, respect 10 C is dlagollal.

'1. T: R2 -t Rl definedbyJal ~ [ - 4b 1 Jl a +5b

33.

1:]

=

(tl

41. Show that rank(T) = rank(A).

[: +~]

T :gJI, --+ ~, d e finedbyT( a + bx) = (4a

42. If V = Wand lJ = C, show that Tis diagonalizable if and only if A is diagonalizable. + 2b)

+

+ 3b)x

34. T: @I, --+ ~ Zdcfined by T(p(x )) = p{x + I) l.&.35. T :@I,--+gJIl defined by T(p(x» = p(x) + xp'(x) 36. T : t~\ --+ ~2 definedby T(p(x)) = p(3x+ 2) 37. Let

ebe the line thro ugh the o n gin in R' with dire" p( - x) ~ p(x))

III QrlestiollS 11- 13, determine whether T is a linear trmlsformation. II . T:

R2 --+ JRl d efined by "/1x) = yxTy, where y =

1 [2 ]

12. T : Mnn --+ M nn defined by T( A ) = A TA 13. T: Ql'n -+9Pndefi ned by T(p(x)) = p(2x- 1)

14. If T: W' l --+ M21 is a linear transfo rmation such that

T(I) ~[ ~ nT(I +X)~ [~ T( J + x + x 2 ) =

0 -I] [ I

:],nd

O. fi nd T(5 - 3x + 2x2 ).

15. Find the null ity of th e linea r transformation T: M m , --+ IR defin ed by T(A) = tr(A). 16. Let W be the vector space of upper triangular 2 X2

m atrices.

=a+ c=b+d } 4. V ~

I O. Find the change-of- basis mat rices Pc..../) and PB.....c with respect to the bases B = I 1, 1 + x, 1 + x + x"} and C = Ii + x,x+ xl, I + xl} of~ 2 '

(a) Find a linear transformation T : Mn --+ M21 such that ker(T) = W. (b ) Find a linear transform ation T : M12 --+ Mn s uch tha t range( T ) = W.

17. Find the matrix I T lc....a o f the linear transformation T in Question 14 with respect to the standa rd bases B = {I, x, Xl) ofQJl 2 and C = {Ell' E12 , ~ L> !;,2} of M n· 18. Let 5 = {VI' ... , v n} be a set of vectors in a vector space V with the property that every vecto r in V can be written as a linear combination of V I' .•. , V n in exactly one way. Prove that 5 is a basis for V. 19. If T: U --+ V and 5: V --+ Ware li nea r transformations such that ra n ge(T) C ker(S), wha t can be deduced about So T? 20. Let T: V --+ V be a linear transformatio n, and let {VI' ... , v n } be a basis fo r V such tha t {T(v , ), ... , T(v n )} is also a basis fo r V. Prove that Tis inve rtible.

iSlanc I

A stralght/ine may be the shortest dislmlce betwun two points, but il

is by no lIlea/lS the most . . mteresrmg. -Doctor Who In ''The Time Monster" By Robert Sloman BBC,1972

A/though Illis may seem a pnradox, all exact sCIence is dominated by the idea of approximation. - Bertrand Russell In W. H. Auden and L. Kronenberger, eds. Tile Vikillg Book of Aphorisms Viking, 1962, p. 263

B

A

Fluurll.1 Taxicab distance

538

Ii

1.0 Introduction: Taxicab Geometrll We live in a three-dime nsional Euclid ean wo rld , and therefore, concepts fro m Euclidean geometry govern our way of looking at the world. In particular, imagine stopping people on the street and asking them to fill in the blank in the following ." They will a lmost sen tence: "The shortest distance between two points is a certainly respond \vith "straigh t line." There a re, however, other equall y sensible and intuitive notions of d istance. By allowing ourselves to think of "distance" in a more flexible way, we will open the door to the possibility o f having a "distance" between polynomials, funct ions, mat rices, and many other objects that arise in li near algebra. In this section, you will dIscover a type of "distance" that is every bit as real as the straight-line distance you are used to from Euchdean geometry (the one that is a consequence of Pythagoras' Theorem). As you'll see, this new type o f "distance" still behaves in som e fam iliar ways. Suppose you are standing at an mtersection lt1 a city, trying to get 10 a restaurant a l anolher intersection . If you ask someone how far il is to the restaurant, that person is unlikely to measure distance "as the c row flies " (i .e., usmg the Euclidean version of distance). Instead, the response will be someth ing like " It's five blocks away." Since thIS is the way taxicab drivers measure dis tance, we will refer to this notion of "dIstance" as taxicab distance, Figure 7.1 shows a n exam ple of taxicab d istance. The sho rtest path from A to B req uires traversing the Sides of five city blocks. Notice that although there is more than one route from A to B, all shortest ro utes requ ire th ree horizontal moves and two ve n ical moves, where a "move" corresponds to the SIde of one city block. (How many shortest routes are there from A to B?) Therefore, the taxicab distance from A to B is 5. Idealizmg thIS situation, we will assume that all blocks a re unit squares, and we WIll use the notatIon d,(A, B) for the taxicab distance from A to B.

Problell 1 Find the taxicab distance between the followrng pairs of points:

(,) ( 1,2)ood(S,5)

(b) (2,4),nd (3, - 2)

(e) (0,0) ,nd (- 4, - 3)

(d) (- 2,3) ,nd (I, 3)

(e) (I, D and{ -L D

(f) (2.5,4 .6)and(3 . 1,1.5)

Section 7.0

Introduction: Taxicab Geometry

539

Proble. 2 Which of the following is the correct formula for the taxicab distance d ,(A, 8 ) between A = (a l • a2) and B = ( hI> b2)?

(,) d,(A, B) ~ (a, - b,) + (a, - b,) (b) d,(A, B) ~ (la,1- lb,l) + (I.,I- lb,l) «) d,(A, B) ~ la, - & ,1 + la, - b,1 We can d efi ne the taxicab " orm of a ve2' (For example, if p(x) = I - 5x + 6 + 2x then (p(x), q(x» = 1· 6 + (- 5) . 2 + 3 · ( - 1) "" -7.)

r,

3r and q(x)

E:

SOlutiOD Since (jp 2 is isomorphic to H), we need only show that the dot product in R~ is an inner product, which we have already established.

Section 7.1

Example 1.5

Inner Product Spaces

543

Let f and g be in «5 [a, h] , the vector space of all continuous fun ctions on the closed interval [a, bJ. Show that

(f, g) defines an inner product on '€ [a,

Solution

We have

(f, g)

~

r

[( x)g(x) dx

•

hI.

r

~

~

[(x)g(x) dx

•

r

g(x)[(x ) dx

~ (g,f)

•

Also, if Ii is in '€ la, hI , then

(f, g + h) =

r r

f(x)(g(x) + h(x)) dx

•

([(x)g(x) + [(x)h(x)) dx

•

r

[(x)g(x) dx +

•

= (f,g) + (f, h) If c is a scalar, then

(of, g)

~

r

r•

[(x)h( x) dx

,[(x)g(x) dx

•

~

,r

[(x)g(x) dx

•

~

Finally,if, f) =

f

clj. g)

b(f(X»2 dx 2: 0, and it follows from a theorem of calculus that, since f

• is continuous.lj.f) =

r

(f(X» l dx = 0 if and on ly if f is the zero fun ction . T he refore,

•

(f, g) is an in ner product o n '€ [n, hI.

Example 7.5 also defines an inner product o n any subspaceoftfl. [a, bJ. For example, we could res trict our attent ion to polynom ials defined o n the interval [a, bJ. Suppose we consider '!J> [0, II> the vector space o f all polynomials o n the interval [0, 1J . Then, using the inner product of Example 7.5, we have

{x 2,1 +

~ = (x2(1 + x) dx = ( X2+ x') dx o

=

0

[Y! + x4jl = .!.+.!.=:~ 3

4

(I

3

4

12

544

Chapter 7

Distance and Ap proxi mation

Properties ollner Produc.s The following theorem summarizes some additional properties that follow from the definition of inner product.

I

Theare. 1.1

Let u, v, and w be vectors in 3n inner product s.E.3ce Vand Jet c be a scalar: a. (u+v,w) = (U,W/+ {V,W) b. (u, cv) = C(U,VI c. lu , O)~IO,v)~ O

i

We prove property (a) , leaving the proof of properties (b) and (c) as Exercises 23 and 24. Referring to the definition of inner product, we have

ProOf

(u + v, w) = (w, u +

VI

By (1)

"" (w, ul + (w, VI

By (2)

= (u, W I + {v, WI

By (I)

lengtl, Distance, aDd OrtbOgOD8111V In an inne r product space, we can defi ne the length of a vector, distance between vec\ors, and orthogonal vec!ors,just as we did in Section 1.2. We sim ply have to replace every usc of the dOl product u . v by the more general inner product (u, v). Ii

DeHnlllon

Let u and v be vectors in an inner product s ace

I. The length (or "orm ) of v is ~ v~ = V (v, v). 2. The distance between u and v is d( u, v) = IIu - v~. 3. u and v are orthogonal if (u, VI = o.

Note that I v~ is always defin ed, since (v, v) 2 0 by t he definition of inner product, so we can take the square root of this nonnegative quan tity. As in Rn, a vector of length I is called a u,lit vector. The unit sphere in V IS the set S of all untt vectors in V.

Example 1.6

ConsIder the inner product on