4,704 243 106MB
Pages 736 Page size 1274 x 1649 pts Year 2011
li A
rn
CIiOD
Second
David Poole Trent University
THOIVISON
__
~JI
_
BROOKS/COLE
Aust ralia. Canada. Mexico . Singapore . Spa in United KHl gdom • United Slates
THOIVISON ~...
.
BROOKS/COLE
Linear Algebra: A Modern Introduct ion Second Ed ition
Dm'la Poole Executive Publisher C urt Hinnchs Executive Edi tor: jennifer Laugier Ed itor: /ohn·Paul Ramin ASSIStant Edi tor: Stacy Gre!.'n Ed itonal Assistant: Leala Holloway T!.'chnology l)roject Managt'r: E,lrl Perry Marketing M,mager. Tom Ziolkowski Marketing Assistant: Enn M IIchell Advertising Project ~lanager: Bryan Vann ProjeCl Ma nager. Edllorial ProduClion: Kelsey McGee Art Director: Vernon Boes Prinl Buyer: Jud y Inouye
Permissions Ed,lor: loohtt lLe Production Service: Matrlx Producllons Text Designer: John Rokusek Photo Research: Sarah Ever t50n / lmage Quest Copy Editor: Conllle Day Il lust ration: SCientific Illustra tors Cover Design: Kim l~okusek Cover Images: Getty Images Cover Pnnllng, Prim ing and Binding: Transcontlllen Pnnllllgl L.ouisrv ille Composilor: Inte racti\'e Compositton Corporation
e
Asia (including India ) Thomson Learning 5 Shenton Way #01 01 UIC Buildtng Singapore 068808
2006 Thomson Brooks/Cole, a parI of The Thomson Corporation Thomson, Ihe St~r logo, and Brooks/Cor.. arc tr~demarks llsed herein under license. ALL RI GHTS RESERVED. No part of this work covered by the
cop)'right hereon may be reproduced or used III any form or b)' I)v = (1/ Vl4) _ I 3
2/ Vl4 1/ Vl4 3/ Vl4
Since property (b ) of Theorem 1.3 describes how length behaves With respect to scala r multiplication. natural curiosity suggests that we ask whether length and vector addition arc compatible. It would be nice If we had an identity such as lu + vi = I l u~ + Ivll. bu t fo r nlmost any choice of vectors u a nd vthis turns out to be false. [See Exercise 42(n).) However, all is not lost, fo r It turns out tha t if we replace the = sign by :5 ,l he resulti ng inequality IS true. The proof of this famous and important resuh
••
Ihe Triangle Inequalityrelies on anolher imporlant inequalitythe CauchySchwarz Inequalitywhich \....e will prove and disc uss In m ore detail in Chaple r 7.
Theorem 1.4
• The CauchySchwarz InequaUty
a,
•
For all vectors u and v in Rn,
u
+ ,.
" flllrt 1.21 The Triangle In(>(Juality
.
See Exercises 55 nnd 56 for nlgebraic and geometric npproaches to the pmof o f this inequality. In Rl or IR J , where we G ill usc geom etry, it is dea r from a d iagra m such as Figure 1.26 that l u + vi < ~ u ll + I vl fo r all vectors u and v. We now show that thIs is true more generally.
,
~
ThlOrlm 1.5
The Triangle Inequality For all vectors u and v in R~.
20
Chapter I
Vectors
P"II Since both sides of the inequality arc nonnegative, showing thai the square of the lefthand side is less Ihan or equal to the square of the righthand side IS equiva lent to proving the theo rem. (Why?) We compute
Il u + V~2
(u + v) ·(u + v)
""
= u·u + 2(u·v) + v'v
By EX31llpie 1.9
" iull' + 21u· vi + ~vr s lul1 2 + 21uU + IvII 2 = (lui + IvI )'
By CauchySchwoarl
as requ ired.
DlSlance The d istance betw'cen two vectors is the direct analogue of the distance between IwO points on the real n umber line or IwO points in the Cartesian plane. On the number line (Figure 1.27 ), t he distance between the numbers a and b is given by ~. (Taking the absolute need to know which o f a o r b is larger.) This d istance is also eq ual to , and its twodimensional generalization is points (II I' a2 ) and (bl' btlna mely, the familia r fo rmulaIor the dis· nce (I
la
d=Y
b
'+
11,)' t
/,
o
I i
I
o
2 flglr'1 .21 d
= 10 
Jn terms of vectors, if a
=.0
[
~
::]
I 4 I I . 3
= 1 2  31= 5 and b "" [ : : ], then ti is just the length o f a  b.
as shown III Figu re 1.2B. This is the basis for the next definition. y
a b
,
,, , ,. ,, "' , __ _ ________ JI,
l a~ h _
FIliI,. 1.Z1 (/  v""(a,'b,.,")'~+c(a,b,.,")l
DelialtloD
=
'~ x
la  hI
The distanced(u, v) between vectors u and v in Rn is defined b
(u. v)
=
lu .
i
' UW'"
S«t ion
1.2 Length and Angle: Thl' Dot f'roduct
Exlmple 1.13
o
Find the distance betwl'en u =
and v :::
I
 I
S,I.II..
21
2
 2
v2 We com pute u  v =

I ,so
1
d(u . , ) ~ ~ u 
'i
s
\I( v2) ' + ( 
I )'
+
I' ~
V4
~ 2
The dol product can also be used to calc ulat e the ang le between a pair of vcctors. In 1!l2 orR), Ihe angle bet>.vee n Ihe non7.ero vect ors u and v will refer 10 the angle 0 determ ined by these vectors that S:ltisfi es 0 s () :S 180" (sec Plgure 1.29 ).
,
, u
"
• C\
•
u
u
fl,. f.l .2I The lIngie bctw('('n u lind v In Figure 1.30, con sider the tria ngle with side s u, v,and u  v, where () is the angle between u and v. Applyin g the law of cosi nes \0 Ih is triangle yields
U '
•
Il u  vl1
U
FI"f' U'
lu92 + I 2  20ull cos 0 Expanding the lefthand side and usin g Ivf2= v · v seve ral time s, we obta in lui '  2(u" ) + lvi' ~ lu! ' + lvi'  '1Iu!II'1005 0 :::
which, after simplification, leaves li S with u' v = U uUvf cosO. From this we obtain the foll owing fo rmula for Ihe cosine of tile angle () between non zero ve, determ ine whether f and QI> are parallel, perpendicular, or neither: (a) 2x+ 3y  Z "" I (c) x  y  z = 3
(b) 4x  y + 5z = 0 (d ) 4x + 6y  2z = 0
19. The plane @'] has the equation 4x  y + 5z = 2. For each of the planes q. in Exercise 18, de termine whether qp] and 'lJ' are parallel, perpendicular, or neither.
x 28. Q = (0, \ ,0), € with eq uatio n y
{:J 2
1
\ +
,
I
1
0
3
In Exercises 29 and 30, find the distallce from tllf point Q to the phme ~ . 29, Q "" (2, 2, 2), ~ with equation x
+ y
z= 0
30. Q = (0, 0,0), f!J' with equation x  2y + 22 = 1
20. Find the vector fo rm of the equation of the 11Ile in 1R2 that passes thro ugh P = (2,  I ) and is perpendicular to the line with general equation 2x  3y = 1.
Figure 1.63 suggests a way to use vectors to locate the point R Otl f that is closest to Q.
2\. Find the vecto r fo rm of the eq uatio n of the line in [R:2 that passes th rough P = (2,  \ ) and is parallel to the line with general equat ion 2x  3y = 1.
32. Find the point Ron f t hat is closest to Q in Exercise 28.
3 1. Find the poin t Ron
e that is doses t to Q in Exercise 27.
Q
22. Find the vector fo rm of the equation of the line in IR J that passes through P "" ( \,0, 3) and is perpendicular to the plane with general equation x  3y + 2z = 5.
e
23. Fmd the vector fo rm of the equation o f the line in R J that passes through P = (  1,0, 3) and is parallel to the lme with parametric equations
,
x = I  t Y "" 2 + 3t z=  2 I 24. Find th e nor m al for m of the equation of the plane that passes thro ugh P = (0,  2,5) and is parallel to the plane with general equatio n 6x  y + 22 == 3.
p
o flgur. 1.63
~
r = p
+ PR
Section 1.3
Figure 1.64 suggests a way to use vectors to locate the poim R on VI' tlrat is closest to Q.
Lines and Planes
43
the angle between W> I and qp 2 to be either 8 or 180"  0, whichever is an acu te angle. (Figure 1.65)
Q
n,
,
c
,
o
flgare 1.64 r = p + PQ + OR
 
180  8
figure 1.65
33. Find the point Ron g> that is closest to Q in Exercise 29. 34. Find the po int Ron 'll' that is closest to Q in Exercise 30.
Exercises 35 (II/(/ 36, filld the distall ce between tile {X/rallel lilies.
III Exercises 4344, find tlse acute mlgle between the pIa/Ie! with the given equat ;0115.
43.x+ y+ z = 0 and 2x + y  2z = 0 44. 3x  y+2 z=5 and x+4y  z = 2
111
35.
= [I] + s['] [x] y I 3
III Exercises 4546, show tlll/tihe pllllie and line with the given 1.'(llIatiol15 illlersecf, (lnd then find the aela/.' angle of intersectioll between them. 45. The plane given by x
36.
x Y
I
I
x
O+siandy
,
 \
I
z
o
I
\ +t \ 1
Z=
3
+
I
46. The plane given by 4x  Y 
In Exercises 37 011(/38, find the distance between the parallel planes. C 137. 2x + y  1%= 0 and 2x + y  2z =5
38.x+y + z =
J
,nd x + y +z= 3
39. Prove equation 3 o n page 40. 40. Prove equation 4 on page 4 J. 41. Prove that, in R ', the distance bet..."een parallel lines wit h equations n' x = c, and n· x = c1 is given by
given by x =
Exercises 4748 explore Olle approach 10 the problem of fillding the projection of a ,'ector onlO (/ pial/e. As Figllre 1.66 shows, [email protected]> is a plalle throllgll the origin ill RJ with normal
n
en
till
p= \
I nil If two nonparallel plalles f!J> I alld 0>2 lrave lIormaI vectors " l al1d 11, mui 8 is tile angle /Jetween " l anti " 2, then we define
6 and the line
t
42. Prove that the dis tance between parallel planes with equations n· x = til and n' x = ti, is given by 
z
y = I +2t. Z = 2 + 31
ICI  ~ I ~ nil
I(il
0 and the line
givcn byx = 2 +
I y = I  2t.
I
+ Y + 2z =
figure 1.66 Projection onto a pl(lllc
en
44
Chapler I
Veclors
vector n, ami v is a vector in Rl, then p = pro~{ v) is a vector ;11 r:I sllch that v  en = p for some scalar c.
onto the planes With the fo llowi ng equations: (a) x+ y+ z = O
(b) 3x  y+ z = O
47. Usi ng the fa ct that n is orlhogonal to every vector in ~ (and hence to p), solve for c ::and the reby fi nd an expressio n fo r p in terms of v and n.
(e) x  2z = 0
(d ) 2x  3y
48. Use the method of Exercise 43 to find the p rojection of v =
1 0
2
+z= 0
I The Cross Product It would be convenient if we could easily convert the vector form x "" p + s u + tvor the equation of a plane to the normal for m n' x = n ' p. What we need is a process that, given two nonparallel vecto rs u and v, produces a third vecto r n that is orthogo nal to both u and v. One approach is to use a const ruction known as the cross product of vectors. Dilly Yldid In RJ , it is defined as follows:
Definition
The cross prOtlUCI of u =
U2
and v =
VI
is the vector u X v
defin ed by
U X II
=
IIl V) 
II J Vl
" l V, 
" I V)
U I Vl 
U2 Y ,
. A s hortcut that can help yo u rem ember how to cakut.lte the cross product of two lIectors is illustra ted below. Under each com p lete Yector, write the first two compo
nents of that vector. Ignon ng the two components on the top line, consider each block o f four: Subtract the products of the components connected by dashed lines from the products o f the components connected by solid lines. (It helps to notice that the fi rst component of u X v has no Is as subscripts, the second has no 2s, and the third has no 35.)
IIl Vl 
II J V,
UJ V, 
II I VJ
Il , I'l 
1'2 VI
45
The following problems brietly explore the cross product. I. Compute u x v.
,
3
0 (a) u =
,
, v :::::
(0) u
=
,
flgur.1.61
=
.. 2
2 ,v =
3
(b ) u
2
, u X ,
,
(d ) u
=
, , , ,
•v =
2
,
, ,
0
3
, ,=
2
3
2. Show that c 1 X c 2 = c)' c1 X c j = c., and c j X e l = ez. 3. Using the definitiOn o f a cross p roduct, prove that u X v (as shown in Figure 1.67 ) is orthogonal to u and v. 4. Use the cross product to help find the no rmal form of the equation of the plane. o 3 (a) The plane passing through P = (l , 0,  2), pa rallel to u = I and v =  ]
,
2
(b) The plane passing through p = CO,  1, l),Q = (2,0,2),aod R = (1,2,  1) 5. Prove the following properties of the cross produ ct: ( a) v X u =  (u x v ) (b) u X 0 = 0 (c) u X u = 0 (d ) u X kv = k(u X v) (e) u X ku == 0 (0 u x (v + w) = u X v + u X w 6. Prove th e fo llowing properties of the cross product: (a ) u· (v X w) ::::: (u X v) ·w ( b ) u x (v X w ) "" (u ·w )v  (u v)w (e) Illl x
V!l =
I U ~ 2 ~ v ll~ 
(u vy
7. Redo Problem s 2 and 3, this time making use of Problems 5 and 6. 8. I.et u and v be vecto rs in RJ and let 0 be the angle between u and v. (a) Prove that lu x v ~ = l un vll sin O. t Hlnt: Usc Problem 6(c).J ( b) Prove that the arc.. A of the tri .. ngle de termined by u and v (as shown in Figure 1.68) is given by
,
A u
fll.,.1 .6I
= t llu x
vii
(c) Use the resul t in part (b) to compute the area o f the tria ngle with vertices A = (1 , 2, 1) , B = (2, I,O),and C = (5,  I, 3) .
Section 1.4
Code Vectors lmd Modular Anthmetlc
41
" ."~
Code Vectors and Modular Arithmetic
The modern theory of codes onglllated WIth the work o f the American mathematician and com puter scientist Claude Shannon ( 19162001 ). whose 1937 thesis showed how algebra could playa role in the design and analysis o f electncal clfcuits. Shan non would later be Instrumental in th e formatIon of the field of IIIformation tlreoryand gtve the theorctkal basis for what are now called errorcorreclmg codes.
Throughout hislory, people have transmitted informa tio n usi ng codes. Sometimes the intent is to disgUise the message being sen t, such as when each letter in a word is replaced by a different leiter acco rding \ 0 a substitu tio n rule. Although fascinating, these secret codes, or ciphers, ilre not o f concern here; they are the focus of the field of cryptography. Rather, we wi ll concentrate o n codes that are used when data m ust be transmitted electronically. A familiar example of such a code is Morse code, \~ i th its system of dots and dashes. The adven t of d igital computers In the 20th centu ry led to the need to tra nsmit massive amounts of data q uickly and accurately. Computers are designed to en code data as sequences of Os ilnd Is. Many recent tech nological advilncements depend on codes, and we encounter Ihem every d ay withoul being aware of them: satellite communications, compact disc players, the u niversal product codes (U PC) associated with the bar codes fo u nd o n merchandise, and the international standard book numbers (ISBN) found o n every book published today are but a few examples. In this sectIOn, we will use vectors to design codes for detecting errors that may occur in the transmission of data. In laler cha plers, we will construct codes that can not only detect but also correct erro rs. The vectors thaI arise in the study of codes are not the familia r vectors of R" but vectors with only a fi nite number of choices for the components. These veclo rs depend on a different type of arlthmeticmoduiar arithmetIcwhich will be introduced in Ihis section and used throughout the book.
Binary COdes Since computers represen t d;lIa in terms o f Os and Is (which can be interpreted as off/on, closed/open, false/ t rue, o r no/yes), we begin by consideri ng biliary codes, which co nsist of vectors each of whose componenls is eilher a 0 Qf a \, In thls setting, the usual rliles of arit hmetIC must be modified, since the result of each calculation involving SOl lars must be a 0 or a I. The modifi ed rules for addition and multiplication are given below.
+
0
I
o
I
001
o
0
0
I
I
0
I
I
0
The only curiosity here is the rule that I + I = O. This is not as strange as it appears; If we replace 0 wit h the word ""even" and I with the word "odd," these tables simply sum marize the fami liar panty rules for the additio n and multiplicatIOn of even and odd integers. For example, I + I = 0 expresses the fact tha t the sum of IWO odd integers is an even lllteger. With these rules, a Uf set of scala rs 10, I } is denoted by Z2 and is called the set of integers modulo 2.
":t
In Z2' 1 + 1 + 0 + 1 = 1 and 1 + 1 + I + I = O. (Thesecakulalions ill ustrate the panty ,"I" Th, sum o f lh,oo odds ,"d , n eve" " odd; lh, sum of fout odds is
'S''"'
We are using the term kmgth differently from the way we used it in R". This should not be confusing, since there is no gromt'tric notion of length for binary vectors.
Wi th l, as OU f set o f scalars, we now extend the above rules to vectors. The SCI of all ,,tuples of Os and Is (with all ari th metic performed m odu lo 2) is de noted by Zl' The vectors In Z~ are called binary vectors o/Iength n.
Cha pler I
V« lo rs
Example 1.28
The vectors in l~ arc [0 , 0 1, [0, I], [I, 0], and II, contain , in general?)
Exampll 1.29
Lei U = f 1, 1,0, I, OJ and v  10, I. I, 1,01 be two brna ry veclorsoflen glh 5. Find U ' v.
II.
(How Illany vectors does
Z'l
Solution The calculation of u' v takes place in Zl' so we have u ·v = \·0+ \ . \ + 0·\ + \·1 + 0·0 = 0 + 1 +0+ 1+0
= 0
t
I In practice, we have a message (consisting of words, numbers, or symbols) that we wish to transmit. We begin by encod ing each "word" of Ihe message as a binary vecto r. III
Definition
A binary code is a set o f binary vecto rs (of the same length ) Gliled
code vectors. The process o f com'erring a message into code vectors is ca tled encoding, and the reverse process is called decodi"g. •
"=zz
A5 we will 5«, it is highly desirable that a code have other p ro~rti es as well, such as the ability to spot when an error has occu rred in the transmission o f a code vecto r and, if possible, to suggest how to correct the erro r.
ErrorOllecllng COdes Suppose that we have alread y encoded a message as a set of binary code vectors. We now want to send the binar y cod e vecto rs across a cluHlllei (such as a radio tra nsm itter, a telepho ne line, a fiber o ptic cable, or a CD laser). Unfortunatel y, the channel may be "noisy" (because o f electrical interference, competing Signals, or dirt and scratches). As a result, erro rs may be introduced: Some of the Os may ~ changed to Is, and vice versa. How can we guard agaInst this problem ?
hample 1.30
We wish to encode and transmit a message conSisting of one of the words up, do""., I"/t, or rigill. We decide to use the fo ur vectors in Z~ as our binar y code, as shown in Table 104. If the receiver has this table too and the encoded message is transmitted without e rro r, decod ing is trivial. However, let's suppose that a si ngle error occurred. (By an error, we mean that one component o r the code vec to r changed .) For example, suppose we sent the message "down" encoded as [0, I J but an error occurred in the transm ission o f the fi rst component and the 0 changed to a t. The receiver wo uld then sec
Tlble 1.4 Message Code '
up
[0.0)
down
left 11 . 0)
right ) 1,
tJ
Section 1.4
Code Vectors and Modular Anthmetlc
49
[1. II instead and decode the message as "right." (We will only concern ourselves wi th thc case of single errors such as this o ne. In practice. it is usually assumed that the probabil ity of multiple errors is negligibly small.) Even If the receiver knew (somehow) that a single error had occurred, he or she would not know whether Ihe cor rect code vector was (0, Jj or [ I, OJ. But suppose we sent the message usi ng a code that was a subset of Z~in other wo rds, a binary code of length 3, as shown in Table 1.5.
Tnlll.5 Message
Cod,
up
down
left
right
[O.O. OJ
10, I, J]
11,0, iJ
[ 1,1, 01
This code can detect any single error. For example, if "down" was sent as [0, I, J J and an error occurred in one component, the receiver would read either [I , I , 1 J o r !0, 0, I J or [0, I , 0], none of which is a code vector. So the receiver would know that an error had occurred (but not where) and could ask that the encoded message be retransmitted. (Why wouldn't the receiver know where the error was?)
The term pa rifY comes from th~ l ntin wo rd par, meaning "equnl or ~CVCI\:' Two inteser~ ar~ s.aid t~. have the sa me parity if they are both even or bQth odd,
The code ill Table 1.5 is an example of an errordetecting code. Until the 1940s, this was the best that could be achieved. The advent of digital computers led to the development of codes thllt could correct as well as detect erro rs. We will consider these in Chapters 3, 6, and 7. The message to be transmitted may itself consist of binary vectors. In th is case, a simple but useful errordetecting code is a parity d l/?ck code, which is created by ap
pending an extra componelltcatied a check digitto each vector so that the par ity (the total numberof Is) is even.
Exampla 1.31
If the messllge to be sent is the binary vector I I, 0, 0, 1,0, 11. which has an odd number of Is, then the check digit will be I (In o rder to make the total number of Is in the code vector even) and th e code vector will be [ 1, 0,0, 1, 0, I, I J. Note that a single error will be detected, since It will CllUse the panty of the code vecto r to change from even to odd. For exam ple, if an erro r occurred III the third compo nent. the code vector would be received as [ I, 0, I, 1, 0, I, I J, whose parity is odd because it has fi ve Is.
~ jI
Let's look at this concept a bit more formally. Suppose the message is the binary vector b = [bl' bl •• •• , hnJ in I.';. Then the parity check code vector is v = [ /' " b2 , . .. , l bn , d] in , where the check digit d is chosen so that
Zr
bl + h2 + ... + b" + d = 0
In
Zl
or, equivalently, so that I .v = 0
where I = [1, I, ... , I J, a vector whose every component is I. The vector 1 is cal led a check yector. If vector Vi IS received and I· Vi = I, then we can be certai n that
50
Chapter I
Vectors
an error has occurred. (Although we are not considering the possibility of more tha n one erro r, observe that th is schem e will not detect an even number of erro rs.) Parity check codes arc a special case of the more general check digit codes, which we will consider after first extend ing the forego mg ideas to more general seuings.
Modular Arithmetic I I is possible to generalize what we have just done fo r b inary vecto rs to vecto rs whose
components are taken from a finite set 10, 1,2, ... , kJ fo r k 2:: 2. To do so, we must fi rst extend the id ea of b inary arit hmetic.
EKample 1.32
The integers modulo 3 consist of the set Zl = {O, I, 2 ) I io n as given below:
+1 0
I
2
0
0
I
2
0
I
I
2 0
I
2
2 0
I
2
With
addition and multiplica 
0
I
2
0 0 0
0
0
I
2
2
O bserve that the result of each addition and m ultiplication belongs to the set 10, I , 2J; we say that Zl is closed wi th respect to the operatio ns of addi tion and multiplicatio n. It is perhaps easiest to think of this set in term s of a threeho ur dock with 0 , I, and 2 on its face, as shown in Figure 1.69. The calculation 1 + 2 = 0 translates as fo llows: 2 hours afte r I o'dock, it is o o'dock. l us t as 24:00 and 12:00 are the same on a 12hou r d ock, so 3 and 0 are eq uivalent on this 3 ho ur clock. Likewise, all mult iples of 3positive and negativeare equivalent to 0 here; 1 is equi valent to any num ber tha t is I more than a multiple o f 3 (such as  2, 4, and 7); and 2 is eq uivalent to any numbe r that is 2 mo re than a m ultiple of 3 (such as  1,5, and 8). We can Vis ualize the n um ber line as wra pping a round a circle, as shown in Figure 1.70.
o . . . . 3.0.3 . .. .
2
.... 1,2.5 . . . .
. . . .  2, 1.4.. ..
filer. 1.&9 Arit hmetic modulo 3
Example 1.33
Fllur. 1.10
To whi nalion of the vectors 0 and 4 3
\
,
3
 \
\ ?
 3
Sollilion (a) \Ve wanllo fin d scalars x and y s uch that
I
x 0
+y
3
 1
1
1
2
3
3
Expandlllg, we obtain the system
x 
y= I Y= 2
3x 3y=3 whose augmen ted matrix is 1
 I
o
\
3
I
2  3 3
(Observe that the column s of the augmented matrix are just the given vectors; notICe the o rder of the vecto rsin particular, \'Ihic h vC
we have
ZC.l
1
Thus. Rl = span(e l , e2, ( 3 ). You should have n o difficulty seeing lhat, in general, R~ "" span(e l' c! • ... , e~).
t
When the span of a set o f vectors In a d escription of the vectors' span .
EKBmple 2.21
1
Find the span o f 0 and
3
R~
is n ot all of lR~, i\ is reasonable to ask for
 I
1 . (See Exam ple 2.18.)
3
94
Chapter 2 Systems of Linear EquatIOns
,
Solutloll Thinking geometrically, we can see that the set of all linear combi nations of  I
]
o
and
I IS Just
]
the plane through the origi n with
0
3
3
x
plane
]
as direction
 3
x y
,
 ]
]
= , 0
+ 1
,
 3
3 1
1
 I
y is in the span of 0 and
which is just another way of sayi ng that
Two nonparallel ve e,).)
 I
2
2 •  I • I  I 3 0
= span
24.
Exercises /3 16, describe tile Spl/II of ri,e givel1 I'ec fors (a) geometrically lind (b) alge/Jraica//y. III
13.[_:].[;] 15.
I
3
2,
2
o
14. [~ ].[!] 1
16.
 I
0, ..... 1
J
2 •
I •
I
2
26.  I
0
1 , 1 0 I
18. Prove that u , v, and ware all in span( u, v, w ). 19. Prove that u, v, and w are all in span(u, u v + w ).
4
7 I
28.
2
,
3
2
2 2 •
3
I
 I
I
0
 I
I
0
I
0 •
I
0 0
0 0 •
2
I
=
31.
,
I
•
0
4
3
3
2 •
2
I
l
 I
I
 I
 I
3
I
 I
 I
•
, 0
.2
3
I
I
3 I
• I
 I
I
I
0
I
 I
4
•
2
I
I
I
2
6 0 27. 4 • 7 • 0 5 8 0
I • 0 2 3
5
0
3
5
3
0
30.
25.
2
 I
29.
5
3 •  I •
+ v, u +
20. (a) Prove that if u l • . .• , u mare vectors in R~, S = { u p u ~"" , u kl, a nd T = {u p ...• Ul. U hP .•• , uml. then span(S)!: span( T). (Him: Rephrase this questio n in terms of linear combinations.) (b) Deduce that if R" span (S), then R~ span( T) also. 21. (a) Suppose that vector w is a linear combi natio n of vectors U I' ••. , u •. an(\ that each u, is a linear combination of vectors v I" . . , v,,,. Prove that W IS a linear combination of V I" •• , v'" and therefo re span(u p . .. , Uk) \: span( v p ••• > von)'
I
 2
17. The general equation of the plane t hat contains the points (1, 0, 3), (1, I,  3), and the origin is of the form ax + by + cz = O. Solve fo r (/, b, and c.
=
2
I  I
•
3 I
•
I
3
111 Exercrses 32 4 J, determine If ri,e sets of vectors 1/1 the given exerCISe are Iim~ilrly itldependelll by cOllverting rile
Section 2.4
vectors to row vectors (Il1d usmg tile method of Example 2.25 (/lid Theorem 2.7. For any sets that are lil/eMly dependem, find a dependence relationship among the vectors. 32. Exercise 22
33. Exercise 23
34. Exercise 24
35. Exercise 25
36. Exercise 26
37. Exercise 27
38, Exercise 28
39. Exercise 29
40. ExerCISe 30
41. Exercise 3 1
Applications
1D1
(b) If vectors u , v, and ware hnearly independent, will u  v, v  w, and u  w also be linearly independent? Justify your answer. 44. Prove that two vectors are linearly dependent if and o nly ,f one is a scalar multiple of the o ther. ( Him: Sepa rately consider the case where one of the vectors is 0.) 45. Give a "row veclor proof" of Theorem 2.8.
42. (a) If the columns of an fi X /I mat rix A are linearly independent as vecto rs in IR", what is the rank of A? Explain. (b) If the rows of an nXn matrix A are linearly independent as vectors in R ", what is the rank of A? Explain. 43. (a) If vecto rs u, v, and w arc linearly independen t, will u + v, v + w, and u + w also be linearly independent? Justify your answer.
46. Prove that every subset of a linearly independent set is linearly independen t. 47. Suppose that 5 = Iv 1" •• , v ~, vI is a sel of ve(.to rs in some R" and that v is a linear combination of v..... , vk. lf S = !vp ... , vk}, prove that span (S) = span (S'). [Hint: Exercise 2 1(b) is helpful here. ] 48. Let {v" .. . , v~l be a linearly independent set of vectors in R", and let v be a vector in R ~. Suppose Ih"l v = e,v l + C2V1 + ... + ck v k with CI * O. Prove that lv, v!" .. , v,l is li nearly independent.
Applications There are too many applications of systems o f linear equations to do them justice in a single section. This section Will introduce a few applications.. to illust rate the diverse settings in which they arise.
Allocation 01 Resources A great many applications of systems of ti near eq uations involve allocating limited resources subject to a set of constraints.
Example 2.21
A biologist has placed three st rains of bacteria (denoted I, II, and III ) in a test tube, where they will feed on threedifTerent food sources (A, 13, and C). Each day 2300 units of A, BOO units of B, and 1500 units of C are placed in the test tube, and each bac· terium consumes a certain nu mber of units of each food per day, as shown in Table 2.2 . How many b'lCteria of each strain ca n coexist in the test tube and consume ti ll of the foo d?
Table 2.Z
Food A Food B FoodC
Bacteria Strain I
Bacteria Strain II
Bacteria Strain III
2
2 2 3
4
I I
0 I
112
Chapler 2 Systems of Linear Equations
Sol.tl..
Let XI' x 2 , and x) be the numbers of ba cteria of strains [, II , and 1[ [, respectively. Smce each o f the XI baCieria of strain I consumes 2 units of A per da y, strain 1 consumes a total of 2xI units per da y. Similarly. strai ns II and III consume a to tal of 2x2and 4xJ units of food A daily. Since we W10
o
I
I 2
o
0
0 2
1 01 0
I I
Hence, there is a unique solutio n: x, = 2,Xz "" I,x] = 1.ln o ther words, we must push swi tch A twice and the other two switches on ce each. (Check this.)
Exercises 2.4 Alloeallon of Resolrees I. Suppose that, in Example 2.27, 400 un its of food. A, 600 units of B, and 600 units of C arc placed in the test tube ellch day and the dll ta on dail y food consumption by the bacteria (in u nits per day) are as shown in Table 2.4. How many bacteria o f each strain can coexist III the test tube and consume all of th e food?
able 2.4 Bacteria Strain I
Bacteria Strain II
Bacteria Strain III
1 2 1
2 1 1
0 1
FooclA Food B Food. C
2
2. Suppose that in Example 2.27, 400 units offood A, 500 units of B, and 600 units of C arc placed in the test
Jable 2.5
Food A Food B Food C
Bacteria Strain I
Bacteria Strain II
Bacteria Strain III
1 2 1
2 1 1
0 3 1
tube each d ay and the data o n dal ly food consumptio n by the bacteria (in un its per day) are as shown in Table 2.5. How many bacteria of each strai n can coexist in the test tube and consume all of the food? 3. A florist offers th ree sizes o f flowe r arrangements containing roses, daisies, and chrysanthemums. Each small arrangement contains one rose, three daisies, and three chrysanthemullls. Each llledium arrangemen t contains two roses, four daiSies, and six chrysan them ums. Each large arrangement contains four roses, eight d;lisies, and six chr ysant hemums. One da y, the flo rist no ted Ihal she used ;l to tal of 24 roses, SO d aisies, and 48 chrysant hemum s in fi lling orders for these th ree types of arrangements. How m ally arrangements o f each type did she make? 4. (a) In your pocket rou have some IlIckels, d imes, and quarters. There arc 20 coins altogether and exactly twice as many dimes as nickels. The total value of the coins is S3.00. Find the number of coi ns of each type. (b) Find all possible combinations of 20 coins (nickels, dimes, and quarters) tha t will make exactl y $3.00. 5. A coffee merchant sells three blend s of coffee. A bag o f the house blend contains 300 grams of Colombian beans and 200 grams of French roast beans. A bag of the special blend contains 200 gram s o f Colombian beans, 200 grams of Kenyan beans, and 100 gra ms of French roast beans. A bag o f the gourmet blend
11.
Chapter 2 Systems of Linear Equations
f,
contains 100 grams o f Colombian beans, 200 grams o f Kenyan beans, and 200 grams of French roast beans. The merc hant has o n hand 30 kilogra ms o f Colom bian bea ns, 15 kilograms of Ke nya n beans, a nd 25 kilograms of French roast bea ns. If he wishes to use up all of the beans, how ma ny bags o f eac h type of blend can be made?
c
,
6. Redo Exercise 5, assu m ing tha t the house blend contains 300 gra ms of Colo m bian beans, 50 gra ms of Kenyan beans, and 150 grams of Fren ch roast bea ns and the gourmet blend contains 100 grams of Colombian beans, 350 grams of Ke nyan bea ns, a nd 50 grams of French roast beans. Th is time the me rchant has on hand 30 kilograms of Colombian beans, 15 ki lograms of Kenyan beans, and 15 kilograms of French roast beans. Suppose o ne bag of the house blend produces a profit of $0.50, one bag o f the special blend prod uces a profit of $1.50, and one bag of the gourmet blend produces a profit of $2.00. How many bags o f each type should the merchant prepare if he wants to usc up all o f the beans and maxinm.e his profit? Wha t is the maximum profit?
III ExerCISes 7 /4, bn/tmce tile chemical equation for each ret/cl iot!.
8. CO!
+ IIp
~
+ SOl C,H 120, + 0 1 (This reaction takes
r e 20 ) ~
place when a green plant converts carbon dioxide a nd wa te r to glucose and oxygen during photosynthesis.)
9.
CO 2 + H 20 (This reac tion occurs when butane, C~ H 1 0' burns in the presence of oxygen to form car bon dioxide and wa ter.) C~HtU
+ O2
10. Ci H 60 Z + 0 l
)
~
Flgur.2.18 (b) If the fl ow through A B is res tricted to 5 Um in, what WIll the fl ows th ro ugh the o ther two branches be? (c) What a re the m inimum and maximum possible flows through each branch? (d ) We have been assuming tha t flow is always poSitive. \¥ha t would negaave flow mean, assum ing we allowed it? G ive an illus tration for this example.
16. The downtown core of Gotha m City consists of oneway st reets, a nd the traffic fl ow has been measured al each in lersectioll. For the city block shown in Figure 2. 19, the numbers represent the average numbers of vehicles per m inute entering and leaving intersections A, C, a nd D d uring business hours . (a) Set up and solve a syste m of linear equal iOns to fin d the possible flows fl' ... ,f;,. (b) If traffic is regula ted on CD so thath "" 10 ve hicles per minute, what will the average flows on the other streets be? (c) What are the minimum and maximum possible flows on each s treet? (d) How would the solution change if all of the dIrections were reversed?
+ CO 2 Hp + CO 2 (This equation rep
H 20
II. CsH I10H + Ol ~ resents the combus tion of amyl alcoho1.)
+ P40 10 ) 13. Na 2CO J + C + N: ~ 14. C2 H lCl~ + Ca(O H ): 12. HCIO t
B
n,
• 111.elnl Ch,.lell IquIUOOS
7. FeS 2 + O 2
H)P04 + Cl z0 7 ) NaCN + CO ) C 2HCl J
•
101 10,
15. Figure 2. 18 shows a neh...ork of water pipes with flows measured in lite rs per min ute. (a) Set up and solve a system of lin ear equations to find the possible flows.
f,
A
•
hI
+ CaCI2 + H zO
N,lwor. 111111111
20i s, B
hi f,
•"
D I()
flgur' 2.19
I
•
C
lSi
•"
Section 2.4
17. A netwo rk of m igation ditches is shown in Figure 2.20, with flows measured in thousa nds o f liters per day. (a)
SCI
up and solve a system of lin ca r equations to find
the possible 110ws h ....
,is'
(b ) Suppose DC is closed. Wha t range of flow will need 10 be mamlai ned th ro ugh DB? (e) Fro m Figu re 2.20 it is deartha! DIJ cannO( be closed. (Why no t?) How does your solution in part (a) show
Electllcal "etwDI's For Exercises J9 (Iud 20, defemllne tile Ctlrrctt/s jor the gIven elecrricailletworks. I
19.
•
,
I
•
, 1 ohm
IIliS.
J,
(d) Front your solution in part (a), determine the minimu m and maxImum fl ows through DB.
i
c 8 volts
.,
100 ~
115
Applica ti ons
I,
•
A
•
B
1 oh m 4 ohms
A
I)
I,
•
•
•
I
\
I,
13 volls
:{'
20.
•
I
,
\
•
I
,
5 volts 1 ohm
"
C
c
J,
o
A
I,
•
•
B
2 ohms
fillare 2.21 4 ohms
18. (a) Set up and solve a system of linear equations to find the possible fl ows in the network shown in Figure 2.21. (b ) Is it possible for f. == 100 and i6= ISO? (Answer this questio n firs t wi th reference to your solutIOn III part (al and then direct ly from Figure 2.21. ) (el If h = 0, what will the ra nge of flo w l>c on ellch of the other b ranches?
150 !
lOot
[,
200 ~
[,t 4
c
I,t
f,t
/6
2 oh m ~
c
100 ~
/J
h
• ISO
•
r
E
0
21. (a) Find the cu rren ts I, 11" ' " I., in the bridge circuit . I·· 1Il " sure 2" . __ . (b ) Find the effective resistance of this network. (e) Crill you change the resistance III bra nch Be (bu t leave everything else unchanged ) so Ihal the current through branch CEbecomesO? I ohm
h~
[, !
200
8 volts
200t
•
A
•
, I,
I)
I,
I ohm
I, /J
"!
• 2 ohms
I,
£
•
I)
1 ohm
A
loot fig,,, 2.21
100 !
loot
•I fllure 2.22
14 volts
•I
116
Chapter 2 Systems of Linear Equations
22. The networks III parts (a) and (b) of Figure 2.23 show two resistors coupled in series and in parallel, respectively. We wish to find a general formula for the effective resistance of each network that IS, find R,ff such that E = RoffI.
24. (a) In Example 2.33, suppose the fourth light is initially on and the other four lights arc ofT. Can we push the switches in some o rder so that only the second and fourth lights will be on? (b) Can we push the switches in some order so that only the second light will be on?
(a) Show that the effective resistance Rdf of a netwo rk With two resistors coupled in series [Figure 2.23 (a) I is given by
25. In Example 2.33, desc ribe all possible configurations of lights that can be obtained if we start wi th all the ligh ts off. 26. (a) In Exam ple 2.34, suppose that all of the lights are iniliallyoff. Show that it is possible to push the switches in some o rder so that the lights are off, da rk blue, and light blue, III that order. (b) Show that it IS possible to push the sWilches in some o rder so that the lights are light blue, off, and light blue, III that order. (cl Prove tha t any configu ration of the th ree lights can be achieved.
(b) Show that the effective resista nce Rdf of a network with two resistors cou pled in parallel [Figure 2.23(b)] is given by
27. Suppose the lights in Example 2.33 can be ofT, light blue, or dark blue a nd the switches wo rk as described in Example 2.34. (That is, the sWI tches control the same lights as in Example 2.33 but cycle through the colors as in Example 2.34.) Show that it is possible to start with all of the lights off a nd push the switches in some order so that the lights are dark blue, light blue, dark blue, light blue, and dark blue, in that order.
E (,)
/
I I
,
/,
•
R,
•
N,
28. For Exercise 27, desc ribe all possible configurations of lights that can be obtamed , starting With all the lights off.

CM
1 I
/1 E (b)
29. Nine squares, each one either blac k or wh ite, are arranged in a 3X3 grid. Figure 2.24 shows one possible arrangement. 'Nhen touched, each square changes Its own state a nd the states o f some of its neighbors (black ~ white and white ~ black). Figure 2.25 shows how the state changes work. (Touchi ng the square whose numbe r is circled causes the states of the squares marked " to change.) The object of the game is to tu rn all n ine squa res b lack.lExe rcises 29 and 30
Figure 2.23 Resistors in series and in parallel
Flnlle linear Games 23. (a) In Example 2.33, suppose all th e lights are initially off. Ca n we push the swi tches in some order so that only the second and four th lights will be on? (b) Can we push the switches in some order so that only the second light will be on?
Figure 2.24 The nine squares puzzle
Section 2.4
CD•
2
4
5
•
• •
8
7
0) call be established usm,!; mtlfhematical indllCtioll (see Appemlix 8 ). Dlle way 10 make (III edllcatetl guess (I S to what the fo mlllias (Ire. though, is to observe Ilrtll we ((III rewrile the two forml/I(ls above as
(b) (3, 1),(2,2),and ( 1,5)
40. Th rough any three noncoHinear points there also passes a unique circle. Find the circles (whose general equations are ofthcform xl + I" + ax + by + c"" 0) that pass through Ihe sets of points in Exercise 39. (To check the validity of your answer, find the center and radius of each ci rcle and d raw a sketch. )
The process ofadtling rational filllCl ;011$ (ratios ofpoly"om I aIs) by placing them over a commo" denominator is tile anaioglleof tl(itlillg ratlOna/numbers. Tire reverse process oflakmg (i rallO/wi fU llctioll l'parl by wntmg /ltU a SI/ III of si",pier ratiOlwl/lItICliollS is IIseflll in several areas ofmatllematics;
respectlve/y. TIllS leads to the conjecture that the Sll III of ptll powers of tI,e jim" natural /wmbers is a polynomial of degree p + I in tire variable II. 45. Assuming that I + 2 + ... + 11 "" ati + bn + c, find a, I"~ and (by substituting thrcc values for /I and thereby obtai nrng a system of linear equations in (I , b, and c. 46. Assume that 12 + 21 + ... + ,,2"" (11,3 + bll 2 + en + d. Fi nd (I, b, c, and d. ( Hm l: It is legitima te 10 use 1/ = o. What is the lefthand side in that case?) 47. Show that I' + 2] + ... + ,1' = ("(/1 + I) 2)1.
I I The GI
1Positio
g
The Global Positioning System (CPS) is used In a variety of situations fo r dctcrmining geographical locatio ns. The military, survero rs, airlines. shipping co mpanies, and hikers all make use of It. CPS technology IS becoming so commonplace that some auto mobiles, cellu lar phones, and various handheld deV ices are now equipped with it. The basic idea of G PS is a variant on th reedimensionaltriangulation: A point on Ea rth's surface is uniquely determined by knowing Its distances from th ree other points. I']ere the point we wish to determine is the location of the CPS receiver. the other points are s.1tcl lites, and the distances a re computed usmg the travel times of radio signals from the sa tellites to the receiver. We will assume that Earth IS a sphere on which we impose an xyzcoordinate system with Earth centered at the origin and with the posi tive zaxis runn ing through the north pole and fi xed relative to Earth. For simplici ty, let's take one unit to be equal to the radius of Earth. Thus Earth's su rface becomes the unit sphere with equation ,( + I + t = I. Time will be measured in hundredths of a second. GPS fi nds distances by knowing how long II takes a radio signal to get from o ne point to another. For this we need to know the speed of light, which is approximately equal to 0.47 (Earlh radii per hundredths of a second). Let's imagine that you are a hiker lost in the woods at poin t (x,y, zl at some lime t. You don't know whe r~ you are, and fur thermo re, )'o u have no watch, so you don't know what time it IS. However, you have your C PS device, and it receives simultaneous signals from four satel lites, giving their positions and limes as shown in Table 2.6. (Distances are measured in Earth radii and time in hundredths of a second past midnight.)
This application is based on the article MAn Underdetermined linear System for GPS" by Dan Kalman In 71ll' Col/l'gf' Mil themiltKJ jol"nill. 33 (2002). pp. 384390.
For a more indepth treatment of the Ideas introduced here, s« G. Strang and K. Ilorre. Lil/cllr Algebra, GeM!'!),. II/Ill CPS (WellesleyCambridge Press, MA, 1997).
Let (x,y. z) be your position. and let t be the time when the signals arrive. The g0.11 is 10 solve for x. y, Z, and t. Your distance from 5..1tellite I can be compu ted as foll ows. The signal, traveling at a speed of 0.47 Earth radl i llO~l sec, was sent at time 1.29 and arrived at lime t, so it took t  1.29 hundredths of a second to reach you. Distance equals veloci ty multiplied by (dapsed) lime, so
rl = 0.47«(  1.29) 111
• Table 2.6 salemle Oat. Position
Time
(1.11. 2.55, 2.14) (2.87.0.00. 1.43) (0.00. 1.08.2.29) (1.54, 1.01 , 1.23)
1.29
SateUile
,
2 3 4
1.31
2.75 4.06
We can also express d in terrn$ o f (x. y. z) and the s,1Iell itc's position (1.11,2.55,2.14) using the distance formula:
d ~ \/"(x _:,:.,7, ") ,+:7(y_:,".,","7; ),+:7 (, _  :,.7,,"')' Combin ing these results leads to the cqutllion (x  Lll) ~
+
(y 2.55 )2
+ (z
2. 14 )2  0.47z(t  1.29)2
(I)
Expanding, simplifying, and rearrangi ng. we find thot equation ( I) becomes 2.22x
+ 5.1Oy + 4.28z 
0.57 ' = y;2
+ y + ZI
O.22r

+
11.95
Similarly, we can derIve a correspond ing equation fo r each of the other three satel lites. We end up with a system o f four equations in x. y, z, and t:
+ 5.IOr + 4.28z + 2.86z 5.74x 2.16y + 4.58% 3.08x + 2.02y + 2.46z 
2.22x
+ I + r  O.22r + ) 1.95 0.581 = xl + 1+ r  O.22r + 9.90 1.2 11 = K + i + o.zzr + 4.74 l.79r = x:2 + Y + xl  0.221 2 + 1.26 0.57t =
r
r
These are not lineOlr equations, but the nonlinear terms are the same in each equatio n. If we subtrOlct the fi rst equation from each o f the ot her three equations, ,,'e obtain a linear system: 3.52x  5.lOy  I.42z0.0 I t =
2.05
 2.22x  2.94y + 0.30z  O.64t =
7.2 1
O.86x  3.08y  J.82z  1.22( =  10.69
The augmented matrix row reduces as 3.52
5. 10
 1.42
0.01
2.05
2.22 0.86
 2.94 3.08
0 .30
 0.64
7.2 1
 1.82
 1.22  10.69
o
,o ,
0 0.36 2.97 0.03 0.8 1 0.79 59 1
from which we see that
x= 2.97  0.36/ y=0.8 1 0.03 / Z
= 5.9 1  0.791
wi th t free. Substituti ng these equations into ( I ), we obtain (2.97  0.36r  1.11)1 + {0.8 1  0.031  2.55)2 + (5.91  0.79t  2. 14 )~ = 0.471( t  1.29 )~
. 120
(')
which sim plifies to the q uadratic equation
0.54 r  6.65t + 20.32 :: 0 There are two solutions: t = 6.74
and
t = 5.60
Substituti ng into (2), we find that the first solu tio n corresponds to (x, y, z) = (0.55, 0.61,0.56) and the second solution to (x, y, z) = (0.96,0.65,1.46). The second solution is clea rl y no t o n the unit sphere (Eanh), so we reject it. The firs t solution produces Xl + + = 0.99, so we are satisfied that, within acceptable rou ndoff error, we have located yo ur coordinates as (0.55,0.61,0.56) . In practice, G PS ta kes sign ificantly more facto rs in to acco unt, such as the fact that Eart h's surface is not exactly spherical, so additio nal refinements are needed involvIng such techn iq ues as least squares approximation (see Chapter 7). In addition, the results of the CPS calculation arc converted fro m rect:mgular (Cartesian) coordinates into latitude and longi tude. an interesting exercise in Itself and one involvmg yet other branches o f mathematics.
l
r
'"
Seellon 2.5
Table 2.1 n o o x, o x,
1
2
J
4
5
6
0.714
0.914
0.976
0.993
0.998
0.999
1.400
1.829
1.949
1.985
1.996
1.999
TIle successive vectors iterate is
121
Iterative Methods for Sol\'ing Linear Systems
[::J
arecalled iterates.so,forexample. when" = 4, the fourth
[~::~~]. We can sec that the iterates in this example arc approaching [~],
which is the exact solution of the given system . (Check lhis.) We say in this case tha t Jacobi 's Illethod converges.
Jacobi's method calculates the successive iterates in a twovariable system according to the crisscross pattern shown," Table 2.S.
o
2
1
x,
J
:
:=
Before we consIder Jacobi's method in the general case, we will look al a I g
modification of it that o ften converges fas ter to the solution. The GaussSeIdel method is the same as the Jacobi method except th:Jt we usc each new val ue (IS SOOIl (IS lVe CfIlI. So In OllT exa mple. we begin by c3 leulati ng x, = (5 + 0)/7 = ~ .. 0.7 14 as before, bu t we now use this value o f x, to gel the next value of x 2:
7 + 3· J "'" 1.829 5
,•
We then usc this value of A1 to recalculate X" and so on. The iterates this lime arc shown in Table 2.9. We observe thaI the GaussSeidel method has converged (3sler to Ihe solution. The ilerates this time are c3lculaled according to the zigzag pattern shown in Table 2.10.
[ able 2 91
"
x, x,
0
1
2
J
4
5
0
0.714
0.976
0.998
1.000
1.000
0
1.829
1.985
1.999
2.000
2.000
124
Chapter 2 Systems o f Lmear Equations
Table 2.10 n o
1
2
3
The GaussSeidel method also has a nice geometric interpretation in the case of two variables. We can thin k of X I and ~ as the coordinates of po ints ill the plane. O ur starting poi nt LS the point corresponding to our Ill itial approximation, (0, 0 ). O ur first calculation gives X I = ~ ,so we move to the point ( ~, 0) .. (0.7 J 4, 0). T hen we compute:s = ~ = 1.829, which moves us to the point (~, ~) ... (0.71 4, 1.829). Continuing in this fashion, ou r calculations from the GaussSeidel method give risc to a sequence of poin ts, each one d ifferi ng frOIll the precedmg poi nt in exactly one coor== 5 and 3x1  5'0 =  7 correspo ndi ng to the two d inate. If we plot the lines 7X I given equations, we find that the points calculated above fall alternately on th e two lines, as shown III Figure 2.27. Moreover, they approach the poi nt of intersec\Jon of the li nes, which corres po nds to the solution o f the system o f equations. Th is IS what cO ll l'ergcllcc means!
:s
2
Ij. I
0.5
0.2
0.4
0.6
~+++ ~ XI 0.8 I 1.2
 0.5  I
flnrl 2.21 Convcrgmg l te ra l ~
T he general cases of the two methods are analogous. Given a system of equations in tI vanables,
""X, + ",,'"_ + .. + " ,"x" ~ " /.
II
linear
b, (2)
we solve the first equation for XI' the second for :S' and so on. Then, beginning with an initial approximation, we use these new equatio ns to iteratively update each
Section 2 5
herati\'e Me th ods for Solving Linear Systems
125
variable. Jacobi's method uses all of the values at the hh iteration to compute the (k + l )st iterate, whereas the GaussSeidel m ethod always uses the mos' recent value of each variable in every calculation. Example 2.37 below illustrates the GaussSeidel method in a threevariable problem. At th is paim, yo u should have some questions and concerns abo ut these iterat ive methods. (Do you?) Several corne to mind: Must these met hods co nverge? ICnot, when do they converge? If they co nverge, must they converge to the solution? The answer \0 the first ques tion is no, as Example 2.36 ill ustrates.
Example 2.36
Apply the GaussSeidel method to the system
with initial approximation SOIIUOI
[~].
We rea rrange Ihe equatio ns to gel
x l = l +x2 Xl
= 5  2xI
The fi rst few iterates are given in Table 2. 11. (Check these.) T he ac tual solution to the given system is [ : :] =
[~l Clearly. the iterates ill
Table 2. 11 are not app roaching this point, as Figure 2.28 m akes graphically clear in an example of divergence.
128
Chapter 2 Systems of Linear Equations
So when do these iter3tive methods com'erge? Un for tunately, the answer to this question is rather tricky. We will answer it completel y in Chapter 7, but fo r now we will gl\'e a partial answe r, \vithou t proof. Let A be the II X II m atrix
au
A=
a"
, ,
'"
a.,
a~
fl 21
'
a,.
' ,
.. ,
fl2"
.. ,
a.
We say tha t A is strictly diagonally dominant if
lulI I > laul + laul + ... + Iud > Itl211 + laBI + ... +
I UI ~I l al ~1
That is, the absol ute value of each diagonal entry a ll' U w .. . ,
n~~
is greater than the
sum of the absolute values of the re/Twining entries in that row.
Theor8 .. 2.9
If :I system o f II linear equatio ns in /I variables has a strictly diagonally dominant coefficient matrix, then it has a uniq ue solution and both the Jacobi and the GaussSeidel method converge to it .
•••• ,. Be wa rned! This theorem IS a oneway implicat ion . The fa cl that a system is lIo t strictl y dl3gonally domm[lnt does lIor m ean that the iterati ve methods diverge. They mayor may no t converge. (See Exercises 15 19.) Indeed , there 3re examples in wh ich o ne o f the m ethods converges and the o ther d iverges. However, If either of these methods converges, then it must co nverge to the sol ution It cannOI converge to some other point.
Theorem 2.10
If thc Jacobi or the GaussSeidel method converges for a syMem of /I linea equations in n variables, then it must converge to the solution of the system.
•
PilOt We will illustra te the idea behind the proof by sketch mg It o ut for the case of Jacobi 's method, using the sys tem of equations in E.xample 2.35. The general proof is similar. Convergence mea ns that from some iteration on, the val ues of th e iterates rel1l:li n the same. Th is means that X I and X:z converge to rand s. respecti vely, as shown in Table 2. 12. We musl prove tha t
[x'J ['J' X2
=
s
IS
,
the solutl()n of the system of eq uatio ns. In
other \'o'o rds, at the ( k + l )sl iteration , the values of
XI
and X:z must sta y the same as al
Se.::tion 2.5
Iterative Met hods for Solving Linear Systems
121
Table 2.12 k
n
x,
.. .
,
••
s
•
k+ I
k+ 2
,
,
.
s
s
.. .
the kth iteration. But the calculatio ns give (7 + 3xl )/5 == (7 + 3r)/5. Therefore.
5h 7
~
,
and
XI
7
= (5
..
+ x1 )17 = (5 + $)/7 and x2 =
+ 3r 5
=s
Rearranging, we see that
7r  s = 5 3r 55=  7 Thus,
XI
= r, Xi = s salisfy the o riginal equations, as required.

By now you may be wonden ng: If iterative methods don't always converge to the solution, what good arc they? Why don't we just use Gaussian eliminatio n? First, we have seen that Gaussian elim inatIOn is sensitive to roundoff errors, and this sensitivity can lead to inaccurate or even wildly wrong answers. Also, even if Gaussian elimi · nation does not go ast ray, we canno t improve o n a solu\ion once we have found it. For example, if we use Gaussian elimination to calculate a solution to two decimal places, there is no way to obtain the solution to fou r decimal places except to start over again and wo rk with increased accuracy. In contrast, we can achieve additional accu racy with ite ra tive methods simply by doing more iteratio ns. For large systems, particularly those with sparse coefficien t matrices, iterative methods are m uch faster than direct methods when Implemented on a computer. In many applications, the systems that arise are strictly diago nally dominant, and th us iterative methods are guaranteed to converge. The next example illustrates one such ap plication.
Example 2.37
Suppose we heat each edge of a metal plate to a constant temperature, as shown in Figu re 2.29.
50""
JlgurI2 .29 1\ heated metal plate
o·
!ms o f ti near Equatio ns
Eventuall y the temperatu re at the in terior po ints will reach equilibrium, where the following propert y can be shown 10 ho ld :
The temperature at each interior point Pon a plate is the average of the temperatures on the circumference of any circle centered at Pinside the plate (Figure 2.30).
Fluu,.2.30
To apply this pro perty in an actual exam ple requ ires techniques fro m calc ulus. As a n alternative, we can approximate the sit ua tio n by ove rla yi ng the plate with a grid , or mesh, that has a fi nite number o f interior points, as shown in Figure 2.3 1.
5fJ'
50'
nuur. 2.31
I,
I,
100' 'l
The di5(: rele verSLon o f Ihe heated
plale problem
0'
0'
The disc rete analogue o f the averagin g p roperi y governing equil ib ri um tempera tu res is slated as fo llows: Th e temperature at each interior point P is the average of the temperatures at th points adjacent to P.
For the example shown in Figure 2.3 1, there are th ree In terior points, and each is adjacent to four other points. Let the equ il ib ri um temperatures o f the m terior points
Section 2.5
be t.,
' 2'
and
fl ,
129
Iterati ve Methods for Solving Linear Systems
as snown. Then, by the temperatureaveraging p roperty, we have
'I + SO
+ 100 +
100
4
+
tl
+ 0
t)
+ 50
(3)
4
+
100
100
+
+
0
t2
4 "'" 250  /1 +4t} 
t}"'"
50
'1 + 41} "'" 200

No tice tnat this system is strictly diagonally do m inan\. No tICe also that equations (3) arc in the fo rm required for Jacobi o r Ga ussSeidel iteratIon. With an initial approxima tion of" "" 0, 12 "" 0, t) = O,the GaussSeidel method gives the foll owing itera tes. 100
Iteration 1:
+
100
+ 0 + 50
4 /1
I}
Iteration 2:
=
=
62.5+0+0 + 50 100
+
4 100
100
+
100
= 62.5
= 28. 125
+0+
+ 28.1 25 + 50
4
28.125
= 57.03 1
4 69.531
= 69.53 1
+ 57.031 + 0 + 50 4
100
+
100
= 44.141
°
+ + 44.1 41 :: 61.035 4
Continuing, we fi nd the ite rates listed in Table 2.13. We wo rk with fiveslgnificantdigit acc uracy a nd stop when two successive Iterates ag ree wuhin 0.001 in all variables. Thus, the equdib rium temperatures at the in tenor po ints are (to an accuracy of 0.001) II = 74. 108, = 46.430, and ') "" 61 .607. (Check Ihesecalculations.) By using a fin er grid ( wuh more Inte rior points), we can get as precise in forma tio n as we like abo ut the eq uilibrium te mperat ures at va rious POlllts on the plate.
'2
Table 2.1a 0
J
2
3
t,
0
62.500
69.531
t,
0
28. 125
I;
0
57.031
"
...
7
8
73.535
74.107
74. 107
44.141
46.1 43
46.429
46.429
6 1.035
61.536
6 1.607
61.607
.. .
+
138
Ch;lpwr 2
Systems of Lmear Eqwltion5
CAS
111 Exercises 16, apply Jacobi's lUe/lrod to thegivell S)'stelU. 'lflke Ihe zero vector as the illilifl/approximation am/work wllh fOllrsignific,," tdigit accuml)' IItllli IwO srlCceuive aerales agree willli" 0.001 in each vam,"'e {" each case, compare YO llr al/swer wi/II tile exact so/wioll foulld using (IllY direct tIIethod )'Ollilke. 1.
7x,

2. 2xl +x1 =5 x,  Xl = 1
= 6 x l Sxz = 4 Xl
3. 45xI  O.5xz "" x,  J.5J,·l
 I
4. 20xI +
Xl
:II
Xl 
x,  lOx! +  XI
5.
+
3xI +
Xl
+

= 17
6xI 
xJ

2xJ = I
17. Draw a diagram to illu strate the divergence of the GaussSeidel method in Exercise 15.
l B.  4xl :
I
'YI+3x1  X, Xl
+ 2xJ = 2 2X2 + 4xJ = I
XI  4xj
diagolt(llly dOlllill(lIIl, lIor ((III the efjJlfltiollS be reamll/ged 10 make it so. Howel'Cr, both tile Jacobi arid till' GaussSeidel method cotll'crge (lIIyway Dell/ollSlmle dUlt tlris is Irlle of IIJe G(1I/5sSeidel method, starlmg with lite zero vector as lite i"itial approx;matlon and obta;IIl118 (I $o/ul ioll tiwI ;$ accurate 10 lI'ith;/1 0.01.
1
x,
16.
hi Exercises 18 a1l(1 19, lhe coefficiem matrix is IIOt strrclly
xl +4x2 + x} = I Xl + 3x, = 1
6. 3x,
15. XI  2X2 = 3 3xl+2x2 = 1
1
Xj = 13 l Ox} "" 18
x,
o/mlill a ll approxilllflte solution Ihtl/ is aCCl/rtlte to witilill O.(XH.
= 0 + 3x}  X4 "" I X, + 3x,. = 1
I" Exercises 7/2, repcat Ille givell exercise rlSillg tire GallssSeidel metllOd. Take the zero vector as tile illitial approxilllfllioll (lIId work with fOlirsigmfiflJtJIdlgil aa"llml)' 1I11/i/ 111'0 sllccessiw! itemres agree willli" 0.00 I m each I'anable. Compare thc /II/ mila of itemtiolts requi red by tire Iflcobi arId Gal/ssSeldel met/lods 10 reach sl/ch all approxmlllte solmio1l. 7. Exercise I
B. Exercise 2
9. Exercise 3
10. Exercise 4
II. Exercise 5
12. Exercise 6
I" Exercises 13 mill 14, (Imw (/iflgrums 10 illustrate I/,e eOIlvergence of tIle G(II/ssSeidel method willi lire gll'e" system. 13. The system in Exercise I 14. The s),slem in Exercise 2
+
5xl = 14
3x1 =  7 19. 5xI  2xz + 3x, =  8 XI + 4xz  4x.l = 102 2x l  2X:! + 4x3 = 90 XI 
20. Continue perfonning iterations in EXerc"lSe 18 to oblain a solution Ih;1t is accurale 10 wilhin 0.001 . 21. Continue performing iterations
Exercise 19 10 oblain a solution Ihal is accura te to within 0.00 1.
I" Exercises 2224, the mettll plate lUIS tire C0Il5/(1II1 lemt1Crafllres shown 011 its bOllmlancs. PmtJ rite equilibrium tempemlure fll caell of tile /fIt/ieared ill/erior poillts by sCltillg up a syslem of lille(lr eqlllltiolls miff applyillg eitller the I(lcobi or the GaussSeitlel method. Obtai" a SOIIIlIOII Ihal ;$ accurate 10 lVithi" 0.001. 22.
0'
I" Exercises 15 (/lid 16, compule the firs l four ilemles, Iisillg lilt' zero vector as tile jllitiaf approximatlOtl, to show II,at tile Gm/SSSeidef metllod (Iiverges. TlJen show tlJaltl,e eqlw llOIls am be rcammged 10 give (/ strictly diago//(/I/y (IOllllllalll coeffilictll matrrx, and apply II,e GaussSeidel tIIelhod 10
III
0'
"
,. ,.
SeCiion 2.5
o·
CY'
23.
"
o· IOCY'
"
24.
o·
o·
'.
IOCY'
100"
CY'
2CY'
"
'2
40'
4CY'
27. A narrow strip of paper I unit long is placed along a number line so that its ends are at 0 and I. The paper IS folded In half, right end over left, so that its ends are now at 0 and t. Next, it is fo lded in half again, this time left end over right, so that It S ends a fC at ~ and ~ . Figure 2.32 shows this process. We con tinue fo lding the paper in half, alternating flght overIeft and leftoverright. If we could con ti nue indefinitely, il IS dear that the ends of the paper would converge to a poi nt. It is thjs point that we want 10 find .
2CY'
(a) lei XI co rrespond to the left hand end of the paper
100'
'.
"
131
Exercises 27 ali(I 28 demonstrtltc that soll/ctimes, if we arc lucky, the form of an iterative problelllll1ay allow [IS 10 lise a little IIIsight to olJtaiu all exact soilltion.
'2
IO(f
Iterati ve Methods fo r Solvmg Linear Systems
IOCY'
111 Exercises 25 (lI1d 26, we refille the gnd used In Exercises 22 mid 24 to obtai" more accllrate iltjormatloll about IIII? eqllilibrlllm lcmperarures almtcrior poims of the plates. ObU/ili solUliollS Ihat are accurate to WIt/1111 0.00], J/SlIlg eitiler the Jacoul or tile G(It/S5Seidel method. 25.
and Xz to the righth:llld end. Make a table with the first six values of [ XI' X21and plol the corresponding pomts o n XlX:! coordinate axes. (b) Find two linear equulions of the form X:! = (lX I + b and XI = £'Xl + d that determine the new values o f the endpoints at each iteration. Draw the correspondlllg lines on your coordinate axes and show that thiS d iagram would result from applying the GaussSeidel method to the system of linear equations you have found. (Your diagram should resemble Figure 2.27 on page 124.) (c) Switching to decimal representation, continue applying th e GaussSeidel method to approximate the
point to which the ends of the paper are converging to within 0.00 1 accuracy. (d) Solve the system of equations exactly and compare your answers. 28. An ant is standin g on a number line al poilu A. It
o· 5~
26.
5°
0°
o·
"
'2
'. '.
" 4CY' 40'
00
'.
'10
'13
•
5°
20°
20 0
"
'.
"
'.
'n
'12
'"
'16
20°
2CY'
100'
,,,,alks halfway to point 8 and turns arou nd. Then it walks halfway back to point A, lurns around again, Jnd walks halfway to pomt B. It continues to do this indefinitely. Let point A be at 0 and poil\\ 13 be at I. The ant's walk is made up of a sequence of overlap· ping line segments. Let XI record the positions of the lefthand endpoints of these segments a nd X:! their right hand endpoin ts. (Thus, we begin with XI = 0 and X:! = Then we have Xl = ~ and X2 = ~. and so on.) Figure 2.33 shows Ihe stun of the ant's wulk.
i.
(a) Make a table with the fi rst six values of Ix i • ~J and plot the corresponding points on X I  X 2 coordinate axes. (b) Find two linem equations of the form ~ = aX I + /, and XI = cx.z + d that dClertlline the new values of the endpoints at each iterallon. Draw the corresponding
132
Chapter 2 Systems of I.inear Equations
...
..I 0
• I ;I /Iite
I
I
0
I I
, ,, ,I •
I
,
,
I
I 0
,,I
I 0
0 I
I
1
1
,
2
I
,
3
n11U12.32 Folding a strip ofpapcr
I 0
,,I
I
I
• I i I
I 0
,
,J
)
I"J9{~\

J

I
, ,
I/P'if·1 I
I
J
I
,
1
lines on your coordinate axes and show that this diagram wo uld result from applying the GaussSeidel method to the system of linear equa tions you have found. (Your diagram should resemble rigure 2.27 on page 124.) «) Switchmg 10 decimal represen tation, continue appl ying the Ga ussSeidel method to approximate the values to which XI and Xl arc converging to within 0.00 1 acc uracy. (d ) Solve the system of equatio ns exacdy and compare yo ur answers. Inter pret yo ur resul ts.
1
2
8
figl,. 2.33 The anCs walk
R
.
.'
~
.~
.~
..:......... ":'
.
'.'
Ie, Dellnltlons and augment ed matrix, 62 back substitu tion , 62 coefficient matrix, 68 consistent system, 61 convergence, 123 124 d ivergence, 125 elementary row o perations, 70 free variable, 75 Gauss Jordan eli mination, 76 GaussSeidel method, 123
Gaussian eiimmatio n, 72 ho mogeneous system, 80 inconsistent system, 6 1 iterate, 123 Jacobi's m ethod , 122 leadmg Yanable (leading 1), 7576 linear equation, 59 linearl y dependent vecto rs, 95 linearly independent vectors, 95
pivot, 70 ra nk of a matrix, 75 .. Rank Theorem, 75 reduced row echelon form, 76 row echelon fo rm, 68 row equivalent matrices, . 72 span of a set o f vectors, 92 spanning Set, 92 sysiCm of linear equations. 59
Review Questions I. Mark each o f the following statemen ts true or fa lse:
(a ) Every system of linear eq ua tions has a solution. (b ) Every homogeneo us system of linear equations has a solution. (c) If a system of linear equations has more vanables than equat ions, then it has infinitely many solutio ns. (d ) If .. system of linear equatio ns has mo re equations than variables, then it has no solution.
(el Determining whether b is in span(a l , •• . ,an) is equivalent to determlll lllg whether the system [A I b l is consistent, whe re A ::0 lOl l'" anI . (f) In RJ,span( u , v) is alwa ys a plane through the on gln. (g) In R3, if nonzero vectors u and v are not parallel, then they are linearl y independent. (h ) In R 3 , If a set of vecto rs can be drawn head to tail, one after the o ther so that a closed path (polygon) is fo rmed, then th e vectors are linearly dependen t.
133
Chapter Review
(i ) If a set of vectors has the propert y that no two vectors in the set are scalar m ultiples of o ne another, then the set of vectors is linearly independent. (j) If there arc more vectors in a set of vectors than the num ber of entries in each vector, then the sCI of vectors is linearl y dependent.
2. Find the rank o f the mat rix.
] 3
2  ]
o
3
2
]
3
4
3
4
2
 3
o
 5
 I
6
2 2
II. Find the gener;al equatio n of the plane span ned by
1
3
1 and
2
1
]
2 12. Determ ine whet her independent.
u
~
,v =
]
4. Solve the linear system
 ]
,v =
over 2 7,
6. Solve the linear sys tem
3x +2y = 1 x + 4y = 2
7. For what value(s) of k is the linear system with
2 I] inconsistent?
2k 1
9. Find the point of intersectio n of the fo llowing lines, if it exists.
Y
2 + , I 3 2
,
10. Determine whether 1
and
2 2
x
1
and
y
,
5 2  4
 I
+
~
 I
1
3
5 is in the span of
1
3
a1 a,l. What are the possible val
t
1
17. Show that if u and v are linearly independent vecto rs, thensoare u + vand u  v.
18. Show that span(u, v) = span(u, u u and v.
+ v) fo r any vectors
19. In order for a linear system with augmented mat rix [A I b l to be consisten t, what mus t be true about the ran ks of A and [ A I b j? 1
1 1
 ]
w
16. What is the maximum rank o f a 5 X 3 matrix? What is the minimum rank of a 5 X 3 matrix?
8. Find parametric equations fo r the line of intersection of the planesx+ 2y+ 3z = 4 and 5x + 6y+ 7z = 8.
1
,
15. Let a i' a 2, aJ be linearly dependen : vectors in R', not all zero, and let A = [a l ues o f the rank of A?
x
0 1
0
(a) The reduced row echelo n for m o f A is 13' (b ) The rank of A is 3. (e) The system [A I b] has a unique solution for any vector b in [RJ. (d) (a), (b ), and (c) are all true. (e) (a) and (b ) are both true, but nOI (el.
2x+ 3y = 4 x + 2y = 3
k
1
w ~
14. Let a i' a 2, a J be linearl y independent vectors in [RJ, and let A = ta l a~ a JI. Which o f the following s tatements are true?
5. Solve the linear system
augmented matrix [ I
 2
1
0
3w+ 8x 18y+ z = 35 w + 2x  4y = II w+ 3x 7y+ z = iO
9 are linearly
0
]
 I
,
= span{u, v, w) if:
,
0
1
Cbl u ~
2
1
0
x + y  2z = 4 x + 3y  z = 7 2x+ y  5z = 7
I
3
]
Cal
3. Solve the linear system
,
]
13. Determine whether R'
3
1
20. Arc the matrices
I
I
2 3  I and  1 4 1 row equivalent? Why or why not?
I
0
 1
I 0
I I
I 3
'' .. .... ... . ,_ ...... ~,..
.....
~

.
trice
We [Halmos lind Kllpltlllsky/ share II philosophy a llOlll lim:llr algebra: we Ihink basisfree, we wnte basisfree, bur wile" Ihe chips are down we clost' Ihe affin' door ami comp"tt with matricts tikt fury. lr:vlllg Kaplansky In Pa,11 Halmas: Celebrating 50 )cars of Mar/rt'malics J. H. Ewingand F '" Gehrmg. 005. SpringerVerlag, J991 , p. 88
3.0 Introduction: Matrices In Action In this ch3pter, we will study matrices in their own right. We have already used matricesin the form of augmented matricesto record information about and to help stream,line calculatio ns involvmg systems of Imear equations. Now you will see that matrices have algebraic properties of their own, whICh enable us to calculate with them, subjoct to the rules of matrix algebra. Furthermo re, you will observe that matrices arc not stalic objects, recording information a nd data; rather, they rep resent certain types offunctions that "act" on vectors, transformi ng them in to other vecto rs. These "mat rix transformations" will begin to play a key role in our study of linear 31gcbra and will shed new light o n what you have al ready learned about vectors and systems o f Imear equatio ns. Furthermo re, mat rices arise in many form s other than augmented matrices; we will explore some of the many applications of mat rices al the end of th iS chapter. In thiS section, we will consider a few si mple examples to illustrate how matrices ca n transfo rm vectors. In the process, you will gel your first gl impse of "matrix arithmetic." Consider the equations
y, = x l + 2x2 Y2 =
(I)
3 X2
\'Ve can view these equations as describing a tran sformation of the vector x  [xX,'
1
in to the vector y = [;:]. If we d enote the matrix of coefficients of the righthand side by F, then F =
[ ~ ~] we can rewrite the transformation as
or, more succinctly, y = Fx. (T hi nk o f this expression as analogous to the functional notation y = ! (x ) you are used to: x is the independ ent "varltlble" here, y is the dependent "variable," and F is the name of the "functio n.")
13.
Section 3.0
Th us, if x = [ 
Introduction: Matrices m Action
135
~ ], then the eq uations ( I) give YI= 2 +2 'I =O
Y2 =
3 ' 1=3
We can write this expression as
y =
[ ~]
[ ~] = [ ~ ~][  ~ ].
ProblelD 1 Compute Fx for the following vectors x:
Problem 2 The heads o f the fo ur vectors x in Problem 1 locate the four corners of a square in the x I X2 pla ne. Draw this square a nd label its corners A, B, C, and D, cor· responding to parts (a), (b ), (c), and (d ) o f Problem 1. On separate coordinate axes (labeled YI and Yl)' d raw the fo ur points determined by Fx in Problem 1. Label these po~s A', 8' ,C , and D' . Let's make the (reasonable) assumption thaI the line segment AB is tra nsformed in lO the line segment A' B', and likewise for the other three sides of the square ABCD. Whal geometric figure is rep re· sen ted by A' B'C D'?
Problell 3 The center of square ABCD is the origin 0 =
[ ~ ]. What
IS
the center of
A' 8' C D' ? What algebraic calculation confirm s Ihis? Now consider the equations 21
=
YI Yl
(2)
2;> =  2YI
that t ransform a vector y =
[;J
[~]. We can abbreyiatc this
into the vecto r z =
tra nsformation as"Z = Gy, where
G=[  2' '] 0 Prolllllll 4 We arc going to fi nd out how G transfor ms the figure A' B' C D' . Compute Gy for each o f the four ve aw " . ,and If m = n ( that IS, if A has the same nu mber of rows as columns), the n A is called a square mntrix. A square matrix whose nondiagonal entries a rc all zero IS called a tlingomd matrix. A diagonal matrix all of whose diagonal en tries a rc the same is called a scalar matrix. If the scalar o n the diagonal IS 1. the scalar mat ri x is called a n idw,ity matrix. For example, let A = [
2
 I
5 4
B

[34 5'I]
o o c= o 6 o , o o 2 3
D =
1 0 0 0 1 0
o
0
1
The diagonal enlries of A a re 2 and 4, but A is no t square; B is a square ma trix of Size 2 X 2 with diagonal entries 3 a nd 5, C is a diagonal ma trix; D is a 3 X3 identity ma tTix. The n X II identity ma trix is denoted by I~ (or simply I if its size is unde rslOod). Since we c.1 n view ma trices as generalizations of vectors (and. indeed, matrices can and sho uld be though t of as being made up of bot h row a nd column veclOrs), many of the conventions and o pe rations for vectors carry th rough ( m a n obvious way) to m atrices. Two ma trices arc equal if they have the same size a nd if their corresponding ent ries are equal. Th us. if A = [a'JJmxn and B (b91 ,xl' the n A Bif and only if //I "'" r llnd /I = s a nd tI,) = hI] for atll a nd j.
=
Example 3,1
=
Conside r the matrices
A = [:
!].
B=
[ ~ ~ ].
c
o
[! ;] 3
Neither A no r B can be eq ual to C( no matter what the values of xand y), s ince A lInd Bare2 X2 malrices a nd C is2X3. However, A = Bifand on ly if ( I = 2, /, ;;; O,e  5, and d "" 3.
Example 3,2
Consider the malri c~s
R =[ l
4
3J and
C=
1 4 3
138
Chapter 3
Matrices
D espite the fac t that Rand C have the same entries in the same order, R =I C since R is 1 X3 and C is 3X I. (If we read Rand Caloud , they both sound the same; "one, fou r, th ree.") Thus, o ur distinc tion between row matrices/vectors and column matrices! vecto rs is an importan t one.
Matrix Addition and Scalar Multiplication Generalizing from vector add ition, we defi ne mat rix addi tion compOl1el1twise. If A = [a,) and B = [b;) are mX tI mat rices, their sum A + B is the mX tI matrix obtained by adding the corresponding entries. Thus,
A
+ B = [11;j +
bij]
[We could equally well ha ve defined A + B in terms o f vector addition by specifying that each column (or row) of A + B is the sum of the correspo nding colum ns (or rows) of A and 8. [ If A and B a re no t the same size, the n A + B is not defined.
Example 3. 3
Let A = [
1
4
2 6
~].
B ~
1 ] [: 1 2 . 0
Then
A+B = b ut neither A
+
C no r B
[ ~
5 6
"d
C =
[~
:]
; ]
+ C is defi ned.
The com ponen twise defi n ition of scalar multiplication will come as no surprise. If A is an m Xn matrix and c is a scalar, then the scalar multiple cA is the mXn matrix o btained by m ultiplyi ng each e ntry of A by c. More fo rmally, we have
[ In te rms o f vectors, we could equivalently stipulate that each column (or row) of cA is c times the corresponding colum n (or row) of A.I
Example 3.4
For mat ri x A in Example 3.3 ,
2A = [
2
 4
8 12
l~l
!A= [_: ~
~l
and
( l)A =[ ~
 4  6
The matrix ( I)A is written as  A and called the negativeo f A. As with vectors, we can use this fact to defi ne the difference of two matrices; If A and B are the same size, then A  B ~ A +( B)
Sect ion 3.1
111m pie 3.5
lJ1
Ma tnx Operatio ns
For matrices A and B in Example 3.3,
] [3 o A  B= [ I 4 0 I
2 6
5
3
A matrix all of whose entries arc l eTO is called a zero matrix ;md denoted by 0 (or 0 "')(11 if it is imporlant to specify its size), It should be dear that if A IS any matrix and o is the 7£ ro matrix of the same size, then
A+O=A=Q+A . nd A  A = 0 =  A
+A
MaUll MUlllpllclliOD ~13t hcll1aticia ns
are sometimes like Lewis Carroll's Hu mpty Dumpty: "Wb en I use a w\l rd ,~ Hu mpty Dumpty said, "it means just what I choose it to meanneither more nor JdoS ( from 11Iro11811 Illf Loobll8 GIIIU), M
The Introduction in Sect ion 3.0 suggested that there is a "product" of matrices that is analogous to the compo sition of fun ctions. We now make Ihis no tion morc precise. The defini tion we arc about to give generalizes what you should have discovered in Problems 5 and 7 in Section 3.0. Unl ike the definitions of matrix addition and sca lar multiplicauon, the defi nitio n o f th e product of IwO m,l\rices is not a componentwise definiti on . Of cou rse, there is nothing to stop u s from defin ing a product o f matrices in a componenlwl5e fas hion; unfortunately such a defini tion has fcw ap plica tions and is not as "natu ral" as the one we now give.
If A is an ", Xn matrix and 8 is an tlX r matriX', then the product an "'x r matrix. The (i, j) entry of the product is computed as follows:
II
.
U.,r•• Notice that A and B need not be the same size. However, the number of colulIIlI$ of A must be the same as the number o f rows of 8. If we write the sizes of A, and AIJ in order, we C;11l scc at a gl A'(Ax) = A' b ::::} (A' A)x = A' b ::::} Ix = A' b => x == A' b
=
(Why would each of these steps be justified?) Our goal in this section is to determi ne preCisely when we can find such a matrix A'. In fact , we arc going to insist on a bit more: We want not only A' A = I but also AA ' == 1. This requirement forces A and A' to be square matrices. (Why?) =2
Dellnltlon
If A is an nX II mat rix, an j"verse of A is an the property that
AA' = I and where 1 == ;'JVertible.
EKample 3.22
,
If A =
AA ' :
[
21
is the
matrix A' with
A' A ~ 1
"x" identity matrix. If such an A' exists, then A is called
5] then A' = [ 3 '] 2 is an inverse or A, since 3 '
 I
[2 5][ 3 5]: [1 I
Example 3.23
1~
/IX /I
3
 I
2
0
0] and A' A = [ 3 1  1
Show that the following matTices are not invertible:
(, ) 0 :
Solullon
[~ ~]
(b) B : [ ;
:]
(a) It is easy to see that the zero matnx o docs not have an inverse. Ifit did, then there would be a matrix 0 ' sllch that 00' = J == 0 ' 0. But the product of the zero matrix with any other matrix is the zero matrix, and so 00' could never equal the identity
matrix I. (Notice that this proof makes no reference to the si%.e of the mal rices and so is true for nXn matrices in general.) (b) Suppose B has an inverse B'
=
[ ;'
:J.
The equalion 118' = / gives
[; ~t :] [~
~]
=
from which we get the equations
+ 2y
w
+ 2z
x
+ 4y
211'
= 1
= 0
=0
+ 4z = I
2x
Subtracting twice the fi rst equation from the third yields 0 =  2, which is clearly absurd. Thus, there is no solution. (Row reduction gives the same result but is not rea lly needed here.) We deduce that no such matrix B' exists; that is, IJ is not invertible. (In fact, it docs not even have an inverse th 8XA = W~ABJA => B1BXAA  I = B 1B 3 AB J AA  1 => IXI = B 4 AB J I => X = 8 4 AB 3 (Can you justify each step?) Note the careful use of Theorem 3.9( c) and the expansion o f (A1 8 3 ) 2. We have also made liberal use of the associativity of matrix multiplicatio n to simplify the placement (or el imination) of parentheses.
Elementary Matrices V.re are going to use matrix multiplication to take a different perspectIve o n the row reduction of mat rices. In the process, you will discover many new and important insights into the nature o f invertible matrices.
If I E ~
0
0 0
0 I
0
I
0
,nd
A ~
5  I
0
8
3
7
we find that 5
7
8
3
1
0
EA ~
In other words, multiplying A by E(on the left ) has the same effect as lllterchanging rows 2 and 3 o f A. What is significant about E? It is si mply the matnx we obtain by applying the same elementary row operation, R2 +,). R3, to the identIty matrix 13, It turns out that this always works.
Definition
An elementary matrix is any matrix that can be obtained by per forming an elementary row operation on an identity matrix.
Since there are three types of elementary row operations, there are three corresponding types of elementary matrices. Here are some more elementary matrices.
Example 3.21
Let I
E]
=
0
0 0
0 3 0 0
0 0
0 0 1 0
0
1
, Ez =
0 0
0
1
1
1
0
0 0
0 0 , 0
0
0
0
I
I
,nd
E, ~
0 0 0
0
0 0 1 0 0 0 1 0  2 0 1
Section 3.}
The Inverse of a Matrix
169
Each of thcse matrices has been obtained from the identity matrix I. by applying a single elementary row operation. The matrix £1 corresponds to 3R1 , E, to RI ++ R.p and ~ to R(  2R 2• Observe that when we leftmultiply a 4X II matrix by one of these elementary matrices, the corresponding elementary row operation is performed on the matrix. For example, if
a" a" a" a"
A~
then
E1A =
a"
al2
au
3""
3all
3al}
a" a"
a"
a"
EJA ;;
and
a.,
al2
au
a2l
a"
a" a"
a.2
a.,
• E2A =
""
a" a" a" a"
a" an
a'l
a"
au •
all
a" a"
a"
al!
au
a 21
au
a"
a"
an
an
a"  2a21
a. 2  2a Z2
ao  2a D
t
Example 3.27 and Exercises 2430 should convince you that tltlyelemen tary row operation on tilly matrIX can be accomplished by leftmultiplying by a suitable elementary matrix, We record this fact as a theorem, the proof of which is omitted.
Theo,.. 3.10
,
L
Let E be the elementary matrix obtained by performing an elemcntdry row opcration on Tw' If the salllc clementary row operat iOll is performed on an fi X r m,lI rix A, the result is the S(Hne as the matrix fA
a•• .,.
From a compu tational poin t of view, it is not a good idea to use elementary matrices to perform elementary row operationsj ust do them direct ly. However, elementary mat rices C:1I1 provide some valuable IIlslghts 1Il10 invertible matrices and the solution of systems of linear equations. We have already observed that ewry elementary row operation can be "undone," or "reversed." Th is sa me observation applied to element,lfY matrices shows us that they are invertible.
Example 3.28
Let L
E,
Then £ 1
~
0 0
0 0
0
I
0
L
L
,~=
0 0
0 4
0 0 • and
0
I
E) =
L
0
0
I
0 0
 2 0
I
corresponds to Rz H RJ , which is undone by doi ng R2 H RJ agai n. Thus, 1 = £ 1' (Check by showing that Ei = EIE. = I.) The matrix Ez comes from 41l1, EI
Chapt ~r 3
110
Matrices
which is undone by perform ing ~ R2 ' Thus.
o o 1 o E, ·  o • o o I I
which can be easily checked. Finally. ~ corresponds to the elementary row o peration RJ  2R" which can be undone by the elementary row opera tion R} + 2R .. So, in this case,
(Again, it is easy to check this by confirming that the product of this matrix and both o rd ers, is I.)
~,
in
Notice that not only is each elementary matrix invertible, but its inverse is another elementary matrix of the same type. We record this finding as the next theorem.

Theo". 3.11
Each elementary matrix is invertible, and its inverse is an elementary matrix of the same type.
T.e luUamenlal Theorem 01 Inverllble Mallicas Weare now in a position to p rove one of the main resul ts in this book a set of equivalent characterizatio ns of what it means for a matrix to be invertible. In a sense, much o f line;l r algebra is connected to this theorem, either 10 the develo pment o f these characterizations or in their applicatio n. As you m ight expect, given this introduction, we will use this theorem a great deal. Make it yo ur fr iend! We refer to Theorem 3.12 as the first version of the Fundamental T heorem, since we will add to it in subsequent chapters. You are rem inded that, when we $l.IY that a set of statements about a matrix A are equivalent, we mean that , for a given A, the statements are either all true o r all fal se.
, The Fundamental Theorem of Invertible Matrices: Version I Let A be an a. b. c. d. e.
11 X n
matrix. The following statements are equivalent:
A is invenible. Ax = b has a unique solution for every b in IR,n. Ax = 0 has only the trivial solution. The reduced row echelon form of A is 1.. A is a product of elementary matrices.
SectiOn 33
Praal
111
The Inverse of a Matrix
We Will establish the theore m by proving the Ci rcular cham of implications
(a )::::} (b ) We have al ready shown that if A is invertible, then Ax = b has the unique solut ion x = A Ib fo r any b in 1R"(Theorem 3. 7). (b) => (c) Ass ume that Ax = b has a unique solu tion for any b in [R". This implies, in particular, that Ax = 0 htls tl unique sol utIOn. But tl homogeneous system Ax = 0 alwtlys has x = 0 as olle solution. So in Ih is case, x = 0 must be tl.esolution. (c)::::} (d ) Suppose th tlt Ax = 0 has o nl y the tnvitll solution. The corresponding system of eq uations is (I ,
tX t
(l2tXt
+ +
(1124 (1 214
+ ... + + ... +
at..x" = (ll..x" =
0 0
and we are ass um ing that its solutio n is
x,
= 0 =0
x
"
=
0
In o the r words, GaussJordan eliminatio n applied to the augmented matrix of the system gives
a" a" [AI OJ =
""
a 22
a",
anl
.. ,
ti , "
0
a,"
0
,
1 0 0
1
' , ,
.. ,
0
0
0
0
=
[/"I OJ
, ,
a""
0
Thus, the reduced row echelon form of A
0 IS
0
1 0
I".
(d ) =? (e) If we assume that the reduced row echelon for m o f A is I", then A can be reduced to I" usi ng a fi nite sequence of elemen tary row operations. By Theorem 3. 10, each one o f these elementary row operations COl n be achieved by leftmultiplyi ng by an appro pria te elementary matrix. If thc app ropr iate sC{1uence of elem entary matrices is E., f l '" ., EI; (in tha t order), then we have
" "'k
''' ""2 "EA = I" 1
According to Theorem 3.11 , these elementary matrices are all invertible. Therefore, so is their p roduct. and we have
E)  II" = ( E1 ... £, 2 E'1 ) '  £, 1 ' E'l 1... E,. ' A  (El .. , E21 Agai n, each E,1 is anothe r elementary matrix, by Theorem 3. J I, so we have wriuen A as a product of elemen tary mat rices, as required. (e) =? (a) If A is a product of elementary matri ces, lhen A is invertible, since elementary matrices are invertible and products of inverti ble matrices are invert ib le.
111
Chapter 3 Matrices
Example 3.29
If possible, express A =
Solullon
[ ~ ~] as a product of elemen tary matrices.
We row reduce A as follo\~s:
A 
'[I 3]3 ,_, " ) [I2 !] KJl~ [~ !J '_"1 [I 0] IRe, [I 0] = I ° 3 o
1
'
Th us, the reduced row echelon fo rm of A is the identity matrix, so the Fundamental Theorem assures us that A IS invert ible and can be written as a product of elementary matrices. We have E~EJ ~EIA = /, where
EI = [~ ~]. E2=[_~ ~]. E3=[~
:]. ~=[~ _~]
are the elementary matrices corresponding to the four elementary row operations used to reduce A to /. As in the proof of the theorem , we have E"2 E" E" _ [ 01 A = (E~}! E E EI ) . ,  E" 'I ') '4 
~] [~  : ] [~
as required.
Remark
Because the sequence of elementary row operations that transforms A into / i.~ not un l(lue, neither is the representation of A as a product of elementary matrices. (Find a d ifferent way to express A as a product of elementary matrices.)
"Never bring a cannon on stage in Act I unless you intend to fire it by the last aC1."  Anton Chekhov
Theorem 3.13
The Fundamental Theorem is surprisingly powerfu l. To ill ustrate its power, we consider two of ItS consequences. The nrst is that. although the defin ition of an invertible mat rix Slates that a matrix A is invertible if there is a mat rix B such that both AD = / and BA = / are satisfied, we need only check oneof these equatio ns. Thus, we can cut our work in half! Let A be a square matrix. If 8 isa square matrix such that either AD = lor BA = I, then A is invertible and B = AI.
PlIof
Suppose BA = I. Consider the equation Ax = O. Leftmultiplying by IJ, we have BAx = 80. This implies thatx = Ix = O. Thus, the system re presen ted by Ax = 0 has the unique solution x == O. From the eq uivalence of (c) and (a) in the Fundamental Theorem, we know that A is invertible. (That is, A I exists and satisfies AA  I = I = A  I A.) If we now right multiply both sides of BA = / by A I, we obtain BAAI = LA I ~ BJ ::: A I => B = AI
(The proof III the case of AB
= I
is left as Exercise 4 L)
The next consequence of the Fundamental Theorem is the basis for an efficient method of computing the inverse of a matrix.
Section 3.3
Theorem 3.14
The Inverse of a Matrix
113
Let A bc a square matrix. If a sequence of elem entary row operations reduces A to /, then the same sequence o f elementary row op erations transforms 1 into AI .
If A is row equivalent to t, thcn we can achieve the reduction by leftIll ulti plying by a sequence £1' Ez • ... , E1. of elem entary matrices. Therefore, we have Ek . .. ~EI A = I. Setting B = E, ... EzE I gives IJA = I. By Theorem 3. 13, A is invertible and A I = B. Now applying the same sequcnce of elementary row olXrations to 1 is equivalent to left multiplyi ng Iby El · ·· ElEI = 8. The result is
Proof
Ek ... ~Ell "" 131
= B = AI
Thus, 1 is transform ed into AI by the slime seq uence of elemcntary row opcrations.
The GaussJordan Method lor Computing the Inverse We can perform row opcrations on A and I sim ultlillcously by constructi ng a "superaugmented ma tri x" [A l l]. Theorem 3. 14 shows that if A is row eq uivale nt to [ [which, by the I:: undamc ntal Theorem (d ) (a), means that A is invertible !, then elementary row operations Will yield
If A cannot be reduced to /, then the Fundamental Theorem guarantees us tha t A is not invertible. The procedu re just described IS simply Ga ussJordan elimilllltion performed o n an tIX27/, instead of an II X( n + 1), augmented matrix. Another way to view this procedure is to look at the problem of fi nd ing A I as solVi ng the mat rix eq uation AX = I" for an n X II matrix X. (This is sufficie nt, by the Fundamental Theorem , since a right inverse o f A mus t be a twosided inve rse. ) If we deno te the colum ns o f X by X I ' .•. , x n' then this matrix equation is equ ivalent to so lvlllg fo r the columns of X, one at a time. Since the col um ns of /" are the standard um t vectors f l ' ... , en' we th us have /I systems of linear equa tio ns, all with coeffiCIent matrix A:
Since the sam e sequence o f row o perations is needed to b rlllg A to red uced row echelo n form in each case, the augmented matr ices for these systems, [A I e d, ... , [A I e" i, ca n be co mbined as
(AI ' ,' , ... ' .]
~
[A I I.]
We now apply row operations to try 10 reduce A to I", which, if successful, will sim ul taneo usly solve for the columns o f A I , transfo rming In into A  I. We illustrate this use of Ga uss Jo rdan el iminatio n with three examples.
(Kample 3.30
FlIld the inve rse of A ~
if it exists.
1 2
 I
2 2 1 3
4
 3
114
Chapler 3
Matrices
Solulloa
GaussJordan elimination produces
2 2 2 1 3 1
[AI f ) = H. · 211, 11, II,
R.. II. )
1
0
0
4 0 3 0
1
0 1
0
2 2
0
1
1
)
 I
0
 I
6  2
 2  I
1
2
 I
1
0
0 0
1
t
1
 3 1  2  I
1
2
 I
0
1
0
0
1
2 0  I
0
0 0 1
1
0
 3 1 r  2
,
0
_1
0
1
r
0 1 0 s 0 0 1  2
,
1
1
1
3
,1
, ,
I 0 0 9 0 1 0 s 0 0 1  2
II, lll,
1
AI
=
9
1
 5
s
1
2
1
3 1
,
1
s
,1 ,

Therefore,
0 0 1 0 0 1
1
3 1
(You should always check that AA l = Tby di rect m ultiplicat ion. l3y T heorem 3.13, we do not need to check that A , A = Ttoo.)
Remlr.
NOlice Ihal we have used the v3riant of GaussJordan elimination th3t first introduces all of the zeros below the le3dmg Is, from left to right and to p to baltom, and then cre31es zeros above the leading Is, from right to left and bollom to top. This approach saves o n calculations, as we noted in Chapter 2, but yo u mOlYfind it easier, when working by hand, \0 create ,,/I o f the zeros in each column as yo u go. The answer, of cou rse, will be the same.
lxample 3.31
Find the inverse of 2 A '=
if it exists.
 4 2
1
 4
 [ 6 22
SectIon 3.3 The In\'erst of a Malrix
115
SOI,II.. We proceed as in Example 3.30, adjoin ing the identity matrix to A and then trying to manipulate IA I tJ into II I AI I. 2
[AI I}
~
I{_ 11{,
•••• II. ,_II
•
.
]
,
 \
 2
2
,
]
0
0
6 0
\
0
 2 0
0
\
\
0
0
\
0
3
 2 2  6 \
0
\
\
2
\
]
0
0 0
]
3
2
\
0 0
0 5
3
\
2
]
0
]
0
0
At this point, we see that it is not possible to reduce A to 1, SIIlCCthere is a row of zeros on Ih, Id lh,nd sid, of Ih, ,msmenICd n,,"" . Co ''''q ''c j  in other wo rds, workjllgfrol1l lOp
to bottom ill each CO/lIIllIl. ('Vhy?) In th is example, we have
[: 1~
1  I
fl. •.111,
1 I
0 1 3 3
R.  ~R ,
0
1
0
0
0
9
4
5
•
1  1
(c) => (h) If rank(A) = tI, then the red uced row echelon fo rm of A has "leading Is and so is In" From (d) => (c) we know that Ax = 0 has on ly the triVIal solution, wh ich implies that the column vectors o f A are linearly independent. since Ax is jusl a linca r comblllatio n o f the column veClors o f A.
(h ) => (i) If the column vectors of A are linearly independen t. then Ax == 0 has only the triVIal solution. Thus, by (c) => (b), Ax = b has a unique solu tion for every bill IR". This means that every vector b in JR" can be written as a linea r combination of the column vecto rs of A. establishi ng (i). (i) => (j) If the colum n vectors of A span IR", the n col (A ) = Rn by definition, so rank (A) :::: dim (col (A» :::: tI. This is (f), and we have already established that (f) => (h ). We conclude that the column vectors o f A are linearly independent and so form a basis for IR:", since, by assumption, they also span IR". (j ) => ( f) If the column vectors of A form a basis for JR", then, in pa rticular, they are linearly independ ent. It follows that the reduced row echelon form of A contains /I lead ing Is, and thus rank(A) = II. The above dIscussion shows that ( f) => (d ) => (c) => (h) => (i) => (i ) => ( f) ¢> (g). Now recall that, by Theo rem 3.25, rank (AT ) :::: ra nk (A), so what we have just proved g ives us the correspond rng results about the column vectors of AT. These are then resulls about the rol\lvectors of A, bringing (k), (1), c "
In momy applications that can be mode/cd by a graph, the vert ices are ordered by SQme type of relation that imposes a direction on the edges. For example, directed edges I11ight be used to represent oneway rou tes in a graph that models a transportation network or predatorprey relationships in a graph modeling an emsystem. A graph with directed edges is called a digraph. FIgure 3.26 shows an example. An easy modification to the definition of adjacency matrices allows us to use them with digraphs.
DeliaUion
If G is a digraph with /I X II n1:l trix A ror A( G) I defined by
figure 3.26
tI
vertices, then its adjautlcy matrix is the
I+. digraph
a . = {I if there is an edge from vertex ito verlex j IJ 0 otherwise
Thus, the adjacency matrix for the digraph in Figure 3.26 IS
A=
o o I I
I
0
I
0 0 I 000 0 I 0
Not surprisingly, the adjacency matrix of a dlgmph is not symmetric III general. ('Vhen would it be?) You should have no difficulty seeing that Ai now contains the numbers of directed kpaths between vertices, where we insist that all edges along a path flow in the same direction. (See ExerCise 54. ) The next example gives an applica· tion of this idea.
Example 3.68
D
w
Five tennis players (Davenport , Graf, Hingis, Scles, and Williams) compete 111 a roundrobin tournament in which each player plays every other player once. The digraph in FIgure 3.27 summarizes the results. A directed edge from vertex i to vertex j means thai player I defeated player j . (A digraph 11\ whic.h there is exactly one directed edge between ellery pair of vertices is called a tQurnamerll.) The adjacency matrix for the digraph in Figure 3.27 is
G
A
S
flglrI 3.:n A tournament
H
0 I 0 I I 0 0 I I I I 0 0 I 0 0 0 0 0 I 0 0 I 0 0
where the order of the vertices (and hence the rows and colu mns of A) is determined alphabetically. Thus, Graf corresponds to row 2 and column 2, for example. Suppose ,"'c wish to rank the five players. based on the results or their matches. One way to do this migbt be to count the number of wins for each pla)'er. Observe that the number of WinS each player had is just the sum of the entries in the
StClion 3.7 Applications
239
corresponding row; equivalently, the ve(lor conhlining all the row sums is given by the product Aj, where
1 1 ) =
I
1 1 in our case. we have
0 0 Aj "'"
1 0 0 1
1 0 0 0 0 0
0 0 I
1
1
1
1 1 0
1
1 1 = 1 I
0
J J 2 1 1
which produces the followin g ran king: First: Davenport, Graf (tic) Second: H ingis Third:
Sdes. Williams (tie)
Are the players who tied in this ranking equally strong? Davenport might argue that since she defea ted Graf, she deserves first place. Scles would use the same type of argumcnt to break the tICwi th Williams. However, Williams could argue tha t she has two "indi rect" victories because she beat Hmgis. who defeated two others; fu rt hermo re, she m ight note that Seles has only one indircct victory (over Williams, who then dcfeated H mgis). Since one player might not have defeated all the others with whom she ultimately lies, the notion of ind irect wi ns seem s more useful. Moreover, an indi rect victory corresponds to a 2path In the digrap h, so we can use the square of the adjacency ma trix. To compute both wins and indirect wins for each player, we nced the row su ms of the matrix A + A 2 , which arc given by
(A + A'lj
=
=
0 1 0 1 1 0 0 1 1 1 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 1 2 2 J 1 0 2 2 2 1 1 0 2 2 0 0 1 0 1 I 0 I 0
2 1 2 0 1 1 1 0 1 0 1 2 0 0 1 0 0 1 0 0 1 0
0 I
+
1 1 1 1 1
1 I
1 1 1
8 7 6
2 3
Thus, we would rank the players as fo llows: Davenport, Graf, Hingis, Williams, Seles. Unfortunately, this app roach is not guaran teed 10 break all ties.
Chapter 3 M:J{rlce's
frrorCorrectlng Codes Section 1.4 discussed examples of errordelecling codes. We turn now 10 the problem of designing codes that can ,nrrecl as well as d elect certai n ty pes of errors. O ur message will be a vector x in .l~ for some k, and we will encode it by using a mat rix transforma tion T : 1:~ ..... 1:; for some" > k. The vector T(x) will be called a ",de veclor. A simple example will serve to ill us trate the approach we will take, which is a generaJization of the paritycheck vectors in Example 1.31.
Example 3.69
Suppose the message is a single bina ry digit: 0 or I. If we encode the message by simply repeating it twice, then the code vectors are [0, OJ and [ I, I]. This code can de tect single errors. For example, if we tra nsmit [0, OJ and a n error occurs in the first component, then [1,01 is received and an error is detected, because this IS not a legal code vector. However, the receiver can not correct the error, since [ I, OJ would also be the result of an error in the second component if [ I, I] had been transmitted . We can solve this problem by mak lllg the code vectors longer re peating the message d igit three times instead of two. Thus, 0 and 1 are encoded as {O, 0, 0] and [ I, I, I ], respect ively. Now if a single error occurs, we can nol only detect it but also correct it. For example, If 10, 1. OJ is received, then \ve know it m ust have been the result of a Single error in Ihe transmission of [0, 0, 0 ]. since a single error ill [ I, I, I J could not have produced it.
Note that the code in EXC =
parity check matrix fo r the code. Observe that PC =
[~]
o. The mat ri x P lscalled :1 =
o.
To see how these matrices COllle in to play in the correction of errors, su ppose I we send I as I = [ 1 I I but a single error causes it 10 be received as
f,
I
SectIOn 3.7
c' = [1 0
AppilcatlOns
711
IV. We comp ute
Po'
[: o I
I
~]
0 I
so we know that c ' cannot be a code vector. VV'here is the error? Notice that Pc' IS the second column of the parity check matrix P this tells us that the error is in the second componen t of c ' (which we will prove in Theorem 3.34 below) and allows us to correct the error. (Of course, in this example we could find the erro r faster without uSing matrices, but the idea is a useful one.) To generalize the ideas In the last exam ple, we make the following definitions.
Definitions
If k < II, then an y nXkmatrix o f the form G =
[ ~], where A is
an (II  k)X k matrix over 7l 2> is called a standard generator matrix fo r an (II, k) biliary code T: 7l~ ,lo z~. Any (/I  k) X n matrix of the form P = [ B In k)' where B is an ( II  k) X k matrix over 71.1' is called a standard parity check matrix. The code is said to have length II and dim ension k.
Here
what we need to know: (a ) When is G the standard generator matrix fo r an errorcorrecflllg bmary code? (b ) Given G, how do we find an associated standard parity check matrix P? It turns out that the answers are quite easy, as shown b}' the follow ing theorem .
Theorem 3.34
If G =
IS
[ ~] is a standard generato r matrix and P =
[B
I~~tl is a standard par
ity check matrix, then P is the parity check m atrix associated with G if and on ly if A = B. T he corresponding (II, k) binary code is (single) errorcorrecting if and only if the colum ns o f Pare nonzero and distinct.
Before we prove the theorem, let's consider another, less tri vial example that illustrv'" are linea rly dependen t is false. It follows that Y. , V1'. ' " V'" must be linearly independent.
/n exercISes / 12, compute (a) the characteristic polynomial of A, (b) the eigenvalues ofA, (c) a basIS for eacl! eigenspace of A, lIlid (d) tile algebmic alld geometric multiplicity ofeach eIgenvalue. I. A=[
3. A =
5. A =
I
 2
~l
2.A =[
~l
2
\
0
\
0
0
\
3
\
\
0 0 3
0 0 0
\
 2
\
2
 \
0 0
 2
\
4. A = 0
\
0
3
\
\
 \
\
0
\
\
6. A =
3 2
, 0
10. A
\
\
 \
\
,
11. A =
 \
0
2
0
\
 \
\
\
0
0 \ 4 0 0 3 0 0 0
5
2
\
0
0
 \
0 0 0 0
\
\
\
\
4
0
0
9.A =
3
 \
8. A =
2 2
\
\
2
7. A =
\
\
0 0 0 0
\
\
0 2 3  \ 0 4
=
\
\
2
296
Chapter 4
Eigenvalues and Eigenvectors
I
0
12. A :::
4 0 0 4 0 0 0 0
I
I
I
2 0
3
(b) Using Theorem 4. 18 and Exercise 22, find the eIgenvalues and eigenspaces of AI , A  2/, and A + 21. 24. Let A and Bbe n X II matrices with eigenvalues A and J.L, res pectively. (a ) Give an example to show that A + J.L need not be
13. Prove Theorem 4.1 8(b). 14. Prove Theorem 4.18(c}. [Hint: Combine the proofs of parts (a) and (b) and see the fo urth Remark follow ing
Theorem 3.9 (p. 167}.!
I" Exercises IS and 16, A tors V ,
::: [ _ : ]
ami V2
IS
a 2X2 matrix with eigenvec
::: [ :
J corresponcling to elgenvailles
!
AI ::: and Al = 2, respectively, and x :::
[~].
15. Find Al(lx.
an eigenvalue of A + B. (b) Give an example to show thtH AJ.L need not be an eigenvalue of AB. (c) Suppose A and J.L cor respond to the sallie eigenvector x. Show that, in t his case, A + JJ. is an eigenvalue of A + Hand AJ.L is an eigenvalue of AB. 25. If A and Bare IWO row equivalent matrices, do they necessarily have the S;lme eigenvillues? Ei ther prove Ihat they do or give a counterexample.
Let p(x) be tile poiynom;tli
16. Find AlX. What happens as k becomes large (i.e., k + oo)? II! exerCISes 17 alld 18, A is a 3X3 matrix with eigerlvectors I I I V, = 0 ' V2 = 1 ,am/v)::: I corresponding to eigellI o o vailles A, =  i, A2 ::: and A) = 1, respectivel)\ and
1,
p(x} = X' + a"_Ix"1 + ... + a,x +
Tile companion matrix ofp(x) is Ille /I X" matrix  a" _1
 11"  2
I
0
0
I
0
0
0
0
C(p)
2
x=
I
ao

(I ,
0
 "0 0 (4 )
0
• ••
I
0 0
26. Find the companion matrix of p(x) = x 2  7x + 12 and then find the characteristic polynomial of C( pl.
2
17. Find A 10x. 18. Find Akx. What happens as k becomes large (Le., k '"""" oo)? 19. (a) Show that, for any sq uare matri:x A, AT and A have the same characteristic polynomial and hence the same eigenvalues. (b) Give an example of a 2X2 matrix A fo r which AT and A have different eigenspaces. 20. Let A be a nilpotent matrix (that is, Am = a fo r some II! > I). Sho\'I' that A ::: 0 is the only eigenvalue of A. 21. letA bean idempotent matrix (that Is, A! = A).Showthat A = 0and A = I are the only possible eigenvalues of A. 22. If V is an eIgenvector of A with corresponding eigenvalue A and c IS a scalar, show Ihat v is an eigenvector of A  cI with co rrespondi ng eigenvalue A  c. 23. (a) Find the eIgenva lues and eigenspaces of A=
[~ ~]
27. Find the companio n ma trix of p(x) = xl + 3x2 4x + 12 and then find the characteristic polynomial
of C( pI. 28. (a) Show that the companion matrix C( p ) of p(x) ::: xl + ax + b has characteristic polynomial A2 + aA
+
b.
(b) Show that if A is an eIgenvalue of the companion
~
matrix C( p) in part (a), then [ ] is an eigenvector of C( p) corresponding to A. 29. (a) Show that the companion matrix C( p) of p(x) ::: Xl + ax 2 + bx + c has characteristic polynomial _( A' + aA 2 + bA + c). (b) Show that If Aisan eigenval ue of the compan ion
A' matrix C( I'} in part (a), then
A isan eigenvector I
of C( p) corresponding to A.
Section 4.3
30. Construct a tloili riangular 2x 2 matrix wit h eigenvalues 2 and 5. (Hint: Usc Exercise 28.)
33. Verify the Cayley Hamilton Theorcm fo r A =
2, the companion matrix C( p) of p(x) "'" x" + tl.. I x "~ I + ... + a , x + ao has characteristic polynomial ( \ )"p (A). 1HlMt: Expand by (ofacton along the last colum n. You may find it helpfu l to introduce the polynomial q (x) = ( p(x)  '\I)/x.1 (b) Show that if A IS an eigenvalue of the compan ion matrix C( p } in equation (4), then an eigenvector corresponding to A is given by II
~
 I
0
1
 2
I
0
powers mId inverses of flit/trices. Io r example, if A is a 2X 2 matrix with ,/ramaer;st j, poiynomitll ' ,,( A) ::< A2 + aA + b, thenA 2 + aA + bl= O,so
+
A
~
I
~
a" _ I A~1
+ . + a,A + aoI
Au imporlrlllt theorel/llll (Idwlllced li" eM alge/ml says 111(11 if c,.(.A) IS the ciltlftlc/eriS/1Cpolynomial of the matrix A. lilen CA( A) = (itl words, every matrix satisfies its characterislic equotioll). This IS the celebrated CayleyHamilloll r 1leorem, mll1leti after Arthur Cayley ( 182 1 1895) Ilud SI( WjJ{iam Rowan Hamiltotl (sec page 2). Cayley proved this tllt.'orem ill 1858. HamiltOll dlscoveretl iI, illdepemle1lf/); ill IllS work all quaterniorlS, (I gellemlizat;oll of tile complex nllmbers.
 aA  bl
AI = AA2 = A(  (11\  bf) =  aA 2
A~
~].
'n,e Cayley Hamilton TI ,corem can be used to ca1cultJfe
and
I\A ) "'"

34. Verify the CayleyHamilton Theorcm for A :: I I 0
A 2::11:
If p{x) = x" + tl..~ I X " 1 + ... + alx + ao and A is a square matrix, we alii define tl sqllflrc rnatflX P(A) by
[~
That is, find the characteristic polynomial c,,( A) of A and show that cA(A) = 0.
3 1. Const ruct a tlont riangular 3 X3 matrix with eigenvalues  2, I,and 3. ( H int: Use ExerclSoo
'
O b. AI has a correspond ing positive eigenvector. c. If A is any other eigenvalue of A, then IAI ~ AI"
,
In tuitively, we can see why the first two statements should be true. Consider the case of a 2 X 2 positive mat rix A. The corresponding matrix transfo rmatio n ma ps the first quadrant of the plane properly into Itself, since all com ponents are positive. If we repeatedly allow A to act on the images we get, they necessarily converge toward som e ray in the first quad rant (Figure 4.17). A direction vector for this ray will be a positive vector x, wh ich must be mapped into some positive multiple of itsclf (say, AI)' since A leaves the ra y fixed. In other wo rds, Ax = Alx, with x and A. both positive.
Proof
for some nonzero vectors x , Ax ?: Ax for some scalar A. When this happens, lhen A(kx) ~ A(kx ) fo r all k > 0; thus, we need only co nsider unit vectors x. In Chapter 7, we will see that A m aps the set of all unit vectors in R" (the IHIII sphere) into a "generalized ellipSOid." So, as x ranges over the nonnegative vectors on th is un it sphere, there will be a maximum value of A suc h lhat Ax 2: Ax. (See Figure 4. 18.) Denote this number by AI and the corresponding unit vector by XI' y
y
y
y
, +~x
++ x ++ x l' '
Figure 4.11 y
Figure 4.18
+ x
Sed.ion 4.6
Applications and t he PerronFrobcnius Theorem
331
We nO\\l show that Ax l = Alx l. If not, then Axl > A1x l, and, applying A agai n, we obtain A(Ax l) > A(Alx l ) = A1(Ax I) where the inequality is preserved, since A IS positive. (See Exercise 40.) But then y = ( 1/II Axllj)Ax l is a unit vector that satisfi es Ay > Aly, so there will be so me A. > AI such that Ay 2: A2y . This contradicts the fact tha t AI was the maxi mum val ue wit h th is property. Consequently, it must be the case that A X I = Alx l; thai is, AI 's an eigenvalue of A. Now A is positive and X I is positive, so A,x l = Ax, > O. This means that AI > 0 and XI> 0, which completes the proof of (a) a nd (b). To prove (c). suppose A is any other (real or complex ) eigenvalue of A with co rrespondlllg eigenvector z. Then Az == Az, and, taking absolute values, we have (4)
where the middle inequality fo llows [rom the Triangle Ineq ual ity. (See Exercise 40.) Since jzI > 0, the unit vector u III the d ireCtIon of Izi is also positive and sa tisfies Au :;> IAlu. By the maximality of AI from the first part of thiSproof, we must have IAI:$: A,. In fact, more is true. It turns out that A, is dominant, so IAI < A, for any eigenvalue A AI. It is also the case thai AI has algebraic, and hence geometric, mult iplici ty L We will not prove these facls. Perron's Theorem can be generalized from positIVe to certain nonnegative matrices. Frobeni us d id so in 191 2. The resuit requires a technical condition o n the malrlx. A S(luare matrix A is called reducible if, subject 10 some permutation of the rows and the same permutation of the columns. A can be written It1 I>lock form as
"*
where Band D arc square. Equivalently, A is reducible matrix Psuch that
,r there is some permutatio n
(See page 185.) For eX:lm ple, the mat rix 2 4
A=
I
6 I
0 2 2 0 0
0
I
3
I
5
5
7 0 0
3
0
2
I
7
2
is reducible, since jnterchangmg rows I and 3 and then col umns I and 3 produces 72 i l 30 • 2 i •••. 4 ...... 5 5 _I.. __.+ o O •i 2 I 3 o Oj 6 2 I o O i l 72
,
332
Chapter 4 EIgenvalue; and Eigenvectors
(This is just PApT, where
p
0 0
0
I
I
I
0
0
0
0
0
0 0 0 0
0 0 0 I
0 0 0 0
0
I
Check Ihis!) A square matrix A that is not reducible is called irreducible. If Al > 0 for some k, then A is called primitive. For example, every regular Markov chain has a primitive transition matrix, by definition. It IS not hard to show that every prtmitive matrix is irreducible. (Do you see why? Try showi ng the cont rapositive of this.)
Theora. 4.31
The PerronFrobenius Theorem Let A be an irreducible nonnegative
nX n
matrix. Then A has a real eigenvalue Al
with the following properties: a. Al > 0 b. Al has a corresponding positive eigenvector. c. If A is any other eigenvalue of A, then .A !SO AI' If A is primitive, then this inequality is strict. d. If A is an eigenvalue of A such that A = AI' then A is a (complex) root o f the equa tion An  A ~ = O. c. Al has algebraic multiplicity I .
S« Matrix Alwlysis by R. A. Horn and C. R. Johnson (Cambridge,
England: Cambridge Uruve~ity Pre$$, 1985).
The interested reader can filld a proof of the PerronFroheni us Theorem in many texts on nonnegative matrices or matrix analysis. The eigenvalue AI is often calted the Perron root of A, and a corresponding probability eigenvector (which is necessarily unique) is called the Perron eigenvector of A.
linear Recarrence Relations The Fibonacci numbers are the numbers in the sequence 0, 1, 1. 2, 3, 5, 8, 13, 21 , ... , where, after the fi rSI two terms, each new term is obtained by summing the two terms preceding it. If we denote the nth Fibonacci number by f.. then this sequence is completely defined by the equations fo = 0, It = 1, and. for n 2. 2,
This last equation is an example of a linea r recurrence relation. We will return to the Fibonacci numbers, but first we will consider linear recurrence relations somewhat more generally.
Section 4.6
Applicatio ns and t he PerronFrobenius Theorem
an
I.eonardo of PiS where
'
n
X. [ X,, _ I
=
1
and
A =
[~ ~]
Since A has d istinct eigenva lues, it can be di3gon3lizcd . The rest of the de tails are left fo r ExeTCIse 51. (b ) \Ve will show that x" = CI An + Cl IIA" satisfies the recurrence relation x" = aX.. _1 + bx.._1 or, equivalentl y, (6)
if A2  tlA  b = O. Since X .. _ 1
=
, ,,I + " (1/
CIA
x,,  2 =
and

CI A " ~
+ C2(II  2) A,, 2
substitution into equa tion (6) yields
x"  aX,,_1  bx.._2 = ( cI A ~ + " IIA")  (/(CIA" I + ,,(II  I ) A,, I)  b( cI A,, 2 + ~ (II  2) A,, 2) (I( An

aA" 1  !JA"I)
+
~(/l A " 
a( /1  I ) A" I
 b( n  2) , ·· ' )
= ( IA" 2(A2  aA  IJ) + C211A,, 2(A2  aA  b) + ~ A"  2(aA + 2b) = cI A,,2(0) + " I1A,, 2(0) + = c1A" 1( aA
~A "  2 (aA
+ 2b)
+ 2b)
=
=
But since A is a double root o f ,\2  (IA  b 0, we m ust have Ql + 4b = 0 and A a/2, using the quad ratic (ormula. Consequently, aA + 2b = cr/2 + 2b =  4b/ 2 + 21J = 0, so
SeCtio n 4.6
331
Apphcatio ns and the Perro nFrobenius Theorem
Suppose the in itial conditions are XV = r and x, "" s. Then, in either (a) or (b ) there is a unique soluti on for and '1 (Sec Exercise 52. )
'I
Ixample 4.42
Solve Ihe recurrence relatio n XV = I. x, = 6, and
x~
= 6x.. _,  9xn_l fo r n 2: 2.
The characteristic equation is,\2  6A + 9 = 0, which has A = 3 as a dou ble root. By Theorem 4.38(b), we must have X n == e13" + ':zu3" = ( e. + ~ 1I ) 3 ". Since I = XV = c1 tl nd 6 = X I = ( ' I + ez)3, we fi nd that '2:: I,SO
SOlllilon
x" = ( I + /1)3" The techniques outlmed in Theorem 4.38 can be extend ed to higher o rder recurrence relations. We slale, without proof, the general result.
Theorem 4.39
Let x" = a ," _ \x~_ 1 + a .. _2x~_2 + "'" + ~x" '" be a recurrence relatio n of order III that is sa tisfied by a sequence (XII) ' Suppose the (lssoci:lIed characteristic polyno mial
' "  a", _ I A, ,,,I_ a",_1ft• ...2_ ... _
•
A
factors as (A  A1)"" (A  A2)"'; '" (A  AA)"", where 1111 + Then x~ has the form X,, :::: (cll A ~ + c12 nA ~ + c13 u2A7 + ... + cl",n",,I An + ...
m.,:'"'.~..,F'mL
::
111.
+ (Ckl Aj; + cu /lAi: + cul12AI' + ... + Ckm/,m" IAl)
SYSlemS 01 linear D111erenllaIIQualions In calculus, you learn that if x = x ( t) is a diffe rentiable fu nction satisfyi ng a differential equation of the fo rm x' :::: h, where k is a constant, then the genenll solut ion is x = ee b , where C is a constant, If an initial cond ition x (O) = ~ is specifi ed, then, by substitut ing I = 0 in the general solution, we fi nd that C = ~. Hence, the uniq ue solution to the differential equation that s(ltisfi es the ini tial conditio n is
Suppose we have n differen tiable fun ctio ns of Isay, x" X:z, .. . I x,, that sallsfy a system of differential equations
x; =
a l 1x 1
+
+ ... +
" l n X ..
xi =
(l2 I X .
+ (ln X 2 + ... +
(l2" X ..
{11 2Xi
We C(l 1l wflle this system in matrix for m as x'
x(I)
~
XI( t) x,( I) •
x,,( I)
X'( I)
~
x;( I) . Qand ei?:: D 2: 0, then AC 2: BD i?:: 0,
It call be sllOwl/ that a nonnegatil1e /I X /I mat rix is irretillCilJle if and Dilly if ( / + A) ,,I > 0. b. Exercises 3235,
32. A ~
Ibl IA + BI siAl + 181
42. a l
= 128, an = a n_ I / 2
43, Yo = 0,11 ... I, y" =
for
II
2:
2
Y,,2 for N 2' 2
Ynl 
0
I
0
0
I
0
0
0
0
I
0
0
0
0
I
0
0
0
0
0
I
0
I
0
0
0
0
I
I
0
I
0
I
I
0
0
0
I
hi Exercises 4550, solve Ihe recllrrence relatioll IVillr tile givell Irlilial cOlldilions,
0
0
I
I
0
0
0
I
0
0
45. Xc!
I
0
0
0
0
0
0
0
I
I
46, Xu = 0,
35. A ::
44. /)0
36. (a) If A is the adjacency matrtx of n graph G, show that A is Irreducible if a nd only if G is connected . (A graph is c0111lected if there is a path between every pair of vertices.) (b) Which of the graphs in Section 4.0 have an c irreducible adjacency matrix? \Vhich have a prrm itive adjacency matrix? 37. Let G be a bipa rtite graph with adjacency matrix A. (a) Show that A is no t prim itive. (b) Show that if A is an eigenvalue of A, so is  A, \ Hil1l: Use Exercise 60 in Section 3.7 and partition an eigenvector fo r A so that it is compatible with this partitioning of A. Use this partitioning to fi nd an eigenvector fo r  A. I 38. A graph is called kregl,{ar If k edges meet at each vertex. Let G be a krcgular graph. (a) Show that the adjacency matrix A of G has A = k as an eigenva lue. (1'/1111: Adapt Theorem 4.30.)
= I , /)1 = 1, b
n
= 0, x 1 = XI
S,x"
49.
"
= 3x
n_
1
= I, x" = 4xn_ 1
47. YI = \ 'Y2 = 6,y" 48. ('0
= 2bn _ 1 + b"_2 for II 2: 2
+ 4X n_l 
fOTIi
>
2
3X,,_2 for /I i?:: 2
=
4Yn_1  4Y,,_2 for 1/ i?:: 3
= 4, " I = I, a" =
a,,_1  a,,_z/4 for II i?:: 2
bo = 0, bl
= I, b" = 2b n _ 1
+ 2b,,_2 for"
2:
2
50. The recu rrence relation in Exercise 43. Show that your solut ion agrees with the answer to Exercise 43, 5 1. Complete the proof of Theorem 4.38(a ) by showing that jf the recurrence relation x" = ax"_ 1 + bX,,_2has distlilct eigenvalues Al '\2' then the solution will be
'*
of the form
( Hilll: Show that the method of Example 4.40 wo rks in general.)
52. Show that for any choice of mltial conditio ns Xu = r and x, = S, the scalars c i and C:! can be fo und, as stated jn Theorem 4,38(a) and (b).
Section 4.6
Applications and the PerronFrobenius Theorem
T he area of the square is 64 square u mls, but the rectangle's area is 65 square u nits! Where did the extra square coille fro m? (Him: What does this have to do wi th the Fibonacci sequence?)
53. The Fibonacci recurrence f" = /,, 1+ /,,2 has the associated matrix equation x ~ = Ax n _ p where
(a) With fo = 0 and f.. = I, use mathematical ind uction to prove that
A"
~ [f"+' f.
f,
/~ I
359
54. You have a supply of three kmds of tiles: two ki nds of 1 X2 tiles and one kind of I X 1 tile, as shown in Figure 4.29.
1 figure 4.29
for ,11111 (cl Assume that Q is orthogonal. The n QTQ = I, and we have
Qx· Qr = (QX)TQ y = XTQTQ y = xTly = xTy :::: x · y ( c) => (b) Assume that Qx · Qy = x . y for every x and y in R". Theil, taki ng y : x,
weh ave Qx 'Qx = x'x ,so I Qxl:::: Y Qx · Qx = Y xx = I x~. (b) => (a) Assume that property (b) holds and let q, den ote the Ith colu mn of Q.
Using Exercise 49 in Sect ion 1.2 and propert y (b), we have
x'y :::: Hlix + y~ 2
I x 
yU1)
+ Y)I '  l Q(x  Y)i ' ) = IU Qx + QYI '  IQx  QYI') = WQ (x
= Qx · Qy for all x and y in ~ ". [This shows that (b) => (e).1 Now if e, is the IIh stan dard basis vector, then q, :
q, . q) = ,,e also have \V = f J. . Figure 5.5 Illustrates this situation.
e
e
318
ChapteT 5 OrThogonalilY
In Example 5.8. the orthogonal complement of a subspace turned out to be ano ther subspace. Also, the complement of the complement of a subspace was the original subspace. These properties are true In general and are proved as properlles (a) and (b) of Theorem 5.9. Properties (c) and (d) will also be useful. (Recall that the intersectIOn A n B of sets A and B consists of their common elements. See Appendix A. )
Theo,.. 5.9
Let W be a subspace of R8. a. W.I. is a subspace of R8. b. ( W.l.).1.= W c. wn W.l. = 101 d. If w = span (wj • • . • , Wi)' then v is in W.l. if and only if v ' w, = 0 for all i = l •. . . , k.
Proal (a ) Since o · W "" 0 for all w in
W. O is in W..I. . ut u and v be in W.I. and let c
be a scalar. Then u 'W = v·w = 0
forallwlfl W
Therefore, (u + v)·w = u ' W + V'w = 0 + 0 "" 0 so u + vis in W.I. . We also have
« . ) ·w
= « •. w) = «0) = 0
from which we sec that cu is in W.I.. It foll ows that W.I. is a subspace of R". (b) We will prove this properly as Corollary 5.12. (c) You are asked to prove this property in Exercise 23. (d ) You are asked to prove this properly in Exercise 24.

We can now express some fu ndamental relationships involving the subspaces associated with an m X " matrix.
Theore .. 5.1.
Let A be an m X n matrix. Then the orthogonal complement of the row space of A is the null space of A, and the orthogonal complement of the column space of A is the null space of AT:
P,ool
If x is a vector in R", then x is in (row (A».I. if and only if x is orthogonal to every row of A. But this is true if and only if Ax = 0, whi(h is equivalent to x bc:'ing in null (A), so we have established the firs t Identity. To prove the second identity, we s imply replace A by ATand use the fa ct that row (A T ) = col (A). Thus. an m X n mat rix has four subspaces: row(A), null (A ), col (A), and null (AT ). The first two arc orthogonal complements in R8, and the last two arc orthogonal
•
,
Section 5.2
I
Orthogonal Complements and O rthogonal Proj«tions
null(Al
,0
col(A)
row(A)
R" flglre 5.6 The four fundamental subspaces
complements in R"'. The mX /I mat rix A d efines a linear transfo rmation from R~ into R" whose range is collA). Moreover, th is transfo rmatio n sends null (A ) to 0 in Ill .... Figure 5.6 illustrates these ideas schematically. These four subspaces arc called the fundame"tal subspausofthe mX" matrix A.
Example 5.9
Find bases fo r the four fu ndamental subspaccs of
A ~
1
1
3
1
6
2
 1
0
1
 1
3
2
1
 2
1
lll)' where
Also, null {A) = spa n (x •• Xl)' where
x,
~
1
1
2
3 0
1
0 0
,
X2
=
. 1
To show thaI (row (A».!. = null (A ), it is enough to show that every u, is orthogonal to each x" which IS an easy exercise. (Why is th is sufficierll?)
311
Chapter 5
Orthogon:llity
The colu m n space of A is col (A) "" span {a J, a 2• a J ), where ]
2 ,  3
3 1 ""
=
32
]
]
 ]
]
2
4
,
a,
=
 2
]
]
We still need 10 compute the null space of AT. Row reduction produces ]
2
 3
4 0
]
0
0
]
]
 ]
2
0
6 0
0
]
]
 2
]
]
0 0 0
]
]
0 0 0 0
]
IA'I O)= 3
0 0 ] 0 3 0
3 0 0 0 0 0
•
]
•
•
0 0
0
So, if y is in the n ull space of AT, then Y. ""'  Y., Y2 ""  6y., and Yj =  3Y4. It fo llo\ys that
nuU(A')
 6y.  3y.
=
"" span
vCClO r
3 ]
)'.
and it is easy to check tha t this
.  ]
Y.
is orthogonal to a l> a z• and a y
The method of Example 5.9 is easily adapted to other situatio ns.
(umple 5,10
Let W be the subspace of [R5 spanned by ]
wl =
3 5 , w1 = 0
5
]
0
]
 ]
2 ,  2 3
Wj=
4  ]
5
Find a basis for W J.. .
Salallall
The subspace W spanned by wI'
w~ .
and wJ is the same as the column
space of
A=
]
 ]
0
3
]
 ]
0
2 2
4  ]
5
3
5
5
Seclion 5.2
Orthogonal Complements and Orthogonal Project ions
319
Therefore, by Theorem 5.1 0, W i = (col(A))'" = null (AT ), and we may p roceed as in the p re vio us exam ple. We com pute
[A' loJ
1
 3
 I
1
°
5 2
 I 4
°J ° 5
•
 2 0  \ 5 0
° °0 01° o1 32 °° 1
°°
3
4
1
Hence, yisin W.L ifandonlY lf Yi =  3Y4  4YS'Y2 =  Y4  3Ys, and YJ =  2Ys. lt follows that
 3
 3Y4  4Y5 Y4  3Y5  2ys
W l. =
4  I  3 0 , 2 1 0 0 1
= span
y,
y, and these two vectors for m a basis for Wi
,
Orthogonal Prolee"ons Recall that, in Rl, the projection o f a vecto r v on lO a no nzero vector u is given by )
projo(v)
v)
U· = ( U· U
u
Furthermo re, the vector perpg(v) = v  proj,,( v) is orthogo nal to proju(v ), and we can decompose v as
v = proj.(v) + perpu{v) as shown in rigurc 5.7. If we leI W = span (u ), then w = proj., ( v) is in Wand w'" = perp.( v) is In Wi , We therefore have a way of "decomposing" v into the sum of two vectors, one from Wand the other orthogonal to IV namely, v = w + W i . We now generalize this idea to R~.
Definition
Let Wbe a subspace of R~ and le t {u l , . . . , uJ.} be an orthogonal basis for W. For any vector v in R~, the ortlJogonal projection of v Ollt o W is defi ned as
The component ofv orthogonal to W is the vector
Each sum mand in the defi nition o f proj I~ V) is also a projectio n onto a single vecto r (o r, equivalently, the oned imensional subspace span ned by it in our p revIO Us sense). Therefore, with the notation of the preceding defin ition, we can write
p roj W 0 for all x *" O. 2. positive semidefinite if f(x) ~ 0 for all x. 3. negative definite if f(x ) < 0 for all x *" O. 4. negative semidefinite if f(x ) < 0 for all x. 5. jndefinite if f( x ) takes on both positive and negative values.
41&
Cha pter 5 Orl hogollaluy
A symmetric matrix A is called positive definite, positive semidefinite, negative definite, negative semidefi nite, or i"definite if th e associated quadratic fo rm [(x) = xTAx has the corresponding property.
The quadratic forms in parts (a), (b), (cl, and (d) of Figure 5. 12 are posi tive defin ite, negative defi nite, indefi nite, and posltl\'e semidefini te, respectIvely. The PrinCIpal Axes Theorem makes it easy to tel l if a quadratic form has one of these properties.
Theorem 5.24
Let A be an !IX 1/ sym metric matrix. The quadratic form [(x) = x TAx is a. jIOsitlve definite if and only if aU"of the eigenvalues of A ar po&Jtive. b . .positive SCiUldefiniJe j f and only if II of the eigenval ues of A nonn ive. egative definiteif and only if all of the eigenvalues of A are rfegs· . egative semidefinitejf and only if aU of the eigenvalues of A ak non"",,,';"v
a O (c) a pair of straight lines o r an imaginary conic if k 'l 0, and prove that if x lies o n this ell ipse, so does Ax.J
• ., Dellnltions and fundamental subspaces of a matrix.. 377 Gram·Schmidt Process. 386 orthogonal basis, 36i orthogonal complemen t of a subspace, 375 orthogonal matrix. 37 1
orthonormal set of vectors. 369 propcn ies of onhogonal mat rices. 373 QR faclOrization. 390 Rank Theorem, 383 spectral decompositIOn , 402 Spectr,,1 Theorem , 400
o nhogonal proJKtion , 379 o rthogonal set of vecto rs, 366 Orthogonal Decomposi tion Theorem. 381 o rthogonally diagonalizable matrix. 397 o nhonormal basis, 369
Review Questions 7
1. Mark e"ch of the following statements t rue or false: (a) Every ortho normal set of vttto rs is linearly
(b) (c) (d) (e)
(0 (g)
(h) (i) (j)
independent. Every nonzero subspace of R~ has an orthogonal basis If A is a square matrix with orthonormal rows, then A is an orthogo nal matTlx. Every o rthogol1,,1 matrix is invertible. If A is a mat rix with det A :: I, then A is an orthogonal matrix. If A is an m X " matrix such that (row(A» .l = R~. then A must be the zero matrix. If W is a subspace of IR~ and v is a vector in R" such that pro; lO'( v) = O, lhen v must be the zero vector. If A is a symmetTlc,orthogonal matrix, then AZ = I. E\'ery o rthogonally diagonaliz.l ble matrix is Invertible. Given any /I real numbers AI! . .. ,An' there exists a symmetric /I X III11"tfl)C with AI' ... , An as Its eigenvalues.
2. Find all values of (/ and b such that I
2. 3
4
•
I , b  2 3
is an o rthogonal set of vectors.
3. Find the coordmate vector [V]6 of v =
 3 with 2
respect to the orthogonal basis I
I
0,
I ,
I
 I
 I
2
of R'.
1
4. The coordina te vector of a veClOr v with respect to an
orthonorm:llbasis 6 "" {v l, vl}of R l is [V]6 = If VI =
J/5] [
[ l/~] '
4/ 5, find allposslblevectorsv.
6! 7 2! 7 3! 7 5. Show that  I!V, o 2! V, 4! 7Vs  15!7V, 2! 7V,
•
IS an
o rthogonal matTix. 6. If
[ 1~2
:] is an o rthogonal matrix, fi nd all possible
values of a, h, and c. 7. If Qis an orthogonal " X II matrix and {VI> ...• v(} is an orthonormal sct In Rn, prove that {Q v l , . . . , QVt} is an o rthonormal sct.
431
Chapter 5 Orthogonali ty
8. If Q IS an " X " mat rix such that the angles L (Qx, Qy) and L (x , y ) a re eq ual for all vectors x and y in lQ:", prove that Q IS an orthogonal matrix .
(b) Use the result of part (a) to find a OR factorization
o f A ""
In QlleJtlO1U 912. find Il basis lor IV J.. 9. W is the line 111 H2 with general equation 2x  Sy = O. X=I
11. W "" span
\
0
 \ ,
\
,
\
vectors
A=
3
2
\
2
\
4
8
9
 5
6
 \
7
2 3
IS. Let A =
 \
: XI
El
I
 1
I  I
2 I
I 2
=
span
20. If {V I' V 2• 0
,
\
=
\
\
\
0
\
\
\
\ \
, Xl
=
\
, X)
0
to fi nd a n o rthogonal basis for
=
\ \
W
.•
\
,
\
\
\
,E _l = span
 \
o
v,,} is an o rthonormal basis for R d a nd
prove that A IS a sym m e tric m a trix wi th eigenvalues cl' ':!' . .. • c,. and corresponding eigenvectors VI' v 1• • . , v .....
 2
15. (a) Apply the G ram · Schmidt Process to
XI
of W.
\
 \
\
= 0
3
\
,
\
o
with rcsp«t to
0
+ Xi + X, + ~
2
2
\
R~ that contains the
\
\
0  \
\
\
19. Find asymmetric ma trix wit h eigenvalues Al = Az = I, AJ =  2 and eigenspaces
\
W = span
\ 0
(a) O rthogonally diagonahze A. (b) Give the spectral decomposition of A.
14. Find the orthogonal decompositio n of
v =
\
17. Find an ort hogo nal basis for the subspace
\
 \
\
2
2  2
 \
\
"d
2
13. Find bases for each o f the four fundame ntal subspaces of \
\
\
x,
 \
\
o
2
\
\
0
\
\
0
\
x, x, x,
\ \
 I
3
4
12. W = span
=
\
16. Fi nd an orthogonal basis for
10. Wis the line in W wi th parametric equations y = 21. Z
\
= span{xl'
X l ' Xl }'
ctor
Algebm is gellerous; she of,ell gives more 1111111 is asked of IU'r.  Jean Ie Rond d'Alembert
6.0 Introduction: Fibonacci in (Veclor) Space The Fibonacci sequence was introduced in Section 4.6. It is the sequence
( 17 17 1783)
In Carl B. Boyer A Hisrory of MII/h emafies \'Viley, 1968, p. 481
0, I, 1,2,3,5,8, 13, ...
of no nnegative integers with the property that after the fi rst IwO terms, each term is the sum of the two terms preceding it. Thus 0 + 1 = 1, 1 + 1 = 2, J + 2 "" 3, 2 + 3 = 5,and soon. If we denote the terms of the Fibonacci sequence by ~, h., ... , then the entire sequence is completely determined by specifyi ng that
to,
fo = Q,!;
= I
and
in=
In I
+ i"2 fo r II 2: 2
By analogy with vector notation, let's write a sequence .\("
XI'
X:!' x3• '
••
as
x = [Xo,XI' ~,x3" " )
The Fibonacci sequence then becomes f = [Io,!"!,,!,,. .. ) = [0, I, 1, 2,. .. )
We now general ize this notion.
Definition
A Fibonaccitype sequence is any sequence x = (Xu, xI' X 2, Xl" such that Xu and X I are real numbers and xn = xn _ 1 + xn_2 for n > 2. For example, [ I, sequence.
Vi. I + V2. 1 + 2 V2. 2 + 3 v'2.... )
••
is a Fibonaccitype
Problell1 Write down the first five terms of three more Fibonaccit ype sequences. By analogy with vecto rs agai n. let's defi ne the $11111 of two seq uences x = [At). xI' X l > . . . ) and y = [Yo. Y» Y2' .. . ) to be the sequence
x + Y = [41
+ Yo,xl + YI'X2 + Yl.·· ·)
If c is a scalar, we can likewise define the scalar multiple of a sequence by
•
431
taZ
Chapter 6
Vector Spaces
'r,lIle.2 (a) Using your examples from Problem 1 or other examples, compute thf! sums of various pairs of Fibonaccitype sequences. Do the resulting sequences appear to be Fibonaccitype? (b ) Com pute va rious scalar multiples of your Fibonaccitype sequences from Problem I . Do t he resulting sequences appear to be Fibonaccitype? Probl •• 3 (a) Prove that if x and y arc Fibonaccitype sequences, then so is x + y. (b ) Prove that if x is a Fibonaccitype sequence and c is a scalar, then ex is also a
Fibonaccitype sequence. Let's denote the set of all Fibonaccitype sequences by Fib. Problem 3 shows Ihat, like R~, Fib is closed under addition and scalar multiplication. The next exercises show that Fib has much more in common with R". 'robl •• 4 Review the algebraic properties of vectors in Theorem 1. 1. Does Pib satisfy all of these properties? What Fibonaccitype sequence plays the role of O? For a Fibonaccitype sequence x, what is  x? Is  x also a Fibonaccitype sequence? 'robl •• 5 In An, we have the standard basis vecto rs el' e1•... • eft' The Fibonacci sequence f = [0. [, I, 2, . .. ) can be thought of as the analogue of e~ because its fi rst two terms arc 0 and l. Whal sequence e in Fib plays the role of c l? What about el , e~ •. .. ? Do these vectors have analogues in Fib? 'rolll•• 6 Let x = [;.;;., xl'~ ' ... ) be a Fibonacci type sequence. Show that x is a linear combination of e and f. J Show that e and f arc linearly independent. (That is, show that if ce + df = O,then c '" (1 = 0. ) Problel! 8 Given your answers to Problems 6 and 7, what would be a sensible value to assign to the "'dimension" of Fib? Why? ProbleUl 9 Are there any geometric sequences in Fib? That is. if
'r.III,.
Il,r, r1, rJ , ••
.)
is a Fibonaccitype sequence, what arc the possible values of ~ 'r,bl •• 11 Find a "baSIS" for Fib consisting of geometric Fibonaccitype sequences. ",lIle. 11 Using your answer to Problem 10, give an alternative derivation of Biflet's fomlllfil Iformula ( 5) in Section 4.6 1:
I (I + v'S)" _ I (I  v'S )" "v'S 2 v'S 2
f, _
for the terms orthe Fibonacci sequence f = the basiS from Problem to.) The Lucas sequence is named after Edouard lucas (see pagf! 333).
1fo.J; ./,., . . . ). ( Hint: Express f in terms of
The Luctu seq uence is the Fibonaccitype sequence 1 =[ ~,11'12 ,13 , ·· · )  [ 2, 1 ,3,4,
. .. )
Problea 12 Use the basis from Problem 10 to find an analogue of Binet's formula fo r the nth term f" of the Lucas seq uenCl~~. Proble. 13 Prove that the Fibonacci and Lu cas sequen c~ are related by the identity
{,. I + f~" 1 = I~ [H im; The fibona ccitype sequences r
for tl
2:.
1
"" 11, I, 2, 3, ... ) and f'"
= [ I, 0, I, I, ... )
fo rm a basis for Fib. (Why?)] In this Introduction, we have seen that the collection Fib of all Fibonacci type sequences ~ha ves in many respects like H2. even though the "vectors" are actually infinite sequencrs. This useful analogy leads to the general notion of a vector space that is the subject of this chapter.
5«tion 6. 1 Vector Spaces and Subspaces
ua
Vector Spaces and Subspaces In Chapters 1 and 3, we saw that lhe algebra of vectors and the algebra of matrices are similar in many respects. In particular, we can add both vC(:tors and matrices, and we can multiply both by scalars. The properties that result from these two operations (Theorem 1.1 and Theorem 3.2) are identICa l in bot h settings. In th IS section, we usc these properties to define generalized "vectors" tha t arise in a wide variety of exam ples. By proving general theorems about these "vectors," we will therefore sim ultaneo usly be provlllg results about all of these examples. ThiS is lhe real po\Y"er of algebra: its ability to take properties from a concrete setting, like RM, and (lbstmct them into a general setting.
,, Let Vbe a set 011 wh ich two operations, called ndditiol1 and 5calar ; have been defi ned. If u and v arc in V, the 511m of u and v is denoted by u + v, and if c is a scalar, the scalar multiplc of u by c is denoted by cu. If the following axioms hold for all u, v, and w in Vand for aU scaJars cand d. then V is called a vector space and its elements are called vectOni. The German mathematiCIan Hermann Grassmann ( 18091877) is generally credited with first Introducing the idea of a vector space (although he did not can it that) in 1844 Unfortu · nately, his work was very difficult to read and did not receive the attention it deserved. One person who did study it was the Italian mathematician Giuseppe !'eano ( [8 58~ 1932). In his 1888 book C(llcolo GeomctncQ, Peano clarified Grassmann's e;lrlier work and laid down the axioms for a vector space as we know them today. Pea no's book is also remarkable for introducing operations on sets. His notations U, n , and E (for "union," "inler· section,Mand "is an dement of") are the ones we still use, although they were nOI immcdlaldy accepted by other mathematici(1I1S. Peano's axiomatic defini· tion of a vector space 111so had vcry little mfluence for many years. Acceplance came in 1918, after Hermann Weyl ( 18851955) repeated it 111 his book Space, Time, Mmler, 1111 introduction to Einstcl11's general theory of relativity.
l. u + v lsinV. 2. u + v = v + u
under addition Commutativity 3. ( u + v) + w = u + (v + w ) M!,ociati\il\' 4. There ex ists an element 0 in v, called a %ero vector, such that u + 0 = u. 5. Fo r each u in V, there is an clement  u in V such that u + ( u ) == o. 6. culs inV. Clo~urc under !oCalar muJtipJi";,1lion 7. c( u + v) = co + CV Diwibutivity 8. (c + d) u = ru + du D i~tributivi t y 9. c(tlu ) = (cd )u 10.1u = u Ch)~ure
Re • • ," • By "scalars" we will usually mea n the real numbers. Accordingly, we should refer to Vas a rC(l1 vector space (or a vector space over tile Tenlmlmbers) . It IS also possible fo r scalars to be complex numbers o r to belong to Zp' where p is prime. In these Cllses, V is called a complex vector SplICe o r a vector space over Zp' respectively. Most of our examples will be real vector spaces, so we will usually o mit the adjective " real." If something is referred to as a "vector space," assume that we arc working over the real number system. In fact, the scalars can be chosen from any num ber system in which, roughly speakmg, we can add, subtract, multiply, and divide according to the usual laws of arit hmetic. In abstract algebra, such a number system is called a field. • The definition of a vector space does not specity what the set V consists of. Neither docs it specify what the operations called "addition" and "scalar multiplication" look like. Often, they will be fam ilia r, but they necd not he. Sec Example 6 below and ExerCises 57,
\Ve will now look at several examples of vector spaces. In each case, we need to specify the set Vand the operations of addition and· scalar multiphcation and to verify axioms 1 th rough 10. We need to pay particular attention to axioms 1 and 6
434
Chapler 6
Veclor Spaces
(closu re), axiom 4 (the existence o f a zero vector V must have a negative in V).
In
V), and axiom 5 (each vector in
Ixample 6.1
For any 1/ i2: 1, IRn is a vector space with the us ual op erations of addition and scalar m ultiplication. Axio ms I and 6 follow from the defi n itions o f these operations. and the remaining axioms foll ow from Theorem 1.1.
lKample 6.2
The set of all 2X3 matrices is a vecto r space with the usual operations of matrix addition and m atrix scalar multiplication. Here the "vectors" are actually matrices. We know that the sum of 1\\10 2X3 matrices is also a 2X3 matrix and that multiplying a 2X3 matrix by a scalar gives anothe r 2X3 mat r ix; hence, we have closure. The remaming aXIO ms follow from Theorem 3.2. In particular. the zero vector 0 is the 2X3 "lero matrix, and the negative of a 2x3 matrix A is just the 2x3 matri x  A. There IS noth ing special about 2X3 matrices. For any positive integers m and n, th e set of all //I X tI mat rices fo rms a vector space with the usual operatio ns of m atri x add ition and matrix scalar multi plication. This vector space is denoted M m".
IxampleJt
Let ~ 1 denote the set of all polynomials o f degree 2 or less with real coefficients. Define addition and sca lar multiplication in the usua l way. (See Appendix D.) If
p(x)
= flo + alx + al~
and
(Ax)
= bo + b,x + blxl
are in f!J' 2' then
p(x)
+ q(x) ""
(110
+ Vo) + (a l + vl)x + (al + b2 )X2
has degree at most 2 and so is in r;p 2' If c is a scalar, then
cp(x) "" ctlo + calx + cal;; is also in qp 2' This verifies axioms 1 and 6. The zero vector 0 is the zero po lynom ial that is, the polyno mial all of whose coefficients are zero. The negati ve of a polynom ial p(x) = flo + (/IX + (l lX2 is the polyn om ial p{x) "" flo  (l 1X  a2x 2. lt is now easy to verify the remaining axio m s. We will check axiom 2 and Icave the ot hers for Exercise 12. With p{x) and q(x) as above, we have
P(x) + (Kx) = (ao + a\x + a2;;) + (~ + blx + blxl) = (flo
+ bo) + (a\ + b,)x + (a2 +
b2 )Xl
+ (b\ + al)x + (b2 + ( 2)x2 = (bo + b\x + b2xl) + ('10 + (l\X + (l2xl) = q(x) + p(x) = (bo + (10)
where the third equality follows fro m the fac t that addition o f real nu m bers is comm utative.
Section 6. 1 Vector Spaces and Subspaces
435
In general, for any fixed t/ :> 0, the set Cjp" o f all polynomials of degree less than or equal to " is a vector space, as is the set g> of all polynomials.
Example 6.4
Let ~ denote the set of all realvalued fu nctio ns defined on the real line. [f [a nd g arc two such func tions and c is a scala r, then f + g and c[are defi ned by
(f + g)(x)
~
f(x) + g(x)
,,,d
(e!)(x)  if(x)
In other words, the valll l! of f + g at x is obtained by adding together the values of f and g at x/Figure 6. 1(a)l. Similarl y, the val ue of c[at x IS Just the value of fat x mu ltiplied by the scalar c I Figure 6.1 (b)]. The zero vector in c:;. is the constant fu nctlon /o tha t is identically zero; that is,/o (x) ::: 0 for all x. The negative of a funct ion f IS the function  f defined by (  f) (x) :::  [( x) IFigure 6. 1(c)]. Axioms I and 6 arc obviously true. Verifi cation of the remaimng axioms is left as Exercise 13. Th us, g. is a vector space.
)'
(x. 2j(x» (x,j(x)
\
+ g(x»
I
2(
f f+g
8
\ _ +.o:,x:c.f,,"::: '"j:)...' '~

(x. 0)
~~ L4'~ x (x. 0)
f
/  '" Jj(x))
(l.
(b)
(a)
)'
(x,J(x)
\ / fIx»~
f
f
(x. (e)
"gar. &.1 The graphs of (a) f, g, and [ + g, (b) [, 2[, and  3[, and (c) f and  f
3f
'
436
C hapter 6
Vector Spaces
In Example 4, we could also h:we considered o n ly those fu nctions defined o n some closed mterval [a, h] of the real line. T his approach also prod uces a vector space, d enoted by ~ [a, h].
Example 6.5
T he set 1L of integers wit h the usual ope ra tions is lIo t a vector space. '10 de mo nstrate this, it is enough 10 find tha t ol/cof the ten axioms fail s and to give a specific instance in which it fails (a cOllllterexample). In this ca se, we find that we do not have closure under scalar multiplica tion. For example, the m ult iple o f the in teger 2 by the scalar is 0)(2) = whic h is no t an integer. Th us, il is nOI true that ex is in if.. for every x in 1L a nd every scalar c (i.e., axiom 6 fails).
1
Example 6.6
L
Let V = R2 wit h the us ual defi nition of add ilion but the fo llowing defin ition of scala r m ultiplication:
t ]~ [ ~] 1[:] ~ [~] t [:]
Then, for example,
so axiom 10 fal ls. [In fact , the other nine axioms are all true (check Ihis), but we do n ot need to look into the m because V has already failed to be a vector space. This example shows the value of lookmg ahead, rathe r th an working through the list of axioms in the o rde r in which they have been given. )
Example 6.1
LeI (2 de note the set of all o rdered pairs of com plex n umbe rs. Defi ne addition and scalar multi plication as in 1112, except here the scalars are com plex nu mbers. For exam ple,
[ 1+i] +[3+ 2i] [2+3i] 2  31
(I 
a nd
4
6  31
i)[ I+i] ~ [(Ii)( I +i)] ~ [ 23i
( l  i)(2 3i)
2] I  51
Using prope rt ies of the complex numbers, it is straigh tforward 10 check that all ten axio m s hold. The refore, C 2 is a co mplex vector space.
In general,
Example 6.8
e" is a complex vector space for all n 2:
I.
If P is prime, the set lL; (with the us ual d efinitions of addition a nd multiplication by scalars from Z,J is a vector space over lLp for all n 2: I.
Sa:tion 6.1
Vector Spaces and Subspace!;
UI
Udore we consider furt her examples. we st,llc a theorem that contains somc useful properties of vecto r spaces. It is Important to note Ihal, by proving this theorem fo r vector spaces In gelleml, we are actually provmg it for every specific vector spact.
Theor•• 6.1
Ltt V be a vector spact, u a vector in V, and c a scalar.
II
3.0U = 0
b.eO = O c. ( l) u =  u d. Ifcu = O,then
C '"'
Oor u = O.
Proal We prove properties (b) and (d ) and leavt Ihe proofs of the rema ining propenies as exercises. (b) \Ve have dI ~ « 0
+
0 ) ~ dI
+
dI
by vector space axioms 4 ::and 7. Adding the negat ive of d) 10 both sides produces
+ (dl )
dI
~
(dl
+ (0 ) +
(dl)
which implies
°= dI + (dl + (  dl ))  cO
+0
By ax iom ~ 5 and 3
By axiom 5
= dl
By axiom 4
c = 0 or u = 0 , let's assume that c ¢ O. (If c = O. there is no thing to prove.) Then, since c r:f. 0, its reciprocal I/c is defi ned, and
(d ) Suppose cu = O. To show that ei ther
u
=
lu
U)' axiom 10
e} I
,
fro )
I\y axiom 9
I
0 c
°
Ill" property (b )
We will wn te u  v for u + ( v ), thereby definmg sub/merion of veclo rs. We will also exploit the associativity property of addit io n to unambiguo usly write u + v + w fo r the sum of three vectors and, more generally,
for a linear combiNation of vectors.
Sibspaces We have seen that, in R~. it is possible for onc vector space to sit inside another one, glVmg rise to the notion of 3 subspace. For example. a plane through the ongin is a subspace of R '. We now extend th Ls concept to general vector spaces.
431
Chapter 6
Vtctor Spaces
Dennmon
A subset W of a vector space V is ca lled a sllbspace of V if IV is .tse f a vector space with the same scalars, add ition, and scala r multiplication as V.
As in IR~, checking to see whet her a subset W o f a vector space Vis a subspace of V involves testing only two of the ten vecto r space axioms. We prove this observation as a theorem.
•
it
Theorem 6.2
Let V be a vector space and lei W be a nonempty subset of V. Then \Visa subspace of Vi f and only if the fol lowi ng conditions ho ld: a. Ifu and varei n W, thenu + v is in W. b. If u is in Wand c is a scalar, then cu is in IV.
Prool Assume that W is a subspace of V. Then W satisfi es vecto r space axio ms I to 10. In particular. ax iom 1 is cond ition (a) and axiom 6 is condition (b). Conversely, ass ume that W is a subset of a vector space v, satisfying co nditions (a) and (b ). By hypothesis, axioms I and 6 hold. Axioms 2, 3, 7,8,9, and 10 hold in Wbecause they are true for allveclors in Vand thus are true in particular for those veclo rs in W. (We say that W inlterits these properties from V.) This leaves axioms 4 and 5 to be checked . Si nce W is noncmpty, it contains at least one vcctor u. Then condi tion (b) and Theorem 6. I(a) imply that Ou = 0 is also in W. This is axiom 4. If u is in V,then, bytakingc =  I in condi tion (b ), we have that  u = ( J) u is also in W, using Theorem 6.1 (c).
R,.arr
SlIlce Theorem 6.2 generalizes the no tion of a subspace from the ca nlext of lR~ to general vector spaces, all of the subsp:lcCS of R" that we encountered in Chaptcr 3 arc subspaces o f R" in the current context. In particular, lines and planes th rough the origin arc subs paces of Rl.

Ixample 6.9
lxample 6.10
We have already shown that the set ~ n of all polynomials with d egree at most vector space. Hence, (jJ> ~ is a subspace o f the vector space ~ of all polyno mials.
/I
is a
.t
lei Wbe the set o f sym met ric /I X /I matrices. Show that W is a subspace of M n" .
Sola11011 Clearly, W is nonempty, so we need only check condJlio ns (a) and (b ) in Theorem 6.2. I.et A and B be in Wand let c be a scalar. Then A T = A and 8 T = B. from wh ich it fo llows that (A + 8) T = AT + 8T = A + B
Therefore, A + B is symmetric and , hence, is in \V. Similarly,
(CA )T = CAT = cA so cA is symmetric a nd, thus, is in W. We have shown that W is dosed under add ition and scalar multiplication. Therefore, it is a subspace of M"", by T heo rem 6. 2.
$cClion 6. J
Elample 6.11
Veclo r Spaces and Subspaces
la9
LeI cg be the set of all continuous realvalued functions defined 011 R and let £b be the sel of all d Ifferen tiable real valued func tions defined on R. Show that is a subspace o f 3' and Mn ' Typical elements of these vector spaces are, respectively,
a b u
In th e words of Yogi Berra, "It s dej il. vu all over again."
EMample 6.13
~
,
, p(x)
=
a
+ bx +
a;l
+ dx 3 ,
d
Any calculations involving the vec tor space o pe rations of add ition and scalar multiplication are essentially the same in all three settings. To high light t he simila rities, in the next exam ple we will perform the necessary steps in the three vector spaces side by side. (a) Show tha t the set W of all vectors of the form
a b  b a is a subspace of [R4. (b) Show that the set W of all polynomials of t he form a s ubspace of 9J> y (e) Show that the set W of all matrices of the form [ _ :
+ bx  bil + (o?
:J
IS
a
is a subspace of Mn ·
,,
(
Section 6.1
,
•
Vector Spaces and Subspaccs
.41
Solutlan (3) W is no n empty beca use it contai ns the ):cro vector O. (Take a = b = 0.) leI u and v be in \.~' say,
,
a
"~
(b) W IS nonem pty because it contains the zero polyno m ial. (Take a = b = 0.) Let p {x ) and q(x) be in Wsay,
b b
and
v
~
AX) =
d
,
+ I,x  bx + ax' A =
q(x) = c + (Ix 
cJr + ex'
and
Then
Then
+C b+ d
B=
[  "b b] (l
,I] , [ d
c
Then
p(x) + q(x)
II
u + v=
tams the zero matri x O. (Take (l = b =' 0.) Let A and B be in Wsay,
and
 d
a
(I
(c) W is nonempty because il con
~
+, b+ d] A +B= [  (b + d) a + ,
(a + ,)
a
+ (b + d)x
 b d
 (b + d)"
+ (a+ c)K
a+, b+ d (b + d)
a+, so u + v is also in W (because It has the right form ), Similarly, if k is a scalar, then
so p{x ) + q(x) is also in W (because it has the righ t fo rm ). Sim ilarly, if k is a scalar, then
kp{x)
ka ku =
=
so A + B is also in W (because it has the right form ). Si m ilarly, if k is a .scalar, then
ka + kbx  kbK + kax'
ka kA = [ kh
kb  kb
kb] ka
ka so ku is in W. Thus, W IS a noncmpty subset of
R4 that is d osed under addition and scalar multiplication. Therefore, W is
a subspace of R 4 , by Theorem 6.2.
so kp(x) is In w. Thus, W is a no n('m pty subset of IJi'J tha t is closed under addition and scalar m ultiplication. Th('rcfo re, W is a subspace of qp ) by T heorem 6.2.
so kA is in \V. Thus, W is a nonempty subset of M n that is closed under addition and scalar multiplicat IOn. T herefo re, W is
a sob, p'" of M ".by Th'N' ''' 6.:...t
Exampk 6. 13 shows that it is often possible to relate examples that,on the surfa ce, appear to have nothing in com mo n. Conseq uently, we can apply o ur knowledge of III " to polynomials, matrices, and othe r examples. We will encoun ter this idea several times mlhis chapter and will m ake it prel' determine whether ,(x) = I  4x ((x ) = I  x
+ xl
+ 6~ is in span ( p(x). q(x)), where
and
q(x)::::: 2
+x
 3K
Solallol We arc looki ng for scalars c and d such that cp(x) + dq(x)  ,(x). This means that c( 1  x
+ r) +
(2
+x
 3x1 ) = I  4x
+ 6K
Regrouping according powers of x, we have
(c+ 2d)
+ (c+ tl)x+ (c 3t1),r
Equaling the coeffi cients o f like powers of x gives
c+ 2d= I c + d = 4
c  3d =
6
= 1  4x+ 6x 2
...
Ch:lptcr 6 Vector Spaces
which is easily solved to give c = 3 and d ...  J. Therefore. r{x) = 3p(x)  q(x) , so r(x) is in span (p(x). q(x» . (Check this.)
Example 6.20
In !J. determine whether sin 2x is in span(si n x, cos x). We set C Sin x + i/ cos x "" sin 2x and try to determine c and Ii so that th is equ::ltion is true. Since these are function s, the equation must be true for (III values of x. Setting x = 0, we have
$0111101
csinO + dcos O = Sin O or
c(O) + d(l ) = 0
from which we see Ihat i/ = O. Setting x = rr / 2, we get csin(7T/ 2) + dCOS(7T/2}
= sin(7T)
or
c( l ) + d(O)
=0
giving c = O. Bu t this implies that sin 2x "" O(SIn x) + O(cos x ) = 0 for all x, which is absu rd, since sin 2x is not the u ro function. We conclude thai sin 2x is not in span (sin x. cos x) .
••••'l It is true that sin 2x can be written in terms of sin x and cos
x. ror
example, we have the double angle for mula sin 2x = 2 sin xcos x. However, th is is not a lillcilrcombination.
Ellmpl,6.21
[~
In Mw descnbe the span of A "" SIIIII..
'], 8=[ '
°
°
O,].and C=[ O,
0'],
Every linear combination of A, B, and C is of the for m
CA+dB +eC= C[ ~ ~]+d[ ~ ~]+ e[~ ~] ~
[ v2' . • . , .... l. Then, since W is closed under addition and scalar multiplication , it contains every linear combination CIVI + C~V2 + ... + ckV k of vI' V I" •• , vr Therefo re, span (v i • v l " .. , vt ) is contallled in W.
ExerCises 6.1 In Exercises 111, determine whether thegivell set, together with the specified operatiolls of additioll and scalar mulriplicatiOlI, is a vector space. If It is 1I0t, list all of t ile axioms !lrat fail to hold 1. The set of all vectors in
R2of the
form
[:J.
with the
usual vcrtor addiuon and scalar multiplication
2. The set of all vectors [ ;] in Rl with x C!: 0, Y 2: 0 (i.e., the first quadrant), with the us ual vector addition and scalar multiplication
3. The set of all vectors [;] in 1R2 with xy C!: 0 (i.e., the union of the fi rst and third quadrants), with the usual vector addition and scalar multiplication 4. The set of all vectors [ ;] in R2 with x
~ y, with the
usual vector addition and scalar multiplication
5. IR', with the usual addition but scalar multiplication defi ned by
6. 1R 2, with the usual scala r multiplication but addition defi ned by
X'] + [""] _ [x' +"" + [Y I Yl YI+y,+
I] l
7. The set of all posiuve real numbers, with addition defined by xE!) y = xy and scalar multiplication 0 defined by c 0 x "'" x'
E:B
8. The set of all rat ional numbers, wi th the usual additIOn and multi plication 9. The set of all uppe r triangular 2X2 matrices, with the usual matm additio n and scalar multiplication 10. The set of all 2 X 2 matrices of the form [:
: ].
where ad :::: 0, with the usual matrix addition and scalar multiplicat ion
11. The set of all skewsymmetric 71X n matrices, with the usual matflx additio n and sca lar multiplication (see Exercises 3.2). 12. Fin ish veri fying tha t qp l is a vector space (see Exampie 6.3) . 13. Finish verifying that ~ is a vector space (see Example 6.4).
...
Cha pt~r
6
Vector Spaces
•• ~ III Exercises 14 / 7, delt:rl/lll1e whether the gIven set, toge/I, er w;th the specified operntiollS of (uldition (wd scalar multipUcmioll, is a complex vector space. If it is nor, list all of the axioms thnt fnil 10 /IO/d. 14. The set of all vectors in C 1 o f the for m
[~], with the
usual vector add ition and scalar multiplication 15. The sct M",~(C ) o f all m X " comple x matrices, wi th the usual ma trix addi tion and scalar multiplication 16. The set
el, with
the usual vector addit ion but scalar
multiplication defin ed by 17.
c[::] = [~~]
Rn, with the usual vector add ition and scalar multiplicat ion
III Exercises 182 1, determille whether the give" set, together wirh tlJe specified operatiolls of mld,t,oll (lml scatt" multipU
alliorl, IS a vector space over tlJe illdicated Z,.. If it IS IIOt, Ust rill of lite rlxiOIllS II/at fat! to Itold.
18. The set of aU vectors in Z; with an tvt'n numocr of I s, over Zz with the usual vector additio n and scalar multiplication 19. The set of all vectors in Zi with an odd number o f Is, over Z, with the usual VC1:tor addition and scalar multiplication 20. The set M"",(Z,J of all m X " mat rices With entries from Zp> over Zp with the usual ma trix addition and scalar multipl icatio n
21. 1 6 , over ill with the usual additio n and multiplicatio n (Think this o ne Ihrough carefu lly!) 22. ProveTheorem6.1 (a).
23. PrQ\'e Theorem 6.1 (c).
In Exercises 2445, lIse Theorem 6.2 to determine whether W is a subspace ofY.
27. V = Rl, W =
• b
I. I
28. V=M n ,W = {[:
2~)}
29. V=M n ,W = { [ :
~] :ad2bc}
30. V = Mn~' W = lAin M",, : det A = I} 31. V = M".., W is the set o f diagonal
"X" mat rices
32. V = M"", W is the set o f idem potent nXn matrices 33. V = At"", \V = IA in M",, : AB = BA}, where B IS a given (fixed ) matrix
34. V ~ ~" W = {bx+ d} 35. V = CJ>:z, W= fa + bx+ a 1:u + b+ c= O} 36. V=~" W = {.+ Itr+ d ,abc=O} 37. V =
~,
W is the set o f all polynomials o f degree 3
38. V= '§, W = {n n '§'f( x) = f(x))
39. V = 1/', IV = (f ;,, 1/" f(  x) =  f(x))
'0. V = S;, IV = (f; n 1/' , f(O) = I) 41. V = :1', IV = 1f;":I', f(O) = O} 42. V = '§, IV is the set o f all llliegrable fu nctions 43. V = 9i, IV = {fin ~: r ( x) ~ 0 for all x} 44. V = ,§, w = (€ (l), the sct of all fu nctions with continuous second derivatives ~ 45. V =
,,
1/', IV = (f h' 1/', Um f(x) = 00)
46. leI Vbe a vector space with subspaces U and W Prove that u n W IS a subspace of V. 47. Let Vbe a vector space wit h subspaces U and HI. Give an example wit h V "" Rl to show that U U W need nOI be a subspace of V. 48. Le t Vbe a vecto r space with subspaces U and \V. Define the slim of U t,ml W 10 be
U+ W = lu + w : u isin U, w is in W]
25. V = R', W=
•
. 2.
26. V = Rl, W=
a b a+b+1
(a) If V = IR:J, U is the xaxis, and W is the yaxis, what is U + W ? (b) If U and Wa re subspaces of a vector space V, p rove Ih:1I U + W /s a subspace of V.
49. If U and Yare vector spaces, define the Cartesian product of U and V to be U X V = leu , v) : u isin Uand v isi n VI
Prove that U X V is a vector space.
Secllon 6.2
50. Let W be a subspace of a vector space V. Prove that = !( w, w ): wislll W I is asubspace of VX V.
a
In Exercises 51 (lnd 52, let A = [ 8 =
\ I] [ 1
\  \
5I. C=[~!]
52.C =[~
5]  \
xl
54. sex) = I
58. hex)
= situ
:].[ ~ ~]. [: ~]t
~}
60. Is M22 spanned by [ :
~].[ : ~].[: :].[~
~]?
62. IsttP 2 spannedbyi
span(p(x). q{x). r( x)).
56. h(x) = cos 2x
59. ISM21SpannedbY [~
61. Is (jJ>1 spanned by I
In Exercises 53 011(/ 54. let p(x) = 1  2x, q(x) = x  X l , alld r(x) =  2 + 3x+ x 2. Determine whether s(x) IS in 53. s(x) = 3  5x 
=I 57. h (x) = sin 2x
55. h(x)
\ ] and \
O · Determine whether C is ill span (A, 8 ).
en
I.mear Independence, BasIs, and Dimension
+ x,x + xl, 1 + Xl? +x+ 2x",2 + x+ 2X 2,
1+ x+2x 2?
63. Prove tha t every vector space has a unique zero vector.
+ x + xl
64. Prove that for every vector v in a vector space V. there is a unique ,,' in V such that v + v' = o.
In Exercises 5558, let f (x) = sin 2x ami g(x) = cos 2x. Defermine wlle/ller II(X) is ill spaf/(f (x), g(x)).
Linear Independence. Basis. and Dimension In this section , we extend the notions of linear independence, basis, and dime nsion to general veclor spaces, generalizing the results of Sections 2.3 and 35. In most cases, the proofs o f the theo rems ca rryove r ; we simply replace R" by the vector space V.
linear Iidependence DeOnIllD.
A set of vectors {V I ' v2, ••• , vk} in a vector space V is linearly de· pendent if there are scalars CI' C:l, ... • c1, allerul one of wJUda..u.. 0, such that
A set of vectors tha t is not linearly d ependent is sa id to be linearly independen .
As only if
In
RIr, Ivp v" . . . , vA.} is linearl y inde pendent in a vector space V if and
We also have the following useful alternative formulation of linear d ependence.
441
Chapter 6
Vector Spaces
Ii
Theorem 6.4
A set of vectors l VI' V2" . • , v k } in a vector space Vis linearly dependent if and only if alieasl one of the vectors can be expressed as a linear combination of the others.
Prill
•
The proof is .dentical to that of Theorem 25.
As a spc • •• , e,,\ is a basis for R ~,
xl> I is a basis for qJ>~, called the sumdard btlSis for qJ> ,..
Xl,
I + .0}isaba si s for ~l'
We have already shown that 6 is linearly independen t, in Example 6.26. To show that 8 spans f/P 2' let a + bx + ex! be an arbitrary polynomial in ~l' We must show that there 3re scalars c" ';}. and t; such that
Solulloa
(,( I
+ x) +
C:l( x
+
x2)
+
cil
+
x 2) = ()
+
/JX
+
ex l
or, equivalently,
Equating coefficients of like powers of x, we obtain the linear system
(, +
Cj = a
which has a solution, Since the coefficien t matrix
I
0
I
I
I
0 has rank 3 and, hence,
o
I
I
is invertible. (We do nO I need 10 know wllll/ ihe solution is; we only need to know that it exists.) Therefore, B is a basis for r;p l '
Remar. Observe that the matrix
I I
0 I lOis the key to Example 6.32. We can
o
I
I
immediately obtain it using the correspondence between flP'1 and RJ, as indicatC'd the Remark foUowing Example 6.26.
In
.52
Chapter 6
Veclor Spaces
Example 6.33
Show that 6 = { I, X,
X l , ..• }
isa basis fo r ~.
[n Example 6.28, we saw that 6 IS linear[y mdependent. It also spans IJJ>, since clearly every polynomial IS a linear combination of (finite ly many) powers o f x.
Solution
tExample 6.34
Find bases for the three vector spaces in Example 6.13:
a
(,) W, =
b b
a
Solullon Once again, we will work the three examples side by side to highlight the similari ties amo ng them. In a strong sense, they are all the SClme ex:ample, but ]t will rake us until Section 6.5 to make this idea perfectly precise.
('J
(b) Sin ce
Since
"b :
b
"
"
1
0
0
1
0 1
+ b
0
1
0
0
1
0 1
'nd
n +bx  bx 2 +nxl = a( l + x 3) + h(x  xl)
 I
we have WI = span (u , v ), where
u:
(c) Since
v =
we have W z = span ( ll(x), V(xl), where
u(x) = 1 + x 3
 I
0
Since lu, v) is clearly linea rly indepe ndent, it is also a basis fo r WI'
and
v(x)
=
we have W3 = span( U, V) , where
u=l~~]
and
v = [ _~~]
x  x2
Since lu (x), vex» ) is dearly linearly independent, it is also a baSIS for W2•
Since 1U, VI is dearly linearly in dependent, it is also a basis for WJ •
Coordinates Section 3.5 in troduced the idea of the coordinates of a vector with respect to a basis for subspaces of Rn. We now extend thiS concept to arbitrary vector spaces.
Theorem 6.5
Let V be a vector space and let 6 be a basis for V. For every vector v in V, there is exactly one way to wri te v as a linear combination of the basis vectors in 13.
Proof
The proof is the same as the proof o f T heorem 3.29.11 works even if the basis B is infmite, since linear combinat ions are, by defi nition, finite.
Sectton 6.2
453
Linear independence, Basis, and Dimension
The conve rse of Theorem 6.5 IS also true. That is, If 13 is a set of vectors in a vector space V \" ith the pro perty that every vector in V can be wri tten uniquely as a linear combination of the vectors in 13, then B is a basis for V (see Exercise 30) . In this sense, the unique representation property characterizes a basis. Since representation of a vector with respect to a basis IS unique, the next definition m akes sense.
Definition
Let 13 = { V I> V2' • . " v~} be a basis for a vector space V. Let V be a vector in V, and write v = ci v, + '7v2 + ... + env no Then (I' (2' .. . , cn arc called the coordinates ofv with respect to B, and the column vector
c,
c, c. is called the coordi1mte vector of v with respect to 8.
O bserve that if the basis 13 of Vhas n vectors, then [vlLl is" (colum n) vector III Rn.
Example 6.35
Find the coordinate vcrtor [P(X) ]8 of p (x ) = 2  3x dard basis B = {i, x, Xl} of rz; 2.
SoluUon
+ 5Xl with respect to
The polynomial p(x) is already a line" r combination of i , x. and
the sta n
xl, so
2 [P(X) iB~
3 5
This is the correspondence between QIl 2 and IR3 tha t we remarked o n after Example 6.26, and it can easily be generalized to show that the coordin,lIe vector of a polynomial
p(x) = ~
+ alx + alx 2 + .. +
with respect to the standard basis 13 = { i,
X,
an x~ in qp~
x 2, .. . ,xn} is jus t the vector
a, [p(x) iB~ a,
in IRn+1
a. The order in which the basis vectors appear in 6 affects the o rder of the entries III a coordina te vector. For examp le, in Example 6.35, assume that the
1I,•• rl
.5.
Chapter 6
VeclOr Spaces
standllrd basis vecto rs lire o rde red as B' = {x 2, p (x) = 2  3x + 5; wit h respect to B' is
X,
I}. Then the coordillate vector o f
5
[p(xl l.
lumpls 6.36
Find the coo rd inate vecto r
~
,
 3
']
[A],~ of A = [~
3 with respect to the standa rd basis
l3 = {Ell' E12 , £:z l' £:zl}o f M12 • SoluUon
Since
2
,
we have
4 3 ThIS is the correspondence between Mn lind IR" that we no ted bcfo rc thc intro· duct ion to Exam ple 6.13, It too can easily be generalized to give 11 corrcspondcnce b etween M",~ and R""'.
lump Ie 6.31
Find the coordinate vector [ p(x)]/3 of p(x) = 1 + 2x C = II + x, x + x 2, I + x 2 ) o f W> 2'
SOIl iiol
Xl
wit h respect to the basis
We need to find cl ' c2' and c, such that
' 1(1 + x ) + Gj(x + x 2) + c3(1 + x 2) "" 1 + 2x  x 2 or, eq uivalently,
(el
+ eJ ) + (c1 +
Gj )x
+ (Gj +
c, )x 2 = 1 + 2x  x 2
As in Exam ple 6.32, this m eans we need to solve the system
+ c1 + ( I
1
(J ""
(2
""
2
c2 +(3= 1 whose solution is found to be el
= 2, C:.! = 0, £3 =  I. T herefore, [p(xl lc
~
2 0
,
Secllon 6.2
Linear Independence, Basis, and Dimension
(Since this result says that p( x) "" 2( I correct.)
+ x)
 (I
455
+ xl), it is easy to check that it is
The next theorem shows that the process of forming coordinate veclOrs is compatible wi th the vector space operations of addition and scalar multiplication.
Theor,. B.B
lei 6 "" {VJ' VI" .. , V~} be a basis for a vector space V..Le.LLLand v be vectors in and let c be a scalar. Then
a. [u + v]s "" [u]s + [vls b. [cuJ. = cl uj.
Prllt
We begin by writing u and v in terms of the basis vectorssay, as
Then, using vector space properties, we have
,nd so
d,
c, ~
[u + v]s ""
[eu]s
and
+
d,
= [uJ. + [vJ.
=c
=
ee.,
e.,
An easy corollary to Theorem 6.6 states t hat coordinale vectors preserve linear
combinations:
[ c, u, + "" ..
i!""OCC'
!. ' " :
 ;;;c
c;"f":
(I )
You are asked to prove this corollary in Exercise 3 1. The most useful aspe l ' 51. Findabasisfor span (J  x,x X l , I  x 2 , 1  2x+ X l ) in f!J>2' 52. Fi nd a basis for
span ([~ ~]. [~
'] [' ']
O·
[  1,  I' ]) ;nM".
53. Find a baSIS for span(sin1x, cos 2x, cos 2x) in ~.
55. Let S == {VI" ..• v,,} be a spanning set for a vector space V. Show that ifv" IS in span (v l •• .. , V,,_ I)' then S' = {VI" .. • v n I} is still a spann ing set for V. 56. Prove Theorem 6. IO(f).
I
 I '
58. Let {Vi' ...• v ..} be a basis fora vector space V. Prove thai
{VI' VI + Vl, VI + v1 + V,' ...• VI + ... + v,,} is also a basis for V.
Let (Ie, Ill>' • . ,(I" be n + I dis/mel rea/ nllmbers. Defil1c polynomials pJ"x), plx), . . .• p.(x) by .I ) _
(x  IlO) ". (x  a' _I)(x  a,+I)'" (x  a. )
p" .'  (a, 
"0) . . (a,  (1, _1)( (I,

a,+I)' " (a,  an)
These are all/ell tlte lAgrange polY'lOmials associate,' with tIo, Ill" •. , an' IJoseph·wllis Lagmllse ( 1736 1813) was hom i,1 ffaly bllt spent most of his life ill GermallY ami Frallce. He made important cOllf riblltioll$ to mel! fields as /lumber theory, algebra, astronomy. mechanics, and the calculus of variatiOllS. I" 1773, lAgrnnge WtlS tile first to give the volume iflterpreltl tioll of a determinant (see Chapter 4).1 59. (a) Compute the Lagrange polynomials associ31cd With a." = l ,u 1 = 2, ° 2 :: 3. (b) Show, in general, that
p,(a,) =
t
ifi "'} if j = j
60. (a) Prove that the set 13 = {Alx), plx), .. . , p,,(x)} of Lagrange polynom ials is linearly independent III ~,..IHjm:Se t GJAix) + ... + c"p,,(x) == Oand use Exercise 59(b).] (b) Deduce that B IS a basis fo r flI'". 6 1. If q(x) is an arbit rary polynomial in qp M it follows from Exercise 6O(b) that
q(x) = '>Po(x ) + ... + "p,(x) for some sca l ars~, ... , c,..
(l )
(a) Show that c, == q(a,) for i = 0•. .. , n. and deduce th .. q(x) = q(a,)p,(x) + ... + q(a. )pJx) ;sth, unique represen tation of q(x) with respect to the basis B.
Settion 6.2
(b ) Show that fo r ;lny n + I points (Ug. Co), (al' c, ), . .. , ( tI~, cn) with distinct first components, the (unction q(x) defined by equation ( I) is the unique po lynomial of degree a t most I1 lha l passes th rough all of
the points. This formula is known as the Lagrange ;"'erpolation formula. (Compare this formula with Problem 19 in E..xploration: Geometric Applications of Determinants in Chapler 4.) (cl Usc the L1grangc interpolation for mula to fin d the polynomial of degree at most 2 that passes through the points
tmear Independenct. Bas,s, and Dimension
461
(i) (1. 6). (2. i) .,nd (3.  2) (ii) ( I, 1O), (O,S), and {3,2)
62. Use the Lagrange interpolation for mula to show that if a polynomial in and PtloC for the bases 8 :::: {I, x, x 2} and C "" {I + X, x + X l , I + X l) of'lJ 2• Then find the coord inate vector of I'(x) = I + 2x  x 2 with respect to C.
Solullon
Changing to a standard basis is easy. so we fi nd PlJc firs t. Observe Ihat the coordinate vectors for C m terms of Bare
o
I
[l+x jB =
I .
o
[X+ Xl ]B =
I , I
I
[ 1 + x 1],6
=
0 I
Section 6.3
Change of lJasis
.U
(Look back at the Rema rk followi ng Example 6.26.) It fo llows that I
PBc=
0
I
II
O
o
I
I
To find PColi. we could express each vector in B as a linear combination of the vectors in C (do this), but it is much casier 10 use the fac l that PC_ B = ( PI:IC)  I, by Theo rem 6.12(c). We fi nd that
, 1, _1, _1 , 1, ,1 ,1 _ 1, 1, 1
(I'a ...d
PCo B 
I
=
It now fo llows that
,1 1,, ~ _1, , 1, _1, 1, I
I
2  I
2 0  I
which agrees with Exam ple 6.37.
Rlmlrk
If we do no t need PCo Bexplicitly, we can find [p(x) ]c fro m [p(x) ]s and PBc using Gaussi:lll elimination. Row reductIOn produces
(Sec the next section o n using GaussJordan eliminatio n.) It is worth re~ating the observation in Example 6.46: C hanging to a standard basis is easy. If t is the standard basis for a vector space Vand 8 is any other basis, then the columns of Pt _ s are the coordmate vectors of 8 with respect to [ . and these arc usually "visible." We make use of this observation again in the next example.
In M w let 8 be Ihe basis IEII' E21' E12, En ! and leI C be the basis lA, B, C, Dl, where
A
[10]
00 '
8 
[II] 00 '
C
[I ~]. I
D=[ :
Find the changeofbasis matrix PCI:> and verify that [ Xlc
[; !l
z:::
:]
pc_J X]s
for X =
.12
Chapter 6
Vector Spaces
Solulltn 1 To solve this problem d irectly. we must fi nd the coordinate vectors of B with respect to C. This involves solvi ng four linear com bmation problems of the form X = aA + bB + cC + dD. when' X is in B and we mu st find a, b. c, and d. However, he re we are lucky, since we can fi nd the required coeffi cients by inspection. Clearly. Ell = A, ~ I =  B+ C, EI2 =  A + B,and ~2 = C+ D. Thus.
[ EI1 ]C =
If X =
[~
0 , 0 0
 I
0  I
1
1
, [E"k = 1
[E, ,]c=
0
[E,, ]c =
,
0 0
0 0
1
0
 I
0
 I
1
0 0
1
0 0
0
 I 0 0
 I 1
! ]. then 1
3
2 4
and
PC_8[XJ.
=
1
0
I
0
1
0
 I
1
0
0 0
1
0 0
 I
3 2 4
0
1
=
 I I  I 4
T h is is the coordinate vector with respect to C of the malrix
A  8 C+ 4D=  [ ~ ~]  [~ ~) [ : ~]+ 4[ : =
[~
!]
:]
= X
as it sh ould be.
Solullon 2 We can compute PC and compare YOllr answer with tile one JOUlld In part (a).
3. x =
1
1
o
0 ,6 =
0 , 0
1 , 0
1 1 c~
1
,
1
0
0
1
, 0
1
1
3
4.x =
I ,8
5 c ~
1
I ,
o,
0
0
1
0
0
I ,
I , 0
0
1
Pco B
I
in
~j
1
=[_: ~ J
16. Let Band C be bases forQJ> 2. If 6 = {x, I + X, I  x + xl} and t he changeofbasis matrix from B to C is 100 021  I
I
I
find C. Xl,
X + x 2},
C= {I, 1 + x,xl}in QJ>2
III calmlus, you leam that a Taylor polynomial of degree n
In Exercises 9 and 10, follow the instructiolls for Exercises 14 IIsing A instead of x.
~ {[~
{ [ ~], [ ~]} and
find 6 .
8.p(x) = 4  2x x 2,6= {x, 1+
C
III Exercises 11 and 12,jollow the Im tructions for Exercises 14 IIsingf(x) ills/ead of x. 11. f(x) = 2 sin x  3 cos x, 6 = {sin x + cos X, cos xl, C = {sin x, cos x} in span(sin x, cos x) 12. f(x) = sin x, B = {sin x + cos X, cosx},C = {cosx sin x, sin x + cos x} in span (sin x, cos x)
the changeofbasis matrix from B to C is
,
C={I,x,x Z}in~2
o
~ {[~ ~]. [~ C ~ {[~ :J. [ :
B
15. U=I Band C be bases for R'. If C =
III Exercises~. follow the imtructions for Exercises 14 IIsing p(x) instead of x. 5. p(x) = 2  x,6 = {1,x},C = {x, I + x} in 9J'1 6.p(x) = I +3x,6 = {l +x,1  x},C={2x,4}in Q/'1 2 2 7 . p(x) = I + x , 6 = {I + x + X l, X + X Z, x },
9. A = [ 4
:].
14. Repeat Exercise 13 with 0 = 135°.
in W
o
1
,
1
0 ~
415
13. Rotate the xyaxes in the plane counterclockwise through an angle 8 = 60° to obtain new x' y' axes. Usc the methods of this section to find (a) the x' y' coordinates of the point whose xycoordinates are (3, 2) and (b) the xycoordinatcs of the point whose x' y' coordina tes are (4,  4).
0
o
IO. A=[ :
Change of Basis
about a is a polYllomial of the form
2 ], 6 = the standard basis,
ao + aj(x  a) + a,(x  a)l + ... + a,,(x where an *" o. In other words, it is a polynomial/hat has
:J. [:
beell expanded ill terms of powers of x  a illstead of powers of x. Taylor polYllomials are very useful for approximating futlc/ions that are "well behaved" lIear x = a.
 1
~J. [~
:J. [~
~]} in M"
p(x)
=
at
Chapter 6 Vee/or Spaces
416
The set 8 = {I, x  a, (x  af .. ., (x  a)If} is a basis for9P nfor(IIIY real /IIIII/ ber (/. (Do YOI I see a quick way to show tills? Try uSing Throrem 6.7.) TIIis fact allows us to lise the techniqlles of tllis section (0 rewrite a polYllomial as a 1(lylor poiYllolllwl abollt a given a.
be bases for a timtedlmensional vecfO r space V. Prove that
21 . Let 8. C, and V
22. Le t V be an IIdimensional ve .. + '!J'1t and T: ~ It + IlJ> It by
S(p(x))  p(x + 1) ,nd
T(I~ x))
 p'(x)
Find (S 0 T)(p(x» and (To S)(p(x)}. ( H mt: Remember the Cham Rule.) ~28. Defin e linear transfo rmation s 5:
'lP .. + 'lP and It
T : IlJ> It + IlJ> It by
S(p(x))  p(x + 1) ,nd F; nd (5 0
T)~x))
"i,p(x))  xp'(x)
, nd (T o 5)(p(x)).
y
y
3x
x y  3x+4y
y
and T :1R1+1R2
1
30. S: Q/' I + eJ> I d efined by S( a + bx) = (  4a and T: 2 ~ [R2 be the linear t.ransformation defined by
+
bx
+ a! }
=
c ["b + b]
B ~[
[~l
 I
I] 1
1I]
 I
1
~ 13.
(ii) x  x 2
(iii) I
+x
x2
(b) Which, if any, of thc following vecto rs are in range( T)? (i)
1
12. T: M22 ~ MZ2 defined by T(A} = AB  BA, \",here
(a) Which, if any, of the following polynomials arc in ker(T)? (i ) 1 + x
T ~,+R' d'finedbyT(p(x» ~ [~~~n
11. T: M22 ~ M12 defined by 'It A) "" AB, where
(iii) I/ Vi
(cl Describe ker(T) and range( T).
'{{a
T:Mu _ W definedbYT[~ ~] = [~ = ;]
(ii)
[~l
(iii)
[~l
(c) Describe ker(T) and range(T). 4. Let T: 'lP 1 + 'lP 1 be the linear t ransformatIon defined by T(p(x» ~ xp'(x),
T : 1J>2_IR defi ned by T(p(x)) = p'(O) 14. T:MJJ ~MjJ de fi nedby 'J1A} = A  AT
Itt Exercises 1520, determit1e wlletller I/Ie linear transfor
mallOn T is (a) ailelaone alld (b) 011/0. 15. T: 1H z _ [Rl defined by T [ x] = Y
[2X  Y] x + 2y
588
Cha pter 6 Vector Sp:lces
ee[0, 2]. 32, Show that (€[a, bJ  C€[ c, d] fo r all a < ba nd c < d. 31. Sho w that C(6[ 0, I J 
x  2y 3x
+Y
x +y 2a  b
a + b  3c
18. H I" +11' dcfi ncd by 'I11'(x» = n
19. T : llt~ M22 d efi n edby T U
,
[;~~n
a+b+ [ b  2c
C
2']
,
(a) Prove that if 5 and T are both o neto one, so IS SoT. (b ) Prove that if 5 and T are both on to, so is SoT.
34. Let 5: V + Wand T: U '" V be linear tra nsformations.
__ ["" +bb bb + ,,]
(a) Prove that If 5 0 T is o neto one, so is T (b ) Prove that If S o T is onto, so is S. 35. Let T: V '" W be a linear tra nsfo rmatio n between two fi nited im ensional vector spaces.
a 20. T: R J '" W defi ned by T b
33. Let 5: V ~ W and T : U '" V be linear transfo rm ations.
=
b, where W is the vector space o f a ,
all symmet ric 2 X2 matrices
(a) Prove that if di m V < dim W, then Tcan not be onto. (b) Prove that if dim V> d im W, then 1'can not be one too ne. 36, Let no, (1, •••• , a" be n + I distinct real n umbe r~. Defin e T: W'" + IR:"t l b y
In Exercises 2/26, determine whether Vand Wa re
T(p(x) =
isomorphIC. If they are, gIVe an explicit IsomQrphism T: V~ W. 22. V = 53 (sym metric 3 X 3 matrices) , W = UJ (upper t riangu lar 3 X 3 mat rices)
53 (skcw
24. V = !j>,. IV = (P(x) in !j>" P(O ) = 0) •• 101
25. V = C, W = R2 26. V = {A in M" , u(A) = 0). W =
Il'
27. Show that T:~n ~ ~n defi ned by T{p(x» = p(x) p'(x) is an isomorphism. 28. Show that T:'lP n '" 'lP n d efined by is an isomorphism. 29.
+
T(p(x» = p(x  2)
~how that T:~n"' 'lPn defined by T(p(x»)
=
xnp(; )
IS an Isomorphism. 30. (a) Show that (£[0, I ]  '{; [2, 3]. [ Hint: Define T: C€ (0, 11'" C€ [2,3] by letti ng T(f) be the functio n whose value at x IS (T(f))(x) = f(x  2) for x in
[2.3[.[ (b ) Show that f=ars, on the surface, to be a calculus proble m. We will explore this idea further in Example 6.83.
Example 6.80
Let V be an ,,·dimensional vector space and let I be tne identity transformation on V. What is the matrix of I with respc, + R' d,fin, d by T( p(x» B  {x' ,x, I},e V ""
7. T:
p(x) = a + bx
,
{:] 
th" cos x. t h" Sin x ).
, B
{ [;].[_~]},
b
I I o, I , I I 0 0
8. Repeat Exercise 7 with v
,
v 
[~]
= [:].
9. T: M Z1 + M12 defi ned by T(A) = AT, B = C = {E' l' Eu.
~l' ~l}' V = A = [:
!]
10. Repeat Excrcise 9 w ith B "" (Ell' Ell ' Eu. Ell} and C = {Ell> E2l , E22, Ell}' II. T: MI.2 + M22 defined by T(A ) = All  BA, where
B [
I I], B _ e  {E" , E E", E,,},  I I
v  A  [:
(a) Find the matrix of D with respect to B = {th , eh" cos X, eh" sin x}. (b) Computethederivativeoff(x) = 3t h  tUcosx+ Zth" sin x indirectly. using Theorem 6.Z6, and verify that it agrees with 3S computed d irectly.
+ (X l
I
e 
~ 15. ConSider the subspace IV of~, given by W = span (t h",
W].[:]},
R2 + RJ d efi ned by a + 2b
rex)
~ 16.
Consider Ihe subspace Wof 9:J, given by W = span (cos x, sin x, xcos x, x si n x). (a) Find Ihe mat rix of Dwith res pect 10 5 = {cos X, sin x, x cos X, xsin x}. (b) Compute the d erivative of f(x) = cos x + Zxcos x ind irectly. using Theorem 6.26, and verify Ihat it agrees wilh f (x) as computed directl y.
III Exercises 17 alld 18, T: U + V and 5 : V+ Ware Ii/lear tra/l sformatlOlIS (Illd 5, C. mul V are bases for V. V. lIIld W, respectively. Compute [S 0 T):D ..... 8 ill two ways: (a) by finding So T directly and thell compl/ting Its matrix and (b) by finding the matrices ofS anti T separately and using Theorem 6.27. 17. T: @I, + R 2 defi n ed by T(p(x)) =
u,
:]
d'finodby
[~~~]. S: RI+ RI
J.Jl a]_ [a 2b],B _ {I,x}, 2a  b b
C = V = {e l . e2}
12. T:M21 +M11 definedbyT(A) "" A  AT,B =
c= {Ell,E12'~l>~l}' V =
511
1l. 14. Consider the subspace W o f 2b, given by W  span (C'", e 1., "),
+d
5. T ;@>l + R d efinedbyT(p(x» =
The MatriJ( of a Linea r TransformatIon
A = [:
~]
13. Consider the subspace Wof 2'b. given by W = span (slll X, cos x). (a) Show that the differential operator D maps IV into itself. (b) Find the matrix of 0 with respect to B = {sin x, cos x}. (c) Compute the d erivative of fix) "" 3 sin x  5 cos x indireclly. using Theorem 6.26, and verify that it agrees with r ex) as computed directly.
18. T: ~' + ~l definedbyT(P(x» = p(x+ I), S:~l+~ldefi n edbyS{p(x»  p(x+ I),
B: {I, x},e  V  {I,x,x'}
III Exercises 1926, determine wller/ler tire lillear trfillsformatioll T is invertible by considering its matrix witll respect to the standard bases. 1f T is invertible. lise Theorem 6.28 and tile metlrod of Example 6.82 /0 fiml T ' . 19. Tin Exercise 1
20. Tin Exercise 5
21. Tin ExeTClse3 22. T: (jJ> I + '!P 2 defi ned by T(p(x» = p' (xl . T: ~l+'!Pl defined by T(p(x»
= p(x) + p' (x)
Chapler 6 Vector Spaces
51.
24. T: Mn + MZ2 defi ned by T(A ) = AU, when!:
it to compute the o rthogonal projection of v o nto W, where 3 v 
25. T in Exercise II
2
26. T In Exercise 12
C()1l1pare your answer with Example 5.11. I Him: Find an orthogonal decomposition oflVas W = w + W1. using an o rthogonal baSIs for W. See Example 5.3.1 39. Let T: V+ Wbe a li near transform.llion between finite dimensional vecto r spaces and let Band C be bases fo r Vand W, respectively. Show that the matnx of Twith respect to Ba nd C is un ique. That is, If A IS a matrix such that A[ v]o = [ T(v)] c for all v in V, then A = [1'Jc_B' {Hi"t: Find values of v that will show this. one column at a time.]
~ 11, ExerCIses 2730, use the method of Example 6 83 to eVal'lnte the given inlrgml.
f 28. f
27.
(si n x  3 cos x )llx (See Exercise 13.) Se z.. (ix (Sce Exercise 14. )
29.
J (;S cos x 
30.
f
2i" si n x) dx(See Exercise 15.)
(xcos x + xsin x ) dx(See Exercise 16.)
III Exercises 3136, a lillear Ir(msfor/1lf1liml T: V+ Vis give" If possil1le, find a 1n,sis C for V S'ld, IIJat the matrix [T1 ofT will, respect 10 C is dlagollal.
'1. T: R2 t Rl definedbyJal ~ [  4b 1 Jl a +5b
33.
1:]
=
(tl
41. Show that rank(T) = rank(A).
[: +~]
T :gJI, + ~, d e finedbyT( a + bx) = (4a
42. If V = Wand lJ = C, show that Tis diagonalizable if and only if A is diagonalizable. + 2b)
+
+ 3b)x
34. T: @I, + ~ Zdcfined by T(p(x )) = p{x + I) l.&.35. T :@I,+gJIl defined by T(p(x» = p(x) + xp'(x) 36. T : t~\ + ~2 definedby T(p(x)) = p(3x+ 2) 37. Let
ebe the line thro ugh the o n gin in R' with dire" p(  x) ~ p(x))
III QrlestiollS 11 13, determine whether T is a linear trmlsformation. II . T:
R2 + JRl d efined by "/1x) = yxTy, where y =
1 [2 ]
12. T : Mnn + M nn defined by T( A ) = A TA 13. T: Ql'n +9Pndefi ned by T(p(x)) = p(2x 1)
14. If T: W' l + M21 is a linear transfo rmation such that
T(I) ~[ ~ nT(I +X)~ [~ T( J + x + x 2 ) =
0 I] [ I
:],nd
O. fi nd T(5  3x + 2x2 ).
15. Find the null ity of th e linea r transformation T: M m , + IR defin ed by T(A) = tr(A). 16. Let W be the vector space of upper triangular 2 X2
m atrices.
=a+ c=b+d } 4. V ~
I O. Find the changeof basis mat rices Pc..../) and PB.....c with respect to the bases B = I 1, 1 + x, 1 + x + x"} and C = Ii + x,x+ xl, I + xl} of~ 2 '
(a) Find a linear transformation T : Mn + M21 such that ker(T) = W. (b ) Find a linear transform ation T : M12 + Mn s uch tha t range( T ) = W.
17. Find the matrix I T lc....a o f the linear transformation T in Question 14 with respect to the standa rd bases B = {I, x, Xl) ofQJl 2 and C = {Ell' E12 , ~ L> !;,2} of M n· 18. Let 5 = {VI' ... , v n} be a set of vectors in a vector space V with the property that every vecto r in V can be written as a linear combination of V I' .•. , V n in exactly one way. Prove that 5 is a basis for V. 19. If T: U + V and 5: V + Ware li nea r transformations such that ra n ge(T) C ker(S), wha t can be deduced about So T? 20. Let T: V + V be a linear transformatio n, and let {VI' ... , v n } be a basis fo r V such tha t {T(v , ), ... , T(v n )} is also a basis fo r V. Prove that Tis inve rtible.
iSlanc I
A stralght/ine may be the shortest dislmlce betwun two points, but il
is by no lIlea/lS the most . . mteresrmg. Doctor Who In ''The Time Monster" By Robert Sloman BBC,1972
A/though Illis may seem a pnradox, all exact sCIence is dominated by the idea of approximation.  Bertrand Russell In W. H. Auden and L. Kronenberger, eds. Tile Vikillg Book of Aphorisms Viking, 1962, p. 263
B
A
Fluurll.1 Taxicab distance
538
Ii
1.0 Introduction: Taxicab Geometrll We live in a threedime nsional Euclid ean wo rld , and therefore, concepts fro m Euclidean geometry govern our way of looking at the world. In particular, imagine stopping people on the street and asking them to fill in the blank in the following ." They will a lmost sen tence: "The shortest distance between two points is a certainly respond \vith "straigh t line." There a re, however, other equall y sensible and intuitive notions of d istance. By allowing ourselves to think of "distance" in a more flexible way, we will open the door to the possibility o f having a "distance" between polynomials, funct ions, mat rices, and many other objects that arise in li near algebra. In this section, you will dIscover a type of "distance" that is every bit as real as the straightline distance you are used to from Euchdean geometry (the one that is a consequence of Pythagoras' Theorem). As you'll see, this new type o f "distance" still behaves in som e fam iliar ways. Suppose you are standing at an mtersection lt1 a city, trying to get 10 a restaurant a l anolher intersection . If you ask someone how far il is to the restaurant, that person is unlikely to measure distance "as the c row flies " (i .e., usmg the Euclidean version of distance). Instead, the response will be someth ing like " It's five blocks away." Since thIS is the way taxicab drivers measure dis tance, we will refer to this notion of "dIstance" as taxicab distance, Figure 7.1 shows a n exam ple of taxicab d istance. The sho rtest path from A to B req uires traversing the Sides of five city blocks. Notice that although there is more than one route from A to B, all shortest ro utes requ ire th ree horizontal moves and two ve n ical moves, where a "move" corresponds to the SIde of one city block. (How many shortest routes are there from A to B?) Therefore, the taxicab distance from A to B is 5. Idealizmg thIS situation, we will assume that all blocks a re unit squares, and we WIll use the notatIon d,(A, B) for the taxicab distance from A to B.
Problell 1 Find the taxicab distance between the followrng pairs of points:
(,) ( 1,2)ood(S,5)
(b) (2,4),nd (3,  2)
(e) (0,0) ,nd ( 4,  3)
(d) ( 2,3) ,nd (I, 3)
(e) (I, D and{ L D
(f) (2.5,4 .6)and(3 . 1,1.5)
Section 7.0
Introduction: Taxicab Geometry
539
Proble. 2 Which of the following is the correct formula for the taxicab distance d ,(A, 8 ) between A = (a l • a2) and B = ( hI> b2)?
(,) d,(A, B) ~ (a,  b,) + (a,  b,) (b) d,(A, B) ~ (la,1 lb,l) + (I.,I lb,l) «) d,(A, B) ~ la,  & ,1 + la,  b,1 We can d efi ne the taxicab " orm of a ve2' (For example, if p(x) = I  5x + 6 + 2x then (p(x), q(x» = 1· 6 + ( 5) . 2 + 3 · (  1) "" 7.)
r,
3r and q(x)
E:
SOlutiOD Since (jp 2 is isomorphic to H), we need only show that the dot product in R~ is an inner product, which we have already established.
Section 7.1
Example 1.5
Inner Product Spaces
543
Let f and g be in «5 [a, h] , the vector space of all continuous fun ctions on the closed interval [a, bJ. Show that
(f, g) defines an inner product on '€ [a,
Solution
We have
(f, g)
~
r
[( x)g(x) dx
•
hI.
r
~
~
[(x)g(x) dx
•
r
g(x)[(x ) dx
~ (g,f)
•
Also, if Ii is in '€ la, hI , then
(f, g + h) =
r r
f(x)(g(x) + h(x)) dx
•
([(x)g(x) + [(x)h(x)) dx
•
r
[(x)g(x) dx +
•
= (f,g) + (f, h) If c is a scalar, then
(of, g)
~
r
r•
[(x)h( x) dx
,[(x)g(x) dx
•
~
,r
[(x)g(x) dx
•
~
Finally,if, f) =
f
clj. g)
b(f(X»2 dx 2: 0, and it follows from a theorem of calculus that, since f
• is continuous.lj.f) =
r
(f(X» l dx = 0 if and on ly if f is the zero fun ction . T he refore,
•
(f, g) is an in ner product o n '€ [n, hI.
Example 7.5 also defines an inner product o n any subspaceoftfl. [a, bJ. For example, we could res trict our attent ion to polynom ials defined o n the interval [a, bJ. Suppose we consider '!J> [0, II> the vector space o f all polynomials o n the interval [0, 1J . Then, using the inner product of Example 7.5, we have
{x 2,1 +
~ = (x2(1 + x) dx = ( X2+ x') dx o
=
0
[Y! + x4jl = .!.+.!.=:~ 3
4
(I
3
4
12
544
Chapter 7
Distance and Ap proxi mation
Properties ollner Produc.s The following theorem summarizes some additional properties that follow from the definition of inner product.
I
Theare. 1.1
Let u, v, and w be vectors in 3n inner product s.E.3ce Vand Jet c be a scalar: a. (u+v,w) = (U,W/+ {V,W) b. (u, cv) = C(U,VI c. lu , O)~IO,v)~ O
i
We prove property (a) , leaving the proof of properties (b) and (c) as Exercises 23 and 24. Referring to the definition of inner product, we have
ProOf
(u + v, w) = (w, u +
VI
By (1)
"" (w, ul + (w, VI
By (2)
= (u, W I + {v, WI
By (I)
lengtl, Distance, aDd OrtbOgOD8111V In an inne r product space, we can defi ne the length of a vector, distance between vec\ors, and orthogonal vec!ors,just as we did in Section 1.2. We sim ply have to replace every usc of the dOl product u . v by the more general inner product (u, v). Ii
DeHnlllon
Let u and v be vectors in an inner product s ace
I. The length (or "orm ) of v is ~ v~ = V (v, v). 2. The distance between u and v is d( u, v) = IIu  v~. 3. u and v are orthogonal if (u, VI = o.
Note that I v~ is always defin ed, since (v, v) 2 0 by t he definition of inner product, so we can take the square root of this nonnegative quan tity. As in Rn, a vector of length I is called a u,lit vector. The unit sphere in V IS the set S of all untt vectors in V.
Example 1.6
ConsIder the inner product on