Lec 15 | MIT 18.06 Linear Algebra, Spring 2005

Uploaded by MIT on 06.05.2009

OK, guys the -- we're almost ready to make this --
to make this lecture immortal. OK. Are we on?
All right. This is an important lecture. It's about projections.
And I'll -- let me start by just projecting a vector B down on a
vector A. So just to so you see what the
geometry looks like in when I'm in -- in just two dimensions, I'd like to
find the point along this line so that -- that line through A is a
one-dimensional subspace, so I'm starting with one dimension.
I'd like to find the point on that line closest to B.
Can I just take that problem first and then I'll explain why I want to
do it and why I want to project on other subspaces.
So wh- where's the point closest to B that's on that line?
It's somewhere there. And let me connect that
and -- and what's the whole point of my picture now?
Wh- what's the -- where does orthogonality come into
this picture? The whole point is that this --
this best point, that's the projection, P, of B onto the line,
uh where's orthogonality? It's the fact that that's a right
angle. That this -- the error -- this is like how much
I'm wrong by -- this is the difference between B and
P, the whole point is that that -- that that's
perpendicular to A. That's got to give us the equation.
That's got to tell us -- that's the -- that's the one fact we know,
that's got to tell us where that projection is.
Let me also say, look I -- I've drawn a triangle
there. So if we were doing trigonometry we would
do l- like we would have angles theta and distances that would
involve sine theta and cos theta that leads to lousy formulas
compared to linear algebra. The -- the -- the formula that we
want comes out nicely and what's the --
what do we know? We know that P, this projection,
is some multiple of A, right? It's on that line.
So we know it's -- it's in that one-dimensional
subspace, it's some multiple, let me call that multiple X, of A.
So really it's that number X I'd like to find.
So thi- this is going to be simple in one-D, so let's just carry it
through, and then see how it goes in high dimensions.
OK. The key fact is -- so the -- the -- the key --
the key to everything is that perpendicular.
The fact that -- that A is perpendicular to A is perpendicular
to E. Which is B minus AX, XA. I don't care what -- XA.
That that equals zero. Do you see that as the central
equation, that's saying that this A is perpendicular to this --
correction, that's going to tell us what X is.
Let me just raise the board and simplify that and out will come X.
OK. So if I simplify that, let's see, I'll move one to --
to -- one term to one side, the other term will be on the other side,
it looks to me like X times A transpose A is equal
to A transpose B. Right? I have A transpose B as one
f- one term, A transpose A as the other,
so right away here's my A transpose A . But it --
it's just a number now. And I divide by it. And I get the
answer. X is A transpose B over A transpose A.
And P, the projection I wanted, is that's the c- that's the right
multiple. That's got a cosine theta built in. But we don't need to look
at angles. It's -- we've just got vectors here.
And the projection is P is A times that X.
Or X times that A. But I -- I'm really going to --
eventually I'm going to want that X coming on the right-hand side.
So do you see that I've got two of the three
formulas already, right here, I've got the --
the -- the equat- that's the equation that that --
that that leads me to the answer, here's the answer for X, and here's
the projection. OK. can I do add just one more
thing to this one-dimensional problem?
One more like lift it up into linear algebra, into matrices.
Here's the last thing I want to do w- don't but don't forget those
formulas. A transpose B over A transpose A.
Actually let's look at that for a moment first.
Uh supposeand -- and -- and A well then I'll --
I'll let me I'll -- I'll t- let me take this next step.
So P is A times X. So can I write that then?
P is A times this neat number, A transpose B
over A transpose A. That's our projection.
Can I ask a couple of questions about it, just while we look,
get that digest that formula. Suppose B is doubled.
Suppose I change B to two B. What happens to the
projection? So suppose I instead of that vector
B that I drew on the board make it two B, twice as long,
what's -- what's the projection now? It's doubled too, right? It's
going to be twice as far, if B goes twice as far,
the projection will go twice as far. And you see it there. If I put in
an extra factor two, then -- then P's got that factor too.
Now what about if I double A? What if I double the vector A that
I'm projecting onto? What -- what changes?
N- the projection doesn't change at all. Right?
Because I'm just -- the line didn't change.
If I double A or I take minus A. It's still that same line. The
projection's still in the same place. And of course if I double
A I get a four up -- up above, and I get a four --
a extra four below, they cancel out, and the projection is the same. OK.
So really, this -- I want to look at this as
the projection is -- there's a matrix here.
The projection is being -- is carried out by some matrix that
I'm going to call the projection matrix. And in other words the
projection is some matrix that acts on this guy B and produces the
projection. The projection P is the projection matrix acting on
whatever the input is. The input is B, the projection
matrix is P. OK. Actually you can tell me right
away what this projection matrix is. So this is a pretty interesting
matrix. What matrix is multiplying B?
I'm just -- just from my formula I --
I see what P is. P, this projection matrix, is,
what is it? I see A A transpose above, and I see A
transpose A below. And those don't cancel.
That's not one. Right? That's a matrix. Because down here,
the A transpose A, that's just a number,
A transpose A, that's the length of A squared, and up above is
a column times a row. Column times a row is a matrix.
So this is a full-scale N by N matrix, if I --
if I'm in N dimensions. And it's kind of an interesting one.
And it's the one which if I multiply by B then I get this,
you see once again I'm putting parentheses in
different places. I'm putting the parentheses right
there. I'm saying OK, that's really the matrix that
produces this projection. OK. Now, tell me --
all right, what are the properties of that
matrix? I'm just using letters here, A and B, I could put in numbers,
but I think it's -- for once, it's clearer with letters,
because all formulas are simple, A transpose B over A transpose A,
that's the number that multiplies the A, and then I see wait a minute,
there's a matrix here and what's the rank of that matrix,
by the way? What's the rank of that matrix, yeah,
let me -- let me just ask you about that matrix.
Which looks a little strange, A A transpose over this number.
But uh well, I could ask you its column
space. Yeah, let me ask you its column space.
So what's the column space of a matrix? It's --
it's if you multiply that matrix by anything you always get in the
column space, right? The column space of a matrix is when
you multiply any vector by that matrix, any vector B,
by the matrix, you always land in the column space.
That's -- that's what column spaces work. Now what space
do we always land in? What's the column space of --
what's the result when I multiply this any vector B by my matrix?
So I have P times B, where am I? I'm on that line, right? The
column space, so here are facts about this matrix.
The column space of P, of this projection matrix,
is the line through A. And the rank of this matrix is
you can all say it at once one. Right. The rank is one. This is a
rank one matrix. Actually w- it's exactly the form
that we -- we're familiar with with rank one matrix --
ma-with a rank one matrix. A column times a row, that's a rank
one matrix, that column is the --
is the basis for the column space. Just one dimension. OK. So I know
that much about the matrix. But now there are two more facts
about the matrix that -- that I want to -- that I want to
notice. I- first of all is the matrix
symmetric? That's a natural question for
matrices. And the answer is yes. If I take the transpose of this,
there's a number down there, the transpose of A A transpose is A
A transpose. So P is symmetric. P transpose equals P.
So this is a key property. That the projection matrix is
symmetric. One more property now and this is
the real one. What happens if I do the projection
twice? So I'm looking -- looking for something,
some information about P squared. But just give me in terms of that
picture, in terms my picture, I take any vector B, I multiply it
by pr- my projection matrix, and that puts me there, so this is
PB. And now I project again.
What happens now? What happens when I apply the projection matrix a
second time? To this, so I'm applying it once
brings me here and the second time brings me
I stay put. Right? The projection for a point
on this line the projection is right where it is.
The projection is the same point. So that means that if I project
twice, I get the same answer as I did in the first projection.
So those are the two properties that tell me I'm looking at a
projection matrix. It's symmetric and it's square is
itself. Because if I project a second time, it's the same result
as the first result. OK. So that'sand --
and then here's the exact formula when I know what I'm projecting onto,
that line through A, then I know what P is.
So do you see that I have all the pieces here for projection
on a line? Now, and those --
please remember those. So there are three formulas to
remember. The formula for X, the formula for P, which is just AX,
and then the formula for capital P, which is the matrix. Good. Good.
OK. Now I want to move to more dimensions.
So we're going to have three formulas again but you'll have to --
they'll b- they'll be a little different because we won't have a
single line but a -- a plane or three-dimensional or a
N-dimensional subspace. OK. So so now I'm -- I'm move to
the next question. Maybe I -- maybe I'll say first --
I'll -- let me say first why I want this projection,
and then we'll figure out what it is, we'll go completely in parallel
there, and then we'll use it. OK, why do I want this projection?
Well, so why -- why project? It's because I'm as I mentioned last
time this new chapter deals with equations AX equal B
may have no solution. So that -- that's -- that's really
my problem, that I'm -- I'm given a --
a bunch of equations probably too many equations,
more equations than unknowns, and I can't solve them. OK. So
what do I do? I solve the closest problem that I
can solve. And what's the closest one?
Well, AX will always be in the column space of A.
That's my problem. My problem is AX has to be in the column space and
B is probably not in the column space. So I change B to what?
I choose the closest vector in the column space,
so I'll solve AX equal P instead. That -- that one I can do. Where P
is this is the projection of B onto the column space.
That's why I want to be able to do this. Because I --
I have to find a solution here, and I'm going to put a little hat
there to indicate that it's not the X, it's not the X that
doesn't exist, it's the X hat that's best possible.
So if -- so I must be able to figure out what's the good
projection there. What's the good right-hand side
that is in the column space that's as close
as possible to B and then I'm -- then I know what to do. OK. So now
I've got that problem. So th- that's -- that's why I have
the problem again but now let me say I'm in three dimensions,
so I have a plane maybe for example, and I have a vector B that's not in
the plane. And I want to project B down into
the plane. OK. So there's my question.
How do I project a vector and I'm -- I'm really what I'm looking for is a
nice formula, and I'm counting on linear algebra
to just come out right, a nice formula for the projection of
-- of B into the plane. The nearest point. So this again a
right angle is going to be crucial. OK. Now so what's --
first of all I've got to say what is that plane.
To get a formula I have to tell you what the plane is.
How am I going to tell you a plane? I'll tell you a -- a basis for the
plane, I'll tell you two vectors A one and A two that --
that give you a basis for the plane, so that -- let us say -- say there's
an A one and here's an a -- a vector A two. They don't have to
be perpendicular. But they better be independent,
because then that tells me the plane. The plane is the --
is the plane of A one and A two. And actually going back to my -- to
this -- to this connection, this plane is a column space,
it's the -- it's the column space of what matrix?
What what matrix, so how do I connect the two
questions? I -- I'm trying --
I'm pr- I'm thinking how do I project onto a plane
and I want to get a matrix in here. Everything's cleaner if I write it
in terms of a matrix. So what matrix has these --
has that column space? Well of course it's just the matrix
that has A one in the first column and A two in the
second column. Right, just just let's be sure
we've got the -- the question before we get to the
answer. So I'm looking for again I'm given a matrix A with
two columns. And really I'm ready once I get to two I'm
ready for N. So it could be two columns,
it could be N columns. I'll write the answer in terms of the matrix A.
And the point will be those two columns describe the plane,
they describe the column space, and I want to project.
OK. And I'm given a vector B that's probably not in the column space.
Of course, if B is in the column space, my projection is simple,
it's just B. But most likely I have an error E, this B minus P part,
which is probably not zero. OK. But the beauty is that I know
the -- from geometry or I could get it from calculus or I could get it
from linear algebra that that this this vector --
this this is the part of B that's that's perpendicular to the plane.
That that E is perpendicular is perpendicular to the plane.
If your intuition is saying that that's the crucial fact.
That that that's that that's going to give us the answer.
OK. So let me, that's the problem. Now for the answer. So so this is
a lecture that's really like moving along. Because uh
I'm I'm just plotting that problem up there and asking you what
combination -- now yeso what is it?
P. What is this projection P? This is projection P, is some
multiple of these basis guys,
right, some multiple of the columns. But I don't like writing out X one
A one plus X two A two, I would rather right that as AX.
Well, actually I should put if I'm really
doing everything right, I should put a little hat on it,
to remember that that that this X that those are the numbers and I
could have a put a hat way back there is right,
so this is this is the projection, P. P is AX bar.
And I'm looking for X bar. So that's what I want an equation
for. So so so now I've got hold of the problem.
The problem is find the right combination of the columns so that
the error vector is perpendicular to the plane.
Now let -- let me turn that into an equation. So w- so I'll raise the
board and just turn that turn what we've just done into an equation.
So let me I'll r- I'll write again the main point.
The projection is AX b- X hat. And our problem is find X hat.
And the key is that B minus AX hat, that's the error.
This is the E. Is perpendicular to the plane.
That's got to give me well what am I looking for, I'm looking for
two equations now because I've got an X one and an X two.
And I'll get two equations because so this thing E is perpendicular to
the plane. So what does that mean?
I guess it means it's perpendicular to A one
and also to A two. Right, those are two vectors in the
plane and the things that are perpendicular to the plane are
perpendicular to A one and A two. Let me just repeat. This this guy
then is perpendicular to the plane so it's perpendicular to that vector
and that vector. Not -- it's n- i- i- it's perpendicular to
that of course. But it's a- it's perpendicular to
everything I the plane. And the plane is really told me by
A one and A two. So really I have the equations A
one transpose B minus AX is zero. And also A two transpose B
minus AX is zero. Those are my two equations. But
I want those in matrix form. I want to put those two equations
together as a matrix equation and it's just comes out right.
Look at the matrix A transpose. Put A one A one transpose is its
first row, A two transpose is its second row,
that multiplies this B minus AX, and gives me the zero and the zero.
I'm you see the -- I -- I'm just --
this is one way to -- to -- to come up with this equation.
So the equation I'm coming up with is A transpose B minus AX hat is
zero. OK. That's my equation. All right.
Now I -- I -- I want to stop for a moment before I
solve it and just think about it. First of all do you see that that
equation back in the very first problem I solved on a line,
what was -- what was uh on a line the v- the matrix A only
had one column, it was just little A.
So in the first problem I solved, projecting on a line, this for --
for capital A you just change that to little A and you have the same
equation that we solved before. A transpose E equals zero.
OK. Now a second thing, second comment. I would like to since I
know about these four subspaces, I would like to get them into this
picture. So let me ask the question,
what subspace is this thing in? Which of the four subspaces is that
error vector E, this is this is nothing but E,
this is this guy, coming down the int- in down perpendicular
to the plane. What subspace is E in?
From this equation. Well the equation is saying A transpose E is
zero. So I'm learning here that E is in the null space
of A transpose. Right? That's my equation.
And now I just want to see hey of course that that was right.
Because things that are in the null space of A transpose,
what do we know about the null space of
A transpose? So that last lecture gave us the
sort of the geometry of these subspaces.
And the orthogonality of them. And do you remember what it was?
What on the -- on the -- on the right side of our big figure we
always have the null space of A transpose and the column space of
A, and they're orthogonal. So E in the null space of A
transpose is saying E is perpendicular to the column
space of A. Yes. I -- I -- I just feel OK,
the damn thing came out right. Uh the equation for the -- the
equation that I struggled to find for E really said
what I wanted, that the -- the --
the error E is perpendicular to the column space of A,
just right. And from our four fundamental subspaces we knew that
that is the same as that. To say E is in the null space of A
transpose says E's perpendicular to the column space.
OK. So we've got this equation. Now let's just solve it. All right.
Let me just rewrite it as A transpose A X hat equals A transpose
B. That's our equation. That gives us X.
And oh I -- allow me to keep remembering the
one-dimensional case. The one-dimensional case,
this was little A. So this was just a number, little A --
little A transpose, A transpose A was just a
vector row times a column, a number. And this was a number.
And X was the ratio of those numbers. But now we've got matrices,
this -- this one is N by N. A transpose A is an N by N matrix.
OK. So can I move to the next board
for the solution? OK. There -- this is the --
the -- the key equation. Now I'm ready for the formulas that we have
to remember. What's X hat?
What's the projection, what's the projection matrix,
those are my three questions. That we answered in the one-D case and
now we're ready for in the N-dimensional case.
So what is X hat? Well, what can I say but A transpose A
inverse, A transpose B.
OK. that's the solution to -- to our equation. What's the
projection? That's more interesting.
What's the projection? The projection is A X hat.
That's how X hat got into the picture in the first
place. X hat was the was the combination of columns in the I had
to look for those numbers and now I found them.
Was the combination of the columns of A that gave me the projection.
OK. So now I know what this guy is.
So it's just I multiply by A. A A transpose A inverse A transpose
B. That's looking a little messy but it's uh
not bad. That that combination is is our like magic combination.
This is the thing which is which use which w- which is likewhat's
it like, what was it in one dimension?
What was that we had this we must have had this thing way back at the
beginning of the lecture. What did we -- oh that A was just a
column so it was little A, little A transpose over A transpose
A, right, that's what it was in one-D.
You see what's happened in more dimensions,
I -- I'm not allowed to to just divide because because I don't have
a number, I have to put inverse, because I have an N by N matrix.
But same formula. And now tell me what's the
projection matrix? What matrix is multiplying B to
give the projection? Right there. Because there it i- I
n- n- I even already underlined it by accident. The projection matrix
which I use capital P is this, it's it's that thing, shall I write
it again, A times A transpose A inverse times A transpose.
Now if -- if you'll bear with me I'll think about what --
what I -- what have I done here. I've got this formula. Now the
first thing that occurs to me is -- is something bad.
Lookwhy don't I just you know here's a product of two matrices and
I want its inverse, why don't I just use the formula I
know for the inverse of a product and say OK,
that's A inverse times A transpose inverse,
what will happen if I do that? What will happen if I -- if I say
hey this is A inverse times A transpose inverse,
then shall I do it? It's going to go on videotape if I
do it, and I don't --
all right, I'll put it there, but just like don't take the
videotape quite so carefully. OK. So if I -- if I put that thing
it -- it would be A A inverse A transpose inverse A transpose
and what's that? That's the identity.
But what's going on? So why -- you see my question is
somehow I did s- I did something wrong. That that wasn't allowed.
And and and why is that? Because A is not a square matrix.
A is not a square matrix. It doesn't have an inverse.
So so I have to leave it that way. This is not OK. If if A was a
square invertible matrix, then then I couldn't complain.
Yeah I I'll think -- I -- let me think about that case.
But you but my main case, the whole reason I'm doing all this,
is that A is this matrix that has X too many rows,
it's just got a couple of columns, like A one and A two,
but lots of rows. Not square. And if it's not square,
this A transpose A is square but I can't pull it apart like this,
so I -- I -- I'm not allowed to -- to do this pull apart, except if A
was square. Now if A is square what's what's
going on if A is a square matrix?
A nice square inv- invertible matrix. Think.
Wh- what's up with that wh- what's what's with that case.
So this is auhis that the formula ought to work then too.
If A is a nice square invertible matrix what's its column space,
so it's a nice N by N invertible everything great matrix,
what's its column space, the whole of RN. So what's the projection
matrix if I'm projecting onto the whole space? It's the
identity matrix right? If I'm projecting B onto the whole
space, not just onto a plane, but onto all of three-D, then B is
already in the column space, the projection is the identity,
and this is gives me the correct formula, P is I. But
if I'm projecting onto a subspace then I can't split those apart and I
have to stay with that formula. OK. And what can I say if -- so --
so I remember this formula for one-D and that's what it looks
like in N dimensions. And what are the properties that I
expected for any projection matrix? And I still expect for this one?
That matrix should be symmetric and it is. P transpose is P.
Because if I transpose this, this guy's symmetric, and its
inverse is symmetric, and if I transpose this one when I
transpose it will pop up there, become A, that A transpose will pop
up here, and I'm back to P again. And do we dare try the other
property which is P squared equal P? It -- it's got to be right.
Because we know geometrically that the first projection pops us onto
the column space and the second one leaves us where we are.
So I expect that if I multiply by let me do it --
if I multiply by another P, so there's another A, another A
transpose A inverse A transpose, can you God --
eight As in a row is quite obscene butuh do you see that it works?
So I'm squaring that so what do I dohow do I see that
multiplication? Well, yeI just want to put
parentheses in good places, so I see what's happening,
yehere's an A transpose A sitting together,
so when that A transpose A multiplies its inverse,
all that stuff goes, right. And leaves just the A transpose at
the end, which is just what we want. So P squared equals P. So sure
enough those two properties hold. OK.
That's th- OK we really have got now all the formulas.
X hat, the projection P, and the projection matrix capital P.
And now my job is to use them. OK. So when would I --
when would Ihave a bunch of equations,
too many equations and yet I want the best answer and the --
the most important example, the most common example is if I
have points so here's the --
here's the application. V squared. Fitting by a line.
OK. So I -- I'll start this application today
and there's more in it thanI can do in this same lecture.
So that'll give me a chance to recap the formulas and there they
are, and recap the ideas. So let me s- let me start the
problemtoday. I'm given a bunch of data points.
And they lie close to a line but not on a line.
Let me take that. Say a T equal to one,
two and three, I have one,
and two and two again. So my data points are this is the
like the time direction and this is like well let me call that B or Y or
something. I'm given these three points
and I want to fit them by a line. By the best straight line. So the
problem is fit the pointsone, one is the first point, the second
point is T equals two, B equal one,
and the third point is T equal three, B equal to two.
So those are my three points, T equal sorry,that's two. Yeah,
OK. So this is the point one, one. This is the point two,
two, and that's the point three, two. And of course there isn't a --
a line that goes through them. So I'm looking for the best line.
I'm looking for a line that probably goes somewhere,
do you think it goes somewhere like that? I'm l-uh I didn't
mean to make it go through that point,
it won't. It'll kind of -- it'll s- it'll go between so the
error there and the error there and the error there are as small as I
can get them. OK, what I'd like to do is
find the matrix A. Because once I've found the matrix
A the formulas take over. So what I -- I'm looking for this
line, B is C plus DT. So this is in the homework that I
sent out for today. Find the best line.
So I'm looking for these numbers. C and D.
That tell me the line and I want them to be as close to going through
those three points as I can get. I can't get exactly so there are
three equations to go through the three points. It would it will go
exactly through that point if let's see that first point has T equal to
one, so that would say C plus D equaled one.
This is the one, one. The second point T is two.
So C plus two D should come out to equal two.
But I also want to get the third equation in and at that third
equation T is three so C plus three D equals only two.
That th- that's the key. Is to write down
what equations we would like to solve but can't.
Reason we if we could solve them that would mean that we could put a
line through all three points and of course if these numbers one,
two, two were different, we could do it.
But with those numbers, one, two, two, we can't.
So what is our equation AX equal AX equal B that we can't solve?
I -- I -- I want -- I just want to say what's the matrix here,
what's the unknown X, and what's the right-hand side.
So this is the matrix is one, one, one,
one, two, three. The unknown is C and D.
And the right-hand side if one, two, two. Right, I've just taken my
equations and I've said what is AX and what is B.
Then there's no solution, this is the typical case where I have three
equations, two unknowns,
no solution, but I'm still looking for the best solution.
And the best solution is taken is is to solve not this equation AX equal
B which has which has no solution but the equation that does have a
solution, which was this one. So that's the equation to
solve. That's the central equation of the subject.
I can't solve AX equal B but magically when I multiply both sides
by A transpose I get an equation that I can solve and its
solution gives me X, the best X,
the best projection, and I discover what's the matrix
that's behind it. OK. So next time I'll complete an
example, numerical example. T- t- today was all letters,
numbers next time. Thanks.