Uploaded by MIT on 06.05.2009

Transcript:

OK, guys the -- we're almost ready to make this --

to make this lecture immortal. OK. Are we on?

All right. This is an important lecture. It's about projections.

And I'll -- let me start by just projecting a vector B down on a

vector A. So just to so you see what the

geometry looks like in when I'm in -- in just two dimensions, I'd like to

find the point along this line so that -- that line through A is a

one-dimensional subspace, so I'm starting with one dimension.

I'd like to find the point on that line closest to B.

Can I just take that problem first and then I'll explain why I want to

do it and why I want to project on other subspaces.

So wh- where's the point closest to B that's on that line?

It's somewhere there. And let me connect that

and -- and what's the whole point of my picture now?

Wh- what's the -- where does orthogonality come into

this picture? The whole point is that this --

this best point, that's the projection, P, of B onto the line,

uh where's orthogonality? It's the fact that that's a right

angle. That this -- the error -- this is like how much

I'm wrong by -- this is the difference between B and

P, the whole point is that that -- that that's

perpendicular to A. That's got to give us the equation.

That's got to tell us -- that's the -- that's the one fact we know,

that's got to tell us where that projection is.

Let me also say, look I -- I've drawn a triangle

there. So if we were doing trigonometry we would

do l- like we would have angles theta and distances that would

involve sine theta and cos theta that leads to lousy formulas

compared to linear algebra. The -- the -- the formula that we

want comes out nicely and what's the --

what do we know? We know that P, this projection,

is some multiple of A, right? It's on that line.

So we know it's -- it's in that one-dimensional

subspace, it's some multiple, let me call that multiple X, of A.

So really it's that number X I'd like to find.

So thi- this is going to be simple in one-D, so let's just carry it

through, and then see how it goes in high dimensions.

OK. The key fact is -- so the -- the -- the key --

the key to everything is that perpendicular.

The fact that -- that A is perpendicular to A is perpendicular

to E. Which is B minus AX, XA. I don't care what -- XA.

That that equals zero. Do you see that as the central

equation, that's saying that this A is perpendicular to this --

correction, that's going to tell us what X is.

Let me just raise the board and simplify that and out will come X.

OK. So if I simplify that, let's see, I'll move one to --

to -- one term to one side, the other term will be on the other side,

it looks to me like X times A transpose A is equal

to A transpose B. Right? I have A transpose B as one

f- one term, A transpose A as the other,

so right away here's my A transpose A . But it --

it's just a number now. And I divide by it. And I get the

answer. X is A transpose B over A transpose A.

And P, the projection I wanted, is that's the c- that's the right

multiple. That's got a cosine theta built in. But we don't need to look

at angles. It's -- we've just got vectors here.

And the projection is P is A times that X.

Or X times that A. But I -- I'm really going to --

eventually I'm going to want that X coming on the right-hand side.

So do you see that I've got two of the three

formulas already, right here, I've got the --

the -- the equat- that's the equation that that --

that that leads me to the answer, here's the answer for X, and here's

the projection. OK. can I do add just one more

thing to this one-dimensional problem?

One more like lift it up into linear algebra, into matrices.

Here's the last thing I want to do w- don't but don't forget those

formulas. A transpose B over A transpose A.

Actually let's look at that for a moment first.

Uh supposeand -- and -- and A well then I'll --

I'll let me I'll -- I'll t- let me take this next step.

So P is A times X. So can I write that then?

P is A times this neat number, A transpose B

over A transpose A. That's our projection.

Can I ask a couple of questions about it, just while we look,

get that digest that formula. Suppose B is doubled.

Suppose I change B to two B. What happens to the

projection? So suppose I instead of that vector

B that I drew on the board make it two B, twice as long,

what's -- what's the projection now? It's doubled too, right? It's

going to be twice as far, if B goes twice as far,

the projection will go twice as far. And you see it there. If I put in

an extra factor two, then -- then P's got that factor too.

Now what about if I double A? What if I double the vector A that

I'm projecting onto? What -- what changes?

N- the projection doesn't change at all. Right?

Because I'm just -- the line didn't change.

If I double A or I take minus A. It's still that same line. The

projection's still in the same place. And of course if I double

A I get a four up -- up above, and I get a four --

a extra four below, they cancel out, and the projection is the same. OK.

So really, this -- I want to look at this as

the projection is -- there's a matrix here.

The projection is being -- is carried out by some matrix that

I'm going to call the projection matrix. And in other words the

projection is some matrix that acts on this guy B and produces the

projection. The projection P is the projection matrix acting on

whatever the input is. The input is B, the projection

matrix is P. OK. Actually you can tell me right

away what this projection matrix is. So this is a pretty interesting

matrix. What matrix is multiplying B?

I'm just -- just from my formula I --

I see what P is. P, this projection matrix, is,

what is it? I see A A transpose above, and I see A

transpose A below. And those don't cancel.

That's not one. Right? That's a matrix. Because down here,

the A transpose A, that's just a number,

A transpose A, that's the length of A squared, and up above is

a column times a row. Column times a row is a matrix.

So this is a full-scale N by N matrix, if I --

if I'm in N dimensions. And it's kind of an interesting one.

And it's the one which if I multiply by B then I get this,

you see once again I'm putting parentheses in

different places. I'm putting the parentheses right

there. I'm saying OK, that's really the matrix that

produces this projection. OK. Now, tell me --

all right, what are the properties of that

matrix? I'm just using letters here, A and B, I could put in numbers,

but I think it's -- for once, it's clearer with letters,

because all formulas are simple, A transpose B over A transpose A,

that's the number that multiplies the A, and then I see wait a minute,

there's a matrix here and what's the rank of that matrix,

by the way? What's the rank of that matrix, yeah,

let me -- let me just ask you about that matrix.

Which looks a little strange, A A transpose over this number.

But uh well, I could ask you its column

space. Yeah, let me ask you its column space.

So what's the column space of a matrix? It's --

it's if you multiply that matrix by anything you always get in the

column space, right? The column space of a matrix is when

you multiply any vector by that matrix, any vector B,

by the matrix, you always land in the column space.

That's -- that's what column spaces work. Now what space

do we always land in? What's the column space of --

what's the result when I multiply this any vector B by my matrix?

So I have P times B, where am I? I'm on that line, right? The

column space, so here are facts about this matrix.

The column space of P, of this projection matrix,

is the line through A. And the rank of this matrix is

you can all say it at once one. Right. The rank is one. This is a

rank one matrix. Actually w- it's exactly the form

that we -- we're familiar with with rank one matrix --

ma-with a rank one matrix. A column times a row, that's a rank

one matrix, that column is the --

is the basis for the column space. Just one dimension. OK. So I know

that much about the matrix. But now there are two more facts

about the matrix that -- that I want to -- that I want to

notice. I- first of all is the matrix

symmetric? That's a natural question for

matrices. And the answer is yes. If I take the transpose of this,

there's a number down there, the transpose of A A transpose is A

A transpose. So P is symmetric. P transpose equals P.

So this is a key property. That the projection matrix is

symmetric. One more property now and this is

the real one. What happens if I do the projection

twice? So I'm looking -- looking for something,

some information about P squared. But just give me in terms of that

picture, in terms my picture, I take any vector B, I multiply it

by pr- my projection matrix, and that puts me there, so this is

PB. And now I project again.

What happens now? What happens when I apply the projection matrix a

second time? To this, so I'm applying it once

brings me here and the second time brings me

I stay put. Right? The projection for a point

on this line the projection is right where it is.

The projection is the same point. So that means that if I project

twice, I get the same answer as I did in the first projection.

So those are the two properties that tell me I'm looking at a

projection matrix. It's symmetric and it's square is

itself. Because if I project a second time, it's the same result

as the first result. OK. So that'sand --

and then here's the exact formula when I know what I'm projecting onto,

that line through A, then I know what P is.

So do you see that I have all the pieces here for projection

on a line? Now, and those --

please remember those. So there are three formulas to

remember. The formula for X, the formula for P, which is just AX,

and then the formula for capital P, which is the matrix. Good. Good.

OK. Now I want to move to more dimensions.

So we're going to have three formulas again but you'll have to --

they'll b- they'll be a little different because we won't have a

single line but a -- a plane or three-dimensional or a

N-dimensional subspace. OK. So so now I'm -- I'm move to

the next question. Maybe I -- maybe I'll say first --

I'll -- let me say first why I want this projection,

and then we'll figure out what it is, we'll go completely in parallel

there, and then we'll use it. OK, why do I want this projection?

Well, so why -- why project? It's because I'm as I mentioned last

time this new chapter deals with equations AX equal B

may have no solution. So that -- that's -- that's really

my problem, that I'm -- I'm given a --

a bunch of equations probably too many equations,

more equations than unknowns, and I can't solve them. OK. So

what do I do? I solve the closest problem that I

can solve. And what's the closest one?

Well, AX will always be in the column space of A.

That's my problem. My problem is AX has to be in the column space and

B is probably not in the column space. So I change B to what?

I choose the closest vector in the column space,

so I'll solve AX equal P instead. That -- that one I can do. Where P

is this is the projection of B onto the column space.

That's why I want to be able to do this. Because I --

I have to find a solution here, and I'm going to put a little hat

there to indicate that it's not the X, it's not the X that

doesn't exist, it's the X hat that's best possible.

So if -- so I must be able to figure out what's the good

projection there. What's the good right-hand side

that is in the column space that's as close

as possible to B and then I'm -- then I know what to do. OK. So now

I've got that problem. So th- that's -- that's why I have

the problem again but now let me say I'm in three dimensions,

so I have a plane maybe for example, and I have a vector B that's not in

the plane. And I want to project B down into

the plane. OK. So there's my question.

How do I project a vector and I'm -- I'm really what I'm looking for is a

nice formula, and I'm counting on linear algebra

to just come out right, a nice formula for the projection of

-- of B into the plane. The nearest point. So this again a

right angle is going to be crucial. OK. Now so what's --

first of all I've got to say what is that plane.

To get a formula I have to tell you what the plane is.

How am I going to tell you a plane? I'll tell you a -- a basis for the

plane, I'll tell you two vectors A one and A two that --

that give you a basis for the plane, so that -- let us say -- say there's

an A one and here's an a -- a vector A two. They don't have to

be perpendicular. But they better be independent,

because then that tells me the plane. The plane is the --

is the plane of A one and A two. And actually going back to my -- to

this -- to this connection, this plane is a column space,

it's the -- it's the column space of what matrix?

What what matrix, so how do I connect the two

questions? I -- I'm trying --

I'm pr- I'm thinking how do I project onto a plane

and I want to get a matrix in here. Everything's cleaner if I write it

in terms of a matrix. So what matrix has these --

has that column space? Well of course it's just the matrix

that has A one in the first column and A two in the

second column. Right, just just let's be sure

we've got the -- the question before we get to the

answer. So I'm looking for again I'm given a matrix A with

two columns. And really I'm ready once I get to two I'm

ready for N. So it could be two columns,

it could be N columns. I'll write the answer in terms of the matrix A.

And the point will be those two columns describe the plane,

they describe the column space, and I want to project.

OK. And I'm given a vector B that's probably not in the column space.

Of course, if B is in the column space, my projection is simple,

it's just B. But most likely I have an error E, this B minus P part,

which is probably not zero. OK. But the beauty is that I know

the -- from geometry or I could get it from calculus or I could get it

from linear algebra that that this this vector --

this this is the part of B that's that's perpendicular to the plane.

That that E is perpendicular is perpendicular to the plane.

If your intuition is saying that that's the crucial fact.

That that that's that that's going to give us the answer.

OK. So let me, that's the problem. Now for the answer. So so this is

a lecture that's really like moving along. Because uh

I'm I'm just plotting that problem up there and asking you what

combination -- now yeso what is it?

P. What is this projection P? This is projection P, is some

multiple of these basis guys,

right, some multiple of the columns. But I don't like writing out X one

A one plus X two A two, I would rather right that as AX.

Well, actually I should put if I'm really

doing everything right, I should put a little hat on it,

to remember that that that this X that those are the numbers and I

could have a put a hat way back there is right,

so this is this is the projection, P. P is AX bar.

And I'm looking for X bar. So that's what I want an equation

for. So so so now I've got hold of the problem.

The problem is find the right combination of the columns so that

the error vector is perpendicular to the plane.

Now let -- let me turn that into an equation. So w- so I'll raise the

board and just turn that turn what we've just done into an equation.

So let me I'll r- I'll write again the main point.

The projection is AX b- X hat. And our problem is find X hat.

And the key is that B minus AX hat, that's the error.

This is the E. Is perpendicular to the plane.

That's got to give me well what am I looking for, I'm looking for

two equations now because I've got an X one and an X two.

And I'll get two equations because so this thing E is perpendicular to

the plane. So what does that mean?

I guess it means it's perpendicular to A one

and also to A two. Right, those are two vectors in the

plane and the things that are perpendicular to the plane are

perpendicular to A one and A two. Let me just repeat. This this guy

then is perpendicular to the plane so it's perpendicular to that vector

and that vector. Not -- it's n- i- i- it's perpendicular to

that of course. But it's a- it's perpendicular to

everything I the plane. And the plane is really told me by

A one and A two. So really I have the equations A

one transpose B minus AX is zero. And also A two transpose B

minus AX is zero. Those are my two equations. But

I want those in matrix form. I want to put those two equations

together as a matrix equation and it's just comes out right.

Look at the matrix A transpose. Put A one A one transpose is its

first row, A two transpose is its second row,

that multiplies this B minus AX, and gives me the zero and the zero.

I'm you see the -- I -- I'm just --

this is one way to -- to -- to come up with this equation.

So the equation I'm coming up with is A transpose B minus AX hat is

zero. OK. That's my equation. All right.

Now I -- I -- I want to stop for a moment before I

solve it and just think about it. First of all do you see that that

equation back in the very first problem I solved on a line,

what was -- what was uh on a line the v- the matrix A only

had one column, it was just little A.

So in the first problem I solved, projecting on a line, this for --

for capital A you just change that to little A and you have the same

equation that we solved before. A transpose E equals zero.

OK. Now a second thing, second comment. I would like to since I

know about these four subspaces, I would like to get them into this

picture. So let me ask the question,

what subspace is this thing in? Which of the four subspaces is that

error vector E, this is this is nothing but E,

this is this guy, coming down the int- in down perpendicular

to the plane. What subspace is E in?

From this equation. Well the equation is saying A transpose E is

zero. So I'm learning here that E is in the null space

of A transpose. Right? That's my equation.

And now I just want to see hey of course that that was right.

Because things that are in the null space of A transpose,

what do we know about the null space of

A transpose? So that last lecture gave us the

sort of the geometry of these subspaces.

And the orthogonality of them. And do you remember what it was?

What on the -- on the -- on the right side of our big figure we

always have the null space of A transpose and the column space of

A, and they're orthogonal. So E in the null space of A

transpose is saying E is perpendicular to the column

space of A. Yes. I -- I -- I just feel OK,

the damn thing came out right. Uh the equation for the -- the

equation that I struggled to find for E really said

what I wanted, that the -- the --

the error E is perpendicular to the column space of A,

just right. And from our four fundamental subspaces we knew that

that is the same as that. To say E is in the null space of A

transpose says E's perpendicular to the column space.

OK. So we've got this equation. Now let's just solve it. All right.

Let me just rewrite it as A transpose A X hat equals A transpose

B. That's our equation. That gives us X.

And oh I -- allow me to keep remembering the

one-dimensional case. The one-dimensional case,

this was little A. So this was just a number, little A --

little A transpose, A transpose A was just a

vector row times a column, a number. And this was a number.

And X was the ratio of those numbers. But now we've got matrices,

this -- this one is N by N. A transpose A is an N by N matrix.

OK. So can I move to the next board

for the solution? OK. There -- this is the --

the -- the key equation. Now I'm ready for the formulas that we have

to remember. What's X hat?

What's the projection, what's the projection matrix,

those are my three questions. That we answered in the one-D case and

now we're ready for in the N-dimensional case.

So what is X hat? Well, what can I say but A transpose A

inverse, A transpose B.

OK. that's the solution to -- to our equation. What's the

projection? That's more interesting.

What's the projection? The projection is A X hat.

That's how X hat got into the picture in the first

place. X hat was the was the combination of columns in the I had

to look for those numbers and now I found them.

Was the combination of the columns of A that gave me the projection.

OK. So now I know what this guy is.

So it's just I multiply by A. A A transpose A inverse A transpose

B. That's looking a little messy but it's uh

not bad. That that combination is is our like magic combination.

This is the thing which is which use which w- which is likewhat's

it like, what was it in one dimension?

What was that we had this we must have had this thing way back at the

beginning of the lecture. What did we -- oh that A was just a

column so it was little A, little A transpose over A transpose

A, right, that's what it was in one-D.

You see what's happened in more dimensions,

I -- I'm not allowed to to just divide because because I don't have

a number, I have to put inverse, because I have an N by N matrix.

But same formula. And now tell me what's the

projection matrix? What matrix is multiplying B to

give the projection? Right there. Because there it i- I

n- n- I even already underlined it by accident. The projection matrix

which I use capital P is this, it's it's that thing, shall I write

it again, A times A transpose A inverse times A transpose.

Now if -- if you'll bear with me I'll think about what --

what I -- what have I done here. I've got this formula. Now the

first thing that occurs to me is -- is something bad.

Lookwhy don't I just you know here's a product of two matrices and

I want its inverse, why don't I just use the formula I

know for the inverse of a product and say OK,

that's A inverse times A transpose inverse,

what will happen if I do that? What will happen if I -- if I say

hey this is A inverse times A transpose inverse,

then shall I do it? It's going to go on videotape if I

do it, and I don't --

all right, I'll put it there, but just like don't take the

videotape quite so carefully. OK. So if I -- if I put that thing

it -- it would be A A inverse A transpose inverse A transpose

and what's that? That's the identity.

But what's going on? So why -- you see my question is

somehow I did s- I did something wrong. That that wasn't allowed.

And and and why is that? Because A is not a square matrix.

A is not a square matrix. It doesn't have an inverse.

So so I have to leave it that way. This is not OK. If if A was a

square invertible matrix, then then I couldn't complain.

Yeah I I'll think -- I -- let me think about that case.

But you but my main case, the whole reason I'm doing all this,

is that A is this matrix that has X too many rows,

it's just got a couple of columns, like A one and A two,

but lots of rows. Not square. And if it's not square,

this A transpose A is square but I can't pull it apart like this,

so I -- I -- I'm not allowed to -- to do this pull apart, except if A

was square. Now if A is square what's what's

going on if A is a square matrix?

A nice square inv- invertible matrix. Think.

Wh- what's up with that wh- what's what's with that case.

So this is auhis that the formula ought to work then too.

If A is a nice square invertible matrix what's its column space,

so it's a nice N by N invertible everything great matrix,

what's its column space, the whole of RN. So what's the projection

matrix if I'm projecting onto the whole space? It's the

identity matrix right? If I'm projecting B onto the whole

space, not just onto a plane, but onto all of three-D, then B is

already in the column space, the projection is the identity,

and this is gives me the correct formula, P is I. But

if I'm projecting onto a subspace then I can't split those apart and I

have to stay with that formula. OK. And what can I say if -- so --

so I remember this formula for one-D and that's what it looks

like in N dimensions. And what are the properties that I

expected for any projection matrix? And I still expect for this one?

That matrix should be symmetric and it is. P transpose is P.

Because if I transpose this, this guy's symmetric, and its

inverse is symmetric, and if I transpose this one when I

transpose it will pop up there, become A, that A transpose will pop

up here, and I'm back to P again. And do we dare try the other

property which is P squared equal P? It -- it's got to be right.

Because we know geometrically that the first projection pops us onto

the column space and the second one leaves us where we are.

So I expect that if I multiply by let me do it --

if I multiply by another P, so there's another A, another A

transpose A inverse A transpose, can you God --

eight As in a row is quite obscene butuh do you see that it works?

So I'm squaring that so what do I dohow do I see that

multiplication? Well, yeI just want to put

parentheses in good places, so I see what's happening,

yehere's an A transpose A sitting together,

so when that A transpose A multiplies its inverse,

all that stuff goes, right. And leaves just the A transpose at

the end, which is just what we want. So P squared equals P. So sure

enough those two properties hold. OK.

That's th- OK we really have got now all the formulas.

X hat, the projection P, and the projection matrix capital P.

And now my job is to use them. OK. So when would I --

when would Ihave a bunch of equations,

too many equations and yet I want the best answer and the --

the most important example, the most common example is if I

have points so here's the --

here's the application. V squared. Fitting by a line.

OK. So I -- I'll start this application today

and there's more in it thanI can do in this same lecture.

So that'll give me a chance to recap the formulas and there they

are, and recap the ideas. So let me s- let me start the

problemtoday. I'm given a bunch of data points.

And they lie close to a line but not on a line.

Let me take that. Say a T equal to one,

two and three, I have one,

and two and two again. So my data points are this is the

like the time direction and this is like well let me call that B or Y or

something. I'm given these three points

and I want to fit them by a line. By the best straight line. So the

problem is fit the pointsone, one is the first point, the second

point is T equals two, B equal one,

and the third point is T equal three, B equal to two.

So those are my three points, T equal sorry,that's two. Yeah,

OK. So this is the point one, one. This is the point two,

two, and that's the point three, two. And of course there isn't a --

a line that goes through them. So I'm looking for the best line.

I'm looking for a line that probably goes somewhere,

do you think it goes somewhere like that? I'm l-uh I didn't

mean to make it go through that point,

it won't. It'll kind of -- it'll s- it'll go between so the

error there and the error there and the error there are as small as I

can get them. OK, what I'd like to do is

find the matrix A. Because once I've found the matrix

A the formulas take over. So what I -- I'm looking for this

line, B is C plus DT. So this is in the homework that I

sent out for today. Find the best line.

So I'm looking for these numbers. C and D.

That tell me the line and I want them to be as close to going through

those three points as I can get. I can't get exactly so there are

three equations to go through the three points. It would it will go

exactly through that point if let's see that first point has T equal to

one, so that would say C plus D equaled one.

This is the one, one. The second point T is two.

So C plus two D should come out to equal two.

But I also want to get the third equation in and at that third

equation T is three so C plus three D equals only two.

That th- that's the key. Is to write down

what equations we would like to solve but can't.

Reason we if we could solve them that would mean that we could put a

line through all three points and of course if these numbers one,

two, two were different, we could do it.

But with those numbers, one, two, two, we can't.

So what is our equation AX equal AX equal B that we can't solve?

I -- I -- I want -- I just want to say what's the matrix here,

what's the unknown X, and what's the right-hand side.

So this is the matrix is one, one, one,

one, two, three. The unknown is C and D.

And the right-hand side if one, two, two. Right, I've just taken my

equations and I've said what is AX and what is B.

Then there's no solution, this is the typical case where I have three

equations, two unknowns,

no solution, but I'm still looking for the best solution.

And the best solution is taken is is to solve not this equation AX equal

B which has which has no solution but the equation that does have a

solution, which was this one. So that's the equation to

solve. That's the central equation of the subject.

I can't solve AX equal B but magically when I multiply both sides

by A transpose I get an equation that I can solve and its

solution gives me X, the best X,

the best projection, and I discover what's the matrix

that's behind it. OK. So next time I'll complete an

example, numerical example. T- t- today was all letters,

numbers next time. Thanks.

to make this lecture immortal. OK. Are we on?

All right. This is an important lecture. It's about projections.

And I'll -- let me start by just projecting a vector B down on a

vector A. So just to so you see what the

geometry looks like in when I'm in -- in just two dimensions, I'd like to

find the point along this line so that -- that line through A is a

one-dimensional subspace, so I'm starting with one dimension.

I'd like to find the point on that line closest to B.

Can I just take that problem first and then I'll explain why I want to

do it and why I want to project on other subspaces.

So wh- where's the point closest to B that's on that line?

It's somewhere there. And let me connect that

and -- and what's the whole point of my picture now?

Wh- what's the -- where does orthogonality come into

this picture? The whole point is that this --

this best point, that's the projection, P, of B onto the line,

uh where's orthogonality? It's the fact that that's a right

angle. That this -- the error -- this is like how much

I'm wrong by -- this is the difference between B and

P, the whole point is that that -- that that's

perpendicular to A. That's got to give us the equation.

That's got to tell us -- that's the -- that's the one fact we know,

that's got to tell us where that projection is.

Let me also say, look I -- I've drawn a triangle

there. So if we were doing trigonometry we would

do l- like we would have angles theta and distances that would

involve sine theta and cos theta that leads to lousy formulas

compared to linear algebra. The -- the -- the formula that we

want comes out nicely and what's the --

what do we know? We know that P, this projection,

is some multiple of A, right? It's on that line.

So we know it's -- it's in that one-dimensional

subspace, it's some multiple, let me call that multiple X, of A.

So really it's that number X I'd like to find.

So thi- this is going to be simple in one-D, so let's just carry it

through, and then see how it goes in high dimensions.

OK. The key fact is -- so the -- the -- the key --

the key to everything is that perpendicular.

The fact that -- that A is perpendicular to A is perpendicular

to E. Which is B minus AX, XA. I don't care what -- XA.

That that equals zero. Do you see that as the central

equation, that's saying that this A is perpendicular to this --

correction, that's going to tell us what X is.

Let me just raise the board and simplify that and out will come X.

OK. So if I simplify that, let's see, I'll move one to --

to -- one term to one side, the other term will be on the other side,

it looks to me like X times A transpose A is equal

to A transpose B. Right? I have A transpose B as one

f- one term, A transpose A as the other,

so right away here's my A transpose A . But it --

it's just a number now. And I divide by it. And I get the

answer. X is A transpose B over A transpose A.

And P, the projection I wanted, is that's the c- that's the right

multiple. That's got a cosine theta built in. But we don't need to look

at angles. It's -- we've just got vectors here.

And the projection is P is A times that X.

Or X times that A. But I -- I'm really going to --

eventually I'm going to want that X coming on the right-hand side.

So do you see that I've got two of the three

formulas already, right here, I've got the --

the -- the equat- that's the equation that that --

that that leads me to the answer, here's the answer for X, and here's

the projection. OK. can I do add just one more

thing to this one-dimensional problem?

One more like lift it up into linear algebra, into matrices.

Here's the last thing I want to do w- don't but don't forget those

formulas. A transpose B over A transpose A.

Actually let's look at that for a moment first.

Uh supposeand -- and -- and A well then I'll --

I'll let me I'll -- I'll t- let me take this next step.

So P is A times X. So can I write that then?

P is A times this neat number, A transpose B

over A transpose A. That's our projection.

Can I ask a couple of questions about it, just while we look,

get that digest that formula. Suppose B is doubled.

Suppose I change B to two B. What happens to the

projection? So suppose I instead of that vector

B that I drew on the board make it two B, twice as long,

what's -- what's the projection now? It's doubled too, right? It's

going to be twice as far, if B goes twice as far,

the projection will go twice as far. And you see it there. If I put in

an extra factor two, then -- then P's got that factor too.

Now what about if I double A? What if I double the vector A that

I'm projecting onto? What -- what changes?

N- the projection doesn't change at all. Right?

Because I'm just -- the line didn't change.

If I double A or I take minus A. It's still that same line. The

projection's still in the same place. And of course if I double

A I get a four up -- up above, and I get a four --

a extra four below, they cancel out, and the projection is the same. OK.

So really, this -- I want to look at this as

the projection is -- there's a matrix here.

The projection is being -- is carried out by some matrix that

I'm going to call the projection matrix. And in other words the

projection is some matrix that acts on this guy B and produces the

projection. The projection P is the projection matrix acting on

whatever the input is. The input is B, the projection

matrix is P. OK. Actually you can tell me right

away what this projection matrix is. So this is a pretty interesting

matrix. What matrix is multiplying B?

I'm just -- just from my formula I --

I see what P is. P, this projection matrix, is,

what is it? I see A A transpose above, and I see A

transpose A below. And those don't cancel.

That's not one. Right? That's a matrix. Because down here,

the A transpose A, that's just a number,

A transpose A, that's the length of A squared, and up above is

a column times a row. Column times a row is a matrix.

So this is a full-scale N by N matrix, if I --

if I'm in N dimensions. And it's kind of an interesting one.

And it's the one which if I multiply by B then I get this,

you see once again I'm putting parentheses in

different places. I'm putting the parentheses right

there. I'm saying OK, that's really the matrix that

produces this projection. OK. Now, tell me --

all right, what are the properties of that

matrix? I'm just using letters here, A and B, I could put in numbers,

but I think it's -- for once, it's clearer with letters,

because all formulas are simple, A transpose B over A transpose A,

that's the number that multiplies the A, and then I see wait a minute,

there's a matrix here and what's the rank of that matrix,

by the way? What's the rank of that matrix, yeah,

let me -- let me just ask you about that matrix.

Which looks a little strange, A A transpose over this number.

But uh well, I could ask you its column

space. Yeah, let me ask you its column space.

So what's the column space of a matrix? It's --

it's if you multiply that matrix by anything you always get in the

column space, right? The column space of a matrix is when

you multiply any vector by that matrix, any vector B,

by the matrix, you always land in the column space.

That's -- that's what column spaces work. Now what space

do we always land in? What's the column space of --

what's the result when I multiply this any vector B by my matrix?

So I have P times B, where am I? I'm on that line, right? The

column space, so here are facts about this matrix.

The column space of P, of this projection matrix,

is the line through A. And the rank of this matrix is

you can all say it at once one. Right. The rank is one. This is a

rank one matrix. Actually w- it's exactly the form

that we -- we're familiar with with rank one matrix --

ma-with a rank one matrix. A column times a row, that's a rank

one matrix, that column is the --

is the basis for the column space. Just one dimension. OK. So I know

that much about the matrix. But now there are two more facts

about the matrix that -- that I want to -- that I want to

notice. I- first of all is the matrix

symmetric? That's a natural question for

matrices. And the answer is yes. If I take the transpose of this,

there's a number down there, the transpose of A A transpose is A

A transpose. So P is symmetric. P transpose equals P.

So this is a key property. That the projection matrix is

symmetric. One more property now and this is

the real one. What happens if I do the projection

twice? So I'm looking -- looking for something,

some information about P squared. But just give me in terms of that

picture, in terms my picture, I take any vector B, I multiply it

by pr- my projection matrix, and that puts me there, so this is

PB. And now I project again.

What happens now? What happens when I apply the projection matrix a

second time? To this, so I'm applying it once

brings me here and the second time brings me

I stay put. Right? The projection for a point

on this line the projection is right where it is.

The projection is the same point. So that means that if I project

twice, I get the same answer as I did in the first projection.

So those are the two properties that tell me I'm looking at a

projection matrix. It's symmetric and it's square is

itself. Because if I project a second time, it's the same result

as the first result. OK. So that'sand --

and then here's the exact formula when I know what I'm projecting onto,

that line through A, then I know what P is.

So do you see that I have all the pieces here for projection

on a line? Now, and those --

please remember those. So there are three formulas to

remember. The formula for X, the formula for P, which is just AX,

and then the formula for capital P, which is the matrix. Good. Good.

OK. Now I want to move to more dimensions.

So we're going to have three formulas again but you'll have to --

they'll b- they'll be a little different because we won't have a

single line but a -- a plane or three-dimensional or a

N-dimensional subspace. OK. So so now I'm -- I'm move to

the next question. Maybe I -- maybe I'll say first --

I'll -- let me say first why I want this projection,

and then we'll figure out what it is, we'll go completely in parallel

there, and then we'll use it. OK, why do I want this projection?

Well, so why -- why project? It's because I'm as I mentioned last

time this new chapter deals with equations AX equal B

may have no solution. So that -- that's -- that's really

my problem, that I'm -- I'm given a --

a bunch of equations probably too many equations,

more equations than unknowns, and I can't solve them. OK. So

what do I do? I solve the closest problem that I

can solve. And what's the closest one?

Well, AX will always be in the column space of A.

That's my problem. My problem is AX has to be in the column space and

B is probably not in the column space. So I change B to what?

I choose the closest vector in the column space,

so I'll solve AX equal P instead. That -- that one I can do. Where P

is this is the projection of B onto the column space.

That's why I want to be able to do this. Because I --

I have to find a solution here, and I'm going to put a little hat

there to indicate that it's not the X, it's not the X that

doesn't exist, it's the X hat that's best possible.

So if -- so I must be able to figure out what's the good

projection there. What's the good right-hand side

that is in the column space that's as close

as possible to B and then I'm -- then I know what to do. OK. So now

I've got that problem. So th- that's -- that's why I have

the problem again but now let me say I'm in three dimensions,

so I have a plane maybe for example, and I have a vector B that's not in

the plane. And I want to project B down into

the plane. OK. So there's my question.

How do I project a vector and I'm -- I'm really what I'm looking for is a

nice formula, and I'm counting on linear algebra

to just come out right, a nice formula for the projection of

-- of B into the plane. The nearest point. So this again a

right angle is going to be crucial. OK. Now so what's --

first of all I've got to say what is that plane.

To get a formula I have to tell you what the plane is.

How am I going to tell you a plane? I'll tell you a -- a basis for the

plane, I'll tell you two vectors A one and A two that --

that give you a basis for the plane, so that -- let us say -- say there's

an A one and here's an a -- a vector A two. They don't have to

be perpendicular. But they better be independent,

because then that tells me the plane. The plane is the --

is the plane of A one and A two. And actually going back to my -- to

this -- to this connection, this plane is a column space,

it's the -- it's the column space of what matrix?

What what matrix, so how do I connect the two

questions? I -- I'm trying --

I'm pr- I'm thinking how do I project onto a plane

and I want to get a matrix in here. Everything's cleaner if I write it

in terms of a matrix. So what matrix has these --

has that column space? Well of course it's just the matrix

that has A one in the first column and A two in the

second column. Right, just just let's be sure

we've got the -- the question before we get to the

answer. So I'm looking for again I'm given a matrix A with

two columns. And really I'm ready once I get to two I'm

ready for N. So it could be two columns,

it could be N columns. I'll write the answer in terms of the matrix A.

And the point will be those two columns describe the plane,

they describe the column space, and I want to project.

OK. And I'm given a vector B that's probably not in the column space.

Of course, if B is in the column space, my projection is simple,

it's just B. But most likely I have an error E, this B minus P part,

which is probably not zero. OK. But the beauty is that I know

the -- from geometry or I could get it from calculus or I could get it

from linear algebra that that this this vector --

this this is the part of B that's that's perpendicular to the plane.

That that E is perpendicular is perpendicular to the plane.

If your intuition is saying that that's the crucial fact.

That that that's that that's going to give us the answer.

OK. So let me, that's the problem. Now for the answer. So so this is

a lecture that's really like moving along. Because uh

I'm I'm just plotting that problem up there and asking you what

combination -- now yeso what is it?

P. What is this projection P? This is projection P, is some

multiple of these basis guys,

right, some multiple of the columns. But I don't like writing out X one

A one plus X two A two, I would rather right that as AX.

Well, actually I should put if I'm really

doing everything right, I should put a little hat on it,

to remember that that that this X that those are the numbers and I

could have a put a hat way back there is right,

so this is this is the projection, P. P is AX bar.

And I'm looking for X bar. So that's what I want an equation

for. So so so now I've got hold of the problem.

The problem is find the right combination of the columns so that

the error vector is perpendicular to the plane.

Now let -- let me turn that into an equation. So w- so I'll raise the

board and just turn that turn what we've just done into an equation.

So let me I'll r- I'll write again the main point.

The projection is AX b- X hat. And our problem is find X hat.

And the key is that B minus AX hat, that's the error.

This is the E. Is perpendicular to the plane.

That's got to give me well what am I looking for, I'm looking for

two equations now because I've got an X one and an X two.

And I'll get two equations because so this thing E is perpendicular to

the plane. So what does that mean?

I guess it means it's perpendicular to A one

and also to A two. Right, those are two vectors in the

plane and the things that are perpendicular to the plane are

perpendicular to A one and A two. Let me just repeat. This this guy

then is perpendicular to the plane so it's perpendicular to that vector

and that vector. Not -- it's n- i- i- it's perpendicular to

that of course. But it's a- it's perpendicular to

everything I the plane. And the plane is really told me by

A one and A two. So really I have the equations A

one transpose B minus AX is zero. And also A two transpose B

minus AX is zero. Those are my two equations. But

I want those in matrix form. I want to put those two equations

together as a matrix equation and it's just comes out right.

Look at the matrix A transpose. Put A one A one transpose is its

first row, A two transpose is its second row,

that multiplies this B minus AX, and gives me the zero and the zero.

I'm you see the -- I -- I'm just --

this is one way to -- to -- to come up with this equation.

So the equation I'm coming up with is A transpose B minus AX hat is

zero. OK. That's my equation. All right.

Now I -- I -- I want to stop for a moment before I

solve it and just think about it. First of all do you see that that

equation back in the very first problem I solved on a line,

what was -- what was uh on a line the v- the matrix A only

had one column, it was just little A.

So in the first problem I solved, projecting on a line, this for --

for capital A you just change that to little A and you have the same

equation that we solved before. A transpose E equals zero.

OK. Now a second thing, second comment. I would like to since I

know about these four subspaces, I would like to get them into this

picture. So let me ask the question,

what subspace is this thing in? Which of the four subspaces is that

error vector E, this is this is nothing but E,

this is this guy, coming down the int- in down perpendicular

to the plane. What subspace is E in?

From this equation. Well the equation is saying A transpose E is

zero. So I'm learning here that E is in the null space

of A transpose. Right? That's my equation.

And now I just want to see hey of course that that was right.

Because things that are in the null space of A transpose,

what do we know about the null space of

A transpose? So that last lecture gave us the

sort of the geometry of these subspaces.

And the orthogonality of them. And do you remember what it was?

What on the -- on the -- on the right side of our big figure we

always have the null space of A transpose and the column space of

A, and they're orthogonal. So E in the null space of A

transpose is saying E is perpendicular to the column

space of A. Yes. I -- I -- I just feel OK,

the damn thing came out right. Uh the equation for the -- the

equation that I struggled to find for E really said

what I wanted, that the -- the --

the error E is perpendicular to the column space of A,

just right. And from our four fundamental subspaces we knew that

that is the same as that. To say E is in the null space of A

transpose says E's perpendicular to the column space.

OK. So we've got this equation. Now let's just solve it. All right.

Let me just rewrite it as A transpose A X hat equals A transpose

B. That's our equation. That gives us X.

And oh I -- allow me to keep remembering the

one-dimensional case. The one-dimensional case,

this was little A. So this was just a number, little A --

little A transpose, A transpose A was just a

vector row times a column, a number. And this was a number.

And X was the ratio of those numbers. But now we've got matrices,

this -- this one is N by N. A transpose A is an N by N matrix.

OK. So can I move to the next board

for the solution? OK. There -- this is the --

the -- the key equation. Now I'm ready for the formulas that we have

to remember. What's X hat?

What's the projection, what's the projection matrix,

those are my three questions. That we answered in the one-D case and

now we're ready for in the N-dimensional case.

So what is X hat? Well, what can I say but A transpose A

inverse, A transpose B.

OK. that's the solution to -- to our equation. What's the

projection? That's more interesting.

What's the projection? The projection is A X hat.

That's how X hat got into the picture in the first

place. X hat was the was the combination of columns in the I had

to look for those numbers and now I found them.

Was the combination of the columns of A that gave me the projection.

OK. So now I know what this guy is.

So it's just I multiply by A. A A transpose A inverse A transpose

B. That's looking a little messy but it's uh

not bad. That that combination is is our like magic combination.

This is the thing which is which use which w- which is likewhat's

it like, what was it in one dimension?

What was that we had this we must have had this thing way back at the

beginning of the lecture. What did we -- oh that A was just a

column so it was little A, little A transpose over A transpose

A, right, that's what it was in one-D.

You see what's happened in more dimensions,

I -- I'm not allowed to to just divide because because I don't have

a number, I have to put inverse, because I have an N by N matrix.

But same formula. And now tell me what's the

projection matrix? What matrix is multiplying B to

give the projection? Right there. Because there it i- I

n- n- I even already underlined it by accident. The projection matrix

which I use capital P is this, it's it's that thing, shall I write

it again, A times A transpose A inverse times A transpose.

Now if -- if you'll bear with me I'll think about what --

what I -- what have I done here. I've got this formula. Now the

first thing that occurs to me is -- is something bad.

Lookwhy don't I just you know here's a product of two matrices and

I want its inverse, why don't I just use the formula I

know for the inverse of a product and say OK,

that's A inverse times A transpose inverse,

what will happen if I do that? What will happen if I -- if I say

hey this is A inverse times A transpose inverse,

then shall I do it? It's going to go on videotape if I

do it, and I don't --

all right, I'll put it there, but just like don't take the

videotape quite so carefully. OK. So if I -- if I put that thing

it -- it would be A A inverse A transpose inverse A transpose

and what's that? That's the identity.

But what's going on? So why -- you see my question is

somehow I did s- I did something wrong. That that wasn't allowed.

And and and why is that? Because A is not a square matrix.

A is not a square matrix. It doesn't have an inverse.

So so I have to leave it that way. This is not OK. If if A was a

square invertible matrix, then then I couldn't complain.

Yeah I I'll think -- I -- let me think about that case.

But you but my main case, the whole reason I'm doing all this,

is that A is this matrix that has X too many rows,

it's just got a couple of columns, like A one and A two,

but lots of rows. Not square. And if it's not square,

this A transpose A is square but I can't pull it apart like this,

so I -- I -- I'm not allowed to -- to do this pull apart, except if A

was square. Now if A is square what's what's

going on if A is a square matrix?

A nice square inv- invertible matrix. Think.

Wh- what's up with that wh- what's what's with that case.

So this is auhis that the formula ought to work then too.

If A is a nice square invertible matrix what's its column space,

so it's a nice N by N invertible everything great matrix,

what's its column space, the whole of RN. So what's the projection

matrix if I'm projecting onto the whole space? It's the

identity matrix right? If I'm projecting B onto the whole

space, not just onto a plane, but onto all of three-D, then B is

already in the column space, the projection is the identity,

and this is gives me the correct formula, P is I. But

if I'm projecting onto a subspace then I can't split those apart and I

have to stay with that formula. OK. And what can I say if -- so --

so I remember this formula for one-D and that's what it looks

like in N dimensions. And what are the properties that I

expected for any projection matrix? And I still expect for this one?

That matrix should be symmetric and it is. P transpose is P.

Because if I transpose this, this guy's symmetric, and its

inverse is symmetric, and if I transpose this one when I

transpose it will pop up there, become A, that A transpose will pop

up here, and I'm back to P again. And do we dare try the other

property which is P squared equal P? It -- it's got to be right.

Because we know geometrically that the first projection pops us onto

the column space and the second one leaves us where we are.

So I expect that if I multiply by let me do it --

if I multiply by another P, so there's another A, another A

transpose A inverse A transpose, can you God --

eight As in a row is quite obscene butuh do you see that it works?

So I'm squaring that so what do I dohow do I see that

multiplication? Well, yeI just want to put

parentheses in good places, so I see what's happening,

yehere's an A transpose A sitting together,

so when that A transpose A multiplies its inverse,

all that stuff goes, right. And leaves just the A transpose at

the end, which is just what we want. So P squared equals P. So sure

enough those two properties hold. OK.

That's th- OK we really have got now all the formulas.

X hat, the projection P, and the projection matrix capital P.

And now my job is to use them. OK. So when would I --

when would Ihave a bunch of equations,

too many equations and yet I want the best answer and the --

the most important example, the most common example is if I

have points so here's the --

here's the application. V squared. Fitting by a line.

OK. So I -- I'll start this application today

and there's more in it thanI can do in this same lecture.

So that'll give me a chance to recap the formulas and there they

are, and recap the ideas. So let me s- let me start the

problemtoday. I'm given a bunch of data points.

And they lie close to a line but not on a line.

Let me take that. Say a T equal to one,

two and three, I have one,

and two and two again. So my data points are this is the

like the time direction and this is like well let me call that B or Y or

something. I'm given these three points

and I want to fit them by a line. By the best straight line. So the

problem is fit the pointsone, one is the first point, the second

point is T equals two, B equal one,

and the third point is T equal three, B equal to two.

So those are my three points, T equal sorry,that's two. Yeah,

OK. So this is the point one, one. This is the point two,

two, and that's the point three, two. And of course there isn't a --

a line that goes through them. So I'm looking for the best line.

I'm looking for a line that probably goes somewhere,

do you think it goes somewhere like that? I'm l-uh I didn't

mean to make it go through that point,

it won't. It'll kind of -- it'll s- it'll go between so the

error there and the error there and the error there are as small as I

can get them. OK, what I'd like to do is

find the matrix A. Because once I've found the matrix

A the formulas take over. So what I -- I'm looking for this

line, B is C plus DT. So this is in the homework that I

sent out for today. Find the best line.

So I'm looking for these numbers. C and D.

That tell me the line and I want them to be as close to going through

those three points as I can get. I can't get exactly so there are

three equations to go through the three points. It would it will go

exactly through that point if let's see that first point has T equal to

one, so that would say C plus D equaled one.

This is the one, one. The second point T is two.

So C plus two D should come out to equal two.

But I also want to get the third equation in and at that third

equation T is three so C plus three D equals only two.

That th- that's the key. Is to write down

what equations we would like to solve but can't.

Reason we if we could solve them that would mean that we could put a

line through all three points and of course if these numbers one,

two, two were different, we could do it.

But with those numbers, one, two, two, we can't.

So what is our equation AX equal AX equal B that we can't solve?

I -- I -- I want -- I just want to say what's the matrix here,

what's the unknown X, and what's the right-hand side.

So this is the matrix is one, one, one,

one, two, three. The unknown is C and D.

And the right-hand side if one, two, two. Right, I've just taken my

equations and I've said what is AX and what is B.

Then there's no solution, this is the typical case where I have three

equations, two unknowns,

no solution, but I'm still looking for the best solution.

And the best solution is taken is is to solve not this equation AX equal

B which has which has no solution but the equation that does have a

solution, which was this one. So that's the equation to

solve. That's the central equation of the subject.

I can't solve AX equal B but magically when I multiply both sides

by A transpose I get an equation that I can solve and its

solution gives me X, the best X,

the best projection, and I discover what's the matrix

that's behind it. OK. So next time I'll complete an

example, numerical example. T- t- today was all letters,

numbers next time. Thanks.