Lec 3 | MIT 18.02 Multivariable Calculus, Fall 2007

Uploaded by MIT on 16.01.2009

Remember last time -- -- we learned about the cross product
of vectors in space. Remember the definition of
cross product is in terms of this determinant det| i hat,
j hat, k hat, and then the components of A,
a1, a2, a3, and then the components of B|
This is not an actual determinant because these are
not numbers. But it's a symbolic notation,
to remember what the actual formula is.
The actual formula is obtained by expanding the determinant.
So, we actually get the determinant of a2,
a3, b2, b3 times i hat, minus the determinant of a1,
a3, b1, b3 times j hat plus the determinant of a1,
a2, b1, b2, times k hat. And we also saw a more
geometric definition of the cross product.
We have learned that the length of the cross product is equal to
the area of the parallelogram with sides A and B.

We have also learned that the direction of the cross product
is given by taking the direction that's perpendicular to A and B.
If I draw A and B in a plane (they determine a plane),
then the cross product should go in the direction that's
perpendicular to that plane. Now, there's two different
possible directions that are perpendicular to a plane.
And, to decide which one it is, we use the right-hand rule,
which says if you extend your right hand in the direction of
the vector A, then curve your fingers in the
direction of B, then your thumb will go in the
direction of the cross product. One thing I didn't quite get to
say last time is that there are some funny manipulation rules.
What are we allowed to do or not do with cross products?
So, let me tell you right away the most surprising one if
you've never seen it before: A cross B and B cross A are not
the same thing. Why are they not the same thing?
Well, one way to see it is to think geometrically.
The parallelogram still has the same area, and it's still in the
same plane. So, the cross product is still
perpendicular to the same plane. But, what happens is that,
if you try to apply the right-hand rule but exchange the
roles of A and B, then you will either injure
yourself, or your thumb will end up
pointing in the opposite direction.
So, in fact, B cross A and A cross B are
opposite of each other. And you can check that in the
formula because, for example,
the i component is a2 b3 minus a3 b2.
If you swap the roles of A and B, you will also have to change
the signs. That's a slightly surprising
thing, but you will see one easily adjusts to it.
It just means one must resist the temptation to write AxB
equals BxA. Whenever you do that,
put a minus sign. Now, in particular,
what happens if I do A cross A? Well, I will get zero.
And, there's many ways to see that.
One is to use the formula. Also, you can see that the
parallelogram formed by A and A is completely flat,
and it has area zero. So, we get the zero vector.

Hopefully you got practice with cross products,
and computing them, in recitation yesterday.
Let me just point out one important application of cross
product that maybe you haven't seen yet.
Let's say that I'm given three points in space,
and I want to find the equation of the plane that contains them.
So, say I have P1, P2, P3, three points in space.
They determine a plane, at least if they are not
aligned, and we would like to find the equation of the plane
that they determine. That means, let's say that we
have a point, P, in space with coordinates x,
y, z. Well, to find the equation of
the plane -- -- the plane containing P1,
P2, and P3, we need to find a condition on
the coordinates x, y, z,
telling us whether P is in the plane or not.
We have several ways of doing that.
For example, one thing we could do.
Let me just backtrack to determinants that we saw last
time. One way to think about it is to
consider these vectors, P1P2, P1P3, and P1P.
The question of whether they are all in the same plane is the
same as asking ourselves whether the parallelepiped that they
form is actually completely flattened.
So, if I try to form a parallelepiped with these three
sides, and P is not in the plane, then it will have some
volume. But, if P is in the plane,
then it's actually completely squished.
So,one possible answer, one possible way to think of
the equation of a plane is that the determinant of these vectors
should be zero. Take the determinant of (vector
P1P,vector P1P2,vector P1P3) equals 0 (if you do it in a
different order it doesn't really matter).
One possible way to express the condition that P is in the plane
is to say that the determinant of these three vectors has to be
zero. And, if I am given coordinates
for these points -- I'm not giving you numbers,
but if I gave you numbers, then you would be able to plug
those numbers in. So, you could compute these two
vectors P1P2 and P1P3 explicitly.
But, of course, P1P would depend on x,
y, and z. So, when you compute the
determinant, you get a formula that involves x,
y, and z. And you'll find that this
condition on x, y, z is the equation of a
plane. We're going to see more about
that pretty soon. Now, let me tell you a slightly
faster way of doing it. Actually, it's not much faster,
It's pretty much the same calculation, but it's maybe more
enlightening. Let me actually show you a nice
color picture that I prepared for this.
One thing that's on this picture that I haven't drawn
before is the normal vector to the plane.
Why is that? Well, let's say that we know
how to find a vector that's perpendicular to our plane.
Then, what does it mean for the point, P, to be in the plane?
It means that the direction from P1 to P has to be
perpendicular to this vector N. So here's another solution:
P is in the plane exactly when the vector P1P is perpendicular
to N, where N is some vector that's
perpendicular to the plane. N is called a normal vector.
How do we rephrase this condition?
Well, we've learned how to detect whether two vectors are
perpendicular to each other using dot product (that was the
first lecture). These two vectors are
perpendicular exactly when their dot product is zero.
So, concretely, if we have a point P1 given to
us, and say we have been able to
compute the vector N, then when we actually compute
what happens, here we will have the
coordinates x, y, z, of a point P,
and we will get some condition on x, y, z.
That will be the equation of a plane.
Now, why are these things the same?
Well, before I can tell you that, I should tell you how to
find a normal vector. Maybe you are already starting
to see what the method should be, because we know how to find
a vector perpendicular to two given vectors.
We know two vectors in that plane, for example,
P1P2, and P1P3. Actually, I could have used
another permutation of these points, but, let's use this.
So, if I want to find a vector that's perpendicular to both
P1P2 and P1P3 at the same time, all I have to do is take their
cross product. So, how do we find a vector
that's perpendicular to the plane?
The answer is just the cross product P1P2 cross P1P3.
Say you actually took the points in a different order,
and you took P1P3 x P1P2. You would get,
of course, the opposite vector. That is fine.
Any plane actually has infinitely many normal vectors.
You can just multiply a normal vector by any constant,
you will still get a normal vector.
So, that's going to be one of the main uses of dot product.
When we know two vectors in a plane, it lets us find the
normal vector to the plane, and that is what we need to
find the equation. Now, why is that the same as
our first answer over there? Well, the condition that we
have is that P1P dot N should be 0.
And we said N is actually P1P2 cross P1P3.
So, this is what we want to be zero.
Now, if you remember, a long time ago (that was
Friday) we've introduced this thing and called it the triple
product. And what we've seen is that the
triple product is the same thing as the determinant.
So, in fact, these two ways of thinking,
one saying that the box formed by these three vectors should be
flat and have volume zero, and the other one saying that
we can find a normal vector and then express the condition that
a vector is in the plane if it's perpendicular to the normal
vector, are actually giving us the same
formula in the end.

OK, any quick questions before we move on?
STUDENT QUESTION: are those two equal only when P
is in the plane, or no matter where it is?
So, these two quantities, P1P dot the cross product,
or the determinant of the three vectors, are always equal to
each other. They are always the same.
And now, if a point is not in the plane, then their numerical
value will be nonzero. If P is in the plane,
it will be zero. OK, let's move on and talk a
bit about matrices. Probably some of you have
learnt about matrices a little bit in high school,
but certainly not all of you. So let me just introduce you to
a little bit about matrices -- just enough for what we will
need later on in this class. If you want to know everything
about the secret life of matrices, then you should take
18.06 someday. OK, what's going to be our
motivation for matrices? Well, in life,
a lot of things are related by linear formulas.
And, even if they are not, maybe sometimes you can
approximate them by linear formulas.
So, often, we have linear relations between variables --
for example, if we do a change of coordinate systems.
For example, say that we are in space,
and we have a point. Its coordinates might be,
let me call them x1, x2, x3 in my initial coordinate
system. But then, maybe I need to
actually switch to different coordinates to better solve the
problem because they're more adapted to other things that
we'll do in the problem. And so I have other coordinates
axes, and in these new coordinates, P will have
different coordinates -- let me call them, say,
u1, u2, u3. And then, the relation between
the old and the new coordinates is going to be given by linear
formulas -- at least if I choose the same origin.
Otherwise, there might be constant terms,
which I will not insist on. Let me just give an example.
For example, maybe, let's say u1 could be 2
x1 3 x2 3 x3. u2 might be 2 x1 4 x2 5 x3.
u3 might be x1 x2 2 x3. Do not ask me where these
numbers come from. I just made them up,
that's just an example of what might happen.
You can put here your favorite numbers if you want.
Now, in order to express this kind of linear relation,
we can use matrices. A matrix is just a table with
numbers in it. And we can reformulate this in
terms of matrix multiplication or matrix product.
So, instead of writing this, I will write that the matrix
|2,3, 3; 2,4, 5; 1,1, 2| times the vector
***amp***lt;x1, x2, x3> is equal to
***amp***lt;u1, u2, u3>.
Hopefully you see that there is the same information content on
both sides. I just need to explain to you
what this way of multiplying tables of numbers means.
Well, what it means is really that we'll have exactly these
same quantities. Let me just say that more
symbolically: so maybe this matrix could be
called A, and this we could call X, and this one we could call U.
Then we'll say A times X equals U, which is a lot shorter than
that. Of course, I need to tell you
what A, X, and U are in terms of their entries for you to get the
formula. But it's a convenient notation.
So, what does it mean to do a matrix product?
The entries in the matrix product are obtained by taking
dot products. Let's say we are doing the
product AX. We do a dot products between
the rows of A and the columns of X.
Here, A is a 3x3 matrix -- that just means there's three rows
and three columns. And X is a column vector,
which we can think of as a 3x1 matrix.
It has three rows and only one column.
Now, what can we do? Well, I said we are going to do
a dot product between a row of A: 2,3, 3, and a column of X:
x1, x2, x3. That dot product will be two
times x1 plus three times x2 plus three times x3.
OK, it's exactly what we want to set u1 equal to.
Let's do the second one. I take the second row of A:
2,4, 5, and I do the dot product with x1,
x2, x3. I will get two times x1 plus
four times x2 plus five times x3, which is u2.
And, same thing with the third one: one times x1 plus one times
x2 plus two times x3. So that's matrix multiplication.
Let me restate things more generally.
If I want to find the entries of a product of two matrices,
A and B -- I'm saying matrices, but of course they could be
vectors. Vectors are now a special case
of matrices, just by taking a matrix of width one.
So, if I have my matrix A, and I have my matrix B,
then I will get the product, AB.
Let's say for example -- this works in any size -- let's say
that A is a 3x4 matrix. So, it has three rows,
four columns. And, here, I'm not going to
give you all the values because I'm not going to compute
everything. It would take the rest of the
lecture. And let's say that B is maybe
size 4x2. So, it has two columns and four
rows. And, let's say,
for example, that we have the second column:
0,3, 0,2. So, in A times B,
the entries should be the dot products between these rows and
these columns. Here, we have two columns.
Here, we have three rows. So, we should get three times
two different possibilities. And so the answer will have
size 3x2. We cannot compute most of them,
because I did not give you numbers, but one of them we can
compute. We can compute the value that
goes here, namely, this one in the second column.
So, I select the second column of B, and I take the first row
of A, and I find: 1 times 0: 0.
2 times 3: 6, plus 0, plus 8,
should make 14. So, this entry right here is 14.
In fact, let me tell you about another way to set it up so that
you can remember more easily what goes where.
One way that you can set it up is you can put A here.
You can put B up here, and then you will get the
answer here. And, if you want to find what
goes in a given slot here, then you just look to its left
and you look above it, and you do the dot product
between these guys. That's an easy way to remember.
First of all, it tells you what the size of
the answer will be. The size will be what fits
nicely in this box: it should have the same width
as B and the same height as A. And second, it tells you which
dot product to compute for each position.
You just look at what's to the left, and what's above the given
position. Now, there's a catch.
Can we multiply anything by anything?
Well, no. I wouldn't ask the question
otherwise. But anyway, to be able to do
this dot product, we need to have the same number
of entries here and here. Otherwise, we can't say "take
this times that, plus this times that,
and so on" if we run out of space on one of them before the
other. So, the condition -- and it's
important, so let me write it in red -- is that the width of A
must equal the height of B. (OK, it's a bit cluttered,
but hopefully you can still see what I'm writing.)
OK, now we know how to multiply matrices.

So, what does it mean to multiply matrices?
Of course, we've seen in this example that we can use a matrix
to tell us how to transform from x's to u's.
And, that's an example of multiplication.
But now, let's see that we have two matrices like that telling
us how to transform from something to something else.
What does it mean to multiply them?

I claim that the product AB represents doing first the
transformation B, then transformation A.
That's a slightly counterintuitive thing,
because we are used to writing things from left to right.
Unfortunately, with matrices,
you multiply things from right to left.
If you think about it, say you have two functions,
f and g, and you write f(g(x)), it really means you apply first
g then f. It works the same way as that.
OK, so why is this? Well, if I write AB times X
where X is some vector that I want to transform,
it's the same as A times BX. This property is called
associativity. And, it's a good property of
well-behaved products -- not of cross product,
by the way. Matrix product is associative.
That means we can actually think of a product ABX and
multiply them in whichever order we want.
We can start with BX or we can start with AB.
So, now, BX means we apply the transformation B to X.
And then, multiplying by A means we apply the
transformation A. So, we first apply B,
then we apply A. That's the same as applying AB
all at once. Another thing -- a warning:
AB and BA are not the same thing at all.
You can probably see that already from this
interpretation. It's not the same thing to
convert oranges to bananas and then to carrots,
or vice versa. Actually, even worse:
this thing might not even be well defined.
If the width of A equals the height of B, we can do this
product. But it's not clear that the
width of B will equal the height of A, which is what we would
need for that one. So, the size condition,
to be able to do the product, might not make sense -- maybe
one of the products doesn't make sense.
Even if they both make sense, they are usually completely
different things. The next thing I need to tell
you about is something called the identity matrix.
The identity matrix is the matrix that does nothing.
What does it mean to do nothing? I don't mean the matrix is zero.
The matrix zero would take X and would always give you back
zero. That's not a very interesting
transformation. What I mean is the guy that
takes X and gives you X again. It's called I,
and it has the property that IX equals X for all X.
So, it's the transformation from something to itself.
It's the obvious transformation -- called the identity
transformation. So, how do we write that as a
matrix? Well, actually there's an
identity for each size because, depending on whether X has two
entries or ten entries, the matrix I needs to have a
different size. For example,
the identity matrix of size 3x3 has entries one,
one, one on the diagonal, and zero everywhere else.
OK, let's check. If we multiply this with a
vector -- start thinking about it.
What happens when multiply this with the vector X?

OK, so let's say I multiply the matrix I with a vector x1,
x2, x3. What will the first entry be?
It will be the dot product between ***amp***lt;1,0,0> and
***amp***lt;x1 x2 x3>. This vector is i hat.
If you do the dot product with i hat, you will get the first
component -- that will be x1. One times x1 plus zero, zero.
Similarly here, if I do the dot product,
I get zero plus x2 plus zero. I get x2, and here I get x3.
OK, it works. Same thing if I put here a
matrix: I will get back the same matrix.
In general, the identity matrix in size n x n is an n x n matrix
with ones on the diagonal, and zeroes everywhere else.
You just put 1 at every diagonal position and 0
elsewhere. And then, you can see that if
you multiply that by a vector, you'll get the same vector

OK, let me give you another example of a matrix.
Let's say that in the plane we look at the transformation that
does rotation by 90°, let's say, counterclockwise.
I claim that this is given by the matrix: |0,1;
- 1,0|. Let's try to see why that is
the case. Well, if I do R times i hat --
if I apply that to the first vector,
i hat: i hat will be ***amp***lt;1,0> so in this
product, first you will get 0,
and then you will get 1. You get j hat.
OK, so this thing sends i hat to j hat.
What about j hat? Well, you get negative one.
And then you get 0. So, that's minus i hat.
So, j is sent towards here. And, in general,
if you apply it to a vector with components x,y,
then you will get back -y,x, which is the formula we've seen
for rotating a vector by 90°. So, it seems to do what we want.
By the way, the columns in this matrix represent what happens to
each basis vector, to the vectors i and j.
This guy here is exactly what we get when we multiply R by i.
And, when we multiply R by j, we get this guy here.
So, what's interesting about this matrix?
Well, we can do computations with matrices in ways that are
easier than writing coordinate change formulas.
For example, if you compute R squared,
so if you multiply R with itself: I'll let you do it as an
exercise, but you will find that you get
|-1,0;0,-1|. So, that's minus the identity
matrix. Why is that?
Well, if I rotate something by 90° and then I rotate by 90°
again, then I will rotate by 180�.
That means I will actually just go to the opposite point around
the origin. So, I will take (x,y) to
(-x,-y). And if I applied R four times,
R^4 would be identity. OK, questions?
STUDENT QUESTION: when you said R equals that
matrix, is that the definition of R?
How did I come up with this R? Well, secretly,
I worked pretty hard to find the entries that would tell me
how to rotate something by 90° counterclockwise.
So, remember: what we saw last time or in the
first lecture is that, to rotate a vector by 90°,
we should change (x, y) to (-y, x).
And now I'm trying to express this transformation as a matrix.
So, maybe you can call these guys u and v,
and then you write that u equals 0x-1y,
and that v equals 1x 0y. So that's how I would find it.
Here, I just gave it to you already made,
so you didn't really see how I found it.
You will see more about rotations on the problem set.
OK, next I need to tell you how to invert matrices.
So, what's the point of matrices?
It's that it gives us a nice way to think about changes of
variables. And, in particular,
if we know how to express U in terms of X, maybe we'd like to
know how to express X in terms of U.
Well, we can do that, because we've learned how to
solve linear systems like this. So in principle,
we could start working, substituting and so on,
to find formulas for x1, x2, x3 as functions of u1,
u2, u3. And the relation will be,
again, a linear relation. It will, again,
be given by a matrix. Well, what's that matrix?
It's the inverse transformation.
It's the inverse of the matrix A.
So, we need to learn how to find the inverse of a matrix

The inverse of A, by definition,
is a matrix M, with the property that if I
multiply A by M, then I get identity.
And, if I multiply M by A, I also get identity.
The two properties are equivalent.
That means, if I apply first the transformation A,
then the transformation M, actually I undo the
transformation A, and vice versa.
These two transformations are the opposite of each other,
or I should say the inverse of each other.
For this to make sense, we need A to be a square
matrix. It must have size n by n.
It can be any size, but it must have the same
number of rows as columns. It's a general fact that you
will see more in detail in linear algebra if you take it.
Let's just admit it. The matrix M will be denoted by
A inverse. Then, what is it good for?
Well, for example, finding the solution to a
linear system. What's a linear system in our
new language? It's: a matrix times some
unknown vector, X, equals some known vector,
B. How do we solve that?
We just compute: X equals A inverse B.
Why does that work? How do I get from here to here?
Let's be careful.

(I'm going to reuse this matrix, but I'm going to erase
it nonetheless and I'll just rewrite it).

If AX=B, then let's multiply both sides by A inverse.
A inverse times AX is A inverse B.
And then, A inverse times A is identity, so I get:
X equals A inverse B. That's how I solved my system
of equations. So, if you have a calculator
that can invert matrices, then you can solve linear
systems very quickly. Now, we should still learn how
to compute these things. Yes?
[Student Questions:]"How do you know that A inverse will be on
the left of B and not after it " Well,
it's exactly this derivation. So, if you are not sure,
then just reproduce this calculation.
To get from here to here, what I did is I multiplied
things on the left by A inverse, and then this guy simplify.
If I had put A inverse on the right, I would have AX A
inverse, which might not make sense, and even if it makes
sense, it doesn't simplify. So, the basic rule is that you
have to multiply by A inverse on the left so that it cancels with
this A that's on the left. STUDENT QUESTION:
"And if you put it on the left on this side then it will be on
the left with B as well?" That's correct,
in our usual way of dealing with matrices,
where the vectors are column vectors.
It's just something to remember: if you have a square
matrix times a column vector, the product that makes sense is
with the matrix on the left, and the vector on the right.
The other one just doesn't work. You cannot take X times A if A
is a square matrix and X is a column vector.
This product AX makes sense. The other one XA doesn't make
sense. It's not the right size.
OK. What we need to do is to learn
how to invert a matrix. It's a useful thing to know,
first for your general knowledge, and second because
it's actually useful for things we'll see later in this class.
In particular, on the exam,
you will need to know how to invert a matrix by hand.
This formula is actually good for small matrices,
3x3,4x4. It's not good at all if you
have a matrix of size 1,000x1,000.
So, in computer software, actually for small matrices
they do this, but for larger matrices,
they use other algorithms. Let's just see how we do it.
First of all I will give you the final answer.
And of course I will need to explain what the answer means.
We will have to compute something called the adjoint
matrix. I will tell you how to do that.
And then, we will divide by the determinant of A.
How do we get to the adjoint matrix?
Let's go through the steps on a 3x3 example -- the steps are the
same no matter what the size is, but let's do 3x3.
So, let's say that I'm giving you the matrix A -- let's say
it's the same as the one that I erased earlier.
That was the one relating our X's and our U's.
The first thing I want to do is find something called the
minors. What's a minor?
It's a slightly smaller determinant.
We've already seen them without calling them that way.
The matrix of minors will have again the same size.
Let's say we want this entry. Then, we just delete this row
and this column, and we are left with a 2x2
determinant. So, here, we'll put the
determinant 4,5, 1,2, which is 4 times 2:
8 -- minus 5: 3.
Let's do the next one. So, for this entry,
I'll delete this row and this column.
I'm left with 2,5, 1,2. The determinant will be 2 times
2 minus 5, which is negative 1. Then minus 2,
then I get to the second row, so I get to this entry.
To find the minor here, I will delete this row and this
column. And I'm left with 3,3, 1,2.
3 times 2 minus 3 is 3. Let me just cheat and give you
the others -- I think I've shown you that I can do them.
Let's just explain the last one again.
The last one is 2. To find the minor here,
I delete this column and this row, and I take this
determinant: 2 times 4 minus 2 times 3.
So it's the same kind of manipulation that we've seen
when we've taken determinants and cross products.
Step two: we go to another matrix that's called cofactors.
So, the cofactors are pretty much the same thing as the
minors except the signs are slightly different.
What we do is that we flip signs according to a
checkerboard diagram. You start with a plus in the
upper left corner, and you alternate pluses and
minuses. The rule is:
if there is a plus somewhere, then there's a minus next to it
and below it. And then, below a minus or to
the right of a minus, there's a plus.
So that's how it looks in size 3x3.
What do I mean by that? I don't mean,
make this positive, make this negative,
and so on. That's not what I mean.
What I mean is: if there's a plus,
that means leave it alone -- we don't do anything to it.
If there's a minus, that means we flip the sign.
So, here, we'd get: 3, then 1, -2,
-3,1, 1... 3,-4, and 2.
OK, that step is pretty easy. The only hard step in terms of
calculations is the first one because you have to compute all
of these 2x2 determinants.

By the way, this minus sign here is actually related to the
way in which, when we do a cross product,
we have a minus sign for the second entry.
OK, we're almost done. The third step is to transpose.
What does it mean to transpose? It means: you read the rows of
your matrix and write them as columns, or vice versa.
So we switch rows and columns. What do we get?
Well, let's just read the matrix horizontally and write it
vertically. We read 3,1, - 2: 3,1, - 2.
Then we read -3 3,1, 1: - 3,1, 1.
Then, 3, - 4,2: 3, - 4,2. That's pretty easy.
We're almost done. What we get here is this is the
adjoint matrix. So, the fourth and last step is
to divide by the determinant of A.
We have to compute the determinant -- the determinant
of A, not the determinant of this guy.
So: 2,3, 3,2, 4,5, 1,1, 2. I'll let you check my
computation. I found that it's equal to 3.
So the final answer is that A inverse is one third of the
matrix we got there: |3, - 3,3, 1,1,
- 4, - 2,1, 2|. Now, remember,
A told us how to find the u's in terms of the x's.
This tells us how to find x-s in terms of u-s:
if you multiply x1,x2,x3 by this you get u1,u2,u3.
It also tells you how to solve a linear system:
A times X equals something.