Uploaded by
MIT on 16.01.2009
so -- OK, so remember last time,
on Tuesday we learned about the chain rule,
and so for example we saw that if we have a function that
depends, sorry, on three variables,
x,y,z, that x,y,z themselves depend on
some variable, t,
then you can find a formula for df/dt by writing down wx/dx dt
wy dy/dt wz dz/dt. And, the meaning of that
formula is that while the change in w is caused by changes in x,
y, and z, x, y, and z change at rates dx/dt,
dy/dt, dz/dt. And, this causes a function to
change accordingly using, well, the partial derivatives
tell you how sensitive w is to changes in each variable.
OK, so, we are going to just rewrite this in a new notation.
So, I'm going to rewrite this in a more concise form as
gradient of w dot product with velocity vector dr/dt.
So, the gradient of w is a vector formed by putting
together all of the partial derivatives.
OK, so it's the vector whose components are the partials.
And, of course, it's a vector that depends on
x, y, and z, right? These guys depend on x, y, z.
So, it's actually one vector for each point,
x, y, z. You can talk about the gradient
of w at some point, x, y, z.
So, at each point, it gives you a vector.
That actually is what we will call later a vector field.
We'll get back to that later. And, dr/dt is just the velocity
vector dx/dt, dy/dt, dz/dt.
OK, so the new definition for today is the definition of the
gradient vector. And, our goal will be to
understand a bit better, what does this vector mean?
What does it measure? And, what can we do with it?
But, you see that in terms of information content,
it's really the same information that's already in
the partial derivatives, or in the differential.
So, yes, and I should say, of course you can also use the
gradient and other things like approximation formulas and so
on. And so far, it's just notation.
It's a way to rewrite things. But, so here's the first cool
property of the gradient. So, I claim that the gradient
vector is perpendicular to the level surface corresponding to
setting the function, w, equal to a constant.
OK, so if I draw a contour plot of my function,
so, actually forget about z because I want to draw a two
variable contour plot. So, say I have a function of
two variables, x and y, then maybe it has some
contour plot. And, I'm saying if I take the
gradient of a function at this point, (x,y).
So, I will have a vector. Well, if I draw that vector on
top of a contour plot, it's going to end up being
perpendicular to the level curve.
Same thing if I have a function of three variables.
Then, I can try to draw its contour plot.
Of course, I can't really do it because the contour plot would
be living in space with x, y, and z.
But, it would be a bunch of level faces, and the gradient
vector would be a vector in space.
That vector is perpendicular to the level faces.
So, let's try to see that on a couple of examples.
So, let's do a first example. What's the easiest case?
Let's take a linear function of x, y, and z.
So, I will take w equals a1 times x plus a2 times y plus a3
times z. Well, so, what's the gradient
of this function? Well, the first component will
be a1. That's partial w partial x.
Then, a2, that's partial w partial y, and a3,
partial w partial z. Now, what is the levels of this?
Well, if I set w equal to some constant, c, that means I look
at the points where a1x a2y a3z equals c.
What kind of service is that? It's a plane.
And, we know how to find a normal vector to this plane just
by looking at the coefficients. So, it's a plane with a normal
vector exactly this gradient. And, in fact,
in a way, this is the only case you need to check because of
linear approximations. If you replace a function by
its linear approximation, that means you will replace the
level surfaces by their tension planes.
And then, you'll actually end up in this situation.
But maybe that's not very convincing.
So, let's do another example. So, let's do a second example.
Let's say we look at the function x^2 y^2.
OK, so now it's a function of just two variables because that
way we'll be able to actually draw a picture for you.
OK, so what are the level sets of this function?
Well, they're going to be circles, right?
w equals c is a circle, x^2 y^2 = c.
So, I should say, maybe, sorry,
the level curve is a circle. So, the contour plot looks
something like that. Now, what's the gradient vector?
Well, the gradient of this function, so,
partial w partial x is 2x. And partial w partial y is 2y.
So, let's say I take a point, x comma y, and I try to draw my
gradient vector. So, here at x,
y, so, I have to draw the vector, <2x,
2y>. What does it look like?
Well, it's going in that direction.
It's parallel to the position vector for this point.
It's actually twice the position vector.
So, I guess it goes more or less like this.
What's interesting, too, is it is perpendicular to
this circle. OK, so it's a general feature.
Actually, let me show you more examples, oops,
not the one I want. So, I don't know if you can see
it so well. Well, hopefully you can.
So, here I have a contour plot of a function,
and I have a blue vector. That's the gradient vector at
the pink point on the plot. So, you can see,
I can move the pink point, and the gradient vector,
of course, changes because the gradient depends on x and y.
But, what doesn't change is that it's always perpendicular
to the level curves. Anywhere I am,
my gradient stays perpendicular to the level curve.
OK, is that convincing? Is that visible for people who
can't see blue? OK, so, OK, so we have a lot of
evidence, but let's try to prove the theorem because it will be
interesting. So, first of all,
sorry, any questions about the statement, the example,
anything, yes? Ah, very good question.
Does the gradient vector, why is the gradient vector
perpendicular in one direction rather than the other?
So, we'll see the answer to that in a few minutes.
But let me just tell you immediately, to the side,
which side it's pointing to, it's always pointing towards
higher values of a function. OK, and we'll see in that maybe
about half an hour. So, well, let me say actually
points towards higher values of w.
OK, any other questions? I don't see any questions.
OK, so let's try to prove this theorem, at least this part of
the theorem. We're not going to prove that
just yet. That will come in a while.
So, well, maybe we want to understand first what happens if
we move inside the level curve, OK?
So, let's imagine that we are taking a moving point that stays
on the level curve or on the level surface.
And then, we know, well, what happens is that the
function stays constant. But, we can also know how
quickly the function changes using the chain rule up there.
So, maybe the chain rule will actually be the key to
understanding how the gradient vector and the motion on the
level service relate. So, let's take a curve,
r equals r of t, that stays inside,
well, maybe I should say on the level surface,
w equals c. So, let's think about what that
means. So, just to get you used to
this idea, I'm going to draw a level surface of a function of
three variables. OK, so it's a surface given by
the equation w of x, y, z equals some constant,
c. And, so now I'm going to have a
point on that, and it's going to move on that
surface. So, I will have some parametric
curve that lives on this surface.
So, the question is, what's going to happen at any
given time? Well, the first observation is
that the velocity vector, what can I say about the
velocity vector of this motion? It's going to be tangent to the
level surface, right?
If I move on a surface, then at any point,
my velocity is tangent to the curve.
But, if it's tangent to the curve, then it's also tangent to
the surface because the curve is inside the surface.
So, OK, it's getting a bit cluttered.
Maybe I should draw a bigger picture.
Let me do that right away here. So, I have my level surface,
w equals c. I have a curve on that,
and at some point, I'm going to have a certain
velocity. So, the claim is that the
velocity, v, equals dr/dt is tangent -- --
to the level, w equals c because it's tangent
to the curve, and the curve is inside the
level, OK?
Now, what else can we say? Well, we have,
the chain rule will tell us how the value of w changes.
So, by the chain rule, we have dw/dt.
So, the rate of change of the value of w as I move along this
curve is given by the dot product between the gradient and
the velocity vector. And, so, well,
maybe I can rewrite it as w dot v, and that should be,
well, what should it be? What happens to the value of w
as t changes? Well, it stays constant because
we are moving on a curve. That curve might be
complicated, but it stays always on the level,
w equals c. So, it's zero because w of t
equals c, which is a constant. OK, is that convincing?
OK, so now if we have a dot product that's zero,
that tells us that these two guys are perpendicular.
So -- So if the gradient vector is perpendicular to v,
OK, that's a good start. We know that the gradient is
perpendicular to this vector tangent that's tangent to the
level surface. What about other vectors
tangent to the level surface? Well, in fact,
I could use any curve drawn on the level of w equals c.
So, I could move, really, any way I wanted on
that surface. In particular,
I claim that I could have chosen my velocity vector to be
any vector tangent to the surface.
OK, so let's write this. So this is true for any curve,
or, I'll say for any motion on the level surface,
w equals c. So that means v can be any
vector tangent to the surface tangent to the level.
See, for example, OK, let me draw one more
picture. OK, so I have my level surface.
So, I'm drawing more and more levels, and they never quite
look the same. But I have a point.
And, at this point, I have the tangent plane to the
level surface. OK, so this is tangent plane to
the level. Then, if I choose any vector in
that tangent plane. Let's say I choose the one that
goes in that direction. Then, I can actually find a
curve that goes in that direction, and stays on the
level. So, here, that would be a curve
that somehow goes from the right to the left, and of course it
has to end up going up or something like that.
OK, so given any vector tangent -- -- let's call that vector v
tangent to the level, we get that the gradient is
perpendicular to v. So, if the gradient is
perpendicular to this vector tangent to this curve,
but also to any vector, I can draw that tangent to my
surface. So, what does that mean?
Well, that means the gradient is actually perpendicular to the
tangent plane or to the surface at this point.
So, the gradient is perpendicular.
And, well, here, I've illustrated things with a
three-dimensional example, but really it works the same if
you have only two variables. Then you have a level curve
that has a tangent line, and the gradient is
perpendicular to that line. OK, any questions?
No? OK, so, let's see.
That's actually pretty neat because there is a nice
application of this, which is to try to figure out,
now we know, actually, how to find the
tangent plane to anything, pretty much.
OK, so let's see. So, let's say that,
for example, I want to find -- -- the
tangent plane -- -- to the surface with equation,
let's say, x^2 y^2-z^2 = 4 at the point (2,1,
1). Let me write that.
So, how do we do that? Well, one way that we already
know, if we solve this for z,
so we can write z equals a function of x and y,
then we know tangent plane approximation for the graph of a
function, z equals some function of x and
y. But, that doesn't look like
it's the best way to do it. OK, the best way to it,
now that we have the gradient vector, is actually to directly
say, oh, we know the normal vector to this plane.
The normal vector will just be the gradient.
Oh, I think I have a cool picture to show.
OK, so that's what it looks like.
OK, so here you have the surface x2 y2-z2 equals four.
That's called a hyperboloid because it looks like when you
get when you spin a hyperbola around an axis.
And, here's a tangent plane at the given point.
So, it doesn't look very tangent because it crosses the
surface. But, it's really,
if you think about it, you will see it's really the
plane that's approximating the surface in the best way that you
can at this given point. It is really the tangent plane.
So, how do we find this plane? Well, you can plot it on a
computer. That's not exactly how you
would look for it in the first place.
So, the way to do it is that we compute the gradient.
So, a gradient of what? Well, a gradient of this
function. OK, so I should say,
this is the level set, w equals four,
where w equals x^2 y^2 - z^2. And so, we know that the
gradient of this, well, what is it?
2x, then 2y, and then negative 2z.
So, at this given point, I guess we are at x equals two.
So, that's four. And then, y and z are one.
So, two, negative two. OK, and that's going to be the
normal vector to the surface or to the tangent plane.
That's one way to define the tangent plane.
All right, it has the same normal vector as the surface.
That's one way to define the normal vector to the surface,
if you prefer. Being perpendicular to the
surface means that you are perpendicular to its tangent
plane. OK, so the equation is,
well, 4x 2y-2z equals something, where something is,
well, we should just plug in that point.
We'll get eight plus two minus two looks like we'll get eight.
And, of course, we could simplify dividing
everything by two, but it's not very important
here. OK, so now if you have a
surface given by an evil equation,
and a point on the surface, well, you know how to find the
tangent plane to the surface at that point.
OK, any questions? No.
OK, let me give just another reason why, another way that we
could have seen this. So, I claim,
in fact, we could have done this without the gradient,
or using the gradient in a somehow disguised way.
So, here's another way. So, the other way to do it
would be to start with a differential,
OK? dw, while it's pretty much the
same content, but let me write it as a
differential, dw is 2xdx 2ydy-2zdz.
So, at a given point, at (2,1, 1),
this is 4dx 2dy-2dz. Now, if we want to change this
into an approximation formula, we can.
We know that the change in w is approximately equal to 4 delta x
2 delta y - 2 delta z. OK, so when do we stay on the
level surface? Well, we stay on the level
surface when w doesn't change, so, when this becomes zero,
OK? Now, what does this
approximation sign mean? Well, it means for small
changes in x, y, z, this guy will be close to
that guy. It also means something else.
Remember, these approximation formulas, they are linear
approximations. They mean that we replace the
function, actually, by some closest linear formula
that will be nearby. And so, in particular,
if we set this equal to zero instead of approximately zero,
it means we'll actually be moving on the tangent plane to
the level set. If you want strict equalities
in approximations means that we replace the function by its
tangent approximation.
So -- [APPLAUSE] OK, so the level corresponds to
delta w equals zero, and its tangent plane
corresponds to four delta x plus two delta y minus two delta z
equals zero. That's what I'm trying to say,
basically. And, what's delta x?
Well, that means it's the change in x.
So, what's the change in x here? That means, well,
we started with x equals two, and we moved to some other
value, x. So, that's actually x- 2, right?
That's how much x has changed compared to 2.
And, two times (y - 1) minus two times z - 1 = 0.
That's the equation of a tangent plane.
It's the same equation as the one over there.
These are just two different methods to get it.
OK, so this one explains to you what's going on in terms of
approximation formulas. This one goes right away,
by using the gradient factor. So, in a way,
with this one, you don't have to think nearly
as much. But, you can use either one.
OK, questions? No?
OK, so let's move on to new topic, which is another
application of a gradient vector, and that is directional
derivatives.
OK, so let's say that we have a function of two variables,
x and y. Well, we know how to compute
partial w over partial x or partial w over partial y,
which measure how w changes if I move in the direction of the x
axis or in the direction of the y axis.
So, what about moving in other directions?
Well, of course, we've seen other approximation
formulas and so on. But, we can still ask,
is there a derivative in every direction?
And that's basically, yes, that's the directional
derivative. OK, so these are derivatives in
the direction of I hat or j hat, the vectors that go along the x
or the y axis. So, what if we move in another
direction, let's say, the direction of some unit
vector, let's call it u . OK, so if I give you a unit
vector, you can ask yourself, if I move in the direction,
how quickly will my function change?
So -- So, let's look at the straight trajectory.
What this should mean is I start at some value,
x, y, and there I have my vector u.
And, I'm going to move in a straight line in the direction
of u. And, I have the graph of my
function -- -- and I'm asking myself how quickly does the
value change when I move on the graph in that direction?
OK, so let's look at a straight line trajectory So,
we have a position vector, r, that will depend on some
parameter which I will call s. You'll see why very soon,
in such a way that the derivative is this given unit
vector u hat. So, why do I use s for my
parameter rather than t. Well, it's a convention.
I'm moving at unit speed along this line.
So that means that actually, I'm parameterizing things by
the distance that I've traveled along a curve,
sorry, along this line. So, here it's called s in the
sense of arc length. Actually, it's not really an
arc because it's a straight line, so it's the distance along
the line. OK, so because we are
parameterizing by distance, we are just using s as a
convention just to distinguish it from other situations.
And, so, now, the question will be,
what is dw/ds? What's the rate of change of w
when I move like that? Well, of course we know the
answer because that's a special case of the chain rule.
So, that's how we will actually compute it.
But, in terms of what it means, it really means we are asking
ourselves, we start at a point and we
change the variables in a certain direction,
which is not necessarily the x or the y direction,
but really any direction. And then, what's the derivative
in that direction? OK, does that make sense as a
concept? Kind of?
I see some faces that are not completely convinced.
So, maybe you should show more pictures.
Well, let me first write down a bit more and show you something.
So I just want to give you the actual definition.
Sorry, first of all in case you wonder what this is all about,
so let's say the components of our unit vector are two numbers,
a and b. Then, it means we'll move along
the line x of s equals some initial value,
the point where we are actually at the directional derivative
plus s times a, or I meant to say plus a times
s. And, y of s equals y0 bs.
And then, we plug that into w. And then we take the derivative.
So, we have a notation for that which is going to be dw/ds with
a subscript in the direction of u to indicate in which direction
we are actually going to move. And, that's called the
directional derivative -- -- in the direction of u.
OK, so, let's see what it means geometrically.
So, remember, we've seen things about partial
derivatives, and we see that the partial
derivatives are the slopes of slices of the graph by vertical
planes that are parallel to the x or the y directions.
OK, so, if I have a point, at any point,
I can slice the graph of my function by two planes,
one that's going along the x, one along the y direction.
And then, I can look at the slices of the graph.
Let me see if I can use that thing.
So, we can look at the slices of the graph that are drawn
here. In fact, we look at the tangent
lines to the slices, and we look at the slope and
that gives us the partial derivatives in case you are on
that side and want to see also the pointer that was here.
So, now, similarly, the directional derivative
means, actually, we'll be slicing our graph by
the vertical plane. It's not really colorful,
something more colorful. We'll be slicing things by a
plane that is now in the direction of this vector,
u, and we'll be looking at the slope of the slice of the graph.
So, what that looks like here, so that's the same applet the
way that you've used on your problem set in case you are
wondering. So, now, I'm picking a point on
the contour plot. And, at that point,
I slice the graph. So, here I'm starting by
slicing in the direction of the x axis.
So, in fact, what I'm measuring here by the
slope of the slice is the partial in the x direction.
It's really partial f partial x, which is also the directional
derivative in the direction of i.
And now, if I rotate the slice, then I have all of these
planes. So, you see at the bottom left,
I have the direction in which I'm going.
There's this, like, rotating line that tells
you in which direction I'm going to be moving.
And for each direction, I have a plane.
And, when I slice by that plane, I will get,
so I have this direction here going maybe to the southwest.
So, that gives me a slice of my graph by a vertical plane,
and the slice has a certain slope.
And, the slope is going to be the directional derivative in
that direction. OK, I think that's as graphic
as I can get. OK, any questions about that?
No? OK, so let's see how we compute
that guy. So, let me just write again
just in case you want to, in case you didn't hear me it's
the slope of the slice of the graph by a vertical plane -- --
that contains the given direction,
that's parallel to the direction, u.
So, how do we compute it? Well, we can use the chain rule.
The chain rule implies that dw/ds is actually the gradient
of w dot product with the velocity vector dr/ds.
But, remember we say that we are going to be moving at unit
speed in the direction of u. So, in fact,
that's just gradient w dot product with the unit vector u.
OK, so the formula that we remember is really dw/ds in the
direction of u is gradient w dot product of u.
And, maybe I should also say in words, this is the component of
the gradient in the direction of u.
And, maybe that makes more sense.
So, for example, the directional derivative in
the direction of I hat is the component along the x axes.
That's the same as, indeed, the partial derivatives
in the x direction. Things make sense.
dw/ds in the direction of I hat is, sorry, gradient w dot I hat,
which is wx,maybe I should write, partial w of partial x.
OK, now, so that's basically what we need to know to compute
these guys. So now, let's go back to the
gradient and see what this tells us about the gradient.
[APPLAUSE] I see you guys are having fun.
OK, OK, let's do a little bit of geometry here.
That should calm you down. So, we said dw/ds in the
direction of u is gradient w dot u.
That's the same as the length of gradient w times the length
of u. Well, that happens to be one
because we are taking the unit vector times the cosine of the
angle between the gradient and the given unit vector,
u, so, have this angle, theta. OK, that's another way of
saying we are taking the component of a gradient in the
direction of u. But now, what does that tell us?
Well, let's try to figure out in
which directions w changes the fastest,
in which direction it increases the most or decreases the most,
or doesn't actually change. So, when is this going to be
the largest? If I fix a point,
if I set a point, then the gradient vector at
that point is given to me. But, the question is,
in which direction does it change the most quickly?
Well, what I can change is the direction, and this will be the
largest when the cosine is one. So, this is largest when the
cosine of the angle is one. That means the angle is zero.
That means u is actually in the direction of the gradient.
OK, so that's a new way to think about the direction of a
gradient. The gradient is the direction
in which the function increases the most quickly at that point.
So, the direction of gradient w is the direction of fastest
increase of w at the given point.
And, what is the magnitude of w? Well, it's actually the
directional derivative in that direction.
OK, so if I go in that direction, which gives me the
fastest increase, then the corresponding slope
will be the length of the gradient.
And, with the direction of the fastest decrease?
It's going in the opposite direction, right?
I mean, if you are on a mountain, and you know that you
are facing the mountain, that's the direction of fastest
increase. The direction of fastest
decrease is behind you straight down.
OK, so, the minimal value of dw/ds is achieved when cosine of
theta is minus one. That means theta equals 180�.
That means u is in the direction of minus the gradient.
It points opposite to the gradient.
And, finally, when do we have dw/ds equals
zero? So, in which direction does the
function not change? Well, we have two answers to
that. One is to just use the formula.
So, that's one cosine theta equals zero.
That means theta equals 90�. That means that u is
perpendicular to the gradient. The other way to think about
it, the direction in which the value doesn't change is a
direction that's tangent to the level surface.
If we are not changing a, it means we are moving along
the level. And, that's the same thing --
-- as being tangent to the level.
So, let me just show that on the picture here.
So, if actually show you the gradient, you can't really see
it here. I need to move it a bit.
So, the gradient here is pointing straight up at the
point that I have chosen. Now, if I choose a slice that's
perpendicular, and a direction that's
perpendicular to the gradient, so that's actually tangent to
the level curve, then you see that my slice is
flat. I don't actually have any slop.
The directional derivative in a direction that's perpendicular
to the gradient is basically zero.
Now, if I rotate, then the slope sort of
increases, increases, increases, and it becomes the
largest when I'm going in the direction of a gradient.
So, here, I have, actually, a pretty big slope.
And now, if I keep rotating, then the slope will decrease
again. Then it becomes zero when I
perpendicular, and then it becomes negative.
It's the most negative when I pointing away from the gradient
and then becomes zero again when I'm back perpendicular.
OK, so for example, if I give you a contour plot,
and I ask you to draw the direction of the gradient
vector, well, at this point,
for example, you would look at the picture.
The gradient vector would be going perpendicular to the
level. And, it would be going towards
higher values of a function. I don't know if you can see the
labels, but the thing in the middle is a minimum.
So, it will actually be pointing in this kind of
direction. OK, so that's it for today.