On Tuesday, when I explained this stuff, I'm motivated by
the geometric aspects of direction of derivatives,
and graphs, and tangent planes and tangent lines.
So now, I want to start with sort of a slightly different
end, which is just one of the applications of this technique.
In other words, what is the real-life situation where we
might be interested in finding direction of derivatives.
And, I explained last time, the direction of derivates is just
a fancy term for the rate of change.
So, here's a typical example which you may have already seen
in the book or in your section work, which is imagine a
mountain, imagine a mountain, OK, and under is an ocean
somewhere, next to the beach.
But, we'll talk about this later.
Focus on the mountain for now.
This is a mountain.
Now, if I draw it like this, it's not clear whether it is
a mountain or just a curve, right.
So, in order to give it an illusion of 3-D, of
three-dimensional picture, what I usually do as a reflect, and
what you would normally do or when you read the book, what
you see there is that you draw some curves on it.
You kind of trying to give it a three-dimensional feel, right.
So, my first point is what are these curves.
These are the level curves.
So, that's the first thing I want to say.
Which of course, I've said before, but I just want to
emphasize it one more time.
Even to visualize this three-dimensional object on the
plane, we find it very convenient, very useful to
imagine not just the contour, the general contour of this
object, but this collection of curves, which are kind of
parallel to each other.
What are they.
Were they just obtained by taking the section of the
mountain, by parallel planes, or by parallel horizontal
planes, which parallel to the floor -- to the ground, OK
-- that's what they are.
Well, in fact, I have drawn just the visible parts
of those curves, right.
There is also a back side for each curve.
For example, this one also has a backside.
We don't see it unless it's a transparent object.
We don't see it, so that's why usually we indicate like this.
So, each of them is actually, has sort of a second
half, which is behind, which we don't see.
This, we also don't see when we look at the actual mountain,
but when we try to draw it, draw the picture, then we,
you know, we draw this.
Okay, so that's the first point.
So, the second point is what does this have to do with
direction of derivatives.
Direction of derivatives is the rate of change.
So, change of what?
So, in this particular setting, there is a very
good example of this.
Which is that suppose that there is somebody here
on this mountain, who is climbing it, OK.
So, there is climber.
And, so the climber wants to decide which way they should
go, and depending on which way they go, what will be the rate
of, you know, how steep will the climb be.
That's the question.
So, the rate of change will be, here, the rate of its
altitude change -- his or her altitude change, right.
And, the higher the rate, the steeper is the climb, in case
we're going towards the top of the mountain, or the
steeper the descent, if we're going down.
And, you know, likewise the smaller the rate of change, the
smaller is that, the steepest.
So, how do we measure this.
The point is there isn't a single number, there isn't a
single number to give us the steepness of the climb.
Because, the climber could go in many different directions,
and for each direction, there is a particular steepness rate.
For instance, the climber may be tired, and doesn't want to
climb anymore, so in that case the climber could just go
along the level curve.
So, then the altitude of the climber, the height of this
point over the sea level, will not change at all.
So, then the rate of change, or the steepness level, is 0,
steepness rate is 0, right.
So that's one possibility, that would correspond to
going in this direction.
But, on the other hand, the climber good go, could choose
the most steepest path, which would actually be, if you think
about it just intuitively, you could guess that it should be
something which is perpendicular to
the level curve.
In fact, this is something that we have confirmed
by calculation.
I will go over it one more time in a couple minutes.
So, in that case, the rate of change is the highest possible.
So, in this direction, the rate of change is 0.
In this direction, rate of change is maximal.
And, if we go in the direction which is perpendicular to the
level curve, if we go down, then it's going to be minimal,
the smallest possible.
Well, its absolute value is still going to be the highest
possible, but it's going to have a negative sign.
So, as a number, it will be the smallest possible value.
So here, the rate is minimal.
OK.
And, so the way we -- in order to talk about the rate of
change, you have to choose a direction.
That's why it's the direction of derivative.
It's not, there isn't a single derivative for a function of
two variables , but there is a whole variety of derivatives.
A derivative is determined by the choice of
the direction, OK.
So, what are the variables here, which variables
am I talking about.
Well, the variables are somewhere here, so there is --
let's actually do it, let's do it on this board, because I
don't want to mess up the picture.
So, I want to draw the coordinate system.
And, let's say this plato, this plane -- x, y plane
-- is the sea level.
And the z then will correspond to the height, to the altitude
above the sea level.
Now, our point as a projection onto the plane, and so
on the plane it has coordinates, x and y.
Maybe x0 and y0 to emphasize that these are some
fixed numbers.
So, these are the coordinates of -- this corresponds to the
position of the climber.
See, the point is that the climber is in space, right.
So, a priori, the climber has three coordinates
-- x, y and z.
But, because the climber is not flying, you know, is not
jumping with a parachute, he is on the mountain.
Because he is on the mountain, as soon as we know the x, y
coordinates, we know the z coordinate, unless the mountain
has a shape which sort of comes back, right.
But normally, for a normal mountain, for a normal
mountain it's not going to happen, right.
So , then the z coordinate is actually determined by
the x and y coordinates.
That's why the only parameters here are x
and y, not x, y and z.
In fact, you can think of this surface as a
graph of a function.
The graph represents the mountain, OK.
And, so now when we talk about the direction, we can think
about the direction as being the direction on the mountain,
but we could also think about the direction as being the
direction on the x, y plane.
And, so in other words, what we can do is we can drop
the level curve down here.
It's not going to look exactlly the same, because I have to
magnify the picture compared to the picture here.
This is not -- this is bigger.
I have used a different scale for the bottom picture as
opposed the top picture, so the level curve will
look like an ellipse.
But, I have magnified it, I have scaled it to make it
bigger, so that it easier to draw.
So, then the directions which we have talked about
here are the falling.
This one is both parallel to the level curve.
That's the direction along the x, y plane or in the x, y
plane, which would correspond to the movement on the mountain
parallel to the level curve.
This is the direction on the x, y plane, which will correspond
to the path of steepest descent, for which the rate is
maximal And, this will be the direction for which the
rate will be minimal.
So, this is what we call the rate of steepest descent.
Now, I would like to draw this vector in such a way that
they are of the same length.
Because, they are supposed to be unit vectors.
This is a convention.
We agree from the beginning that we will measure directions
by unit vectors, OK.
And, the point is that these are perpendicular.
So, what are these vectors.
This again is a vector parallel to the level curve, and this
vector is a vector perpendicular to
the level curve.
And, this is what we discussed last time.
I mean both of these vectors are perpendicular
to the level curve.
And, the one which corresponds to the steepest descent, is the
one which is the gradient vector.
So, this is actually the gradient vector.
So this is an outlier.
Because, to get the steepest ascent, you have to go inside
this level curve, right?
Because you go towards the center of the mountain.
And, this one will be negative.
OK, and the point is I explained last time why this
gradient vector is actually perpendicular to the tangent
vector, or in other words, perpendicular to the tangent
line to the level curve.
This is some calculation which involved the knowledge of
equations for lines on the plane, OK.
So, that's the picture, but in principle there are
many other directions.
We can also draw a direction like this say.
Again, some unit vector, u, which would be a,b, which would
have two coordinates a and b.
And, if a climber goes in this direction, then her path would
go along that, and it will correspond to some
path on the mountain.
Like this.
Which, is neither the steepest descent nor is
it parallel to the slope.
In this case, it goes down, because the vector
point is outward.
So, this direction or more precisely this line, which
contains this vector, will correspond on the mountain
to some specific path which starts at this point, and
then go somewhere, OK.
And, what we are calculating is just the slope of that
curve, of that path.
So, the direction of derivative -- direction of derivative
D sub u, f x0 y0.
With respect to this vector u, is the slope of the path on the
mountain, or on the graph, corresponding to the
line containing u.
You see, this is the line I'm talking about,
this yellow line.
I didn't draw it very well.
Maybe it's more like goes like this.
So, if I look at this line, this line will
you give me that path.
What do I mean by give me.
If I lift that path, there's a path on the x,y plane, but it
has a unique lift to the graph.
This is the yellow path on the graph.
In other words, this line or this half line, is the
projection of that path.
The unique path on the mountain, which starts at that
point and whose projection onto the x, y plane is the
half line directed by u.
You see what I mean?
Is there any questions about this?
OK.
So, the point is that the graph is two-dimensional.
It's a surface.
But, once you choose a direction, you cut a path or
a curve on that surface.
So, you are back to one-dimension, to the
one-dimensional case.
And, in the one-dimensional case, you can actually
talk about the slope.
Because, you get the graph of a function of one variable,
namely the variable along this line, and you can talk
about its slope.
That slope is the rate of change along that path.
That's what we call the direction of derivatives.
And, finally we will have a formula for it, which involves
the gradient vector, and this formula tells us when this
direction of derivative takes some particular values.
For example, the maximum value, maximum value corresponds to u
equal nabla, minimal value is its opposite, and the 0 value
corresponds to the tangent, to the u, which tangent
to level curve, OK.
But, this we already knew from, by analyzing this picture just
on the grounds of common sense.
We didn't need to do any calculation to figure this out.
In fact, when you are climbing the mountain, you're not
pulling out a paper pad and a pen and starting to calculate
what is the best way to reach the top of the mountain.
You kind of follow your intuition.
And, what your intuition will always tell you is that if you
want to go, reach the top, in the fastest possible way, you
have to go perpendicular to the level curve.
In the direction perpendicular to the level curve, and
likewise, if you want to go down the fastest way, you also
go perpendicular to the level curve.
Is that clear?
OK.
So, intuitively it's clear.
But, now we have proved that because we found the formula
for the rate of change.
And, from this formula, which is written in terms of dot
product, It's plain obvious when it takes the maximum
value, the minimum value or the 0 value.
And, that was one of the main conclusions last time, but now
I have illustrated it in this way.
OK.
So now, one more -- that's odd.
Here is the problem -- that's call Catch 22.
Alright, so I have -- this is a small inconvenience.
And, that's very clever.
I will not try to get it out of there, because I don't want
to pull the second one.
So, yes, what is that symbol?
It is called nabla.
It's a Greek letter, which is written opposite of delta.
We're using it for the gradient.
I'm sorry?
That's right, this is a notation for the gradient.
OK.
So, and one other thing which I wanted to mention in this
regard, is we talked about equations of-- we have talked
about equations of tangent lines and tangent planes.
And, I know this could be confusing, because you have
many different objects at the same time for which you could
look at tangent lines and tangent planes.
Seems like there are many different discussions going on.
Let's focus, okay.
So, there are tangent lines and tangent planes.
And, so I just want to summarize the stuff once more,
so that there's no ambiguity or there is no confusion.
So, the first is a two variable case.
In a two variable case, we look at the level curve
over function f of x,y.
So, it is given by this equation x y equals k.
So, that's
this level curve, that's this curve of equal height, or
equal altitude, which I drew over there.
OK?
And so then, we could look at the tangent line, tangent line
to this curve, and the equation of this tangent line at the
point x0, y0 is like this.
It's
f sub x at x0 y0 times x minus x0, plus f sub y x0 y0,
times y minus y0, equals 0.
OK.
But, in fact, we can now look at the case of three
variables as well, right.
Let me actually, let me do it here.
Three variables, three variables case.
In three variables case, we would want to take, instead of
a function in two variables, f of x y, we would want to take a
function, maybe F capital, a function of three
variables, x, y, z.
And, by analogy, we would have to look at the level, but
not curve, but now level surface of this function.
And, so that would be F of x y z equals k.
Maybe I should say that k is some number.
So, that for example, in this discussion k would be the
height, so I don't know 1,000 feet.
Whereas x and y and z are variables.
So, that's the difference.
It's an equation.
It's one equation for three variables, because in this
equation this is some number, like 1,000.
So, then you can ask what is the equation of the tangent,
but now not line, but plane to to the surface at the
point x0, y0 and z0.
And you see, the point is that the answer is given by
something which looks exactly the same, except now we
have three variables.
So, we have to add one more term, which involves
the third variable, z.
So, what the answer is, the answer is the following. you
have to take the partial derivative with respect to x,
multiplied by x minus x0, plus the partial derivative with
respect to y, plus the partial derivative with respect
to z now, also.
So, the difference between the two variable case and the three
variable case is that we now have an extra variable.
So, everything gets dimension one higher, the dimension
gets bumped by one.
We had a curve now we have a surface, we had a line
now we have a plane.
The equation here involved two partial derivatives, and
had this very simple form.
And, now the equation involves all three partial derivatives,
but had the sam form.
So, I will not derive this formula.
It is derived in the same way as in this case, in
the case of two variables.
But, I hope it looks convincing to you, because you can
clearly see the analogy.
And, in fact, if you want to prove it, you can prove it
in exactly the same way.
Now, what is slightly confusing in this is that there is a
special case of this three variable, of this three
variable picture.
And the special case is, when F of x, y, z is
f of x,y, minus z.
So, you can ask, why would we even bother to look
at this special case.
And, the reason is very simple.
Because, in this special case, if I look at the equation
F of x, y, z equals 0.
Which is a special case of a level surface.
Namely, the case when k is 0, right.
This is just equation z equals f of x,y.
And, this equation defines a graph, a graph of function
and two variables.
So, it's kind of funny that function and two variables
shows up into two different context.
It shows up here in the context of level curves, OK, but it
also can show up here, for functions with three variables.
Even though it is a function of two variables.
but even when we have a function with two variables,
and we think about graphs, we automatically go to the
three-dimensional situation, right.
And, so the graph of a function in two variables can be thought
of as a level surface for a function in three variables.
Which function.
Well, this function, f of x,y minus z.
It's kind of like a simplest concoction you can make out
f and the new variable z.
So, we can apply this general formula for the tangent plane
to this special case, and what do we get?
Let's observe that F sub x is just f small sub x.
Because when you take partial derivative of this function big
F, you have to differentiate this one, that would
be just f sub x.
And you differentiate this one, but this one is
independent of f.
So this doesn't change anything.
So partial derivative of this function, of this whole
function with respect to x is just the partial derivative for
this part, so that's small f.
Partial derivative with respect to y is f sub y.
Partial derivative with respect to z is what?
It's negative 1.
It's negative 1, because that's the derivative of this function
negative z with respect to z.
This guy doesn't depend on z, so its partial derivative
with respect to z is 0.
But, the partial derivative of this term is negative 1.
So, we get this, three partial derivatives, which we
substitute into this formula, and what do we get?
We get f sub x of x0, y0 times x minus x0, plus f of x0, y0,
y minus y0, minus z minus z0 equals 0.
And now, we recognize the equation of the tangent plane
to the graph, which we have known already.
This is the one which we got already two weeks ago, when we
talked about differentials and linear approximation.
The only difference is that now I put negative z minus
z0 on the left-hand side.
But in our old discussion, we would write equals z minus z0.
And, then we would switch the left- and right-hand
sides too, but that's a minor issue, right.
So, this is just a slightly different form of writing the
same equation, just putting everything on one side.
And, now you see that the case of graphs of -- the case of
tangent planes of graphs of functions into variables can be
thought of in two different ways.
OK.
Namely, you can think that you started with a function and two
variables, and you just look at the graph and you look
at the tangent plane.
But, you can also think of it as a special case, of the more
general case, a functions and three variables, except you
take as a function and three variables this very special
form: f of x y minus z.
Either way you approach it, you get the same answer.
But now, you can appreciate more the connection between
this answer and this.
Many people ask me after last lecture, why when go, when we
look at the equation of the tangent line, of a function and
two variables, it's as though we are dropping this
term, z minus z0.
So, there's this negative 1, which we just dropped.
Well, geometrically it's clear.
In fact, the board stayed since Tuesday, so I guess nobody
likes this small board, except for me, which is good.
So, this is its tangent line, and this tangent line
corresponds to, along this tangent line to the level
curve, we have the same value of z, z equals
z0.
So, that's why we draw up this term, to go from
this equation to this.
So, you can get this equation from this, by dropping z minuz
z0, Because z is equal to z0, along the level curve.
But, also you can now understand that this formula is
a special case of the formula for the tangent plane to a
general level surface, which actually looks like this one.
Except, we have a third variable.
In a special case, when the functions like this, this third
term becomes extremely simple.
It just gets a coefficient of negative 1.
So, you get minus z minus z0.
OK.
And, to make theanalogy complete, let's actually look,
let's fill in this square.
You know, it's like when you do IQ tests.
I've never done it.
But it's easy to find them online.
And, I think the problem is often like fill in the square,
so this is exactly the kind of question here.
What should be here.
Right.
In other words, this is a case of three variables, and this
is a special case of that.
Now, what's the analog of special case for functions
and two variables?
That's the case when this function of two
variables, special case.
When this f of x, y is some function in one variable,
let's call it g of x minus y.
And, so you see in this case, in fact, you know what I'm
going to do to make it look more like an analogy,
let's actually, let's re-position the board.
There we go.
So, now I think it's more clear what I mean by
filling in the square.
I want to find a special case of this, which is analogous to
how we found the special case of three variables.
And, that's the case.
Now, our function and two variables is equal to
another function and one variable minus y.
OK.
In this case, the equation, f of x y equals 0,
means y equals g of x.
Right?
And, this is a graph.
Graph of the function g of x.
So, a level curve for the function of two variables
can become the graph of a function of one variable.
When this function into variables, has
this special form.
Did you have a question?
[INAUDIBLE]
Good question.
So, what would happen if we put some k?
Right.
If we put some k, it will be here, I would put minus k.
Right.
So, then I could just absorb k into the definition
of the function g of x.
If I redefine my function g of x by subtracting k, then I
would get back the level 0.
So, that's why we don't lose any generality by looking at
the case of level 0, rather than, as oppose
to general case.
So, that's a graph.
So now, this formula for the tangent line, note that f sub
x of x,y now is g prime of x.
Just like here, the partial derivatives of the big function
F with respect to x and y were just the derivatives
of the small f.
And now, the role of the small f is played by g, so the
partial derivative, like this, is just g prime.
And, the partial derivative with respect to the second
variable is negative 1 again.
Because this, minus y, plays the same role as
minus z played here.
So, when we take the derivative, we get negative 1.
So now, this formula for the tangent line becomes g prime
of x0 times x minus x0, minus y minus y0 equals 0.
Let me rewrite this.
This is equivalent to saying y is equal to f prime of x0
times x minus x0 plus y0.
We recover the old formula for the equation of the tangent
line to the graph, a function and one variable.
That formula is exactly this one.
The slope is f prime, you multiply x minus 0 and you add
the value of the function at the point x0, which is y0.
Right.
So, there is nothing mysterious in this formula.
In this special case, we get back the old formula
we've known all along.
And, also this now sheds some new light on this coefficient
negative 1, which many of you have found mysterious.
It's not mysterious, it's as mysterious as this coefficient
negative 1, which shows up in the old formula for the
tangent line to the graph.
We were not surprise to write the formula for the graph of
the function y equals f prime times, you know, in this form.
But, if you have it in this form, you can rewrite
it like this.
When you rewrite it like this, you find the
coefficient negative 1.
That's exactly the reason it appears is the same reason
why this negative 1 appears.
Any questions about this?
Yes.
[INAUDIBLE]
Why do we choose a special case?
That's a very good question.
Why do we even choose a special case.
Well from the point of view, let's talk about this case.
From the point of view of functions into variables,
this sounds strange.
Why would you write it like this, and not f of x minus
x,y or something, right?
So from one point of view, functions of two variables,
it doesn't make any sense.
It makes a lot of sense, however, from the point of view
of the theory of functions in one variable.
When we started functions in one variable, we would like
to visualize them by graphs, right?
When we draw a graph of a function with one variable, we
introduced one more variable, and we look at the graph,
which is y equals f of x.
What I'm saying now is that within this formula, this
formula that we are developing, we can think of the graph of g
of x, which normally we would write as y equals g of
x, just in this form.
And when we write it in this form, we never say the
word level curve or anything like this.
We just say graph.
But, we have to realize, it is important to realize, to see
the connection between different formulas.
It's important to realize that this graph actually can be
thought of as a level curve for a function and two variables.
And, that function just happens to be this function, even
though it looks kind of -- there's no reason a priori
to study such functions.
We have introduced them, because we started from the
point of view of functions and one variable, and then this was
naturally fell out, once we started to look at the graphs.
So that, likewise in this case.
Yes?
[INAUDIBLE]
That's right.
Could be a point, or finitely many points, because you could
be -- let's look at the function.
And, for a good reason right, the dimension of the level
curve is going to be a number of variables, or a level curve
or a level surface, and so on, would be the number of
variables involved minus 1.
If you have two variables, it's a level curve
so the dimension 1.
And three variables it's a level surface, dimension 2.
If it's a function of one variable, it will
be of dimension 0.
Zero dimension objects are just collections of points.
And, the way it works is just like this.
Let's look at, for example, you have a parabola, level curve
consists of 2 points, but, you know, if you have a cubic
parabola, like this, there would be 3 points.
And, if you have a cosine, you would have
infinitely many points.
Infinitely many points if the level is between
1 and negative 1.
And, if the level is higher than 1 or lower than negative
1, then it will be empty.
Level curve could be empty, or level surface.
In this case, it's sort of level point.
We don't have a good word for a collection of points.
So, it's like level -- zero- dimensional object.
Manifold, as a mathematician will call it.
Any other questions about this.
Yes.
[INAUDIBLE].
Oh yes, I'm sorry.
Thank you.
That was just a, that was a mistake.
Thank you.
Yeah, it's g prime of course, I'm just saying this formula
becomes this formula.
I called it g right, sorry.
Yeah, I completely messed it up.
Good job.
So, that will do it for us in this topic, and actually we are
running out of time, so we need talk about something
else today, also.
I didn't want to go over this slowly to emphasize that the
connection between this different objects, because I
think that there are different dimensions ad different number
of variables that play, and it could be very confusing.
But, I think that if you put this in this sort of -- in this
picture, where you have this four squares, two variable
case, three variable case, special case, and two variable
special case, and three variables, then I think it
becomes much more clear.
Alright, but the next topic we'll discuss, concerns finding
maximum and mimina functions.
And, as is always the case, it's actually instructive to
look at this question already in the one-dimensional,
one variable case.
Because, we already contain some insights into the problem
by looking at this very special, simplest
possible case.
If you have a function in one variable, it's a natural
question to ask where this function attains maximum
and minimum values.
That's important, because this function could respond to
something in real life, and you may want to maximize
that or minimize that.
And, so the first point I want to emphasize is that there are
two different types of maxima and minima, the local
and the global.
The global ones are called absolute.
I like to think local, global.
I like this terminology because terminology is
a little bit better.
So, what do I mean by local.
So, let me draw this.
For a function one variable, it is very convenient to
analyze everything by using graphs of functions.
And, graphs, again, are curves on the plane.
Right.
So, we introduce the new variable y, and we write
a graph as given by the equation y equals f of x.
So, let's look at this kind of function.
That's a very typical example.
So, I want to focus on this point.
So clearly, this point -- the value of the function at this
point, this will be the point x0, and that's the
value of the function.
This is f of x here.
The value of this function, at this point, is greater than
the value at nearby points.
So, that's an example of a local maximum.
A point is a local maximum if there is a small neighborhood
of this point such that if you restrict your function to this
neighborhood, which is this little interval in this case,
then this function will -- this will be the maximum
value on the interval.
Okay.
But is it the global maximum.
Clearly not, because I have a point here, for example,
x1, for which the value
is higher.
So, that's not a global maximum.
That's not a global maximum either.
In fact, in this example, there, is no global maximum,
because I'm assuming the function keeps growing,
it keeps increasing, its axis is increasing, OK.
If that's the case, there is no local maximum.
So, global maximum is a completely different, finding
global maxima is a completely different game than
finding a local maximum.
Finding local maxima just involves analyzing the function
on a very small interval around this point.
Finding global one sort of involves analyzing all
points in your domain.
The way I phrased the question -- I have raised the question
so far, is as though we were studying global maxima on the
entire line, on the entire x line.
OK.
And, you see clearly that that question often
doesn't have an answer.
In other words, there is no global maximum, simply because
for any point, there will be another point which you will
have a higher value, higher value, higher value,
and so on, OK.
The question of finding global maxima is better to phrase on
domains which are bounded.
Not on the entire line, but on bounded domains.
Bounded means that it is finite.
So, it's better to say, what is the maximum of this
function on this interval.
OK.
This is an example of a close bounded domain
in the falling sense.
First of all, it's bounded because it's finite.
It doesn't go to infinity.
Second, it is closed because it contains the endpoints.
And, these are the kind of that domains we should look at if we
want to ask questions about global maxima or absolute
maxima or minima.
So, let's look at this question in this particular case.
In this particular case, we see that the maximum value is
actually taken at this point.
This is a maximum.
So, now you can appreciate why you have to
include the endpoint.
If we did not include the endpoint, there wouldn't be a
maximum, because no matter how close you are to this point,
there would be another point even closer for which the
value be even higher.
So, therefore there would be no
maximum.
So, in order to guarantee that you have was a positive answer
to the question of existence of a maximum, or minimum for that
matter, you should really look at clossed or bounded
intervals, and then what happens is that the maximum can
be attained either at the boundary, which is the case
here, or it could be some local maximum which lies in the
interior of this interval.
In this particular case, you do have a candidate, you do have
a candidate for a maximum.
This one, because it is a local maximum, and it is within this
interval, but it's not a global maximum, on this interval,
because the value of this function is just bigger.
But, if I were to take a different interval, if I were
to take an interval like this, for example, then
this guy would win.
Because, at the boundary, the value would be smaller.
You see.
At the boundary, the value would be smaller.
So, this guy would have the highest possible value
on this interval.
So, the bottom line, the upshot of all of this, is that the
absolute maximum can be found in the finite set of points.
Those points are -- first of all the endpoints, and all
the points where you have potentially a local
maximum or minimum.
So, the global maximum on the bounded interval, on a closed
interval, on the bounded interval, let's call it [a,b],
can be found, maximum of some function f, at one of
the following points.
The endpoints, which are a, b, and the points of
potential local maximum.
And here, it is important to emphasize the word potential.
And, those are the points for which f prime of x is 0.
Because certainly, if it's a point of local maximum, then
the slope of the tangent line, at this point, is
going to be 0, right.
Because, if you have a non-0 slope, you just move away from
this point and you'll get a bigger or smaller value.
And, one side bigger and the other side smaller.
So, the only way you could have a, or possibly have a local
maximum, is to have slope 0.
A slope is a derivative, so that's why these are the points
for which the derivative is equal to 0.
So, this is the first statement that I want you to remember, or
recall, perhaps from the one variable calculus, which is
that, if you're looking for global maxima, what you need to
do is simply measure or evaluate the function
at that endpoints.
Evaluate at this point and evaluate at this point.
Next, find all the points where the derivative is 0, and
evaluate the function.
So, you get the finite list, and then just pick the
one, or the ones, where the value is maximum.
These are the values, these are the maxima of this
function on the interval.
In other words, you don't have to look through all
the points on the interval.
There are infinitely many.
But you only look at the endpoints and the points
where f prime is equal to 0.
That's the algorithm for finding maxima of a function.
Like for minimum, just replace the word minimum,
maximum by minimum.
So, it's exactly the same.
Now, before I go and generalize it to the case of two
variables, I want to explain what I mean by the word
potentially, potential local maximum.
In other words, if the point x is a local maximum or minimum,
then the derivative is 0.
I already explained this, because the slope has to be 0.
If it's a maximum or minimum, the slope has to be 0.
If the slope is non-0, it means you can increase or decrease
the value by moving a little bit away from the point.
So, this is true, but this is not true.
In other words, if the derivative of your function at
your point is 0, it doesn't mean it's an absolute
maximum or minimum.
And, the reason is the following , there is a very
simple counter-example to this, namely function x cube.
So, f of x is x cubed, f prime is 3x squared,
so f prime of 0 is 0.
Which we see, right?
We do see that the slope at 0 is 0.
Right?
But, is it the point of a local maximum or minimum?
It's not, because if you go this way it increases, and if
go that way it decreases.
And, in fact, if you think in terms of monomials, the
same thing will happen if you have x to the odd.
To the n or n is odd, like 3, 5, 7 and so on.
Because, the derivative of x to the n is n times x minus 1.
So it's still, it's always 0 for this function.
But, if n is odd, for positive x this value is positive and
for negative is it is negative.
So, it's going to look like this.
But if x is, if you have y is x to the n is even, so it's 2, 4,
so on, then it is going to look like this.
So, in that case it is okay.
It is actually point of local maximum or minimum, local
maximum in this case, and if it were negative it would be,
sorry this is minimum, but if were like this, it
would be maximum.
So, in other words there are many possible scenarios where
you have a derivative equals 0, and it is a maximum, and there
are many scenarios where the derivative is 0, but it's
not a maximum or minimum.
So, it only goes this way, if there is a maximum or minimum,
then the derivative is 0.
That's why I said, those are the points which potentially
could be maxima or minima.
OK.
So, in principle, you could rule some of them out, from the
outset, by saying well this points are actually, the
prime is 0, but the point is not maximum or minimum.
So, then it cannot possibly contribute to the list of
suspicious or candidates for global maximum or minimum.
But, I think it's just much easier to just take all of
them, because it could be finitely many, and just
evaluate your function f at all of them, and then compare.
Where do you get the largest value, and where do you
get the smallest value.
Okay, good.
So, that's why the way I formulated this, I did want to
at this level to try to differentiate between the ones
which are actually maxima and minima and which are not.
Let's look at all which are potentially maxima.
So, that's the one-dimensional case.
So, now, in some sense we already know everything we
need to know, because in a two-dimensional case, it's
going to look exactly the same.
The criteria will be slightly more complicated.
Maybe I'll say one more thing, which is that there is a
criteria to see whether the function is a maximum or
minimum in this case.
Namely, suppose that f prime is 0, but f double
prime -- that's x.
At this point, let me emphasize that there is a particular
point x0, which was the point 0 in my previous examples.
Let's call it x0.
This is positive, then it's a maximum.
It's a minimum, sorry.
A minimum, local minimum.
And if -- I said "but," it should be
"and." If f prime of x0 is 0, and f double prime of
x0 is less than 0, then it's a local minimum.
In other words, if you think about this in
terms of Taylor series.
You can approximate oftentimes, you can approximate a function
by, a smooth function by its Taylor series.
And, the first terms in the Taylor series are going to be
given by the value of the function, and the derivative,
and then the second derivative.
So, the point is that if the first derivative vanishes,
that's a necessary condition to have a local
maximum or minimum.
But then, it depends on which term in the Taylor
series is non-0 next.
So, for example, if the second term is non-0, that means your
function looks like x minus x0 squared times some
coefficient, right?
What I'm trying to say, what I'm trying to explain is
the following, let me do it more slowly.
The Taylor series looks like this.
Okay.
So, this is just the value of the function.
Let's assume without loss of generality
that it is equal to 0.
I mean, after all we could just subtract this
value from this side.
It's not going to change anything.
So, let's assume it's 0.
So, the next comes this term, which is the first derivative,
and the first derivative has to vanish otherwise it can't be a
maximum or minimum as we just discussed.
So, this also vanishes.
So, the next term is the second derivative, right.
And, then there was some additional terms.
But, the additional terms are negligible compared
to this term when x is very close to x0.
So, you might as well replace your function by this function.
But, this function is just the parabola, I mean the graph of
this function is just a parabola.
And, the parabola we know, the parabola would be like this if
the coefficient is positive, and it would be like this if
the coefficient is negative.
So, in this case, clearly this is a local minimum.
For this one.
For this one it's a local maximum.
And, the other terms don't matter.
So, that's the reason why we get this criteria.
But if it is 0, it this term is also 0, then we can't really
tell, because we don't know what comes next.
If the non-0 term is a cubic term, we know it's not going to
be maximum or minimum, because we looked at a cubic parabola,
and it's like this.
It doesn't have a maximum or minimum, right?
But, if the cubic one vanishes, but the quartic one is non-0,
but then again it's a U shape.
So, there's no telling, we should really then look at the
higher terms in the expansion, and that's much more difficult.
So, that's why we just stop here, and we say
here is a criteria.
It the first derivative is 0, but the second derivate is
positive, it's a local minimum.
And in this case, it's a local maximum.
And, we just stop right there.
In other words, it's not, it does exhaust all possible
cases, but it exhausts its concerns or helps us in the
cases where the second derivate is non-0.
And, there is a similar criteria also it for
functions and two variables.
So, now we switch to functions in two variables.
[INAUDIBLE]
I wrote what?
On the top of what?
Oh, they are both local minima.
Wow.
It's kind of pessimistic.
Thank you, I have to correct.
You definitely should correct that.
I know it looks like we'll never reach maximum.
Okay, I think now it's good.
Right, because if it's negative it is shaped like
this, so it's maximum.
Okay.
So now switch to functions and two variables.
So again, we have local things, local maxima
and minima and global.
And, searching for them is sort of, they are two
different games for this.
For local maxima or minima, the first, step one is to check
that the two partial derivatives are 0.
Just like for functions in one variable, the first step is to
look at the first derivative.
Well, now we have function in two variables, so there are two
different partial derivatives.
So, both of them have to vanish.
In order for us to have a local maximum or minimum.
Well, I'm assuming now that both of them exist.
There is another possibility which is that, say one
of them may not exist.
So, in that case, that's also a possible case for local
maximum or minimum.
But, let's assume, in this discussion, that the partial
derivatives always exist.
So then we don't have to worry about this.
If they do exist, then a given point, x0, y0, will be a local
maximum or minimum, only if those partial
derivatives vanish.
So, when you kind of narrow down your search, you first
have to, you throw everything away, everything else away.
You just keep the points for which both partial
derivatives are 0.
But, this does not guarantee that it is maximum or
minimum, just like in the one variable case.
The best we can do, is to have a criterion involving
second partial derivatives.
so the criterion -- we would like to say something like if
the second derivative is positive, it's a minimum, if
it's negative it's a maximum.
But, there are now three different second
partial derivatives.
We have fxx, fxy, and fyy.
OK.
So, in fact the rule is as follows.
We have to calculate the following expression: So,
remember when we did cross products, we used determinants.
So, let's make a determinant of this 2 by 2 matrix, which
is very easy to memorize.
Think of the x as -- think of this one corresponding to the
first index, and this, the rows will correspond
to the first index.
So, the first index here is x and here is y.
And, the colons will correspond to the second index, which will
be here is x and here is y.
So, you put four possible partial derivatives
in this matrix.
Then, we know by Clereau's theorem that this is
the same as this.
But, let's not yet worry about this.
This is just an easy way to remember.
OK, and then we take the determinant of this.
So, what's the of fxx fyy minus fxy fyx?
But fyx is equal to -- OK, now we remember it.
And we just put squared.
So let's call this D.
So the criterion is that if both partial derivatives are
0, and D is greater than 0, then it's a maximum.
That's number one.
Number two, if both partial derivatives are 0 and D is
negative, then it's a minimum.
And finally, I'm sorry, I'm not saying it correctly,
it's maxima, No, sorry, it's worse than that.
It's maximum or minimum.
And this one is not.
Let's just say not.
I don't have enough space.
Not a maximum, not a minimum if it's negative.
And, if it is 0, it's inconclusive.
Don't know.
OK.
First point, think of this as an analog of this rule, because
in the case of one variable, there is also a rule which
involes a second derivative.
However, this rule is much more complicated, because there
are three different partial derivatives of second order.
And, we make some complicate combination of them.
Whereas, here we just took the second part of the partial
derivative on the nose, and we just said, looked whether
it's positive or negative.
But it's very, there is an analogy between the two
clearly, because this involves second partial derivatives and
this involves second partial derivatives.
Now, but it looks very mysterious.
Why do I make this
-- why do I look at this combination, and not
at other combinations.
To understand this this, think of the case, where, think of
the case of parabola, and the look of the parabola.
Because, I explained to how this rule came
about, by looking at the parabolas, right.
The parabolas, because the parabolas approximate
your graph.
Just because of the Taylor expansion, argument, you can
see that the parabolas are going to see that the parabolas
are going to approximate your graph near the point of where
the first partial derivatives were, where first
derivative vanished.
Right.
So, think about the parabolas.
And, in the case of the parabola, you know that if it
is an elliptic, first of all parabola now becomes
paraboloid, but there are two types of paraboloid.
There is an elliptic paraboloid, and there is
a hyperbolic parabolid.
OK.
And, just look at the examples of elliptic paraboloids, and
you will see that for elliptic paraboloids, the first
condition will be satisfied.
And, for a hyperbolic paraboloid, the second
condition will be satified.
So, if z is equal to -- let's say if f of x,y is
ax squared plus y squared.
Ah, ax squared plus by squared.
So, what are the derivatives in this case?
fxx is 2a, right?
fyy is 2b, right?
And fxy is 0.
So there is a simplification in this case.
But there is no cross term.
OK?
So this matrix looks like this it's 2a and 2b and that's 4ab.
That's the D in this case.
So, to say that D is that positive means to say that
both a,b are positive or both of them are negative.
If both a,b are positive, it's going to look like this .
If both a,b are negative, it's going to look like this.
So, in this case it's a local minimum, and this
case a local maximum.
But, in both cases you see a,b both positive,
a,b both negative.
The combination a times b, or 4a times b.
In both cases it's positive.
So, that's why we get into the first condition, in the
situation of the first condition where D is positive.
So, in this case, we can say for sure that it's maximum
and minimum, but we cannot say which one.
So, we have to look at it more closely.
OK.
And, what if D is negative.
If D is negative, that means that a,b have different signs.
And in this case, so a good example of this would z equals
x squared minus y squared.
And, that's a hyperbolic paraboloid.
And for a hyperbolic paraboloid, I drew this picture
before, it looks like a saddle, and on the saddle, there is a
point from which you can either increase the function if you go
along one of the parabolas, which opens up this way, or you
could also decrease the function by traveling on
a different parabola perpendicular one where it
opens downward, right.
So, this point clearly, this point on the saddle is not a
point of maximum or minimum.
So, that is the explanation of this criteria in the case of
quadratic functions, functions which are combinations of
x square and y square.
And, the point is that all other functions can be reduced
to this one by certain procedures, and that's
how you get this rule.
OK.
So, that's how we get this rule for local maxima and minima.
And, that takes care of that issue.
The last remaining topic is how to find the absolute
maximum and minimum on particular domains.
And, this I will illustrate very quickly by a
concrete example.
This is step 1 and this is step 2.
Let me give you an example of how to find maximum
and minimum, global maximum and minimum.
I have just enough time to explain this.
So, let's say you have a function, f of x y, which is
x squared plus y squared plus x squared y plus 4.
Find global or absolute maxima and minima on the domain x,y
where the absolute value of x is less than and equal to one,
and the absolute value of 1 is less than and equal to y.
So, the first step is to sketch the domain.
Sketch, it's very easy, right.
This is just the square, where the sides are lines parallel to
x, y axis 1 and negative 1.
Step 2 is to find the boundary, identify the boundary.
This is the boundary.
And, now we are going to make a list of suspicious points, or
points which are candidates for being maxima or minima.
OK.
And, this list will include three kinds of points.
The first are points in the interior, where both
partial derivatives are 0.
What do I mean by interior?
Interior means everything except the boundary.
So, I have to calculate what is fx and what is fy? fx is
2x plus 2y and fy is 2y plus x squared.
Right.
So, we have to set this equals to 0 and this equals to 0.
Since I'm running out of time, let me just go
to the next step.
You solve this equation, it's very easy.
Right.
This is the first group of points that you
get on your list.
The second group of points are points on the boundary, but
which belong to the smooth part of the boundary.
Two is smooth part of the boundary.
Yes?
[INAUDIBLE]
2x plus 2xy.
I'm sorry, that's right.
All right.
Smooth part of the boundary.
What I mean by this, well exclude -- I mean, maybe
it's not a good idea to say smooth part.
Maybe it's not like this.
Let's just say, let's just call it the components of the
boundary, components of the boundary.
So, what I mean by components, I mean this four intervals.
So, break your boundary into four, into pieces, which can
be represented by a nice equation, like here.
Here it's like x is equal to 1 and y is between
negative 1 and 1.
So, then you
restrict your function.
Restrict your function to this component.
It will effectively become a function in one variable.
Solve the problem for this one variable function.
One minute left.
OK.
So, let me give you an example.
Say, one of the components is y equals 1 and x is
between negative 1 and 1.
So, I substitute this y equals 1 into this formula, and I get
f of x1 is x squared plus 1, plus x squared plus 4.
So, that's 2x squared plus 5.
I got f function in one variable on this interval.
Find the local, find the absolute maximum of this
function that interval.
OK.
And, then the same for each other component.
And, that's not all.
It would have been all, it would have been all, if you
didn't have the corners.
Because you have corners you have to include them, because
in principle it could happen that the maximum and minimum is
attained at the corners, so look at, include the corners.
So, now you compile the list, and you evaluate the function,
and you choose the one where the value is maximum.
So, that's how you do it.
All right, have a good weekend.