Uploaded by UCBerkeley on 17.11.2009

Transcript:

On Tuesday, when I explained this stuff, I'm motivated by

the geometric aspects of direction of derivatives,

and graphs, and tangent planes and tangent lines.

So now, I want to start with sort of a slightly different

end, which is just one of the applications of this technique.

In other words, what is the real-life situation where we

might be interested in finding direction of derivatives.

And, I explained last time, the direction of derivates is just

a fancy term for the rate of change.

So, here's a typical example which you may have already seen

in the book or in your section work, which is imagine a

mountain, imagine a mountain, OK, and under is an ocean

somewhere, next to the beach.

But, we'll talk about this later.

Focus on the mountain for now.

This is a mountain.

Now, if I draw it like this, it's not clear whether it is

a mountain or just a curve, right.

So, in order to give it an illusion of 3-D, of

three-dimensional picture, what I usually do as a reflect, and

what you would normally do or when you read the book, what

you see there is that you draw some curves on it.

You kind of trying to give it a three-dimensional feel, right.

So, my first point is what are these curves.

These are the level curves.

So, that's the first thing I want to say.

Which of course, I've said before, but I just want to

emphasize it one more time.

Even to visualize this three-dimensional object on the

plane, we find it very convenient, very useful to

imagine not just the contour, the general contour of this

object, but this collection of curves, which are kind of

parallel to each other.

What are they.

Were they just obtained by taking the section of the

mountain, by parallel planes, or by parallel horizontal

planes, which parallel to the floor -- to the ground, OK

-- that's what they are.

Well, in fact, I have drawn just the visible parts

of those curves, right.

There is also a back side for each curve.

For example, this one also has a backside.

We don't see it unless it's a transparent object.

We don't see it, so that's why usually we indicate like this.

So, each of them is actually, has sort of a second

half, which is behind, which we don't see.

This, we also don't see when we look at the actual mountain,

but when we try to draw it, draw the picture, then we,

you know, we draw this.

Okay, so that's the first point.

So, the second point is what does this have to do with

direction of derivatives.

Direction of derivatives is the rate of change.

So, change of what?

So, in this particular setting, there is a very

good example of this.

Which is that suppose that there is somebody here

on this mountain, who is climbing it, OK.

So, there is climber.

And, so the climber wants to decide which way they should

go, and depending on which way they go, what will be the rate

of, you know, how steep will the climb be.

That's the question.

So, the rate of change will be, here, the rate of its

altitude change -- his or her altitude change, right.

And, the higher the rate, the steeper is the climb, in case

we're going towards the top of the mountain, or the

steeper the descent, if we're going down.

And, you know, likewise the smaller the rate of change, the

smaller is that, the steepest.

So, how do we measure this.

The point is there isn't a single number, there isn't a

single number to give us the steepness of the climb.

Because, the climber could go in many different directions,

and for each direction, there is a particular steepness rate.

For instance, the climber may be tired, and doesn't want to

climb anymore, so in that case the climber could just go

along the level curve.

So, then the altitude of the climber, the height of this

point over the sea level, will not change at all.

So, then the rate of change, or the steepness level, is 0,

steepness rate is 0, right.

So that's one possibility, that would correspond to

going in this direction.

But, on the other hand, the climber good go, could choose

the most steepest path, which would actually be, if you think

about it just intuitively, you could guess that it should be

something which is perpendicular to

the level curve.

In fact, this is something that we have confirmed

by calculation.

I will go over it one more time in a couple minutes.

So, in that case, the rate of change is the highest possible.

So, in this direction, the rate of change is 0.

In this direction, rate of change is maximal.

And, if we go in the direction which is perpendicular to the

level curve, if we go down, then it's going to be minimal,

the smallest possible.

Well, its absolute value is still going to be the highest

possible, but it's going to have a negative sign.

So, as a number, it will be the smallest possible value.

So here, the rate is minimal.

OK.

And, so the way we -- in order to talk about the rate of

change, you have to choose a direction.

That's why it's the direction of derivative.

It's not, there isn't a single derivative for a function of

two variables , but there is a whole variety of derivatives.

A derivative is determined by the choice of

the direction, OK.

So, what are the variables here, which variables

am I talking about.

Well, the variables are somewhere here, so there is --

let's actually do it, let's do it on this board, because I

don't want to mess up the picture.

So, I want to draw the coordinate system.

And, let's say this plato, this plane -- x, y plane

-- is the sea level.

And the z then will correspond to the height, to the altitude

above the sea level.

Now, our point as a projection onto the plane, and so

on the plane it has coordinates, x and y.

Maybe x0 and y0 to emphasize that these are some

fixed numbers.

So, these are the coordinates of -- this corresponds to the

position of the climber.

See, the point is that the climber is in space, right.

So, a priori, the climber has three coordinates

-- x, y and z.

But, because the climber is not flying, you know, is not

jumping with a parachute, he is on the mountain.

Because he is on the mountain, as soon as we know the x, y

coordinates, we know the z coordinate, unless the mountain

has a shape which sort of comes back, right.

But normally, for a normal mountain, for a normal

mountain it's not going to happen, right.

So , then the z coordinate is actually determined by

the x and y coordinates.

That's why the only parameters here are x

and y, not x, y and z.

In fact, you can think of this surface as a

graph of a function.

The graph represents the mountain, OK.

And, so now when we talk about the direction, we can think

about the direction as being the direction on the mountain,

but we could also think about the direction as being the

direction on the x, y plane.

And, so in other words, what we can do is we can drop

the level curve down here.

It's not going to look exactlly the same, because I have to

magnify the picture compared to the picture here.

This is not -- this is bigger.

I have used a different scale for the bottom picture as

opposed the top picture, so the level curve will

look like an ellipse.

But, I have magnified it, I have scaled it to make it

bigger, so that it easier to draw.

So, then the directions which we have talked about

here are the falling.

This one is both parallel to the level curve.

That's the direction along the x, y plane or in the x, y

plane, which would correspond to the movement on the mountain

parallel to the level curve.

This is the direction on the x, y plane, which will correspond

to the path of steepest descent, for which the rate is

maximal And, this will be the direction for which the

rate will be minimal.

So, this is what we call the rate of steepest descent.

Now, I would like to draw this vector in such a way that

they are of the same length.

Because, they are supposed to be unit vectors.

This is a convention.

We agree from the beginning that we will measure directions

by unit vectors, OK.

And, the point is that these are perpendicular.

So, what are these vectors.

This again is a vector parallel to the level curve, and this

vector is a vector perpendicular to

the level curve.

And, this is what we discussed last time.

I mean both of these vectors are perpendicular

to the level curve.

And, the one which corresponds to the steepest descent, is the

one which is the gradient vector.

So, this is actually the gradient vector.

So this is an outlier.

Because, to get the steepest ascent, you have to go inside

this level curve, right?

Because you go towards the center of the mountain.

And, this one will be negative.

OK, and the point is I explained last time why this

gradient vector is actually perpendicular to the tangent

vector, or in other words, perpendicular to the tangent

line to the level curve.

This is some calculation which involved the knowledge of

equations for lines on the plane, OK.

So, that's the picture, but in principle there are

many other directions.

We can also draw a direction like this say.

Again, some unit vector, u, which would be a,b, which would

have two coordinates a and b.

And, if a climber goes in this direction, then her path would

go along that, and it will correspond to some

path on the mountain.

Like this.

Which, is neither the steepest descent nor is

it parallel to the slope.

In this case, it goes down, because the vector

point is outward.

So, this direction or more precisely this line, which

contains this vector, will correspond on the mountain

to some specific path which starts at this point, and

then go somewhere, OK.

And, what we are calculating is just the slope of that

curve, of that path.

So, the direction of derivative -- direction of derivative

D sub u, f x0 y0.

With respect to this vector u, is the slope of the path on the

mountain, or on the graph, corresponding to the

line containing u.

You see, this is the line I'm talking about,

this yellow line.

I didn't draw it very well.

Maybe it's more like goes like this.

So, if I look at this line, this line will

you give me that path.

What do I mean by give me.

If I lift that path, there's a path on the x,y plane, but it

has a unique lift to the graph.

This is the yellow path on the graph.

In other words, this line or this half line, is the

projection of that path.

The unique path on the mountain, which starts at that

point and whose projection onto the x, y plane is the

half line directed by u.

You see what I mean?

Is there any questions about this?

OK.

So, the point is that the graph is two-dimensional.

It's a surface.

But, once you choose a direction, you cut a path or

a curve on that surface.

So, you are back to one-dimension, to the

one-dimensional case.

And, in the one-dimensional case, you can actually

talk about the slope.

Because, you get the graph of a function of one variable,

namely the variable along this line, and you can talk

about its slope.

That slope is the rate of change along that path.

That's what we call the direction of derivatives.

And, finally we will have a formula for it, which involves

the gradient vector, and this formula tells us when this

direction of derivative takes some particular values.

For example, the maximum value, maximum value corresponds to u

equal nabla, minimal value is its opposite, and the 0 value

corresponds to the tangent, to the u, which tangent

to level curve, OK.

But, this we already knew from, by analyzing this picture just

on the grounds of common sense.

We didn't need to do any calculation to figure this out.

In fact, when you are climbing the mountain, you're not

pulling out a paper pad and a pen and starting to calculate

what is the best way to reach the top of the mountain.

You kind of follow your intuition.

And, what your intuition will always tell you is that if you

want to go, reach the top, in the fastest possible way, you

have to go perpendicular to the level curve.

In the direction perpendicular to the level curve, and

likewise, if you want to go down the fastest way, you also

go perpendicular to the level curve.

Is that clear?

OK.

So, intuitively it's clear.

But, now we have proved that because we found the formula

for the rate of change.

And, from this formula, which is written in terms of dot

product, It's plain obvious when it takes the maximum

value, the minimum value or the 0 value.

And, that was one of the main conclusions last time, but now

I have illustrated it in this way.

OK.

So now, one more -- that's odd.

Here is the problem -- that's call Catch 22.

Alright, so I have -- this is a small inconvenience.

And, that's very clever.

I will not try to get it out of there, because I don't want

to pull the second one.

So, yes, what is that symbol?

It is called nabla.

It's a Greek letter, which is written opposite of delta.

We're using it for the gradient.

I'm sorry?

That's right, this is a notation for the gradient.

OK.

So, and one other thing which I wanted to mention in this

regard, is we talked about equations of-- we have talked

about equations of tangent lines and tangent planes.

And, I know this could be confusing, because you have

many different objects at the same time for which you could

look at tangent lines and tangent planes.

Seems like there are many different discussions going on.

Let's focus, okay.

So, there are tangent lines and tangent planes.

And, so I just want to summarize the stuff once more,

so that there's no ambiguity or there is no confusion.

So, the first is a two variable case.

In a two variable case, we look at the level curve

over function f of x,y.

So, it is given by this equation x y equals k.

So, that's

this level curve, that's this curve of equal height, or

equal altitude, which I drew over there.

OK?

And so then, we could look at the tangent line, tangent line

to this curve, and the equation of this tangent line at the

point x0, y0 is like this.

It's

f sub x at x0 y0 times x minus x0, plus f sub y x0 y0,

times y minus y0, equals 0.

OK.

But, in fact, we can now look at the case of three

variables as well, right.

Let me actually, let me do it here.

Three variables, three variables case.

In three variables case, we would want to take, instead of

a function in two variables, f of x y, we would want to take a

function, maybe F capital, a function of three

variables, x, y, z.

And, by analogy, we would have to look at the level, but

not curve, but now level surface of this function.

And, so that would be F of x y z equals k.

Maybe I should say that k is some number.

So, that for example, in this discussion k would be the

height, so I don't know 1,000 feet.

Whereas x and y and z are variables.

So, that's the difference.

It's an equation.

It's one equation for three variables, because in this

equation this is some number, like 1,000.

So, then you can ask what is the equation of the tangent,

but now not line, but plane to to the surface at the

point x0, y0 and z0.

And you see, the point is that the answer is given by

something which looks exactly the same, except now we

have three variables.

So, we have to add one more term, which involves

the third variable, z.

So, what the answer is, the answer is the following. you

have to take the partial derivative with respect to x,

multiplied by x minus x0, plus the partial derivative with

respect to y, plus the partial derivative with respect

to z now, also.

So, the difference between the two variable case and the three

variable case is that we now have an extra variable.

So, everything gets dimension one higher, the dimension

gets bumped by one.

We had a curve now we have a surface, we had a line

now we have a plane.

The equation here involved two partial derivatives, and

had this very simple form.

And, now the equation involves all three partial derivatives,

but had the sam form.

So, I will not derive this formula.

It is derived in the same way as in this case, in

the case of two variables.

But, I hope it looks convincing to you, because you can

clearly see the analogy.

And, in fact, if you want to prove it, you can prove it

in exactly the same way.

Now, what is slightly confusing in this is that there is a

special case of this three variable, of this three

variable picture.

And the special case is, when F of x, y, z is

f of x,y, minus z.

So, you can ask, why would we even bother to look

at this special case.

And, the reason is very simple.

Because, in this special case, if I look at the equation

F of x, y, z equals 0.

Which is a special case of a level surface.

Namely, the case when k is 0, right.

This is just equation z equals f of x,y.

And, this equation defines a graph, a graph of function

and two variables.

So, it's kind of funny that function and two variables

shows up into two different context.

It shows up here in the context of level curves, OK, but it

also can show up here, for functions with three variables.

Even though it is a function of two variables.

but even when we have a function with two variables,

and we think about graphs, we automatically go to the

three-dimensional situation, right.

And, so the graph of a function in two variables can be thought

of as a level surface for a function in three variables.

Which function.

Well, this function, f of x,y minus z.

It's kind of like a simplest concoction you can make out

f and the new variable z.

So, we can apply this general formula for the tangent plane

to this special case, and what do we get?

Let's observe that F sub x is just f small sub x.

Because when you take partial derivative of this function big

F, you have to differentiate this one, that would

be just f sub x.

And you differentiate this one, but this one is

independent of f.

So this doesn't change anything.

So partial derivative of this function, of this whole

function with respect to x is just the partial derivative for

this part, so that's small f.

Partial derivative with respect to y is f sub y.

Partial derivative with respect to z is what?

It's negative 1.

It's negative 1, because that's the derivative of this function

negative z with respect to z.

This guy doesn't depend on z, so its partial derivative

with respect to z is 0.

But, the partial derivative of this term is negative 1.

So, we get this, three partial derivatives, which we

substitute into this formula, and what do we get?

We get f sub x of x0, y0 times x minus x0, plus f of x0, y0,

y minus y0, minus z minus z0 equals 0.

And now, we recognize the equation of the tangent plane

to the graph, which we have known already.

This is the one which we got already two weeks ago, when we

talked about differentials and linear approximation.

The only difference is that now I put negative z minus

z0 on the left-hand side.

But in our old discussion, we would write equals z minus z0.

And, then we would switch the left- and right-hand

sides too, but that's a minor issue, right.

So, this is just a slightly different form of writing the

same equation, just putting everything on one side.

And, now you see that the case of graphs of -- the case of

tangent planes of graphs of functions into variables can be

thought of in two different ways.

OK.

Namely, you can think that you started with a function and two

variables, and you just look at the graph and you look

at the tangent plane.

But, you can also think of it as a special case, of the more

general case, a functions and three variables, except you

take as a function and three variables this very special

form: f of x y minus z.

Either way you approach it, you get the same answer.

But now, you can appreciate more the connection between

this answer and this.

Many people ask me after last lecture, why when go, when we

look at the equation of the tangent line, of a function and

two variables, it's as though we are dropping this

term, z minus z0.

So, there's this negative 1, which we just dropped.

Well, geometrically it's clear.

In fact, the board stayed since Tuesday, so I guess nobody

likes this small board, except for me, which is good.

So, this is its tangent line, and this tangent line

corresponds to, along this tangent line to the level

curve, we have the same value of z, z equals

z0.

So, that's why we draw up this term, to go from

this equation to this.

So, you can get this equation from this, by dropping z minuz

z0, Because z is equal to z0, along the level curve.

But, also you can now understand that this formula is

a special case of the formula for the tangent plane to a

general level surface, which actually looks like this one.

Except, we have a third variable.

In a special case, when the functions like this, this third

term becomes extremely simple.

It just gets a coefficient of negative 1.

So, you get minus z minus z0.

OK.

And, to make theanalogy complete, let's actually look,

let's fill in this square.

You know, it's like when you do IQ tests.

I've never done it.

But it's easy to find them online.

And, I think the problem is often like fill in the square,

so this is exactly the kind of question here.

What should be here.

Right.

In other words, this is a case of three variables, and this

is a special case of that.

Now, what's the analog of special case for functions

and two variables?

That's the case when this function of two

variables, special case.

When this f of x, y is some function in one variable,

let's call it g of x minus y.

And, so you see in this case, in fact, you know what I'm

going to do to make it look more like an analogy,

let's actually, let's re-position the board.

There we go.

So, now I think it's more clear what I mean by

filling in the square.

I want to find a special case of this, which is analogous to

how we found the special case of three variables.

And, that's the case.

Now, our function and two variables is equal to

another function and one variable minus y.

OK.

In this case, the equation, f of x y equals 0,

means y equals g of x.

Right?

And, this is a graph.

Graph of the function g of x.

So, a level curve for the function of two variables

can become the graph of a function of one variable.

When this function into variables, has

this special form.

Did you have a question?

[INAUDIBLE]

Good question.

So, what would happen if we put some k?

Right.

If we put some k, it will be here, I would put minus k.

Right.

So, then I could just absorb k into the definition

of the function g of x.

If I redefine my function g of x by subtracting k, then I

would get back the level 0.

So, that's why we don't lose any generality by looking at

the case of level 0, rather than, as oppose

to general case.

So, that's a graph.

So now, this formula for the tangent line, note that f sub

x of x,y now is g prime of x.

Just like here, the partial derivatives of the big function

F with respect to x and y were just the derivatives

of the small f.

And now, the role of the small f is played by g, so the

partial derivative, like this, is just g prime.

And, the partial derivative with respect to the second

variable is negative 1 again.

Because this, minus y, plays the same role as

minus z played here.

So, when we take the derivative, we get negative 1.

So now, this formula for the tangent line becomes g prime

of x0 times x minus x0, minus y minus y0 equals 0.

Let me rewrite this.

This is equivalent to saying y is equal to f prime of x0

times x minus x0 plus y0.

We recover the old formula for the equation of the tangent

line to the graph, a function and one variable.

That formula is exactly this one.

The slope is f prime, you multiply x minus 0 and you add

the value of the function at the point x0, which is y0.

Right.

So, there is nothing mysterious in this formula.

In this special case, we get back the old formula

we've known all along.

And, also this now sheds some new light on this coefficient

negative 1, which many of you have found mysterious.

It's not mysterious, it's as mysterious as this coefficient

negative 1, which shows up in the old formula for the

tangent line to the graph.

We were not surprise to write the formula for the graph of

the function y equals f prime times, you know, in this form.

But, if you have it in this form, you can rewrite

it like this.

When you rewrite it like this, you find the

coefficient negative 1.

That's exactly the reason it appears is the same reason

why this negative 1 appears.

Any questions about this?

Yes.

[INAUDIBLE]

Why do we choose a special case?

That's a very good question.

Why do we even choose a special case.

Well from the point of view, let's talk about this case.

From the point of view of functions into variables,

this sounds strange.

Why would you write it like this, and not f of x minus

x,y or something, right?

So from one point of view, functions of two variables,

it doesn't make any sense.

It makes a lot of sense, however, from the point of view

of the theory of functions in one variable.

When we started functions in one variable, we would like

to visualize them by graphs, right?

When we draw a graph of a function with one variable, we

introduced one more variable, and we look at the graph,

which is y equals f of x.

What I'm saying now is that within this formula, this

formula that we are developing, we can think of the graph of g

of x, which normally we would write as y equals g of

x, just in this form.

And when we write it in this form, we never say the

word level curve or anything like this.

We just say graph.

But, we have to realize, it is important to realize, to see

the connection between different formulas.

It's important to realize that this graph actually can be

thought of as a level curve for a function and two variables.

And, that function just happens to be this function, even

though it looks kind of -- there's no reason a priori

to study such functions.

We have introduced them, because we started from the

point of view of functions and one variable, and then this was

naturally fell out, once we started to look at the graphs.

So that, likewise in this case.

Yes?

[INAUDIBLE]

That's right.

Could be a point, or finitely many points, because you could

be -- let's look at the function.

And, for a good reason right, the dimension of the level

curve is going to be a number of variables, or a level curve

or a level surface, and so on, would be the number of

variables involved minus 1.

If you have two variables, it's a level curve

so the dimension 1.

And three variables it's a level surface, dimension 2.

If it's a function of one variable, it will

be of dimension 0.

Zero dimension objects are just collections of points.

And, the way it works is just like this.

Let's look at, for example, you have a parabola, level curve

consists of 2 points, but, you know, if you have a cubic

parabola, like this, there would be 3 points.

And, if you have a cosine, you would have

infinitely many points.

Infinitely many points if the level is between

1 and negative 1.

And, if the level is higher than 1 or lower than negative

1, then it will be empty.

Level curve could be empty, or level surface.

In this case, it's sort of level point.

We don't have a good word for a collection of points.

So, it's like level -- zero- dimensional object.

Manifold, as a mathematician will call it.

Any other questions about this.

Yes.

[INAUDIBLE].

Oh yes, I'm sorry.

Thank you.

That was just a, that was a mistake.

Thank you.

Yeah, it's g prime of course, I'm just saying this formula

becomes this formula.

I called it g right, sorry.

Yeah, I completely messed it up.

Good job.

So, that will do it for us in this topic, and actually we are

running out of time, so we need talk about something

else today, also.

I didn't want to go over this slowly to emphasize that the

connection between this different objects, because I

think that there are different dimensions ad different number

of variables that play, and it could be very confusing.

But, I think that if you put this in this sort of -- in this

picture, where you have this four squares, two variable

case, three variable case, special case, and two variable

special case, and three variables, then I think it

becomes much more clear.

Alright, but the next topic we'll discuss, concerns finding

maximum and mimina functions.

And, as is always the case, it's actually instructive to

look at this question already in the one-dimensional,

one variable case.

Because, we already contain some insights into the problem

by looking at this very special, simplest

possible case.

If you have a function in one variable, it's a natural

question to ask where this function attains maximum

and minimum values.

That's important, because this function could respond to

something in real life, and you may want to maximize

that or minimize that.

And, so the first point I want to emphasize is that there are

two different types of maxima and minima, the local

and the global.

The global ones are called absolute.

I like to think local, global.

I like this terminology because terminology is

a little bit better.

So, what do I mean by local.

So, let me draw this.

For a function one variable, it is very convenient to

analyze everything by using graphs of functions.

And, graphs, again, are curves on the plane.

Right.

So, we introduce the new variable y, and we write

a graph as given by the equation y equals f of x.

So, let's look at this kind of function.

That's a very typical example.

So, I want to focus on this point.

So clearly, this point -- the value of the function at this

point, this will be the point x0, and that's the

value of the function.

This is f of x here.

The value of this function, at this point, is greater than

the value at nearby points.

So, that's an example of a local maximum.

A point is a local maximum if there is a small neighborhood

of this point such that if you restrict your function to this

neighborhood, which is this little interval in this case,

then this function will -- this will be the maximum

value on the interval.

Okay.

But is it the global maximum.

Clearly not, because I have a point here, for example,

x1, for which the value

is higher.

So, that's not a global maximum.

That's not a global maximum either.

In fact, in this example, there, is no global maximum,

because I'm assuming the function keeps growing,

it keeps increasing, its axis is increasing, OK.

If that's the case, there is no local maximum.

So, global maximum is a completely different, finding

global maxima is a completely different game than

finding a local maximum.

Finding local maxima just involves analyzing the function

on a very small interval around this point.

Finding global one sort of involves analyzing all

points in your domain.

The way I phrased the question -- I have raised the question

so far, is as though we were studying global maxima on the

entire line, on the entire x line.

OK.

And, you see clearly that that question often

doesn't have an answer.

In other words, there is no global maximum, simply because

for any point, there will be another point which you will

have a higher value, higher value, higher value,

and so on, OK.

The question of finding global maxima is better to phrase on

domains which are bounded.

Not on the entire line, but on bounded domains.

Bounded means that it is finite.

So, it's better to say, what is the maximum of this

function on this interval.

OK.

This is an example of a close bounded domain

in the falling sense.

First of all, it's bounded because it's finite.

It doesn't go to infinity.

Second, it is closed because it contains the endpoints.

And, these are the kind of that domains we should look at if we

want to ask questions about global maxima or absolute

maxima or minima.

So, let's look at this question in this particular case.

In this particular case, we see that the maximum value is

actually taken at this point.

This is a maximum.

So, now you can appreciate why you have to

include the endpoint.

If we did not include the endpoint, there wouldn't be a

maximum, because no matter how close you are to this point,

there would be another point even closer for which the

value be even higher.

So, therefore there would be no

maximum.

So, in order to guarantee that you have was a positive answer

to the question of existence of a maximum, or minimum for that

matter, you should really look at clossed or bounded

intervals, and then what happens is that the maximum can

be attained either at the boundary, which is the case

here, or it could be some local maximum which lies in the

interior of this interval.

In this particular case, you do have a candidate, you do have

a candidate for a maximum.

This one, because it is a local maximum, and it is within this

interval, but it's not a global maximum, on this interval,

because the value of this function is just bigger.

But, if I were to take a different interval, if I were

to take an interval like this, for example, then

this guy would win.

Because, at the boundary, the value would be smaller.

You see.

At the boundary, the value would be smaller.

So, this guy would have the highest possible value

on this interval.

So, the bottom line, the upshot of all of this, is that the

absolute maximum can be found in the finite set of points.

Those points are -- first of all the endpoints, and all

the points where you have potentially a local

maximum or minimum.

So, the global maximum on the bounded interval, on a closed

interval, on the bounded interval, let's call it [a,b],

can be found, maximum of some function f, at one of

the following points.

The endpoints, which are a, b, and the points of

potential local maximum.

And here, it is important to emphasize the word potential.

And, those are the points for which f prime of x is 0.

Because certainly, if it's a point of local maximum, then

the slope of the tangent line, at this point, is

going to be 0, right.

Because, if you have a non-0 slope, you just move away from

this point and you'll get a bigger or smaller value.

And, one side bigger and the other side smaller.

So, the only way you could have a, or possibly have a local

maximum, is to have slope 0.

A slope is a derivative, so that's why these are the points

for which the derivative is equal to 0.

So, this is the first statement that I want you to remember, or

recall, perhaps from the one variable calculus, which is

that, if you're looking for global maxima, what you need to

do is simply measure or evaluate the function

at that endpoints.

Evaluate at this point and evaluate at this point.

Next, find all the points where the derivative is 0, and

evaluate the function.

So, you get the finite list, and then just pick the

one, or the ones, where the value is maximum.

These are the values, these are the maxima of this

function on the interval.

In other words, you don't have to look through all

the points on the interval.

There are infinitely many.

But you only look at the endpoints and the points

where f prime is equal to 0.

That's the algorithm for finding maxima of a function.

Like for minimum, just replace the word minimum,

maximum by minimum.

So, it's exactly the same.

Now, before I go and generalize it to the case of two

variables, I want to explain what I mean by the word

potentially, potential local maximum.

In other words, if the point x is a local maximum or minimum,

then the derivative is 0.

I already explained this, because the slope has to be 0.

If it's a maximum or minimum, the slope has to be 0.

If the slope is non-0, it means you can increase or decrease

the value by moving a little bit away from the point.

So, this is true, but this is not true.

In other words, if the derivative of your function at

your point is 0, it doesn't mean it's an absolute

maximum or minimum.

And, the reason is the following , there is a very

simple counter-example to this, namely function x cube.

So, f of x is x cubed, f prime is 3x squared,

so f prime of 0 is 0.

Which we see, right?

We do see that the slope at 0 is 0.

Right?

But, is it the point of a local maximum or minimum?

It's not, because if you go this way it increases, and if

go that way it decreases.

And, in fact, if you think in terms of monomials, the

same thing will happen if you have x to the odd.

To the n or n is odd, like 3, 5, 7 and so on.

Because, the derivative of x to the n is n times x minus 1.

So it's still, it's always 0 for this function.

But, if n is odd, for positive x this value is positive and

for negative is it is negative.

So, it's going to look like this.

But if x is, if you have y is x to the n is even, so it's 2, 4,

so on, then it is going to look like this.

So, in that case it is okay.

It is actually point of local maximum or minimum, local

maximum in this case, and if it were negative it would be,

sorry this is minimum, but if were like this, it

would be maximum.

So, in other words there are many possible scenarios where

you have a derivative equals 0, and it is a maximum, and there

are many scenarios where the derivative is 0, but it's

not a maximum or minimum.

So, it only goes this way, if there is a maximum or minimum,

then the derivative is 0.

That's why I said, those are the points which potentially

could be maxima or minima.

OK.

So, in principle, you could rule some of them out, from the

outset, by saying well this points are actually, the

prime is 0, but the point is not maximum or minimum.

So, then it cannot possibly contribute to the list of

suspicious or candidates for global maximum or minimum.

But, I think it's just much easier to just take all of

them, because it could be finitely many, and just

evaluate your function f at all of them, and then compare.

Where do you get the largest value, and where do you

get the smallest value.

Okay, good.

So, that's why the way I formulated this, I did want to

at this level to try to differentiate between the ones

which are actually maxima and minima and which are not.

Let's look at all which are potentially maxima.

So, that's the one-dimensional case.

So, now, in some sense we already know everything we

need to know, because in a two-dimensional case, it's

going to look exactly the same.

The criteria will be slightly more complicated.

Maybe I'll say one more thing, which is that there is a

criteria to see whether the function is a maximum or

minimum in this case.

Namely, suppose that f prime is 0, but f double

prime -- that's x.

At this point, let me emphasize that there is a particular

point x0, which was the point 0 in my previous examples.

Let's call it x0.

This is positive, then it's a maximum.

It's a minimum, sorry.

A minimum, local minimum.

And if -- I said "but," it should be

"and." If f prime of x0 is 0, and f double prime of

x0 is less than 0, then it's a local minimum.

In other words, if you think about this in

terms of Taylor series.

You can approximate oftentimes, you can approximate a function

by, a smooth function by its Taylor series.

And, the first terms in the Taylor series are going to be

given by the value of the function, and the derivative,

and then the second derivative.

So, the point is that if the first derivative vanishes,

that's a necessary condition to have a local

maximum or minimum.

But then, it depends on which term in the Taylor

series is non-0 next.

So, for example, if the second term is non-0, that means your

function looks like x minus x0 squared times some

coefficient, right?

What I'm trying to say, what I'm trying to explain is

the following, let me do it more slowly.

The Taylor series looks like this.

Okay.

So, this is just the value of the function.

Let's assume without loss of generality

that it is equal to 0.

I mean, after all we could just subtract this

value from this side.

It's not going to change anything.

So, let's assume it's 0.

So, the next comes this term, which is the first derivative,

and the first derivative has to vanish otherwise it can't be a

maximum or minimum as we just discussed.

So, this also vanishes.

So, the next term is the second derivative, right.

And, then there was some additional terms.

But, the additional terms are negligible compared

to this term when x is very close to x0.

So, you might as well replace your function by this function.

But, this function is just the parabola, I mean the graph of

this function is just a parabola.

And, the parabola we know, the parabola would be like this if

the coefficient is positive, and it would be like this if

the coefficient is negative.

So, in this case, clearly this is a local minimum.

For this one.

For this one it's a local maximum.

And, the other terms don't matter.

So, that's the reason why we get this criteria.

But if it is 0, it this term is also 0, then we can't really

tell, because we don't know what comes next.

If the non-0 term is a cubic term, we know it's not going to

be maximum or minimum, because we looked at a cubic parabola,

and it's like this.

It doesn't have a maximum or minimum, right?

But, if the cubic one vanishes, but the quartic one is non-0,

but then again it's a U shape.

So, there's no telling, we should really then look at the

higher terms in the expansion, and that's much more difficult.

So, that's why we just stop here, and we say

here is a criteria.

It the first derivative is 0, but the second derivate is

positive, it's a local minimum.

And in this case, it's a local maximum.

And, we just stop right there.

In other words, it's not, it does exhaust all possible

cases, but it exhausts its concerns or helps us in the

cases where the second derivate is non-0.

And, there is a similar criteria also it for

functions and two variables.

So, now we switch to functions in two variables.

[INAUDIBLE]

I wrote what?

On the top of what?

Oh, they are both local minima.

Wow.

It's kind of pessimistic.

Thank you, I have to correct.

You definitely should correct that.

I know it looks like we'll never reach maximum.

Okay, I think now it's good.

Right, because if it's negative it is shaped like

this, so it's maximum.

Okay.

So now switch to functions and two variables.

So again, we have local things, local maxima

and minima and global.

And, searching for them is sort of, they are two

different games for this.

For local maxima or minima, the first, step one is to check

that the two partial derivatives are 0.

Just like for functions in one variable, the first step is to

look at the first derivative.

Well, now we have function in two variables, so there are two

different partial derivatives.

So, both of them have to vanish.

In order for us to have a local maximum or minimum.

Well, I'm assuming now that both of them exist.

There is another possibility which is that, say one

of them may not exist.

So, in that case, that's also a possible case for local

maximum or minimum.

But, let's assume, in this discussion, that the partial

derivatives always exist.

So then we don't have to worry about this.

If they do exist, then a given point, x0, y0, will be a local

maximum or minimum, only if those partial

derivatives vanish.

So, when you kind of narrow down your search, you first

have to, you throw everything away, everything else away.

You just keep the points for which both partial

derivatives are 0.

But, this does not guarantee that it is maximum or

minimum, just like in the one variable case.

The best we can do, is to have a criterion involving

second partial derivatives.

so the criterion -- we would like to say something like if

the second derivative is positive, it's a minimum, if

it's negative it's a maximum.

But, there are now three different second

partial derivatives.

We have fxx, fxy, and fyy.

OK.

So, in fact the rule is as follows.

We have to calculate the following expression: So,

remember when we did cross products, we used determinants.

So, let's make a determinant of this 2 by 2 matrix, which

is very easy to memorize.

Think of the x as -- think of this one corresponding to the

first index, and this, the rows will correspond

to the first index.

So, the first index here is x and here is y.

And, the colons will correspond to the second index, which will

be here is x and here is y.

So, you put four possible partial derivatives

in this matrix.

Then, we know by Clereau's theorem that this is

the same as this.

But, let's not yet worry about this.

This is just an easy way to remember.

OK, and then we take the determinant of this.

So, what's the of fxx fyy minus fxy fyx?

But fyx is equal to -- OK, now we remember it.

And we just put squared.

So let's call this D.

So the criterion is that if both partial derivatives are

0, and D is greater than 0, then it's a maximum.

That's number one.

Number two, if both partial derivatives are 0 and D is

negative, then it's a minimum.

And finally, I'm sorry, I'm not saying it correctly,

it's maxima, No, sorry, it's worse than that.

It's maximum or minimum.

And this one is not.

Let's just say not.

I don't have enough space.

Not a maximum, not a minimum if it's negative.

And, if it is 0, it's inconclusive.

Don't know.

OK.

First point, think of this as an analog of this rule, because

in the case of one variable, there is also a rule which

involes a second derivative.

However, this rule is much more complicated, because there

are three different partial derivatives of second order.

And, we make some complicate combination of them.

Whereas, here we just took the second part of the partial

derivative on the nose, and we just said, looked whether

it's positive or negative.

But it's very, there is an analogy between the two

clearly, because this involves second partial derivatives and

this involves second partial derivatives.

Now, but it looks very mysterious.

Why do I make this

-- why do I look at this combination, and not

at other combinations.

To understand this this, think of the case, where, think of

the case of parabola, and the look of the parabola.

Because, I explained to how this rule came

about, by looking at the parabolas, right.

The parabolas, because the parabolas approximate

your graph.

Just because of the Taylor expansion, argument, you can

see that the parabolas are going to see that the parabolas

are going to approximate your graph near the point of where

the first partial derivatives were, where first

derivative vanished.

Right.

So, think about the parabolas.

And, in the case of the parabola, you know that if it

is an elliptic, first of all parabola now becomes

paraboloid, but there are two types of paraboloid.

There is an elliptic paraboloid, and there is

a hyperbolic parabolid.

OK.

And, just look at the examples of elliptic paraboloids, and

you will see that for elliptic paraboloids, the first

condition will be satisfied.

And, for a hyperbolic paraboloid, the second

condition will be satified.

So, if z is equal to -- let's say if f of x,y is

ax squared plus y squared.

Ah, ax squared plus by squared.

So, what are the derivatives in this case?

fxx is 2a, right?

fyy is 2b, right?

And fxy is 0.

So there is a simplification in this case.

But there is no cross term.

OK?

So this matrix looks like this it's 2a and 2b and that's 4ab.

That's the D in this case.

So, to say that D is that positive means to say that

both a,b are positive or both of them are negative.

If both a,b are positive, it's going to look like this .

If both a,b are negative, it's going to look like this.

So, in this case it's a local minimum, and this

case a local maximum.

But, in both cases you see a,b both positive,

a,b both negative.

The combination a times b, or 4a times b.

In both cases it's positive.

So, that's why we get into the first condition, in the

situation of the first condition where D is positive.

So, in this case, we can say for sure that it's maximum

and minimum, but we cannot say which one.

So, we have to look at it more closely.

OK.

And, what if D is negative.

If D is negative, that means that a,b have different signs.

And in this case, so a good example of this would z equals

x squared minus y squared.

And, that's a hyperbolic paraboloid.

And for a hyperbolic paraboloid, I drew this picture

before, it looks like a saddle, and on the saddle, there is a

point from which you can either increase the function if you go

along one of the parabolas, which opens up this way, or you

could also decrease the function by traveling on

a different parabola perpendicular one where it

opens downward, right.

So, this point clearly, this point on the saddle is not a

point of maximum or minimum.

So, that is the explanation of this criteria in the case of

quadratic functions, functions which are combinations of

x square and y square.

And, the point is that all other functions can be reduced

to this one by certain procedures, and that's

how you get this rule.

OK.

So, that's how we get this rule for local maxima and minima.

And, that takes care of that issue.

The last remaining topic is how to find the absolute

maximum and minimum on particular domains.

And, this I will illustrate very quickly by a

concrete example.

This is step 1 and this is step 2.

Let me give you an example of how to find maximum

and minimum, global maximum and minimum.

I have just enough time to explain this.

So, let's say you have a function, f of x y, which is

x squared plus y squared plus x squared y plus 4.

Find global or absolute maxima and minima on the domain x,y

where the absolute value of x is less than and equal to one,

and the absolute value of 1 is less than and equal to y.

So, the first step is to sketch the domain.

Sketch, it's very easy, right.

This is just the square, where the sides are lines parallel to

x, y axis 1 and negative 1.

Step 2 is to find the boundary, identify the boundary.

This is the boundary.

And, now we are going to make a list of suspicious points, or

points which are candidates for being maxima or minima.

OK.

And, this list will include three kinds of points.

The first are points in the interior, where both

partial derivatives are 0.

What do I mean by interior?

Interior means everything except the boundary.

So, I have to calculate what is fx and what is fy? fx is

2x plus 2y and fy is 2y plus x squared.

Right.

So, we have to set this equals to 0 and this equals to 0.

Since I'm running out of time, let me just go

to the next step.

You solve this equation, it's very easy.

Right.

This is the first group of points that you

get on your list.

The second group of points are points on the boundary, but

which belong to the smooth part of the boundary.

Two is smooth part of the boundary.

Yes?

[INAUDIBLE]

2x plus 2xy.

I'm sorry, that's right.

All right.

Smooth part of the boundary.

What I mean by this, well exclude -- I mean, maybe

it's not a good idea to say smooth part.

Maybe it's not like this.

Let's just say, let's just call it the components of the

boundary, components of the boundary.

So, what I mean by components, I mean this four intervals.

So, break your boundary into four, into pieces, which can

be represented by a nice equation, like here.

Here it's like x is equal to 1 and y is between

negative 1 and 1.

So, then you

restrict your function.

Restrict your function to this component.

It will effectively become a function in one variable.

Solve the problem for this one variable function.

One minute left.

OK.

So, let me give you an example.

Say, one of the components is y equals 1 and x is

between negative 1 and 1.

So, I substitute this y equals 1 into this formula, and I get

f of x1 is x squared plus 1, plus x squared plus 4.

So, that's 2x squared plus 5.

I got f function in one variable on this interval.

Find the local, find the absolute maximum of this

function that interval.

OK.

And, then the same for each other component.

And, that's not all.

It would have been all, it would have been all, if you

didn't have the corners.

Because you have corners you have to include them, because

in principle it could happen that the maximum and minimum is

attained at the corners, so look at, include the corners.

So, now you compile the list, and you evaluate the function,

and you choose the one where the value is maximum.

So, that's how you do it.

All right, have a good weekend.

the geometric aspects of direction of derivatives,

and graphs, and tangent planes and tangent lines.

So now, I want to start with sort of a slightly different

end, which is just one of the applications of this technique.

In other words, what is the real-life situation where we

might be interested in finding direction of derivatives.

And, I explained last time, the direction of derivates is just

a fancy term for the rate of change.

So, here's a typical example which you may have already seen

in the book or in your section work, which is imagine a

mountain, imagine a mountain, OK, and under is an ocean

somewhere, next to the beach.

But, we'll talk about this later.

Focus on the mountain for now.

This is a mountain.

Now, if I draw it like this, it's not clear whether it is

a mountain or just a curve, right.

So, in order to give it an illusion of 3-D, of

three-dimensional picture, what I usually do as a reflect, and

what you would normally do or when you read the book, what

you see there is that you draw some curves on it.

You kind of trying to give it a three-dimensional feel, right.

So, my first point is what are these curves.

These are the level curves.

So, that's the first thing I want to say.

Which of course, I've said before, but I just want to

emphasize it one more time.

Even to visualize this three-dimensional object on the

plane, we find it very convenient, very useful to

imagine not just the contour, the general contour of this

object, but this collection of curves, which are kind of

parallel to each other.

What are they.

Were they just obtained by taking the section of the

mountain, by parallel planes, or by parallel horizontal

planes, which parallel to the floor -- to the ground, OK

-- that's what they are.

Well, in fact, I have drawn just the visible parts

of those curves, right.

There is also a back side for each curve.

For example, this one also has a backside.

We don't see it unless it's a transparent object.

We don't see it, so that's why usually we indicate like this.

So, each of them is actually, has sort of a second

half, which is behind, which we don't see.

This, we also don't see when we look at the actual mountain,

but when we try to draw it, draw the picture, then we,

you know, we draw this.

Okay, so that's the first point.

So, the second point is what does this have to do with

direction of derivatives.

Direction of derivatives is the rate of change.

So, change of what?

So, in this particular setting, there is a very

good example of this.

Which is that suppose that there is somebody here

on this mountain, who is climbing it, OK.

So, there is climber.

And, so the climber wants to decide which way they should

go, and depending on which way they go, what will be the rate

of, you know, how steep will the climb be.

That's the question.

So, the rate of change will be, here, the rate of its

altitude change -- his or her altitude change, right.

And, the higher the rate, the steeper is the climb, in case

we're going towards the top of the mountain, or the

steeper the descent, if we're going down.

And, you know, likewise the smaller the rate of change, the

smaller is that, the steepest.

So, how do we measure this.

The point is there isn't a single number, there isn't a

single number to give us the steepness of the climb.

Because, the climber could go in many different directions,

and for each direction, there is a particular steepness rate.

For instance, the climber may be tired, and doesn't want to

climb anymore, so in that case the climber could just go

along the level curve.

So, then the altitude of the climber, the height of this

point over the sea level, will not change at all.

So, then the rate of change, or the steepness level, is 0,

steepness rate is 0, right.

So that's one possibility, that would correspond to

going in this direction.

But, on the other hand, the climber good go, could choose

the most steepest path, which would actually be, if you think

about it just intuitively, you could guess that it should be

something which is perpendicular to

the level curve.

In fact, this is something that we have confirmed

by calculation.

I will go over it one more time in a couple minutes.

So, in that case, the rate of change is the highest possible.

So, in this direction, the rate of change is 0.

In this direction, rate of change is maximal.

And, if we go in the direction which is perpendicular to the

level curve, if we go down, then it's going to be minimal,

the smallest possible.

Well, its absolute value is still going to be the highest

possible, but it's going to have a negative sign.

So, as a number, it will be the smallest possible value.

So here, the rate is minimal.

OK.

And, so the way we -- in order to talk about the rate of

change, you have to choose a direction.

That's why it's the direction of derivative.

It's not, there isn't a single derivative for a function of

two variables , but there is a whole variety of derivatives.

A derivative is determined by the choice of

the direction, OK.

So, what are the variables here, which variables

am I talking about.

Well, the variables are somewhere here, so there is --

let's actually do it, let's do it on this board, because I

don't want to mess up the picture.

So, I want to draw the coordinate system.

And, let's say this plato, this plane -- x, y plane

-- is the sea level.

And the z then will correspond to the height, to the altitude

above the sea level.

Now, our point as a projection onto the plane, and so

on the plane it has coordinates, x and y.

Maybe x0 and y0 to emphasize that these are some

fixed numbers.

So, these are the coordinates of -- this corresponds to the

position of the climber.

See, the point is that the climber is in space, right.

So, a priori, the climber has three coordinates

-- x, y and z.

But, because the climber is not flying, you know, is not

jumping with a parachute, he is on the mountain.

Because he is on the mountain, as soon as we know the x, y

coordinates, we know the z coordinate, unless the mountain

has a shape which sort of comes back, right.

But normally, for a normal mountain, for a normal

mountain it's not going to happen, right.

So , then the z coordinate is actually determined by

the x and y coordinates.

That's why the only parameters here are x

and y, not x, y and z.

In fact, you can think of this surface as a

graph of a function.

The graph represents the mountain, OK.

And, so now when we talk about the direction, we can think

about the direction as being the direction on the mountain,

but we could also think about the direction as being the

direction on the x, y plane.

And, so in other words, what we can do is we can drop

the level curve down here.

It's not going to look exactlly the same, because I have to

magnify the picture compared to the picture here.

This is not -- this is bigger.

I have used a different scale for the bottom picture as

opposed the top picture, so the level curve will

look like an ellipse.

But, I have magnified it, I have scaled it to make it

bigger, so that it easier to draw.

So, then the directions which we have talked about

here are the falling.

This one is both parallel to the level curve.

That's the direction along the x, y plane or in the x, y

plane, which would correspond to the movement on the mountain

parallel to the level curve.

This is the direction on the x, y plane, which will correspond

to the path of steepest descent, for which the rate is

maximal And, this will be the direction for which the

rate will be minimal.

So, this is what we call the rate of steepest descent.

Now, I would like to draw this vector in such a way that

they are of the same length.

Because, they are supposed to be unit vectors.

This is a convention.

We agree from the beginning that we will measure directions

by unit vectors, OK.

And, the point is that these are perpendicular.

So, what are these vectors.

This again is a vector parallel to the level curve, and this

vector is a vector perpendicular to

the level curve.

And, this is what we discussed last time.

I mean both of these vectors are perpendicular

to the level curve.

And, the one which corresponds to the steepest descent, is the

one which is the gradient vector.

So, this is actually the gradient vector.

So this is an outlier.

Because, to get the steepest ascent, you have to go inside

this level curve, right?

Because you go towards the center of the mountain.

And, this one will be negative.

OK, and the point is I explained last time why this

gradient vector is actually perpendicular to the tangent

vector, or in other words, perpendicular to the tangent

line to the level curve.

This is some calculation which involved the knowledge of

equations for lines on the plane, OK.

So, that's the picture, but in principle there are

many other directions.

We can also draw a direction like this say.

Again, some unit vector, u, which would be a,b, which would

have two coordinates a and b.

And, if a climber goes in this direction, then her path would

go along that, and it will correspond to some

path on the mountain.

Like this.

Which, is neither the steepest descent nor is

it parallel to the slope.

In this case, it goes down, because the vector

point is outward.

So, this direction or more precisely this line, which

contains this vector, will correspond on the mountain

to some specific path which starts at this point, and

then go somewhere, OK.

And, what we are calculating is just the slope of that

curve, of that path.

So, the direction of derivative -- direction of derivative

D sub u, f x0 y0.

With respect to this vector u, is the slope of the path on the

mountain, or on the graph, corresponding to the

line containing u.

You see, this is the line I'm talking about,

this yellow line.

I didn't draw it very well.

Maybe it's more like goes like this.

So, if I look at this line, this line will

you give me that path.

What do I mean by give me.

If I lift that path, there's a path on the x,y plane, but it

has a unique lift to the graph.

This is the yellow path on the graph.

In other words, this line or this half line, is the

projection of that path.

The unique path on the mountain, which starts at that

point and whose projection onto the x, y plane is the

half line directed by u.

You see what I mean?

Is there any questions about this?

OK.

So, the point is that the graph is two-dimensional.

It's a surface.

But, once you choose a direction, you cut a path or

a curve on that surface.

So, you are back to one-dimension, to the

one-dimensional case.

And, in the one-dimensional case, you can actually

talk about the slope.

Because, you get the graph of a function of one variable,

namely the variable along this line, and you can talk

about its slope.

That slope is the rate of change along that path.

That's what we call the direction of derivatives.

And, finally we will have a formula for it, which involves

the gradient vector, and this formula tells us when this

direction of derivative takes some particular values.

For example, the maximum value, maximum value corresponds to u

equal nabla, minimal value is its opposite, and the 0 value

corresponds to the tangent, to the u, which tangent

to level curve, OK.

But, this we already knew from, by analyzing this picture just

on the grounds of common sense.

We didn't need to do any calculation to figure this out.

In fact, when you are climbing the mountain, you're not

pulling out a paper pad and a pen and starting to calculate

what is the best way to reach the top of the mountain.

You kind of follow your intuition.

And, what your intuition will always tell you is that if you

want to go, reach the top, in the fastest possible way, you

have to go perpendicular to the level curve.

In the direction perpendicular to the level curve, and

likewise, if you want to go down the fastest way, you also

go perpendicular to the level curve.

Is that clear?

OK.

So, intuitively it's clear.

But, now we have proved that because we found the formula

for the rate of change.

And, from this formula, which is written in terms of dot

product, It's plain obvious when it takes the maximum

value, the minimum value or the 0 value.

And, that was one of the main conclusions last time, but now

I have illustrated it in this way.

OK.

So now, one more -- that's odd.

Here is the problem -- that's call Catch 22.

Alright, so I have -- this is a small inconvenience.

And, that's very clever.

I will not try to get it out of there, because I don't want

to pull the second one.

So, yes, what is that symbol?

It is called nabla.

It's a Greek letter, which is written opposite of delta.

We're using it for the gradient.

I'm sorry?

That's right, this is a notation for the gradient.

OK.

So, and one other thing which I wanted to mention in this

regard, is we talked about equations of-- we have talked

about equations of tangent lines and tangent planes.

And, I know this could be confusing, because you have

many different objects at the same time for which you could

look at tangent lines and tangent planes.

Seems like there are many different discussions going on.

Let's focus, okay.

So, there are tangent lines and tangent planes.

And, so I just want to summarize the stuff once more,

so that there's no ambiguity or there is no confusion.

So, the first is a two variable case.

In a two variable case, we look at the level curve

over function f of x,y.

So, it is given by this equation x y equals k.

So, that's

this level curve, that's this curve of equal height, or

equal altitude, which I drew over there.

OK?

And so then, we could look at the tangent line, tangent line

to this curve, and the equation of this tangent line at the

point x0, y0 is like this.

It's

f sub x at x0 y0 times x minus x0, plus f sub y x0 y0,

times y minus y0, equals 0.

OK.

But, in fact, we can now look at the case of three

variables as well, right.

Let me actually, let me do it here.

Three variables, three variables case.

In three variables case, we would want to take, instead of

a function in two variables, f of x y, we would want to take a

function, maybe F capital, a function of three

variables, x, y, z.

And, by analogy, we would have to look at the level, but

not curve, but now level surface of this function.

And, so that would be F of x y z equals k.

Maybe I should say that k is some number.

So, that for example, in this discussion k would be the

height, so I don't know 1,000 feet.

Whereas x and y and z are variables.

So, that's the difference.

It's an equation.

It's one equation for three variables, because in this

equation this is some number, like 1,000.

So, then you can ask what is the equation of the tangent,

but now not line, but plane to to the surface at the

point x0, y0 and z0.

And you see, the point is that the answer is given by

something which looks exactly the same, except now we

have three variables.

So, we have to add one more term, which involves

the third variable, z.

So, what the answer is, the answer is the following. you

have to take the partial derivative with respect to x,

multiplied by x minus x0, plus the partial derivative with

respect to y, plus the partial derivative with respect

to z now, also.

So, the difference between the two variable case and the three

variable case is that we now have an extra variable.

So, everything gets dimension one higher, the dimension

gets bumped by one.

We had a curve now we have a surface, we had a line

now we have a plane.

The equation here involved two partial derivatives, and

had this very simple form.

And, now the equation involves all three partial derivatives,

but had the sam form.

So, I will not derive this formula.

It is derived in the same way as in this case, in

the case of two variables.

But, I hope it looks convincing to you, because you can

clearly see the analogy.

And, in fact, if you want to prove it, you can prove it

in exactly the same way.

Now, what is slightly confusing in this is that there is a

special case of this three variable, of this three

variable picture.

And the special case is, when F of x, y, z is

f of x,y, minus z.

So, you can ask, why would we even bother to look

at this special case.

And, the reason is very simple.

Because, in this special case, if I look at the equation

F of x, y, z equals 0.

Which is a special case of a level surface.

Namely, the case when k is 0, right.

This is just equation z equals f of x,y.

And, this equation defines a graph, a graph of function

and two variables.

So, it's kind of funny that function and two variables

shows up into two different context.

It shows up here in the context of level curves, OK, but it

also can show up here, for functions with three variables.

Even though it is a function of two variables.

but even when we have a function with two variables,

and we think about graphs, we automatically go to the

three-dimensional situation, right.

And, so the graph of a function in two variables can be thought

of as a level surface for a function in three variables.

Which function.

Well, this function, f of x,y minus z.

It's kind of like a simplest concoction you can make out

f and the new variable z.

So, we can apply this general formula for the tangent plane

to this special case, and what do we get?

Let's observe that F sub x is just f small sub x.

Because when you take partial derivative of this function big

F, you have to differentiate this one, that would

be just f sub x.

And you differentiate this one, but this one is

independent of f.

So this doesn't change anything.

So partial derivative of this function, of this whole

function with respect to x is just the partial derivative for

this part, so that's small f.

Partial derivative with respect to y is f sub y.

Partial derivative with respect to z is what?

It's negative 1.

It's negative 1, because that's the derivative of this function

negative z with respect to z.

This guy doesn't depend on z, so its partial derivative

with respect to z is 0.

But, the partial derivative of this term is negative 1.

So, we get this, three partial derivatives, which we

substitute into this formula, and what do we get?

We get f sub x of x0, y0 times x minus x0, plus f of x0, y0,

y minus y0, minus z minus z0 equals 0.

And now, we recognize the equation of the tangent plane

to the graph, which we have known already.

This is the one which we got already two weeks ago, when we

talked about differentials and linear approximation.

The only difference is that now I put negative z minus

z0 on the left-hand side.

But in our old discussion, we would write equals z minus z0.

And, then we would switch the left- and right-hand

sides too, but that's a minor issue, right.

So, this is just a slightly different form of writing the

same equation, just putting everything on one side.

And, now you see that the case of graphs of -- the case of

tangent planes of graphs of functions into variables can be

thought of in two different ways.

OK.

Namely, you can think that you started with a function and two

variables, and you just look at the graph and you look

at the tangent plane.

But, you can also think of it as a special case, of the more

general case, a functions and three variables, except you

take as a function and three variables this very special

form: f of x y minus z.

Either way you approach it, you get the same answer.

But now, you can appreciate more the connection between

this answer and this.

Many people ask me after last lecture, why when go, when we

look at the equation of the tangent line, of a function and

two variables, it's as though we are dropping this

term, z minus z0.

So, there's this negative 1, which we just dropped.

Well, geometrically it's clear.

In fact, the board stayed since Tuesday, so I guess nobody

likes this small board, except for me, which is good.

So, this is its tangent line, and this tangent line

corresponds to, along this tangent line to the level

curve, we have the same value of z, z equals

z0.

So, that's why we draw up this term, to go from

this equation to this.

So, you can get this equation from this, by dropping z minuz

z0, Because z is equal to z0, along the level curve.

But, also you can now understand that this formula is

a special case of the formula for the tangent plane to a

general level surface, which actually looks like this one.

Except, we have a third variable.

In a special case, when the functions like this, this third

term becomes extremely simple.

It just gets a coefficient of negative 1.

So, you get minus z minus z0.

OK.

And, to make theanalogy complete, let's actually look,

let's fill in this square.

You know, it's like when you do IQ tests.

I've never done it.

But it's easy to find them online.

And, I think the problem is often like fill in the square,

so this is exactly the kind of question here.

What should be here.

Right.

In other words, this is a case of three variables, and this

is a special case of that.

Now, what's the analog of special case for functions

and two variables?

That's the case when this function of two

variables, special case.

When this f of x, y is some function in one variable,

let's call it g of x minus y.

And, so you see in this case, in fact, you know what I'm

going to do to make it look more like an analogy,

let's actually, let's re-position the board.

There we go.

So, now I think it's more clear what I mean by

filling in the square.

I want to find a special case of this, which is analogous to

how we found the special case of three variables.

And, that's the case.

Now, our function and two variables is equal to

another function and one variable minus y.

OK.

In this case, the equation, f of x y equals 0,

means y equals g of x.

Right?

And, this is a graph.

Graph of the function g of x.

So, a level curve for the function of two variables

can become the graph of a function of one variable.

When this function into variables, has

this special form.

Did you have a question?

[INAUDIBLE]

Good question.

So, what would happen if we put some k?

Right.

If we put some k, it will be here, I would put minus k.

Right.

So, then I could just absorb k into the definition

of the function g of x.

If I redefine my function g of x by subtracting k, then I

would get back the level 0.

So, that's why we don't lose any generality by looking at

the case of level 0, rather than, as oppose

to general case.

So, that's a graph.

So now, this formula for the tangent line, note that f sub

x of x,y now is g prime of x.

Just like here, the partial derivatives of the big function

F with respect to x and y were just the derivatives

of the small f.

And now, the role of the small f is played by g, so the

partial derivative, like this, is just g prime.

And, the partial derivative with respect to the second

variable is negative 1 again.

Because this, minus y, plays the same role as

minus z played here.

So, when we take the derivative, we get negative 1.

So now, this formula for the tangent line becomes g prime

of x0 times x minus x0, minus y minus y0 equals 0.

Let me rewrite this.

This is equivalent to saying y is equal to f prime of x0

times x minus x0 plus y0.

We recover the old formula for the equation of the tangent

line to the graph, a function and one variable.

That formula is exactly this one.

The slope is f prime, you multiply x minus 0 and you add

the value of the function at the point x0, which is y0.

Right.

So, there is nothing mysterious in this formula.

In this special case, we get back the old formula

we've known all along.

And, also this now sheds some new light on this coefficient

negative 1, which many of you have found mysterious.

It's not mysterious, it's as mysterious as this coefficient

negative 1, which shows up in the old formula for the

tangent line to the graph.

We were not surprise to write the formula for the graph of

the function y equals f prime times, you know, in this form.

But, if you have it in this form, you can rewrite

it like this.

When you rewrite it like this, you find the

coefficient negative 1.

That's exactly the reason it appears is the same reason

why this negative 1 appears.

Any questions about this?

Yes.

[INAUDIBLE]

Why do we choose a special case?

That's a very good question.

Why do we even choose a special case.

Well from the point of view, let's talk about this case.

From the point of view of functions into variables,

this sounds strange.

Why would you write it like this, and not f of x minus

x,y or something, right?

So from one point of view, functions of two variables,

it doesn't make any sense.

It makes a lot of sense, however, from the point of view

of the theory of functions in one variable.

When we started functions in one variable, we would like

to visualize them by graphs, right?

When we draw a graph of a function with one variable, we

introduced one more variable, and we look at the graph,

which is y equals f of x.

What I'm saying now is that within this formula, this

formula that we are developing, we can think of the graph of g

of x, which normally we would write as y equals g of

x, just in this form.

And when we write it in this form, we never say the

word level curve or anything like this.

We just say graph.

But, we have to realize, it is important to realize, to see

the connection between different formulas.

It's important to realize that this graph actually can be

thought of as a level curve for a function and two variables.

And, that function just happens to be this function, even

though it looks kind of -- there's no reason a priori

to study such functions.

We have introduced them, because we started from the

point of view of functions and one variable, and then this was

naturally fell out, once we started to look at the graphs.

So that, likewise in this case.

Yes?

[INAUDIBLE]

That's right.

Could be a point, or finitely many points, because you could

be -- let's look at the function.

And, for a good reason right, the dimension of the level

curve is going to be a number of variables, or a level curve

or a level surface, and so on, would be the number of

variables involved minus 1.

If you have two variables, it's a level curve

so the dimension 1.

And three variables it's a level surface, dimension 2.

If it's a function of one variable, it will

be of dimension 0.

Zero dimension objects are just collections of points.

And, the way it works is just like this.

Let's look at, for example, you have a parabola, level curve

consists of 2 points, but, you know, if you have a cubic

parabola, like this, there would be 3 points.

And, if you have a cosine, you would have

infinitely many points.

Infinitely many points if the level is between

1 and negative 1.

And, if the level is higher than 1 or lower than negative

1, then it will be empty.

Level curve could be empty, or level surface.

In this case, it's sort of level point.

We don't have a good word for a collection of points.

So, it's like level -- zero- dimensional object.

Manifold, as a mathematician will call it.

Any other questions about this.

Yes.

[INAUDIBLE].

Oh yes, I'm sorry.

Thank you.

That was just a, that was a mistake.

Thank you.

Yeah, it's g prime of course, I'm just saying this formula

becomes this formula.

I called it g right, sorry.

Yeah, I completely messed it up.

Good job.

So, that will do it for us in this topic, and actually we are

running out of time, so we need talk about something

else today, also.

I didn't want to go over this slowly to emphasize that the

connection between this different objects, because I

think that there are different dimensions ad different number

of variables that play, and it could be very confusing.

But, I think that if you put this in this sort of -- in this

picture, where you have this four squares, two variable

case, three variable case, special case, and two variable

special case, and three variables, then I think it

becomes much more clear.

Alright, but the next topic we'll discuss, concerns finding

maximum and mimina functions.

And, as is always the case, it's actually instructive to

look at this question already in the one-dimensional,

one variable case.

Because, we already contain some insights into the problem

by looking at this very special, simplest

possible case.

If you have a function in one variable, it's a natural

question to ask where this function attains maximum

and minimum values.

That's important, because this function could respond to

something in real life, and you may want to maximize

that or minimize that.

And, so the first point I want to emphasize is that there are

two different types of maxima and minima, the local

and the global.

The global ones are called absolute.

I like to think local, global.

I like this terminology because terminology is

a little bit better.

So, what do I mean by local.

So, let me draw this.

For a function one variable, it is very convenient to

analyze everything by using graphs of functions.

And, graphs, again, are curves on the plane.

Right.

So, we introduce the new variable y, and we write

a graph as given by the equation y equals f of x.

So, let's look at this kind of function.

That's a very typical example.

So, I want to focus on this point.

So clearly, this point -- the value of the function at this

point, this will be the point x0, and that's the

value of the function.

This is f of x here.

The value of this function, at this point, is greater than

the value at nearby points.

So, that's an example of a local maximum.

A point is a local maximum if there is a small neighborhood

of this point such that if you restrict your function to this

neighborhood, which is this little interval in this case,

then this function will -- this will be the maximum

value on the interval.

Okay.

But is it the global maximum.

Clearly not, because I have a point here, for example,

x1, for which the value

is higher.

So, that's not a global maximum.

That's not a global maximum either.

In fact, in this example, there, is no global maximum,

because I'm assuming the function keeps growing,

it keeps increasing, its axis is increasing, OK.

If that's the case, there is no local maximum.

So, global maximum is a completely different, finding

global maxima is a completely different game than

finding a local maximum.

Finding local maxima just involves analyzing the function

on a very small interval around this point.

Finding global one sort of involves analyzing all

points in your domain.

The way I phrased the question -- I have raised the question

so far, is as though we were studying global maxima on the

entire line, on the entire x line.

OK.

And, you see clearly that that question often

doesn't have an answer.

In other words, there is no global maximum, simply because

for any point, there will be another point which you will

have a higher value, higher value, higher value,

and so on, OK.

The question of finding global maxima is better to phrase on

domains which are bounded.

Not on the entire line, but on bounded domains.

Bounded means that it is finite.

So, it's better to say, what is the maximum of this

function on this interval.

OK.

This is an example of a close bounded domain

in the falling sense.

First of all, it's bounded because it's finite.

It doesn't go to infinity.

Second, it is closed because it contains the endpoints.

And, these are the kind of that domains we should look at if we

want to ask questions about global maxima or absolute

maxima or minima.

So, let's look at this question in this particular case.

In this particular case, we see that the maximum value is

actually taken at this point.

This is a maximum.

So, now you can appreciate why you have to

include the endpoint.

If we did not include the endpoint, there wouldn't be a

maximum, because no matter how close you are to this point,

there would be another point even closer for which the

value be even higher.

So, therefore there would be no

maximum.

So, in order to guarantee that you have was a positive answer

to the question of existence of a maximum, or minimum for that

matter, you should really look at clossed or bounded

intervals, and then what happens is that the maximum can

be attained either at the boundary, which is the case

here, or it could be some local maximum which lies in the

interior of this interval.

In this particular case, you do have a candidate, you do have

a candidate for a maximum.

This one, because it is a local maximum, and it is within this

interval, but it's not a global maximum, on this interval,

because the value of this function is just bigger.

But, if I were to take a different interval, if I were

to take an interval like this, for example, then

this guy would win.

Because, at the boundary, the value would be smaller.

You see.

At the boundary, the value would be smaller.

So, this guy would have the highest possible value

on this interval.

So, the bottom line, the upshot of all of this, is that the

absolute maximum can be found in the finite set of points.

Those points are -- first of all the endpoints, and all

the points where you have potentially a local

maximum or minimum.

So, the global maximum on the bounded interval, on a closed

interval, on the bounded interval, let's call it [a,b],

can be found, maximum of some function f, at one of

the following points.

The endpoints, which are a, b, and the points of

potential local maximum.

And here, it is important to emphasize the word potential.

And, those are the points for which f prime of x is 0.

Because certainly, if it's a point of local maximum, then

the slope of the tangent line, at this point, is

going to be 0, right.

Because, if you have a non-0 slope, you just move away from

this point and you'll get a bigger or smaller value.

And, one side bigger and the other side smaller.

So, the only way you could have a, or possibly have a local

maximum, is to have slope 0.

A slope is a derivative, so that's why these are the points

for which the derivative is equal to 0.

So, this is the first statement that I want you to remember, or

recall, perhaps from the one variable calculus, which is

that, if you're looking for global maxima, what you need to

do is simply measure or evaluate the function

at that endpoints.

Evaluate at this point and evaluate at this point.

Next, find all the points where the derivative is 0, and

evaluate the function.

So, you get the finite list, and then just pick the

one, or the ones, where the value is maximum.

These are the values, these are the maxima of this

function on the interval.

In other words, you don't have to look through all

the points on the interval.

There are infinitely many.

But you only look at the endpoints and the points

where f prime is equal to 0.

That's the algorithm for finding maxima of a function.

Like for minimum, just replace the word minimum,

maximum by minimum.

So, it's exactly the same.

Now, before I go and generalize it to the case of two

variables, I want to explain what I mean by the word

potentially, potential local maximum.

In other words, if the point x is a local maximum or minimum,

then the derivative is 0.

I already explained this, because the slope has to be 0.

If it's a maximum or minimum, the slope has to be 0.

If the slope is non-0, it means you can increase or decrease

the value by moving a little bit away from the point.

So, this is true, but this is not true.

In other words, if the derivative of your function at

your point is 0, it doesn't mean it's an absolute

maximum or minimum.

And, the reason is the following , there is a very

simple counter-example to this, namely function x cube.

So, f of x is x cubed, f prime is 3x squared,

so f prime of 0 is 0.

Which we see, right?

We do see that the slope at 0 is 0.

Right?

But, is it the point of a local maximum or minimum?

It's not, because if you go this way it increases, and if

go that way it decreases.

And, in fact, if you think in terms of monomials, the

same thing will happen if you have x to the odd.

To the n or n is odd, like 3, 5, 7 and so on.

Because, the derivative of x to the n is n times x minus 1.

So it's still, it's always 0 for this function.

But, if n is odd, for positive x this value is positive and

for negative is it is negative.

So, it's going to look like this.

But if x is, if you have y is x to the n is even, so it's 2, 4,

so on, then it is going to look like this.

So, in that case it is okay.

It is actually point of local maximum or minimum, local

maximum in this case, and if it were negative it would be,

sorry this is minimum, but if were like this, it

would be maximum.

So, in other words there are many possible scenarios where

you have a derivative equals 0, and it is a maximum, and there

are many scenarios where the derivative is 0, but it's

not a maximum or minimum.

So, it only goes this way, if there is a maximum or minimum,

then the derivative is 0.

That's why I said, those are the points which potentially

could be maxima or minima.

OK.

So, in principle, you could rule some of them out, from the

outset, by saying well this points are actually, the

prime is 0, but the point is not maximum or minimum.

So, then it cannot possibly contribute to the list of

suspicious or candidates for global maximum or minimum.

But, I think it's just much easier to just take all of

them, because it could be finitely many, and just

evaluate your function f at all of them, and then compare.

Where do you get the largest value, and where do you

get the smallest value.

Okay, good.

So, that's why the way I formulated this, I did want to

at this level to try to differentiate between the ones

which are actually maxima and minima and which are not.

Let's look at all which are potentially maxima.

So, that's the one-dimensional case.

So, now, in some sense we already know everything we

need to know, because in a two-dimensional case, it's

going to look exactly the same.

The criteria will be slightly more complicated.

Maybe I'll say one more thing, which is that there is a

criteria to see whether the function is a maximum or

minimum in this case.

Namely, suppose that f prime is 0, but f double

prime -- that's x.

At this point, let me emphasize that there is a particular

point x0, which was the point 0 in my previous examples.

Let's call it x0.

This is positive, then it's a maximum.

It's a minimum, sorry.

A minimum, local minimum.

And if -- I said "but," it should be

"and." If f prime of x0 is 0, and f double prime of

x0 is less than 0, then it's a local minimum.

In other words, if you think about this in

terms of Taylor series.

You can approximate oftentimes, you can approximate a function

by, a smooth function by its Taylor series.

And, the first terms in the Taylor series are going to be

given by the value of the function, and the derivative,

and then the second derivative.

So, the point is that if the first derivative vanishes,

that's a necessary condition to have a local

maximum or minimum.

But then, it depends on which term in the Taylor

series is non-0 next.

So, for example, if the second term is non-0, that means your

function looks like x minus x0 squared times some

coefficient, right?

What I'm trying to say, what I'm trying to explain is

the following, let me do it more slowly.

The Taylor series looks like this.

Okay.

So, this is just the value of the function.

Let's assume without loss of generality

that it is equal to 0.

I mean, after all we could just subtract this

value from this side.

It's not going to change anything.

So, let's assume it's 0.

So, the next comes this term, which is the first derivative,

and the first derivative has to vanish otherwise it can't be a

maximum or minimum as we just discussed.

So, this also vanishes.

So, the next term is the second derivative, right.

And, then there was some additional terms.

But, the additional terms are negligible compared

to this term when x is very close to x0.

So, you might as well replace your function by this function.

But, this function is just the parabola, I mean the graph of

this function is just a parabola.

And, the parabola we know, the parabola would be like this if

the coefficient is positive, and it would be like this if

the coefficient is negative.

So, in this case, clearly this is a local minimum.

For this one.

For this one it's a local maximum.

And, the other terms don't matter.

So, that's the reason why we get this criteria.

But if it is 0, it this term is also 0, then we can't really

tell, because we don't know what comes next.

If the non-0 term is a cubic term, we know it's not going to

be maximum or minimum, because we looked at a cubic parabola,

and it's like this.

It doesn't have a maximum or minimum, right?

But, if the cubic one vanishes, but the quartic one is non-0,

but then again it's a U shape.

So, there's no telling, we should really then look at the

higher terms in the expansion, and that's much more difficult.

So, that's why we just stop here, and we say

here is a criteria.

It the first derivative is 0, but the second derivate is

positive, it's a local minimum.

And in this case, it's a local maximum.

And, we just stop right there.

In other words, it's not, it does exhaust all possible

cases, but it exhausts its concerns or helps us in the

cases where the second derivate is non-0.

And, there is a similar criteria also it for

functions and two variables.

So, now we switch to functions in two variables.

[INAUDIBLE]

I wrote what?

On the top of what?

Oh, they are both local minima.

Wow.

It's kind of pessimistic.

Thank you, I have to correct.

You definitely should correct that.

I know it looks like we'll never reach maximum.

Okay, I think now it's good.

Right, because if it's negative it is shaped like

this, so it's maximum.

Okay.

So now switch to functions and two variables.

So again, we have local things, local maxima

and minima and global.

And, searching for them is sort of, they are two

different games for this.

For local maxima or minima, the first, step one is to check

that the two partial derivatives are 0.

Just like for functions in one variable, the first step is to

look at the first derivative.

Well, now we have function in two variables, so there are two

different partial derivatives.

So, both of them have to vanish.

In order for us to have a local maximum or minimum.

Well, I'm assuming now that both of them exist.

There is another possibility which is that, say one

of them may not exist.

So, in that case, that's also a possible case for local

maximum or minimum.

But, let's assume, in this discussion, that the partial

derivatives always exist.

So then we don't have to worry about this.

If they do exist, then a given point, x0, y0, will be a local

maximum or minimum, only if those partial

derivatives vanish.

So, when you kind of narrow down your search, you first

have to, you throw everything away, everything else away.

You just keep the points for which both partial

derivatives are 0.

But, this does not guarantee that it is maximum or

minimum, just like in the one variable case.

The best we can do, is to have a criterion involving

second partial derivatives.

so the criterion -- we would like to say something like if

the second derivative is positive, it's a minimum, if

it's negative it's a maximum.

But, there are now three different second

partial derivatives.

We have fxx, fxy, and fyy.

OK.

So, in fact the rule is as follows.

We have to calculate the following expression: So,

remember when we did cross products, we used determinants.

So, let's make a determinant of this 2 by 2 matrix, which

is very easy to memorize.

Think of the x as -- think of this one corresponding to the

first index, and this, the rows will correspond

to the first index.

So, the first index here is x and here is y.

And, the colons will correspond to the second index, which will

be here is x and here is y.

So, you put four possible partial derivatives

in this matrix.

Then, we know by Clereau's theorem that this is

the same as this.

But, let's not yet worry about this.

This is just an easy way to remember.

OK, and then we take the determinant of this.

So, what's the of fxx fyy minus fxy fyx?

But fyx is equal to -- OK, now we remember it.

And we just put squared.

So let's call this D.

So the criterion is that if both partial derivatives are

0, and D is greater than 0, then it's a maximum.

That's number one.

Number two, if both partial derivatives are 0 and D is

negative, then it's a minimum.

And finally, I'm sorry, I'm not saying it correctly,

it's maxima, No, sorry, it's worse than that.

It's maximum or minimum.

And this one is not.

Let's just say not.

I don't have enough space.

Not a maximum, not a minimum if it's negative.

And, if it is 0, it's inconclusive.

Don't know.

OK.

First point, think of this as an analog of this rule, because

in the case of one variable, there is also a rule which

involes a second derivative.

However, this rule is much more complicated, because there

are three different partial derivatives of second order.

And, we make some complicate combination of them.

Whereas, here we just took the second part of the partial

derivative on the nose, and we just said, looked whether

it's positive or negative.

But it's very, there is an analogy between the two

clearly, because this involves second partial derivatives and

this involves second partial derivatives.

Now, but it looks very mysterious.

Why do I make this

-- why do I look at this combination, and not

at other combinations.

To understand this this, think of the case, where, think of

the case of parabola, and the look of the parabola.

Because, I explained to how this rule came

about, by looking at the parabolas, right.

The parabolas, because the parabolas approximate

your graph.

Just because of the Taylor expansion, argument, you can

see that the parabolas are going to see that the parabolas

are going to approximate your graph near the point of where

the first partial derivatives were, where first

derivative vanished.

Right.

So, think about the parabolas.

And, in the case of the parabola, you know that if it

is an elliptic, first of all parabola now becomes

paraboloid, but there are two types of paraboloid.

There is an elliptic paraboloid, and there is

a hyperbolic parabolid.

OK.

And, just look at the examples of elliptic paraboloids, and

you will see that for elliptic paraboloids, the first

condition will be satisfied.

And, for a hyperbolic paraboloid, the second

condition will be satified.

So, if z is equal to -- let's say if f of x,y is

ax squared plus y squared.

Ah, ax squared plus by squared.

So, what are the derivatives in this case?

fxx is 2a, right?

fyy is 2b, right?

And fxy is 0.

So there is a simplification in this case.

But there is no cross term.

OK?

So this matrix looks like this it's 2a and 2b and that's 4ab.

That's the D in this case.

So, to say that D is that positive means to say that

both a,b are positive or both of them are negative.

If both a,b are positive, it's going to look like this .

If both a,b are negative, it's going to look like this.

So, in this case it's a local minimum, and this

case a local maximum.

But, in both cases you see a,b both positive,

a,b both negative.

The combination a times b, or 4a times b.

In both cases it's positive.

So, that's why we get into the first condition, in the

situation of the first condition where D is positive.

So, in this case, we can say for sure that it's maximum

and minimum, but we cannot say which one.

So, we have to look at it more closely.

OK.

And, what if D is negative.

If D is negative, that means that a,b have different signs.

And in this case, so a good example of this would z equals

x squared minus y squared.

And, that's a hyperbolic paraboloid.

And for a hyperbolic paraboloid, I drew this picture

before, it looks like a saddle, and on the saddle, there is a

point from which you can either increase the function if you go

along one of the parabolas, which opens up this way, or you

could also decrease the function by traveling on

a different parabola perpendicular one where it

opens downward, right.

So, this point clearly, this point on the saddle is not a

point of maximum or minimum.

So, that is the explanation of this criteria in the case of

quadratic functions, functions which are combinations of

x square and y square.

And, the point is that all other functions can be reduced

to this one by certain procedures, and that's

how you get this rule.

OK.

So, that's how we get this rule for local maxima and minima.

And, that takes care of that issue.

The last remaining topic is how to find the absolute

maximum and minimum on particular domains.

And, this I will illustrate very quickly by a

concrete example.

This is step 1 and this is step 2.

Let me give you an example of how to find maximum

and minimum, global maximum and minimum.

I have just enough time to explain this.

So, let's say you have a function, f of x y, which is

x squared plus y squared plus x squared y plus 4.

Find global or absolute maxima and minima on the domain x,y

where the absolute value of x is less than and equal to one,

and the absolute value of 1 is less than and equal to y.

So, the first step is to sketch the domain.

Sketch, it's very easy, right.

This is just the square, where the sides are lines parallel to

x, y axis 1 and negative 1.

Step 2 is to find the boundary, identify the boundary.

This is the boundary.

And, now we are going to make a list of suspicious points, or

points which are candidates for being maxima or minima.

OK.

And, this list will include three kinds of points.

The first are points in the interior, where both

partial derivatives are 0.

What do I mean by interior?

Interior means everything except the boundary.

So, I have to calculate what is fx and what is fy? fx is

2x plus 2y and fy is 2y plus x squared.

Right.

So, we have to set this equals to 0 and this equals to 0.

Since I'm running out of time, let me just go

to the next step.

You solve this equation, it's very easy.

Right.

This is the first group of points that you

get on your list.

The second group of points are points on the boundary, but

which belong to the smooth part of the boundary.

Two is smooth part of the boundary.

Yes?

[INAUDIBLE]

2x plus 2xy.

I'm sorry, that's right.

All right.

Smooth part of the boundary.

What I mean by this, well exclude -- I mean, maybe

it's not a good idea to say smooth part.

Maybe it's not like this.

Let's just say, let's just call it the components of the

boundary, components of the boundary.

So, what I mean by components, I mean this four intervals.

So, break your boundary into four, into pieces, which can

be represented by a nice equation, like here.

Here it's like x is equal to 1 and y is between

negative 1 and 1.

So, then you

restrict your function.

Restrict your function to this component.

It will effectively become a function in one variable.

Solve the problem for this one variable function.

One minute left.

OK.

So, let me give you an example.

Say, one of the components is y equals 1 and x is

between negative 1 and 1.

So, I substitute this y equals 1 into this formula, and I get

f of x1 is x squared plus 1, plus x squared plus 4.

So, that's 2x squared plus 5.

I got f function in one variable on this interval.

Find the local, find the absolute maximum of this

function that interval.

OK.

And, then the same for each other component.

And, that's not all.

It would have been all, it would have been all, if you

didn't have the corners.

Because you have corners you have to include them, because

in principle it could happen that the maximum and minimum is

attained at the corners, so look at, include the corners.

So, now you compile the list, and you evaluate the function,

and you choose the one where the value is maximum.

So, that's how you do it.

All right, have a good weekend.