Uploaded by UCBerkeley on 17.11.2009

Transcript:

Question.

Sorry?

No, I'm working out but walking out.

I will be here on Thursday, so inasmuch as I sympathize with

the organizers, and I think that they really have a very

valid point, they raise a very valid point, I also feel that

my job is to be here, so I will be here Thursday.

Any other questions?

Perhaps even more important, next Thursday we'll have

our first mid-term exam.

Yeah, everybody's excited.

So I already put an announcement on the bSpace

page, which, by the way, you should check bSpace site

because there are all kind of interesting stuff in there,

and useful information also.

So, homework solutions, something I have already told

you about, but also there will be some information

about the exam.

This week I'm going to post a mock mid-term so that you will

have a chance to practice for the mid-term, and we'll have a

review lecture next Tuesday, a week from today.

And the exam is on Thursday exactly during the class hour

-- one hour, 20 minutes, more precisely.

So, we'll start at 3:40 and finish at 5:00.

Because this room is not big enough, you know, because

everybody would be packed, so I have requested another room, so

we'll have to separate into two groups.

I will post information about this and I will tell you about

this and everything about this on Thursday or next Tuesday.

And now I go back to what we talked about last

week, which is limits.

I just want to say a few words about limits, and then we will

move on to the next subject, which is partial derivative.

So about the limits -- too many late arrivals today.

I'm going to charge a late arrival tax if this continues.

So let's just quiet a little bit because it will be good

for everybody if we are more focused.

So limits.

I illustrated limits but by way of an example and I looked at

the particular function in two variables, namely x squared

divided by x squared plus y squared near xy equals 0,0.

And the reason is that at this point the denominator becomes

equal to 0, and so this expression becomes problematic,

it may or may not have a limit.

If we did not have that, if, for example, we looked at this

function near a point xy equals 1, 0 or 1, 1, a point where the

denominator is not equal to 0, that would be an easy

question to handle.

But precisely when the denominator becomes equal to 0,

we have to be careful, and we have to analyze it

more precisely.

And so in this particular case, I explained that this function

doesn't have a limit.

This function does not have a limit at this point.

And I explained this, explained the reason.

The reason is that we could find -- the reason I gave was I

showed two different paths, on the xy plane which approach 0.

One was the path when x is equal to 0, so

it's a green arrow.

And the other one was the path when y is equal to 0, and

that's the red arrow.

And we have seen that along the first path, when x is equal to

0, when we look at this function on this path, very

close to 0 but not quite equal to 0.

What we get is 0 divided by 0 plus y squared.

But as I said, we really look at it not at this point

but just near this point.

So near this point y is not 0.

So this is not 0 near the point.

Because it's near the point, we can actually evaluate this, and

we see that for all values of y near this point, this is

actually just plain 0 because you divide 0

by something non-zero.

And so this means that along this path, there is a

limit and it's equal to 0.

On the other hand, if we look at this path, we get x squared

divided by x squared plus 0, and again, we will assume that

x is very close to 0 but not quite 0 yet.

So this is actually a non-zero expression, and therefore, we

can cancel them out -- this again is non-zero, and

this actually gives us 1.

So that means that along this path, again, there is a limit

along this path, there exists a limit, there is a limit

along this path, namely 1.

So we have found two paths along which we achieve a

different limit, we obtain a different limit.

That means that the function itself does not have a limit,

does not have a limit, because to say that the function has a

limit, if along any path approaching the point the limit

exists, then all of them are the same.

In this particular case, clearly they are not the same

because there are two path, at least two paths along which

the limits are different.

So how can it be that a function has different limits

along different paths, in other words, what is

the meaning of this?

What is the geometric representation of this?

Well, to explain that we can already see an analog of this

phenomenon for functions in one variable, which I

explained last time.

In the case of one variable, we look at the graph of a function

as a geometric representation, and we can have the following

situation where the graph this discontinuous.

So graph is discontinuous, for example, like this.

So there's some point, x, 0, and if you approach from the

left, you're going to end up at this point and this value, and

if you approach on the right you end up with this value.

So, that means that the function does not have a

limit at this point, because along different paths you

get different limits.

They are both finite, so in some sense you can say, you can

argue that it's a more benign situation than let's say

situation over hyperbola where actually it goes to infinity.

Here, by the way, on the right it goes to plus infinity and

the left goes to minus infinity.

So, in some sense it's even worse than just going to

infinity, it actually goes to two different

infinities in some sense.

But first of all, if it goes to infinity along any path,

it's already, we already say it doesn't have a limit.

But even if it's finite, this situation, which is in some

sense more benign, it still doesn't have a limit, it has a

limit from the left, has a limit from the right.

In the case of one variable, there are only two possible

paths which converge to this point.

You can go sort of with different speed, different

velocity, but it doesn't matter, I mean geometrically

it's the same path, either this one or this one, only two.

And already that creates trouble.

In the case of functions in two variable there

are many more paths.

I have drawn two paths here -- here is one, here is another,

but you have many more, right, you also have a path like this.

Finally, a path doesn't have to be straight line,

it could be a spiral.

So, there are many, many different paths, and that's

the difference between two variables and one variable.

To say that the function has a limit for a function in two

variables is a very strong statement, it's a statement

that along any of those path, you're going to achieve

the same limit.

So, this actually shows that a way it's much easier to

disprove something here, to show that the function does not

have a limit than to show that the function does have a limit.

Indeed, to show that it doesn't have a limit, it's sufficient

to just exhibit two different paths along which you have

different limits, and usually it's pretty clear which

ones you should take.

For example, in this case, you just look at

x equal 0, y equals 0.

Sometimes you might need to look at a linear path, like

this one where x is equal to y, again, to convince yourself

that indeed the function doesn't have a limit.

But to prove that actually the function does have a limit is

much more difficult because it wouldn't be enough, for

example, to say that the limit along this path and along

this path is this same.

You would also need to show that the limit along this

path is the same, and along infinitely many other paths.

So to show that it has a limit is more difficult, and this is

not a very efficient way to show it.

It is very efficient -- this way of argument is very

efficient to show that it does not have a limit, because then

it's enough to show just two for which the limits are

different, but to show that it has a limit, it wouldn't be

enough to show that along two paths you get the same answer.

You have show that the same goes

for all paths.

So, for practical purposes, what you need to know is the

argument showing that it doesn't exist.

You need to know this way of argument.

When the limit doesn't exist you should be able to

demonstrate that there exist two paths along which the

limits are different.

Yes?

[INAUDIBLE].

The distance, right, the distance, what really matters

is the distance -- what should matter -- so the question is

what really matters, the shape of the spiral or the shape

of the curve or the angle.

It depends on the situation.

The point is that to have a limit means that as soon as you

get close, say within 1 over 100 of an inch of the origin,

the answer is going to be within some small neighborhood

of the value that you claim is the limit.

It doesn't matter how you approach it, it should

be within that limit.

If it's 1 over 1,000 of an inch it should be even closer, if

it's one over a million should be even closer and so on.

So the notion of a limit should not rely on the way you

approach, it should be about, it should be uniform with

respect to all directions and all points in the neighborhood

of the point, of the point 0, 0.

So, in a sense this argument that I gave you is

kind of misleading.

It's a very nice argument to disprove the existence a

limit, to show that the limit does not exist.

But it is misleading if you try to think in this way about

the existence of limits.

So, for existence of limits, you have to use a different

kind of argument.

And now I'm not going to require that on the exam, but

I'm going to, just to give you an idea how it works, I'm

going to explain it in the following case.

Suppose you have a function which is just slightly

different from this at first glance, namely instead of x

squared I will take x cubed.

You see, so what happened?

So, the problem here, the problem with this function

was that both numerator and denominator had degree two.

So both numerator and denominator in some sense are

going to 0 at roughly the same speed, but not exactly.

It depends along which direction you go.

If you go on this one, it will 0, along this one it's one,

along this one, for example, if x is equal to y, you can

see it's going to be 1/2.

But because the powers are the same, that's why you end

up with different answers.

What happens now is that I put the numerator, I choose as the

numerator x to the third power, and the third power goes to 0

much faster than second power.

So that's why this will dominate and this will kill

this guy, so it will become 0.

Well, it's hardly kill, because they both go to 0, so in some

sense it goes faster to 0 so it doesn't really kill it.

but it depends on your point of view.

So, how would I show that this actually has a limit.

So I claim, I claim that this function does

have a limit at 0,0.

Namely the limit is equal to 0.

How would I show that?

Well, for that I would actually have to estimate the value of

this function, I would have to estimate the value of this

function when I approach 0.

So I will say let's suppose that xy belongs to the small

disk of some radius r, and I purposefully don't want to use

delta and epsilon, not because I don't like pre-cals a bit,

but because I know that people immediately feel violated when

I try to talk about epsilon and delta.

So, if I use a different letter you will feel much more

comfortable, some of you will feel more comfortable.

There is a certain -- just the way it sounds -- actually, one

of the students told me that it reminds him of going to

a dentist's office, epsilon and delta.

So, let's call it r, and let's look at the disk of radius r.

I'm drawing it as a big disk, but actually you should think

of it as very small, I just magnify it.

So, this is dr and I want to look at all the points here,

and I want to estimate, I want to estimate of the value

of this function for all points within this disk.

So the function, as I explained already many times is a rule,

which assigns to each of these points a certain number, which

is the value of the function, right, so that's

the function, f.

And I want to estimate the value of the function.

What I want to show is that the closer I get here, the closer

I'm going to get to the neighborhood of 0

value on this line.

So how can I estimate this?

Well, what does it mean that belongs to this?

It means that x squared, the square root of x squared

plus y squared is less than or equal to r.

That's what it means, because that's the distance, that's how

we measure the distance to 0. to belong to a disk of radius r

means that for your coordinates x and y, the square root of x

squared plus y squared is less than or equal to r.

So that means that x squared plus y squared is less than

or equal to r squared.

But you see both of these are positive, so if the x squared

plus y squared together are less than r squared,

this also implies that x squared is less than r squared,

so x is actually less than or equal to r.

This is actually also clear geometrically because if you

have all points in the disk, you will see that all of them

will have the x coordinate less than r, except for the two

points which lie on the intersection of the

circle with the x-x.

So, I know this as soon as my point is in dr,

and I also know this.

So now, what I would like to do is I would like to

write the folllowing.

I would like to write x cubed divided by x squared plus

y squared, and I want to measure the absolute value.

What matters is that absolute value.

When I say it's close to 0, it doesn't have to be

positive or negative.

It should be very close to 0 in either side.

So that's why less to estimate it's better to take the

absolute value rather than just the value.

And so now I want to write it like this.

I want to write as x times x squared divided by x squared

plus y squared -- I've done nothing, I just pulled

one x of this fraction.

And then I want to write it as x times x squared divided by

x squared plus y squared.

So now I want to look at this, and clearly, this is less than

or equal to 1, because in the numerator I have a squared,

but in dominator I have a

squared plus y squared, and this guy's always

positive or 0.

So the largest value this can obtain is 1, when y square is

0, but if y square is not 0, the numerator will be strictly

less than denominator.

So, this fraction is less than or equal to 1, and x, as I have

just explained, is less than r -- less than or equal to r.

So that means that the whole thing is less

than or equal to r.

So you see I have been able to effectively estimate the

value of this function.

I can not find it for all x and y -- I mean they're

all different for different x and y.

But I can say for sure that as long as x and y, as long as the

point xy is within that disk of radius r, the value is going to

be less than r -- less than or equal to, it doesn't matter.

So that means that if I make this disk smaller and smaller,

in other words, as I take r closer and closer to 0, this

value of the function -- more precisely the absolute value of

the function, woll also tend to 0, because for all points

within that disk, all points, this value is going to be less

than or equal to r, and I have control over r because I

can take my points in a smaller and smaller disk.

Nowhere in this argument am I talking about particular paths

approaching 0, I'm talking about the entire disk of radius

r, and then I'm squeezing that disk, I'm taking smaller and

smaller and smaller, and by being able to use this estimate

I can control of the value as a function of the radius.

I can say that if I am within the radius 1 inch, the value

of the function is going to be less than 1.

If I'm within the radius one over 100 of an inch, it's

going to be less than 100 -- one of a million.

In other words, I can get as close as I want to 0 for

the value by choosing sufficiently small disk.

Now that's the argument, that's the proof that this function

has a limit and the limit is equal to 0.

And that's what traditionally mathematicians explain in

the epsilon delta language.

But that's all it is, that's all it is is just saying that

if you take a disk of radius r, then the value of the function

for all points within that disk is going to be less than r or

less than something which becomes smaller with r.

So that's the argument.

Are there any questions about this?

Yes?

So, they ultimately have to prove that that function

will be smaller than r.

That's right.

In general it will not necessarily be r, it could be

-- let's suppose I prove, let's suppose I had a different

function, for example, I had 2 times x cubed, then I would

prove it's less than 2r.

That would still be OK, because I have to get something which

will become smaller and smaller as r becomes smaller.

Or if I get r to the 1/2, the square root of r, I would

also be OK, or r squared.

It will not be OK if I just say that it's less than 1, that the

whole thing is less than 1.

That wouldn't help me.

That would just tell me that I get within a certain range, but

that range doesn't get smaller as the domain gets smaller.

I have to make sure that the range will get smaller as

the domain gets smaller.

And that's a perfect situation where I didn't have to

make any adjustments.

I get r on the nose as the estimate, and the largest

possible value for this.

Another question?

[INAUDIBLE]?

Would that only work -- the question is whether this

is a general argument.

This is not a general case because in general you're

going to have maybe some polynomial in x and y

with additional terms.

I have really looked at the simplest case.

But in general the argument is going to be very similar and if

you like in the book there are more examples of this

type which are analyzed.

But as I said, I'm not going to require you to know this, so my

point here is just to give you an idea of how this

kind of proof works.

And I think that even though this is the simplest example,

it already illustrates this idea.

Yes?

[INAUDIBLE].

L'Hopital's Rule.

Well, L'Hopital's rule is really specific to functions

in one variable because you have to differentiate.

So, the question was about L'Hopital's Rule, which was

one of the powerful methods to finding limits for

functions among variable.

And so the way it works is that if you have a function which is

say p of x divided by q of x and both of these tend to 0,

say, and you don't know what the limit is, you can estimate

the limit by taking derivative.

So now the situation is different because now we have,

this is one variable case, and now we have two variables, and

in two variables we have functions in two, say,

p of xy and q of xy.

So let's say I wanted to generalize this, I would have

to take some sort of derivative here, right, and this actually

brings out the question which we study next, which is what

kind of derivatives can we do for functions in two

and three variables.

So clearly, there isn't a single derivative because

derivative is about the rate of change, the rate of change.

In a 1-dimensional case there is only one direction in which

you can change your variables.

You can increase that -- you know you can

just go away from x.

Apart from the fact that you can change it going left and

right, there is essentially only one way to change it.

More precisely, there is only one degree of freedom that you

can change, you can only change in one direction.

But in two variables there are more directions in which you

can change and estimate the rate of change, and therefore,

there are many more derivatives that are possible.

So in fact this is very close to what we discussed up to now.

There are so many different ways to approach 0, and we

have to be able to take care of all of them.

So, but L'Hopital's Rule in two dimensions would give us at

best is the way to approximate the limit along a

particular direction.

So let's say if I were to take the derivative with respect to

the x direction, then I would be estimating what happens when

I approach along this line.

And if I were to take the derivative with respect to y, I

would be estimating along the green line, along the y-axis.

But neither would give me the full picture.

The full picture I can only get as I have argued, by sort of

looking at path, right, looking at all possible directions.

So in that sense the L'Hopital's Rule

doesn't help us.

Now, sometimes, sometimes it happens that you can convert

this function that you have in two variables, into a

function in one variable.

One of the exercises, I think it's the last one on homework

is about this where you can use polar coordinates to realize

that the function you have is something like -- it's in both

x squared plus y squared.

I think it's something like logarithm of y squared times

x squared plus y squared, something like this.

So you realize that actually it only looks like it's a function

in two variables, but actually, it's a function in one

variable, namely this.

And then you are back to one variable case, and then it

becomes a fair game to use all the methods that you know

in one variable case.

A general function is not going to be like this.

For example, this one is not like this.

So, I can not directly use L'Hoptal's Rule for two

varibles, and there isn't any obvious way to use it because

there is more than one possible derivative, which actually

brings up the question as to what are possible derivatives

for functions in two variable, and that's our next subject.

So, I have already kind of alluded to the answer.

Because there are two variables, we can actually

differentiate to respect either one of them and we get a

meaningful derivative, and this I call partial derivatives.

So what are partial derivatives?

So we have a function, let's say we have a function f of xy

in two variables, and when we talk about derivatives, we

should, first of all we should fix the point, we should fix

the point at which we are taking the derivative, because

for a different point you have different derivative.

It' the same thing happens for functions in one variable.

So let's say we have a point which has coordinates a and b.

What we can do is we can convert this function into a

function in one variable, but by freezing one

of the variables.

So, for example, we can say y is equal to b -- 3's the second

variable and say that y, the variable y is equal to b,

which is at the second coordinate of this point.

So what we get then is a f of x and b, and let me indicate the

fact that I have frozen, I have frozen this by red, so red

would be a fixed value.

So here also would be, I will put them in red to indicate

that these are numbers like 1 or 5 or 27 over 11,

whatever you want.

But x is a variable, so x we can plug in number you want

and you'll get an answer.

So you want to view it still as a function, but because you

have frozen one of the two variables, there is only one

variable that's remaining.

Therefore, what you get is a function in one variable only.

And once we get a function in one variable, we can then

differentiate it just in the usual way how we differentiate

functions in one variable.

Differentiate it, so we'll get a g-prime and then we can

substitute the value that we wanted, the value a.

So then finally we get a number.

So in other words, we have a function and two variables,

first we freeze one of the two variables, and then we take the

derivative with respect to the second variable at the

particular value of that variable, namely a.

So the result of this is what's called the first partial

derivative, or partial derivative with respect to x.

Partial derivative with respect to x at this

point, at the point ab.

And the notation for this is f sub x of ab.

Likewise, I could freeze the second variable -- I mean the

first variable x, I could say x is equal to a.

Then I get a function again in one variable where the first

variable is frozen but the second one is free.

So I get a function of one variable, let's call it h, and

then I can differentiate it.

So what I get is this h-prime of b.

And that's called the partial derivative with respect to y

for which the notation -- I'm just abbreviating the same

sentences I have at the top of this board.

The notation for this is, obviously, f sub y of ab.

So we got two derivatives for functions in two variables,

the first and the second.

So now let's look at the example of what

this looks like.

Let's say f of xy is x to the 5 plus x times y cubed plus

cosign x times e to the y.

And we would like to find the partial derivative.

Now, when I define them, I was insisting that the value of --

that the derivative has to be evaluated for particular values

of a and b -- sorry, particular values of x and y, which

I denoted by a and b.

Just like in the case of a function one variable, let's

say if you have a function, let's say for function one

variable, say f of x is equal to x cubed.

I could say that the f-prime for any value a will be --

well, will be 3a squared, right.

That's the rule, because I know the rule.

The rule is that the derivative of x cubed is three times x

squared, and then if I substitute x equals

a, then I get this.

So usually we don't write it like this.

Usually we just write f-prime of x is x squared.

In other words, we would like to look at not just one value

of the derivative for particular value of x, namely

a, but at all of them.

For all possible values of x, we would like to know what the

value of the derivative is, and then we can substitute x equals

a, for example, 1/2, then you will get -- I'm sorry, I forgot

3, 3f squared, and no one corrects me or at least I

didn't hear. 3f squared, of course, for the function, and

we substitute x equals a and we get 3a squared.

But it's too pedantic to go this long way each time and say

well, if I ask what is the derivative of this function,

you say well, for a given value a, the derivative at the point

a is going to be 3a squared.

Instead we just write f-prime of x is 3f squared.

And that, what is understood is that if I want the value at the

particular, for a particular a, i'll just plug a into this

formula and I'll get the answer, right.

So we will use the same shorthand for

partial derivatives.

In other words, we will not be, I will not be writing each time

that fx of ab, I will just write fx of xy.

I will just write it for just fx sometimes, and that would be

a function of x and y so that if I substitute a instead of x,

b instead of y, I will get the value of the derivative , of

this particular partial derivative at that point.

So let's see how it works in the case.

In fact, nothing could be easier.

You just look at this function and in order to calcualte the

partial derivative with respect to x, you just view y as a

parameter, but not a variable.

This is exactly what I meant when I said that we freeze y.

It just means that we view y as a parameter.

And then you just differentiate what you see.

Well, what do you see, you see x to the 5, so you get 5 x to

the fourth plus this, you differnetiate this is y cubed

plus differentiate this you get negative sign x

times e to the y.

And that's it.

That's the answer.

That's the way you write the answer.

Now, if you are given some x and y, some values for x and

y, like a and b, you can substitute them and

you'll get a number.

But in fact you can view of this first partial derivative

with respect to x as a function of x and y, which I just

obtained in this way.

Likewise, we view x as a parameter, and then take the

derivative with respect to y.

So if x is a parameter, then from this point of view

this is just a constant.

It's independent of y.

Therefore, its derivative is 0.

So it's going to be 0 plus here it's also parameter, so it was

just differentiate y cubed, so we get 3x y squared.

Of course, sign x is also constant, and the derivative

of e to the y is e to the y.

So that's the answer for the second partial derivative.

Is that clear?

So this is really straight forward.

You only need to know how to differentiate functions

in one variable.

Yes?

[INAUDIBLE].

Why doesn't cosign go away.

In which one?

In this one.

Well, let's suppose instead of this you had

5 times e to the y.

Then the derivative would still be 5 e to the y, or any other

constant would just show up as overall factor.

So in the event, the constant is cosign x.

That's what I mean when I say we treat x as a parameter.

If we treat x as a parameter it's treated as a number, and

so any expression involving x, like cosign x, is

a fixed number.

So it just it shows up as an extra factor.

Any other questions?

So next I would like to explain the geometric meaning of this,

because as you see in this course, algebra and geometry go

hand-in-hand, and all of the concepts that we discuss

algebraically, they have geometric interpretation,

which is very important.

So, for functions in one variable, the derivative of the

function has to do with the slope of the tangent line.

In one variable derivative gives the slope of a

tangent line to the graph.

And the way we draw it is like this.

We have xy plane, we have a function, f of x, we have y

equals f of x with the graph.

Note again that y here has a totally different meaning than

y in here. y in here is a second variable, so it's on

equal footing with x -- x and y are two independent variables.

But now I'm talking about functions in one variable, so

there's only one variable x, and y is not a variable, it's

actually -- it denotes the value of the function.

I already talked about this.

It's an unfortunate choice of notation, but that's how it is

so I'm not going to change it.

So, we pick a point, let's say x equals a and we draw a

tangent line, and we know that the derivative, let's say the

angle is theta, that the tangent of theta is a

derivative f-prime of a.

That's the geometric meaning of the derivative of the

functions in one variable.

So then it's natural to ask what is the meaning for

functions in two variables.

To understand that, we have to look at the graph of the

function in two variables.

What does that look like?

Well, for a function in one variable, a graph is a curve on

the plane, and I already talked about it many times why do we

need a plane, because to represent the graph, you have

to have your variables and you have to throw in one additional

variable which will represent the value of the function.

For function two variables, there are already two variables

to begin with, and to draw a graph we have to throw in one

more, one more variable, which will represent the

value of the function.

So, as the result, the graph of function in two variables is

going to live in 3-dimensional space, so it will have

coordinates x, y and z, and we will have a graph of this

function which will be a surface.

So I would like to just draw part of it, which lives

in the first octant.

On the plane this coordinate system brakes the plane

into four quadrants, four coordinates which are called

quadrants, right, because there are four of them.

In space the coordinate planes break the entire 3-dimensional

space into 8 pieces, which are called octants.

So this is one octant, it's looking at us like this.

And so the graph actually lives everywhere, but I have just

drawn the intersection of the graph with each of the

coordinate planes.

And so you should think of this as something like a dome,

like a sphere, like part of a sphere.

It's not necessarily a sphere, I'm just -- just like

this is not a circle.

I mean I'm just doing a sample graph.

And to emphasize this, I want to show a particular

point on this.

So let's say I take this point, and so this

point has coordinates.

Do find the coordinates I have to drop perpendicular on he xy

plane, so that's going to look like this, and then that's the

z coordinate, maybe a little higher.

So this point is -- what is this point?

So this point has coordinates a and b, and the third coordinate

is a value of the function, because it lives on

this yellow surface.

I don't want to shade it because otherwise it will not

be clear what am I shading, am I shading this, am I shading

the plane and so on.

So just try to imagine that there is something here which

looks like it's part of a sphere, and that's the

point which belongs to it.

And that's the graph of a function, f of xy, so it's

defined by the equation z equals f of xy.

And this is a particular point, let's go with p which has

coordinates ab -- these are given, these are given, these

are just the values of x and y.

And what about the z coordinate?

Well, since it's a graph, the z coordinate has to be f of

the x and y coordinates.

So that means that I have f of ab.

So that's what this point is.

So this is a of ab.

Is it clear so far?

So now what is the slope?

What is the slope of the graph?

Well, first of all, it's not the slope of a graph, it's

the slope of a tangent line.

So here, actually, it doesn't make sense to talk about the

tangent line to the graph, because a line is

1-dimensional, and the graph is 2-dimensional.

How can it not be 2-dimensional if we have a function

in two variables.

Function one variables will have a graph which is a curve,

but function two variables has a graph which is a surface,

so it's 2-dimensional.

So it doesn't make sense to talk about a tangent line

unless we make some choices, give some additional

information.

So in fact, the proper notion here is a tangent plane, and

this is something we'll talk about on Thursday.

So that's really ultimately what we would like to

understand is the analog of this picture in 2-dimension and

the full -- to get the full analog of this picture, we

should really talk about the tangent plane.

But for now I have a more limited goal.

I want to illustrate the concept of partial derivatives.

And when I talked about partial derivatives, I said that I

freeze one of the two variables, and then I basically

go back to the 1-dimensional case, so the case of a

function one variable.

So that's what I would like to do.

I don't want to talk immdiately about the tangent plan, the

entire tangent plane, I want to see what I get when I freeze

one of the variables.

So in my algebraic to calculation on that board,

I first of froze the second variable y.

So what happens if I freeze y?

If I freeze y it means that I look at the part of the graph

which has a fixed y coordinate, namely b, so it means that

I cut this graph with the plane, which is y equals b.

So the result is going to look something like this.

So the blue, this blue, is the intersection with

the plane y equals b.

That's what I get.

So now, instead of a surface, I get a curve.

This intersection is actually a curve because now y is frozen,

y is equal to b, and so it's out of the game.

So the game now is between x and z.

And it's the same game -- it's a game, same game for

functions and one variable.

So, in fact, I can draw this curve as a graph for the

function and one variable x, which I get by

substituting y equals b.

This is, by the way, the function which I called

g of x on that board.

So let me draw this.

So now, as I said, I only have x and z variables remaining,

and this blue curve is going to look like this, and of course,

it continues somewhere, but since I didn't draw it on the

big picture, I'm not going to draw it much beyond

the first quadrant.

It would be tempting to draw it here.

I know you might be wondering why am I drawing it like this

and not like this, but the point is you have to look at it

not from this angle but from the back of the blackboard

where x and z become the oriented coordinate system.

You see what I mean?

You have to turn this, you have to turn this coordinate system

like this, 90 degrees in this way to make the x to go to the

and z to go vertically up.

Go up.

If I look like this it would be x will go here,

so I don't want it.

I want to look like this.

And that's what I will see.

If I turn this this is what I will see.

So this is, in fact, the graph, this is a graph of what I

called g of x, which is obtained by taking f of xb.

It is part of the surface which is a graph of the entire

function, but I have frozen one of the variables, so actually

was able to reduce my problem to the problem of

function one variable.

I get a graph of a function one variable, namely

g equals g of x.

And now I can calculate for the value of x equal to a.

I can calculate the slope of the tangent line.

So let me draw this tangent line wide so there is this

tangent line and it has a slope, and the tangent of the

slope is a derivative g-prime at point a, which is what we

call the first partial derivative of a function

f at the point ab.

You see what I mean?

I'm doing geometrically here precisely what I did

algebraically on this board.

Algebraically I freeze one of the variables, I get a function

one variable, it's called g of x and I differentiate

it, fx equal a.

Now I'm doing the same geometrically.

Freezing the second variable means intersecting the graph

with the plane y equals b, then I'm down to two variables.

I can look at it as a graph of a function one variable, and

then I look at the tangent line to this graph at this point,

and I measure the slope on this tangent line.

The slope is that derivative which we were looking for,

namely the partial derivative with respect to x.

Any questions?

So let me draw it now on this board.

So the tangent line that I drew over there is going

to look like this.

That's the tangent line.

It's not the entire tangent plane, it's one line

on that tangent plane.

If you think of the tangent plane as this -- it doesn't

want to turn anymore.

I didn't know that it had some knobs and some

things to play with.

If this is the tangent plane, then I have drawn just one

line on it, and that's the intersection with the xy plane.

Maybe it better like this to draw it.

If you think of the -- not the xy plane, sorry, it's

a plane of y equals b.

So think of the y equal b plane as this vertical, kind of

vertical plane, then that's the tangent line that I got.

So this green plane is not yet on the picture.

I have not drawn it yet.

I have only drawn this.

And so now I'm going to draw the second one, so that's my

point, that's what point -- yellow.

And now I will talk about the second tangent line, which

corresponds to freezing the other variable.

So this corresponds to y equals b.

And this corresponds to x equals a.

So, I intersect now with the plane x equals a, and I'm going

to use a different color for this, so it's going to

be something like this.

And now, so this is the red curve is intersection with

the plane x equals a, with intersection with the

plane x equals a.

So x equals a would be like this.

So what's the tangent line to this.

Well, this

already looks like tangent line, but I want to erase this

part so we don't get confused.

The second tangent line is going to look like this, and

that's the second white line which I drraw on that board.

So if you want the tangent plane looks like this,

this is a tangent plane.

The tangent plane is spanned by both of these lines,

both of the tangent lines.

So the tangent plane is the green board, but I can not put

it there, so think of the graph as being a kind of a part of a

steer which is just below this plane, so that this plane just

touches it, just the tangent one to, tangent plane

to this graph.

But on this tangent plane I can clearly distinguish two lines.

One of them corresponds to the intersection with y equal

d plane and other one with x equal a plane.

When I intersect

with those planes, I get pictures just like this.

This is the first one and here's a second one.

In the second one, I have two variables left, also, but those

variables are y and z because I have fix ax now -- x is equal

to a, but y remains a free variable.

So I'm talking about this red curve, and this red

curve looks like this.

The way I have drawn it, it looks almost identical, the

blue and red, but I just did it to simplify the picture.

Of course, in general they're going be totally different.

And on this curve, I pick the value of y equal b -- so, this

is my yellow, emphasize -- I forgot to put the yellow in

that place, put I'm sure you understand it.

And then the tangent line is here, and then maybe let's

call this theta prime.

The tangent of the theta prime -- ah, maybe I should say that

this is a graph, z equals h of y, which is f of a,y.

And the tangent is the derivative h-prime of

b, which is fy of ab.

So this line, this tangent line I have drawn here is the

tangent line to the red curve on the graph.

So that's the picture.

So to summarize this, in the case of one variable you only

have one derivative, and that one derivative corresponds to

the slope of the tangent line to the point, to a given point.

In two variables you have a tangent plane, and what partial

derivatives give you, they give you the slopes of two tangent

lines which belong of this plane, namely the lines which

are obtained -- like these two lines -- the two lines which

are obtained by intersecting the tangent plane with

the plane y equal b, or the plane x equal a.

That's the idea.

So we just kind of look at from two different angles.

We'll look at this tangent plane from two different

angles, and what we get is two different lines, and once

you get lines you can talk about slopes.

You can not talk about the slope of a whole plane --

a plane doesn't have a slope, lines have slopes.

And so, there are sort of two independent slopes that we can

talk about, one with respect to x and one with respect to y,

and they correspond to the two partial derivatives, one with

respect to x and one with respect to y.

Any questions?

Yes?

[INAUDIBLE].

Not a good notation, because prime is for derivative

-- let's go with tilde.

I just wanted to distinguish from the other theta, I

wanted to make sure it's not the same as that theta.

But put in prime is like the worst possible notation because

then -- and it looks like I'm taking derivative of

theta, which I'm not.

Even better to call it something else, epsilon maybe.

Alpha, that's a good compromise.

It's Greek but not epsilon or delta, which are taboo.

OK.

So what else can we do?

In the case of a function in one variable, we could also

take further derivatives.

We don't have to take just one derivative, we can take the

second derivative, third derivative and so on.

So it's natural to ask whether we can do something similar for

functions in two variables.

And the answer is yes.

We can also take, for example, a second derivative.

So, in other words, we start with the function f of x and y,

and then we can take the first derivative with respect to x

and we can take derivative with respect to y.

So we got two new functions, which I'll give you an

example of this for a particular case of f.

So both of them are also functions in two variables, so

we can again apply the same procedure and do partial

derivatives for this function.

Then if we go this way, we obtain ffx of xy.

What do I mean by this?

I mean that I take f sub x, this function, and I take

the derivative with respect to x one more time.

That means again freezing y and then taking derivative

with respect to x.

OK, if I go this way, I get fyy of xy, which means I take fy,

the derivative of f with respect to y, and I take the

derivative to y one more time.

But, of course, I can also do mixed derivatives.

For example, here I can take this and can take

the derivative of this with respect to y.

So that I will denote as fxy of xy.

That means taking first the derivative with respect to

x and then respect to y.

But I can also do fyx, which is first with respect to

y and then respect to x.

And then, of course, the natural question is whether

I get the same answer if I apply these derivatives

into different order.

The question is whether these are actually equal.

And in my example, in my example, let's

see if I remember.

I think it was f of xy, f to the 5 -- what was it? xy cubed

plus cosign x equal to y.

So what will be -- so let me write f of xy, so then

derivative with respect to x was 5x to the 4 plus y cubed

minus sign xe to the y.

If I do one more derivative I get 20x cubed.

Now y cubed is a constant as a function of x.

I view y as a parameter, so it doesn't depend on the x,

therefore, the derivative vanishes, so it disappears.

And then I take one more derivative of sign, I get

cosign x times e to the y.

On the other hand, I can take derivative with respect to y,

I get 3x y squared plus cosign xe to the y.

One more derivative -- 6xy plus cosign xe to the y.

And now the most interesting thing, I can -- let me do it

like this so that we don't lose track of where we are.

So first we take the derivative of this with respect to y, this

disappears, this becomes 3y squared minus sign xe to

the y, so that's this way.

And if I go this way -- I'm sorry, not this way, this way.

I get the same, right, I get 3y squared and I take the

derivative with respect to x, so minus sign x e to the y.

So clearly I get the same answer.

So this actual general result, which

is called Clairaut.

Or I guess if I pronounce as a French, with a French

accent, it will be Clairaut.

So the Clairaut theorem, which says that uner favorable

conditions, which is essentially the condition that

in a small neighborhood of a given point, you have all

partial derivatives which are continuous function up

to the second order.

Under these favorable conditions, the two mixed

derivatives are the same.

So this is, in fact, Clairaut's theorem under some

conditions of continuity.

Which will, in all our examples, this

will be satisfied.

So this is actually great because what it means is that

if you think of a way of doing partial derivatives for

function two variables as explained where if you go this

way you differentiate with respect to x, and if you go

this way you differentiate with respect to y.

Right, you could do that.

We could continue this picture.

I can go one more step will be like fxxx, or I could go this

way and it will be fxxy -- always the last one, the

new one is the last one.

And then if I go more, it will be, for example, fyx.

But the point is that it doesn't matter in which order

you take, what matters how many times you differentiated x and

how many times you differentiated y.

So for instance, fyx is equal to fxy, but likewise, fxyx is

the same as fxxy, the same as fyxx, again, under favorable

conditions when functions in question are continuous

and differentiable.

So all that matters is not the order, but the number of times

you differentiate with respect to x and y, which is kind of

nice so it has the same communicative structure as the

structure that you have for the variables x and y themselves.

In fact, differentiation is in some sense the process which is

opposite to the process of multiplication by x or y.

So that you have two operations of multiplication by x and y,

but you also have two operations of differentiation

by x and by y.

And multiplication by x and y commute -- two multiplications

commute, and the derivatives also commute.

Which, by the way, actually is kind of a better notation

for this iteration.

Because for now the iteration is denoted by inscribing

this additional subscript next to the function.

But there is another notation, so I go back to this, this is

our notation, but another notation is df dx.

Also, if you wish at ab, but it doesn't have to be.

And likewise, a notation for the second derivative is dfdy.

So this is a particular notation.

This is not be confused -- this should not be confused with the

straight d, with just a straight letter d.

It's not the same.

In fact, this actually makes sense, which I will

explain on Thursday.

I will finally explain what dx mean.

But this by itself doesn't make any sense, this

makes sense, zdx.

zdx is a procedure, which you can apply to a function

and it gives you first partial derivative.

This is an operation which -- in the case of one variable

we just denote by prime.

Also in the case of one variable we write, in the case

of one variable we write, you have f of x, you write f-prime

of x or you write df dx.

But now we cannot write like this, as I will explain in

more detail on Thursday.

To differentiate you have to specify in which direction

you differentiate and this is one way to do it.

Say you choose to differentiate with respect to the x direction

and then you get this.

But this is not to say -- the numerator by itself doesn't

make any sense as a notation, and the denominator also

doesn't make any sense.

Only these two things together make sense.

This is a notion of partial derivative, and likewise, you

have the notion of partial dervatives with respect to y

which makes any function in its derivative with respect to y. f

sub x and this is f sub y.

OK?

This on the other hand, df, is an entire different

object, the differential.

This is an entirel different object, it's called

a differential.

And it's not the same, likewise, it's not df.

So this is not even a letter, if you think about it.

It's not even a letter of any reasonable alphabet.

It's just a mathematical notation for

partial derivative.

So this, of course, begs the question as to what

is the differential.

What is the differential and what on earth does this mean?

Because this is something we've been using quite a lot,

but never really -- I've never really spelled out

what we mean by this.

But actually it has a very precise meaning, differential

and dx and dy, and this is what we're going to discuss next.

In fact, I have about five minutes left, so I'll give you

a little preview of what's coming on Thursday, and it's

really a very important, that's a very important subject, which

unfortunately has been made really, really obscure

by a very unfortunate choice of notation.

It's a very bad notation makes it very obscure and very

difficult to understand.

So I remember when I was learning this for the first

time it was impossible to understand.

So it took me a long time to figure it out, but I'm happy to

share it with you, I have to tell you, because it's actually

very simple, and we already know everything that we

need to know about this.

So what I want to do is just tell you just a little bit,

just a couple things about it.

And I will, as always, I will start with the function one

variable, var that's already a very good example where you can

understand what the differential is and what all

this notation means.

So in fact, I shouldn't have erased it because I'm going to

draw it again, but I just wanted to draw it in a slightly

different way, in a more -- the way I usually draw which

is kind of a -- this is the optimistic view of

reality as it goes up.

The other one's down, that's why I raised it.

So, we talk about tangent lines, and we talk about

importance of tangent lines, and the importance of tangent

line really is that it gives you very useful approximation

to a complicated function on a very small scale.

So the differential, the differential really is

the function whose graph the tangent line is.

So the funny thing is that we talk about a function in one

variable, so in this case let's say you have a function f of x.

In this yellow curve, it presents the graph of this

function, that is, the set of solutions to the equation

y equals f of x.

So for this function we have two objects -- we have the

function named f, and we have the graph which is the yellow

curve, we have two objects.

And then we talk about the tangent line, and tangent line

of course we understand geometrically very clearly,

we choose a particular point, so let's say

x, 0, right, but we never talk about the function which

gives us this tangent line as the graph.

Somehow we usually ignore this question.

So the yellow curve is the graph of this function.

But the tangent line is also graph of a function in a much

simpler function which is actually a linear function,

and what this function is it is a differential.

This one is a graph also graph of a function, namely df.

This is what we mean by df.

Let me write it in words.

Namely the differential of f at this point.

That's what the differential is.

The only subtle point is that for the differential we shift

the coordinates -- we choose a new coordinate system where

the origin is at our point, that's all.

In other words, we view now this line as a graph with

respect to a new coordinate system where the origin is not

here -- it's not at some arbitrary point, but actually

at the point which we are analyzing.

And if you write down the function whose graph this

tangent line is, you will have precisely the differential.

That's how it works for functions in one variable.

And now it's absolutely clear what will happen for

functions in two variable.

For functions in two variable, we are going to look at the

tangent plane to this graph, which is represented here, or

if you wish, that's the tangent plane which I was

talking about.

And then you will think of this tangent plane also as a graph

of a function, of a linear function, and that linear

function is a differential of a function in two variables

you started with.

So that's the short version of this, and I will give you

more details on Thursday.

Sorry?

No, I'm working out but walking out.

I will be here on Thursday, so inasmuch as I sympathize with

the organizers, and I think that they really have a very

valid point, they raise a very valid point, I also feel that

my job is to be here, so I will be here Thursday.

Any other questions?

Perhaps even more important, next Thursday we'll have

our first mid-term exam.

Yeah, everybody's excited.

So I already put an announcement on the bSpace

page, which, by the way, you should check bSpace site

because there are all kind of interesting stuff in there,

and useful information also.

So, homework solutions, something I have already told

you about, but also there will be some information

about the exam.

This week I'm going to post a mock mid-term so that you will

have a chance to practice for the mid-term, and we'll have a

review lecture next Tuesday, a week from today.

And the exam is on Thursday exactly during the class hour

-- one hour, 20 minutes, more precisely.

So, we'll start at 3:40 and finish at 5:00.

Because this room is not big enough, you know, because

everybody would be packed, so I have requested another room, so

we'll have to separate into two groups.

I will post information about this and I will tell you about

this and everything about this on Thursday or next Tuesday.

And now I go back to what we talked about last

week, which is limits.

I just want to say a few words about limits, and then we will

move on to the next subject, which is partial derivative.

So about the limits -- too many late arrivals today.

I'm going to charge a late arrival tax if this continues.

So let's just quiet a little bit because it will be good

for everybody if we are more focused.

So limits.

I illustrated limits but by way of an example and I looked at

the particular function in two variables, namely x squared

divided by x squared plus y squared near xy equals 0,0.

And the reason is that at this point the denominator becomes

equal to 0, and so this expression becomes problematic,

it may or may not have a limit.

If we did not have that, if, for example, we looked at this

function near a point xy equals 1, 0 or 1, 1, a point where the

denominator is not equal to 0, that would be an easy

question to handle.

But precisely when the denominator becomes equal to 0,

we have to be careful, and we have to analyze it

more precisely.

And so in this particular case, I explained that this function

doesn't have a limit.

This function does not have a limit at this point.

And I explained this, explained the reason.

The reason is that we could find -- the reason I gave was I

showed two different paths, on the xy plane which approach 0.

One was the path when x is equal to 0, so

it's a green arrow.

And the other one was the path when y is equal to 0, and

that's the red arrow.

And we have seen that along the first path, when x is equal to

0, when we look at this function on this path, very

close to 0 but not quite equal to 0.

What we get is 0 divided by 0 plus y squared.

But as I said, we really look at it not at this point

but just near this point.

So near this point y is not 0.

So this is not 0 near the point.

Because it's near the point, we can actually evaluate this, and

we see that for all values of y near this point, this is

actually just plain 0 because you divide 0

by something non-zero.

And so this means that along this path, there is a

limit and it's equal to 0.

On the other hand, if we look at this path, we get x squared

divided by x squared plus 0, and again, we will assume that

x is very close to 0 but not quite 0 yet.

So this is actually a non-zero expression, and therefore, we

can cancel them out -- this again is non-zero, and

this actually gives us 1.

So that means that along this path, again, there is a limit

along this path, there exists a limit, there is a limit

along this path, namely 1.

So we have found two paths along which we achieve a

different limit, we obtain a different limit.

That means that the function itself does not have a limit,

does not have a limit, because to say that the function has a

limit, if along any path approaching the point the limit

exists, then all of them are the same.

In this particular case, clearly they are not the same

because there are two path, at least two paths along which

the limits are different.

So how can it be that a function has different limits

along different paths, in other words, what is

the meaning of this?

What is the geometric representation of this?

Well, to explain that we can already see an analog of this

phenomenon for functions in one variable, which I

explained last time.

In the case of one variable, we look at the graph of a function

as a geometric representation, and we can have the following

situation where the graph this discontinuous.

So graph is discontinuous, for example, like this.

So there's some point, x, 0, and if you approach from the

left, you're going to end up at this point and this value, and

if you approach on the right you end up with this value.

So, that means that the function does not have a

limit at this point, because along different paths you

get different limits.

They are both finite, so in some sense you can say, you can

argue that it's a more benign situation than let's say

situation over hyperbola where actually it goes to infinity.

Here, by the way, on the right it goes to plus infinity and

the left goes to minus infinity.

So, in some sense it's even worse than just going to

infinity, it actually goes to two different

infinities in some sense.

But first of all, if it goes to infinity along any path,

it's already, we already say it doesn't have a limit.

But even if it's finite, this situation, which is in some

sense more benign, it still doesn't have a limit, it has a

limit from the left, has a limit from the right.

In the case of one variable, there are only two possible

paths which converge to this point.

You can go sort of with different speed, different

velocity, but it doesn't matter, I mean geometrically

it's the same path, either this one or this one, only two.

And already that creates trouble.

In the case of functions in two variable there

are many more paths.

I have drawn two paths here -- here is one, here is another,

but you have many more, right, you also have a path like this.

Finally, a path doesn't have to be straight line,

it could be a spiral.

So, there are many, many different paths, and that's

the difference between two variables and one variable.

To say that the function has a limit for a function in two

variables is a very strong statement, it's a statement

that along any of those path, you're going to achieve

the same limit.

So, this actually shows that a way it's much easier to

disprove something here, to show that the function does not

have a limit than to show that the function does have a limit.

Indeed, to show that it doesn't have a limit, it's sufficient

to just exhibit two different paths along which you have

different limits, and usually it's pretty clear which

ones you should take.

For example, in this case, you just look at

x equal 0, y equals 0.

Sometimes you might need to look at a linear path, like

this one where x is equal to y, again, to convince yourself

that indeed the function doesn't have a limit.

But to prove that actually the function does have a limit is

much more difficult because it wouldn't be enough, for

example, to say that the limit along this path and along

this path is this same.

You would also need to show that the limit along this

path is the same, and along infinitely many other paths.

So to show that it has a limit is more difficult, and this is

not a very efficient way to show it.

It is very efficient -- this way of argument is very

efficient to show that it does not have a limit, because then

it's enough to show just two for which the limits are

different, but to show that it has a limit, it wouldn't be

enough to show that along two paths you get the same answer.

You have show that the same goes

for all paths.

So, for practical purposes, what you need to know is the

argument showing that it doesn't exist.

You need to know this way of argument.

When the limit doesn't exist you should be able to

demonstrate that there exist two paths along which the

limits are different.

Yes?

[INAUDIBLE].

The distance, right, the distance, what really matters

is the distance -- what should matter -- so the question is

what really matters, the shape of the spiral or the shape

of the curve or the angle.

It depends on the situation.

The point is that to have a limit means that as soon as you

get close, say within 1 over 100 of an inch of the origin,

the answer is going to be within some small neighborhood

of the value that you claim is the limit.

It doesn't matter how you approach it, it should

be within that limit.

If it's 1 over 1,000 of an inch it should be even closer, if

it's one over a million should be even closer and so on.

So the notion of a limit should not rely on the way you

approach, it should be about, it should be uniform with

respect to all directions and all points in the neighborhood

of the point, of the point 0, 0.

So, in a sense this argument that I gave you is

kind of misleading.

It's a very nice argument to disprove the existence a

limit, to show that the limit does not exist.

But it is misleading if you try to think in this way about

the existence of limits.

So, for existence of limits, you have to use a different

kind of argument.

And now I'm not going to require that on the exam, but

I'm going to, just to give you an idea how it works, I'm

going to explain it in the following case.

Suppose you have a function which is just slightly

different from this at first glance, namely instead of x

squared I will take x cubed.

You see, so what happened?

So, the problem here, the problem with this function

was that both numerator and denominator had degree two.

So both numerator and denominator in some sense are

going to 0 at roughly the same speed, but not exactly.

It depends along which direction you go.

If you go on this one, it will 0, along this one it's one,

along this one, for example, if x is equal to y, you can

see it's going to be 1/2.

But because the powers are the same, that's why you end

up with different answers.

What happens now is that I put the numerator, I choose as the

numerator x to the third power, and the third power goes to 0

much faster than second power.

So that's why this will dominate and this will kill

this guy, so it will become 0.

Well, it's hardly kill, because they both go to 0, so in some

sense it goes faster to 0 so it doesn't really kill it.

but it depends on your point of view.

So, how would I show that this actually has a limit.

So I claim, I claim that this function does

have a limit at 0,0.

Namely the limit is equal to 0.

How would I show that?

Well, for that I would actually have to estimate the value of

this function, I would have to estimate the value of this

function when I approach 0.

So I will say let's suppose that xy belongs to the small

disk of some radius r, and I purposefully don't want to use

delta and epsilon, not because I don't like pre-cals a bit,

but because I know that people immediately feel violated when

I try to talk about epsilon and delta.

So, if I use a different letter you will feel much more

comfortable, some of you will feel more comfortable.

There is a certain -- just the way it sounds -- actually, one

of the students told me that it reminds him of going to

a dentist's office, epsilon and delta.

So, let's call it r, and let's look at the disk of radius r.

I'm drawing it as a big disk, but actually you should think

of it as very small, I just magnify it.

So, this is dr and I want to look at all the points here,

and I want to estimate, I want to estimate of the value

of this function for all points within this disk.

So the function, as I explained already many times is a rule,

which assigns to each of these points a certain number, which

is the value of the function, right, so that's

the function, f.

And I want to estimate the value of the function.

What I want to show is that the closer I get here, the closer

I'm going to get to the neighborhood of 0

value on this line.

So how can I estimate this?

Well, what does it mean that belongs to this?

It means that x squared, the square root of x squared

plus y squared is less than or equal to r.

That's what it means, because that's the distance, that's how

we measure the distance to 0. to belong to a disk of radius r

means that for your coordinates x and y, the square root of x

squared plus y squared is less than or equal to r.

So that means that x squared plus y squared is less than

or equal to r squared.

But you see both of these are positive, so if the x squared

plus y squared together are less than r squared,

this also implies that x squared is less than r squared,

so x is actually less than or equal to r.

This is actually also clear geometrically because if you

have all points in the disk, you will see that all of them

will have the x coordinate less than r, except for the two

points which lie on the intersection of the

circle with the x-x.

So, I know this as soon as my point is in dr,

and I also know this.

So now, what I would like to do is I would like to

write the folllowing.

I would like to write x cubed divided by x squared plus

y squared, and I want to measure the absolute value.

What matters is that absolute value.

When I say it's close to 0, it doesn't have to be

positive or negative.

It should be very close to 0 in either side.

So that's why less to estimate it's better to take the

absolute value rather than just the value.

And so now I want to write it like this.

I want to write as x times x squared divided by x squared

plus y squared -- I've done nothing, I just pulled

one x of this fraction.

And then I want to write it as x times x squared divided by

x squared plus y squared.

So now I want to look at this, and clearly, this is less than

or equal to 1, because in the numerator I have a squared,

but in dominator I have a

squared plus y squared, and this guy's always

positive or 0.

So the largest value this can obtain is 1, when y square is

0, but if y square is not 0, the numerator will be strictly

less than denominator.

So, this fraction is less than or equal to 1, and x, as I have

just explained, is less than r -- less than or equal to r.

So that means that the whole thing is less

than or equal to r.

So you see I have been able to effectively estimate the

value of this function.

I can not find it for all x and y -- I mean they're

all different for different x and y.

But I can say for sure that as long as x and y, as long as the

point xy is within that disk of radius r, the value is going to

be less than r -- less than or equal to, it doesn't matter.

So that means that if I make this disk smaller and smaller,

in other words, as I take r closer and closer to 0, this

value of the function -- more precisely the absolute value of

the function, woll also tend to 0, because for all points

within that disk, all points, this value is going to be less

than or equal to r, and I have control over r because I

can take my points in a smaller and smaller disk.

Nowhere in this argument am I talking about particular paths

approaching 0, I'm talking about the entire disk of radius

r, and then I'm squeezing that disk, I'm taking smaller and

smaller and smaller, and by being able to use this estimate

I can control of the value as a function of the radius.

I can say that if I am within the radius 1 inch, the value

of the function is going to be less than 1.

If I'm within the radius one over 100 of an inch, it's

going to be less than 100 -- one of a million.

In other words, I can get as close as I want to 0 for

the value by choosing sufficiently small disk.

Now that's the argument, that's the proof that this function

has a limit and the limit is equal to 0.

And that's what traditionally mathematicians explain in

the epsilon delta language.

But that's all it is, that's all it is is just saying that

if you take a disk of radius r, then the value of the function

for all points within that disk is going to be less than r or

less than something which becomes smaller with r.

So that's the argument.

Are there any questions about this?

Yes?

So, they ultimately have to prove that that function

will be smaller than r.

That's right.

In general it will not necessarily be r, it could be

-- let's suppose I prove, let's suppose I had a different

function, for example, I had 2 times x cubed, then I would

prove it's less than 2r.

That would still be OK, because I have to get something which

will become smaller and smaller as r becomes smaller.

Or if I get r to the 1/2, the square root of r, I would

also be OK, or r squared.

It will not be OK if I just say that it's less than 1, that the

whole thing is less than 1.

That wouldn't help me.

That would just tell me that I get within a certain range, but

that range doesn't get smaller as the domain gets smaller.

I have to make sure that the range will get smaller as

the domain gets smaller.

And that's a perfect situation where I didn't have to

make any adjustments.

I get r on the nose as the estimate, and the largest

possible value for this.

Another question?

[INAUDIBLE]?

Would that only work -- the question is whether this

is a general argument.

This is not a general case because in general you're

going to have maybe some polynomial in x and y

with additional terms.

I have really looked at the simplest case.

But in general the argument is going to be very similar and if

you like in the book there are more examples of this

type which are analyzed.

But as I said, I'm not going to require you to know this, so my

point here is just to give you an idea of how this

kind of proof works.

And I think that even though this is the simplest example,

it already illustrates this idea.

Yes?

[INAUDIBLE].

L'Hopital's Rule.

Well, L'Hopital's rule is really specific to functions

in one variable because you have to differentiate.

So, the question was about L'Hopital's Rule, which was

one of the powerful methods to finding limits for

functions among variable.

And so the way it works is that if you have a function which is

say p of x divided by q of x and both of these tend to 0,

say, and you don't know what the limit is, you can estimate

the limit by taking derivative.

So now the situation is different because now we have,

this is one variable case, and now we have two variables, and

in two variables we have functions in two, say,

p of xy and q of xy.

So let's say I wanted to generalize this, I would have

to take some sort of derivative here, right, and this actually

brings out the question which we study next, which is what

kind of derivatives can we do for functions in two

and three variables.

So clearly, there isn't a single derivative because

derivative is about the rate of change, the rate of change.

In a 1-dimensional case there is only one direction in which

you can change your variables.

You can increase that -- you know you can

just go away from x.

Apart from the fact that you can change it going left and

right, there is essentially only one way to change it.

More precisely, there is only one degree of freedom that you

can change, you can only change in one direction.

But in two variables there are more directions in which you

can change and estimate the rate of change, and therefore,

there are many more derivatives that are possible.

So in fact this is very close to what we discussed up to now.

There are so many different ways to approach 0, and we

have to be able to take care of all of them.

So, but L'Hopital's Rule in two dimensions would give us at

best is the way to approximate the limit along a

particular direction.

So let's say if I were to take the derivative with respect to

the x direction, then I would be estimating what happens when

I approach along this line.

And if I were to take the derivative with respect to y, I

would be estimating along the green line, along the y-axis.

But neither would give me the full picture.

The full picture I can only get as I have argued, by sort of

looking at path, right, looking at all possible directions.

So in that sense the L'Hopital's Rule

doesn't help us.

Now, sometimes, sometimes it happens that you can convert

this function that you have in two variables, into a

function in one variable.

One of the exercises, I think it's the last one on homework

is about this where you can use polar coordinates to realize

that the function you have is something like -- it's in both

x squared plus y squared.

I think it's something like logarithm of y squared times

x squared plus y squared, something like this.

So you realize that actually it only looks like it's a function

in two variables, but actually, it's a function in one

variable, namely this.

And then you are back to one variable case, and then it

becomes a fair game to use all the methods that you know

in one variable case.

A general function is not going to be like this.

For example, this one is not like this.

So, I can not directly use L'Hoptal's Rule for two

varibles, and there isn't any obvious way to use it because

there is more than one possible derivative, which actually

brings up the question as to what are possible derivatives

for functions in two variable, and that's our next subject.

So, I have already kind of alluded to the answer.

Because there are two variables, we can actually

differentiate to respect either one of them and we get a

meaningful derivative, and this I call partial derivatives.

So what are partial derivatives?

So we have a function, let's say we have a function f of xy

in two variables, and when we talk about derivatives, we

should, first of all we should fix the point, we should fix

the point at which we are taking the derivative, because

for a different point you have different derivative.

It' the same thing happens for functions in one variable.

So let's say we have a point which has coordinates a and b.

What we can do is we can convert this function into a

function in one variable, but by freezing one

of the variables.

So, for example, we can say y is equal to b -- 3's the second

variable and say that y, the variable y is equal to b,

which is at the second coordinate of this point.

So what we get then is a f of x and b, and let me indicate the

fact that I have frozen, I have frozen this by red, so red

would be a fixed value.

So here also would be, I will put them in red to indicate

that these are numbers like 1 or 5 or 27 over 11,

whatever you want.

But x is a variable, so x we can plug in number you want

and you'll get an answer.

So you want to view it still as a function, but because you

have frozen one of the two variables, there is only one

variable that's remaining.

Therefore, what you get is a function in one variable only.

And once we get a function in one variable, we can then

differentiate it just in the usual way how we differentiate

functions in one variable.

Differentiate it, so we'll get a g-prime and then we can

substitute the value that we wanted, the value a.

So then finally we get a number.

So in other words, we have a function and two variables,

first we freeze one of the two variables, and then we take the

derivative with respect to the second variable at the

particular value of that variable, namely a.

So the result of this is what's called the first partial

derivative, or partial derivative with respect to x.

Partial derivative with respect to x at this

point, at the point ab.

And the notation for this is f sub x of ab.

Likewise, I could freeze the second variable -- I mean the

first variable x, I could say x is equal to a.

Then I get a function again in one variable where the first

variable is frozen but the second one is free.

So I get a function of one variable, let's call it h, and

then I can differentiate it.

So what I get is this h-prime of b.

And that's called the partial derivative with respect to y

for which the notation -- I'm just abbreviating the same

sentences I have at the top of this board.

The notation for this is, obviously, f sub y of ab.

So we got two derivatives for functions in two variables,

the first and the second.

So now let's look at the example of what

this looks like.

Let's say f of xy is x to the 5 plus x times y cubed plus

cosign x times e to the y.

And we would like to find the partial derivative.

Now, when I define them, I was insisting that the value of --

that the derivative has to be evaluated for particular values

of a and b -- sorry, particular values of x and y, which

I denoted by a and b.

Just like in the case of a function one variable, let's

say if you have a function, let's say for function one

variable, say f of x is equal to x cubed.

I could say that the f-prime for any value a will be --

well, will be 3a squared, right.

That's the rule, because I know the rule.

The rule is that the derivative of x cubed is three times x

squared, and then if I substitute x equals

a, then I get this.

So usually we don't write it like this.

Usually we just write f-prime of x is x squared.

In other words, we would like to look at not just one value

of the derivative for particular value of x, namely

a, but at all of them.

For all possible values of x, we would like to know what the

value of the derivative is, and then we can substitute x equals

a, for example, 1/2, then you will get -- I'm sorry, I forgot

3, 3f squared, and no one corrects me or at least I

didn't hear. 3f squared, of course, for the function, and

we substitute x equals a and we get 3a squared.

But it's too pedantic to go this long way each time and say

well, if I ask what is the derivative of this function,

you say well, for a given value a, the derivative at the point

a is going to be 3a squared.

Instead we just write f-prime of x is 3f squared.

And that, what is understood is that if I want the value at the

particular, for a particular a, i'll just plug a into this

formula and I'll get the answer, right.

So we will use the same shorthand for

partial derivatives.

In other words, we will not be, I will not be writing each time

that fx of ab, I will just write fx of xy.

I will just write it for just fx sometimes, and that would be

a function of x and y so that if I substitute a instead of x,

b instead of y, I will get the value of the derivative , of

this particular partial derivative at that point.

So let's see how it works in the case.

In fact, nothing could be easier.

You just look at this function and in order to calcualte the

partial derivative with respect to x, you just view y as a

parameter, but not a variable.

This is exactly what I meant when I said that we freeze y.

It just means that we view y as a parameter.

And then you just differentiate what you see.

Well, what do you see, you see x to the 5, so you get 5 x to

the fourth plus this, you differnetiate this is y cubed

plus differentiate this you get negative sign x

times e to the y.

And that's it.

That's the answer.

That's the way you write the answer.

Now, if you are given some x and y, some values for x and

y, like a and b, you can substitute them and

you'll get a number.

But in fact you can view of this first partial derivative

with respect to x as a function of x and y, which I just

obtained in this way.

Likewise, we view x as a parameter, and then take the

derivative with respect to y.

So if x is a parameter, then from this point of view

this is just a constant.

It's independent of y.

Therefore, its derivative is 0.

So it's going to be 0 plus here it's also parameter, so it was

just differentiate y cubed, so we get 3x y squared.

Of course, sign x is also constant, and the derivative

of e to the y is e to the y.

So that's the answer for the second partial derivative.

Is that clear?

So this is really straight forward.

You only need to know how to differentiate functions

in one variable.

Yes?

[INAUDIBLE].

Why doesn't cosign go away.

In which one?

In this one.

Well, let's suppose instead of this you had

5 times e to the y.

Then the derivative would still be 5 e to the y, or any other

constant would just show up as overall factor.

So in the event, the constant is cosign x.

That's what I mean when I say we treat x as a parameter.

If we treat x as a parameter it's treated as a number, and

so any expression involving x, like cosign x, is

a fixed number.

So it just it shows up as an extra factor.

Any other questions?

So next I would like to explain the geometric meaning of this,

because as you see in this course, algebra and geometry go

hand-in-hand, and all of the concepts that we discuss

algebraically, they have geometric interpretation,

which is very important.

So, for functions in one variable, the derivative of the

function has to do with the slope of the tangent line.

In one variable derivative gives the slope of a

tangent line to the graph.

And the way we draw it is like this.

We have xy plane, we have a function, f of x, we have y

equals f of x with the graph.

Note again that y here has a totally different meaning than

y in here. y in here is a second variable, so it's on

equal footing with x -- x and y are two independent variables.

But now I'm talking about functions in one variable, so

there's only one variable x, and y is not a variable, it's

actually -- it denotes the value of the function.

I already talked about this.

It's an unfortunate choice of notation, but that's how it is

so I'm not going to change it.

So, we pick a point, let's say x equals a and we draw a

tangent line, and we know that the derivative, let's say the

angle is theta, that the tangent of theta is a

derivative f-prime of a.

That's the geometric meaning of the derivative of the

functions in one variable.

So then it's natural to ask what is the meaning for

functions in two variables.

To understand that, we have to look at the graph of the

function in two variables.

What does that look like?

Well, for a function in one variable, a graph is a curve on

the plane, and I already talked about it many times why do we

need a plane, because to represent the graph, you have

to have your variables and you have to throw in one additional

variable which will represent the value of the function.

For function two variables, there are already two variables

to begin with, and to draw a graph we have to throw in one

more, one more variable, which will represent the

value of the function.

So, as the result, the graph of function in two variables is

going to live in 3-dimensional space, so it will have

coordinates x, y and z, and we will have a graph of this

function which will be a surface.

So I would like to just draw part of it, which lives

in the first octant.

On the plane this coordinate system brakes the plane

into four quadrants, four coordinates which are called

quadrants, right, because there are four of them.

In space the coordinate planes break the entire 3-dimensional

space into 8 pieces, which are called octants.

So this is one octant, it's looking at us like this.

And so the graph actually lives everywhere, but I have just

drawn the intersection of the graph with each of the

coordinate planes.

And so you should think of this as something like a dome,

like a sphere, like part of a sphere.

It's not necessarily a sphere, I'm just -- just like

this is not a circle.

I mean I'm just doing a sample graph.

And to emphasize this, I want to show a particular

point on this.

So let's say I take this point, and so this

point has coordinates.

Do find the coordinates I have to drop perpendicular on he xy

plane, so that's going to look like this, and then that's the

z coordinate, maybe a little higher.

So this point is -- what is this point?

So this point has coordinates a and b, and the third coordinate

is a value of the function, because it lives on

this yellow surface.

I don't want to shade it because otherwise it will not

be clear what am I shading, am I shading this, am I shading

the plane and so on.

So just try to imagine that there is something here which

looks like it's part of a sphere, and that's the

point which belongs to it.

And that's the graph of a function, f of xy, so it's

defined by the equation z equals f of xy.

And this is a particular point, let's go with p which has

coordinates ab -- these are given, these are given, these

are just the values of x and y.

And what about the z coordinate?

Well, since it's a graph, the z coordinate has to be f of

the x and y coordinates.

So that means that I have f of ab.

So that's what this point is.

So this is a of ab.

Is it clear so far?

So now what is the slope?

What is the slope of the graph?

Well, first of all, it's not the slope of a graph, it's

the slope of a tangent line.

So here, actually, it doesn't make sense to talk about the

tangent line to the graph, because a line is

1-dimensional, and the graph is 2-dimensional.

How can it not be 2-dimensional if we have a function

in two variables.

Function one variables will have a graph which is a curve,

but function two variables has a graph which is a surface,

so it's 2-dimensional.

So it doesn't make sense to talk about a tangent line

unless we make some choices, give some additional

information.

So in fact, the proper notion here is a tangent plane, and

this is something we'll talk about on Thursday.

So that's really ultimately what we would like to

understand is the analog of this picture in 2-dimension and

the full -- to get the full analog of this picture, we

should really talk about the tangent plane.

But for now I have a more limited goal.

I want to illustrate the concept of partial derivatives.

And when I talked about partial derivatives, I said that I

freeze one of the two variables, and then I basically

go back to the 1-dimensional case, so the case of a

function one variable.

So that's what I would like to do.

I don't want to talk immdiately about the tangent plan, the

entire tangent plane, I want to see what I get when I freeze

one of the variables.

So in my algebraic to calculation on that board,

I first of froze the second variable y.

So what happens if I freeze y?

If I freeze y it means that I look at the part of the graph

which has a fixed y coordinate, namely b, so it means that

I cut this graph with the plane, which is y equals b.

So the result is going to look something like this.

So the blue, this blue, is the intersection with

the plane y equals b.

That's what I get.

So now, instead of a surface, I get a curve.

This intersection is actually a curve because now y is frozen,

y is equal to b, and so it's out of the game.

So the game now is between x and z.

And it's the same game -- it's a game, same game for

functions and one variable.

So, in fact, I can draw this curve as a graph for the

function and one variable x, which I get by

substituting y equals b.

This is, by the way, the function which I called

g of x on that board.

So let me draw this.

So now, as I said, I only have x and z variables remaining,

and this blue curve is going to look like this, and of course,

it continues somewhere, but since I didn't draw it on the

big picture, I'm not going to draw it much beyond

the first quadrant.

It would be tempting to draw it here.

I know you might be wondering why am I drawing it like this

and not like this, but the point is you have to look at it

not from this angle but from the back of the blackboard

where x and z become the oriented coordinate system.

You see what I mean?

You have to turn this, you have to turn this coordinate system

like this, 90 degrees in this way to make the x to go to the

and z to go vertically up.

Go up.

If I look like this it would be x will go here,

so I don't want it.

I want to look like this.

And that's what I will see.

If I turn this this is what I will see.

So this is, in fact, the graph, this is a graph of what I

called g of x, which is obtained by taking f of xb.

It is part of the surface which is a graph of the entire

function, but I have frozen one of the variables, so actually

was able to reduce my problem to the problem of

function one variable.

I get a graph of a function one variable, namely

g equals g of x.

And now I can calculate for the value of x equal to a.

I can calculate the slope of the tangent line.

So let me draw this tangent line wide so there is this

tangent line and it has a slope, and the tangent of the

slope is a derivative g-prime at point a, which is what we

call the first partial derivative of a function

f at the point ab.

You see what I mean?

I'm doing geometrically here precisely what I did

algebraically on this board.

Algebraically I freeze one of the variables, I get a function

one variable, it's called g of x and I differentiate

it, fx equal a.

Now I'm doing the same geometrically.

Freezing the second variable means intersecting the graph

with the plane y equals b, then I'm down to two variables.

I can look at it as a graph of a function one variable, and

then I look at the tangent line to this graph at this point,

and I measure the slope on this tangent line.

The slope is that derivative which we were looking for,

namely the partial derivative with respect to x.

Any questions?

So let me draw it now on this board.

So the tangent line that I drew over there is going

to look like this.

That's the tangent line.

It's not the entire tangent plane, it's one line

on that tangent plane.

If you think of the tangent plane as this -- it doesn't

want to turn anymore.

I didn't know that it had some knobs and some

things to play with.

If this is the tangent plane, then I have drawn just one

line on it, and that's the intersection with the xy plane.

Maybe it better like this to draw it.

If you think of the -- not the xy plane, sorry, it's

a plane of y equals b.

So think of the y equal b plane as this vertical, kind of

vertical plane, then that's the tangent line that I got.

So this green plane is not yet on the picture.

I have not drawn it yet.

I have only drawn this.

And so now I'm going to draw the second one, so that's my

point, that's what point -- yellow.

And now I will talk about the second tangent line, which

corresponds to freezing the other variable.

So this corresponds to y equals b.

And this corresponds to x equals a.

So, I intersect now with the plane x equals a, and I'm going

to use a different color for this, so it's going to

be something like this.

And now, so this is the red curve is intersection with

the plane x equals a, with intersection with the

plane x equals a.

So x equals a would be like this.

So what's the tangent line to this.

Well, this

already looks like tangent line, but I want to erase this

part so we don't get confused.

The second tangent line is going to look like this, and

that's the second white line which I drraw on that board.

So if you want the tangent plane looks like this,

this is a tangent plane.

The tangent plane is spanned by both of these lines,

both of the tangent lines.

So the tangent plane is the green board, but I can not put

it there, so think of the graph as being a kind of a part of a

steer which is just below this plane, so that this plane just

touches it, just the tangent one to, tangent plane

to this graph.

But on this tangent plane I can clearly distinguish two lines.

One of them corresponds to the intersection with y equal

d plane and other one with x equal a plane.

When I intersect

with those planes, I get pictures just like this.

This is the first one and here's a second one.

In the second one, I have two variables left, also, but those

variables are y and z because I have fix ax now -- x is equal

to a, but y remains a free variable.

So I'm talking about this red curve, and this red

curve looks like this.

The way I have drawn it, it looks almost identical, the

blue and red, but I just did it to simplify the picture.

Of course, in general they're going be totally different.

And on this curve, I pick the value of y equal b -- so, this

is my yellow, emphasize -- I forgot to put the yellow in

that place, put I'm sure you understand it.

And then the tangent line is here, and then maybe let's

call this theta prime.

The tangent of the theta prime -- ah, maybe I should say that

this is a graph, z equals h of y, which is f of a,y.

And the tangent is the derivative h-prime of

b, which is fy of ab.

So this line, this tangent line I have drawn here is the

tangent line to the red curve on the graph.

So that's the picture.

So to summarize this, in the case of one variable you only

have one derivative, and that one derivative corresponds to

the slope of the tangent line to the point, to a given point.

In two variables you have a tangent plane, and what partial

derivatives give you, they give you the slopes of two tangent

lines which belong of this plane, namely the lines which

are obtained -- like these two lines -- the two lines which

are obtained by intersecting the tangent plane with

the plane y equal b, or the plane x equal a.

That's the idea.

So we just kind of look at from two different angles.

We'll look at this tangent plane from two different

angles, and what we get is two different lines, and once

you get lines you can talk about slopes.

You can not talk about the slope of a whole plane --

a plane doesn't have a slope, lines have slopes.

And so, there are sort of two independent slopes that we can

talk about, one with respect to x and one with respect to y,

and they correspond to the two partial derivatives, one with

respect to x and one with respect to y.

Any questions?

Yes?

[INAUDIBLE].

Not a good notation, because prime is for derivative

-- let's go with tilde.

I just wanted to distinguish from the other theta, I

wanted to make sure it's not the same as that theta.

But put in prime is like the worst possible notation because

then -- and it looks like I'm taking derivative of

theta, which I'm not.

Even better to call it something else, epsilon maybe.

Alpha, that's a good compromise.

It's Greek but not epsilon or delta, which are taboo.

OK.

So what else can we do?

In the case of a function in one variable, we could also

take further derivatives.

We don't have to take just one derivative, we can take the

second derivative, third derivative and so on.

So it's natural to ask whether we can do something similar for

functions in two variables.

And the answer is yes.

We can also take, for example, a second derivative.

So, in other words, we start with the function f of x and y,

and then we can take the first derivative with respect to x

and we can take derivative with respect to y.

So we got two new functions, which I'll give you an

example of this for a particular case of f.

So both of them are also functions in two variables, so

we can again apply the same procedure and do partial

derivatives for this function.

Then if we go this way, we obtain ffx of xy.

What do I mean by this?

I mean that I take f sub x, this function, and I take

the derivative with respect to x one more time.

That means again freezing y and then taking derivative

with respect to x.

OK, if I go this way, I get fyy of xy, which means I take fy,

the derivative of f with respect to y, and I take the

derivative to y one more time.

But, of course, I can also do mixed derivatives.

For example, here I can take this and can take

the derivative of this with respect to y.

So that I will denote as fxy of xy.

That means taking first the derivative with respect to

x and then respect to y.

But I can also do fyx, which is first with respect to

y and then respect to x.

And then, of course, the natural question is whether

I get the same answer if I apply these derivatives

into different order.

The question is whether these are actually equal.

And in my example, in my example, let's

see if I remember.

I think it was f of xy, f to the 5 -- what was it? xy cubed

plus cosign x equal to y.

So what will be -- so let me write f of xy, so then

derivative with respect to x was 5x to the 4 plus y cubed

minus sign xe to the y.

If I do one more derivative I get 20x cubed.

Now y cubed is a constant as a function of x.

I view y as a parameter, so it doesn't depend on the x,

therefore, the derivative vanishes, so it disappears.

And then I take one more derivative of sign, I get

cosign x times e to the y.

On the other hand, I can take derivative with respect to y,

I get 3x y squared plus cosign xe to the y.

One more derivative -- 6xy plus cosign xe to the y.

And now the most interesting thing, I can -- let me do it

like this so that we don't lose track of where we are.

So first we take the derivative of this with respect to y, this

disappears, this becomes 3y squared minus sign xe to

the y, so that's this way.

And if I go this way -- I'm sorry, not this way, this way.

I get the same, right, I get 3y squared and I take the

derivative with respect to x, so minus sign x e to the y.

So clearly I get the same answer.

So this actual general result, which

is called Clairaut.

Or I guess if I pronounce as a French, with a French

accent, it will be Clairaut.

So the Clairaut theorem, which says that uner favorable

conditions, which is essentially the condition that

in a small neighborhood of a given point, you have all

partial derivatives which are continuous function up

to the second order.

Under these favorable conditions, the two mixed

derivatives are the same.

So this is, in fact, Clairaut's theorem under some

conditions of continuity.

Which will, in all our examples, this

will be satisfied.

So this is actually great because what it means is that

if you think of a way of doing partial derivatives for

function two variables as explained where if you go this

way you differentiate with respect to x, and if you go

this way you differentiate with respect to y.

Right, you could do that.

We could continue this picture.

I can go one more step will be like fxxx, or I could go this

way and it will be fxxy -- always the last one, the

new one is the last one.

And then if I go more, it will be, for example, fyx.

But the point is that it doesn't matter in which order

you take, what matters how many times you differentiated x and

how many times you differentiated y.

So for instance, fyx is equal to fxy, but likewise, fxyx is

the same as fxxy, the same as fyxx, again, under favorable

conditions when functions in question are continuous

and differentiable.

So all that matters is not the order, but the number of times

you differentiate with respect to x and y, which is kind of

nice so it has the same communicative structure as the

structure that you have for the variables x and y themselves.

In fact, differentiation is in some sense the process which is

opposite to the process of multiplication by x or y.

So that you have two operations of multiplication by x and y,

but you also have two operations of differentiation

by x and by y.

And multiplication by x and y commute -- two multiplications

commute, and the derivatives also commute.

Which, by the way, actually is kind of a better notation

for this iteration.

Because for now the iteration is denoted by inscribing

this additional subscript next to the function.

But there is another notation, so I go back to this, this is

our notation, but another notation is df dx.

Also, if you wish at ab, but it doesn't have to be.

And likewise, a notation for the second derivative is dfdy.

So this is a particular notation.

This is not be confused -- this should not be confused with the

straight d, with just a straight letter d.

It's not the same.

In fact, this actually makes sense, which I will

explain on Thursday.

I will finally explain what dx mean.

But this by itself doesn't make any sense, this

makes sense, zdx.

zdx is a procedure, which you can apply to a function

and it gives you first partial derivative.

This is an operation which -- in the case of one variable

we just denote by prime.

Also in the case of one variable we write, in the case

of one variable we write, you have f of x, you write f-prime

of x or you write df dx.

But now we cannot write like this, as I will explain in

more detail on Thursday.

To differentiate you have to specify in which direction

you differentiate and this is one way to do it.

Say you choose to differentiate with respect to the x direction

and then you get this.

But this is not to say -- the numerator by itself doesn't

make any sense as a notation, and the denominator also

doesn't make any sense.

Only these two things together make sense.

This is a notion of partial derivative, and likewise, you

have the notion of partial dervatives with respect to y

which makes any function in its derivative with respect to y. f

sub x and this is f sub y.

OK?

This on the other hand, df, is an entire different

object, the differential.

This is an entirel different object, it's called

a differential.

And it's not the same, likewise, it's not df.

So this is not even a letter, if you think about it.

It's not even a letter of any reasonable alphabet.

It's just a mathematical notation for

partial derivative.

So this, of course, begs the question as to what

is the differential.

What is the differential and what on earth does this mean?

Because this is something we've been using quite a lot,

but never really -- I've never really spelled out

what we mean by this.

But actually it has a very precise meaning, differential

and dx and dy, and this is what we're going to discuss next.

In fact, I have about five minutes left, so I'll give you

a little preview of what's coming on Thursday, and it's

really a very important, that's a very important subject, which

unfortunately has been made really, really obscure

by a very unfortunate choice of notation.

It's a very bad notation makes it very obscure and very

difficult to understand.

So I remember when I was learning this for the first

time it was impossible to understand.

So it took me a long time to figure it out, but I'm happy to

share it with you, I have to tell you, because it's actually

very simple, and we already know everything that we

need to know about this.

So what I want to do is just tell you just a little bit,

just a couple things about it.

And I will, as always, I will start with the function one

variable, var that's already a very good example where you can

understand what the differential is and what all

this notation means.

So in fact, I shouldn't have erased it because I'm going to

draw it again, but I just wanted to draw it in a slightly

different way, in a more -- the way I usually draw which

is kind of a -- this is the optimistic view of

reality as it goes up.

The other one's down, that's why I raised it.

So, we talk about tangent lines, and we talk about

importance of tangent lines, and the importance of tangent

line really is that it gives you very useful approximation

to a complicated function on a very small scale.

So the differential, the differential really is

the function whose graph the tangent line is.

So the funny thing is that we talk about a function in one

variable, so in this case let's say you have a function f of x.

In this yellow curve, it presents the graph of this

function, that is, the set of solutions to the equation

y equals f of x.

So for this function we have two objects -- we have the

function named f, and we have the graph which is the yellow

curve, we have two objects.

And then we talk about the tangent line, and tangent line

of course we understand geometrically very clearly,

we choose a particular point, so let's say

x, 0, right, but we never talk about the function which

gives us this tangent line as the graph.

Somehow we usually ignore this question.

So the yellow curve is the graph of this function.

But the tangent line is also graph of a function in a much

simpler function which is actually a linear function,

and what this function is it is a differential.

This one is a graph also graph of a function, namely df.

This is what we mean by df.

Let me write it in words.

Namely the differential of f at this point.

That's what the differential is.

The only subtle point is that for the differential we shift

the coordinates -- we choose a new coordinate system where

the origin is at our point, that's all.

In other words, we view now this line as a graph with

respect to a new coordinate system where the origin is not

here -- it's not at some arbitrary point, but actually

at the point which we are analyzing.

And if you write down the function whose graph this

tangent line is, you will have precisely the differential.

That's how it works for functions in one variable.

And now it's absolutely clear what will happen for

functions in two variable.

For functions in two variable, we are going to look at the

tangent plane to this graph, which is represented here, or

if you wish, that's the tangent plane which I was

talking about.

And then you will think of this tangent plane also as a graph

of a function, of a linear function, and that linear

function is a differential of a function in two variables

you started with.

So that's the short version of this, and I will give you

more details on Thursday.